Message ID | 166377993287.1737053.10258297257583703949.stgit@firesoul (mailing list archive) |
---|---|
State | Accepted |
Commit | fb33ec016b8710281343ce73bec92bfe54bad4fa |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] xdp: improve page_pool xdp_return performance | expand |
Jesper Dangaard Brouer <brouer@redhat.com> writes: > During LPC2022 I meetup with my page_pool co-maintainer Ilias. When > discussing page_pool code we realised/remembered certain optimizations > had not been fully utilised. > > Since commit c07aea3ef4d4 ("mm: add a signature in struct page") struct > page have a direct pointer to the page_pool object this page was > allocated from. > > Thus, with this info it is possible to skip the rhashtable_lookup to > find the page_pool object in __xdp_return(). > > The rcu_read_lock can be removed as it was tied to xdp_mem_allocator. > The page_pool object is still safe to access as it tracks inflight pages > and (potentially) schedules final release from a work queue. > > Created a micro benchmark of XDP redirecting from mlx5 into veth with > XDP_DROP bpf-prog on the peer veth device. This increased performance > 6.5% from approx 8.45Mpps to 9Mpps corresponding to using 7 nanosec > (27 cycles at 3.8GHz) less per packet. > > Suggested-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Nice! The two of you should get together in person more often ;) Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Hi Jesper, On Wed, Sep 21, 2022 at 07:05:32PM +0200, Jesper Dangaard Brouer wrote: > During LPC2022 I meetup with my page_pool co-maintainer Ilias. When > discussing page_pool code we realised/remembered certain optimizations > had not been fully utilised. > > Since commit c07aea3ef4d4 ("mm: add a signature in struct page") struct > page have a direct pointer to the page_pool object this page was > allocated from. > > Thus, with this info it is possible to skip the rhashtable_lookup to > find the page_pool object in __xdp_return(). > > The rcu_read_lock can be removed as it was tied to xdp_mem_allocator. > The page_pool object is still safe to access as it tracks inflight pages > and (potentially) schedules final release from a work queue. > > Created a micro benchmark of XDP redirecting from mlx5 into veth with > XDP_DROP bpf-prog on the peer veth device. This increased performance > 6.5% from approx 8.45Mpps to 9Mpps corresponding to using 7 nanosec > (27 cycles at 3.8GHz) less per packet. Thanks for the detailed testing > > Suggested-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> > --- > net/core/xdp.c | 10 ++++------ > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/net/core/xdp.c b/net/core/xdp.c > index 24420209bf0e..844c9d99dc0e 100644 > --- a/net/core/xdp.c > +++ b/net/core/xdp.c > @@ -375,19 +375,17 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_reg_mem_model); > void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct, > struct xdp_buff *xdp) > { > - struct xdp_mem_allocator *xa; > struct page *page; > > switch (mem->type) { > case MEM_TYPE_PAGE_POOL: > - rcu_read_lock(); > - /* mem->id is valid, checked in xdp_rxq_info_reg_mem_model() */ > - xa = rhashtable_lookup(mem_id_ht, &mem->id, mem_id_rht_params); > page = virt_to_head_page(data); > if (napi_direct && xdp_return_frame_no_direct()) > napi_direct = false; > - page_pool_put_full_page(xa->page_pool, page, napi_direct); > - rcu_read_unlock(); > + /* No need to check ((page->pp_magic & ~0x3UL) == PP_SIGNATURE) > + * as mem->type knows this a page_pool page > + */ > + page_pool_put_full_page(page->pp, page, napi_direct); > break; > case MEM_TYPE_PAGE_SHARED: > page_frag_free(data); > > Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Hello: This patch was applied to netdev/net-next.git (master) by Jakub Kicinski <kuba@kernel.org>: On Wed, 21 Sep 2022 19:05:32 +0200 you wrote: > During LPC2022 I meetup with my page_pool co-maintainer Ilias. When > discussing page_pool code we realised/remembered certain optimizations > had not been fully utilised. > > Since commit c07aea3ef4d4 ("mm: add a signature in struct page") struct > page have a direct pointer to the page_pool object this page was > allocated from. > > [...] Here is the summary with links: - [net-next] xdp: improve page_pool xdp_return performance https://git.kernel.org/netdev/net-next/c/fb33ec016b87 You are awesome, thank you!
diff --git a/net/core/xdp.c b/net/core/xdp.c index 24420209bf0e..844c9d99dc0e 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -375,19 +375,17 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_reg_mem_model); void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct, struct xdp_buff *xdp) { - struct xdp_mem_allocator *xa; struct page *page; switch (mem->type) { case MEM_TYPE_PAGE_POOL: - rcu_read_lock(); - /* mem->id is valid, checked in xdp_rxq_info_reg_mem_model() */ - xa = rhashtable_lookup(mem_id_ht, &mem->id, mem_id_rht_params); page = virt_to_head_page(data); if (napi_direct && xdp_return_frame_no_direct()) napi_direct = false; - page_pool_put_full_page(xa->page_pool, page, napi_direct); - rcu_read_unlock(); + /* No need to check ((page->pp_magic & ~0x3UL) == PP_SIGNATURE) + * as mem->type knows this a page_pool page + */ + page_pool_put_full_page(page->pp, page, napi_direct); break; case MEM_TYPE_PAGE_SHARED: page_frag_free(data);
During LPC2022 I meetup with my page_pool co-maintainer Ilias. When discussing page_pool code we realised/remembered certain optimizations had not been fully utilised. Since commit c07aea3ef4d4 ("mm: add a signature in struct page") struct page have a direct pointer to the page_pool object this page was allocated from. Thus, with this info it is possible to skip the rhashtable_lookup to find the page_pool object in __xdp_return(). The rcu_read_lock can be removed as it was tied to xdp_mem_allocator. The page_pool object is still safe to access as it tracks inflight pages and (potentially) schedules final release from a work queue. Created a micro benchmark of XDP redirecting from mlx5 into veth with XDP_DROP bpf-prog on the peer veth device. This increased performance 6.5% from approx 8.45Mpps to 9Mpps corresponding to using 7 nanosec (27 cycles at 3.8GHz) less per packet. Suggested-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> --- net/core/xdp.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)