Message ID | 20230727144336.1646454-3-aleksander.lobakin@intel.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | page_pool: a couple of assorted optimizations | expand |
On 2023/7/27 22:43, Alexander Lobakin wrote: > diff --git a/net/core/skbuff.c b/net/core/skbuff.c ... > +bool page_pool_return_skb_page(struct page *page, bool napi_safe) Still having the 'page_pool_' prefix seems odd here when it is in the skbuff.c where most have skb_ or napi_ prefix, is it better to rename it to something like napi_return_page_pool_page()? > +{ > + struct napi_struct *napi; > + struct page_pool *pp; > + bool allow_direct; > + > + page = compound_head(page); > + > + /* page->pp_magic is OR'ed with PP_SIGNATURE after the allocation > + * in order to preserve any existing bits, such as bit 0 for the > + * head page of compound page and bit 1 for pfmemalloc page, so > + * mask those bits for freeing side when doing below checking, > + * and page_is_pfmemalloc() is checked in __page_pool_put_page() > + * to avoid recycling the pfmemalloc page. > + */ > + if (unlikely((page->pp_magic & ~0x3UL) != PP_SIGNATURE)) > + return false; > + > + pp = page->pp; > + > + /* Allow direct recycle if we have reasons to believe that we are > + * in the same context as the consumer would run, so there's > + * no possible race. > + */ > + napi = READ_ONCE(pp->p.napi); > + allow_direct = napi_safe && napi && > + READ_ONCE(napi->list_owner) == smp_processor_id(); > + > + /* Driver set this to memory recycling info. Reset it on recycle. > + * This will *not* work for NIC using a split-page memory model. > + * The page will be returned to the pool here regardless of the > + * 'flipped' fragment being in use or not. > + */ > + page_pool_put_full_page(pp, page, allow_direct); > + > + return true; > +} > +EXPORT_SYMBOL(page_pool_return_skb_page); > + > static bool skb_pp_recycle(struct sk_buff *skb, void *data, bool napi_safe) > { > if (!IS_ENABLED(CONFIG_PAGE_POOL) || !skb->pp_recycle) >
From: Yunsheng Lin <linyunsheng@huawei.com> Date: Fri, 28 Jul 2023 20:02:51 +0800 > On 2023/7/27 22:43, Alexander Lobakin wrote: >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c > > ... > >> +bool page_pool_return_skb_page(struct page *page, bool napi_safe) > > Still having the 'page_pool_' prefix seems odd here when it is in the > skbuff.c where most have skb_ or napi_ prefix, is it better to rename > it to something like napi_return_page_pool_page()? Given that how the function that goes next is named, maybe skb_pp_return_page() (or skb_pp_put_page())? > >> +{ >> + struct napi_struct *napi; >> + struct page_pool *pp; >> + bool allow_direct; >> + >> + page = compound_head(page); >> + >> + /* page->pp_magic is OR'ed with PP_SIGNATURE after the allocation >> + * in order to preserve any existing bits, such as bit 0 for the >> + * head page of compound page and bit 1 for pfmemalloc page, so >> + * mask those bits for freeing side when doing below checking, >> + * and page_is_pfmemalloc() is checked in __page_pool_put_page() >> + * to avoid recycling the pfmemalloc page. >> + */ >> + if (unlikely((page->pp_magic & ~0x3UL) != PP_SIGNATURE)) >> + return false; >> + >> + pp = page->pp; >> + >> + /* Allow direct recycle if we have reasons to believe that we are >> + * in the same context as the consumer would run, so there's >> + * no possible race. >> + */ >> + napi = READ_ONCE(pp->p.napi); >> + allow_direct = napi_safe && napi && >> + READ_ONCE(napi->list_owner) == smp_processor_id(); >> + >> + /* Driver set this to memory recycling info. Reset it on recycle. >> + * This will *not* work for NIC using a split-page memory model. >> + * The page will be returned to the pool here regardless of the >> + * 'flipped' fragment being in use or not. >> + */ >> + page_pool_put_full_page(pp, page, allow_direct); >> + >> + return true; >> +} >> +EXPORT_SYMBOL(page_pool_return_skb_page); >> + >> static bool skb_pp_recycle(struct sk_buff *skb, void *data, bool napi_safe) (this one) >> { >> if (!IS_ENABLED(CONFIG_PAGE_POOL) || !skb->pp_recycle) >> Thanks, Olek
On 2023/7/28 21:58, Alexander Lobakin wrote: > From: Yunsheng Lin <linyunsheng@huawei.com> > Date: Fri, 28 Jul 2023 20:02:51 +0800 > >> On 2023/7/27 22:43, Alexander Lobakin wrote: >>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >> >> ... >> >>> +bool page_pool_return_skb_page(struct page *page, bool napi_safe) >> >> Still having the 'page_pool_' prefix seems odd here when it is in the >> skbuff.c where most have skb_ or napi_ prefix, is it better to rename >> it to something like napi_return_page_pool_page()? > > Given that how the function that goes next is named, maybe > skb_pp_return_page() (or skb_pp_put_page())? skb_pp_put_page() seems better. And I like napi_pp_put_page() with 'napi_' prefix better as it does not take a skb as parameter and the naming is aligned with the 'napi_safe' parameter. > >> >>> +{ >>> + struct napi_struct *napi; >>> + struct page_pool *pp; >>> + bool allow_direct; >>> + >>> + page = compound_head(page); >>> + >>> + /* page->pp_magic is OR'ed with PP_SIGNATURE after the allocation >>> + * in order to preserve any existing bits, such as bit 0 for the >>> + * head page of compound page and bit 1 for pfmemalloc page, so >>> + * mask those bits for freeing side when doing below checking, >>> + * and page_is_pfmemalloc() is checked in __page_pool_put_page() >>> + * to avoid recycling the pfmemalloc page. >>> + */ >>> + if (unlikely((page->pp_magic & ~0x3UL) != PP_SIGNATURE)) >>> + return false; >>> + >>> + pp = page->pp; >>> + >>> + /* Allow direct recycle if we have reasons to believe that we are >>> + * in the same context as the consumer would run, so there's >>> + * no possible race. >>> + */ >>> + napi = READ_ONCE(pp->p.napi); >>> + allow_direct = napi_safe && napi && >>> + READ_ONCE(napi->list_owner) == smp_processor_id(); >>> + >>> + /* Driver set this to memory recycling info. Reset it on recycle. >>> + * This will *not* work for NIC using a split-page memory model. >>> + * The page will be returned to the pool here regardless of the >>> + * 'flipped' fragment being in use or not. >>> + */ >>> + page_pool_put_full_page(pp, page, allow_direct); >>> + >>> + return true; >>> +} >>> +EXPORT_SYMBOL(page_pool_return_skb_page); >>> + >>> static bool skb_pp_recycle(struct sk_buff *skb, void *data, bool napi_safe) > > (this one) > >>> { >>> if (!IS_ENABLED(CONFIG_PAGE_POOL) || !skb->pp_recycle) We may need the 'IS_ENABLED(CONFIG_PAGE_POOL' checking in the newly moved function too. >>> > > Thanks, > Olek > > . >
From: Yunsheng Lin <linyunsheng@huawei.com> Date: Sat, 29 Jul 2023 19:40:19 +0800 > On 2023/7/28 21:58, Alexander Lobakin wrote: >> From: Yunsheng Lin <linyunsheng@huawei.com> >> Date: Fri, 28 Jul 2023 20:02:51 +0800 >> >>> On 2023/7/27 22:43, Alexander Lobakin wrote: >>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >>> >>> ... >>> >>>> +bool page_pool_return_skb_page(struct page *page, bool napi_safe) >>> >>> Still having the 'page_pool_' prefix seems odd here when it is in the >>> skbuff.c where most have skb_ or napi_ prefix, is it better to rename >>> it to something like napi_return_page_pool_page()? >> >> Given that how the function that goes next is named, maybe >> skb_pp_return_page() (or skb_pp_put_page())? > > skb_pp_put_page() seems better. > > And I like napi_pp_put_page() with 'napi_' prefix better as > it does not take a skb as parameter and the naming is aligned > with the 'napi_safe' parameter. Ah, I see. Sounds reasonable. I'll pick napi_pp_put_page() for the next round, I like this one. > >> >>> >>>> +{ >>>> + struct napi_struct *napi; >>>> + struct page_pool *pp; >>>> + bool allow_direct; >>>> + >>>> + page = compound_head(page); >>>> + >>>> + /* page->pp_magic is OR'ed with PP_SIGNATURE after the allocation >>>> + * in order to preserve any existing bits, such as bit 0 for the >>>> + * head page of compound page and bit 1 for pfmemalloc page, so >>>> + * mask those bits for freeing side when doing below checking, >>>> + * and page_is_pfmemalloc() is checked in __page_pool_put_page() >>>> + * to avoid recycling the pfmemalloc page. >>>> + */ >>>> + if (unlikely((page->pp_magic & ~0x3UL) != PP_SIGNATURE)) >>>> + return false; >>>> + >>>> + pp = page->pp; >>>> + >>>> + /* Allow direct recycle if we have reasons to believe that we are >>>> + * in the same context as the consumer would run, so there's >>>> + * no possible race. >>>> + */ >>>> + napi = READ_ONCE(pp->p.napi); >>>> + allow_direct = napi_safe && napi && >>>> + READ_ONCE(napi->list_owner) == smp_processor_id(); >>>> + >>>> + /* Driver set this to memory recycling info. Reset it on recycle. >>>> + * This will *not* work for NIC using a split-page memory model. >>>> + * The page will be returned to the pool here regardless of the >>>> + * 'flipped' fragment being in use or not. >>>> + */ >>>> + page_pool_put_full_page(pp, page, allow_direct); >>>> + >>>> + return true; >>>> +} >>>> +EXPORT_SYMBOL(page_pool_return_skb_page); >>>> + >>>> static bool skb_pp_recycle(struct sk_buff *skb, void *data, bool napi_safe) >> >> (this one) >> >>>> { >>>> if (!IS_ENABLED(CONFIG_PAGE_POOL) || !skb->pp_recycle) > > We may need the 'IS_ENABLED(CONFIG_PAGE_POOL' checking in the newly > moved function too. The first person who noticed this, for sure we should have it! > >>>> >> >> Thanks, >> Olek >> >> . >> Thanks, Olek
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 888e3d7e74c1..7effd94efd6c 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -32,7 +32,6 @@ #include <linux/if_packet.h> #include <linux/llist.h> #include <net/flow.h> -#include <net/page_pool/types.h> #if IS_ENABLED(CONFIG_NF_CONNTRACK) #include <linux/netfilter/nf_conntrack_common.h> #endif @@ -3421,6 +3420,8 @@ static inline void skb_frag_ref(struct sk_buff *skb, int f) __skb_frag_ref(&skb_shinfo(skb)->frags[f]); } +bool page_pool_return_skb_page(struct page *page, bool napi_safe); + static inline void napi_frag_unref(skb_frag_t *frag, bool recycle, bool napi_safe) { diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index 4a0270291deb..c7aef6c75935 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -156,8 +156,6 @@ struct page_pool { struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp); struct page *page_pool_alloc_frag(struct page_pool *pool, unsigned int *offset, unsigned int size, gfp_t gfp); -bool page_pool_return_skb_page(struct page *page, bool napi_safe); - struct page_pool *page_pool_create(const struct page_pool_params *params); struct xdp_mem_info; diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 2a75f61264c5..7a23ca6b1124 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -906,42 +906,3 @@ void page_pool_update_nid(struct page_pool *pool, int new_nid) } } EXPORT_SYMBOL(page_pool_update_nid); - -bool page_pool_return_skb_page(struct page *page, bool napi_safe) -{ - struct napi_struct *napi; - struct page_pool *pp; - bool allow_direct; - - page = compound_head(page); - - /* page->pp_magic is OR'ed with PP_SIGNATURE after the allocation - * in order to preserve any existing bits, such as bit 0 for the - * head page of compound page and bit 1 for pfmemalloc page, so - * mask those bits for freeing side when doing below checking, - * and page_is_pfmemalloc() is checked in __page_pool_put_page() - * to avoid recycling the pfmemalloc page. - */ - if (unlikely((page->pp_magic & ~0x3UL) != PP_SIGNATURE)) - return false; - - pp = page->pp; - - /* Allow direct recycle if we have reasons to believe that we are - * in the same context as the consumer would run, so there's - * no possible race. - */ - napi = READ_ONCE(pp->p.napi); - allow_direct = napi_safe && napi && - READ_ONCE(napi->list_owner) == smp_processor_id(); - - /* Driver set this to memory recycling info. Reset it on recycle. - * This will *not* work for NIC using a split-page memory model. - * The page will be returned to the pool here regardless of the - * 'flipped' fragment being in use or not. - */ - page_pool_put_full_page(pp, page, allow_direct); - - return true; -} -EXPORT_SYMBOL(page_pool_return_skb_page); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index ac8f421f8ab3..3084ef59400b 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -73,7 +73,7 @@ #include <net/mpls.h> #include <net/mptcp.h> #include <net/mctp.h> -#include <net/page_pool/types.h> +#include <net/page_pool/helpers.h> #include <net/dropreason.h> #include <linux/uaccess.h> @@ -879,6 +879,45 @@ static void skb_clone_fraglist(struct sk_buff *skb) skb_get(list); } +bool page_pool_return_skb_page(struct page *page, bool napi_safe) +{ + struct napi_struct *napi; + struct page_pool *pp; + bool allow_direct; + + page = compound_head(page); + + /* page->pp_magic is OR'ed with PP_SIGNATURE after the allocation + * in order to preserve any existing bits, such as bit 0 for the + * head page of compound page and bit 1 for pfmemalloc page, so + * mask those bits for freeing side when doing below checking, + * and page_is_pfmemalloc() is checked in __page_pool_put_page() + * to avoid recycling the pfmemalloc page. + */ + if (unlikely((page->pp_magic & ~0x3UL) != PP_SIGNATURE)) + return false; + + pp = page->pp; + + /* Allow direct recycle if we have reasons to believe that we are + * in the same context as the consumer would run, so there's + * no possible race. + */ + napi = READ_ONCE(pp->p.napi); + allow_direct = napi_safe && napi && + READ_ONCE(napi->list_owner) == smp_processor_id(); + + /* Driver set this to memory recycling info. Reset it on recycle. + * This will *not* work for NIC using a split-page memory model. + * The page will be returned to the pool here regardless of the + * 'flipped' fragment being in use or not. + */ + page_pool_put_full_page(pp, page, allow_direct); + + return true; +} +EXPORT_SYMBOL(page_pool_return_skb_page); + static bool skb_pp_recycle(struct sk_buff *skb, void *data, bool napi_safe) { if (!IS_ENABLED(CONFIG_PAGE_POOL) || !skb->pp_recycle)
Currently, touching <net/page_pool/types.h> triggers a rebuild of more than half of the kernel. That's because it's included in <linux/skbuff.h>. And each new include to page_pool/types.h adds more [useless] data for the toolchain to process per each source file from that pile. In commit 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling"), Matteo included it to be able to call a couple of functions defined there. Then, in commit 57f05bc2ab24 ("page_pool: keep pp info as long as page pool owns the page") one of the calls was removed, so only one was left. It's the call to page_pool_return_skb_page() in napi_frag_unref(). The function is external and doesn't have any dependencies. Having very niche page_pool_types.h included only for that looks like an overkill. As %PP_SIGNATURE is not local to page_pool.c (was only in the early submissions), nothing holds this function there. Teleport page_pool_return_skb_page() to skbuff.c, just next to the main consumer, skb_pp_recycle(). It's used also in napi_frag_unref() -> {__,}skb_frag_unref(), so no `static` unfortunately. Maybe next time. Now, touching page_pool_types.h only triggers rebuilding of the drivers using it and a couple of core networking files. Suggested-by: Jakub Kicinski <kuba@kernel.org> # make skbuff.h less heavy Suggested-by: Alexander Duyck <alexanderduyck@fb.com> # move to skbuff.c Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> --- include/linux/skbuff.h | 3 ++- include/net/page_pool/types.h | 2 -- net/core/page_pool.c | 39 --------------------------------- net/core/skbuff.c | 41 ++++++++++++++++++++++++++++++++++- 4 files changed, 42 insertions(+), 43 deletions(-)