Message ID | 20230105214631.3939268-6-willy@infradead.org (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Split netmem from struct page | expand |
Hi Matthew, I love your patch! Perhaps something to improve: [auto build test WARNING on bpf-next/master] [also build test WARNING on bpf/master net/master net-next/master linus/master v6.2-rc2 next-20230105] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Matthew-Wilcox-Oracle/netmem-Create-new-type/20230106-054852 base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master patch link: https://lore.kernel.org/r/20230105214631.3939268-6-willy%40infradead.org patch subject: [PATCH v2 05/24] page_pool: Start using netmem in allocation path. config: arm64-randconfig-r033-20230105 compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 8d9828ef5aa9688500657d36cd2aefbe12bbd162) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install arm64 cross compiling tool for clang build # apt-get install binutils-aarch64-linux-gnu # https://github.com/intel-lab-lkp/linux/commit/37522b9f56a9bac5d16428140c925698c26b07d1 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Matthew-Wilcox-Oracle/netmem-Create-new-type/20230106-054852 git checkout 37522b9f56a9bac5d16428140c925698c26b07d1 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm64 olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash net/core/ If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): In file included from net/core/page_pool.c:13: include/net/page_pool.h:111:21: warning: due to lvalue conversion of the controlling expression, association of type 'const struct netmem' will never be selected because it is qualified [-Wunreachable-code-generic-assoc] return page_to_pfn(netmem_page(nmem)); ^ include/net/page_pool.h:100:8: note: expanded from macro 'netmem_page' const struct netmem: (const struct page *)nmem, \ ^ include/net/page_pool.h:116:21: warning: due to lvalue conversion of the controlling expression, association of type 'const struct netmem' will never be selected because it is qualified [-Wunreachable-code-generic-assoc] return page_to_nid(netmem_page(nmem)); ^ include/net/page_pool.h:100:8: note: expanded from macro 'netmem_page' const struct netmem: (const struct page *)nmem, \ ^ include/net/page_pool.h:126:22: warning: due to lvalue conversion of the controlling expression, association of type 'const struct netmem' will never be selected because it is qualified [-Wunreachable-code-generic-assoc] return page_to_virt(netmem_page(nmem)); ^ include/net/page_pool.h:100:8: note: expanded from macro 'netmem_page' const struct netmem: (const struct page *)nmem, \ ^ include/net/page_pool.h:126:22: warning: due to lvalue conversion of the controlling expression, association of type 'const struct netmem' will never be selected because it is qualified [-Wunreachable-code-generic-assoc] include/net/page_pool.h:100:8: note: expanded from macro 'netmem_page' const struct netmem: (const struct page *)nmem, \ ^ include/net/page_pool.h:131:22: warning: due to lvalue conversion of the controlling expression, association of type 'const struct netmem' will never be selected because it is qualified [-Wunreachable-code-generic-assoc] return page_address(netmem_page(nmem)); ^ include/net/page_pool.h:100:8: note: expanded from macro 'netmem_page' const struct netmem: (const struct page *)nmem, \ ^ include/net/page_pool.h:136:24: warning: due to lvalue conversion of the controlling expression, association of type 'const struct netmem' will never be selected because it is qualified [-Wunreachable-code-generic-assoc] return page_ref_count(netmem_page(nmem)); ^ include/net/page_pool.h:100:8: note: expanded from macro 'netmem_page' const struct netmem: (const struct page *)nmem, \ ^ >> net/core/page_pool.c:309:22: warning: due to lvalue conversion of the controlling expression, association of type 'const struct netmem' will never be selected because it is qualified [-Wunreachable-code-generic-assoc] struct page *page = netmem_page(nmem); ^ include/net/page_pool.h:100:8: note: expanded from macro 'netmem_page' const struct netmem: (const struct page *)nmem, \ ^ net/core/page_pool.c:337:25: warning: due to lvalue conversion of the controlling expression, association of type 'const struct netmem' will never be selected because it is qualified [-Wunreachable-code-generic-assoc] pool->p.init_callback(netmem_page(nmem), pool->p.init_arg); ^ include/net/page_pool.h:100:8: note: expanded from macro 'netmem_page' const struct netmem: (const struct page *)nmem, \ ^ net/core/page_pool.c:368:9: warning: due to lvalue conversion of the controlling expression, association of type 'const struct netmem' will never be selected because it is qualified [-Wunreachable-code-generic-assoc] return netmem_page(nmem); ^ include/net/page_pool.h:100:8: note: expanded from macro 'netmem_page' const struct netmem: (const struct page *)nmem, \ ^ net/core/page_pool.c:410:44: warning: due to lvalue conversion of the controlling expression, association of type 'const struct netmem' will never be selected because it is qualified [-Wunreachable-code-generic-assoc] pool->alloc.cache[pool->alloc.count++] = netmem_page(nmem); ^ include/net/page_pool.h:100:8: note: expanded from macro 'netmem_page' const struct netmem: (const struct page *)nmem, \ ^ 10 warnings generated. vim +309 net/core/page_pool.c 306 307 static bool page_pool_dma_map(struct page_pool *pool, struct netmem *nmem) 308 { > 309 struct page *page = netmem_page(nmem); 310 dma_addr_t dma; 311 312 /* Setup DMA mapping: use 'struct page' area for storing DMA-addr 313 * since dma_addr_t can be either 32 or 64 bits and does not always fit 314 * into page private data (i.e 32bit cpu with 64bit DMA caps) 315 * This mapping is kept for lifetime of page, until leaving pool. 316 */ 317 dma = dma_map_page_attrs(pool->p.dev, page, 0, 318 (PAGE_SIZE << pool->p.order), 319 pool->p.dma_dir, DMA_ATTR_SKIP_CPU_SYNC); 320 if (dma_mapping_error(pool->p.dev, dma)) 321 return false; 322 323 page_pool_set_dma_addr(page, dma); 324 325 if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV) 326 page_pool_dma_sync_for_device(pool, page, pool->p.max_len); 327 328 return true; 329 } 330
On 05/01/2023 22.46, Matthew Wilcox (Oracle) wrote: > Convert __page_pool_alloc_page_order() and __page_pool_alloc_pages_slow() > to use netmem internally. This removes a couple of calls > to compound_head() that are hidden inside put_page(). > Convert trace_page_pool_state_hold(), page_pool_dma_map() and > page_pool_set_pp_info() to take a netmem argument. > > Saves 83 bytes of text in __page_pool_alloc_page_order() and 98 in > __page_pool_alloc_pages_slow() for a total of 181 bytes. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > --- > include/trace/events/page_pool.h | 14 +++++------ > net/core/page_pool.c | 42 +++++++++++++++++--------------- > 2 files changed, 29 insertions(+), 27 deletions(-) Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Question below. > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > index 437241aba5a7..4e985502c569 100644 > --- a/net/core/page_pool.c > +++ b/net/core/page_pool.c [...] > @@ -421,7 +422,8 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, > page = NULL; > } > > - /* When page just alloc'ed is should/must have refcnt 1. */ > + /* When page just allocated it should have refcnt 1 (but may have > + * speculative references) */ > return page; What does it mean page may have speculative references ? And do I/we need to worry about that for page_pool? --Jesper
On Fri, Jan 06, 2023 at 02:59:30PM +0100, Jesper Dangaard Brouer wrote: > On 05/01/2023 22.46, Matthew Wilcox (Oracle) wrote: > > Convert __page_pool_alloc_page_order() and __page_pool_alloc_pages_slow() > > to use netmem internally. This removes a couple of calls > > to compound_head() that are hidden inside put_page(). > > Convert trace_page_pool_state_hold(), page_pool_dma_map() and > > page_pool_set_pp_info() to take a netmem argument. > > > > Saves 83 bytes of text in __page_pool_alloc_page_order() and 98 in > > __page_pool_alloc_pages_slow() for a total of 181 bytes. > > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > > --- > > include/trace/events/page_pool.h | 14 +++++------ > > net/core/page_pool.c | 42 +++++++++++++++++--------------- > > 2 files changed, 29 insertions(+), 27 deletions(-) > > Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> > > Question below. > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > > index 437241aba5a7..4e985502c569 100644 > > --- a/net/core/page_pool.c > > +++ b/net/core/page_pool.c > [...] > > @@ -421,7 +422,8 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, > > page = NULL; > > } > > - /* When page just alloc'ed is should/must have refcnt 1. */ > > + /* When page just allocated it should have refcnt 1 (but may have > > + * speculative references) */ > > return page; > > What does it mean page may have speculative references ? > > And do I/we need to worry about that for page_pool? An excellent question. There are two code paths (known to me) which take speculative references on a page, and there may well be more. One is in GUP and the other is in the page cache. Both take the form of: rcu_read_lock(); again: look-up-page try-get-page-ref check-lookup if lookup-failed drop-page-ref; goto again; rcu_read_unlock(); If a page _has been_ in the page tables, then GUP can find it. If a page _has been_ in the page cache, then filemap can find it. Because there's no RCU grace period between freeing and reallocating a page, it actually means that any page can see its refcount temporarily raised. Usually the biggest problem is consumers assuming that they will be the last code to call put_page() / folio_put(), and can do their cleanup at that time (when the last caller of folio_put() may be GUP or filemap which knows nothing of what you're using the page for). I didn't notice any problems with temporarily elevated refcounts while doing the netmem conversion, and it's something I'm fairly sensitive to, so I think you got it all right and there is no need to be concerned. Hope that's helpful!
On Thu, Jan 05, 2023 at 09:46:12PM +0000, Matthew Wilcox (Oracle) wrote: > Convert __page_pool_alloc_page_order() and __page_pool_alloc_pages_slow() > to use netmem internally. This removes a couple of calls > to compound_head() that are hidden inside put_page(). > Convert trace_page_pool_state_hold(), page_pool_dma_map() and > page_pool_set_pp_info() to take a netmem argument. > > Saves 83 bytes of text in __page_pool_alloc_page_order() and 98 in > __page_pool_alloc_pages_slow() for a total of 181 bytes. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > --- > include/trace/events/page_pool.h | 14 +++++------ > net/core/page_pool.c | 42 +++++++++++++++++--------------- > 2 files changed, 29 insertions(+), 27 deletions(-) > > diff --git a/include/trace/events/page_pool.h b/include/trace/events/page_pool.h > index 113aad0c9e5b..d1237a7ce481 100644 > --- a/include/trace/events/page_pool.h > +++ b/include/trace/events/page_pool.h > @@ -67,26 +67,26 @@ TRACE_EVENT(page_pool_state_release, > TRACE_EVENT(page_pool_state_hold, > > TP_PROTO(const struct page_pool *pool, > - const struct page *page, u32 hold), > + const struct netmem *nmem, u32 hold), > > - TP_ARGS(pool, page, hold), > + TP_ARGS(pool, nmem, hold), > > TP_STRUCT__entry( > __field(const struct page_pool *, pool) > - __field(const struct page *, page) > + __field(const struct netmem *, nmem) > __field(u32, hold) > __field(unsigned long, pfn) > ), > > TP_fast_assign( > __entry->pool = pool; > - __entry->page = page; > + __entry->nmem = nmem; > __entry->hold = hold; > - __entry->pfn = page_to_pfn(page); > + __entry->pfn = netmem_pfn(nmem); > ), > > - TP_printk("page_pool=%p page=%p pfn=0x%lx hold=%u", > - __entry->pool, __entry->page, __entry->pfn, __entry->hold) > + TP_printk("page_pool=%p netmem=%p pfn=0x%lx hold=%u", > + __entry->pool, __entry->nmem, __entry->pfn, __entry->hold) > ); > > TRACE_EVENT(page_pool_update_nid, > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > index 437241aba5a7..4e985502c569 100644 > --- a/net/core/page_pool.c > +++ b/net/core/page_pool.c > @@ -304,8 +304,9 @@ static void page_pool_dma_sync_for_device(struct page_pool *pool, > pool->p.dma_dir); > } > > -static bool page_pool_dma_map(struct page_pool *pool, struct page *page) > +static bool page_pool_dma_map(struct page_pool *pool, struct netmem *nmem) > { > + struct page *page = netmem_page(nmem); > dma_addr_t dma; > > /* Setup DMA mapping: use 'struct page' area for storing DMA-addr > @@ -328,12 +329,12 @@ static bool page_pool_dma_map(struct page_pool *pool, struct page *page) > } > > static void page_pool_set_pp_info(struct page_pool *pool, > - struct page *page) > + struct netmem *nmem) > { > - page->pp = pool; > - page->pp_magic |= PP_SIGNATURE; > + nmem->pp = pool; > + nmem->pp_magic |= PP_SIGNATURE; > if (pool->p.init_callback) > - pool->p.init_callback(page, pool->p.init_arg); > + pool->p.init_callback(netmem_page(nmem), pool->p.init_arg); > } > > static void page_pool_clear_pp_info(struct netmem *nmem) > @@ -345,26 +346,26 @@ static void page_pool_clear_pp_info(struct netmem *nmem) > static struct page *__page_pool_alloc_page_order(struct page_pool *pool, > gfp_t gfp) > { > - struct page *page; > + struct netmem *nmem; > > gfp |= __GFP_COMP; > - page = alloc_pages_node(pool->p.nid, gfp, pool->p.order); > - if (unlikely(!page)) > + nmem = page_netmem(alloc_pages_node(pool->p.nid, gfp, pool->p.order)); > + if (unlikely(!nmem)) > return NULL; > > if ((pool->p.flags & PP_FLAG_DMA_MAP) && > - unlikely(!page_pool_dma_map(pool, page))) { > - put_page(page); > + unlikely(!page_pool_dma_map(pool, nmem))) { > + netmem_put(nmem); > return NULL; > } > > alloc_stat_inc(pool, slow_high_order); > - page_pool_set_pp_info(pool, page); > + page_pool_set_pp_info(pool, nmem); > > /* Track how many pages are held 'in-flight' */ > pool->pages_state_hold_cnt++; > - trace_page_pool_state_hold(pool, page, pool->pages_state_hold_cnt); > - return page; > + trace_page_pool_state_hold(pool, nmem, pool->pages_state_hold_cnt); > + return netmem_page(nmem); > } > > /* slow path */ > @@ -398,18 +399,18 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, > * page element have not been (possibly) DMA mapped. > */ > for (i = 0; i < nr_pages; i++) { > - page = pool->alloc.cache[i]; > + struct netmem *nmem = page_netmem(pool->alloc.cache[i]); > if ((pp_flags & PP_FLAG_DMA_MAP) && > - unlikely(!page_pool_dma_map(pool, page))) { > - put_page(page); > + unlikely(!page_pool_dma_map(pool, nmem))) { > + netmem_put(nmem); > continue; > } > > - page_pool_set_pp_info(pool, page); > - pool->alloc.cache[pool->alloc.count++] = page; > + page_pool_set_pp_info(pool, nmem); > + pool->alloc.cache[pool->alloc.count++] = netmem_page(nmem); > /* Track how many pages are held 'in-flight' */ > pool->pages_state_hold_cnt++; > - trace_page_pool_state_hold(pool, page, > + trace_page_pool_state_hold(pool, nmem, > pool->pages_state_hold_cnt); > } > > @@ -421,7 +422,8 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, > page = NULL; > } > > - /* When page just alloc'ed is should/must have refcnt 1. */ > + /* When page just allocated it should have refcnt 1 (but may have > + * speculative references) */ > return page; > } > > -- > 2.35.1 > Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
diff --git a/include/trace/events/page_pool.h b/include/trace/events/page_pool.h index 113aad0c9e5b..d1237a7ce481 100644 --- a/include/trace/events/page_pool.h +++ b/include/trace/events/page_pool.h @@ -67,26 +67,26 @@ TRACE_EVENT(page_pool_state_release, TRACE_EVENT(page_pool_state_hold, TP_PROTO(const struct page_pool *pool, - const struct page *page, u32 hold), + const struct netmem *nmem, u32 hold), - TP_ARGS(pool, page, hold), + TP_ARGS(pool, nmem, hold), TP_STRUCT__entry( __field(const struct page_pool *, pool) - __field(const struct page *, page) + __field(const struct netmem *, nmem) __field(u32, hold) __field(unsigned long, pfn) ), TP_fast_assign( __entry->pool = pool; - __entry->page = page; + __entry->nmem = nmem; __entry->hold = hold; - __entry->pfn = page_to_pfn(page); + __entry->pfn = netmem_pfn(nmem); ), - TP_printk("page_pool=%p page=%p pfn=0x%lx hold=%u", - __entry->pool, __entry->page, __entry->pfn, __entry->hold) + TP_printk("page_pool=%p netmem=%p pfn=0x%lx hold=%u", + __entry->pool, __entry->nmem, __entry->pfn, __entry->hold) ); TRACE_EVENT(page_pool_update_nid, diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 437241aba5a7..4e985502c569 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -304,8 +304,9 @@ static void page_pool_dma_sync_for_device(struct page_pool *pool, pool->p.dma_dir); } -static bool page_pool_dma_map(struct page_pool *pool, struct page *page) +static bool page_pool_dma_map(struct page_pool *pool, struct netmem *nmem) { + struct page *page = netmem_page(nmem); dma_addr_t dma; /* Setup DMA mapping: use 'struct page' area for storing DMA-addr @@ -328,12 +329,12 @@ static bool page_pool_dma_map(struct page_pool *pool, struct page *page) } static void page_pool_set_pp_info(struct page_pool *pool, - struct page *page) + struct netmem *nmem) { - page->pp = pool; - page->pp_magic |= PP_SIGNATURE; + nmem->pp = pool; + nmem->pp_magic |= PP_SIGNATURE; if (pool->p.init_callback) - pool->p.init_callback(page, pool->p.init_arg); + pool->p.init_callback(netmem_page(nmem), pool->p.init_arg); } static void page_pool_clear_pp_info(struct netmem *nmem) @@ -345,26 +346,26 @@ static void page_pool_clear_pp_info(struct netmem *nmem) static struct page *__page_pool_alloc_page_order(struct page_pool *pool, gfp_t gfp) { - struct page *page; + struct netmem *nmem; gfp |= __GFP_COMP; - page = alloc_pages_node(pool->p.nid, gfp, pool->p.order); - if (unlikely(!page)) + nmem = page_netmem(alloc_pages_node(pool->p.nid, gfp, pool->p.order)); + if (unlikely(!nmem)) return NULL; if ((pool->p.flags & PP_FLAG_DMA_MAP) && - unlikely(!page_pool_dma_map(pool, page))) { - put_page(page); + unlikely(!page_pool_dma_map(pool, nmem))) { + netmem_put(nmem); return NULL; } alloc_stat_inc(pool, slow_high_order); - page_pool_set_pp_info(pool, page); + page_pool_set_pp_info(pool, nmem); /* Track how many pages are held 'in-flight' */ pool->pages_state_hold_cnt++; - trace_page_pool_state_hold(pool, page, pool->pages_state_hold_cnt); - return page; + trace_page_pool_state_hold(pool, nmem, pool->pages_state_hold_cnt); + return netmem_page(nmem); } /* slow path */ @@ -398,18 +399,18 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, * page element have not been (possibly) DMA mapped. */ for (i = 0; i < nr_pages; i++) { - page = pool->alloc.cache[i]; + struct netmem *nmem = page_netmem(pool->alloc.cache[i]); if ((pp_flags & PP_FLAG_DMA_MAP) && - unlikely(!page_pool_dma_map(pool, page))) { - put_page(page); + unlikely(!page_pool_dma_map(pool, nmem))) { + netmem_put(nmem); continue; } - page_pool_set_pp_info(pool, page); - pool->alloc.cache[pool->alloc.count++] = page; + page_pool_set_pp_info(pool, nmem); + pool->alloc.cache[pool->alloc.count++] = netmem_page(nmem); /* Track how many pages are held 'in-flight' */ pool->pages_state_hold_cnt++; - trace_page_pool_state_hold(pool, page, + trace_page_pool_state_hold(pool, nmem, pool->pages_state_hold_cnt); } @@ -421,7 +422,8 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, page = NULL; } - /* When page just alloc'ed is should/must have refcnt 1. */ + /* When page just allocated it should have refcnt 1 (but may have + * speculative references) */ return page; }
Convert __page_pool_alloc_page_order() and __page_pool_alloc_pages_slow() to use netmem internally. This removes a couple of calls to compound_head() that are hidden inside put_page(). Convert trace_page_pool_state_hold(), page_pool_dma_map() and page_pool_set_pp_info() to take a netmem argument. Saves 83 bytes of text in __page_pool_alloc_page_order() and 98 in __page_pool_alloc_pages_slow() for a total of 181 bytes. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> --- include/trace/events/page_pool.h | 14 +++++------ net/core/page_pool.c | 42 +++++++++++++++++--------------- 2 files changed, 29 insertions(+), 27 deletions(-)