diff mbox series

[RFC,net-next,2/3] net: page_pool: use alloc_pages_bulk in refill code path

Message ID 161419300618.2718959.11165518489200268845.stgit@firesoul (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [RFC,net-next,1/3] net: page_pool: refactor dma_map into own function page_pool_dma_map | expand

Checks

Context Check Description
netdev/cover_letter warning Series does not have a cover letter
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers fail 8 maintainers not CCed: davem@davemloft.net hawk@kernel.org ilias.apalodimas@linaro.org daniel@iogearbox.net kuba@kernel.org john.fastabend@gmail.com bpf@vger.kernel.org ast@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit fail Errors and warnings before: 1 this patch: 6
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns
netdev/build_allmodconfig_warn fail Errors and warnings before: 1 this patch: 6
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Jesper Dangaard Brouer Feb. 24, 2021, 6:56 p.m. UTC
There are cases where the page_pool need to refill with pages from the
page allocator. Some workloads cause the page_pool to release pages
instead of recycling these pages.

For these workload it can improve performance to bulk alloc pages from
the page-allocator to refill the alloc cache.

For XDP-redirect workload with 100G mlx5 driver (that use page_pool)
redirecting xdp_frame packets into a veth, that does XDP_PASS to create
an SKB from the xdp_frame, which then cannot return the page to the
page_pool. In this case, we saw[1] an improvement of 18.8% from using
the alloc_pages_bulk API (3,677,958 pps -> 4,368,926 pps).

[1] https://github.com/xdp-project/xdp-project/blob/master/areas/mem/page_pool06_alloc_pages_bulk.org

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 net/core/page_pool.c |   65 ++++++++++++++++++++++++++++++++------------------
 1 file changed, 41 insertions(+), 24 deletions(-)

Comments

Ilias Apalodimas Feb. 24, 2021, 8:15 p.m. UTC | #1
Hi Jesper, 

On Wed, Feb 24, 2021 at 07:56:46PM +0100, Jesper Dangaard Brouer wrote:
> There are cases where the page_pool need to refill with pages from the
> page allocator. Some workloads cause the page_pool to release pages
> instead of recycling these pages.
> 
> For these workload it can improve performance to bulk alloc pages from
> the page-allocator to refill the alloc cache.
> 
> For XDP-redirect workload with 100G mlx5 driver (that use page_pool)
> redirecting xdp_frame packets into a veth, that does XDP_PASS to create
> an SKB from the xdp_frame, which then cannot return the page to the
> page_pool. In this case, we saw[1] an improvement of 18.8% from using
> the alloc_pages_bulk API (3,677,958 pps -> 4,368,926 pps).
> 
> [1] https://github.com/xdp-project/xdp-project/blob/master/areas/mem/page_pool06_alloc_pages_bulk.org
> 
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

[...]

> +	/* Remaining pages store in alloc.cache */
> +	list_for_each_entry_safe(page, next, &page_list, lru) {
> +		list_del(&page->lru);
> +		if (pp_flags & PP_FLAG_DMA_MAP) {
> +			page = page_pool_dma_map(pool, page);
> +			if (!page)

As I commented on the previous patch, i'd prefer the put_page() here to be
explicitly called, instead of hiding in the page_pool_dma_map()

> +				continue;
> +		}
> +		if (likely(pool->alloc.count < PP_ALLOC_CACHE_SIZE)) {
> +			pool->alloc.cache[pool->alloc.count++] = page;
> +			pool->pages_state_hold_cnt++;
> +			trace_page_pool_state_hold(pool, page,
> +						   pool->pages_state_hold_cnt);
> +		} else {
> +			put_page(page);
> +		}
> +	}
> +out:
>  	if (pool->p.flags & PP_FLAG_DMA_MAP) {
> -		page = page_pool_dma_map(pool, page);
> -		if (!page)
> +		first_page = page_pool_dma_map(pool, first_page);
> +		if (!first_page)
>  			return NULL;
>  	}
>  
[...]

Cheers
/Ilias
Jesper Dangaard Brouer Feb. 26, 2021, 2:31 p.m. UTC | #2
On Wed, 24 Feb 2021 22:15:22 +0200
Ilias Apalodimas <ilias.apalodimas@linaro.org> wrote:

> Hi Jesper, 
> 
> On Wed, Feb 24, 2021 at 07:56:46PM +0100, Jesper Dangaard Brouer wrote:
> > There are cases where the page_pool need to refill with pages from the
> > page allocator. Some workloads cause the page_pool to release pages
> > instead of recycling these pages.
> > 
> > For these workload it can improve performance to bulk alloc pages from
> > the page-allocator to refill the alloc cache.
> > 
> > For XDP-redirect workload with 100G mlx5 driver (that use page_pool)
> > redirecting xdp_frame packets into a veth, that does XDP_PASS to create
> > an SKB from the xdp_frame, which then cannot return the page to the
> > page_pool. In this case, we saw[1] an improvement of 18.8% from using
> > the alloc_pages_bulk API (3,677,958 pps -> 4,368,926 pps).
> > 
> > [1] https://github.com/xdp-project/xdp-project/blob/master/areas/mem/page_pool06_alloc_pages_bulk.org
> > 
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>  
> 
> [...]
> 
> > +	/* Remaining pages store in alloc.cache */
> > +	list_for_each_entry_safe(page, next, &page_list, lru) {
> > +		list_del(&page->lru);
> > +		if (pp_flags & PP_FLAG_DMA_MAP) {
> > +			page = page_pool_dma_map(pool, page);
> > +			if (!page)  
> 
> As I commented on the previous patch, i'd prefer the put_page() here to be
> explicitly called, instead of hiding in the page_pool_dma_map()

I fully agree.  I will fixup the code.

> > +				continue;
> > +		}
> > +		if (likely(pool->alloc.count < PP_ALLOC_CACHE_SIZE)) {
> > +			pool->alloc.cache[pool->alloc.count++] = page;
> > +			pool->pages_state_hold_cnt++;
> > +			trace_page_pool_state_hold(pool, page,
> > +						   pool->pages_state_hold_cnt);
> > +		} else {
> > +			put_page(page);
> > +		}
> > +	}
> > +out:
> >  	if (pool->p.flags & PP_FLAG_DMA_MAP) {
> > -		page = page_pool_dma_map(pool, page);
> > -		if (!page)
> > +		first_page = page_pool_dma_map(pool, first_page);
> > +		if (!first_page)
> >  			return NULL;
> >  	}
> >    
> [...]
diff mbox series

Patch

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 50d52aa6fbeb..e0ae95fc59f0 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -210,44 +210,61 @@  noinline
 static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
 						 gfp_t _gfp)
 {
-	struct page *page;
+	const int bulk = PP_ALLOC_CACHE_REFILL;
+	struct page *page, *next, *first_page;
+	unsigned int pp_flags = pool->p.flags;
+	unsigned int pp_order = pool->p.order;
+	int pp_nid = pool->p.nid;
+	LIST_HEAD(page_list);
 	gfp_t gfp = _gfp;
 
-	/* We could always set __GFP_COMP, and avoid this branch, as
-	 * prep_new_page() can handle order-0 with __GFP_COMP.
-	 */
-	if (pool->p.order)
+	/* Don't support bulk alloc for high-order pages */
+	if (unlikely(pp_order)) {
 		gfp |= __GFP_COMP;
+		first_page = alloc_pages_node(pp_nid, gfp, pp_order);
+		if (unlikely(!first_page))
+			return NULL;
+		goto out;
+	}
 
-	/* FUTURE development:
-	 *
-	 * Current slow-path essentially falls back to single page
-	 * allocations, which doesn't improve performance.  This code
-	 * need bulk allocation support from the page allocator code.
-	 */
-
-	/* Cache was empty, do real allocation */
-#ifdef CONFIG_NUMA
-	page = alloc_pages_node(pool->p.nid, gfp, pool->p.order);
-#else
-	page = alloc_pages(gfp, pool->p.order);
-#endif
-	if (!page)
+	if (unlikely(!__alloc_pages_bulk_nodemask(gfp, pp_nid, NULL,
+						  bulk, &page_list)))
 		return NULL;
 
+	/* First page is extracted and returned to caller */
+	first_page = list_first_entry(&page_list, struct page, lru);
+	list_del(&first_page->lru);
+
+	/* Remaining pages store in alloc.cache */
+	list_for_each_entry_safe(page, next, &page_list, lru) {
+		list_del(&page->lru);
+		if (pp_flags & PP_FLAG_DMA_MAP) {
+			page = page_pool_dma_map(pool, page);
+			if (!page)
+				continue;
+		}
+		if (likely(pool->alloc.count < PP_ALLOC_CACHE_SIZE)) {
+			pool->alloc.cache[pool->alloc.count++] = page;
+			pool->pages_state_hold_cnt++;
+			trace_page_pool_state_hold(pool, page,
+						   pool->pages_state_hold_cnt);
+		} else {
+			put_page(page);
+		}
+	}
+out:
 	if (pool->p.flags & PP_FLAG_DMA_MAP) {
-		page = page_pool_dma_map(pool, page);
-		if (!page)
+		first_page = page_pool_dma_map(pool, first_page);
+		if (!first_page)
 			return NULL;
 	}
 
 	/* Track how many pages are held 'in-flight' */
 	pool->pages_state_hold_cnt++;
-
-	trace_page_pool_state_hold(pool, page, pool->pages_state_hold_cnt);
+	trace_page_pool_state_hold(pool, first_page, pool->pages_state_hold_cnt);
 
 	/* When page just alloc'ed is should/must have refcnt 1. */
-	return page;
+	return first_page;
 }
 
 /* For using page_pool replace: alloc_pages() API calls, but provide