Message ID | 20220705113515.54342-1-huangguangbin2@huawei.com (mailing list archive) |
---|---|
State | Accepted |
Commit | d810d367ec40a1031173a447bd0146cf48e98733 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,v3] net: page_pool: optimize page pool page allocation in NUMA scenario | expand |
On 05/07/2022 13.35, Guangbin Huang wrote: > From: Jie Wang <wangjie125@huawei.com> > > Currently NIC packet receiving performance based on page pool deteriorates > occasionally. To analysis the causes of this problem page allocation stats > are collected. Here are the stats when NIC rx performance deteriorates: > > bandwidth(Gbits/s) 16.8 6.91 > rx_pp_alloc_fast 13794308 21141869 > rx_pp_alloc_slow 108625 166481 > rx_pp_alloc_slow_h 0 0 > rx_pp_alloc_empty 8192 8192 > rx_pp_alloc_refill 0 0 > rx_pp_alloc_waive 100433 158289 > rx_pp_recycle_cached 0 0 > rx_pp_recycle_cache_full 0 0 > rx_pp_recycle_ring 362400 420281 > rx_pp_recycle_ring_full 6064893 9709724 > rx_pp_recycle_released_ref 0 0 > > The rx_pp_alloc_waive count indicates that a large number of pages' numa > node are inconsistent with the NIC device numa node. Therefore these pages > can't be reused by the page pool. As a result, many new pages would be > allocated by __page_pool_alloc_pages_slow which is time consuming. This > causes the NIC rx performance fluctuations. > > The main reason of huge numa mismatch pages in page pool is that page pool > uses alloc_pages_bulk_array to allocate original pages. This function is > not suitable for page allocation in NUMA scenario. So this patch uses > alloc_pages_bulk_array_node which has a NUMA id input parameter to ensure > the NUMA consistent between NIC device and allocated pages. > > Repeated NIC rx performance tests are performed 40 times. NIC rx bandwidth > is higher and more stable compared to the datas above. Here are three test > stats, the rx_pp_alloc_waive count is zero and rx_pp_alloc_slow which > indicates pages allocated from slow patch is relatively low. > > bandwidth(Gbits/s) 93 93.9 93.8 > rx_pp_alloc_fast 60066264 61266386 60938254 > rx_pp_alloc_slow 16512 16517 16539 > rx_pp_alloc_slow_ho 0 0 0 > rx_pp_alloc_empty 16512 16517 16539 > rx_pp_alloc_refill 473841 481910 481585 > rx_pp_alloc_waive 0 0 0 > rx_pp_recycle_cached 0 0 0 > rx_pp_recycle_cache_full 0 0 0 > rx_pp_recycle_ring 29754145 30358243 30194023 > rx_pp_recycle_ring_full 0 0 0 > rx_pp_recycle_released_ref 0 0 0 > > Signed-off-by: Jie Wang <wangjie125@huawei.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> > --- > v2->v3: > 1, Delete the #ifdefs > 2, Use 'pool->p.nid' in the call to alloc_pages_bulk_array_node() > > v1->v2: > 1, Remove two inappropriate comments. > 2, Use NUMA_NO_NODE instead of numa_mem_id() for code maintenance. > --- > net/core/page_pool.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > index f18e6e771993..b74905fcc3a1 100644 > --- a/net/core/page_pool.c > +++ b/net/core/page_pool.c > @@ -389,7 +389,8 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, > /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */ > memset(&pool->alloc.cache, 0, sizeof(void *) * bulk); > > - nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache); > + nr_pages = alloc_pages_bulk_array_node(gfp, pool->p.nid, bulk, > + pool->alloc.cache); > if (unlikely(!nr_pages)) > return NULL; >
On Thu, 7 Jul 2022 at 22:14, Jesper Dangaard Brouer <jbrouer@redhat.com> wrote: > > > On 05/07/2022 13.35, Guangbin Huang wrote: > > From: Jie Wang <wangjie125@huawei.com> > > > > Currently NIC packet receiving performance based on page pool deteriorates > > occasionally. To analysis the causes of this problem page allocation stats > > are collected. Here are the stats when NIC rx performance deteriorates: > > > > bandwidth(Gbits/s) 16.8 6.91 > > rx_pp_alloc_fast 13794308 21141869 > > rx_pp_alloc_slow 108625 166481 > > rx_pp_alloc_slow_h 0 0 > > rx_pp_alloc_empty 8192 8192 > > rx_pp_alloc_refill 0 0 > > rx_pp_alloc_waive 100433 158289 > > rx_pp_recycle_cached 0 0 > > rx_pp_recycle_cache_full 0 0 > > rx_pp_recycle_ring 362400 420281 > > rx_pp_recycle_ring_full 6064893 9709724 > > rx_pp_recycle_released_ref 0 0 > > > > The rx_pp_alloc_waive count indicates that a large number of pages' numa > > node are inconsistent with the NIC device numa node. Therefore these pages > > can't be reused by the page pool. As a result, many new pages would be > > allocated by __page_pool_alloc_pages_slow which is time consuming. This > > causes the NIC rx performance fluctuations. > > > > The main reason of huge numa mismatch pages in page pool is that page pool > > uses alloc_pages_bulk_array to allocate original pages. This function is > > not suitable for page allocation in NUMA scenario. So this patch uses > > alloc_pages_bulk_array_node which has a NUMA id input parameter to ensure > > the NUMA consistent between NIC device and allocated pages. > > > > Repeated NIC rx performance tests are performed 40 times. NIC rx bandwidth > > is higher and more stable compared to the datas above. Here are three test > > stats, the rx_pp_alloc_waive count is zero and rx_pp_alloc_slow which > > indicates pages allocated from slow patch is relatively low. > > > > bandwidth(Gbits/s) 93 93.9 93.8 > > rx_pp_alloc_fast 60066264 61266386 60938254 > > rx_pp_alloc_slow 16512 16517 16539 > > rx_pp_alloc_slow_ho 0 0 0 > > rx_pp_alloc_empty 16512 16517 16539 > > rx_pp_alloc_refill 473841 481910 481585 > > rx_pp_alloc_waive 0 0 0 > > rx_pp_recycle_cached 0 0 0 > > rx_pp_recycle_cache_full 0 0 0 > > rx_pp_recycle_ring 29754145 30358243 30194023 > > rx_pp_recycle_ring_full 0 0 0 > > rx_pp_recycle_released_ref 0 0 0 > > > > Signed-off-by: Jie Wang <wangjie125@huawei.com> > > Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> > > > --- > > v2->v3: > > 1, Delete the #ifdefs > > 2, Use 'pool->p.nid' in the call to alloc_pages_bulk_array_node() > > > > v1->v2: > > 1, Remove two inappropriate comments. > > 2, Use NUMA_NO_NODE instead of numa_mem_id() for code maintenance. > > --- > > net/core/page_pool.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > > index f18e6e771993..b74905fcc3a1 100644 > > --- a/net/core/page_pool.c > > +++ b/net/core/page_pool.c > > @@ -389,7 +389,8 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, > > /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */ > > memset(&pool->alloc.cache, 0, sizeof(void *) * bulk); > > > > - nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache); > > + nr_pages = alloc_pages_bulk_array_node(gfp, pool->p.nid, bulk, > > + pool->alloc.cache); > > if (unlikely(!nr_pages)) > > return NULL; > > > Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Hello: This patch was applied to netdev/net-next.git (master) by Jakub Kicinski <kuba@kernel.org>: On Tue, 5 Jul 2022 19:35:15 +0800 you wrote: > From: Jie Wang <wangjie125@huawei.com> > > Currently NIC packet receiving performance based on page pool deteriorates > occasionally. To analysis the causes of this problem page allocation stats > are collected. Here are the stats when NIC rx performance deteriorates: > > bandwidth(Gbits/s) 16.8 6.91 > rx_pp_alloc_fast 13794308 21141869 > rx_pp_alloc_slow 108625 166481 > rx_pp_alloc_slow_h 0 0 > rx_pp_alloc_empty 8192 8192 > rx_pp_alloc_refill 0 0 > rx_pp_alloc_waive 100433 158289 > rx_pp_recycle_cached 0 0 > rx_pp_recycle_cache_full 0 0 > rx_pp_recycle_ring 362400 420281 > rx_pp_recycle_ring_full 6064893 9709724 > rx_pp_recycle_released_ref 0 0 > > [...] Here is the summary with links: - [net-next,v3] net: page_pool: optimize page pool page allocation in NUMA scenario https://git.kernel.org/netdev/net-next/c/d810d367ec40 You are awesome, thank you!
diff --git a/net/core/page_pool.c b/net/core/page_pool.c index f18e6e771993..b74905fcc3a1 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -389,7 +389,8 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */ memset(&pool->alloc.cache, 0, sizeof(void *) * bulk); - nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache); + nr_pages = alloc_pages_bulk_array_node(gfp, pool->p.nid, bulk, + pool->alloc.cache); if (unlikely(!nr_pages)) return NULL;