Message ID | 20241109023303.3366500-1-kuba@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Commit | ef04d290c01301b7467df48425c36891d86ff417 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] net: page_pool: do not count normal frag allocation in stats | expand |
On 09/11/2024 03.33, Jakub Kicinski wrote: > Commit 0f6deac3a079 ("net: page_pool: add page allocation stats for > two fast page allocate path") added increments for "fast path" > allocation to page frag alloc. It mentions performance degradation > analysis but the details are unclear. Could be that the author > was simply surprised by the alloc stats not matching packet count. > > In my experience the key metric for page pool is the recycling rate. > Page return stats, however, count returned _pages_ not frags. > This makes it impossible to calculate recycling rate for drivers > using the frag API. Here is example output of the page-pool > YNL sample for a driver allocating 1200B frags (4k pages) > with nearly perfect recycling: > > $ ./page-pool > eth0[2] page pools: 32 (zombies: 0) > refs: 291648 bytes: 1194590208 (refs: 0 bytes: 0) > recycling: 33.3% (alloc: 4557:2256365862 recycle: 200476245:551541893) > > The recycling rate is reported as 33.3% because we give out > 4096 // 1200 = 3 frags for every recycled page. > > Effectively revert the aforementioned commit. This also aligns > with the stats we would see for drivers which do the fragmentation > themselves, although that's not a strong reason in itself. > > On the (very unlikely) path where we can reuse the current page > let's bump the "cached" stat. The fact that we don't put the page > in the cache is just an optimization. > > Signed-off-by: Jakub Kicinski <kuba@kernel.org> > --- LGTM Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> > CC: hawk@kernel.org > CC: ilias.apalodimas@linaro.org > CC: lorenzo@kernel.org > CC: wangjie125@huawei.com > CC: huangguangbin2@huawei.com > --- > net/core/page_pool.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > index a813d30d2135..f89cf93f6eb4 100644 > --- a/net/core/page_pool.c > +++ b/net/core/page_pool.c > @@ -950,6 +950,7 @@ netmem_ref page_pool_alloc_frag_netmem(struct page_pool *pool, > if (netmem && *offset + size > max_size) { > netmem = page_pool_drain_frag(pool, netmem); > if (netmem) { > + recycle_stat_inc(pool, cached); > alloc_stat_inc(pool, fast); > goto frag_reset; > } > @@ -974,7 +975,6 @@ netmem_ref page_pool_alloc_frag_netmem(struct page_pool *pool, > > pool->frag_users++; > pool->frag_offset = *offset + size; > - alloc_stat_inc(pool, fast); > return netmem; > } > EXPORT_SYMBOL(page_pool_alloc_frag_netmem);
On Fri, 8 Nov 2024 18:33:03 -0800, Jakub Kicinski <kuba@kernel.org> wrote: > Commit 0f6deac3a079 ("net: page_pool: add page allocation stats for > two fast page allocate path") added increments for "fast path" > allocation to page frag alloc. It mentions performance degradation > analysis but the details are unclear. Could be that the author > was simply surprised by the alloc stats not matching packet count. > > In my experience the key metric for page pool is the recycling rate. > Page return stats, however, count returned _pages_ not frags. > This makes it impossible to calculate recycling rate for drivers > using the frag API. Here is example output of the page-pool > YNL sample for a driver allocating 1200B frags (4k pages) > with nearly perfect recycling: > > $ ./page-pool > eth0[2] page pools: 32 (zombies: 0) > refs: 291648 bytes: 1194590208 (refs: 0 bytes: 0) > recycling: 33.3% (alloc: 4557:2256365862 recycle: 200476245:551541893) > > The recycling rate is reported as 33.3% because we give out > 4096 // 1200 = 3 frags for every recycled page. > > Effectively revert the aforementioned commit. This also aligns > with the stats we would see for drivers which do the fragmentation > themselves, although that's not a strong reason in itself. > > On the (very unlikely) path where we can reuse the current page > let's bump the "cached" stat. The fact that we don't put the page > in the cache is just an optimization. > > Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > --- > CC: hawk@kernel.org > CC: ilias.apalodimas@linaro.org > CC: lorenzo@kernel.org > CC: wangjie125@huawei.com > CC: huangguangbin2@huawei.com > --- > net/core/page_pool.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > index a813d30d2135..f89cf93f6eb4 100644 > --- a/net/core/page_pool.c > +++ b/net/core/page_pool.c > @@ -950,6 +950,7 @@ netmem_ref page_pool_alloc_frag_netmem(struct page_pool *pool, > if (netmem && *offset + size > max_size) { > netmem = page_pool_drain_frag(pool, netmem); > if (netmem) { > + recycle_stat_inc(pool, cached); > alloc_stat_inc(pool, fast); > goto frag_reset; > } > @@ -974,7 +975,6 @@ netmem_ref page_pool_alloc_frag_netmem(struct page_pool *pool, > > pool->frag_users++; > pool->frag_offset = *offset + size; > - alloc_stat_inc(pool, fast); > return netmem; > } > EXPORT_SYMBOL(page_pool_alloc_frag_netmem); > -- > 2.47.0 > >
On Sat, 9 Nov 2024 at 04:33, Jakub Kicinski <kuba@kernel.org> wrote: > > Commit 0f6deac3a079 ("net: page_pool: add page allocation stats for > two fast page allocate path") added increments for "fast path" > allocation to page frag alloc. It mentions performance degradation > analysis but the details are unclear. Could be that the author > was simply surprised by the alloc stats not matching packet count. > > In my experience the key metric for page pool is the recycling rate. > Page return stats, however, count returned _pages_ not frags. > This makes it impossible to calculate recycling rate for drivers > using the frag API. Here is example output of the page-pool > YNL sample for a driver allocating 1200B frags (4k pages) > with nearly perfect recycling: > > $ ./page-pool > eth0[2] page pools: 32 (zombies: 0) > refs: 291648 bytes: 1194590208 (refs: 0 bytes: 0) > recycling: 33.3% (alloc: 4557:2256365862 recycle: 200476245:551541893) > > The recycling rate is reported as 33.3% because we give out > 4096 // 1200 = 3 frags for every recycled page. > > Effectively revert the aforementioned commit. This also aligns > with the stats we would see for drivers which do the fragmentation > themselves, although that's not a strong reason in itself. > > On the (very unlikely) path where we can reuse the current page > let's bump the "cached" stat. The fact that we don't put the page > in the cache is just an optimization. > > Signed-off-by: Jakub Kicinski <kuba@kernel.org> > --- > CC: hawk@kernel.org > CC: ilias.apalodimas@linaro.org > CC: lorenzo@kernel.org > CC: wangjie125@huawei.com > CC: huangguangbin2@huawei.com > --- > net/core/page_pool.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > index a813d30d2135..f89cf93f6eb4 100644 > --- a/net/core/page_pool.c > +++ b/net/core/page_pool.c > @@ -950,6 +950,7 @@ netmem_ref page_pool_alloc_frag_netmem(struct page_pool *pool, > if (netmem && *offset + size > max_size) { > netmem = page_pool_drain_frag(pool, netmem); > if (netmem) { > + recycle_stat_inc(pool, cached); > alloc_stat_inc(pool, fast); > goto frag_reset; > } > @@ -974,7 +975,6 @@ netmem_ref page_pool_alloc_frag_netmem(struct page_pool *pool, > > pool->frag_users++; > pool->frag_offset = *offset + size; > - alloc_stat_inc(pool, fast); > return netmem; > } > EXPORT_SYMBOL(page_pool_alloc_frag_netmem); > -- > 2.47.0 > Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Hello: This patch was applied to netdev/net-next.git (main) by Jakub Kicinski <kuba@kernel.org>: On Fri, 8 Nov 2024 18:33:03 -0800 you wrote: > Commit 0f6deac3a079 ("net: page_pool: add page allocation stats for > two fast page allocate path") added increments for "fast path" > allocation to page frag alloc. It mentions performance degradation > analysis but the details are unclear. Could be that the author > was simply surprised by the alloc stats not matching packet count. > > In my experience the key metric for page pool is the recycling rate. > Page return stats, however, count returned _pages_ not frags. > This makes it impossible to calculate recycling rate for drivers > using the frag API. Here is example output of the page-pool > YNL sample for a driver allocating 1200B frags (4k pages) > with nearly perfect recycling: > > [...] Here is the summary with links: - [net-next] net: page_pool: do not count normal frag allocation in stats https://git.kernel.org/netdev/net-next/c/ef04d290c013 You are awesome, thank you!
diff --git a/net/core/page_pool.c b/net/core/page_pool.c index a813d30d2135..f89cf93f6eb4 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -950,6 +950,7 @@ netmem_ref page_pool_alloc_frag_netmem(struct page_pool *pool, if (netmem && *offset + size > max_size) { netmem = page_pool_drain_frag(pool, netmem); if (netmem) { + recycle_stat_inc(pool, cached); alloc_stat_inc(pool, fast); goto frag_reset; } @@ -974,7 +975,6 @@ netmem_ref page_pool_alloc_frag_netmem(struct page_pool *pool, pool->frag_users++; pool->frag_offset = *offset + size; - alloc_stat_inc(pool, fast); return netmem; } EXPORT_SYMBOL(page_pool_alloc_frag_netmem);
Commit 0f6deac3a079 ("net: page_pool: add page allocation stats for two fast page allocate path") added increments for "fast path" allocation to page frag alloc. It mentions performance degradation analysis but the details are unclear. Could be that the author was simply surprised by the alloc stats not matching packet count. In my experience the key metric for page pool is the recycling rate. Page return stats, however, count returned _pages_ not frags. This makes it impossible to calculate recycling rate for drivers using the frag API. Here is example output of the page-pool YNL sample for a driver allocating 1200B frags (4k pages) with nearly perfect recycling: $ ./page-pool eth0[2] page pools: 32 (zombies: 0) refs: 291648 bytes: 1194590208 (refs: 0 bytes: 0) recycling: 33.3% (alloc: 4557:2256365862 recycle: 200476245:551541893) The recycling rate is reported as 33.3% because we give out 4096 // 1200 = 3 frags for every recycled page. Effectively revert the aforementioned commit. This also aligns with the stats we would see for drivers which do the fragmentation themselves, although that's not a strong reason in itself. On the (very unlikely) path where we can reuse the current page let's bump the "cached" stat. The fact that we don't put the page in the cache is just an optimization. Signed-off-by: Jakub Kicinski <kuba@kernel.org> --- CC: hawk@kernel.org CC: ilias.apalodimas@linaro.org CC: lorenzo@kernel.org CC: wangjie125@huawei.com CC: huangguangbin2@huawei.com --- net/core/page_pool.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)