Message ID | 20230111042214.907030-1-willy@infradead.org (mailing list archive) |
---|---|
Headers | show |
Series | Split netmem from struct page | expand |
On 2023/1/11 12:21, Matthew Wilcox (Oracle) wrote: > The MM subsystem is trying to reduce struct page to a single pointer. > The first step towards that is splitting struct page by its individual > users, as has already been done with folio and slab. This patchset does > that for netmem which is used for page pools. As page pool is only used for rx side in the net stack depending on the driver, a lot more memory for the net stack is from page_frag_alloc_align(), kmem cache, etc. naming it netmem seems a little overkill, perhaps a more specific name for the page pool? such as pp_cache. @Jesper & Ilias Any better idea? And it seem some API may need changing too, as we are not pooling 'pages' now. > > There are some relatively significant reductions in kernel text size > from these changes. They don't appear to affect performance at all, > but it's nice to save a bit of memory. > > v3: > - Rebase to next-20230110 > - Add received Acked-by and Reviewed-by tags (thanks!) > - Mark compat functions in page_pool.h (Ilias) > - Correct a patch title > - Convert hns3 driver (and page_pool_dev_alloc_frag()) > - Make page_pool_recycle_direct() accept a netmem or page pointer > > Matthew Wilcox (Oracle) (26): > netmem: Create new type > netmem: Add utility functions > page_pool: Add netmem_set_dma_addr() and netmem_get_dma_addr() > page_pool: Convert page_pool_release_page() to > page_pool_release_netmem() > page_pool: Start using netmem in allocation path. ^ nit: there is a '.' at the end of patch titile. > page_pool: Convert page_pool_return_page() to > page_pool_return_netmem() > page_pool: Convert __page_pool_put_page() to __page_pool_put_netmem() > page_pool: Convert pp_alloc_cache to contain netmem > page_pool: Convert page_pool_defrag_page() to > page_pool_defrag_netmem() > page_pool: Convert page_pool_put_defragged_page() to netmem > page_pool: Convert page_pool_empty_ring() to use netmem > page_pool: Convert page_pool_alloc_pages() to page_pool_alloc_netmem() > page_pool: Convert page_pool_dma_sync_for_device() to take a netmem > page_pool: Convert page_pool_recycle_in_cache() to netmem > page_pool: Remove __page_pool_put_page() > page_pool: Use netmem in page_pool_drain_frag() > page_pool: Convert page_pool_return_skb_page() to use netmem > page_pool: Allow page_pool_recycle_direct() to take a netmem or a page > page_pool: Convert frag_page to frag_nmem > xdp: Convert to netmem > mm: Remove page pool members from struct page > page_pool: Pass a netmem to init_callback() > net: Add support for netmem in skb_frag > mvneta: Convert to netmem > mlx5: Convert to netmem > hns3: Convert to netmem > > Documentation/networking/page_pool.rst | 5 + > .../net/ethernet/hisilicon/hns3/hns3_enet.c | 16 +- > drivers/net/ethernet/marvell/mvneta.c | 48 +-- > drivers/net/ethernet/mellanox/mlx5/core/en.h | 10 +- > .../net/ethernet/mellanox/mlx5/core/en/txrx.h | 4 +- > .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 24 +- > .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 2 +- > .../net/ethernet/mellanox/mlx5/core/en_main.c | 12 +- > .../net/ethernet/mellanox/mlx5/core/en_rx.c | 130 +++++---- > include/linux/mm_types.h | 22 -- > include/linux/skbuff.h | 11 + > include/net/page_pool.h | 228 ++++++++++++--- > include/trace/events/page_pool.h | 28 +- > net/bpf/test_run.c | 4 +- > net/core/page_pool.c | 274 +++++++++--------- > net/core/xdp.c | 7 +- > 16 files changed, 493 insertions(+), 332 deletions(-) >
On Wed, Jan 11, 2023 at 04:25:46PM +0800, Yunsheng Lin wrote: > On 2023/1/11 12:21, Matthew Wilcox (Oracle) wrote: > > The MM subsystem is trying to reduce struct page to a single pointer. > > The first step towards that is splitting struct page by its individual > > users, as has already been done with folio and slab. This patchset does > > that for netmem which is used for page pools. > > As page pool is only used for rx side in the net stack depending on the > driver, a lot more memory for the net stack is from page_frag_alloc_align(), > kmem cache, etc. > naming it netmem seems a little overkill, perhaps a more specific name for > the page pool? such as pp_cache. > > @Jesper & Ilias > Any better idea? > And it seem some API may need changing too, as we are not pooling 'pages' > now. I raised the question of naming in v1, six weeks ago, and nobody had any better names. Seems a little unfair to ignore the question at first and then bring it up now. I'd hate to miss the merge window because of a late-breaking major request like this. https://lore.kernel.org/netdev/20221130220803.3657490-1-willy@infradead.org/ I'd like to understand what we think we'll do in networking when we trim struct page down to a single pointer, All these usages that aren't from page_pool -- what information does networking need to track per-allocation? Would it make sense for the netmem to describe all memory used by the networking stack, and have allocators other than page_pool also return netmem, or does the normal usage of memory in the net stack not need to track that information? > > Matthew Wilcox (Oracle) (26): > > netmem: Create new type > > netmem: Add utility functions > > page_pool: Add netmem_set_dma_addr() and netmem_get_dma_addr() > > page_pool: Convert page_pool_release_page() to > > page_pool_release_netmem() > > page_pool: Start using netmem in allocation path. > ^ > nit: there is a '.' at the end of patch titile. Thanks, I'll fix that for v4.
On 2023/1/11 21:21, Matthew Wilcox wrote: > On Wed, Jan 11, 2023 at 04:25:46PM +0800, Yunsheng Lin wrote: >> On 2023/1/11 12:21, Matthew Wilcox (Oracle) wrote: >>> The MM subsystem is trying to reduce struct page to a single pointer. >>> The first step towards that is splitting struct page by its individual >>> users, as has already been done with folio and slab. This patchset does >>> that for netmem which is used for page pools. >> >> As page pool is only used for rx side in the net stack depending on the >> driver, a lot more memory for the net stack is from page_frag_alloc_align(), >> kmem cache, etc. >> naming it netmem seems a little overkill, perhaps a more specific name for >> the page pool? such as pp_cache. >> >> @Jesper & Ilias >> Any better idea? >> And it seem some API may need changing too, as we are not pooling 'pages' >> now. > > I raised the question of naming in v1, six weeks ago, and nobody had > any better names. Seems a little unfair to ignore the question at first > and then bring it up now. I'd hate to miss the merge window because of > a late-breaking major request like this. I was not keeping very close key on the maillist, so forgive me for missing it. As this version contains change to hns3 driver, it caught my eyes and I took some time to go through it. > > https://lore.kernel.org/netdev/20221130220803.3657490-1-willy@infradead.org/ > > I'd like to understand what we think we'll do in networking when we trim > struct page down to a single pointer, All these usages that aren't from > page_pool -- what information does networking need to track per-allocation? these are memory used by net stack as my limited understanding: 1. page allocated directly from page allocator directly, such as dev_alloc_pages(). 2. page allocated from page pool API. 3. page allocated form page frag API such as page_frag_alloc_align(). 4. buffer allocated from kmem cache API, and may converted to page using virt_to_page(). 5. page allocated from other subsystem and used by net stack (or allocated by net stack and used by other subsystem)using copy avoidance optimization, such as sendfile and splite. For case 1, I do not think the networking need to track per-allocation information now as its user has taken care of that if information needed. For case 3, I view it as a pool far more smaller than page pool, we may merge page frag as part of page pool or find common information needed by both of them. For case 4 & 5, they seems to be a similar case. I am not sure how "triming struct page down to a single pointer" works yet, but at least it may need a commom field such as page->pp_magic for different subsystem to use the same page coordinately as page pool does now? > Would it make sense for the netmem to describe all memory used by the > networking stack, and have allocators other than page_pool also return > netmem, or does the normal usage of memory in the net stack not need to > track that information? The above question seems hard to answer now, if "reduce struct page to a single pointer" does not affect case 4 & 5 or other case I missed, it seems fine to limit the change to page pool as the patchset does for now, it would be better if we can come out with better name than 'netmem'.
On 11/01/2023 14.21, Matthew Wilcox wrote: > On Wed, Jan 11, 2023 at 04:25:46PM +0800, Yunsheng Lin wrote: >> On 2023/1/11 12:21, Matthew Wilcox (Oracle) wrote: >>> The MM subsystem is trying to reduce struct page to a single pointer. >>> The first step towards that is splitting struct page by its individual >>> users, as has already been done with folio and slab. This patchset does >>> that for netmem which is used for page pools. >> As page pool is only used for rx side in the net stack depending on the >> driver, a lot more memory for the net stack is from page_frag_alloc_align(), >> kmem cache, etc. >> naming it netmem seems a little overkill, perhaps a more specific name for >> the page pool? such as pp_cache. >> >> @Jesper & Ilias >> Any better idea? I like the 'netmem' name. >> And it seem some API may need changing too, as we are not pooling 'pages' >> now. IMHO it would be overkill to rename the page_pool to e.g. netmem_pool. as it would generate too much churn and will be hard to follow in git as the code filename page_pool.c would also have to be renamed. It guess we keep page_pool for historical reasons ;-) > I raised the question of naming in v1, six weeks ago, and nobody had > any better names. Seems a little unfair to ignore the question at first > and then bring it up now. I'd hate to miss the merge window because of > a late-breaking major request like this. > > https://lore.kernel.org/netdev/20221130220803.3657490-1-willy@infradead.org/ > > I'd like to understand what we think we'll do in networking when we trim > struct page down to a single pointer, All these usages that aren't from > page_pool -- what information does networking need to track per-allocation? > Would it make sense for the netmem to describe all memory used by the > networking stack, and have allocators other than page_pool also return > netmem, This is also how I see the future, that other netstack "allocators" can return and work-with 'netmem' objects. IMHO we are already cramming too many use-cases into page_pool (like the frag support Yunsheng added). IMHO there are room for other netstack "allocators" that can utilize netmem. The page_pool is optimized for RX-NAPI workloads, using it for other purposes is a mistake IMHO. People should create other netstack "allocators" that solves their specific use-cases. E.g. The TX path likely needs another "allocator" optimized for this TX use-case. > or does the normal usage of memory in the net stack not need to > track that information? The page refcnt is (obviously) used by netstack as tracked information. I have seen drivers that use the DMA mapping directly in page/'netmem', instead of having to store this separately in the drivers. --Jesper
On 2023/1/12 18:15, Jesper Dangaard Brouer wrote:> On 11/01/2023 14.21, Matthew Wilcox wrote: >> On Wed, Jan 11, 2023 at 04:25:46PM +0800, Yunsheng Lin wrote: >>> On 2023/1/11 12:21, Matthew Wilcox (Oracle) wrote: >>>> The MM subsystem is trying to reduce struct page to a single pointer. >>>> The first step towards that is splitting struct page by its individual >>>> users, as has already been done with folio and slab. This patchset does >>>> that for netmem which is used for page pools. >>> As page pool is only used for rx side in the net stack depending on the >>> driver, a lot more memory for the net stack is from page_frag_alloc_align(), >>> kmem cache, etc. >>> naming it netmem seems a little overkill, perhaps a more specific name for >>> the page pool? such as pp_cache. >>> >>> @Jesper & Ilias >>> Any better idea? > > I like the 'netmem' name. Fair enough. I just pointed out why netmem might not be appropriate when we are not figuring out how netmem will work through the whole networking stack yet. It is eventually your and david/jakub's call to decide the naming anyway. > >>> And it seem some API may need changing too, as we are not pooling 'pages' >>> now. > > IMHO it would be overkill to rename the page_pool to e.g. netmem_pool. > as it would generate too much churn and will be hard to follow in git > as the code filename page_pool.c would also have to be renamed. > It guess we keep page_pool for historical reasons ;-) I think this is a matter of conflict between backward and forward maintainability. IMHO we should prefer forward maintainability over backward maintainability. And greg offers a possible way to fix the backport problem: https://www.spinics.net/lists/kernel/msg4648826.html For git history, I suppose that is a pain we have to pay for the future maintainability. > >> I raised the question of naming in v1, six weeks ago, and nobody had >> any better names. Seems a little unfair to ignore the question at first >> and then bring it up now. I'd hate to miss the merge window because of >> a late-breaking major request like this. >> >> https://lore.kernel.org/netdev/20221130220803.3657490-1-willy@infradead.org/ >> >> I'd like to understand what we think we'll do in networking when we trim >> struct page down to a single pointer, All these usages that aren't from >> page_pool -- what information does networking need to track per-allocation? >> Would it make sense for the netmem to describe all memory used by the >> networking stack, and have allocators other than page_pool also return >> netmem, > > This is also how I see the future, that other netstack "allocators" can > return and work-with 'netmem' objects. IMHO we are already cramming I am not sure how "other netstack 'allocators' can return and work-with 'netmem' objects" works, I suppose putting different union for different allocators in struct netmem like struct page does? Isn't that bringing the similar problem Matthew is trying to fix in this patchset? > too many use-cases into page_pool (like the frag support Yunsheng > added). IMHO there are room for other netstack "allocators" that can I do not understand why frag support is viewed as "cramming use-cases to page pool". In my defence, the frag support for rx is fix in the page pool, it just extend the page pool to return smaller buffer than before. If I create other allocator for that, I might invent a lot of wheel page pool already invented. > utilize netmem. The page_pool is optimized for RX-NAPI workloads, using > it for other purposes is a mistake IMHO. People should create other > netstack "allocators" that solves their specific use-cases. E.g. The TX > path likely needs another "allocator" optimized for this TX use-case. >