Message ID | tencent_76E62F6A47A7C7E818FC7C74A6B02772F308@qq.com (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,v1] net: page_pool: add page_pool_put_page_nosync() | expand |
On 2024/12/19 11:11, Guowei Dang wrote: > Add page_pool_put_page_nosync() to respond to dma_sync_size being 0. > > The purpose of this is to make the semantics more obvious and may > enable removing some checkings in the future. It would be good to describe the actual use case of the above API in the commit log too. > > And in the long term, treating the nosync scenario separately provides > more flexibility for the user and enable removing of the > PP_FLAG_DMA_SYNC_DEV in the future. > > Since we do have a page_pool_put_full_page(), adding a variant for > the nosync seems reasonable. > > Suggested-by: Yunsheng Lin <linyunsheng@huawei.com> > Acked-by: Furong Xu <0x1207@gmail.com> > Signed-off-by: Guowei Dang <guowei.dang@foxmail.com> > --- > Documentation/networking/page_pool.rst | 5 ++++- > include/net/page_pool/helpers.h | 17 +++++++++++++++++ > 2 files changed, 21 insertions(+), 1 deletion(-) > > diff --git a/Documentation/networking/page_pool.rst b/Documentation/networking/page_pool.rst > index 9d958128a57c..a83f7c071132 100644 > --- a/Documentation/networking/page_pool.rst > +++ b/Documentation/networking/page_pool.rst > @@ -62,7 +62,8 @@ a page will cause no race conditions is enough. > :identifiers: struct page_pool_params > > .. kernel-doc:: include/net/page_pool/helpers.h > - :identifiers: page_pool_put_page page_pool_put_full_page > + :identifiers: page_pool_put_page > + page_pool_put_page_nosync page_pool_put_full_page > page_pool_recycle_direct page_pool_free_va > page_pool_dev_alloc_pages page_pool_dev_alloc_frag > page_pool_dev_alloc page_pool_dev_alloc_va > @@ -93,6 +94,8 @@ much of the page needs to be synced (starting at ``offset``). > When directly freeing pages in the driver (page_pool_put_page()) > the ``dma_sync_size`` argument specifies how much of the buffer needs > to be synced. > +If the ``dma_sync_size`` argument is 0, page_pool_put_page_nosync() should be > +used instead of page_pool_put_page(). It would be good to describe when user should call page_pool_put_page_nosync() and when user should call page_pool_put_page() as the above doesn't really help user to decide using which API. As I recall correctly, it seems there is some use case that user is able to tell that it is ok the skip the dma sync to improve performance by calling page_pool_put_page_nosync() even when the page_pool is created with PP_FLAG_DMA_SYNC_DEV flags set. > > If in doubt set ``offset`` to 0, ``max_len`` to ``PAGE_SIZE`` and > pass -1 as ``dma_sync_size``. That combination of arguments is always > diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h > index e555921e5233..5cc68d48624a 100644 > --- a/include/net/page_pool/helpers.h > +++ b/include/net/page_pool/helpers.h > @@ -340,12 +340,14 @@ static inline void page_pool_put_netmem(struct page_pool *pool, > * the allocator owns the page and will try to recycle it in one of the pool > * caches. If PP_FLAG_DMA_SYNC_DEV is set, the page will be synced for_device > * using dma_sync_single_range_for_device(). > + * page_pool_put_page_nosync() should be used if dma_sync_size is 0. > */ > static inline void page_pool_put_page(struct page_pool *pool, > struct page *page, > unsigned int dma_sync_size, > bool allow_direct) > { > + DEBUG_NET_WARN_ON_ONCE(!dma_sync_size); > page_pool_put_netmem(pool, page_to_netmem(page), dma_sync_size, > allow_direct); > } > @@ -372,6 +374,21 @@ static inline void page_pool_put_full_page(struct page_pool *pool, > page_pool_put_netmem(pool, page_to_netmem(page), -1, allow_direct); > } > > +/** > + * page_pool_put_page_nosync() - release a reference on a page pool page > + * @pool: pool from which page was allocated > + * @page: page to release a reference on > + * @allow_direct: released by the consumer, allow lockless caching > + * > + * Similar to page_pool_put_page(), but will not DMA sync the memory area. > + */ > +static inline void page_pool_put_page_nosync(struct page_pool *pool, > + struct page *page, > + bool allow_direct) > +{ > + page_pool_put_netmem(pool, page_to_netmem(page), 0, allow_direct); > +} > + > /** > * page_pool_recycle_direct() - release a reference on a page pool page > * @pool: pool from which page was allocated
On Thu, 19 Dec 2024 11:11:38 +0800 Guowei Dang wrote: > Add page_pool_put_page_nosync() to respond to dma_sync_size being 0. > > The purpose of this is to make the semantics more obvious and may > enable removing some checkings in the future. > > And in the long term, treating the nosync scenario separately provides > more flexibility for the user and enable removing of the > PP_FLAG_DMA_SYNC_DEV in the future. > > Since we do have a page_pool_put_full_page(), adding a variant for > the nosync seems reasonable. You should provide an upstream user with the API. But IMHO this just complicates the already very large API, for little benefit. I'm going to leave this in patchwork for a day in case page pool maintainers disagree, but I vote "no".
From: Jakub Kicinski <kuba@kernel.org> Date: Thu, 19 Dec 2024 06:24:38 -0800 (to the author of the patch) > On Thu, 19 Dec 2024 11:11:38 +0800 Guowei Dang wrote: >> Add page_pool_put_page_nosync() to respond to dma_sync_size being 0. If PP_FLAG_DMA_SYNC_DEV is set, dma_sync_size == 0 can happen only when the HW didn't write anything *and* the driver uses only one page per frame, no frags. Very unlikely case I'd say, adding a separate wrapper for it makes no sense. >> >> The purpose of this is to make the semantics more obvious and may >> enable removing some checkings in the future. Which checks do you want to remove? >> >> And in the long term, treating the nosync scenario separately provides >> more flexibility for the user and enable removing of the >> PP_FLAG_DMA_SYNC_DEV in the future. Why remove SYNC_DEV? >> >> Since we do have a page_pool_put_full_page(), adding a variant for >> the nosync seems reasonable. Not really. put_full_page() is for cases when either the HW-written size is unknown or the driver uses frags, those are common and widely-used. > > You should provide an upstream user with the API. Would be nice to see a real example as I don't understand the purpose of this function as well. > But IMHO this just complicates the already very large API, > for little benefit. > I'm going to leave this in patchwork for a day in case page > pool maintainers disagree, but I vote "no". I don't see a reason for this either. Thanks, Olek
Hi Jakub On Thu, 19 Dec 2024 at 16:24, Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 19 Dec 2024 11:11:38 +0800 Guowei Dang wrote: > > Add page_pool_put_page_nosync() to respond to dma_sync_size being 0. > > > > The purpose of this is to make the semantics more obvious and may > > enable removing some checkings in the future. > > > > And in the long term, treating the nosync scenario separately provides > > more flexibility for the user and enable removing of the > > PP_FLAG_DMA_SYNC_DEV in the future. > > > > Since we do have a page_pool_put_full_page(), adding a variant for > > the nosync seems reasonable. > > You should provide an upstream user with the API. > But IMHO this just complicates the already very large API, > for little benefit. +1000, I think the API has grown more than it has to and we now have way too many abstractions. I'll try to find some time and see if I can come up with a cleanup that makes sense Thanks /Ilias > I'm going to leave this in patchwork for a day in case page > pool maintainers disagree, but I vote "no".
From: Ilias Apalodimas <ilias.apalodimas@linaro.org> Date: Fri, 20 Dec 2024 14:27:16 +0200 > Hi Jakub > > On Thu, 19 Dec 2024 at 16:24, Jakub Kicinski <kuba@kernel.org> wrote: >> >> On Thu, 19 Dec 2024 11:11:38 +0800 Guowei Dang wrote: >>> Add page_pool_put_page_nosync() to respond to dma_sync_size being 0. >>> >>> The purpose of this is to make the semantics more obvious and may >>> enable removing some checkings in the future. >>> >>> And in the long term, treating the nosync scenario separately provides >>> more flexibility for the user and enable removing of the >>> PP_FLAG_DMA_SYNC_DEV in the future. >>> >>> Since we do have a page_pool_put_full_page(), adding a variant for >>> the nosync seems reasonable. >> >> You should provide an upstream user with the API. >> But IMHO this just complicates the already very large API, >> for little benefit. > > +1000, I think the API has grown more than it has to and we now have > way too many abstractions. > > I'll try to find some time and see if I can come up with a cleanup > that makes sense I'd remove: * explicit 1-page-per-buffer API (leaving only the hybrid mode); * order > 0 support (barely used if at all?); * usage without DMA map and sync for device; + converting the users to netmem would allow to remove some wrappers. Thanks, Olek
diff --git a/Documentation/networking/page_pool.rst b/Documentation/networking/page_pool.rst index 9d958128a57c..a83f7c071132 100644 --- a/Documentation/networking/page_pool.rst +++ b/Documentation/networking/page_pool.rst @@ -62,7 +62,8 @@ a page will cause no race conditions is enough. :identifiers: struct page_pool_params .. kernel-doc:: include/net/page_pool/helpers.h - :identifiers: page_pool_put_page page_pool_put_full_page + :identifiers: page_pool_put_page + page_pool_put_page_nosync page_pool_put_full_page page_pool_recycle_direct page_pool_free_va page_pool_dev_alloc_pages page_pool_dev_alloc_frag page_pool_dev_alloc page_pool_dev_alloc_va @@ -93,6 +94,8 @@ much of the page needs to be synced (starting at ``offset``). When directly freeing pages in the driver (page_pool_put_page()) the ``dma_sync_size`` argument specifies how much of the buffer needs to be synced. +If the ``dma_sync_size`` argument is 0, page_pool_put_page_nosync() should be +used instead of page_pool_put_page(). If in doubt set ``offset`` to 0, ``max_len`` to ``PAGE_SIZE`` and pass -1 as ``dma_sync_size``. That combination of arguments is always diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h index e555921e5233..5cc68d48624a 100644 --- a/include/net/page_pool/helpers.h +++ b/include/net/page_pool/helpers.h @@ -340,12 +340,14 @@ static inline void page_pool_put_netmem(struct page_pool *pool, * the allocator owns the page and will try to recycle it in one of the pool * caches. If PP_FLAG_DMA_SYNC_DEV is set, the page will be synced for_device * using dma_sync_single_range_for_device(). + * page_pool_put_page_nosync() should be used if dma_sync_size is 0. */ static inline void page_pool_put_page(struct page_pool *pool, struct page *page, unsigned int dma_sync_size, bool allow_direct) { + DEBUG_NET_WARN_ON_ONCE(!dma_sync_size); page_pool_put_netmem(pool, page_to_netmem(page), dma_sync_size, allow_direct); } @@ -372,6 +374,21 @@ static inline void page_pool_put_full_page(struct page_pool *pool, page_pool_put_netmem(pool, page_to_netmem(page), -1, allow_direct); } +/** + * page_pool_put_page_nosync() - release a reference on a page pool page + * @pool: pool from which page was allocated + * @page: page to release a reference on + * @allow_direct: released by the consumer, allow lockless caching + * + * Similar to page_pool_put_page(), but will not DMA sync the memory area. + */ +static inline void page_pool_put_page_nosync(struct page_pool *pool, + struct page *page, + bool allow_direct) +{ + page_pool_put_netmem(pool, page_to_netmem(page), 0, allow_direct); +} + /** * page_pool_recycle_direct() - release a reference on a page pool page * @pool: pool from which page was allocated