Message ID | 20250314-page-pool-track-dma-v1-2-c212e57a74c2@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Fix late DMA unmap crash for page pool | expand |
On 3/14/2025 6:10 PM, Toke Høiland-Jørgensen wrote: > Change the single-bit booleans for dma_sync into an unsigned long with > BIT() definitions so that a subsequent patch can write them both with a > singe WRITE_ONCE() on teardown. Also move the check for the sync_cpu > side into __page_pool_dma_sync_for_cpu() so it can be disabled for > non-netmem providers as well. I guess this patch is for the preparation of disabling the page_pool_dma_sync_for_cpu() related API on teardown? It seems unnecessary that page_pool_dma_sync_for_cpu() related API need to be disabled on teardown as page_pool_dma_sync_for_cpu() has the same calling assumption as the alloc API, which is not supposed to be called by the drivers when page_pool_destroy() is called.
Yunsheng Lin <yunshenglin0825@gmail.com> writes: > On 3/14/2025 6:10 PM, Toke Høiland-Jørgensen wrote: >> Change the single-bit booleans for dma_sync into an unsigned long with >> BIT() definitions so that a subsequent patch can write them both with a >> singe WRITE_ONCE() on teardown. Also move the check for the sync_cpu >> side into __page_pool_dma_sync_for_cpu() so it can be disabled for >> non-netmem providers as well. > > I guess this patch is for the preparation of disabling the > page_pool_dma_sync_for_cpu() related API on teardown? > > It seems unnecessary that page_pool_dma_sync_for_cpu() related API need > to be disabled on teardown as page_pool_dma_sync_for_cpu() has the same > calling assumption as the alloc API, which is not supposed to be called > by the drivers when page_pool_destroy() is called. Sure, we could keep it to the dma_sync_for_dev() direction only, but making both directions use the same variable to store the state, and just resetting it at once, seemed simpler and more consistent. -Toke
On Fri, Mar 14, 2025 at 3:12 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Change the single-bit booleans for dma_sync into an unsigned long with > BIT() definitions so that a subsequent patch can write them both with a > singe WRITE_ONCE() on teardown. Also move the check for the sync_cpu > side into __page_pool_dma_sync_for_cpu() so it can be disabled for > non-netmem providers as well. > > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Mina Almasry <almasrymina@google.com> > --- > include/net/page_pool/helpers.h | 6 +++--- > include/net/page_pool/types.h | 8 ++++++-- > net/core/devmem.c | 3 +-- > net/core/page_pool.c | 9 +++++---- > 4 files changed, 15 insertions(+), 11 deletions(-) > > diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h > index 582a3d00cbe2315edeb92850b6a42ab21e509e45..7ed32bde4b8944deb7fb22e291e95b8487be681a 100644 > --- a/include/net/page_pool/helpers.h > +++ b/include/net/page_pool/helpers.h > @@ -443,6 +443,9 @@ static inline void __page_pool_dma_sync_for_cpu(const struct page_pool *pool, > const dma_addr_t dma_addr, > u32 offset, u32 dma_sync_size) > { > + if (!(READ_ONCE(pool->dma_sync) & PP_DMA_SYNC_CPU)) > + return; > + > dma_sync_single_range_for_cpu(pool->p.dev, dma_addr, > offset + pool->p.offset, dma_sync_size, > page_pool_get_dma_dir(pool)); > @@ -473,9 +476,6 @@ page_pool_dma_sync_netmem_for_cpu(const struct page_pool *pool, > const netmem_ref netmem, u32 offset, > u32 dma_sync_size) > { > - if (!pool->dma_sync_for_cpu) > - return; > - > __page_pool_dma_sync_for_cpu(pool, > page_pool_get_dma_addr_netmem(netmem), > offset, dma_sync_size); I think moving the check to __page_pool_dma_sync_for_cpu is fine, but I would have preferred to keep it as-is actually. I think if we're syncing netmem we should check dma_sync_for_cpu, because the netmem may not be dma-syncable. But for pages, they will likely always be dma-syncable. Some driver may have opted to do a perf optimizations by calling __page_pool_dma_sync_for_cpu on a dma-addr that it knows came from a page to save some cycles of netmem checking. > diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h > index df0d3c1608929605224feb26173135ff37951ef8..fbe34024b20061e8bcd1d4474f6ebfc70992f1eb 100644 > --- a/include/net/page_pool/types.h > +++ b/include/net/page_pool/types.h > @@ -33,6 +33,10 @@ > #define PP_FLAG_ALL (PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV | \ > PP_FLAG_SYSTEM_POOL | PP_FLAG_ALLOW_UNREADABLE_NETMEM) > > +/* bit values used in pp->dma_sync */ > +#define PP_DMA_SYNC_DEV BIT(0) > +#define PP_DMA_SYNC_CPU BIT(1) > + > /* > * Fast allocation side cache array/stack > * > @@ -175,12 +179,12 @@ struct page_pool { > > bool has_init_callback:1; /* slow::init_callback is set */ > bool dma_map:1; /* Perform DMA mapping */ > - bool dma_sync:1; /* Perform DMA sync for device */ > - bool dma_sync_for_cpu:1; /* Perform DMA sync for cpu */ > #ifdef CONFIG_PAGE_POOL_STATS > bool system:1; /* This is a global percpu pool */ > #endif > > + unsigned long dma_sync; > + > __cacheline_group_begin_aligned(frag, PAGE_POOL_FRAG_GROUP_ALIGN); > long frag_users; > netmem_ref frag_page; > diff --git a/net/core/devmem.c b/net/core/devmem.c > index 7c6e0b5b6acb55f376ec725dfb71d1f70a4320c3..16e43752566feb510b3e47fbec2d8da0f26a6adc 100644 > --- a/net/core/devmem.c > +++ b/net/core/devmem.c > @@ -337,8 +337,7 @@ int mp_dmabuf_devmem_init(struct page_pool *pool) > /* dma-buf dma addresses do not need and should not be used with > * dma_sync_for_cpu/device. Force disable dma_sync. > */ > - pool->dma_sync = false; > - pool->dma_sync_for_cpu = false; > + pool->dma_sync = 0; > > if (pool->p.order != 0) > return -E2BIG; > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > index acef1fcd8ddcfd1853a6f2055c1f1820ab248e8d..d51ca4389dd62d8bc266a9a2b792838257173535 100644 > --- a/net/core/page_pool.c > +++ b/net/core/page_pool.c > @@ -203,7 +203,7 @@ static int page_pool_init(struct page_pool *pool, > memcpy(&pool->slow, ¶ms->slow, sizeof(pool->slow)); > > pool->cpuid = cpuid; > - pool->dma_sync_for_cpu = true; > + pool->dma_sync = PP_DMA_SYNC_CPU; > More pedantically this should have been pool->dma_sync |= PP_DMA_SYNC_CPU, but it doesn't matter since this variable is 0 initialized I think. -- Thanks, Mina
diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h index 582a3d00cbe2315edeb92850b6a42ab21e509e45..7ed32bde4b8944deb7fb22e291e95b8487be681a 100644 --- a/include/net/page_pool/helpers.h +++ b/include/net/page_pool/helpers.h @@ -443,6 +443,9 @@ static inline void __page_pool_dma_sync_for_cpu(const struct page_pool *pool, const dma_addr_t dma_addr, u32 offset, u32 dma_sync_size) { + if (!(READ_ONCE(pool->dma_sync) & PP_DMA_SYNC_CPU)) + return; + dma_sync_single_range_for_cpu(pool->p.dev, dma_addr, offset + pool->p.offset, dma_sync_size, page_pool_get_dma_dir(pool)); @@ -473,9 +476,6 @@ page_pool_dma_sync_netmem_for_cpu(const struct page_pool *pool, const netmem_ref netmem, u32 offset, u32 dma_sync_size) { - if (!pool->dma_sync_for_cpu) - return; - __page_pool_dma_sync_for_cpu(pool, page_pool_get_dma_addr_netmem(netmem), offset, dma_sync_size); diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index df0d3c1608929605224feb26173135ff37951ef8..fbe34024b20061e8bcd1d4474f6ebfc70992f1eb 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -33,6 +33,10 @@ #define PP_FLAG_ALL (PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV | \ PP_FLAG_SYSTEM_POOL | PP_FLAG_ALLOW_UNREADABLE_NETMEM) +/* bit values used in pp->dma_sync */ +#define PP_DMA_SYNC_DEV BIT(0) +#define PP_DMA_SYNC_CPU BIT(1) + /* * Fast allocation side cache array/stack * @@ -175,12 +179,12 @@ struct page_pool { bool has_init_callback:1; /* slow::init_callback is set */ bool dma_map:1; /* Perform DMA mapping */ - bool dma_sync:1; /* Perform DMA sync for device */ - bool dma_sync_for_cpu:1; /* Perform DMA sync for cpu */ #ifdef CONFIG_PAGE_POOL_STATS bool system:1; /* This is a global percpu pool */ #endif + unsigned long dma_sync; + __cacheline_group_begin_aligned(frag, PAGE_POOL_FRAG_GROUP_ALIGN); long frag_users; netmem_ref frag_page; diff --git a/net/core/devmem.c b/net/core/devmem.c index 7c6e0b5b6acb55f376ec725dfb71d1f70a4320c3..16e43752566feb510b3e47fbec2d8da0f26a6adc 100644 --- a/net/core/devmem.c +++ b/net/core/devmem.c @@ -337,8 +337,7 @@ int mp_dmabuf_devmem_init(struct page_pool *pool) /* dma-buf dma addresses do not need and should not be used with * dma_sync_for_cpu/device. Force disable dma_sync. */ - pool->dma_sync = false; - pool->dma_sync_for_cpu = false; + pool->dma_sync = 0; if (pool->p.order != 0) return -E2BIG; diff --git a/net/core/page_pool.c b/net/core/page_pool.c index acef1fcd8ddcfd1853a6f2055c1f1820ab248e8d..d51ca4389dd62d8bc266a9a2b792838257173535 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -203,7 +203,7 @@ static int page_pool_init(struct page_pool *pool, memcpy(&pool->slow, ¶ms->slow, sizeof(pool->slow)); pool->cpuid = cpuid; - pool->dma_sync_for_cpu = true; + pool->dma_sync = PP_DMA_SYNC_CPU; /* Validate only known flags were used */ if (pool->slow.flags & ~PP_FLAG_ALL) @@ -238,7 +238,7 @@ static int page_pool_init(struct page_pool *pool, if (!pool->p.max_len) return -EINVAL; - pool->dma_sync = true; + pool->dma_sync |= PP_DMA_SYNC_DEV; /* pool->p.offset has to be set according to the address * offset used by the DMA engine to start copying rx data @@ -291,7 +291,7 @@ static int page_pool_init(struct page_pool *pool, } if (pool->mp_ops) { - if (!pool->dma_map || !pool->dma_sync) + if (!pool->dma_map || !(pool->dma_sync & PP_DMA_SYNC_DEV)) return -EOPNOTSUPP; if (WARN_ON(!is_kernel_rodata((unsigned long)pool->mp_ops))) { @@ -466,7 +466,8 @@ page_pool_dma_sync_for_device(const struct page_pool *pool, netmem_ref netmem, u32 dma_sync_size) { - if (pool->dma_sync && dma_dev_need_sync(pool->p.dev)) + if ((READ_ONCE(pool->dma_sync) & PP_DMA_SYNC_DEV) && + dma_dev_need_sync(pool->p.dev)) __page_pool_dma_sync_for_device(pool, netmem, dma_sync_size); }
Change the single-bit booleans for dma_sync into an unsigned long with BIT() definitions so that a subsequent patch can write them both with a singe WRITE_ONCE() on teardown. Also move the check for the sync_cpu side into __page_pool_dma_sync_for_cpu() so it can be disabled for non-netmem providers as well. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> --- include/net/page_pool/helpers.h | 6 +++--- include/net/page_pool/types.h | 8 ++++++-- net/core/devmem.c | 3 +-- net/core/page_pool.c | 9 +++++---- 4 files changed, 15 insertions(+), 11 deletions(-)