Message ID | 20250325-page-pool-track-dma-v2-2-113ebc1946f3@redhat.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Fix late DMA unmap crash for page pool | expand |
On Tue, 25 Mar 2025 16:45:43 +0100 Toke Høiland-Jørgensen wrote: > Change the single-bit booleans for dma_sync into an unsigned long with > BIT() definitions so that a subsequent patch can write them both with a > singe WRITE_ONCE() on teardown. Also move the check for the sync_cpu > side into __page_pool_dma_sync_for_cpu() so it can be disabled for > non-netmem providers as well. Can we make them just bools without the bit width? Less churn and actually fewer bytes. I don't see why we'd need to wipe them atomically. In fact I don't see why we're touching dma_sync_cpu, at all, it's driver-facing and the driver is gone in the problematic scenario.
Jakub Kicinski <kuba@kernel.org> writes: > On Tue, 25 Mar 2025 16:45:43 +0100 Toke Høiland-Jørgensen wrote: >> Change the single-bit booleans for dma_sync into an unsigned long with >> BIT() definitions so that a subsequent patch can write them both with a >> singe WRITE_ONCE() on teardown. Also move the check for the sync_cpu >> side into __page_pool_dma_sync_for_cpu() so it can be disabled for >> non-netmem providers as well. > > Can we make them just bools without the bit width? > Less churn and actually fewer bytes. Ah! Didn't realise that was possible, excellent solution :) > I don't see why we'd need to wipe them atomically. > In fact I don't see why we're touching dma_sync_cpu, at all, > it's driver-facing and the driver is gone in the problematic > scenario. No you're right, but it felt weird to change just one of them, so figured I'd go with both. But keeping them both as bool, and just making dma_sync a full-width bool works, so I'll respin with that and leave dma_sync_cpu as-is.
On Wed, 26 Mar 2025 09:12:34 +0100 Toke Høiland-Jørgensen wrote: > > I don't see why we'd need to wipe them atomically. > > In fact I don't see why we're touching dma_sync_cpu, at all, > > it's driver-facing and the driver is gone in the problematic > > scenario. > > No you're right, but it felt weird to change just one of them, so > figured I'd go with both. But keeping them both as bool, and just making > dma_sync a full-width bool works, so I'll respin with that and leave > dma_sync_cpu as-is. Opinion on dma_sync_cpu clearing probably depends on mental model. No strong feelings but perhaps add a comment next to clearing it for the likes of myself saying that this technically shouldn't be needed as we only expect drivers to ask for CPU sync?
On Wed, 26 Mar 2025 04:23:47 -0700 Jakub Kicinski wrote: > On Wed, 26 Mar 2025 09:12:34 +0100 Toke Høiland-Jørgensen wrote: > > > I don't see why we'd need to wipe them atomically. > > > In fact I don't see why we're touching dma_sync_cpu, at all, > > > it's driver-facing and the driver is gone in the problematic > > > scenario. > > > > No you're right, but it felt weird to change just one of them, so > > figured I'd go with both. But keeping them both as bool, and just making > > dma_sync a full-width bool works, so I'll respin with that and leave > > dma_sync_cpu as-is. > > Opinion on dma_sync_cpu clearing probably depends on mental model. > No strong feelings but perhaps add a comment next to clearing it > for the likes of myself saying that this technically shouldn't be > needed as we only expect drivers to ask for CPU sync? Ah, misread, I thought you meant "as-is" == "as is in this series". Thanks!
On 25 Mar 16:45, Toke Høiland-Jørgensen wrote: >Change the single-bit booleans for dma_sync into an unsigned long with >BIT() definitions so that a subsequent patch can write them both with a >singe WRITE_ONCE() on teardown. Also move the check for the sync_cpu >side into __page_pool_dma_sync_for_cpu() so it can be disabled for >non-netmem providers as well. > >Reviewed-by: Mina Almasry <almasrymina@google.com> >Tested-by: Yonglong Liu <liuyonglong@huawei.com> >Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> >--- > include/net/page_pool/helpers.h | 6 +++--- > include/net/page_pool/types.h | 8 ++++++-- > net/core/devmem.c | 3 +-- > net/core/page_pool.c | 9 +++++---- > 4 files changed, 15 insertions(+), 11 deletions(-) > >diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h >index 582a3d00cbe2315edeb92850b6a42ab21e509e45..7ed32bde4b8944deb7fb22e291e95b8487be681a 100644 >--- a/include/net/page_pool/helpers.h >+++ b/include/net/page_pool/helpers.h >@@ -443,6 +443,9 @@ static inline void __page_pool_dma_sync_for_cpu(const struct page_pool *pool, > const dma_addr_t dma_addr, > u32 offset, u32 dma_sync_size) > { >+ if (!(READ_ONCE(pool->dma_sync) & PP_DMA_SYNC_CPU)) >+ return; >+ page_pool_dma_sync_for_cpu() is a wrapper for this function, and it assumes pages were created with DMA flag, so you are adding this unnecessary check for that path. Just change page_pool_dma_sync_for_cpu() to directly call dma_sync_single_range_for_cpu(...) as part of this patch. > dma_sync_single_range_for_cpu(pool->p.dev, dma_addr, > offset + pool->p.offset, dma_sync_size, > page_pool_get_dma_dir(pool)); >@@ -473,9 +476,6 @@ page_pool_dma_sync_netmem_for_cpu(const struct page_pool *pool, > const netmem_ref netmem, u32 offset, > u32 dma_sync_size) > { >- if (!pool->dma_sync_for_cpu) >- return; >- > __page_pool_dma_sync_for_cpu(pool, > page_pool_get_dma_addr_netmem(netmem), > offset, dma_sync_size); >diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h >index df0d3c1608929605224feb26173135ff37951ef8..fbe34024b20061e8bcd1d4474f6ebfc70992f1eb 100644 >--- a/include/net/page_pool/types.h >+++ b/include/net/page_pool/types.h >@@ -33,6 +33,10 @@ > #define PP_FLAG_ALL (PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV | \ > PP_FLAG_SYSTEM_POOL | PP_FLAG_ALLOW_UNREADABLE_NETMEM) > >+/* bit values used in pp->dma_sync */ >+#define PP_DMA_SYNC_DEV BIT(0) >+#define PP_DMA_SYNC_CPU BIT(1) >+ > /* > * Fast allocation side cache array/stack > * >@@ -175,12 +179,12 @@ struct page_pool { > > bool has_init_callback:1; /* slow::init_callback is set */ > bool dma_map:1; /* Perform DMA mapping */ >- bool dma_sync:1; /* Perform DMA sync for device */ >- bool dma_sync_for_cpu:1; /* Perform DMA sync for cpu */ > #ifdef CONFIG_PAGE_POOL_STATS > bool system:1; /* This is a global percpu pool */ > #endif > >+ unsigned long dma_sync; >+ > __cacheline_group_begin_aligned(frag, PAGE_POOL_FRAG_GROUP_ALIGN); > long frag_users; > netmem_ref frag_page; >diff --git a/net/core/devmem.c b/net/core/devmem.c >index 6802e82a4d03b6030f6df50ae3661f81e40bc101..955d392d707b12fe784747aa2040ce1a882a64db 100644 >--- a/net/core/devmem.c >+++ b/net/core/devmem.c >@@ -340,8 +340,7 @@ int mp_dmabuf_devmem_init(struct page_pool *pool) > /* dma-buf dma addresses do not need and should not be used with > * dma_sync_for_cpu/device. Force disable dma_sync. > */ >- pool->dma_sync = false; >- pool->dma_sync_for_cpu = false; >+ pool->dma_sync = 0; > > if (pool->p.order != 0) > return -E2BIG; >diff --git a/net/core/page_pool.c b/net/core/page_pool.c >index acef1fcd8ddcfd1853a6f2055c1f1820ab248e8d..d51ca4389dd62d8bc266a9a2b792838257173535 100644 >--- a/net/core/page_pool.c >+++ b/net/core/page_pool.c >@@ -203,7 +203,7 @@ static int page_pool_init(struct page_pool *pool, > memcpy(&pool->slow, ¶ms->slow, sizeof(pool->slow)); > > pool->cpuid = cpuid; >- pool->dma_sync_for_cpu = true; >+ pool->dma_sync = PP_DMA_SYNC_CPU; > > /* Validate only known flags were used */ > if (pool->slow.flags & ~PP_FLAG_ALL) >@@ -238,7 +238,7 @@ static int page_pool_init(struct page_pool *pool, > if (!pool->p.max_len) > return -EINVAL; > >- pool->dma_sync = true; >+ pool->dma_sync |= PP_DMA_SYNC_DEV; > > /* pool->p.offset has to be set according to the address > * offset used by the DMA engine to start copying rx data >@@ -291,7 +291,7 @@ static int page_pool_init(struct page_pool *pool, > } > > if (pool->mp_ops) { >- if (!pool->dma_map || !pool->dma_sync) >+ if (!pool->dma_map || !(pool->dma_sync & PP_DMA_SYNC_DEV)) > return -EOPNOTSUPP; > > if (WARN_ON(!is_kernel_rodata((unsigned long)pool->mp_ops))) { >@@ -466,7 +466,8 @@ page_pool_dma_sync_for_device(const struct page_pool *pool, > netmem_ref netmem, > u32 dma_sync_size) > { >- if (pool->dma_sync && dma_dev_need_sync(pool->p.dev)) >+ if ((READ_ONCE(pool->dma_sync) & PP_DMA_SYNC_DEV) && >+ dma_dev_need_sync(pool->p.dev)) > __page_pool_dma_sync_for_device(pool, netmem, dma_sync_size); > } > > >-- >2.48.1 >
diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h index 582a3d00cbe2315edeb92850b6a42ab21e509e45..7ed32bde4b8944deb7fb22e291e95b8487be681a 100644 --- a/include/net/page_pool/helpers.h +++ b/include/net/page_pool/helpers.h @@ -443,6 +443,9 @@ static inline void __page_pool_dma_sync_for_cpu(const struct page_pool *pool, const dma_addr_t dma_addr, u32 offset, u32 dma_sync_size) { + if (!(READ_ONCE(pool->dma_sync) & PP_DMA_SYNC_CPU)) + return; + dma_sync_single_range_for_cpu(pool->p.dev, dma_addr, offset + pool->p.offset, dma_sync_size, page_pool_get_dma_dir(pool)); @@ -473,9 +476,6 @@ page_pool_dma_sync_netmem_for_cpu(const struct page_pool *pool, const netmem_ref netmem, u32 offset, u32 dma_sync_size) { - if (!pool->dma_sync_for_cpu) - return; - __page_pool_dma_sync_for_cpu(pool, page_pool_get_dma_addr_netmem(netmem), offset, dma_sync_size); diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index df0d3c1608929605224feb26173135ff37951ef8..fbe34024b20061e8bcd1d4474f6ebfc70992f1eb 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -33,6 +33,10 @@ #define PP_FLAG_ALL (PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV | \ PP_FLAG_SYSTEM_POOL | PP_FLAG_ALLOW_UNREADABLE_NETMEM) +/* bit values used in pp->dma_sync */ +#define PP_DMA_SYNC_DEV BIT(0) +#define PP_DMA_SYNC_CPU BIT(1) + /* * Fast allocation side cache array/stack * @@ -175,12 +179,12 @@ struct page_pool { bool has_init_callback:1; /* slow::init_callback is set */ bool dma_map:1; /* Perform DMA mapping */ - bool dma_sync:1; /* Perform DMA sync for device */ - bool dma_sync_for_cpu:1; /* Perform DMA sync for cpu */ #ifdef CONFIG_PAGE_POOL_STATS bool system:1; /* This is a global percpu pool */ #endif + unsigned long dma_sync; + __cacheline_group_begin_aligned(frag, PAGE_POOL_FRAG_GROUP_ALIGN); long frag_users; netmem_ref frag_page; diff --git a/net/core/devmem.c b/net/core/devmem.c index 6802e82a4d03b6030f6df50ae3661f81e40bc101..955d392d707b12fe784747aa2040ce1a882a64db 100644 --- a/net/core/devmem.c +++ b/net/core/devmem.c @@ -340,8 +340,7 @@ int mp_dmabuf_devmem_init(struct page_pool *pool) /* dma-buf dma addresses do not need and should not be used with * dma_sync_for_cpu/device. Force disable dma_sync. */ - pool->dma_sync = false; - pool->dma_sync_for_cpu = false; + pool->dma_sync = 0; if (pool->p.order != 0) return -E2BIG; diff --git a/net/core/page_pool.c b/net/core/page_pool.c index acef1fcd8ddcfd1853a6f2055c1f1820ab248e8d..d51ca4389dd62d8bc266a9a2b792838257173535 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -203,7 +203,7 @@ static int page_pool_init(struct page_pool *pool, memcpy(&pool->slow, ¶ms->slow, sizeof(pool->slow)); pool->cpuid = cpuid; - pool->dma_sync_for_cpu = true; + pool->dma_sync = PP_DMA_SYNC_CPU; /* Validate only known flags were used */ if (pool->slow.flags & ~PP_FLAG_ALL) @@ -238,7 +238,7 @@ static int page_pool_init(struct page_pool *pool, if (!pool->p.max_len) return -EINVAL; - pool->dma_sync = true; + pool->dma_sync |= PP_DMA_SYNC_DEV; /* pool->p.offset has to be set according to the address * offset used by the DMA engine to start copying rx data @@ -291,7 +291,7 @@ static int page_pool_init(struct page_pool *pool, } if (pool->mp_ops) { - if (!pool->dma_map || !pool->dma_sync) + if (!pool->dma_map || !(pool->dma_sync & PP_DMA_SYNC_DEV)) return -EOPNOTSUPP; if (WARN_ON(!is_kernel_rodata((unsigned long)pool->mp_ops))) { @@ -466,7 +466,8 @@ page_pool_dma_sync_for_device(const struct page_pool *pool, netmem_ref netmem, u32 dma_sync_size) { - if (pool->dma_sync && dma_dev_need_sync(pool->p.dev)) + if ((READ_ONCE(pool->dma_sync) & PP_DMA_SYNC_DEV) && + dma_dev_need_sync(pool->p.dev)) __page_pool_dma_sync_for_device(pool, netmem, dma_sync_size); }