Message ID | 20230524171904.3967031-14-catalin.marinas@arm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 | expand |
On 24/05/2023 6:19 pm, Catalin Marinas wrote: > Similarly to the direct DMA, bounce small allocations as they may have > originated from a kmalloc() cache not safe for DMA. Unlike the direct > DMA, iommu_dma_map_sg() cannot call iommu_dma_map_sg_swiotlb() for all > non-coherent devices as this would break some cases where the iova is > expected to be contiguous (dmabuf). Instead, scan the scatterlist for > any small sizes and only go the swiotlb path if any element of the list > needs bouncing (note that iommu_dma_map_page() would still only bounce > those buffers which are not DMA-aligned). > > To avoid scanning the scatterlist on the 'sync' operations, introduce an > SG_DMA_USE_SWIOTLB flag set by iommu_dma_map_sg_swiotlb(). The > dev_use_swiotlb() function together with the newly added > dev_use_sg_swiotlb() now check for both untrusted devices and unaligned > kmalloc() buffers (suggested by Robin Murphy). > > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> > Cc: Joerg Roedel <joro@8bytes.org> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Robin Murphy <robin.murphy@arm.com> > --- [...] > diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h > index 87aaf8b5cdb4..330a157c5501 100644 > --- a/include/linux/scatterlist.h > +++ b/include/linux/scatterlist.h > @@ -248,6 +248,29 @@ static inline void sg_unmark_end(struct scatterlist *sg) > sg->page_link &= ~SG_END; > } > > +#define SG_DMA_BUS_ADDRESS (1 << 0) > +#define SG_DMA_USE_SWIOTLB (1 << 1) > + > +#ifdef CONFIG_SWIOTLB > +static inline bool sg_is_dma_use_swiotlb(struct scatterlist *sg) Nit: can we decide whether this API is named sg_<verb>_<flag name> or sg_dma_<verb>_<flag suffix>? I'm leaning towards the latter, which would be consistent with the existing kerneldoc if not all the code (which I shall probably now send a patch to fix). However, my internal grammar parser just cannot cope with the "is use" double-verb construct, so I would be inclined to collapse this particular one to "sg_dma_use_swiotlb" either way. Yes, it breaks the pattern, but that's what the English language does best :) Otherwise, Reviewed-by: Robin Murphy <robin.murphy@arm.com> I don't seem to have got round to giving this a spin on my Juno today, but will try to do so soon. Cheers, Robin. > +{ > + return sg->dma_flags & SG_DMA_USE_SWIOTLB; > +} > + > +static inline void sg_dma_mark_use_swiotlb(struct scatterlist *sg) > +{ > + sg->dma_flags |= SG_DMA_USE_SWIOTLB; > +} > +#else > +static inline bool sg_is_dma_use_swiotlb(struct scatterlist *sg) > +{ > + return false; > +} > +static inline void sg_dma_mark_use_swiotlb(struct scatterlist *sg) > +{ > +} > +#endif > + > /* > * CONFIG_PCI_P2PDMA depends on CONFIG_64BIT which means there is 4 bytes > * in struct scatterlist (assuming also CONFIG_NEED_SG_DMA_LENGTH is set). > @@ -256,8 +279,6 @@ static inline void sg_unmark_end(struct scatterlist *sg) > */ > #ifdef CONFIG_PCI_P2PDMA > > -#define SG_DMA_BUS_ADDRESS (1 << 0) > - > /** > * sg_dma_is_bus address - Return whether a given segment was marked > * as a bus address
On Wed, May 24, 2023 at 06:19:02PM +0100, Catalin Marinas wrote: > Similarly to the direct DMA, bounce small allocations as they may have > originated from a kmalloc() cache not safe for DMA. Unlike the direct > DMA, iommu_dma_map_sg() cannot call iommu_dma_map_sg_swiotlb() for all > non-coherent devices as this would break some cases where the iova is > expected to be contiguous (dmabuf). Instead, scan the scatterlist for > any small sizes and only go the swiotlb path if any element of the list > needs bouncing (note that iommu_dma_map_page() would still only bounce > those buffers which are not DMA-aligned). > > To avoid scanning the scatterlist on the 'sync' operations, introduce an > SG_DMA_USE_SWIOTLB flag set by iommu_dma_map_sg_swiotlb(). The > dev_use_swiotlb() function together with the newly added > dev_use_sg_swiotlb() now check for both untrusted devices and unaligned > kmalloc() buffers (suggested by Robin Murphy). > > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> > Cc: Joerg Roedel <joro@8bytes.org> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Robin Murphy <robin.murphy@arm.com> > --- > drivers/iommu/Kconfig | 1 + > drivers/iommu/dma-iommu.c | 50 ++++++++++++++++++++++++++++++------- > include/linux/scatterlist.h | 25 +++++++++++++++++-- > 3 files changed, 65 insertions(+), 11 deletions(-) > > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig > index db98c3f86e8c..670eff7a8e11 100644 > --- a/drivers/iommu/Kconfig > +++ b/drivers/iommu/Kconfig > @@ -152,6 +152,7 @@ config IOMMU_DMA > select IOMMU_IOVA > select IRQ_MSI_IOMMU > select NEED_SG_DMA_LENGTH > + select NEED_SG_DMA_FLAGS if SWIOTLB > > # Shared Virtual Addressing > config IOMMU_SVA > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > index 7a9f0b0bddbd..24a8b8c2368c 100644 > --- a/drivers/iommu/dma-iommu.c > +++ b/drivers/iommu/dma-iommu.c > @@ -520,9 +520,38 @@ static bool dev_is_untrusted(struct device *dev) > return dev_is_pci(dev) && to_pci_dev(dev)->untrusted; > } > > -static bool dev_use_swiotlb(struct device *dev) > +static bool dev_use_swiotlb(struct device *dev, size_t size, > + enum dma_data_direction dir) > { > - return IS_ENABLED(CONFIG_SWIOTLB) && dev_is_untrusted(dev); > + return IS_ENABLED(CONFIG_SWIOTLB) && > + (dev_is_untrusted(dev) || > + dma_kmalloc_needs_bounce(dev, size, dir)); > +} > + > +static bool dev_use_sg_swiotlb(struct device *dev, struct scatterlist *sg, > + int nents, enum dma_data_direction dir) > +{ > + struct scatterlist *s; > + int i; > + > + if (!IS_ENABLED(CONFIG_SWIOTLB)) > + return false; > + > + if (dev_is_untrusted(dev)) > + return true; > + > + /* > + * If kmalloc() buffers are not DMA-safe for this device and > + * direction, check the individual lengths in the sg list. If any > + * element is deemed unsafe, use the swiotlb for bouncing. > + */ > + if (!dma_kmalloc_safe(dev, dir)) { > + for_each_sg(sg, s, nents, i) > + if (!dma_kmalloc_size_aligned(s->length)) > + return true; > + } > + > + return false; > } > > /** > @@ -922,7 +951,7 @@ static void iommu_dma_sync_single_for_cpu(struct device *dev, > { > phys_addr_t phys; > > - if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev)) > + if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir)) > return; > > phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle); > @@ -938,7 +967,7 @@ static void iommu_dma_sync_single_for_device(struct device *dev, > { > phys_addr_t phys; > > - if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev)) > + if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir)) > return; > > phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle); > @@ -956,7 +985,7 @@ static void iommu_dma_sync_sg_for_cpu(struct device *dev, > struct scatterlist *sg; > int i; > > - if (dev_use_swiotlb(dev)) > + if (sg_is_dma_use_swiotlb(sgl)) > for_each_sg(sgl, sg, nelems, i) > iommu_dma_sync_single_for_cpu(dev, sg_dma_address(sg), > sg->length, dir); > @@ -972,7 +1001,7 @@ static void iommu_dma_sync_sg_for_device(struct device *dev, > struct scatterlist *sg; > int i; > > - if (dev_use_swiotlb(dev)) > + if (sg_is_dma_use_swiotlb(sgl)) > for_each_sg(sgl, sg, nelems, i) > iommu_dma_sync_single_for_device(dev, > sg_dma_address(sg), > @@ -998,7 +1027,8 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, > * If both the physical buffer start address and size are > * page aligned, we don't need to use a bounce page. > */ > - if (dev_use_swiotlb(dev) && iova_offset(iovad, phys | size)) { > + if (dev_use_swiotlb(dev, size, dir) && > + iova_offset(iovad, phys | size)) { > void *padding_start; > size_t padding_size, aligned_size; > > @@ -1166,6 +1196,8 @@ static int iommu_dma_map_sg_swiotlb(struct device *dev, struct scatterlist *sg, > struct scatterlist *s; > int i; > > + sg_dma_mark_use_swiotlb(sg); > + > for_each_sg(sg, s, nents, i) { > sg_dma_address(s) = iommu_dma_map_page(dev, sg_page(s), > s->offset, s->length, dir, attrs); > @@ -1210,7 +1242,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, > goto out; > } > > - if (dev_use_swiotlb(dev)) > + if (dev_use_sg_swiotlb(dev, sg, nents, dir)) > return iommu_dma_map_sg_swiotlb(dev, sg, nents, dir, attrs); > > if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC)) > @@ -1315,7 +1347,7 @@ static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, > struct scatterlist *tmp; > int i; > > - if (dev_use_swiotlb(dev)) { > + if (sg_is_dma_use_swiotlb(sg)) { > iommu_dma_unmap_sg_swiotlb(dev, sg, nents, dir, attrs); > return; > } > diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h > index 87aaf8b5cdb4..330a157c5501 100644 > --- a/include/linux/scatterlist.h > +++ b/include/linux/scatterlist.h > @@ -248,6 +248,29 @@ static inline void sg_unmark_end(struct scatterlist *sg) > sg->page_link &= ~SG_END; > } > > +#define SG_DMA_BUS_ADDRESS (1 << 0) > +#define SG_DMA_USE_SWIOTLB (1 << 1) > + > +#ifdef CONFIG_SWIOTLB s/CONFIG_SWIOTLB/CONFIG_NEED_SG_DMA_FLAGS ? Otherwise, there's compiler error if SWIOTLB=y but IOMMU=n Thanks > +static inline bool sg_is_dma_use_swiotlb(struct scatterlist *sg) > +{ > + return sg->dma_flags & SG_DMA_USE_SWIOTLB; > +} > + > +static inline void sg_dma_mark_use_swiotlb(struct scatterlist *sg) > +{ > + sg->dma_flags |= SG_DMA_USE_SWIOTLB; > +} > +#else > +static inline bool sg_is_dma_use_swiotlb(struct scatterlist *sg) > +{ > + return false; > +} > +static inline void sg_dma_mark_use_swiotlb(struct scatterlist *sg) > +{ > +} > +#endif > + > /* > * CONFIG_PCI_P2PDMA depends on CONFIG_64BIT which means there is 4 bytes > * in struct scatterlist (assuming also CONFIG_NEED_SG_DMA_LENGTH is set). > @@ -256,8 +279,6 @@ static inline void sg_unmark_end(struct scatterlist *sg) > */ > #ifdef CONFIG_PCI_P2PDMA > > -#define SG_DMA_BUS_ADDRESS (1 << 0) > - > /** > * sg_dma_is_bus address - Return whether a given segment was marked > * as a bus address > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Sat, May 27, 2023 at 12:36:30AM +0800, Jisheng Zhang wrote: > On Wed, May 24, 2023 at 06:19:02PM +0100, Catalin Marinas wrote: > > diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h > > index 87aaf8b5cdb4..330a157c5501 100644 > > --- a/include/linux/scatterlist.h > > +++ b/include/linux/scatterlist.h > > @@ -248,6 +248,29 @@ static inline void sg_unmark_end(struct scatterlist *sg) > > sg->page_link &= ~SG_END; > > } > > > > +#define SG_DMA_BUS_ADDRESS (1 << 0) > > +#define SG_DMA_USE_SWIOTLB (1 << 1) > > + > > +#ifdef CONFIG_SWIOTLB > > s/CONFIG_SWIOTLB/CONFIG_NEED_SG_DMA_FLAGS ? > Otherwise, there's compiler error if SWIOTLB=y but IOMMU=n Yes, I pushed a fixup to the kmalloc-minalign branch. > > +static inline bool sg_is_dma_use_swiotlb(struct scatterlist *sg) > > +{ > > + return sg->dma_flags & SG_DMA_USE_SWIOTLB; > > +} > > + > > +static inline void sg_dma_mark_use_swiotlb(struct scatterlist *sg) > > +{ > > + sg->dma_flags |= SG_DMA_USE_SWIOTLB; > > +} > > +#else > > +static inline bool sg_is_dma_use_swiotlb(struct scatterlist *sg) > > +{ > > + return false; > > +} > > +static inline void sg_dma_mark_use_swiotlb(struct scatterlist *sg) > > +{ > > +} > > +#endif And I wonder whether we should keep these accessors in the scatterlist.h file. They are only used by the dma-iommu.c code.
On 26/05/2023 8:22 pm, Catalin Marinas wrote: > On Sat, May 27, 2023 at 12:36:30AM +0800, Jisheng Zhang wrote: >> On Wed, May 24, 2023 at 06:19:02PM +0100, Catalin Marinas wrote: >>> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h >>> index 87aaf8b5cdb4..330a157c5501 100644 >>> --- a/include/linux/scatterlist.h >>> +++ b/include/linux/scatterlist.h >>> @@ -248,6 +248,29 @@ static inline void sg_unmark_end(struct scatterlist *sg) >>> sg->page_link &= ~SG_END; >>> } >>> >>> +#define SG_DMA_BUS_ADDRESS (1 << 0) >>> +#define SG_DMA_USE_SWIOTLB (1 << 1) >>> + >>> +#ifdef CONFIG_SWIOTLB >> >> s/CONFIG_SWIOTLB/CONFIG_NEED_SG_DMA_FLAGS ? >> Otherwise, there's compiler error if SWIOTLB=y but IOMMU=n > > Yes, I pushed a fixup to the kmalloc-minalign branch. I'd agree that CONFIG_NEED_SG_DMA_FLAGS is the better option. >>> +static inline bool sg_is_dma_use_swiotlb(struct scatterlist *sg) >>> +{ >>> + return sg->dma_flags & SG_DMA_USE_SWIOTLB; >>> +} >>> + >>> +static inline void sg_dma_mark_use_swiotlb(struct scatterlist *sg) >>> +{ >>> + sg->dma_flags |= SG_DMA_USE_SWIOTLB; >>> +} >>> +#else >>> +static inline bool sg_is_dma_use_swiotlb(struct scatterlist *sg) >>> +{ >>> + return false; >>> +} >>> +static inline void sg_dma_mark_use_swiotlb(struct scatterlist *sg) >>> +{ >>> +} >>> +#endif > > And I wonder whether we should keep these accessors in the scatterlist.h > file. They are only used by the dma-iommu.c code. Nah, logically they belong with the definition of the flag itself, which certainly should be here. Also once this does land, dma-direct and possibly others will be able to take advantage if it too (as a small win over repeating the is_swiotlb_buffer() check a lot). Thanks, Robin.
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index db98c3f86e8c..670eff7a8e11 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -152,6 +152,7 @@ config IOMMU_DMA select IOMMU_IOVA select IRQ_MSI_IOMMU select NEED_SG_DMA_LENGTH + select NEED_SG_DMA_FLAGS if SWIOTLB # Shared Virtual Addressing config IOMMU_SVA diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 7a9f0b0bddbd..24a8b8c2368c 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -520,9 +520,38 @@ static bool dev_is_untrusted(struct device *dev) return dev_is_pci(dev) && to_pci_dev(dev)->untrusted; } -static bool dev_use_swiotlb(struct device *dev) +static bool dev_use_swiotlb(struct device *dev, size_t size, + enum dma_data_direction dir) { - return IS_ENABLED(CONFIG_SWIOTLB) && dev_is_untrusted(dev); + return IS_ENABLED(CONFIG_SWIOTLB) && + (dev_is_untrusted(dev) || + dma_kmalloc_needs_bounce(dev, size, dir)); +} + +static bool dev_use_sg_swiotlb(struct device *dev, struct scatterlist *sg, + int nents, enum dma_data_direction dir) +{ + struct scatterlist *s; + int i; + + if (!IS_ENABLED(CONFIG_SWIOTLB)) + return false; + + if (dev_is_untrusted(dev)) + return true; + + /* + * If kmalloc() buffers are not DMA-safe for this device and + * direction, check the individual lengths in the sg list. If any + * element is deemed unsafe, use the swiotlb for bouncing. + */ + if (!dma_kmalloc_safe(dev, dir)) { + for_each_sg(sg, s, nents, i) + if (!dma_kmalloc_size_aligned(s->length)) + return true; + } + + return false; } /** @@ -922,7 +951,7 @@ static void iommu_dma_sync_single_for_cpu(struct device *dev, { phys_addr_t phys; - if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev)) + if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir)) return; phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle); @@ -938,7 +967,7 @@ static void iommu_dma_sync_single_for_device(struct device *dev, { phys_addr_t phys; - if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev)) + if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir)) return; phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle); @@ -956,7 +985,7 @@ static void iommu_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg; int i; - if (dev_use_swiotlb(dev)) + if (sg_is_dma_use_swiotlb(sgl)) for_each_sg(sgl, sg, nelems, i) iommu_dma_sync_single_for_cpu(dev, sg_dma_address(sg), sg->length, dir); @@ -972,7 +1001,7 @@ static void iommu_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg; int i; - if (dev_use_swiotlb(dev)) + if (sg_is_dma_use_swiotlb(sgl)) for_each_sg(sgl, sg, nelems, i) iommu_dma_sync_single_for_device(dev, sg_dma_address(sg), @@ -998,7 +1027,8 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, * If both the physical buffer start address and size are * page aligned, we don't need to use a bounce page. */ - if (dev_use_swiotlb(dev) && iova_offset(iovad, phys | size)) { + if (dev_use_swiotlb(dev, size, dir) && + iova_offset(iovad, phys | size)) { void *padding_start; size_t padding_size, aligned_size; @@ -1166,6 +1196,8 @@ static int iommu_dma_map_sg_swiotlb(struct device *dev, struct scatterlist *sg, struct scatterlist *s; int i; + sg_dma_mark_use_swiotlb(sg); + for_each_sg(sg, s, nents, i) { sg_dma_address(s) = iommu_dma_map_page(dev, sg_page(s), s->offset, s->length, dir, attrs); @@ -1210,7 +1242,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, goto out; } - if (dev_use_swiotlb(dev)) + if (dev_use_sg_swiotlb(dev, sg, nents, dir)) return iommu_dma_map_sg_swiotlb(dev, sg, nents, dir, attrs); if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC)) @@ -1315,7 +1347,7 @@ static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, struct scatterlist *tmp; int i; - if (dev_use_swiotlb(dev)) { + if (sg_is_dma_use_swiotlb(sg)) { iommu_dma_unmap_sg_swiotlb(dev, sg, nents, dir, attrs); return; } diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 87aaf8b5cdb4..330a157c5501 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -248,6 +248,29 @@ static inline void sg_unmark_end(struct scatterlist *sg) sg->page_link &= ~SG_END; } +#define SG_DMA_BUS_ADDRESS (1 << 0) +#define SG_DMA_USE_SWIOTLB (1 << 1) + +#ifdef CONFIG_SWIOTLB +static inline bool sg_is_dma_use_swiotlb(struct scatterlist *sg) +{ + return sg->dma_flags & SG_DMA_USE_SWIOTLB; +} + +static inline void sg_dma_mark_use_swiotlb(struct scatterlist *sg) +{ + sg->dma_flags |= SG_DMA_USE_SWIOTLB; +} +#else +static inline bool sg_is_dma_use_swiotlb(struct scatterlist *sg) +{ + return false; +} +static inline void sg_dma_mark_use_swiotlb(struct scatterlist *sg) +{ +} +#endif + /* * CONFIG_PCI_P2PDMA depends on CONFIG_64BIT which means there is 4 bytes * in struct scatterlist (assuming also CONFIG_NEED_SG_DMA_LENGTH is set). @@ -256,8 +279,6 @@ static inline void sg_unmark_end(struct scatterlist *sg) */ #ifdef CONFIG_PCI_P2PDMA -#define SG_DMA_BUS_ADDRESS (1 << 0) - /** * sg_dma_is_bus address - Return whether a given segment was marked * as a bus address
Similarly to the direct DMA, bounce small allocations as they may have originated from a kmalloc() cache not safe for DMA. Unlike the direct DMA, iommu_dma_map_sg() cannot call iommu_dma_map_sg_swiotlb() for all non-coherent devices as this would break some cases where the iova is expected to be contiguous (dmabuf). Instead, scan the scatterlist for any small sizes and only go the swiotlb path if any element of the list needs bouncing (note that iommu_dma_map_page() would still only bounce those buffers which are not DMA-aligned). To avoid scanning the scatterlist on the 'sync' operations, introduce an SG_DMA_USE_SWIOTLB flag set by iommu_dma_map_sg_swiotlb(). The dev_use_swiotlb() function together with the newly added dev_use_sg_swiotlb() now check for both untrusted devices and unaligned kmalloc() buffers (suggested by Robin Murphy). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Robin Murphy <robin.murphy@arm.com> --- drivers/iommu/Kconfig | 1 + drivers/iommu/dma-iommu.c | 50 ++++++++++++++++++++++++++++++------- include/linux/scatterlist.h | 25 +++++++++++++++++-- 3 files changed, 65 insertions(+), 11 deletions(-)