Message ID | 20221106220143.2129263-14-catalin.marinas@arm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 | expand |
On 2022-11-06 22:01, Catalin Marinas wrote: > With all the infrastructure in place for bouncing small kmalloc() > buffers, add the corresponding Kconfig entry and select it for arm64. AFAICS we're missing the crucial part to ensure that SWIOTLB is available even when max_pfn <= arm64_dma_phys_limit, which is very likely to be true on low-memory systems that care most about kmalloc wastage. The only way to override that currently is with "swiotlb=force", but bouncing *everything* is not desirable either. Thanks, Robin. > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will@kernel.org> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Robin Murphy <robin.murphy@arm.com> > --- > arch/arm64/Kconfig | 1 + > kernel/dma/Kconfig | 8 ++++++++ > 2 files changed, 9 insertions(+) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 3991cb7b8a33..f889cf16e6ab 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -100,6 +100,7 @@ config ARM64 > select ARCH_WANT_FRAME_POINTERS > select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36) > select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP > + select ARCH_WANT_KMALLOC_DMA_BOUNCE > select ARCH_WANT_LD_ORPHAN_WARN > select ARCH_WANTS_NO_INSTR > select ARCH_WANTS_THP_SWAP if ARM64_4K_PAGES > diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig > index d6fab8e3cbae..b56e76371023 100644 > --- a/kernel/dma/Kconfig > +++ b/kernel/dma/Kconfig > @@ -86,6 +86,14 @@ config SWIOTLB > bool > select NEED_DMA_MAP_STATE > > +config ARCH_WANT_KMALLOC_DMA_BOUNCE > + bool > + > +config DMA_BOUNCE_UNALIGNED_KMALLOC > + def_bool y > + depends on ARCH_WANT_KMALLOC_DMA_BOUNCE > + depends on SWIOTLB && !SLOB > + > config DMA_RESTRICTED_POOL > bool "DMA Restricted Pool" > depends on OF && OF_RESERVED_MEM && SWIOTLB
On Mon, Nov 07, 2022 at 01:03:31PM +0000, Robin Murphy wrote: > On 2022-11-06 22:01, Catalin Marinas wrote: >> With all the infrastructure in place for bouncing small kmalloc() >> buffers, add the corresponding Kconfig entry and select it for arm64. > > AFAICS we're missing the crucial part to ensure that SWIOTLB is available > even when max_pfn <= arm64_dma_phys_limit, which is very likely to be true > on low-memory systems that care most about kmalloc wastage. The only way to > override that currently is with "swiotlb=force", but bouncing *everything* > is not desirable either. FYI, one of the reasons for the swiotlb_init refactor that passes flags and a boolean a while ago is that we can trivially just either pass another flag or check a condition in swiotlb_init to allocate the buffer. There's actually another case for which we need the unconditional allocation, and that is the bouncing for untrusted external devices with dma-iommu.
On 2022-11-07 14:38, Christoph Hellwig wrote: > On Mon, Nov 07, 2022 at 01:03:31PM +0000, Robin Murphy wrote: >> On 2022-11-06 22:01, Catalin Marinas wrote: >>> With all the infrastructure in place for bouncing small kmalloc() >>> buffers, add the corresponding Kconfig entry and select it for arm64. >> >> AFAICS we're missing the crucial part to ensure that SWIOTLB is available >> even when max_pfn <= arm64_dma_phys_limit, which is very likely to be true >> on low-memory systems that care most about kmalloc wastage. The only way to >> override that currently is with "swiotlb=force", but bouncing *everything* >> is not desirable either. > > FYI, one of the reasons for the swiotlb_init refactor that passes > flags and a boolean a while ago is that we can trivially just either > pass another flag or check a condition in swiotlb_init to allocate the > buffer. There's actually another case for which we need the > unconditional allocation, and that is the bouncing for untrusted > external devices with dma-iommu. Right, I guess machines with Thunderbolt and all the firmware annotations but less than 4GB of RAM are unlikely to exist in the wild, so the untrusted bouncing logic has been getting lucky so far. There are however plenty of arm64 systems with small amounts of RAM and non-coherent USB so in this case someone's likely to fall over it pretty much right away. I know it's easy to add a new condition, but it still has to actually *be* added. Thanks, Robin.
On Mon, Nov 07, 2022 at 01:03:31PM +0000, Robin Murphy wrote: > On 2022-11-06 22:01, Catalin Marinas wrote: > > With all the infrastructure in place for bouncing small kmalloc() > > buffers, add the corresponding Kconfig entry and select it for arm64. > > AFAICS we're missing the crucial part to ensure that SWIOTLB is available > even when max_pfn <= arm64_dma_phys_limit, which is very likely to be true > on low-memory systems that care most about kmalloc wastage. This was a deliberate decision for this version. Patch 4 mitigates it a bit by raising the kmalloc() minimum alignment to the cache line size (typically 64). It's still an improvement from the current 128-byte alignment. Since it's hard to guess the optimal swiotlb buffer for such platforms, I think a follow-up step would be to use the DMA coherent pool for bouncing if no swiotlb buffer is available. At least the pool can grow dynamically. Yet another option would be to increase the swiotlb buffer at run-time but it has an overhead for is_swiotlb_buffer().
On Tue, Nov 08, 2022 at 09:52:15AM +0000, Catalin Marinas wrote: > Since it's hard to guess the optimal swiotlb buffer for such platforms, > I think a follow-up step would be to use the DMA coherent pool for > bouncing if no swiotlb buffer is available. At least the pool can grow > dynamically. Yet another option would be to increase the swiotlb buffer > at run-time but it has an overhead for is_swiotlb_buffer(). Alex said he wanted to look into growing the swiotlb buffer on demand for other reason, so adding him to Cc to check if there has been any progress on that.
On Tue, Nov 08, 2022 at 11:03:31AM +0100, Christoph Hellwig wrote: > On Tue, Nov 08, 2022 at 09:52:15AM +0000, Catalin Marinas wrote: > > Since it's hard to guess the optimal swiotlb buffer for such platforms, > > I think a follow-up step would be to use the DMA coherent pool for > > bouncing if no swiotlb buffer is available. At least the pool can grow > > dynamically. Yet another option would be to increase the swiotlb buffer > > at run-time but it has an overhead for is_swiotlb_buffer(). > > Alex said he wanted to look into growing the swiotlb buffer on demand > for other reason, so adding him to Cc to check if there has been any > progress on that. Hi Alex, Did you get a chance to look into this? If so, have you been able to make progress on being able to grow the SWIOTLB buffer on demand? Thanks, Isaac
Hi Isaac, On 30.11.22 19:48, Isaac Manjarres wrote: > On Tue, Nov 08, 2022 at 11:03:31AM +0100, Christoph Hellwig wrote: >> On Tue, Nov 08, 2022 at 09:52:15AM +0000, Catalin Marinas wrote: >>> Since it's hard to guess the optimal swiotlb buffer for such platforms, >>> I think a follow-up step would be to use the DMA coherent pool for >>> bouncing if no swiotlb buffer is available. At least the pool can grow >>> dynamically. Yet another option would be to increase the swiotlb buffer >>> at run-time but it has an overhead for is_swiotlb_buffer(). >> Alex said he wanted to look into growing the swiotlb buffer on demand >> for other reason, so adding him to Cc to check if there has been any >> progress on that. > Hi Alex, > > Did you get a chance to look into this? If so, have you been able to > make progress on being able to grow the SWIOTLB buffer on demand? I've been slightly under water and haven't been able to look at this yet :). It's on my list, but will probably be a while until I get to it. Would you be interested in having a first try? Thanks, Alex
Hi Alex! Nice to meet you again... On Thu, 1 Dec 2022 00:32:07 +0100 Alexander Graf <agraf@csgraf.de> wrote: > Hi Isaac, > > On 30.11.22 19:48, Isaac Manjarres wrote: > > On Tue, Nov 08, 2022 at 11:03:31AM +0100, Christoph Hellwig wrote: > >> On Tue, Nov 08, 2022 at 09:52:15AM +0000, Catalin Marinas wrote: > >>> Since it's hard to guess the optimal swiotlb buffer for such platforms, > >>> I think a follow-up step would be to use the DMA coherent pool for > >>> bouncing if no swiotlb buffer is available. At least the pool can grow > >>> dynamically. Yet another option would be to increase the swiotlb buffer > >>> at run-time but it has an overhead for is_swiotlb_buffer(). > >> Alex said he wanted to look into growing the swiotlb buffer on demand > >> for other reason, so adding him to Cc to check if there has been any > >> progress on that. > > Hi Alex, > > > > Did you get a chance to look into this? If so, have you been able to > > make progress on being able to grow the SWIOTLB buffer on demand? > > > I've been slightly under water and haven't been able to look at this yet > :). It's on my list, but will probably be a while until I get to it. > Would you be interested in having a first try? All right, I have just found this thread now after having sent my own patch series to make SWIOTLB dynamic. I hope you don't mind. I didn't want to "steal" the project from you. Petr T
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 3991cb7b8a33..f889cf16e6ab 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -100,6 +100,7 @@ config ARM64 select ARCH_WANT_FRAME_POINTERS select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36) select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP + select ARCH_WANT_KMALLOC_DMA_BOUNCE select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANTS_NO_INSTR select ARCH_WANTS_THP_SWAP if ARM64_4K_PAGES diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig index d6fab8e3cbae..b56e76371023 100644 --- a/kernel/dma/Kconfig +++ b/kernel/dma/Kconfig @@ -86,6 +86,14 @@ config SWIOTLB bool select NEED_DMA_MAP_STATE +config ARCH_WANT_KMALLOC_DMA_BOUNCE + bool + +config DMA_BOUNCE_UNALIGNED_KMALLOC + def_bool y + depends on ARCH_WANT_KMALLOC_DMA_BOUNCE + depends on SWIOTLB && !SLOB + config DMA_RESTRICTED_POOL bool "DMA Restricted Pool" depends on OF && OF_RESERVED_MEM && SWIOTLB
With all the infrastructure in place for bouncing small kmalloc() buffers, add the corresponding Kconfig entry and select it for arm64. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Robin Murphy <robin.murphy@arm.com> --- arch/arm64/Kconfig | 1 + kernel/dma/Kconfig | 8 ++++++++ 2 files changed, 9 insertions(+)