mbox series

[v7,00/17] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8

Message ID 20230612153201.554742-1-catalin.marinas@arm.com (mailing list archive)
Headers show
Series mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 | expand

Message

Catalin Marinas June 12, 2023, 3:31 p.m. UTC
Hi,

That's v7 of the series reducing the kmalloc() minimum alignment on
arm64 to 8 (from 128). There's no new/different functionality, mostly
cosmetic changes and acks/tested-bys.

Andrew, if there are no further comments or objections to this version,
are you ok to take the series through the mm tree? The arm64 changes are
fairly small. Alternatively, I can push it into linux-next now to give
it some wider exposure and decide whether to upstream it when the
merging window opens. Thanks.

The updated patches are also available on this branch:

git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux devel/kmalloc-minalign

Changes since v6
(https://lore.kernel.org/r/20230531154836.1366225-1-catalin.marinas@arm.com):

- Moved ARCH_DMA_MINALIGN to include/linux/cache.h from slab.h (it
  matches the asm/cache.h where most architectures defining this macro
  place it).

- Single body for __kmalloc_minalign() with an #ifdef.

- Renamed the sg_dma_use_swiotlb() to sg_dma_is_swiotlb() (Robin's
  suggestion).

- Added acks/tested-by.

Catalin Marinas (15):
  mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN
  dma: Allow dma_get_cache_alignment() to be overridden by the arch code
  mm/slab: Simplify create_kmalloc_cache() args and make it static
  mm/slab: Limit kmalloc() minimum alignment to
    dma_get_cache_alignment()
  drivers/base: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  drivers/gpu: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  drivers/usb: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  drivers/spi: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  dm-crypt: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  iio: core: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  arm64: Allow kmalloc() caches aligned to the smaller cache_line_size()
  dma-mapping: Force bouncing if the kmalloc() size is not
    cache-line-aligned
  iommu/dma: Force bouncing if the size is not cacheline-aligned
  mm: slab: Reduce the kmalloc() minimum alignment if DMA bouncing
    possible
  arm64: Enable ARCH_WANT_KMALLOC_DMA_BOUNCE for arm64

Robin Murphy (2):
  scatterlist: Add dedicated config for DMA flags
  dma-mapping: Name SG DMA flag helpers consistently

 arch/arm64/Kconfig             |  1 +
 arch/arm64/include/asm/cache.h |  3 ++
 arch/arm64/mm/init.c           |  7 +++-
 drivers/base/devres.c          |  6 ++--
 drivers/gpu/drm/drm_managed.c  |  6 ++--
 drivers/iommu/Kconfig          |  1 +
 drivers/iommu/dma-iommu.c      | 58 ++++++++++++++++++++++++--------
 drivers/iommu/iommu.c          |  2 +-
 drivers/md/dm-crypt.c          |  2 +-
 drivers/pci/Kconfig            |  1 +
 drivers/spi/spidev.c           |  2 +-
 drivers/usb/core/buffer.c      |  8 ++---
 include/linux/cache.h          |  6 ++++
 include/linux/dma-map-ops.h    | 61 ++++++++++++++++++++++++++++++++++
 include/linux/dma-mapping.h    |  5 ++-
 include/linux/iio/iio.h        |  2 +-
 include/linux/scatterlist.h    | 60 ++++++++++++++++++++++++++-------
 include/linux/slab.h           | 14 +++++---
 kernel/dma/Kconfig             |  7 ++++
 kernel/dma/direct.c            |  2 +-
 kernel/dma/direct.h            |  3 +-
 mm/slab.c                      |  6 +---
 mm/slab.h                      |  5 ++-
 mm/slab_common.c               | 41 +++++++++++++++++------
 24 files changed, 244 insertions(+), 65 deletions(-)

Comments

Amit Pundir July 5, 2023, 1:40 p.m. UTC | #1
Hi Catalin,

On Wed, 5 Jul 2023 at 18:17, Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> Hi,
>
> That's v7 of the series reducing the kmalloc() minimum alignment on
> arm64 to 8 (from 128). There's no new/different functionality, mostly
> cosmetic changes and acks/tested-bys.
>
> Andrew, if there are no further comments or objections to this version,
> are you ok to take the series through the mm tree? The arm64 changes are
> fairly small. Alternatively, I can push it into linux-next now to give
> it some wider exposure and decide whether to upstream it when the
> merging window opens. Thanks.

This patch series broke Dragonboard 845c (SDM845) running AOSP.
With this series I run into random oops at __kmem_cache_alloc_node().
Here is one such boot log https://bugs.linaro.org/attachment.cgi?id=1146

Reverting "arm64: Enable ARCH_WANT_KMALLOC_DMA_BOUNCE for arm64"
fixes it for the time being.

Regards,
Amit Pundir


>
> The updated patches are also available on this branch:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux devel/kmalloc-minalign
>
> Changes since v6
> (https://lore.kernel.org/r/20230531154836.1366225-1-catalin.marinas@arm.com):
>
> - Moved ARCH_DMA_MINALIGN to include/linux/cache.h from slab.h (it
>   matches the asm/cache.h where most architectures defining this macro
>   place it).
>
> - Single body for __kmalloc_minalign() with an #ifdef.
>
> - Renamed the sg_dma_use_swiotlb() to sg_dma_is_swiotlb() (Robin's
>   suggestion).
>
> - Added acks/tested-by.
>
> Catalin Marinas (15):
>   mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN
>   dma: Allow dma_get_cache_alignment() to be overridden by the arch code
>   mm/slab: Simplify create_kmalloc_cache() args and make it static
>   mm/slab: Limit kmalloc() minimum alignment to
>     dma_get_cache_alignment()
>   drivers/base: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
>   drivers/gpu: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
>   drivers/usb: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
>   drivers/spi: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
>   dm-crypt: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
>   iio: core: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
>   arm64: Allow kmalloc() caches aligned to the smaller cache_line_size()
>   dma-mapping: Force bouncing if the kmalloc() size is not
>     cache-line-aligned
>   iommu/dma: Force bouncing if the size is not cacheline-aligned
>   mm: slab: Reduce the kmalloc() minimum alignment if DMA bouncing
>     possible
>   arm64: Enable ARCH_WANT_KMALLOC_DMA_BOUNCE for arm64
>
> Robin Murphy (2):
>   scatterlist: Add dedicated config for DMA flags
>   dma-mapping: Name SG DMA flag helpers consistently
>
>  arch/arm64/Kconfig             |  1 +
>  arch/arm64/include/asm/cache.h |  3 ++
>  arch/arm64/mm/init.c           |  7 +++-
>  drivers/base/devres.c          |  6 ++--
>  drivers/gpu/drm/drm_managed.c  |  6 ++--
>  drivers/iommu/Kconfig          |  1 +
>  drivers/iommu/dma-iommu.c      | 58 ++++++++++++++++++++++++--------
>  drivers/iommu/iommu.c          |  2 +-
>  drivers/md/dm-crypt.c          |  2 +-
>  drivers/pci/Kconfig            |  1 +
>  drivers/spi/spidev.c           |  2 +-
>  drivers/usb/core/buffer.c      |  8 ++---
>  include/linux/cache.h          |  6 ++++
>  include/linux/dma-map-ops.h    | 61 ++++++++++++++++++++++++++++++++++
>  include/linux/dma-mapping.h    |  5 ++-
>  include/linux/iio/iio.h        |  2 +-
>  include/linux/scatterlist.h    | 60 ++++++++++++++++++++++++++-------
>  include/linux/slab.h           | 14 +++++---
>  kernel/dma/Kconfig             |  7 ++++
>  kernel/dma/direct.c            |  2 +-
>  kernel/dma/direct.h            |  3 +-
>  mm/slab.c                      |  6 +---
>  mm/slab.h                      |  5 ++-
>  mm/slab_common.c               | 41 +++++++++++++++++------
>  24 files changed, 244 insertions(+), 65 deletions(-)
>
Isaac Manjarres July 7, 2023, 12:41 a.m. UTC | #2
On Wed, Jul 05, 2023 at 07:10:02PM +0530, Amit Pundir wrote:
> Hi Catalin,
> 
> On Wed, 5 Jul 2023 at 18:17, Catalin Marinas <catalin.marinas@arm.com> wrote:
> >
> > Hi,
> >
> > That's v7 of the series reducing the kmalloc() minimum alignment on
> > arm64 to 8 (from 128). There's no new/different functionality, mostly
> > cosmetic changes and acks/tested-bys.
> >
> > Andrew, if there are no further comments or objections to this version,
> > are you ok to take the series through the mm tree? The arm64 changes are
> > fairly small. Alternatively, I can push it into linux-next now to give
> > it some wider exposure and decide whether to upstream it when the
> > merging window opens. Thanks.
> 
> This patch series broke Dragonboard 845c (SDM845) running AOSP.
> With this series I run into random oops at __kmem_cache_alloc_node().
> Here is one such boot log https://bugs.linaro.org/attachment.cgi?id=1146

Hey Amit,

From the log that you linked, this looks like there's corruption within
the kmalloc slab caches. Can you please try enabling slub debug? That
might give us more information about who is causing the corruption
you're seeing here.

To enable slub debug, you can add the following to your kernel config:

CONFIG_SLUB_DEBUG=y

and CONFIG_SLUB_DEBUG_ON=y or add slub_debug to the kernel command line.

--Isaac
Amit Pundir July 8, 2023, 1:02 p.m. UTC | #3
On Fri, 7 Jul 2023 at 06:12, Isaac Manjarres <isaacmanjarres@google.com> wrote:
>
> On Wed, Jul 05, 2023 at 07:10:02PM +0530, Amit Pundir wrote:
> > Hi Catalin,
> >
> > On Wed, 5 Jul 2023 at 18:17, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > >
> > > Hi,
> > >
> > > That's v7 of the series reducing the kmalloc() minimum alignment on
> > > arm64 to 8 (from 128). There's no new/different functionality, mostly
> > > cosmetic changes and acks/tested-bys.
> > >
> > > Andrew, if there are no further comments or objections to this version,
> > > are you ok to take the series through the mm tree? The arm64 changes are
> > > fairly small. Alternatively, I can push it into linux-next now to give
> > > it some wider exposure and decide whether to upstream it when the
> > > merging window opens. Thanks.
> >
> > This patch series broke Dragonboard 845c (SDM845) running AOSP.
> > With this series I run into random oops at __kmem_cache_alloc_node().
> > Here is one such boot log https://bugs.linaro.org/attachment.cgi?id=1146
>
> Hey Amit,
>
> From the log that you linked, this looks like there's corruption within
> the kmalloc slab caches. Can you please try enabling slub debug? That
> might give us more information about who is causing the corruption
> you're seeing here.
>
> To enable slub debug, you can add the following to your kernel config:
>
> CONFIG_SLUB_DEBUG=y
>
> and CONFIG_SLUB_DEBUG_ON=y or add slub_debug to the kernel command line.

Hi Issac, I can't reproduce this crash if I have slub_debug enabled
https://bugs.linaro.org/attachment.cgi?id=1149.

But I can still reproduce the crash, with the same build, if I just
remove the slub_debug bootarg
https://bugs.linaro.org/attachment.cgi?id=1150

Is there any specific (slub_debug=) option(s) you want me to try?
May be I can reproduce this bug just with those options.

Regards,
Amit Pundir

>
> --Isaac
Catalin Marinas July 9, 2023, 3:27 a.m. UTC | #4
Hi Amit,

On Wed, Jul 05, 2023 at 07:10:02PM +0530, Amit Pundir wrote:
> On Wed, 5 Jul 2023 at 18:17, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > That's v7 of the series reducing the kmalloc() minimum alignment on
> > arm64 to 8 (from 128). There's no new/different functionality, mostly
> > cosmetic changes and acks/tested-bys.
> >
> > Andrew, if there are no further comments or objections to this version,
> > are you ok to take the series through the mm tree? The arm64 changes are
> > fairly small. Alternatively, I can push it into linux-next now to give
> > it some wider exposure and decide whether to upstream it when the
> > merging window opens. Thanks.
> 
> This patch series broke Dragonboard 845c (SDM845) running AOSP.
> With this series I run into random oops at __kmem_cache_alloc_node().
> Here is one such boot log https://bugs.linaro.org/attachment.cgi?id=1146
> 
> Reverting "arm64: Enable ARCH_WANT_KMALLOC_DMA_BOUNCE for arm64"
> fixes it for the time being.

I'm on holiday until the 24th, away from my computer and I don't have
much time to dig into this. Are the DMA-capable devices on this platform
non-coherent? We'd need to check whether the bouncing logic somehow
overflows and corrupts adjacent memory. If the devices are coherent,
maybe the DMA spills over the kmalloc() boundaries, now that we
allocate buffers smaller than a cacheline even for DMA (no bouncing if
the DMA is coherent). Also, are these devices behind an IOMMU?

Yet another possibility is a not fully transparent system level cache
with a line size larger than what's reported by the CTR_EL0.CWG. If you
boot the platform with "swiotlb=noforce", do you still see issues? Or
there's no swiotlb buffer allocated anyway.

In addition to what Isaac suggests on slab debugging, it may be worth
enabling DMA debugging as well (I don't remember the config name).
Isaac Manjarres July 11, 2023, 7:44 p.m. UTC | #5
On Sat, Jul 8, 2023 at 6:03 AM Amit Pundir <amit.pundir@linaro.org> wrote:
> Hi Issac, I can't reproduce this crash if I have slub_debug enabled
> https://bugs.linaro.org/attachment.cgi?id=1149.
>
> But I can still reproduce the crash, with the same build, if I just
> remove the slub_debug bootarg
> https://bugs.linaro.org/attachment.cgi?id=1150
>

Hey Amit,

Thanks for trying this out. John Stultz and I tried this out as well, and we
couldn't reproduce the issue with slub_debug enabled either.

However, we were able to reproduce the issue you reported along with
KASAN enabled, which also helped us catch an out-of-bounds access
in the regmap-irq code. Once we fixed it, the memory corruption you
reported was also fixed. I've sent the patch [1] for review. Can you please
test it out and see if it works for you too?

[1]: https://lore.kernel.org/all/20230711193059.2480971-1-isaacmanjarres@google.com/

Thanks,
Isaac
Amit Pundir July 12, 2023, 4:57 a.m. UTC | #6
On Wed, 12 Jul 2023 at 01:14, Isaac Manjarres <isaacmanjarres@google.com> wrote:
>
> On Sat, Jul 8, 2023 at 6:03 AM Amit Pundir <amit.pundir@linaro.org> wrote:
> > Hi Issac, I can't reproduce this crash if I have slub_debug enabled
> > https://bugs.linaro.org/attachment.cgi?id=1149.
> >
> > But I can still reproduce the crash, with the same build, if I just
> > remove the slub_debug bootarg
> > https://bugs.linaro.org/attachment.cgi?id=1150
> >
>
> Hey Amit,
>
> Thanks for trying this out. John Stultz and I tried this out as well, and we
> couldn't reproduce the issue with slub_debug enabled either.
>
> However, we were able to reproduce the issue you reported along with
> KASAN enabled, which also helped us catch an out-of-bounds access
> in the regmap-irq code. Once we fixed it, the memory corruption you
> reported was also fixed. I've sent the patch [1] for review. Can you please
> test it out and see if it works for you too?

It does fix the v6.5-rc1 boot regression I reported above on
Dragonboard 845c running AOSP. Thank you.

Regards,
Amit Pundir

>
> [1]: https://lore.kernel.org/all/20230711193059.2480971-1-isaacmanjarres@google.com/
>
> Thanks,
> Isaac