mbox series

[0/9] Input: Fix insufficent DMA alignment.

Message ID 20221127144116.1418083-1-jic23@kernel.org (mailing list archive)
Headers show
Series Input: Fix insufficent DMA alignment. | expand

Message

Jonathan Cameron Nov. 27, 2022, 2:41 p.m. UTC
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

This problem was discovered in IIO as a side effect of the discussions about
relaxing kmalloc alignment on arm64 and resulted in a series of large
patch sets.

https://lore.kernel.org/linux-iio/20220508175712.647246-1-jic23@kernel.org/

Unsurprisingly there are cases of it in other subsystems.

The short version of this is that there are a few known arm64 chips where
___cacheline_aligned enforces 64 byte alignment which is what we typically
want for performance optimization as the size of the L1 cache lines.
However, further out in the cache hierarchy we have caches with 128 byte
lines.  Those are the ones that matter for DMA safety.
So we need the larger alignment guarantees of ARCH_KMALLOC_MINALIGN which
in this case is 128 bytes.

There is one other use of ____cacheline_aligned in input:
joystick/iforce/iforce-usb.c

Whilst suspicious I'm not sure enough of the requirements of USB to
know if they are there for DMA safety or some other constraint.

Jonathan

Cc: Daniel Mack <daniel@zonque.org>
Cc: Michael Hennerich <michael.hennerich@analog.com>
Cc: Tomohiro Yoshidomi <sylph23k@gmail.com>
Cc: Javier Martinez Canillas <javier@dowhile0.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: Lauri Kasanen <cand@gmx.com>
Cc: Daniel Hung-yu Wu <hywu@google.com>

Jonathan Cameron (9):
  Input: psxpad - Fix padding for DMA safe buffers.
  Input: ad714x - Fix padding for DMA safe buffers.
  Input: ad7887 - Fix padding for DMA safe buffers.
  Input: ads7846 - Fix padding for DMA safe buffers.
  Input: cyttsp - Fix padding for DMA safe buffers.
  Input: surface3 - Fix padding for DMA safe buffers.
  Input: n64joy - Fix DMA buffer alignment.
  Input: atmel_captouch - Avoid suspect DMA buffer alignment.
  Input: elants - Fix suspect DMA buffer alignment

 drivers/input/joystick/n64joy.c          | 6 +++---
 drivers/input/joystick/psxpad-spi.c      | 4 ++--
 drivers/input/misc/ad714x.h              | 2 +-
 drivers/input/misc/atmel_captouch.c      | 2 +-
 drivers/input/touchscreen/ad7877.c       | 4 ++--
 drivers/input/touchscreen/ads7846.c      | 4 ++--
 drivers/input/touchscreen/cyttsp_core.h  | 2 +-
 drivers/input/touchscreen/elants_i2c.c   | 2 +-
 drivers/input/touchscreen/surface3_spi.c | 2 +-
 9 files changed, 14 insertions(+), 14 deletions(-)

Comments

Dmitry Torokhov Nov. 28, 2022, 6:16 p.m. UTC | #1
Hi Jonathan,

On Sun, Nov 27, 2022 at 02:41:07PM +0000, Jonathan Cameron wrote:
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> This problem was discovered in IIO as a side effect of the discussions about
> relaxing kmalloc alignment on arm64 and resulted in a series of large
> patch sets.
> 
> https://lore.kernel.org/linux-iio/20220508175712.647246-1-jic23@kernel.org/
> 
> Unsurprisingly there are cases of it in other subsystems.
> 
> The short version of this is that there are a few known arm64 chips where
> ___cacheline_aligned enforces 64 byte alignment which is what we typically
> want for performance optimization as the size of the L1 cache lines.
> However, further out in the cache hierarchy we have caches with 128 byte
> lines.  Those are the ones that matter for DMA safety.
> So we need the larger alignment guarantees of ARCH_KMALLOC_MINALIGN which
> in this case is 128 bytes.

I wonder if we could have something like ____dmasafe_aligned instead of
sprinkling ARCH_KMALLOC_MINALIGN around?

> 
> There is one other use of ____cacheline_aligned in input:
> joystick/iforce/iforce-usb.c
> 
> Whilst suspicious I'm not sure enough of the requirements of USB to
> know if they are there for DMA safety or some other constraint.

Yes, USB has requirements similar to SPI.

Thanks.
Jonathan Cameron Nov. 29, 2022, 9:18 a.m. UTC | #2
On Mon, 28 Nov 2022 10:16:31 -0800
Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:

> Hi Jonathan,
> 
> On Sun, Nov 27, 2022 at 02:41:07PM +0000, Jonathan Cameron wrote:
> > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > 
> > This problem was discovered in IIO as a side effect of the discussions about
> > relaxing kmalloc alignment on arm64 and resulted in a series of large
> > patch sets.
> > 
> > https://lore.kernel.org/linux-iio/20220508175712.647246-1-jic23@kernel.org/
> > 
> > Unsurprisingly there are cases of it in other subsystems.
> > 
> > The short version of this is that there are a few known arm64 chips where
> > ___cacheline_aligned enforces 64 byte alignment which is what we typically
> > want for performance optimization as the size of the L1 cache lines.
> > However, further out in the cache hierarchy we have caches with 128 byte
> > lines.  Those are the ones that matter for DMA safety.
> > So we need the larger alignment guarantees of ARCH_KMALLOC_MINALIGN which
> > in this case is 128 bytes.  
> 
> I wonder if we could have something like ____dmasafe_aligned instead of
> sprinkling ARCH_KMALLOC_MINALIGN around?

I agree in principle and eventually that will be ARCH_DMA_MINALIGN.
But it isn't useable yet for backports.
Catalin has worked on several approaches to reducing the alignment of kmalloc.
I may well be lagging on the current plan...
https://lore.kernel.org/linux-arm-kernel/20221106220143.2129263-1-catalin.marinas@arm.com/#r

The way crypto and IIO handled this was to add a local define
IIO_DMA_MINALIGN, CRYPTO_MINALIGN

Catalin, if you do forwards with making ARCH_DMA_MINALIGN global available, make sure
to include IIO_DMA_MINALIGN.  I'm fine with the approach you used for crypto of changing
the local define.  Note that in IIO this typically only bloats a single structure
per device instance, so it's not worth the complexity of dynamic alignment.
As we are marking structure elements with it, in all cases they are multiples of
IIO_DMA_MINALIGN in size, so everything is easy.

On the other side of things, we might be able to relax most of these alignment tricks
in IIO (and input) if swiotlb is the approach eventually used.  I'm personally not that
keen on that transition but meh, this stuff usually involves a serial bus, so the extra
bounce isnt' too painful. Note that we regularly do fun things like setting tx
buffer to be ARCH_KMALLOC_MINALIGN and the rx buffer to be that +n bytes on the basis
that a given serial bus master won't trash itself and hence this is safe. I think your
approach will bounce all of those which is somewhat unfortunate.  Other fun things
include DMA mapping a series of small buffers in one allocation for a series of serial
messages.  Again, that is guaranteed to be safe.

Jonathan






> 
> > 
> > There is one other use of ____cacheline_aligned in input:
> > joystick/iforce/iforce-usb.c
> > 
> > Whilst suspicious I'm not sure enough of the requirements of USB to
> > know if they are there for DMA safety or some other constraint.  
> 
> Yes, USB has requirements similar to SPI.
> 
> Thanks.
>
Dmitry Torokhov Nov. 30, 2022, 1:23 a.m. UTC | #3
On Tue, Nov 29, 2022 at 09:18:28AM +0000, Jonathan Cameron wrote:
> On Mon, 28 Nov 2022 10:16:31 -0800
> Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:
> 
> > Hi Jonathan,
> > 
> > On Sun, Nov 27, 2022 at 02:41:07PM +0000, Jonathan Cameron wrote:
> > > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > 
> > > This problem was discovered in IIO as a side effect of the discussions about
> > > relaxing kmalloc alignment on arm64 and resulted in a series of large
> > > patch sets.
> > > 
> > > https://lore.kernel.org/linux-iio/20220508175712.647246-1-jic23@kernel.org/
> > > 
> > > Unsurprisingly there are cases of it in other subsystems.
> > > 
> > > The short version of this is that there are a few known arm64 chips where
> > > ___cacheline_aligned enforces 64 byte alignment which is what we typically
> > > want for performance optimization as the size of the L1 cache lines.
> > > However, further out in the cache hierarchy we have caches with 128 byte
> > > lines.  Those are the ones that matter for DMA safety.
> > > So we need the larger alignment guarantees of ARCH_KMALLOC_MINALIGN which
> > > in this case is 128 bytes.  
> > 
> > I wonder if we could have something like ____dmasafe_aligned instead of
> > sprinkling ARCH_KMALLOC_MINALIGN around?
> 
> I agree in principle and eventually that will be ARCH_DMA_MINALIGN.
> But it isn't useable yet for backports.

Sorry, I do not follow. Are talking about backports because the code is
broken in the mainline right now, or it will become broken when
Catalin's changes land? And even if it is broken right now why can't we
add

#define ____dmasafe_aligned __attribute__((__aligned__(ARCH_KMALLOC_MINALIGN)))

somewhere in include/linux/cache.h? Then you can tweak it as needed
independently of kmalloc alignment and without need to touch drivers
again and it should be easy to backport still.

Thanks.