mbox series

[PATCHv2,0/5] fix direct io device mapper errors

Message ID 20221110184501.2451620-1-kbusch@meta.com (mailing list archive)
Headers show
Series fix direct io device mapper errors | expand

Message

Keith Busch Nov. 10, 2022, 6:44 p.m. UTC
From: Keith Busch <kbusch@kernel.org>

The 6.0 kernel made some changes to the direct io interface to allow
offsets in user addresses. This based on the hardware's capabilities
reported in the request_queue's dma_alignment attribute.

dm-crypt, -log-writes and -integrity require direct io be aligned to the
block size. Since it was only ever using the default 511 dma mask, this
requirement may fail if formatted to something larger, like 4k, which
will result in unexpected behavior with direct-io.

Changes since v1: Added the same fix for -integrity and -log-writes

The first three were reported successfully tested by Dmitrii Tcvetkov,
but I don't have an official Tested-by: tag.

  https://lore.kernel.org/linux-block/20221103194140.06ce3d36@xps.demsh.org/T/#mba1d0b13374541cdad3b669ec4257a11301d1860

Keitio errors on Busch (5):
  block: make dma_alignment a stacking queue_limit
  dm-crypt: provide dma_alignment limit in io_hints
  block: make blk_set_default_limits() private
  dm-integrity: set dma_alignment limit in io_hints
  dm-log-writes: set dma_alignment limit in io_hints

 block/blk-core.c           |  1 -
 block/blk-settings.c       |  9 +++++----
 block/blk.h                |  1 +
 drivers/md/dm-crypt.c      |  1 +
 drivers/md/dm-integrity.c  |  1 +
 drivers/md/dm-log-writes.c |  1 +
 include/linux/blkdev.h     | 16 ++++++++--------
 7 files changed, 17 insertions(+), 13 deletions(-)

Comments

Mike Snitzer Nov. 11, 2022, 6:07 p.m. UTC | #1
On Thu, Nov 10 2022 at  1:44P -0500,
Keith Busch <kbusch@meta.com> wrote:

> From: Keith Busch <kbusch@kernel.org>
> 
> The 6.0 kernel made some changes to the direct io interface to allow
> offsets in user addresses. This based on the hardware's capabilities
> reported in the request_queue's dma_alignment attribute.
> 
> dm-crypt, -log-writes and -integrity require direct io be aligned to the
> block size. Since it was only ever using the default 511 dma mask, this
> requirement may fail if formatted to something larger, like 4k, which
> will result in unexpected behavior with direct-io.
> 
> Changes since v1: Added the same fix for -integrity and -log-writes
> 
> The first three were reported successfully tested by Dmitrii Tcvetkov,
> but I don't have an official Tested-by: tag.
> 
>   https://lore.kernel.org/linux-block/20221103194140.06ce3d36@xps.demsh.org/T/#mba1d0b13374541cdad3b669ec4257a11301d1860
> 
> Keitio errors on Busch (5):
>   block: make dma_alignment a stacking queue_limit
>   dm-crypt: provide dma_alignment limit in io_hints
>   block: make blk_set_default_limits() private
>   dm-integrity: set dma_alignment limit in io_hints
>   dm-log-writes: set dma_alignment limit in io_hints
> 
>  block/blk-core.c           |  1 -
>  block/blk-settings.c       |  9 +++++----
>  block/blk.h                |  1 +
>  drivers/md/dm-crypt.c      |  1 +
>  drivers/md/dm-integrity.c  |  1 +
>  drivers/md/dm-log-writes.c |  1 +
>  include/linux/blkdev.h     | 16 ++++++++--------
>  7 files changed, 17 insertions(+), 13 deletions(-)
> 
> -- 
> 2.30.2
> 

There are other DM targets that override logical_block_size in their
.io_hints hook (writecache, ebs, zoned). Have you reasoned through why
those do _not_ need updating too?

Is there any risk of just introducing a finalization method in block
core (that DM's .io_hints would call) that would ensure limits that
are a funtion of another are always in sync?  Would avoid whack-a-mole
issues in the future.

Thanks,
Mike
Keith Busch Nov. 11, 2022, 6:31 p.m. UTC | #2
On Fri, Nov 11, 2022 at 01:07:05PM -0500, Mike Snitzer wrote:
> On Thu, Nov 10 2022 at  1:44P -0500,
> Keith Busch <kbusch@meta.com> wrote:
> 
> > From: Keith Busch <kbusch@kernel.org>
> > 
> > The 6.0 kernel made some changes to the direct io interface to allow
> > offsets in user addresses. This based on the hardware's capabilities
> > reported in the request_queue's dma_alignment attribute.
> > 
> > dm-crypt, -log-writes and -integrity require direct io be aligned to the
> > block size. Since it was only ever using the default 511 dma mask, this
> > requirement may fail if formatted to something larger, like 4k, which
> > will result in unexpected behavior with direct-io.
> > 
> > Changes since v1: Added the same fix for -integrity and -log-writes
> > 
> > The first three were reported successfully tested by Dmitrii Tcvetkov,
> > but I don't have an official Tested-by: tag.
> > 
> >   https://lore.kernel.org/linux-block/20221103194140.06ce3d36@xps.demsh.org/T/#mba1d0b13374541cdad3b669ec4257a11301d1860
> > 
> > Keitio errors on Busch (5):
> >   block: make dma_alignment a stacking queue_limit
> >   dm-crypt: provide dma_alignment limit in io_hints
> >   block: make blk_set_default_limits() private
> >   dm-integrity: set dma_alignment limit in io_hints
> >   dm-log-writes: set dma_alignment limit in io_hints
> > 
> >  block/blk-core.c           |  1 -
> >  block/blk-settings.c       |  9 +++++----
> >  block/blk.h                |  1 +
> >  drivers/md/dm-crypt.c      |  1 +
> >  drivers/md/dm-integrity.c  |  1 +
> >  drivers/md/dm-log-writes.c |  1 +
> >  include/linux/blkdev.h     | 16 ++++++++--------
> >  7 files changed, 17 insertions(+), 13 deletions(-)
> > 
> > -- 
> > 2.30.2
> > 
> 
> There are other DM targets that override logical_block_size in their
> .io_hints hook (writecache, ebs, zoned). Have you reasoned through why
> those do _not_ need updating too?

Yeah, that's a good question. The ones that have a problem all make
assumptions about a bio's bv_offset being logical block size aligned,
and each of those is accounted for here. Everything else looks fine with
respect to handling offsets.

> Is there any risk of just introducing a finalization method in block
> core (that DM's .io_hints would call) that would ensure limits that
> are a funtion of another are always in sync?  Would avoid whack-a-mole
> issues in the future.

I don't think we can this particular limit issue. A lot of drivers and
devices can use a smaller dma_alignment than the logical block size, so
the generic block layer wouldn't really know the driver's intentions for
the relationship of these limits.
Mikulas Patocka Nov. 14, 2022, 11:31 a.m. UTC | #3
On Fri, 11 Nov 2022, Keith Busch wrote:

> > There are other DM targets that override logical_block_size in their
> > .io_hints hook (writecache, ebs, zoned). Have you reasoned through why
> > those do _not_ need updating too?
> 
> Yeah, that's a good question. The ones that have a problem all make
> assumptions about a bio's bv_offset being logical block size aligned,
> and each of those is accounted for here. Everything else looks fine with
> respect to handling offsets.

Unaligned bv_offset should work - because XFS is sending such bios. If you 
compile the kernel with memory debugging, kmalloc returns unaligned 
memory. XFS will allocate a buffer with kmalloc, test if it crosses a page 
boundary, if not, use the buffer, if yes, free the buffer and allocate a 
full page.

There have been device mapper problems about unaligned bv_offset in the 
past and I have fixed them.

Unaligned bv_length is a problem for the affected targets.

Mikulas
Keith Busch Nov. 14, 2022, 6:12 p.m. UTC | #4
On Mon, Nov 14, 2022 at 06:31:36AM -0500, Mikulas Patocka wrote:
> 
> 
> On Fri, 11 Nov 2022, Keith Busch wrote:
> 
> > > There are other DM targets that override logical_block_size in their
> > > .io_hints hook (writecache, ebs, zoned). Have you reasoned through why
> > > those do _not_ need updating too?
> > 
> > Yeah, that's a good question. The ones that have a problem all make
> > assumptions about a bio's bv_offset being logical block size aligned,
> > and each of those is accounted for here. Everything else looks fine with
> > respect to handling offsets.
> 
> Unaligned bv_offset should work - because XFS is sending such bios. If you 
> compile the kernel with memory debugging, kmalloc returns unaligned 
> memory. XFS will allocate a buffer with kmalloc, test if it crosses a page 
> boundary, if not, use the buffer, if yes, free the buffer and allocate a 
> full page.
> 
> There have been device mapper problems about unaligned bv_offset in the 
> past and I have fixed them.
> 
> Unaligned bv_length is a problem for the affected targets.

kmalloc is physically contiguous, and bio's support multi-page bvecs for
these, so the bv_len is always aligned if the kmalloc is sized
correctly. The unaligned offsets become a problem with virtually
contiguous buffers since individual bv lengths might not be block size
aligned when bv offsets exist.
Jens Axboe Nov. 16, 2022, 10:58 p.m. UTC | #5
On Thu, 10 Nov 2022 10:44:56 -0800, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> The 6.0 kernel made some changes to the direct io interface to allow
> offsets in user addresses. This based on the hardware's capabilities
> reported in the request_queue's dma_alignment attribute.
> 
> dm-crypt, -log-writes and -integrity require direct io be aligned to the
> block size. Since it was only ever using the default 511 dma mask, this
> requirement may fail if formatted to something larger, like 4k, which
> will result in unexpected behavior with direct-io.
> 
> [...]

Applied, thanks!

[1/5] block: make dma_alignment a stacking queue_limit
      commit: c964d62f5cab7b43dd0534f22a96eab386c6ec5d
[2/5] dm-crypt: provide dma_alignment limit in io_hints
      commit: 86e4d3e8d1838ca88fb9267e669c36f6c8f7c6cd
[3/5] block: make blk_set_default_limits() private
      commit: b3228254bb6e91e57f920227f72a1a7d81925d81
[4/5] dm-integrity: set dma_alignment limit in io_hints
      commit: 29aa778bb66795e6a78b1c99beadc83887827868
[5/5] dm-log-writes: set dma_alignment limit in io_hints
      commit: 50a893359cd2643ee1afc96eedc9e7084cab49fa

Best regards,