mbox series

[v9,00/10] block atomic writes

Message ID 20240620125359.2684798-1-john.g.garry@oracle.com (mailing list archive)
Headers show
Series block atomic writes | expand

Message

John Garry June 20, 2024, 12:53 p.m. UTC
This series introduces a proposal to implementing atomic writes in the
kernel for torn-write protection.

This series takes the approach of adding a new "atomic" flag to each of
pwritev2() and iocb->ki_flags - RWF_ATOMIC and IOCB_ATOMIC, respectively.
When set, these indicate that we want the write issued "atomically".

Only direct IO is supported and for block devices here. For this, atomic
write HW is required, like SCSI ATOMIC WRITE (16).

XFS FS support has previously been posted at:
https://lore.kernel.org/linux-xfs/20240607143919.2622319-1-john.g.garry@oracle.com/T/#t

Updated man pages have been posted at:
https://lore.kernel.org/lkml/20240124112731.28579-1-john.g.garry@oracle.com/T/#m520dca97a9748de352b5a723d3155a4bb1e46456

The goal here is to provide an interface that allows applications use
application-specific block sizes larger than logical block size
reported by the storage device or larger than filesystem block size as
reported by stat().

With this new interface, application blocks will never be torn or
fractured when written. For a power fail, for each individual application
block, all or none of the data to be written. A racing atomic write and
read will mean that the read sees all the old data or all the new data,
but never a mix of old and new.

Three new fields are added to struct statx - atomic_write_unit_min,
atomic_write_unit_max, and atomic_write_segments_max. For each atomic
individual write, the total length of a write must be a between
atomic_write_unit_min and atomic_write_unit_max, inclusive, and a
power-of-2. The write must also be at a natural offset in the file
wrt the write length. For pwritev2, iovcnt is limited by
atomic_write_segments_max.

There has been some discussion on untorn buffered writes support at:
https://lore.kernel.org/linux-fsdevel/20240601093325.GC247052@mit.edu/T/#t

That conversation continues.

SCSI sd.c and scsi_debug and NVMe kernel support is added.

This series is based on Jens' for-6.11/block-limits branch at commit
339d3948c07b ("block: move the bounce flag into the features field").

Patches can be found at:
https://github.com/johnpgarry/linux/commits/atomic-writes-v6.10-v9

Changes since v8:
- Rebase
- Update comment on nvme_valid_atomic_write()
- Update chunk sectors vs atomic boundary support
- Add Martin and Darrick's review tags (thanks!)

Changes since v7:
- Generalize block chunk_sectors support (Hannes)
- Relocate and reorder args for generic_atomic_write_valid (Christoph)
- Drop rq_straddles_atomic_write_boundary()

Alan Adamson (1):
  nvme: Atomic write support

John Garry (6):
  block: Pass blk_queue_get_max_sectors() a request pointer
  block: Generalize chunk_sectors support as boundary support
  block: Add core atomic write support
  block: Add fops atomic write support
  scsi: sd: Atomic write support
  scsi: scsi_debug: Atomic write support

Prasad Singamsetty (3):
  fs: Initial atomic write support
  fs: Add initial atomic write support info to statx
  block: Add atomic write support for statx

 Documentation/ABI/stable/sysfs-block |  53 +++
 block/bdev.c                         |  36 +-
 block/blk-core.c                     |  19 +
 block/blk-merge.c                    |  67 ++-
 block/blk-mq.c                       |   2 +-
 block/blk-settings.c                 |  88 ++++
 block/blk-sysfs.c                    |  33 ++
 block/blk.h                          |   9 +-
 block/fops.c                         |  20 +-
 drivers/md/dm.c                      |   2 +-
 drivers/nvme/host/core.c             |  52 +++
 drivers/scsi/scsi_debug.c            | 588 +++++++++++++++++++++------
 drivers/scsi/scsi_trace.c            |  22 +
 drivers/scsi/sd.c                    |  93 ++++-
 drivers/scsi/sd.h                    |   8 +
 fs/aio.c                             |   8 +-
 fs/btrfs/ioctl.c                     |   2 +-
 fs/read_write.c                      |  18 +-
 fs/stat.c                            |  50 ++-
 include/linux/blk_types.h            |   8 +-
 include/linux/blkdev.h               |  74 +++-
 include/linux/fs.h                   |  20 +-
 include/linux/stat.h                 |   3 +
 include/scsi/scsi_proto.h            |   1 +
 include/trace/events/scsi.h          |   1 +
 include/uapi/linux/fs.h              |   5 +-
 include/uapi/linux/stat.h            |  12 +-
 io_uring/rw.c                        |   9 +-
 28 files changed, 1111 insertions(+), 192 deletions(-)

Comments

Jens Axboe June 20, 2024, 9:23 p.m. UTC | #1
On Thu, 20 Jun 2024 12:53:49 +0000, John Garry wrote:
> This series introduces a proposal to implementing atomic writes in the
> kernel for torn-write protection.
> 
> This series takes the approach of adding a new "atomic" flag to each of
> pwritev2() and iocb->ki_flags - RWF_ATOMIC and IOCB_ATOMIC, respectively.
> When set, these indicate that we want the write issued "atomically".
> 
> [...]

Applied, thanks!

[01/10] block: Pass blk_queue_get_max_sectors() a request pointer
        commit: 8d1dfd51c84e202df05a999ce82cb27554f7d152
[02/10] block: Generalize chunk_sectors support as boundary support
        commit: f70167a7a6e7e8a6911f3a216dc044cbfe7c1983
[03/10] fs: Initial atomic write support
        commit: c34fc6f26ab86d03a2d47446f42b6cd492dfdc56
[04/10] fs: Add initial atomic write support info to statx
        commit: 0f9ca80fa4f9670ba09721e4e36b8baf086a500c
[05/10] block: Add core atomic write support
        commit: 9da3d1e912f3953196e66991d75208cde3e845e1
[06/10] block: Add atomic write support for statx
        commit: 9abcfbd235f59fb5b6379e5bc0231dad831ebace
[07/10] block: Add fops atomic write support
        commit: caf336f81b3a3ca744e335972e86ec7244512d4a
[08/10] scsi: sd: Atomic write support
        commit: bf4ae8f2e6407a779c0368eb0f3e047a8333be17
[09/10] scsi: scsi_debug: Atomic write support
        commit: 84f3a3c01d70efba736bc42155cf32722067b327
[10/10] nvme: Atomic write support
        commit: 5f9bbea02f06110ec5cf95a3327019b3194b2d80

Best regards,
John Garry June 21, 2024, 7:59 a.m. UTC | #2
On 20/06/2024 22:23, Jens Axboe wrote:
> On Thu, 20 Jun 2024 12:53:49 +0000, John Garry wrote:
>> This series introduces a proposal to implementing atomic writes in the
>> kernel for torn-write protection.
>>
>> This series takes the approach of adding a new "atomic" flag to each of
>> pwritev2() and iocb->ki_flags - RWF_ATOMIC and IOCB_ATOMIC, respectively.
>> When set, these indicate that we want the write issued "atomically".
>>
>> [...]
> Applied, thanks!

Thanks Jens.

JFYI, we will probably notice a trivial conflict in 
include/uapi/linux/stat.h when merging, as I fixed a comment there which 
went into v6.10-rc4 . To resolve, the version in this series can be 
used, as it also fixes that comment.
Jens Axboe June 21, 2024, 2:28 p.m. UTC | #3
On 6/21/24 1:59 AM, John Garry wrote:
> On 20/06/2024 22:23, Jens Axboe wrote:
>> On Thu, 20 Jun 2024 12:53:49 +0000, John Garry wrote:
>>> This series introduces a proposal to implementing atomic writes in the
>>> kernel for torn-write protection.
>>>
>>> This series takes the approach of adding a new "atomic" flag to each of
>>> pwritev2() and iocb->ki_flags - RWF_ATOMIC and IOCB_ATOMIC, respectively.
>>> When set, these indicate that we want the write issued "atomically".
>>>
>>> [...]
>> Applied, thanks!
> 
> Thanks Jens.
> 
> JFYI, we will probably notice a trivial conflict in
> include/uapi/linux/stat.h when merging, as I fixed a comment there
> which went into v6.10-rc4 . To resolve, the version in this series can
> be used, as it also fixes that comment.

I did notice and resolved it when I merged it into my for-next branch.
And then was kind of annoyed when I noticed it was caused by a patch
from yourself as well, surely that should either have been part of the
series, just ignored for -git, or done after the fact. Kind of pointless
to cause conflicts with your own series right when it needs ready to go
into the for-next tree.
John Garry June 21, 2024, 2:41 p.m. UTC | #4
On 21/06/2024 15:28, Jens Axboe wrote:
>> JFYI, we will probably notice a trivial conflict in
>> include/uapi/linux/stat.h when merging, as I fixed a comment there
>> which went into v6.10-rc4 . To resolve, the version in this series can
>> be used, as it also fixes that comment.
> I did notice and resolved it when I merged it into my for-next branch.
> And then was kind of annoyed when I noticed it was caused by a patch
> from yourself as well, surely that should either have been part of the
> series, just ignored for -git, or done after the fact. Kind of pointless
> to cause conflicts with your own series right when it needs ready to go
> into the for-next tree.

ok, I will co-ordinate things better in future.

Thanks again,
John