mbox series

[v7,00/19] Pass data lifetime information to SCSI disk devices

Message ID 20231218185705.2002516-1-bvanassche@acm.org (mailing list archive)
Headers show
Series Pass data lifetime information to SCSI disk devices | expand

Message

Bart Van Assche Dec. 18, 2023, 6:56 p.m. UTC
Hi Martin,

UFS vendors need the data lifetime information to achieve good performance.
Providing data lifetime information to UFS devices can result in up to 40%
lower write amplification. Hence this patch series that adds support in F2FS
and also in the block layer for data lifetime information. The SCSI disk (sd)
driver is modified such that it passes write hint information to SCSI devices
via the GROUP NUMBER field.

Please consider this patch series for the next merge window.

Thank you,

Bart.

Changes compared to v6:
 - Dropped patch "fs: Restore F_[GS]ET_FILE_RW_HINT support".

Changes compared to v5:
 - Added compile-time tests that compare the WRITE_LIFE_* and RWH_* constants.
 - Split the F_[GS]ET_RW_HINT handlers.
 - Removed the structure member kiocb.ki_hint again. Instead, copy the data
   lifetime information directly from struct file into a bio.
 - Together with Doug Gilbert, fixed multiple bugs in the scsi_debug patches.
   Added Doug's Tested-by.
 - Changed the type of "rscs:1" from bool into unsigned.
 - Added unit tests for the new SCSI protocol data structures.
 - Improved multiple patch descriptions.
 
Changes compared to v4:
 - Dropped the patch that renames the WRITE_LIFE_* constants.
 - Added a fix for an argument check in fcntl_rw_hint().
 - Reordered the patches that restore data lifetime support.
 - Included a fix for data lifetime support for buffered I/O to raw block
   devices.

Changes compared to v3:
 - Renamed the data lifetime constants (WRITE_LIFE_*).
 - Fixed a checkpatch complaint by changing "unsigned" into "unsigned int".
 - Rebased this patch series on top of kernel v6.7-rc1.
 
Changes compared to v2:
 - Instead of storing data lifetime information in bi_ioprio, introduce the
   new struct bio member bi_lifetime and also the struct request member
   'lifetime'.
 - Removed the bio_set_data_lifetime() and bio_get_data_lifetime() functions
   and replaced these with direct assignments.
 - Dropped all changes related to I/O priority.
 - Improved patch descriptions.

Changes compared to v1:
 - Use six bits from the ioprio field for data lifetime information. The
   bio->bi_write_hint / req->write_hint / iocb->ki_hint members that were
   introduced in v1 have been removed again.
 - The F_GET_FILE_RW_HINT and F_SET_FILE_RW_HINT fcntls have been removed.
 - In the SCSI disk (sd) driver, query the stream status and check the PERM bit.
 - The GET STREAM STATUS command has been implemented in the scsi_debug driver.

Bart Van Assche (19):
  fs: Fix rw_hint validation
  fs: Verify write lifetime constants at compile time
  fs: Split fcntl_rw_hint()
  fs: Move enum rw_hint into a new header file
  block, fs: Restore the per-bio/request data lifetime fields
  block, fs: Propagate write hints to the block device inode
  fs/f2fs: Restore the whint_mode mount option
  fs/f2fs: Restore support for tracing data lifetimes
  scsi: core: Query the Block Limits Extension VPD page
  scsi: scsi_proto: Add structures and constants related to I/O groups
    and streams
  scsi: sd: Translate data lifetime information
  scsi: scsi_debug: Reduce code duplication
  scsi: scsi_debug: Support the block limits extension VPD page
  scsi: scsi_debug: Rework page code error handling
  scsi: scsi_debug: Rework subpage code error handling
  scsi: scsi_debug: Allocate the MODE SENSE response from the heap
  scsi: scsi_debug: Implement the IO Advice Hints Grouping mode page
  scsi: scsi_debug: Implement GET STREAM STATUS
  scsi: scsi_debug: Maintain write statistics per group number

 Documentation/filesystems/f2fs.rst |  70 +++++++
 block/bio.c                        |   2 +
 block/blk-crypto-fallback.c        |   1 +
 block/blk-merge.c                  |   8 +
 block/blk-mq.c                     |   2 +
 block/bounce.c                     |   1 +
 block/fops.c                       |  14 ++
 drivers/scsi/Kconfig               |   5 +
 drivers/scsi/Makefile              |   2 +
 drivers/scsi/scsi.c                |   2 +
 drivers/scsi/scsi_debug.c          | 293 ++++++++++++++++++++++-------
 drivers/scsi/scsi_proto_test.c     |  56 ++++++
 drivers/scsi/scsi_sysfs.c          |  10 +
 drivers/scsi/sd.c                  | 111 ++++++++++-
 drivers/scsi/sd.h                  |   3 +
 fs/buffer.c                        |  12 +-
 fs/direct-io.c                     |   2 +
 fs/f2fs/data.c                     |   2 +
 fs/f2fs/f2fs.h                     |  10 +
 fs/f2fs/segment.c                  |  95 ++++++++++
 fs/f2fs/super.c                    |  32 +++-
 fs/fcntl.c                         |  63 ++++---
 fs/inode.c                         |   1 +
 fs/iomap/buffered-io.c             |   2 +
 fs/iomap/direct-io.c               |   2 +
 fs/mpage.c                         |   1 +
 include/linux/blk-mq.h             |   2 +
 include/linux/blk_types.h          |   2 +
 include/linux/fs.h                 |  17 +-
 include/linux/rw_hint.h            |  21 +++
 include/scsi/scsi_device.h         |   1 +
 include/scsi/scsi_proto.h          |  78 ++++++++
 include/trace/events/f2fs.h        |   6 +-
 33 files changed, 814 insertions(+), 115 deletions(-)
 create mode 100644 drivers/scsi/scsi_proto_test.c
 create mode 100644 include/linux/rw_hint.h

Comments

Douglas Gilbert Dec. 18, 2023, 11:08 p.m. UTC | #1
On 12/18/23 13:56, Bart Van Assche wrote:
> Hi Martin,
> 
> UFS vendors need the data lifetime information to achieve good performance.
> Providing data lifetime information to UFS devices can result in up to 40%
> lower write amplification. Hence this patch series that adds support in F2FS
> and also in the block layer for data lifetime information. The SCSI disk (sd)
> driver is modified such that it passes write hint information to SCSI devices
> via the GROUP NUMBER field.
> 
> Please consider this patch series for the next merge window.
> 
> Thank you,
> 
> Bart.
> 
> Changes compared to v6:
>   - Dropped patch "fs: Restore F_[GS]ET_FILE_RW_HINT support".

That leaves us with F_SET_RW_HINT and F_GET_RW_HINT ioctls. Could you please
explain, perhaps with an example, what functionality is lost and what we still
have?


I built the v6 patchset atop Martin's 6.8/scsi-queue branch and it built clean.
My experience with "rc1" branches on my working laptop has been a bit less than
ideal. So also built the v6 patchset atop linux_stable around last Monday and
have been running that without issues on my laptop for a week. Haven't updated
my  linux-stable to lk 6.7.0-rc6 yet but don't expect issues with the v7
patchset.

Doug Gilbert


<snip>
Bart Van Assche Dec. 18, 2023, 11:29 p.m. UTC | #2
On 12/18/23 15:08, Douglas Gilbert wrote:
> On 12/18/23 13:56, Bart Van Assche wrote:
>> Changes compared to v6:
>>   - Dropped patch "fs: Restore F_[GS]ET_FILE_RW_HINT support".
> 
> That leaves us with F_SET_RW_HINT and F_GET_RW_HINT ioctls. Could you please
> explain, perhaps with an example, what functionality is lost and what we still
> have?
Hmm ... what information do you expect that is not available in the fcntl man
page? From that man page:

        F_SET_RW_HINT (uint64_t *; since Linux 4.13)
               Sets the read/write hint value associated with the underlying inode re‐
               ferred to by fd.  This hint persists until either it is explicitly mod‐
               ified or the underlying filesystem is unmounted.

        F_SET_FILE_RW_HINT (uint64_t *; since Linux 4.13)
               Sets the read/write hint value associated with the open  file  descrip‐
               tion referred to by fd.

The functionality that is lost is to open a file multiple times and to call
F_SET_FILE_RW_HINT with a different value for each file. Ceph used this approach
to specify R/W hints per LBA range in combination with direct I/O.

Bart.