Message ID | 20241113094727.1497722-9-mcgrof@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | enable bs > ps for block devices | expand |
On 11/13/24 10:47, Luis Chamberlain wrote: > You can use lsblk to query for a block device block device block size: > > lsblk -o MIN-IO /dev/nvme0n1 > MIN-IO > 4096 > > The min-io is the minimum IO the block device prefers for optimal > performance. In turn we map this to the block device block size. > The current block size exposed even for block devices with an > LBA format of 16k is 4k. Likewise devices which support 4k LBA format > but have a larger Indirection Unit of 16k have an exposed block size > of 4k. > > This incurs read-modify-writes on direct IO against devices with a > min-io larger than the page size. To fix this, use the block device > min io, which is the minimal optimal IO the device prefers. > > With this we now get: > > lsblk -o MIN-IO /dev/nvme0n1 > MIN-IO > 16384 > > And so userspace gets the appropriate information it needs for optimal > performance. This is verified with blkalgn against mkfs against a > device with LBA format of 4k but an NPWG of 16k (min io size) > > mkfs.xfs -f -b size=16k /dev/nvme3n1 > blkalgn -d nvme3n1 --ops Write > > Block size : count distribution > 0 -> 1 : 0 | | > 2 -> 3 : 0 | | > 4 -> 7 : 0 | | > 8 -> 15 : 0 | | > 16 -> 31 : 0 | | > 32 -> 63 : 0 | | > 64 -> 127 : 0 | | > 128 -> 255 : 0 | | > 256 -> 511 : 0 | | > 512 -> 1023 : 0 | | > 1024 -> 2047 : 0 | | > 2048 -> 4095 : 0 | | > 4096 -> 8191 : 0 | | > 8192 -> 16383 : 0 | | > 16384 -> 32767 : 66 |****************************************| > 32768 -> 65535 : 0 | | > 65536 -> 131071 : 0 | | > 131072 -> 262143 : 2 |* | > Block size: 14 - 66 > Block size: 17 - 2 > > Algn size : count distribution > 0 -> 1 : 0 | | > 2 -> 3 : 0 | | > 4 -> 7 : 0 | | > 8 -> 15 : 0 | | > 16 -> 31 : 0 | | > 32 -> 63 : 0 | | > 64 -> 127 : 0 | | > 128 -> 255 : 0 | | > 256 -> 511 : 0 | | > 512 -> 1023 : 0 | | > 1024 -> 2047 : 0 | | > 2048 -> 4095 : 0 | | > 4096 -> 8191 : 0 | | > 8192 -> 16383 : 0 | | > 16384 -> 32767 : 66 |****************************************| > 32768 -> 65535 : 0 | | > 65536 -> 131071 : 0 | | > 131072 -> 262143 : 2 |* | > Algn size: 14 - 66 > Algn size: 17 - 2 > > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> > --- > block/bdev.c | 1 + > fs/stat.c | 2 +- > 2 files changed, 2 insertions(+), 1 deletion(-) > Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes
diff --git a/block/bdev.c b/block/bdev.c index 3a5fd65f6c8e..4dcc501ed953 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -1306,6 +1306,7 @@ void bdev_statx(struct path *path, struct kstat *stat, queue_atomic_write_unit_max_bytes(bd_queue)); } + stat->blksize = (unsigned int) bdev_io_min(bdev); blkdev_put_no_open(bdev); } diff --git a/fs/stat.c b/fs/stat.c index 41e598376d7e..9b579c0b5153 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -268,7 +268,7 @@ static int vfs_statx_path(struct path *path, int flags, struct kstat *stat, * obtained from the bdev backing inode. */ if (S_ISBLK(stat->mode)) - bdev_statx(path, stat, request_mask); + bdev_statx(path, stat, request_mask | STATX_DIOALIGN); return error; }
You can use lsblk to query for a block device block device block size: lsblk -o MIN-IO /dev/nvme0n1 MIN-IO 4096 The min-io is the minimum IO the block device prefers for optimal performance. In turn we map this to the block device block size. The current block size exposed even for block devices with an LBA format of 16k is 4k. Likewise devices which support 4k LBA format but have a larger Indirection Unit of 16k have an exposed block size of 4k. This incurs read-modify-writes on direct IO against devices with a min-io larger than the page size. To fix this, use the block device min io, which is the minimal optimal IO the device prefers. With this we now get: lsblk -o MIN-IO /dev/nvme0n1 MIN-IO 16384 And so userspace gets the appropriate information it needs for optimal performance. This is verified with blkalgn against mkfs against a device with LBA format of 4k but an NPWG of 16k (min io size) mkfs.xfs -f -b size=16k /dev/nvme3n1 blkalgn -d nvme3n1 --ops Write Block size : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 0 | | 4096 -> 8191 : 0 | | 8192 -> 16383 : 0 | | 16384 -> 32767 : 66 |****************************************| 32768 -> 65535 : 0 | | 65536 -> 131071 : 0 | | 131072 -> 262143 : 2 |* | Block size: 14 - 66 Block size: 17 - 2 Algn size : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 0 | | 4096 -> 8191 : 0 | | 8192 -> 16383 : 0 | | 16384 -> 32767 : 66 |****************************************| 32768 -> 65535 : 0 | | 65536 -> 131071 : 0 | | 131072 -> 262143 : 2 |* | Algn size: 14 - 66 Algn size: 17 - 2 Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> --- block/bdev.c | 1 + fs/stat.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)