diff mbox series

[v2,8/8] bdev: use bdev_io_min() for statx block size

Message ID 20250204231209.429356-9-mcgrof@kernel.org (mailing list archive)
State New
Headers show
Series enable bs > ps for block devices | expand

Commit Message

Luis Chamberlain Feb. 4, 2025, 11:12 p.m. UTC
You can use lsblk to query for a block device block device block size:

lsblk -o MIN-IO /dev/nvme0n1
MIN-IO
 4096

The min-io is the minimum IO the block device prefers for optimal
performance. In turn we map this to the block device block size.
The current block size exposed even for block devices with an
LBA format of 16k is 4k. Likewise devices which support 4k LBA format
but have a larger Indirection Unit of 16k have an exposed block size
of 4k.

This incurs read-modify-writes on direct IO against devices with a
min-io larger than the page size. To fix this, use the block device
min io, which is the minimal optimal IO the device prefers.

With this we now get:

lsblk -o MIN-IO /dev/nvme0n1
MIN-IO
 16384

And so userspace gets the appropriate information it needs for optimal
performance. This is verified with blkalgn against mkfs against a
device with LBA format of 4k but an NPWG of 16k (min io size)

mkfs.xfs -f -b size=16k  /dev/nvme3n1
blkalgn -d nvme3n1 --ops Write

     Block size          : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 0        |                                        |
      2048 -> 4095       : 0        |                                        |
      4096 -> 8191       : 0        |                                        |
      8192 -> 16383      : 0        |                                        |
     16384 -> 32767      : 66       |****************************************|
     32768 -> 65535      : 0        |                                        |
     65536 -> 131071     : 0        |                                        |
    131072 -> 262143     : 2        |*                                       |
Block size: 14 - 66
Block size: 17 - 2

     Algn size           : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 0        |                                        |
      2048 -> 4095       : 0        |                                        |
      4096 -> 8191       : 0        |                                        |
      8192 -> 16383      : 0        |                                        |
     16384 -> 32767      : 66       |****************************************|
     32768 -> 65535      : 0        |                                        |
     65536 -> 131071     : 0        |                                        |
    131072 -> 262143     : 2        |*                                       |
Algn size: 14 - 66
Algn size: 17 - 2

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 block/bdev.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Comments

Hannes Reinecke Feb. 5, 2025, 4:22 p.m. UTC | #1
On 2/5/25 00:12, Luis Chamberlain wrote:
> You can use lsblk to query for a block device block device block size:
> 
> lsblk -o MIN-IO /dev/nvme0n1
> MIN-IO
>   4096
> 
> The min-io is the minimum IO the block device prefers for optimal
> performance. In turn we map this to the block device block size.
> The current block size exposed even for block devices with an
> LBA format of 16k is 4k. Likewise devices which support 4k LBA format
> but have a larger Indirection Unit of 16k have an exposed block size
> of 4k.
> 
> This incurs read-modify-writes on direct IO against devices with a
> min-io larger than the page size. To fix this, use the block device
> min io, which is the minimal optimal IO the device prefers.
> 
> With this we now get:
> 
> lsblk -o MIN-IO /dev/nvme0n1
> MIN-IO
>   16384
> 
> And so userspace gets the appropriate information it needs for optimal
> performance. This is verified with blkalgn against mkfs against a
> device with LBA format of 4k but an NPWG of 16k (min io size)
> 
> mkfs.xfs -f -b size=16k  /dev/nvme3n1
> blkalgn -d nvme3n1 --ops Write
> 
>       Block size          : count     distribution
>           0 -> 1          : 0        |                                        |
>           2 -> 3          : 0        |                                        |
>           4 -> 7          : 0        |                                        |
>           8 -> 15         : 0        |                                        |
>          16 -> 31         : 0        |                                        |
>          32 -> 63         : 0        |                                        |
>          64 -> 127        : 0        |                                        |
>         128 -> 255        : 0        |                                        |
>         256 -> 511        : 0        |                                        |
>         512 -> 1023       : 0        |                                        |
>        1024 -> 2047       : 0        |                                        |
>        2048 -> 4095       : 0        |                                        |
>        4096 -> 8191       : 0        |                                        |
>        8192 -> 16383      : 0        |                                        |
>       16384 -> 32767      : 66       |****************************************|
>       32768 -> 65535      : 0        |                                        |
>       65536 -> 131071     : 0        |                                        |
>      131072 -> 262143     : 2        |*                                       |
> Block size: 14 - 66
> Block size: 17 - 2
> 
>       Algn size           : count     distribution
>           0 -> 1          : 0        |                                        |
>           2 -> 3          : 0        |                                        |
>           4 -> 7          : 0        |                                        |
>           8 -> 15         : 0        |                                        |
>          16 -> 31         : 0        |                                        |
>          32 -> 63         : 0        |                                        |
>          64 -> 127        : 0        |                                        |
>         128 -> 255        : 0        |                                        |
>         256 -> 511        : 0        |                                        |
>         512 -> 1023       : 0        |                                        |
>        1024 -> 2047       : 0        |                                        |
>        2048 -> 4095       : 0        |                                        |
>        4096 -> 8191       : 0        |                                        |
>        8192 -> 16383      : 0        |                                        |
>       16384 -> 32767      : 66       |****************************************|
>       32768 -> 65535      : 0        |                                        |
>       65536 -> 131071     : 0        |                                        |
>      131072 -> 262143     : 2        |*                                       |
> Algn size: 14 - 66
> Algn size: 17 - 2
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>   block/bdev.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
diff mbox series

Patch

diff --git a/block/bdev.c b/block/bdev.c
index 22806ce11e1d..3bd948e6438d 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -1276,9 +1276,6 @@  void bdev_statx(struct path *path, struct kstat *stat,
 	struct inode *backing_inode;
 	struct block_device *bdev;
 
-	if (!(request_mask & (STATX_DIOALIGN | STATX_WRITE_ATOMIC)))
-		return;
-
 	backing_inode = d_backing_inode(path->dentry);
 
 	/*
@@ -1305,6 +1302,8 @@  void bdev_statx(struct path *path, struct kstat *stat,
 			queue_atomic_write_unit_max_bytes(bd_queue));
 	}
 
+	stat->blksize = bdev_io_min(bdev);
+
 	blkdev_put_no_open(bdev);
 }