diff mbox series

[RFC,8/8] bdev: use bdev_io_min() for statx block size

Message ID 20241113094727.1497722-9-mcgrof@kernel.org (mailing list archive)
State New
Headers show
Series enable bs > ps for block devices | expand

Commit Message

Luis Chamberlain Nov. 13, 2024, 9:47 a.m. UTC
You can use lsblk to query for a block device block device block size:

lsblk -o MIN-IO /dev/nvme0n1
MIN-IO
 4096

The min-io is the minimum IO the block device prefers for optimal
performance. In turn we map this to the block device block size.
The current block size exposed even for block devices with an
LBA format of 16k is 4k. Likewise devices which support 4k LBA format
but have a larger Indirection Unit of 16k have an exposed block size
of 4k.

This incurs read-modify-writes on direct IO against devices with a
min-io larger than the page size. To fix this, use the block device
min io, which is the minimal optimal IO the device prefers.

With this we now get:

lsblk -o MIN-IO /dev/nvme0n1
MIN-IO
 16384

And so userspace gets the appropriate information it needs for optimal
performance. This is verified with blkalgn against mkfs against a
device with LBA format of 4k but an NPWG of 16k (min io size)

mkfs.xfs -f -b size=16k  /dev/nvme3n1
blkalgn -d nvme3n1 --ops Write

     Block size          : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 0        |                                        |
      2048 -> 4095       : 0        |                                        |
      4096 -> 8191       : 0        |                                        |
      8192 -> 16383      : 0        |                                        |
     16384 -> 32767      : 66       |****************************************|
     32768 -> 65535      : 0        |                                        |
     65536 -> 131071     : 0        |                                        |
    131072 -> 262143     : 2        |*                                       |
Block size: 14 - 66
Block size: 17 - 2

     Algn size           : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 0        |                                        |
      2048 -> 4095       : 0        |                                        |
      4096 -> 8191       : 0        |                                        |
      8192 -> 16383      : 0        |                                        |
     16384 -> 32767      : 66       |****************************************|
     32768 -> 65535      : 0        |                                        |
     65536 -> 131071     : 0        |                                        |
    131072 -> 262143     : 2        |*                                       |
Algn size: 14 - 66
Algn size: 17 - 2

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 block/bdev.c | 1 +
 fs/stat.c    | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

Comments

Hannes Reinecke Nov. 13, 2024, 9:59 a.m. UTC | #1
On 11/13/24 10:47, Luis Chamberlain wrote:
> You can use lsblk to query for a block device block device block size:
> 
> lsblk -o MIN-IO /dev/nvme0n1
> MIN-IO
>   4096
> 
> The min-io is the minimum IO the block device prefers for optimal
> performance. In turn we map this to the block device block size.
> The current block size exposed even for block devices with an
> LBA format of 16k is 4k. Likewise devices which support 4k LBA format
> but have a larger Indirection Unit of 16k have an exposed block size
> of 4k.
> 
> This incurs read-modify-writes on direct IO against devices with a
> min-io larger than the page size. To fix this, use the block device
> min io, which is the minimal optimal IO the device prefers.
> 
> With this we now get:
> 
> lsblk -o MIN-IO /dev/nvme0n1
> MIN-IO
>   16384
> 
> And so userspace gets the appropriate information it needs for optimal
> performance. This is verified with blkalgn against mkfs against a
> device with LBA format of 4k but an NPWG of 16k (min io size)
> 
> mkfs.xfs -f -b size=16k  /dev/nvme3n1
> blkalgn -d nvme3n1 --ops Write
> 
>       Block size          : count     distribution
>           0 -> 1          : 0        |                                        |
>           2 -> 3          : 0        |                                        |
>           4 -> 7          : 0        |                                        |
>           8 -> 15         : 0        |                                        |
>          16 -> 31         : 0        |                                        |
>          32 -> 63         : 0        |                                        |
>          64 -> 127        : 0        |                                        |
>         128 -> 255        : 0        |                                        |
>         256 -> 511        : 0        |                                        |
>         512 -> 1023       : 0        |                                        |
>        1024 -> 2047       : 0        |                                        |
>        2048 -> 4095       : 0        |                                        |
>        4096 -> 8191       : 0        |                                        |
>        8192 -> 16383      : 0        |                                        |
>       16384 -> 32767      : 66       |****************************************|
>       32768 -> 65535      : 0        |                                        |
>       65536 -> 131071     : 0        |                                        |
>      131072 -> 262143     : 2        |*                                       |
> Block size: 14 - 66
> Block size: 17 - 2
> 
>       Algn size           : count     distribution
>           0 -> 1          : 0        |                                        |
>           2 -> 3          : 0        |                                        |
>           4 -> 7          : 0        |                                        |
>           8 -> 15         : 0        |                                        |
>          16 -> 31         : 0        |                                        |
>          32 -> 63         : 0        |                                        |
>          64 -> 127        : 0        |                                        |
>         128 -> 255        : 0        |                                        |
>         256 -> 511        : 0        |                                        |
>         512 -> 1023       : 0        |                                        |
>        1024 -> 2047       : 0        |                                        |
>        2048 -> 4095       : 0        |                                        |
>        4096 -> 8191       : 0        |                                        |
>        8192 -> 16383      : 0        |                                        |
>       16384 -> 32767      : 66       |****************************************|
>       32768 -> 65535      : 0        |                                        |
>       65536 -> 131071     : 0        |                                        |
>      131072 -> 262143     : 2        |*                                       |
> Algn size: 14 - 66
> Algn size: 17 - 2
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>   block/bdev.c | 1 +
>   fs/stat.c    | 2 +-
>   2 files changed, 2 insertions(+), 1 deletion(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
diff mbox series

Patch

diff --git a/block/bdev.c b/block/bdev.c
index 3a5fd65f6c8e..4dcc501ed953 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -1306,6 +1306,7 @@  void bdev_statx(struct path *path, struct kstat *stat,
 			queue_atomic_write_unit_max_bytes(bd_queue));
 	}
 
+	stat->blksize = (unsigned int) bdev_io_min(bdev);
 	blkdev_put_no_open(bdev);
 }
 
diff --git a/fs/stat.c b/fs/stat.c
index 41e598376d7e..9b579c0b5153 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -268,7 +268,7 @@  static int vfs_statx_path(struct path *path, int flags, struct kstat *stat,
 	 * obtained from the bdev backing inode.
 	 */
 	if (S_ISBLK(stat->mode))
-		bdev_statx(path, stat, request_mask);
+		bdev_statx(path, stat, request_mask | STATX_DIOALIGN);
 
 	return error;
 }