mbox series

[RFC,v5,00/11] Add support for zoned device

Message ID 20220801012015.4902-1-faithilikerun@gmail.com (mailing list archive)
Headers show
Series Add support for zoned device | expand

Message

Sam Li Aug. 1, 2022, 1:20 a.m. UTC
Zoned Block Devices (ZBDs) devide the LBA space to block regions called zones
that are larger than the LBA size. It can only allow sequential writes, which
reduces write amplification in SSD, leading to higher throughput and increased
capacity. More details about ZBDs can be found at:

https://zonedstorage.io/docs/introduction/zoned-storage

The zoned device support aims to let guests (virtual machines) access zoned
storage devices on the host (hypervisor) through a virtio-blk device. This
involves extending QEMU's block layer and virtio-blk emulation code.  In its
current status, the virtio-blk device is not aware of ZBDs but the guest sees
host-managed drives as regular drive that will runs correctly under the most
common write workloads.

This patch series extend the block layer APIs and virtio-blk emulation section
with the minimum set of zoned commands that are necessary to support zoned
devices. The commands are - Report Zones, four zone operations and Zone Append
(developing).

It can be tested on a null_blk device using qemu-io, qemu-iotests or blkzone(8)
command in the guest os. For example, the command line for zone report using
qemu-io is:

$ path/to/qemu-io --image-opts driver=zoned_host_device,filename=/dev/nullb0 -c
"zrp offset nr_zones"

To enable zoned device in the guest os, the guest kernel must have the virtio-blk
driver with ZBDs support. The link to such patches for the kernel is:
https://github.com/dmitry-fomichev/virtblk-zbd

Then, add the following options to the QEMU command line:
-blockdev node-name=drive0,driver=zoned_host_device,filename=/dev/nullb0

After the guest os booting, use blkzone(8) to test zone operations:
blkzone report -o offset -c nr_zones /dev/vda

v5:
- add zoned storage emulation to virtio-blk device
- add documentation for zoned storage
- address review comments
  * fix qemu-iotests
  * fix check to block layer
  * modify interfaces of sysfs helper functions
  * rename zoned device structs according to QEMU styles
  * reorder patches

v4:
- add virtio-blk headers for zoned device
- add configurations for zoned host device
- add zone operations for raw-format
- address review comments
  * fix memory leak bug in zone_report
  * add checks to block layers
  * fix qemu-iotests format
  * fix sysfs helper functions

v3:
- add helper functions to get sysfs attributes
- address review comments
  * fix zone report bugs
  * fix the qemu-io code path
  * use thread pool to avoid blocking ioctl() calls

v2:
- add qemu-io sub-commands
- address review comments
  * modify interfaces of APIs

v1:
- add block layer APIs resembling Linux ZoneBlockDevice ioctls

Sam Li (11):
  include: add zoned device structs
  include: import virtio_blk headers from linux with zoned storage
    support
  file-posix: introduce get_sysfs_long_val for the long sysfs attribute
  file-posix: introduce get_sysfs_str_val for device zoned model
  block: add block layer APIs resembling Linux ZonedBlockDevice ioctls
  raw-format: add zone operations to pass through requests
  config: add check to block layer
  virtio-blk: add zoned storage APIs for zoned devices
  qemu-io: add zoned block device operations.
  qemu-iotests: test new zone operations
  docs/zoned-storage: add zoned device documentation

 block.c                                     |  13 +
 block/block-backend.c                       | 139 +++++++
 block/coroutines.h                          |   6 +
 block/file-posix.c                          | 383 +++++++++++++++++++-
 block/io.c                                  |  41 +++
 block/raw-format.c                          |  14 +
 docs/devel/zoned-storage.rst                |  68 ++++
 docs/system/qemu-block-drivers.rst.inc      |   6 +
 hw/block/virtio-blk.c                       | 172 ++++++++-
 include/block/block-common.h                |  44 ++-
 include/block/block-io.h                    |  13 +
 include/block/block_int-common.h            |  35 +-
 include/block/raw-aio.h                     |   6 +-
 include/standard-headers/linux/virtio_blk.h | 118 ++++++
 include/sysemu/block-backend-io.h           |   6 +
 meson.build                                 |   1 +
 qapi/block-core.json                        |   7 +-
 qemu-io-cmds.c                              | 144 ++++++++
 tests/qemu-iotests/tests/zoned.out          |  53 +++
 tests/qemu-iotests/tests/zoned.sh           |  86 +++++
 20 files changed, 1340 insertions(+), 15 deletions(-)
 create mode 100644 docs/devel/zoned-storage.rst
 create mode 100644 tests/qemu-iotests/tests/zoned.out
 create mode 100755 tests/qemu-iotests/tests/zoned.sh

Comments

Stefan Hajnoczi Aug. 1, 2022, 4:16 p.m. UTC | #1
Hi Hannes, Damien, and Dmitry,
This patch series introduces zoned_host_device for passing through
host zoned storage devices.

How can one host zoned storage device be split up for multiple VMs?

For NVMe it may be possible to allocate multiple Namespaces on the
device using management tools. Then Linux sees individual /dev/nvme0nX
block device nodes and QEMU uses them with zoned_host_device.

For other types of devices, can dm-linear create separate
device-mapper targets? How do max open/active zones, etc work when
multiple untrusted users are sharing a device?

I'm asking because splitting up a single physical device for multiple
VMs is a common virtualization use case that we should document.

Thanks,
Stefan