diff mbox series

[v11,02/10] block/raw: add persistent reservation in/out driver

Message ID 20240909113453.64527-3-luchangqi.123@bytedance.com (mailing list archive)
State New, archived
Headers show
Series Support persistent reservation operations | expand

Commit Message

Changqi Lu Sept. 9, 2024, 11:34 a.m. UTC
Add persistent reservation in/out operations for raw driver.
The following methods are implemented: bdrv_co_pr_read_keys,
bdrv_co_pr_read_reservation, bdrv_co_pr_register, bdrv_co_pr_reserve,
bdrv_co_pr_release, bdrv_co_pr_clear and bdrv_co_pr_preempt.

Signed-off-by: Changqi Lu <luchangqi.123@bytedance.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/raw-format.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

Comments

Keith Busch Sept. 9, 2024, 8:18 p.m. UTC | #1
On Mon, Sep 09, 2024 at 07:34:45PM +0800, Changqi Lu wrote:
> +static int coroutine_fn GRAPH_RDLOCK
> +raw_co_pr_register(BlockDriverState *bs, uint64_t old_key,
> +                   uint64_t new_key, BlockPrType type,
> +                   bool ptpl, bool ignore_key)
> +{
> +    return bdrv_co_pr_register(bs->file->bs, old_key, new_key,
> +                               type, ptpl, ignore_key);
> +}

The nvme parts look fine, but could you help me understand how this all
works? I was looking for something utilizing ioctl's, like
IOC_PR_REGISTER for this one, chained through the file-posix block
driver. Is this only supposed to work with iscsi?
zhenwei pi Sept. 10, 2024, 1:59 a.m. UTC | #2
On 9/10/24 04:18, Keith Busch wrote:
> On Mon, Sep 09, 2024 at 07:34:45PM +0800, Changqi Lu wrote:
>> +static int coroutine_fn GRAPH_RDLOCK
>> +raw_co_pr_register(BlockDriverState *bs, uint64_t old_key,
>> +                   uint64_t new_key, BlockPrType type,
>> +                   bool ptpl, bool ignore_key)
>> +{
>> +    return bdrv_co_pr_register(bs->file->bs, old_key, new_key,
>> +                               type, ptpl, ignore_key);
>> +}
> 
> The nvme parts look fine, but could you help me understand how this all
> works? I was looking for something utilizing ioctl's, like
> IOC_PR_REGISTER for this one, chained through the file-posix block
> driver. Is this only supposed to work with iscsi?

Hi Keith,

IOC_PR_REGISTER family supports PR OUT direction only in Linux kernel, 
so the command `blkpr` command (from util-linux since v2.39) supports PR 
OUT direction only too.

- In a guest:
* `blkpr` command could test PR OUT commands.
* `sg_persist` command (from sg3-utils) works fine on a SCSI device
* `nvme` command (from nvme-cli) works fine on a NVMe device

- On a host:
* libiscsi supports PR, and LIO/SPDK supports PR from the target side, 
tgt supports uncompleted PR(lack of PTPL), so QEMU libiscsi driver works 
fine.
* `iscsi-pr` command (from libiscsi-bin since v1.20.0) supports the full 
PR family command.
* because of the lack of PR IN commands from linux block layer, so QEMU 
posix block driver can't support PR currently. Once this series is 
merged into QEMU, I think we have a scenario on posix block PR IN 
family, it's a hint to promote it for linux block layer.
* I wrote a user space nvme-of library `libnvmf` 
(https://github.com/bytedance/libnvmf), it does not support PR family 
command, but I don't think it's difficult to support if necessary.
* As far as I know, several private vendor block driver support PR 
family, this QEMU block framework will make the private drivers easy to 
integrate.
diff mbox series

Patch

diff --git a/block/raw-format.c b/block/raw-format.c
index ac7e8495f6..3746bc1bd3 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -454,6 +454,55 @@  raw_co_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
     return bdrv_co_ioctl(bs->file->bs, req, buf);
 }
 
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_read_keys(BlockDriverState *bs, uint32_t *generation,
+                    uint32_t num_keys, uint64_t *keys)
+{
+
+    return bdrv_co_pr_read_keys(bs->file->bs, generation, num_keys, keys);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_read_reservation(BlockDriverState *bs, uint32_t *generation,
+                           uint64_t *key, BlockPrType *type)
+{
+    return bdrv_co_pr_read_reservation(bs->file->bs, generation, key, type);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_register(BlockDriverState *bs, uint64_t old_key,
+                   uint64_t new_key, BlockPrType type,
+                   bool ptpl, bool ignore_key)
+{
+    return bdrv_co_pr_register(bs->file->bs, old_key, new_key,
+                               type, ptpl, ignore_key);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_reserve(BlockDriverState *bs, uint64_t key, BlockPrType type)
+{
+    return bdrv_co_pr_reserve(bs->file->bs, key, type);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_release(BlockDriverState *bs, uint64_t key, BlockPrType type)
+{
+    return bdrv_co_pr_release(bs->file->bs, key, type);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_clear(BlockDriverState *bs, uint64_t key)
+{
+    return bdrv_co_pr_clear(bs->file->bs, key);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_preempt(BlockDriverState *bs, uint64_t old_key,
+                  uint64_t new_key, BlockPrType type, bool abort)
+{
+    return bdrv_co_pr_preempt(bs->file->bs, old_key, new_key, type, abort);
+}
+
 static int GRAPH_RDLOCK raw_has_zero_init(BlockDriverState *bs)
 {
     return bdrv_has_zero_init(bs->file->bs);
@@ -672,6 +721,13 @@  BlockDriver bdrv_raw = {
     .strong_runtime_opts  = raw_strong_runtime_opts,
     .mutable_opts         = mutable_opts,
     .bdrv_cancel_in_flight = raw_cancel_in_flight,
+    .bdrv_co_pr_read_keys    = raw_co_pr_read_keys,
+    .bdrv_co_pr_read_reservation = raw_co_pr_read_reservation,
+    .bdrv_co_pr_register     = raw_co_pr_register,
+    .bdrv_co_pr_reserve      = raw_co_pr_reserve,
+    .bdrv_co_pr_release      = raw_co_pr_release,
+    .bdrv_co_pr_clear        = raw_co_pr_clear,
+    .bdrv_co_pr_preempt      = raw_co_pr_preempt,
 };
 
 static void bdrv_raw_init(void)