[v7,12/12] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS

Message ID	20230925192229.3186470-26-eblake@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org> From: Eric Blake <eblake@redhat.com> To: qemu-devel@nongnu.org Cc: libguestfs@redhat.com, qemu-block@nongnu.org, vsementsov@yandex-team.ru, Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com> Subject: [PATCH v7 12/12] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Date: Mon, 25 Sep 2023 14:22:42 -0500 Message-ID: <20230925192229.3186470-26-eblake@redhat.com> In-Reply-To: <20230925192229.3186470-14-eblake@redhat.com> References: <20230925192229.3186470-14-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Series	NBD 64-bit extensions for qemu \| expand [v7,00/12] NBD 64-bit extensions for qemu [v7,01/12] nbd/server: Support a request payload [v7,02/12] nbd/server: Prepare to receive extended header requests [v7,03/12] nbd/server: Prepare to send extended header replies [v7,04/12] nbd/server: Support 64-bit block status [v7,05/12] nbd/server: Enable initial support for extended headers [v7,06/12] nbd/client: Plumb errp through nbd_receive_replies [v7,07/12] nbd/client: Initial support for extended headers [v7,08/12] nbd/client: Accept 64-bit block status chunks [v7,09/12] nbd/client: Request extended headers during negotiation [v7,10/12] nbd/server: Refactor list of negotiated meta contexts [v7,11/12] nbd/server: Prepare for per-request filtering of BLOCK_STATUS [v7,12/12] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS

Eric Blake Sept. 25, 2023, 7:22 p.m. UTC

Allow a client to request a subset of negotiated meta contexts.  For
example, a client may ask to use a single connection to learn about
both block status and dirty bitmaps, but where the dirty bitmap
queries only need to be performed on a subset of the disk; forcing the
server to compute that information on block status queries in the rest
of the disk is wasted effort (both at the server, and on the amount of
traffic sent over the wire to be parsed and ignored by the client).

Qemu as an NBD client never requests to use more than one meta
context, so it has no need to use block status payloads.  Testing this
instead requires support from libnbd, which CAN access multiple meta
contexts in parallel from a single NBD connection; an interop test
submitted to the libnbd project at the same time as this patch
demonstrates the feature working, as well as testing some corner cases
(for example, when the payload length is longer than the export
length), although other corner cases (like passing the same id
duplicated) requires a protocol fuzzer because libnbd is not wired up
to break the protocol that badly.

This also includes tweaks to 'qemu-nbd --list' to show when a server
is advertising the capability, and to the testsuite to reflect the
addition to that output.

Of note: qemu will always advertise the new feature bit during
NBD_OPT_INFO if extended headers have alreay been negotiated
(regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has
occurred); but for NBD_OPT_GO, qemu only advertises the feature if
block status is also enabled (that is, if the client does not
negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so
the feature is not advertised).

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v5: factor out 'id - NBD_MTA_ID_DIRTY_BITMAP' [Vladimir], rework logic
on zero-length requests to be clearer [Vladimir], rebase to earlier
changes
---
 docs/interop/nbd.txt                          |   2 +-
 nbd/server.c                                  | 114 ++++++++++++++++--
 qemu-nbd.c                                    |   1 +
 nbd/trace-events                              |   1 +
 tests/qemu-iotests/223.out                    |  12 +-
 tests/qemu-iotests/307.out                    |  10 +-
 .../tests/nbd-qemu-allocation.out             |   2 +-
 7 files changed, 122 insertions(+), 20 deletions(-)

Vladimir Sementsov-Ogievskiy Sept. 30, 2023, 1:24 p.m. UTC | #1

On 25.09.23 22:22, Eric Blake wrote:
> Allow a client to request a subset of negotiated meta contexts.  For
> example, a client may ask to use a single connection to learn about
> both block status and dirty bitmaps, but where the dirty bitmap
> queries only need to be performed on a subset of the disk; forcing the
> server to compute that information on block status queries in the rest
> of the disk is wasted effort (both at the server, and on the amount of
> traffic sent over the wire to be parsed and ignored by the client).
> 
> Qemu as an NBD client never requests to use more than one meta
> context, so it has no need to use block status payloads.  Testing this
> instead requires support from libnbd, which CAN access multiple meta
> contexts in parallel from a single NBD connection; an interop test
> submitted to the libnbd project at the same time as this patch
> demonstrates the feature working, as well as testing some corner cases
> (for example, when the payload length is longer than the export
> length), although other corner cases (like passing the same id
> duplicated) requires a protocol fuzzer because libnbd is not wired up
> to break the protocol that badly.
> 
> This also includes tweaks to 'qemu-nbd --list' to show when a server
> is advertising the capability, and to the testsuite to reflect the
> addition to that output.
> 
> Of note: qemu will always advertise the new feature bit during
> NBD_OPT_INFO if extended headers have alreay been negotiated
> (regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has
> occurred); but for NBD_OPT_GO, qemu only advertises the feature if
> block status is also enabled (that is, if the client does not
> negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so
> the feature is not advertised).
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> 

[..]

> 
> +/*
> + * nbd_co_block_status_payload_read
> + * Called when a client wants a subset of negotiated contexts via a
> + * BLOCK_STATUS payload.  Check the payload for valid length and
> + * contents.  On success, return 0 with request updated to effective
> + * length.  If request was invalid but all payload consumed, return 0
> + * with request->len and request->contexts->count set to 0 (which will
> + * trigger an appropriate NBD_EINVAL response later on).  Return
> + * negative errno if the payload was not fully consumed.
> + */
> +static int
> +nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request,
> +                                 Error **errp)

[..]

> +        payload_len > (sizeof(NBDBlockStatusPayload) +
> +                       sizeof(id) * client->contexts.count)) {
> +        goto skip;
> +    }
> +
> +    buf = g_malloc(payload_len);
> +    if (nbd_read(client->ioc, buf, payload_len,
> +                 "CMD_BLOCK_STATUS data", errp) < 0) {
> +        return -EIO;
> +    }
> +    trace_nbd_co_receive_request_payload_received(request->cookie,
> +                                                  payload_len);
> +    request->contexts->bitmaps = g_new0(bool, nr_bitmaps);
> +    count = (payload_len - sizeof(NBDBlockStatusPayload)) / sizeof(id);
> +    payload_len = 0;
> +
> +    for (i = 0; i < count; i++) {
> +        id = ldl_be_p(buf + sizeof(NBDBlockStatusPayload) + sizeof(id) * i);
> +        if (id == NBD_META_ID_BASE_ALLOCATION) {
> +            if (request->contexts->base_allocation) {
> +                goto skip;
> +            }

should we also check that base_allocation is negotiated?

> +            request->contexts->base_allocation = true;
> +        } else if (id == NBD_META_ID_ALLOCATION_DEPTH) {
> +            if (request->contexts->allocation_depth) {
> +                goto skip;
> +            }

same here

> +            request->contexts->allocation_depth = true;
> +        } else {
> +            int idx = id - NBD_META_ID_DIRTY_BITMAP;
> +

I think, we also should check that idx >=0 after this operation.

> +            if (idx > nr_bitmaps || request->contexts->bitmaps[idx]) {
> +                goto skip;
> +            }
> +            request->contexts->bitmaps[idx] = true;
> +        }
> +    }
> +
> +    request->len = ldq_be_p(buf);
> +    request->contexts->count = count;
> +    return 0;
> +
> + skip:
> +    trace_nbd_co_receive_block_status_payload_compliance(request->from,
> +                                                         request->len);
> +    request->len = request->contexts->count = 0;
> +    return nbd_drop(client->ioc, payload_len, errp);
> +}
> +

[..]

> diff --git a/nbd/trace-events b/nbd/trace-events
> index 8f4e20ee9f2..ac186c19ec0 100644
> --- a/nbd/trace-events
> +++ b/nbd/trace-events
> @@ -70,6 +70,7 @@ nbd_co_send_chunk_read(uint64_t cookie, uint64_t offset, void *data, uint64_t si
>   nbd_co_send_chunk_read_hole(uint64_t cookie, uint64_t offset, uint64_t size) "Send structured read hole reply: cookie = %" PRIu64 ", offset = %" PRIu64 ", len = %" PRIu64
>   nbd_co_send_extents(uint64_t cookie, unsigned int extents, uint32_t id, uint64_t length, int last) "Send block status reply: cookie = %" PRIu64 ", extents = %u, context = %d (extents cover %" PRIu64 " bytes, last chunk = %d)"
>   nbd_co_send_chunk_error(uint64_t cookie, int err, const char *errname, const char *msg) "Send structured error reply: cookie = %" PRIu64 ", error = %d (%s), msg = '%s'"
> +nbd_co_receive_block_status_payload_compliance(uint64_t from, int len) "client sent unusable block status payload: from=0x%" PRIx64 ", len=0x%x"

both passed parameters request->from and request->len are uint64_t actually

>   nbd_co_receive_request_decode_type(uint64_t cookie, uint16_t type, const char *name) "Decoding type: cookie = %" PRIu64 ", type = %" PRIu16 " (%s)"
>   nbd_co_receive_request_payload_received(uint64_t cookie, uint64_t len) "Payload received: cookie = %" PRIu64 ", len = %" PRIu64
>   nbd_co_receive_ext_payload_compliance(uint64_t from, uint64_t len) "client sent non-compliant write without payload flag: from=0x%" PRIx64 ", len=0x%" PRIx64

[..]

Eric Blake Oct. 4, 2023, 9:55 p.m. UTC | #2

On Sat, Sep 30, 2023 at 04:24:02PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 25.09.23 22:22, Eric Blake wrote:
> > Allow a client to request a subset of negotiated meta contexts.  For
> > example, a client may ask to use a single connection to learn about
> > both block status and dirty bitmaps, but where the dirty bitmap
> > queries only need to be performed on a subset of the disk; forcing the
> > server to compute that information on block status queries in the rest
> > of the disk is wasted effort (both at the server, and on the amount of
> > traffic sent over the wire to be parsed and ignored by the client).
> > 
> > Qemu as an NBD client never requests to use more than one meta
> > context, so it has no need to use block status payloads.  Testing this
> > instead requires support from libnbd, which CAN access multiple meta
> > contexts in parallel from a single NBD connection; an interop test
> > submitted to the libnbd project at the same time as this patch
> > demonstrates the feature working, as well as testing some corner cases
> > (for example, when the payload length is longer than the export
> > length), although other corner cases (like passing the same id
> > duplicated) requires a protocol fuzzer because libnbd is not wired up
> > to break the protocol that badly.
> > 
> > This also includes tweaks to 'qemu-nbd --list' to show when a server
> > is advertising the capability, and to the testsuite to reflect the
> > addition to that output.
> > 
> > Of note: qemu will always advertise the new feature bit during
> > NBD_OPT_INFO if extended headers have alreay been negotiated
> > (regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has
> > occurred); but for NBD_OPT_GO, qemu only advertises the feature if
> > block status is also enabled (that is, if the client does not
> > negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so
> > the feature is not advertised).
> > 
> > Signed-off-by: Eric Blake <eblake@redhat.com>
> > ---
> > 
> 
> [..]
> 
> > 
> > +/*
> > + * nbd_co_block_status_payload_read
> > + * Called when a client wants a subset of negotiated contexts via a
> > + * BLOCK_STATUS payload.  Check the payload for valid length and
> > + * contents.  On success, return 0 with request updated to effective
> > + * length.  If request was invalid but all payload consumed, return 0
> > + * with request->len and request->contexts->count set to 0 (which will
> > + * trigger an appropriate NBD_EINVAL response later on).  Return
> > + * negative errno if the payload was not fully consumed.
> > + */
> > +static int
> > +nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request,
> > +                                 Error **errp)
> 
> [..]
> 
> > +        payload_len > (sizeof(NBDBlockStatusPayload) +
> > +                       sizeof(id) * client->contexts.count)) {
> > +        goto skip;
> > +    }
> > +
> > +    buf = g_malloc(payload_len);
> > +    if (nbd_read(client->ioc, buf, payload_len,
> > +                 "CMD_BLOCK_STATUS data", errp) < 0) {
> > +        return -EIO;
> > +    }
> > +    trace_nbd_co_receive_request_payload_received(request->cookie,
> > +                                                  payload_len);
> > +    request->contexts->bitmaps = g_new0(bool, nr_bitmaps);
> > +    count = (payload_len - sizeof(NBDBlockStatusPayload)) / sizeof(id);
> > +    payload_len = 0;
> > +
> > +    for (i = 0; i < count; i++) {
> > +        id = ldl_be_p(buf + sizeof(NBDBlockStatusPayload) + sizeof(id) * i);
> > +        if (id == NBD_META_ID_BASE_ALLOCATION) {
> > +            if (request->contexts->base_allocation) {
> > +                goto skip;
> > +            }
> 
> should we also check that base_allocation is negotiated?

Oh, good point.  Without that check, the client can pass in random id
numbers that it never negotiated.  I've queued 1-11 and will probably
send a pull request for those this week, while respinning this patch
to fix the remaining issues you pointed out.

> 
> > +            request->contexts->base_allocation = true;
> > +        } else if (id == NBD_META_ID_ALLOCATION_DEPTH) {
> > +            if (request->contexts->allocation_depth) {
> > +                goto skip;
> > +            }
> 
> same here
> 
> > +            request->contexts->allocation_depth = true;
> > +        } else {
> > +            int idx = id - NBD_META_ID_DIRTY_BITMAP;
> > +
> 
> I think, we also should check that idx >=0 after this operation.
> 
> > +            if (idx > nr_bitmaps || request->contexts->bitmaps[idx]) {

Or else make idx an unsigned value, instead of signed.  Also a good catch.

> > +                goto skip;
> > +            }
> > +            request->contexts->bitmaps[idx] = true;
> > +        }
> > +    }
> > +
> > +    request->len = ldq_be_p(buf);
> > +    request->contexts->count = count;
> > +    return 0;
> > +
> > + skip:
> > +    trace_nbd_co_receive_block_status_payload_compliance(request->from,
> > +                                                         request->len);
> > +    request->len = request->contexts->count = 0;
> > +    return nbd_drop(client->ioc, payload_len, errp);
> > +}
> > +
> 
> [..]
> 
> > diff --git a/nbd/trace-events b/nbd/trace-events
> > index 8f4e20ee9f2..ac186c19ec0 100644
> > --- a/nbd/trace-events
> > +++ b/nbd/trace-events
> > @@ -70,6 +70,7 @@ nbd_co_send_chunk_read(uint64_t cookie, uint64_t offset, void *data, uint64_t si
> >   nbd_co_send_chunk_read_hole(uint64_t cookie, uint64_t offset, uint64_t size) "Send structured read hole reply: cookie = %" PRIu64 ", offset = %" PRIu64 ", len = %" PRIu64
> >   nbd_co_send_extents(uint64_t cookie, unsigned int extents, uint32_t id, uint64_t length, int last) "Send block status reply: cookie = %" PRIu64 ", extents = %u, context = %d (extents cover %" PRIu64 " bytes, last chunk = %d)"
> >   nbd_co_send_chunk_error(uint64_t cookie, int err, const char *errname, const char *msg) "Send structured error reply: cookie = %" PRIu64 ", error = %d (%s), msg = '%s'"
> > +nbd_co_receive_block_status_payload_compliance(uint64_t from, int len) "client sent unusable block status payload: from=0x%" PRIx64 ", len=0x%x"
> 
> both passed parameters request->from and request->len are uint64_t actually

Also a good catch.

Eric Blake Oct. 5, 2023, 1:49 p.m. UTC | #3

On Wed, Oct 04, 2023 at 04:55:02PM -0500, Eric Blake wrote:
> > > +static int
> > > +nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request,
> > > +                                 Error **errp)
> > 
> > [..]

> > > +    for (i = 0; i < count; i++) {
> > > +        id = ldl_be_p(buf + sizeof(NBDBlockStatusPayload) + sizeof(id) * i);
> > > +        if (id == NBD_META_ID_BASE_ALLOCATION) {
> > > +            if (request->contexts->base_allocation) {
> > > +                goto skip;
> > > +            }
> > 
> > should we also check that base_allocation is negotiated?
> 
> Oh, good point.  Without that check, the client can pass in random id
> numbers that it never negotiated.  I've queued 1-11 and will probably
> send a pull request for those this week, while respinning this patch
> to fix the remaining issues you pointed out.

I'm squashing in the following. If you can review it today, I'll
include it in my pull request this afternoon; if not, we still have
time before soft freeze to get it in the next batch.

diff --git i/nbd/server.c w/nbd/server.c
index 30816b42386..62654579cbc 100644
--- i/nbd/server.c
+++ w/nbd/server.c
@@ -2478,19 +2478,22 @@ nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request,
     for (i = 0; i < count; i++) {
         id = ldl_be_p(buf + sizeof(NBDBlockStatusPayload) + sizeof(id) * i);
         if (id == NBD_META_ID_BASE_ALLOCATION) {
-            if (request->contexts->base_allocation) {
+            if (!client->contexts.base_allocation ||
+                request->contexts->base_allocation) {
                 goto skip;
             }
             request->contexts->base_allocation = true;
         } else if (id == NBD_META_ID_ALLOCATION_DEPTH) {
-            if (request->contexts->allocation_depth) {
+            if (!client->contexts.allocation_depth ||
+                request->contexts->allocation_depth) {
                 goto skip;
             }
             request->contexts->allocation_depth = true;
         } else {
-            int idx = id - NBD_META_ID_DIRTY_BITMAP;
+            unsigned idx = id - NBD_META_ID_DIRTY_BITMAP;

-            if (idx > nr_bitmaps || request->contexts->bitmaps[idx]) {
+            if (idx > nr_bitmaps || !client->contexts.bitmaps[idx] ||
+                request->contexts->bitmaps[idx]) {
                 goto skip;
             }
             request->contexts->bitmaps[idx] = true;
diff --git i/nbd/trace-events w/nbd/trace-events
index 3cf2d00e458..00ae3216a11 100644
--- i/nbd/trace-events
+++ w/nbd/trace-events
@@ -70,7 +70,7 @@ nbd_co_send_chunk_read(uint64_t cookie, uint64_t offset, void *data, uint64_t si
 nbd_co_send_chunk_read_hole(uint64_t cookie, uint64_t offset, uint64_t size) "Send structured read hole reply: cookie = %" PRIu64 ", offset = %" PRIu64 ", len = %" PRIu64
 nbd_co_send_extents(uint64_t cookie, unsigned int extents, uint32_t id, uint64_t length, int last) "Send block status reply: cookie = %" PRIu64 ", extents = %u, context = %d (extents cover %" PRIu64 " bytes, last chunk = %d)"
 nbd_co_send_chunk_error(uint64_t cookie, int err, const char *errname, const char *msg) "Send structured error reply: cookie = %" PRIu64 ", error = %d (%s), msg = '%s'"
-nbd_co_receive_block_status_payload_compliance(uint64_t from, int len) "client sent unusable block status payload: from=0x%" PRIx64 ", len=0x%x"
+nbd_co_receive_block_status_payload_compliance(uint64_t from, uint64_t len) "client sent unusable block status payload: from=0x%" PRIx64 ", len=0x%" PRIx64
 nbd_co_receive_request_decode_type(uint64_t cookie, uint16_t type, const char *name) "Decoding type: cookie = %" PRIu64 ", type = %" PRIu16 " (%s)"
 nbd_co_receive_request_payload_received(uint64_t cookie, uint64_t len) "Payload received: cookie = %" PRIu64 ", len = %" PRIu64
 nbd_co_receive_ext_payload_compliance(uint64_t from, uint64_t len) "client sent non-compliant write without payload flag: from=0x%" PRIx64 ", len=0x%" PRIx64

Vladimir Sementsov-Ogievskiy Oct. 5, 2023, 2:26 p.m. UTC | #4

On 05.10.23 16:49, Eric Blake wrote:
> On Wed, Oct 04, 2023 at 04:55:02PM -0500, Eric Blake wrote:
>>>> +static int
>>>> +nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request,
>>>> +                                 Error **errp)
>>>
>>> [..]
> 
>>>> +    for (i = 0; i < count; i++) {
>>>> +        id = ldl_be_p(buf + sizeof(NBDBlockStatusPayload) + sizeof(id) * i);
>>>> +        if (id == NBD_META_ID_BASE_ALLOCATION) {
>>>> +            if (request->contexts->base_allocation) {
>>>> +                goto skip;
>>>> +            }
>>>
>>> should we also check that base_allocation is negotiated?
>>
>> Oh, good point.  Without that check, the client can pass in random id
>> numbers that it never negotiated.  I've queued 1-11 and will probably
>> send a pull request for those this week, while respinning this patch
>> to fix the remaining issues you pointed out.
> 
> I'm squashing in the following. If you can review it today, I'll
> include it in my pull request this afternoon; if not, we still have
> time before soft freeze to get it in the next batch.
> 
> diff --git i/nbd/server.c w/nbd/server.c
> index 30816b42386..62654579cbc 100644
> --- i/nbd/server.c
> +++ w/nbd/server.c
> @@ -2478,19 +2478,22 @@ nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request,
>       for (i = 0; i < count; i++) {
>           id = ldl_be_p(buf + sizeof(NBDBlockStatusPayload) + sizeof(id) * i);
>           if (id == NBD_META_ID_BASE_ALLOCATION) {
> -            if (request->contexts->base_allocation) {
> +            if (!client->contexts.base_allocation ||
> +                request->contexts->base_allocation) {
>                   goto skip;
>               }
>               request->contexts->base_allocation = true;
>           } else if (id == NBD_META_ID_ALLOCATION_DEPTH) {
> -            if (request->contexts->allocation_depth) {
> +            if (!client->contexts.allocation_depth ||
> +                request->contexts->allocation_depth) {
>                   goto skip;
>               }
>               request->contexts->allocation_depth = true;
>           } else {
> -            int idx = id - NBD_META_ID_DIRTY_BITMAP;
> +            unsigned idx = id - NBD_META_ID_DIRTY_BITMAP;
> 
> -            if (idx > nr_bitmaps || request->contexts->bitmaps[idx]) {
> +            if (idx > nr_bitmaps || !client->contexts.bitmaps[idx] ||

Oops, I didn't notice: s/>/>=/, as nr_bitmaps is length of array.

> +                request->contexts->bitmaps[idx]) {
>                   goto skip;
>               }
>               request->contexts->bitmaps[idx] = true;
> diff --git i/nbd/trace-events w/nbd/trace-events
> index 3cf2d00e458..00ae3216a11 100644
> --- i/nbd/trace-events
> +++ w/nbd/trace-events
> @@ -70,7 +70,7 @@ nbd_co_send_chunk_read(uint64_t cookie, uint64_t offset, void *data, uint64_t si
>   nbd_co_send_chunk_read_hole(uint64_t cookie, uint64_t offset, uint64_t size) "Send structured read hole reply: cookie = %" PRIu64 ", offset = %" PRIu64 ", len = %" PRIu64
>   nbd_co_send_extents(uint64_t cookie, unsigned int extents, uint32_t id, uint64_t length, int last) "Send block status reply: cookie = %" PRIu64 ", extents = %u, context = %d (extents cover %" PRIu64 " bytes, last chunk = %d)"
>   nbd_co_send_chunk_error(uint64_t cookie, int err, const char *errname, const char *msg) "Send structured error reply: cookie = %" PRIu64 ", error = %d (%s), msg = '%s'"
> -nbd_co_receive_block_status_payload_compliance(uint64_t from, int len) "client sent unusable block status payload: from=0x%" PRIx64 ", len=0x%x"
> +nbd_co_receive_block_status_payload_compliance(uint64_t from, uint64_t len) "client sent unusable block status payload: from=0x%" PRIx64 ", len=0x%" PRIx64
>   nbd_co_receive_request_decode_type(uint64_t cookie, uint16_t type, const char *name) "Decoding type: cookie = %" PRIu64 ", type = %" PRIu16 " (%s)"
>   nbd_co_receive_request_payload_received(uint64_t cookie, uint64_t len) "Payload received: cookie = %" PRIu64 ", len = %" PRIu64
>   nbd_co_receive_ext_payload_compliance(uint64_t from, uint64_t len) "client sent non-compliant write without payload flag: from=0x%" PRIx64 ", len=0x%" PRIx64
> 
> 
>

Vladimir Sementsov-Ogievskiy Oct. 5, 2023, 2:33 p.m. UTC | #5

On 25.09.23 22:22, Eric Blake wrote:
> Allow a client to request a subset of negotiated meta contexts.  For
> example, a client may ask to use a single connection to learn about
> both block status and dirty bitmaps, but where the dirty bitmap
> queries only need to be performed on a subset of the disk; forcing the
> server to compute that information on block status queries in the rest
> of the disk is wasted effort (both at the server, and on the amount of
> traffic sent over the wire to be parsed and ignored by the client).
> 
> Qemu as an NBD client never requests to use more than one meta
> context, so it has no need to use block status payloads.  Testing this
> instead requires support from libnbd, which CAN access multiple meta
> contexts in parallel from a single NBD connection; an interop test
> submitted to the libnbd project at the same time as this patch
> demonstrates the feature working, as well as testing some corner cases
> (for example, when the payload length is longer than the export
> length), although other corner cases (like passing the same id
> duplicated) requires a protocol fuzzer because libnbd is not wired up
> to break the protocol that badly.
> 
> This also includes tweaks to 'qemu-nbd --list' to show when a server
> is advertising the capability, and to the testsuite to reflect the
> addition to that output.
> 
> Of note: qemu will always advertise the new feature bit during
> NBD_OPT_INFO if extended headers have alreay been negotiated
> (regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has
> occurred); but for NBD_OPT_GO, qemu only advertises the feature if
> block status is also enabled (that is, if the client does not
> negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so
> the feature is not advertised).
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> 
> v5: factor out 'id - NBD_MTA_ID_DIRTY_BITMAP' [Vladimir], rework logic
> on zero-length requests to be clearer [Vladimir], rebase to earlier
> changes

[..]

> +/*
> + * nbd_co_block_status_payload_read
> + * Called when a client wants a subset of negotiated contexts via a
> + * BLOCK_STATUS payload.  Check the payload for valid length and
> + * contents.  On success, return 0 with request updated to effective
> + * length.  If request was invalid but all payload consumed, return 0
> + * with request->len and request->contexts->count set to 0 (which will
> + * trigger an appropriate NBD_EINVAL response later on).  Return
> + * negative errno if the payload was not fully consumed.
> + */
> +static int
> +nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request,
> +                                 Error **errp)
> +{
> +    int payload_len = request->len;

payload_len should be uint64_t

> +    g_autofree char *buf = NULL;
> +    size_t count, i, nr_bitmaps;
> +    uint32_t id;
> +

otherwise, we may do something unexpected here, when reqeuest->len is too big for int:

> +    if (payload_len > NBD_MAX_BUFFER_SIZE) {
> +        error_setg(errp, "len (%" PRIu64 ") is larger than max len (%u)",
> +                   request->len, NBD_MAX_BUFFER_SIZE);
> +        return -EINVAL;
> +    }
> +
> +    assert(client->contexts.exp == client->exp);
> +    nr_bitmaps = client->exp->nr_export_bitmaps;
> +    request->contexts = g_new0(NBDMetaContexts, 1);
> +    request->contexts->exp = client->exp;
> +
> +    if (payload_len % sizeof(uint32_t) ||
> +        payload_len < sizeof(NBDBlockStatusPayload) ||
> +        payload_len > (sizeof(NBDBlockStatusPayload) +
> +                       sizeof(id) * client->contexts.count)) {
> +        goto skip;
> +    }

[..]

>    * connection right away, -EAGAIN to indicate we were interrupted and the
> @@ -2505,7 +2593,18 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req,
>           break;
> 
>       case NBD_CMD_BLOCK_STATUS:
> -        request->contexts = &client->contexts;
> +        if (extended_with_payload) {
> +            ret = nbd_co_block_status_payload_read(client, request, errp);
> +            if (ret < 0) {
> +                return ret;
> +            }
> +            /* payload now consumed */
> +            check_length = extended_with_payload = false;

why set extended_with_payload to false? it's a bit misleading. And you don't do this for WRITE request.

> +            payload_len = 0;
> +            valid_flags |= NBD_CMD_FLAG_PAYLOAD_LEN;
> +        } else {
> +            request->contexts = &client->contexts;
> +        }
>           valid_flags |= NBD_CMD_FLAG_REQ_ONE;
>           break;
> 

[..]


with payload_len changed to uint64_t, your squash-in applied and s/>/>=/ fixed:
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

Eric Blake Oct. 5, 2023, 3:22 p.m. UTC | #6

On Thu, Oct 05, 2023 at 05:33:26PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > +static int
> > +nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request,
> > +                                 Error **errp)
> > +{
> > +    int payload_len = request->len;
> 
> payload_len should be uint64_t
> 
> > +    g_autofree char *buf = NULL;
> > +    size_t count, i, nr_bitmaps;
> > +    uint32_t id;
> > +
> 
> otherwise, we may do something unexpected here, when reqeuest->len is too big for int:
> 
> > +    if (payload_len > NBD_MAX_BUFFER_SIZE) {
> > +        error_setg(errp, "len (%" PRIu64 ") is larger than max len (%u)",
> > +                   request->len, NBD_MAX_BUFFER_SIZE);
> > +        return -EINVAL;
> > +    }

Oh, it looks like I introduced that same type mismatch in commit
8db7e2d6 as well, although it appears to have a latent effect until
this series enables the ability for request->length to actually exceed
32 bits.  I'll reply on 1/12 with another squash I'm making there.

> > +
> > +    assert(client->contexts.exp == client->exp);
> > +    nr_bitmaps = client->exp->nr_export_bitmaps;
> > +    request->contexts = g_new0(NBDMetaContexts, 1);
> > +    request->contexts->exp = client->exp;
> > +
> > +    if (payload_len % sizeof(uint32_t) ||
> > +        payload_len < sizeof(NBDBlockStatusPayload) ||
> > +        payload_len > (sizeof(NBDBlockStatusPayload) +
> > +                       sizeof(id) * client->contexts.count)) {
> > +        goto skip;
> > +    }
> 
> [..]
> 
> >    * connection right away, -EAGAIN to indicate we were interrupted and the
> > @@ -2505,7 +2593,18 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req,
> >           break;
> > 
> >       case NBD_CMD_BLOCK_STATUS:
> > -        request->contexts = &client->contexts;
> > +        if (extended_with_payload) {
> > +            ret = nbd_co_block_status_payload_read(client, request, errp);
> > +            if (ret < 0) {
> > +                return ret;
> > +            }
> > +            /* payload now consumed */
> > +            check_length = extended_with_payload = false;
> 
> why set extended_with_payload to false? it's a bit misleading. And you don't do this for WRITE request.

Indeed; it doesn't make any different to later in the function.  Will drop.

> 
> > +            payload_len = 0;
> > +            valid_flags |= NBD_CMD_FLAG_PAYLOAD_LEN;
> > +        } else {
> > +            request->contexts = &client->contexts;
> > +        }
> >           valid_flags |= NBD_CMD_FLAG_REQ_ONE;
> >           break;
> > 
> 
> [..]
> 
> 
> with payload_len changed to uint64_t, your squash-in applied and s/>/>=/ fixed:
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

Thanks for the careful review.

[v7,12/12] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS

Commit Message

Comments

Patch