[Nbd,v5] doc: Add NBD_CMD_BLOCK_STATUS extension

On Tue, Dec 13, 2016 at 10:37:00AM +0000, Stefan Hajnoczi wrote:
> On Mon, Dec 12, 2016 at 09:43:13PM +0100, Wouter Verhelst wrote:
> > +- `NBD_OPT_LIST_META_CONTEXT` (10)
> > +
> > +    Return a list of `NBD_REP_META_CONTEXT` replies, one per context,
> > +    followed by an `NBD_REP_ACK`. If a server replies to such a request
> > +    with no error message, clients MAY send NBD_CMD_BLOCK_STATUS
> > +    commands during the transmission phase.
> > +
> > +    If the query string is syntactically invalid, the server SHOULD send
> > +    `NBD_REP_ERR_INVALID`. If the query string is syntactically valid
> > +    but finds no metadata contexts, the server MUST send a single
> > +    reply of type `NBD_REP_ACK`.
> > +
> > +    This option MUST NOT be requested unless structured replies have
> > +    been negotiated first. If a client attempts to do so, a server
> > +    SHOULD send `NBD_REP_ERR_INVALID`.
> > +
> > +    Data:
> > +    - 32 bits, length of export name
> > +    - String, name of export for which we wish to list or select metadata
> > +      contexts.
> > +    - 32 bits, length of query
> > +    - String, query to select a subset of the available metadata
> > +      contexts. If this is not specified (i.e., the "length of query"
> > +      field is 0 and no query is sent), then the server MUST send all
> > +      the metadata contexts it knows about. If specified, this query
> > +      string MUST start with a name that uniquely identifies a server
> > +      implementation; e.g., the reference implementation that
> > +      accompanies this document would support query strings starting
> > +      with 'nbd-server:'
> 
> What about querying "base:" if the NBD spec adds more standard metadata
> contexts in the future?  "base:" does not "uniquely identify a server
> implementation" but it should still be possible to query it so that
> additional contexts can be added to this spec in the future.

If that happens, we can update the spec to define how you can select
more than one context.

> Is the query matching algorithm defined anywhere?  I guess it is
> strncmp(query, context, strlen(query)) but I'm not sure from this text.

It was my intent to allow namespaces to define their own matching
algorithm. A simple strncmp could certainly work, but I could imagine
that some more complicated scheme could work too (e.g., a syntax to ask
"all snapshots created between date X and date Y")

> Another common syntax would use an asterisk wildcard ('*') so that it's
> possible to differentiate between full matches and (wildcard) partial
> matches.

Yes.

> > +    The server MUST reply with a list of `NBD_REP_META_CONTEXT` replies,
> > +    followed by `NBD_REP_ACK`. The metadata context ID in these replies
> > +    is reserved and SHOULD be set to zero; clients SHOULD disregard it.
> > +
> > +- `NBD_OPT_SET_META_CONTEXT` (11)
> > +
> > +    Change the set of active metadata contexts. Issuing this command
> > +    replaces all previously-set metadata contexts; clients must ensure
> > +    that all metadata contexts they're interested in are selected with
> > +    the final query that they sent.
> > +
> > +    Data:
> > +    - 32 bits, length of query
> > +    - String, query to select metadata contexts. The syntax of this
> > +      query is implementation-defined, except that it MUST start with a
> > +      namespace. This namespace may be one of the following:
> > +        - `base:`, for metadata contexts defined by this document;
> > +        - `nbd-server:`, for metadata contexts defined by the
> > +          implementation that accompanies this document (none
> > +          currently);
> > +        - `x-*:`, where `*` can be replaced by any random string not
> > +          containing colons, for local experiments. This SHOULD NOT be
> > +          used by metadata contexts that are expected to e widely used.
> 
> s/ e / be /

Thanks, fixed that.

> > +        - third-party implementations can register additional
> > +          namespaces by simple request to the mailinglist.
> 
> Does this mean it is possible to activate multiple contexts but only if
> their namespace is identical?  That seems like an arbitrary limitation.
>
> In other words, the spec suggests you can activate nbd-server:a and
> nbd-server:b but you cannot activate base:a and nbd-server:a.

As written, it does. It was my intent (hence the addition of a field
"length of query") to add the ability to send multiple queries in one
SET_META_CONTEXT, and then this wouldn't be a problem. I now see that I
forgot to add the wording to that effect.

I'm adding the following to remedy this oversight:

> I'd prefer a programming model where the contexts don't need to be set
> for the lifetime of the connection.  Instead the client passes a bitmap
> (64-bits?) of contexts along with each BLOCK_STATUS command.  That way
> the client can select contexts on a per-command basis.  It's unlikely
> that more than 64 contexts need to be available at once but I admit it's
> an arbitrary limitation...
> 
> I guess you've considered this model and decided it's better to
> negotiate upfront, it's wasteful to enable multiple contexts and then
> discard the response data on the client side because only a subset is
> needed for this command invocation.

I do agree that it might be nice to be able to enable or disable
particular metadata contexts for the lifetime of one BLOCK_STATUS
command. However, the problem then becomes one of "where do we do that".
The request header has a fixed size, and there is no space for such data
in the request header. This could be worked around in ways that do not
break compatibility with older implementations (not in the least because
we're defining a new command that needs to be negotiated first, and so
we could say that the server needs to understand a new request format),
but I haven't found a way to do so that doesn't strike me as "very
ugly".

In addition, most use cases that we've been presented with seem to
require only one or (at most) a handful of metadata contexts. In that
case, the ability to select which metadata contexts are to be used in
transmission doesn't strike me as a very useful possibility, which would
be used often. Given that, and given the problems described in the
previous paragraph, I'm not convinced it's worth complicating things
much over.

However, I could be convinced otherwise by strong arguments and by a
suggested spec for how to pass the required information in a clean way.

> > +    The server MUST reply with a number of `NBD_REP_META_CONTEXT`
> > +    replies, one for each selected metadata context, each with a unique
> > +    metadata context ID. It is not an error if a
> > +    `NBD_OPT_SET_META_CONTEXT` option does not select any metadata
> > +    context, provided the client then does not attempt to issue
> > +    `NBD_CMD_BLOCK_STATUS` commands.
> > +
> >  #### Option reply types
> >  
> >  These values are used in the "reply type" field, sent by the server
> > @@ -882,7 +989,7 @@ during option haggling in the fixed newstyle negotiation.
> >      information is available, or when sending data related to the option
> >      (in the case of `NBD_OPT_LIST`) has finished. No data.
> >  
> > -* `NBD_REP_SERVER` (2)
> > +- `NBD_REP_SERVER` (2)
> >  
> >      A description of an export. Data:
> >  
> > @@ -897,10 +1004,18 @@ during option haggling in the fixed newstyle negotiation.
> >        particular client request, this field is defined to be a string
> >        suitable for direct display to a human being.
> >  
> > -* `NBD_REP_INFO` (3)
> > +- `NBD_REP_INFO` (3)
> >  
> >      Defined by the experimental `INFO` [extension](https://github.com/NetworkBlockDevice/nbd/blob/extension-info/doc/proto.md).
> >  
> > +- `NBD_REP_META_CONTEXT` (4)
> > +
> > +    A description of a metadata context. Data:
> > +
> > +    - 32 bits, NBD metadata context ID.
> > +    - String, name of the metadata context. This is not required to be
> > +      a human-readable string, but it MUST be valid UTF-8 data.
> > +
> >  There are a number of error reply types, all of which are denoted by
> >  having bit 31 set. All error replies MAY have some data set, in which
> >  case that data is an error message string suitable for display to the user.
> > @@ -938,15 +1053,62 @@ case that data is an error message string suitable for display to the user.
> >  
> >      Defined by the experimental `INFO` [extension](https://github.com/NetworkBlockDevice/nbd/blob/extension-info/doc/proto.md).
> >  
> > -* `NBD_REP_ERR_SHUTDOWN` (2^32 + 7)
> > +* `NBD_REP_ERR_SHUTDOWN` (2^31 + 7)
> >  
> >      The server is unwilling to continue negotiation as it is in the
> >      process of being shut down.
> >  
> > -* `NBD_REP_ERR_BLOCK_SIZE_REQD` (2^32 + 8)
> > +* `NBD_REP_ERR_BLOCK_SIZE_REQD` (2^31 + 8)
> 
> Are these separate fixes that slipped into the patch?  Not directly
> relevant to BLOCK_STATUS.

Yeah, like I said, they were. I should probably update master (and
structured reply) to fix this, but it's not been as important ;-)

[Nbd,v5] doc: Add NBD_CMD_BLOCK_STATUS extension

Commit Message

Comments

Patch