[1/1] nbd: fix NBD_CMD_CACHE negitiation according to the NBD specification
diff mbox series

Message ID 20181004100313.4253-1-den@openvz.org
State New
Headers show
Series
  • [1/1] nbd: fix NBD_CMD_CACHE negitiation according to the NBD specification
Related show

Commit Message

Denis V. Lunev Oct. 4, 2018, 10:03 a.m. UTC
Commit bc37b06a5 was made very bad thing, it has been added
NBD_FLAG_SEND_CACHE flag for negotiation. The problem is that the value
of the flag was taken wrong and directly violates NBD specification.
This value (bit 8) is used at least in the Linux kernel as a part of
stable userspace-kernel API since 4.10 as NBD_FLAG_CAN_MULTI_CONN
as defined in the specification:

"bit 8, NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates
entirely without cache, or that the cache it uses is shared among all
connections to the given device. In particular, if this flag is
present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA
MUST be visible across all connections when the server sends its reply
to that command to the client. In the absense of this flag, clients
SHOULD NOT multiplex their commands over more than one connection to
the export.
...
bit 10, NBD_FLAG_SEND_CACHE: documents that the server understands
NBD_CMD_CACHE; however, note that server implementations exist
which support the command without advertising this bit, and
conversely that this bit does not guarantee that the command will
succeed or have an impact."

Personally I do not see any option that we will be allowed to fix the
specification in favor of QEMU and apply the fix to the kernel. This
is a big-big problem. Let us fix this in QEMU as soon as possible.

Thus the commit fixes negotiation flag in QEMU accoring to the
specification. Fortunately enough the bit has been added by Virtuozzo
and there are no released products in the field with this bit used
so far.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
CC: Valery Vdovin <valery.vdovin@acronis.com>
CC: Eric Blake <eblake@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
---
 include/block/nbd.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eric Blake Oct. 4, 2018, 12:27 p.m. UTC | #1
On 10/4/18 5:03 AM, Denis V. Lunev wrote:
> Commit bc37b06a5 was made very bad thing, it has been added
> NBD_FLAG_SEND_CACHE flag for negotiation.

Oof. Probably my fault for not doing a careful review against the 
upstream spec.

> The problem is that the value
> of the flag was taken wrong and directly violates NBD specification.
> This value (bit 8) is used at least in the Linux kernel as a part of
> stable userspace-kernel API since 4.10 as NBD_FLAG_CAN_MULTI_CONN
> as defined in the specification:

And a kernel that honors that bit can cause qemu to misbehave. Yeah, 
that's definitely undesirable.

> 
> "bit 8, NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates
> entirely without cache, or that the cache it uses is shared among all
> connections to the given device. In particular, if this flag is
> present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA
> MUST be visible across all connections when the server sends its reply
> to that command to the client. In the absense of this flag, clients

Oh fun - a typo in the NBD spec (should be absence). I'll fix that 
separately.

> SHOULD NOT multiplex their commands over more than one connection to
> the export.
> ...
> bit 10, NBD_FLAG_SEND_CACHE: documents that the server understands
> NBD_CMD_CACHE; however, note that server implementations exist
> which support the command without advertising this bit, and
> conversely that this bit does not guarantee that the command will
> succeed or have an impact."
> 
> Personally I do not see any option that we will be allowed to fix the
> specification in favor of QEMU and apply the fix to the kernel. This
> is a big-big problem. Let us fix this in QEMU as soon as possible.

Correct. Since the kernel is already one client that supports 
CAN_MULTI_CONN, qemu 3.0 was wrong for declaring the wrong bit value.

> 
> Thus the commit fixes negotiation flag in QEMU accoring to the

s/according/according/

> specification. Fortunately enough the bit has been added by Virtuozzo
> and there are no released products in the field with this bit used
> so far.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> CC: Valery Vdovin <valery.vdovin@acronis.com>
> CC: Eric Blake <eblake@redhat.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   include/block/nbd.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index 4638c839f5..4377fa502c 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -135,7 +135,7 @@ typedef struct NBDExtent {
>   #define NBD_FLAG_SEND_TRIM         (1 << 5) /* Send TRIM (discard) */
>   #define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) /* Send WRITE_ZEROES */
>   #define NBD_FLAG_SEND_DF           (1 << 7) /* Send DF (Do not Fragment) */
> -#define NBD_FLAG_SEND_CACHE        (1 << 8) /* Send CACHE (prefetch) */
> +#define NBD_FLAG_SEND_CACHE        (1 << 10) /* Send CACHE (prefetch) */

I'll probably amend this to list all NBD_FLAG_ values in the spec (even 
if qemu doesn't implement them yet), just to make it easier to avoid 
making this mistake in the future.

Reviewed-by: Eric Blake <eblake@redhat.com>

Will queue through my NBD tree.
Vladimir Sementsov-Ogievskiy Oct. 4, 2018, 12:51 p.m. UTC | #2
04.10.2018 15:27, Eric Blake wrote:
> On 10/4/18 5:03 AM, Denis V. Lunev wrote:
>> Commit bc37b06a5 was made very bad thing, it has been added
>> NBD_FLAG_SEND_CACHE flag for negotiation.
>
> Oof. Probably my fault for not doing a careful review against the 
> upstream spec.

mostly my, to introduce this =(. really too bad

>
>> The problem is that the value
>> of the flag was taken wrong and directly violates NBD specification.
>> This value (bit 8) is used at least in the Linux kernel as a part of
>> stable userspace-kernel API since 4.10 as NBD_FLAG_CAN_MULTI_CONN
>> as defined in the specification:
>
> And a kernel that honors that bit can cause qemu to misbehave. Yeah, 
> that's definitely undesirable.

As I understand opposite: kernel client will assume support for 
multi_conn feature which declares some guarantees about FLUSH, but qemu 
don't give these guarantees.

>
>>
>> "bit 8, NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates
>> entirely without cache, or that the cache it uses is shared among all
>> connections to the given device. In particular, if this flag is
>> present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA
>> MUST be visible across all connections when the server sends its reply
>> to that command to the client. In the absense of this flag, clients
>
> Oh fun - a typo in the NBD spec (should be absence). I'll fix that 
> separately.
>
>> SHOULD NOT multiplex their commands over more than one connection to
>> the export.
>> ...
>> bit 10, NBD_FLAG_SEND_CACHE: documents that the server understands
>> NBD_CMD_CACHE; however, note that server implementations exist
>> which support the command without advertising this bit, and
>> conversely that this bit does not guarantee that the command will
>> succeed or have an impact."
>>
>> Personally I do not see any option that we will be allowed to fix the
>> specification in favor of QEMU and apply the fix to the kernel. This
>> is a big-big problem. Let us fix this in QEMU as soon as possible.
>
> Correct. Since the kernel is already one client that supports 
> CAN_MULTI_CONN, qemu 3.0 was wrong for declaring the wrong bit value.
>
>>
>> Thus the commit fixes negotiation flag in QEMU accoring to the
>
> s/according/according/
>
>> specification. Fortunately enough the bit has been added by Virtuozzo
>> and there are no released products in the field with this bit used
>> so far.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> CC: Valery Vdovin <valery.vdovin@acronis.com>
>> CC: Eric Blake <eblake@redhat.com>
>> CC: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>   include/block/nbd.h | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/block/nbd.h b/include/block/nbd.h
>> index 4638c839f5..4377fa502c 100644
>> --- a/include/block/nbd.h
>> +++ b/include/block/nbd.h
>> @@ -135,7 +135,7 @@ typedef struct NBDExtent {
>>   #define NBD_FLAG_SEND_TRIM         (1 << 5) /* Send TRIM (discard) */
>>   #define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) /* Send WRITE_ZEROES */
>>   #define NBD_FLAG_SEND_DF           (1 << 7) /* Send DF (Do not 
>> Fragment) */
>> -#define NBD_FLAG_SEND_CACHE        (1 << 8) /* Send CACHE (prefetch) */
>> +#define NBD_FLAG_SEND_CACHE        (1 << 10) /* Send CACHE 
>> (prefetch) */
>
> I'll probably amend this to list all NBD_FLAG_ values in the spec 
> (even if qemu doesn't implement them yet), just to make it easier to 
> avoid making this mistake in the future.
>
> Reviewed-by: Eric Blake <eblake@redhat.com>
>
> Will queue through my NBD tree.
>

I vote for adding all values at least as a comment

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Denis V. Lunev Oct. 4, 2018, 1:02 p.m. UTC | #3
s/negitiation/negotiation/

On 10/04/2018 01:03 PM, Denis V. Lunev wrote:
> Commit bc37b06a5 was made very bad thing, it has been added
> NBD_FLAG_SEND_CACHE flag for negotiation. The problem is that the value
> of the flag was taken wrong and directly violates NBD specification.
> This value (bit 8) is used at least in the Linux kernel as a part of
> stable userspace-kernel API since 4.10 as NBD_FLAG_CAN_MULTI_CONN
> as defined in the specification:
>
> "bit 8, NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates
> entirely without cache, or that the cache it uses is shared among all
> connections to the given device. In particular, if this flag is
> present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA
> MUST be visible across all connections when the server sends its reply
> to that command to the client. In the absense of this flag, clients
> SHOULD NOT multiplex their commands over more than one connection to
> the export.
> ...
> bit 10, NBD_FLAG_SEND_CACHE: documents that the server understands
> NBD_CMD_CACHE; however, note that server implementations exist
> which support the command without advertising this bit, and
> conversely that this bit does not guarantee that the command will
> succeed or have an impact."
>
> Personally I do not see any option that we will be allowed to fix the
> specification in favor of QEMU and apply the fix to the kernel. This
> is a big-big problem. Let us fix this in QEMU as soon as possible.
>
> Thus the commit fixes negotiation flag in QEMU accoring to the
> specification. Fortunately enough the bit has been added by Virtuozzo
> and there are no released products in the field with this bit used
> so far.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> CC: Valery Vdovin <valery.vdovin@acronis.com>
> CC: Eric Blake <eblake@redhat.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/block/nbd.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index 4638c839f5..4377fa502c 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -135,7 +135,7 @@ typedef struct NBDExtent {
>  #define NBD_FLAG_SEND_TRIM         (1 << 5) /* Send TRIM (discard) */
>  #define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) /* Send WRITE_ZEROES */
>  #define NBD_FLAG_SEND_DF           (1 << 7) /* Send DF (Do not Fragment) */
> -#define NBD_FLAG_SEND_CACHE        (1 << 8) /* Send CACHE (prefetch) */
> +#define NBD_FLAG_SEND_CACHE        (1 << 10) /* Send CACHE (prefetch) */
>  
>  /* New-style handshake (global) flags, sent from server to client, and
>     control what will happen during handshake phase. */
Eric Blake Oct. 4, 2018, 1:49 p.m. UTC | #4
On 10/4/18 7:51 AM, Vladimir Sementsov-Ogievskiy wrote:
> 04.10.2018 15:27, Eric Blake wrote:
>> On 10/4/18 5:03 AM, Denis V. Lunev wrote:
>>> Commit bc37b06a5 was made very bad thing, it has been added
>>> NBD_FLAG_SEND_CACHE flag for negotiation.
>>
>> Oof. Probably my fault for not doing a careful review against the
>> upstream spec.
> 
> mostly my, to introduce this =(. really too bad
> 
>>
>>> The problem is that the value
>>> of the flag was taken wrong and directly violates NBD specification.
>>> This value (bit 8) is used at least in the Linux kernel as a part of
>>> stable userspace-kernel API since 4.10 as NBD_FLAG_CAN_MULTI_CONN
>>> as defined in the specification:
>>
>> And a kernel that honors that bit can cause qemu to misbehave. Yeah,
>> that's definitely undesirable.
> 
> As I understand opposite: kernel client will assume support for
> multi_conn feature which declares some guarantees about FLUSH, but qemu
> don't give these guarantees.

Yeah, that's my biggest worry - data corruption when the kernel assumes 
it can make multiple connections with consistent caching, but the 
caching is not consistent, leading to data corruption when connected to 
an unpatched qemu 3.0 server.  I'm rewording the commit message along 
those lines.


>> I'll probably amend this to list all NBD_FLAG_ values in the spec
>> (even if qemu doesn't implement them yet), just to make it easier to
>> avoid making this mistake in the future.
>>
>> Reviewed-by: Eric Blake <eblake@redhat.com>
>>
>> Will queue through my NBD tree.
>>
> 
> I vote for adding all values at least as a comment
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
>

Patch
diff mbox series

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 4638c839f5..4377fa502c 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -135,7 +135,7 @@  typedef struct NBDExtent {
 #define NBD_FLAG_SEND_TRIM         (1 << 5) /* Send TRIM (discard) */
 #define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) /* Send WRITE_ZEROES */
 #define NBD_FLAG_SEND_DF           (1 << 7) /* Send DF (Do not Fragment) */
-#define NBD_FLAG_SEND_CACHE        (1 << 8) /* Send CACHE (prefetch) */
+#define NBD_FLAG_SEND_CACHE        (1 << 10) /* Send CACHE (prefetch) */
 
 /* New-style handshake (global) flags, sent from server to client, and
    control what will happen during handshake phase. */