diff mbox series

[12/14] nbd/client: Work around 3.0 bug for listing meta contexts

Message ID 20181130220344.3350618-13-eblake@redhat.com (mailing list archive)
State New, archived
Headers show
Series nbd: add qemu-nbd --list | expand

Commit Message

Eric Blake Nov. 30, 2018, 10:03 p.m. UTC
Commit 3d068aff forgot to advertise available qemu: contexts
when the client requests a list with 0 queries. Furthermore,
3.0 shipped with a qemu-img hack of x-dirty-bitmap (commit
216ee365) that _silently_ acts as though the entire image is
clean if a requested bitmap is not present.  Both bugs have
been recently fixed to give full output from the start, and
to refuse a connection if a requested x-dirty-bitmap was not
found.

Still, it is likely that there will be users that have to
work with a mix of old and new qemu versions, depending on
which features get backported where, at which point being
able to rely on 'qemu-img --list' output to know for sure
whether a given NBD export has the desired dirty bitmap is
much nicer than blindly connecting and risking that the
entire image may appear clean.  We can make our --list code
smart enough to work around buggy servers by tracking
whether we've seen any qemu: replies in the original 0-query
list; if not, recurse to a single query on "qemu:" (which
may still have no replies, but then we know for sure we
didn't trip up on the server bug).

Signed-off-by: Eric Blake <eblake@redhat.com>

---
Done as a separate patch to make it easier to revert when we no
longer care about 3.0 servers
---
 nbd/client.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Vladimir Sementsov-Ogievskiy Dec. 7, 2018, 11:21 a.m. UTC | #1
01.12.2018 1:03, Eric Blake wrote:
> Commit 3d068aff forgot to advertise available qemu: contexts
> when the client requests a list with 0 queries. Furthermore,
> 3.0 shipped with a qemu-img hack of x-dirty-bitmap (commit
> 216ee365) that _silently_ acts as though the entire image is
> clean if a requested bitmap is not present.  Both bugs have
> been recently fixed to give full output from the start, and
> to refuse a connection if a requested x-dirty-bitmap was not
> found.
> 
> Still, it is likely that there will be users that have to
> work with a mix of old and new qemu versions, depending on
> which features get backported where, at which point being
> able to rely on 'qemu-img --list' output to know for sure
> whether a given NBD export has the desired dirty bitmap is
> much nicer than blindly connecting and risking that the
> entire image may appear clean.  We can make our --list code
> smart enough to work around buggy servers by tracking
> whether we've seen any qemu: replies in the original 0-query
> list; if not, recurse to a single query on "qemu:" (which
> may still have no replies, but then we know for sure we
> didn't trip up on the server bug).

related thing:
should we document these bugs with corresponding version numbers in docs/interop/nbd.txt?


> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> Done as a separate patch to make it easier to revert when we no
> longer care about 3.0 servers
> ---
>   nbd/client.c | 11 ++++++++++-
>   1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/nbd/client.c b/nbd/client.c
> index 6292de560ee..928ecabd420 100644
> --- a/nbd/client.c
> +++ b/nbd/client.c
> @@ -21,6 +21,7 @@
>   #include "qapi/error.h"
>   #include "trace.h"
>   #include "nbd-internal.h"
> +#include "qemu/cutils.h"
> 
>   /* Definitions for opaque data types */
> 
> @@ -736,12 +737,13 @@ static int nbd_negotiate_simple_meta_context(QIOChannel *ioc,
>                   return -1;
>               }
>               g_free(name);
> +            received = true;
>           } else {
>               info->contexts = g_renew(char *, info->contexts,
>                                        ++info->n_contexts);
>               info->contexts[info->n_contexts - 1] = name;
> +            received |= strstart(name, "qemu:", NULL);

so, for _LIST_, it is actually received_qemu var. It has taken some time for me
to understand that it's ok (turns out, that this variable actually isn't used for
_LIST_ case).. Personally I don't like it, for me it's too tricky, but anyway, you
said, you'll refactor this function somehow.

>           }
> -        received = true;
> 
>           /* receive NBD_REP_ACK */
>           if (nbd_receive_option_reply(ioc, opt, &reply, errp) < 0) {
> @@ -771,6 +773,13 @@ static int nbd_negotiate_simple_meta_context(QIOChannel *ioc,
>           info->meta_base_allocation_id = received_id;
>       }
> 
> +    /* Recurse to work around qemu 3.0 bug - the server forgot to send
> +     * "qemu:" replies to 0 queries. */
> +    if (!context && !received) {
> +        return nbd_negotiate_simple_meta_context(ioc, opt, "qemu:", info,
> +                                                 errp);
> +    }
> +
>       return received || opt == NBD_OPT_LIST_META_CONTEXT;
>   }
>
Eric Blake Dec. 7, 2018, 3:21 p.m. UTC | #2
On 12/7/18 5:21 AM, Vladimir Sementsov-Ogievskiy wrote:
> 01.12.2018 1:03, Eric Blake wrote:
>> Commit 3d068aff forgot to advertise available qemu: contexts
>> when the client requests a list with 0 queries. Furthermore,
>> 3.0 shipped with a qemu-img hack of x-dirty-bitmap (commit
>> 216ee365) that _silently_ acts as though the entire image is
>> clean if a requested bitmap is not present.  Both bugs have
>> been recently fixed to give full output from the start, and
>> to refuse a connection if a requested x-dirty-bitmap was not
>> found.
>>
>> Still, it is likely that there will be users that have to
>> work with a mix of old and new qemu versions, depending on
>> which features get backported where, at which point being
>> able to rely on 'qemu-img --list' output to know for sure
>> whether a given NBD export has the desired dirty bitmap is
>> much nicer than blindly connecting and risking that the
>> entire image may appear clean.  We can make our --list code
>> smart enough to work around buggy servers by tracking
>> whether we've seen any qemu: replies in the original 0-query
>> list; if not, recurse to a single query on "qemu:" (which
>> may still have no replies, but then we know for sure we
>> didn't trip up on the server bug).
> 
> related thing:
> should we document these bugs with corresponding version numbers in docs/interop/nbd.txt?

Might be useful.  I'll add that as another patch in my v2 series.
diff mbox series

Patch

diff --git a/nbd/client.c b/nbd/client.c
index 6292de560ee..928ecabd420 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -21,6 +21,7 @@ 
 #include "qapi/error.h"
 #include "trace.h"
 #include "nbd-internal.h"
+#include "qemu/cutils.h"

 /* Definitions for opaque data types */

@@ -736,12 +737,13 @@  static int nbd_negotiate_simple_meta_context(QIOChannel *ioc,
                 return -1;
             }
             g_free(name);
+            received = true;
         } else {
             info->contexts = g_renew(char *, info->contexts,
                                      ++info->n_contexts);
             info->contexts[info->n_contexts - 1] = name;
+            received |= strstart(name, "qemu:", NULL);
         }
-        received = true;

         /* receive NBD_REP_ACK */
         if (nbd_receive_option_reply(ioc, opt, &reply, errp) < 0) {
@@ -771,6 +773,13 @@  static int nbd_negotiate_simple_meta_context(QIOChannel *ioc,
         info->meta_base_allocation_id = received_id;
     }

+    /* Recurse to work around qemu 3.0 bug - the server forgot to send
+     * "qemu:" replies to 0 queries. */
+    if (!context && !received) {
+        return nbd_negotiate_simple_meta_context(ioc, opt, "qemu:", info,
+                                                 errp);
+    }
+
     return received || opt == NBD_OPT_LIST_META_CONTEXT;
 }