diff mbox

nbd: Fix regression on resiliency to port scan

Message ID 20170608222617.20376-1-eblake@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Eric Blake June 8, 2017, 10:26 p.m. UTC
Back in qemu 2.5, qemu-nbd was immune to port probes (a transient
server would not quit, regardless of how many probe connections
came and went, until a connection actually negotiated).  But we
broke that in commit ee7d7aa when removing the return value to
nbd_client_new(), although that patch also introduced a bug causing
an assertion failure on a client that fails negotiation.  We then
made it worse during refactoring in commit 1a6245a (a segfault
before we could even assert); the (masked) assertion was cleaned
up in d3780c2 (still in 2.6), and just recently we finally fixed
the segfault ("nbd: Fully intialize client in case of failed
negotiation").  But that still means that ever since we added
TLS support to qemu-nbd, we have been vulnerable to an ill-timed
port-scan being able to cause a denial of service by taking down
qemu-nbd before a real client has a chance to connect.

Since negotiation is now handled asynchronously via coroutines,
we no longer have a synchronous point of return by re-adding a
return value to nbd_client_new().  So this patch instead wires
things up to pass the negotiation status through the close_fn
callback function.

Simple test across two terminals:
$ qemu-nbd -f raw -p 30001 file
$ nmap 127.0.0.1 -p 30001 && \
  qemu-io -c 'r 0 512' -f raw nbd://localhost:30001

Note that this patch does not change what constitutes successful
negotiation (thus, a client must enter transmission phase before
that client can be considered as a reason to terminate the server
when the connection ends).  Perhaps we may want to tweak things
in a later patch to also treat a client that uses NBD_OPT_ABORT
as being a 'successful' negotiation (the client correctly talked
the NBD protocol, and informed us it was not going to use our
export after all), but that's a discussion for another day.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451614

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 include/block/nbd.h |  2 +-
 blockdev-nbd.c      |  6 +++++-
 nbd/server.c        | 24 +++++++++++++++---------
 qemu-nbd.c          |  4 ++--
 4 files changed, 23 insertions(+), 13 deletions(-)

Comments

Eric Blake June 9, 2017, 11:17 a.m. UTC | #1
On 06/08/2017 05:26 PM, Eric Blake wrote:
> Back in qemu 2.5, qemu-nbd was immune to port probes (a transient
> server would not quit, regardless of how many probe connections
> came and went, until a connection actually negotiated).  But we
> broke that in commit ee7d7aa

> Simple test across two terminals:
> $ qemu-nbd -f raw -p 30001 file
> $ nmap 127.0.0.1 -p 30001 && \
>   qemu-io -c 'r 0 512' -f raw nbd://localhost:30001

This is now being assigned a CVE.
Eric Blake June 9, 2017, 11:58 a.m. UTC | #2
adding qemu-stable in cc

On 06/08/2017 05:26 PM, Eric Blake wrote:
> Back in qemu 2.5, qemu-nbd was immune to port probes (a transient
> server would not quit, regardless of how many probe connections
> came and went, until a connection actually negotiated).  But we
> broke that in commit ee7d7aa when removing the return value to
> nbd_client_new(), although that patch also introduced a bug causing
> an assertion failure on a client that fails negotiation.  We then
> made it worse during refactoring in commit 1a6245a (a segfault
> before we could even assert); the (masked) assertion was cleaned
> up in d3780c2 (still in 2.6), and just recently we finally fixed
> the segfault ("nbd: Fully intialize client in case of failed
> negotiation").  But that still means that ever since we added
> TLS support to qemu-nbd, we have been vulnerable to an ill-timed
> port-scan being able to cause a denial of service by taking down
> qemu-nbd before a real client has a chance to connect.
>
Max Reitz June 11, 2017, 11:50 a.m. UTC | #3
On 2017-06-09 00:26, Eric Blake wrote:
> Back in qemu 2.5, qemu-nbd was immune to port probes (a transient
> server would not quit, regardless of how many probe connections
> came and went, until a connection actually negotiated).  But we
> broke that in commit ee7d7aa when removing the return value to
> nbd_client_new(), although that patch also introduced a bug causing
> an assertion failure on a client that fails negotiation.  We then
> made it worse during refactoring in commit 1a6245a (a segfault
> before we could even assert); the (masked) assertion was cleaned
> up in d3780c2 (still in 2.6), and just recently we finally fixed
> the segfault ("nbd: Fully intialize client in case of failed
> negotiation").  But that still means that ever since we added
> TLS support to qemu-nbd, we have been vulnerable to an ill-timed
> port-scan being able to cause a denial of service by taking down
> qemu-nbd before a real client has a chance to connect.

Maybe it would be worthy to note explicitly that this does not affect
named exports (i.e. everything done with the system emulator).

> Since negotiation is now handled asynchronously via coroutines,
> we no longer have a synchronous point of return by re-adding a
> return value to nbd_client_new().  So this patch instead wires
> things up to pass the negotiation status through the close_fn
> callback function.
> 
> Simple test across two terminals:
> $ qemu-nbd -f raw -p 30001 file

Maybe this command line should contain a -t, because otherwise I don't
find it very CVE-worthy.

First, sure, you can kill this NBD server with the below nmap
invocation, or with an "ncat localhost 30001". But you can also do so
with a plain "qemu-io -c quit nbd://localhost:30001".

With -t you actually cannot kill it using ncat or qemu-io, but you can
with nmap. So this is what the actual issue is.

And secondly, note that even after this patch you can make the NBD
server quit with "ncat localhost 30001" (just like with qemu-io).

> $ nmap 127.0.0.1 -p 30001 && \
>   qemu-io -c 'r 0 512' -f raw nbd://localhost:30001
> 
> Note that this patch does not change what constitutes successful
> negotiation (thus, a client must enter transmission phase before
> that client can be considered as a reason to terminate the server
> when the connection ends).

Well, the thing is that transmission phase starts immediately for the
old-style protocol:

$ ./qemu-nbd -f raw null-co:// &
[1] 24581
$ ncat --send-only localhost 10809 < /dev/null
nbd/server.c:nbd_receive_request():L706: read failed
[1]  + 24581 done       ./qemu-nbd -f raw null-co://

I'm not sure whether this is intended, but then again I'm not sure
whether we can or have to do anything about this (as I said above, you
can easily make the server quit with a qemu-io invocation if you're so
inclined.

But the thing is that this really shows there should be a -t in the
qemu-nbd invocation in this commit message. The merit of this patch is
that it fixes a *crash* in qemu-nbd if someone closes the connection
immediately after opening it, not that it allows qemu-nbd to continue to
run when someone connects who is not an NBD client -- because it really
doesn't.

(With the new style, option negotiation fails so we do not go to
transmission phase and the server stays alive:

$ ./qemu-nbd -x foo -f raw null-co:// &
[1] 24609
$ ncat --send-only localhost 10809 < /dev/null
nbd/server.c:nbd_negotiate_options():L442: read failed
nbd/server.c:nbd_negotiate():L673: option negotiation failed
$ kill %1
[1]  + 24609 done       ./qemu-nbd -x foo -f raw null-co://

)

>                             Perhaps we may want to tweak things
> in a later patch to also treat a client that uses NBD_OPT_ABORT
> as being a 'successful' negotiation (the client correctly talked
> the NBD protocol, and informed us it was not going to use our
> export after all), but that's a discussion for another day.

Well, about that...

$ ./qemu-nbd -x foo -f raw null-co:// &
[1] 24389
$ ./qemu-io -c quit nbd://localhost/bar
can't open device nbd://localhost/bar: No export with name 'bar' available
[1]  + 24389 broken pipe  ./qemu-nbd -x foo -f raw null-co://

Even worse, the same happens with -t, both before and after this patch.

Having spent probably too much time for investigating, this happens
because the client just closes the pipe which results in a SIGPIPE sent
to qemu-nbd when it tries to sendmsg() (in qio_channel_socket_writev())
the ACK.

qemu-io and qemu-img use signal(SIGPIPE, SIG_IGN) (and os-posix.c has a
sigaction() call for the same). qemu-nbd should probably, too.
...Considering you're on vacation next week, maybe I should just write
that patch.

> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451614
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  include/block/nbd.h |  2 +-
>  blockdev-nbd.c      |  6 +++++-
>  nbd/server.c        | 24 +++++++++++++++---------
>  qemu-nbd.c          |  4 ++--
>  4 files changed, 23 insertions(+), 13 deletions(-)

With a -t in the qemu-nbd command line:

Reviewed-by: Max Reitz <mreitz@redhat.com>
Eric Blake June 27, 2017, 5:06 p.m. UTC | #4
On 06/11/2017 06:50 AM, Max Reitz wrote:
> On 2017-06-09 00:26, Eric Blake wrote:
>> Back in qemu 2.5, qemu-nbd was immune to port probes (a transient
>> server would not quit, regardless of how many probe connections
>> came and went, until a connection actually negotiated).  But we
>> broke that in commit ee7d7aa

>> Simple test across two terminals:
>> $ qemu-nbd -f raw -p 30001 file
> 
> Maybe this command line should contain a -t, because otherwise I don't
> find it very CVE-worthy.

I was on vacation when this patch got merged, so it's too late to modify
the commit message now.  However, I'm following up now; first to point
out that we now have the actual CVE number (it was not yet assigned when
I originally posted - and if we hadn't merged yet, I would also have
tweaked the commit message to call the id out): CVE-2017-9524

> 
> First, sure, you can kill this NBD server with the below nmap
> invocation, or with an "ncat localhost 30001". But you can also do so
> with a plain "qemu-io -c quit nbd://localhost:30001".

But in that case, without -t, you specifically wanted the server to go
away after the first successful client, and that was indeed a successful
client.

> 
> With -t you actually cannot kill it using ncat or qemu-io, but you can
> with nmap. So this is what the actual issue is.

You do make a point, though.  The command line example I gave is the
old-style NBD, where there is no named export, and where any client is
first in line for the given export - which is rather weak, and why
upstream NBD discourages the old protocol in favor of the new where the
client has to specify WHICH named export it wants.  And in the new
protocol, if you are not using -t, then you want the server to ignore
clients that aren't requesting the export that the server is willing to
serve, rather than hanging up on the first random client that requests
the wrong name.  And you properly investigated that scenario below...

> 
> And secondly, note that even after this patch you can make the NBD
> server quit with "ncat localhost 30001" (just like with qemu-io).
> 
>> $ nmap 127.0.0.1 -p 30001 && \
>>   qemu-io -c 'r 0 512' -f raw nbd://localhost:30001
>>
>> Note that this patch does not change what constitutes successful
>> negotiation (thus, a client must enter transmission phase before
>> that client can be considered as a reason to terminate the server
>> when the connection ends).
> 
> Well, the thing is that transmission phase starts immediately for the
> old-style protocol:
> 
> $ ./qemu-nbd -f raw null-co:// &
> [1] 24581
> $ ncat --send-only localhost 10809 < /dev/null
> nbd/server.c:nbd_receive_request():L706: read failed
> [1]  + 24581 done       ./qemu-nbd -f raw null-co://
> 
> I'm not sure whether this is intended, but then again I'm not sure
> whether we can or have to do anything about this (as I said above, you
> can easily make the server quit with a qemu-io invocation if you're so
> inclined.

Old-style negotiation is not as important these days as new-style where
a handshake occurs (in fact, upstream NBD documents but no longer
implements old-style negotiation, so it has definitely moved into
deprecated usage).

> 
> But the thing is that this really shows there should be a -t in the
> qemu-nbd invocation in this commit message. The merit of this patch is
> that it fixes a *crash* in qemu-nbd if someone closes the connection
> immediately after opening it, not that it allows qemu-nbd to continue to
> run when someone connects who is not an NBD client -- because it really
> doesn't.

So yes, your arguments are good that the REAL denial-of-service is not
against a server that expects only one client (or, with new-style, one
client that knows about the specific export), but against a server
running with -t that is supposed to be persistent regardless of how many
clients consecutively connect to it.

> 
> (With the new style, option negotiation fails so we do not go to
> transmission phase and the server stays alive:
> 
> $ ./qemu-nbd -x foo -f raw null-co:// &
> [1] 24609
> $ ncat --send-only localhost 10809 < /dev/null
> nbd/server.c:nbd_negotiate_options():L442: read failed
> nbd/server.c:nbd_negotiate():L673: option negotiation failed
> $ kill %1
> [1]  + 24609 done       ./qemu-nbd -x foo -f raw null-co://
> 
> )
> 
>>                             Perhaps we may want to tweak things
>> in a later patch to also treat a client that uses NBD_OPT_ABORT
>> as being a 'successful' negotiation (the client correctly talked
>> the NBD protocol, and informed us it was not going to use our
>> export after all), but that's a discussion for another day.
> 
> Well, about that...
> 
> $ ./qemu-nbd -x foo -f raw null-co:// &
> [1] 24389
> $ ./qemu-io -c quit nbd://localhost/bar
> can't open device nbd://localhost/bar: No export with name 'bar' available
> [1]  + 24389 broken pipe  ./qemu-nbd -x foo -f raw null-co://
> 
> Even worse, the same happens with -t, both before and after this patch.

You later submitted the fix for this (ignoring SIGPIPE); what I'm now
trying to figure out is whether your later patch is part of the same
CVE-2017-9524 fix, or whether we need a second CVE assigned for this
second denial-of-service scenario.

> 
> Having spent probably too much time for investigating, this happens
> because the client just closes the pipe which results in a SIGPIPE sent
> to qemu-nbd when it tries to sendmsg() (in qio_channel_socket_writev())
> the ACK.
> 
> qemu-io and qemu-img use signal(SIGPIPE, SIG_IGN) (and os-posix.c has a
> sigaction() call for the same). qemu-nbd should probably, too.
> ...Considering you're on vacation next week, maybe I should just write
> that patch.

Thanks for your investigations and followup patch.  Now, the task at
hand is to figure out how important your followup patch is to the CVE
question.

> 
>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451614
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---
>>  include/block/nbd.h |  2 +-
>>  blockdev-nbd.c      |  6 +++++-
>>  nbd/server.c        | 24 +++++++++++++++---------
>>  qemu-nbd.c          |  4 ++--
>>  4 files changed, 23 insertions(+), 13 deletions(-)
> 
> With a -t in the qemu-nbd command line:
> 
> Reviewed-by: Max Reitz <mreitz@redhat.com>
>
diff mbox

Patch

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 416257a..8fa5ce5 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -162,7 +162,7 @@  void nbd_client_new(NBDExport *exp,
                     QIOChannelSocket *sioc,
                     QCryptoTLSCreds *tlscreds,
                     const char *tlsaclname,
-                    void (*close)(NBDClient *));
+                    void (*close_fn)(NBDClient *, bool));
 void nbd_client_get(NBDClient *client);
 void nbd_client_put(NBDClient *client);

diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index dd0860f..28f551a 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -27,6 +27,10 @@  typedef struct NBDServerData {

 static NBDServerData *nbd_server;

+static void nbd_blockdev_client_closed(NBDClient *client, bool ignored)
+{
+    nbd_client_put(client);
+}

 static gboolean nbd_accept(QIOChannel *ioc, GIOCondition condition,
                            gpointer opaque)
@@ -46,7 +50,7 @@  static gboolean nbd_accept(QIOChannel *ioc, GIOCondition condition,
     qio_channel_set_name(QIO_CHANNEL(cioc), "nbd-server");
     nbd_client_new(NULL, cioc,
                    nbd_server->tlscreds, NULL,
-                   nbd_client_put);
+                   nbd_blockdev_client_closed);
     object_unref(OBJECT(cioc));
     return TRUE;
 }
diff --git a/nbd/server.c b/nbd/server.c
index 49b55f6..f2b1aa4 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -81,7 +81,7 @@  static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports);

 struct NBDClient {
     int refcount;
-    void (*close)(NBDClient *client);
+    void (*close_fn)(NBDClient *client, bool negotiated);

     bool no_zeroes;
     NBDExport *exp;
@@ -778,7 +778,7 @@  void nbd_client_put(NBDClient *client)
     }
 }

-static void client_close(NBDClient *client)
+static void client_close(NBDClient *client, bool negotiated)
 {
     if (client->closing) {
         return;
@@ -793,8 +793,8 @@  static void client_close(NBDClient *client)
                          NULL);

     /* Also tell the client, so that they release their reference.  */
-    if (client->close) {
-        client->close(client);
+    if (client->close_fn) {
+        client->close_fn(client, negotiated);
     }
 }

@@ -975,7 +975,7 @@  void nbd_export_close(NBDExport *exp)

     nbd_export_get(exp);
     QTAILQ_FOREACH_SAFE(client, &exp->clients, next, next) {
-        client_close(client);
+        client_close(client, true);
     }
     nbd_export_set_name(exp, NULL);
     nbd_export_set_description(exp, NULL);
@@ -1337,7 +1337,7 @@  done:

 out:
     nbd_request_put(req);
-    client_close(client);
+    client_close(client, true);
     nbd_client_put(client);
 }

@@ -1363,7 +1363,7 @@  static coroutine_fn void nbd_co_client_start(void *opaque)
     qemu_co_mutex_init(&client->send_lock);

     if (nbd_negotiate(data)) {
-        client_close(client);
+        client_close(client, false);
         goto out;
     }

@@ -1373,11 +1373,17 @@  out:
     g_free(data);
 }

+/*
+ * Create a new client listener on the given export @exp, using the
+ * given channel @sioc.  Begin servicing it in a coroutine.  When the
+ * connection closes, call @close_fn with an indication of whether the
+ * client completed negotiation.
+ */
 void nbd_client_new(NBDExport *exp,
                     QIOChannelSocket *sioc,
                     QCryptoTLSCreds *tlscreds,
                     const char *tlsaclname,
-                    void (*close_fn)(NBDClient *))
+                    void (*close_fn)(NBDClient *, bool))
 {
     NBDClient *client;
     NBDClientNewData *data = g_new(NBDClientNewData, 1);
@@ -1394,7 +1400,7 @@  void nbd_client_new(NBDExport *exp,
     object_ref(OBJECT(client->sioc));
     client->ioc = QIO_CHANNEL(sioc);
     object_ref(OBJECT(client->ioc));
-    client->close = close_fn;
+    client->close_fn = close_fn;

     data->client = client;
     data->co = qemu_coroutine_create(nbd_co_client_start, data);
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 651f85e..9464a04 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -336,10 +336,10 @@  static void nbd_export_closed(NBDExport *exp)

 static void nbd_update_server_watch(void);

-static void nbd_client_closed(NBDClient *client)
+static void nbd_client_closed(NBDClient *client, bool negotiated)
 {
     nb_fds--;
-    if (nb_fds == 0 && !persistent && state == RUNNING) {
+    if (negotiated && nb_fds == 0 && !persistent && state == RUNNING) {
         state = TERMINATE;
     }
     nbd_update_server_watch();