diff mbox

[Nbd] NBD_CMD_DISC

Message ID 20160412103406.GA28370@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Daniel P. Berrangé April 12, 2016, 10:34 a.m. UTC
On Tue, Apr 12, 2016 at 10:48:20AM +0100, Daniel P. Berrange wrote:
> On Sun, Apr 10, 2016 at 10:49:00AM +0100, Alex Bligh wrote:
> > (Daniel: if you want to replicate the issue, just run qemu-img info
> > against my gonbdserver with TLS. Every fifth NBD_CMD_DISC doesn't
> > get through before the TCP session closes).
> 
> Hmm, I'll have a look at this - I'm not sure its caused by
> the lack of gnutls_bye, as opposed to incorrect handling
> of EAGAIN when the block layer sends CMD_DISC

I have tried to reproduce with your server yet, but I did look at the
code and identified one potential flaw that might cause the behaviour
you mention. Can you try testing with the following change applied
to QEMU:



The current code will cause it to silently not send CMD_DISC if the socket
returns EAGAIN on send(). This change will cause it to do a poll to wait
and retry the send(). That said I'd be pretty surprised if we do actually
get EAGAIN in this scenario, as I'd expect the outgoing TCP buffers to be
empty at the point in which we close the client connection, but this fix
is worth a try.

Regards,
Daniel

Comments

Alex Bligh April 12, 2016, 11:14 a.m. UTC | #1
On 12 Apr 2016, at 11:34, Daniel P. Berrange <berrange@redhat.com> wrote:

> On Tue, Apr 12, 2016 at 10:48:20AM +0100, Daniel P. Berrange wrote:
>> On Sun, Apr 10, 2016 at 10:49:00AM +0100, Alex Bligh wrote:
>>> (Daniel: if you want to replicate the issue, just run qemu-img info
>>> against my gonbdserver with TLS. Every fifth NBD_CMD_DISC doesn't
>>> get through before the TCP session closes).
>> 
>> Hmm, I'll have a look at this - I'm not sure its caused by
>> the lack of gnutls_bye, as opposed to incorrect handling
>> of EAGAIN when the block layer sends CMD_DISC
> 
> I have tried to reproduce with your server yet, but I did look at the
> code and identified one potential flaw that might cause the behaviour
> you mention. Can you try testing with the following change applied
> to QEMU:
> 
> diff --git a/nbd/common.c b/nbd/common.c
> index 8ddb2dd..47757b6 100644
> --- a/nbd/common.c
> +++ b/nbd/common.c
> @@ -50,7 +50,7 @@ ssize_t nbd_wr_syncv(QIOChannel *ioc,
>                  * qio_channel_yield() that works with AIO contexts
>                  * and consider using that in this branch */
>                 qemu_coroutine_yield();
> -            } else if (done) {
> +            } else if (done || !do_read) {
>                 /* XXX this is needed by nbd_reply_ready.  */
>                 qio_channel_wait(ioc,
>                                  do_read ? G_IO_IN : G_IO_OUT);
> 
> 
> The current code will cause it to silently not send CMD_DISC if the socket
> returns EAGAIN on send(). This change will cause it to do a poll to wait
> and retry the send(). That said I'd be pretty surprised if we do actually
> get EAGAIN in this scenario, as I'd expect the outgoing TCP buffers to be
> empty at the point in which we close the client connection, but this fix
> is worth a try.

Well, what with pulling to the latest qemu master I now can't reproduce
the original problem against my server :-/ However, I have verified that
against my server it works with your patch (as well as without). I have
verified each time that I am receiving NBD_CMD_DISC.

It's also now working (with and without your patch) against nbd-server.c
(reference implementation) with my GnuTLS patches. This should be exactly
the same as before.

I suspect the issue may be timing related. Unless there's something else
fixed in master, the main change I've made is rebooted my MBP, which
was running very slowly.

I think you should probably still do a gnutls_bye() though.
diff mbox

Patch

diff --git a/nbd/common.c b/nbd/common.c
index 8ddb2dd..47757b6 100644
--- a/nbd/common.c
+++ b/nbd/common.c
@@ -50,7 +50,7 @@  ssize_t nbd_wr_syncv(QIOChannel *ioc,
                  * qio_channel_yield() that works with AIO contexts
                  * and consider using that in this branch */
                 qemu_coroutine_yield();
-            } else if (done) {
+            } else if (done || !do_read) {
                 /* XXX this is needed by nbd_reply_ready.  */
                 qio_channel_wait(ioc,
                                  do_read ? G_IO_IN : G_IO_OUT);