Message ID | 20160412103406.GA28370@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 12 Apr 2016, at 11:34, Daniel P. Berrange <berrange@redhat.com> wrote: > On Tue, Apr 12, 2016 at 10:48:20AM +0100, Daniel P. Berrange wrote: >> On Sun, Apr 10, 2016 at 10:49:00AM +0100, Alex Bligh wrote: >>> (Daniel: if you want to replicate the issue, just run qemu-img info >>> against my gonbdserver with TLS. Every fifth NBD_CMD_DISC doesn't >>> get through before the TCP session closes). >> >> Hmm, I'll have a look at this - I'm not sure its caused by >> the lack of gnutls_bye, as opposed to incorrect handling >> of EAGAIN when the block layer sends CMD_DISC > > I have tried to reproduce with your server yet, but I did look at the > code and identified one potential flaw that might cause the behaviour > you mention. Can you try testing with the following change applied > to QEMU: > > diff --git a/nbd/common.c b/nbd/common.c > index 8ddb2dd..47757b6 100644 > --- a/nbd/common.c > +++ b/nbd/common.c > @@ -50,7 +50,7 @@ ssize_t nbd_wr_syncv(QIOChannel *ioc, > * qio_channel_yield() that works with AIO contexts > * and consider using that in this branch */ > qemu_coroutine_yield(); > - } else if (done) { > + } else if (done || !do_read) { > /* XXX this is needed by nbd_reply_ready. */ > qio_channel_wait(ioc, > do_read ? G_IO_IN : G_IO_OUT); > > > The current code will cause it to silently not send CMD_DISC if the socket > returns EAGAIN on send(). This change will cause it to do a poll to wait > and retry the send(). That said I'd be pretty surprised if we do actually > get EAGAIN in this scenario, as I'd expect the outgoing TCP buffers to be > empty at the point in which we close the client connection, but this fix > is worth a try. Well, what with pulling to the latest qemu master I now can't reproduce the original problem against my server :-/ However, I have verified that against my server it works with your patch (as well as without). I have verified each time that I am receiving NBD_CMD_DISC. It's also now working (with and without your patch) against nbd-server.c (reference implementation) with my GnuTLS patches. This should be exactly the same as before. I suspect the issue may be timing related. Unless there's something else fixed in master, the main change I've made is rebooted my MBP, which was running very slowly. I think you should probably still do a gnutls_bye() though.
diff --git a/nbd/common.c b/nbd/common.c index 8ddb2dd..47757b6 100644 --- a/nbd/common.c +++ b/nbd/common.c @@ -50,7 +50,7 @@ ssize_t nbd_wr_syncv(QIOChannel *ioc, * qio_channel_yield() that works with AIO contexts * and consider using that in this branch */ qemu_coroutine_yield(); - } else if (done) { + } else if (done || !do_read) { /* XXX this is needed by nbd_reply_ready. */ qio_channel_wait(ioc, do_read ? G_IO_IN : G_IO_OUT);