Message ID | 1466775608-31052-1-git-send-email-roman.penyaev@profitbricks.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Am 24.06.2016 um 15:40 hat Roman Pen geschrieben: > This reverts commit ccb9dc10129954d0bcd7814298ed445e684d5a2a, > which causes MQ stuck while doing IO thru virtio_blk. It would be good to have a theory why this happens. > diff --git a/block/linux-aio.c b/block/linux-aio.c > index e468960..fe7cece 100644 > --- a/block/linux-aio.c > +++ b/block/linux-aio.c > @@ -149,8 +149,6 @@ static void qemu_laio_completion_bh(void *opaque) > if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) { > ioq_submit(s); > } > - > - qemu_bh_cancel(s->completion_bh); > } Maybe if a nested event loops cancels the BH, it's missing on the next loop iteration. Before my patch, the nested callback happened to leave an additional BH around which the outer one actually needs. I find this a bit ugly, but if we're okay with this mechanism we could add a counter for the nesting level and only cancel on the top level. If you find it as ugly as I do, a cleaner solution would be to schedule the BH inside the loop. > @@ -158,7 +156,7 @@ static void qemu_laio_completion_cb(EventNotifier *e) > LinuxAioState *s = container_of(e, LinuxAioState, e); > > if (event_notifier_test_and_clear(&s->e)) { > - qemu_laio_completion_bh(s); > + qemu_bh_schedule(s->completion_bh); > } > } I can't see how this hunk would make a difference. Can you confirm that just the first hunk is enough to fix the problem? Kevin
On Fri, Jun 24, 2016 at 3:46 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 24.06.2016 um 15:40 hat Roman Pen geschrieben: >> This reverts commit ccb9dc10129954d0bcd7814298ed445e684d5a2a, >> which causes MQ stuck while doing IO thru virtio_blk. > > It would be good to have a theory why this happens. It's worth taking the batch notify BH out of the equation in virtio_blk_data_plane_notify(): - set_bit(virtio_get_queue_index(vq), s->batch_notify_vqs); - qemu_bh_schedule(s->bh); + if (virtio_should_notify(s->vdev, vq)) { + event_notifier_set(virtio_queue_get_guest_notifier(vq)); + } I wonder if that makes any difference? I don't have a concrete theory why batch notify interferes with Kevin's patch though. Stefan
On Fri, Jun 24, 2016 at 2:40 PM, Roman Pen <roman.penyaev@profitbricks.com> wrote: > diff --git a/block/linux-aio.c b/block/linux-aio.c > index e468960..fe7cece 100644 > --- a/block/linux-aio.c > +++ b/block/linux-aio.c > @@ -149,8 +149,6 @@ static void qemu_laio_completion_bh(void *opaque) > if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) { > ioq_submit(s); > } > - > - qemu_bh_cancel(s->completion_bh); This was the cause. I've found the root cause and will send a patch. Stefan
On Fri, Jun 24, 2016 at 3:46 PM, Kevin Wolf <kwolf@redhat.com> wrote: >> diff --git a/block/linux-aio.c b/block/linux-aio.c >> index e468960..fe7cece 100644 >> --- a/block/linux-aio.c >> +++ b/block/linux-aio.c >> @@ -149,8 +149,6 @@ static void qemu_laio_completion_bh(void *opaque) >> if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) { >> ioq_submit(s); >> } >> - >> - qemu_bh_cancel(s->completion_bh); >> } > > Maybe if a nested event loops cancels the BH, it's missing on the next > loop iteration. Before my patch, the nested callback happened to leave > an additional BH around which the outer one actually needs. The scenario you described is: qemu_laio_completion_bh() -> cb1() -> aio_poll() -> qemu_laio_completion_bh() <- qemu_laio_completion_bh() (cancel BH) <- aio_poll() <- cb1() -> cb2() -> aio_poll() (hang!) This hang seems impossible because the qemu_laio_completion_bh() loop processes all pending events. Therefore cb1() consumes all pending events and cb2() will not poll. If new I/O was submitted during cb1() and cb2() waits for it, then the eventfd will become readable upon completion and cb2() does not hang in that case either. If, instead of the original scenario, cb1() nests deeper then the BH is still scheduled and events will be processed without a hang. In summary, the job of scheduling the BH is not to force all nested callbacks to call qemu_laio_completion_bh(). Only the first nested callback needs the BH so that all pending events will be processed. Stefan
diff --git a/block/linux-aio.c b/block/linux-aio.c index e468960..fe7cece 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -149,8 +149,6 @@ static void qemu_laio_completion_bh(void *opaque) if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) { ioq_submit(s); } - - qemu_bh_cancel(s->completion_bh); } static void qemu_laio_completion_cb(EventNotifier *e) @@ -158,7 +156,7 @@ static void qemu_laio_completion_cb(EventNotifier *e) LinuxAioState *s = container_of(e, LinuxAioState, e); if (event_notifier_test_and_clear(&s->e)) { - qemu_laio_completion_bh(s); + qemu_bh_schedule(s->completion_bh); } }
This reverts commit ccb9dc10129954d0bcd7814298ed445e684d5a2a, which causes MQ stuck while doing IO thru virtio_blk. I reproduce very easily this stuck on recent v4 Stefan's set using num-queues=4: "[PATCH v4 0/7] virtio-blk: multiqueue support" https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg05999.html Some debug output from guest: ----------------------------- [root@andbd-vm ~]# cat /sys/block/vda/inflight 106 98 [root@andbd-vm ~]# cat /sys/block/vda/mq/*/tags nr_tags=128, reserved_tags=0, bits_per_word=5 nr_free=89, nr_reserved=0 active_queues=0 nr_tags=128, reserved_tags=0, bits_per_word=5 nr_free=83, nr_reserved=0 active_queues=0 nr_tags=128, reserved_tags=0, bits_per_word=5 nr_free=31, nr_reserved=0 active_queues=0 nr_tags=128, reserved_tags=0, bits_per_word=5 nr_free=105, nr_reserved=0 active_queues=0 Fio configuration: ------------------ [global] description=Emulation of Storage Server Access Pattern bssplit=512/20:1k/16:2k/9:4k/12:8k/19:16k/10:32k/8:64k/4 fadvise_hint=0 rw=randrw:2 direct=1 ioengine=libaio iodepth=64 iodepth_batch_submit=64 iodepth_batch_complete=64 numjobs=8 gtod_reduce=1 group_reporting=1 time_based=1 runtime=30 [job] filename=/dev/vda VM configuration: ----------------- -object iothread,id=t0 \ -drive if=none,id=d0,file=/dev/nullb0,format=raw,snapshot=off,cache=none,aio=native \ -device virtio-blk-pci,drive=d0,iothread=t0,num-queues=4,disable-modern=off,disable-legacy=on \ Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: qemu-devel@nongnu.org --- block/linux-aio.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)