Message ID | 1455470231-5223-6-git-send-email-pbonzini@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, 14 Feb 2016 18:17:08 +0100 Paolo Bonzini <pbonzini@redhat.com> wrote: > In disabled mode, virtio-blk dataplane seems to be enabled, but flow > actually goes through the normal virtio path. This patch simplifies a bit > the handling of disabled mode. In disabled mode, virtio_blk_handle_output > might be called even if s->dataplane is not NULL. > > This is a bit tricky, because the current check for s->dataplane will > always trigger, causing a continuous stream of calls to > virtio_blk_data_plane_start. Unfortunately, these calls will not > do anything. > To fix this, set the "started" flag even in disabled > mode, and skip virtio_blk_data_plane_start if the started flag is true. > The resulting changes also prepare the code for the next patch, were > virtio-blk dataplane will reuse the same virtio_blk_handle_output function > as "regular" virtio-blk. > > Because struct VirtIOBlockDataPlane is opaque in virtio-blk.c, we have > to move s->dataplane->started inside struct VirtIOBlock. It seems a bit odd to me that ->started is the only state that is not inside the dataplane struct... this approach saves a function call for an accessor, though. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > hw/block/dataplane/virtio-blk.c | 21 +++++++++------------ > hw/block/virtio-blk.c | 2 +- > include/hw/virtio/virtio-blk.h | 1 + > 3 files changed, 11 insertions(+), 13 deletions(-) > > @@ -319,9 +314,10 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) > k->set_guest_notifiers(qbus->parent, 1, false); > fail_guest_notifiers: > vring_teardown(&s->vring, s->vdev, 0); > - s->disabled = true; > fail_vring: > + s->disabled = true; I originally introduced ->disabled to cover (possibly temporary) failures to set up notifiers, but I guess this doesn't hurt. > s->starting = false; > + vblk->dataplane_started = true; > } > > /* Context: QEMU global mutex held */ After spending some time wrapping my head around it again, this looks sane to me. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
On Sun, 02/14 18:17, Paolo Bonzini wrote: > In disabled mode, virtio-blk dataplane seems to be enabled, but flow > actually goes through the normal virtio path. This patch simplifies a bit > the handling of disabled mode. In disabled mode, virtio_blk_handle_output > might be called even if s->dataplane is not NULL. > > This is a bit tricky, because the current check for s->dataplane will > always trigger, causing a continuous stream of calls to > virtio_blk_data_plane_start. Unfortunately, these calls will not > do anything. To fix this, set the "started" flag even in disabled > mode, and skip virtio_blk_data_plane_start if the started flag is true. > The resulting changes also prepare the code for the next patch, were > virtio-blk dataplane will reuse the same virtio_blk_handle_output function > as "regular" virtio-blk. > > Because struct VirtIOBlockDataPlane is opaque in virtio-blk.c, we have > to move s->dataplane->started inside struct VirtIOBlock. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > hw/block/dataplane/virtio-blk.c | 21 +++++++++------------ > hw/block/virtio-blk.c | 2 +- > include/hw/virtio/virtio-blk.h | 1 + > 3 files changed, 11 insertions(+), 13 deletions(-) > > diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c > index 03b81bc..cc521c1 100644 > --- a/hw/block/dataplane/virtio-blk.c > +++ b/hw/block/dataplane/virtio-blk.c > @@ -28,7 +28,6 @@ > #include "qom/object_interfaces.h" > > struct VirtIOBlockDataPlane { > - bool started; > bool starting; > bool stopping; > bool disabled; > @@ -264,11 +263,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) > VirtQueue *vq; > int r; > > - if (s->started || s->disabled) { > - return; > - } > - > - if (s->starting) { > + if (vblk->dataplane_started || s->starting) { > return; > } > > @@ -300,7 +295,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) > vblk->complete_request = complete_request_vring; > > s->starting = false; > - s->started = true; > + vblk->dataplane_started = true; > trace_virtio_blk_data_plane_start(s); > > blk_set_aio_context(s->conf->conf.blk, s->ctx); > @@ -319,9 +314,10 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) > k->set_guest_notifiers(qbus->parent, 1, false); > fail_guest_notifiers: > vring_teardown(&s->vring, s->vdev, 0); > - s->disabled = true; > fail_vring: > + s->disabled = true; > s->starting = false; Worth a comment here, or at definition of dataplane_started, explaining the trick said in the commit message? Reviewed-by: Fam Zheng <famz@redhat.com> > + vblk->dataplane_started = true; > } > <...>
On 02/14/2016 06:17 PM, Paolo Bonzini wrote: > In disabled mode, virtio-blk dataplane seems to be enabled, but flow > actually goes through the normal virtio path. This patch simplifies a bit > the handling of disabled mode. In disabled mode, virtio_blk_handle_output > might be called even if s->dataplane is not NULL. > > This is a bit tricky, because the current check for s->dataplane will > always trigger, causing a continuous stream of calls to > virtio_blk_data_plane_start. Unfortunately, these calls will not > do anything. To fix this, set the "started" flag even in disabled > mode, and skip virtio_blk_data_plane_start if the started flag is true. > The resulting changes also prepare the code for the next patch, were > virtio-blk dataplane will reuse the same virtio_blk_handle_output function > as "regular" virtio-blk. > > Because struct VirtIOBlockDataPlane is opaque in virtio-blk.c, we have > to move s->dataplane->started inside struct VirtIOBlock. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> I have some random crashes at startup Stack trace of thread 48326: #0 0x000002aa2e0cce46 bdrv_co_do_rw (qemu-system-s390x) #1 0x000002aa2e159e8e coroutine_trampoline (qemu-system-s390x) #2 0x000003ffbc35150a __makecontext_ret (libc.so.6) that I was able to bisect. commit 2906cddfecff21af20eedab43288b485a679f9ac does crash regularly, 2906cddfecff21af20eedab43288b485a679f9ac^ does not. I will try to find somebody that looks into that - unless you have an idea. Christian > --- > hw/block/dataplane/virtio-blk.c | 21 +++++++++------------ > hw/block/virtio-blk.c | 2 +- > include/hw/virtio/virtio-blk.h | 1 + > 3 files changed, 11 insertions(+), 13 deletions(-) > > diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c > index 03b81bc..cc521c1 100644 > --- a/hw/block/dataplane/virtio-blk.c > +++ b/hw/block/dataplane/virtio-blk.c > @@ -28,7 +28,6 @@ > #include "qom/object_interfaces.h" > > struct VirtIOBlockDataPlane { > - bool started; > bool starting; > bool stopping; > bool disabled; > @@ -264,11 +263,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) > VirtQueue *vq; > int r; > > - if (s->started || s->disabled) { > - return; > - } > - > - if (s->starting) { > + if (vblk->dataplane_started || s->starting) { > return; > } > > @@ -300,7 +295,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) > vblk->complete_request = complete_request_vring; > > s->starting = false; > - s->started = true; > + vblk->dataplane_started = true; > trace_virtio_blk_data_plane_start(s); > > blk_set_aio_context(s->conf->conf.blk, s->ctx); > @@ -319,9 +314,10 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) > k->set_guest_notifiers(qbus->parent, 1, false); > fail_guest_notifiers: > vring_teardown(&s->vring, s->vdev, 0); > - s->disabled = true; > fail_vring: > + s->disabled = true; > s->starting = false; > + vblk->dataplane_started = true; > } > > /* Context: QEMU global mutex held */ > @@ -331,13 +327,14 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s) > VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); > VirtIOBlock *vblk = VIRTIO_BLK(s->vdev); > > + if (!vblk->dataplane_started || s->stopping) { > + return; > + } > > /* Better luck next time. */ > if (s->disabled) { > s->disabled = false; > - return; > - } > - if (!s->started || s->stopping) { > + vblk->dataplane_started = false; > return; > } > s->stopping = true; > @@ -364,6 +361,6 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s) > /* Clean up guest notifier (irq) */ > k->set_guest_notifiers(qbus->parent, 1, false); > > - s->started = false; > + vblk->dataplane_started = false; > s->stopping = false; > } > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c > index c427698..e04c8f5 100644 > --- a/hw/block/virtio-blk.c > +++ b/hw/block/virtio-blk.c > @@ -589,7 +589,7 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq) > /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start > * dataplane here instead of waiting for .set_status(). > */ > - if (s->dataplane) { > + if (s->dataplane && !s->dataplane_started) { > virtio_blk_data_plane_start(s->dataplane); > return; > } > diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h > index 199bb0e..781969d 100644 > --- a/include/hw/virtio/virtio-blk.h > +++ b/include/hw/virtio/virtio-blk.h > @@ -56,6 +56,7 @@ typedef struct VirtIOBlock { > /* Function to push to vq and notify guest */ > void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status); > Notifier migration_state_notifier; > + bool dataplane_started; > struct VirtIOBlockDataPlane *dataplane; > } VirtIOBlock; >
On 09/03/2016 13:21, Christian Borntraeger wrote: > I have some random crashes at startup > > Stack trace of thread 48326: > #0 0x000002aa2e0cce46 bdrv_co_do_rw (qemu-system-s390x) > #1 0x000002aa2e159e8e coroutine_trampoline (qemu-system-s390x) > #2 0x000003ffbc35150a __makecontext_ret (libc.so.6) > > > that I was able to bisect. > commit 2906cddfecff21af20eedab43288b485a679f9ac does crash regularly, > 2906cddfecff21af20eedab43288b485a679f9ac^ does not. > > I will try to find somebody that looks into that - unless you have an idea. The only random idea is to move vblk->dataplane_started = true to the beginning of virtio_blk_data_plane_start rather than the end. Paolo
On 03/09/2016 02:02 PM, Christian Borntraeger wrote: > On 03/09/2016 01:55 PM, Paolo Bonzini wrote: >> >> >> On 09/03/2016 13:21, Christian Borntraeger wrote: >>> I have some random crashes at startup >>> >>> Stack trace of thread 48326: >>> #0 0x000002aa2e0cce46 bdrv_co_do_rw (qemu-system-s390x) >>> #1 0x000002aa2e159e8e coroutine_trampoline (qemu-system-s390x) >>> #2 0x000003ffbc35150a __makecontext_ret (libc.so.6) >>> >>> >>> that I was able to bisect. >>> commit 2906cddfecff21af20eedab43288b485a679f9ac does crash regularly, >>> 2906cddfecff21af20eedab43288b485a679f9ac^ does not. >>> >>> I will try to find somebody that looks into that - unless you have an idea. >> >> The only random idea is to move >> >> vblk->dataplane_started = true >> >> to the beginning of virtio_blk_data_plane_start rather than the end. >> >> Paolo >> > > Indeed > > diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c > index 36f3d2b..1908d59 100644 > --- a/hw/block/dataplane/virtio-blk.c > +++ b/hw/block/dataplane/virtio-blk.c > @@ -195,6 +195,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) > if (vblk->dataplane_started || s->starting) { > return; > } > + vblk->dataplane_started = true; > > s->starting = true; > s->vq = virtio_get_queue(s->vdev, 0); > @@ -235,7 +236,6 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) > fail_guest_notifiers: > s->disabled = true; > s->starting = false; > - vblk->dataplane_started = true; > } > > /* Context: QEMU global mutex held */ > > seems to fix the issue. Hmmm, no another crash, just seems to happen less often.
On Wed, 9 Mar 2016 14:05:20 +0100 Christian Borntraeger <borntraeger@de.ibm.com> wrote: > On 03/09/2016 02:02 PM, Christian Borntraeger wrote: > > On 03/09/2016 01:55 PM, Paolo Bonzini wrote: > >> > >> > >> On 09/03/2016 13:21, Christian Borntraeger wrote: > >>> I have some random crashes at startup > >>> > >>> Stack trace of thread 48326: > >>> #0 0x000002aa2e0cce46 bdrv_co_do_rw (qemu-system-s390x) > >>> #1 0x000002aa2e159e8e coroutine_trampoline (qemu-system-s390x) > >>> #2 0x000003ffbc35150a __makecontext_ret (libc.so.6) > >>> > >>> > >>> that I was able to bisect. > >>> commit 2906cddfecff21af20eedab43288b485a679f9ac does crash regularly, > >>> 2906cddfecff21af20eedab43288b485a679f9ac^ does not. > >>> > >>> I will try to find somebody that looks into that - unless you have an idea. > >> > >> The only random idea is to move > >> > >> vblk->dataplane_started = true > >> > >> to the beginning of virtio_blk_data_plane_start rather than the end. > >> > >> Paolo > >> > > > > Indeed > > > > diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c > > index 36f3d2b..1908d59 100644 > > --- a/hw/block/dataplane/virtio-blk.c > > +++ b/hw/block/dataplane/virtio-blk.c > > @@ -195,6 +195,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) > > if (vblk->dataplane_started || s->starting) { > > return; > > } > > + vblk->dataplane_started = true; > > > > s->starting = true; > > s->vq = virtio_get_queue(s->vdev, 0); > > @@ -235,7 +236,6 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) > > fail_guest_notifiers: > > s->disabled = true; > > s->starting = false; > > - vblk->dataplane_started = true; > > } > > > > /* Context: QEMU global mutex held */ > > > > seems to fix the issue. > > Hmmm, no another crash, just seems to happen less often. What about the proposal in <56C34414.90809@redhat.com>, i.e. move setting the started flag out of dataplane entirely?
On 03/09/2016 01:55 PM, Paolo Bonzini wrote: > > > On 09/03/2016 13:21, Christian Borntraeger wrote: >> I have some random crashes at startup >> >> Stack trace of thread 48326: >> #0 0x000002aa2e0cce46 bdrv_co_do_rw (qemu-system-s390x) >> #1 0x000002aa2e159e8e coroutine_trampoline (qemu-system-s390x) >> #2 0x000003ffbc35150a __makecontext_ret (libc.so.6) >> >> >> that I was able to bisect. >> commit 2906cddfecff21af20eedab43288b485a679f9ac does crash regularly, >> 2906cddfecff21af20eedab43288b485a679f9ac^ does not. >> >> I will try to find somebody that looks into that - unless you have an idea. > > The only random idea is to move > > vblk->dataplane_started = true > > to the beginning of virtio_blk_data_plane_start rather than the end. > > Paolo FWIW, it seems that this patch triggers this error, the "tracked_request_begin" that I reported yesterday and / or some early read issues from the bootloader in a random fashion. Using 2906cddfecff21af20eedab43288b485a679f9ac^ seems to work all the time, moving around vblk->dataplane_started = true also triggers all 3 types of bugs, e.g. Thread 1 (Thread 0x3ffaabff910 (LWP 32782)): #0 0x0000000010329a70 in bdrv_co_do_rw (opaque=0x0) at /home/cborntra/REPOS/qemu/block/io.c:2170 #1 0x00000000103b2e7a in coroutine_trampoline (i0=1023, i1=-2147470992) at /home/cborntra/REPOS/qemu/util/coroutine-ucontext.c:79 #2 0x000003ffac85150a in __makecontext_ret () from /lib64/libc.so.6 (gdb) list 2165 2166 /* Invoke bdrv_co_do_readv/bdrv_co_do_writev */ 2167 static void coroutine_fn bdrv_co_do_rw(void *opaque) 2168 { 2169 BlockAIOCBCoroutine *acb = opaque; 2170 BlockDriverState *bs = acb->common.bs; 2171 2172 if (!acb->is_write) { 2173 acb->req.error = bdrv_co_do_readv(bs, acb->req.sector, 2174 acb->req.nb_sectors, acb->req.qiov, acb->req.flags); I will try to find somebody to work on this.
On 09/03/2016 15:29, Christian Borntraeger wrote: > FWIW, it seems that this patch triggers this error, the "tracked_request_begin" > that I reported yesterday and / or some early read issues from the bootloader > in a random fashion. > Using 2906cddfecff21af20eedab43288b485a679f9ac^ seems to work all the time, > moving around vblk->dataplane_started = true also triggers all 3 types > of bugs In all likelihood, the bug is that virtio_blk_handle_output is being called in two threads. It's not clear to me how that's possible, though. Paolo
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 03b81bc..cc521c1 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -28,7 +28,6 @@ #include "qom/object_interfaces.h" struct VirtIOBlockDataPlane { - bool started; bool starting; bool stopping; bool disabled; @@ -264,11 +263,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) VirtQueue *vq; int r; - if (s->started || s->disabled) { - return; - } - - if (s->starting) { + if (vblk->dataplane_started || s->starting) { return; } @@ -300,7 +295,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) vblk->complete_request = complete_request_vring; s->starting = false; - s->started = true; + vblk->dataplane_started = true; trace_virtio_blk_data_plane_start(s); blk_set_aio_context(s->conf->conf.blk, s->ctx); @@ -319,9 +314,10 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) k->set_guest_notifiers(qbus->parent, 1, false); fail_guest_notifiers: vring_teardown(&s->vring, s->vdev, 0); - s->disabled = true; fail_vring: + s->disabled = true; s->starting = false; + vblk->dataplane_started = true; } /* Context: QEMU global mutex held */ @@ -331,13 +327,14 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s) VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); VirtIOBlock *vblk = VIRTIO_BLK(s->vdev); + if (!vblk->dataplane_started || s->stopping) { + return; + } /* Better luck next time. */ if (s->disabled) { s->disabled = false; - return; - } - if (!s->started || s->stopping) { + vblk->dataplane_started = false; return; } s->stopping = true; @@ -364,6 +361,6 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s) /* Clean up guest notifier (irq) */ k->set_guest_notifiers(qbus->parent, 1, false); - s->started = false; + vblk->dataplane_started = false; s->stopping = false; } diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index c427698..e04c8f5 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -589,7 +589,7 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq) /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start * dataplane here instead of waiting for .set_status(). */ - if (s->dataplane) { + if (s->dataplane && !s->dataplane_started) { virtio_blk_data_plane_start(s->dataplane); return; } diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h index 199bb0e..781969d 100644 --- a/include/hw/virtio/virtio-blk.h +++ b/include/hw/virtio/virtio-blk.h @@ -56,6 +56,7 @@ typedef struct VirtIOBlock { /* Function to push to vq and notify guest */ void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status); Notifier migration_state_notifier; + bool dataplane_started; struct VirtIOBlockDataPlane *dataplane; } VirtIOBlock;
In disabled mode, virtio-blk dataplane seems to be enabled, but flow actually goes through the normal virtio path. This patch simplifies a bit the handling of disabled mode. In disabled mode, virtio_blk_handle_output might be called even if s->dataplane is not NULL. This is a bit tricky, because the current check for s->dataplane will always trigger, causing a continuous stream of calls to virtio_blk_data_plane_start. Unfortunately, these calls will not do anything. To fix this, set the "started" flag even in disabled mode, and skip virtio_blk_data_plane_start if the started flag is true. The resulting changes also prepare the code for the next patch, were virtio-blk dataplane will reuse the same virtio_blk_handle_output function as "regular" virtio-blk. Because struct VirtIOBlockDataPlane is opaque in virtio-blk.c, we have to move s->dataplane->started inside struct VirtIOBlock. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/block/dataplane/virtio-blk.c | 21 +++++++++------------ hw/block/virtio-blk.c | 2 +- include/hw/virtio/virtio-blk.h | 1 + 3 files changed, 11 insertions(+), 13 deletions(-)