Message ID | 1455645388-32401-6-git-send-email-pbonzini@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, 02/16 18:56, Paolo Bonzini wrote: > mirror is calling bdrv_drain from an AIO callback---more precisely, > the bdrv_drain happens far away from the AIO callback, in the coroutine that > the AIO callback enters. > > This used to be okay because bdrv_drain more or less tried to guess > when all AIO callbacks were done; however it will cause a deadlock once > bdrv_drain really checks for AIO callbacks to be complete. The situation > here is admittedly underdefined, and Stefan pointed out that the same > solution is found in many other places in the QEMU block layer, therefore > I think this workaround is acceptable. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > block/mirror.c | 23 +++++++++++++++++++---- > 1 file changed, 19 insertions(+), 4 deletions(-) > > diff --git a/block/mirror.c b/block/mirror.c > index 2c0edfa..793c20c 100644 > --- a/block/mirror.c > +++ b/block/mirror.c > @@ -71,6 +71,7 @@ typedef struct MirrorOp { > QEMUIOVector qiov; > int64_t sector_num; > int nb_sectors; > + QEMUBH *co_enter_bh; > } MirrorOp; > > static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read, > @@ -86,6 +87,18 @@ static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read, > } > } > > +static void mirror_bh_cb(void *opaque) > +{ > + MirrorOp *op = opaque; > + MirrorBlockJob *s = op->s; > + > + qemu_bh_delete(op->co_enter_bh); > + g_free(op); > + if (s->waiting_for_io) { > + qemu_coroutine_enter(s->common.co, NULL); > + } > +} > + > static void mirror_iteration_done(MirrorOp *op, int ret) > { > MirrorBlockJob *s = op->s; > @@ -116,11 +129,13 @@ static void mirror_iteration_done(MirrorOp *op, int ret) > } > > qemu_iovec_destroy(&op->qiov); > - g_free(op); > > - if (s->waiting_for_io) { > - qemu_coroutine_enter(s->common.co, NULL); > - } > + /* The I/O operation is not finished until the callback returns. > + * If we call qemu_coroutine_enter here, there is the possibility > + * of a deadlock when the coroutine calls bdrv_drained_begin. > + */ > + op->co_enter_bh = qemu_bh_new(mirror_bh_cb, op); Shouldn't this be aio_bh_new()? Fam > + qemu_bh_schedule(op->co_enter_bh); > } > > static void mirror_write_complete(void *opaque, int ret) > -- > 2.5.0 > > >
On 09/03/2016 04:19, Fam Zheng wrote: >> > + /* The I/O operation is not finished until the callback returns. >> > + * If we call qemu_coroutine_enter here, there is the possibility >> > + * of a deadlock when the coroutine calls bdrv_drained_begin. >> > + */ >> > + op->co_enter_bh = qemu_bh_new(mirror_bh_cb, op); > Shouldn't this be aio_bh_new()? Yes, of course. Paolo
diff --git a/block/mirror.c b/block/mirror.c index 2c0edfa..793c20c 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -71,6 +71,7 @@ typedef struct MirrorOp { QEMUIOVector qiov; int64_t sector_num; int nb_sectors; + QEMUBH *co_enter_bh; } MirrorOp; static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read, @@ -86,6 +87,18 @@ static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read, } } +static void mirror_bh_cb(void *opaque) +{ + MirrorOp *op = opaque; + MirrorBlockJob *s = op->s; + + qemu_bh_delete(op->co_enter_bh); + g_free(op); + if (s->waiting_for_io) { + qemu_coroutine_enter(s->common.co, NULL); + } +} + static void mirror_iteration_done(MirrorOp *op, int ret) { MirrorBlockJob *s = op->s; @@ -116,11 +129,13 @@ static void mirror_iteration_done(MirrorOp *op, int ret) } qemu_iovec_destroy(&op->qiov); - g_free(op); - if (s->waiting_for_io) { - qemu_coroutine_enter(s->common.co, NULL); - } + /* The I/O operation is not finished until the callback returns. + * If we call qemu_coroutine_enter here, there is the possibility + * of a deadlock when the coroutine calls bdrv_drained_begin. + */ + op->co_enter_bh = qemu_bh_new(mirror_bh_cb, op); + qemu_bh_schedule(op->co_enter_bh); } static void mirror_write_complete(void *opaque, int ret)
mirror is calling bdrv_drain from an AIO callback---more precisely, the bdrv_drain happens far away from the AIO callback, in the coroutine that the AIO callback enters. This used to be okay because bdrv_drain more or less tried to guess when all AIO callbacks were done; however it will cause a deadlock once bdrv_drain really checks for AIO callbacks to be complete. The situation here is admittedly underdefined, and Stefan pointed out that the same solution is found in many other places in the QEMU block layer, therefore I think this workaround is acceptable. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- block/mirror.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-)