Message ID | 20180920161958.27453-14-kwolf@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix some jobs/drain/aio_poll related hangs | expand |
On 20.09.18 18:19, Kevin Wolf wrote: > bdrv_drain_poll_top_level() was buggy because it didn't release the > AioContext lock of the node to be drained before calling aio_poll(). > This way, callbacks called by aio_poll() would possibly take the lock a > second time and run into a deadlock with a nested AIO_WAIT_WHILE() call. > > However, it turns out that the aio_poll() call isn't actually needed any > more. It was introduced in commit 91af091f923, which is effectively > reverted by this patch. The cases it was supposed to fix are now covered > by bdrv_drain_poll(), which waits for block jobs to reach a quiescent > state. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > Reviewed-by: Fam Zheng <famz@redhat.com> > Reviewed-by: Max Reitz <mreitz@redhat.com> > --- > block/io.c | 8 -------- > 1 file changed, 8 deletions(-) Hm... While looking at iotest 129 (which I think is broken because it tries to use BB-level throttling which doesn't do anything for the mirror job), I noticed this: $ x86_64-softmmu/qemu-system-x86_64 \ -object throttle-group,id=tg0 \ -drive node-name=node0,driver=throttle,\ throttle-group=tg0,file.driver=qcow2,file.file.driver=file,\ file.file.filename=/tmp/src.qcow2 -qmp stdio \ <<EOF {"execute":"qmp_capabilities"} {"execute":"drive-mirror","arguments":{"device":"node0", "target":"/tmp/tgt.qcow2","sync":"full","format":"qcow2", "mode":"absolute-paths","job-id":"mirror-job0"}} {"execute":"block-job-cancel", "arguments":{"device":"mirror-job0","force":true}} {"execute":"quit"} EOF [...] qemu-system-x86_64: block/block-backend.c:2211: blk_root_drained_end: Assertion `blk->quiesce_counter' failed. [1] 2722 abort (core dumped) x86_64-softmmu/qemu-system-x86_64 -object throttle-group,id=tg0 -drive -qmp (Which worked before commit 4cf077b59fc73eec29f8b7d082919dbb278bdc86, i.e. this one.) Max
diff --git a/block/io.c b/block/io.c index 914ba78f1a..8b81ff3913 100644 --- a/block/io.c +++ b/block/io.c @@ -268,10 +268,6 @@ bool bdrv_drain_poll(BlockDriverState *bs, bool recursive, static bool bdrv_drain_poll_top_level(BlockDriverState *bs, bool recursive, BdrvChild *ignore_parent) { - /* Execute pending BHs first and check everything else only after the BHs - * have executed. */ - while (aio_poll(bs->aio_context, false)); - return bdrv_drain_poll(bs, recursive, ignore_parent, false); } @@ -511,10 +507,6 @@ static bool bdrv_drain_all_poll(void) BlockDriverState *bs = NULL; bool result = false; - /* Execute pending BHs first (may modify the graph) and check everything - * else only after the BHs have executed. */ - while (aio_poll(qemu_get_aio_context(), false)); - /* bdrv_drain_poll() can't make changes to the graph and we are holding the * main AioContext lock, so iterating bdrv_next_all_states() is safe. */ while ((bs = bdrv_next_all_states(bs))) {