Message ID | 1461306907-2837-3-git-send-email-famz@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 04/22 14:35, Fam Zheng wrote: > As a bandage, this patch undoes the change of 5a7e7a0bad17 in > non-dataplane case, and calls mirror_replace directly in mirror_run. mirror_exit minus mirror_replace was kept in the BH because it's not safe to call in coroutine. After discussing with kwolf on IRC, now it turns out that mirror_replace isn't either, due to the bdrv_drain_all in bdrv_reopen. So, NACK The only feasible fix for 2.6 is with the previous direction, but we need is_external check added to the main loop. Patches are being worked on. Fam
diff --git a/block/mirror.c b/block/mirror.c index 6c3fe43..bc77773 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -726,6 +726,13 @@ immediate_exit: /* Before we switch to target in mirror_exit, make sure data doesn't * change. */ bdrv_drained_begin(s->common.bs); + if (qemu_get_aio_context() == bdrv_get_aio_context(bs)) { + /* FIXME: if we are in the main loop, the iohandler doesn't honor + * bdrv_drained_begin yet, and guest requests can sneak in before BH + * callback runs. Do the replace now to avoid it. */ + mirror_replace(s, &data->ret); + data->should_replace = false; + } block_job_defer_to_main_loop(&s->common, mirror_exit, data); }
Commit 5a7e7a0ba moved mirror_exit to a BH handler but didn't add any protection against guest requests that could sneak in before the BH is dispatched. For example, this could happen (assuming a code base at that commit): main_loop_wait # 1 os_host_main_loop_wait g_main_context_dispatch aio_ctx_dispatch aio_dispatch ... mirror_run bdrv_drain (a) block_job_defer_to_main_loop qemu_iohandler_poll virtio_queue_host_notifier_read ... virtio_submit_multiwrite (b) blk_aio_multiwrite main_loop_wait # 2 <snip> aio_dispatch aio_bh_poll (c) mirror_exit At (a) we know the BDS has no pending request. However, the same main_loop_wait call is going to dispatch iohandlers (EventNotifier events), which may lead to a new I/O from guest. So the invariant is already broken at (c). Data loss. Commit f3926945c8 made iohandler to use aio API. The order of virtio_queue_host_notifier_read and block_job_defer_to_main_loop within a main_loop_wait becomes unpredictable, and even worse, if the host notifier event arrives at the next main_loop_wait call, the unpredictable order between mirror_exit and virtio_queue_host_notifier_read is also a trouble. As shown below, this commit made the bug easier to trigger: - Bug case 1: main_loop_wait # 1 os_host_main_loop_wait g_main_context_dispatch aio_ctx_dispatch (qemu_aio_context) ... mirror_run bdrv_drain (a) block_job_defer_to_main_loop aio_ctx_dispatch (iohandler_ctx) virtio_queue_host_notifier_read ... virtio_submit_multiwrite (b) blk_aio_multiwrite main_loop_wait # 2 ... aio_dispatch aio_bh_poll (c) mirror_exit - Bug case 2: main_loop_wait # 1 os_host_main_loop_wait g_main_context_dispatch aio_ctx_dispatch (qemu_aio_context) ... mirror_run bdrv_drain (a) block_job_defer_to_main_loop main_loop_wait # 2 ... aio_ctx_dispatch (iohandler_ctx) virtio_queue_host_notifier_read ... virtio_submit_multiwrite (b) blk_aio_multiwrite aio_dispatch aio_bh_poll (c) mirror_exit In both cases, (b) breaks the invariant wanted by (a) and (c). Unfortunately, the request loss has been silent, until 3f09bfbc7be added an assertion at (c) to check the invariant in bdrv_replace_in_backing_chain. Max reproduced an assertion failure after that commit, by doing active committing while the guest is running bonnie++. 2.5 added bdrv_drained_begin at (a) to protect the dataplane case from similar problems, but we never realize the main loop bug until now. As a bandage, this patch undoes the change of 5a7e7a0bad17 in non-dataplane case, and calls mirror_replace directly in mirror_run. The next step after 2.6 is to complete bdrv_drained_begin complete (including both fixing the dispatching code in the main loop and marking host event notifiers etc. as external) and then this special casing is not necessary. Launchpad Bug: 1570134 Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> --- block/mirror.c | 7 +++++++ 1 file changed, 7 insertions(+)