From patchwork Fri Apr 22 10:56:04 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fam Zheng X-Patchwork-Id: 8909701 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 27E4BBF29F for ; Fri, 22 Apr 2016 11:01:01 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 73D1E20149 for ; Fri, 22 Apr 2016 11:01:00 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 761A320145 for ; Fri, 22 Apr 2016 11:00:59 +0000 (UTC) Received: from localhost ([::1]:59221 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atYpW-000706-LV for patchwork-qemu-devel@patchwork.kernel.org; Fri, 22 Apr 2016 07:00:58 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52581) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atYl3-0006W0-Qn for qemu-devel@nongnu.org; Fri, 22 Apr 2016 06:56:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1atYl2-0005yD-JJ for qemu-devel@nongnu.org; Fri, 22 Apr 2016 06:56:21 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41654) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atYl0-0005xk-9Q; Fri, 22 Apr 2016 06:56:18 -0400 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D662262655; Fri, 22 Apr 2016 10:56:17 +0000 (UTC) Received: from ad.usersys.redhat.com (dhcp-15-133.nay.redhat.com [10.66.15.133]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u3MAu0WG018089; Fri, 22 Apr 2016 06:56:15 -0400 From: Fam Zheng To: qemu-devel@nongnu.org Date: Fri, 22 Apr 2016 18:56:04 +0800 Message-Id: <1461322564-17440-6-git-send-email-famz@redhat.com> In-Reply-To: <1461322564-17440-1-git-send-email-famz@redhat.com> References: <1461322564-17440-1-git-send-email-famz@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 22 Apr 2016 10:56:17 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v2 for-2.6 5/5] mirror: Workaround for unexpected iohandler events during completion X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , qemu-block@nongnu.org, "Michael S. Tsirkin" , Jeff Cody , Max Reitz , Stefan Hajnoczi , Paolo Bonzini Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Commit 5a7e7a0ba moved mirror_exit to a BH handler but didn't add any protection against new requests that could sneak in just before the BH is dispatched. For example (assuming a code base at that commit): main_loop_wait # 1 os_host_main_loop_wait g_main_context_dispatch aio_ctx_dispatch aio_dispatch ... mirror_run bdrv_drain (a) block_job_defer_to_main_loop qemu_iohandler_poll virtio_queue_host_notifier_read ... virtio_submit_multiwrite (b) blk_aio_multiwrite main_loop_wait # 2 aio_dispatch aio_bh_poll (c) mirror_exit At (a) we know the BDS has no pending request. However, the same main_loop_wait call is going to dispatch iohandlers (EventNotifier events), which may lead to a new I/O from guest. So the invariant is already broken at (c). Data loss. Commit f3926945c8 made iohandler to use aio API. The order of virtio_queue_host_notifier_read and block_job_defer_to_main_loop within a main_loop_wait becomes unpredictable, and even worse, if the host notifier event arrives at the next main_loop_wait call, the unpredictable order between mirror_exit and virtio_queue_host_notifier_read is also a trouble. As shown below, this commit made the bug easier to trigger: - Bug case 1: main_loop_wait # 1 os_host_main_loop_wait g_main_context_dispatch aio_ctx_dispatch (qemu_aio_context) ... mirror_run bdrv_drain (a) block_job_defer_to_main_loop aio_ctx_dispatch (iohandler_ctx) virtio_queue_host_notifier_read ... virtio_submit_multiwrite (b) blk_aio_multiwrite main_loop_wait # 2 ... aio_dispatch aio_bh_poll (c) mirror_exit - Bug case 2: main_loop_wait # 1 os_host_main_loop_wait g_main_context_dispatch aio_ctx_dispatch (qemu_aio_context) ... mirror_run bdrv_drain (a) block_job_defer_to_main_loop main_loop_wait # 2 ... aio_ctx_dispatch (iohandler_ctx) virtio_queue_host_notifier_read ... virtio_submit_multiwrite (b) blk_aio_multiwrite aio_dispatch aio_bh_poll (c) mirror_exit In both cases, (b) breaks the invariant wanted by (a) and (c). Until then, the request loss has been silent. Later, 3f09bfbc7be added asserts at (c) to check the invariant (in bdrv_replace_in_backing_chain), and Max reported an assertion failure first visible there, by doing active committing while the guest is running bonnie++. 2.5 added bdrv_drained_begin at (a) to protect the dataplane case from similar problems, but we never realize the main loop bug until now. As a bandage, this patch disables iohandler's external events temporarily together with bs->ctx. Launchpad Bug: 1570134 Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng --- block/mirror.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/block/mirror.c b/block/mirror.c index d56e30e..039f481 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -495,6 +495,9 @@ out: block_job_completed(&s->common, data->ret); g_free(data); bdrv_drained_end(src); + if (qemu_get_aio_context() == bdrv_get_aio_context(src)) { + aio_enable_external(iohandler_get_aio_context()); + } bdrv_unref(src); } @@ -716,6 +719,12 @@ immediate_exit: /* Before we switch to target in mirror_exit, make sure data doesn't * change. */ bdrv_drained_begin(s->common.bs); + if (qemu_get_aio_context() == bdrv_get_aio_context(bs)) { + /* FIXME: virtio host notifiers run on iohandler_ctx, therefore the + * above bdrv_drained_end isn't enough to quiesce it. This is ugly, we + * need a block layer API change to achieve this. */ + aio_disable_external(iohandler_get_aio_context()); + } block_job_defer_to_main_loop(&s->common, mirror_exit, data); }