From patchwork Fri Apr 22 06:35:06 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fam Zheng X-Patchwork-Id: 8906081 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id B8BE9BF29F for ; Fri, 22 Apr 2016 06:35:32 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 1041F201B9 for ; Fri, 22 Apr 2016 06:35:32 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EBB8D2026F for ; Fri, 22 Apr 2016 06:35:30 +0000 (UTC) Received: from localhost ([::1]:55275 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atUgc-000206-9J for patchwork-qemu-devel@patchwork.kernel.org; Fri, 22 Apr 2016 02:35:30 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46556) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atUgN-0001u0-J4 for qemu-devel@nongnu.org; Fri, 22 Apr 2016 02:35:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1atUgM-0000WS-Gm for qemu-devel@nongnu.org; Fri, 22 Apr 2016 02:35:15 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33878) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atUgJ-0000V0-UP; Fri, 22 Apr 2016 02:35:12 -0400 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 96FA13B3C0; Fri, 22 Apr 2016 06:35:11 +0000 (UTC) Received: from ad.usersys.redhat.com (dhcp-15-133.nay.redhat.com [10.66.15.133]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u3M6Z3cU014569; Fri, 22 Apr 2016 02:35:09 -0400 From: Fam Zheng To: qemu-devel@nongnu.org Date: Fri, 22 Apr 2016 14:35:06 +0800 Message-Id: <1461306907-2837-3-git-send-email-famz@redhat.com> In-Reply-To: <1461306907-2837-1-git-send-email-famz@redhat.com> References: <1461306907-2837-1-git-send-email-famz@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH for-2.6 2/3] mirror: Skip BH for mirror_exit if in main loop X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , qemu-block@nongnu.org, Jeff Cody , Max Reitz , stefanha@redhat.com, pbonzini@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Commit 5a7e7a0ba moved mirror_exit to a BH handler but didn't add any protection against guest requests that could sneak in before the BH is dispatched. For example, this could happen (assuming a code base at that commit): main_loop_wait # 1 os_host_main_loop_wait g_main_context_dispatch aio_ctx_dispatch aio_dispatch ... mirror_run bdrv_drain (a) block_job_defer_to_main_loop qemu_iohandler_poll virtio_queue_host_notifier_read ... virtio_submit_multiwrite (b) blk_aio_multiwrite main_loop_wait # 2 aio_dispatch aio_bh_poll (c) mirror_exit At (a) we know the BDS has no pending request. However, the same main_loop_wait call is going to dispatch iohandlers (EventNotifier events), which may lead to a new I/O from guest. So the invariant is already broken at (c). Data loss. Commit f3926945c8 made iohandler to use aio API. The order of virtio_queue_host_notifier_read and block_job_defer_to_main_loop within a main_loop_wait becomes unpredictable, and even worse, if the host notifier event arrives at the next main_loop_wait call, the unpredictable order between mirror_exit and virtio_queue_host_notifier_read is also a trouble. As shown below, this commit made the bug easier to trigger: - Bug case 1: main_loop_wait # 1 os_host_main_loop_wait g_main_context_dispatch aio_ctx_dispatch (qemu_aio_context) ... mirror_run bdrv_drain (a) block_job_defer_to_main_loop aio_ctx_dispatch (iohandler_ctx) virtio_queue_host_notifier_read ... virtio_submit_multiwrite (b) blk_aio_multiwrite main_loop_wait # 2 ... aio_dispatch aio_bh_poll (c) mirror_exit - Bug case 2: main_loop_wait # 1 os_host_main_loop_wait g_main_context_dispatch aio_ctx_dispatch (qemu_aio_context) ... mirror_run bdrv_drain (a) block_job_defer_to_main_loop main_loop_wait # 2 ... aio_ctx_dispatch (iohandler_ctx) virtio_queue_host_notifier_read ... virtio_submit_multiwrite (b) blk_aio_multiwrite aio_dispatch aio_bh_poll (c) mirror_exit In both cases, (b) breaks the invariant wanted by (a) and (c). Unfortunately, the request loss has been silent, until 3f09bfbc7be added an assertion at (c) to check the invariant in bdrv_replace_in_backing_chain. Max reproduced an assertion failure after that commit, by doing active committing while the guest is running bonnie++. 2.5 added bdrv_drained_begin at (a) to protect the dataplane case from similar problems, but we never realize the main loop bug until now. As a bandage, this patch undoes the change of 5a7e7a0bad17 in non-dataplane case, and calls mirror_replace directly in mirror_run. The next step after 2.6 is to complete bdrv_drained_begin complete (including both fixing the dispatching code in the main loop and marking host event notifiers etc. as external) and then this special casing is not necessary. Launchpad Bug: 1570134 Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng --- block/mirror.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/block/mirror.c b/block/mirror.c index 6c3fe43..bc77773 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -726,6 +726,13 @@ immediate_exit: /* Before we switch to target in mirror_exit, make sure data doesn't * change. */ bdrv_drained_begin(s->common.bs); + if (qemu_get_aio_context() == bdrv_get_aio_context(bs)) { + /* FIXME: if we are in the main loop, the iohandler doesn't honor + * bdrv_drained_begin yet, and guest requests can sneak in before BH + * callback runs. Do the replace now to avoid it. */ + mirror_replace(s, &data->ret); + data->should_replace = false; + } block_job_defer_to_main_loop(&s->common, mirror_exit, data); }