From patchwork Mon Nov 27 23:29:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 10078361 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C8BEC60353 for ; Mon, 27 Nov 2017 23:30:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B5AFE2896A for ; Mon, 27 Nov 2017 23:30:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AA6AA2900B; Mon, 27 Nov 2017 23:30:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 422A628A6F for ; Mon, 27 Nov 2017 23:30:19 +0000 (UTC) Received: from localhost ([::1]:35227 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eJSqv-000099-77 for patchwork-qemu-devel@patchwork.kernel.org; Mon, 27 Nov 2017 18:30:17 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47766) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eJSq6-00007t-IS for qemu-devel@nongnu.org; Mon, 27 Nov 2017 18:29:27 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eJSq5-0008P9-LS for qemu-devel@nongnu.org; Mon, 27 Nov 2017 18:29:26 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41666) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eJSpz-0008L5-77; Mon, 27 Nov 2017 18:29:19 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1D6BF4E33B; Mon, 27 Nov 2017 23:29:18 +0000 (UTC) Received: from localhost.localdomain (unknown [10.36.118.73]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 88E9D5C888; Mon, 27 Nov 2017 23:29:10 +0000 (UTC) Date: Tue, 28 Nov 2017 00:29:09 +0100 From: Kevin Wolf To: Fam Zheng Message-ID: <20171127232909.GF4903@localhost.localdomain> References: <20171123175747.2309-1-famz@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20171123175747.2309-1-famz@redhat.com> User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 27 Nov 2017 23:29:18 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: Re: [Qemu-devel] [PATCH 0/1] block: Workaround for the iotests errors X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-block@nongnu.org, jcody@redhat.com, qemu-devel@nongnu.org, Max Reitz , Stefan Hajnoczi , pbonzini@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Am 23.11.2017 um 18:57 hat Fam Zheng geschrieben: > Jeff's block job patch made the latent drain bug visible, and I find this > patch, which by itself also makes some sense, can hide it again. :) With it > applied we are at least back to the ground where patchew's iotests (make > docker-test-block@fedora) can pass. > > The real bug is that in the middle of bdrv_parent_drained_end(), bs's parent > list changes. One drained_end call before the mirror_exit() already did one > blk_root_drained_end(), a second drained_end on an updated parent node can do > another same blk_root_drained_end(), making it unbalanced with > blk_root_drained_begin(). This is shown by the following three backtraces as > captured by rr with a crashed "qemu-img commit", essentially the same as in > the failed iotest 020: My conclusion what really happens in 020 is that we have a graph like this: mirror target BB --+ | v qemu-img BB -> mirror_top_bs -> overlay -> base bdrv_drained_end(base) results in it being available for requests again, so it calls bdrv_parent_drained_end() for overlay. While draining itself, the mirror job completes and changes the BdrvChild between mirror_top_bs and overlay (which is currently being drained) to point to base instead. After returning, QLIST_FOREACH() continues to iterate the parents of base instead of those of overlay, resulting in a second blk_drained_end() for the mirror target BB. This instance can be fixed relatively easily (see below) by using QLIST_FOREACH_SAFE() instead. However, I'm not sure if all problems with the graph change can be solved this way and whether we can really allow graph changes while iterating the graph for bdrv_drained_begin/end. Not allowing it would require some more serious changes to the block jobs that delays their completion until after bdrv_drain_end() has finished (not sure how to even get a callback at that point...) And the test cases that Jeff mentions still fail with this patch, too. But at least it doesn't only make failure less likely by reducing the window for a race condition, but seems to attack a real problem. Kevin Tested-by: John Snow diff --git a/block/io.c b/block/io.c index 4fdf93a014..6773926fc1 100644 --- a/block/io.c +++ b/block/io.c @@ -42,9 +42,9 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, void bdrv_parent_drained_begin(BlockDriverState *bs) { - BdrvChild *c; + BdrvChild *c, *next; - QLIST_FOREACH(c, &bs->parents, next_parent) { + QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) { if (c->role->drained_begin) { c->role->drained_begin(c); } @@ -53,9 +53,9 @@ void bdrv_parent_drained_begin(BlockDriverState *bs) void bdrv_parent_drained_end(BlockDriverState *bs) { - BdrvChild *c; + BdrvChild *c, *next; - QLIST_FOREACH(c, &bs->parents, next_parent) { + QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) { if (c->role->drained_end) { c->role->drained_end(c); }