From patchwork Thu Jul 25 16:27:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 11059205 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D4AE3912 for ; Thu, 25 Jul 2019 16:28:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C514F28837 for ; Thu, 25 Jul 2019 16:28:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B9A932884B; Thu, 25 Jul 2019 16:28:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 56ED5288A6 for ; Thu, 25 Jul 2019 16:28:11 +0000 (UTC) Received: from localhost ([::1]:33836 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hqgbC-0006cB-CM for patchwork-qemu-devel@patchwork.kernel.org; Thu, 25 Jul 2019 12:28:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:36775) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hqgaP-0003BC-Iz for qemu-devel@nongnu.org; Thu, 25 Jul 2019 12:27:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hqgaO-0003px-B9 for qemu-devel@nongnu.org; Thu, 25 Jul 2019 12:27:21 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35096) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hqgaL-0003l2-Ll; Thu, 25 Jul 2019 12:27:17 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D5EDF30C1350; Thu, 25 Jul 2019 16:27:16 +0000 (UTC) Received: from linux.fritz.box.com (ovpn-117-146.ams2.redhat.com [10.36.117.146]) by smtp.corp.redhat.com (Postfix) with ESMTP id 17CCC5D71A; Thu, 25 Jul 2019 16:27:14 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Date: Thu, 25 Jul 2019 18:27:01 +0200 Message-Id: <20190725162704.12622-2-kwolf@redhat.com> In-Reply-To: <20190725162704.12622-1-kwolf@redhat.com> References: <20190725162704.12622-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Thu, 25 Jul 2019 16:27:16 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 1/4] block: Remove blk_pread_unthrottled() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, vsementsov@virtuozzo.com, den@virtuozzo.com, qemu-devel@nongnu.org, mreitz@redhat.com, dplotnikov@virtuozzo.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP The functionality offered by blk_pread_unthrottled() goes back to commit 498e386c584. Then, we couldn't perform I/O throttling with synchronous requests because timers wouldn't be executed in polling loops. So the commit automatically disabled I/O throttling as soon as a synchronous request was issued. However, for geometry detection during disk initialisation, we always used (and still use) synchronous requests even if guest requests use AIO later. Geometry detection was not wanted to disable I/O throttling, so bdrv_pread_unthrottled() was introduced which disabled throttling only temporarily. All of this isn't necessary any more because we do run timers in polling loop and even synchronous requests are now using coroutine infrastructure internally. For this reason, commit 90c78624f already removed the automatic disabling of I/O throttling. It's time to get rid of the workaround for the removed code, and its abuse of blk_root_drained_begin()/end(), as well. Signed-off-by: Kevin Wolf Reviewed-by: Max Reitz --- include/sysemu/block-backend.h | 2 -- block/block-backend.c | 16 ---------------- hw/block/hd-geometry.c | 7 +------ 3 files changed, 1 insertion(+), 24 deletions(-) diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h index 733c4957eb..7320b58467 100644 --- a/include/sysemu/block-backend.h +++ b/include/sysemu/block-backend.h @@ -117,8 +117,6 @@ char *blk_get_attached_dev_id(BlockBackend *blk); BlockBackend *blk_by_dev(void *dev); BlockBackend *blk_by_qdev_id(const char *id, Error **errp); void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops, void *opaque); -int blk_pread_unthrottled(BlockBackend *blk, int64_t offset, uint8_t *buf, - int bytes); int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset, unsigned int bytes, QEMUIOVector *qiov, BdrvRequestFlags flags); diff --git a/block/block-backend.c b/block/block-backend.c index 0056b526b8..fdd6b01ecf 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1237,22 +1237,6 @@ static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf, return rwco.ret; } -int blk_pread_unthrottled(BlockBackend *blk, int64_t offset, uint8_t *buf, - int count) -{ - int ret; - - ret = blk_check_byte_request(blk, offset, count); - if (ret < 0) { - return ret; - } - - blk_root_drained_begin(blk->root); - ret = blk_pread(blk, offset, buf, count); - blk_root_drained_end(blk->root, NULL); - return ret; -} - int blk_pwrite_zeroes(BlockBackend *blk, int64_t offset, int bytes, BdrvRequestFlags flags) { diff --git a/hw/block/hd-geometry.c b/hw/block/hd-geometry.c index 79384a2b0a..dcbccee294 100644 --- a/hw/block/hd-geometry.c +++ b/hw/block/hd-geometry.c @@ -63,12 +63,7 @@ static int guess_disk_lchs(BlockBackend *blk, blk_get_geometry(blk, &nb_sectors); - /** - * The function will be invoked during startup not only in sync I/O mode, - * but also in async I/O mode. So the I/O throttling function has to - * be disabled temporarily here, not permanently. - */ - if (blk_pread_unthrottled(blk, 0, buf, BDRV_SECTOR_SIZE) < 0) { + if (blk_pread(blk, 0, buf, BDRV_SECTOR_SIZE) < 0) { return -1; } /* test msdos magic */ From patchwork Thu Jul 25 16:27:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 11059197 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1417B912 for ; Thu, 25 Jul 2019 16:27:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 03A7C2875F for ; Thu, 25 Jul 2019 16:27:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EBE59288A6; Thu, 25 Jul 2019 16:27:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7DBFD2875F for ; Thu, 25 Jul 2019 16:27:48 +0000 (UTC) Received: from localhost ([::1]:33818 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hqgap-0004yA-Fw for patchwork-qemu-devel@patchwork.kernel.org; Thu, 25 Jul 2019 12:27:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:36878) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hqgaT-0003QO-30 for qemu-devel@nongnu.org; Thu, 25 Jul 2019 12:27:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hqgaR-0003tW-OO for qemu-devel@nongnu.org; Thu, 25 Jul 2019 12:27:24 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57112) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hqgaO-0003oz-91; Thu, 25 Jul 2019 12:27:20 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6D504306031E; Thu, 25 Jul 2019 16:27:19 +0000 (UTC) Received: from linux.fritz.box.com (ovpn-117-146.ams2.redhat.com [10.36.117.146]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2DBBB5D71A; Thu, 25 Jul 2019 16:27:17 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Date: Thu, 25 Jul 2019 18:27:02 +0200 Message-Id: <20190725162704.12622-3-kwolf@redhat.com> In-Reply-To: <20190725162704.12622-1-kwolf@redhat.com> References: <20190725162704.12622-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Thu, 25 Jul 2019 16:27:19 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 2/4] block: Reduce (un)drains when replacing a child X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, vsementsov@virtuozzo.com, den@virtuozzo.com, qemu-devel@nongnu.org, mreitz@redhat.com, dplotnikov@virtuozzo.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Max Reitz Currently, bdrv_replace_child_noperm() undrains the parent until it is completely undrained, then re-drains it after attaching the new child node. This is a problem with bdrv_drop_intermediate(): We want to keep the whole subtree drained, including parents, while the operation is under way. bdrv_replace_child_noperm() breaks this by allowing every parent to become unquiesced briefly, and then redraining it. In fact, there is no reason why the parent should become unquiesced and be allowed to submit requests to the new child node if that new node is supposed to be kept drained. So if anything, we have to drain the parent before detaching the old child node. Conversely, we have to undrain it only after attaching the new child node. Thus, change the whole drain algorithm here: Calculate the number of times we have to drain/undrain the parent before replacing the child node then drain it (if necessary), replace the child node, and then undrain it. Signed-off-by: Max Reitz Signed-off-by: Kevin Wolf --- block.c | 49 +++++++++++++++++++++++++++++++++---------------- 1 file changed, 33 insertions(+), 16 deletions(-) diff --git a/block.c b/block.c index cbd8da5f3b..e1595bd058 100644 --- a/block.c +++ b/block.c @@ -2238,13 +2238,27 @@ static void bdrv_replace_child_noperm(BdrvChild *child, BlockDriverState *new_bs) { BlockDriverState *old_bs = child->bs; - int i; + int new_bs_quiesce_counter; + int drain_saldo; assert(!child->frozen); if (old_bs && new_bs) { assert(bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)); } + + new_bs_quiesce_counter = (new_bs ? new_bs->quiesce_counter : 0); + drain_saldo = new_bs_quiesce_counter - child->parent_quiesce_counter; + + /* + * If the new child node is drained but the old one was not, flush + * all outstanding requests to the old child node. + */ + while (drain_saldo > 0 && child->role->drained_begin) { + bdrv_parent_drained_begin_single(child, true); + drain_saldo--; + } + if (old_bs) { /* Detach first so that the recursive drain sections coming from @child * are already gone and we only end the drain sections that came from @@ -2252,28 +2266,22 @@ static void bdrv_replace_child_noperm(BdrvChild *child, if (child->role->detach) { child->role->detach(child); } - while (child->parent_quiesce_counter) { - bdrv_parent_drained_end_single(child); - } QLIST_REMOVE(child, next_parent); - } else { - assert(child->parent_quiesce_counter == 0); } child->bs = new_bs; if (new_bs) { QLIST_INSERT_HEAD(&new_bs->parents, child, next_parent); - if (new_bs->quiesce_counter) { - int num = new_bs->quiesce_counter; - if (child->role->parent_is_bds) { - num -= bdrv_drain_all_count; - } - assert(num >= 0); - for (i = 0; i < num; i++) { - bdrv_parent_drained_begin_single(child, true); - } - } + + /* + * Detaching the old node may have led to the new node's + * quiesce_counter having been decreased. Not a problem, we + * just need to recognize this here and then invoke + * drained_end appropriately more often. + */ + assert(new_bs->quiesce_counter <= new_bs_quiesce_counter); + drain_saldo += new_bs->quiesce_counter - new_bs_quiesce_counter; /* Attach only after starting new drained sections, so that recursive * drain sections coming from @child don't get an extra .drained_begin @@ -2282,6 +2290,15 @@ static void bdrv_replace_child_noperm(BdrvChild *child, child->role->attach(child); } } + + /* + * If the old child node was drained but the new one is not, allow + * requests to come in only after the new node has been attached. + */ + while (drain_saldo < 0 && child->role->drained_end) { + bdrv_parent_drained_end_single(child); + drain_saldo++; + } } /* From patchwork Thu Jul 25 16:27:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 11059199 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B7C44912 for ; Thu, 25 Jul 2019 16:27:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A6811288DB for ; Thu, 25 Jul 2019 16:27:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 97CD828812; Thu, 25 Jul 2019 16:27:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4376228812 for ; Thu, 25 Jul 2019 16:27:51 +0000 (UTC) Received: from localhost ([::1]:33822 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hqgas-0005As-8v for patchwork-qemu-devel@patchwork.kernel.org; Thu, 25 Jul 2019 12:27:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:36933) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hqgaV-0003bH-Eo for qemu-devel@nongnu.org; Thu, 25 Jul 2019 12:27:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hqgaT-0003vm-Av for qemu-devel@nongnu.org; Thu, 25 Jul 2019 12:27:27 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46738) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hqgaQ-0003ra-53; Thu, 25 Jul 2019 12:27:22 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4EED830917AF; Thu, 25 Jul 2019 16:27:21 +0000 (UTC) Received: from linux.fritz.box.com (ovpn-117-146.ams2.redhat.com [10.36.117.146]) by smtp.corp.redhat.com (Postfix) with ESMTP id B928E5D71A; Thu, 25 Jul 2019 16:27:19 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Date: Thu, 25 Jul 2019 18:27:03 +0200 Message-Id: <20190725162704.12622-4-kwolf@redhat.com> In-Reply-To: <20190725162704.12622-1-kwolf@redhat.com> References: <20190725162704.12622-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Thu, 25 Jul 2019 16:27:21 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 3/4] mirror: Keep target drained until graph changes are done X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, vsementsov@virtuozzo.com, den@virtuozzo.com, qemu-devel@nongnu.org, mreitz@redhat.com, dplotnikov@virtuozzo.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Calling bdrv_drained_end() for target_bs can restarts requests too early, so that they would execute on mirror_top_bs, which however has already dropped all permissions. Keep the target node drained until all graph changes have completed. Signed-off-by: Kevin Wolf --- block/mirror.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/block/mirror.c b/block/mirror.c index 8cb75fb409..7483051f8d 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -644,6 +644,11 @@ static int mirror_exit_common(Job *job) bdrv_ref(mirror_top_bs); bdrv_ref(target_bs); + /* The mirror job has no requests in flight any more, but we need to + * drain potential other users of the BDS before changing the graph. */ + assert(s->in_drain); + bdrv_drained_begin(target_bs); + /* Remove target parent that still uses BLK_PERM_WRITE/RESIZE before * inserting target_bs at s->to_replace, where we might not be able to get * these permissions. @@ -684,12 +689,7 @@ static int mirror_exit_common(Job *job) bdrv_reopen_set_read_only(target_bs, ro, NULL); } - /* The mirror job has no requests in flight any more, but we need to - * drain potential other users of the BDS before changing the graph. */ - assert(s->in_drain); - bdrv_drained_begin(target_bs); bdrv_replace_node(to_replace, target_bs, &local_err); - bdrv_drained_end(target_bs); if (local_err) { error_report_err(local_err); ret = -EPERM; @@ -704,7 +704,6 @@ static int mirror_exit_common(Job *job) aio_context_release(replace_aio_context); } g_free(s->replaces); - bdrv_unref(target_bs); /* * Remove the mirror filter driver from the graph. Before this, get rid of @@ -724,9 +723,12 @@ static int mirror_exit_common(Job *job) bs_opaque->job = NULL; bdrv_drained_end(src); + bdrv_drained_end(target_bs); + s->in_drain = false; bdrv_unref(mirror_top_bs); bdrv_unref(src); + bdrv_unref(target_bs); return ret; } From patchwork Thu Jul 25 16:27:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 11059213 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E0D371398 for ; Thu, 25 Jul 2019 16:28:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CEFD928866 for ; Thu, 25 Jul 2019 16:28:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C3889288A6; Thu, 25 Jul 2019 16:28:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D967A285DD for ; Thu, 25 Jul 2019 16:28:27 +0000 (UTC) Received: from localhost ([::1]:33842 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hqgbS-0007oU-Sn for patchwork-qemu-devel@patchwork.kernel.org; Thu, 25 Jul 2019 12:28:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:36993) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hqgad-0004BT-KG for qemu-devel@nongnu.org; Thu, 25 Jul 2019 12:27:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hqgaX-0003xz-Cd for qemu-devel@nongnu.org; Thu, 25 Jul 2019 12:27:33 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57154) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hqgaS-0003tL-Ev; Thu, 25 Jul 2019 12:27:24 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2E96C300CB0E; Thu, 25 Jul 2019 16:27:23 +0000 (UTC) Received: from linux.fritz.box.com (ovpn-117-146.ams2.redhat.com [10.36.117.146]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9A1405D772; Thu, 25 Jul 2019 16:27:21 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Date: Thu, 25 Jul 2019 18:27:04 +0200 Message-Id: <20190725162704.12622-5-kwolf@redhat.com> In-Reply-To: <20190725162704.12622-1-kwolf@redhat.com> References: <20190725162704.12622-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Thu, 25 Jul 2019 16:27:23 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 4/4] block-backend: Queue requests while drained X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, vsementsov@virtuozzo.com, den@virtuozzo.com, qemu-devel@nongnu.org, mreitz@redhat.com, dplotnikov@virtuozzo.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP This fixes device like IDE that can still start new requests from I/O handlers in the CPU thread while the block backend is drained. The basic assumption is that in a drain section, no new requests should be allowed through a BlockBackend (blk_drained_begin/end don't exist, we get drain sections only on the node level). However, there are two special cases where requests should not be queued: 1. Block jobs: We already make sure that block jobs are paused in a drain section, so they won't start new requests. However, if the drain_begin is called on the job's BlockBackend first, it can happen that we deadlock because the job stays busy until it reaches a pause point - which it can't if it's requests aren't processed any more. The proper solution here would be to make all requests through the job's filter node instead of using a BlockBackend. For now, just disabling request queuin on the job BlockBackend is simpler. 2. In test cases where making requests through bdrv_* would be cumbersome because we'd need a BdrvChild. As we already got the functionality to disable request queuing from 1., use it in tests, too, for convenience. Signed-off-by: Kevin Wolf --- include/sysemu/block-backend.h | 11 +++--- block/backup.c | 1 + block/block-backend.c | 69 +++++++++++++++++++++++++++++----- block/commit.c | 2 + block/mirror.c | 6 ++- blockjob.c | 3 ++ tests/test-bdrv-drain.c | 1 + 7 files changed, 76 insertions(+), 17 deletions(-) diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h index 7320b58467..d453a4e9a1 100644 --- a/include/sysemu/block-backend.h +++ b/include/sysemu/block-backend.h @@ -104,6 +104,7 @@ void blk_get_perm(BlockBackend *blk, uint64_t *perm, uint64_t *shared_perm); void blk_set_allow_write_beyond_eof(BlockBackend *blk, bool allow); void blk_set_allow_aio_context_change(BlockBackend *blk, bool allow); +void blk_set_disable_request_queuing(BlockBackend *blk, bool disable); void blk_iostatus_enable(BlockBackend *blk); bool blk_iostatus_is_enabled(const BlockBackend *blk); BlockDeviceIoStatus blk_iostatus(const BlockBackend *blk); @@ -119,10 +120,10 @@ BlockBackend *blk_by_qdev_id(const char *id, Error **errp); void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops, void *opaque); int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset, unsigned int bytes, QEMUIOVector *qiov, - BdrvRequestFlags flags); + BdrvRequestFlags flags, bool wait_while_drained); int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset, - unsigned int bytes, QEMUIOVector *qiov, - BdrvRequestFlags flags); + unsigned int bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags, bool wait_while_drained); static inline int coroutine_fn blk_co_pread(BlockBackend *blk, int64_t offset, unsigned int bytes, void *buf, @@ -130,7 +131,7 @@ static inline int coroutine_fn blk_co_pread(BlockBackend *blk, int64_t offset, { QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, buf, bytes); - return blk_co_preadv(blk, offset, bytes, &qiov, flags); + return blk_co_preadv(blk, offset, bytes, &qiov, flags, true); } static inline int coroutine_fn blk_co_pwrite(BlockBackend *blk, int64_t offset, @@ -139,7 +140,7 @@ static inline int coroutine_fn blk_co_pwrite(BlockBackend *blk, int64_t offset, { QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, buf, bytes); - return blk_co_pwritev(blk, offset, bytes, &qiov, flags); + return blk_co_pwritev(blk, offset, bytes, &qiov, flags, true); } int blk_pwrite_zeroes(BlockBackend *blk, int64_t offset, diff --git a/block/backup.c b/block/backup.c index 715e1d3be8..f66b2f4ee7 100644 --- a/block/backup.c +++ b/block/backup.c @@ -635,6 +635,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, if (ret < 0) { goto error; } + blk_set_disable_request_queuing(job->target, true); job->on_source_error = on_source_error; job->on_target_error = on_target_error; diff --git a/block/block-backend.c b/block/block-backend.c index fdd6b01ecf..603b281743 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -79,6 +79,9 @@ struct BlockBackend { QLIST_HEAD(, BlockBackendAioNotifier) aio_notifiers; int quiesce_counter; + CoQueue queued_requests; + bool disable_request_queuing; + VMChangeStateEntry *vmsh; bool force_allow_inactivate; @@ -339,6 +342,7 @@ BlockBackend *blk_new(AioContext *ctx, uint64_t perm, uint64_t shared_perm) block_acct_init(&blk->stats); + qemu_co_queue_init(&blk->queued_requests); notifier_list_init(&blk->remove_bs_notifiers); notifier_list_init(&blk->insert_bs_notifiers); QLIST_INIT(&blk->aio_notifiers); @@ -1096,6 +1100,11 @@ void blk_set_allow_aio_context_change(BlockBackend *blk, bool allow) blk->allow_aio_context_change = allow; } +void blk_set_disable_request_queuing(BlockBackend *blk, bool disable) +{ + blk->disable_request_queuing = disable; +} + static int blk_check_byte_request(BlockBackend *blk, int64_t offset, size_t size) { @@ -1127,13 +1136,26 @@ static int blk_check_byte_request(BlockBackend *blk, int64_t offset, return 0; } +static void blk_wait_while_drained(BlockBackend *blk) +{ + if (blk->quiesce_counter && !blk->disable_request_queuing) { + qemu_co_queue_wait(&blk->queued_requests, NULL); + } +} + int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset, unsigned int bytes, QEMUIOVector *qiov, - BdrvRequestFlags flags) + BdrvRequestFlags flags, bool wait_while_drained) { int ret; - BlockDriverState *bs = blk_bs(blk); + BlockDriverState *bs; + if (wait_while_drained) { + blk_wait_while_drained(blk); + } + + /* Call blk_bs() only after waiting, the graph may have changed */ + bs = blk_bs(blk); trace_blk_co_preadv(blk, bs, offset, bytes, flags); ret = blk_check_byte_request(blk, offset, bytes); @@ -1156,11 +1178,17 @@ int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset, int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset, unsigned int bytes, QEMUIOVector *qiov, - BdrvRequestFlags flags) + BdrvRequestFlags flags, bool wait_while_drained) { int ret; - BlockDriverState *bs = blk_bs(blk); + BlockDriverState *bs; + + if (wait_while_drained) { + blk_wait_while_drained(blk); + } + /* Call blk_bs() only after waiting, the graph may have changed */ + bs = blk_bs(blk); trace_blk_co_pwritev(blk, bs, offset, bytes, flags); ret = blk_check_byte_request(blk, offset, bytes); @@ -1198,7 +1226,7 @@ static void blk_read_entry(void *opaque) QEMUIOVector *qiov = rwco->iobuf; rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, qiov->size, - qiov, rwco->flags); + qiov, rwco->flags, true); aio_wait_kick(); } @@ -1208,7 +1236,7 @@ static void blk_write_entry(void *opaque) QEMUIOVector *qiov = rwco->iobuf; rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, qiov->size, - qiov, rwco->flags); + qiov, rwco->flags, true); aio_wait_kick(); } @@ -1349,9 +1377,15 @@ static void blk_aio_read_entry(void *opaque) BlkRwCo *rwco = &acb->rwco; QEMUIOVector *qiov = rwco->iobuf; + if (rwco->blk->quiesce_counter) { + blk_dec_in_flight(rwco->blk); + blk_wait_while_drained(rwco->blk); + blk_inc_in_flight(rwco->blk); + } + assert(qiov->size == acb->bytes); rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, acb->bytes, - qiov, rwco->flags); + qiov, rwco->flags, false); blk_aio_complete(acb); } @@ -1361,9 +1395,15 @@ static void blk_aio_write_entry(void *opaque) BlkRwCo *rwco = &acb->rwco; QEMUIOVector *qiov = rwco->iobuf; + if (rwco->blk->quiesce_counter) { + blk_dec_in_flight(rwco->blk); + blk_wait_while_drained(rwco->blk); + blk_inc_in_flight(rwco->blk); + } + assert(!qiov || qiov->size == acb->bytes); rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, acb->bytes, - qiov, rwco->flags); + qiov, rwco->flags, false); blk_aio_complete(acb); } @@ -1482,6 +1522,8 @@ void blk_aio_cancel_async(BlockAIOCB *acb) int blk_co_ioctl(BlockBackend *blk, unsigned long int req, void *buf) { + blk_wait_while_drained(blk); + if (!blk_is_available(blk)) { return -ENOMEDIUM; } @@ -1522,7 +1564,11 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf, int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes) { - int ret = blk_check_byte_request(blk, offset, bytes); + int ret; + + blk_wait_while_drained(blk); + + ret = blk_check_byte_request(blk, offset, bytes); if (ret < 0) { return ret; } @@ -2004,7 +2050,7 @@ int coroutine_fn blk_co_pwrite_zeroes(BlockBackend *blk, int64_t offset, int bytes, BdrvRequestFlags flags) { return blk_co_pwritev(blk, offset, bytes, NULL, - flags | BDRV_REQ_ZERO_WRITE); + flags | BDRV_REQ_ZERO_WRITE, true); } int blk_pwrite_compressed(BlockBackend *blk, int64_t offset, const void *buf, @@ -2232,6 +2278,9 @@ static void blk_root_drained_end(BdrvChild *child, int *drained_end_counter) if (blk->dev_ops && blk->dev_ops->drained_end) { blk->dev_ops->drained_end(blk->dev_opaque); } + while (qemu_co_enter_next(&blk->queued_requests, NULL)) { + /* Resume all queued requests */ + } } } diff --git a/block/commit.c b/block/commit.c index 2c5a6d4ebc..408ae15389 100644 --- a/block/commit.c +++ b/block/commit.c @@ -350,6 +350,7 @@ void commit_start(const char *job_id, BlockDriverState *bs, if (ret < 0) { goto fail; } + blk_set_disable_request_queuing(s->base, true); s->base_bs = base; /* Required permissions are already taken with block_job_add_bdrv() */ @@ -358,6 +359,7 @@ void commit_start(const char *job_id, BlockDriverState *bs, if (ret < 0) { goto fail; } + blk_set_disable_request_queuing(s->top, true); s->backing_file_str = g_strdup(backing_file_str); s->on_error = on_error; diff --git a/block/mirror.c b/block/mirror.c index 7483051f8d..8d0a3a987d 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -231,7 +231,8 @@ static void coroutine_fn mirror_read_complete(MirrorOp *op, int ret) return; } - ret = blk_co_pwritev(s->target, op->offset, op->qiov.size, &op->qiov, 0); + ret = blk_co_pwritev(s->target, op->offset, op->qiov.size, &op->qiov, 0, + false); mirror_write_complete(op, ret); } @@ -1237,7 +1238,7 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method, switch (method) { case MIRROR_METHOD_COPY: ret = blk_co_pwritev(job->target, dirty_offset, dirty_bytes, - qiov ? &target_qiov : NULL, flags); + qiov ? &target_qiov : NULL, flags, false); break; case MIRROR_METHOD_ZERO: @@ -1624,6 +1625,7 @@ static BlockJob *mirror_start_job( blk_set_force_allow_inactivate(s->target); } blk_set_allow_aio_context_change(s->target, true); + blk_set_disable_request_queuing(s->target, true); s->replaces = g_strdup(replaces); s->on_source_error = on_source_error; diff --git a/blockjob.c b/blockjob.c index 20b7f557da..73d9f1ba2b 100644 --- a/blockjob.c +++ b/blockjob.c @@ -445,6 +445,9 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver, bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker); + /* Disable request queuing in the BlockBackend to avoid deadlocks on drain: + * The job reports that it's busy until it reaches a pause point. */ + blk_set_disable_request_queuing(blk, true); blk_set_allow_aio_context_change(blk, true); /* Only set speed when necessary to avoid NotSupported error */ diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c index 03fa1142a1..3fcf7c1c95 100644 --- a/tests/test-bdrv-drain.c +++ b/tests/test-bdrv-drain.c @@ -677,6 +677,7 @@ static void test_iothread_common(enum drain_type drain_type, int drain_thread) &error_abort); s = bs->opaque; blk_insert_bs(blk, bs, &error_abort); + blk_set_disable_request_queuing(blk, true); blk_set_aio_context(blk, ctx_a, &error_abort); aio_context_acquire(ctx_a);