From patchwork Fri Jul 3 16:11:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11642251 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1AD5B60D for ; Fri, 3 Jul 2020 16:27:47 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EE6612073E for ; Fri, 3 Jul 2020 16:27:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EE6612073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:41446 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jrOXS-0004K4-7M for patchwork-qemu-devel@patchwork.kernel.org; Fri, 03 Jul 2020 12:27:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35144) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOHx-0001Oo-QQ; Fri, 03 Jul 2020 12:11:46 -0400 Received: from relay.sw.ru ([185.231.240.75]:48384 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOHq-0002cr-VR; Fri, 03 Jul 2020 12:11:45 -0400 Received: from [192.168.15.23] (helo=iris.sw.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jrOHf-0005Rg-9w; Fri, 03 Jul 2020 19:11:27 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 1/7] migration/savevm: respect qemu_fclose() error code in save_snapshot() Date: Fri, 3 Jul 2020 19:11:24 +0300 Message-Id: <20200703161130.23772-2-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200703161130.23772-1-den@openvz.org> References: <20200703161130.23772-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/07/03 12:11:34 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Juan Quintela , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V. Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" qemu_fclose() could return error, f.e. if bdrv_co_flush() will return the error. This validation will become more important once we will start waiting of asynchronous IO operations, started from bdrv_write_vmstate(), which are coming soon. Signed-off-by: Denis V. Lunev Reviewed-by: "Dr. David Alan Gilbert" Reviewed-by: Vladimir Sementsov-Ogievskiy CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: Denis Plotnikov Reviewed-by: Juan Quintela --- migration/savevm.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/migration/savevm.c b/migration/savevm.c index b979ea6e7f..da3dead4e9 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -2628,7 +2628,7 @@ int save_snapshot(const char *name, Error **errp) { BlockDriverState *bs, *bs1; QEMUSnapshotInfo sn1, *sn = &sn1, old_sn1, *old_sn = &old_sn1; - int ret = -1; + int ret = -1, ret2; QEMUFile *f; int saved_vm_running; uint64_t vm_state_size; @@ -2712,10 +2712,14 @@ int save_snapshot(const char *name, Error **errp) } ret = qemu_savevm_state(f, errp); vm_state_size = qemu_ftell(f); - qemu_fclose(f); + ret2 = qemu_fclose(f); if (ret < 0) { goto the_end; } + if (ret2 < 0) { + ret = ret2; + goto the_end; + } /* The bdrv_all_create_snapshot() call that follows acquires the AioContext * for itself. BDRV_POLL_WHILE() does not support nested locking because From patchwork Fri Jul 3 16:11:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11642261 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 69137912 for ; Fri, 3 Jul 2020 16:31:41 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 47BA820760 for ; Fri, 3 Jul 2020 16:31:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 47BA820760 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:57342 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jrObE-0002sv-Eb for patchwork-qemu-devel@patchwork.kernel.org; Fri, 03 Jul 2020 12:31:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35184) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOI4-0001QT-So; Fri, 03 Jul 2020 12:11:53 -0400 Received: from relay.sw.ru ([185.231.240.75]:48374 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOHq-0002cq-W2; Fri, 03 Jul 2020 12:11:50 -0400 Received: from [192.168.15.23] (helo=iris.sw.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jrOHf-0005Rg-E8; Fri, 03 Jul 2020 19:11:27 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 2/7] block/aio_task: allow start/wait task from any coroutine Date: Fri, 3 Jul 2020 19:11:25 +0300 Message-Id: <20200703161130.23772-3-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200703161130.23772-1-den@openvz.org> References: <20200703161130.23772-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/07/03 12:11:34 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Juan Quintela , "Dr. David Alan Gilbert" , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V . Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy Currently, aio task pool assumes that there is a main coroutine, which creates tasks and wait for them. Let's remove the restriction by using CoQueue. Code becomes clearer, interface more obvious. Signed-off-by: Vladimir Sementsov-Ogievskiy Signed-off-by: Denis V. Lunev CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: "Dr. David Alan Gilbert" CC: Vladimir Sementsov-Ogievskiy CC: Denis Plotnikov --- block/aio_task.c | 21 ++++++--------------- 1 file changed, 6 insertions(+), 15 deletions(-) diff --git a/block/aio_task.c b/block/aio_task.c index 88989fa248..cf62e5c58b 100644 --- a/block/aio_task.c +++ b/block/aio_task.c @@ -27,11 +27,10 @@ #include "block/aio_task.h" struct AioTaskPool { - Coroutine *main_co; int status; int max_busy_tasks; int busy_tasks; - bool waiting; + CoQueue waiters; }; static void coroutine_fn aio_task_co(void *opaque) @@ -52,31 +51,23 @@ static void coroutine_fn aio_task_co(void *opaque) g_free(task); - if (pool->waiting) { - pool->waiting = false; - aio_co_wake(pool->main_co); - } + qemu_co_queue_restart_all(&pool->waiters); } void coroutine_fn aio_task_pool_wait_one(AioTaskPool *pool) { assert(pool->busy_tasks > 0); - assert(qemu_coroutine_self() == pool->main_co); - pool->waiting = true; - qemu_coroutine_yield(); + qemu_co_queue_wait(&pool->waiters, NULL); - assert(!pool->waiting); assert(pool->busy_tasks < pool->max_busy_tasks); } void coroutine_fn aio_task_pool_wait_slot(AioTaskPool *pool) { - if (pool->busy_tasks < pool->max_busy_tasks) { - return; + while (pool->busy_tasks >= pool->max_busy_tasks) { + aio_task_pool_wait_one(pool); } - - aio_task_pool_wait_one(pool); } void coroutine_fn aio_task_pool_wait_all(AioTaskPool *pool) @@ -98,8 +89,8 @@ AioTaskPool *coroutine_fn aio_task_pool_new(int max_busy_tasks) { AioTaskPool *pool = g_new0(AioTaskPool, 1); - pool->main_co = qemu_coroutine_self(); pool->max_busy_tasks = max_busy_tasks; + qemu_co_queue_init(&pool->waiters); return pool; } From patchwork Fri Jul 3 16:11:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11642255 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 902EF13BD for ; Fri, 3 Jul 2020 16:30:25 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6FD342073E for ; Fri, 3 Jul 2020 16:30:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6FD342073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:52384 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jrOa0-0000e7-Lt for patchwork-qemu-devel@patchwork.kernel.org; Fri, 03 Jul 2020 12:30:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35136) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOHx-0001Om-D5; Fri, 03 Jul 2020 12:11:46 -0400 Received: from relay.sw.ru ([185.231.240.75]:48370 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOHq-0002cp-VU; Fri, 03 Jul 2020 12:11:43 -0400 Received: from [192.168.15.23] (helo=iris.sw.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jrOHf-0005Rg-IH; Fri, 03 Jul 2020 19:11:27 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 3/7] block/aio_task: drop aio_task_pool_wait_one() helper Date: Fri, 3 Jul 2020 19:11:26 +0300 Message-Id: <20200703161130.23772-4-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200703161130.23772-1-den@openvz.org> References: <20200703161130.23772-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/07/03 12:11:34 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Juan Quintela , "Dr. David Alan Gilbert" , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V. Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" It is not used outside the module. Actually there are 2 kind of waiters: - for a slot and - for all tasks to finish This patch limits external API to listed types. Signed-off-by: Denis V. Lunev Suggested-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Vladimir Sementsov-Ogievskiy CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: "Dr. David Alan Gilbert" CC: Denis Plotnikov --- block/aio_task.c | 13 ++----------- include/block/aio_task.h | 1 - 2 files changed, 2 insertions(+), 12 deletions(-) diff --git a/block/aio_task.c b/block/aio_task.c index cf62e5c58b..7ba15ff41f 100644 --- a/block/aio_task.c +++ b/block/aio_task.c @@ -54,26 +54,17 @@ static void coroutine_fn aio_task_co(void *opaque) qemu_co_queue_restart_all(&pool->waiters); } -void coroutine_fn aio_task_pool_wait_one(AioTaskPool *pool) -{ - assert(pool->busy_tasks > 0); - - qemu_co_queue_wait(&pool->waiters, NULL); - - assert(pool->busy_tasks < pool->max_busy_tasks); -} - void coroutine_fn aio_task_pool_wait_slot(AioTaskPool *pool) { while (pool->busy_tasks >= pool->max_busy_tasks) { - aio_task_pool_wait_one(pool); + qemu_co_queue_wait(&pool->waiters, NULL); } } void coroutine_fn aio_task_pool_wait_all(AioTaskPool *pool) { while (pool->busy_tasks > 0) { - aio_task_pool_wait_one(pool); + qemu_co_queue_wait(&pool->waiters, NULL); } } diff --git a/include/block/aio_task.h b/include/block/aio_task.h index 50bc1e1817..50b1c036c5 100644 --- a/include/block/aio_task.h +++ b/include/block/aio_task.h @@ -48,7 +48,6 @@ bool aio_task_pool_empty(AioTaskPool *pool); void coroutine_fn aio_task_pool_start_task(AioTaskPool *pool, AioTask *task); void coroutine_fn aio_task_pool_wait_slot(AioTaskPool *pool); -void coroutine_fn aio_task_pool_wait_one(AioTaskPool *pool); void coroutine_fn aio_task_pool_wait_all(AioTaskPool *pool); #endif /* BLOCK_AIO_TASK_H */ From patchwork Fri Jul 3 16:11:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11642253 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 92D0813BD for ; Fri, 3 Jul 2020 16:29:27 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 736D32073E for ; Fri, 3 Jul 2020 16:29:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 736D32073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:48206 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jrOZ4-0007M5-N6 for patchwork-qemu-devel@patchwork.kernel.org; Fri, 03 Jul 2020 12:29:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35148) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOHy-0001Ow-9H; Fri, 03 Jul 2020 12:11:47 -0400 Received: from relay.sw.ru ([185.231.240.75]:48372 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOHq-0002cu-VF; Fri, 03 Jul 2020 12:11:45 -0400 Received: from [192.168.15.23] (helo=iris.sw.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jrOHf-0005Rg-M9; Fri, 03 Jul 2020 19:11:27 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 4/7] block/block-backend: remove always true check from blk_save_vmstate Date: Fri, 3 Jul 2020 19:11:27 +0300 Message-Id: <20200703161130.23772-5-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200703161130.23772-1-den@openvz.org> References: <20200703161130.23772-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/07/03 12:11:34 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Juan Quintela , "Dr. David Alan Gilbert" , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V. Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" bdrv_save_vmstate() returns either error with negative return value or size. Thus this check is useless. Signed-off-by: Denis V. Lunev Suggested-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: "Dr. David Alan Gilbert" CC: Denis Plotnikov --- block/block-backend.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/block-backend.c b/block/block-backend.c index 6936b25c83..1c6e53bbde 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -2188,7 +2188,7 @@ int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf, return ret; } - if (ret == size && !blk->enable_write_cache) { + if (!blk->enable_write_cache) { ret = bdrv_flush(blk_bs(blk)); } From patchwork Fri Jul 3 16:11:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11642263 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F3310912 for ; Fri, 3 Jul 2020 16:32:35 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CEB5F20760 for ; Fri, 3 Jul 2020 16:32:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CEB5F20760 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:33616 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jrOc7-0004mK-2g for patchwork-qemu-devel@patchwork.kernel.org; Fri, 03 Jul 2020 12:32:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35156) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOI0-0001Pf-27; Fri, 03 Jul 2020 12:11:51 -0400 Received: from relay.sw.ru ([185.231.240.75]:48379 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOHq-0002cv-U4; Fri, 03 Jul 2020 12:11:46 -0400 Received: from [192.168.15.23] (helo=iris.sw.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jrOHf-0005Rg-RD; Fri, 03 Jul 2020 19:11:27 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 5/7] block, migration: add bdrv_finalize_vmstate helper Date: Fri, 3 Jul 2020 19:11:28 +0300 Message-Id: <20200703161130.23772-6-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200703161130.23772-1-den@openvz.org> References: <20200703161130.23772-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/07/03 12:11:34 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Juan Quintela , "Dr. David Alan Gilbert" , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V. Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" Right now bdrv_fclose() is just calling bdrv_flush(). The problem is that migration code is working inefficiently from block layer terms and are frequently called for very small pieces of unaligned data. Block layer is capable to work this way, but this is very slow. This patch is a preparation for the introduction of the intermediate buffer at block driver state. It would be beneficial to separate conventional bdrv_flush() from closing QEMU file from migration code. The patch also forces bdrv_finalize_vmstate() operation inside synchronous blk_save_vmstate() operation. This helper is used from qemu-io only. Signed-off-by: Denis V. Lunev Reviewed-by: Vladimir Sementsov-Ogievskiy CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: "Dr. David Alan Gilbert" CC: Denis Plotnikov --- block/block-backend.c | 6 +++++- block/io.c | 15 +++++++++++++++ include/block/block.h | 5 +++++ migration/savevm.c | 4 ++++ 4 files changed, 29 insertions(+), 1 deletion(-) diff --git a/block/block-backend.c b/block/block-backend.c index 1c6e53bbde..5bb11c8e01 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -2177,16 +2177,20 @@ int blk_truncate(BlockBackend *blk, int64_t offset, bool exact, int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf, int64_t pos, int size) { - int ret; + int ret, ret2; if (!blk_is_available(blk)) { return -ENOMEDIUM; } ret = bdrv_save_vmstate(blk_bs(blk), buf, pos, size); + ret2 = bdrv_finalize_vmstate(blk_bs(blk)); if (ret < 0) { return ret; } + if (ret2 < 0) { + return ret2; + } if (!blk->enable_write_cache) { ret = bdrv_flush(blk_bs(blk)); diff --git a/block/io.c b/block/io.c index df8f2a98d4..1f69268361 100644 --- a/block/io.c +++ b/block/io.c @@ -2724,6 +2724,21 @@ int bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos) return bdrv_rw_vmstate(bs, qiov, pos, true); } +static int coroutine_fn bdrv_co_finalize_vmstate(BlockDriverState *bs) +{ + return 0; +} + +static int coroutine_fn bdrv_finalize_vmstate_co_entry(void *opaque) +{ + return bdrv_co_finalize_vmstate(opaque); +} + +int bdrv_finalize_vmstate(BlockDriverState *bs) +{ + return bdrv_run_co(bs, bdrv_finalize_vmstate_co_entry, bs); +} + /**************************************************************/ /* async I/Os */ diff --git a/include/block/block.h b/include/block/block.h index e8fc814996..0119b68505 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -572,6 +572,11 @@ int bdrv_save_vmstate(BlockDriverState *bs, const uint8_t *buf, int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf, int64_t pos, int size); +/* + * bdrv_finalize_vmstate() is mandatory to commit vmstate changes if + * bdrv_save_vmstate() was ever called. + */ +int bdrv_finalize_vmstate(BlockDriverState *bs); void bdrv_img_create(const char *filename, const char *fmt, const char *base_filename, const char *base_fmt, diff --git a/migration/savevm.c b/migration/savevm.c index da3dead4e9..798a4cb402 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -150,6 +150,10 @@ static ssize_t block_get_buffer(void *opaque, uint8_t *buf, int64_t pos, static int bdrv_fclose(void *opaque, Error **errp) { + int err = bdrv_finalize_vmstate(opaque); + if (err < 0) { + return err; + } return bdrv_flush(opaque); } From patchwork Fri Jul 3 16:11:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11642259 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6AF6A13BD for ; Fri, 3 Jul 2020 16:30:30 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4C5402073E for ; Fri, 3 Jul 2020 16:30:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4C5402073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:52760 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jrOa5-0000nO-Cs for patchwork-qemu-devel@patchwork.kernel.org; Fri, 03 Jul 2020 12:30:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35074) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOHt-0001LF-Ev; Fri, 03 Jul 2020 12:11:41 -0400 Received: from relay.sw.ru ([185.231.240.75]:48378 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOHq-0002ct-IH; Fri, 03 Jul 2020 12:11:40 -0400 Received: from [192.168.15.23] (helo=iris.sw.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jrOHf-0005Rg-Ut; Fri, 03 Jul 2020 19:11:28 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 6/7] block/io: improve savevm performance Date: Fri, 3 Jul 2020 19:11:29 +0300 Message-Id: <20200703161130.23772-7-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200703161130.23772-1-den@openvz.org> References: <20200703161130.23772-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/07/03 12:11:34 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Juan Quintela , "Dr. David Alan Gilbert" , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V. Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" This patch does 2 standard basic things: - it creates intermediate buffer for all writes from QEMU migration code to block driver, - this buffer is sent to disk asynchronously, allowing several writes to run in parallel. Thus bdrv_vmstate_write() is becoming asynchronous. All pending operations completion are performed in newly invented bdrv_finalize_vmstate(). In general, migration code is fantastically inefficent (by observation), buffers are not aligned and sent with arbitrary pieces, a lot of time less than 100 bytes at a chunk, which results in read-modify-write operations if target file descriptor is opened with O_DIRECT. It should also be noted that all operations are performed into unallocated image blocks, which also suffer due to partial writes to such new clusters even on cached file descriptors. Snapshot creation time (2 GB Fedora-31 VM running over NVME storage): original fixed cached: 1.79s 1.27s non-cached: 3.29s 0.81s The difference over HDD would be more significant :) Signed-off-by: Denis V. Lunev Reviewed-by: Vladimir Sementsov-Ogievskiy CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: "Dr. David Alan Gilbert" CC: Denis Plotnikov --- block/io.c | 126 +++++++++++++++++++++++++++++++++++++- include/block/block_int.h | 8 +++ 2 files changed, 132 insertions(+), 2 deletions(-) diff --git a/block/io.c b/block/io.c index 1f69268361..71a696deb7 100644 --- a/block/io.c +++ b/block/io.c @@ -26,6 +26,7 @@ #include "trace.h" #include "sysemu/block-backend.h" #include "block/aio-wait.h" +#include "block/aio_task.h" #include "block/blockjob.h" #include "block/blockjob_int.h" #include "block/block_int.h" @@ -33,6 +34,7 @@ #include "qapi/error.h" #include "qemu/error-report.h" #include "qemu/main-loop.h" +#include "qemu/units.h" #include "sysemu/replay.h" /* Maximum bounce buffer for copy-on-read and write zeroes, in bytes */ @@ -2640,6 +2642,103 @@ typedef struct BdrvVmstateCo { bool is_read; } BdrvVmstateCo; +typedef struct BdrvVMStateTask { + AioTask task; + + BlockDriverState *bs; + int64_t offset; + void *buf; + size_t bytes; +} BdrvVMStateTask; + +typedef struct BdrvSaveVMState { + AioTaskPool *pool; + BdrvVMStateTask *t; +} BdrvSaveVMState; + + +static coroutine_fn int bdrv_co_vmstate_save_task_entry(AioTask *task) +{ + int err = 0; + BdrvVMStateTask *t = container_of(task, BdrvVMStateTask, task); + + if (t->bytes != 0) { + QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, t->buf, t->bytes); + + bdrv_inc_in_flight(t->bs); + err = t->bs->drv->bdrv_save_vmstate(t->bs, &qiov, t->offset); + bdrv_dec_in_flight(t->bs); + } + + qemu_vfree(t->buf); + return err; +} + +static BdrvVMStateTask *bdrv_vmstate_task_create(BlockDriverState *bs, + int64_t pos, size_t size) +{ + BdrvVMStateTask *t = g_new(BdrvVMStateTask, 1); + + *t = (BdrvVMStateTask) { + .task.func = bdrv_co_vmstate_save_task_entry, + .buf = qemu_blockalign(bs, size), + .offset = pos, + .bs = bs, + }; + + return t; +} + +static int bdrv_co_do_save_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, + int64_t pos) +{ + BdrvSaveVMState *state = bs->savevm_state; + BdrvVMStateTask *t; + size_t buf_size = MAX(bdrv_get_cluster_size(bs), 1 * MiB); + size_t to_copy, off; + + if (state == NULL) { + state = g_new(BdrvSaveVMState, 1); + *state = (BdrvSaveVMState) { + .pool = aio_task_pool_new(BDRV_VMSTATE_WORKERS_MAX), + .t = bdrv_vmstate_task_create(bs, pos, buf_size), + }; + + bs->savevm_state = state; + } + + if (aio_task_pool_status(state->pool) < 0) { + /* + * The operation as a whole is unsuccessful. Prohibit all futher + * operations. If we clean here, new useless ops will come again. + * Thus we rely on caller for cleanup here. + */ + return aio_task_pool_status(state->pool); + } + + t = state->t; + if (t->offset + t->bytes != pos) { + /* Normally this branch is not reachable from migration */ + return bs->drv->bdrv_save_vmstate(bs, qiov, pos); + } + + off = 0; + while (1) { + to_copy = MIN(qiov->size - off, buf_size - t->bytes); + qemu_iovec_to_buf(qiov, off, t->buf + t->bytes, to_copy); + t->bytes += to_copy; + if (t->bytes < buf_size) { + return 0; + } + + aio_task_pool_start_task(state->pool, &t->task); + + pos += to_copy; + off += to_copy; + state->t = t = bdrv_vmstate_task_create(bs, pos, buf_size); + } +} + static int coroutine_fn bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos, bool is_read) @@ -2655,7 +2754,7 @@ bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos, if (is_read) { ret = drv->bdrv_load_vmstate(bs, qiov, pos); } else { - ret = drv->bdrv_save_vmstate(bs, qiov, pos); + ret = bdrv_co_do_save_vmstate(bs, qiov, pos); } } else if (bs->file) { ret = bdrv_co_rw_vmstate(bs->file->bs, qiov, pos, is_read); @@ -2726,7 +2825,30 @@ int bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos) static int coroutine_fn bdrv_co_finalize_vmstate(BlockDriverState *bs) { - return 0; + int err; + BdrvSaveVMState *state = bs->savevm_state; + + if (bs->drv->bdrv_save_vmstate == NULL && bs->file != NULL) { + return bdrv_co_finalize_vmstate(bs->file->bs); + } + if (state == NULL) { + return 0; + } + + if (aio_task_pool_status(state->pool) >= 0) { + /* We are on success path, commit last chunk if possible */ + aio_task_pool_start_task(state->pool, &state->t->task); + } + + aio_task_pool_wait_all(state->pool); + err = aio_task_pool_status(state->pool); + + aio_task_pool_free(state->pool); + g_free(state); + + bs->savevm_state = NULL; + + return err; } static int coroutine_fn bdrv_finalize_vmstate_co_entry(void *opaque) diff --git a/include/block/block_int.h b/include/block/block_int.h index 791de6a59c..f90f0e8b6a 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -61,6 +61,8 @@ #define BLOCK_PROBE_BUF_SIZE 512 +#define BDRV_VMSTATE_WORKERS_MAX 8 + enum BdrvTrackedRequestType { BDRV_TRACKED_READ, BDRV_TRACKED_WRITE, @@ -784,6 +786,9 @@ struct BdrvChild { QLIST_ENTRY(BdrvChild) next_parent; }; + +typedef struct BdrvSaveVMState BdrvSaveVMState; + /* * Note: the function bdrv_append() copies and swaps contents of * BlockDriverStates, so if you add new fields to this struct, please @@ -947,6 +952,9 @@ struct BlockDriverState { /* BdrvChild links to this node may never be frozen */ bool never_freeze; + + /* Intermediate buffer for VM state saving from snapshot creation code */ + BdrvSaveVMState *savevm_state; }; struct BlockBackendRootState { From patchwork Fri Jul 3 16:11:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11642257 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AC45E912 for ; Fri, 3 Jul 2020 16:30:27 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8CA482073E for ; Fri, 3 Jul 2020 16:30:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8CA482073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:52516 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jrOa2-0000hK-Pr for patchwork-qemu-devel@patchwork.kernel.org; Fri, 03 Jul 2020 12:30:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35096) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOHt-0001ME-Tz; Fri, 03 Jul 2020 12:11:41 -0400 Received: from relay.sw.ru ([185.231.240.75]:48376 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrOHq-0002cw-Ee; Fri, 03 Jul 2020 12:11:41 -0400 Received: from [192.168.15.23] (helo=iris.sw.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jrOHg-0005Rg-43; Fri, 03 Jul 2020 19:11:28 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 7/7] block/io: improve loadvm performance Date: Fri, 3 Jul 2020 19:11:30 +0300 Message-Id: <20200703161130.23772-8-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200703161130.23772-1-den@openvz.org> References: <20200703161130.23772-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/07/03 12:11:34 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Juan Quintela , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V. Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" This patch creates intermediate buffer for reading from block driver state and performs read-ahead to this buffer. Snapshot code performs reads sequentially and thus we know what offsets will be required and when they will become not needed. Results are fantastic. Switch to snapshot times of 2GB Fedora 31 VM over NVME storage are the following: original fixed cached: 1.84s 1.16s non-cached: 12.74s 1.27s The difference over HDD would be more significant :) Signed-off-by: Denis V. Lunev CC: Vladimir Sementsov-Ogievskiy CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: Denis Plotnikov --- block/io.c | 254 +++++++++++++++++++++++++++++++++++++- include/block/block_int.h | 3 + 2 files changed, 254 insertions(+), 3 deletions(-) diff --git a/block/io.c b/block/io.c index 71a696deb7..decb850f1a 100644 --- a/block/io.c +++ b/block/io.c @@ -2739,6 +2739,209 @@ static int bdrv_co_do_save_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, } } + +typedef struct BdrvLoadVMChunk { + void *buf; + uint64_t offset; + ssize_t bytes; + + QLIST_ENTRY(BdrvLoadVMChunk) list; +} BdrvLoadVMChunk; + +typedef struct BdrvLoadVMState { + AioTaskPool *pool; + + int64_t offset; + int64_t last_loaded; + + int chunk_count; + QLIST_HEAD(, BdrvLoadVMChunk) chunks; + QLIST_HEAD(, BdrvLoadVMChunk) loading; + CoMutex lock; + CoQueue waiters; +} BdrvLoadVMState; + +typedef struct BdrvLoadVMStateTask { + AioTask task; + + BlockDriverState *bs; + BdrvLoadVMChunk *chunk; +} BdrvLoadVMStateTask; + +static BdrvLoadVMChunk *bdrv_co_find_loadvmstate_chunk(int64_t pos, + BdrvLoadVMChunk *c) +{ + for (; c != NULL; c = QLIST_NEXT(c, list)) { + if (c->offset <= pos && c->offset + c->bytes > pos) { + return c; + } + } + + return NULL; +} + +static void bdrv_free_loadvm_chunk(BdrvLoadVMChunk *c) +{ + qemu_vfree(c->buf); + g_free(c); +} + +static coroutine_fn int bdrv_co_vmstate_load_task_entry(AioTask *task) +{ + int err = 0; + BdrvLoadVMStateTask *t = container_of(task, BdrvLoadVMStateTask, task); + BdrvLoadVMChunk *c = t->chunk; + BdrvLoadVMState *state = t->bs->loadvm_state; + QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, c->buf, c->bytes); + + bdrv_inc_in_flight(t->bs); + err = t->bs->drv->bdrv_load_vmstate(t->bs, &qiov, c->offset); + bdrv_dec_in_flight(t->bs); + + qemu_co_mutex_lock(&state->lock); + QLIST_REMOVE(c, list); + if (err == 0) { + QLIST_INSERT_HEAD(&state->chunks, c, list); + } else { + bdrv_free_loadvm_chunk(c); + } + qemu_co_mutex_unlock(&state->lock); + qemu_co_queue_restart_all(&state->waiters); + + return err; +} + + +static void bdrv_co_loadvmstate_next(BlockDriverState *bs, BdrvLoadVMChunk *c) +{ + BdrvLoadVMStateTask *t = g_new(BdrvLoadVMStateTask, 1); + BdrvLoadVMState *state = bs->loadvm_state; + + qemu_co_mutex_assert_locked(&state->lock); + + c->offset = state->last_loaded; + + *t = (BdrvLoadVMStateTask) { + .task.func = bdrv_co_vmstate_load_task_entry, + .bs = bs, + .chunk = c, + }; + + QLIST_INSERT_HEAD(&state->loading, t->chunk, list); + state->chunk_count++; + state->last_loaded += c->bytes; + + qemu_co_mutex_unlock(&state->lock); + aio_task_pool_start_task(state->pool, &t->task); + qemu_co_mutex_lock(&state->lock); +} + + +static void bdrv_co_loadvmstate_start(BlockDriverState *bs) +{ + int i; + size_t buf_size = MAX(bdrv_get_cluster_size(bs), 1 * MiB); + BdrvLoadVMState *state = bs->loadvm_state; + + qemu_co_mutex_lock(&state->lock); + for (i = 0; i < BDRV_VMSTATE_WORKERS_MAX; i++) { + BdrvLoadVMChunk *c = g_new0(BdrvLoadVMChunk, 1); + + c->buf = qemu_blockalign(bs, buf_size); + c->bytes = buf_size; + + bdrv_co_loadvmstate_next(bs, c); + } + qemu_co_mutex_unlock(&state->lock); +} + +static int bdrv_co_do_load_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, + int64_t pos) +{ + BdrvLoadVMState *state = bs->loadvm_state; + BdrvLoadVMChunk *c; + size_t off; + int64_t start_pos = pos; + + if (state == NULL) { + if (pos != 0) { + goto slow_path; + } + + state = g_new(BdrvLoadVMState, 1); + *state = (BdrvLoadVMState) { + .pool = aio_task_pool_new(BDRV_VMSTATE_WORKERS_MAX), + .chunks = QLIST_HEAD_INITIALIZER(state->chunks), + .loading = QLIST_HEAD_INITIALIZER(state->loading), + }; + qemu_co_mutex_init(&state->lock); + qemu_co_queue_init(&state->waiters); + + bs->loadvm_state = state; + + bdrv_co_loadvmstate_start(bs); + } + + if (state->offset != pos) { + goto slow_path; + } + + off = 0; + + qemu_co_mutex_lock(&state->lock); + while (off < qiov->size && aio_task_pool_status(state->pool) == 0) { + c = bdrv_co_find_loadvmstate_chunk(pos, QLIST_FIRST(&state->chunks)); + if (c != NULL) { + ssize_t chunk_off = pos - c->offset; + ssize_t to_copy = MIN(qiov->size - off, c->bytes - chunk_off); + + qemu_iovec_from_buf(qiov, off, c->buf + chunk_off, to_copy); + + off += to_copy; + pos += to_copy; + + if (pos == c->offset + c->bytes) { + state->chunk_count--; + /* End of buffer, discard it from the list */ + QLIST_REMOVE(c, list); + + /* + * Start loading next chunk. The slot in the pool should + * always be free for this purpose at the moment. + * + * There is a problem with the end of the stream. This code + * starts to read the data beyond the end of the saved state. + * The code in low level should be ready to such behavior but + * we will have unnecessary BDRV_VMSTATE_WORKERS_MAX chunks + * fully zeroed. This is not good, but acceptable. + */ + bdrv_co_loadvmstate_next(bs, c); + } + + state->offset += to_copy; + continue; + } + + c = bdrv_co_find_loadvmstate_chunk(pos, QLIST_FIRST(&state->loading)); + if (c != NULL) { + qemu_co_queue_wait(&state->waiters, &state->lock); + continue; + } + + /* + * This should not happen normally. This point could be reached only + * if we have had some parallel requests. Fallback to slow load. + */ + qemu_co_mutex_unlock(&state->lock); + +slow_path: + return bs->drv->bdrv_load_vmstate(bs, qiov, start_pos); + } + qemu_co_mutex_unlock(&state->lock); + + return aio_task_pool_status(state->pool); +} + static int coroutine_fn bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos, bool is_read) @@ -2752,7 +2955,7 @@ bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos, ret = -ENOMEDIUM; } else if (drv->bdrv_load_vmstate) { if (is_read) { - ret = drv->bdrv_load_vmstate(bs, qiov, pos); + ret = bdrv_co_do_load_vmstate(bs, qiov, pos); } else { ret = bdrv_co_do_save_vmstate(bs, qiov, pos); } @@ -2823,13 +3026,13 @@ int bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos) return bdrv_rw_vmstate(bs, qiov, pos, true); } -static int coroutine_fn bdrv_co_finalize_vmstate(BlockDriverState *bs) +static int coroutine_fn bdrv_co_finalize_save_vmstate(BlockDriverState *bs) { int err; BdrvSaveVMState *state = bs->savevm_state; if (bs->drv->bdrv_save_vmstate == NULL && bs->file != NULL) { - return bdrv_co_finalize_vmstate(bs->file->bs); + return bdrv_co_finalize_save_vmstate(bs->file->bs); } if (state == NULL) { return 0; @@ -2851,6 +3054,51 @@ static int coroutine_fn bdrv_co_finalize_vmstate(BlockDriverState *bs) return err; } +static int coroutine_fn bdrv_co_finalize_load_vmstate(BlockDriverState *bs) +{ + int err; + BdrvLoadVMState *state = bs->loadvm_state; + BdrvLoadVMChunk *c, *tmp; + + if (bs->drv->bdrv_load_vmstate == NULL && bs->file != NULL) { + return bdrv_co_finalize_load_vmstate(bs->file->bs); + } + if (state == NULL) { + return 0; + } + + aio_task_pool_wait_all(state->pool); + err = aio_task_pool_status(state->pool); + aio_task_pool_free(state->pool); + + QLIST_FOREACH(c, &state->loading, list) { + assert(1); /* this list must be empty as all tasks are committed */ + } + QLIST_FOREACH_SAFE(c, &state->chunks, list, tmp) { + QLIST_REMOVE(c, list); + bdrv_free_loadvm_chunk(c); + } + + g_free(state); + + bs->loadvm_state = NULL; + + return err; +} + +static int coroutine_fn bdrv_co_finalize_vmstate(BlockDriverState *bs) +{ + int err1 = bdrv_co_finalize_save_vmstate(bs); + int err2 = bdrv_co_finalize_load_vmstate(bs); + if (err1 < 0) { + return err1; + } + if (err2 < 0) { + return err2; + } + return 0; +} + static int coroutine_fn bdrv_finalize_vmstate_co_entry(void *opaque) { return bdrv_co_finalize_vmstate(opaque); diff --git a/include/block/block_int.h b/include/block/block_int.h index f90f0e8b6a..0942578a74 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -788,6 +788,7 @@ struct BdrvChild { typedef struct BdrvSaveVMState BdrvSaveVMState; +typedef struct BdrvLoadVMState BdrvLoadVMState; /* * Note: the function bdrv_append() copies and swaps contents of @@ -955,6 +956,8 @@ struct BlockDriverState { /* Intermediate buffer for VM state saving from snapshot creation code */ BdrvSaveVMState *savevm_state; + /* Intermediate buffer for VM state loading */ + BdrvLoadVMState *loadvm_state; }; struct BlockBackendRootState {