From patchwork Fri Jun 19 10:07:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11613669 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CEA9892A for ; Fri, 19 Jun 2020 10:08:05 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B089720890 for ; Fri, 19 Jun 2020 10:08:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B089720890 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:56314 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jmDwK-0005fK-TV for patchwork-qemu-devel@patchwork.kernel.org; Fri, 19 Jun 2020 06:08:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49470) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jmDvb-0003gp-L3; Fri, 19 Jun 2020 06:07:19 -0400 Received: from relay.sw.ru ([185.231.240.75]:37628 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jmDvY-0003LK-Am; Fri, 19 Jun 2020 06:07:19 -0400 Received: from [192.168.15.83] (helo=iris.lishka.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jmDvN-0004IR-Bx; Fri, 19 Jun 2020 13:07:05 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 1/6] migration/savevm: respect qemu_fclose() error code in save_snapshot() Date: Fri, 19 Jun 2020 13:07:03 +0300 Message-Id: <20200619100708.30440-2-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200619100708.30440-1-den@openvz.org> References: <20200619100708.30440-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/19 06:07:12 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Juan Quintela , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V. Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" qemu_fclose() could return error, f.e. if bdrv_co_flush() will return the error. This validation will become more important once we will start waiting of asynchronous IO operations, started from bdrv_write_vmstate(), which are coming soon. Signed-off-by: Denis V. Lunev Reviewed-by: "Dr. David Alan Gilbert" Reviewed-by: Vladimir Sementsov-Ogievskiy CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: Denis Plotnikov --- migration/savevm.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/migration/savevm.c b/migration/savevm.c index b979ea6e7f..da3dead4e9 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -2628,7 +2628,7 @@ int save_snapshot(const char *name, Error **errp) { BlockDriverState *bs, *bs1; QEMUSnapshotInfo sn1, *sn = &sn1, old_sn1, *old_sn = &old_sn1; - int ret = -1; + int ret = -1, ret2; QEMUFile *f; int saved_vm_running; uint64_t vm_state_size; @@ -2712,10 +2712,14 @@ int save_snapshot(const char *name, Error **errp) } ret = qemu_savevm_state(f, errp); vm_state_size = qemu_ftell(f); - qemu_fclose(f); + ret2 = qemu_fclose(f); if (ret < 0) { goto the_end; } + if (ret2 < 0) { + ret = ret2; + goto the_end; + } /* The bdrv_all_create_snapshot() call that follows acquires the AioContext * for itself. BDRV_POLL_WHILE() does not support nested locking because From patchwork Fri Jun 19 10:07:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11613673 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0E84992A for ; Fri, 19 Jun 2020 10:09:49 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E2CC12073E for ; Fri, 19 Jun 2020 10:09:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E2CC12073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:35296 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jmDy0-00008w-6T for patchwork-qemu-devel@patchwork.kernel.org; Fri, 19 Jun 2020 06:09:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49476) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jmDvb-0003hG-Sb; Fri, 19 Jun 2020 06:07:19 -0400 Received: from relay.sw.ru ([185.231.240.75]:37632 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jmDvY-0003LP-BP; Fri, 19 Jun 2020 06:07:19 -0400 Received: from [192.168.15.83] (helo=iris.lishka.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jmDvN-0004IR-H1; Fri, 19 Jun 2020 13:07:05 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 2/6] block/aio_task: allow start/wait task from any coroutine Date: Fri, 19 Jun 2020 13:07:04 +0300 Message-Id: <20200619100708.30440-3-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200619100708.30440-1-den@openvz.org> References: <20200619100708.30440-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/19 06:07:12 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Juan Quintela , "Dr. David Alan Gilbert" , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V . Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy Currently, aio task pool assumes that there is a main coroutine, which creates tasks and wait for them. Let's remove the restriction by using CoQueue. Code becomes clearer, interface more obvious. Signed-off-by: Vladimir Sementsov-Ogievskiy Signed-off-by: Denis V. Lunev CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: "Dr. David Alan Gilbert" CC: Vladimir Sementsov-Ogievskiy CC: Denis Plotnikov --- block/aio_task.c | 21 ++++++--------------- 1 file changed, 6 insertions(+), 15 deletions(-) diff --git a/block/aio_task.c b/block/aio_task.c index 88989fa248..cf62e5c58b 100644 --- a/block/aio_task.c +++ b/block/aio_task.c @@ -27,11 +27,10 @@ #include "block/aio_task.h" struct AioTaskPool { - Coroutine *main_co; int status; int max_busy_tasks; int busy_tasks; - bool waiting; + CoQueue waiters; }; static void coroutine_fn aio_task_co(void *opaque) @@ -52,31 +51,23 @@ static void coroutine_fn aio_task_co(void *opaque) g_free(task); - if (pool->waiting) { - pool->waiting = false; - aio_co_wake(pool->main_co); - } + qemu_co_queue_restart_all(&pool->waiters); } void coroutine_fn aio_task_pool_wait_one(AioTaskPool *pool) { assert(pool->busy_tasks > 0); - assert(qemu_coroutine_self() == pool->main_co); - pool->waiting = true; - qemu_coroutine_yield(); + qemu_co_queue_wait(&pool->waiters, NULL); - assert(!pool->waiting); assert(pool->busy_tasks < pool->max_busy_tasks); } void coroutine_fn aio_task_pool_wait_slot(AioTaskPool *pool) { - if (pool->busy_tasks < pool->max_busy_tasks) { - return; + while (pool->busy_tasks >= pool->max_busy_tasks) { + aio_task_pool_wait_one(pool); } - - aio_task_pool_wait_one(pool); } void coroutine_fn aio_task_pool_wait_all(AioTaskPool *pool) @@ -98,8 +89,8 @@ AioTaskPool *coroutine_fn aio_task_pool_new(int max_busy_tasks) { AioTaskPool *pool = g_new0(AioTaskPool, 1); - pool->main_co = qemu_coroutine_self(); pool->max_busy_tasks = max_busy_tasks; + qemu_co_queue_init(&pool->waiters); return pool; } From patchwork Fri Jun 19 10:07:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11613667 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 766C7912 for ; Fri, 19 Jun 2020 10:08:04 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 580912073E for ; Fri, 19 Jun 2020 10:08:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 580912073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:56074 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jmDwJ-0005ZZ-Gl for patchwork-qemu-devel@patchwork.kernel.org; Fri, 19 Jun 2020 06:08:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49464) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jmDvb-0003gT-6w; Fri, 19 Jun 2020 06:07:19 -0400 Received: from relay.sw.ru ([185.231.240.75]:37630 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jmDvY-0003LO-9U; Fri, 19 Jun 2020 06:07:18 -0400 Received: from [192.168.15.83] (helo=iris.lishka.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jmDvN-0004IR-L8; Fri, 19 Jun 2020 13:07:05 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 3/6] block/aio_task: drop aio_task_pool_wait_one() helper Date: Fri, 19 Jun 2020 13:07:05 +0300 Message-Id: <20200619100708.30440-4-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200619100708.30440-1-den@openvz.org> References: <20200619100708.30440-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/19 06:07:12 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Juan Quintela , "Dr. David Alan Gilbert" , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V. Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" It is not used outside the module. Actually there are 2 kind of waiters: - for a slot and - for all tasks to finish This patch limits external API to listed types. Signed-off-by: Denis V. Lunev Suggested-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Vladimir Sementsov-Ogievskiy CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: "Dr. David Alan Gilbert" CC: Denis Plotnikov --- block/aio_task.c | 13 ++----------- include/block/aio_task.h | 1 - 2 files changed, 2 insertions(+), 12 deletions(-) diff --git a/block/aio_task.c b/block/aio_task.c index cf62e5c58b..7ba15ff41f 100644 --- a/block/aio_task.c +++ b/block/aio_task.c @@ -54,26 +54,17 @@ static void coroutine_fn aio_task_co(void *opaque) qemu_co_queue_restart_all(&pool->waiters); } -void coroutine_fn aio_task_pool_wait_one(AioTaskPool *pool) -{ - assert(pool->busy_tasks > 0); - - qemu_co_queue_wait(&pool->waiters, NULL); - - assert(pool->busy_tasks < pool->max_busy_tasks); -} - void coroutine_fn aio_task_pool_wait_slot(AioTaskPool *pool) { while (pool->busy_tasks >= pool->max_busy_tasks) { - aio_task_pool_wait_one(pool); + qemu_co_queue_wait(&pool->waiters, NULL); } } void coroutine_fn aio_task_pool_wait_all(AioTaskPool *pool) { while (pool->busy_tasks > 0) { - aio_task_pool_wait_one(pool); + qemu_co_queue_wait(&pool->waiters, NULL); } } diff --git a/include/block/aio_task.h b/include/block/aio_task.h index 50bc1e1817..50b1c036c5 100644 --- a/include/block/aio_task.h +++ b/include/block/aio_task.h @@ -48,7 +48,6 @@ bool aio_task_pool_empty(AioTaskPool *pool); void coroutine_fn aio_task_pool_start_task(AioTaskPool *pool, AioTask *task); void coroutine_fn aio_task_pool_wait_slot(AioTaskPool *pool); -void coroutine_fn aio_task_pool_wait_one(AioTaskPool *pool); void coroutine_fn aio_task_pool_wait_all(AioTaskPool *pool); #endif /* BLOCK_AIO_TASK_H */ From patchwork Fri Jun 19 10:07:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11613665 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A450D92A for ; Fri, 19 Jun 2020 10:07:58 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 84BA62073E for ; Fri, 19 Jun 2020 10:07:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 84BA62073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:55434 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jmDwD-0005Jb-N5 for patchwork-qemu-devel@patchwork.kernel.org; Fri, 19 Jun 2020 06:07:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49430) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jmDva-0003gI-D1; Fri, 19 Jun 2020 06:07:18 -0400 Received: from relay.sw.ru ([185.231.240.75]:37622 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jmDvY-0003LL-AG; Fri, 19 Jun 2020 06:07:18 -0400 Received: from [192.168.15.83] (helo=iris.lishka.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jmDvN-0004IR-PX; Fri, 19 Jun 2020 13:07:05 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 4/6] block/block-backend: remove always true check from blk_save_vmstate Date: Fri, 19 Jun 2020 13:07:06 +0300 Message-Id: <20200619100708.30440-5-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200619100708.30440-1-den@openvz.org> References: <20200619100708.30440-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/19 06:07:12 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Juan Quintela , "Dr. David Alan Gilbert" , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V. Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" bdrv_save_vmstate() returns either error with negative return value or size. Thus this check is useless. Signed-off-by: Denis V. Lunev Suggested-by: Eric Blake CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: "Dr. David Alan Gilbert" CC: Vladimir Sementsov-Ogievskiy CC: Denis Plotnikov Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/block-backend.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/block-backend.c b/block/block-backend.c index 6936b25c83..1c6e53bbde 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -2188,7 +2188,7 @@ int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf, return ret; } - if (ret == size && !blk->enable_write_cache) { + if (!blk->enable_write_cache) { ret = bdrv_flush(blk_bs(blk)); } From patchwork Fri Jun 19 10:07:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11613671 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BD1CC912 for ; Fri, 19 Jun 2020 10:08:14 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9E7D22073E for ; Fri, 19 Jun 2020 10:08:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9E7D22073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:57106 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jmDwT-0005ys-PN for patchwork-qemu-devel@patchwork.kernel.org; Fri, 19 Jun 2020 06:08:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49486) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jmDvd-0003jb-B5; Fri, 19 Jun 2020 06:07:21 -0400 Received: from relay.sw.ru ([185.231.240.75]:37626 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jmDvY-0003LM-BY; Fri, 19 Jun 2020 06:07:21 -0400 Received: from [192.168.15.83] (helo=iris.lishka.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jmDvN-0004IR-VI; Fri, 19 Jun 2020 13:07:06 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 5/6] block, migration: add bdrv_finalize_vmstate helper Date: Fri, 19 Jun 2020 13:07:07 +0300 Message-Id: <20200619100708.30440-6-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200619100708.30440-1-den@openvz.org> References: <20200619100708.30440-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/19 06:07:12 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Juan Quintela , "Dr. David Alan Gilbert" , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V. Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" Right now bdrv_fclose() is just calling bdrv_flush(). The problem is that migration code is working inefficiently from block layer terms and are frequently called for very small pieces of unaligned data. Block layer is capable to work this way, but this is very slow. This patch is a preparation for the introduction of the intermediate buffer at block driver state. It would be beneficial to separate conventional bdrv_flush() from closing QEMU file from migration code. The patch also forces bdrv_finalize_vmstate() operation inside synchronous blk_save_vmstate() operation. This helper is used from qemu-io only. Signed-off-by: Denis V. Lunev Reviewed-by: Vladimir Sementsov-Ogievskiy CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: "Dr. David Alan Gilbert" CC: Denis Plotnikov --- block/block-backend.c | 6 +++++- block/io.c | 15 +++++++++++++++ include/block/block.h | 5 +++++ migration/savevm.c | 4 ++++ 4 files changed, 29 insertions(+), 1 deletion(-) diff --git a/block/block-backend.c b/block/block-backend.c index 1c6e53bbde..5bb11c8e01 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -2177,16 +2177,20 @@ int blk_truncate(BlockBackend *blk, int64_t offset, bool exact, int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf, int64_t pos, int size) { - int ret; + int ret, ret2; if (!blk_is_available(blk)) { return -ENOMEDIUM; } ret = bdrv_save_vmstate(blk_bs(blk), buf, pos, size); + ret2 = bdrv_finalize_vmstate(blk_bs(blk)); if (ret < 0) { return ret; } + if (ret2 < 0) { + return ret2; + } if (!blk->enable_write_cache) { ret = bdrv_flush(blk_bs(blk)); diff --git a/block/io.c b/block/io.c index df8f2a98d4..1f69268361 100644 --- a/block/io.c +++ b/block/io.c @@ -2724,6 +2724,21 @@ int bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos) return bdrv_rw_vmstate(bs, qiov, pos, true); } +static int coroutine_fn bdrv_co_finalize_vmstate(BlockDriverState *bs) +{ + return 0; +} + +static int coroutine_fn bdrv_finalize_vmstate_co_entry(void *opaque) +{ + return bdrv_co_finalize_vmstate(opaque); +} + +int bdrv_finalize_vmstate(BlockDriverState *bs) +{ + return bdrv_run_co(bs, bdrv_finalize_vmstate_co_entry, bs); +} + /**************************************************************/ /* async I/Os */ diff --git a/include/block/block.h b/include/block/block.h index 25e299605e..ab2c962094 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -572,6 +572,11 @@ int bdrv_save_vmstate(BlockDriverState *bs, const uint8_t *buf, int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf, int64_t pos, int size); +/* + * bdrv_finalize_vmstate() is mandatory to commit vmstate changes if + * bdrv_save_vmstate() was ever called. + */ +int bdrv_finalize_vmstate(BlockDriverState *bs); void bdrv_img_create(const char *filename, const char *fmt, const char *base_filename, const char *base_fmt, diff --git a/migration/savevm.c b/migration/savevm.c index da3dead4e9..798a4cb402 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -150,6 +150,10 @@ static ssize_t block_get_buffer(void *opaque, uint8_t *buf, int64_t pos, static int bdrv_fclose(void *opaque, Error **errp) { + int err = bdrv_finalize_vmstate(opaque); + if (err < 0) { + return err; + } return bdrv_flush(opaque); } From patchwork Fri Jun 19 10:07:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11613677 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3616C92A for ; Fri, 19 Jun 2020 10:11:19 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 17DD32073E for ; Fri, 19 Jun 2020 10:11:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 17DD32073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:39850 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jmDzS-0003HA-D3 for patchwork-qemu-devel@patchwork.kernel.org; Fri, 19 Jun 2020 06:11:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49484) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jmDvd-0003jG-3W; Fri, 19 Jun 2020 06:07:21 -0400 Received: from relay.sw.ru ([185.231.240.75]:37624 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jmDvY-0003LN-Lk; Fri, 19 Jun 2020 06:07:20 -0400 Received: from [192.168.15.83] (helo=iris.lishka.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jmDvO-0004IR-3r; Fri, 19 Jun 2020 13:07:06 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 6/6] block/io: improve savevm performance Date: Fri, 19 Jun 2020 13:07:08 +0300 Message-Id: <20200619100708.30440-7-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200619100708.30440-1-den@openvz.org> References: <20200619100708.30440-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/19 06:07:12 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Juan Quintela , "Dr. David Alan Gilbert" , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V. Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" This patch does 2 standard basic things: - it creates intermediate buffer for all writes from QEMU migration code to block driver, - this buffer is sent to disk asynchronously, allowing several writes to run in parallel. Thus bdrv_vmstate_write() is becoming asynchronous. All pending operations completion are performed in newly invented bdrv_finalize_vmstate(). In general, migration code is fantastically inefficent (by observation), buffers are not aligned and sent with arbitrary pieces, a lot of time less than 100 bytes at a chunk, which results in read-modify-write operations if target file descriptor is opened with O_DIRECT. It should also be noted that all operations are performed into unallocated image blocks, which also suffer due to partial writes to such new clusters even on cached file descriptors. Snapshot creation time (2 GB Fedora-31 VM running over NVME storage): original fixed cached: 1.79s 1.27s non-cached: 3.29s 0.81s The difference over HDD would be more significant :) Signed-off-by: Denis V. Lunev Reviewed-by: Vladimir Sementsov-Ogievskiy CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: "Dr. David Alan Gilbert" CC: Denis Plotnikov --- block/io.c | 126 +++++++++++++++++++++++++++++++++++++- include/block/block_int.h | 8 +++ 2 files changed, 132 insertions(+), 2 deletions(-) diff --git a/block/io.c b/block/io.c index 1f69268361..71a696deb7 100644 --- a/block/io.c +++ b/block/io.c @@ -26,6 +26,7 @@ #include "trace.h" #include "sysemu/block-backend.h" #include "block/aio-wait.h" +#include "block/aio_task.h" #include "block/blockjob.h" #include "block/blockjob_int.h" #include "block/block_int.h" @@ -33,6 +34,7 @@ #include "qapi/error.h" #include "qemu/error-report.h" #include "qemu/main-loop.h" +#include "qemu/units.h" #include "sysemu/replay.h" /* Maximum bounce buffer for copy-on-read and write zeroes, in bytes */ @@ -2640,6 +2642,103 @@ typedef struct BdrvVmstateCo { bool is_read; } BdrvVmstateCo; +typedef struct BdrvVMStateTask { + AioTask task; + + BlockDriverState *bs; + int64_t offset; + void *buf; + size_t bytes; +} BdrvVMStateTask; + +typedef struct BdrvSaveVMState { + AioTaskPool *pool; + BdrvVMStateTask *t; +} BdrvSaveVMState; + + +static coroutine_fn int bdrv_co_vmstate_save_task_entry(AioTask *task) +{ + int err = 0; + BdrvVMStateTask *t = container_of(task, BdrvVMStateTask, task); + + if (t->bytes != 0) { + QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, t->buf, t->bytes); + + bdrv_inc_in_flight(t->bs); + err = t->bs->drv->bdrv_save_vmstate(t->bs, &qiov, t->offset); + bdrv_dec_in_flight(t->bs); + } + + qemu_vfree(t->buf); + return err; +} + +static BdrvVMStateTask *bdrv_vmstate_task_create(BlockDriverState *bs, + int64_t pos, size_t size) +{ + BdrvVMStateTask *t = g_new(BdrvVMStateTask, 1); + + *t = (BdrvVMStateTask) { + .task.func = bdrv_co_vmstate_save_task_entry, + .buf = qemu_blockalign(bs, size), + .offset = pos, + .bs = bs, + }; + + return t; +} + +static int bdrv_co_do_save_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, + int64_t pos) +{ + BdrvSaveVMState *state = bs->savevm_state; + BdrvVMStateTask *t; + size_t buf_size = MAX(bdrv_get_cluster_size(bs), 1 * MiB); + size_t to_copy, off; + + if (state == NULL) { + state = g_new(BdrvSaveVMState, 1); + *state = (BdrvSaveVMState) { + .pool = aio_task_pool_new(BDRV_VMSTATE_WORKERS_MAX), + .t = bdrv_vmstate_task_create(bs, pos, buf_size), + }; + + bs->savevm_state = state; + } + + if (aio_task_pool_status(state->pool) < 0) { + /* + * The operation as a whole is unsuccessful. Prohibit all futher + * operations. If we clean here, new useless ops will come again. + * Thus we rely on caller for cleanup here. + */ + return aio_task_pool_status(state->pool); + } + + t = state->t; + if (t->offset + t->bytes != pos) { + /* Normally this branch is not reachable from migration */ + return bs->drv->bdrv_save_vmstate(bs, qiov, pos); + } + + off = 0; + while (1) { + to_copy = MIN(qiov->size - off, buf_size - t->bytes); + qemu_iovec_to_buf(qiov, off, t->buf + t->bytes, to_copy); + t->bytes += to_copy; + if (t->bytes < buf_size) { + return 0; + } + + aio_task_pool_start_task(state->pool, &t->task); + + pos += to_copy; + off += to_copy; + state->t = t = bdrv_vmstate_task_create(bs, pos, buf_size); + } +} + static int coroutine_fn bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos, bool is_read) @@ -2655,7 +2754,7 @@ bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos, if (is_read) { ret = drv->bdrv_load_vmstate(bs, qiov, pos); } else { - ret = drv->bdrv_save_vmstate(bs, qiov, pos); + ret = bdrv_co_do_save_vmstate(bs, qiov, pos); } } else if (bs->file) { ret = bdrv_co_rw_vmstate(bs->file->bs, qiov, pos, is_read); @@ -2726,7 +2825,30 @@ int bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos) static int coroutine_fn bdrv_co_finalize_vmstate(BlockDriverState *bs) { - return 0; + int err; + BdrvSaveVMState *state = bs->savevm_state; + + if (bs->drv->bdrv_save_vmstate == NULL && bs->file != NULL) { + return bdrv_co_finalize_vmstate(bs->file->bs); + } + if (state == NULL) { + return 0; + } + + if (aio_task_pool_status(state->pool) >= 0) { + /* We are on success path, commit last chunk if possible */ + aio_task_pool_start_task(state->pool, &state->t->task); + } + + aio_task_pool_wait_all(state->pool); + err = aio_task_pool_status(state->pool); + + aio_task_pool_free(state->pool); + g_free(state); + + bs->savevm_state = NULL; + + return err; } static int coroutine_fn bdrv_finalize_vmstate_co_entry(void *opaque) diff --git a/include/block/block_int.h b/include/block/block_int.h index 791de6a59c..f90f0e8b6a 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -61,6 +61,8 @@ #define BLOCK_PROBE_BUF_SIZE 512 +#define BDRV_VMSTATE_WORKERS_MAX 8 + enum BdrvTrackedRequestType { BDRV_TRACKED_READ, BDRV_TRACKED_WRITE, @@ -784,6 +786,9 @@ struct BdrvChild { QLIST_ENTRY(BdrvChild) next_parent; }; + +typedef struct BdrvSaveVMState BdrvSaveVMState; + /* * Note: the function bdrv_append() copies and swaps contents of * BlockDriverStates, so if you add new fields to this struct, please @@ -947,6 +952,9 @@ struct BlockDriverState { /* BdrvChild links to this node may never be frozen */ bool never_freeze; + + /* Intermediate buffer for VM state saving from snapshot creation code */ + BdrvSaveVMState *savevm_state; }; struct BlockBackendRootState { From patchwork Mon Jun 22 08:33:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 11617269 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0CBA3138C for ; Mon, 22 Jun 2020 08:33:58 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C4BA6206C3 for ; Mon, 22 Jun 2020 08:33:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C4BA6206C3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=openvz.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46818 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jnHtt-0001Ck-3E for patchwork-qemu-devel@patchwork.kernel.org; Mon, 22 Jun 2020 04:33:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34486) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jnHtB-0000hb-Mz; Mon, 22 Jun 2020 04:33:13 -0400 Received: from relay.sw.ru ([185.231.240.75]:33786 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jnHt8-0000wN-T3; Mon, 22 Jun 2020 04:33:13 -0400 Received: from [192.168.15.40] (helo=iris.lishka.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1jnHst-0007Fy-RR; Mon, 22 Jun 2020 11:32:55 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 7/6] block/io: improve loadvm performance Date: Mon, 22 Jun 2020 11:33:03 +0300 Message-Id: <20200622083303.18665-1-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200619100708.30440-1-den@openvz.org> References: <20200619100708.30440-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/22 04:33:06 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Juan Quintela , "Dr. David Alan Gilbert" , Max Reitz , Denis Plotnikov , Stefan Hajnoczi , "Denis V. Lunev" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" This patch creates intermediate buffer for reading from block driver state and performs read-ahead to this buffer. Snapshot code performs reads sequentially and thus we know what offsets will be required and when they will become not needed. Results are fantastic. Switch to snapshot times of 2GB Fedora 31 VM over NVME storage are the following: original fixed cached: 1.84s 1.16s non-cached: 12.74s 1.27s The difference over HDD would be more significant :) Signed-off-by: Denis V. Lunev CC: Vladimir Sementsov-Ogievskiy CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi CC: Fam Zheng CC: Juan Quintela CC: "Dr. David Alan Gilbert" CC: Denis Plotnikov --- block/io.c | 225 +++++++++++++++++++++++++++++++++++++- include/block/block_int.h | 3 + 2 files changed, 225 insertions(+), 3 deletions(-) diff --git a/block/io.c b/block/io.c index 71a696deb7..bb06f750d8 100644 --- a/block/io.c +++ b/block/io.c @@ -2739,6 +2739,180 @@ static int bdrv_co_do_save_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, } } + +typedef struct BdrvLoadVMChunk { + void *buf; + uint64_t offset; + ssize_t bytes; + + QLIST_ENTRY(BdrvLoadVMChunk) list; +} BdrvLoadVMChunk; + +typedef struct BdrvLoadVMState { + AioTaskPool *pool; + + int64_t offset; + int64_t last_loaded; + + int chunk_count; + QLIST_HEAD(, BdrvLoadVMChunk) chunks; + QLIST_HEAD(, BdrvLoadVMChunk) loading; + CoMutex lock; + CoQueue waiters; +} BdrvLoadVMState; + +typedef struct BdrvLoadVMStateTask { + AioTask task; + + BlockDriverState *bs; + BdrvLoadVMChunk *chunk; +} BdrvLoadVMStateTask; + +static BdrvLoadVMChunk *bdrv_co_find_loadvmstate_chunk(int64_t pos, + BdrvLoadVMChunk *c) +{ + for (; c != NULL; c = QLIST_NEXT(c, list)) { + if (c->offset <= pos && c->offset + c->bytes > pos) { + return c; + } + } + + return NULL; +} + +static void bdrv_free_loadvm_chunk(BdrvLoadVMChunk *c) +{ + qemu_vfree(c->buf); + g_free(c); +} + +static coroutine_fn int bdrv_co_vmstate_load_task_entry(AioTask *task) +{ + int err = 0; + BdrvLoadVMStateTask *t = container_of(task, BdrvLoadVMStateTask, task); + BdrvLoadVMChunk *c = t->chunk; + BdrvLoadVMState *state = t->bs->loadvm_state; + QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, c->buf, c->bytes); + + bdrv_inc_in_flight(t->bs); + err = t->bs->drv->bdrv_load_vmstate(t->bs, &qiov, c->offset); + bdrv_dec_in_flight(t->bs); + + qemu_co_mutex_lock(&state->lock); + QLIST_REMOVE(c, list); + if (err == 0) { + QLIST_INSERT_HEAD(&state->chunks, c, list); + } else { + bdrv_free_loadvm_chunk(c); + } + qemu_co_mutex_unlock(&state->lock); + qemu_co_queue_restart_all(&state->waiters); + + return err; +} + +static void bdrv_co_start_loadvmstate(BlockDriverState *bs, + BdrvLoadVMState *state) +{ + int i; + size_t buf_size = MAX(bdrv_get_cluster_size(bs), 1 * MiB); + + qemu_co_mutex_assert_locked(&state->lock); + for (i = state->chunk_count; i < BDRV_VMSTATE_WORKERS_MAX; i++) { + BdrvLoadVMStateTask *t = g_new(BdrvLoadVMStateTask, 1); + + *t = (BdrvLoadVMStateTask) { + .task.func = bdrv_co_vmstate_load_task_entry, + .bs = bs, + .chunk = g_new(BdrvLoadVMChunk, 1), + }; + + *t->chunk = (BdrvLoadVMChunk) { + .buf = qemu_blockalign(bs, buf_size), + .offset = state->last_loaded, + .bytes = buf_size, + }; + /* FIXME: tail of stream */ + + QLIST_INSERT_HEAD(&state->loading, t->chunk, list); + state->chunk_count++; + state->last_loaded += buf_size; + + qemu_co_mutex_unlock(&state->lock); + aio_task_pool_start_task(state->pool, &t->task); + qemu_co_mutex_lock(&state->lock); + } +} + +static int bdrv_co_do_load_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, + int64_t pos) +{ + BdrvLoadVMState *state = bs->loadvm_state; + BdrvLoadVMChunk *c; + size_t off; + + if (state == NULL) { + if (pos != 0) { + /* Normally this branch is not reachable from migration */ + return bs->drv->bdrv_load_vmstate(bs, qiov, pos); + } + + state = g_new(BdrvLoadVMState, 1); + *state = (BdrvLoadVMState) { + .pool = aio_task_pool_new(BDRV_VMSTATE_WORKERS_MAX), + .chunks = QLIST_HEAD_INITIALIZER(state->chunks), + .loading = QLIST_HEAD_INITIALIZER(state->loading), + }; + qemu_co_mutex_init(&state->lock); + qemu_co_queue_init(&state->waiters); + + bs->loadvm_state = state; + } + + if (state->offset != pos) { + /* Normally this branch is not reachable from migration */ + return bs->drv->bdrv_load_vmstate(bs, qiov, pos); + } + + off = 0; + qemu_co_mutex_lock(&state->lock); + bdrv_co_start_loadvmstate(bs, state); + + while (off < qiov->size && aio_task_pool_status(state->pool) == 0) { + c = bdrv_co_find_loadvmstate_chunk(pos, QLIST_FIRST(&state->chunks)); + if (c != NULL) { + ssize_t chunk_off = pos - c->offset; + ssize_t to_copy = MIN(qiov->size - off, c->bytes - chunk_off); + + qemu_iovec_from_buf(qiov, off, c->buf + chunk_off, to_copy); + + off += to_copy; + pos += to_copy; + + if (pos == c->offset + c->bytes) { + state->chunk_count--; + /* End of buffer, discard it from the list */ + QLIST_REMOVE(c, list); + bdrv_free_loadvm_chunk(c); + } + + state->offset += to_copy; + continue; + } + + c = bdrv_co_find_loadvmstate_chunk(pos, QLIST_FIRST(&state->loading)); + if (c != NULL) { + qemu_co_queue_wait(&state->waiters, &state->lock); + continue; + } + + bdrv_co_start_loadvmstate(bs, state); + } + qemu_co_mutex_unlock(&state->lock); + + return aio_task_pool_status(state->pool); +} + static int coroutine_fn bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos, bool is_read) @@ -2752,7 +2926,7 @@ bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos, ret = -ENOMEDIUM; } else if (drv->bdrv_load_vmstate) { if (is_read) { - ret = drv->bdrv_load_vmstate(bs, qiov, pos); + ret = bdrv_co_do_load_vmstate(bs, qiov, pos); } else { ret = bdrv_co_do_save_vmstate(bs, qiov, pos); } @@ -2823,13 +2997,13 @@ int bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos) return bdrv_rw_vmstate(bs, qiov, pos, true); } -static int coroutine_fn bdrv_co_finalize_vmstate(BlockDriverState *bs) +static int coroutine_fn bdrv_co_finalize_save_vmstate(BlockDriverState *bs) { int err; BdrvSaveVMState *state = bs->savevm_state; if (bs->drv->bdrv_save_vmstate == NULL && bs->file != NULL) { - return bdrv_co_finalize_vmstate(bs->file->bs); + return bdrv_co_finalize_save_vmstate(bs->file->bs); } if (state == NULL) { return 0; @@ -2851,6 +3025,51 @@ static int coroutine_fn bdrv_co_finalize_vmstate(BlockDriverState *bs) return err; } +static int coroutine_fn bdrv_co_finalize_load_vmstate(BlockDriverState *bs) +{ + int err; + BdrvLoadVMState *state = bs->loadvm_state; + BdrvLoadVMChunk *c, *tmp; + + if (bs->drv->bdrv_load_vmstate == NULL && bs->file != NULL) { + return bdrv_co_finalize_load_vmstate(bs->file->bs); + } + if (state == NULL) { + return 0; + } + + aio_task_pool_wait_all(state->pool); + err = aio_task_pool_status(state->pool); + aio_task_pool_free(state->pool); + + QLIST_FOREACH(c, &state->loading, list) { + assert(1); /* this list must be empty as all tasks are committed */ + } + QLIST_FOREACH_SAFE(c, &state->chunks, list, tmp) { + QLIST_REMOVE(c, list); + bdrv_free_loadvm_chunk(c); + } + + g_free(state); + + bs->loadvm_state = NULL; + + return err; +} + +static int coroutine_fn bdrv_co_finalize_vmstate(BlockDriverState *bs) +{ + int err1 = bdrv_co_finalize_save_vmstate(bs); + int err2 = bdrv_co_finalize_load_vmstate(bs); + if (err1 < 0) { + return err1; + } + if (err2 < 0) { + return err2; + } + return 0; +} + static int coroutine_fn bdrv_finalize_vmstate_co_entry(void *opaque) { return bdrv_co_finalize_vmstate(opaque); diff --git a/include/block/block_int.h b/include/block/block_int.h index f90f0e8b6a..0942578a74 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -788,6 +788,7 @@ struct BdrvChild { typedef struct BdrvSaveVMState BdrvSaveVMState; +typedef struct BdrvLoadVMState BdrvLoadVMState; /* * Note: the function bdrv_append() copies and swaps contents of @@ -955,6 +956,8 @@ struct BlockDriverState { /* Intermediate buffer for VM state saving from snapshot creation code */ BdrvSaveVMState *savevm_state; + /* Intermediate buffer for VM state loading */ + BdrvLoadVMState *loadvm_state; }; struct BlockBackendRootState {