From patchwork Mon Sep 11 07:49:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 13378927 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E4DACEE7FF4 for ; Mon, 11 Sep 2023 07:50:49 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qfbgk-0003qc-Ub; Mon, 11 Sep 2023 03:50:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgh-0003pc-JX for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:27 -0400 Received: from gandalf.ozlabs.org ([150.107.74.76]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgd-000819-OA for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:26 -0400 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Rkf5L72tdz4xF0; Mon, 11 Sep 2023 17:50:14 +1000 (AEST) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4Rkf5K1VJqz4x5q; Mon, 11 Sep 2023 17:50:12 +1000 (AEST) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Avihai Horon , YangHang Liu , =?utf-8?q?C=C3=A9dric_Le_Goater?= Subject: [PULL 01/13] vfio/migration: Move from STOP_COPY to STOP in vfio_save_cleanup() Date: Mon, 11 Sep 2023 09:49:56 +0200 Message-ID: <20230911075008.462712-2-clg@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911075008.462712-1-clg@redhat.com> References: <20230911075008.462712-1-clg@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=150.107.74.76; envelope-from=SRS0=GLJ6=E3=redhat.com=clg@ozlabs.org; helo=gandalf.ozlabs.org X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Avihai Horon Changing the device state from STOP_COPY to STOP can take time as the device may need to free resources and do other operations as part of the transition. Currently, this is done in vfio_save_complete_precopy() and therefore it is counted in the migration downtime. To avoid this, change the device state from STOP_COPY to STOP in vfio_save_cleanup(), which is called after migration has completed and thus is not part of migration downtime. Signed-off-by: Avihai Horon Tested-by: YangHang Liu Signed-off-by: Cédric Le Goater --- hw/vfio/migration.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 2674f4bc472d16989910300958f697f26fefa442..8acd182a8bf3fcd0eb0368816ff3093242b103f5 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -383,6 +383,19 @@ static void vfio_save_cleanup(void *opaque) VFIODevice *vbasedev = opaque; VFIOMigration *migration = vbasedev->migration; + /* + * Changing device state from STOP_COPY to STOP can take time. Do it here, + * after migration has completed, so it won't increase downtime. + */ + if (migration->device_state == VFIO_DEVICE_STATE_STOP_COPY) { + /* + * If setting the device in STOP state fails, the device should be + * reset. To do so, use ERROR state as a recover state. + */ + vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP, + VFIO_DEVICE_STATE_ERROR); + } + g_free(migration->data_buffer); migration->data_buffer = NULL; migration->precopy_init_size = 0; @@ -508,12 +521,6 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) return ret; } - /* - * If setting the device in STOP state fails, the device should be reset. - * To do so, use ERROR state as a recover state. - */ - ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP, - VFIO_DEVICE_STATE_ERROR); trace_vfio_save_complete_precopy(vbasedev->name, ret); return ret; From patchwork Mon Sep 11 07:49:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 13378932 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D7030EE57DF for ; Mon, 11 Sep 2023 07:51:26 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qfbgn-0003tZ-6i; Mon, 11 Sep 2023 03:50:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgj-0003qK-5v for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:29 -0400 Received: from mail.ozlabs.org ([2404:9400:2221:ea00::3] helo=gandalf.ozlabs.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgd-00081W-PT for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:28 -0400 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Rkf5P1rpLz4xF3; Mon, 11 Sep 2023 17:50:17 +1000 (AEST) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4Rkf5M3Qfmz4x5q; Mon, 11 Sep 2023 17:50:15 +1000 (AEST) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Avihai Horon , =?utf-8?q?C=C3=A9dric_Le_Goater?= , YangHang Liu Subject: [PULL 02/13] sysemu: Add prepare callback to struct VMChangeStateEntry Date: Mon, 11 Sep 2023 09:49:57 +0200 Message-ID: <20230911075008.462712-3-clg@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911075008.462712-1-clg@redhat.com> References: <20230911075008.462712-1-clg@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2404:9400:2221:ea00::3; envelope-from=SRS0=GLJ6=E3=redhat.com=clg@ozlabs.org; helo=gandalf.ozlabs.org X-Spam_score_int: -39 X-Spam_score: -4.0 X-Spam_bar: ---- X-Spam_report: (-4.0 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Avihai Horon Add prepare callback to struct VMChangeStateEntry. The prepare callback is optional and can be set by the new function qemu_add_vm_change_state_handler_prio_full() that allows setting this callback in addition to the main callback. The prepare callbacks and main callbacks are called in two separate phases: First all prepare callbacks are called and only then all main callbacks are called. The purpose of the new prepare callback is to allow all devices to run a preliminary task before calling the devices' main callbacks. This will facilitate adding P2P support for VFIO migration where all VFIO devices need to be put in an intermediate P2P quiescent state before being stopped or started by the main callback. Signed-off-by: Avihai Horon Reviewed-by: Cédric Le Goater Tested-by: YangHang Liu Signed-off-by: Cédric Le Goater --- include/sysemu/runstate.h | 4 ++++ softmmu/runstate.c | 40 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+) diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h index 7beb29c2e2ac564bb002d208b125ab6269e097de..764a0fc6a4554857bcff339c668b48193b40c3a4 100644 --- a/include/sysemu/runstate.h +++ b/include/sysemu/runstate.h @@ -16,6 +16,10 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler(VMChangeStateHandler *cb, void *opaque); VMChangeStateEntry *qemu_add_vm_change_state_handler_prio( VMChangeStateHandler *cb, void *opaque, int priority); +VMChangeStateEntry * +qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb, + VMChangeStateHandler *prepare_cb, + void *opaque, int priority); VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev, VMChangeStateHandler *cb, void *opaque); diff --git a/softmmu/runstate.c b/softmmu/runstate.c index f3bd8628181303792629fa4079f09abf63fd9787..1652ed0439b4d39e5719d5b7caa002aa297789b6 100644 --- a/softmmu/runstate.c +++ b/softmmu/runstate.c @@ -271,6 +271,7 @@ void qemu_system_vmstop_request(RunState state) } struct VMChangeStateEntry { VMChangeStateHandler *cb; + VMChangeStateHandler *prepare_cb; void *opaque; QTAILQ_ENTRY(VMChangeStateEntry) entries; int priority; @@ -293,12 +294,39 @@ static QTAILQ_HEAD(, VMChangeStateEntry) vm_change_state_head = */ VMChangeStateEntry *qemu_add_vm_change_state_handler_prio( VMChangeStateHandler *cb, void *opaque, int priority) +{ + return qemu_add_vm_change_state_handler_prio_full(cb, NULL, opaque, + priority); +} + +/** + * qemu_add_vm_change_state_handler_prio_full: + * @cb: the main callback to invoke + * @prepare_cb: a callback to invoke before the main callback + * @opaque: user data passed to the callbacks + * @priority: low priorities execute first when the vm runs and the reverse is + * true when the vm stops + * + * Register a main callback function and an optional prepare callback function + * that are invoked when the vm starts or stops running. The main callback and + * the prepare callback are called in two separate phases: First all prepare + * callbacks are called and only then all main callbacks are called. As its + * name suggests, the prepare callback can be used to do some preparatory work + * before invoking the main callback. + * + * Returns: an entry to be freed using qemu_del_vm_change_state_handler() + */ +VMChangeStateEntry * +qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb, + VMChangeStateHandler *prepare_cb, + void *opaque, int priority) { VMChangeStateEntry *e; VMChangeStateEntry *other; e = g_malloc0(sizeof(*e)); e->cb = cb; + e->prepare_cb = prepare_cb; e->opaque = opaque; e->priority = priority; @@ -333,10 +361,22 @@ void vm_state_notify(bool running, RunState state) trace_vm_state_notify(running, state, RunState_str(state)); if (running) { + QTAILQ_FOREACH_SAFE(e, &vm_change_state_head, entries, next) { + if (e->prepare_cb) { + e->prepare_cb(e->opaque, running, state); + } + } + QTAILQ_FOREACH_SAFE(e, &vm_change_state_head, entries, next) { e->cb(e->opaque, running, state); } } else { + QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) { + if (e->prepare_cb) { + e->prepare_cb(e->opaque, running, state); + } + } + QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) { e->cb(e->opaque, running, state); } From patchwork Mon Sep 11 07:49:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 13378940 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EE303EE801F for ; Mon, 11 Sep 2023 07:52:31 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qfbgs-0003xz-2F; Mon, 11 Sep 2023 03:50:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgj-0003qJ-5u for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:29 -0400 Received: from gandalf.ozlabs.org ([150.107.74.76]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbge-00081e-CZ for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:28 -0400 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Rkf5S1Jvgz4xFD; Mon, 11 Sep 2023 17:50:20 +1000 (AEST) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4Rkf5P5JS1z4x5q; Mon, 11 Sep 2023 17:50:17 +1000 (AEST) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Avihai Horon , Joao Martins , =?utf-8?q?C=C3=A9dric_Le_Goater?= , YangHang Liu Subject: [PULL 03/13] qdev: Add qdev_add_vm_change_state_handler_full() Date: Mon, 11 Sep 2023 09:49:58 +0200 Message-ID: <20230911075008.462712-4-clg@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911075008.462712-1-clg@redhat.com> References: <20230911075008.462712-1-clg@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=150.107.74.76; envelope-from=SRS0=GLJ6=E3=redhat.com=clg@ozlabs.org; helo=gandalf.ozlabs.org X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Avihai Horon Add qdev_add_vm_change_state_handler_full() variant that allows setting a prepare callback in addition to the main callback. This will facilitate adding P2P support for VFIO migration in the following patches. Signed-off-by: Avihai Horon Signed-off-by: Joao Martins Reviewed-by: Cédric Le Goater Tested-by: YangHang Liu Signed-off-by: Cédric Le Goater --- include/sysemu/runstate.h | 3 +++ hw/core/vm-change-state-handler.c | 14 +++++++++++++- 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h index 764a0fc6a4554857bcff339c668b48193b40c3a4..08afb97695bd6a7c7f1fb852f475be710eb4ac35 100644 --- a/include/sysemu/runstate.h +++ b/include/sysemu/runstate.h @@ -23,6 +23,9 @@ qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb, VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev, VMChangeStateHandler *cb, void *opaque); +VMChangeStateEntry *qdev_add_vm_change_state_handler_full( + DeviceState *dev, VMChangeStateHandler *cb, + VMChangeStateHandler *prepare_cb, void *opaque); void qemu_del_vm_change_state_handler(VMChangeStateEntry *e); /** * vm_state_notify: Notify the state of the VM diff --git a/hw/core/vm-change-state-handler.c b/hw/core/vm-change-state-handler.c index 1f3630986d54e1fc70ac7d62f646296af3f7f3cf..8e2639224e7572b77be0f3c44e12d7321f18b535 100644 --- a/hw/core/vm-change-state-handler.c +++ b/hw/core/vm-change-state-handler.c @@ -55,8 +55,20 @@ static int qdev_get_dev_tree_depth(DeviceState *dev) VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev, VMChangeStateHandler *cb, void *opaque) +{ + return qdev_add_vm_change_state_handler_full(dev, cb, NULL, opaque); +} + +/* + * Exactly like qdev_add_vm_change_state_handler() but passes a prepare_cb + * argument too. + */ +VMChangeStateEntry *qdev_add_vm_change_state_handler_full( + DeviceState *dev, VMChangeStateHandler *cb, + VMChangeStateHandler *prepare_cb, void *opaque) { int depth = qdev_get_dev_tree_depth(dev); - return qemu_add_vm_change_state_handler_prio(cb, opaque, depth); + return qemu_add_vm_change_state_handler_prio_full(cb, prepare_cb, opaque, + depth); } From patchwork Mon Sep 11 07:49:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 13378939 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 81659EE57DF for ; Mon, 11 Sep 2023 07:52:29 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qfbgx-00047b-IC; Mon, 11 Sep 2023 03:50:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgt-00042W-1Y for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:39 -0400 Received: from gandalf.ozlabs.org ([150.107.74.76]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgf-00081n-Dx for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:38 -0400 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Rkf5V51qWz4xFQ; Mon, 11 Sep 2023 17:50:22 +1000 (AEST) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4Rkf5S4sf6z4x5q; Mon, 11 Sep 2023 17:50:20 +1000 (AEST) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Joao Martins , Avihai Horon , =?utf-8?q?C=C3=A9dric_Le_Goater?= , YangHang Liu Subject: [PULL 04/13] vfio/migration: Refactor PRE_COPY and RUNNING state checks Date: Mon, 11 Sep 2023 09:49:59 +0200 Message-ID: <20230911075008.462712-5-clg@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911075008.462712-1-clg@redhat.com> References: <20230911075008.462712-1-clg@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=150.107.74.76; envelope-from=SRS0=GLJ6=E3=redhat.com=clg@ozlabs.org; helo=gandalf.ozlabs.org X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Joao Martins Move the PRE_COPY and RUNNING state checks to helper functions. This is in preparation for adding P2P VFIO migration support, where these helpers will also test for PRE_COPY_P2P and RUNNING_P2P states. Signed-off-by: Joao Martins Signed-off-by: Avihai Horon Reviewed-by: Cédric Le Goater Tested-by: YangHang Liu Signed-off-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 2 ++ hw/vfio/common.c | 22 ++++++++++++++++++---- hw/vfio/migration.c | 10 ++++------ 3 files changed, 24 insertions(+), 10 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index da43d273524ec441c13194b363008ab27a72839d..e9b895459534d7873445f865ef0e5f8f5c53882a 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -230,6 +230,8 @@ void vfio_unblock_multiple_devices_migration(void); bool vfio_viommu_preset(VFIODevice *vbasedev); int64_t vfio_mig_bytes_transferred(void); void vfio_reset_bytes_transferred(void); +bool vfio_device_state_is_running(VFIODevice *vbasedev); +bool vfio_device_state_is_precopy(VFIODevice *vbasedev); #ifdef CONFIG_LINUX int vfio_get_region_info(VFIODevice *vbasedev, int index, diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 9aac21abb76ef7d1abb54428e9a173a33ce16073..16cf79a76c845d8eb19498e8c6bf1f3b2b8d2fd8 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -437,6 +437,20 @@ static void vfio_set_migration_error(int err) } } +bool vfio_device_state_is_running(VFIODevice *vbasedev) +{ + VFIOMigration *migration = vbasedev->migration; + + return migration->device_state == VFIO_DEVICE_STATE_RUNNING; +} + +bool vfio_device_state_is_precopy(VFIODevice *vbasedev) +{ + VFIOMigration *migration = vbasedev->migration; + + return migration->device_state == VFIO_DEVICE_STATE_PRE_COPY; +} + static bool vfio_devices_all_dirty_tracking(VFIOContainer *container) { VFIOGroup *group; @@ -457,8 +471,8 @@ static bool vfio_devices_all_dirty_tracking(VFIOContainer *container) } if (vbasedev->pre_copy_dirty_page_tracking == ON_OFF_AUTO_OFF && - (migration->device_state == VFIO_DEVICE_STATE_RUNNING || - migration->device_state == VFIO_DEVICE_STATE_PRE_COPY)) { + (vfio_device_state_is_running(vbasedev) || + vfio_device_state_is_precopy(vbasedev))) { return false; } } @@ -503,8 +517,8 @@ static bool vfio_devices_all_running_and_mig_active(VFIOContainer *container) return false; } - if (migration->device_state == VFIO_DEVICE_STATE_RUNNING || - migration->device_state == VFIO_DEVICE_STATE_PRE_COPY) { + if (vfio_device_state_is_running(vbasedev) || + vfio_device_state_is_precopy(vbasedev)) { continue; } else { return false; diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 8acd182a8bf3fcd0eb0368816ff3093242b103f5..48f9c23cbe3ac720ef252d699636e4a572bec762 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -411,7 +411,7 @@ static void vfio_state_pending_estimate(void *opaque, uint64_t *must_precopy, VFIODevice *vbasedev = opaque; VFIOMigration *migration = vbasedev->migration; - if (migration->device_state != VFIO_DEVICE_STATE_PRE_COPY) { + if (!vfio_device_state_is_precopy(vbasedev)) { return; } @@ -444,7 +444,7 @@ static void vfio_state_pending_exact(void *opaque, uint64_t *must_precopy, vfio_query_stop_copy_size(vbasedev, &stop_copy_size); *must_precopy += stop_copy_size; - if (migration->device_state == VFIO_DEVICE_STATE_PRE_COPY) { + if (vfio_device_state_is_precopy(vbasedev)) { vfio_query_precopy_size(migration); *must_precopy += @@ -459,9 +459,8 @@ static void vfio_state_pending_exact(void *opaque, uint64_t *must_precopy, static bool vfio_is_active_iterate(void *opaque) { VFIODevice *vbasedev = opaque; - VFIOMigration *migration = vbasedev->migration; - return migration->device_state == VFIO_DEVICE_STATE_PRE_COPY; + return vfio_device_state_is_precopy(vbasedev); } static int vfio_save_iterate(QEMUFile *f, void *opaque) @@ -656,7 +655,6 @@ static const SaveVMHandlers savevm_vfio_handlers = { static void vfio_vmstate_change(void *opaque, bool running, RunState state) { VFIODevice *vbasedev = opaque; - VFIOMigration *migration = vbasedev->migration; enum vfio_device_mig_state new_state; int ret; @@ -664,7 +662,7 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state) new_state = VFIO_DEVICE_STATE_RUNNING; } else { new_state = - (migration->device_state == VFIO_DEVICE_STATE_PRE_COPY && + (vfio_device_state_is_precopy(vbasedev) && (state == RUN_STATE_FINISH_MIGRATE || state == RUN_STATE_PAUSED)) ? VFIO_DEVICE_STATE_STOP_COPY : VFIO_DEVICE_STATE_STOP; From patchwork Mon Sep 11 07:50:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 13378937 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C9344EE7FF4 for ; Mon, 11 Sep 2023 07:52:25 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qfbgs-00042Q-Qk; Mon, 11 Sep 2023 03:50:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgr-0003vy-4C for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:37 -0400 Received: from gandalf.ozlabs.org ([150.107.74.76]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgl-0008Cl-4k for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:35 -0400 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Rkf5X6wYqz4xGM; Mon, 11 Sep 2023 17:50:24 +1000 (AEST) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4Rkf5W1STjz4x5q; Mon, 11 Sep 2023 17:50:22 +1000 (AEST) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Avihai Horon , YangHang Liu , =?utf-8?q?C=C3=A9dric_Le_Goater?= Subject: [PULL 05/13] vfio/migration: Add P2P support for VFIO migration Date: Mon, 11 Sep 2023 09:50:00 +0200 Message-ID: <20230911075008.462712-6-clg@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911075008.462712-1-clg@redhat.com> References: <20230911075008.462712-1-clg@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=150.107.74.76; envelope-from=SRS0=GLJ6=E3=redhat.com=clg@ozlabs.org; helo=gandalf.ozlabs.org X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Avihai Horon VFIO migration uAPI defines an optional intermediate P2P quiescent state. While in the P2P quiescent state, P2P DMA transactions cannot be initiated by the device, but the device can respond to incoming ones. Additionally, all outstanding P2P transactions are guaranteed to have been completed by the time the device enters this state. The purpose of this state is to support migration of multiple devices that might do P2P transactions between themselves. Add support for P2P migration by transitioning all the devices to the P2P quiescent state before stopping or starting the devices. Use the new VMChangeStateHandler prepare_cb to achieve that behavior. This will allow migration of multiple VFIO devices if all of them support P2P migration. Signed-off-by: Avihai Horon Tested-by: YangHang Liu Reviewed-by: Cédric Le Goater Signed-off-by: Cédric Le Goater --- docs/devel/vfio-migration.rst | 93 +++++++++++++++++++++-------------- hw/vfio/common.c | 6 ++- hw/vfio/migration.c | 46 +++++++++++++++-- hw/vfio/trace-events | 1 + 4 files changed, 105 insertions(+), 41 deletions(-) diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst index b433cb5bb2c8caa703c6063efeb382bebfe7aed6..605fe60e9695a50813b1294bb970ce2c39d2ba07 100644 --- a/docs/devel/vfio-migration.rst +++ b/docs/devel/vfio-migration.rst @@ -23,9 +23,21 @@ and recommends that the initial bytes are sent and loaded in the destination before stopping the source VM. Enabling this migration capability will guarantee that and thus, can potentially reduce downtime even further. -Note that currently VFIO migration is supported only for a single device. This -is due to VFIO migration's lack of P2P support. However, P2P support is planned -to be added later on. +To support migration of multiple devices that might do P2P transactions between +themselves, VFIO migration uAPI defines an intermediate P2P quiescent state. +While in the P2P quiescent state, P2P DMA transactions cannot be initiated by +the device, but the device can respond to incoming ones. Additionally, all +outstanding P2P transactions are guaranteed to have been completed by the time +the device enters this state. + +All the devices that support P2P migration are first transitioned to the P2P +quiescent state and only then are they stopped or started. This makes migration +safe P2P-wise, since starting and stopping the devices is not done atomically +for all the devices together. + +Thus, multiple VFIO devices migration is allowed only if all the devices +support P2P migration. Single VFIO device migration is allowed regardless of +P2P migration support. A detailed description of the UAPI for VFIO device migration can be found in the comment for the ``vfio_device_mig_state`` structure in the header file @@ -132,54 +144,63 @@ will be blocked. Flow of state changes during Live migration =========================================== -Below is the flow of state change during live migration. +Below is the state change flow during live migration for a VFIO device that +supports both precopy and P2P migration. The flow for devices that don't +support it is similar, except that the relevant states for precopy and P2P are +skipped. The values in the parentheses represent the VM state, the migration state, and the VFIO device state, respectively. -The text in the square brackets represents the flow if the VFIO device supports -pre-copy. Live migration save path ------------------------ :: - QEMU normal running state - (RUNNING, _NONE, _RUNNING) - | + QEMU normal running state + (RUNNING, _NONE, _RUNNING) + | migrate_init spawns migration_thread - Migration thread then calls each device's .save_setup() - (RUNNING, _SETUP, _RUNNING [_PRE_COPY]) - | - (RUNNING, _ACTIVE, _RUNNING [_PRE_COPY]) - If device is active, get pending_bytes by .state_pending_{estimate,exact}() - If total pending_bytes >= threshold_size, call .save_live_iterate() - [Data of VFIO device for pre-copy phase is copied] - Iterate till total pending bytes converge and are less than threshold - | - On migration completion, vCPU stops and calls .save_live_complete_precopy for - each active device. The VFIO device is then transitioned into _STOP_COPY state - (FINISH_MIGRATE, _DEVICE, _STOP_COPY) - | - For the VFIO device, iterate in .save_live_complete_precopy until - pending data is 0 - (FINISH_MIGRATE, _DEVICE, _STOP) - | - (FINISH_MIGRATE, _COMPLETED, _STOP) - Migraton thread schedules cleanup bottom half and exits + Migration thread then calls each device's .save_setup() + (RUNNING, _SETUP, _PRE_COPY) + | + (RUNNING, _ACTIVE, _PRE_COPY) + If device is active, get pending_bytes by .state_pending_{estimate,exact}() + If total pending_bytes >= threshold_size, call .save_live_iterate() + Data of VFIO device for pre-copy phase is copied + Iterate till total pending bytes converge and are less than threshold + | + On migration completion, the vCPUs and the VFIO device are stopped + The VFIO device is first put in P2P quiescent state + (FINISH_MIGRATE, _ACTIVE, _PRE_COPY_P2P) + | + Then the VFIO device is put in _STOP_COPY state + (FINISH_MIGRATE, _ACTIVE, _STOP_COPY) + .save_live_complete_precopy() is called for each active device + For the VFIO device, iterate in .save_live_complete_precopy() until + pending data is 0 + | + (POSTMIGRATE, _COMPLETED, _STOP_COPY) + Migraton thread schedules cleanup bottom half and exits + | + .save_cleanup() is called + (POSTMIGRATE, _COMPLETED, _STOP) Live migration resume path -------------------------- :: - Incoming migration calls .load_setup for each device - (RESTORE_VM, _ACTIVE, _STOP) - | - For each device, .load_state is called for that device section data - (RESTORE_VM, _ACTIVE, _RESUMING) - | - At the end, .load_cleanup is called for each device and vCPUs are started - (RUNNING, _NONE, _RUNNING) + Incoming migration calls .load_setup() for each device + (RESTORE_VM, _ACTIVE, _STOP) + | + For each device, .load_state() is called for that device section data + (RESTORE_VM, _ACTIVE, _RESUMING) + | + At the end, .load_cleanup() is called for each device and vCPUs are started + The VFIO device is first put in P2P quiescent state + (RUNNING, _ACTIVE, _RUNNING_P2P) + | + (RUNNING, _NONE, _RUNNING) Postcopy ======== diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 16cf79a76c845d8eb19498e8c6bf1f3b2b8d2fd8..7c3d636025695641299f306c2afe12fa3e990736 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -441,14 +441,16 @@ bool vfio_device_state_is_running(VFIODevice *vbasedev) { VFIOMigration *migration = vbasedev->migration; - return migration->device_state == VFIO_DEVICE_STATE_RUNNING; + return migration->device_state == VFIO_DEVICE_STATE_RUNNING || + migration->device_state == VFIO_DEVICE_STATE_RUNNING_P2P; } bool vfio_device_state_is_precopy(VFIODevice *vbasedev) { VFIOMigration *migration = vbasedev->migration; - return migration->device_state == VFIO_DEVICE_STATE_PRE_COPY; + return migration->device_state == VFIO_DEVICE_STATE_PRE_COPY || + migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P; } static bool vfio_devices_all_dirty_tracking(VFIOContainer *container) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 48f9c23cbe3ac720ef252d699636e4a572bec762..71855468fe985291e2d009b81c6efd29abcbe755 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -71,8 +71,12 @@ static const char *mig_state_to_str(enum vfio_device_mig_state state) return "STOP_COPY"; case VFIO_DEVICE_STATE_RESUMING: return "RESUMING"; + case VFIO_DEVICE_STATE_RUNNING_P2P: + return "RUNNING_P2P"; case VFIO_DEVICE_STATE_PRE_COPY: return "PRE_COPY"; + case VFIO_DEVICE_STATE_PRE_COPY_P2P: + return "PRE_COPY_P2P"; default: return "UNKNOWN STATE"; } @@ -652,6 +656,39 @@ static const SaveVMHandlers savevm_vfio_handlers = { /* ---------------------------------------------------------------------- */ +static void vfio_vmstate_change_prepare(void *opaque, bool running, + RunState state) +{ + VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + enum vfio_device_mig_state new_state; + int ret; + + new_state = migration->device_state == VFIO_DEVICE_STATE_PRE_COPY ? + VFIO_DEVICE_STATE_PRE_COPY_P2P : + VFIO_DEVICE_STATE_RUNNING_P2P; + + /* + * If setting the device in new_state fails, the device should be reset. + * To do so, use ERROR state as a recover state. + */ + ret = vfio_migration_set_state(vbasedev, new_state, + VFIO_DEVICE_STATE_ERROR); + if (ret) { + /* + * Migration should be aborted in this case, but vm_state_notify() + * currently does not support reporting failures. + */ + if (migrate_get_current()->to_dst_file) { + qemu_file_set_error(migrate_get_current()->to_dst_file, ret); + } + } + + trace_vfio_vmstate_change_prepare(vbasedev->name, running, + RunState_str(state), + mig_state_to_str(new_state)); +} + static void vfio_vmstate_change(void *opaque, bool running, RunState state) { VFIODevice *vbasedev = opaque; @@ -758,6 +795,7 @@ static int vfio_migration_init(VFIODevice *vbasedev) char id[256] = ""; g_autofree char *path = NULL, *oid = NULL; uint64_t mig_flags = 0; + VMChangeStateHandler *prepare_cb; if (!vbasedev->ops->vfio_get_object) { return -EINVAL; @@ -798,9 +836,11 @@ static int vfio_migration_init(VFIODevice *vbasedev) register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, &savevm_vfio_handlers, vbasedev); - migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev, - vfio_vmstate_change, - vbasedev); + prepare_cb = migration->mig_flags & VFIO_MIGRATION_P2P ? + vfio_vmstate_change_prepare : + NULL; + migration->vm_state = qdev_add_vm_change_state_handler_full( + vbasedev->dev, vfio_vmstate_change, prepare_cb, vbasedev); migration->migration_state.notify = vfio_migration_state_notifier; add_migration_state_change_notifier(&migration->migration_state); diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index ee7509e68e4fc1953934bafa6a5c1e1981e4c6ca..329736a738d32ab006c3621cecfb704c84a513b7 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -167,3 +167,4 @@ vfio_save_setup(const char *name, uint64_t data_buffer_size) " (%s) data buffer vfio_state_pending_estimate(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy 0x%"PRIx64" postcopy 0x%"PRIx64" precopy initial size 0x%"PRIx64" precopy dirty size 0x%"PRIx64 vfio_state_pending_exact(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t stopcopy_size, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy 0x%"PRIx64" postcopy 0x%"PRIx64" stopcopy size 0x%"PRIx64" precopy initial size 0x%"PRIx64" precopy dirty size 0x%"PRIx64 vfio_vmstate_change(const char *name, int running, const char *reason, const char *dev_state) " (%s) running %d reason %s device state %s" +vfio_vmstate_change_prepare(const char *name, int running, const char *reason, const char *dev_state) " (%s) running %d reason %s device state %s" From patchwork Mon Sep 11 07:50:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 13378935 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BAE96EE7FF4 for ; Mon, 11 Sep 2023 07:52:12 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qfbgu-000441-2I; Mon, 11 Sep 2023 03:50:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgp-0003vL-7K for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:35 -0400 Received: from mail.ozlabs.org ([2404:9400:2221:ea00::3] helo=gandalf.ozlabs.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgl-0008Co-4a for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:34 -0400 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Rkf5b3WBZz4xGN; Mon, 11 Sep 2023 17:50:27 +1000 (AEST) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4Rkf5Y3Nfzz4x5q; Mon, 11 Sep 2023 17:50:25 +1000 (AEST) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Avihai Horon , Joao Martins , =?utf-8?q?C=C3=A9dric_Le_Goater?= , YangHang Liu Subject: [PULL 06/13] vfio/migration: Allow migration of multiple P2P supporting devices Date: Mon, 11 Sep 2023 09:50:01 +0200 Message-ID: <20230911075008.462712-7-clg@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911075008.462712-1-clg@redhat.com> References: <20230911075008.462712-1-clg@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2404:9400:2221:ea00::3; envelope-from=SRS0=GLJ6=E3=redhat.com=clg@ozlabs.org; helo=gandalf.ozlabs.org X-Spam_score_int: -39 X-Spam_score: -4.0 X-Spam_bar: ---- X-Spam_report: (-4.0 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Avihai Horon Now that P2P support has been added to VFIO migration, allow migration of multiple devices if all of them support P2P migration. Single device migration is allowed regardless of P2P migration support. Signed-off-by: Avihai Horon Signed-off-by: Joao Martins Reviewed-by: Cédric Le Goater Tested-by: YangHang Liu Signed-off-by: Cédric Le Goater --- hw/vfio/common.c | 26 ++++++++++++++++++-------- 1 file changed, 18 insertions(+), 8 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 7c3d636025695641299f306c2afe12fa3e990736..8a8d074e1863ec40b00a424bbe50494ce8391301 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -363,21 +363,31 @@ bool vfio_mig_active(void) static Error *multiple_devices_migration_blocker; -static unsigned int vfio_migratable_device_num(void) +/* + * Multiple devices migration is allowed only if all devices support P2P + * migration. Single device migration is allowed regardless of P2P migration + * support. + */ +static bool vfio_multiple_devices_migration_is_supported(void) { VFIOGroup *group; VFIODevice *vbasedev; unsigned int device_num = 0; + bool all_support_p2p = true; QLIST_FOREACH(group, &vfio_group_list, next) { QLIST_FOREACH(vbasedev, &group->device_list, next) { if (vbasedev->migration) { device_num++; + + if (!(vbasedev->migration->mig_flags & VFIO_MIGRATION_P2P)) { + all_support_p2p = false; + } } } } - return device_num; + return all_support_p2p || device_num <= 1; } int vfio_block_multiple_devices_migration(VFIODevice *vbasedev, Error **errp) @@ -385,19 +395,19 @@ int vfio_block_multiple_devices_migration(VFIODevice *vbasedev, Error **errp) int ret; if (multiple_devices_migration_blocker || - vfio_migratable_device_num() <= 1) { + vfio_multiple_devices_migration_is_supported()) { return 0; } if (vbasedev->enable_migration == ON_OFF_AUTO_ON) { - error_setg(errp, "Migration is currently not supported with multiple " - "VFIO devices"); + error_setg(errp, "Multiple VFIO devices migration is supported only if " + "all of them support P2P migration"); return -EINVAL; } error_setg(&multiple_devices_migration_blocker, - "Migration is currently not supported with multiple " - "VFIO devices"); + "Multiple VFIO devices migration is supported only if all of " + "them support P2P migration"); ret = migrate_add_blocker(multiple_devices_migration_blocker, errp); if (ret < 0) { error_free(multiple_devices_migration_blocker); @@ -410,7 +420,7 @@ int vfio_block_multiple_devices_migration(VFIODevice *vbasedev, Error **errp) void vfio_unblock_multiple_devices_migration(void) { if (!multiple_devices_migration_blocker || - vfio_migratable_device_num() > 1) { + !vfio_multiple_devices_migration_is_supported()) { return; } From patchwork Mon Sep 11 07:50:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 13378930 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D706BEE7FF4 for ; Mon, 11 Sep 2023 07:51:16 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qfbgu-00046g-O7; Mon, 11 Sep 2023 03:50:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgr-0003vz-6q for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:37 -0400 Received: from mail.ozlabs.org ([2404:9400:2221:ea00::3] helo=gandalf.ozlabs.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgn-0008D6-2P for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:36 -0400 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Rkf5d3dh8z4xKR; Mon, 11 Sep 2023 17:50:29 +1000 (AEST) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4Rkf5b75CQz4x5q; Mon, 11 Sep 2023 17:50:27 +1000 (AEST) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Avihai Horon , =?utf-8?q?C=C3=A9dric_Le_Goater?= Subject: [PULL 07/13] migration: Add migration prefix to functions in target.c Date: Mon, 11 Sep 2023 09:50:02 +0200 Message-ID: <20230911075008.462712-8-clg@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911075008.462712-1-clg@redhat.com> References: <20230911075008.462712-1-clg@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2404:9400:2221:ea00::3; envelope-from=SRS0=GLJ6=E3=redhat.com=clg@ozlabs.org; helo=gandalf.ozlabs.org X-Spam_score_int: -33 X-Spam_score: -3.4 X-Spam_bar: --- X-Spam_report: (-3.4 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URG_BIZ=0.573 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Avihai Horon The functions in target.c are not static, yet they don't have a proper migration prefix. Add such prefix. Signed-off-by: Avihai Horon Reviewed-by: Cédric Le Goater Signed-off-by: Cédric Le Goater --- migration/migration.h | 4 ++-- migration/migration.c | 6 +++--- migration/savevm.c | 2 +- migration/target.c | 8 ++++---- 4 files changed, 10 insertions(+), 10 deletions(-) diff --git a/migration/migration.h b/migration/migration.h index 6eea18db367585e6f08aada9d14fc37e02e50977..c5695de214965dfddd854779e4da8d09f04d35ba 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -512,8 +512,8 @@ void migration_consume_urgent_request(void); bool migration_rate_limit(void); void migration_cancel(const Error *error); -void populate_vfio_info(MigrationInfo *info); -void reset_vfio_bytes_transferred(void); +void migration_populate_vfio_info(MigrationInfo *info); +void migration_reset_vfio_bytes_transferred(void); void postcopy_temp_page_reset(PostcopyTmpPage *tmp_page); #endif diff --git a/migration/migration.c b/migration/migration.c index 5528acb65e0f7b84d528ee8e8c477975cb8a7dad..92866a8f49d3a7d24028198defb15c5d4d86726e 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1039,7 +1039,7 @@ static void fill_source_migration_info(MigrationInfo *info) populate_time_info(info, s); populate_ram_info(info, s); populate_disk_info(info); - populate_vfio_info(info); + migration_populate_vfio_info(info); break; case MIGRATION_STATUS_COLO: info->has_status = true; @@ -1048,7 +1048,7 @@ static void fill_source_migration_info(MigrationInfo *info) case MIGRATION_STATUS_COMPLETED: populate_time_info(info, s); populate_ram_info(info, s); - populate_vfio_info(info); + migration_populate_vfio_info(info); break; case MIGRATION_STATUS_FAILED: info->has_status = true; @@ -1641,7 +1641,7 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc, */ memset(&mig_stats, 0, sizeof(mig_stats)); memset(&compression_counters, 0, sizeof(compression_counters)); - reset_vfio_bytes_transferred(); + migration_reset_vfio_bytes_transferred(); return true; } diff --git a/migration/savevm.c b/migration/savevm.c index a2cb8855e29547d2b66e6bddfc3363466c7c3bab..5bf8b59a7dfc243eb353674bdef8083d441797e3 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1622,7 +1622,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp) migrate_init(ms); memset(&mig_stats, 0, sizeof(mig_stats)); memset(&compression_counters, 0, sizeof(compression_counters)); - reset_vfio_bytes_transferred(); + migration_reset_vfio_bytes_transferred(); ms->to_dst_file = f; qemu_mutex_unlock_iothread(); diff --git a/migration/target.c b/migration/target.c index f39c9a8d88775648816d46113843ef58198c86fd..a6ffa9a5ce312d1e64157b650827aa726eb4d364 100644 --- a/migration/target.c +++ b/migration/target.c @@ -15,7 +15,7 @@ #endif #ifdef CONFIG_VFIO -void populate_vfio_info(MigrationInfo *info) +void migration_populate_vfio_info(MigrationInfo *info) { if (vfio_mig_active()) { info->vfio = g_malloc0(sizeof(*info->vfio)); @@ -23,16 +23,16 @@ void populate_vfio_info(MigrationInfo *info) } } -void reset_vfio_bytes_transferred(void) +void migration_reset_vfio_bytes_transferred(void) { vfio_reset_bytes_transferred(); } #else -void populate_vfio_info(MigrationInfo *info) +void migration_populate_vfio_info(MigrationInfo *info) { } -void reset_vfio_bytes_transferred(void) +void migration_reset_vfio_bytes_transferred(void) { } #endif From patchwork Mon Sep 11 07:50:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 13378938 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EC15BEE57DF for ; Mon, 11 Sep 2023 07:52:26 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qfbgt-00042l-Cu; Mon, 11 Sep 2023 03:50:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgr-0003wZ-Hz for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:37 -0400 Received: from mail.ozlabs.org ([2404:9400:2221:ea00::3] helo=gandalf.ozlabs.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgo-0008DH-Tw for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:37 -0400 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Rkf5g3lfCz4xM3; Mon, 11 Sep 2023 17:50:31 +1000 (AEST) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4Rkf5f04pZz4xM1; Mon, 11 Sep 2023 17:50:29 +1000 (AEST) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Avihai Horon , =?utf-8?q?C=C3=A9dric_Le_Goater?= Subject: [PULL 08/13] vfio/migration: Fail adding device with enable-migration=on and existing blocker Date: Mon, 11 Sep 2023 09:50:03 +0200 Message-ID: <20230911075008.462712-9-clg@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911075008.462712-1-clg@redhat.com> References: <20230911075008.462712-1-clg@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2404:9400:2221:ea00::3; envelope-from=SRS0=GLJ6=E3=redhat.com=clg@ozlabs.org; helo=gandalf.ozlabs.org X-Spam_score_int: -39 X-Spam_score: -4.0 X-Spam_bar: ---- X-Spam_report: (-4.0 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Avihai Horon If a device with enable-migration=on is added and it causes a migration blocker, adding the device should fail with a proper error. This is not the case with multiple device migration blocker when the blocker already exists. If the blocker already exists and a device with enable-migration=on is added which causes a migration blocker, adding the device will succeed. Fix it by failing adding the device in such case. Fixes: 8bbcb64a71d8 ("vfio/migration: Make VFIO migration non-experimental") Signed-off-by: Avihai Horon Reviewed-by: Cédric Le Goater Signed-off-by: Cédric Le Goater --- hw/vfio/common.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 8a8d074e1863ec40b00a424bbe50494ce8391301..237101d03844273f653d98b6d053a1ae9c05a247 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -394,8 +394,7 @@ int vfio_block_multiple_devices_migration(VFIODevice *vbasedev, Error **errp) { int ret; - if (multiple_devices_migration_blocker || - vfio_multiple_devices_migration_is_supported()) { + if (vfio_multiple_devices_migration_is_supported()) { return 0; } @@ -405,6 +404,10 @@ int vfio_block_multiple_devices_migration(VFIODevice *vbasedev, Error **errp) return -EINVAL; } + if (multiple_devices_migration_blocker) { + return 0; + } + error_setg(&multiple_devices_migration_blocker, "Multiple VFIO devices migration is supported only if all of " "them support P2P migration"); From patchwork Mon Sep 11 07:50:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 13378928 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E12BCEE7FF4 for ; Mon, 11 Sep 2023 07:51:04 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qfbgz-00048R-KF; Mon, 11 Sep 2023 03:50:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgt-000438-JA for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:39 -0400 Received: from gandalf.ozlabs.org ([150.107.74.76]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgq-0008Ej-T3 for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:39 -0400 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Rkf5j44F7z4xFd; Mon, 11 Sep 2023 17:50:33 +1000 (AEST) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4Rkf5h09zrz4xM4; Mon, 11 Sep 2023 17:50:31 +1000 (AEST) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Avihai Horon , =?utf-8?q?C=C3=A9dric_Le_Goater?= Subject: [PULL 09/13] migration: Move more initializations to migrate_init() Date: Mon, 11 Sep 2023 09:50:04 +0200 Message-ID: <20230911075008.462712-10-clg@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911075008.462712-1-clg@redhat.com> References: <20230911075008.462712-1-clg@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=150.107.74.76; envelope-from=SRS0=GLJ6=E3=redhat.com=clg@ozlabs.org; helo=gandalf.ozlabs.org X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Avihai Horon Initialization of mig_stats, compression_counters and VFIO bytes transferred is hard-coded in migration code path and snapshot code path. Make the code cleaner by initializing them in migrate_init(). Suggested-by: Cédric Le Goater Signed-off-by: Avihai Horon Reviewed-by: Cédric Le Goater Signed-off-by: Cédric Le Goater --- migration/migration.c | 14 +++++++------- migration/savevm.c | 3 --- 2 files changed, 7 insertions(+), 10 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 92866a8f49d3a7d24028198defb15c5d4d86726e..ce01a3ba6af72aa35063f88355349ec739708d4a 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1425,6 +1425,13 @@ void migrate_init(MigrationState *s) s->iteration_initial_bytes = 0; s->threshold_size = 0; s->switchover_acked = false; + /* + * set mig_stats compression_counters memory to zero for a + * new migration + */ + memset(&mig_stats, 0, sizeof(mig_stats)); + memset(&compression_counters, 0, sizeof(compression_counters)); + migration_reset_vfio_bytes_transferred(); } int migrate_add_blocker_internal(Error *reason, Error **errp) @@ -1635,13 +1642,6 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc, } migrate_init(s); - /* - * set mig_stats compression_counters memory to zero for a - * new migration - */ - memset(&mig_stats, 0, sizeof(mig_stats)); - memset(&compression_counters, 0, sizeof(compression_counters)); - migration_reset_vfio_bytes_transferred(); return true; } diff --git a/migration/savevm.c b/migration/savevm.c index 5bf8b59a7dfc243eb353674bdef8083d441797e3..e14efeced0fb8e4b2dc2b7799f5612799f185170 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1620,9 +1620,6 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp) } migrate_init(ms); - memset(&mig_stats, 0, sizeof(mig_stats)); - memset(&compression_counters, 0, sizeof(compression_counters)); - migration_reset_vfio_bytes_transferred(); ms->to_dst_file = f; qemu_mutex_unlock_iothread(); From patchwork Mon Sep 11 07:50:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 13378933 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1C9DAEE57DF for ; Mon, 11 Sep 2023 07:51:34 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qfbgv-000478-De; Mon, 11 Sep 2023 03:50:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgt-00043K-RZ for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:39 -0400 Received: from mail.ozlabs.org ([2404:9400:2221:ea00::3] helo=gandalf.ozlabs.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgq-0008Co-TQ for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:39 -0400 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Rkf5l63pkz4xM7; Mon, 11 Sep 2023 17:50:35 +1000 (AEST) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4Rkf5k0X96z4xM5; Mon, 11 Sep 2023 17:50:33 +1000 (AEST) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Avihai Horon , Peter Xu , =?utf-8?q?C=C3=A9dric_Le_Goater?= Subject: [PULL 10/13] migration: Add .save_prepare() handler to struct SaveVMHandlers Date: Mon, 11 Sep 2023 09:50:05 +0200 Message-ID: <20230911075008.462712-11-clg@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911075008.462712-1-clg@redhat.com> References: <20230911075008.462712-1-clg@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2404:9400:2221:ea00::3; envelope-from=SRS0=GLJ6=E3=redhat.com=clg@ozlabs.org; helo=gandalf.ozlabs.org X-Spam_score_int: -39 X-Spam_score: -4.0 X-Spam_bar: ---- X-Spam_report: (-4.0 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Avihai Horon Add a new .save_prepare() handler to struct SaveVMHandlers. This handler is called early, even before migration starts, and can be used by devices to perform early checks. Refactor migrate_init() to be able to return errors and call .save_prepare() from there. Suggested-by: Peter Xu Signed-off-by: Avihai Horon Reviewed-by: Peter Xu Reviewed-by: Cédric Le Goater Signed-off-by: Cédric Le Goater --- include/migration/register.h | 5 +++++ migration/migration.h | 2 +- migration/savevm.h | 1 + migration/migration.c | 15 +++++++++++++-- migration/savevm.c | 29 ++++++++++++++++++++++++++++- 5 files changed, 48 insertions(+), 4 deletions(-) diff --git a/include/migration/register.h b/include/migration/register.h index 90914f32f50cab9fc6b7797fbf3fb509a0860291..2b12c6adeca7ce5c7282e8df3c2def0068d44c18 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -20,6 +20,11 @@ typedef struct SaveVMHandlers { /* This runs inside the iothread lock. */ SaveStateHandler *save_state; + /* + * save_prepare is called early, even before migration starts, and can be + * used to perform early checks. + */ + int (*save_prepare)(void *opaque, Error **errp); void (*save_cleanup)(void *opaque); int (*save_live_complete_postcopy)(QEMUFile *f, void *opaque); int (*save_live_complete_precopy)(QEMUFile *f, void *opaque); diff --git a/migration/migration.h b/migration/migration.h index c5695de214965dfddd854779e4da8d09f04d35ba..c390500604b6db50eb1bb1dbb8ee56f5e1bd8610 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -472,7 +472,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in); bool migration_is_setup_or_active(int state); bool migration_is_running(int state); -void migrate_init(MigrationState *s); +int migrate_init(MigrationState *s, Error **errp); bool migration_is_blocked(Error **errp); /* True if outgoing migration has entered postcopy phase */ bool migration_in_postcopy(void); diff --git a/migration/savevm.h b/migration/savevm.h index e894bbc143313a34c5c573f0839d5c13567cc521..74669733dd63a080b765866c703234a5c4939223 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -31,6 +31,7 @@ bool qemu_savevm_state_blocked(Error **errp); void qemu_savevm_non_migratable_list(strList **reasons); +int qemu_savevm_state_prepare(Error **errp); void qemu_savevm_state_setup(QEMUFile *f); bool qemu_savevm_state_guest_unplug_pending(void); int qemu_savevm_state_resume_prepare(MigrationState *s); diff --git a/migration/migration.c b/migration/migration.c index ce01a3ba6af72aa35063f88355349ec739708d4a..d61e5727429aef92b129f04208ee82eff414bd97 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1392,8 +1392,15 @@ bool migration_is_active(MigrationState *s) s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE); } -void migrate_init(MigrationState *s) +int migrate_init(MigrationState *s, Error **errp) { + int ret; + + ret = qemu_savevm_state_prepare(errp); + if (ret) { + return ret; + } + /* * Reinitialise all migration state, except * parameters/capabilities that the user set, and @@ -1432,6 +1439,8 @@ void migrate_init(MigrationState *s) memset(&mig_stats, 0, sizeof(mig_stats)); memset(&compression_counters, 0, sizeof(compression_counters)); migration_reset_vfio_bytes_transferred(); + + return 0; } int migrate_add_blocker_internal(Error *reason, Error **errp) @@ -1641,7 +1650,9 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc, migrate_set_block_incremental(true); } - migrate_init(s); + if (migrate_init(s, errp)) { + return false; + } return true; } diff --git a/migration/savevm.c b/migration/savevm.c index e14efeced0fb8e4b2dc2b7799f5612799f185170..bb3e99194c608d4a08fe46182cfa67d94635d3c9 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1233,6 +1233,30 @@ bool qemu_savevm_state_guest_unplug_pending(void) return false; } +int qemu_savevm_state_prepare(Error **errp) +{ + SaveStateEntry *se; + int ret; + + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { + if (!se->ops || !se->ops->save_prepare) { + continue; + } + if (se->ops->is_active) { + if (!se->ops->is_active(se->opaque)) { + continue; + } + } + + ret = se->ops->save_prepare(se->opaque, errp); + if (ret < 0) { + return ret; + } + } + + return 0; +} + void qemu_savevm_state_setup(QEMUFile *f) { MigrationState *ms = migrate_get_current(); @@ -1619,7 +1643,10 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp) return -EINVAL; } - migrate_init(ms); + ret = migrate_init(ms, errp); + if (ret) { + return ret; + } ms->to_dst_file = f; qemu_mutex_unlock_iothread(); From patchwork Mon Sep 11 07:50:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 13378931 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 61A14EE7FF4 for ; Mon, 11 Sep 2023 07:51:26 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qfbh3-0004Em-5D; Mon, 11 Sep 2023 03:50:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgx-00047X-Bl for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:43 -0400 Received: from gandalf.ozlabs.org ([150.107.74.76]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgu-0008Cl-2W for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:43 -0400 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Rkf5p2kbdz4xMC; Mon, 11 Sep 2023 17:50:38 +1000 (AEST) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4Rkf5m2XPPz4xM5; Mon, 11 Sep 2023 17:50:36 +1000 (AEST) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Avihai Horon , Yanghang Liu , Peter Xu , =?utf-8?q?C=C3=A9dric_Le_Goater?= Subject: [PULL 11/13] vfio/migration: Block VFIO migration with postcopy migration Date: Mon, 11 Sep 2023 09:50:06 +0200 Message-ID: <20230911075008.462712-12-clg@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911075008.462712-1-clg@redhat.com> References: <20230911075008.462712-1-clg@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=150.107.74.76; envelope-from=SRS0=GLJ6=E3=redhat.com=clg@ozlabs.org; helo=gandalf.ozlabs.org X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Avihai Horon VFIO migration is not compatible with postcopy migration. A VFIO device in the destination can't handle page faults for pages that have not been sent yet. Doing such migration will cause the VM to crash in the destination: qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address) qemu: hardware error: vfio: DMA mapping failed, unable to continue To prevent this, block VFIO migration with postcopy migration. Reported-by: Yanghang Liu Signed-off-by: Avihai Horon Tested-by: Yanghang Liu Reviewed-by: Peter Xu Signed-off-by: Cédric Le Goater --- hw/vfio/migration.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 71855468fe985291e2d009b81c6efd29abcbe755..20994dc1d60b1606728415fec17c19cfd00c4dee 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev) /* ---------------------------------------------------------------------- */ +static int vfio_save_prepare(void *opaque, Error **errp) +{ + VFIODevice *vbasedev = opaque; + + /* + * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on. + */ + if (runstate_check(RUN_STATE_SAVE_VM)) { + return 0; + } + + if (migrate_postcopy_ram()) { + error_setg( + errp, "%s: VFIO migration is not supported with postcopy migration", + vbasedev->name); + return -EOPNOTSUPP; + } + + return 0; +} + static int vfio_save_setup(QEMUFile *f, void *opaque) { VFIODevice *vbasedev = opaque; @@ -640,6 +661,7 @@ static bool vfio_switchover_ack_needed(void *opaque) } static const SaveVMHandlers savevm_vfio_handlers = { + .save_prepare = vfio_save_prepare, .save_setup = vfio_save_setup, .save_cleanup = vfio_save_cleanup, .state_pending_estimate = vfio_state_pending_estimate, From patchwork Mon Sep 11 07:50:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 13378934 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 43AC4EE7FF4 for ; Mon, 11 Sep 2023 07:51:52 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qfbh8-0004Ox-Mw; Mon, 11 Sep 2023 03:50:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgy-00047y-5F for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:45 -0400 Received: from gandalf.ozlabs.org ([150.107.74.76]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgv-0008Ej-Sh for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:43 -0400 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Rkf5r4dzjz4xNG; Mon, 11 Sep 2023 17:50:40 +1000 (AEST) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4Rkf5p6JKtz4xM5; Mon, 11 Sep 2023 17:50:38 +1000 (AEST) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Avihai Horon , Peter Xu , =?utf-8?q?C=C3=A9dric_Le_Goater?= Subject: [PULL 12/13] vfio/migration: Block VFIO migration with background snapshot Date: Mon, 11 Sep 2023 09:50:07 +0200 Message-ID: <20230911075008.462712-13-clg@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911075008.462712-1-clg@redhat.com> References: <20230911075008.462712-1-clg@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=150.107.74.76; envelope-from=SRS0=GLJ6=E3=redhat.com=clg@ozlabs.org; helo=gandalf.ozlabs.org X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Avihai Horon Background snapshot allows creating a snapshot of the VM while it's running and keeping it small by not including dirty RAM pages. The way it works is by first stopping the VM, saving the non-iterable devices' state and then starting the VM and saving the RAM while write protecting it with UFFD. The resulting snapshot represents the VM state at snapshot start. VFIO migration is not compatible with background snapshot. First of all, VFIO device state is not even saved in background snapshot because only non-iterable device state is saved. But even if it was saved, after starting the VM, a VFIO device could dirty pages without it being detected by UFFD write protection. This would corrupt the snapshot, as the RAM in it would not represent the RAM at snapshot start. To prevent this, block VFIO migration with background snapshot. Signed-off-by: Avihai Horon Reviewed-by: Peter Xu Signed-off-by: Cédric Le Goater --- hw/vfio/migration.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 20994dc1d60b1606728415fec17c19cfd00c4dee..da43dcd2fe0734091960e3eb2ffa26750073be72 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -340,7 +340,8 @@ static int vfio_save_prepare(void *opaque, Error **errp) VFIODevice *vbasedev = opaque; /* - * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on. + * Snapshot doesn't use postcopy nor background snapshot, so allow snapshot + * even if they are on. */ if (runstate_check(RUN_STATE_SAVE_VM)) { return 0; @@ -353,6 +354,14 @@ static int vfio_save_prepare(void *opaque, Error **errp) return -EOPNOTSUPP; } + if (migrate_background_snapshot()) { + error_setg( + errp, + "%s: VFIO migration is not supported with background snapshot", + vbasedev->name); + return -EOPNOTSUPP; + } + return 0; } From patchwork Mon Sep 11 07:50:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 13378929 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 04D7FEE7FF4 for ; Mon, 11 Sep 2023 07:51:08 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qfbh8-0004OX-4j; Mon, 11 Sep 2023 03:50:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbh1-0004EE-I8 for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:49 -0400 Received: from mail.ozlabs.org ([2404:9400:2221:ea00::3] helo=gandalf.ozlabs.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qfbgy-0008Ft-Km for qemu-devel@nongnu.org; Mon, 11 Sep 2023 03:50:47 -0400 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Rkf5t4l2Sz4xNg; Mon, 11 Sep 2023 17:50:42 +1000 (AEST) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4Rkf5s1557z4xM5; Mon, 11 Sep 2023 17:50:40 +1000 (AEST) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Joao Martins , =?utf-8?q?C=C3=A9dric_Le_Goater?= Subject: [PULL 13/13] vfio/common: Separate vfio-pci ranges Date: Mon, 11 Sep 2023 09:50:08 +0200 Message-ID: <20230911075008.462712-14-clg@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911075008.462712-1-clg@redhat.com> References: <20230911075008.462712-1-clg@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2404:9400:2221:ea00::3; envelope-from=SRS0=GLJ6=E3=redhat.com=clg@ozlabs.org; helo=gandalf.ozlabs.org X-Spam_score_int: -39 X-Spam_score: -4.0 X-Spam_bar: ---- X-Spam_report: (-4.0 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Joao Martins QEMU computes the DMA logging ranges for two predefined ranges: 32-bit and 64-bit. In the OVMF case, when the dynamic MMIO window is enabled, QEMU includes in the 64-bit range the RAM regions at the lower part and vfio-pci device RAM regions which are at the top of the address space. This range contains a large gap and the size can be bigger than the dirty tracking HW limits of some devices (MLX5 has a 2^42 limit). To avoid such large ranges, introduce a new PCI range covering the vfio-pci device RAM regions, this only if the addresses are above 4GB to avoid breaking potential SeaBIOS guests. [ clg: - wrote commit log - fixed overlapping 32-bit and PCI ranges when using SeaBIOS ] Signed-off-by: Joao Martins Signed-off-by: Cédric Le Goater Fixes: 5255bbf4ec16 ("vfio/common: Add device dirty page tracking start/stop") Signed-off-by: Cédric Le Goater --- hw/vfio/common.c | 71 +++++++++++++++++++++++++++++++++++++------- hw/vfio/trace-events | 2 +- 2 files changed, 61 insertions(+), 12 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 237101d03844273f653d98b6d053a1ae9c05a247..134649226d4333f648ca751291003316a5f3b4a9 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -27,6 +27,7 @@ #include "hw/vfio/vfio-common.h" #include "hw/vfio/vfio.h" +#include "hw/vfio/pci.h" #include "exec/address-spaces.h" #include "exec/memory.h" #include "exec/ram_addr.h" @@ -1400,6 +1401,8 @@ typedef struct VFIODirtyRanges { hwaddr max32; hwaddr min64; hwaddr max64; + hwaddr minpci64; + hwaddr maxpci64; } VFIODirtyRanges; typedef struct VFIODirtyRangesListener { @@ -1408,6 +1411,31 @@ typedef struct VFIODirtyRangesListener { MemoryListener listener; } VFIODirtyRangesListener; +static bool vfio_section_is_vfio_pci(MemoryRegionSection *section, + VFIOContainer *container) +{ + VFIOPCIDevice *pcidev; + VFIODevice *vbasedev; + VFIOGroup *group; + Object *owner; + + owner = memory_region_owner(section->mr); + + QLIST_FOREACH(group, &container->group_list, container_next) { + QLIST_FOREACH(vbasedev, &group->device_list, next) { + if (vbasedev->type != VFIO_DEVICE_TYPE_PCI) { + continue; + } + pcidev = container_of(vbasedev, VFIOPCIDevice, vbasedev); + if (OBJECT(pcidev) == owner) { + return true; + } + } + } + + return false; +} + static void vfio_dirty_tracking_update(MemoryListener *listener, MemoryRegionSection *section) { @@ -1424,19 +1452,32 @@ static void vfio_dirty_tracking_update(MemoryListener *listener, } /* - * The address space passed to the dirty tracker is reduced to two ranges: - * one for 32-bit DMA ranges, and another one for 64-bit DMA ranges. + * The address space passed to the dirty tracker is reduced to three ranges: + * one for 32-bit DMA ranges, one for 64-bit DMA ranges and one for the + * PCI 64-bit hole. + * * The underlying reports of dirty will query a sub-interval of each of * these ranges. * - * The purpose of the dual range handling is to handle known cases of big - * holes in the address space, like the x86 AMD 1T hole. The alternative - * would be an IOVATree but that has a much bigger runtime overhead and - * unnecessary complexity. + * The purpose of the three range handling is to handle known cases of big + * holes in the address space, like the x86 AMD 1T hole, and firmware (like + * OVMF) which may relocate the pci-hole64 to the end of the address space. + * The latter would otherwise generate large ranges for tracking, stressing + * the limits of supported hardware. The pci-hole32 will always be below 4G + * (overlapping or not) so it doesn't need special handling and is part of + * the 32-bit range. + * + * The alternative would be an IOVATree but that has a much bigger runtime + * overhead and unnecessary complexity. */ - min = (end <= UINT32_MAX) ? &range->min32 : &range->min64; - max = (end <= UINT32_MAX) ? &range->max32 : &range->max64; - + if (vfio_section_is_vfio_pci(section, dirty->container) && + iova >= UINT32_MAX) { + min = &range->minpci64; + max = &range->maxpci64; + } else { + min = (end <= UINT32_MAX) ? &range->min32 : &range->min64; + max = (end <= UINT32_MAX) ? &range->max32 : &range->max64; + } if (*min > iova) { *min = iova; } @@ -1461,6 +1502,7 @@ static void vfio_dirty_tracking_init(VFIOContainer *container, memset(&dirty, 0, sizeof(dirty)); dirty.ranges.min32 = UINT32_MAX; dirty.ranges.min64 = UINT64_MAX; + dirty.ranges.minpci64 = UINT64_MAX; dirty.listener = vfio_dirty_tracking_listener; dirty.container = container; @@ -1531,7 +1573,8 @@ vfio_device_feature_dma_logging_start_create(VFIOContainer *container, * DMA logging uAPI guarantees to support at least a number of ranges that * fits into a single host kernel base page. */ - control->num_ranges = !!tracking->max32 + !!tracking->max64; + control->num_ranges = !!tracking->max32 + !!tracking->max64 + + !!tracking->maxpci64; ranges = g_try_new0(struct vfio_device_feature_dma_logging_range, control->num_ranges); if (!ranges) { @@ -1550,11 +1593,17 @@ vfio_device_feature_dma_logging_start_create(VFIOContainer *container, if (tracking->max64) { ranges->iova = tracking->min64; ranges->length = (tracking->max64 - tracking->min64) + 1; + ranges++; + } + if (tracking->maxpci64) { + ranges->iova = tracking->minpci64; + ranges->length = (tracking->maxpci64 - tracking->minpci64) + 1; } trace_vfio_device_dirty_tracking_start(control->num_ranges, tracking->min32, tracking->max32, - tracking->min64, tracking->max64); + tracking->min64, tracking->max64, + tracking->minpci64, tracking->maxpci64); return feature; } diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 329736a738d32ab006c3621cecfb704c84a513b7..81ec7c7a958b890686865900e37157a373892048 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -104,7 +104,7 @@ vfio_known_safe_misalignment(const char *name, uint64_t iova, uint64_t offset_wi vfio_listener_region_add_no_dma_map(const char *name, uint64_t iova, uint64_t size, uint64_t page_size) "Region \"%s\" 0x%"PRIx64" size=0x%"PRIx64" is not aligned to 0x%"PRIx64" and cannot be mapped for DMA" vfio_listener_region_del(uint64_t start, uint64_t end) "region_del 0x%"PRIx64" - 0x%"PRIx64 vfio_device_dirty_tracking_update(uint64_t start, uint64_t end, uint64_t min, uint64_t max) "section 0x%"PRIx64" - 0x%"PRIx64" -> update [0x%"PRIx64" - 0x%"PRIx64"]" -vfio_device_dirty_tracking_start(int nr_ranges, uint64_t min32, uint64_t max32, uint64_t min64, uint64_t max64) "nr_ranges %d 32:[0x%"PRIx64" - 0x%"PRIx64"], 64:[0x%"PRIx64" - 0x%"PRIx64"]" +vfio_device_dirty_tracking_start(int nr_ranges, uint64_t min32, uint64_t max32, uint64_t min64, uint64_t max64, uint64_t minpci, uint64_t maxpci) "nr_ranges %d 32:[0x%"PRIx64" - 0x%"PRIx64"], 64:[0x%"PRIx64" - 0x%"PRIx64"], pci64:[0x%"PRIx64" - 0x%"PRIx64"]" vfio_disconnect_container(int fd) "close container->fd=%d" vfio_put_group(int fd) "close group->fd=%d" vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, irqs: %u"