From patchwork Wed Jan 17 09:15:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13521508 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 24C37C47422 for ; Wed, 17 Jan 2024 09:17:24 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rQ21x-0008PC-27; Wed, 17 Jan 2024 04:16:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rQ21v-0008P0-JJ for qemu-devel@nongnu.org; Wed, 17 Jan 2024 04:16:15 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rQ21t-0006Yz-RK for qemu-devel@nongnu.org; Wed, 17 Jan 2024 04:16:15 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705482972; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oMbVFPYQsHSfFzRUkbmyU6bwmA69o3lKHeITiDBmUpM=; b=CIQr4UApYPVIYFcb2/ArXkRuPLDiXELjKOkczi4gqvft+QGEQvZTl7SrvyvCZs/bDEDczH D+x+wItNEpwqbjSIeuuXvkLhjYxsipbsdGWPgpVFyZKy4RBQoyC8Sk5zuh7u5T9tG1ihU4 +sO6BDlTNPXH2lJJLc5HRJfap77999g= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-628-q8vuVMoWMai_ExTome83Dw-1; Wed, 17 Jan 2024 04:16:10 -0500 X-MC-Unique: q8vuVMoWMai_ExTome83Dw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2818D82DFE6 for ; Wed, 17 Jan 2024 09:16:10 +0000 (UTC) Received: from x1n.redhat.com (unknown [10.72.116.92]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4DE2E2166B32; Wed, 17 Jan 2024 09:16:05 +0000 (UTC) From: peterx@redhat.com To: qemu-devel@nongnu.org Cc: Eric Auger , "Michael S . Tsirkin" , Jason Wang , Alex Williamson , Igor Mammedov , peterx@redhat.com Subject: [PATCH 1/4] reset: qemu_register_reset_one() Date: Wed, 17 Jan 2024 17:15:56 +0800 Message-ID: <20240117091559.144730-2-peterx@redhat.com> In-Reply-To: <20240117091559.144730-1-peterx@redhat.com> References: <20240117091559.144730-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -38 X-Spam_score: -3.9 X-Spam_bar: --- X-Spam_report: (-3.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.806, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Peter Xu Cleanup the code to use a single entrance on register reset hooks. Signed-off-by: Peter Xu Reviewed-by: Eric Auger --- hw/core/reset.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/hw/core/reset.c b/hw/core/reset.c index d3263b613e..8cf60b2b09 100644 --- a/hw/core/reset.c +++ b/hw/core/reset.c @@ -39,23 +39,26 @@ typedef struct QEMUResetEntry { static QTAILQ_HEAD(, QEMUResetEntry) reset_handlers = QTAILQ_HEAD_INITIALIZER(reset_handlers); -void qemu_register_reset(QEMUResetHandler *func, void *opaque) +static void qemu_register_reset_one(QEMUResetHandler *func, void *opaque, + bool skip_snap) { QEMUResetEntry *re = g_new0(QEMUResetEntry, 1); re->func = func; re->opaque = opaque; + re->skip_on_snapshot_load = skip_snap; QTAILQ_INSERT_TAIL(&reset_handlers, re, entry); } -void qemu_register_reset_nosnapshotload(QEMUResetHandler *func, void *opaque) +void qemu_register_reset(QEMUResetHandler *func, void *opaque) { - QEMUResetEntry *re = g_new0(QEMUResetEntry, 1); + /* By default, do not skip during load of a snapshot */ + qemu_register_reset_one(func, opaque, false); +} - re->func = func; - re->opaque = opaque; - re->skip_on_snapshot_load = true; - QTAILQ_INSERT_TAIL(&reset_handlers, re, entry); +void qemu_register_reset_nosnapshotload(QEMUResetHandler *func, void *opaque) +{ + qemu_register_reset_one(func, opaque, true); } void qemu_unregister_reset(QEMUResetHandler *func, void *opaque) From patchwork Wed Jan 17 09:15:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13521505 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 84E2DC47258 for ; Wed, 17 Jan 2024 09:16:59 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rQ220-0008Rs-N4; Wed, 17 Jan 2024 04:16:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rQ21y-0008PU-Lw for qemu-devel@nongnu.org; Wed, 17 Jan 2024 04:16:18 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rQ21x-0006aa-5r for qemu-devel@nongnu.org; Wed, 17 Jan 2024 04:16:18 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705482976; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zGbgHPbepSaUtEc8A7xDBzFgtQfCe0LSdeWdpVuOkY8=; b=EHCA0dtNsw7zHCtj98GChewP862j5WW32hOiMIzuR2IGEaeyqyExxLeZeZjCX3onGwm87w QUPFnrukubs/Ks4uzZ2IxfEbbH9ryPLBa1rPjR4uVfEqyUtbp0UyeoxLKvkIY4Xra2FhnC TVaUnt3NSuOqt6Mi6+1AAEQa0vckZ0w= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-674-aAEol0csOGOoVurQ_Y8tiQ-1; Wed, 17 Jan 2024 04:16:14 -0500 X-MC-Unique: aAEol0csOGOoVurQ_Y8tiQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 99DF4280FED2 for ; Wed, 17 Jan 2024 09:16:14 +0000 (UTC) Received: from x1n.redhat.com (unknown [10.72.116.92]) by smtp.corp.redhat.com (Postfix) with ESMTP id 227E22166B32; Wed, 17 Jan 2024 09:16:10 +0000 (UTC) From: peterx@redhat.com To: qemu-devel@nongnu.org Cc: Eric Auger , "Michael S . Tsirkin" , Jason Wang , Alex Williamson , Igor Mammedov , peterx@redhat.com Subject: [PATCH 2/4] reset: Allow multiple stages of system resets Date: Wed, 17 Jan 2024 17:15:57 +0800 Message-ID: <20240117091559.144730-3-peterx@redhat.com> In-Reply-To: <20240117091559.144730-1-peterx@redhat.com> References: <20240117091559.144730-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -38 X-Spam_score: -3.9 X-Spam_bar: --- X-Spam_report: (-3.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.806, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Peter Xu QEMU resets do not have a way to order reset hooks. Add one coarse grained reset stage so that some devices can be reset later than some others. Signed-off-by: Peter Xu --- include/sysemu/reset.h | 5 ++++ hw/core/reset.c | 60 +++++++++++++++++++++++++++++++----------- 2 files changed, 49 insertions(+), 16 deletions(-) diff --git a/include/sysemu/reset.h b/include/sysemu/reset.h index 609e4d50c2..0de697ce9f 100644 --- a/include/sysemu/reset.h +++ b/include/sysemu/reset.h @@ -5,9 +5,14 @@ typedef void QEMUResetHandler(void *opaque); +#define QEMU_RESET_STAGES_N 2 + void qemu_register_reset(QEMUResetHandler *func, void *opaque); +void qemu_register_reset_one(QEMUResetHandler *func, void *opaque, + bool skip_snap, int stage); void qemu_register_reset_nosnapshotload(QEMUResetHandler *func, void *opaque); void qemu_unregister_reset(QEMUResetHandler *func, void *opaque); +void qemu_unregister_reset_one(QEMUResetHandler *func, void *opaque, int stage); void qemu_devices_reset(ShutdownCause reason); #endif diff --git a/hw/core/reset.c b/hw/core/reset.c index 8cf60b2b09..a84c9bee84 100644 --- a/hw/core/reset.c +++ b/hw/core/reset.c @@ -36,55 +36,83 @@ typedef struct QEMUResetEntry { bool skip_on_snapshot_load; } QEMUResetEntry; -static QTAILQ_HEAD(, QEMUResetEntry) reset_handlers = - QTAILQ_HEAD_INITIALIZER(reset_handlers); +typedef QTAILQ_HEAD(QEMUResetList, QEMUResetEntry) QEMUResetList; +static QEMUResetList reset_handlers[QEMU_RESET_STAGES_N]; -static void qemu_register_reset_one(QEMUResetHandler *func, void *opaque, - bool skip_snap) +static void __attribute__((__constructor__)) qemu_reset_handlers_init(void) +{ + QEMUResetList *head; + int i = 0; + + for (i = 0; i < QEMU_RESET_STAGES_N; i++) { + head = &reset_handlers[i]; + QTAILQ_INIT(head); + } +} + +void qemu_register_reset_one(QEMUResetHandler *func, void *opaque, + bool skip_snap, int stage) { QEMUResetEntry *re = g_new0(QEMUResetEntry, 1); + QEMUResetList *head; + + assert(stage >= 0 && stage < QEMU_RESET_STAGES_N); + head = &reset_handlers[stage]; re->func = func; re->opaque = opaque; re->skip_on_snapshot_load = skip_snap; - QTAILQ_INSERT_TAIL(&reset_handlers, re, entry); + QTAILQ_INSERT_TAIL(head, re, entry); } void qemu_register_reset(QEMUResetHandler *func, void *opaque) { - /* By default, do not skip during load of a snapshot */ - qemu_register_reset_one(func, opaque, false); + qemu_register_reset_one(func, opaque, false, 0); } void qemu_register_reset_nosnapshotload(QEMUResetHandler *func, void *opaque) { - qemu_register_reset_one(func, opaque, true); + qemu_register_reset_one(func, opaque, true, 0); } -void qemu_unregister_reset(QEMUResetHandler *func, void *opaque) +void qemu_unregister_reset_one(QEMUResetHandler *func, void *opaque, int stage) { + QEMUResetList *head; QEMUResetEntry *re; - QTAILQ_FOREACH(re, &reset_handlers, entry) { + assert(stage >= 0 && stage < QEMU_RESET_STAGES_N); + head = &reset_handlers[stage]; + + QTAILQ_FOREACH(re, head, entry) { if (re->func == func && re->opaque == opaque) { - QTAILQ_REMOVE(&reset_handlers, re, entry); + QTAILQ_REMOVE(head, re, entry); g_free(re); return; } } } +void qemu_unregister_reset(QEMUResetHandler *func, void *opaque) +{ + qemu_unregister_reset_one(func, opaque, 0); +} + void qemu_devices_reset(ShutdownCause reason) { QEMUResetEntry *re, *nre; + QEMUResetList *head; + int stage; /* reset all devices */ - QTAILQ_FOREACH_SAFE(re, &reset_handlers, entry, nre) { - if (reason == SHUTDOWN_CAUSE_SNAPSHOT_LOAD && - re->skip_on_snapshot_load) { - continue; + for (stage = 0; stage < QEMU_RESET_STAGES_N; stage++) { + head = &reset_handlers[stage]; + QTAILQ_FOREACH_SAFE(re, head, entry, nre) { + if (reason == SHUTDOWN_CAUSE_SNAPSHOT_LOAD && + re->skip_on_snapshot_load) { + continue; + } + re->func(re->opaque); } - re->func(re->opaque); } } From patchwork Wed Jan 17 09:15:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13521509 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A779CC47258 for ; Wed, 17 Jan 2024 09:17:25 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rQ225-0008SU-A9; Wed, 17 Jan 2024 04:16:25 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rQ223-0008SI-55 for qemu-devel@nongnu.org; Wed, 17 Jan 2024 04:16:23 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rQ221-0006az-OM for qemu-devel@nongnu.org; Wed, 17 Jan 2024 04:16:22 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705482980; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pFdoHAJ+lkGOTCUL+ieZy9Kht+Sp7cO/n/p8tGSPvWM=; b=YtDCL0ODgrehggU1KAlm7jFts5r+Y8LB+8FVYjeF3+rAp24dX3XTOELfc19bXKin5pNfA4 mDFdinxnmWC3L+x+gNtsy+bAzBRBaVWxuBqSHdy+V0JSfUz7cyhJa8TPjrCiV5Y32YtJG5 5opy1CRfV62nGPZv9VOA9I+6J3ewrPU= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-659-lHz3sYrWNfmgrUBKKdIVrg-1; Wed, 17 Jan 2024 04:16:19 -0500 X-MC-Unique: lHz3sYrWNfmgrUBKKdIVrg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E419F280FEC8 for ; Wed, 17 Jan 2024 09:16:18 +0000 (UTC) Received: from x1n.redhat.com (unknown [10.72.116.92]) by smtp.corp.redhat.com (Postfix) with ESMTP id 52AE92166B32; Wed, 17 Jan 2024 09:16:15 +0000 (UTC) From: peterx@redhat.com To: qemu-devel@nongnu.org Cc: Eric Auger , "Michael S . Tsirkin" , Jason Wang , Alex Williamson , Igor Mammedov , peterx@redhat.com Subject: [PATCH 3/4] intel_iommu: Tear down address spaces before IOMMU reset Date: Wed, 17 Jan 2024 17:15:58 +0800 Message-ID: <20240117091559.144730-4-peterx@redhat.com> In-Reply-To: <20240117091559.144730-1-peterx@redhat.com> References: <20240117091559.144730-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -38 X-Spam_score: -3.9 X-Spam_bar: --- X-Spam_report: (-3.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.806, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Peter Xu No bug report for this, but logically tearing down of existing address space should happen before reset of IOMMU state / registers, because the current address spaces may still rely on those information. Signed-off-by: Peter Xu --- hw/i386/intel_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 1a07faddb4..8b467cbbd2 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -4090,8 +4090,8 @@ static void vtd_reset(DeviceState *dev) { IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev); - vtd_init(s); vtd_address_space_refresh_all(s); + vtd_init(s); } static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn) From patchwork Wed Jan 17 09:15:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13521506 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6E8DFC47422 for ; Wed, 17 Jan 2024 09:16:59 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rQ22B-0008TH-Nx; Wed, 17 Jan 2024 04:16:31 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rQ22A-0008T7-Cb for qemu-devel@nongnu.org; Wed, 17 Jan 2024 04:16:30 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rQ228-0006d1-Cl for qemu-devel@nongnu.org; Wed, 17 Jan 2024 04:16:30 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705482987; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WnWfoHy7+zLZw2Fi8Yxm9dlFqdV0ueomxrtZBCoFXMs=; b=A48WnUenL7JrfevIIYrhIxY44Z7TMoi4l0mwKY9sKyuPlDzL8AySpSklHPlsB4EQzn6qAV yFTFtE+x1Z2HEyYCnmMzpiBS9Ied0+sRCyJRxKSalwAu6/HkPj/KrvzKoDfPL0+BI248FL TQvdGM02sgwOd5hj0OCslQHIsesTd38= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-9-m77RcFeeOYmcmRH6b4PzxA-1; Wed, 17 Jan 2024 04:16:24 -0500 X-MC-Unique: m77RcFeeOYmcmRH6b4PzxA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CE99D85A588 for ; Wed, 17 Jan 2024 09:16:23 +0000 (UTC) Received: from x1n.redhat.com (unknown [10.72.116.92]) by smtp.corp.redhat.com (Postfix) with ESMTP id B31F72166B32; Wed, 17 Jan 2024 09:16:19 +0000 (UTC) From: peterx@redhat.com To: qemu-devel@nongnu.org Cc: Eric Auger , "Michael S . Tsirkin" , Jason Wang , Alex Williamson , Igor Mammedov , peterx@redhat.com, YangHang Liu Subject: [PATCH 4/4] intel_iommu: Reset vIOMMU at the last stage of system reset Date: Wed, 17 Jan 2024 17:15:59 +0800 Message-ID: <20240117091559.144730-5-peterx@redhat.com> In-Reply-To: <20240117091559.144730-1-peterx@redhat.com> References: <20240117091559.144730-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -38 X-Spam_score: -3.9 X-Spam_bar: --- X-Spam_report: (-3.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.806, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Peter Xu We got report from Yanghang Liu on an unexpected host DMA error when system resets with VFIO attached to vIOMMU in the VM context. Alex Williamson quickly spot that there can be ordering issues on resets. A further test verified that the issue is indeed caused by such wrong ordering. vIOMMU is a fundamentally infrustructural device, its reset is currently problematic because no ordering is guaranteed against other PCI devices which may DMA through the vIOMMU device. The reset order is tricky, not only because it's current representation as a normal "-device" (so it kind of follow the qdev tree depth-first reset, but at a wrong place in the qtree; ideally it should be the parent somewhere for all pci buses, or just part of pci host bridge), but also because customized device reset hooks registered over the system reset framework, so that the ordering of the vIOMMU reset is not guaranteed. For example, VFIO can register its reset hook with vfio_reset_handler() if some device does not support FLR. That will not so far follow the depth-first travelsal reset mechanism provided by QEMU reset framework. To remedy both of the issues with limited code changes, leverage the newly introduced reset stage framework to reset vIOMMUs at the last stage of the rest devices. More information can be found in the comments in the patch, which I decided to persist even with the code to make the problem even clearer (with potential TODOs for the future, if possible). Buglink: https://issues.redhat.com/browse/RHEL-7188 Analyzed-by: Alex Williamson Reported-by: YangHang Liu Signed-off-by: Peter Xu --- hw/i386/intel_iommu.c | 54 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 52 insertions(+), 2 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 8b467cbbd2..5a8fbcad7a 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -35,6 +35,7 @@ #include "sysemu/kvm.h" #include "sysemu/dma.h" #include "sysemu/sysemu.h" +#include "sysemu/reset.h" #include "hw/i386/apic_internal.h" #include "kvm/kvm_i386.h" #include "migration/vmstate.h" @@ -4086,7 +4087,7 @@ static void vtd_init(IntelIOMMUState *s) /* Should not reset address_spaces when reset because devices will still use * the address space they got at first (won't ask the bus again). */ -static void vtd_reset(DeviceState *dev) +static void vtd_reset(void *dev) { IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev); @@ -4242,7 +4243,6 @@ static void vtd_class_init(ObjectClass *klass, void *data) DeviceClass *dc = DEVICE_CLASS(klass); X86IOMMUClass *x86_class = X86_IOMMU_DEVICE_CLASS(klass); - dc->reset = vtd_reset; dc->vmsd = &vtd_vmstate; device_class_set_props(dc, vtd_properties); dc->hotpluggable = false; @@ -4254,10 +4254,60 @@ static void vtd_class_init(ObjectClass *klass, void *data) dc->desc = "Intel IOMMU (VT-d) DMA Remapping device"; } +static void vtd_instance_init(Object *obj) +{ + IntelIOMMUState *s = INTEL_IOMMU_DEVICE(obj); + + /* + * vIOMMU reset may require proper ordering with other devices. There + * are two complexities so that normal DeviceState.reset() may not + * work properly for vIOMMUs: + * + * (1) Device depth-first reset hierachy doesn't yet work for vIOMMUs + * (reference: resettable_cold_reset_fn()) + * + * Currently, vIOMMU devices are created as normal '-device' + * cmdlines. It means in many ways it has the same attributes with + * most of the rest devices, even if the rest devices should + * logically be under control of the vIOMMU unit. + * + * One side effect of it is vIOMMU devices will be currently put + * randomly under qdev tree hierarchy, which is the source of + * device reset ordering in current QEMU (depth-first traversal). + * It means vIOMMU now can be reset before some devices. For fully + * emulated devices that's not a problem, because the traversal + * holds BQL for the whole process. However it is a problem if DMA + * can happen without BQL, like VFIO, vDPA or remote device process. + * + * TODO: one ideal solution can be that we make vIOMMU the parent + * of the whole pci host bridge. Hence vIOMMU can be reset after + * all the devices are reset and quiesced. + * + * (2) Some devices register its own reset functions + * + * Even if above issue solved, if devices register its own reset + * functions for some reason via QEMU reset hooks, vIOMMU can still + * be reset before the device. One example is vfio_reset_handler() + * where FLR is not supported on the device. + * + * TODO: merge relevant reset functions into the device tree reset + * framework. + * + * Neither of the above TODO may be trivial. To make it work for now, + * leverage reset stages and reset vIOMMU always at latter stage of the + * default. It means it needs to be reset after at least: + * + * - resettable_cold_reset_fn(): machine qdev tree reset + * - vfio_reset_handler(): VFIO reset for !FLR + */ + qemu_register_reset_one(vtd_reset, s, false, 1); +} + static const TypeInfo vtd_info = { .name = TYPE_INTEL_IOMMU_DEVICE, .parent = TYPE_X86_IOMMU_DEVICE, .instance_size = sizeof(IntelIOMMUState), + .instance_init = vtd_instance_init, .class_init = vtd_class_init, };