From patchwork Wed Feb 28 09:44:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13575161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9BE58C54E41 for ; Wed, 28 Feb 2024 09:47:52 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rfGWe-0005Ec-Il; Wed, 28 Feb 2024 04:46:56 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfGWa-0005AW-QC for qemu-devel@nongnu.org; Wed, 28 Feb 2024 04:46:53 -0500 Received: from mgamail.intel.com ([198.175.65.9]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfGWX-00029q-9d for qemu-devel@nongnu.org; Wed, 28 Feb 2024 04:46:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709113609; x=1740649609; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jLiQvP98+mmxdOUEX7dj0HvLoYoLYvqDA5bLEgUKsYs=; b=QaSr0nFB9Rf+2GSpAYVdismkJsK3FgexCkit7DMmolin6V503gwDZUZ4 GR5Wbcuy0R9qyb+PpyYwWqZoitGBPK9D+LYQ1oh0j+rl5sKEc0L1M/Z/e SrhusfR/HCTdFHShM/OohW+Ijiw/vZMWyj7V2iF1Evi1R1tYTPiNbtmdO 8TI76MkweJW5N72LYe0CWKo3Bl0NSGVWWpB30CDFTT0yPdaSMa6UdKrWw BLOYhewXvzQj0K1lo3f6+guUr48cz3wqPSWtzBZ7ZZuv0T+Rhzd3qs4VC qoyK/4Q8Ma4huFi4A9rv6LEAuUYE0IdqbcB+T1YVeqil/ZDxOtfXeAawx Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10996"; a="25969964" X-IronPort-AV: E=Sophos;i="6.06,190,1705392000"; d="scan'208";a="25969964" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 01:46:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,190,1705392000"; d="scan'208";a="7809944" Received: from spr-s2600bt.bj.intel.com ([10.240.192.124]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 01:46:43 -0800 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan , Marcel Apfelbaum , Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v1 1/6] intel_iommu: Add set/unset_iommu_device callback Date: Wed, 28 Feb 2024 17:44:27 +0800 Message-Id: <20240228094432.1092748-2-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240228094432.1092748-1-zhenzhong.duan@intel.com> References: <20240228094432.1092748-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=198.175.65.9; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.088, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Yi Liu This adds set/unset_iommu_device() implementation in Intel vIOMMU. In set call, a pointer to host IOMMU device info is stored in hash table indexed by PCI BDF. Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- hw/i386/intel_iommu_internal.h | 8 ++++ include/hw/i386/intel_iommu.h | 2 + hw/i386/intel_iommu.c | 74 ++++++++++++++++++++++++++++++++++ 3 files changed, 84 insertions(+) diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h index f8cf99bddf..becafd03c1 100644 --- a/hw/i386/intel_iommu_internal.h +++ b/hw/i386/intel_iommu_internal.h @@ -537,4 +537,12 @@ typedef struct VTDRootEntry VTDRootEntry; #define VTD_SL_IGN_COM 0xbff0000000000000ULL #define VTD_SL_TM (1ULL << 62) + +typedef struct VTDHostIOMMUDevice { + IntelIOMMUState *iommu_state; + PCIBus *bus; + uint8_t devfn; + HostIOMMUDevice *dev; + QLIST_ENTRY(VTDHostIOMMUDevice) next; +} VTDHostIOMMUDevice; #endif diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index 7fa0a695c8..bbc7b96add 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -292,6 +292,8 @@ struct IntelIOMMUState { /* list of registered notifiers */ QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers; + GHashTable *vtd_host_iommu_dev; /* VTDHostIOMMUDevice */ + /* interrupt remapping */ bool intr_enabled; /* Whether guest enabled IR */ dma_addr_t intr_root; /* Interrupt remapping table pointer */ diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 1a07faddb4..9b62441439 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer v1, gconstpointer v2) (key1->pasid == key2->pasid); } +static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2) +{ + const struct vtd_as_key *key1 = v1; + const struct vtd_as_key *key2 = v2; + + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn); +} /* * Note that we use pointer to PCIBus as the key, so hashing/shifting * based on the pointer value is intended. Note that we deal with @@ -3812,6 +3819,68 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, return vtd_dev_as; } +static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn, + HostIOMMUDevice *base_dev, Error **errp) +{ + IntelIOMMUState *s = opaque; + VTDHostIOMMUDevice *vtd_hdev; + struct vtd_as_key key = { + .bus = bus, + .devfn = devfn, + }; + struct vtd_as_key *new_key; + + assert(base_dev); + + vtd_iommu_lock(s); + + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key); + + if (vtd_hdev) { + error_setg(errp, "IOMMUFD device already exist"); + vtd_iommu_unlock(s); + return -EEXIST; + } + + vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice)); + vtd_hdev->bus = bus; + vtd_hdev->devfn = (uint8_t)devfn; + vtd_hdev->iommu_state = s; + vtd_hdev->dev = base_dev; + + new_key = g_malloc(sizeof(*new_key)); + new_key->bus = bus; + new_key->devfn = devfn; + + g_hash_table_insert(s->vtd_host_iommu_dev, new_key, vtd_hdev); + + vtd_iommu_unlock(s); + + return 0; +} + +static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int devfn) +{ + IntelIOMMUState *s = opaque; + VTDHostIOMMUDevice *vtd_hdev; + struct vtd_as_key key = { + .bus = bus, + .devfn = devfn, + }; + + vtd_iommu_lock(s); + + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key); + if (!vtd_hdev) { + vtd_iommu_unlock(s); + return; + } + + g_hash_table_remove(s->vtd_host_iommu_dev, &key); + + vtd_iommu_unlock(s); +} + /* Unmap the whole range in the notifier's scope. */ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n) { @@ -4107,6 +4176,8 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn) static PCIIOMMUOps vtd_iommu_ops = { .get_address_space = vtd_host_dma_iommu, + .set_iommu_device = vtd_dev_set_iommu_device, + .unset_iommu_device = vtd_dev_unset_iommu_device, }; static bool vtd_decide_config(IntelIOMMUState *s, Error **errp) @@ -4230,6 +4301,9 @@ static void vtd_realize(DeviceState *dev, Error **errp) g_free, g_free); s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash, vtd_as_equal, g_free, g_free); + s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_as_hash, + vtd_as_idev_equal, + g_free, g_free); vtd_init(s); pci_setup_iommu(bus, &vtd_iommu_ops, dev); /* Pseudo address space under root PCI bus. */ From patchwork Wed Feb 28 09:44:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13575160 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CB1D3C47DD9 for ; Wed, 28 Feb 2024 09:47:17 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rfGWh-0005Gm-VU; Wed, 28 Feb 2024 04:46:59 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfGWg-0005GU-Au for qemu-devel@nongnu.org; Wed, 28 Feb 2024 04:46:58 -0500 Received: from mgamail.intel.com ([198.175.65.9]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfGWc-0002AQ-K6 for qemu-devel@nongnu.org; Wed, 28 Feb 2024 04:46:57 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709113615; x=1740649615; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=648gzLYMico2GjbX5DwPYwkXtPATDWt90fsopq2YixU=; b=Pb7CrixIE258buytrgu1ewEzKCcSgIfRC4MDQSCWgNHvT0Iavu0n8zAN 9EzTBu0vNqG3oi2oAjcYPYrOVyHFLvFcOLFiEg5djgOCDHYd6sQ19k7em 3WWqVQkp20b2K7yxN/gzVPPCew52q6X/Vv1mQK6tR/sHFygNPOy8tDY6C c75t0nGKSF5Fs2UpYMiZ0ZbYDVU2rv3mBkjhe+xzad2JCSAfUSkYdAkRx uKBPJD8kNNBKpQf8mSFjv0jNkCVE36Wo81PgmmxEr7CnKla2fKdNtSs3J UJVyIBAG7zheS+VZscj0b6GsDcp0PEIzLfhFlVRuSjUElleHvZWjYgUIq Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10996"; a="25969984" X-IronPort-AV: E=Sophos;i="6.06,190,1705392000"; d="scan'208";a="25969984" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 01:46:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,190,1705392000"; d="scan'208";a="7809976" Received: from spr-s2600bt.bj.intel.com ([10.240.192.124]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 01:46:48 -0800 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Marcel Apfelbaum , Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v1 2/6] intel_iommu: Extract out vtd_cap_init to initialize cap/ecap Date: Wed, 28 Feb 2024 17:44:28 +0800 Message-Id: <20240228094432.1092748-3-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240228094432.1092748-1-zhenzhong.duan@intel.com> References: <20240228094432.1092748-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=198.175.65.9; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.088, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This is a prerequisite for host cap/ecap sync. No functional change intended. Reviewed-by: Eric Auger Signed-off-by: Zhenzhong Duan --- hw/i386/intel_iommu.c | 93 ++++++++++++++++++++++++------------------- 1 file changed, 51 insertions(+), 42 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 9b62441439..ffa1ad6429 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -4003,30 +4003,10 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) return; } -/* Do the initialization. It will also be called when reset, so pay - * attention when adding new initialization stuff. - */ -static void vtd_init(IntelIOMMUState *s) +static void vtd_cap_init(IntelIOMMUState *s) { X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); - memset(s->csr, 0, DMAR_REG_SIZE); - memset(s->wmask, 0, DMAR_REG_SIZE); - memset(s->w1cmask, 0, DMAR_REG_SIZE); - memset(s->womask, 0, DMAR_REG_SIZE); - - s->root = 0; - s->root_scalable = false; - s->dmar_enabled = false; - s->intr_enabled = false; - s->iq_head = 0; - s->iq_tail = 0; - s->iq = 0; - s->iq_size = 0; - s->qi_enabled = false; - s->iq_last_desc_type = VTD_INV_DESC_NONE; - s->iq_dw = false; - s->next_frcd_reg = 0; s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | VTD_CAP_MGAW(s->aw_bits); @@ -4043,27 +4023,6 @@ static void vtd_init(IntelIOMMUState *s) } s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; - /* - * Rsvd field masks for spte - */ - vtd_spte_rsvd[0] = ~0ULL; - vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits, - x86_iommu->dt_supported); - vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits); - vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits); - vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits); - - vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits, - x86_iommu->dt_supported); - vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits, - x86_iommu->dt_supported); - - if (s->scalable_mode || s->snoop_control) { - vtd_spte_rsvd[1] &= ~VTD_SPTE_SNP; - vtd_spte_rsvd_large[2] &= ~VTD_SPTE_SNP; - vtd_spte_rsvd_large[3] &= ~VTD_SPTE_SNP; - } - if (x86_iommu_ir_supported(x86_iommu)) { s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV; if (s->intr_eim == ON_OFF_AUTO_ON) { @@ -4096,6 +4055,56 @@ static void vtd_init(IntelIOMMUState *s) if (s->pasid) { s->ecap |= VTD_ECAP_PASID; } +} + +/* + * Do the initialization. It will also be called when reset, so pay + * attention when adding new initialization stuff. + */ +static void vtd_init(IntelIOMMUState *s) +{ + X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); + + memset(s->csr, 0, DMAR_REG_SIZE); + memset(s->wmask, 0, DMAR_REG_SIZE); + memset(s->w1cmask, 0, DMAR_REG_SIZE); + memset(s->womask, 0, DMAR_REG_SIZE); + + s->root = 0; + s->root_scalable = false; + s->dmar_enabled = false; + s->intr_enabled = false; + s->iq_head = 0; + s->iq_tail = 0; + s->iq = 0; + s->iq_size = 0; + s->qi_enabled = false; + s->iq_last_desc_type = VTD_INV_DESC_NONE; + s->iq_dw = false; + s->next_frcd_reg = 0; + + vtd_cap_init(s); + + /* + * Rsvd field masks for spte + */ + vtd_spte_rsvd[0] = ~0ULL; + vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits, + x86_iommu->dt_supported); + vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits); + vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits); + vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits); + + vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits, + x86_iommu->dt_supported); + vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits, + x86_iommu->dt_supported); + + if (s->scalable_mode || s->snoop_control) { + vtd_spte_rsvd[1] &= ~VTD_SPTE_SNP; + vtd_spte_rsvd_large[2] &= ~VTD_SPTE_SNP; + vtd_spte_rsvd_large[3] &= ~VTD_SPTE_SNP; + } vtd_reset_caches(s); From patchwork Wed Feb 28 09:44:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13575163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D893BC47DD9 for ; Wed, 28 Feb 2024 09:48:01 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rfGWl-0005Re-C7; Wed, 28 Feb 2024 04:47:03 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfGWj-0005LO-NY for qemu-devel@nongnu.org; Wed, 28 Feb 2024 04:47:01 -0500 Received: from mgamail.intel.com ([198.175.65.9]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfGWh-0002AQ-3N for qemu-devel@nongnu.org; Wed, 28 Feb 2024 04:47:00 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709113619; x=1740649619; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zzQtmneALoan6h7kPxfiyycxQTue6ulWrRA2sEohjFA=; b=YTzrWl5oWGxzLrAcBfI95tDrkq6sRfFGkn3minAORhsU0TKcT5OQ1g1D 9ly9mltOaS8M1OfOQHRdYQPcHyNFO//YvuaEuBvECaqkKIoc6yUFdtB8w plhCFdSIAoz3PWHFzpIgiUQ5WLflL1i5JSwsM/eavhCeCQsA/7yCAyBUQ X07U/1J+r1MwsYvzIWqZxBajENEIvRExlti5ztTab86XUUDHagsNbLbHh l9dPbuCr4AVZrx1o7Z4y2rXTvO3wqHToZ5BwBjcIpLxxCtjt8FP4IVfm9 fjZWZYjH9mER9vbioJxyktVB1c5gbZDcAxi2MZJ5iP17ub60FXytSLr8q A==; X-IronPort-AV: E=McAfee;i="6600,9927,10996"; a="25969996" X-IronPort-AV: E=Sophos;i="6.06,190,1705392000"; d="scan'208";a="25969996" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 01:46:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,190,1705392000"; d="scan'208";a="7810012" Received: from spr-s2600bt.bj.intel.com ([10.240.192.124]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 01:46:53 -0800 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan , Paolo Bonzini , Richard Henderson , Eduardo Habkost , Marcel Apfelbaum Subject: [PATCH v1 3/6] intel_iommu: Add a framework to check and sync host IOMMU cap/ecap Date: Wed, 28 Feb 2024 17:44:29 +0800 Message-Id: <20240228094432.1092748-4-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240228094432.1092748-1-zhenzhong.duan@intel.com> References: <20240228094432.1092748-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=198.175.65.9; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.088, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Yi Liu Add a framework to check and synchronize host IOMMU cap/ecap with vIOMMU cap/ecap. The sequence will be: vtd_cap_init() initializes iommu->cap/ecap. vtd_check_hdev() update iommu->cap/ecap based on host cap/ecap. iommu->cap_frozen set when machine create done, iommu->cap/ecap become readonly. Implementation details for different backends will be in following patches. Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/i386/intel_iommu.h | 1 + hw/i386/intel_iommu.c | 50 ++++++++++++++++++++++++++++++++++- 2 files changed, 50 insertions(+), 1 deletion(-) diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index bbc7b96add..c71a133820 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -283,6 +283,7 @@ struct IntelIOMMUState { uint64_t cap; /* The value of capability reg */ uint64_t ecap; /* The value of extended capability reg */ + bool cap_frozen; /* cap/ecap become read-only after frozen */ uint32_t context_cache_gen; /* Should be in [1,MAX] */ GHashTable *iotlb; /* IOTLB */ diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index ffa1ad6429..a9f9dfd6a7 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -35,6 +35,8 @@ #include "sysemu/kvm.h" #include "sysemu/dma.h" #include "sysemu/sysemu.h" +#include "hw/vfio/vfio-common.h" +#include "sysemu/iommufd.h" #include "hw/i386/apic_internal.h" #include "kvm/kvm_i386.h" #include "migration/vmstate.h" @@ -3819,6 +3821,38 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, return vtd_dev_as; } +static int vtd_check_legacy_hdev(IntelIOMMUState *s, + IOMMULegacyDevice *ldev, + Error **errp) +{ + return 0; +} + +static int vtd_check_iommufd_hdev(IntelIOMMUState *s, + IOMMUFDDevice *idev, + Error **errp) +{ + return 0; +} + +static int vtd_check_hdev(IntelIOMMUState *s, VTDHostIOMMUDevice *vtd_hdev, + Error **errp) +{ + HostIOMMUDevice *base_dev = vtd_hdev->dev; + IOMMUFDDevice *idev; + + if (base_dev->type == HID_LEGACY) { + IOMMULegacyDevice *ldev = container_of(base_dev, + IOMMULegacyDevice, base); + + return vtd_check_legacy_hdev(s, ldev, errp); + } + + idev = container_of(base_dev, IOMMUFDDevice, base); + + return vtd_check_iommufd_hdev(s, idev, errp); +} + static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn, HostIOMMUDevice *base_dev, Error **errp) { @@ -3829,6 +3863,7 @@ static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn, .devfn = devfn, }; struct vtd_as_key *new_key; + int ret; assert(base_dev); @@ -3848,6 +3883,13 @@ static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn, vtd_hdev->iommu_state = s; vtd_hdev->dev = base_dev; + ret = vtd_check_hdev(s, vtd_hdev, errp); + if (ret) { + g_free(vtd_hdev); + vtd_iommu_unlock(s); + return ret; + } + new_key = g_malloc(sizeof(*new_key)); new_key->bus = bus; new_key->devfn = devfn; @@ -4083,7 +4125,9 @@ static void vtd_init(IntelIOMMUState *s) s->iq_dw = false; s->next_frcd_reg = 0; - vtd_cap_init(s); + if (!s->cap_frozen) { + vtd_cap_init(s); + } /* * Rsvd field masks for spte @@ -4254,6 +4298,10 @@ static int vtd_machine_done_notify_one(Object *child, void *unused) static void vtd_machine_done_hook(Notifier *notifier, void *unused) { + IntelIOMMUState *iommu = INTEL_IOMMU_DEVICE(x86_iommu_get_default()); + + iommu->cap_frozen = true; + object_child_foreach_recursive(object_get_root(), vtd_machine_done_notify_one, NULL); } From patchwork Wed Feb 28 09:44:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13575165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 181C4C54E4A for ; Wed, 28 Feb 2024 09:48:10 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rfGWq-0005fz-Od; Wed, 28 Feb 2024 04:47:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfGWo-0005e2-BU for qemu-devel@nongnu.org; Wed, 28 Feb 2024 04:47:06 -0500 Received: from mgamail.intel.com ([198.175.65.9]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfGWm-0002AQ-J5 for qemu-devel@nongnu.org; Wed, 28 Feb 2024 04:47:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709113625; x=1740649625; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WvaK/emGowZQWybU1JwxE4pUbqjBI6c4mZf490UPtRw=; b=XOo4GNQtV9SkboyEsrkaVmRGNbZRhyeNLQDga+jzTLhw6sir98wShDQw PIbH4qM1urkVauU2TFqIx/xb/uLK+0VpROUfGemzxVwPepiKBldjMbtQJ EO3ail/eFZc0vTG2Hy+JJDyc9N4R+qaf2wfqNa4+KA2RfiCqno2z4f/Fn vXGlqPHRmnlAapjGni/Ur6naSzG5hQ/9Nqch89TmTM/ntMg+D3YgNskWj kUC8tq0sOlt8r/8jtSfK+/pECyIZKZUgnSatZD4bMhH6Vc/OK24X0hKlM ILmGC0qaxZiQTH143mBemYMobRB9P3Is6mvV8HLAzQruSKTcJ1DQ/GgbB g==; X-IronPort-AV: E=McAfee;i="6600,9927,10996"; a="25970008" X-IronPort-AV: E=Sophos;i="6.06,190,1705392000"; d="scan'208";a="25970008" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 01:47:04 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,190,1705392000"; d="scan'208";a="7810042" Received: from spr-s2600bt.bj.intel.com ([10.240.192.124]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 01:46:59 -0800 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Yi Sun , Paolo Bonzini , Richard Henderson , Eduardo Habkost , Marcel Apfelbaum Subject: [PATCH v1 4/6] intel_iommu: Implement check and sync mechanism in iommufd mode Date: Wed, 28 Feb 2024 17:44:30 +0800 Message-Id: <20240228094432.1092748-5-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240228094432.1092748-1-zhenzhong.duan@intel.com> References: <20240228094432.1092748-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=198.175.65.9; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.088, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org We use cap_frozen to mark cap/ecap read/writable or read-only, At init stage, we allow to update cap/ecap based on host IOMMU cap/ecap, but when machine create done, cap_frozen is set and we only allow checking cap/ecap for compatibility. Currently only stage-2 translation is supported which is backed by shadow page table on host side. So we don't need exact matching of each bit of cap/ecap between vIOMMU and host. However, we can still ensure compatibility of host and vIOMMU's address width at least, i.e., vIOMMU's mgaw <= host IOMMU mgaw, which is missed before. When stage-1 translation is supported in future, a.k.a. scalable modern mode, this mechanism will be further extended to check more bits. Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- hw/i386/intel_iommu_internal.h | 1 + include/hw/i386/intel_iommu.h | 1 + hw/i386/intel_iommu.c | 28 ++++++++++++++++++++++++++++ 3 files changed, 30 insertions(+) diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h index becafd03c1..72a5cb0859 100644 --- a/hw/i386/intel_iommu_internal.h +++ b/hw/i386/intel_iommu_internal.h @@ -204,6 +204,7 @@ #define VTD_DOMAIN_ID_MASK ((1UL << VTD_DOMAIN_ID_SHIFT) - 1) #define VTD_CAP_ND (((VTD_DOMAIN_ID_SHIFT - 4) / 2) & 7ULL) #define VTD_ADDRESS_SIZE(aw) (1ULL << (aw)) +#define VTD_CAP_MGAW_MASK (0x3fULL << 16) #define VTD_CAP_MGAW(aw) ((((aw) - 1) & 0x3fULL) << 16) #define VTD_MAMV 18ULL #define VTD_CAP_MAMV (VTD_MAMV << 48) diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index c71a133820..a0b530ebc6 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -47,6 +47,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(IntelIOMMUState, INTEL_IOMMU_DEVICE) #define VTD_HOST_AW_48BIT 48 #define VTD_HOST_ADDRESS_WIDTH VTD_HOST_AW_39BIT #define VTD_HAW_MASK(aw) ((1ULL << (aw)) - 1) +#define VTD_MGAW_FROM_CAP(cap) (((cap >> 16) & 0x3fULL) + 1) #define DMAR_REPORT_F_INTR (1) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index a9f9dfd6a7..2a55268538 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -3832,6 +3832,34 @@ static int vtd_check_iommufd_hdev(IntelIOMMUState *s, IOMMUFDDevice *idev, Error **errp) { + struct iommu_hw_info_vtd vtd; + enum iommu_hw_info_type type = IOMMU_HW_INFO_TYPE_INTEL_VTD; + long host_mgaw, viommu_mgaw = VTD_MGAW_FROM_CAP(s->cap); + uint64_t tmp_cap = s->cap; + int ret; + + ret = iommufd_device_get_info(idev, &type, sizeof(vtd), &vtd, errp); + if (ret) { + return ret; + } + + if (type != IOMMU_HW_INFO_TYPE_INTEL_VTD) { + error_setg(errp, "IOMMU hardware is not compatible"); + return -EINVAL; + } + + host_mgaw = VTD_MGAW_FROM_CAP(vtd.cap_reg); + if (viommu_mgaw > host_mgaw) { + if (s->cap_frozen) { + error_setg(errp, "mgaw %" PRId64 " > host mgaw %" PRId64, + viommu_mgaw, host_mgaw); + return -EINVAL; + } + tmp_cap &= ~VTD_CAP_MGAW_MASK; + tmp_cap |= VTD_CAP_MGAW(host_mgaw + 1); + } + + s->cap = tmp_cap; return 0; } From patchwork Wed Feb 28 09:44:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13575162 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 27BF5C47DD9 for ; Wed, 28 Feb 2024 09:47:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rfGWx-00062Z-GB; Wed, 28 Feb 2024 04:47:15 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfGWv-0005ud-2j for qemu-devel@nongnu.org; Wed, 28 Feb 2024 04:47:14 -0500 Received: from mgamail.intel.com ([198.175.65.9]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfGWs-0002AQ-QG for qemu-devel@nongnu.org; Wed, 28 Feb 2024 04:47:12 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709113631; x=1740649631; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yx9WnW6FzZkOMX5scTMftIZD9XRWJ/2axLjB+XskyEI=; b=XRWid4vpS1vKFbYw7WtpCvhLbGo2IxvKD3lsmj0iZqLi6in84vz48/At ekz9saQC9vUwLHf2LHg6NVrRkWOwmLOpRxZXEiuf80sqyhFoC6x56uEIz HomlQM3BtP/8OvriwGOg/vuOxygfsVgrLTUtrO8JzMS8DFA08Zm72wVds 2qv+IzMVvKVDkDtzD+CWgk/5wdvPQtwtB9gCQNceiPcbghYGxW9PYFfLt m0XCPGFAJYD8Z7HRtAZAIH1WpoN5BMsZw/caYTyvZbOWkv2MAKJn3Ncz0 CvylBey8fELtC/e4TnttmXoS18xi9hUMYiI08IbpfnP74xHYBOo/PERpd w==; X-IronPort-AV: E=McAfee;i="6600,9927,10996"; a="25970016" X-IronPort-AV: E=Sophos;i="6.06,190,1705392000"; d="scan'208";a="25970016" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 01:47:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,190,1705392000"; d="scan'208";a="7810092" Received: from spr-s2600bt.bj.intel.com ([10.240.192.124]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 01:47:05 -0800 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Igor Mammedov , Ani Sinha , Marcel Apfelbaum , Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v1 5/6] intel_iommu: Use mgaw instead of s->aw_bits Date: Wed, 28 Feb 2024 17:44:31 +0800 Message-Id: <20240228094432.1092748-6-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240228094432.1092748-1-zhenzhong.duan@intel.com> References: <20240228094432.1092748-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=198.175.65.9; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.088, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Because vIOMMU mgaw can be updated based on host IOMMU mgaw, s->aw_bits does't necessarily represent the final mgaw now but the mgaw field in s->cap does. Replace reference to s->aw_bits with a MACRO S_AW_BITS to fetch mgaw from s->cap. There are two exceptions on this, aw_bits value sanity check and s->cap initialization. ACPI DMAR table is also updated with right mgaw value. Signed-off-by: Zhenzhong Duan --- hw/i386/acpi-build.c | 3 ++- hw/i386/intel_iommu.c | 44 ++++++++++++++++++++++--------------------- 2 files changed, 25 insertions(+), 22 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index edc979379c..6467157686 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -2159,7 +2159,8 @@ build_dmar_q35(GArray *table_data, BIOSLinker *linker, const char *oem_id, acpi_table_begin(&table, table_data); /* Host Address Width */ - build_append_int_noprefix(table_data, intel_iommu->aw_bits - 1, 1); + build_append_int_noprefix(table_data, + VTD_MGAW_FROM_CAP(intel_iommu->cap), 1); build_append_int_noprefix(table_data, dmar_flags, 1); /* Flags */ g_array_append_vals(table_data, rsvd10, sizeof(rsvd10)); /* Reserved */ diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 2a55268538..e474284e43 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -42,6 +42,8 @@ #include "migration/vmstate.h" #include "trace.h" +#define S_AW_BITS (VTD_MGAW_FROM_CAP(s->cap) + 1) + /* context entry operations */ #define VTD_CE_GET_RID2PASID(ce) \ ((ce)->val[1] & VTD_SM_CONTEXT_ENTRY_RID2PASID_MASK) @@ -1410,13 +1412,13 @@ static int vtd_root_entry_rsvd_bits_check(IntelIOMMUState *s, { /* Legacy Mode reserved bits check */ if (!s->root_scalable && - (re->hi || (re->lo & VTD_ROOT_ENTRY_RSVD(s->aw_bits)))) + (re->hi || (re->lo & VTD_ROOT_ENTRY_RSVD(S_AW_BITS)))) goto rsvd_err; /* Scalable Mode reserved bits check */ if (s->root_scalable && - ((re->lo & VTD_ROOT_ENTRY_RSVD(s->aw_bits)) || - (re->hi & VTD_ROOT_ENTRY_RSVD(s->aw_bits)))) + ((re->lo & VTD_ROOT_ENTRY_RSVD(S_AW_BITS)) || + (re->hi & VTD_ROOT_ENTRY_RSVD(S_AW_BITS)))) goto rsvd_err; return 0; @@ -1433,7 +1435,7 @@ static inline int vtd_context_entry_rsvd_bits_check(IntelIOMMUState *s, { if (!s->root_scalable && (ce->hi & VTD_CONTEXT_ENTRY_RSVD_HI || - ce->lo & VTD_CONTEXT_ENTRY_RSVD_LO(s->aw_bits))) { + ce->lo & VTD_CONTEXT_ENTRY_RSVD_LO(S_AW_BITS))) { error_report_once("%s: invalid context entry: hi=%"PRIx64 ", lo=%"PRIx64" (reserved nonzero)", __func__, ce->hi, ce->lo); @@ -1441,7 +1443,7 @@ static inline int vtd_context_entry_rsvd_bits_check(IntelIOMMUState *s, } if (s->root_scalable && - (ce->val[0] & VTD_SM_CONTEXT_ENTRY_RSVD_VAL0(s->aw_bits) || + (ce->val[0] & VTD_SM_CONTEXT_ENTRY_RSVD_VAL0(S_AW_BITS) || ce->val[1] & VTD_SM_CONTEXT_ENTRY_RSVD_VAL1 || ce->val[2] || ce->val[3])) { @@ -1572,7 +1574,7 @@ static int vtd_sync_shadow_page_table_range(VTDAddressSpace *vtd_as, .hook_fn = vtd_sync_shadow_page_hook, .private = (void *)&vtd_as->iommu, .notify_unmap = true, - .aw = s->aw_bits, + .aw = S_AW_BITS, .as = vtd_as, .domain_id = vtd_get_domain_id(s, ce, vtd_as->pasid), }; @@ -1991,7 +1993,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus, } ret_fr = vtd_iova_to_slpte(s, &ce, addr, is_write, &slpte, &level, - &reads, &writes, s->aw_bits, pasid); + &reads, &writes, S_AW_BITS, pasid); if (ret_fr) { vtd_report_fault(s, -ret_fr, is_fpd_set, source_id, addr, is_write, pasid != PCI_NO_PASID, pasid); @@ -2005,7 +2007,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus, out: vtd_iommu_unlock(s); entry->iova = addr & page_mask; - entry->translated_addr = vtd_get_slpte_addr(slpte, s->aw_bits) & page_mask; + entry->translated_addr = vtd_get_slpte_addr(slpte, S_AW_BITS) & page_mask; entry->addr_mask = ~page_mask; entry->perm = access_flags; return true; @@ -2022,7 +2024,7 @@ error: static void vtd_root_table_setup(IntelIOMMUState *s) { s->root = vtd_get_quad_raw(s, DMAR_RTADDR_REG); - s->root &= VTD_RTADDR_ADDR_MASK(s->aw_bits); + s->root &= VTD_RTADDR_ADDR_MASK(S_AW_BITS); vtd_update_scalable_state(s); @@ -2040,7 +2042,7 @@ static void vtd_interrupt_remap_table_setup(IntelIOMMUState *s) uint64_t value = 0; value = vtd_get_quad_raw(s, DMAR_IRTA_REG); s->intr_size = 1UL << ((value & VTD_IRTA_SIZE_MASK) + 1); - s->intr_root = value & VTD_IRTA_ADDR_MASK(s->aw_bits); + s->intr_root = value & VTD_IRTA_ADDR_MASK(S_AW_BITS); s->intr_eime = value & VTD_IRTA_EIME; /* Notify global invalidation */ @@ -2323,7 +2325,7 @@ static void vtd_handle_gcmd_qie(IntelIOMMUState *s, bool en) trace_vtd_inv_qi_enable(en); if (en) { - s->iq = iqa_val & VTD_IQA_IQA_MASK(s->aw_bits); + s->iq = iqa_val & VTD_IQA_IQA_MASK(S_AW_BITS); /* 2^(x+8) entries */ s->iq_size = 1UL << ((iqa_val & VTD_IQA_QS) + 8 - (s->iq_dw ? 1 : 0)); s->qi_enabled = true; @@ -3966,12 +3968,12 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n) * VT-d spec), otherwise we need to consider overflow of 64 bits. */ - if (end > VTD_ADDRESS_SIZE(s->aw_bits) - 1) { + if (end > VTD_ADDRESS_SIZE(S_AW_BITS) - 1) { /* * Don't need to unmap regions that is bigger than the whole * VT-d supported address space size */ - end = VTD_ADDRESS_SIZE(s->aw_bits) - 1; + end = VTD_ADDRESS_SIZE(S_AW_BITS) - 1; } assert(start <= end); @@ -3979,7 +3981,7 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n) while (remain >= VTD_PAGE_SIZE) { IOMMUTLBEvent event; - uint64_t mask = dma_aligned_pow2_mask(start, end, s->aw_bits); + uint64_t mask = dma_aligned_pow2_mask(start, end, S_AW_BITS); uint64_t size = mask + 1; assert(size); @@ -4058,7 +4060,7 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) .hook_fn = vtd_replay_hook, .private = (void *)n, .notify_unmap = false, - .aw = s->aw_bits, + .aw = S_AW_BITS, .as = vtd_as, .domain_id = vtd_get_domain_id(s, &ce, vtd_as->pasid), }; @@ -4161,15 +4163,15 @@ static void vtd_init(IntelIOMMUState *s) * Rsvd field masks for spte */ vtd_spte_rsvd[0] = ~0ULL; - vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits, + vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(S_AW_BITS, x86_iommu->dt_supported); - vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits); - vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits); - vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits); + vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(S_AW_BITS); + vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(S_AW_BITS); + vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(S_AW_BITS); - vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits, + vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(S_AW_BITS, x86_iommu->dt_supported); - vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits, + vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(S_AW_BITS, x86_iommu->dt_supported); if (s->scalable_mode || s->snoop_control) { From patchwork Wed Feb 28 09:44:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13575164 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0BA18C54E41 for ; Wed, 28 Feb 2024 09:48:10 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rfGX1-0006FK-RP; Wed, 28 Feb 2024 04:47:19 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfGWz-0006Bb-FE for qemu-devel@nongnu.org; Wed, 28 Feb 2024 04:47:17 -0500 Received: from mgamail.intel.com ([198.175.65.9]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfGWx-0002AQ-Vm for qemu-devel@nongnu.org; Wed, 28 Feb 2024 04:47:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709113636; x=1740649636; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=klAYBbP2OdBAI+vH8MMNk4trXTEnoMk+ZzXZROcs7UA=; b=bO+BNLg42d2OH60aUhRpPBhsySLitqKoJrTQnlcFLtf2mSD/UDuZr1BU ni7lZAR9fvcT7bHBkqYITfgJwcQMWCTQi/3ElzfVO94mJBhmdVwfN8V91 C8L1Gua84f5XpCNKXTWiYncSmFtr0eW9n6CJL6xemlf4vWOHXKr9YWYUR YjFzB/5cl9S9H40HRcMtZFvyfAmt2PjaNRSXvwdWP7GGNrGw+QL1PGFoH jndvCSVMBl/OE4eadDR0iputaXtQjGrJfEWV/aEP9Exfu7ZUay/xFiDOq //5f8Jj5YGVsys3gqsj+YMZ4c8sL3ebNVeWKMfELDQoV3OPdK2P8ggw2f w==; X-IronPort-AV: E=McAfee;i="6600,9927,10996"; a="25970028" X-IronPort-AV: E=Sophos;i="6.06,190,1705392000"; d="scan'208";a="25970028" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 01:47:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,190,1705392000"; d="scan'208";a="7810109" Received: from spr-s2600bt.bj.intel.com ([10.240.192.124]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 01:47:11 -0800 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Marcel Apfelbaum , Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v1 6/6] intel_iommu: Block migration if cap is updated Date: Wed, 28 Feb 2024 17:44:32 +0800 Message-Id: <20240228094432.1092748-7-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240228094432.1092748-1-zhenzhong.duan@intel.com> References: <20240228094432.1092748-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=198.175.65.9; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.088, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org When there is VFIO device and vIOMMU cap/ecap is updated based on host IOMMU cap/ecap, migration should be blocked. Signed-off-by: Zhenzhong Duan --- hw/i386/intel_iommu.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index e474284e43..9ca47dbf9a 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -40,6 +40,7 @@ #include "hw/i386/apic_internal.h" #include "kvm/kvm_i386.h" #include "migration/vmstate.h" +#include "migration/blocker.h" #include "trace.h" #define S_AW_BITS (VTD_MGAW_FROM_CAP(s->cap) + 1) @@ -3830,6 +3831,8 @@ static int vtd_check_legacy_hdev(IntelIOMMUState *s, return 0; } +static Error *vtd_mig_blocker; + static int vtd_check_iommufd_hdev(IntelIOMMUState *s, IOMMUFDDevice *idev, Error **errp) @@ -3861,8 +3864,17 @@ static int vtd_check_iommufd_hdev(IntelIOMMUState *s, tmp_cap |= VTD_CAP_MGAW(host_mgaw + 1); } - s->cap = tmp_cap; - return 0; + if (s->cap != tmp_cap) { + if (vtd_mig_blocker == NULL) { + error_setg(&vtd_mig_blocker, + "cap/ecap update from host IOMMU block migration"); + ret = migrate_add_blocker(&vtd_mig_blocker, errp); + } + if (!ret) { + s->cap = tmp_cap; + } + } + return ret; } static int vtd_check_hdev(IntelIOMMUState *s, VTDHostIOMMUDevice *vtd_hdev,