From patchwork Thu Mar 9 08:22:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13166963 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A797C64EC4 for ; Thu, 9 Mar 2023 08:25:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230150AbjCIIYg (ORCPT ); Thu, 9 Mar 2023 03:24:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230427AbjCIIYD (ORCPT ); Thu, 9 Mar 2023 03:24:03 -0500 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1ECA8B9521; Thu, 9 Mar 2023 00:22:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678350168; x=1709886168; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eosmom8m8jHTqT2C2nazDZTU+HJNVXBxSxneXzpiKIc=; b=LDDuv9eO4ME/7gfcJHfd1FQwcWy7hEN0OjeSUFkELIUkEYnY+hEgsUEX Jlz7DhST17d8dSNRu8wRK/cjZYkgmIA8wri21bejU/IUBlnKtHuV/4OBO sY8Wxl1qDruxcv89Wo0vIoDTOfh505UtcrV04NJsbs2ADdshodtW7hypE o5wRQ4TUtXj57KSfVfyILcP2y4/ZzYO+GAlkHeH2+/T59+lLU0L+uDf2Q VexyIwpR5gZcNPLc835Yi99GLwI4usuJ2hEScgY2dtr8lLcKBFIg5QRca NL1+L2VJiB/AvBe4oah3jvSgUJbs4JOqd/jsss6GxWpZnPXuu27nZzHdU Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10643"; a="364026180" X-IronPort-AV: E=Sophos;i="5.98,245,1673942400"; d="scan'208";a="364026180" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Mar 2023 00:22:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10643"; a="787474150" X-IronPort-AV: E=Sophos;i="5.98,245,1673942400"; d="scan'208";a="787474150" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga002.fm.intel.com with ESMTP; 09 Mar 2023 00:22:11 -0800 From: Yi Liu To: joro@8bytes.org, alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com, robin.murphy@arm.com, baolu.lu@linux.intel.com Cc: cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH v2 1/5] iommufd: Add nesting related data structures for Intel VT-d Date: Thu, 9 Mar 2023 00:22:03 -0800 Message-Id: <20230309082207.612346-2-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230309082207.612346-1-yi.l.liu@intel.com> References: <20230309082207.612346-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Add the following data structures for corresponding ioctls: iommu_hwpt_intel_vtd => IOMMU_HWPT_ALLOC iommu_hw_info_vtd => IOMMU_DEVICE_GET_HW_INFO iommu_hwpt_invalidate_intel_vtd => IOMMU_HWPT_INVALIDATE Also, add IOMMU_HW_INFO_TYPE_INTEL_VTD and IOMMU_HWPT_TYPE_VTD_S1 to the header and corresponding type/size arrays. Signed-off-by: Yi Liu --- drivers/iommu/iommufd/hw_pagetable.c | 7 +- drivers/iommu/iommufd/main.c | 5 + include/uapi/linux/iommufd.h | 136 +++++++++++++++++++++++++++ 3 files changed, 147 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 76ff228dfc1f..36e79db8a17d 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -172,6 +172,7 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, */ static const size_t iommufd_hwpt_alloc_data_size[] = { [IOMMU_HWPT_TYPE_DEFAULT] = 0, + [IOMMU_HWPT_TYPE_VTD_S1] = sizeof(struct iommu_hwpt_intel_vtd), }; /* @@ -180,6 +181,8 @@ static const size_t iommufd_hwpt_alloc_data_size[] = { */ const u64 iommufd_hwpt_type_bitmaps[] = { [IOMMU_HW_INFO_TYPE_DEFAULT] = BIT_ULL(IOMMU_HWPT_TYPE_DEFAULT), + [IOMMU_HW_INFO_TYPE_INTEL_VTD] = BIT_ULL(IOMMU_HWPT_TYPE_DEFAULT) | + BIT_ULL(IOMMU_HWPT_TYPE_VTD_S1), }; /* Return true if type is supported, otherwise false */ @@ -324,7 +327,9 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) * size of page table type specific invalidate_info, indexed by * enum iommu_hwpt_type. */ -static const size_t iommufd_hwpt_invalidate_info_size[] = {}; +static const size_t iommufd_hwpt_invalidate_info_size[] = { + [IOMMU_HWPT_TYPE_VTD_S1] = sizeof(struct iommu_hwpt_invalidate_intel_vtd), +}; int iommufd_hwpt_invalidate(struct iommufd_ucmd *ucmd) { diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index 7ec3ceac01b3..514db4c26927 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -275,6 +275,11 @@ union ucmd_buffer { #ifdef CONFIG_IOMMUFD_TEST struct iommu_test_cmd test; #endif + /* + * data_type specific structure used in the cache invalidation + * path. + */ + struct iommu_hwpt_invalidate_intel_vtd vtd; }; struct iommufd_ioctl_op { diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index e2eff9c56ab3..2a6c326391b2 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -351,9 +351,64 @@ struct iommu_vfio_ioas { /** * enum iommu_hwpt_type - IOMMU HWPT Type * @IOMMU_HWPT_TYPE_DEFAULT: default + * @IOMMU_HWPT_TYPE_VTD_S1: Intel VT-d stage-1 page table */ enum iommu_hwpt_type { IOMMU_HWPT_TYPE_DEFAULT, + IOMMU_HWPT_TYPE_VTD_S1, +}; + +/** + * enum iommu_hwpt_intel_vtd_flags - Intel VT-d stage-1 page + * table entry attributes + * @IOMMU_VTD_PGTBL_SRE: Supervisor request + * @IOMMU_VTD_PGTBL_EAFE: Extended access enable + * @IOMMU_VTD_PGTBL_PCD: Page-level cache disable + * @IOMMU_VTD_PGTBL_PWT: Page-level write through + * @IOMMU_VTD_PGTBL_EMTE: Extended mem type enable + * @IOMMU_VTD_PGTBL_CD: PASID-level cache disable + * @IOMMU_VTD_PGTBL_WPE: Write protect enable + */ +enum iommu_hwpt_intel_vtd_flags { + IOMMU_VTD_PGTBL_SRE = 1 << 0, + IOMMU_VTD_PGTBL_EAFE = 1 << 1, + IOMMU_VTD_PGTBL_PCD = 1 << 2, + IOMMU_VTD_PGTBL_PWT = 1 << 3, + IOMMU_VTD_PGTBL_EMTE = 1 << 4, + IOMMU_VTD_PGTBL_CD = 1 << 5, + IOMMU_VTD_PGTBL_WPE = 1 << 6, + IOMMU_VTD_PGTBL_LAST = 1 << 7, +}; + +/** + * struct iommu_hwpt_intel_vtd - Intel VT-d specific user-managed + * stage-1 page table info + * @flags: Combination of enum iommu_hwpt_intel_vtd_flags + * @pgtbl_addr: The base address of the user-managed stage-1 page table. + * @pat: Page attribute table data to compute effective memory type + * @emt: Extended memory type + * @addr_width: The address width of the untranslated addresses that are + * subjected to the user-managed stage-1 page table. + * @__reserved: Must be 0 + * + * The Intel VT-d specific data for creating hw_pagetable to represent + * the user-managed stage-1 page table that is used in nested translation. + * + * In nested translation, the stage-1 page table locates in the address + * space that defined by the corresponding stage-2 page table. Hence the + * stage-1 page table base address value should not be higher than the + * maximum untranslated address of stage-2 page table. + * + * The paging level of the stage-1 page table should be compataible with + * the hardware iommu. Otherwise, the allocation would be failed. + */ +struct iommu_hwpt_intel_vtd { + __u64 flags; + __u64 pgtbl_addr; + __u32 pat; + __u32 emt; + __u32 addr_width; + __u32 __reserved; }; /** @@ -389,6 +444,8 @@ enum iommu_hwpt_type { * +------------------------------+-------------------------------------+-----------+ * | IOMMU_HWPT_TYPE_DEFAULT | N/A | IOAS | * +------------------------------+-------------------------------------+-----------+ + * | IOMMU_HWPT_TYPE_VTD_S1 | struct iommu_hwpt_intel_vtd | HWPT | + * +------------------------------+-------------------------------------+-----------+ */ struct iommu_hwpt_alloc { __u32 size; @@ -405,9 +462,32 @@ struct iommu_hwpt_alloc { /** * enum iommu_hw_info_type - IOMMU Hardware Info Types + * @IOMMU_HW_INFO_TYPE_INTEL_VTD: Intel VT-d iommu info type */ enum iommu_hw_info_type { IOMMU_HW_INFO_TYPE_DEFAULT, + IOMMU_HW_INFO_TYPE_INTEL_VTD, +}; + +/** + * struct iommu_hw_info_vtd - Intel VT-d hardware information + * + * @flags: Must be set to 0 + * @__reserved: Must be 0 + * + * @cap_reg: Value of Intel VT-d capability register defined in VT-d spec + * section 11.4.2 Capability Register. + * @ecap_reg: Value of Intel VT-d capability register defined in VT-d spec + * section 11.4.3 Extended Capability Register. + * + * User needs to understand the Intel VT-d specification to decode the + * register value. + */ +struct iommu_hw_info_vtd { + __u32 flags; + __u32 __reserved; + __aligned_u64 cap_reg; + __aligned_u64 ecap_reg; }; /** @@ -457,6 +537,60 @@ struct iommu_hw_info { }; #define IOMMU_DEVICE_GET_HW_INFO _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_GET_HW_INFO) +/** + * enum iommu_vtd_qi_granularity - Intel VT-d specific granularity of + * queued invalidation + * @IOMMU_VTD_QI_GRAN_DOMAIN: domain-selective invalidation + * @IOMMU_VTD_QI_GRAN_ADDR: page-selective invalidation + */ +enum iommu_vtd_qi_granularity { + IOMMU_VTD_QI_GRAN_DOMAIN, + IOMMU_VTD_QI_GRAN_ADDR, +}; + +/** + * enum iommu_hwpt_intel_vtd_invalidate_flags - Flags for Intel VT-d + * stage-1 page table cache + * invalidation + * @IOMMU_VTD_QI_FLAGS_LEAF: The LEAF flag indicates whether only the + * leaf PTE caching needs to be invalidated + * and other paging structure caches can be + * preserved. + */ +enum iommu_hwpt_intel_vtd_invalidate_flags { + IOMMU_VTD_QI_FLAGS_LEAF = 1 << 0, +}; + +/** + * struct iommu_hwpt_invalidate_intel_vtd - Intel VT-d cache invalidation info + * @granularity: One of enum iommu_vtd_qi_granularity. + * @flags: Combination of enum iommu_hwpt_intel_vtd_invalidate_flags + * @__reserved: Must be 0 + * @addr: The start address of the addresses to be invalidated. + * @granule_size: Page/block size of the mapping in bytes. It is used to + * compute the invalidation range togehter with @nb_granules. + * @nb_granules: Number of contiguous granules to be invalidated. + * + * The Intel VT-d specific invalidation data for user-managed stage-1 cache + * invalidation under nested translation. Userspace uses this structure to + * tell host about the impacted caches after modifying the stage-1 page table. + * + * @addr, @granule_size and @nb_granules are meaningful when + * @granularity==IOMMU_VTD_QI_GRAN_ADDR. Intel VT-d currently only supports + * 4kB page size, so @granule_size should be 4KB. @addr should be aligned + * with @granule_size * @nb_granules, otherwise invalidation won't take + * effect. + */ +struct iommu_hwpt_invalidate_intel_vtd { + __u8 granularity; + __u8 padding[7]; + __u32 flags; + __u32 __reserved; + __u64 addr; + __u64 granule_size; + __u64 nb_granules; +}; + /** * struct iommu_hwpt_invalidate - ioctl(IOMMU_HWPT_INVALIDATE) * @size: sizeof(struct iommu_hwpt_invalidate) @@ -473,6 +607,8 @@ struct iommu_hw_info { * +==============================+========================================+ * | @data_type | Data structure in @data_uptr | * +------------------------------+----------------------------------------+ + * | IOMMU_HWPT_TYPE_VTD_S1 | struct iommu_hwpt_invalidate_intel_vtd | + * +------------------------------+----------------------------------------+ */ struct iommu_hwpt_invalidate { __u32 size; From patchwork Thu Mar 9 08:22:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13166967 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4722CC61DA4 for ; Thu, 9 Mar 2023 08:25:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229906AbjCIIZJ (ORCPT ); Thu, 9 Mar 2023 03:25:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229852AbjCIIYS (ORCPT ); Thu, 9 Mar 2023 03:24:18 -0500 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36CD514EB0; Thu, 9 Mar 2023 00:23:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678350183; x=1709886183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8FKzjXidbvZaREDXMrJoGHp6nNM9UNsEUhh+2h9xdEU=; b=XIXig3ooZb2MMzfG0jTf82bP75lFMg7lxkaYwr0o4/HCJClgpSHn0L8A Rjzj/w23G1AaUbKlde2wgC0Aumi13wv4GCSqTw5wkWh0Tv/d8Zu2Je1uA jV30KBXPDWm/zS26cgrsvzdCSCcgd++GkQioK/omttmFead8IRYV1encZ 3RgsnqQuWV08MznmydJRl+4QygwpTNtJUR3JZiW2cSD+Hth7MuNHbWs7T xWX1DjhSG/ppnXf5MkLp3IXOa58852vEQHfT66mcfP5lqZRBxccP7bYwF Z0yW7hGyo3VnQQTuaQnsIY/J+5qoxm23B7T2MvroCvUayTU7QCiLx1Se8 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10643"; a="364026207" X-IronPort-AV: E=Sophos;i="5.98,245,1673942400"; d="scan'208";a="364026207" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Mar 2023 00:22:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10643"; a="787474152" X-IronPort-AV: E=Sophos;i="5.98,245,1673942400"; d="scan'208";a="787474152" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga002.fm.intel.com with ESMTP; 09 Mar 2023 00:22:12 -0800 From: Yi Liu To: joro@8bytes.org, alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com, robin.murphy@arm.com, baolu.lu@linux.intel.com Cc: cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH v2 2/5] iommu/vt-d: Implement hw_info for iommu capability query Date: Thu, 9 Mar 2023 00:22:04 -0800 Message-Id: <20230309082207.612346-3-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230309082207.612346-1-yi.l.liu@intel.com> References: <20230309082207.612346-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org From: Lu Baolu Add intel_iommu_hw_info() to report cap_reg and ecap_reg information. Signed-off-by: Lu Baolu Signed-off-by: Yi Liu Signed-off-by: Nicolin Chen Signed-off-by: Yi Sun --- drivers/iommu/intel/iommu.c | 19 +++++++++++++++++++ drivers/iommu/intel/iommu.h | 1 + 2 files changed, 20 insertions(+) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 7c2f4bd33582..004406209793 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4780,8 +4780,26 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid) intel_pasid_tear_down_entry(iommu, dev, pasid, false); } +static void *intel_iommu_hw_info(struct device *dev, u32 *length) +{ + struct device_domain_info *info = dev_iommu_priv_get(dev); + struct intel_iommu *iommu = info->iommu; + struct iommu_hw_info_vtd *vtd; + + vtd = kzalloc(sizeof(*vtd), GFP_KERNEL); + if (!vtd) + return ERR_PTR(-ENOMEM); + + vtd->cap_reg = iommu->cap; + vtd->ecap_reg = iommu->ecap; + *length = sizeof(*vtd); + + return vtd; +} + const struct iommu_ops intel_iommu_ops = { .capable = intel_iommu_capable, + .hw_info = intel_iommu_hw_info, .domain_alloc = intel_iommu_domain_alloc, .probe_device = intel_iommu_probe_device, .probe_finalize = intel_iommu_probe_finalize, @@ -4794,6 +4812,7 @@ const struct iommu_ops intel_iommu_ops = { .def_domain_type = device_def_domain_type, .remove_dev_pasid = intel_iommu_remove_dev_pasid, .pgsize_bitmap = SZ_4K, + .driver_type = IOMMU_HW_INFO_TYPE_INTEL_VTD, #ifdef CONFIG_INTEL_IOMMU_SVM .page_response = intel_svm_page_response, #endif diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index d6df3b865812..9871de2170eb 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -23,6 +23,7 @@ #include #include #include +#include #include #include From patchwork Thu Mar 9 08:22:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13166965 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F0E4C6FD1F for ; Thu, 9 Mar 2023 08:25:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229772AbjCIIYl (ORCPT ); Thu, 9 Mar 2023 03:24:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229770AbjCIIYR (ORCPT ); Thu, 9 Mar 2023 03:24:17 -0500 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E317C10242; Thu, 9 Mar 2023 00:22:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678350178; x=1709886178; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uVvRKgPxKSHuG1EtBteAi9d95M9LHVWKu0TnY2OCClA=; b=b+pW3pVeSZvaQIbw+aoe92CEG4ajrDZYNn9yfwAZzQslw0ojNFIEMSJJ NnCobNuytxWLGJxBLBWBaF/a+1HGw8dhHHL2Ad7p25TB8z9JG/nMHORVN xIXQg4xpxZhBgHn8VRhR3ZJ39rtXNrJU+nc1/0vt+R5VnCiul/Qnnmcru pSmLQRuxzl0JLf1XHT5j7WZCUfq8BqorxkTgrfOBViwnCFvk6VsXiImIj UkTkfBPR/pKf6I1dKoVFIulLwFjwWSxzaSRyuhvDS46E3EV7FNKCuD5WO sZhpw75tMUpNm+Xu11gRCZfOa7WYkD0RtyI31XaA+zmOYlQpYs2AOfh0P w==; X-IronPort-AV: E=McAfee;i="6500,9779,10643"; a="364026221" X-IronPort-AV: E=Sophos;i="5.98,245,1673942400"; d="scan'208";a="364026221" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Mar 2023 00:22:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10643"; a="787474154" X-IronPort-AV: E=Sophos;i="5.98,245,1673942400"; d="scan'208";a="787474154" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga002.fm.intel.com with ESMTP; 09 Mar 2023 00:22:13 -0800 From: Yi Liu To: joro@8bytes.org, alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com, robin.murphy@arm.com, baolu.lu@linux.intel.com Cc: cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH v2 3/5] iommu/vt-d: Extend dmar_domain to support nested domain Date: Thu, 9 Mar 2023 00:22:05 -0800 Message-Id: <20230309082207.612346-4-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230309082207.612346-1-yi.l.liu@intel.com> References: <20230309082207.612346-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org From: Lu Baolu The nested domain fields are exclusive to those that used for a DMA remapping domain. Use union to avoid memory waste. Signed-off-by: Lu Baolu Signed-off-by: Yi Liu --- drivers/iommu/intel/iommu.h | 35 +++++++++++++++++++++++++++++------ 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 9871de2170eb..9446e17dd202 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -599,15 +599,38 @@ struct dmar_domain { spinlock_t lock; /* Protect device tracking lists */ struct list_head devices; /* all devices' list */ - struct dma_pte *pgd; /* virtual address */ - int gaw; /* max guest address width */ - - /* adjusted guest address width, 0 is level 2 30-bit */ - int agaw; int iommu_superpage;/* Level of superpages supported: 0 == 4KiB (no superpages), 1 == 2MiB, 2 == 1GiB, 3 == 512GiB, 4 == 1TiB */ - u64 max_addr; /* maximum mapped address */ + union { + /* DMA remapping domain */ + struct { + /* virtual address */ + struct dma_pte *pgd; + /* max guest address width */ + int gaw; + /* + * adjusted guest address width: + * 0: level 2 30-bit + * 1: level 3 39-bit + * 2: level 4 48-bit + * 3: level 5 57-bit + */ + int agaw; + /* maximum mapped address */ + u64 max_addr; + }; + + /* Nested user domain */ + struct { + /* 2-level page table the user domain nested */ + struct dmar_domain *s2_domain; + /* user page table pointer (in GPA) */ + unsigned long s1_pgtbl; + /* page table attributes */ + struct iommu_hwpt_intel_vtd s1_cfg; + }; + }; struct iommu_domain domain; /* generic domain data structure for iommu core */ From patchwork Thu Mar 9 08:22:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13166964 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEF1AC6FD19 for ; Thu, 9 Mar 2023 08:25:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229725AbjCIIYh (ORCPT ); Thu, 9 Mar 2023 03:24:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50646 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230431AbjCIIYE (ORCPT ); Thu, 9 Mar 2023 03:24:04 -0500 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5CF8B9BCF; Thu, 9 Mar 2023 00:22:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678350169; x=1709886169; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Baw+/7QBeZ0YQuTYvP7GMRvLkvX1hO3sRq7+Dy8YvqQ=; b=FsCcgLYsW5z6Cbm2D/9jbTL+ZPBcczGIY6rPb4tJEpLHTvALNwcf9vc8 RGPY3nUdHaaU3FQRy/fjuVYyaP8aBpAA7P/pgB1D+RF2n+kevCr0cO2aH uqAQG3LZHAtLhGzIeGne13yPpXc/HiP33N1IGj802yUHlA8wSKk4UQfWh cFqbf3nXXjKBGS8lFh8oBbnJ5nfbNT+r4ByWhe4iT0givUV/77ACGnn6n WxC/VYcEx04u89yxD8kKuc3uVuGg0XuucKqBa8KcvRBJrwwhx7fGjF6mU R9qrX5v6bq4VtPSNoYpetAnsQK1DJCOiy2/966UY8dQUUVrlKUkCmhn2C w==; X-IronPort-AV: E=McAfee;i="6500,9779,10643"; a="364026195" X-IronPort-AV: E=Sophos;i="5.98,245,1673942400"; d="scan'208";a="364026195" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Mar 2023 00:22:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10643"; a="787474157" X-IronPort-AV: E=Sophos;i="5.98,245,1673942400"; d="scan'208";a="787474157" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga002.fm.intel.com with ESMTP; 09 Mar 2023 00:22:14 -0800 From: Yi Liu To: joro@8bytes.org, alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com, robin.murphy@arm.com, baolu.lu@linux.intel.com Cc: cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Jacob Pan Subject: [PATCH v2 4/5] iommu/vt-d: Add helper to setup pasid nested translation Date: Thu, 9 Mar 2023 00:22:06 -0800 Message-Id: <20230309082207.612346-5-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230309082207.612346-1-yi.l.liu@intel.com> References: <20230309082207.612346-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org From: Lu Baolu The configurations are passed in from the user when the user domain is allocated. This helper interprets these configurations according to the data structure defined in uapi/linux/iommufd.h. The EINVAL error will be returned if any of configurations are not compatible with the hardware capabilities. The caller can retry with another compatible user domain. The encoding of fields of each pasid entry is defined in section 9.6 of the VT-d spec. Signed-off-by: Jacob Pan Signed-off-by: Lu Baolu Signed-off-by: Yi Liu --- drivers/iommu/intel/pasid.c | 142 ++++++++++++++++++++++++++++++++++++ drivers/iommu/intel/pasid.h | 2 + 2 files changed, 144 insertions(+) diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index 633e0a4a01e7..2e8a912b1499 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -21,6 +21,11 @@ #include "iommu.h" #include "pasid.h" +#define IOMMU_VTD_PGTBL_MTS_MASK (IOMMU_VTD_PGTBL_CD | \ + IOMMU_VTD_PGTBL_EMTE | \ + IOMMU_VTD_PGTBL_PCD | \ + IOMMU_VTD_PGTBL_PWT) + /* * Intel IOMMU system wide PASID name space: */ @@ -411,6 +416,15 @@ pasid_set_flpm(struct pasid_entry *pe, u64 value) pasid_set_bits(&pe->val[2], GENMASK_ULL(3, 2), value << 2); } +/* + * Setup the Extended Access Flag Enable (EAFE) field (Bit 135) + * of a scalable mode PASID entry. + */ +static inline void pasid_set_eafe(struct pasid_entry *pe) +{ + pasid_set_bits(&pe->val[2], 1 << 7, 1 << 7); +} + static void pasid_cache_invalidation_with_pasid(struct intel_iommu *iommu, u16 did, u32 pasid) @@ -756,3 +770,131 @@ void intel_pasid_setup_page_snoop_control(struct intel_iommu *iommu, if (!cap_caching_mode(iommu->cap)) devtlb_invalidation_with_pasid(iommu, dev, pasid); } + +/** + * intel_pasid_setup_nested() - Set up PASID entry for nested translation. + * This could be used for guest shared virtual address. In this case, the + * first level page tables are used for GVA-GPA translation in the guest, + * second level page tables are used for GPA-HPA translation. + * + * @iommu: IOMMU which the device belong to + * @dev: Device to be set up for translation + * @pasid: PASID to be programmed in the device PASID table + * @domain: User domain nested on a s2 domain + */ +int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device *dev, + u32 pasid, struct dmar_domain *domain) +{ + struct iommu_hwpt_intel_vtd *s1_cfg = &domain->s1_cfg; + pgd_t *s1_gpgd = (pgd_t *)(uintptr_t)domain->s1_pgtbl; + struct dmar_domain *s2_domain = domain->s2_domain; + u16 did = domain_id_iommu(domain, iommu); + struct dma_pte *pgd = s2_domain->pgd; + struct pasid_entry *pte; + int agaw; + + if (!ecap_nest(iommu->ecap)) { + pr_err_ratelimited("%s: No nested translation support\n", + iommu->name); + return -ENODEV; + } + + /* + * Sanity checking performed by caller to make sure address width + * matching in two dimensions: CPU vs. IOMMU, guest vs. host. + */ + switch (s1_cfg->addr_width) { + case ADDR_WIDTH_4LEVEL: + break; +#ifdef CONFIG_X86 + case ADDR_WIDTH_5LEVEL: + if (!cpu_feature_enabled(X86_FEATURE_LA57) || + !cap_fl5lp_support(iommu->cap)) { + dev_err_ratelimited(dev, + "5-level paging not supported\n"); + return -EINVAL; + } + break; +#endif + default: + dev_err_ratelimited(dev, "Invalid guest address width %d\n", + s1_cfg->addr_width); + return -EINVAL; + } + + if ((s1_cfg->flags & IOMMU_VTD_PGTBL_SRE) && !ecap_srs(iommu->ecap)) { + pr_err_ratelimited("No supervisor request support on %s\n", + iommu->name); + return -EINVAL; + } + + if ((s1_cfg->flags & IOMMU_VTD_PGTBL_EAFE) && !ecap_eafs(iommu->ecap)) { + pr_err_ratelimited("No extended access flag support on %s\n", + iommu->name); + return -EINVAL; + } + + /* + * Memory type is only applicable to devices inside processor coherent + * domain. Will add MTS support once coherent devices are available. + */ + if (s1_cfg->flags & IOMMU_VTD_PGTBL_MTS_MASK) { + pr_warn_ratelimited("No memory type support %s\n", + iommu->name); + return -EINVAL; + } + + agaw = iommu_skip_agaw(s2_domain, iommu, &pgd); + if (agaw < 0) { + dev_err_ratelimited(dev, "Invalid domain page table\n"); + return -EINVAL; + } + + /* First level PGD (in GPA) must be supported by the second level. */ + if ((uintptr_t)s1_gpgd > (1ULL << s2_domain->gaw)) { + dev_err_ratelimited(dev, + "Guest PGD %lx not supported, max %llx\n", + (uintptr_t)s1_gpgd, s2_domain->max_addr); + return -EINVAL; + } + + spin_lock(&iommu->lock); + pte = intel_pasid_get_entry(dev, pasid); + if (!pte) { + spin_unlock(&iommu->lock); + return -ENODEV; + } + if (pasid_pte_is_present(pte)) { + spin_unlock(&iommu->lock); + return -EBUSY; + } + + pasid_clear_entry(pte); + + if (s1_cfg->addr_width == ADDR_WIDTH_5LEVEL) + pasid_set_flpm(pte, 1); + + pasid_set_flptr(pte, (uintptr_t)s1_gpgd); + + if (s1_cfg->flags & IOMMU_VTD_PGTBL_SRE) { + pasid_set_sre(pte); + if (s1_cfg->flags & IOMMU_VTD_PGTBL_WPE) + pasid_set_wpe(pte); + } + + if (s1_cfg->flags & IOMMU_VTD_PGTBL_EAFE) + pasid_set_eafe(pte); + + pasid_set_slptr(pte, virt_to_phys(pgd)); + pasid_set_fault_enable(pte); + pasid_set_domain_id(pte, did); + pasid_set_address_width(pte, agaw); + pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap)); + pasid_set_translation_type(pte, PASID_ENTRY_PGTT_NESTED); + pasid_set_present(pte); + spin_unlock(&iommu->lock); + + pasid_flush_caches(iommu, pte, pasid, did); + + return 0; +} diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h index 20c54e50f533..2a72bbc79532 100644 --- a/drivers/iommu/intel/pasid.h +++ b/drivers/iommu/intel/pasid.h @@ -118,6 +118,8 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu, int intel_pasid_setup_pass_through(struct intel_iommu *iommu, struct dmar_domain *domain, struct device *dev, u32 pasid); +int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device *dev, + u32 pasid, struct dmar_domain *domain); void intel_pasid_tear_down_entry(struct intel_iommu *iommu, struct device *dev, u32 pasid, bool fault_ignore); From patchwork Thu Mar 9 08:22:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13166968 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E23DC64EC4 for ; Thu, 9 Mar 2023 08:25:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229717AbjCIIZJ (ORCPT ); Thu, 9 Mar 2023 03:25:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50478 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229919AbjCIIYU (ORCPT ); Thu, 9 Mar 2023 03:24:20 -0500 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36BE214EAB; Thu, 9 Mar 2023 00:23:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678350183; x=1709886183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/mmYgR+89ByI+p2Zr/OCn4pE2TGeXtXXbNuoqCFUfCE=; b=YHz2A9Olouj6Zdn/uHh1CNiozWJ9aHR+/ysxavHGibpEqqpsScHkWezP GIJORlUs99UFYhP/NEsaMVozatW4Prabk5hjM6shGmvFYK7Soeem0tncn ssyL98RvKFIZekLqht1LkGx5fAazsBcWFrnMbTdtEeMp1DFjAzXgefv5F 5f1uzdplw3golqEh4c6vPDE2T+tLhpFtVfX1KFyBKuxrw4qjBYQi/IdGL 1g8sNLRfot61zplo8+vzuOLx/RNyENtqOi5nf+Y2BDcBKi0zkxQ3sUqN7 EcKqCpxZK5yYFTZXQalwFlGmYDkLlOjjMeirMWiWmoxjFaGtCVYD7afgI w==; X-IronPort-AV: E=McAfee;i="6500,9779,10643"; a="364026232" X-IronPort-AV: E=Sophos;i="5.98,245,1673942400"; d="scan'208";a="364026232" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Mar 2023 00:22:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10643"; a="787474161" X-IronPort-AV: E=Sophos;i="5.98,245,1673942400"; d="scan'208";a="787474161" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga002.fm.intel.com with ESMTP; 09 Mar 2023 00:22:14 -0800 From: Yi Liu To: joro@8bytes.org, alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com, robin.murphy@arm.com, baolu.lu@linux.intel.com Cc: cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Jacob Pan Subject: [PATCH v2 5/5] iommu/vt-d: Add nested domain support Date: Thu, 9 Mar 2023 00:22:07 -0800 Message-Id: <20230309082207.612346-6-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230309082207.612346-1-yi.l.liu@intel.com> References: <20230309082207.612346-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org From: Lu Baolu This adds nested domain support in the Intel IOMMU driver. It allows to allocate and free a nested domain, set the nested domain to a device, and synchronize the caches when the userspace managed page tables are updated. Signed-off-by: Jacob Pan Signed-off-by: Lu Baolu Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu --- drivers/iommu/intel/Makefile | 2 +- drivers/iommu/intel/iommu.c | 38 ++++++---- drivers/iommu/intel/iommu.h | 15 ++++ drivers/iommu/intel/nested.c | 143 +++++++++++++++++++++++++++++++++++ 4 files changed, 182 insertions(+), 16 deletions(-) create mode 100644 drivers/iommu/intel/nested.c diff --git a/drivers/iommu/intel/Makefile b/drivers/iommu/intel/Makefile index 7af3b8a4f2a0..5dabf081a779 100644 --- a/drivers/iommu/intel/Makefile +++ b/drivers/iommu/intel/Makefile @@ -1,6 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_DMAR_TABLE) += dmar.o -obj-$(CONFIG_INTEL_IOMMU) += iommu.o pasid.o +obj-$(CONFIG_INTEL_IOMMU) += iommu.o pasid.o nested.o obj-$(CONFIG_DMAR_TABLE) += trace.o cap_audit.o obj-$(CONFIG_DMAR_PERF) += perf.o obj-$(CONFIG_INTEL_IOMMU_DEBUGFS) += debugfs.o diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 004406209793..89cc7095afed 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -277,7 +277,6 @@ static LIST_HEAD(dmar_satc_units); #define for_each_rmrr_units(rmrr) \ list_for_each_entry(rmrr, &dmar_rmrr_units, list) -static void device_block_translation(struct device *dev); static void intel_iommu_domain_free(struct iommu_domain *domain); int dmar_disabled = !IS_ENABLED(CONFIG_INTEL_IOMMU_DEFAULT_ON); @@ -555,7 +554,7 @@ static unsigned long domain_super_pgsize_bitmap(struct dmar_domain *domain) } /* Some capabilities may be different across iommus */ -static void domain_update_iommu_cap(struct dmar_domain *domain) +void domain_update_iommu_cap(struct dmar_domain *domain) { domain_update_iommu_coherency(domain); domain->iommu_superpage = domain_update_iommu_superpage(domain, NULL); @@ -1498,10 +1497,10 @@ static void iommu_flush_dev_iotlb(struct dmar_domain *domain, spin_unlock_irqrestore(&domain->lock, flags); } -static void iommu_flush_iotlb_psi(struct intel_iommu *iommu, - struct dmar_domain *domain, - unsigned long pfn, unsigned int pages, - int ih, int map) +void iommu_flush_iotlb_psi(struct intel_iommu *iommu, + struct dmar_domain *domain, + unsigned long pfn, unsigned int pages, + int ih, int map) { unsigned int aligned_pages = __roundup_pow_of_two(pages); unsigned int mask = ilog2(aligned_pages); @@ -1573,7 +1572,7 @@ static inline void __mapping_notify_one(struct intel_iommu *iommu, iommu_flush_write_buffer(iommu); } -static void intel_flush_iotlb_all(struct iommu_domain *domain) +void intel_flush_iotlb_all(struct iommu_domain *domain) { struct dmar_domain *dmar_domain = to_dmar_domain(domain); struct iommu_domain_info *info; @@ -1765,8 +1764,7 @@ static struct dmar_domain *alloc_domain(unsigned int type) return domain; } -static int domain_attach_iommu(struct dmar_domain *domain, - struct intel_iommu *iommu) +int domain_attach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu) { struct iommu_domain_info *info, *curr; unsigned long ndomains; @@ -1815,8 +1813,7 @@ static int domain_attach_iommu(struct dmar_domain *domain, return ret; } -static void domain_detach_iommu(struct dmar_domain *domain, - struct intel_iommu *iommu) +void domain_detach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu) { struct iommu_domain_info *info; @@ -4103,7 +4100,7 @@ static void dmar_remove_one_dev_info(struct device *dev) * all DMA requests without PASID from the device are blocked. If the page * table has been set, clean up the data structures. */ -static void device_block_translation(struct device *dev) +void device_block_translation(struct device *dev) { struct device_domain_info *info = dev_iommu_priv_get(dev); struct intel_iommu *iommu = info->iommu; @@ -4204,14 +4201,24 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) return NULL; } +static struct iommu_domain * +intel_iommu_domain_alloc_user(struct device *dev, struct iommu_domain *parent, + const void *user_data) +{ + if (parent) + return intel_nested_domain_alloc(parent, user_data); + else + return iommu_domain_alloc(dev->bus); +} + static void intel_iommu_domain_free(struct iommu_domain *domain) { if (domain != &si_domain->domain && domain != &blocking_domain) domain_exit(to_dmar_domain(domain)); } -static int prepare_domain_attach_device(struct iommu_domain *domain, - struct device *dev) +int prepare_domain_attach_device(struct iommu_domain *domain, + struct device *dev) { struct dmar_domain *dmar_domain = to_dmar_domain(domain); struct intel_iommu *iommu; @@ -4450,7 +4457,7 @@ static void domain_set_force_snooping(struct dmar_domain *domain) PASID_RID2PASID); } -static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain) +bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain) { struct dmar_domain *dmar_domain = to_dmar_domain(domain); unsigned long flags; @@ -4801,6 +4808,7 @@ const struct iommu_ops intel_iommu_ops = { .capable = intel_iommu_capable, .hw_info = intel_iommu_hw_info, .domain_alloc = intel_iommu_domain_alloc, + .domain_alloc_user = intel_iommu_domain_alloc_user, .probe_device = intel_iommu_probe_device, .probe_finalize = intel_iommu_probe_finalize, .release_device = intel_iommu_release_device, diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 9446e17dd202..661fb5050e2d 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -848,6 +848,19 @@ void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did, u64 granu, int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc, unsigned int count, unsigned long options); +int domain_attach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu); +void domain_detach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu); +void device_block_translation(struct device *dev); +int prepare_domain_attach_device(struct iommu_domain *domain, + struct device *dev); +bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain); +void iommu_flush_iotlb_psi(struct intel_iommu *iommu, + struct dmar_domain *domain, + unsigned long pfn, unsigned int pages, + int ih, int map); +void intel_flush_iotlb_all(struct iommu_domain *domain); +void domain_update_iommu_cap(struct dmar_domain *domain); + /* * Options used in qi_submit_sync: * QI_OPT_WAIT_DRAIN - Wait for PRQ drain completion, spec 6.5.2.8. @@ -860,6 +873,8 @@ void *alloc_pgtable_page(int node, gfp_t gfp); void free_pgtable_page(void *vaddr); void iommu_flush_write_buffer(struct intel_iommu *iommu); struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn); +struct iommu_domain *intel_nested_domain_alloc(struct iommu_domain *s2_domain, + const void *data); #ifdef CONFIG_INTEL_IOMMU_SVM extern void intel_svm_check(struct intel_iommu *iommu); diff --git a/drivers/iommu/intel/nested.c b/drivers/iommu/intel/nested.c new file mode 100644 index 000000000000..64441d11a84d --- /dev/null +++ b/drivers/iommu/intel/nested.c @@ -0,0 +1,143 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * nested.c - nested mode translation support + * + * Copyright (C) 2023 Intel Corporation + * + * Author: Lu Baolu + * Jacob Pan + */ + +#define pr_fmt(fmt) "DMAR: " fmt + +#include +#include +#include + +#include "iommu.h" +#include "pasid.h" + +static int intel_nested_attach_dev(struct iommu_domain *domain, + struct device *dev) +{ + struct device_domain_info *info = dev_iommu_priv_get(dev); + struct dmar_domain *dmar_domain = to_dmar_domain(domain); + struct intel_iommu *iommu = info->iommu; + unsigned long flags; + int ret = 0; + + if (info->domain) + device_block_translation(dev); + + /* Is s2_domain compatible with this IOMMU? */ + ret = prepare_domain_attach_device(&dmar_domain->s2_domain->domain, dev); + if (ret) { + dev_err_ratelimited(dev, "s2 domain is not compatible\n"); + return ret; + } + + ret = domain_attach_iommu(dmar_domain, iommu); + if (ret) { + dev_err_ratelimited(dev, "Failed to attach domain to iommu\n"); + return ret; + } + + ret = intel_pasid_setup_nested(iommu, dev, + PASID_RID2PASID, dmar_domain); + if (ret) { + domain_detach_iommu(dmar_domain, iommu); + dev_err_ratelimited(dev, "Failed to setup pasid entry\n"); + return ret; + } + + info->domain = dmar_domain; + spin_lock_irqsave(&dmar_domain->lock, flags); + list_add(&info->link, &dmar_domain->devices); + spin_unlock_irqrestore(&dmar_domain->lock, flags); + domain_update_iommu_cap(dmar_domain); + + return 0; +} + +static void intel_nested_domain_free(struct iommu_domain *domain) +{ + kfree(to_dmar_domain(domain)); +} + +static void intel_nested_invalidate(struct device *dev, + struct dmar_domain *domain, + void *user_data) +{ + struct iommu_hwpt_invalidate_intel_vtd *inv_info = user_data; + struct device_domain_info *info = dev_iommu_priv_get(dev); + struct intel_iommu *iommu = info->iommu; + + if (WARN_ON(!user_data)) + return; + + switch (inv_info->granularity) { + case IOMMU_VTD_QI_GRAN_ADDR: + if (inv_info->granule_size != VTD_PAGE_SIZE || + !IS_ALIGNED(inv_info->addr, VTD_PAGE_SIZE)) { + dev_err_ratelimited(dev, "Invalid invalidation address 0x%llx\n", + inv_info->addr); + return; + } + + iommu_flush_iotlb_psi(iommu, domain, + inv_info->addr >> VTD_PAGE_SHIFT, + inv_info->nb_granules, 1, 0); + break; + case IOMMU_VTD_QI_GRAN_DOMAIN: + intel_flush_iotlb_all(&domain->domain); + break; + default: + dev_err_ratelimited(dev, "Unsupported IOMMU invalidation type %d\n", + inv_info->granularity); + break; + } +} + +static void intel_nested_cache_invalidate_user(struct iommu_domain *domain, + void *user_data) +{ + struct dmar_domain *dmar_domain = to_dmar_domain(domain); + struct device_domain_info *info; + unsigned long flags; + + spin_lock_irqsave(&dmar_domain->lock, flags); + list_for_each_entry(info, &dmar_domain->devices, link) + intel_nested_invalidate(info->dev, dmar_domain, + user_data); + spin_unlock_irqrestore(&dmar_domain->lock, flags); +} + +static const struct iommu_domain_ops intel_nested_domain_ops = { + .attach_dev = intel_nested_attach_dev, + .cache_invalidate_user = intel_nested_cache_invalidate_user, + .free = intel_nested_domain_free, + .enforce_cache_coherency = intel_iommu_enforce_cache_coherency, +}; + +struct iommu_domain *intel_nested_domain_alloc(struct iommu_domain *s2_domain, + const void *user_data) +{ + const struct iommu_hwpt_intel_vtd *vtd = user_data; + struct dmar_domain *domain; + + domain = kzalloc(sizeof(*domain), GFP_KERNEL_ACCOUNT); + if (!domain) + return NULL; + + domain->use_first_level = true; + domain->s2_domain = to_dmar_domain(s2_domain); + domain->s1_pgtbl = vtd->pgtbl_addr; + domain->s1_cfg = *vtd; + domain->domain.ops = &intel_nested_domain_ops; + domain->domain.type = IOMMU_DOMAIN_NESTED; + INIT_LIST_HEAD(&domain->devices); + spin_lock_init(&domain->lock); + xa_init(&domain->iommu_array); + + return &domain->domain; +}