From patchwork Thu Jun 1 06:33:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13263084 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0C2C3C7EE23 for ; Thu, 1 Jun 2023 06:47:12 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1q4c4p-0004Hv-D8; Thu, 01 Jun 2023 02:46:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q4c4n-0004H7-JD for qemu-devel@nongnu.org; Thu, 01 Jun 2023 02:46:25 -0400 Received: from mga07.intel.com ([134.134.136.100]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q4c4l-000762-Sy for qemu-devel@nongnu.org; Thu, 01 Jun 2023 02:46:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685601983; x=1717137983; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KihC+dgXoto8EsPZ7B7+rYy+MeCePUN/hrcJxPBQAUs=; b=XLkwc+GQR4k6bwZUi9qT7gFW00X1+AEJtpOvbH+30D4TJ53+kv7v347R oiHd7A3lq3co7uLGAkb/x7VmbMQPLW9vN+Ju+F7iwoY5zADBAYhFZYqVE g9yGhIqj6yoVoy/X5/iwqUEMAChO5WAYlf9lbjgdwkHpFk06pGSWFzoiB l0k+6g/3gG06+w0iS6lS4oK7p48Re1xradTyKnn9y3oAsAAeVHSOF1Lql tarYX6BoJ77EZEBe4Ztr/Dvj1QMKcrGNAFz3Alq86pWy1+09cNYWzxJDl 7kTy7GPHp6qaTx8dW1P7meu/3OXA0eKi9SlZaHmJns8hasEa74JmokbOt w==; X-IronPort-AV: E=McAfee;i="6600,9927,10727"; a="421249321" X-IronPort-AV: E=Sophos;i="6.00,209,1681196400"; d="scan'208";a="421249321" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2023 23:46:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10727"; a="953953058" X-IronPort-AV: E=Sophos;i="6.00,209,1681196400"; d="scan'208";a="953953058" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2023 23:46:17 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: mst@redhat.com, peterx@redhat.com, jasowang@redhat.com, pbonzini@redhat.com, richard.henderson@linaro.org, eduardo@habkost.net, marcel.apfelbaum@gmail.com, alex.williamson@redhat.com, clg@redhat.com, david@redhat.com, philmd@linaro.org, kwankhede@nvidia.com, cjia@nvidia.com, yi.l.liu@intel.com, chao.p.peng@intel.com Subject: [PATCH v2 1/4] util: Add iova_tree_foreach_range_data Date: Thu, 1 Jun 2023 14:33:17 +0800 Message-Id: <20230601063320.139308-2-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230601063320.139308-1-zhenzhong.duan@intel.com> References: <20230601063320.139308-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.100; envelope-from=zhenzhong.duan@intel.com; helo=mga07.intel.com X-Spam_score_int: -45 X-Spam_score: -4.6 X-Spam_bar: ---- X-Spam_report: (-4.6 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.163, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This function is a variant of iova_tree_foreach and support tranversing a range to trigger callback with a private data. Signed-off-by: Zhenzhong Duan Reviewed-by: Peter Xu --- include/qemu/iova-tree.h | 17 +++++++++++++++-- util/iova-tree.c | 31 +++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+), 2 deletions(-) diff --git a/include/qemu/iova-tree.h b/include/qemu/iova-tree.h index 8528e5c98fbc..8bbf049dc3f7 100644 --- a/include/qemu/iova-tree.h +++ b/include/qemu/iova-tree.h @@ -39,6 +39,7 @@ typedef struct DMAMap { IOMMUAccessFlags perm; } QEMU_PACKED DMAMap; typedef gboolean (*iova_tree_iterator)(DMAMap *map); +typedef gboolean (*iova_tree_iterator_2)(DMAMap *map, gpointer *private); /** * iova_tree_new: @@ -131,11 +132,23 @@ const DMAMap *iova_tree_find_address(const IOVATree *tree, hwaddr iova); * @iterator: the interator for the mappings, return true to stop * * Iterate over the iova tree. - * - * Return: 1 if found any overlap, 0 if not, <0 if error. */ void iova_tree_foreach(IOVATree *tree, iova_tree_iterator iterator); +/** + * iova_tree_foreach_range_data: + * + * @tree: the iova tree to iterate on + * @range: the iova range to iterate in + * @func: the iterator for the mappings, return true to stop + * @private: parameter passed to @func + * + * Iterate over an iova range in iova tree. + */ +void iova_tree_foreach_range_data(IOVATree *tree, DMAMap *range, + iova_tree_iterator_2 func, + gpointer *private); + /** * iova_tree_alloc_map: * diff --git a/util/iova-tree.c b/util/iova-tree.c index 536789797e47..a3cbd5198410 100644 --- a/util/iova-tree.c +++ b/util/iova-tree.c @@ -42,6 +42,12 @@ typedef struct IOVATreeFindIOVAArgs { const DMAMap *result; } IOVATreeFindIOVAArgs; +typedef struct IOVATreeIterator { + DMAMap *range; + iova_tree_iterator_2 func; + gpointer *private; +} IOVATreeIterator; + /** * Iterate args to the next hole * @@ -164,6 +170,31 @@ void iova_tree_foreach(IOVATree *tree, iova_tree_iterator iterator) g_tree_foreach(tree->tree, iova_tree_traverse, iterator); } +static gboolean iova_tree_traverse_range(gpointer key, gpointer value, + gpointer data) +{ + DMAMap *map = key; + IOVATreeIterator *iterator = data; + DMAMap *range = iterator->range; + + g_assert(key == value); + + if (iova_tree_compare(map, range, NULL)) { + return false; + } + + return iterator->func(map, iterator->private); +} + +void iova_tree_foreach_range_data(IOVATree *tree, DMAMap *range, + iova_tree_iterator_2 func, + gpointer *private) +{ + IOVATreeIterator iterator = {range, func, private}; + + g_tree_foreach(tree->tree, iova_tree_traverse_range, &iterator); +} + void iova_tree_remove(IOVATree *tree, DMAMap map) { const DMAMap *overlap; From patchwork Thu Jun 1 06:33:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13263083 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6E89AC77B7A for ; Thu, 1 Jun 2023 06:46:55 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1q4c4s-0004Ip-S7; Thu, 01 Jun 2023 02:46:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q4c4q-0004IM-Ta for qemu-devel@nongnu.org; Thu, 01 Jun 2023 02:46:28 -0400 Received: from mga07.intel.com ([134.134.136.100]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q4c4p-00077p-1s for qemu-devel@nongnu.org; Thu, 01 Jun 2023 02:46:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685601986; x=1717137986; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CNHEDBOtjkxHzvRFPTKz/4r+ykDOvMqN3IHTzpMvfwk=; b=dqE4+WCUvpVqwivjahLn8+iqM6oYS3NJ7KPYubLVQR2z3s6gPpzrJbna w9vWguGkthAIXATeSZOYMYH2tHtYQbIHFCR+9KTX4kxnb9rUMRsFMRpS6 TirF60HMKR8+Vwl+NCymiGupwvHZh2fivNXzchw7DZ08JDE/RweWVIUoM i5VunM2yK7rPcc+GJKfxI9xSSWKHGGxVm/6aSM+yB0NZ+/AKfrivSJz/W CSm0/lfO54bda01iGkr6w0NyCdcS/9OPLLCv30tm9cPKRttBILBp7aSg/ LQw5uCWlwEWAdFV5U2yPiHs6cNaktNsPPThUGLQ9Ms8TehXL2uMc9b/De g==; X-IronPort-AV: E=McAfee;i="6600,9927,10727"; a="421249338" X-IronPort-AV: E=Sophos;i="6.00,209,1681196400"; d="scan'208";a="421249338" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2023 23:46:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10727"; a="953953074" X-IronPort-AV: E=Sophos;i="6.00,209,1681196400"; d="scan'208";a="953953074" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2023 23:46:21 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: mst@redhat.com, peterx@redhat.com, jasowang@redhat.com, pbonzini@redhat.com, richard.henderson@linaro.org, eduardo@habkost.net, marcel.apfelbaum@gmail.com, alex.williamson@redhat.com, clg@redhat.com, david@redhat.com, philmd@linaro.org, kwankhede@nvidia.com, cjia@nvidia.com, yi.l.liu@intel.com, chao.p.peng@intel.com Subject: [PATCH v2 2/4] intel_iommu: Fix a potential issue in VFIO dirty page sync Date: Thu, 1 Jun 2023 14:33:18 +0800 Message-Id: <20230601063320.139308-3-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230601063320.139308-1-zhenzhong.duan@intel.com> References: <20230601063320.139308-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.100; envelope-from=zhenzhong.duan@intel.com; helo=mga07.intel.com X-Spam_score_int: -45 X-Spam_score: -4.6 X-Spam_bar: ---- X-Spam_report: (-4.6 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.163, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Peter Xu found a potential issue: "The other thing is when I am looking at the new code I found that we actually extended the replay() to be used also in dirty tracking of vfio, in vfio_sync_dirty_bitmap(). For that maybe it's already broken if unmap_all() because afaiu log_sync() can be called in migration thread anytime during DMA so I think it means the device is prone to DMA with the IOMMU pgtable quickly erased and rebuilt here, which means the DMA could fail unexpectedly. Copy Alex, Kirti and Neo." To eliminate this small window with empty mapping, we should remove the call to unmap_all(). Besides that, introduce a new notifier type called IOMMU_NOTIFIER_FULL_MAP to get full mappings as intel_iommu only notifies changed mappings while VFIO dirty page sync needs full mappings. Thanks to current implementation of iova tree, we could pick mappings from iova trees directly instead of walking through guest IOMMU page table. IOMMU_NOTIFIER_MAP is still used to get changed mappings for optimization purpose. As long as notification for IOMMU_NOTIFIER_MAP could ensure shadow page table in sync, then it's OK. Signed-off-by: Zhenzhong Duan --- hw/i386/intel_iommu.c | 49 +++++++++++++++++++++++++++++++++++-------- hw/vfio/common.c | 2 +- include/exec/memory.h | 13 ++++++++++++ softmmu/memory.c | 4 ++++ 4 files changed, 58 insertions(+), 10 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 94d52f4205d2..061fcded0dfb 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -3819,6 +3819,41 @@ static int vtd_replay_hook(IOMMUTLBEvent *event, void *private) return 0; } +static gboolean vtd_replay_full_map(DMAMap *map, gpointer *private) +{ + IOMMUTLBEvent event; + + event.type = IOMMU_NOTIFIER_MAP; + event.entry.iova = map->iova; + event.entry.addr_mask = map->size; + event.entry.target_as = &address_space_memory; + event.entry.perm = map->perm; + event.entry.translated_addr = map->translated_addr; + + return vtd_replay_hook(&event, private); +} + +/* + * This is a fast path to notify the full mappings falling in the scope + * of IOMMU notifier. The call site should ensure no iova tree update by + * taking necessary locks(e.x. BQL). + */ +static int vtd_page_walk_full_map_fast_path(IOVATree *iova_tree, + IOMMUNotifier *n) +{ + DMAMap map; + + map.iova = n->start; + map.size = n->end - n->start; + if (!iova_tree_find(iova_tree, &map)) { + return 0; + } + + iova_tree_foreach_range_data(iova_tree, &map, vtd_replay_full_map, + (gpointer *)n); + return 0; +} + static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) { VTDAddressSpace *vtd_as = container_of(iommu_mr, VTDAddressSpace, iommu); @@ -3826,13 +3861,6 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) uint8_t bus_n = pci_bus_num(vtd_as->bus); VTDContextEntry ce; - /* - * The replay can be triggered by either a invalidation or a newly - * created entry. No matter what, we release existing mappings - * (it means flushing caches for UNMAP-only registers). - */ - vtd_address_space_unmap(vtd_as, n); - if (vtd_dev_to_context_entry(s, bus_n, vtd_as->devfn, &ce) == 0) { trace_vtd_replay_ce_valid(s->root_scalable ? "scalable mode" : "legacy mode", @@ -3850,8 +3878,11 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) .as = vtd_as, .domain_id = vtd_get_domain_id(s, &ce, vtd_as->pasid), }; - - vtd_page_walk(s, &ce, 0, ~0ULL, &info, vtd_as->pasid); + if (n->notifier_flags & IOMMU_NOTIFIER_FULL_MAP) { + vtd_page_walk_full_map_fast_path(vtd_as->iova_tree, n); + } else { + vtd_page_walk(s, &ce, 0, ~0ULL, &info, vtd_as->pasid); + } } } else { trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn), diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 78358ede2764..5dae4502b908 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1890,7 +1890,7 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *container, iommu_notifier_init(&gdn.n, vfio_iommu_map_dirty_notify, - IOMMU_NOTIFIER_MAP, + IOMMU_NOTIFIER_FULL_MAP, section->offset_within_region, int128_get64(llend), idx); diff --git a/include/exec/memory.h b/include/exec/memory.h index c3661b2276c7..eecc3eec6702 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -142,6 +142,10 @@ struct IOMMUTLBEntry { * events (e.g. VFIO). Both notifications must be accurate so that * the shadow page table is fully in sync with the guest view. * + * Besides MAP, there is a special use case called FULL_MAP which + * requests notification for all the existent mappings (e.g. VFIO + * dirty page sync). + * * (2) When the device doesn't need accurate synchronizations of the * vIOMMU page tables, it needs to register only with UNMAP or * DEVIOTLB_UNMAP notifies. @@ -164,6 +168,8 @@ typedef enum { IOMMU_NOTIFIER_MAP = 0x2, /* Notify changes on device IOTLB entries */ IOMMU_NOTIFIER_DEVIOTLB_UNMAP = 0x04, + /* Notify every existent entries */ + IOMMU_NOTIFIER_FULL_MAP = 0x8, } IOMMUNotifierFlag; #define IOMMU_NOTIFIER_IOTLB_EVENTS (IOMMU_NOTIFIER_MAP | IOMMU_NOTIFIER_UNMAP) @@ -237,6 +243,13 @@ static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn, hwaddr start, hwaddr end, int iommu_idx) { + /* + * memory_region_notify_iommu_one() needs IOMMU_NOTIFIER_MAP set to + * trigger notifier. + */ + if (flags & IOMMU_NOTIFIER_FULL_MAP) { + flags |= IOMMU_NOTIFIER_MAP; + } n->notify = fn; n->notifier_flags = flags; n->start = start; diff --git a/softmmu/memory.c b/softmmu/memory.c index 7d9494ce7028..0a8465007c66 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -1922,6 +1922,10 @@ int memory_region_register_iommu_notifier(MemoryRegion *mr, assert(n->iommu_idx >= 0 && n->iommu_idx < memory_region_iommu_num_indexes(iommu_mr)); + if (n->notifier_flags & IOMMU_NOTIFIER_FULL_MAP) { + error_setg(errp, "FULL_MAP could only be used in replay"); + } + QLIST_INSERT_HEAD(&iommu_mr->iommu_notify, n, node); ret = memory_region_update_iommu_notify_flags(iommu_mr, errp); if (ret) { From patchwork Thu Jun 1 06:33:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13263081 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7D3E7C77B7A for ; Thu, 1 Jun 2023 06:46:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1q4c4v-0004Jl-FT; Thu, 01 Jun 2023 02:46:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q4c4t-0004JK-QR for qemu-devel@nongnu.org; Thu, 01 Jun 2023 02:46:31 -0400 Received: from mga07.intel.com ([134.134.136.100]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q4c4s-00077p-7T for qemu-devel@nongnu.org; Thu, 01 Jun 2023 02:46:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685601989; x=1717137989; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3UTR/KowszX+j7VU9Hw0DVMlJ9VUAyQ+5cmzDF4TtKA=; b=UsOfqQkjN7nr47wLYbGM3UiT2m/aKHbin6T/6Q0MSxyilKfDAYtXiuYy rRpRp3N1lcDzjkwzRZ4gR1I+yRXG9c58vtr5jKzFrQaBKPPsb70KSB+0q ek2vjOrpaMOgXyf0rytCPMdPLVRN/eJbZmjZCO1hLJXLM8w0rKRJ9LNhY lSuw+kcO6ff7ocaNmxwr4ok5fFFY0ur3SgYpvije3id8JpPPNKVAnMsd/ 8Bp4qEPZdRXMfZaakbmpoMkhIP8BMAsFV8oq/i7SD9HG0pOJbuQFW4f2I 7Wl5rKIVIQ0bRn26FcRzOYXlrow56aNalMVQFnk6Rdx4SbwfoIrtfwaot w==; X-IronPort-AV: E=McAfee;i="6600,9927,10727"; a="421249362" X-IronPort-AV: E=Sophos;i="6.00,209,1681196400"; d="scan'208";a="421249362" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2023 23:46:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10727"; a="953953090" X-IronPort-AV: E=Sophos;i="6.00,209,1681196400"; d="scan'208";a="953953090" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2023 23:46:25 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: mst@redhat.com, peterx@redhat.com, jasowang@redhat.com, pbonzini@redhat.com, richard.henderson@linaro.org, eduardo@habkost.net, marcel.apfelbaum@gmail.com, alex.williamson@redhat.com, clg@redhat.com, david@redhat.com, philmd@linaro.org, kwankhede@nvidia.com, cjia@nvidia.com, yi.l.liu@intel.com, chao.p.peng@intel.com Subject: [PATCH v2 3/4] memory: Document update on replay() Date: Thu, 1 Jun 2023 14:33:19 +0800 Message-Id: <20230601063320.139308-4-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230601063320.139308-1-zhenzhong.duan@intel.com> References: <20230601063320.139308-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.100; envelope-from=zhenzhong.duan@intel.com; helo=mga07.intel.com X-Spam_score_int: -45 X-Spam_score: -4.6 X-Spam_bar: ---- X-Spam_report: (-4.6 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.163, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Currently replay() callback is declared to be exactly same semantics as memory_region_iommu_replay(). Customed replay() may provide some extent of optimization depending on notifier's type. E.g. intel_iommu, IOMMU_NOTIFIER_MAP is optimized to provide only changed entries. Clarify the semantics of replay() and provide more flexibility. Signed-off-by: Zhenzhong Duan --- include/exec/memory.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index eecc3eec6702..9a523ef62a94 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -441,9 +441,9 @@ struct IOMMUMemoryRegionClass { * call the IOMMU translate method for every page in the address space * with flag == IOMMU_NONE and then call the notifier if translate * returns a valid mapping. If this method is implemented then it - * overrides the default behaviour, and must provide the full semantics - * of memory_region_iommu_replay(), by calling @notifier for every - * translation present in the IOMMU. + * overrides the default behavior, and must provide corresponding + * semantics depending on notifier's type, e.g. IOMMU_NOTIFIER_MAP, + * notify changed entries; IOMMU_NOTIFIER_FULL_MAP, notify full entries. * * Optional method -- an IOMMU only needs to provide this method * if the default is inefficient or produces undesirable side effects. From patchwork Thu Jun 1 06:33:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13263085 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7EECCC7EE23 for ; Thu, 1 Jun 2023 06:47:28 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1q4c50-0004KQ-GR; Thu, 01 Jun 2023 02:46:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q4c4y-0004KI-2G for qemu-devel@nongnu.org; Thu, 01 Jun 2023 02:46:36 -0400 Received: from mga07.intel.com ([134.134.136.100]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q4c4w-00077p-74 for qemu-devel@nongnu.org; Thu, 01 Jun 2023 02:46:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685601993; x=1717137993; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=d1VixyNubrvQo5Vj6XTAaxOQt/YjrCUWfhpDJ8Iv61U=; b=kVV11IhlFINIvzSQjIquFm6rR8cNwF0szDBZTZXK9ZdruPvEnoY55lZD 4x27a4hq2AN0LwOO1n0ekB44fNIR8bEEunuHvuvrykUsHVMYkRo1jQ7cp GyGb1T91J0SpA8O87gLRxCc31KfRPO+JPqHiLiCHoXLdp5HFDP8kyiRY6 0COafhfVyWLVWKOaGajwSk7RUU5AXRU004ZIETJwBCZtxVZ6dOgWBa+WI q7kkMp/xE/oVjks1oFaNWnZh9aiF0lgYcPxmQ6gdjLb/UtX7UvrFJFfuP Cf2IpwMS/zLssmiA8/zIWs9DSWQ/ATk88IpigkLc97t/4t0TLRymeI6N+ w==; X-IronPort-AV: E=McAfee;i="6600,9927,10727"; a="421249381" X-IronPort-AV: E=Sophos;i="6.00,209,1681196400"; d="scan'208";a="421249381" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2023 23:46:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10727"; a="953953100" X-IronPort-AV: E=Sophos;i="6.00,209,1681196400"; d="scan'208";a="953953100" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2023 23:46:29 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: mst@redhat.com, peterx@redhat.com, jasowang@redhat.com, pbonzini@redhat.com, richard.henderson@linaro.org, eduardo@habkost.net, marcel.apfelbaum@gmail.com, alex.williamson@redhat.com, clg@redhat.com, david@redhat.com, philmd@linaro.org, kwankhede@nvidia.com, cjia@nvidia.com, yi.l.liu@intel.com, chao.p.peng@intel.com Subject: [PATCH v2 4/4] intel_iommu: Optimize out some unnecessary UNMAP calls Date: Thu, 1 Jun 2023 14:33:20 +0800 Message-Id: <20230601063320.139308-5-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230601063320.139308-1-zhenzhong.duan@intel.com> References: <20230601063320.139308-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.100; envelope-from=zhenzhong.duan@intel.com; helo=mga07.intel.com X-Spam_score_int: -45 X-Spam_score: -4.6 X-Spam_bar: ---- X-Spam_report: (-4.6 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.163, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Commit 63b88968f1 ("intel-iommu: rework the page walk logic") adds IOVA tree to cache mapped ranges so we only need to send MAP or UNMAP when there are changes. But there is still a corner case of unnecessary UNMAP. During invalidation, either domain or device selective, we only need to unmap when there are recorded mapped IOVA ranges, presuming most of OSes allocating IOVA range continuously, e.g. on x86, linux sets up mapping from 0xffffffff downwards. Strace shows UNMAP ioctl taking 0.000014us and we have 28 such ioctl() in one invalidation, as two notifiers in x86 are split into power of 2 pieces. ioctl(48, VFIO_IOMMU_UNMAP_DMA, 0x7ffffd5c42f0) = 0 <0.000014> The other purpose of this patch is to eliminate noisy error log when we work with IOMMUFD. It looks the duplicate UNMAP call will fail with IOMMUFD while always succeed with legacy container. This behavior difference leads to below error log for IOMMUFD: IOMMU_IOAS_UNMAP failed: No such file or directory vfio_container_dma_unmap(0x562012d6b6d0, 0x0, 0x80000000) = -2 (No such file or directory) IOMMU_IOAS_UNMAP failed: No such file or directory vfio_container_dma_unmap(0x562012d6b6d0, 0x80000000, 0x40000000) = -2 (No such file or directory) ... Signed-off-by: Zhenzhong Duan --- hw/i386/intel_iommu.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 061fcded0dfb..a5fd144aa246 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -3743,6 +3743,7 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n) hwaddr start = n->start; hwaddr end = n->end; IntelIOMMUState *s = as->iommu_state; + IOMMUTLBEvent event; DMAMap map; /* @@ -3762,22 +3763,25 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n) assert(start <= end); size = remain = end - start + 1; + event.type = IOMMU_NOTIFIER_UNMAP; + event.entry.target_as = &address_space_memory; + event.entry.perm = IOMMU_NONE; + /* This field is meaningless for unmap */ + event.entry.translated_addr = 0; + while (remain >= VTD_PAGE_SIZE) { - IOMMUTLBEvent event; uint64_t mask = dma_aligned_pow2_mask(start, end, s->aw_bits); uint64_t size = mask + 1; assert(size); - event.type = IOMMU_NOTIFIER_UNMAP; - event.entry.iova = start; - event.entry.addr_mask = mask; - event.entry.target_as = &address_space_memory; - event.entry.perm = IOMMU_NONE; - /* This field is meaningless for unmap */ - event.entry.translated_addr = 0; - - memory_region_notify_iommu_one(n, &event); + map.iova = start; + map.size = mask; + if (iova_tree_find(as->iova_tree, &map)) { + event.entry.iova = start; + event.entry.addr_mask = mask; + memory_region_notify_iommu_one(n, &event); + } start += size; remain -= size; @@ -3791,7 +3795,7 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n) n->start, size); map.iova = n->start; - map.size = size; + map.size = size - 1; /* Inclusive */ iova_tree_remove(as->iova_tree, map); }