From patchwork Mon Oct 16 08:31:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422726 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 449D0CDB465 for ; Mon, 16 Oct 2023 08:49:17 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJG8-0007Wm-3A; Mon, 16 Oct 2023 04:47:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJG3-0007WJ-Fj; Mon, 16 Oct 2023 04:47:28 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJFy-00018h-RI; Mon, 16 Oct 2023 04:47:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446042; x=1728982042; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Y9FyVK+jxV/VEJVJDYwOBvUhjjtdfCrDN6hu2yxIxDg=; b=cy/bt8lwjri18IlTDjvQ2COqyFRDMJ8T7qUAe5DU/Q7czMGF+p9fdBLM f+QjzUwLeXO1dU3pL3aWNsk0EGzghfpc4joofQ6W+MxSlY8G1G/dT2cQP Y8AfEUuhbUsHzG0q3xvGhqsnJAslOv333tBAXCjdRoDlJJ/Gq+CAlTqmw ejjDDpwf/A1104wEnECzwUa43lGoME4DY0pi4aId3i6a9QGyO/jqnPnoj Cftar6VTlda2Orl5IRMn9lr48mSsQSjZ3YVwgrIh+6GXjG10CVeiSoNyh HaPgyZiRkYH9X5dta1e4ClgAYB3p3+lTcqtSzkbpl/TcBuHUHD+rpJ9Kh w==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737512" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737512" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222689" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222689" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:14 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v2 01/27] vfio: Rename VFIOContainer into VFIOLegacyContainer Date: Mon, 16 Oct 2023 16:31:57 +0800 Message-Id: <20231016083223.1519410-2-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger In the prospect to introduce a base object for the VFIOContainer and derive into the existing legacy container and the iommufd based container, let's rename the existing one into VFIOLegacyContainer. This is just an incremental step to ease the migration. Soon there won't be any reference to the legacy container in the common.c code. Only the container.c should handle the VFIOLegacyContainer object. No functional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 46 ++++++++++++------------- hw/vfio/common.c | 63 ++++++++++++++++++++--------------- hw/vfio/container.c | 45 +++++++++++++------------ hw/vfio/spapr.c | 12 +++---- 4 files changed, 89 insertions(+), 77 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 7780b9073a..34648e518e 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -74,13 +74,13 @@ typedef struct VFIOMigration { typedef struct VFIOAddressSpace { AddressSpace *as; - QLIST_HEAD(, VFIOContainer) containers; + QLIST_HEAD(, VFIOLegacyContainer) containers; QLIST_ENTRY(VFIOAddressSpace) list; } VFIOAddressSpace; struct VFIOGroup; -typedef struct VFIOContainer { +typedef struct VFIOLegacyContainer { VFIOAddressSpace *space; int fd; /* /dev/vfio/vfio, empowered by the attached groups */ MemoryListener listener; @@ -97,12 +97,12 @@ typedef struct VFIOContainer { QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; - QLIST_ENTRY(VFIOContainer) next; + QLIST_ENTRY(VFIOLegacyContainer) next; QLIST_HEAD(, VFIODevice) device_list; -} VFIOContainer; +} VFIOLegacyContainer; typedef struct VFIOGuestIOMMU { - VFIOContainer *container; + VFIOLegacyContainer *container; IOMMUMemoryRegion *iommu_mr; hwaddr iommu_offset; IOMMUNotifier n; @@ -110,7 +110,7 @@ typedef struct VFIOGuestIOMMU { } VFIOGuestIOMMU; typedef struct VFIORamDiscardListener { - VFIOContainer *container; + VFIOLegacyContainer *container; MemoryRegion *mr; hwaddr offset_within_address_space; hwaddr size; @@ -133,7 +133,7 @@ typedef struct VFIODevice { QLIST_ENTRY(VFIODevice) container_next; QLIST_ENTRY(VFIODevice) global_next; struct VFIOGroup *group; - VFIOContainer *container; + VFIOLegacyContainer *container; char *sysfsdev; char *name; DeviceState *dev; @@ -167,7 +167,7 @@ struct VFIODeviceOps { typedef struct VFIOGroup { int fd; int groupid; - VFIOContainer *container; + VFIOLegacyContainer *container; QLIST_HEAD(, VFIODevice) device_list; QLIST_ENTRY(VFIOGroup) next; QLIST_ENTRY(VFIOGroup) container_next; @@ -206,28 +206,28 @@ typedef struct { hwaddr pages; } VFIOBitmap; -void vfio_host_win_add(VFIOContainer *container, +void vfio_host_win_add(VFIOLegacyContainer *container, hwaddr min_iova, hwaddr max_iova, uint64_t iova_pgsizes); -int vfio_host_win_del(VFIOContainer *container, hwaddr min_iova, +int vfio_host_win_del(VFIOLegacyContainer *container, hwaddr min_iova, hwaddr max_iova); VFIOAddressSpace *vfio_get_address_space(AddressSpace *as); void vfio_put_address_space(VFIOAddressSpace *space); -bool vfio_devices_all_running_and_saving(VFIOContainer *container); +bool vfio_devices_all_running_and_saving(VFIOLegacyContainer *container); /* container->fd */ -int vfio_dma_unmap(VFIOContainer *container, hwaddr iova, +int vfio_dma_unmap(VFIOLegacyContainer *container, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb); -int vfio_dma_map(VFIOContainer *container, hwaddr iova, +int vfio_dma_map(VFIOLegacyContainer *container, hwaddr iova, ram_addr_t size, void *vaddr, bool readonly); -int vfio_set_dirty_page_tracking(VFIOContainer *container, bool start); -int vfio_query_dirty_bitmap(VFIOContainer *container, VFIOBitmap *vbmap, +int vfio_set_dirty_page_tracking(VFIOLegacyContainer *container, bool start); +int vfio_query_dirty_bitmap(VFIOLegacyContainer *container, VFIOBitmap *vbmap, hwaddr iova, hwaddr size); -int vfio_container_add_section_window(VFIOContainer *container, +int vfio_container_add_section_window(VFIOLegacyContainer *container, MemoryRegionSection *section, Error **errp); -void vfio_container_del_section_window(VFIOContainer *container, +void vfio_container_del_section_window(VFIOLegacyContainer *container, MemoryRegionSection *section); void vfio_disable_irqindex(VFIODevice *vbasedev, int index); @@ -290,21 +290,21 @@ vfio_get_cap(void *ptr, uint32_t cap_offset, uint16_t id); #endif extern const MemoryListener vfio_prereg_listener; -int vfio_spapr_create_window(VFIOContainer *container, +int vfio_spapr_create_window(VFIOLegacyContainer *container, MemoryRegionSection *section, hwaddr *pgsize); -int vfio_spapr_remove_window(VFIOContainer *container, +int vfio_spapr_remove_window(VFIOLegacyContainer *container, hwaddr offset_within_address_space); bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp); void vfio_migration_exit(VFIODevice *vbasedev); int vfio_bitmap_alloc(VFIOBitmap *vbmap, hwaddr size); -bool vfio_devices_all_running_and_mig_active(VFIOContainer *container); -bool vfio_devices_all_device_dirty_tracking(VFIOContainer *container); -int vfio_devices_query_dirty_bitmap(VFIOContainer *container, +bool vfio_devices_all_running_and_mig_active(VFIOLegacyContainer *container); +bool vfio_devices_all_device_dirty_tracking(VFIOLegacyContainer *container); +int vfio_devices_query_dirty_bitmap(VFIOLegacyContainer *container, VFIOBitmap *vbmap, hwaddr iova, hwaddr size); -int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova, +int vfio_get_dirty_bitmap(VFIOLegacyContainer *container, uint64_t iova, uint64_t size, ram_addr_t ram_addr); #endif /* HW_VFIO_VFIO_COMMON_H */ diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 5ff5acf1d8..b51ef3a15a 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -184,7 +184,7 @@ bool vfio_device_state_is_precopy(VFIODevice *vbasedev) migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P; } -static bool vfio_devices_all_dirty_tracking(VFIOContainer *container) +static bool vfio_devices_all_dirty_tracking(VFIOLegacyContainer *container) { VFIODevice *vbasedev; MigrationState *ms = migrate_get_current(); @@ -210,7 +210,7 @@ static bool vfio_devices_all_dirty_tracking(VFIOContainer *container) return true; } -bool vfio_devices_all_device_dirty_tracking(VFIOContainer *container) +bool vfio_devices_all_device_dirty_tracking(VFIOLegacyContainer *container) { VFIODevice *vbasedev; @@ -227,7 +227,7 @@ bool vfio_devices_all_device_dirty_tracking(VFIOContainer *container) * Check if all VFIO devices are running and migration is active, which is * essentially equivalent to the migration being in pre-copy phase. */ -bool vfio_devices_all_running_and_mig_active(VFIOContainer *container) +bool vfio_devices_all_running_and_mig_active(VFIOLegacyContainer *container) { VFIODevice *vbasedev; @@ -252,7 +252,7 @@ bool vfio_devices_all_running_and_mig_active(VFIOContainer *container) return true; } -void vfio_host_win_add(VFIOContainer *container, hwaddr min_iova, +void vfio_host_win_add(VFIOLegacyContainer *container, hwaddr min_iova, hwaddr max_iova, uint64_t iova_pgsizes) { VFIOHostDMAWindow *hostwin; @@ -274,7 +274,7 @@ void vfio_host_win_add(VFIOContainer *container, hwaddr min_iova, QLIST_INSERT_HEAD(&container->hostwin_list, hostwin, hostwin_next); } -int vfio_host_win_del(VFIOContainer *container, +int vfio_host_win_del(VFIOLegacyContainer *container, hwaddr min_iova, hwaddr max_iova) { VFIOHostDMAWindow *hostwin; @@ -337,7 +337,7 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) { VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n); - VFIOContainer *container = giommu->container; + VFIOLegacyContainer *container = giommu->container; hwaddr iova = iotlb->iova + giommu->iommu_offset; void *vaddr; int ret; @@ -441,7 +441,7 @@ static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, return 0; } -static void vfio_register_ram_discard_listener(VFIOContainer *container, +static void vfio_register_ram_discard_listener(VFIOLegacyContainer *container, MemoryRegionSection *section) { RamDiscardManager *rdm = memory_region_get_ram_discard_manager(section->mr); @@ -515,7 +515,7 @@ static void vfio_register_ram_discard_listener(VFIOContainer *container, } } -static void vfio_unregister_ram_discard_listener(VFIOContainer *container, +static void vfio_unregister_ram_discard_listener(VFIOLegacyContainer *container, MemoryRegionSection *section) { RamDiscardManager *rdm = memory_region_get_ram_discard_manager(section->mr); @@ -538,7 +538,7 @@ static void vfio_unregister_ram_discard_listener(VFIOContainer *container, g_free(vrdl); } -static VFIOHostDMAWindow *vfio_find_hostwin(VFIOContainer *container, +static VFIOHostDMAWindow *vfio_find_hostwin(VFIOLegacyContainer *container, hwaddr iova, hwaddr end) { VFIOHostDMAWindow *hostwin; @@ -599,7 +599,7 @@ static bool vfio_listener_valid_section(MemoryRegionSection *section, return true; } -static bool vfio_get_section_iova_range(VFIOContainer *container, +static bool vfio_get_section_iova_range(VFIOLegacyContainer *container, MemoryRegionSection *section, hwaddr *out_iova, hwaddr *out_end, Int128 *out_llend) @@ -627,7 +627,9 @@ static bool vfio_get_section_iova_range(VFIOContainer *container, static void vfio_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { - VFIOContainer *container = container_of(listener, VFIOContainer, listener); + VFIOLegacyContainer *container = container_of(listener, + VFIOLegacyContainer, + listener); hwaddr iova, end; Int128 llend, llsize; void *vaddr; @@ -788,7 +790,9 @@ fail: static void vfio_listener_region_del(MemoryListener *listener, MemoryRegionSection *section) { - VFIOContainer *container = container_of(listener, VFIOContainer, listener); + VFIOLegacyContainer *container = container_of(listener, + VFIOLegacyContainer, + listener); hwaddr iova, end; Int128 llend, llsize; int ret; @@ -881,13 +885,13 @@ typedef struct VFIODirtyRanges { } VFIODirtyRanges; typedef struct VFIODirtyRangesListener { - VFIOContainer *container; + VFIOLegacyContainer *container; VFIODirtyRanges ranges; MemoryListener listener; } VFIODirtyRangesListener; static bool vfio_section_is_vfio_pci(MemoryRegionSection *section, - VFIOContainer *container) + VFIOLegacyContainer *container) { VFIOPCIDevice *pcidev; VFIODevice *vbasedev; @@ -966,7 +970,7 @@ static const MemoryListener vfio_dirty_tracking_listener = { .region_add = vfio_dirty_tracking_update, }; -static void vfio_dirty_tracking_init(VFIOContainer *container, +static void vfio_dirty_tracking_init(VFIOLegacyContainer *container, VFIODirtyRanges *ranges) { VFIODirtyRangesListener dirty; @@ -991,7 +995,7 @@ static void vfio_dirty_tracking_init(VFIOContainer *container, memory_listener_unregister(&dirty.listener); } -static void vfio_devices_dma_logging_stop(VFIOContainer *container) +static void vfio_devices_dma_logging_stop(VFIOLegacyContainer *container) { uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature), sizeof(uint64_t))] = {}; @@ -1016,7 +1020,7 @@ static void vfio_devices_dma_logging_stop(VFIOContainer *container) } static struct vfio_device_feature * -vfio_device_feature_dma_logging_start_create(VFIOContainer *container, +vfio_device_feature_dma_logging_start_create(VFIOLegacyContainer *container, VFIODirtyRanges *tracking) { struct vfio_device_feature *feature; @@ -1089,7 +1093,7 @@ static void vfio_device_feature_dma_logging_start_destroy( g_free(feature); } -static int vfio_devices_dma_logging_start(VFIOContainer *container) +static int vfio_devices_dma_logging_start(VFIOLegacyContainer *container) { struct vfio_device_feature *feature; VFIODirtyRanges ranges; @@ -1130,7 +1134,9 @@ out: static void vfio_listener_log_global_start(MemoryListener *listener) { - VFIOContainer *container = container_of(listener, VFIOContainer, listener); + VFIOLegacyContainer *container = container_of(listener, + VFIOLegacyContainer, + listener); int ret; if (vfio_devices_all_device_dirty_tracking(container)) { @@ -1148,7 +1154,9 @@ static void vfio_listener_log_global_start(MemoryListener *listener) static void vfio_listener_log_global_stop(MemoryListener *listener) { - VFIOContainer *container = container_of(listener, VFIOContainer, listener); + VFIOLegacyContainer *container = container_of(listener, + VFIOLegacyContainer, + listener); int ret = 0; if (vfio_devices_all_device_dirty_tracking(container)) { @@ -1190,7 +1198,7 @@ static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova, return 0; } -int vfio_devices_query_dirty_bitmap(VFIOContainer *container, +int vfio_devices_query_dirty_bitmap(VFIOLegacyContainer *container, VFIOBitmap *vbmap, hwaddr iova, hwaddr size) { @@ -1213,7 +1221,7 @@ int vfio_devices_query_dirty_bitmap(VFIOContainer *container, return 0; } -int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova, +int vfio_get_dirty_bitmap(VFIOLegacyContainer *container, uint64_t iova, uint64_t size, ram_addr_t ram_addr) { bool all_device_dirty_tracking = @@ -1265,7 +1273,7 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) vfio_giommu_dirty_notifier *gdn = container_of(n, vfio_giommu_dirty_notifier, n); VFIOGuestIOMMU *giommu = gdn->giommu; - VFIOContainer *container = giommu->container; + VFIOLegacyContainer *container = giommu->container; hwaddr iova = iotlb->iova + giommu->iommu_offset; ram_addr_t translated_addr; int ret = -EINVAL; @@ -1313,7 +1321,8 @@ static int vfio_ram_discard_get_dirty_bitmap(MemoryRegionSection *section, return vfio_get_dirty_bitmap(vrdl->container, iova, size, ram_addr); } -static int vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainer *container, +static int +vfio_sync_ram_discard_listener_dirty_bitmap(VFIOLegacyContainer *container, MemoryRegionSection *section) { RamDiscardManager *rdm = memory_region_get_ram_discard_manager(section->mr); @@ -1340,7 +1349,7 @@ static int vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainer *container, &vrdl); } -static int vfio_sync_dirty_bitmap(VFIOContainer *container, +static int vfio_sync_dirty_bitmap(VFIOLegacyContainer *container, MemoryRegionSection *section) { ram_addr_t ram_addr; @@ -1386,7 +1395,9 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *container, static void vfio_listener_log_sync(MemoryListener *listener, MemoryRegionSection *section) { - VFIOContainer *container = container_of(listener, VFIOContainer, listener); + VFIOLegacyContainer *container = container_of(listener, + VFIOLegacyContainer, + listener); int ret; if (vfio_listener_skipped_section(section)) { diff --git a/hw/vfio/container.c b/hw/vfio/container.c index adc467210f..8fde302ae9 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -42,7 +42,8 @@ VFIOGroupList vfio_group_list = QLIST_HEAD_INITIALIZER(vfio_group_list); -static int vfio_ram_block_discard_disable(VFIOContainer *container, bool state) +static int vfio_ram_block_discard_disable(VFIOLegacyContainer *container, + bool state) { switch (container->iommu_type) { case VFIO_TYPE1v2_IOMMU: @@ -65,7 +66,7 @@ static int vfio_ram_block_discard_disable(VFIOContainer *container, bool state) } } -static int vfio_dma_unmap_bitmap(VFIOContainer *container, +static int vfio_dma_unmap_bitmap(VFIOLegacyContainer *container, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb) { @@ -120,7 +121,7 @@ unmap_exit: /* * DMA - Mapping and unmapping for the "type1" IOMMU interface used on x86 */ -int vfio_dma_unmap(VFIOContainer *container, hwaddr iova, +int vfio_dma_unmap(VFIOLegacyContainer *container, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb) { struct vfio_iommu_type1_dma_unmap unmap = { @@ -175,7 +176,7 @@ int vfio_dma_unmap(VFIOContainer *container, hwaddr iova, return 0; } -int vfio_dma_map(VFIOContainer *container, hwaddr iova, +int vfio_dma_map(VFIOLegacyContainer *container, hwaddr iova, ram_addr_t size, void *vaddr, bool readonly) { struct vfio_iommu_type1_dma_map map = { @@ -205,7 +206,7 @@ int vfio_dma_map(VFIOContainer *container, hwaddr iova, return -errno; } -int vfio_container_add_section_window(VFIOContainer *container, +int vfio_container_add_section_window(VFIOLegacyContainer *container, MemoryRegionSection *section, Error **errp) { @@ -273,7 +274,7 @@ int vfio_container_add_section_window(VFIOContainer *container, return 0; } -void vfio_container_del_section_window(VFIOContainer *container, +void vfio_container_del_section_window(VFIOLegacyContainer *container, MemoryRegionSection *section) { if (container->iommu_type != VFIO_SPAPR_TCE_v2_IOMMU) { @@ -291,7 +292,7 @@ void vfio_container_del_section_window(VFIOContainer *container, } } -int vfio_set_dirty_page_tracking(VFIOContainer *container, bool start) +int vfio_set_dirty_page_tracking(VFIOLegacyContainer *container, bool start) { int ret; struct vfio_iommu_type1_dirty_bitmap dirty = { @@ -318,7 +319,7 @@ int vfio_set_dirty_page_tracking(VFIOContainer *container, bool start) return ret; } -int vfio_query_dirty_bitmap(VFIOContainer *container, VFIOBitmap *vbmap, +int vfio_query_dirty_bitmap(VFIOLegacyContainer *container, VFIOBitmap *vbmap, hwaddr iova, hwaddr size) { struct vfio_iommu_type1_dirty_bitmap *dbitmap; @@ -355,7 +356,7 @@ int vfio_query_dirty_bitmap(VFIOContainer *container, VFIOBitmap *vbmap, return ret; } -static void vfio_listener_release(VFIOContainer *container) +static void vfio_listener_release(VFIOLegacyContainer *container) { memory_listener_unregister(&container->listener); if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU) { @@ -415,7 +416,7 @@ static void vfio_kvm_device_del_group(VFIOGroup *group) /* * vfio_get_iommu_type - selects the richest iommu_type (v2 first) */ -static int vfio_get_iommu_type(VFIOContainer *container, +static int vfio_get_iommu_type(VFIOLegacyContainer *container, Error **errp) { int iommu_types[] = { VFIO_TYPE1v2_IOMMU, VFIO_TYPE1_IOMMU, @@ -431,7 +432,7 @@ static int vfio_get_iommu_type(VFIOContainer *container, return -EINVAL; } -static int vfio_init_container(VFIOContainer *container, int group_fd, +static int vfio_init_container(VFIOLegacyContainer *container, int group_fd, Error **errp) { int iommu_type, ret; @@ -466,7 +467,7 @@ static int vfio_init_container(VFIOContainer *container, int group_fd, return 0; } -static int vfio_get_iommu_info(VFIOContainer *container, +static int vfio_get_iommu_info(VFIOLegacyContainer *container, struct vfio_iommu_type1_info **info) { @@ -510,7 +511,7 @@ vfio_get_iommu_info_cap(struct vfio_iommu_type1_info *info, uint16_t id) return NULL; } -static void vfio_get_iommu_info_migration(VFIOContainer *container, +static void vfio_get_iommu_info_migration(VFIOLegacyContainer *container, struct vfio_iommu_type1_info *info) { struct vfio_info_cap_header *hdr; @@ -538,7 +539,7 @@ static void vfio_get_iommu_info_migration(VFIOContainer *container, static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, Error **errp) { - VFIOContainer *container; + VFIOLegacyContainer *container; int ret, fd; VFIOAddressSpace *space; @@ -778,7 +779,7 @@ put_space_exit: static void vfio_disconnect_container(VFIOGroup *group) { - VFIOContainer *container = group->container; + VFIOLegacyContainer *container = group->container; QLIST_REMOVE(group, container_next); group->container = NULL; @@ -978,7 +979,7 @@ static void vfio_put_base_device(VFIODevice *vbasedev) /* * Interfaces for IBM EEH (Enhanced Error Handling) */ -static bool vfio_eeh_container_ok(VFIOContainer *container) +static bool vfio_eeh_container_ok(VFIOLegacyContainer *container) { /* * As of 2016-03-04 (linux-4.5) the host kernel EEH/VFIO @@ -1006,7 +1007,7 @@ static bool vfio_eeh_container_ok(VFIOContainer *container) return true; } -static int vfio_eeh_container_op(VFIOContainer *container, uint32_t op) +static int vfio_eeh_container_op(VFIOLegacyContainer *container, uint32_t op) { struct vfio_eeh_pe_op pe_op = { .argsz = sizeof(pe_op), @@ -1029,10 +1030,10 @@ static int vfio_eeh_container_op(VFIOContainer *container, uint32_t op) return ret; } -static VFIOContainer *vfio_eeh_as_container(AddressSpace *as) +static VFIOLegacyContainer *vfio_eeh_as_container(AddressSpace *as) { VFIOAddressSpace *space = vfio_get_address_space(as); - VFIOContainer *container = NULL; + VFIOLegacyContainer *container = NULL; if (QLIST_EMPTY(&space->containers)) { /* No containers to act on */ @@ -1057,14 +1058,14 @@ out: bool vfio_eeh_as_ok(AddressSpace *as) { - VFIOContainer *container = vfio_eeh_as_container(as); + VFIOLegacyContainer *container = vfio_eeh_as_container(as); return (container != NULL) && vfio_eeh_container_ok(container); } int vfio_eeh_as_op(AddressSpace *as, uint32_t op) { - VFIOContainer *container = vfio_eeh_as_container(as); + VFIOLegacyContainer *container = vfio_eeh_as_container(as); if (!container) { return -ENODEV; @@ -1109,7 +1110,7 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev, int groupid = vfio_device_groupid(vbasedev, errp); VFIODevice *vbasedev_iter; VFIOGroup *group; - VFIOContainer *container; + VFIOLegacyContainer *container; int ret; if (groupid < 0) { diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index 9ec1e95f6d..683252c506 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -39,8 +39,8 @@ static void *vfio_prereg_gpa_to_vaddr(MemoryRegionSection *section, hwaddr gpa) static void vfio_prereg_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { - VFIOContainer *container = container_of(listener, VFIOContainer, - prereg_listener); + VFIOLegacyContainer *container = container_of(listener, VFIOLegacyContainer, + prereg_listener); const hwaddr gpa = section->offset_within_address_space; hwaddr end; int ret; @@ -97,8 +97,8 @@ static void vfio_prereg_listener_region_add(MemoryListener *listener, static void vfio_prereg_listener_region_del(MemoryListener *listener, MemoryRegionSection *section) { - VFIOContainer *container = container_of(listener, VFIOContainer, - prereg_listener); + VFIOLegacyContainer *container = container_of(listener, VFIOLegacyContainer, + prereg_listener); const hwaddr gpa = section->offset_within_address_space; hwaddr end; int ret; @@ -141,7 +141,7 @@ const MemoryListener vfio_prereg_listener = { .region_del = vfio_prereg_listener_region_del, }; -int vfio_spapr_create_window(VFIOContainer *container, +int vfio_spapr_create_window(VFIOLegacyContainer *container, MemoryRegionSection *section, hwaddr *pgsize) { @@ -233,7 +233,7 @@ int vfio_spapr_create_window(VFIOContainer *container, return 0; } -int vfio_spapr_remove_window(VFIOContainer *container, +int vfio_spapr_remove_window(VFIOLegacyContainer *container, hwaddr offset_within_address_space) { struct vfio_iommu_spapr_tce_remove remove = { From patchwork Mon Oct 16 08:31:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422728 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2A06ECDB465 for ; Mon, 16 Oct 2023 08:49:35 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJGD-0007Xr-2N; Mon, 16 Oct 2023 04:47:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJG5-0007WY-0c for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:29 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJG1-00017B-1X for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446045; x=1728982045; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WT9qSQz4TtbYRvjgBGk6vxxeZVyK82p2m8Y0jyc78SE=; b=bRXLWKtFb+pOyUzURjpNgGFUCUSM8jiD86k3Q1nTPpX2bvlRKtc6ZWfJ Dnj32fY/tkF1KhbIl+4vccJekGtCFllMjED1gbUhfKHMRH8ZSB+9xtlPu eQSIxdm9BbAWxCnUSKWfm0cSu4DIo0t2y/oF0oLKAsVrhQqW3C7zDC3cs T7KBxvnb02bEoz62o6aRR4AT9YYav6F1VmY4mH5t1yPEibtpuobwm66ej bmNsFI+MAI3C2Oc8G5WxtoZTQ0IzMo9Uz3Ljxo7CwiKXzeY9aHDRIA56F 7mychE+UsjIEBmDJ/+w3CdvQ7sbpFg8Zda3+ybQ3eI1TTC8Fp7STwsvLE Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737519" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737519" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222692" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222692" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:20 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v2 02/27] vfio: Introduce base object for VFIOContainer and targetted interface Date: Mon, 16 Oct 2023 16:31:58 +0800 Message-Id: <20231016083223.1519410-3-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Introduce a dumb VFIOContainer base object and its targetted interface. This is willingly not a QOM object because we don't want it to be visible from the user interface. The VFIOContainer will be smoothly populated in subsequent patches as well as interfaces. No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 8 +-- include/hw/vfio/vfio-container-base.h | 82 +++++++++++++++++++++++++++ 2 files changed, 84 insertions(+), 6 deletions(-) create mode 100644 include/hw/vfio/vfio-container-base.h diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 34648e518e..9651cf921c 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -30,6 +30,7 @@ #include #endif #include "sysemu/sysemu.h" +#include "hw/vfio/vfio-container-base.h" #define VFIO_MSG_PREFIX "vfio %s: " @@ -81,6 +82,7 @@ typedef struct VFIOAddressSpace { struct VFIOGroup; typedef struct VFIOLegacyContainer { + VFIOContainer bcontainer; VFIOAddressSpace *space; int fd; /* /dev/vfio/vfio, empowered by the attached groups */ MemoryListener listener; @@ -200,12 +202,6 @@ typedef struct VFIODisplay { } dmabuf; } VFIODisplay; -typedef struct { - unsigned long *bitmap; - hwaddr size; - hwaddr pages; -} VFIOBitmap; - void vfio_host_win_add(VFIOLegacyContainer *container, hwaddr min_iova, hwaddr max_iova, uint64_t iova_pgsizes); diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h new file mode 100644 index 0000000000..afc8543d22 --- /dev/null +++ b/include/hw/vfio/vfio-container-base.h @@ -0,0 +1,82 @@ +/* + * VFIO BASE CONTAINER + * + * Copyright (C) 2023 Intel Corporation. + * Copyright Red Hat, Inc. 2023 + * + * Authors: Yi Liu + * Eric Auger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + + * You should have received a copy of the GNU General Public License along + * with this program; if not, see . + */ + +#ifndef HW_VFIO_VFIO_BASE_CONTAINER_H +#define HW_VFIO_VFIO_BASE_CONTAINER_H + +#include "exec/memory.h" +#ifndef CONFIG_USER_ONLY +#include "exec/hwaddr.h" +#endif + +typedef struct VFIOContainer VFIOContainer; +typedef struct VFIODevice VFIODevice; +typedef struct VFIOIOMMUBackendOpsClass VFIOIOMMUBackendOpsClass; + +typedef struct { + unsigned long *bitmap; + hwaddr size; + hwaddr pages; +} VFIOBitmap; + +/* + * This is the base object for vfio container backends + */ +struct VFIOContainer { + VFIOIOMMUBackendOpsClass *ops; +}; + +#define TYPE_VFIO_IOMMU_BACKEND_OPS "vfio-iommu-backend-ops" + +DECLARE_CLASS_CHECKERS(VFIOIOMMUBackendOpsClass, + VFIO_IOMMU_BACKEND_OPS, TYPE_VFIO_IOMMU_BACKEND_OPS) + +struct VFIOIOMMUBackendOpsClass { + /*< private >*/ + ObjectClass parent_class; + + /*< public >*/ + /* required */ + int (*dma_map)(VFIOContainer *bcontainer, + hwaddr iova, ram_addr_t size, + void *vaddr, bool readonly); + int (*dma_unmap)(VFIOContainer *bcontainer, + hwaddr iova, ram_addr_t size, + IOMMUTLBEntry *iotlb); + int (*attach_device)(char *name, VFIODevice *vbasedev, + AddressSpace *as, Error **errp); + void (*detach_device)(VFIODevice *vbasedev); + /* migration feature */ + int (*set_dirty_page_tracking)(VFIOContainer *bcontainer, bool start); + int (*query_dirty_bitmap)(VFIOContainer *bcontainer, VFIOBitmap *vbmap, + hwaddr iova, hwaddr size); + + /* SPAPR specific */ + int (*add_window)(VFIOContainer *bcontainer, + MemoryRegionSection *section, + Error **errp); + void (*del_window)(VFIOContainer *bcontainer, + MemoryRegionSection *section); +}; + +#endif /* HW_VFIO_VFIO_BASE_CONTAINER_H */ From patchwork Mon Oct 16 08:31:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422733 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6E476CDB482 for ; Mon, 16 Oct 2023 08:49:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJGD-0007Xo-2R; Mon, 16 Oct 2023 04:47:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJG7-0007Wo-PM for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:32 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJG5-00018h-5X for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446049; x=1728982049; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dRKZOMcB1np/FNbBw+R0wRE7DGXaXjC9Dz9CMIH3KBQ=; b=cbr0mnwSZp5VfnJEtYo5oLv1e6s5fVfpgWHCaMXd6eaIahiA+QPramwj qg8TyugKJeoNBoJZ3BmWDfBhT4TKnawro5qyFe0XzGjhe6+2bt9u5EXNY 7KG4Ul7boot7g/tpfhqsTsQdBcgckYAkPhKUDwEcZvicPGkPw74HNZzKX nrkGBDTJCp0sh5FyOHoD9cFeJqtVt1wnsoOVmSyYvr3uv4PEsKMJcYevc TTmGuN6Xy+MpBFOG3mHSGuiSyj6KgvqfoRGFeLrlwAzU17fEY9UWEg0rR tHenr/VTJUGteCJ7/k09XaiYr6JTZIUiZdGl4l0XIHO9j17KzcDP5ZyWH w==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737528" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737528" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222695" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222695" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:24 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v2 03/27] VFIO/container: Introduce dummy VFIOContainerClass implementation Date: Mon, 16 Oct 2023 16:31:59 +0800 Message-Id: <20231016083223.1519410-4-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Let's instantiate a dummy VFIOContainerClass implementation whose functions are not yet implemented. No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-container-base.h | 1 + hw/vfio/container-base.c | 40 +++++++++++++++++++++++++++ hw/vfio/container.c | 22 +++++++++++++++ hw/vfio/meson.build | 1 + 4 files changed, 64 insertions(+) create mode 100644 hw/vfio/container-base.c diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index afc8543d22..226e960fb5 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -46,6 +46,7 @@ struct VFIOContainer { VFIOIOMMUBackendOpsClass *ops; }; +#define TYPE_VFIO_IOMMU_BACKEND_LEGACY_OPS "vfio-iommu-backend-legacy-ops" #define TYPE_VFIO_IOMMU_BACKEND_OPS "vfio-iommu-backend-ops" DECLARE_CLASS_CHECKERS(VFIOIOMMUBackendOpsClass, diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c new file mode 100644 index 0000000000..0c21e77039 --- /dev/null +++ b/hw/vfio/container-base.c @@ -0,0 +1,40 @@ +/* + * VFIO BASE CONTAINER + * + * Copyright (C) 2023 Intel Corporation. + * Copyright Red Hat, Inc. 2023 + * + * Authors: Yi Liu + * Eric Auger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + + * You should have received a copy of the GNU General Public License along + * with this program; if not, see . + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "qemu/error-report.h" +#include "hw/vfio/vfio-container-base.h" + +static const TypeInfo vfio_iommu_backend_ops_type_info = { + .name = TYPE_VFIO_IOMMU_BACKEND_OPS, + .parent = TYPE_OBJECT, + .abstract = true, + .class_size = sizeof(VFIOIOMMUBackendOpsClass), +}; + +static void vfio_iommu_backend_ops_register_types(void) +{ + type_register_static(&vfio_iommu_backend_ops_type_info); +} +type_init(vfio_iommu_backend_ops_register_types); diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 8fde302ae9..acc4a6bf8a 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -539,6 +539,9 @@ static void vfio_get_iommu_info_migration(VFIOLegacyContainer *container, static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, Error **errp) { + VFIOIOMMUBackendOpsClass *ops = VFIO_IOMMU_BACKEND_OPS_CLASS( + object_class_by_name(TYPE_VFIO_IOMMU_BACKEND_LEGACY_OPS)); + VFIOContainer *bcontainer; VFIOLegacyContainer *container; int ret, fd; VFIOAddressSpace *space; @@ -620,6 +623,8 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, QLIST_INIT(&container->giommu_list); QLIST_INIT(&container->hostwin_list); QLIST_INIT(&container->vrdl_list); + bcontainer = &container->bcontainer; + bcontainer->ops = ops; ret = vfio_init_container(container, group->fd, errp); if (ret) { @@ -1160,3 +1165,20 @@ void vfio_detach_device(VFIODevice *vbasedev) vfio_put_base_device(vbasedev); vfio_put_group(group); } + +static void vfio_iommu_backend_legacy_ops_class_init(ObjectClass *oc, + void *data) { +} + +static const TypeInfo vfio_iommu_backend_legacy_ops_type = { + .name = TYPE_VFIO_IOMMU_BACKEND_LEGACY_OPS, + + .parent = TYPE_VFIO_IOMMU_BACKEND_OPS, + .class_init = vfio_iommu_backend_legacy_ops_class_init, + .abstract = true, +}; +static void vfio_iommu_backend_legacy_ops_register_types(void) +{ + type_register_static(&vfio_iommu_backend_legacy_ops_type); +} +type_init(vfio_iommu_backend_legacy_ops_register_types); diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build index 2a6912c940..eb6ce6229d 100644 --- a/hw/vfio/meson.build +++ b/hw/vfio/meson.build @@ -2,6 +2,7 @@ vfio_ss = ss.source_set() vfio_ss.add(files( 'helpers.c', 'common.c', + 'container-base.c', 'container.c', 'spapr.c', 'migration.c', From patchwork Mon Oct 16 08:32:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422745 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 261F3C41513 for ; Mon, 16 Oct 2023 08:51:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJGT-0007h0-HE; Mon, 16 Oct 2023 04:47:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGK-0007ab-Up for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:46 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGC-00019z-9O for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446056; x=1728982056; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5yEvxT2rXMhOpXnprf4FOirUqI+6sQBhYa/RMnVdfFM=; b=XUFEETqzoX2ymPJ1G6Fok0J6WoKbDc+u/yMKuG/MUSG2EnrHAWTPNW4e UPTy4f6uJSwXduat/6hxFhzDzH0KTEFV/u6QsR93MVwu2Tt/mSj9XbeBp Y3Zn4OQceriqwHTUBxAiVIUWNnPGKw6ntAOTq+UZ1+7ILzPBovw2ela4a P0ixhvMXA4yYn18/fKww23hUMJ7+riIkvajOCgC829A2cCtpkZCGfcVK/ AefJoltnVdMn8s5BChu1GeqSWCp0WOkUU0zL7sVIkwAReGZ49rwbPiBgb ftA7Th8RnU4jWffjM1jdnRyXlEhO4PKc+TDC/okK1y+Cswzz1ZA1qJZ5r w==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737535" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737535" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222699" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222699" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:28 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v2 04/27] vfio/container: Switch to dma_map|unmap API Date: Mon, 16 Oct 2023 16:32:00 +0800 Message-Id: <20231016083223.1519410-5-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 4 --- include/hw/vfio/vfio-container-base.h | 7 +++++ hw/vfio/common.c | 45 +++++++++++++++------------ hw/vfio/container-base.c | 22 +++++++++++++ hw/vfio/container.c | 25 +++++++++++---- hw/vfio/trace-events | 2 +- 6 files changed, 74 insertions(+), 31 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 9651cf921c..f2aa122c47 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -212,10 +212,6 @@ void vfio_put_address_space(VFIOAddressSpace *space); bool vfio_devices_all_running_and_saving(VFIOLegacyContainer *container); /* container->fd */ -int vfio_dma_unmap(VFIOLegacyContainer *container, hwaddr iova, - ram_addr_t size, IOMMUTLBEntry *iotlb); -int vfio_dma_map(VFIOLegacyContainer *container, hwaddr iova, - ram_addr_t size, void *vaddr, bool readonly); int vfio_set_dirty_page_tracking(VFIOLegacyContainer *container, bool start); int vfio_query_dirty_bitmap(VFIOLegacyContainer *container, VFIOBitmap *vbmap, hwaddr iova, hwaddr size); diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 226e960fb5..1483e77441 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -46,6 +46,13 @@ struct VFIOContainer { VFIOIOMMUBackendOpsClass *ops; }; +int vfio_container_dma_map(VFIOContainer *bcontainer, + hwaddr iova, ram_addr_t size, + void *vaddr, bool readonly); +int vfio_container_dma_unmap(VFIOContainer *bcontainer, + hwaddr iova, ram_addr_t size, + IOMMUTLBEntry *iotlb); + #define TYPE_VFIO_IOMMU_BACKEND_LEGACY_OPS "vfio-iommu-backend-legacy-ops" #define TYPE_VFIO_IOMMU_BACKEND_OPS "vfio-iommu-backend-ops" diff --git a/hw/vfio/common.c b/hw/vfio/common.c index b51ef3a15a..6be1526d79 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -337,7 +337,7 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) { VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n); - VFIOLegacyContainer *container = giommu->container; + VFIOContainer *bcontainer = &giommu->container->bcontainer; hwaddr iova = iotlb->iova + giommu->iommu_offset; void *vaddr; int ret; @@ -367,21 +367,22 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) * of vaddr will always be there, even if the memory object is * destroyed and its backing memory munmap-ed. */ - ret = vfio_dma_map(container, iova, - iotlb->addr_mask + 1, vaddr, - read_only); + ret = vfio_container_dma_map(bcontainer, iova, + iotlb->addr_mask + 1, vaddr, + read_only); if (ret) { - error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", " + error_report("vfio_container_dma_map(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx", %p) = %d (%s)", - container, iova, + bcontainer, iova, iotlb->addr_mask + 1, vaddr, ret, strerror(-ret)); } } else { - ret = vfio_dma_unmap(container, iova, iotlb->addr_mask + 1, iotlb); + ret = vfio_container_dma_unmap(bcontainer, iova, + iotlb->addr_mask + 1, iotlb); if (ret) { - error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", " + error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", - container, iova, + bcontainer, iova, iotlb->addr_mask + 1, ret, strerror(-ret)); vfio_set_migration_error(ret); } @@ -400,9 +401,10 @@ static void vfio_ram_discard_notify_discard(RamDiscardListener *rdl, int ret; /* Unmap with a single call. */ - ret = vfio_dma_unmap(vrdl->container, iova, size , NULL); + ret = vfio_container_dma_unmap(&vrdl->container->bcontainer, + iova, size , NULL); if (ret) { - error_report("%s: vfio_dma_unmap() failed: %s", __func__, + error_report("%s: vfio_container_dma_unmap() failed: %s", __func__, strerror(-ret)); } } @@ -430,8 +432,8 @@ static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, section->offset_within_address_space; vaddr = memory_region_get_ram_ptr(section->mr) + start; - ret = vfio_dma_map(vrdl->container, iova, next - start, - vaddr, section->readonly); + ret = vfio_container_dma_map(&vrdl->container->bcontainer, iova, + next - start, vaddr, section->readonly); if (ret) { /* Rollback */ vfio_ram_discard_notify_discard(rdl, section); @@ -746,10 +748,11 @@ static void vfio_listener_region_add(MemoryListener *listener, } } - ret = vfio_dma_map(container, iova, int128_get64(llsize), - vaddr, section->readonly); + ret = vfio_container_dma_map(&container->bcontainer, + iova, int128_get64(llsize), vaddr, + section->readonly); if (ret) { - error_setg(&err, "vfio_dma_map(%p, 0x%"HWADDR_PRIx", " + error_setg(&err, "vfio_container_dma_map(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx", %p) = %d (%s)", container, iova, int128_get64(llsize), vaddr, ret, strerror(-ret)); @@ -852,18 +855,20 @@ static void vfio_listener_region_del(MemoryListener *listener, if (int128_eq(llsize, int128_2_64())) { /* The unmap ioctl doesn't accept a full 64-bit span. */ llsize = int128_rshift(llsize, 1); - ret = vfio_dma_unmap(container, iova, int128_get64(llsize), NULL); + ret = vfio_container_dma_unmap(&container->bcontainer, iova, + int128_get64(llsize), NULL); if (ret) { - error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", " + error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", container, iova, int128_get64(llsize), ret, strerror(-ret)); } iova += int128_get64(llsize); } - ret = vfio_dma_unmap(container, iova, int128_get64(llsize), NULL); + ret = vfio_container_dma_unmap(&container->bcontainer, iova, + int128_get64(llsize), NULL); if (ret) { - error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", " + error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", container, iova, int128_get64(llsize), ret, strerror(-ret)); diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 0c21e77039..78329935f6 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -26,6 +26,28 @@ #include "qemu/error-report.h" #include "hw/vfio/vfio-container-base.h" +int vfio_container_dma_map(VFIOContainer *bcontainer, + hwaddr iova, ram_addr_t size, + void *vaddr, bool readonly) +{ + if (!bcontainer->ops->dma_map) { + return -EINVAL; + } + + return bcontainer->ops->dma_map(bcontainer, iova, size, vaddr, readonly); +} + +int vfio_container_dma_unmap(VFIOContainer *bcontainer, + hwaddr iova, ram_addr_t size, + IOMMUTLBEntry *iotlb) +{ + if (!bcontainer->ops->dma_unmap) { + return -EINVAL; + } + + return bcontainer->ops->dma_unmap(bcontainer, iova, size, iotlb); +} + static const TypeInfo vfio_iommu_backend_ops_type_info = { .name = TYPE_VFIO_IOMMU_BACKEND_OPS, .parent = TYPE_OBJECT, diff --git a/hw/vfio/container.c b/hw/vfio/container.c index acc4a6bf8a..80aafa21ed 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -121,9 +121,13 @@ unmap_exit: /* * DMA - Mapping and unmapping for the "type1" IOMMU interface used on x86 */ -int vfio_dma_unmap(VFIOLegacyContainer *container, hwaddr iova, - ram_addr_t size, IOMMUTLBEntry *iotlb) +static int vfio_legacy_dma_unmap(VFIOContainer *bcontainer, hwaddr iova, + ram_addr_t size, IOMMUTLBEntry *iotlb) { + VFIOLegacyContainer *container = container_of(bcontainer, + VFIOLegacyContainer, + bcontainer); + struct vfio_iommu_type1_dma_unmap unmap = { .argsz = sizeof(unmap), .flags = 0, @@ -157,7 +161,7 @@ int vfio_dma_unmap(VFIOLegacyContainer *container, hwaddr iova, */ if (errno == EINVAL && unmap.size && !(unmap.iova + unmap.size) && container->iommu_type == VFIO_TYPE1v2_IOMMU) { - trace_vfio_dma_unmap_overflow_workaround(); + trace_vfio_legacy_dma_unmap_overflow_workaround(); unmap.size -= 1ULL << ctz64(container->pgsizes); continue; } @@ -176,9 +180,13 @@ int vfio_dma_unmap(VFIOLegacyContainer *container, hwaddr iova, return 0; } -int vfio_dma_map(VFIOLegacyContainer *container, hwaddr iova, - ram_addr_t size, void *vaddr, bool readonly) +static int vfio_legacy_dma_map(VFIOContainer *bcontainer, hwaddr iova, + ram_addr_t size, void *vaddr, bool readonly) { + VFIOLegacyContainer *container = container_of(bcontainer, + VFIOLegacyContainer, + bcontainer); + struct vfio_iommu_type1_dma_map map = { .argsz = sizeof(map), .flags = VFIO_DMA_MAP_FLAG_READ, @@ -197,7 +205,8 @@ int vfio_dma_map(VFIOLegacyContainer *container, hwaddr iova, * the VGA ROM space. */ if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0 || - (errno == EBUSY && vfio_dma_unmap(container, iova, size, NULL) == 0 && + (errno == EBUSY && + vfio_legacy_dma_unmap(bcontainer, iova, size, NULL) == 0 && ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0)) { return 0; } @@ -1168,6 +1177,10 @@ void vfio_detach_device(VFIODevice *vbasedev) static void vfio_iommu_backend_legacy_ops_class_init(ObjectClass *oc, void *data) { + VFIOIOMMUBackendOpsClass *ops = VFIO_IOMMU_BACKEND_OPS_CLASS(oc); + + ops->dma_map = vfio_legacy_dma_map; + ops->dma_unmap = vfio_legacy_dma_unmap; } static const TypeInfo vfio_iommu_backend_legacy_ops_type = { diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 0eb2387cf2..9f7fedee98 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -116,7 +116,7 @@ vfio_region_unmap(const char *name, unsigned long offset, unsigned long end) "Re vfio_region_sparse_mmap_header(const char *name, int index, int nr_areas) "Device %s region %d: %d sparse mmap entries" vfio_region_sparse_mmap_entry(int i, unsigned long start, unsigned long end) "sparse entry %d [0x%lx - 0x%lx]" vfio_get_dev_region(const char *name, int index, uint32_t type, uint32_t subtype) "%s index %d, %08x/%08x" -vfio_dma_unmap_overflow_workaround(void) "" +vfio_legacy_dma_unmap_overflow_workaround(void) "" vfio_get_dirty_bitmap(int fd, uint64_t iova, uint64_t size, uint64_t bitmap_size, uint64_t start, uint64_t dirty_pages) "container fd=%d, iova=0x%"PRIx64" size= 0x%"PRIx64" bitmap_size=0x%"PRIx64" start=0x%"PRIx64" dirty_pages=%"PRIu64 vfio_iommu_map_dirty_notify(uint64_t iova_start, uint64_t iova_end) "iommu dirty @ 0x%"PRIx64" - 0x%"PRIx64 From patchwork Mon Oct 16 08:32:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422723 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BE785C46CA1 for ; Mon, 16 Oct 2023 08:48:28 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJGS-0007as-CA; Mon, 16 Oct 2023 04:47:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGI-0007a8-BS for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:42 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGF-0001AB-Hj for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446059; x=1728982059; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wwMa75z1K7uUrZlbf9iq+WuE2zAedNKoviONdhuiOVU=; b=XpLD+0XN8s3E+ljsoxQDHuZJeV8Lb3wSEMHH0h1MxngLQFM/psVK+IQ8 haXiDDRZkJzVLs5DG4bIeU65YAwVAAkwBr/dSAZX/h+rgXQibSN5nVJEJ lbyhyzvhUteJiD6ndS1iyvM1HPhsKlsEEcnCYTTZdZsOzzukUDyyl5wOG yxrVH5+ibSp6Ta419yPSOKUygRBRc2enPDDsDQ5qaR0VjoGla18u0Eflf 23awH3oQF9GFPF5qKZyorISguoXTIUmU0xOvyDN9VtFxr2o29hMXyPnE+ tcT2G81hpEpYwKxuf81TlvrcNgNTJqIbevO3BPd2jB+GV2MGqCCyH6EsX Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737538" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737538" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222703" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222703" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:33 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v2 05/27] vfio/common: Move giommu_list in base container Date: Mon, 16 Oct 2023 16:32:01 +0800 Message-Id: <20231016083223.1519410-6-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Move the giommu_list field in the base object and store the base container in the VFIOGuestIOMMU. We introduce vfio_container_init/destroy helper on the base container. No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 9 --------- include/hw/vfio/vfio-container-base.h | 13 +++++++++++++ hw/vfio/common.c | 18 ++++++++++++------ hw/vfio/container-base.c | 19 +++++++++++++++++++ hw/vfio/container.c | 13 +++---------- 5 files changed, 47 insertions(+), 25 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index f2aa122c47..884d1627f4 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -95,7 +95,6 @@ typedef struct VFIOLegacyContainer { uint64_t max_dirty_bitmap_size; unsigned long pgsizes; unsigned int dma_max_mappings; - QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; @@ -103,14 +102,6 @@ typedef struct VFIOLegacyContainer { QLIST_HEAD(, VFIODevice) device_list; } VFIOLegacyContainer; -typedef struct VFIOGuestIOMMU { - VFIOLegacyContainer *container; - IOMMUMemoryRegion *iommu_mr; - hwaddr iommu_offset; - IOMMUNotifier n; - QLIST_ENTRY(VFIOGuestIOMMU) giommu_next; -} VFIOGuestIOMMU; - typedef struct VFIORamDiscardListener { VFIOLegacyContainer *container; MemoryRegion *mr; diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 1483e77441..b6c8eb2313 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -33,6 +33,14 @@ typedef struct VFIOContainer VFIOContainer; typedef struct VFIODevice VFIODevice; typedef struct VFIOIOMMUBackendOpsClass VFIOIOMMUBackendOpsClass; +typedef struct VFIOGuestIOMMU { + VFIOContainer *bcontainer; + IOMMUMemoryRegion *iommu_mr; + hwaddr iommu_offset; + IOMMUNotifier n; + QLIST_ENTRY(VFIOGuestIOMMU) giommu_next; +} VFIOGuestIOMMU; + typedef struct { unsigned long *bitmap; hwaddr size; @@ -44,6 +52,7 @@ typedef struct { */ struct VFIOContainer { VFIOIOMMUBackendOpsClass *ops; + QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; }; int vfio_container_dma_map(VFIOContainer *bcontainer, @@ -53,6 +62,10 @@ int vfio_container_dma_unmap(VFIOContainer *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb); +void vfio_container_init(VFIOContainer *bcontainer, + struct VFIOIOMMUBackendOpsClass *ops); +void vfio_container_destroy(VFIOContainer *bcontainer); + #define TYPE_VFIO_IOMMU_BACKEND_LEGACY_OPS "vfio-iommu-backend-legacy-ops" #define TYPE_VFIO_IOMMU_BACKEND_OPS "vfio-iommu-backend-ops" diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 6be1526d79..1adfdca4f5 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -337,7 +337,7 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) { VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n); - VFIOContainer *bcontainer = &giommu->container->bcontainer; + VFIOContainer *bcontainer = giommu->bcontainer; hwaddr iova = iotlb->iova + giommu->iommu_offset; void *vaddr; int ret; @@ -632,6 +632,7 @@ static void vfio_listener_region_add(MemoryListener *listener, VFIOLegacyContainer *container = container_of(listener, VFIOLegacyContainer, listener); + VFIOContainer *bcontainer = &container->bcontainer; hwaddr iova, end; Int128 llend, llsize; void *vaddr; @@ -683,7 +684,7 @@ static void vfio_listener_region_add(MemoryListener *listener, giommu->iommu_mr = iommu_mr; giommu->iommu_offset = section->offset_within_address_space - section->offset_within_region; - giommu->container = container; + giommu->bcontainer = bcontainer; llend = int128_add(int128_make64(section->offset_within_region), section->size); llend = int128_sub(llend, int128_one()); @@ -709,7 +710,7 @@ static void vfio_listener_region_add(MemoryListener *listener, g_free(giommu); goto fail; } - QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next); + QLIST_INSERT_HEAD(&bcontainer->giommu_list, giommu, giommu_next); memory_region_iommu_replay(giommu->iommu_mr, &giommu->n); return; @@ -796,6 +797,7 @@ static void vfio_listener_region_del(MemoryListener *listener, VFIOLegacyContainer *container = container_of(listener, VFIOLegacyContainer, listener); + VFIOContainer *bcontainer = &container->bcontainer; hwaddr iova, end; Int128 llend, llsize; int ret; @@ -808,7 +810,7 @@ static void vfio_listener_region_del(MemoryListener *listener, if (memory_region_is_iommu(section->mr)) { VFIOGuestIOMMU *giommu; - QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) { + QLIST_FOREACH(giommu, &bcontainer->giommu_list, giommu_next) { if (MEMORY_REGION(giommu->iommu_mr) == section->mr && giommu->n.start == section->offset_within_region) { memory_region_unregister_iommu_notifier(section->mr, @@ -1278,7 +1280,10 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) vfio_giommu_dirty_notifier *gdn = container_of(n, vfio_giommu_dirty_notifier, n); VFIOGuestIOMMU *giommu = gdn->giommu; - VFIOLegacyContainer *container = giommu->container; + VFIOContainer *bcontainer = giommu->bcontainer; + VFIOLegacyContainer *container = container_of(bcontainer, + VFIOLegacyContainer, + bcontainer); hwaddr iova = iotlb->iova + giommu->iommu_offset; ram_addr_t translated_addr; int ret = -EINVAL; @@ -1357,12 +1362,13 @@ vfio_sync_ram_discard_listener_dirty_bitmap(VFIOLegacyContainer *container, static int vfio_sync_dirty_bitmap(VFIOLegacyContainer *container, MemoryRegionSection *section) { + VFIOContainer *bcontainer = &container->bcontainer; ram_addr_t ram_addr; if (memory_region_is_iommu(section->mr)) { VFIOGuestIOMMU *giommu; - QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) { + QLIST_FOREACH(giommu, &bcontainer->giommu_list, giommu_next) { if (MEMORY_REGION(giommu->iommu_mr) == section->mr && giommu->n.start == section->offset_within_region) { Int128 llend; diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 78329935f6..6da50e8151 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -48,6 +48,25 @@ int vfio_container_dma_unmap(VFIOContainer *bcontainer, return bcontainer->ops->dma_unmap(bcontainer, iova, size, iotlb); } +void vfio_container_init(VFIOContainer *bcontainer, + struct VFIOIOMMUBackendOpsClass *ops) +{ + bcontainer->ops = ops; + QLIST_INIT(&bcontainer->giommu_list); +} + +void vfio_container_destroy(VFIOContainer *bcontainer) +{ + VFIOGuestIOMMU *giommu, *tmp; + + QLIST_FOREACH_SAFE(giommu, &bcontainer->giommu_list, giommu_next, tmp) { + memory_region_unregister_iommu_notifier( + MEMORY_REGION(giommu->iommu_mr), &giommu->n); + QLIST_REMOVE(giommu, giommu_next); + g_free(giommu); + } +} + static const TypeInfo vfio_iommu_backend_ops_type_info = { .name = TYPE_VFIO_IOMMU_BACKEND_OPS, .parent = TYPE_OBJECT, diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 80aafa21ed..de6b018eeb 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -629,11 +629,10 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container->error = NULL; container->dirty_pages_supported = false; container->dma_max_mappings = 0; - QLIST_INIT(&container->giommu_list); QLIST_INIT(&container->hostwin_list); QLIST_INIT(&container->vrdl_list); bcontainer = &container->bcontainer; - bcontainer->ops = ops; + vfio_container_init(bcontainer, ops); ret = vfio_init_container(container, group->fd, errp); if (ret) { @@ -794,6 +793,7 @@ put_space_exit: static void vfio_disconnect_container(VFIOGroup *group) { VFIOLegacyContainer *container = group->container; + VFIOContainer *bcontainer = &container->bcontainer; QLIST_REMOVE(group, container_next); group->container = NULL; @@ -814,17 +814,10 @@ static void vfio_disconnect_container(VFIOGroup *group) if (QLIST_EMPTY(&container->group_list)) { VFIOAddressSpace *space = container->space; - VFIOGuestIOMMU *giommu, *tmp; VFIOHostDMAWindow *hostwin, *next; QLIST_REMOVE(container, next); - - QLIST_FOREACH_SAFE(giommu, &container->giommu_list, giommu_next, tmp) { - memory_region_unregister_iommu_notifier( - MEMORY_REGION(giommu->iommu_mr), &giommu->n); - QLIST_REMOVE(giommu, giommu_next); - g_free(giommu); - } + vfio_container_destroy(bcontainer); QLIST_FOREACH_SAFE(hostwin, &container->hostwin_list, hostwin_next, next) { From patchwork Mon Oct 16 08:32:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422730 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5035BCDB474 for ; Mon, 16 Oct 2023 08:49:48 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJGU-0007hZ-25; Mon, 16 Oct 2023 04:47:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGM-0007an-Qt for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:52 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGJ-0001AB-U8 for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446063; x=1728982063; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+pVnJWeuf5IGR1SJs6p3ezEqVedUcS3tsyfw5pT1jVU=; b=jm7N93/PmxKXQ7xZ6YRe53BJZzg3fZixmhcCYpskAXoXJQj0PvALn3u5 bjoB6AkrmWqlPjM1sjTsYtXy85RUwuKpzT1gNlUpf0M3e5ZVNH6jOxWeM WUZUdvw0UqkqP24iyuf1Bb+V1ZiCd8TIMQ7fc1r2ChHZZ6+GBpcTlYpjt /MM/t9euaUt93FNuiuLZaIiXmqrxU3KcZz97ZGsC9pyvju4xx+yS8CQjC MPCLMCymYhUiL45k6tK1kx05CrVlrUSTgURJJLSZ1S97cph45vtGeT2aT ebyWiQUu4LS7x6ExaflnImbdWBYqoUEfW7w335z3vTxPkMaun/V4faGO4 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737546" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737546" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222709" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222709" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:38 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v2 06/27] vfio/container: Move space field to base container Date: Mon, 16 Oct 2023 16:32:02 +0800 Message-Id: <20231016083223.1519410-7-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Move the space field to the base object. Also the VFIOAddressSpace now contains a list of base containers. No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 8 -------- include/hw/vfio/vfio-container-base.h | 9 +++++++++ hw/vfio/common.c | 4 ++-- hw/vfio/container-base.c | 4 ++++ hw/vfio/container.c | 28 +++++++++++++-------------- 5 files changed, 29 insertions(+), 24 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 884d1627f4..33f475957c 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -73,17 +73,10 @@ typedef struct VFIOMigration { bool initial_data_sent; } VFIOMigration; -typedef struct VFIOAddressSpace { - AddressSpace *as; - QLIST_HEAD(, VFIOLegacyContainer) containers; - QLIST_ENTRY(VFIOAddressSpace) list; -} VFIOAddressSpace; - struct VFIOGroup; typedef struct VFIOLegacyContainer { VFIOContainer bcontainer; - VFIOAddressSpace *space; int fd; /* /dev/vfio/vfio, empowered by the attached groups */ MemoryListener listener; MemoryListener prereg_listener; @@ -98,7 +91,6 @@ typedef struct VFIOLegacyContainer { QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; - QLIST_ENTRY(VFIOLegacyContainer) next; QLIST_HEAD(, VFIODevice) device_list; } VFIOLegacyContainer; diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index b6c8eb2313..9504564f4e 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -33,6 +33,12 @@ typedef struct VFIOContainer VFIOContainer; typedef struct VFIODevice VFIODevice; typedef struct VFIOIOMMUBackendOpsClass VFIOIOMMUBackendOpsClass; +typedef struct VFIOAddressSpace { + AddressSpace *as; + QLIST_HEAD(, VFIOContainer) containers; + QLIST_ENTRY(VFIOAddressSpace) list; +} VFIOAddressSpace; + typedef struct VFIOGuestIOMMU { VFIOContainer *bcontainer; IOMMUMemoryRegion *iommu_mr; @@ -52,7 +58,9 @@ typedef struct { */ struct VFIOContainer { VFIOIOMMUBackendOpsClass *ops; + VFIOAddressSpace *space; QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; + QLIST_ENTRY(VFIOContainer) next; }; int vfio_container_dma_map(VFIOContainer *bcontainer, @@ -63,6 +71,7 @@ int vfio_container_dma_unmap(VFIOContainer *bcontainer, IOMMUTLBEntry *iotlb); void vfio_container_init(VFIOContainer *bcontainer, + VFIOAddressSpace *space, struct VFIOIOMMUBackendOpsClass *ops); void vfio_container_destroy(VFIOContainer *bcontainer); diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 1adfdca4f5..c92af34eed 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -152,7 +152,7 @@ void vfio_unblock_multiple_devices_migration(void) bool vfio_viommu_preset(VFIODevice *vbasedev) { - return vbasedev->container->space->as != &address_space_memory; + return vbasedev->container->bcontainer.space->as != &address_space_memory; } static void vfio_set_migration_error(int err) @@ -990,7 +990,7 @@ static void vfio_dirty_tracking_init(VFIOLegacyContainer *container, dirty.container = container; memory_listener_register(&dirty.listener, - container->space->as); + container->bcontainer.space->as); *ranges = dirty.ranges; diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 6da50e8151..e1056dd78e 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -49,9 +49,11 @@ int vfio_container_dma_unmap(VFIOContainer *bcontainer, } void vfio_container_init(VFIOContainer *bcontainer, + VFIOAddressSpace *space, struct VFIOIOMMUBackendOpsClass *ops) { bcontainer->ops = ops; + bcontainer->space = space; QLIST_INIT(&bcontainer->giommu_list); } @@ -59,6 +61,8 @@ void vfio_container_destroy(VFIOContainer *bcontainer) { VFIOGuestIOMMU *giommu, *tmp; + QLIST_REMOVE(bcontainer, next); + QLIST_FOREACH_SAFE(giommu, &bcontainer->giommu_list, giommu_next, tmp) { memory_region_unregister_iommu_notifier( MEMORY_REGION(giommu->iommu_mr), &giommu->n); diff --git a/hw/vfio/container.c b/hw/vfio/container.c index de6b018eeb..fd2d602fb9 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -588,7 +588,8 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, * details once we know which type of IOMMU we are using. */ - QLIST_FOREACH(container, &space->containers, next) { + QLIST_FOREACH(bcontainer, &space->containers, next) { + container = container_of(bcontainer, VFIOLegacyContainer, bcontainer); if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) { ret = vfio_ram_block_discard_disable(container, true); if (ret) { @@ -624,7 +625,6 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, } container = g_malloc0(sizeof(*container)); - container->space = space; container->fd = fd; container->error = NULL; container->dirty_pages_supported = false; @@ -632,7 +632,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, QLIST_INIT(&container->hostwin_list); QLIST_INIT(&container->vrdl_list); bcontainer = &container->bcontainer; - vfio_container_init(bcontainer, ops); + vfio_container_init(bcontainer, space, ops); ret = vfio_init_container(container, group->fd, errp); if (ret) { @@ -750,14 +750,15 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, vfio_kvm_device_add_group(group); QLIST_INIT(&container->group_list); - QLIST_INSERT_HEAD(&space->containers, container, next); + QLIST_INSERT_HEAD(&space->containers, bcontainer, next); group->container = container; QLIST_INSERT_HEAD(&container->group_list, group, container_next); container->listener = vfio_memory_listener; - memory_listener_register(&container->listener, container->space->as); + memory_listener_register(&container->listener, + container->bcontainer.space->as); if (container->error) { ret = -1; @@ -771,7 +772,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, return 0; listener_release_exit: QLIST_REMOVE(group, container_next); - QLIST_REMOVE(container, next); + QLIST_REMOVE(bcontainer, next); vfio_kvm_device_del_group(group); vfio_listener_release(container); @@ -813,10 +814,9 @@ static void vfio_disconnect_container(VFIOGroup *group) } if (QLIST_EMPTY(&container->group_list)) { - VFIOAddressSpace *space = container->space; + VFIOAddressSpace *space = container->bcontainer.space; VFIOHostDMAWindow *hostwin, *next; - QLIST_REMOVE(container, next); vfio_container_destroy(bcontainer); QLIST_FOREACH_SAFE(hostwin, &container->hostwin_list, hostwin_next, @@ -842,7 +842,7 @@ static VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp) QLIST_FOREACH(group, &vfio_group_list, next) { if (group->groupid == groupid) { /* Found it. Now is it already in the right context? */ - if (group->container->space->as == as) { + if (group->container->bcontainer.space->as == as) { return group; } else { error_setg(errp, "group %d used in multiple address spaces", @@ -1040,27 +1040,27 @@ static int vfio_eeh_container_op(VFIOLegacyContainer *container, uint32_t op) static VFIOLegacyContainer *vfio_eeh_as_container(AddressSpace *as) { VFIOAddressSpace *space = vfio_get_address_space(as); - VFIOLegacyContainer *container = NULL; + VFIOContainer *bcontainer = NULL; if (QLIST_EMPTY(&space->containers)) { /* No containers to act on */ goto out; } - container = QLIST_FIRST(&space->containers); + bcontainer = QLIST_FIRST(&space->containers); - if (QLIST_NEXT(container, next)) { + if (QLIST_NEXT(bcontainer, next)) { /* * We don't yet have logic to synchronize EEH state across * multiple containers */ - container = NULL; + bcontainer = NULL; goto out; } out: vfio_put_address_space(space); - return container; + return container_of(bcontainer, VFIOLegacyContainer, bcontainer); } bool vfio_eeh_as_ok(AddressSpace *as) From patchwork Mon Oct 16 08:32:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422735 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6446ECDB465 for ; Mon, 16 Oct 2023 08:50:01 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJGU-0007hX-0l; Mon, 16 Oct 2023 04:47:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGQ-0007dv-SD for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:52 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGP-0001Al-0J for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446069; x=1728982069; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uws+EGYfXV5EQWphM/+91nJNeLUFq7VcNsOP4PXGk/k=; b=a/AtunlTUPHRc5UmuV1TbVqACJpywwsqo3EpaOQS/jaViV+GJKAdbJhi AHkQ7Czq5FJxdx5f+jNWzumTmTJn6tkgUqHI3Ugo9BbLbBcmvXK8Km+UA HnlEMvKxznmEDssYVIeabhf3+0jFrwgz8ymMdHqyTRAxo/696+c5WXb5K K2lTwVL1GmqhBKZ8WF98TkNrdXaqirErMU2OpO+cYcc3x9XC2vtyi6XmA AM9hfgHW02yidD1I71aN7mUkYZjICRqptNvLfJCSN1NsPPIO5D6zfnmK6 PX1iAPdBMHIrr5zgHyeWdeM9SBeLloV1nfEmXN92WOpqFoFantNfYZtxJ g==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737563" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737563" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222770" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222770" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:42 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v2 07/27] vfio/container: switch to IOMMU BE add/del_section_window Date: Mon, 16 Oct 2023 16:32:03 +0800 Message-Id: <20231016083223.1519410-8-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 6 ------ include/hw/vfio/vfio-container-base.h | 5 +++++ hw/vfio/common.c | 4 ++-- hw/vfio/container-base.c | 21 +++++++++++++++++++++ hw/vfio/container.c | 19 ++++++++++++++----- 5 files changed, 42 insertions(+), 13 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 33f475957c..b83ae4b3b6 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -199,12 +199,6 @@ int vfio_set_dirty_page_tracking(VFIOLegacyContainer *container, bool start); int vfio_query_dirty_bitmap(VFIOLegacyContainer *container, VFIOBitmap *vbmap, hwaddr iova, hwaddr size); -int vfio_container_add_section_window(VFIOLegacyContainer *container, - MemoryRegionSection *section, - Error **errp); -void vfio_container_del_section_window(VFIOLegacyContainer *container, - MemoryRegionSection *section); - void vfio_disable_irqindex(VFIODevice *vbasedev, int index); void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index); void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index); diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 9504564f4e..1f6d5fd229 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -69,6 +69,11 @@ int vfio_container_dma_map(VFIOContainer *bcontainer, int vfio_container_dma_unmap(VFIOContainer *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb); +int vfio_container_add_section_window(VFIOContainer *bcontainer, + MemoryRegionSection *section, + Error **errp); +void vfio_container_del_section_window(VFIOContainer *bcontainer, + MemoryRegionSection *section); void vfio_container_init(VFIOContainer *bcontainer, VFIOAddressSpace *space, diff --git a/hw/vfio/common.c b/hw/vfio/common.c index c92af34eed..49cb5b6958 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -655,7 +655,7 @@ static void vfio_listener_region_add(MemoryListener *listener, return; } - if (vfio_container_add_section_window(container, section, &err)) { + if (vfio_container_add_section_window(bcontainer, section, &err)) { goto fail; } @@ -879,7 +879,7 @@ static void vfio_listener_region_del(MemoryListener *listener, memory_region_unref(section->mr); - vfio_container_del_section_window(container, section); + vfio_container_del_section_window(&container->bcontainer, section); } typedef struct VFIODirtyRanges { diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index e1056dd78e..f2a9a33465 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -48,6 +48,27 @@ int vfio_container_dma_unmap(VFIOContainer *bcontainer, return bcontainer->ops->dma_unmap(bcontainer, iova, size, iotlb); } +int vfio_container_add_section_window(VFIOContainer *bcontainer, + MemoryRegionSection *section, + Error **errp) +{ + if (!bcontainer->ops->add_window) { + return 0; + } + + return bcontainer->ops->add_window(bcontainer, section, errp); +} + +void vfio_container_del_section_window(VFIOContainer *bcontainer, + MemoryRegionSection *section) +{ + if (!bcontainer->ops->del_window) { + return; + } + + return bcontainer->ops->del_window(bcontainer, section); +} + void vfio_container_init(VFIOContainer *bcontainer, VFIOAddressSpace *space, struct VFIOIOMMUBackendOpsClass *ops) diff --git a/hw/vfio/container.c b/hw/vfio/container.c index fd2d602fb9..7ca61a7d36 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -215,10 +215,13 @@ static int vfio_legacy_dma_map(VFIOContainer *bcontainer, hwaddr iova, return -errno; } -int vfio_container_add_section_window(VFIOLegacyContainer *container, - MemoryRegionSection *section, - Error **errp) +static int vfio_legacy_add_section_window(VFIOContainer *bcontainer, + MemoryRegionSection *section, + Error **errp) { + VFIOLegacyContainer *container = container_of(bcontainer, + VFIOLegacyContainer, + bcontainer); VFIOHostDMAWindow *hostwin; hwaddr pgsize = 0; int ret; @@ -283,9 +286,13 @@ int vfio_container_add_section_window(VFIOLegacyContainer *container, return 0; } -void vfio_container_del_section_window(VFIOLegacyContainer *container, - MemoryRegionSection *section) +static void vfio_legacy_del_section_window(VFIOContainer *bcontainer, + MemoryRegionSection *section) { + VFIOLegacyContainer *container = container_of(bcontainer, + VFIOLegacyContainer, + bcontainer); + if (container->iommu_type != VFIO_SPAPR_TCE_v2_IOMMU) { return; } @@ -1174,6 +1181,8 @@ static void vfio_iommu_backend_legacy_ops_class_init(ObjectClass *oc, ops->dma_map = vfio_legacy_dma_map; ops->dma_unmap = vfio_legacy_dma_unmap; + ops->add_window = vfio_legacy_add_section_window; + ops->del_window = vfio_legacy_del_section_window; } static const TypeInfo vfio_iommu_backend_legacy_ops_type = { From patchwork Mon Oct 16 08:32:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422727 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7B4DCCDB482 for ; Mon, 16 Oct 2023 08:49:23 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJGW-0007iZ-43; Mon, 16 Oct 2023 04:47:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGU-0007hu-Jg for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:54 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGS-0001Al-E3 for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446072; x=1728982072; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yvpa3fvG/HrnClTGDpUIvnBZsd0pi9zd/hyMrUc3+q8=; b=FenwdPfN0wt/381/wO1eBZq0WDt8Fj9/rEV62VYSNaGZ4U67QVGkrelL Bb8VJKRIO9BR9XIwm1dYaVZpODIwskmJ59VOKjui+MFn3+9p4BvlJjtdV 55Ta0cQvugBlA0rUGw6r3NLfxwVEyoPIgxrLE9wZcHe02I5fteJ/xg3Sb A7sD06xwdHI+9t3ubprEtlgOq+uBwr5Jw0XDiyRIQXcLh9/PZSulEL6P+ mX2gp4o1/k5/Q/qLVy+s6w16wUxmBE3mNStR1EvIc7PpWTq2WZWaRilDi dz01S8J0jOpVumx/TfymUXyIuILzfHO4eDgxVDuTZKylBq8u9Whi9Uf3u g==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737577" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737577" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222828" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222828" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:46 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v2 08/27] vfio/container: Move hostwin_list in base container Date: Mon, 16 Oct 2023 16:32:04 +0800 Message-Id: <20231016083223.1519410-9-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Move hostwin_list in base container. This conducts to passing a base container to vfio_host_win_add/del and vfio_find_hostwin. No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 12 ++---------- include/hw/vfio/vfio-container-base.h | 8 ++++++++ hw/vfio/common.c | 18 +++++++++--------- hw/vfio/container-base.c | 8 ++++++++ hw/vfio/container.c | 18 +++++------------- 5 files changed, 32 insertions(+), 32 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index b83ae4b3b6..85dbda296a 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -88,7 +88,6 @@ typedef struct VFIOLegacyContainer { uint64_t max_dirty_bitmap_size; unsigned long pgsizes; unsigned int dma_max_mappings; - QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; QLIST_HEAD(, VFIODevice) device_list; @@ -104,13 +103,6 @@ typedef struct VFIORamDiscardListener { QLIST_ENTRY(VFIORamDiscardListener) next; } VFIORamDiscardListener; -typedef struct VFIOHostDMAWindow { - hwaddr min_iova; - hwaddr max_iova; - uint64_t iova_pgsizes; - QLIST_ENTRY(VFIOHostDMAWindow) hostwin_next; -} VFIOHostDMAWindow; - typedef struct VFIODeviceOps VFIODeviceOps; typedef struct VFIODevice { @@ -185,10 +177,10 @@ typedef struct VFIODisplay { } dmabuf; } VFIODisplay; -void vfio_host_win_add(VFIOLegacyContainer *container, +void vfio_host_win_add(VFIOContainer *bcontainer, hwaddr min_iova, hwaddr max_iova, uint64_t iova_pgsizes); -int vfio_host_win_del(VFIOLegacyContainer *container, hwaddr min_iova, +int vfio_host_win_del(VFIOContainer *bcontainer, hwaddr min_iova, hwaddr max_iova); VFIOAddressSpace *vfio_get_address_space(AddressSpace *as); void vfio_put_address_space(VFIOAddressSpace *space); diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 1f6d5fd229..03bffbff73 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -47,6 +47,13 @@ typedef struct VFIOGuestIOMMU { QLIST_ENTRY(VFIOGuestIOMMU) giommu_next; } VFIOGuestIOMMU; +typedef struct VFIOHostDMAWindow { + hwaddr min_iova; + hwaddr max_iova; + uint64_t iova_pgsizes; + QLIST_ENTRY(VFIOHostDMAWindow) hostwin_next; +} VFIOHostDMAWindow; + typedef struct { unsigned long *bitmap; hwaddr size; @@ -60,6 +67,7 @@ struct VFIOContainer { VFIOIOMMUBackendOpsClass *ops; VFIOAddressSpace *space; QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; + QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_ENTRY(VFIOContainer) next; }; diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 49cb5b6958..511f538c00 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -252,12 +252,12 @@ bool vfio_devices_all_running_and_mig_active(VFIOLegacyContainer *container) return true; } -void vfio_host_win_add(VFIOLegacyContainer *container, hwaddr min_iova, +void vfio_host_win_add(VFIOContainer *bcontainer, hwaddr min_iova, hwaddr max_iova, uint64_t iova_pgsizes) { VFIOHostDMAWindow *hostwin; - QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { + QLIST_FOREACH(hostwin, &bcontainer->hostwin_list, hostwin_next) { if (ranges_overlap(hostwin->min_iova, hostwin->max_iova - hostwin->min_iova + 1, min_iova, @@ -271,15 +271,15 @@ void vfio_host_win_add(VFIOLegacyContainer *container, hwaddr min_iova, hostwin->min_iova = min_iova; hostwin->max_iova = max_iova; hostwin->iova_pgsizes = iova_pgsizes; - QLIST_INSERT_HEAD(&container->hostwin_list, hostwin, hostwin_next); + QLIST_INSERT_HEAD(&bcontainer->hostwin_list, hostwin, hostwin_next); } -int vfio_host_win_del(VFIOLegacyContainer *container, +int vfio_host_win_del(VFIOContainer *bcontainer, hwaddr min_iova, hwaddr max_iova) { VFIOHostDMAWindow *hostwin; - QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { + QLIST_FOREACH(hostwin, &bcontainer->hostwin_list, hostwin_next) { if (hostwin->min_iova == min_iova && hostwin->max_iova == max_iova) { QLIST_REMOVE(hostwin, hostwin_next); g_free(hostwin); @@ -540,13 +540,13 @@ static void vfio_unregister_ram_discard_listener(VFIOLegacyContainer *container, g_free(vrdl); } -static VFIOHostDMAWindow *vfio_find_hostwin(VFIOLegacyContainer *container, +static VFIOHostDMAWindow *vfio_find_hostwin(VFIOContainer *bcontainer, hwaddr iova, hwaddr end) { VFIOHostDMAWindow *hostwin; bool hostwin_found = false; - QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { + QLIST_FOREACH(hostwin, &bcontainer->hostwin_list, hostwin_next) { if (hostwin->min_iova <= iova && end <= hostwin->max_iova) { hostwin_found = true; break; @@ -659,7 +659,7 @@ static void vfio_listener_region_add(MemoryListener *listener, goto fail; } - hostwin = vfio_find_hostwin(container, iova, end); + hostwin = vfio_find_hostwin(bcontainer, iova, end); if (!hostwin) { error_setg(&err, "Container %p can't map guest IOVA region" " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx, container, iova, end); @@ -842,7 +842,7 @@ static void vfio_listener_region_del(MemoryListener *listener, hwaddr pgmask; VFIOHostDMAWindow *hostwin; - hostwin = vfio_find_hostwin(container, iova, end); + hostwin = vfio_find_hostwin(bcontainer, iova, end); assert(hostwin); /* or region_add() would have failed */ pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1; diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index f2a9a33465..12b256c70e 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -76,11 +76,13 @@ void vfio_container_init(VFIOContainer *bcontainer, bcontainer->ops = ops; bcontainer->space = space; QLIST_INIT(&bcontainer->giommu_list); + QLIST_INIT(&bcontainer->hostwin_list); } void vfio_container_destroy(VFIOContainer *bcontainer) { VFIOGuestIOMMU *giommu, *tmp; + VFIOHostDMAWindow *hostwin, *next; QLIST_REMOVE(bcontainer, next); @@ -90,6 +92,12 @@ void vfio_container_destroy(VFIOContainer *bcontainer) QLIST_REMOVE(giommu, giommu_next); g_free(giommu); } + + QLIST_FOREACH_SAFE(hostwin, &bcontainer->hostwin_list, hostwin_next, + next) { + QLIST_REMOVE(hostwin, hostwin_next); + g_free(hostwin); + } } static const TypeInfo vfio_iommu_backend_ops_type_info = { diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 7ca61a7d36..5d111f69c9 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -231,7 +231,7 @@ static int vfio_legacy_add_section_window(VFIOContainer *bcontainer, } /* For now intersections are not allowed, we may relax this later */ - QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { + QLIST_FOREACH(hostwin, &bcontainer->hostwin_list, hostwin_next) { if (ranges_overlap(hostwin->min_iova, hostwin->max_iova - hostwin->min_iova + 1, section->offset_within_address_space, @@ -253,7 +253,7 @@ static int vfio_legacy_add_section_window(VFIOContainer *bcontainer, return ret; } - vfio_host_win_add(container, section->offset_within_address_space, + vfio_host_win_add(bcontainer, section->offset_within_address_space, section->offset_within_address_space + int128_get64(section->size) - 1, pgsize); #ifdef CONFIG_KVM @@ -299,7 +299,7 @@ static void vfio_legacy_del_section_window(VFIOContainer *bcontainer, vfio_spapr_remove_window(container, section->offset_within_address_space); - if (vfio_host_win_del(container, + if (vfio_host_win_del(bcontainer, section->offset_within_address_space, section->offset_within_address_space + int128_get64(section->size) - 1) < 0) { @@ -636,7 +636,6 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container->error = NULL; container->dirty_pages_supported = false; container->dma_max_mappings = 0; - QLIST_INIT(&container->hostwin_list); QLIST_INIT(&container->vrdl_list); bcontainer = &container->bcontainer; vfio_container_init(bcontainer, space, ops); @@ -681,7 +680,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, * information to get the actual window extent rather than assume * a 64-bit IOVA address space. */ - vfio_host_win_add(container, 0, (hwaddr)-1, container->pgsizes); + vfio_host_win_add(bcontainer, 0, (hwaddr)-1, container->pgsizes); break; } @@ -746,7 +745,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, } else { /* The default table uses 4K pages */ container->pgsizes = 0x1000; - vfio_host_win_add(container, info.dma32_window_start, + vfio_host_win_add(bcontainer, info.dma32_window_start, info.dma32_window_start + info.dma32_window_size - 1, 0x1000); @@ -822,16 +821,9 @@ static void vfio_disconnect_container(VFIOGroup *group) if (QLIST_EMPTY(&container->group_list)) { VFIOAddressSpace *space = container->bcontainer.space; - VFIOHostDMAWindow *hostwin, *next; vfio_container_destroy(bcontainer); - QLIST_FOREACH_SAFE(hostwin, &container->hostwin_list, hostwin_next, - next) { - QLIST_REMOVE(hostwin, hostwin_next); - g_free(hostwin); - } - trace_vfio_disconnect_container(container->fd); close(container->fd); g_free(container); From patchwork Mon Oct 16 08:32:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422738 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D5DD9CDB482 for ; Mon, 16 Oct 2023 08:50:59 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJGZ-0007jN-Qt; Mon, 16 Oct 2023 04:47:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGY-0007iu-8d for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:58 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGW-0001Al-6p for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:47:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446076; x=1728982076; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=b4N0W6TutyN/g+zWp0dJTUQb5rH+g3STsV4RonC02Ns=; b=Ouefi7Y+CpP7EB0oYomvYWgJFG8Ks3FMhWclKVBE0SrTfkoJLTFaQHlE NWlSOC7DWdAfwnIJ7T1LJN4fL7U5aIYFV5bCafehvs/d0PXs5pjdUlIsb bfWTpW99dz3scrQ4PjiJX8oSTTmJS66mDU8AN4O4TrY6BU5ASZMK4MwzW 2uGI8ItiphnSjLu2Z/QHp6dHvvL7sDTtbFUzWWC/39LkFUI9K1jxyu5Ut q61ceRvQOYC+dZlWF4WBuRW6ZQn7eC/nhtbGXPTCiXi7mOtW8MgMCLSMw /L271AO6ksBGXoEX09RKNxYXe5ygi8XIfJORZ4UCOGx46EqI5lij+PkR2 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737590" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737590" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222852" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222852" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:51 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v2 09/27] vfio/container: Switch to IOMMU BE set_dirty_page_tracking/query_dirty_bitmap API Date: Mon, 16 Oct 2023 16:32:05 +0800 Message-Id: <20231016083223.1519410-10-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger dirty_pages_supported field is also moved to the base container No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 6 ------ include/hw/vfio/vfio-container-base.h | 6 ++++++ hw/vfio/common.c | 12 ++++++++---- hw/vfio/container-base.c | 23 +++++++++++++++++++++++ hw/vfio/container.c | 23 ++++++++++++++++------- 5 files changed, 53 insertions(+), 17 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 85dbda296a..39bcc7ec33 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -83,7 +83,6 @@ typedef struct VFIOLegacyContainer { unsigned iommu_type; Error *error; bool initialized; - bool dirty_pages_supported; uint64_t dirty_pgsizes; uint64_t max_dirty_bitmap_size; unsigned long pgsizes; @@ -186,11 +185,6 @@ VFIOAddressSpace *vfio_get_address_space(AddressSpace *as); void vfio_put_address_space(VFIOAddressSpace *space); bool vfio_devices_all_running_and_saving(VFIOLegacyContainer *container); -/* container->fd */ -int vfio_set_dirty_page_tracking(VFIOLegacyContainer *container, bool start); -int vfio_query_dirty_bitmap(VFIOLegacyContainer *container, VFIOBitmap *vbmap, - hwaddr iova, hwaddr size); - void vfio_disable_irqindex(VFIODevice *vbasedev, int index); void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index); void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index); diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 03bffbff73..5ab52774b5 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -66,6 +66,7 @@ typedef struct { struct VFIOContainer { VFIOIOMMUBackendOpsClass *ops; VFIOAddressSpace *space; + bool dirty_pages_supported; QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_ENTRY(VFIOContainer) next; @@ -77,6 +78,11 @@ int vfio_container_dma_map(VFIOContainer *bcontainer, int vfio_container_dma_unmap(VFIOContainer *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb); +int vfio_container_set_dirty_page_tracking(VFIOContainer *bcontainer, + bool start); +int vfio_container_query_dirty_bitmap(VFIOContainer *bcontainer, + VFIOBitmap *vbmap, + hwaddr iova, hwaddr size); int vfio_container_add_section_window(VFIOContainer *bcontainer, MemoryRegionSection *section, Error **errp); diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 511f538c00..855d6d82d0 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1149,7 +1149,8 @@ static void vfio_listener_log_global_start(MemoryListener *listener) if (vfio_devices_all_device_dirty_tracking(container)) { ret = vfio_devices_dma_logging_start(container); } else { - ret = vfio_set_dirty_page_tracking(container, true); + ret = vfio_container_set_dirty_page_tracking(&container->bcontainer, + true); } if (ret) { @@ -1169,7 +1170,8 @@ static void vfio_listener_log_global_stop(MemoryListener *listener) if (vfio_devices_all_device_dirty_tracking(container)) { vfio_devices_dma_logging_stop(container); } else { - ret = vfio_set_dirty_page_tracking(container, false); + ret = vfio_container_set_dirty_page_tracking(&container->bcontainer, + false); } if (ret) { @@ -1237,7 +1239,8 @@ int vfio_get_dirty_bitmap(VFIOLegacyContainer *container, uint64_t iova, VFIOBitmap vbmap; int ret; - if (!container->dirty_pages_supported && !all_device_dirty_tracking) { + if (!container->bcontainer.dirty_pages_supported && + !all_device_dirty_tracking) { cpu_physical_memory_set_dirty_range(ram_addr, size, tcg_enabled() ? DIRTY_CLIENTS_ALL : DIRTY_CLIENTS_NOCODE); @@ -1252,7 +1255,8 @@ int vfio_get_dirty_bitmap(VFIOLegacyContainer *container, uint64_t iova, if (all_device_dirty_tracking) { ret = vfio_devices_query_dirty_bitmap(container, &vbmap, iova, size); } else { - ret = vfio_query_dirty_bitmap(container, &vbmap, iova, size); + ret = vfio_container_query_dirty_bitmap(&container->bcontainer, &vbmap, + iova, size); } if (ret) { diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 12b256c70e..530ad42c0d 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -48,6 +48,28 @@ int vfio_container_dma_unmap(VFIOContainer *bcontainer, return bcontainer->ops->dma_unmap(bcontainer, iova, size, iotlb); } +int vfio_container_set_dirty_page_tracking(VFIOContainer *bcontainer, + bool start) +{ + /* Fallback to all pages dirty if dirty page sync isn't supported */ + if (!bcontainer->ops->set_dirty_page_tracking) { + return 0; + } + + return bcontainer->ops->set_dirty_page_tracking(bcontainer, start); +} + +int vfio_container_query_dirty_bitmap(VFIOContainer *bcontainer, + VFIOBitmap *vbmap, + hwaddr iova, hwaddr size) +{ + if (!bcontainer->ops->query_dirty_bitmap) { + return -EINVAL; + } + + return bcontainer->ops->query_dirty_bitmap(bcontainer, vbmap, iova, size); +} + int vfio_container_add_section_window(VFIOContainer *bcontainer, MemoryRegionSection *section, Error **errp) @@ -75,6 +97,7 @@ void vfio_container_init(VFIOContainer *bcontainer, { bcontainer->ops = ops; bcontainer->space = space; + bcontainer->dirty_pages_supported = false; QLIST_INIT(&bcontainer->giommu_list); QLIST_INIT(&bcontainer->hostwin_list); } diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 5d111f69c9..26617afaa9 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -139,7 +139,7 @@ static int vfio_legacy_dma_unmap(VFIOContainer *bcontainer, hwaddr iova, if (iotlb && vfio_devices_all_running_and_mig_active(container)) { if (!vfio_devices_all_device_dirty_tracking(container) && - container->dirty_pages_supported) { + container->bcontainer.dirty_pages_supported) { return vfio_dma_unmap_bitmap(container, iova, size, iotlb); } @@ -308,14 +308,18 @@ static void vfio_legacy_del_section_window(VFIOContainer *bcontainer, } } -int vfio_set_dirty_page_tracking(VFIOLegacyContainer *container, bool start) +static int vfio_legacy_set_dirty_page_tracking(VFIOContainer *bcontainer, + bool start) { + VFIOLegacyContainer *container = container_of(bcontainer, + VFIOLegacyContainer, + bcontainer); int ret; struct vfio_iommu_type1_dirty_bitmap dirty = { .argsz = sizeof(dirty), }; - if (!container->dirty_pages_supported) { + if (!bcontainer->dirty_pages_supported) { return 0; } @@ -335,9 +339,13 @@ int vfio_set_dirty_page_tracking(VFIOLegacyContainer *container, bool start) return ret; } -int vfio_query_dirty_bitmap(VFIOLegacyContainer *container, VFIOBitmap *vbmap, - hwaddr iova, hwaddr size) +static int vfio_legacy_query_dirty_bitmap(VFIOContainer *bcontainer, + VFIOBitmap *vbmap, + hwaddr iova, hwaddr size) { + VFIOLegacyContainer *container = container_of(bcontainer, + VFIOLegacyContainer, + bcontainer); struct vfio_iommu_type1_dirty_bitmap *dbitmap; struct vfio_iommu_type1_dirty_bitmap_get *range; int ret; @@ -546,7 +554,7 @@ static void vfio_get_iommu_info_migration(VFIOLegacyContainer *container, * qemu_real_host_page_size to mark those dirty. */ if (cap_mig->pgsize_bitmap & qemu_real_host_page_size()) { - container->dirty_pages_supported = true; + container->bcontainer.dirty_pages_supported = true; container->max_dirty_bitmap_size = cap_mig->max_dirty_bitmap_size; container->dirty_pgsizes = cap_mig->pgsize_bitmap; } @@ -634,7 +642,6 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container = g_malloc0(sizeof(*container)); container->fd = fd; container->error = NULL; - container->dirty_pages_supported = false; container->dma_max_mappings = 0; QLIST_INIT(&container->vrdl_list); bcontainer = &container->bcontainer; @@ -1173,6 +1180,8 @@ static void vfio_iommu_backend_legacy_ops_class_init(ObjectClass *oc, ops->dma_map = vfio_legacy_dma_map; ops->dma_unmap = vfio_legacy_dma_unmap; + ops->set_dirty_page_tracking = vfio_legacy_set_dirty_page_tracking; + ops->query_dirty_bitmap = vfio_legacy_query_dirty_bitmap; ops->add_window = vfio_legacy_add_section_window; ops->del_window = vfio_legacy_del_section_window; } From patchwork Mon Oct 16 08:32:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422734 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C6004CDB482 for ; Mon, 16 Oct 2023 08:49:57 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJGe-0007jx-R8; Mon, 16 Oct 2023 04:48:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGc-0007jY-EC for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:02 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGa-0001Al-4W for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446080; x=1728982080; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TkN681/KObc4X+pDGx/3PsHpoCt2Br6pYChqRu2oqZ8=; b=AYqRMR2grnkJ47dBSC+sF+eMPdwkkYTO/6F7wpCm2F+MyXkzGkhTx2g8 1y+gj8gVprij3kAURhZFnqVdp7JjrYipaSUKacMraD43BFDx3Kwmz2RND NORtzcc64r5lexRNmsa6KkyAEbYNnvAmi5VBfc4GYcKdY5asOciTN4ijm FOWoXtTbTtAl6kQhdh826a8Cv9NsYOlsa5lZsiEI76zlTTFvHTzgJB1Pv cxWS1czVd3wkr3qhT0druuChSOKpaGwL7RiaMcQaGA9ef4xotRpKjhbbJ 4KzjGuBNvs8u48BsIXG+a6YZh+n66sORRcJw8dc9XOAdWWPiUGyUH9Gjk Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737600" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737600" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:59 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222862" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222862" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:55 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v2 10/27] vfio/container: Move per container device list in base container Date: Mon, 16 Oct 2023 16:32:06 +0800 Message-Id: <20231016083223.1519410-11-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org VFIO Device is also changed to point to base container instead of legacy container. No fucntional change intended. Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 3 +-- include/hw/vfio/vfio-container-base.h | 1 + hw/vfio/common.c | 23 +++++++++++++++-------- hw/vfio/container.c | 12 ++++++------ 4 files changed, 23 insertions(+), 16 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 39bcc7ec33..6979359457 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -89,7 +89,6 @@ typedef struct VFIOLegacyContainer { unsigned int dma_max_mappings; QLIST_HEAD(, VFIOGroup) group_list; QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; - QLIST_HEAD(, VFIODevice) device_list; } VFIOLegacyContainer; typedef struct VFIORamDiscardListener { @@ -109,7 +108,7 @@ typedef struct VFIODevice { QLIST_ENTRY(VFIODevice) container_next; QLIST_ENTRY(VFIODevice) global_next; struct VFIOGroup *group; - VFIOLegacyContainer *container; + VFIOContainer *bcontainer; char *sysfsdev; char *name; DeviceState *dev; diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 5ab52774b5..49637a1e6c 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -70,6 +70,7 @@ struct VFIOContainer { QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_ENTRY(VFIOContainer) next; + QLIST_HEAD(, VFIODevice) device_list; }; int vfio_container_dma_map(VFIOContainer *bcontainer, diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 855d6d82d0..7350af038a 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -152,7 +152,7 @@ void vfio_unblock_multiple_devices_migration(void) bool vfio_viommu_preset(VFIODevice *vbasedev) { - return vbasedev->container->bcontainer.space->as != &address_space_memory; + return vbasedev->bcontainer->space->as != &address_space_memory; } static void vfio_set_migration_error(int err) @@ -186,6 +186,7 @@ bool vfio_device_state_is_precopy(VFIODevice *vbasedev) static bool vfio_devices_all_dirty_tracking(VFIOLegacyContainer *container) { + VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; MigrationState *ms = migrate_get_current(); @@ -194,7 +195,7 @@ static bool vfio_devices_all_dirty_tracking(VFIOLegacyContainer *container) return false; } - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { VFIOMigration *migration = vbasedev->migration; if (!migration) { @@ -212,9 +213,10 @@ static bool vfio_devices_all_dirty_tracking(VFIOLegacyContainer *container) bool vfio_devices_all_device_dirty_tracking(VFIOLegacyContainer *container) { + VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { if (!vbasedev->dirty_pages_supported) { return false; } @@ -229,13 +231,14 @@ bool vfio_devices_all_device_dirty_tracking(VFIOLegacyContainer *container) */ bool vfio_devices_all_running_and_mig_active(VFIOLegacyContainer *container) { + VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; if (!migration_is_active(migrate_get_current())) { return false; } - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { VFIOMigration *migration = vbasedev->migration; if (!migration) { @@ -901,12 +904,13 @@ static bool vfio_section_is_vfio_pci(MemoryRegionSection *section, VFIOLegacyContainer *container) { VFIOPCIDevice *pcidev; + VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; Object *owner; owner = memory_region_owner(section->mr); - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { if (vbasedev->type != VFIO_DEVICE_TYPE_PCI) { continue; } @@ -1007,13 +1011,14 @@ static void vfio_devices_dma_logging_stop(VFIOLegacyContainer *container) uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature), sizeof(uint64_t))] = {}; struct vfio_device_feature *feature = (struct vfio_device_feature *)buf; + VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; feature->argsz = sizeof(buf); feature->flags = VFIO_DEVICE_FEATURE_SET | VFIO_DEVICE_FEATURE_DMA_LOGGING_STOP; - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { if (!vbasedev->dirty_tracking) { continue; } @@ -1104,6 +1109,7 @@ static int vfio_devices_dma_logging_start(VFIOLegacyContainer *container) { struct vfio_device_feature *feature; VFIODirtyRanges ranges; + VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; int ret = 0; @@ -1114,7 +1120,7 @@ static int vfio_devices_dma_logging_start(VFIOLegacyContainer *container) return -errno; } - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { if (vbasedev->dirty_tracking) { continue; } @@ -1211,10 +1217,11 @@ int vfio_devices_query_dirty_bitmap(VFIOLegacyContainer *container, VFIOBitmap *vbmap, hwaddr iova, hwaddr size) { + VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; int ret; - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { ret = vfio_device_dma_logging_report(vbasedev, iova, size, vbmap->bitmap); if (ret) { diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 26617afaa9..edcdee2904 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -1123,7 +1123,7 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev, int groupid = vfio_device_groupid(vbasedev, errp); VFIODevice *vbasedev_iter; VFIOGroup *group; - VFIOLegacyContainer *container; + VFIOContainer *bcontainer; int ret; if (groupid < 0) { @@ -1150,9 +1150,9 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev, return ret; } - container = group->container; - vbasedev->container = container; - QLIST_INSERT_HEAD(&container->device_list, vbasedev, container_next); + bcontainer = &group->container->bcontainer; + vbasedev->bcontainer = bcontainer; + QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next); QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next); return ret; @@ -1162,13 +1162,13 @@ void vfio_detach_device(VFIODevice *vbasedev) { VFIOGroup *group = vbasedev->group; - if (!vbasedev->container) { + if (!vbasedev->bcontainer) { return; } QLIST_REMOVE(vbasedev, global_next); QLIST_REMOVE(vbasedev, container_next); - vbasedev->container = NULL; + vbasedev->bcontainer = NULL; trace_vfio_detach_device(vbasedev->name, group->groupid); vfio_put_base_device(vbasedev); vfio_put_group(group); From patchwork Mon Oct 16 08:32:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422724 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AFD3FCDB465 for ; Mon, 16 Oct 2023 08:48:28 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJGl-0007vO-QW; Mon, 16 Oct 2023 04:48:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGj-0007sd-MH for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:09 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGe-0001Al-CZ for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446084; x=1728982084; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LTBYouoNRt+WnKVqdCYLxEKl6lqVBo80Ak696VjMBaI=; b=LUUXsCJ388xjZA+AILglbYPAMsYJj4nM92Fx8fuXNAJEZRJV+4FeZfqm DtF8pO+X0HW7GAWlw8nyhXeI/KMVH1R3f4Rlc4IlqKH9+jxzRlVJpGEFy Y6eqo8E4/XHMSViTXHU3BLMJSMKLFPHaPY+QXltUJ4n5gzvLvNTgXhu2r 3YuNg1t/u68YgUwdYL+14dCIXeR45iRCy8K6+5mJrxBQ4hrWr+R2YIA/P 4sRwAj4HsLwq98a8S/fgskjo8LqMHkyrPy0qQW/d8uvTedphF+dYe4xOf wJoTcn8FvzOt9wfu2trW9/lRAb3VM1wbLPOJdsAJTG6qGP3RaXnZKtjZl g==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737605" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737605" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222873" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222873" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:47:59 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v2 11/27] vfio/container: Convert functions to base container Date: Mon, 16 Oct 2023 16:32:07 +0800 Message-Id: <20231016083223.1519410-12-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger In the prospect to get rid of VFIOLegacyContainer refs in common.c lets convert misc functions to use the base container object instead: vfio_devices_all_dirty_tracking vfio_devices_all_device_dirty_tracking vfio_devices_all_running_and_mig_active vfio_devices_query_dirty_bitmap vfio_get_dirty_bitmap Signed-off-by: Eric Auger Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 9 ++++---- hw/vfio/common.c | 42 +++++++++++++++-------------------- hw/vfio/container.c | 6 ++--- hw/vfio/trace-events | 2 +- 4 files changed, 26 insertions(+), 33 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 6979359457..7bb75bc7cd 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -182,7 +182,6 @@ int vfio_host_win_del(VFIOContainer *bcontainer, hwaddr min_iova, hwaddr max_iova); VFIOAddressSpace *vfio_get_address_space(AddressSpace *as); void vfio_put_address_space(VFIOAddressSpace *space); -bool vfio_devices_all_running_and_saving(VFIOLegacyContainer *container); void vfio_disable_irqindex(VFIODevice *vbasedev, int index); void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index); @@ -254,11 +253,11 @@ bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp); void vfio_migration_exit(VFIODevice *vbasedev); int vfio_bitmap_alloc(VFIOBitmap *vbmap, hwaddr size); -bool vfio_devices_all_running_and_mig_active(VFIOLegacyContainer *container); -bool vfio_devices_all_device_dirty_tracking(VFIOLegacyContainer *container); -int vfio_devices_query_dirty_bitmap(VFIOLegacyContainer *container, +bool vfio_devices_all_running_and_mig_active(VFIOContainer *bcontainer); +bool vfio_devices_all_device_dirty_tracking(VFIOContainer *bcontainer); +int vfio_devices_query_dirty_bitmap(VFIOContainer *bcontainer, VFIOBitmap *vbmap, hwaddr iova, hwaddr size); -int vfio_get_dirty_bitmap(VFIOLegacyContainer *container, uint64_t iova, +int vfio_get_dirty_bitmap(VFIOContainer *bcontainer, uint64_t iova, uint64_t size, ram_addr_t ram_addr); #endif /* HW_VFIO_VFIO_COMMON_H */ diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 7350af038a..1c47bcc478 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -184,9 +184,8 @@ bool vfio_device_state_is_precopy(VFIODevice *vbasedev) migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P; } -static bool vfio_devices_all_dirty_tracking(VFIOLegacyContainer *container) +static bool vfio_devices_all_dirty_tracking(VFIOContainer *bcontainer) { - VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; MigrationState *ms = migrate_get_current(); @@ -211,9 +210,8 @@ static bool vfio_devices_all_dirty_tracking(VFIOLegacyContainer *container) return true; } -bool vfio_devices_all_device_dirty_tracking(VFIOLegacyContainer *container) +bool vfio_devices_all_device_dirty_tracking(VFIOContainer *bcontainer) { - VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { @@ -229,9 +227,8 @@ bool vfio_devices_all_device_dirty_tracking(VFIOLegacyContainer *container) * Check if all VFIO devices are running and migration is active, which is * essentially equivalent to the migration being in pre-copy phase. */ -bool vfio_devices_all_running_and_mig_active(VFIOLegacyContainer *container) +bool vfio_devices_all_running_and_mig_active(VFIOContainer *bcontainer) { - VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; if (!migration_is_active(migrate_get_current())) { @@ -1152,7 +1149,7 @@ static void vfio_listener_log_global_start(MemoryListener *listener) listener); int ret; - if (vfio_devices_all_device_dirty_tracking(container)) { + if (vfio_devices_all_device_dirty_tracking(&container->bcontainer)) { ret = vfio_devices_dma_logging_start(container); } else { ret = vfio_container_set_dirty_page_tracking(&container->bcontainer, @@ -1173,7 +1170,7 @@ static void vfio_listener_log_global_stop(MemoryListener *listener) listener); int ret = 0; - if (vfio_devices_all_device_dirty_tracking(container)) { + if (vfio_devices_all_device_dirty_tracking(&container->bcontainer)) { vfio_devices_dma_logging_stop(container); } else { ret = vfio_container_set_dirty_page_tracking(&container->bcontainer, @@ -1213,11 +1210,10 @@ static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova, return 0; } -int vfio_devices_query_dirty_bitmap(VFIOLegacyContainer *container, +int vfio_devices_query_dirty_bitmap(VFIOContainer *bcontainer, VFIOBitmap *vbmap, hwaddr iova, hwaddr size) { - VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; int ret; @@ -1237,17 +1233,16 @@ int vfio_devices_query_dirty_bitmap(VFIOLegacyContainer *container, return 0; } -int vfio_get_dirty_bitmap(VFIOLegacyContainer *container, uint64_t iova, +int vfio_get_dirty_bitmap(VFIOContainer *bcontainer, uint64_t iova, uint64_t size, ram_addr_t ram_addr) { bool all_device_dirty_tracking = - vfio_devices_all_device_dirty_tracking(container); + vfio_devices_all_device_dirty_tracking(bcontainer); uint64_t dirty_pages; VFIOBitmap vbmap; int ret; - if (!container->bcontainer.dirty_pages_supported && - !all_device_dirty_tracking) { + if (!bcontainer->dirty_pages_supported && !all_device_dirty_tracking) { cpu_physical_memory_set_dirty_range(ram_addr, size, tcg_enabled() ? DIRTY_CLIENTS_ALL : DIRTY_CLIENTS_NOCODE); @@ -1260,10 +1255,9 @@ int vfio_get_dirty_bitmap(VFIOLegacyContainer *container, uint64_t iova, } if (all_device_dirty_tracking) { - ret = vfio_devices_query_dirty_bitmap(container, &vbmap, iova, size); + ret = vfio_devices_query_dirty_bitmap(bcontainer, &vbmap, iova, size); } else { - ret = vfio_container_query_dirty_bitmap(&container->bcontainer, &vbmap, - iova, size); + ret = vfio_container_query_dirty_bitmap(bcontainer, &vbmap, iova, size); } if (ret) { @@ -1273,8 +1267,7 @@ int vfio_get_dirty_bitmap(VFIOLegacyContainer *container, uint64_t iova, dirty_pages = cpu_physical_memory_set_dirty_lebitmap(vbmap.bitmap, ram_addr, vbmap.pages); - trace_vfio_get_dirty_bitmap(container->fd, iova, size, vbmap.size, - ram_addr, dirty_pages); + trace_vfio_get_dirty_bitmap(iova, size, vbmap.size, ram_addr, dirty_pages); out: g_free(vbmap.bitmap); @@ -1309,8 +1302,8 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) rcu_read_lock(); if (vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL)) { - ret = vfio_get_dirty_bitmap(container, iova, iotlb->addr_mask + 1, - translated_addr); + ret = vfio_get_dirty_bitmap(&container->bcontainer, iova, + iotlb->addr_mask + 1, translated_addr); if (ret) { error_report("vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", @@ -1339,7 +1332,8 @@ static int vfio_ram_discard_get_dirty_bitmap(MemoryRegionSection *section, * Sync the whole mapped region (spanning multiple individual mappings) * in one go. */ - return vfio_get_dirty_bitmap(vrdl->container, iova, size, ram_addr); + return vfio_get_dirty_bitmap(&vrdl->container->bcontainer, iova, size, + ram_addr); } static int @@ -1409,7 +1403,7 @@ static int vfio_sync_dirty_bitmap(VFIOLegacyContainer *container, ram_addr = memory_region_get_ram_addr(section->mr) + section->offset_within_region; - return vfio_get_dirty_bitmap(container, + return vfio_get_dirty_bitmap(&container->bcontainer, REAL_HOST_PAGE_ALIGN(section->offset_within_address_space), int128_get64(section->size), ram_addr); } @@ -1426,7 +1420,7 @@ static void vfio_listener_log_sync(MemoryListener *listener, return; } - if (vfio_devices_all_dirty_tracking(container)) { + if (vfio_devices_all_dirty_tracking(&container->bcontainer)) { ret = vfio_sync_dirty_bitmap(container, section); if (ret) { error_report("vfio: Failed to sync dirty bitmap, err: %d (%s)", ret, diff --git a/hw/vfio/container.c b/hw/vfio/container.c index edcdee2904..e278321c0a 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -137,8 +137,8 @@ static int vfio_legacy_dma_unmap(VFIOContainer *bcontainer, hwaddr iova, bool need_dirty_sync = false; int ret; - if (iotlb && vfio_devices_all_running_and_mig_active(container)) { - if (!vfio_devices_all_device_dirty_tracking(container) && + if (iotlb && vfio_devices_all_running_and_mig_active(bcontainer)) { + if (!vfio_devices_all_device_dirty_tracking(bcontainer) && container->bcontainer.dirty_pages_supported) { return vfio_dma_unmap_bitmap(container, iova, size, iotlb); } @@ -170,7 +170,7 @@ static int vfio_legacy_dma_unmap(VFIOContainer *bcontainer, hwaddr iova, } if (need_dirty_sync) { - ret = vfio_get_dirty_bitmap(container, iova, size, + ret = vfio_get_dirty_bitmap(bcontainer, iova, size, iotlb->translated_addr); if (ret) { return ret; diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 9f7fedee98..08a1f9dfa4 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -117,7 +117,7 @@ vfio_region_sparse_mmap_header(const char *name, int index, int nr_areas) "Devic vfio_region_sparse_mmap_entry(int i, unsigned long start, unsigned long end) "sparse entry %d [0x%lx - 0x%lx]" vfio_get_dev_region(const char *name, int index, uint32_t type, uint32_t subtype) "%s index %d, %08x/%08x" vfio_legacy_dma_unmap_overflow_workaround(void) "" -vfio_get_dirty_bitmap(int fd, uint64_t iova, uint64_t size, uint64_t bitmap_size, uint64_t start, uint64_t dirty_pages) "container fd=%d, iova=0x%"PRIx64" size= 0x%"PRIx64" bitmap_size=0x%"PRIx64" start=0x%"PRIx64" dirty_pages=%"PRIu64 +vfio_get_dirty_bitmap(uint64_t iova, uint64_t size, uint64_t bitmap_size, uint64_t start, uint64_t dirty_pages) "iova=0x%"PRIx64" size= 0x%"PRIx64" bitmap_size=0x%"PRIx64" start=0x%"PRIx64" dirty_pages=%"PRIu64 vfio_iommu_map_dirty_notify(uint64_t iova_start, uint64_t iova_end) "iommu dirty @ 0x%"PRIx64" - 0x%"PRIx64 # platform.c From patchwork Mon Oct 16 08:32:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422725 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 72DC7CDB465 for ; Mon, 16 Oct 2023 08:48:45 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJGz-0008JX-6G; Mon, 16 Oct 2023 04:48:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGn-00081Q-Kz; Mon, 16 Oct 2023 04:48:15 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGk-0001Al-R6; Mon, 16 Oct 2023 04:48:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446090; x=1728982090; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Rj1lV+86wlvl4h1F503Bqf0fwK/5J5WBfjLXGPuAsnM=; b=LT+eMfXRDURn1UmXbU5du4DARXJSYefkUdU8gsS6awtwASRvMXCbDOC4 NZl890bMl6tEtsvLp8Hvk0KVxMYJlGmzLTrhdhkg3a7VAJngIB7HHksCS CtBnP4wMgWDSKiLu1NcJ7UCyiCNVw0pmZaosCg3BMj7HvuUQzkyhEjk9a /5wG97CvqtyUL89+SK2gdUlFopfUfP9+edtomldGomAZhP7AZoIbWOuBJ RmIr5wef+YNPBpeZH9Mu/JLzjHVNGx6af7WlnU+3dvwZFdtH/+rXPcmXk 8UqsdrTzVwWHo1d+crUxadqEeeuXdwnX/IzwsIFtmN/RofuVakBNgDXHK Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737627" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737627" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222885" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222885" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:03 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v2 12/27] vfio/container: Move vrdl_list, pgsizes and dma_max_mappings to base container Date: Mon, 16 Oct 2023 16:32:08 +0800 Message-Id: <20231016083223.1519410-13-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Move vrdl_list, pgsizes and dma_max_mappings to the base container object No functional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 13 -------- include/hw/vfio/vfio-container-base.h | 13 ++++++++ hw/vfio/common.c | 48 +++++++++++++-------------- hw/vfio/container-base.c | 12 +++++++ hw/vfio/container.c | 18 +++++----- hw/vfio/spapr.c | 4 +-- 6 files changed, 59 insertions(+), 49 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 7bb75bc7cd..18dd676a2a 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -85,22 +85,9 @@ typedef struct VFIOLegacyContainer { bool initialized; uint64_t dirty_pgsizes; uint64_t max_dirty_bitmap_size; - unsigned long pgsizes; - unsigned int dma_max_mappings; QLIST_HEAD(, VFIOGroup) group_list; - QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; } VFIOLegacyContainer; -typedef struct VFIORamDiscardListener { - VFIOLegacyContainer *container; - MemoryRegion *mr; - hwaddr offset_within_address_space; - hwaddr size; - uint64_t granularity; - RamDiscardListener listener; - QLIST_ENTRY(VFIORamDiscardListener) next; -} VFIORamDiscardListener; - typedef struct VFIODeviceOps VFIODeviceOps; typedef struct VFIODevice { diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 49637a1e6c..d6ffd7efc4 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -47,6 +47,16 @@ typedef struct VFIOGuestIOMMU { QLIST_ENTRY(VFIOGuestIOMMU) giommu_next; } VFIOGuestIOMMU; +typedef struct VFIORamDiscardListener { + VFIOContainer *bcontainer; + MemoryRegion *mr; + hwaddr offset_within_address_space; + hwaddr size; + uint64_t granularity; + RamDiscardListener listener; + QLIST_ENTRY(VFIORamDiscardListener) next; +} VFIORamDiscardListener; + typedef struct VFIOHostDMAWindow { hwaddr min_iova; hwaddr max_iova; @@ -66,9 +76,12 @@ typedef struct { struct VFIOContainer { VFIOIOMMUBackendOpsClass *ops; VFIOAddressSpace *space; + unsigned long pgsizes; + unsigned int dma_max_mappings; bool dirty_pages_supported; QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; + QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; QLIST_ENTRY(VFIOContainer) next; QLIST_HEAD(, VFIODevice) device_list; }; diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 1c47bcc478..b833def682 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -396,13 +396,13 @@ static void vfio_ram_discard_notify_discard(RamDiscardListener *rdl, { VFIORamDiscardListener *vrdl = container_of(rdl, VFIORamDiscardListener, listener); + VFIOContainer *bcontainer = vrdl->bcontainer; const hwaddr size = int128_get64(section->size); const hwaddr iova = section->offset_within_address_space; int ret; /* Unmap with a single call. */ - ret = vfio_container_dma_unmap(&vrdl->container->bcontainer, - iova, size , NULL); + ret = vfio_container_dma_unmap(bcontainer, iova, size , NULL); if (ret) { error_report("%s: vfio_container_dma_unmap() failed: %s", __func__, strerror(-ret)); @@ -414,6 +414,7 @@ static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, { VFIORamDiscardListener *vrdl = container_of(rdl, VFIORamDiscardListener, listener); + VFIOContainer *bcontainer = vrdl->bcontainer; const hwaddr end = section->offset_within_region + int128_get64(section->size); hwaddr start, next, iova; @@ -432,8 +433,8 @@ static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, section->offset_within_address_space; vaddr = memory_region_get_ram_ptr(section->mr) + start; - ret = vfio_container_dma_map(&vrdl->container->bcontainer, iova, - next - start, vaddr, section->readonly); + ret = vfio_container_dma_map(bcontainer, iova, next - start, + vaddr, section->readonly); if (ret) { /* Rollback */ vfio_ram_discard_notify_discard(rdl, section); @@ -443,7 +444,7 @@ static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, return 0; } -static void vfio_register_ram_discard_listener(VFIOLegacyContainer *container, +static void vfio_register_ram_discard_listener(VFIOContainer *bcontainer, MemoryRegionSection *section) { RamDiscardManager *rdm = memory_region_get_ram_discard_manager(section->mr); @@ -456,7 +457,7 @@ static void vfio_register_ram_discard_listener(VFIOLegacyContainer *container, g_assert(QEMU_IS_ALIGNED(int128_get64(section->size), TARGET_PAGE_SIZE)); vrdl = g_new0(VFIORamDiscardListener, 1); - vrdl->container = container; + vrdl->bcontainer = bcontainer; vrdl->mr = section->mr; vrdl->offset_within_address_space = section->offset_within_address_space; vrdl->size = int128_get64(section->size); @@ -464,14 +465,14 @@ static void vfio_register_ram_discard_listener(VFIOLegacyContainer *container, section->mr); g_assert(vrdl->granularity && is_power_of_2(vrdl->granularity)); - g_assert(container->pgsizes && - vrdl->granularity >= 1ULL << ctz64(container->pgsizes)); + g_assert(bcontainer->pgsizes && + vrdl->granularity >= 1ULL << ctz64(bcontainer->pgsizes)); ram_discard_listener_init(&vrdl->listener, vfio_ram_discard_notify_populate, vfio_ram_discard_notify_discard, true); ram_discard_manager_register_listener(rdm, &vrdl->listener, section); - QLIST_INSERT_HEAD(&container->vrdl_list, vrdl, next); + QLIST_INSERT_HEAD(&bcontainer->vrdl_list, vrdl, next); /* * Sanity-check if we have a theoretically problematic setup where we could @@ -486,7 +487,7 @@ static void vfio_register_ram_discard_listener(VFIOLegacyContainer *container, * number of sections in the address space we could have over time, * also consuming DMA mappings. */ - if (container->dma_max_mappings) { + if (bcontainer->dma_max_mappings) { unsigned int vrdl_count = 0, vrdl_mappings = 0, max_memslots = 512; #ifdef CONFIG_KVM @@ -495,7 +496,7 @@ static void vfio_register_ram_discard_listener(VFIOLegacyContainer *container, } #endif - QLIST_FOREACH(vrdl, &container->vrdl_list, next) { + QLIST_FOREACH(vrdl, &bcontainer->vrdl_list, next) { hwaddr start, end; start = QEMU_ALIGN_DOWN(vrdl->offset_within_address_space, @@ -507,23 +508,23 @@ static void vfio_register_ram_discard_listener(VFIOLegacyContainer *container, } if (vrdl_mappings + max_memslots - vrdl_count > - container->dma_max_mappings) { + bcontainer->dma_max_mappings) { warn_report("%s: possibly running out of DMA mappings. E.g., try" " increasing the 'block-size' of virtio-mem devies." " Maximum possible DMA mappings: %d, Maximum possible" - " memslots: %d", __func__, container->dma_max_mappings, + " memslots: %d", __func__, bcontainer->dma_max_mappings, max_memslots); } } } -static void vfio_unregister_ram_discard_listener(VFIOLegacyContainer *container, +static void vfio_unregister_ram_discard_listener(VFIOContainer *bcontainer, MemoryRegionSection *section) { RamDiscardManager *rdm = memory_region_get_ram_discard_manager(section->mr); VFIORamDiscardListener *vrdl = NULL; - QLIST_FOREACH(vrdl, &container->vrdl_list, next) { + QLIST_FOREACH(vrdl, &bcontainer->vrdl_list, next) { if (vrdl->mr == section->mr && vrdl->offset_within_address_space == section->offset_within_address_space) { @@ -697,7 +698,7 @@ static void vfio_listener_region_add(MemoryListener *listener, iommu_idx); ret = memory_region_iommu_set_page_size_mask(giommu->iommu_mr, - container->pgsizes, + bcontainer->pgsizes, &err); if (ret) { g_free(giommu); @@ -724,7 +725,7 @@ static void vfio_listener_region_add(MemoryListener *listener, * about changes. */ if (memory_region_has_ram_discard_manager(section->mr)) { - vfio_register_ram_discard_listener(container, section); + vfio_register_ram_discard_listener(bcontainer, section); return; } @@ -848,7 +849,7 @@ static void vfio_listener_region_del(MemoryListener *listener, pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1; try_unmap = !((iova & pgmask) || (int128_get64(llsize) & pgmask)); } else if (memory_region_has_ram_discard_manager(section->mr)) { - vfio_unregister_ram_discard_listener(container, section); + vfio_unregister_ram_discard_listener(bcontainer, section); /* Unregistering will trigger an unmap. */ try_unmap = false; } @@ -1332,18 +1333,17 @@ static int vfio_ram_discard_get_dirty_bitmap(MemoryRegionSection *section, * Sync the whole mapped region (spanning multiple individual mappings) * in one go. */ - return vfio_get_dirty_bitmap(&vrdl->container->bcontainer, iova, size, - ram_addr); + return vfio_get_dirty_bitmap(vrdl->bcontainer, iova, size, ram_addr); } static int -vfio_sync_ram_discard_listener_dirty_bitmap(VFIOLegacyContainer *container, - MemoryRegionSection *section) +vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainer *bcontainer, + MemoryRegionSection *section) { RamDiscardManager *rdm = memory_region_get_ram_discard_manager(section->mr); VFIORamDiscardListener *vrdl = NULL; - QLIST_FOREACH(vrdl, &container->vrdl_list, next) { + QLIST_FOREACH(vrdl, &bcontainer->vrdl_list, next) { if (vrdl->mr == section->mr && vrdl->offset_within_address_space == section->offset_within_address_space) { @@ -1397,7 +1397,7 @@ static int vfio_sync_dirty_bitmap(VFIOLegacyContainer *container, } return 0; } else if (memory_region_has_ram_discard_manager(section->mr)) { - return vfio_sync_ram_discard_listener_dirty_bitmap(container, section); + return vfio_sync_ram_discard_listener_dirty_bitmap(bcontainer, section); } ram_addr = memory_region_get_ram_addr(section->mr) + diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 530ad42c0d..c5a4c5afed 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -98,17 +98,29 @@ void vfio_container_init(VFIOContainer *bcontainer, bcontainer->ops = ops; bcontainer->space = space; bcontainer->dirty_pages_supported = false; + bcontainer->dma_max_mappings = 0; QLIST_INIT(&bcontainer->giommu_list); QLIST_INIT(&bcontainer->hostwin_list); + QLIST_INIT(&bcontainer->vrdl_list); } void vfio_container_destroy(VFIOContainer *bcontainer) { + VFIORamDiscardListener *vrdl, *vrdl_tmp; VFIOGuestIOMMU *giommu, *tmp; VFIOHostDMAWindow *hostwin, *next; QLIST_REMOVE(bcontainer, next); + QLIST_FOREACH_SAFE(vrdl, &bcontainer->vrdl_list, next, vrdl_tmp) { + RamDiscardManager *rdm; + + rdm = memory_region_get_ram_discard_manager(vrdl->mr); + ram_discard_manager_unregister_listener(rdm, &vrdl->listener); + QLIST_REMOVE(vrdl, next); + g_free(vrdl); + } + QLIST_FOREACH_SAFE(giommu, &bcontainer->giommu_list, giommu_next, tmp) { memory_region_unregister_iommu_notifier( MEMORY_REGION(giommu->iommu_mr), &giommu->n); diff --git a/hw/vfio/container.c b/hw/vfio/container.c index e278321c0a..66fad1c280 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -162,7 +162,7 @@ static int vfio_legacy_dma_unmap(VFIOContainer *bcontainer, hwaddr iova, if (errno == EINVAL && unmap.size && !(unmap.iova + unmap.size) && container->iommu_type == VFIO_TYPE1v2_IOMMU) { trace_vfio_legacy_dma_unmap_overflow_workaround(); - unmap.size -= 1ULL << ctz64(container->pgsizes); + unmap.size -= 1ULL << ctz64(container->bcontainer.pgsizes); continue; } error_report("VFIO_UNMAP_DMA failed: %s", strerror(errno)); @@ -642,8 +642,6 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container = g_malloc0(sizeof(*container)); container->fd = fd; container->error = NULL; - container->dma_max_mappings = 0; - QLIST_INIT(&container->vrdl_list); bcontainer = &container->bcontainer; vfio_container_init(bcontainer, space, ops); @@ -671,13 +669,13 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, } if (info->flags & VFIO_IOMMU_INFO_PGSIZES) { - container->pgsizes = info->iova_pgsizes; + container->bcontainer.pgsizes = info->iova_pgsizes; } else { - container->pgsizes = qemu_real_host_page_size(); + container->bcontainer.pgsizes = qemu_real_host_page_size(); } - if (!vfio_get_info_dma_avail(info, &container->dma_max_mappings)) { - container->dma_max_mappings = 65535; + if (!vfio_get_info_dma_avail(info, &bcontainer->dma_max_mappings)) { + container->bcontainer.dma_max_mappings = 65535; } vfio_get_iommu_info_migration(container, info); g_free(info); @@ -687,7 +685,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, * information to get the actual window extent rather than assume * a 64-bit IOVA address space. */ - vfio_host_win_add(bcontainer, 0, (hwaddr)-1, container->pgsizes); + vfio_host_win_add(bcontainer, 0, (hwaddr)-1, bcontainer->pgsizes); break; } @@ -736,7 +734,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, } if (v2) { - container->pgsizes = info.ddw.pgsizes; + container->bcontainer.pgsizes = info.ddw.pgsizes; /* * There is a default window in just created container. * To make region_add/del simpler, we better remove this @@ -751,7 +749,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, } } else { /* The default table uses 4K pages */ - container->pgsizes = 0x1000; + container->bcontainer.pgsizes = 0x1000; vfio_host_win_add(bcontainer, info.dma32_window_start, info.dma32_window_start + info.dma32_window_size - 1, diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index 683252c506..3fdad9d227 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -159,13 +159,13 @@ int vfio_spapr_create_window(VFIOLegacyContainer *container, if (pagesize > rampagesize) { pagesize = rampagesize; } - pgmask = container->pgsizes & (pagesize | (pagesize - 1)); + pgmask = container->bcontainer.pgsizes & (pagesize | (pagesize - 1)); pagesize = pgmask ? (1ULL << (63 - clz64(pgmask))) : 0; if (!pagesize) { error_report("Host doesn't support page size 0x%"PRIx64 ", the supported mask is 0x%lx", memory_region_iommu_get_min_page_size(iommu_mr), - container->pgsizes); + container->bcontainer.pgsizes); return -EINVAL; } From patchwork Mon Oct 16 08:32:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422739 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B8046CDB465 for ; Mon, 16 Oct 2023 08:50:59 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJH7-00009h-1S; Mon, 16 Oct 2023 04:48:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGx-0008Ic-Ed; Mon, 16 Oct 2023 04:48:25 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGr-0001Al-9i; Mon, 16 Oct 2023 04:48:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446097; x=1728982097; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oPwi6eVw5K0v9Z9FjdMpzuizve7IfLaTuIh1dcsksdc=; b=jslN8OkMqc/lEZxkdCmcIyo14TFz59+WQDUXHnBd2IkNiM3BlVQ0Fbfn bUmBGhjaMdJmMZ6hPTAm1+ZA+4P4Eeic3AY5d0kWOH4vM26KDgghJGZEb VzFNFsjcJFMfetRm3LTLq33aL7ebiFoGxL3YPiT0WWP+Shq8uWtC4ykWM xg2kkBlc6MUczTdXSIEl/uXmQYs8AFp+tJZ9BO16sdZJ1JQIHi9bjyY61 KdijJLvqVHup7jnGQAUckUzoq4v5vnrBKo1uDa2XEomnv7bawvl1rK8+0 ubJoTFG3DrhPFObg9yTI8aU2HtqzSVRruJa2JBQPxCoGqIvrWN5nlu3Ke g==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737639" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737639" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222913" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222913" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:09 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v2 13/27] vfio/container: Move listener to base container Date: Mon, 16 Oct 2023 16:32:09 +0800 Message-Id: <20231016083223.1519410-14-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Move listener to base container. Also error and initialized fields are moved at the same time. No functional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 3 - include/hw/vfio/vfio-container-base.h | 3 + hw/vfio/common.c | 116 +++++++++++--------------- hw/vfio/container-base.c | 1 + hw/vfio/container.c | 31 +++---- hw/vfio/spapr.c | 7 +- 6 files changed, 72 insertions(+), 89 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 18dd676a2a..8771160849 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -78,11 +78,8 @@ struct VFIOGroup; typedef struct VFIOLegacyContainer { VFIOContainer bcontainer; int fd; /* /dev/vfio/vfio, empowered by the attached groups */ - MemoryListener listener; MemoryListener prereg_listener; unsigned iommu_type; - Error *error; - bool initialized; uint64_t dirty_pgsizes; uint64_t max_dirty_bitmap_size; QLIST_HEAD(, VFIOGroup) group_list; diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index d6ffd7efc4..96d33495c1 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -76,6 +76,9 @@ typedef struct { struct VFIOContainer { VFIOIOMMUBackendOpsClass *ops; VFIOAddressSpace *space; + MemoryListener listener; + Error *error; + bool initialized; unsigned long pgsizes; unsigned int dma_max_mappings; bool dirty_pages_supported; diff --git a/hw/vfio/common.c b/hw/vfio/common.c index b833def682..da1d64efca 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -602,7 +602,7 @@ static bool vfio_listener_valid_section(MemoryRegionSection *section, return true; } -static bool vfio_get_section_iova_range(VFIOLegacyContainer *container, +static bool vfio_get_section_iova_range(VFIOContainer *bcontainer, MemoryRegionSection *section, hwaddr *out_iova, hwaddr *out_end, Int128 *out_llend) @@ -630,10 +630,7 @@ static bool vfio_get_section_iova_range(VFIOLegacyContainer *container, static void vfio_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { - VFIOLegacyContainer *container = container_of(listener, - VFIOLegacyContainer, - listener); - VFIOContainer *bcontainer = &container->bcontainer; + VFIOContainer *bcontainer = container_of(listener, VFIOContainer, listener); hwaddr iova, end; Int128 llend, llsize; void *vaddr; @@ -645,7 +642,8 @@ static void vfio_listener_region_add(MemoryListener *listener, return; } - if (!vfio_get_section_iova_range(container, section, &iova, &end, &llend)) { + if (!vfio_get_section_iova_range(bcontainer, section, &iova, &end, + &llend)) { if (memory_region_is_ram_device(section->mr)) { trace_vfio_listener_region_add_no_dma_map( memory_region_name(section->mr), @@ -663,7 +661,7 @@ static void vfio_listener_region_add(MemoryListener *listener, hostwin = vfio_find_hostwin(bcontainer, iova, end); if (!hostwin) { error_setg(&err, "Container %p can't map guest IOVA region" - " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx, container, iova, end); + " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx, bcontainer, iova, end); goto fail; } @@ -750,13 +748,12 @@ static void vfio_listener_region_add(MemoryListener *listener, } } - ret = vfio_container_dma_map(&container->bcontainer, - iova, int128_get64(llsize), vaddr, - section->readonly); + ret = vfio_container_dma_map(bcontainer, iova, int128_get64(llsize), + vaddr, section->readonly); if (ret) { error_setg(&err, "vfio_container_dma_map(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx", %p) = %d (%s)", - container, iova, int128_get64(llsize), vaddr, ret, + bcontainer, iova, int128_get64(llsize), vaddr, ret, strerror(-ret)); if (memory_region_is_ram_device(section->mr)) { /* Allow unexpected mappings not to be fatal for RAM devices */ @@ -778,9 +775,9 @@ fail: * can gracefully fail. Runtime, there's not much we can do other * than throw a hardware error. */ - if (!container->initialized) { - if (!container->error) { - error_propagate_prepend(&container->error, err, + if (!bcontainer->initialized) { + if (!bcontainer->error) { + error_propagate_prepend(&bcontainer->error, err, "Region %s: ", memory_region_name(section->mr)); } else { @@ -795,10 +792,7 @@ fail: static void vfio_listener_region_del(MemoryListener *listener, MemoryRegionSection *section) { - VFIOLegacyContainer *container = container_of(listener, - VFIOLegacyContainer, - listener); - VFIOContainer *bcontainer = &container->bcontainer; + VFIOContainer *bcontainer = container_of(listener, VFIOContainer, listener); hwaddr iova, end; Int128 llend, llsize; int ret; @@ -831,7 +825,8 @@ static void vfio_listener_region_del(MemoryListener *listener, */ } - if (!vfio_get_section_iova_range(container, section, &iova, &end, &llend)) { + if (!vfio_get_section_iova_range(bcontainer, section, &iova, &end, + &llend)) { return; } @@ -858,29 +853,29 @@ static void vfio_listener_region_del(MemoryListener *listener, if (int128_eq(llsize, int128_2_64())) { /* The unmap ioctl doesn't accept a full 64-bit span. */ llsize = int128_rshift(llsize, 1); - ret = vfio_container_dma_unmap(&container->bcontainer, iova, + ret = vfio_container_dma_unmap(bcontainer, iova, int128_get64(llsize), NULL); if (ret) { error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", - container, iova, int128_get64(llsize), ret, + bcontainer, iova, int128_get64(llsize), ret, strerror(-ret)); } iova += int128_get64(llsize); } - ret = vfio_container_dma_unmap(&container->bcontainer, iova, + ret = vfio_container_dma_unmap(bcontainer, iova, int128_get64(llsize), NULL); if (ret) { error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", - container, iova, int128_get64(llsize), ret, + bcontainer, iova, int128_get64(llsize), ret, strerror(-ret)); } } memory_region_unref(section->mr); - vfio_container_del_section_window(&container->bcontainer, section); + vfio_container_del_section_window(bcontainer, section); } typedef struct VFIODirtyRanges { @@ -893,16 +888,15 @@ typedef struct VFIODirtyRanges { } VFIODirtyRanges; typedef struct VFIODirtyRangesListener { - VFIOLegacyContainer *container; + VFIOContainer *bcontainer; VFIODirtyRanges ranges; MemoryListener listener; } VFIODirtyRangesListener; static bool vfio_section_is_vfio_pci(MemoryRegionSection *section, - VFIOLegacyContainer *container) + VFIOContainer *bcontainer) { VFIOPCIDevice *pcidev; - VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; Object *owner; @@ -931,7 +925,7 @@ static void vfio_dirty_tracking_update(MemoryListener *listener, hwaddr iova, end, *min, *max; if (!vfio_listener_valid_section(section, "tracking_update") || - !vfio_get_section_iova_range(dirty->container, section, + !vfio_get_section_iova_range(dirty->bcontainer, section, &iova, &end, NULL)) { return; } @@ -955,7 +949,7 @@ static void vfio_dirty_tracking_update(MemoryListener *listener, * The alternative would be an IOVATree but that has a much bigger runtime * overhead and unnecessary complexity. */ - if (vfio_section_is_vfio_pci(section, dirty->container) && + if (vfio_section_is_vfio_pci(section, dirty->bcontainer) && iova >= UINT32_MAX) { min = &range->minpci64; max = &range->maxpci64; @@ -979,7 +973,7 @@ static const MemoryListener vfio_dirty_tracking_listener = { .region_add = vfio_dirty_tracking_update, }; -static void vfio_dirty_tracking_init(VFIOLegacyContainer *container, +static void vfio_dirty_tracking_init(VFIOContainer *bcontainer, VFIODirtyRanges *ranges) { VFIODirtyRangesListener dirty; @@ -989,10 +983,10 @@ static void vfio_dirty_tracking_init(VFIOLegacyContainer *container, dirty.ranges.min64 = UINT64_MAX; dirty.ranges.minpci64 = UINT64_MAX; dirty.listener = vfio_dirty_tracking_listener; - dirty.container = container; + dirty.bcontainer = bcontainer; memory_listener_register(&dirty.listener, - container->bcontainer.space->as); + bcontainer->space->as); *ranges = dirty.ranges; @@ -1004,12 +998,11 @@ static void vfio_dirty_tracking_init(VFIOLegacyContainer *container, memory_listener_unregister(&dirty.listener); } -static void vfio_devices_dma_logging_stop(VFIOLegacyContainer *container) +static void vfio_devices_dma_logging_stop(VFIOContainer *bcontainer) { uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature), sizeof(uint64_t))] = {}; struct vfio_device_feature *feature = (struct vfio_device_feature *)buf; - VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; feature->argsz = sizeof(buf); @@ -1030,7 +1023,7 @@ static void vfio_devices_dma_logging_stop(VFIOLegacyContainer *container) } static struct vfio_device_feature * -vfio_device_feature_dma_logging_start_create(VFIOLegacyContainer *container, +vfio_device_feature_dma_logging_start_create(VFIOContainer *bcontainer, VFIODirtyRanges *tracking) { struct vfio_device_feature *feature; @@ -1103,16 +1096,15 @@ static void vfio_device_feature_dma_logging_start_destroy( g_free(feature); } -static int vfio_devices_dma_logging_start(VFIOLegacyContainer *container) +static int vfio_devices_dma_logging_start(VFIOContainer *bcontainer) { struct vfio_device_feature *feature; VFIODirtyRanges ranges; - VFIOContainer *bcontainer = &container->bcontainer; VFIODevice *vbasedev; int ret = 0; - vfio_dirty_tracking_init(container, &ranges); - feature = vfio_device_feature_dma_logging_start_create(container, + vfio_dirty_tracking_init(bcontainer, &ranges); + feature = vfio_device_feature_dma_logging_start_create(bcontainer, &ranges); if (!feature) { return -errno; @@ -1135,7 +1127,7 @@ static int vfio_devices_dma_logging_start(VFIOLegacyContainer *container) out: if (ret) { - vfio_devices_dma_logging_stop(container); + vfio_devices_dma_logging_stop(bcontainer); } vfio_device_feature_dma_logging_start_destroy(feature); @@ -1145,16 +1137,13 @@ out: static void vfio_listener_log_global_start(MemoryListener *listener) { - VFIOLegacyContainer *container = container_of(listener, - VFIOLegacyContainer, - listener); + VFIOContainer *bcontainer = container_of(listener, VFIOContainer, listener); int ret; - if (vfio_devices_all_device_dirty_tracking(&container->bcontainer)) { - ret = vfio_devices_dma_logging_start(container); + if (vfio_devices_all_device_dirty_tracking(bcontainer)) { + ret = vfio_devices_dma_logging_start(bcontainer); } else { - ret = vfio_container_set_dirty_page_tracking(&container->bcontainer, - true); + ret = vfio_container_set_dirty_page_tracking(bcontainer, true); } if (ret) { @@ -1166,16 +1155,13 @@ static void vfio_listener_log_global_start(MemoryListener *listener) static void vfio_listener_log_global_stop(MemoryListener *listener) { - VFIOLegacyContainer *container = container_of(listener, - VFIOLegacyContainer, - listener); + VFIOContainer *bcontainer = container_of(listener, VFIOContainer, listener); int ret = 0; - if (vfio_devices_all_device_dirty_tracking(&container->bcontainer)) { - vfio_devices_dma_logging_stop(container); + if (vfio_devices_all_device_dirty_tracking(bcontainer)) { + vfio_devices_dma_logging_stop(bcontainer); } else { - ret = vfio_container_set_dirty_page_tracking(&container->bcontainer, - false); + ret = vfio_container_set_dirty_page_tracking(bcontainer, false); } if (ret) { @@ -1286,9 +1272,6 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) vfio_giommu_dirty_notifier, n); VFIOGuestIOMMU *giommu = gdn->giommu; VFIOContainer *bcontainer = giommu->bcontainer; - VFIOLegacyContainer *container = container_of(bcontainer, - VFIOLegacyContainer, - bcontainer); hwaddr iova = iotlb->iova + giommu->iommu_offset; ram_addr_t translated_addr; int ret = -EINVAL; @@ -1303,12 +1286,12 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) rcu_read_lock(); if (vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL)) { - ret = vfio_get_dirty_bitmap(&container->bcontainer, iova, - iotlb->addr_mask + 1, translated_addr); + ret = vfio_get_dirty_bitmap(bcontainer, iova, iotlb->addr_mask + 1, + translated_addr); if (ret) { error_report("vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", - container, iova, iotlb->addr_mask + 1, ret, + bcontainer, iova, iotlb->addr_mask + 1, ret, strerror(-ret)); } } @@ -1364,10 +1347,9 @@ vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainer *bcontainer, &vrdl); } -static int vfio_sync_dirty_bitmap(VFIOLegacyContainer *container, +static int vfio_sync_dirty_bitmap(VFIOContainer *bcontainer, MemoryRegionSection *section) { - VFIOContainer *bcontainer = &container->bcontainer; ram_addr_t ram_addr; if (memory_region_is_iommu(section->mr)) { @@ -1403,7 +1385,7 @@ static int vfio_sync_dirty_bitmap(VFIOLegacyContainer *container, ram_addr = memory_region_get_ram_addr(section->mr) + section->offset_within_region; - return vfio_get_dirty_bitmap(&container->bcontainer, + return vfio_get_dirty_bitmap(bcontainer, REAL_HOST_PAGE_ALIGN(section->offset_within_address_space), int128_get64(section->size), ram_addr); } @@ -1411,17 +1393,15 @@ static int vfio_sync_dirty_bitmap(VFIOLegacyContainer *container, static void vfio_listener_log_sync(MemoryListener *listener, MemoryRegionSection *section) { - VFIOLegacyContainer *container = container_of(listener, - VFIOLegacyContainer, - listener); + VFIOContainer *bcontainer = container_of(listener, VFIOContainer, listener); int ret; if (vfio_listener_skipped_section(section)) { return; } - if (vfio_devices_all_dirty_tracking(&container->bcontainer)) { - ret = vfio_sync_dirty_bitmap(container, section); + if (vfio_devices_all_dirty_tracking(bcontainer)) { + ret = vfio_sync_dirty_bitmap(bcontainer, section); if (ret) { error_report("vfio: Failed to sync dirty bitmap, err: %d (%s)", ret, strerror(-ret)); diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index c5a4c5afed..29cf954019 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -97,6 +97,7 @@ void vfio_container_init(VFIOContainer *bcontainer, { bcontainer->ops = ops; bcontainer->space = space; + bcontainer->error = NULL; bcontainer->dirty_pages_supported = false; bcontainer->dma_max_mappings = 0; QLIST_INIT(&bcontainer->giommu_list); diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 66fad1c280..5b14a9b307 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -382,7 +382,9 @@ static int vfio_legacy_query_dirty_bitmap(VFIOContainer *bcontainer, static void vfio_listener_release(VFIOLegacyContainer *container) { - memory_listener_unregister(&container->listener); + VFIOContainer *bcontainer = &container->bcontainer; + + memory_listener_unregister(&bcontainer->listener); if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU) { memory_listener_unregister(&container->prereg_listener); } @@ -540,6 +542,7 @@ static void vfio_get_iommu_info_migration(VFIOLegacyContainer *container, { struct vfio_info_cap_header *hdr; struct vfio_iommu_type1_info_cap_migration *cap_mig; + VFIOContainer *bcontainer = &container->bcontainer; hdr = vfio_get_iommu_info_cap(info, VFIO_IOMMU_TYPE1_INFO_CAP_MIGRATION); if (!hdr) { @@ -554,7 +557,7 @@ static void vfio_get_iommu_info_migration(VFIOLegacyContainer *container, * qemu_real_host_page_size to mark those dirty. */ if (cap_mig->pgsize_bitmap & qemu_real_host_page_size()) { - container->bcontainer.dirty_pages_supported = true; + bcontainer->dirty_pages_supported = true; container->max_dirty_bitmap_size = cap_mig->max_dirty_bitmap_size; container->dirty_pgsizes = cap_mig->pgsize_bitmap; } @@ -641,7 +644,6 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container = g_malloc0(sizeof(*container)); container->fd = fd; - container->error = NULL; bcontainer = &container->bcontainer; vfio_container_init(bcontainer, space, ops); @@ -669,9 +671,9 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, } if (info->flags & VFIO_IOMMU_INFO_PGSIZES) { - container->bcontainer.pgsizes = info->iova_pgsizes; + bcontainer->pgsizes = info->iova_pgsizes; } else { - container->bcontainer.pgsizes = qemu_real_host_page_size(); + bcontainer->pgsizes = qemu_real_host_page_size(); } if (!vfio_get_info_dma_avail(info, &bcontainer->dma_max_mappings)) { @@ -712,10 +714,10 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, memory_listener_register(&container->prereg_listener, &address_space_memory); - if (container->error) { + if (bcontainer->error) { memory_listener_unregister(&container->prereg_listener); ret = -1; - error_propagate_prepend(errp, container->error, + error_propagate_prepend(errp, bcontainer->error, "RAM memory listener initialization failed: "); goto enable_discards_exit; } @@ -734,7 +736,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, } if (v2) { - container->bcontainer.pgsizes = info.ddw.pgsizes; + bcontainer->pgsizes = info.ddw.pgsizes; /* * There is a default window in just created container. * To make region_add/del simpler, we better remove this @@ -749,7 +751,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, } } else { /* The default table uses 4K pages */ - container->bcontainer.pgsizes = 0x1000; + bcontainer->pgsizes = 0x1000; vfio_host_win_add(bcontainer, info.dma32_window_start, info.dma32_window_start + info.dma32_window_size - 1, @@ -766,19 +768,18 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, group->container = container; QLIST_INSERT_HEAD(&container->group_list, group, container_next); - container->listener = vfio_memory_listener; + bcontainer->listener = vfio_memory_listener; - memory_listener_register(&container->listener, - container->bcontainer.space->as); + memory_listener_register(&bcontainer->listener, bcontainer->space->as); - if (container->error) { + if (bcontainer->error) { ret = -1; - error_propagate_prepend(errp, container->error, + error_propagate_prepend(errp, bcontainer->error, "memory listener initialization failed: "); goto listener_release_exit; } - container->initialized = true; + bcontainer->initialized = true; return 0; listener_release_exit: diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index 3fdad9d227..7df2a7a672 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -41,6 +41,7 @@ static void vfio_prereg_listener_region_add(MemoryListener *listener, { VFIOLegacyContainer *container = container_of(listener, VFIOLegacyContainer, prereg_listener); + VFIOContainer *bcontainer = &container->bcontainer; const hwaddr gpa = section->offset_within_address_space; hwaddr end; int ret; @@ -83,9 +84,9 @@ static void vfio_prereg_listener_region_add(MemoryListener *listener, * can gracefully fail. Runtime, there's not much we can do other * than throw a hardware error. */ - if (!container->initialized) { - if (!container->error) { - error_setg_errno(&container->error, -ret, + if (!bcontainer->initialized) { + if (!bcontainer->error) { + error_setg_errno(&bcontainer->error, -ret, "Memory registering failed"); } } else { From patchwork Mon Oct 16 08:32:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422744 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2D312C46CA1 for ; Mon, 16 Oct 2023 08:51:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJH7-0000GZ-Kz; Mon, 16 Oct 2023 04:48:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGz-0008Lf-BU for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:28 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGu-0001FN-W5 for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446101; x=1728982101; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Y+HkCVoou1khEr9z+ilkcNKYFw+WgtgDANUYD38SDLY=; b=aUJwlPXssAgDFeAoPes55AgJsg6FhFqd/F5Yqf2W1GNy4d2BZ7MC08ot J56JtEKgCxRBNkaAeL9P9DWLsMjPkksanMvJoZfO7f1G5koSKU7lqkXan fobEFewhEE13teAOdvywYJceXBVoWaTPO8B9FTOHID6i+jWPZwo177qPR zkIqa8fFJq0GZLLQmW5GFxKtzQIaofuF+wNTFytrzxWLwV6tuVTx54m16 WO9Px23AFfLTU3Peu4g7O0E3z1gMDNgk9kMhVlrRiX7o1pc8tL4iHCCMq FoTg5mDUtSq1vgnTxnpT/YHlZaUx5yZ3VDU8CTIZPaXk0Yqs4sIVT7V+T Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737646" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737646" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222939" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222939" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:14 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v2 14/27] vfio/container: Move dirty_pgsizes and max_dirty_bitmap_size to base container Date: Mon, 16 Oct 2023 16:32:10 +0800 Message-Id: <20231016083223.1519410-15-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger No functional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 2 -- include/hw/vfio/vfio-container-base.h | 2 ++ hw/vfio/container.c | 11 ++++++----- 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 8771160849..9f2b86581b 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -80,8 +80,6 @@ typedef struct VFIOLegacyContainer { int fd; /* /dev/vfio/vfio, empowered by the attached groups */ MemoryListener prereg_listener; unsigned iommu_type; - uint64_t dirty_pgsizes; - uint64_t max_dirty_bitmap_size; QLIST_HEAD(, VFIOGroup) group_list; } VFIOLegacyContainer; diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 96d33495c1..9a5971a00a 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -79,6 +79,8 @@ struct VFIOContainer { MemoryListener listener; Error *error; bool initialized; + uint64_t dirty_pgsizes; + uint64_t max_dirty_bitmap_size; unsigned long pgsizes; unsigned int dma_max_mappings; bool dirty_pages_supported; diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 5b14a9b307..9d5be749c7 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -70,6 +70,7 @@ static int vfio_dma_unmap_bitmap(VFIOLegacyContainer *container, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb) { + VFIOContainer *bcontainer = &container->bcontainer; struct vfio_iommu_type1_dma_unmap *unmap; struct vfio_bitmap *bitmap; VFIOBitmap vbmap; @@ -97,7 +98,7 @@ static int vfio_dma_unmap_bitmap(VFIOLegacyContainer *container, bitmap->size = vbmap.size; bitmap->data = (__u64 *)vbmap.bitmap; - if (vbmap.size > container->max_dirty_bitmap_size) { + if (vbmap.size > bcontainer->max_dirty_bitmap_size) { error_report("UNMAP: Size of bitmap too big 0x%"PRIx64, vbmap.size); ret = -E2BIG; goto unmap_exit; @@ -139,7 +140,7 @@ static int vfio_legacy_dma_unmap(VFIOContainer *bcontainer, hwaddr iova, if (iotlb && vfio_devices_all_running_and_mig_active(bcontainer)) { if (!vfio_devices_all_device_dirty_tracking(bcontainer) && - container->bcontainer.dirty_pages_supported) { + bcontainer->dirty_pages_supported) { return vfio_dma_unmap_bitmap(container, iova, size, iotlb); } @@ -162,7 +163,7 @@ static int vfio_legacy_dma_unmap(VFIOContainer *bcontainer, hwaddr iova, if (errno == EINVAL && unmap.size && !(unmap.iova + unmap.size) && container->iommu_type == VFIO_TYPE1v2_IOMMU) { trace_vfio_legacy_dma_unmap_overflow_workaround(); - unmap.size -= 1ULL << ctz64(container->bcontainer.pgsizes); + unmap.size -= 1ULL << ctz64(bcontainer->pgsizes); continue; } error_report("VFIO_UNMAP_DMA failed: %s", strerror(errno)); @@ -558,8 +559,8 @@ static void vfio_get_iommu_info_migration(VFIOLegacyContainer *container, */ if (cap_mig->pgsize_bitmap & qemu_real_host_page_size()) { bcontainer->dirty_pages_supported = true; - container->max_dirty_bitmap_size = cap_mig->max_dirty_bitmap_size; - container->dirty_pgsizes = cap_mig->pgsize_bitmap; + bcontainer->max_dirty_bitmap_size = cap_mig->max_dirty_bitmap_size; + bcontainer->dirty_pgsizes = cap_mig->pgsize_bitmap; } } From patchwork Mon Oct 16 08:32:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422737 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B7865CDB482 for ; Mon, 16 Oct 2023 08:50:38 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJH8-0000OA-As; Mon, 16 Oct 2023 04:48:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJH3-000085-AO for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:31 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJGz-0001Al-C8 for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446105; x=1728982105; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=euPLFA5NrtvVaevtxFmWmjqgLWemtU5tMgfeDd/swdA=; b=jh7J3Q0tahSFI09qRpLjSR+VJCdXSVeFCoWOMeAIgWkWKTQ82Ek1svOy lkUVZ1Zu/CeGcxnQ/37MJRoxr+uSi8gZ3J/D3YzCW/41nikarojl7AkAs OCqVSM0y78RoIMmIK0bi74oOBnjQmlHpH+ZC2Z2zPC/QZFZEeJXCx/cv2 J+BegqYnP1AaIIMylrLpJujC2x0+QBzYW+7NgF68m0fxBrydEA9W0Yj9s C4kDU31T4MfqPvd/MVLnwNoqwbQveaQqx5tH3Sev84Wu/mludJjdZEJoX Soh3j5Vi6EVcUw+09cRLWUN2YsI2VF8JMCHoPKqVcFwcUnK1UqhU1uFQd Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737651" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737651" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222953" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222953" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:19 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v2 15/27] vfio/container: Implement attach/detach_device Date: Mon, 16 Oct 2023 16:32:11 +0800 Message-Id: <20231016083223.1519410-16-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- hw/vfio/common.c | 22 ++++++++++++++++++++++ hw/vfio/container.c | 12 +++++------- 2 files changed, 27 insertions(+), 7 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index da1d64efca..ee2ebf4be9 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1552,3 +1552,25 @@ retry: return info; } + +int vfio_attach_device(char *name, VFIODevice *vbasedev, + AddressSpace *as, Error **errp) +{ + const VFIOIOMMUBackendOpsClass *ops; + + ops = VFIO_IOMMU_BACKEND_OPS_CLASS( + object_class_by_name(TYPE_VFIO_IOMMU_BACKEND_LEGACY_OPS)); + if (!ops) { + error_setg(errp, "VFIO IOMMU Backend not found!"); + return -ENODEV; + } + return ops->attach_device(name, vbasedev, as, errp); +} + +void vfio_detach_device(VFIODevice *vbasedev) +{ + if (!vbasedev->bcontainer) { + return; + } + vbasedev->bcontainer->ops->detach_device(vbasedev); +} diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 9d5be749c7..c86accdb38 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -1117,8 +1117,8 @@ static int vfio_device_groupid(VFIODevice *vbasedev, Error **errp) * @name and @vbasedev->name are likely to be different depending * on the type of the device, hence the need for passing @name */ -int vfio_attach_device(char *name, VFIODevice *vbasedev, - AddressSpace *as, Error **errp) +static int vfio_legacy_attach_device(char *name, VFIODevice *vbasedev, + AddressSpace *as, Error **errp) { int groupid = vfio_device_groupid(vbasedev, errp); VFIODevice *vbasedev_iter; @@ -1158,14 +1158,10 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev, return ret; } -void vfio_detach_device(VFIODevice *vbasedev) +static void vfio_legacy_detach_device(VFIODevice *vbasedev) { VFIOGroup *group = vbasedev->group; - if (!vbasedev->bcontainer) { - return; - } - QLIST_REMOVE(vbasedev, global_next); QLIST_REMOVE(vbasedev, container_next); vbasedev->bcontainer = NULL; @@ -1180,6 +1176,8 @@ static void vfio_iommu_backend_legacy_ops_class_init(ObjectClass *oc, ops->dma_map = vfio_legacy_dma_map; ops->dma_unmap = vfio_legacy_dma_unmap; + ops->attach_device = vfio_legacy_attach_device; + ops->detach_device = vfio_legacy_detach_device; ops->set_dirty_page_tracking = vfio_legacy_set_dirty_page_tracking; ops->query_dirty_bitmap = vfio_legacy_query_dirty_bitmap; ops->add_window = vfio_legacy_add_section_window; From patchwork Mon Oct 16 08:32:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422742 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1A10ACDB465 for ; Mon, 16 Oct 2023 08:51:32 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJH8-0000T4-NJ; Mon, 16 Oct 2023 04:48:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJH7-0000Dd-6e for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:33 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJH4-0001FN-7e for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446110; x=1728982110; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wtDPIDhWCjBFycD/ZAY0meL+/Wa9UYfvM5F/A6wiee0=; b=c70cyj+C30LNksjO8f5n/4MXTMNzK3EyTYTeXz5Op/oLRq5uM2Il9tbh th8Q9vvYC010/xxfJhyNXoE8KagAsN4wddqlckfd1QyqoNv5LGKQ4aCSz byQgqiS92zoJ9XnPGx1nEsuEIBO8xO+eqdeu9oPxk1Q8tOlUiVJXiQQdz jU9qc3tmnmZGkHHt+kzkMccCjz71LqUtR5qpIRCusRY7OKc0bAuaHhE9/ OHzpzLZiFEjZofOBTvtK4FPeS5JtBD3U4klUSR8AMhD5wjF1+T66NWJ3Y odCmy6lsdPCCvVBQ4KbyE/cK3IWKf+Ak/LBqCJ6P6dWbHy6CQVdddEZ1q A==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737664" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737664" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222970" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222970" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:23 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Paolo Bonzini , =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Thomas Huth , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v2 16/27] Add iommufd configure option Date: Mon, 16 Oct 2023 16:32:12 +0800 Message-Id: <20231016083223.1519410-17-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This adds "--enable-iommufd/--disable-iommufd" to enable or disable iommufd support, enabled by default. Signed-off-by: Zhenzhong Duan --- meson.build | 6 ++++++ meson_options.txt | 2 ++ scripts/meson-buildoptions.sh | 3 +++ 3 files changed, 11 insertions(+) diff --git a/meson.build b/meson.build index 79aef19bdc..e8d285aa5b 100644 --- a/meson.build +++ b/meson.build @@ -560,6 +560,10 @@ have_tpm = get_option('tpm') \ .require(targetos != 'windows', error_message: 'TPM emulation only available on POSIX systems') \ .allowed() +have_iommufd = get_option('iommufd') \ + .require(targetos == 'linux', error_message: 'iommufd is supported only on Linux') \ + .allowed() + # vhost have_vhost_user = get_option('vhost_user') \ .disable_auto_if(targetos != 'linux') \ @@ -2126,6 +2130,7 @@ if get_option('tcg').allowed() endif config_host_data.set('CONFIG_TPM', have_tpm) config_host_data.set('CONFIG_TSAN', get_option('tsan')) +config_host_data.set('CONFIG_IOMMUFD', have_iommufd) config_host_data.set('CONFIG_USB_LIBUSB', libusb.found()) config_host_data.set('CONFIG_VDE', vde.found()) config_host_data.set('CONFIG_VHOST_NET', have_vhost_net) @@ -4061,6 +4066,7 @@ summary_info += {'vhost-user-crypto support': have_vhost_user_crypto} summary_info += {'vhost-user-blk server support': have_vhost_user_blk_server} summary_info += {'vhost-vdpa support': have_vhost_vdpa} summary_info += {'build guest agent': have_ga} +summary_info += {'iommufd support': have_iommufd} summary(summary_info, bool_yn: true, section: 'Configurable features') # Compilation information diff --git a/meson_options.txt b/meson_options.txt index 6a17b90968..62bd75284b 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -107,6 +107,8 @@ option('dbus_display', type: 'feature', value: 'auto', description: '-display dbus support') option('tpm', type : 'feature', value : 'auto', description: 'TPM support') +option('iommufd', type : 'feature', value : 'auto', + description: 'iommufd support') # Do not enable it by default even for Mingw32, because it doesn't # work on Wine. diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh index 2a74b0275b..86909dc2cc 100644 --- a/scripts/meson-buildoptions.sh +++ b/scripts/meson-buildoptions.sh @@ -114,6 +114,7 @@ meson_options_help() { printf "%s\n" ' guest-agent-msi Build MSI package for the QEMU Guest Agent' printf "%s\n" ' hvf HVF acceleration support' printf "%s\n" ' iconv Font glyph conversion support' + printf "%s\n" ' iommufd iommufd support' printf "%s\n" ' jack JACK sound support' printf "%s\n" ' keyring Linux keyring support' printf "%s\n" ' kvm KVM acceleration support' @@ -327,6 +328,8 @@ _meson_option_parse() { --enable-install-blobs) printf "%s" -Dinstall_blobs=true ;; --disable-install-blobs) printf "%s" -Dinstall_blobs=false ;; --interp-prefix=*) quote_sh "-Dinterp_prefix=$2" ;; + --enable-iommufd) printf "%s" -Diommufd=enabled ;; + --disable-iommufd) printf "%s" -Diommufd=disabled ;; --enable-jack) printf "%s" -Djack=enabled ;; --disable-jack) printf "%s" -Djack=disabled ;; --enable-keyring) printf "%s" -Dkeyring=enabled ;; From patchwork Mon Oct 16 08:32:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422731 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BF34BC41513 for ; Mon, 16 Oct 2023 08:49:48 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJHE-0001ET-4Y; Mon, 16 Oct 2023 04:48:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHD-0001Ar-5j for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:39 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHA-0001HS-0t for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446116; x=1728982116; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=n/TZIh04D0fdSC5Hs8QnxQS7qUmltDJO7ZF/EIvMlII=; b=Q8Ubn5gRMVOAWByLA4vIZj8S368NF1KVcOTWhJivtmEc1yknfL6zQK2e l0g02v1Ey2LD3M+bPhiFdZovkziPKU+NSELdUk8sFur8rg4FHc2lS9ZJ2 gQ2zwtb3EGFy5Vt2pEVQga7jF8ZYPKBKOFh9O/dofoSFq2l1MnwbRKHL+ 6Y6fr3OfLDVBuauZ9sdOmVVj3IBPGpRXgMDs+2otFvFolC6cQTVdQW5CJ rokATYnuFtYTUzRZyjupJKfeKugsSrEIZ2UGYY0xFWAMujQpLgDjM8r8o HuNOgJPihZVDk9gt3/DQlYaPi302zY1r5o+32dXE0lc5UcaLLcK772TVo g==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737675" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737675" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749222993" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749222993" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:28 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Paolo Bonzini , =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Eduardo Habkost , Eric Blake , Markus Armbruster Subject: [PATCH v2 17/27] backends/iommufd: Introduce the iommufd object Date: Mon, 16 Oct 2023 16:32:13 +0800 Message-Id: <20231016083223.1519410-18-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Introduce an iommufd object which allows the interaction with the host /dev/iommu device. The /dev/iommu can have been already pre-opened outside of qemu, in which case the fd can be passed directly along with the iommufd object: This allows the iommufd object to be shared accross several subsystems (VFIO, VDPA, ...). For example, libvirt would open the /dev/iommu once. If no fd is passed along with the iommufd object, the /dev/iommu is opened by the qemu code. The CONFIG_IOMMUFD option must be set to compile this new object. Suggested-by: Alex Williamson Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Zhenzhong Duan --- MAINTAINERS | 7 + qapi/qom.json | 18 ++- include/sysemu/iommufd.h | 46 +++++++ backends/iommufd-stub.c | 59 +++++++++ backends/iommufd.c | 268 +++++++++++++++++++++++++++++++++++++++ backends/Kconfig | 4 + backends/meson.build | 5 + backends/trace-events | 12 ++ qemu-options.hx | 13 ++ 9 files changed, 431 insertions(+), 1 deletion(-) create mode 100644 include/sysemu/iommufd.h create mode 100644 backends/iommufd-stub.c create mode 100644 backends/iommufd.c diff --git a/MAINTAINERS b/MAINTAINERS index 9e7dec4a58..a7cdeb7825 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2081,6 +2081,13 @@ F: hw/vfio/ap.c F: docs/system/s390x/vfio-ap.rst L: qemu-s390x@nongnu.org +iommufd +M: Yi Liu +M: Eric Auger +S: Supported +F: backends/iommufd.c +F: include/sysemu/iommufd.h + vhost M: Michael S. Tsirkin S: Supported diff --git a/qapi/qom.json b/qapi/qom.json index c53ef978ff..3f964e57f5 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -794,6 +794,18 @@ { 'struct': 'VfioUserServerProperties', 'data': { 'socket': 'SocketAddress', 'device': 'str' } } +## +# @IOMMUFDProperties: +# +# Properties for IOMMUFDbackend objects. +# +# fd: file descriptor name +# +# Since: 7.2 +## +{ 'struct': 'IOMMUFDProperties', + 'data': { '*fd': 'str' } } + ## # @RngProperties: # @@ -948,6 +960,8 @@ 'qtest', 'rng-builtin', 'rng-egd', + { 'name': 'iommufd', + 'if': 'CONFIG_IOMMUFD' }, { 'name': 'rng-random', 'if': 'CONFIG_POSIX' }, 'secret', @@ -1029,7 +1043,9 @@ 'tls-creds-x509': 'TlsCredsX509Properties', 'tls-cipher-suites': 'TlsCredsProperties', 'x-remote-object': 'RemoteObjectProperties', - 'x-vfio-user-server': 'VfioUserServerProperties' + 'x-vfio-user-server': 'VfioUserServerProperties', + 'iommufd': { 'type': 'IOMMUFDProperties', + 'if': 'CONFIG_IOMMUFD' } } } ## diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h new file mode 100644 index 0000000000..f0e5c7eeb8 --- /dev/null +++ b/include/sysemu/iommufd.h @@ -0,0 +1,46 @@ +#ifndef SYSEMU_IOMMUFD_H +#define SYSEMU_IOMMUFD_H + +#include "qom/object.h" +#include "qemu/thread.h" +#include "exec/hwaddr.h" +#include "exec/cpu-common.h" + +#define TYPE_IOMMUFD_BACKEND "iommufd" +OBJECT_DECLARE_TYPE(IOMMUFDBackend, IOMMUFDBackendClass, + IOMMUFD_BACKEND) +#define IOMMUFD_BACKEND(obj) \ + OBJECT_CHECK(IOMMUFDBackend, (obj), TYPE_IOMMUFD_BACKEND) +#define IOMMUFD_BACKEND_GET_CLASS(obj) \ + OBJECT_GET_CLASS(IOMMUFDBackendClass, (obj), TYPE_IOMMUFD_BACKEND) +#define IOMMUFD_BACKEND_CLASS(klass) \ + OBJECT_CLASS_CHECK(IOMMUFDBackendClass, (klass), TYPE_IOMMUFD_BACKEND) +struct IOMMUFDBackendClass { + ObjectClass parent_class; +}; + +struct IOMMUFDBackend { + Object parent; + + /*< protected >*/ + int fd; /* /dev/iommu file descriptor */ + bool owned; /* is the /dev/iommu opened internally */ + QemuMutex lock; + uint32_t users; + + /*< public >*/ +}; + +int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp); +void iommufd_backend_disconnect(IOMMUFDBackend *be); + +int iommufd_backend_get_ioas(IOMMUFDBackend *be, uint32_t *ioas_id); +void iommufd_backend_put_ioas(IOMMUFDBackend *be, uint32_t ioas_id); +void iommufd_backend_free_id(int fd, uint32_t id); +int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova, + ram_addr_t size, void *vaddr, bool readonly); +int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id, + hwaddr iova, ram_addr_t size); +int iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, + uint32_t pt_id, uint32_t *out_hwpt); +#endif diff --git a/backends/iommufd-stub.c b/backends/iommufd-stub.c new file mode 100644 index 0000000000..cfb9a87859 --- /dev/null +++ b/backends/iommufd-stub.c @@ -0,0 +1,59 @@ +/* + * iommufd container backend stub + * + * Copyright (C) 2023 Intel Corporation. + * Copyright Red Hat, Inc. 2023 + * + * Authors: Yi Liu + * Eric Auger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + + * You should have received a copy of the GNU General Public License along + * with this program; if not, see . + */ + +#include "qemu/osdep.h" +#include "sysemu/iommufd.h" +#include "qemu/error-report.h" + +int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp) +{ + return 0; +} +void iommufd_backend_disconnect(IOMMUFDBackend *be) +{ +} +void iommufd_backend_free_id(int fd, uint32_t id) +{ +} +int iommufd_backend_get_ioas(IOMMUFDBackend *be, uint32_t *ioas_id) +{ + return 0; +} +void iommufd_backend_put_ioas(IOMMUFDBackend *be, uint32_t ioas_id) +{ +} +int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova, + ram_addr_t size, void *vaddr, bool readonly) +{ + return 0; +} +int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id, + hwaddr iova, ram_addr_t size) +{ + return 0; +} +int iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, + uint32_t pt_id, uint32_t *out_hwpt) +{ + return 0; +} diff --git a/backends/iommufd.c b/backends/iommufd.c new file mode 100644 index 0000000000..3f0ed37847 --- /dev/null +++ b/backends/iommufd.c @@ -0,0 +1,268 @@ +/* + * iommufd container backend + * + * Copyright (C) 2023 Intel Corporation. + * Copyright Red Hat, Inc. 2023 + * + * Authors: Yi Liu + * Eric Auger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + + * You should have received a copy of the GNU General Public License along + * with this program; if not, see . + */ + +#include "qemu/osdep.h" +#include "sysemu/iommufd.h" +#include "qapi/error.h" +#include "qapi/qmp/qerror.h" +#include "qemu/module.h" +#include "qom/object_interfaces.h" +#include "qemu/error-report.h" +#include "monitor/monitor.h" +#include "trace.h" +#include +#include + +static void iommufd_backend_init(Object *obj) +{ + IOMMUFDBackend *be = IOMMUFD_BACKEND(obj); + + be->fd = -1; + be->users = 0; + be->owned = true; + qemu_mutex_init(&be->lock); +} + +static void iommufd_backend_finalize(Object *obj) +{ + IOMMUFDBackend *be = IOMMUFD_BACKEND(obj); + + if (be->owned) { + close(be->fd); + be->fd = -1; + } +} + +static void iommufd_backend_set_fd(Object *obj, const char *str, Error **errp) +{ + IOMMUFDBackend *be = IOMMUFD_BACKEND(obj); + int fd = -1; + + fd = monitor_fd_param(monitor_cur(), str, errp); + if (fd == -1) { + error_prepend(errp, "Could not parse remote object fd %s:", str); + return; + } + qemu_mutex_lock(&be->lock); + be->fd = fd; + be->owned = false; + qemu_mutex_unlock(&be->lock); + trace_iommu_backend_set_fd(be->fd); +} + +static void iommufd_backend_class_init(ObjectClass *oc, void *data) +{ + object_class_property_add_str(oc, "fd", NULL, iommufd_backend_set_fd); +} + +int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp) +{ + int fd, ret = 0; + + qemu_mutex_lock(&be->lock); + if (be->users == UINT32_MAX) { + error_setg(errp, "too many connections"); + ret = -E2BIG; + goto out; + } + if (be->owned && !be->users) { + fd = qemu_open_old("/dev/iommu", O_RDWR); + if (fd < 0) { + error_setg_errno(errp, errno, "/dev/iommu opening failed"); + ret = fd; + goto out; + } + be->fd = fd; + } + be->users++; +out: + trace_iommufd_backend_connect(be->fd, be->owned, + be->users, ret); + qemu_mutex_unlock(&be->lock); + return ret; +} + +void iommufd_backend_disconnect(IOMMUFDBackend *be) +{ + qemu_mutex_lock(&be->lock); + if (!be->users) { + goto out; + } + be->users--; + if (!be->users && be->owned) { + close(be->fd); + be->fd = -1; + } +out: + trace_iommufd_backend_disconnect(be->fd, be->users); + qemu_mutex_unlock(&be->lock); +} + +static int iommufd_backend_alloc_ioas(int fd, uint32_t *ioas_id) +{ + int ret; + struct iommu_ioas_alloc alloc_data = { + .size = sizeof(alloc_data), + .flags = 0, + }; + + ret = ioctl(fd, IOMMU_IOAS_ALLOC, &alloc_data); + if (ret) { + error_report("Failed to allocate ioas %m"); + } + + *ioas_id = alloc_data.out_ioas_id; + trace_iommufd_backend_alloc_ioas(fd, *ioas_id, ret); + + return ret; +} + +void iommufd_backend_free_id(int fd, uint32_t id) +{ + int ret; + struct iommu_destroy des = { + .size = sizeof(des), + .id = id, + }; + + ret = ioctl(fd, IOMMU_DESTROY, &des); + trace_iommufd_backend_free_id(fd, id, ret); + if (ret) { + error_report("Failed to free id: %u %m", id); + } +} + +int iommufd_backend_get_ioas(IOMMUFDBackend *be, uint32_t *ioas_id) +{ + int ret; + + ret = iommufd_backend_alloc_ioas(be->fd, ioas_id); + trace_iommufd_backend_get_ioas(be->fd, *ioas_id, ret); + return ret; +} + +void iommufd_backend_put_ioas(IOMMUFDBackend *be, uint32_t ioas_id) +{ + iommufd_backend_free_id(be->fd, ioas_id); + trace_iommufd_backend_put_ioas(be->fd, ioas_id); +} + +int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova, + ram_addr_t size, void *vaddr, bool readonly) +{ + int ret; + struct iommu_ioas_map map = { + .size = sizeof(map), + .flags = IOMMU_IOAS_MAP_READABLE | + IOMMU_IOAS_MAP_FIXED_IOVA, + .ioas_id = ioas_id, + .__reserved = 0, + .user_va = (uintptr_t)vaddr, + .iova = iova, + .length = size, + }; + + if (!readonly) { + map.flags |= IOMMU_IOAS_MAP_WRITEABLE; + } + + ret = ioctl(be->fd, IOMMU_IOAS_MAP, &map); + trace_iommufd_backend_map_dma(be->fd, ioas_id, iova, size, + vaddr, readonly, ret); + if (ret) { + error_report("IOMMU_IOAS_MAP failed: %m"); + } + return !ret ? 0 : -errno; +} + +int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id, + hwaddr iova, ram_addr_t size) +{ + int ret; + struct iommu_ioas_unmap unmap = { + .size = sizeof(unmap), + .ioas_id = ioas_id, + .iova = iova, + .length = size, + }; + + ret = ioctl(be->fd, IOMMU_IOAS_UNMAP, &unmap); + trace_iommufd_backend_unmap_dma(be->fd, ioas_id, iova, size, ret); + /* + * TODO: IOMMUFD doesn't support mapping PCI BARs for now. + * It's not a problem if there is no p2p dma, relax it here + * and avoid many noisy trigger from vIOMMU side. + */ + if (ret && errno == ENOENT) { + ret = 0; + } + if (ret) { + error_report("IOMMU_IOAS_UNMAP failed: %m"); + } + return !ret ? 0 : -errno; +} + +int iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, + uint32_t pt_id, uint32_t *out_hwpt) +{ + int ret; + struct iommu_hwpt_alloc alloc_hwpt = { + .size = sizeof(struct iommu_hwpt_alloc), + .flags = 0, + .dev_id = dev_id, + .pt_id = pt_id, + .__reserved = 0, + }; + + ret = ioctl(iommufd, IOMMU_HWPT_ALLOC, &alloc_hwpt); + trace_iommufd_backend_alloc_hwpt(iommufd, dev_id, pt_id, + alloc_hwpt.out_hwpt_id, ret); + + if (ret) { + error_report("IOMMU_HWPT_ALLOC failed: %m"); + } else { + *out_hwpt = alloc_hwpt.out_hwpt_id; + } + return !ret ? 0 : -errno; +} + +static const TypeInfo iommufd_backend_info = { + .name = TYPE_IOMMUFD_BACKEND, + .parent = TYPE_OBJECT, + .instance_size = sizeof(IOMMUFDBackend), + .instance_init = iommufd_backend_init, + .instance_finalize = iommufd_backend_finalize, + .class_size = sizeof(IOMMUFDBackendClass), + .class_init = iommufd_backend_class_init, + .interfaces = (InterfaceInfo[]) { + { TYPE_USER_CREATABLE }, + { } + } +}; + +static void register_types(void) +{ + type_register_static(&iommufd_backend_info); +} + +type_init(register_types); diff --git a/backends/Kconfig b/backends/Kconfig index f35abc1609..2cb23f62fa 100644 --- a/backends/Kconfig +++ b/backends/Kconfig @@ -1 +1,5 @@ source tpm/Kconfig + +config IOMMUFD + bool + depends on VFIO diff --git a/backends/meson.build b/backends/meson.build index 914c7c4afb..05ac57ff15 100644 --- a/backends/meson.build +++ b/backends/meson.build @@ -20,6 +20,11 @@ if have_vhost_user system_ss.add(when: 'CONFIG_VIRTIO', if_true: files('vhost-user.c')) endif system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost.c')) +if have_iommufd + system_ss.add(files('iommufd.c')) +else + system_ss.add(files('iommufd-stub.c')) +endif if have_vhost_user_crypto system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost-user.c')) endif diff --git a/backends/trace-events b/backends/trace-events index 652eb76a57..e5f828bca2 100644 --- a/backends/trace-events +++ b/backends/trace-events @@ -5,3 +5,15 @@ dbus_vmstate_pre_save(void) dbus_vmstate_post_load(int version_id) "version_id: %d" dbus_vmstate_loading(const char *id) "id: %s" dbus_vmstate_saving(const char *id) "id: %s" + +# iommufd.c +iommufd_backend_connect(int fd, bool owned, uint32_t users, int ret) "fd=%d owned=%d users=%d (%d)" +iommufd_backend_disconnect(int fd, uint32_t users) "fd=%d users=%d" +iommu_backend_set_fd(int fd) "pre-opened /dev/iommu fd=%d" +iommufd_backend_get_ioas(int iommufd, uint32_t ioas, int ret) " iommufd=%d ioas=%d (%d)" +iommufd_backend_put_ioas(int iommufd, uint32_t ioas) " iommufd=%d ioas=%d" +iommufd_backend_map_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, void *vaddr, bool readonly, int ret) " iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" addr=%p readonly=%d (%d)" +iommufd_backend_unmap_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, int ret) " iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" (%d)" +iommufd_backend_alloc_ioas(int iommufd, uint32_t ioas, int ret) " iommufd=%d ioas=%d (%d)" +iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)" +iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_t out_hwpt_id, int ret) " iommufd=%d dev_id=%u pt_id=%u out_hwpt=%u (%d)" diff --git a/qemu-options.hx b/qemu-options.hx index 54a7e94970..0af0d379a6 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -5207,6 +5207,19 @@ SRST The ``share`` boolean option is on by default with memfd. +#ifdef CONFIG_IOMMUFD + ``-object iommufd,id=id[,fd=fd]`` + Creates an iommufd backend which allows control of DMA mapping + through the /dev/iommu device. + + The ``id`` parameter is a unique ID which frontends (such as + vfio-pci of vdpa) will use to connect with the iommufd backend. + + The ``fd`` parameter is an optional pre-opened file descriptor + resulting from /dev/iommu opening. Usually the iommufd is shared + across all subsystems, bringing the benefit of centralized + reference counting. +#endif ``-object rng-builtin,id=id`` Creates a random number generator backend which obtains entropy from QEMU builtin functions. The ``id`` parameter is a unique ID From patchwork Mon Oct 16 08:32:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422736 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C7BA4CDB465 for ; Mon, 16 Oct 2023 08:50:16 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJHG-0001SM-O3; Mon, 16 Oct 2023 04:48:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHF-0001NZ-Ob for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:41 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHD-0001HS-J2 for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446119; x=1728982119; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=N9fFmS9PktwgQZvkAwkZz4G7suumcTIsx9BFCGP+Cfs=; b=SZqWGHzrtcj+x0Hif5o4v8BBJ/NLmdgMKUJBRdSv/co8qc1AQPuB15W5 lxQ4Ko3nXI84NGAMuS9gcXAoL/pjvYu/snnoqqv6r9P8ld9X8x5TDtbEY d8GKnm1aVJuRXnCBvzAXidI7uPpCtm1Gnn7XovUhuZuVg0QJ8XOBx56/N L/KExHr9OcsSbH0TunbeNprZO6KXii67r7JfGF247eJ8+KsmmW8SzJyA4 E8tOAjEPfS0EZBO30XOjWlmD/TEdWbLNgnVeRbpu/t4E5E08FT3lUr9HE 3A2V7Rcz0qb7hzGf/3401qphkX6Zit+j0DRRW43/r8oMWTblHyjtCFCeh w==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737683" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737683" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749223006" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749223006" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:33 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v2 18/27] util/char_dev: Add open_cdev() Date: Mon, 16 Oct 2023 16:32:14 +0800 Message-Id: <20231016083223.1519410-19-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Yi Liu /dev/vfio/devices/vfioX may not exist. In that case it is still possible to open /dev/char/$major:$minor instead. Add helper function to abstract the cdev open. Suggested-by: Jason Gunthorpe Signed-off-by: Yi Liu Signed-off-by: Zhenzhong Duan --- MAINTAINERS | 6 +++ include/qemu/chardev_open.h | 16 ++++++++ util/chardev_open.c | 81 +++++++++++++++++++++++++++++++++++++ util/meson.build | 1 + 4 files changed, 104 insertions(+) create mode 100644 include/qemu/chardev_open.h create mode 100644 util/chardev_open.c diff --git a/MAINTAINERS b/MAINTAINERS index a7cdeb7825..eb6b7d274c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3408,6 +3408,12 @@ S: Maintained F: include/qemu/iova-tree.h F: util/iova-tree.c +cdev Open +M: Yi Liu +S: Maintained +F: include/qemu/chardev_open.h +F: util/chardev_open.c + elf2dmp M: Viktor Prutyanov S: Maintained diff --git a/include/qemu/chardev_open.h b/include/qemu/chardev_open.h new file mode 100644 index 0000000000..6580d351c6 --- /dev/null +++ b/include/qemu/chardev_open.h @@ -0,0 +1,16 @@ +/* + * QEMU Chardev Helper + * + * Copyright (C) 2023 Intel Corporation. + * + * Authors: Yi Liu + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#ifndef QEMU_CHARDEV_HELPERS_H +#define QEMU_CHARDEV_HELPERS_H + +int open_cdev(const char *devpath, dev_t cdev); +#endif diff --git a/util/chardev_open.c b/util/chardev_open.c new file mode 100644 index 0000000000..005d2b81bd --- /dev/null +++ b/util/chardev_open.c @@ -0,0 +1,81 @@ +/* + * Copyright (c) 2019, Mellanox Technologies. All rights reserved. + * Copyright (C) 2023 Intel Corporation. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Authors: Yi Liu + * + * Copied from + * https://github.com/linux-rdma/rdma-core/blob/master/util/open_cdev.c + * + */ + +#include "qemu/osdep.h" +#include "qemu/chardev_open.h" + +static int open_cdev_internal(const char *path, dev_t cdev) +{ + struct stat st; + int fd; + + fd = qemu_open_old(path, O_RDWR); + if (fd == -1) { + return -1; + } + if (fstat(fd, &st) || !S_ISCHR(st.st_mode) || + (cdev != 0 && st.st_rdev != cdev)) { + close(fd); + return -1; + } + return fd; +} + +static int open_cdev_robust(dev_t cdev) +{ + g_autofree char *devpath; + + /* + * This assumes that udev is being used and is creating the /dev/char/ + * symlinks. + */ + devpath = g_strdup_printf("/dev/char/%u:%u", major(cdev), minor(cdev)); + return open_cdev_internal(devpath, cdev); +} + +int open_cdev(const char *devpath, dev_t cdev) +{ + int fd; + + fd = open_cdev_internal(devpath, cdev); + if (fd == -1 && cdev != 0) { + return open_cdev_robust(cdev); + } + return fd; +} diff --git a/util/meson.build b/util/meson.build index c4827fd70a..654f4528fb 100644 --- a/util/meson.build +++ b/util/meson.build @@ -106,6 +106,7 @@ if have_block util_ss.add(files('filemonitor-stub.c')) endif util_ss.add(when: 'CONFIG_LINUX', if_true: files('vfio-helpers.c')) + util_ss.add(when: 'CONFIG_LINUX', if_true: files('chardev_open.c')) endif if cpu == 'aarch64' From patchwork Mon Oct 16 08:32:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422732 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A60E7CDB483 for ; Mon, 16 Oct 2023 08:49:48 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJHa-0002vG-9I; Mon, 16 Oct 2023 04:49:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHU-0002ku-32 for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:56 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHQ-0001HS-Ru for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:48:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446132; x=1728982132; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+5euWQCg0SwFy0K+UjW/imvDiJ5SW8o4bCCUOEJuxCs=; b=fkVMJnBrfox8OxS3i6n6cUDsesopx2vgydBqavVumfEz8NlxtvRq2Y5Q PyNryr+uQBJqTCtDQkh37nuL4AsgcbBb3Ubh4tuFHH0jpxYBzkh0o33Od 7UX7CM/xYQ632Rq1y5yXGsiRa8PJTr/H7VqcXChmGWzvITYretLAzG7/u iF7EGLTJZRuCutBcqLzAdWr61UB7N9xbOI0SUrmVhYETeHGFvg8/yYhad 6urnLkm+3IcsbyCkdGjTwj8SZirKkGjYG4YPsVJoQaPg8SqtJ/8xnmdFq sQnLKw56ZwRLBbxgo1vBefHG5g6pFyrhzjjuaArLjzVd30bxU0kjhC8Ew A==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737687" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737687" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749223014" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749223014" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:37 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v2 19/27] vfio/iommufd: Implement the iommufd backend Date: Mon, 16 Oct 2023 16:32:15 +0800 Message-Id: <20231016083223.1519410-20-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Yi Liu Add the iommufd backend. The IOMMUFD container class is implemented based on the new /dev/iommu user API. This backend obviously depends on CONFIG_IOMMUFD. So far, the iommufd backend doesn't support dirty page sync yet due to missing support in the host kernel. Co-authored-by: Eric Auger Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 22 ++ include/hw/vfio/vfio-container-base.h | 3 + hw/vfio/common.c | 19 +- hw/vfio/iommufd.c | 535 ++++++++++++++++++++++++++ hw/vfio/meson.build | 3 + hw/vfio/trace-events | 12 + 6 files changed, 590 insertions(+), 4 deletions(-) create mode 100644 hw/vfio/iommufd.c diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 9f2b86581b..e72f5962ee 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -83,6 +83,23 @@ typedef struct VFIOLegacyContainer { QLIST_HEAD(, VFIOGroup) group_list; } VFIOLegacyContainer; +#ifdef CONFIG_IOMMUFD +typedef struct VFIOIOASHwpt { + uint32_t hwpt_id; + QLIST_HEAD(, VFIODevice) device_list; + QLIST_ENTRY(VFIOIOASHwpt) next; +} VFIOIOASHwpt; + +typedef struct IOMMUFDBackend IOMMUFDBackend; + +typedef struct VFIOIOMMUFDContainer { + VFIOContainer bcontainer; + IOMMUFDBackend *be; + uint32_t ioas_id; + QLIST_HEAD(, VFIOIOASHwpt) hwpt_list; +} VFIOIOMMUFDContainer; +#endif + typedef struct VFIODeviceOps VFIODeviceOps; typedef struct VFIODevice { @@ -110,6 +127,11 @@ typedef struct VFIODevice { OnOffAuto pre_copy_dirty_page_tracking; bool dirty_pages_supported; bool dirty_tracking; +#ifdef CONFIG_IOMMUFD + int devid; + VFIOIOASHwpt *hwpt; + IOMMUFDBackend *iommufd; +#endif } VFIODevice; struct VFIODeviceOps { diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 9a5971a00a..5345986993 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -114,6 +114,9 @@ void vfio_container_init(VFIOContainer *bcontainer, void vfio_container_destroy(VFIOContainer *bcontainer); #define TYPE_VFIO_IOMMU_BACKEND_LEGACY_OPS "vfio-iommu-backend-legacy-ops" +#ifdef CONFIG_IOMMUFD +#define TYPE_VFIO_IOMMU_BACKEND_IOMMUFD_OPS "vfio-iommu-backend-iommufd-ops" +#endif #define TYPE_VFIO_IOMMU_BACKEND_OPS "vfio-iommu-backend-ops" DECLARE_CLASS_CHECKERS(VFIOIOMMUBackendOpsClass, diff --git a/hw/vfio/common.c b/hw/vfio/common.c index ee2ebf4be9..6901573c32 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1520,10 +1520,13 @@ VFIOAddressSpace *vfio_get_address_space(AddressSpace *as) void vfio_put_address_space(VFIOAddressSpace *space) { - if (QLIST_EMPTY(&space->containers)) { - QLIST_REMOVE(space, list); - g_free(space); + if (!QLIST_EMPTY(&space->containers)) { + return; } + + QLIST_REMOVE(space, list); + g_free(space); + if (QLIST_EMPTY(&vfio_address_spaces)) { qemu_unregister_reset(vfio_reset_handler, NULL); } @@ -1558,8 +1561,16 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev, { const VFIOIOMMUBackendOpsClass *ops; - ops = VFIO_IOMMU_BACKEND_OPS_CLASS( +#ifdef CONFIG_IOMMUFD + if (vbasedev->iommufd) { + ops = VFIO_IOMMU_BACKEND_OPS_CLASS( + object_class_by_name(TYPE_VFIO_IOMMU_BACKEND_IOMMUFD_OPS)); + } else +#endif + { + ops = VFIO_IOMMU_BACKEND_OPS_CLASS( object_class_by_name(TYPE_VFIO_IOMMU_BACKEND_LEGACY_OPS)); + } if (!ops) { error_setg(errp, "VFIO IOMMU Backend not found!"); return -ENODEV; diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c new file mode 100644 index 0000000000..ee8c4620b6 --- /dev/null +++ b/hw/vfio/iommufd.c @@ -0,0 +1,535 @@ +/* + * iommufd container backend + * + * Copyright (C) 2023 Intel Corporation. + * Copyright Red Hat, Inc. 2023 + * + * Authors: Yi Liu + * Eric Auger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + + * You should have received a copy of the GNU General Public License along + * with this program; if not, see . + */ + +#include "qemu/osdep.h" +#include +#include +#include + +#include "hw/vfio/vfio-common.h" +#include "qemu/error-report.h" +#include "trace.h" +#include "qapi/error.h" +#include "sysemu/iommufd.h" +#include "hw/qdev-core.h" +#include "sysemu/reset.h" +#include "qemu/cutils.h" +#include "qemu/chardev_open.h" + +static int iommufd_map(VFIOContainer *bcontainer, hwaddr iova, + ram_addr_t size, void *vaddr, bool readonly) +{ + VFIOIOMMUFDContainer *container = + container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer); + + return iommufd_backend_map_dma(container->be, + container->ioas_id, + iova, size, vaddr, readonly); +} + +static int iommufd_unmap(VFIOContainer *bcontainer, + hwaddr iova, ram_addr_t size, + IOMMUTLBEntry *iotlb) +{ + VFIOIOMMUFDContainer *container = + container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer); + + /* TODO: Handle dma_unmap_bitmap with iotlb args (migration) */ + return iommufd_backend_unmap_dma(container->be, + container->ioas_id, iova, size); +} + +static void vfio_kvm_device_add_device(VFIODevice *vbasedev) +{ + Error *err = NULL; + + if (vfio_kvm_device_add_fd(vbasedev->fd, &err)) { + error_report_err(err); + } +} + +static void vfio_kvm_device_del_device(VFIODevice *vbasedev) +{ + Error *err = NULL; + + if (vfio_kvm_device_del_fd(vbasedev->fd, &err)) { + error_report_err(err); + } +} + +static int iommufd_connect_and_bind(VFIODevice *vbasedev, Error **errp) +{ + IOMMUFDBackend *iommufd = vbasedev->iommufd; + struct vfio_device_bind_iommufd bind = { + .argsz = sizeof(bind), + .flags = 0, + }; + int ret; + + ret = iommufd_backend_connect(iommufd, errp); + if (ret) { + return ret; + } + + /* + * Add device to kvm-vfio to be prepared for the tracking + * in KVM. Especially for some emulated devices, it requires + * to have kvm information in the device open. + */ + vfio_kvm_device_add_device(vbasedev); + + /* Bind device to iommufd */ + bind.iommufd = iommufd->fd; + ret = ioctl(vbasedev->fd, VFIO_DEVICE_BIND_IOMMUFD, &bind); + if (ret) { + error_setg_errno(errp, errno, "error bind device fd=%d to iommufd=%d", + vbasedev->fd, bind.iommufd); + goto err_bind; + } + + vbasedev->devid = bind.out_devid; + trace_vfio_iommufd_bind_device(bind.iommufd, vbasedev->name, + vbasedev->fd, vbasedev->devid); + return ret; +err_bind: + vfio_kvm_device_del_device(vbasedev); + iommufd_backend_disconnect(iommufd); + return ret; +} + +static void iommufd_unbind_and_disconnect(VFIODevice *vbasedev) +{ + /* Unbind is automatically conducted when device fd is closed */ + vfio_kvm_device_del_device(vbasedev); + iommufd_backend_disconnect(vbasedev->iommufd); +} + +static int vfio_get_devicefd(const char *sysfs_path, Error **errp) +{ + long int ret = -ENOTTY; + char *path, *vfio_dev_path = NULL, *vfio_path = NULL; + DIR *dir = NULL; + struct dirent *dent; + gchar *contents; + struct stat st; + gsize length; + int major, minor; + dev_t vfio_devt; + + path = g_strdup_printf("%s/vfio-dev", sysfs_path); + if (stat(path, &st) < 0) { + error_setg_errno(errp, errno, "no such host device"); + goto out_free_path; + } + + dir = opendir(path); + if (!dir) { + error_setg_errno(errp, errno, "couldn't open dirrectory %s", path); + goto out_free_path; + } + + while ((dent = readdir(dir))) { + if (!strncmp(dent->d_name, "vfio", 4)) { + vfio_dev_path = g_strdup_printf("%s/%s/dev", path, dent->d_name); + break; + } + } + + if (!vfio_dev_path) { + error_setg(errp, "failed to find vfio-dev/vfioX/dev"); + goto out_close_dir; + } + + if (!g_file_get_contents(vfio_dev_path, &contents, &length, NULL)) { + error_setg(errp, "failed to load \"%s\"", vfio_dev_path); + goto out_free_dev_path; + } + + if (sscanf(contents, "%d:%d", &major, &minor) != 2) { + error_setg(errp, "failed to get major:minor for \"%s\"", vfio_dev_path); + goto out_free_dev_path; + } + g_free(contents); + vfio_devt = makedev(major, minor); + + vfio_path = g_strdup_printf("/dev/vfio/devices/%s", dent->d_name); + ret = open_cdev(vfio_path, vfio_devt); + if (ret < 0) { + error_setg(errp, "Failed to open %s", vfio_path); + } + + trace_vfio_iommufd_get_devicefd(vfio_path, ret); + g_free(vfio_path); + +out_free_dev_path: + g_free(vfio_dev_path); +out_close_dir: + closedir(dir); +out_free_path: + if (*errp) { + error_prepend(errp, VFIO_MSG_PREFIX, path); + } + g_free(path); + + return ret; +} + +static VFIOIOASHwpt *vfio_container_get_hwpt(VFIOIOMMUFDContainer *container, + uint32_t hwpt_id) +{ + VFIOIOASHwpt *hwpt; + + QLIST_FOREACH(hwpt, &container->hwpt_list, next) { + if (hwpt->hwpt_id == hwpt_id) { + return hwpt; + } + } + + hwpt = g_malloc0(sizeof(*hwpt)); + + hwpt->hwpt_id = hwpt_id; + QLIST_INIT(&hwpt->device_list); + QLIST_INSERT_HEAD(&container->hwpt_list, hwpt, next); + + return hwpt; +} + +static void vfio_container_put_hwpt(IOMMUFDBackend *be, VFIOIOASHwpt *hwpt) +{ + QLIST_REMOVE(hwpt, next); + iommufd_backend_free_id(be->fd, hwpt->hwpt_id); + g_free(hwpt); +} + +static int __vfio_device_attach_hwpt(VFIODevice *vbasedev, uint32_t hwpt_id, + Error **errp) +{ + struct vfio_device_attach_iommufd_pt attach_data = { + .argsz = sizeof(attach_data), + .flags = 0, + .pt_id = hwpt_id, + }; + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data); + if (ret) { + error_setg_errno(errp, errno, + "[iommufd=%d] error attach %s (%d) to hwpt_id=%d", + vbasedev->iommufd->fd, vbasedev->name, vbasedev->fd, + hwpt_id); + } + return ret; +} + +static int __vfio_device_detach_hwpt(VFIODevice *vbasedev, Error **errp) +{ + struct vfio_device_detach_iommufd_pt detach_data = { + .argsz = sizeof(detach_data), + .flags = 0, + }; + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_DETACH_IOMMUFD_PT, &detach_data); + if (ret) { + error_setg_errno(errp, errno, "detach %s from ioas failed", + vbasedev->name); + } + return ret; +} + +static int vfio_device_attach_container(VFIODevice *vbasedev, + VFIOIOMMUFDContainer *container, + Error **errp) +{ + int ret, iommufd = vbasedev->iommufd->fd; + VFIOIOASHwpt *hwpt; + uint32_t hwpt_id; + Error *err = NULL; + + /* try to attach to an existing hwpt in this container */ + QLIST_FOREACH(hwpt, &container->hwpt_list, next) { + ret = __vfio_device_attach_hwpt(vbasedev, hwpt->hwpt_id, &err); + if (ret) { + const char *msg = error_get_pretty(err); + + trace_vfio_iommufd_fail_attach_existing_hwpt(msg); + error_free(err); + err = NULL; + } else { + goto found_hwpt; + } + } + + ret = iommufd_backend_alloc_hwpt(iommufd, vbasedev->devid, + container->ioas_id, &hwpt_id); + + if (ret) { + error_setg_errno(errp, errno, "error alloc shadow hwpt"); + return ret; + } + + /* Attach device to an hwpt within iommufd */ + ret = __vfio_device_attach_hwpt(vbasedev, hwpt_id, errp); + if (ret) { + iommufd_backend_free_id(iommufd, hwpt_id); + return ret; + } + + hwpt = vfio_container_get_hwpt(container, hwpt_id); +found_hwpt: + QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, next); + vbasedev->hwpt = hwpt; + + trace_vfio_iommufd_attach_device(iommufd, vbasedev->name, vbasedev->fd, + container->ioas_id, hwpt->hwpt_id); + return ret; +} + +static void vfio_device_detach_container(VFIODevice *vbasedev, + VFIOIOMMUFDContainer *container) +{ + VFIOIOASHwpt *hwpt = vbasedev->hwpt; + Error *err = NULL; + int ret; + + ret = __vfio_device_detach_hwpt(vbasedev, &err); + if (ret) { + error_report_err(err); + } + + QLIST_REMOVE(vbasedev, next); + vbasedev->hwpt = NULL; + if (QLIST_EMPTY(&hwpt->device_list)) { + vfio_container_put_hwpt(vbasedev->iommufd, hwpt); + } + + trace_vfio_iommufd_detach_device(container->be->fd, vbasedev->name, + container->ioas_id); +} + +static void vfio_iommufd_container_destroy(VFIOIOMMUFDContainer *container) +{ + VFIOContainer *bcontainer = &container->bcontainer; + + if (!QLIST_EMPTY(&container->hwpt_list)) { + return; + } + memory_listener_unregister(&bcontainer->listener); + vfio_container_destroy(bcontainer); + iommufd_backend_put_ioas(container->be, container->ioas_id); + g_free(container); +} + +static int vfio_ram_block_discard_disable(bool state) +{ + /* + * We support coordinated discarding of RAM via the RamDiscardManager. + */ + return ram_block_uncoordinated_discard_disable(state); +} + +static int iommufd_attach_device(char *name, VFIODevice *vbasedev, + AddressSpace *as, Error **errp) +{ + VFIOIOMMUBackendOpsClass *ops = VFIO_IOMMU_BACKEND_OPS_CLASS( + object_class_by_name(TYPE_VFIO_IOMMU_BACKEND_IOMMUFD_OPS)); + VFIOContainer *bcontainer; + VFIOIOMMUFDContainer *container; + VFIOAddressSpace *space; + struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) }; + int ret, devfd; + uint32_t ioas_id; + Error *err = NULL; + + devfd = vfio_get_devicefd(vbasedev->sysfsdev, errp); + if (devfd < 0) { + return devfd; + } + vbasedev->fd = devfd; + + ret = iommufd_connect_and_bind(vbasedev, errp); + if (ret) { + goto err_connect_bind; + } + + space = vfio_get_address_space(as); + + /* try to attach to an existing container in this space */ + QLIST_FOREACH(bcontainer, &space->containers, next) { + container = container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer); + if (bcontainer->ops != ops || vbasedev->iommufd != container->be) { + continue; + } + if (vfio_device_attach_container(vbasedev, container, &err)) { + const char *msg = error_get_pretty(err); + + trace_vfio_iommufd_fail_attach_existing_container(msg); + error_free(err); + err = NULL; + } else { + ret = vfio_ram_block_discard_disable(true); + if (ret) { + error_setg(errp, + "Cannot set discarding of RAM broken (%d)", ret); + goto err_discard_disable; + } + goto found_container; + } + } + + /* Need to allocate a new dedicated container */ + ret = iommufd_backend_get_ioas(vbasedev->iommufd, &ioas_id); + if (ret < 0) { + error_setg_errno(errp, errno, "Failed to alloc ioas"); + goto err_get_ioas; + } + + trace_vfio_iommufd_alloc_ioas(vbasedev->iommufd->fd, ioas_id); + + container = g_malloc0(sizeof(*container)); + container->be = vbasedev->iommufd; + container->ioas_id = ioas_id; + QLIST_INIT(&container->hwpt_list); + + bcontainer = &container->bcontainer; + vfio_container_init(bcontainer, space, ops); + QLIST_INSERT_HEAD(&space->containers, bcontainer, next); + + ret = vfio_device_attach_container(vbasedev, container, errp); + if (ret) { + goto err_attach_container; + } + + ret = vfio_ram_block_discard_disable(true); + if (ret) { + goto err_discard_disable; + } + + /* + * TODO: for now iommufd BE is on par with vfio iommu type1, so it's + * fine to add the whole range as window. For SPAPR, below code + * should be updated. + */ + vfio_host_win_add(bcontainer, 0, (hwaddr)-1, 4096); + bcontainer->pgsizes = 4096; + + bcontainer->listener = vfio_memory_listener; + + memory_listener_register(&bcontainer->listener, bcontainer->space->as); + + if (bcontainer->error) { + ret = -1; + error_propagate_prepend(errp, bcontainer->error, + "memory listener initialization failed: "); + goto err_listener_register; + } + + bcontainer->initialized = true; + +found_container: + ret = ioctl(devfd, VFIO_DEVICE_GET_INFO, &dev_info); + if (ret) { + error_setg_errno(errp, errno, "error getting device info"); + goto err_listener_register; + } + + /* + * TODO: examine RAM_BLOCK_DISCARD stuff, should we do group level + * for discarding incompatibility check as well? + */ + if (vbasedev->ram_block_discard_allowed) { + vfio_ram_block_discard_disable(false); + } + + vbasedev->group = 0; + vbasedev->num_irqs = dev_info.num_irqs; + vbasedev->num_regions = dev_info.num_regions; + vbasedev->flags = dev_info.flags; + vbasedev->reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET); + vbasedev->bcontainer = bcontainer; + QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next); + QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next); + + trace_vfio_iommufd_device_info(vbasedev->name, devfd, vbasedev->num_irqs, + vbasedev->num_regions, vbasedev->flags); + return 0; + +err_listener_register: + vfio_ram_block_discard_disable(false); +err_discard_disable: + vfio_device_detach_container(vbasedev, container); +err_attach_container: + vfio_iommufd_container_destroy(container); +err_get_ioas: + vfio_put_address_space(space); + iommufd_unbind_and_disconnect(vbasedev); +err_connect_bind: + close(vbasedev->fd); + return ret; +} + +static void iommufd_detach_device(VFIODevice *vbasedev) +{ + VFIOContainer *bcontainer = vbasedev->bcontainer; + VFIOIOMMUFDContainer *container; + VFIOAddressSpace *space = bcontainer->space; + + QLIST_REMOVE(vbasedev, global_next); + QLIST_REMOVE(vbasedev, container_next); + vbasedev->bcontainer = NULL; + + if (!vbasedev->ram_block_discard_allowed) { + vfio_ram_block_discard_disable(false); + } + + container = container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer); + vfio_device_detach_container(vbasedev, container); + vfio_iommufd_container_destroy(container); + vfio_put_address_space(space); + + iommufd_unbind_and_disconnect(vbasedev); + close(vbasedev->fd); +} + +static void vfio_iommu_backend_iommufd_ops_class_init(ObjectClass *oc, + void *data) { + VFIOIOMMUBackendOpsClass *ops = VFIO_IOMMU_BACKEND_OPS_CLASS(oc); + + ops->dma_map = iommufd_map; + ops->dma_unmap = iommufd_unmap; + ops->attach_device = iommufd_attach_device; + ops->detach_device = iommufd_detach_device; +} + +static const TypeInfo vfio_iommu_backend_iommufd_ops_type = { + .name = TYPE_VFIO_IOMMU_BACKEND_IOMMUFD_OPS, + + .parent = TYPE_VFIO_IOMMU_BACKEND_OPS, + .class_init = vfio_iommu_backend_iommufd_ops_class_init, + .abstract = true, +}; +static void vfio_iommu_backend_iommufd_ops_register_types(void) +{ + type_register_static(&vfio_iommu_backend_iommufd_ops_type); +} +type_init(vfio_iommu_backend_iommufd_ops_register_types); diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build index eb6ce6229d..9cae2c9e21 100644 --- a/hw/vfio/meson.build +++ b/hw/vfio/meson.build @@ -7,6 +7,9 @@ vfio_ss.add(files( 'spapr.c', 'migration.c', )) +if have_iommufd + vfio_ss.add(files('iommufd.c')) +endif vfio_ss.add(when: 'CONFIG_VFIO_PCI', if_true: files( 'display.c', 'pci-quirks.c', diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 08a1f9dfa4..9b180cf77c 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -164,3 +164,15 @@ vfio_state_pending_estimate(const char *name, uint64_t precopy, uint64_t postcop vfio_state_pending_exact(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t stopcopy_size, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy 0x%"PRIx64" postcopy 0x%"PRIx64" stopcopy size 0x%"PRIx64" precopy initial size 0x%"PRIx64" precopy dirty size 0x%"PRIx64 vfio_vmstate_change(const char *name, int running, const char *reason, const char *dev_state) " (%s) running %d reason %s device state %s" vfio_vmstate_change_prepare(const char *name, int running, const char *reason, const char *dev_state) " (%s) running %d reason %s device state %s" + +#iommufd.c + +vfio_iommufd_get_devicefd(const char *dev, int devfd) " %s (fd=%d)" +vfio_iommufd_bind_device(int iommufd, const char *name, int devfd, int devid) " [iommufd=%d] Successfully bound device %s (fd=%d): output devid=%d" +vfio_iommufd_fail_attach_existing_hwpt(const char *msg) " %s" +vfio_iommufd_attach_device(int iommufd, const char *name, int devfd, int ioasid, int hwptid) " [iommufd=%d] Successfully attached device %s (%d) to ioasid=%d: output hwptd=%d" +vfio_iommufd_detach_device(int iommufd, const char *name, int ioasid) " [iommufd=%d] Detached %s from ioasid=%d" +vfio_iommufd_alloc_ioas(int iommufd, int ioas_id) " [iommufd=%d] new IOMMUFD container with ioasid=%d" +vfio_iommufd_device_info(char *name, int devfd, int num_irqs, int num_regions, int flags) " %s (%d) num_irqs=%d num_regions=%d flags=%d" +vfio_iommufd_fail_attach_existing_container(const char *msg) " %s" +vfio_iommufd_container_reset(char *name) " Successfully reset %s" From patchwork Mon Oct 16 08:32:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422729 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 79087C41513 for ; Mon, 16 Oct 2023 08:49:36 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJHl-0003BM-LC; Mon, 16 Oct 2023 04:49:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHZ-0002va-7D for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:49:02 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHW-0001LN-9g for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:49:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446138; x=1728982138; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pRR4PipXHtJ9LbhLVu38kzyRCPbzt3WZeY50EyJlqBE=; b=eQ7zgheTgon8Sga36ylJQCPkiJd49rrKi48bO37obr3tAVLAW4ss+Kbk Yn/O68y73E90ncgWGugDSTYluuKbSmJv4FKfQ2BmkPKdV3FjzUokPuDDe uHhX+Lq2c0JRdJFXJ/FcYHWcyggWeW0/5n02G1Ct+kmR/dU6wdVKSeLi9 vOSnuy6FHmLJPwlSd1fjD8pAD0bDhzhFY72IGqqo6tZe3vp2Ceb3LD915 D99aH+zYIrZVfRjJmDMsaKaVMm+gXELNnnOE1ogOPHSb/owGwu6oqFo6o e3YKLTx+Lg2INLZMznd+603Gl5VLPmGatTZjYffZmam4e9CXbjT5gaYfQ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737696" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737696" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749223022" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749223022" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:41 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v2 20/27] vfio/container: Bypass EEH if iommufd backend Date: Mon, 16 Oct 2023 16:32:16 +0800 Message-Id: <20231016083223.1519410-21-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org IBM EEH is only supported by legacy backend currently, bypass it for IOMMUFD backend. Signed-off-by: Zhenzhong Duan --- hw/vfio/container.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/vfio/container.c b/hw/vfio/container.c index c86accdb38..dd9534afab 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -1047,6 +1047,8 @@ static VFIOLegacyContainer *vfio_eeh_as_container(AddressSpace *as) { VFIOAddressSpace *space = vfio_get_address_space(as); VFIOContainer *bcontainer = NULL; + const VFIOIOMMUBackendOpsClass *ops = VFIO_IOMMU_BACKEND_OPS_CLASS( + object_class_by_name(TYPE_VFIO_IOMMU_BACKEND_LEGACY_OPS)); if (QLIST_EMPTY(&space->containers)) { /* No containers to act on */ @@ -1055,7 +1057,7 @@ static VFIOLegacyContainer *vfio_eeh_as_container(AddressSpace *as) bcontainer = QLIST_FIRST(&space->containers); - if (QLIST_NEXT(bcontainer, next)) { + if (QLIST_NEXT(bcontainer, next) || bcontainer->ops != ops) { /* * We don't yet have logic to synchronize EEH state across * multiple containers From patchwork Mon Oct 16 08:32:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422746 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0D671CDB465 for ; Mon, 16 Oct 2023 08:51:44 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJHn-0003Ql-70; Mon, 16 Oct 2023 04:49:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHd-00036s-P0 for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:49:09 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHa-0001Lg-AS for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:49:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446142; x=1728982142; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=b1vBWadFbefqeyoQVQBcGLcT/YMUhXspOdkxS8wof68=; b=BxvkvjSysRSDr2SRLqp5gCjhXP+pCy1nQ4/GhAp4ribfM7/b1dv7fRFO qXgdsJ+Ve5Y1CNJTywpv0AcCdb3nNW3RaefUoWX8r+xv01PSW78IlJvz6 zTA71Yp2ceu3SaXUftTJGv76LYzTtddcCfmNZ7xt+mj904NjtqTfVKLbA XgBsNuF/5g2JujNLNqf90W3KW3GXrgRunT5Vu1Ex8SBkhvPk3X8+vkjLG QEK5VT/GOcj8Tb0sPC391WPJrix1avYp5o7nY2iarcASiiv5Ao/vs1QTv WMe5FNuI0rBB3g0yWFUJWfkjmpN+xwolNINfnJGR+tPFI7DIDFTE3d6BK Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737706" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737706" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749223026" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749223026" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:45 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v2 21/27] vfio/pci: Adapt vfio pci hot reset support with iommufd BE Date: Mon, 16 Oct 2023 16:32:17 +0800 Message-Id: <20231016083223.1519410-22-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org As pci hot reset path need to reference pci specific functions and data structures, adding container level callback functions for legacy and iommufd BE and referencing those pci specific func/data is no better than implementing reset support with iommufd BE directly in pci.c This way we can also share the common bus reset and system reset path for both BEs. A help function vfio_pci_get_pci_hot_reset_info() is extracted out for usage by both BEs. Signed-off-by: Zhenzhong Duan --- hw/vfio/pci.c | 212 +++++++++++++++++++++++++++++++++++++++---- hw/vfio/trace-events | 1 + 2 files changed, 196 insertions(+), 17 deletions(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index b27011cee7..24fc047423 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -42,6 +42,7 @@ #include "qapi/error.h" #include "migration/blocker.h" #include "migration/qemu-file.h" +#include "linux/iommufd.h" #define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug" @@ -2445,22 +2446,13 @@ static bool vfio_pci_host_match(PCIHostDeviceAddress *addr, const char *name) return (strcmp(tmp, name) == 0); } -static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single) +static int vfio_pci_get_pci_hot_reset_info(VFIOPCIDevice *vdev, + struct vfio_pci_hot_reset_info **info_p) { - VFIOGroup *group; struct vfio_pci_hot_reset_info *info; - struct vfio_pci_dependent_device *devices; - struct vfio_pci_hot_reset *reset; - int32_t *fds; - int ret, i, count; - bool multi = false; + int ret, count; - trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" : "multi"); - - if (!single) { - vfio_pci_pre_reset(vdev); - } - vdev->vbasedev.needs_reset = false; + assert(info_p && !*info_p); info = g_malloc0(sizeof(*info)); info->argsz = sizeof(*info); @@ -2468,24 +2460,53 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single) ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info); if (ret && errno != ENOSPC) { ret = -errno; + g_free(info); if (!vdev->has_pm_reset) { error_report("vfio: Cannot reset device %s, " "no available reset mechanism.", vdev->vbasedev.name); } - goto out_single; + return ret; } count = info->count; - info = g_realloc(info, sizeof(*info) + (count * sizeof(*devices))); - info->argsz = sizeof(*info) + (count * sizeof(*devices)); - devices = &info->devices[0]; + info = g_realloc(info, sizeof(*info) + (count * sizeof(info->devices[0]))); + info->argsz = sizeof(*info) + (count * sizeof(info->devices[0])); ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info); if (ret) { ret = -errno; + g_free(info); error_report("vfio: hot reset info failed: %m"); + return ret; + } + + *info_p = info; + return 0; +} + +static int vfio_pci_hot_reset_legacy(VFIOPCIDevice *vdev, bool single) +{ + VFIOGroup *group; + struct vfio_pci_hot_reset_info *info = NULL; + struct vfio_pci_dependent_device *devices; + struct vfio_pci_hot_reset *reset; + int32_t *fds; + int ret, i, count; + bool multi = false; + + trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" : "multi"); + + if (!single) { + vfio_pci_pre_reset(vdev); + } + vdev->vbasedev.needs_reset = false; + + ret = vfio_pci_get_pci_hot_reset_info(vdev, &info); + + if (ret) { goto out_single; } + devices = &info->devices[0]; trace_vfio_pci_hot_reset_has_dep_devices(vdev->vbasedev.name); @@ -2627,6 +2648,163 @@ out_single: return ret; } +#ifdef CONFIG_IOMMUFD +static VFIODevice *vfio_pci_find_by_iommufd_devid(__u32 devid) +{ + VFIODevice *vbasedev_iter; + VFIOIOMMUBackendOpsClass *ops = VFIO_IOMMU_BACKEND_OPS_CLASS( + object_class_by_name(TYPE_VFIO_IOMMU_BACKEND_IOMMUFD_OPS)); + + QLIST_FOREACH(vbasedev_iter, &vfio_device_list, global_next) { + if (vbasedev_iter->bcontainer->ops != ops) { + continue; + } + if (devid == vbasedev_iter->devid) { + return vbasedev_iter; + } + } + return NULL; +} + +static int vfio_pci_hot_reset_iommufd(VFIOPCIDevice *vdev, bool single) +{ + struct vfio_pci_hot_reset_info *info = NULL; + struct vfio_pci_dependent_device *devices; + struct vfio_pci_hot_reset *reset; + int ret, i; + bool multi = false; + + trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" : "multi"); + + if (!single) { + vfio_pci_pre_reset(vdev); + } + vdev->vbasedev.needs_reset = false; + + ret = vfio_pci_get_pci_hot_reset_info(vdev, &info); + + if (ret) { + goto out_single; + } + + assert(info->flags & VFIO_PCI_HOT_RESET_FLAG_DEV_ID); + + devices = &info->devices[0]; + + if (!(info->flags & VFIO_PCI_HOT_RESET_FLAG_DEV_ID_OWNED)) { + if (!vdev->has_pm_reset) { + for (i = 0; i < info->count; i++) { + if (devices[i].devid == VFIO_PCI_DEVID_NOT_OWNED) { + error_report("vfio: Cannot reset device %s, " + "depends on device %04x:%02x:%02x.%x " + "which is not owned.", + vdev->vbasedev.name, devices[i].segment, + devices[i].bus, PCI_SLOT(devices[i].devfn), + PCI_FUNC(devices[i].devfn)); + } + } + } + ret = -EPERM; + goto out_single; + } + + trace_vfio_pci_hot_reset_has_dep_devices(vdev->vbasedev.name); + + for (i = 0; i < info->count; i++) { + VFIOPCIDevice *tmp; + VFIODevice *vbasedev_iter; + + trace_vfio_pci_hot_reset_dep_devices_iommufd(devices[i].segment, + devices[i].bus, + PCI_SLOT(devices[i].devfn), + PCI_FUNC(devices[i].devfn), + devices[i].devid); + + /* + * If a VFIO cdev device is resettable, all the dependent devices + * are either bound to same iommufd or within same iommu_groups as + * one of the iommufd bound devices. + */ + assert(devices[i].devid != VFIO_PCI_DEVID_NOT_OWNED); + + if (devices[i].devid == vdev->vbasedev.devid || + devices[i].devid == VFIO_PCI_DEVID_OWNED) { + continue; + } + + vbasedev_iter = vfio_pci_find_by_iommufd_devid(devices[i].devid); + if (!vbasedev_iter || !vbasedev_iter->dev->realized || + vbasedev_iter->type != VFIO_DEVICE_TYPE_PCI) { + continue; + } + tmp = container_of(vbasedev_iter, VFIOPCIDevice, vbasedev); + if (single) { + ret = -EINVAL; + goto out_single; + } + vfio_pci_pre_reset(tmp); + tmp->vbasedev.needs_reset = false; + multi = true; + } + + if (!single && !multi) { + ret = -EINVAL; + goto out_single; + } + + /* Use zero length array for hot reset with iommufd backend */ + reset = g_malloc0(sizeof(*reset)); + reset->argsz = sizeof(*reset); + + /* Bus reset! */ + ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_PCI_HOT_RESET, reset); + g_free(reset); + + trace_vfio_pci_hot_reset_result(vdev->vbasedev.name, + ret ? strerror(errno) : "Success"); + + /* Re-enable INTx on affected devices */ + for (i = 0; i < info->count; i++) { + VFIOPCIDevice *tmp; + VFIODevice *vbasedev_iter; + + if (devices[i].devid == vdev->vbasedev.devid || + devices[i].devid == VFIO_PCI_DEVID_OWNED) { + continue; + } + + vbasedev_iter = vfio_pci_find_by_iommufd_devid(devices[i].devid); + if (!vbasedev_iter || !vbasedev_iter->dev->realized || + vbasedev_iter->type != VFIO_DEVICE_TYPE_PCI) { + continue; + } + tmp = container_of(vbasedev_iter, VFIOPCIDevice, vbasedev); + vfio_pci_post_reset(tmp); + } +out_single: + if (!single) { + vfio_pci_post_reset(vdev); + } + g_free(info); + + return ret; +} +#endif + +static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single) +{ +#ifdef CONFIG_IOMMUFD + if (vdev->vbasedev.iommufd) { + return vfio_pci_hot_reset_iommufd(vdev, single); + } else +#endif + { + return vfio_pci_hot_reset_legacy(vdev, single); + } +} + + + /* * We want to differentiate hot reset of multiple in-use devices vs hot reset * of a single in-use device. VFIO_DEVICE_RESET will already handle the case diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 9b180cf77c..71c5840636 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -34,6 +34,7 @@ vfio_check_af_flr(const char *name) "%s Supports FLR via AF cap" vfio_pci_hot_reset(const char *name, const char *type) " (%s) %s" vfio_pci_hot_reset_has_dep_devices(const char *name) "%s: hot reset dependent devices:" vfio_pci_hot_reset_dep_devices(int domain, int bus, int slot, int function, int group_id) "\t%04x:%02x:%02x.%x group %d" +vfio_pci_hot_reset_dep_devices_iommufd(int domain, int bus, int slot, int function, int dev_id) "\t%04x:%02x:%02x.%x devid %d" vfio_pci_hot_reset_result(const char *name, const char *result) "%s hot reset: %s" vfio_populate_device_config(const char *name, unsigned long size, unsigned long offset, unsigned long flags) "Device %s config:\n size: 0x%lx, offset: 0x%lx, flags: 0x%lx" vfio_populate_device_get_irq_info_failure(const char *errstr) "VFIO_DEVICE_GET_IRQ_INFO failure: %s" From patchwork Mon Oct 16 08:32:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422743 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 36172CDB483 for ; Mon, 16 Oct 2023 08:51:40 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJHn-0003V5-Oz; Mon, 16 Oct 2023 04:49:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHg-000375-2P for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:49:10 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHe-0001HS-EA for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:49:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446146; x=1728982146; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7F5Dq+REhNpWkarXzwwRKvZFmtbASdwuTQt1zpQiICA=; b=jnWmcVSm3mz3MAXLpv9b9FP9LqNPNhH7QQb6Za8YBjwlVZ50ty09Nq/8 hNaKgzdDUkGSEPAU9QbGl/MQ0MfJoes66LWOUXxkh/UU4zbJDoskHwDNZ y5NJK6hiq99JXb5bCLuDAsW1YTQDTq+BAG7lCAVO88WsCfnTaHgoH5nA3 xLC/5PeO3CnEw4VVoF3ZmKr6aUNE0Tap5LUZ43M0syrfUgdA4WfYGt3Y3 3bGLnE4ybKNbtsn0GcRnBVevslvix2Rd+spvg7t3jmXHyXfm455rHPKMe 7jx1uQGQ4cgbl5jBDfwl1gAuTT0NhBYJ/XUAGS0wwW/yk5RfxN89jRiGa g==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737716" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737716" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749223032" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749223032" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:49 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v2 22/27] vfio/pci: Allow the selection of a given iommu backend Date: Mon, 16 Oct 2023 16:32:18 +0800 Message-Id: <20231016083223.1519410-23-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Now we support two types of iommu backends, let's add the capability to select one of them. This depends on whether an iommufd object has been linked with the vfio-pci device: if the user wants to use the legacy backend, it shall not link the vfio-pci device with any iommufd object: -device vfio-pci,host=0000:02:00.0 This is called the legacy mode/backend. If the user wants to use the iommufd backend (/dev/iommu) it shall pass an iommufd object id in the vfio-pci device options: -object iommufd,id=iommufd0 -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0 Suggested-by: Alex Williamson Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Zhenzhong Duan --- hw/vfio/pci.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 24fc047423..15e1b771b0 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -43,6 +43,7 @@ #include "migration/blocker.h" #include "migration/qemu-file.h" #include "linux/iommufd.h" +#include "sysemu/iommufd.h" #define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug" @@ -3700,6 +3701,10 @@ static Property vfio_pci_dev_properties[] = { * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name), * DEFINE_PROP_STRING("vfiogroupfd, VFIOPCIDevice, vfiogroupfd_name), */ +#ifdef CONFIG_IOMMUFD + DEFINE_PROP_LINK("iommufd", VFIOPCIDevice, vbasedev.iommufd, + TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *), +#endif DEFINE_PROP_END_OF_LIST(), }; From patchwork Mon Oct 16 08:32:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422747 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0D3BBCDB484 for ; Mon, 16 Oct 2023 08:51:45 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJHn-0003SL-7t; Mon, 16 Oct 2023 04:49:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHl-0003F0-MO for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:49:13 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHj-0001LN-K1 for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:49:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446151; x=1728982151; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=am0n2UouFtsQvlo79ZQxZ7c150Kz+KzXPr5YWDxKF5k=; b=LK1hvsUHSM8gWeHdSNanUFSAio8+BMhIEphNrBVCU3zIByRjVb5qdtDj 867sT1fQJ2Jw6t2AR8ncAFRTrHbnCAIr+MI3z98xCn3h6ufcWZm4RcUYh 2kE+Vhl5sZlN5Rj7eeBGe//TdVDmoGuzJs/LS/vJs5jo1Zwoo68Jveaqt NZqvebC5SPOzoSdN2wE/84ST/M2VveO3e2IZy1x7Cy0O6GtfB6Ivu32pH 2L1z2/aaaayP4+9yohoK56M8Kh39XG1laz0ufZgYkR5VfnouTmrCqiVdb ZZcJV8Vyr73RX2zgisIcrdHfYQj3Cauy473GJLnPhcDSuhdohPdNg2p6h A==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737724" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737724" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749223037" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749223037" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:53 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v2 23/27] vfio/pci: Make vfio cdev pre-openable by passing a file handle Date: Mon, 16 Oct 2023 16:32:19 +0800 Message-Id: <20231016083223.1519410-24-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This gives management tools like libvirt a chance to open the vfio cdev with privilege and pass FD to qemu. This way qemu never needs to have privilege to open a VFIO or iommu cdev node. Together with the earlier support of pre-opening /dev/iommu device, now we have full support of passing a vfio device to unprivileged qemu by management tool. This mode is no more considered for the legacy backend. So let's remove the "TODO" comment. Add a helper function vfio_device_get_name() to check fd and get device name, it will also be used by other vfio devices. There is no easy way to check if a device is mdev with FD passing, so fail the x-balloon-allowed check unconditionally in this case. There is also no easy way to get BDF as name with FD passing, so we fake a name by VFIO_FD[fd]. Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 1 + hw/vfio/helpers.c | 33 +++++++++++++++++++++++++++++ hw/vfio/iommufd.c | 12 +++++++---- hw/vfio/pci.c | 40 ++++++++++++++++++++++++----------- 4 files changed, 70 insertions(+), 16 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index e72f5962ee..e6804baa6d 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -244,6 +244,7 @@ struct vfio_info_cap_header * vfio_get_device_info_cap(struct vfio_device_info *info, uint16_t id); struct vfio_info_cap_header * vfio_get_cap(void *ptr, uint32_t cap_offset, uint16_t id); +int vfio_device_get_name(VFIODevice *vbasedev, Error **errp); #endif extern const MemoryListener vfio_prereg_listener; diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c index 7e5da21b31..70c65cf71d 100644 --- a/hw/vfio/helpers.c +++ b/hw/vfio/helpers.c @@ -610,3 +610,36 @@ bool vfio_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_type) return ret; } + +int vfio_device_get_name(VFIODevice *vbasedev, Error **errp) +{ + struct stat st; + + if (vbasedev->fd < 0) { + if (stat(vbasedev->sysfsdev, &st) < 0) { + error_setg_errno(errp, errno, "no such host device"); + error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->sysfsdev); + return -errno; + } + /* User may specify a name, e.g: VFIO platform device */ + if (!vbasedev->name) { + vbasedev->name = g_path_get_basename(vbasedev->sysfsdev); + } + } +#ifdef CONFIG_IOMMUFD + else { + if (!vbasedev->iommufd) { + error_setg(errp, "Use FD passing only with iommufd backend"); + return -EINVAL; + } + /* + * Give a name with fd so any function printing out vbasedev->name + * will not break. + */ + if (!vbasedev->name) { + vbasedev->name = g_strdup_printf("VFIO_FD%d", vbasedev->fd); + } + } +#endif + return 0; +} diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index ee8c4620b6..aabc1d1024 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -361,11 +361,15 @@ static int iommufd_attach_device(char *name, VFIODevice *vbasedev, uint32_t ioas_id; Error *err = NULL; - devfd = vfio_get_devicefd(vbasedev->sysfsdev, errp); - if (devfd < 0) { - return devfd; + if (vbasedev->fd < 0) { + devfd = vfio_get_devicefd(vbasedev->sysfsdev, errp); + if (devfd < 0) { + return devfd; + } + vbasedev->fd = devfd; + } else { + devfd = vbasedev->fd; } - vbasedev->fd = devfd; ret = iommufd_connect_and_bind(vbasedev, errp); if (ret) { diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 15e1b771b0..edb787d3d1 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -44,6 +44,7 @@ #include "migration/qemu-file.h" #include "linux/iommufd.h" #include "sysemu/iommufd.h" +#include "monitor/monitor.h" #define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug" @@ -3257,18 +3258,23 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) VFIODevice *vbasedev = &vdev->vbasedev; char *tmp, *subsys; Error *err = NULL; - struct stat st; int i, ret; bool is_mdev; char uuid[UUID_FMT_LEN]; char *name; - if (!vbasedev->sysfsdev) { + if (vbasedev->fd < 0 && !vbasedev->sysfsdev) { if (!(~vdev->host.domain || ~vdev->host.bus || ~vdev->host.slot || ~vdev->host.function)) { error_setg(errp, "No provided host device"); +#ifdef CONFIG_IOMMUFD + error_append_hint(errp, "Use -device vfio-pci,host=DDDD:BB:DD.F, " + "-device vfio-pci,sysfsdev=PATH_TO_DEVICE " + "or -device vfio-pci,fd=DEVICE_FD\n"); +#else error_append_hint(errp, "Use -device vfio-pci,host=DDDD:BB:DD.F " "or -device vfio-pci,sysfsdev=PATH_TO_DEVICE\n"); +#endif return; } vbasedev->sysfsdev = @@ -3277,13 +3283,9 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) vdev->host.slot, vdev->host.function); } - if (stat(vbasedev->sysfsdev, &st) < 0) { - error_setg_errno(errp, errno, "no such host device"); - error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->sysfsdev); + if (vfio_device_get_name(vbasedev, errp)) { return; } - - vbasedev->name = g_path_get_basename(vbasedev->sysfsdev); vbasedev->ops = &vfio_pci_ops; vbasedev->type = VFIO_DEVICE_TYPE_PCI; vbasedev->dev = DEVICE(vdev); @@ -3643,6 +3645,7 @@ static void vfio_instance_init(Object *obj) vdev->host.bus = ~0U; vdev->host.slot = ~0U; vdev->host.function = ~0U; + vdev->vbasedev.fd = -1; vdev->nv_gpudirect_clique = 0xFF; @@ -3696,11 +3699,6 @@ static Property vfio_pci_dev_properties[] = { qdev_prop_nv_gpudirect_clique, uint8_t), DEFINE_PROP_OFF_AUTO_PCIBAR("x-msix-relocation", VFIOPCIDevice, msix_relo, OFF_AUTOPCIBAR_OFF), - /* - * TODO - support passed fds... is this necessary? - * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name), - * DEFINE_PROP_STRING("vfiogroupfd, VFIOPCIDevice, vfiogroupfd_name), - */ #ifdef CONFIG_IOMMUFD DEFINE_PROP_LINK("iommufd", VFIOPCIDevice, vbasedev.iommufd, TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *), @@ -3708,6 +3706,21 @@ static Property vfio_pci_dev_properties[] = { DEFINE_PROP_END_OF_LIST(), }; +#ifdef CONFIG_IOMMUFD +static void vfio_pci_set_fd(Object *obj, const char *str, Error **errp) +{ + VFIOPCIDevice *vdev = VFIO_PCI(obj); + int fd = -1; + + fd = monitor_fd_param(monitor_cur(), str, errp); + if (fd == -1) { + error_prepend(errp, "Could not parse remote object fd %s:", str); + return; + } + vdev->vbasedev.fd = fd; +} +#endif + static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); @@ -3715,6 +3728,9 @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) dc->reset = vfio_pci_reset; device_class_set_props(dc, vfio_pci_dev_properties); +#ifdef CONFIG_IOMMUFD + object_class_property_add_str(klass, "fd", NULL, vfio_pci_set_fd); +#endif dc->desc = "VFIO-based PCI device assignment"; set_bit(DEVICE_CATEGORY_MISC, dc->categories); pdc->realize = vfio_realize; From patchwork Mon Oct 16 08:32:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422749 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 03AF9CDB482 for ; Mon, 16 Oct 2023 08:51:50 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJII-0005MF-1t; Mon, 16 Oct 2023 04:49:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJI1-00056E-MV; Mon, 16 Oct 2023 04:49:31 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHy-0001Lg-PF; Mon, 16 Oct 2023 04:49:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446166; x=1728982166; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KLgmPGIMb19ZFDsOzcP2jTE87PGiogMPOIUBdyyKBYo=; b=Zrq8nyUe9mJWAbfr7l5XTN3mY4cU1MoEVCeoskxW2ILACDVz7ViuxGdd C+DUUWNIr019UvDnRiHw+Yadye/aJCBWxCw2PeSPFBnXQYccUyjI51u+4 4N/ysDjpCWofPetLO7ORVshiCHPODAHQ98k318XFDMpCAxVDrsS0X0R3B UBBBfG4cbmc/PLDKzKl7D8ehQ9XkIsk6/cTVIrdP0HddaExSTv/967mrV 4JPMwI44zRKRX5IRWG99UgKdd2hlq1zo86xE9WWWShvrSP1QTf5wyLmes 6IBo/CSBVEPWsHteQKuPlmOPIEczQ6lQTEbDSFXRI5wm7bLKqlNeOd8OY Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737736" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737736" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:49:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749223051" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749223051" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:48:57 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Tony Krowiak , Halil Pasic , Jason Herne , Thomas Huth , Eric Farman , Matthew Rosato , qemu-s390x@nongnu.org (open list:vfio-ap) Subject: [PATCH v2 24/27] vfio: Allow the selection of a given iommu backend for platform ap and ccw Date: Mon, 16 Oct 2023 16:32:20 +0800 Message-Id: <20231016083223.1519410-25-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Previously we added support to select iommu backend for vfio pci device. Now we added others, E.g: platform, ap and ccw. Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-platform.h | 1 + hw/vfio/ap.c | 5 +++++ hw/vfio/ccw.c | 5 +++++ hw/vfio/platform.c | 4 ++++ 4 files changed, 15 insertions(+) diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h index c414c3dffc..f57f4276f2 100644 --- a/include/hw/vfio/vfio-platform.h +++ b/include/hw/vfio/vfio-platform.h @@ -18,6 +18,7 @@ #include "hw/sysbus.h" #include "hw/vfio/vfio-common.h" +#include "sysemu/iommufd.h" #include "qemu/event_notifier.h" #include "qemu/queue.h" #include "qom/object.h" diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c index 5f257bffb9..1f8e88aeb3 100644 --- a/hw/vfio/ap.c +++ b/hw/vfio/ap.c @@ -16,6 +16,7 @@ #include "qapi/error.h" #include "hw/vfio/vfio.h" #include "hw/vfio/vfio-common.h" +#include "sysemu/iommufd.h" #include "hw/s390x/ap-device.h" #include "qemu/error-report.h" #include "qemu/event_notifier.h" @@ -205,6 +206,10 @@ static void vfio_ap_unrealize(DeviceState *dev) static Property vfio_ap_properties[] = { DEFINE_PROP_STRING("sysfsdev", VFIOAPDevice, vdev.sysfsdev), +#ifdef CONFIG_IOMMUFD + DEFINE_PROP_LINK("iommufd", VFIOAPDevice, vdev.iommufd, + TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *), +#endif DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index 6623ae237b..c7f8e70783 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -22,6 +22,7 @@ #include "qapi/error.h" #include "hw/vfio/vfio.h" #include "hw/vfio/vfio-common.h" +#include "sysemu/iommufd.h" #include "hw/s390x/s390-ccw.h" #include "hw/s390x/vfio-ccw.h" #include "hw/qdev-properties.h" @@ -678,6 +679,10 @@ static void vfio_ccw_unrealize(DeviceState *dev) static Property vfio_ccw_properties[] = { DEFINE_PROP_STRING("sysfsdev", VFIOCCWDevice, vdev.sysfsdev), DEFINE_PROP_BOOL("force-orb-pfch", VFIOCCWDevice, force_orb_pfch, false), +#ifdef CONFIG_IOMMUFD + DEFINE_PROP_LINK("iommufd", VFIOCCWDevice, vdev.iommufd, + TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *), +#endif DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c index 8e3d4ac458..a1c25e0337 100644 --- a/hw/vfio/platform.c +++ b/hw/vfio/platform.c @@ -649,6 +649,10 @@ static Property vfio_platform_dev_properties[] = { DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice, mmap_timeout, 1100), DEFINE_PROP_BOOL("x-irqfd", VFIOPlatformDevice, irqfd_allowed, true), +#ifdef CONFIG_IOMMUFD + DEFINE_PROP_LINK("iommufd", VFIOPlatformDevice, vbasedev.iommufd, + TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *), +#endif DEFINE_PROP_END_OF_LIST(), }; From patchwork Mon Oct 16 08:32:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422740 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C7CE0CDB474 for ; Mon, 16 Oct 2023 08:51:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJHt-0004MY-OQ; Mon, 16 Oct 2023 04:49:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHs-0004Ds-9e for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:49:20 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJHq-0001HS-Cw for qemu-devel@nongnu.org; Mon, 16 Oct 2023 04:49:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446158; x=1728982158; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LmRhWqNN5MpdYLIWPBzUIr7um0fmtv6LZuOxukM3jx8=; b=lXHdcfveryR/+ssoxTIG1CxOj9qeElVS/e/zax+sJ0oaP0FJH4hQh8Nz ozZJeDXMmwIdPt6sg5MpkJeQun7SNpZ4yaOdcjPflYijo5DvgaWtXqAPV x/SeLlMxStTFPh6AAlYCPnNT48eYEXhvrvIS+e6wdFmZ+YyeX+OGhiWpa +ExShuHBuJb557pwTH6O/crYJvm3h4fCRoPII/cG0ObxqEJCKv3rAR3ix scjqAcMpWooj9QTB2OSjPJknC7YkclCvWy24OC+Mkk282grfXy+ze8ata i7mPfDJUCzkUDrJFqCjJ+YvblBzAZHLuqAu97/XmsEj1ONp+j077h/Z+c Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737750" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737750" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:49:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749223059" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749223059" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:49:03 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v2 25/27] vfio/platform: Make vfio cdev pre-openable by passing a file handle Date: Mon, 16 Oct 2023 16:32:21 +0800 Message-Id: <20231016083223.1519410-26-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This gives management tools like libvirt a chance to open the vfio cdev with privilege and pass FD to qemu. This way qemu never needs to have privilege to open a VFIO or iommu cdev node. Signed-off-by: Zhenzhong Duan --- hw/vfio/platform.c | 41 +++++++++++++++++++++++++++++++++-------- 1 file changed, 33 insertions(+), 8 deletions(-) diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c index a1c25e0337..aa0b2b9583 100644 --- a/hw/vfio/platform.c +++ b/hw/vfio/platform.c @@ -35,6 +35,7 @@ #include "hw/platform-bus.h" #include "hw/qdev-properties.h" #include "sysemu/kvm.h" +#include "monitor/monitor.h" /* * Functions used whatever the injection method @@ -529,14 +530,13 @@ static VFIODeviceOps vfio_platform_ops = { */ static int vfio_base_device_init(VFIODevice *vbasedev, Error **errp) { - struct stat st; int ret; - /* @sysfsdev takes precedence over @host */ - if (vbasedev->sysfsdev) { + /* @fd takes precedence over @sysfsdev which takes precedence over @host */ + if (vbasedev->fd < 0 && vbasedev->sysfsdev) { g_free(vbasedev->name); vbasedev->name = g_path_get_basename(vbasedev->sysfsdev); - } else { + } else if (vbasedev->fd < 0) { if (!vbasedev->name || strchr(vbasedev->name, '/')) { error_setg(errp, "wrong host device name"); return -EINVAL; @@ -546,10 +546,9 @@ static int vfio_base_device_init(VFIODevice *vbasedev, Error **errp) vbasedev->name); } - if (stat(vbasedev->sysfsdev, &st) < 0) { - error_setg_errno(errp, errno, - "failed to get the sysfs host device file status"); - return -errno; + ret = vfio_device_get_name(vbasedev, errp); + if (ret) { + return ret; } ret = vfio_attach_device(vbasedev->name, vbasedev, @@ -656,6 +655,28 @@ static Property vfio_platform_dev_properties[] = { DEFINE_PROP_END_OF_LIST(), }; +static void vfio_platform_instance_init(Object *obj) +{ + VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(obj); + + vdev->vbasedev.fd = -1; +} + +#ifdef CONFIG_IOMMUFD +static void vfio_platform_set_fd(Object *obj, const char *str, Error **errp) +{ + VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(obj); + int fd = -1; + + fd = monitor_fd_param(monitor_cur(), str, errp); + if (fd == -1) { + error_prepend(errp, "Could not parse remote object fd %s:", str); + return; + } + vdev->vbasedev.fd = fd; +} +#endif + static void vfio_platform_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); @@ -663,6 +684,9 @@ static void vfio_platform_class_init(ObjectClass *klass, void *data) dc->realize = vfio_platform_realize; device_class_set_props(dc, vfio_platform_dev_properties); +#ifdef CONFIG_IOMMUFD + object_class_property_add_str(klass, "fd", NULL, vfio_platform_set_fd); +#endif dc->vmsd = &vfio_platform_vmstate; dc->desc = "VFIO-based platform device assignment"; sbc->connect_irq_notifier = vfio_start_irqfd_injection; @@ -675,6 +699,7 @@ static const TypeInfo vfio_platform_dev_info = { .name = TYPE_VFIO_PLATFORM, .parent = TYPE_SYS_BUS_DEVICE, .instance_size = sizeof(VFIOPlatformDevice), + .instance_init = vfio_platform_instance_init, .class_init = vfio_platform_class_init, .class_size = sizeof(VFIOPlatformDeviceClass), }; From patchwork Mon Oct 16 08:32:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422748 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D8D21CDB465 for ; Mon, 16 Oct 2023 08:51:48 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJII-0005XM-L5; Mon, 16 Oct 2023 04:49:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJI9-0005Ke-Uc; Mon, 16 Oct 2023 04:49:38 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJI6-0001LN-Dg; Mon, 16 Oct 2023 04:49:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446174; x=1728982174; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5sTywzEaEW0RCTiGenM/26nMo0a4RpA9GhP1ZwXXWqg=; b=CSLw8Bofg+PJHee5P9D1RjBWPjBM1ggzj7Hu5G4A7d8w9GeCNYqdWAyB xauNoQevav2m48QM4K0ASF15WRSVtnvoIQ1iyj8IoHsQuupbWD8PzCtwX 2C1ZltKeaG6InjtQgVdkdZaNmEUNvf+YhcN7Qt4Cup5PUxsCzMDGlXDFD /D8Gm7bXs2hpvaaC6VG3sXEg7z+ye34Ix2FAi4hXYXUCVVXGYbUnZ+Wvz fkeLU3MKEdOXmJKK8QfkpKNB/NSmqNOjjeoC+nMVzeVKWYvWIGR6Wonh+ qWh0htFgmgPDMWTHvNaKIZt3ZAv8iZJe6/tQceXJFP9xyLk3Sh1CXdfTf w==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737756" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737756" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:49:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749223075" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749223075" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:49:07 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Thomas Huth , Tony Krowiak , Halil Pasic , Jason Herne , qemu-s390x@nongnu.org (open list:S390 general arch...) Subject: [PATCH v2 26/27] vfio/ap: Make vfio cdev pre-openable by passing a file handle Date: Mon, 16 Oct 2023 16:32:22 +0800 Message-Id: <20231016083223.1519410-27-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This gives management tools like libvirt a chance to open the vfio cdev with privilege and pass FD to qemu. This way qemu never needs to have privilege to open a VFIO or iommu cdev node. Opportunisticly, remove some unnecessory double-cast. Signed-off-by: Zhenzhong Duan --- hw/vfio/ap.c | 32 +++++++++++++++++++++++++++++++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c index 1f8e88aeb3..a34cae31a2 100644 --- a/hw/vfio/ap.c +++ b/hw/vfio/ap.c @@ -30,6 +30,7 @@ #include "hw/s390x/ap-bridge.h" #include "exec/address-spaces.h" #include "qom/object.h" +#include "monitor/monitor.h" #define TYPE_VFIO_AP_DEVICE "vfio-ap" @@ -160,7 +161,10 @@ static void vfio_ap_realize(DeviceState *dev, Error **errp) VFIOAPDevice *vapdev = VFIO_AP_DEVICE(dev); VFIODevice *vbasedev = &vapdev->vdev; - vbasedev->name = g_path_get_basename(vbasedev->sysfsdev); + if (vfio_device_get_name(vbasedev, errp)) { + return; + } + vbasedev->ops = &vfio_ap_ops; vbasedev->type = VFIO_DEVICE_TYPE_AP; vbasedev->dev = dev; @@ -230,11 +234,36 @@ static const VMStateDescription vfio_ap_vmstate = { .unmigratable = 1, }; +static void vfio_ap_instance_init(Object *obj) +{ + VFIOAPDevice *vapdev = VFIO_AP_DEVICE(obj); + + vapdev->vdev.fd = -1; +} + +#ifdef CONFIG_IOMMUFD +static void vfio_ap_set_fd(Object *obj, const char *str, Error **errp) +{ + VFIOAPDevice *vapdev = VFIO_AP_DEVICE(obj); + int fd = -1; + + fd = monitor_fd_param(monitor_cur(), str, errp); + if (fd == -1) { + error_prepend(errp, "Could not parse remote object fd %s:", str); + return; + } + vapdev->vdev.fd = fd; +} +#endif + static void vfio_ap_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); device_class_set_props(dc, vfio_ap_properties); +#ifdef CONFIG_IOMMUFD + object_class_property_add_str(klass, "fd", NULL, vfio_ap_set_fd); +#endif dc->vmsd = &vfio_ap_vmstate; dc->desc = "VFIO-based AP device assignment"; set_bit(DEVICE_CATEGORY_MISC, dc->categories); @@ -249,6 +278,7 @@ static const TypeInfo vfio_ap_info = { .name = TYPE_VFIO_AP_DEVICE, .parent = TYPE_AP_DEVICE, .instance_size = sizeof(VFIOAPDevice), + .instance_init = vfio_ap_instance_init, .class_init = vfio_ap_class_init, }; From patchwork Mon Oct 16 08:32:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13422741 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 12D58CDB474 for ; Mon, 16 Oct 2023 08:51:21 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsJIM-00062p-E6; Mon, 16 Oct 2023 04:49:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJIF-0005Tw-GH; Mon, 16 Oct 2023 04:49:45 -0400 Received: from mgamail.intel.com ([192.55.52.151]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsJID-0001HS-Pv; Mon, 16 Oct 2023 04:49:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697446181; x=1728982181; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=p2MFIpHYT4awebGkY68CJ2k7+kXfIV8jtABCfQUYiRo=; b=AdZbB/4HjZwvYNUv3TF1a7PUcsBZmyBB2jJNRo6SRZtLn8VAxerDzbww dO1HzvytsS18n1cWynk3uI8xZmJUh1PBa78dPKfl1AVW/xKyI0Aahbm1K NsJlYjUCdclDCTmbW+BVSqDFYBv2OCbw1Oj6yuCSvyZD7ubp2HC8S1FSn /VvqtmGUjy+v3ayA+PETWOduwf9fsYOKWOCnanOyrdPrBYFxpSPTN4vwq xwrcE7+1QQe2X58ryVwYFn0tNDdGbz37phud5JNqADMTdTTu+HfB2gVmb NfRSvH7xoicAkWZXYNRUeQ8L0+W9Q6TvpX4QUaRA0RA6k1XIdWYUU6VQt Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="365737766" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="365737766" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:49:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="749223099" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="749223099" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 01:49:12 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Thomas Huth , Eric Farman , Matthew Rosato , qemu-s390x@nongnu.org (open list:S390 general arch...) Subject: [PATCH v2 27/27] vfio/ccw: Make vfio cdev pre-openable by passing a file handle Date: Mon, 16 Oct 2023 16:32:23 +0800 Message-Id: <20231016083223.1519410-28-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231016083223.1519410-1-zhenzhong.duan@intel.com> References: <20231016083223.1519410-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=192.55.52.151; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This gives management tools like libvirt a chance to open the vfio cdev with privilege and pass FD to qemu. This way qemu never needs to have privilege to open a VFIO or iommu cdev node. Opportunisticly, remove a redundant definition of TYPE_VFIO_CCW. Signed-off-by: Zhenzhong Duan --- hw/vfio/ccw.c | 34 +++++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-) diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index c7f8e70783..f151652bc2 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -31,6 +31,7 @@ #include "qemu/error-report.h" #include "qemu/main-loop.h" #include "qemu/module.h" +#include "monitor/monitor.h" struct VFIOCCWDevice { S390CCWDevice cdev; @@ -590,11 +591,12 @@ static void vfio_ccw_realize(DeviceState *dev, Error **errp) } } + if (vfio_device_get_name(vbasedev, errp)) { + return; + } + vbasedev->ops = &vfio_ccw_ops; vbasedev->type = VFIO_DEVICE_TYPE_CCW; - vbasedev->name = g_strdup_printf("%x.%x.%04x", vcdev->cdev.hostid.cssid, - vcdev->cdev.hostid.ssid, - vcdev->cdev.hostid.devid); vbasedev->dev = dev; /* @@ -691,12 +693,37 @@ static const VMStateDescription vfio_ccw_vmstate = { .unmigratable = 1, }; +static void vfio_ccw_instance_init(Object *obj) +{ + VFIOCCWDevice *vcdev = VFIO_CCW(obj); + + vcdev->vdev.fd = -1; +} + +#ifdef CONFIG_IOMMUFD +static void vfio_ccw_set_fd(Object *obj, const char *str, Error **errp) +{ + VFIOCCWDevice *vcdev = VFIO_CCW(obj); + int fd = -1; + + fd = monitor_fd_param(monitor_cur(), str, errp); + if (fd == -1) { + error_prepend(errp, "Could not parse remote object fd %s:", str); + return; + } + vcdev->vdev.fd = fd; +} +#endif + static void vfio_ccw_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); S390CCWDeviceClass *cdc = S390_CCW_DEVICE_CLASS(klass); device_class_set_props(dc, vfio_ccw_properties); +#ifdef CONFIG_IOMMUFD + object_class_property_add_str(klass, "fd", NULL, vfio_ccw_set_fd); +#endif dc->vmsd = &vfio_ccw_vmstate; dc->desc = "VFIO-based subchannel assignment"; set_bit(DEVICE_CATEGORY_MISC, dc->categories); @@ -714,6 +741,7 @@ static const TypeInfo vfio_ccw_info = { .name = TYPE_VFIO_CCW, .parent = TYPE_S390_CCW, .instance_size = sizeof(VFIOCCWDevice), + .instance_init = vfio_ccw_instance_init, .class_init = vfio_ccw_class_init, };