From patchwork Mon Jan 15 10:13:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13519468 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1271AC4707C for ; Mon, 15 Jan 2024 10:17:59 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rPK1D-0007fe-2E; Mon, 15 Jan 2024 05:16:35 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPK1B-0007f9-0O for qemu-devel@nongnu.org; Mon, 15 Jan 2024 05:16:33 -0500 Received: from mgamail.intel.com ([134.134.136.20]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPK18-0007OP-UM for qemu-devel@nongnu.org; Mon, 15 Jan 2024 05:16:32 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705313790; x=1736849790; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=17c0COtBuoW99QpVJLPDdLL1AX5u6nkzdn4+JzBln0w=; b=BpX91YLxIHofBHvyETRkLSSLTGYYX0RcZi6ppiHpQLT9KODBG+n/Y6Pi efos8Y3Y1IHBXWAA3A/AgmvDdcgUnvVXL7j8CP5mJOS9LqM1JKsuAuQfl mcxT3hnHjRZzSP3+ZSIcKVW9zGvSB+gk2kAc3WfTV3okZHNi/1PPpD2rJ /9zgRBApsZcGzZaiC6HmRi0PMRUtiSFoiJ3HU1uHDcT4XP8msGHlDGAeD T4UDIeU12IFR0w6CeXDG4l7GKLut/dowJkHS4gUe+/guXd6MCYDjvzfr6 YP/PqJ3sOYHDq6+VTx/lgLaF2eviB+nBCmj2ZBrKbymf8D3TUNsHEZJDX Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10953"; a="390032452" X-IronPort-AV: E=Sophos;i="6.04,196,1695711600"; d="scan'208";a="390032452" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 02:16:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10953"; a="1030599485" X-IronPort-AV: E=Sophos;i="6.04,196,1695711600"; d="scan'208";a="1030599485" Received: from spr-s2600bt.bj.intel.com ([10.240.192.124]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 02:16:24 -0800 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Yi Sun Subject: [PATCH rfcv1 1/6] backends/iommufd_device: introduce IOMMUFDDevice Date: Mon, 15 Jan 2024 18:13:08 +0800 Message-Id: <20240115101313.131139-2-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240115101313.131139-1-zhenzhong.duan@intel.com> References: <20240115101313.131139-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.20; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -71 X-Spam_score: -7.2 X-Spam_bar: ------- X-Spam_report: (-7.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.758, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org IOMMUFDDevice represents a device in iommufd and can be used as a communication interface between devices (i.e., VFIO, VDPA) and vIOMMU. Currently it includes iommufd handler and device id information which could be used by vIOMMU to get hw IOMMU information. In future nested translation support, vIOMMU is going to have more iommufd related operations like allocate hwpt for a device, attach/detach hwpt, etc. So IOMMUFDDevice will be further expanded. IOMMUFDDevice is willingly not a QOM object because we don't want it to be visible from the user interface. Introduce a helper iommufd_device_init to initialize IOMMUFDDevice. Originally-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- MAINTAINERS | 4 +-- include/sysemu/iommufd_device.h | 31 ++++++++++++++++++++ backends/iommufd_device.c | 50 +++++++++++++++++++++++++++++++++ backends/meson.build | 2 +- 4 files changed, 84 insertions(+), 3 deletions(-) create mode 100644 include/sysemu/iommufd_device.h create mode 100644 backends/iommufd_device.c diff --git a/MAINTAINERS b/MAINTAINERS index 00ec1f7eca..606dfeb2b1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2171,8 +2171,8 @@ M: Yi Liu M: Eric Auger M: Zhenzhong Duan S: Supported -F: backends/iommufd.c -F: include/sysemu/iommufd.h +F: backends/iommufd*.c +F: include/sysemu/iommufd*.h F: include/qemu/chardev_open.h F: util/chardev_open.c F: docs/devel/vfio-iommufd.rst diff --git a/include/sysemu/iommufd_device.h b/include/sysemu/iommufd_device.h new file mode 100644 index 0000000000..795630324b --- /dev/null +++ b/include/sysemu/iommufd_device.h @@ -0,0 +1,31 @@ +/* + * IOMMUFD Device + * + * Copyright (C) 2024 Intel Corporation. + * + * Authors: Yi Liu + * Zhenzhong Duan + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#ifndef SYSEMU_IOMMUFD_DEVICE_H +#define SYSEMU_IOMMUFD_DEVICE_H + +#include +#include "sysemu/iommufd.h" + +typedef struct IOMMUFDDevice IOMMUFDDevice; + +/* This is an abstraction of host IOMMUFD device */ +struct IOMMUFDDevice { + IOMMUFDBackend *iommufd; + uint32_t dev_id; +}; + +int iommufd_device_get_info(IOMMUFDDevice *idev, + enum iommu_hw_info_type *type, + uint32_t len, void *data); +void iommufd_device_init(void *_idev, size_t instance_size, + IOMMUFDBackend *iommufd, uint32_t dev_id); +#endif diff --git a/backends/iommufd_device.c b/backends/iommufd_device.c new file mode 100644 index 0000000000..f6e7ca1dbf --- /dev/null +++ b/backends/iommufd_device.c @@ -0,0 +1,50 @@ +/* + * QEMU abstract of Host IOMMU + * + * Copyright (C) 2024 Intel Corporation. + * + * Authors: Yi Liu + * Zhenzhong Duan + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include +#include "qemu/osdep.h" +#include "qemu/error-report.h" +#include "sysemu/iommufd_device.h" + +int iommufd_device_get_info(IOMMUFDDevice *idev, + enum iommu_hw_info_type *type, + uint32_t len, void *data) +{ + struct iommu_hw_info info = { + .size = sizeof(info), + .flags = 0, + .dev_id = idev->dev_id, + .data_len = len, + .__reserved = 0, + .data_uptr = (uintptr_t)data, + }; + int ret; + + ret = ioctl(idev->iommufd->fd, IOMMU_GET_HW_INFO, &info); + if (ret) { + error_report("Failed to get info %m"); + } else { + *type = info.out_data_type; + } + + return ret; +} + +void iommufd_device_init(void *_idev, size_t instance_size, + IOMMUFDBackend *iommufd, uint32_t dev_id) +{ + IOMMUFDDevice *idev = (IOMMUFDDevice *)_idev; + + g_assert(sizeof(IOMMUFDDevice) <= instance_size); + + idev->iommufd = iommufd; + idev->dev_id = dev_id; +} diff --git a/backends/meson.build b/backends/meson.build index 8b2b111497..c437cdb363 100644 --- a/backends/meson.build +++ b/backends/meson.build @@ -24,7 +24,7 @@ if have_vhost_user system_ss.add(when: 'CONFIG_VIRTIO', if_true: files('vhost-user.c')) endif system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost.c')) -system_ss.add(when: 'CONFIG_IOMMUFD', if_true: files('iommufd.c')) +system_ss.add(when: 'CONFIG_IOMMUFD', if_true: files('iommufd.c', 'iommufd_device.c')) if have_vhost_user_crypto system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost-user.c')) endif From patchwork Mon Jan 15 10:13:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13519465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 37DF1C3DA79 for ; Mon, 15 Jan 2024 10:17:23 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rPK1G-0007gB-3A; Mon, 15 Jan 2024 05:16:38 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPK1E-0007g2-NM for qemu-devel@nongnu.org; Mon, 15 Jan 2024 05:16:36 -0500 Received: from mgamail.intel.com ([134.134.136.20]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPK1C-0007Q7-Vy for qemu-devel@nongnu.org; Mon, 15 Jan 2024 05:16:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705313795; x=1736849795; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=B3Md0rF8/bFv3IbdNxrK5ZVslBNY6iDAMFNVl8/7oBw=; b=jXTp8KFYg3+e+wRS9Sc3B8iSwjgle3I++g4snqJZk/OK8bSEr20OaJ4C te9e7pPL5qPMtq3dluNOtGSTNBzRX0UeaCQknGBK9V3OmCXTiuNEraimR dCVWm9PTolcvgeNnWbCYSUPuiy7VqkxOoVv231HZ9qqk1bLWknFIy2CiD KQqSgx3zrq8UTlBnztYRB78qiZXv1GpKV04rsmh0vTq+xVvXWGmIBBjUp 27lrweWHWqaFaDJhgPMbp9sZkCocoDgJj7+9yLfQ9iOzCJwj6BLNQ2pvY vyXhH7heYGLdVgiPBp97bVcDhmBIa/GHmoIcl7bHvPknHkZx6oyqJ1C0G w==; X-IronPort-AV: E=McAfee;i="6600,9927,10953"; a="390032460" X-IronPort-AV: E=Sophos;i="6.04,196,1695711600"; d="scan'208";a="390032460" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 02:16:33 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10953"; a="1030599491" X-IronPort-AV: E=Sophos;i="6.04,196,1695711600"; d="scan'208";a="1030599491" Received: from spr-s2600bt.bj.intel.com ([10.240.192.124]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 02:16:28 -0800 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan , Marcel Apfelbaum Subject: [PATCH rfcv1 2/6] hw/pci: introduce pci_device_set/unset_iommu_device() Date: Mon, 15 Jan 2024 18:13:09 +0800 Message-Id: <20240115101313.131139-3-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240115101313.131139-1-zhenzhong.duan@intel.com> References: <20240115101313.131139-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.20; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -71 X-Spam_score: -7.2 X-Spam_bar: ------- X-Spam_report: (-7.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.758, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Yi Liu This adds pci_device_set/unset_iommu_device() to set/unset IOMMUFDDevice for a given PCIe device. Caller of set should fail if set operation fails. Extract out pci_device_get_iommu_bus_devfn() to facilitate implementation of pci_device_set/unset_iommu_device(). Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Nicolin Chen Signed-off-by: Zhenzhong Duan --- include/hw/pci/pci.h | 39 ++++++++++++++++++++++++++++++++++- hw/pci/pci.c | 49 +++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 86 insertions(+), 2 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index fa6313aabc..a810c0ec74 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -7,6 +7,8 @@ /* PCI includes legacy ISA access. */ #include "hw/isa/isa.h" +#include "sysemu/iommufd_device.h" + extern bool pci_available; /* PCI bus */ @@ -384,10 +386,45 @@ typedef struct PCIIOMMUOps { * * @devfn: device and function number */ - AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn); + AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn); + /** + * @set_iommu_device: set iommufd device for a PCI device to vIOMMU + * + * Optional callback, if not implemented in vIOMMU, then vIOMMU can't + * utilize iommufd specific features. + * + * Return true if iommufd device is accepted, or else return false with + * errp set. + * + * @bus: the #PCIBus of the PCI device. + * + * @opaque: the data passed to pci_setup_iommu(). + * + * @devfn: device and function number of the PCI device. + * + * @idev: the data structure representing iommufd device. + * + */ + int (*set_iommu_device)(PCIBus *bus, void *opaque, int32_t devfn, + IOMMUFDDevice *idev, Error **errp); + /** + * @unset_iommu_device: unset iommufd device for a PCI device from vIOMMU + * + * Optional callback. + * + * @bus: the #PCIBus of the PCI device. + * + * @opaque: the data passed to pci_setup_iommu(). + * + * @devfn: device and function number of the PCI device. + */ + void (*unset_iommu_device)(PCIBus *bus, void *opaque, int32_t devfn); } PCIIOMMUOps; AddressSpace *pci_device_iommu_address_space(PCIDevice *dev); +int pci_device_set_iommu_device(PCIDevice *dev, IOMMUFDDevice *idev, + Error **errp); +void pci_device_unset_iommu_device(PCIDevice *dev); /** * pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 76080af580..3848662f95 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -2672,7 +2672,10 @@ static void pci_device_class_base_init(ObjectClass *klass, void *data) } } -AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) +static void pci_device_get_iommu_bus_devfn(PCIDevice *dev, + PCIBus **aliased_pbus, + PCIBus **piommu_bus, + uint8_t *aliased_pdevfn) { PCIBus *bus = pci_get_bus(dev); PCIBus *iommu_bus = bus; @@ -2717,6 +2720,18 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) iommu_bus = parent_bus; } + *aliased_pbus = bus; + *piommu_bus = iommu_bus; + *aliased_pdevfn = devfn; +} + +AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) +{ + PCIBus *bus; + PCIBus *iommu_bus; + uint8_t devfn; + + pci_device_get_iommu_bus_devfn(dev, &bus, &iommu_bus, &devfn); if (!pci_bus_bypass_iommu(bus) && iommu_bus->iommu_ops) { return iommu_bus->iommu_ops->get_address_space(bus, iommu_bus->iommu_opaque, devfn); @@ -2724,6 +2739,38 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) return &address_space_memory; } +int pci_device_set_iommu_device(PCIDevice *dev, IOMMUFDDevice *idev, + Error **errp) +{ + PCIBus *bus; + PCIBus *iommu_bus; + uint8_t devfn; + + pci_device_get_iommu_bus_devfn(dev, &bus, &iommu_bus, &devfn); + if (!pci_bus_bypass_iommu(bus) && iommu_bus && + iommu_bus->iommu_ops && iommu_bus->iommu_ops->set_iommu_device) { + return iommu_bus->iommu_ops->set_iommu_device(pci_get_bus(dev), + iommu_bus->iommu_opaque, + dev->devfn, idev, errp); + } + return 0; +} + +void pci_device_unset_iommu_device(PCIDevice *dev) +{ + PCIBus *bus; + PCIBus *iommu_bus; + uint8_t devfn; + + pci_device_get_iommu_bus_devfn(dev, &bus, &iommu_bus, &devfn); + if (!pci_bus_bypass_iommu(bus) && iommu_bus && + iommu_bus->iommu_ops && iommu_bus->iommu_ops->unset_iommu_device) { + return iommu_bus->iommu_ops->unset_iommu_device(pci_get_bus(dev), + iommu_bus->iommu_opaque, + dev->devfn); + } +} + void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque) { /* From patchwork Mon Jan 15 10:13:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13519467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B0539C3DA79 for ; Mon, 15 Jan 2024 10:17:55 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rPK1L-0007gk-Jx; Mon, 15 Jan 2024 05:16:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPK1K-0007ga-ER for qemu-devel@nongnu.org; Mon, 15 Jan 2024 05:16:42 -0500 Received: from mgamail.intel.com ([134.134.136.20]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPK1I-0007Q7-Hx for qemu-devel@nongnu.org; Mon, 15 Jan 2024 05:16:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705313800; x=1736849800; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hy7O2g1euiVH0MkcYuwEsYWFvU+c66okol0jYpMXu24=; b=fZmy1KJUN02Ghs28lohMnJcSfqNGkFjuIf5J+nonGbedVSfjjuMAbvhq YZerwZhgk/HaeTIbJkuJ/3WKmYBXBzVxt92F2+qnTt63I6KkeAJiaaP5l eEHfvkH6SCWqkzOBEhe+8WaZ1bXsJA2eZcAZ8ZxMyx9nwf4QrTd4WE5Cm AGmKl95NhMobPy178IrormwU9UmyiX6XIyJPWxMzEwpOKe6N98/uzABLf i8gjEbWBrNSLbov8FdC8KQcSFJ52evtWkOmcREhd4WqJ1We+HUl4HFudb yX4M7zJbyBJ43jQ+hXeU0uU0DH4YLcVQ4gHog5fze7M2SEQWDEmE50v0p Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10953"; a="390032478" X-IronPort-AV: E=Sophos;i="6.04,196,1695711600"; d="scan'208";a="390032478" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 02:16:38 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10953"; a="1030599499" X-IronPort-AV: E=Sophos;i="6.04,196,1695711600"; d="scan'208";a="1030599499" Received: from spr-s2600bt.bj.intel.com ([10.240.192.124]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 02:16:33 -0800 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan , Marcel Apfelbaum , Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH rfcv1 3/6] intel_iommu: add set/unset_iommu_device callback Date: Mon, 15 Jan 2024 18:13:10 +0800 Message-Id: <20240115101313.131139-4-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240115101313.131139-1-zhenzhong.duan@intel.com> References: <20240115101313.131139-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.20; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -71 X-Spam_score: -7.2 X-Spam_bar: ------- X-Spam_report: (-7.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.758, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Yi Liu This adds set/unset_iommu_device() implementation in Intel vIOMMU. In set call, IOMMUFDDevice is recorded in hash table indexed by PCI BDF. Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/i386/intel_iommu.h | 10 +++++ hw/i386/intel_iommu.c | 79 +++++++++++++++++++++++++++++++++++ 2 files changed, 89 insertions(+) diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index 7fa0a695c8..c65fdde56f 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -62,6 +62,7 @@ typedef union VTD_IR_TableEntry VTD_IR_TableEntry; typedef union VTD_IR_MSIAddress VTD_IR_MSIAddress; typedef struct VTDPASIDDirEntry VTDPASIDDirEntry; typedef struct VTDPASIDEntry VTDPASIDEntry; +typedef struct VTDIOMMUFDDevice VTDIOMMUFDDevice; /* Context-Entry */ struct VTDContextEntry { @@ -148,6 +149,13 @@ struct VTDAddressSpace { IOVATree *iova_tree; }; +struct VTDIOMMUFDDevice { + PCIBus *bus; + uint8_t devfn; + IOMMUFDDevice *idev; + IntelIOMMUState *iommu_state; +}; + struct VTDIOTLBEntry { uint64_t gfn; uint16_t domain_id; @@ -292,6 +300,8 @@ struct IntelIOMMUState { /* list of registered notifiers */ QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers; + GHashTable *vtd_iommufd_dev; /* VTDIOMMUFDDevice */ + /* interrupt remapping */ bool intr_enabled; /* Whether guest enabled IR */ dma_addr_t intr_root; /* Interrupt remapping table pointer */ diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index ed5677c0ae..95faf697eb 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer v1, gconstpointer v2) (key1->pasid == key2->pasid); } +static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2) +{ + const struct vtd_as_key *key1 = v1; + const struct vtd_as_key *key2 = v2; + + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn); +} /* * Note that we use pointer to PCIBus as the key, so hashing/shifting * based on the pointer value is intended. Note that we deal with @@ -3812,6 +3819,74 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, return vtd_dev_as; } +static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int32_t devfn, + IOMMUFDDevice *idev, Error **errp) +{ + IntelIOMMUState *s = opaque; + VTDIOMMUFDDevice *vtd_idev; + struct vtd_as_key key = { + .bus = bus, + .devfn = devfn, + }; + struct vtd_as_key *new_key; + + assert(0 <= devfn && devfn < PCI_DEVFN_MAX); + + /* None IOMMUFD case */ + if (!idev) { + return 0; + } + + vtd_iommu_lock(s); + + vtd_idev = g_hash_table_lookup(s->vtd_iommufd_dev, &key); + + if (vtd_idev) { + error_setg(errp, "IOMMUFD device already exist"); + return -1; + } + + new_key = g_malloc(sizeof(*new_key)); + new_key->bus = bus; + new_key->devfn = devfn; + + vtd_idev = g_malloc0(sizeof(VTDIOMMUFDDevice)); + vtd_idev->bus = bus; + vtd_idev->devfn = (uint8_t)devfn; + vtd_idev->iommu_state = s; + vtd_idev->idev = idev; + + g_hash_table_insert(s->vtd_iommufd_dev, new_key, vtd_idev); + + vtd_iommu_unlock(s); + + return 0; +} + +static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int32_t devfn) +{ + IntelIOMMUState *s = opaque; + VTDIOMMUFDDevice *vtd_idev; + struct vtd_as_key key = { + .bus = bus, + .devfn = devfn, + }; + + assert(0 <= devfn && devfn < PCI_DEVFN_MAX); + + vtd_iommu_lock(s); + + vtd_idev = g_hash_table_lookup(s->vtd_iommufd_dev, &key); + if (!vtd_idev) { + vtd_iommu_unlock(s); + return; + } + + g_hash_table_remove(s->vtd_iommufd_dev, &key); + + vtd_iommu_unlock(s); +} + /* Unmap the whole range in the notifier's scope. */ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n) { @@ -4107,6 +4182,8 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn) static PCIIOMMUOps vtd_iommu_ops = { .get_address_space = vtd_host_dma_iommu, + .set_iommu_device = vtd_dev_set_iommu_device, + .unset_iommu_device = vtd_dev_unset_iommu_device, }; static bool vtd_decide_config(IntelIOMMUState *s, Error **errp) @@ -4230,6 +4307,8 @@ static void vtd_realize(DeviceState *dev, Error **errp) g_free, g_free); s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash, vtd_as_equal, g_free, g_free); + s->vtd_iommufd_dev = g_hash_table_new_full(vtd_as_hash, vtd_as_idev_equal, + g_free, g_free); vtd_init(s); pci_setup_iommu(bus, &vtd_iommu_ops, dev); /* Pseudo address space under root PCI bus. */ From patchwork Mon Jan 15 10:13:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13519469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 97EA2C3DA79 for ; Mon, 15 Jan 2024 10:18:01 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rPK1P-0007hH-GB; Mon, 15 Jan 2024 05:16:47 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPK1O-0007h1-9O for qemu-devel@nongnu.org; Mon, 15 Jan 2024 05:16:46 -0500 Received: from mgamail.intel.com ([134.134.136.20]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPK1L-0007Q7-PX for qemu-devel@nongnu.org; Mon, 15 Jan 2024 05:16:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705313803; x=1736849803; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pcicmJgyQWLrFvYkUfe79toM2nRdOhPK1Bz0VvryvaA=; b=Y9rWDbtFddI5qjNp+Ew0gdcF30A0PQaJrZzktO8UktX43HCe3HpnJycX 06MiIcNlO3iBQcOOX08IB4iseJqFMU7nvy9r5PiPz/Bke6NauxKLpRWo3 0nBGFcj1tMVbzG6ly7K/cvOlKNwAhhdG4ltgIo1mcZZruBcR7hpAYbRFO UigajmDDWFibf9YGwoqiZGQymLOeettGQvtm8TnYLG8TGKX8R/+j7edpd RkJB/KeKptpWBFYtseRxqZOtFderAJsQNigvED5Dhj0zIA3ypxIkcVWxF WWBdIimEp7h/s2lpmLfNSst1JGM4VPJlWUQ0BoYUAmmrFbySmWCa+EdBW g==; X-IronPort-AV: E=McAfee;i="6600,9927,10953"; a="390032491" X-IronPort-AV: E=Sophos;i="6.04,196,1695711600"; d="scan'208";a="390032491" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 02:16:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10953"; a="1030599506" X-IronPort-AV: E=Sophos;i="6.04,196,1695711600"; d="scan'208";a="1030599506" Received: from spr-s2600bt.bj.intel.com ([10.240.192.124]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 02:16:38 -0800 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Yi Sun Subject: [PATCH rfcv1 4/6] vfio: initialize IOMMUFDDevice and pass to vIOMMU Date: Mon, 15 Jan 2024 18:13:11 +0800 Message-Id: <20240115101313.131139-5-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240115101313.131139-1-zhenzhong.duan@intel.com> References: <20240115101313.131139-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.20; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -71 X-Spam_score: -7.2 X-Spam_bar: ------- X-Spam_report: (-7.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.758, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Initialize IOMMUFDDevice in vfio and pass to vIOMMU, so that vIOMMU could get hw IOMMU information. In VFIO legacy backend mode, we still pass a NULL IOMMUFDDevice to vIOMMU, in case vIOMMU needs some processing for VFIO legacy backend mode. Originally-by: Yi Liu Signed-off-by: Nicolin Chen Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 2 ++ hw/vfio/iommufd.c | 2 ++ hw/vfio/pci.c | 24 +++++++++++++++++++----- 3 files changed, 23 insertions(+), 5 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 9b7ef7d02b..fde0d0ca60 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -31,6 +31,7 @@ #endif #include "sysemu/sysemu.h" #include "hw/vfio/vfio-container-base.h" +#include "sysemu/iommufd_device.h" #define VFIO_MSG_PREFIX "vfio %s: " @@ -126,6 +127,7 @@ typedef struct VFIODevice { bool dirty_tracking; int devid; IOMMUFDBackend *iommufd; + IOMMUFDDevice idev; } VFIODevice; struct VFIODeviceOps { diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index 9bfddc1360..cbd035f148 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -309,6 +309,7 @@ static int iommufd_cdev_attach(const char *name, VFIODevice *vbasedev, VFIOContainerBase *bcontainer; VFIOIOMMUFDContainer *container; VFIOAddressSpace *space; + IOMMUFDDevice *idev = &vbasedev->idev; struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) }; int ret, devfd; uint32_t ioas_id; @@ -428,6 +429,7 @@ found_container: QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next); QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next); + iommufd_device_init(idev, sizeof(*idev), container->be, vbasedev->devid); trace_iommufd_cdev_device_info(vbasedev->name, devfd, vbasedev->num_irqs, vbasedev->num_regions, vbasedev->flags); return 0; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index d7fe06715c..2c3a5d267b 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3107,11 +3107,21 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) vfio_bars_register(vdev); - ret = vfio_add_capabilities(vdev, errp); + if (vbasedev->iommufd) { + ret = pci_device_set_iommu_device(pdev, &vbasedev->idev, errp); + } else { + ret = pci_device_set_iommu_device(pdev, 0, errp); + } if (ret) { + error_prepend(errp, "Failed to set iommu_device: "); goto out_teardown; } + ret = vfio_add_capabilities(vdev, errp); + if (ret) { + goto out_unset_idev; + } + if (vdev->vga) { vfio_vga_quirk_setup(vdev); } @@ -3128,7 +3138,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) error_setg(errp, "cannot support IGD OpRegion feature on hotplugged " "device"); - goto out_teardown; + goto out_unset_idev; } ret = vfio_get_dev_region_info(vbasedev, @@ -3137,13 +3147,13 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) if (ret) { error_setg_errno(errp, -ret, "does not support requested IGD OpRegion feature"); - goto out_teardown; + goto out_unset_idev; } ret = vfio_pci_igd_opregion_init(vdev, opregion, errp); g_free(opregion); if (ret) { - goto out_teardown; + goto out_unset_idev; } } @@ -3229,6 +3239,8 @@ out_deregister: if (vdev->intx.mmap_timer) { timer_free(vdev->intx.mmap_timer); } +out_unset_idev: + pci_device_unset_iommu_device(pdev); out_teardown: vfio_teardown_msi(vdev); vfio_bars_exit(vdev); @@ -3257,6 +3269,7 @@ static void vfio_instance_finalize(Object *obj) static void vfio_exitfn(PCIDevice *pdev) { VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIODevice *vbasedev = &vdev->vbasedev; vfio_unregister_req_notifier(vdev); vfio_unregister_err_notifier(vdev); @@ -3271,7 +3284,8 @@ static void vfio_exitfn(PCIDevice *pdev) vfio_teardown_msi(vdev); vfio_pci_disable_rp_atomics(vdev); vfio_bars_exit(vdev); - vfio_migration_exit(&vdev->vbasedev); + vfio_migration_exit(vbasedev); + pci_device_unset_iommu_device(pdev); } static void vfio_pci_reset(DeviceState *dev) From patchwork Mon Jan 15 10:13:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13519464 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3AAD6C4707C for ; Mon, 15 Jan 2024 10:17:23 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rPK1W-0007ho-4B; Mon, 15 Jan 2024 05:16:54 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPK1S-0007hW-QG for qemu-devel@nongnu.org; Mon, 15 Jan 2024 05:16:51 -0500 Received: from mgamail.intel.com ([134.134.136.20]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPK1Q-0007Q7-Qg for qemu-devel@nongnu.org; Mon, 15 Jan 2024 05:16:50 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705313808; x=1736849808; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jzlIz+xqyPFKX6mKgHxAxAqCZndzvcdtf7o5NBPArXY=; b=PWREvdJDLarb2PBK51Jm1XbTjgZMjUXJ0IM+/zVcbr5xS4umZPJABaXI 6cgUNf4gasnS4qrX1P8uR4Gar0PqFP9nMSh7hBzDo2uBJ1zDBNQhTF0ss aA0oFxqUO34zjb5aKOJxWODkyqa9r1wNMKMIzBOZnMOS4S7gfzGhOJOwH U9gK7rSgcO6QvjGhFsRFKUi1SkjQYAh/3tRDMTJgHaGLqqLHNEltSLFEW GzlVH2m2WAivylBDX94vdrkakz6MEMR3rX99b5Y7UzQq9frvwWNLZ8cjW CHi+3KOS8Bnvsi+aT0TNIa4EoKd4nzKoEF4zo6on/cfHOaZ/fOSKst/c/ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10953"; a="390032508" X-IronPort-AV: E=Sophos;i="6.04,196,1695711600"; d="scan'208";a="390032508" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 02:16:48 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10953"; a="1030599517" X-IronPort-AV: E=Sophos;i="6.04,196,1695711600"; d="scan'208";a="1030599517" Received: from spr-s2600bt.bj.intel.com ([10.240.192.124]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 02:16:43 -0800 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Paolo Bonzini , Richard Henderson , Eduardo Habkost , Marcel Apfelbaum Subject: [PATCH rfcv1 5/6] intel_iommu: extract out vtd_cap_init to initialize cap/ecap Date: Mon, 15 Jan 2024 18:13:12 +0800 Message-Id: <20240115101313.131139-6-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240115101313.131139-1-zhenzhong.duan@intel.com> References: <20240115101313.131139-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.20; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -71 X-Spam_score: -7.2 X-Spam_bar: ------- X-Spam_report: (-7.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.758, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This is a prerequisite for host cap/ecap sync. No functional change intended. Signed-off-by: Zhenzhong Duan Reviewed-by: Eric Auger --- hw/i386/intel_iommu.c | 92 +++++++++++++++++++++++-------------------- 1 file changed, 50 insertions(+), 42 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 95faf697eb..4c1d058ebd 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -4009,30 +4009,10 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) return; } -/* Do the initialization. It will also be called when reset, so pay - * attention when adding new initialization stuff. - */ -static void vtd_init(IntelIOMMUState *s) +static void vtd_cap_init(IntelIOMMUState *s) { X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); - memset(s->csr, 0, DMAR_REG_SIZE); - memset(s->wmask, 0, DMAR_REG_SIZE); - memset(s->w1cmask, 0, DMAR_REG_SIZE); - memset(s->womask, 0, DMAR_REG_SIZE); - - s->root = 0; - s->root_scalable = false; - s->dmar_enabled = false; - s->intr_enabled = false; - s->iq_head = 0; - s->iq_tail = 0; - s->iq = 0; - s->iq_size = 0; - s->qi_enabled = false; - s->iq_last_desc_type = VTD_INV_DESC_NONE; - s->iq_dw = false; - s->next_frcd_reg = 0; s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | VTD_CAP_MGAW(s->aw_bits); @@ -4049,27 +4029,6 @@ static void vtd_init(IntelIOMMUState *s) } s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; - /* - * Rsvd field masks for spte - */ - vtd_spte_rsvd[0] = ~0ULL; - vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits, - x86_iommu->dt_supported); - vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits); - vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits); - vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits); - - vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits, - x86_iommu->dt_supported); - vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits, - x86_iommu->dt_supported); - - if (s->scalable_mode || s->snoop_control) { - vtd_spte_rsvd[1] &= ~VTD_SPTE_SNP; - vtd_spte_rsvd_large[2] &= ~VTD_SPTE_SNP; - vtd_spte_rsvd_large[3] &= ~VTD_SPTE_SNP; - } - if (x86_iommu_ir_supported(x86_iommu)) { s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV; if (s->intr_eim == ON_OFF_AUTO_ON) { @@ -4102,7 +4061,56 @@ static void vtd_init(IntelIOMMUState *s) if (s->pasid) { s->ecap |= VTD_ECAP_PASID; } +} + +/* + * Do the initialization. It will also be called when reset, so pay + * attention when adding new initialization stuff. + */ +static void vtd_init(IntelIOMMUState *s) +{ + X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); + + memset(s->csr, 0, DMAR_REG_SIZE); + memset(s->wmask, 0, DMAR_REG_SIZE); + memset(s->w1cmask, 0, DMAR_REG_SIZE); + memset(s->womask, 0, DMAR_REG_SIZE); + + s->root = 0; + s->root_scalable = false; + s->dmar_enabled = false; + s->intr_enabled = false; + s->iq_head = 0; + s->iq_tail = 0; + s->iq = 0; + s->iq_size = 0; + s->qi_enabled = false; + s->iq_last_desc_type = VTD_INV_DESC_NONE; + s->iq_dw = false; + s->next_frcd_reg = 0; + + /* + * Rsvd field masks for spte + */ + vtd_spte_rsvd[0] = ~0ULL; + vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits, + x86_iommu->dt_supported); + vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits); + vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits); + vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits); + + vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits, + x86_iommu->dt_supported); + vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits, + x86_iommu->dt_supported); + + if (s->scalable_mode || s->snoop_control) { + vtd_spte_rsvd[1] &= ~VTD_SPTE_SNP; + vtd_spte_rsvd_large[2] &= ~VTD_SPTE_SNP; + vtd_spte_rsvd_large[3] &= ~VTD_SPTE_SNP; + } + vtd_cap_init(s); vtd_reset_caches(s); /* Define registers with default values and bit semantics */ From patchwork Mon Jan 15 10:13:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13519470 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 95280C4707C for ; Mon, 15 Jan 2024 10:18:11 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rPK1a-0007iY-8v; Mon, 15 Jan 2024 05:16:58 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPK1Y-0007iO-TW for qemu-devel@nongnu.org; Mon, 15 Jan 2024 05:16:56 -0500 Received: from mgamail.intel.com ([134.134.136.20]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPK1W-0007Q7-GP for qemu-devel@nongnu.org; Mon, 15 Jan 2024 05:16:56 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705313814; x=1736849814; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VZD6V8ZTFo0W2/fiSlQ4zviQC5Yy1Sfx4PmCNMdFT8Y=; b=XSAkVM5U10UUS+6kg276MFR4lRBM/0PD+vXceZDazXzpFxNljDDaUFos Nl2t7lVz3sQ/1+YEZyQmgWUKwEfvGPt/nfBQW97J4GtLnsJYxYODwTygX GJ2QAW3VqKrYG+Z3f35Ta7UXp22ivd3gVQgvk5QCdQb4kQndFiwZnmlN4 REIAgUiAkGve/YvIse+3zSPL16Afvukwl5SunnkcwO2dRVcagqjDXam/O H79Y8RrDFZ1+SZqb6SloBi4xMsuH1aBlK7bDTPACKi1Abf8zaWCm/YqN7 qOVHM5eMHajSDVCp9o74OWKHZaTI9jXKU980TYn9o5CRatfnL01wMXwju Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10953"; a="390032528" X-IronPort-AV: E=Sophos;i="6.04,196,1695711600"; d="scan'208";a="390032528" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 02:16:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10953"; a="1030599525" X-IronPort-AV: E=Sophos;i="6.04,196,1695711600"; d="scan'208";a="1030599525" Received: from spr-s2600bt.bj.intel.com ([10.240.192.124]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 02:16:48 -0800 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan , Paolo Bonzini , Richard Henderson , Eduardo Habkost , Marcel Apfelbaum Subject: [PATCH rfcv1 6/6] intel_iommu: add a framework to check and sync host IOMMU cap/ecap Date: Mon, 15 Jan 2024 18:13:13 +0800 Message-Id: <20240115101313.131139-7-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240115101313.131139-1-zhenzhong.duan@intel.com> References: <20240115101313.131139-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.20; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -71 X-Spam_score: -7.2 X-Spam_bar: ------- X-Spam_report: (-7.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.758, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Yi Liu Add a framework to check and synchronize host IOMMU cap/ecap with vIOMMU cap/ecap. Currently only stage-2 translation is supported which is backed by shadow page table on host side. So we don't need exact matching of each bit of cap/ecap between vIOMMU and host. However, we can still utilize this framework to ensure compatibility of host and vIOMMU's address width at least, i.e., vIOMMU's aw_bits <= host aw_bits, which is missed before. When stage-1 translation is supported in future, a.k.a. scalable modern mode, we need to ensure compatibility of each bits. Some bits are user controllable, they should be checked with host side to ensure compatibility. Other bits are not, they should be synced into vIOMMU cap/ecap for compatibility. The sequence will be: vtd_cap_init() initializes iommu->cap/ecap. ---- vtd_cap_init() iommu->host_cap/ecap is initialized as iommu->cap/ecap. ---- vtd_init() iommu->host_cap/ecap is checked and updated some bits with host cap/ecap. ---- vtd_sync_hw_info() iommu->cap/ecap is finalized as iommu->host_cap/ecap. ---- vtd_machine_done_hook() iommu->host_cap/ecap is a temporary storage to hold intermediate value when synthesize host cap/ecap and vIOMMU's initial configured cap/ecap. Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/i386/intel_iommu.h | 4 ++ hw/i386/intel_iommu.c | 78 +++++++++++++++++++++++++++++++---- 2 files changed, 75 insertions(+), 7 deletions(-) diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index c65fdde56f..b8abbcce12 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -292,6 +292,9 @@ struct IntelIOMMUState { uint64_t cap; /* The value of capability reg */ uint64_t ecap; /* The value of extended capability reg */ + uint64_t host_cap; /* The value of host capability reg */ + uint64_t host_ecap; /* The value of host ext-capability reg */ + uint32_t context_cache_gen; /* Should be in [1,MAX] */ GHashTable *iotlb; /* IOTLB */ @@ -314,6 +317,7 @@ struct IntelIOMMUState { bool dma_translation; /* Whether DMA translation supported */ bool pasid; /* Whether to support PASID */ + bool cap_finalized; /* Whether VTD capability finalized */ /* * Protects IOMMU states in general. Currently it protects the * per-IOMMU IOTLB cache, and context entry cache in VTDAddressSpace. diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 4c1d058ebd..be03fcbf52 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -3819,6 +3819,47 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, return vtd_dev_as; } +static bool vtd_sync_hw_info(IntelIOMMUState *s, struct iommu_hw_info_vtd *vtd, + Error **errp) +{ + uint64_t addr_width; + + addr_width = (vtd->cap_reg >> 16) & 0x3fULL; + if (s->aw_bits > addr_width) { + error_setg(errp, "User aw-bits: %u > host address width: %lu", + s->aw_bits, addr_width); + return false; + } + + /* TODO: check and sync host cap/ecap into vIOMMU cap/ecap */ + + return true; +} + +/* + * virtual VT-d which wants nested needs to check the host IOMMU + * nesting cap info behind the assigned devices. Thus that vIOMMU + * could bind guest page table to host. + */ +static bool vtd_check_idev(IntelIOMMUState *s, IOMMUFDDevice *idev, + Error **errp) +{ + struct iommu_hw_info_vtd vtd; + enum iommu_hw_info_type type = IOMMU_HW_INFO_TYPE_INTEL_VTD; + + if (iommufd_device_get_info(idev, &type, sizeof(vtd), &vtd)) { + error_setg(errp, "Failed to get IOMMU capability!!!"); + return false; + } + + if (type != IOMMU_HW_INFO_TYPE_INTEL_VTD) { + error_setg(errp, "IOMMU hardware is not compatible!!!"); + return false; + } + + return vtd_sync_hw_info(s, &vtd, errp); +} + static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int32_t devfn, IOMMUFDDevice *idev, Error **errp) { @@ -3837,6 +3878,10 @@ static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int32_t devfn, return 0; } + if (!vtd_check_idev(s, idev, errp)) { + return -1; + } + vtd_iommu_lock(s); vtd_idev = g_hash_table_lookup(s->vtd_iommufd_dev, &key); @@ -4071,10 +4116,11 @@ static void vtd_init(IntelIOMMUState *s) { X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); - memset(s->csr, 0, DMAR_REG_SIZE); - memset(s->wmask, 0, DMAR_REG_SIZE); - memset(s->w1cmask, 0, DMAR_REG_SIZE); - memset(s->womask, 0, DMAR_REG_SIZE); + /* CAP/ECAP are initialized in machine create done stage */ + memset(s->csr + DMAR_GCMD_REG, 0, DMAR_REG_SIZE - DMAR_GCMD_REG); + memset(s->wmask + DMAR_GCMD_REG, 0, DMAR_REG_SIZE - DMAR_GCMD_REG); + memset(s->w1cmask + DMAR_GCMD_REG, 0, DMAR_REG_SIZE - DMAR_GCMD_REG); + memset(s->womask + DMAR_GCMD_REG, 0, DMAR_REG_SIZE - DMAR_GCMD_REG); s->root = 0; s->root_scalable = false; @@ -4110,13 +4156,16 @@ static void vtd_init(IntelIOMMUState *s) vtd_spte_rsvd_large[3] &= ~VTD_SPTE_SNP; } - vtd_cap_init(s); + if (!s->cap_finalized) { + vtd_cap_init(s); + s->host_cap = s->cap; + s->host_ecap = s->ecap; + } + vtd_reset_caches(s); /* Define registers with default values and bit semantics */ vtd_define_long(s, DMAR_VER_REG, 0x10UL, 0, 0); - vtd_define_quad(s, DMAR_CAP_REG, s->cap, 0, 0); - vtd_define_quad(s, DMAR_ECAP_REG, s->ecap, 0, 0); vtd_define_long(s, DMAR_GCMD_REG, 0, 0xff800000UL, 0); vtd_define_long_wo(s, DMAR_GCMD_REG, 0xff800000UL); vtd_define_long(s, DMAR_GSTS_REG, 0, 0, 0); @@ -4241,6 +4290,12 @@ static bool vtd_decide_config(IntelIOMMUState *s, Error **errp) return true; } +static void vtd_setup_capability_reg(IntelIOMMUState *s) +{ + vtd_define_quad(s, DMAR_CAP_REG, s->cap, 0, 0); + vtd_define_quad(s, DMAR_ECAP_REG, s->ecap, 0, 0); +} + static int vtd_machine_done_notify_one(Object *child, void *unused) { IntelIOMMUState *iommu = INTEL_IOMMU_DEVICE(x86_iommu_get_default()); @@ -4259,6 +4314,14 @@ static int vtd_machine_done_notify_one(Object *child, void *unused) static void vtd_machine_done_hook(Notifier *notifier, void *unused) { + IntelIOMMUState *iommu = INTEL_IOMMU_DEVICE(x86_iommu_get_default()); + + iommu->cap = iommu->host_cap; + iommu->ecap = iommu->host_ecap; + iommu->cap_finalized = true; + + vtd_setup_capability_reg(iommu); + object_child_foreach_recursive(object_get_root(), vtd_machine_done_notify_one, NULL); } @@ -4292,6 +4355,7 @@ static void vtd_realize(DeviceState *dev, Error **errp) QLIST_INIT(&s->vtd_as_with_notifiers); qemu_mutex_init(&s->iommu_lock); + s->cap_finalized = false; memory_region_init_io(&s->csrmem, OBJECT(s), &vtd_mem_ops, s, "intel_iommu", DMAR_REG_SIZE); memory_region_add_subregion(get_system_memory(),