From patchwork Sat Jul 4 11:36:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 11643551 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 652ED618 for ; Sat, 4 Jul 2020 11:41:46 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 068A52084C for ; Sat, 4 Jul 2020 11:41:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 068A52084C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:35770 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jrgYD-0004VC-2s for patchwork-qemu-devel@patchwork.kernel.org; Sat, 04 Jul 2020 07:41:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52524) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrgNe-00012F-DC for qemu-devel@nongnu.org; Sat, 04 Jul 2020 07:30:52 -0400 Received: from mga05.intel.com ([192.55.52.43]:24793) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrgNV-0004K6-Ep for qemu-devel@nongnu.org; Sat, 04 Jul 2020 07:30:46 -0400 IronPort-SDR: Lyh/v5ek7Kq8A63/wbkUQbaW4U27/Z8WhKaLYAOHbZNc4Gd7khXNA80CMCNLCIoHvcQV2tNYCO 4XxH+ZQXb2MQ== X-IronPort-AV: E=McAfee;i="6000,8403,9671"; a="232105529" X-IronPort-AV: E=Sophos;i="5.75,311,1589266800"; d="scan'208";a="232105529" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jul 2020 04:30:20 -0700 IronPort-SDR: 1vftf/T72VF/33PVfxz5YkJJCMX9g5IpqYyUl9W9Xm7l4yxtN7hoX1DfklrFNJH4vcD94Jpnmn at/miJPUpzvw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,311,1589266800"; d="scan'208";a="266146811" Received: from jacob-builder.jf.intel.com ([10.7.199.155]) by fmsmga007.fm.intel.com with ESMTP; 04 Jul 2020 04:30:20 -0700 From: Liu Yi L To: qemu-devel@nongnu.org, alex.williamson@redhat.com, peterx@redhat.com Subject: [RFC v7 12/25] vfio: init HostIOMMUContext per-container Date: Sat, 4 Jul 2020 04:36:36 -0700 Message-Id: <1593862609-36135-13-git-send-email-yi.l.liu@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1593862609-36135-1-git-send-email-yi.l.liu@intel.com> References: <1593862609-36135-1-git-send-email-yi.l.liu@intel.com> Received-SPF: pass client-ip=192.55.52.43; envelope-from=yi.l.liu@intel.com; helo=mga05.intel.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/07/04 07:30:14 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: jean-philippe@linaro.org, kevin.tian@intel.com, yi.l.liu@intel.com, Yi Sun , kvm@vger.kernel.org, mst@redhat.com, jun.j.tian@intel.com, eric.auger@redhat.com, yi.y.sun@intel.com, Jacob Pan , pbonzini@redhat.com, hao.wu@intel.com, jasowang@redhat.com, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" In this patch, QEMU firstly gets iommu info from kernel to check the supported capabilities by a VFIO_IOMMU_TYPE1_NESTING iommu. And inits HostIOMMUContet instance. For vfio-pci devices, it could use pci_device_set/unset_iommu() to expose host iommu context to vIOMMU emulators. vIOMMU emulators could make use the methods provided by host iommu context. e.g. propagate requests to host iommu. Cc: Kevin Tian Cc: Jacob Pan Cc: Peter Xu Cc: Eric Auger Cc: Yi Sun Cc: David Gibson Cc: Alex Williamson Signed-off-by: Liu Yi L --- hw/vfio/common.c | 111 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ hw/vfio/pci.c | 17 +++++++++ 2 files changed, 128 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 7b92a58..2abd3d5 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1228,10 +1228,100 @@ static int vfio_host_iommu_ctx_pasid_free(HostIOMMUContext *iommu_ctx, return ret; } +/** + * Get iommu info from host. Caller of this funcion should free + * the memory pointed by the returned pointer stored in @info + * after a successful calling when finished its usage. + */ +static int vfio_get_iommu_info(VFIOContainer *container, + struct vfio_iommu_type1_info **info) +{ + + size_t argsz = sizeof(struct vfio_iommu_type1_info); + + *info = g_malloc0(argsz); + +retry: + (*info)->argsz = argsz; + + if (ioctl(container->fd, VFIO_IOMMU_GET_INFO, *info)) { + g_free(*info); + *info = NULL; + return -errno; + } + + if (((*info)->argsz > argsz)) { + argsz = (*info)->argsz; + *info = g_realloc(*info, argsz); + goto retry; + } + + return 0; +} + +static struct vfio_info_cap_header * +vfio_get_iommu_info_cap(struct vfio_iommu_type1_info *info, uint16_t id) +{ + struct vfio_info_cap_header *hdr; + void *ptr = info; + + if (!(info->flags & VFIO_IOMMU_INFO_CAPS)) { + return NULL; + } + + for (hdr = ptr + info->cap_offset; hdr != ptr; hdr = ptr + hdr->next) { + if (hdr->id == id) { + return hdr; + } + } + + return NULL; +} + +static int vfio_get_nesting_iommu_cap(VFIOContainer *container, + struct vfio_iommu_type1_info_cap_nesting **cap_nesting) +{ + struct vfio_iommu_type1_info *info; + struct vfio_info_cap_header *hdr; + struct vfio_iommu_type1_info_cap_nesting *cap; + struct iommu_nesting_info *nest_info; + int ret; + uint32_t minsz1, minsz2; + + ret = vfio_get_iommu_info(container, &info); + if (ret) { + return ret; + } + + hdr = vfio_get_iommu_info_cap(info, + VFIO_IOMMU_TYPE1_INFO_CAP_NESTING); + if (!hdr) { + g_free(info); + return -EINVAL; + } + + cap = container_of(hdr, + struct vfio_iommu_type1_info_cap_nesting, header); + + nest_info = (struct iommu_nesting_info *) &cap->info; + minsz1 = offsetof(struct iommu_nesting_info, data); + if (nest_info->size < minsz1) { + g_free(info); + return -EINVAL; + } + minsz2 = offsetof(struct vfio_iommu_type1_info_cap_nesting, info); + *cap_nesting = g_malloc0(minsz2 + nest_info->size); + memcpy(*cap_nesting, cap, minsz2 + nest_info->size); + + g_free(info); + return 0; +} + static int vfio_init_container(VFIOContainer *container, int group_fd, bool want_nested, Error **errp) { int iommu_type, ret; + uint64_t flags = 0; iommu_type = vfio_get_iommu_type(container, want_nested, errp); if (iommu_type < 0) { @@ -1259,6 +1349,27 @@ static int vfio_init_container(VFIOContainer *container, int group_fd, return -errno; } + if (iommu_type == VFIO_TYPE1_NESTING_IOMMU) { + struct vfio_iommu_type1_info_cap_nesting *nesting = NULL; + struct iommu_nesting_info *nest_info; + + ret = vfio_get_nesting_iommu_cap(container, &nesting); + if (ret) { + error_setg_errno(errp, -ret, + "Failed to get nesting iommu cap"); + return ret; + } + + nest_info = (struct iommu_nesting_info *) &nesting->info; + flags |= (nest_info->features & IOMMU_NESTING_FEAT_SYSWIDE_PASID) ? + HOST_IOMMU_PASID_REQUEST : 0; + host_iommu_ctx_init(&container->iommu_ctx, + sizeof(container->iommu_ctx), + TYPE_VFIO_HOST_IOMMU_CONTEXT, + flags); + g_free(nesting); + } + container->iommu_type = iommu_type; return 0; } diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 9d8d27f..b7045f0 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -2710,6 +2710,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) VFIOPCIDevice *vdev = PCI_VFIO(pdev); VFIODevice *vbasedev_iter; VFIOGroup *group; + VFIOContainer *container; char *tmp, *subsys, group_path[PATH_MAX], *group_name; Error *err = NULL; ssize_t len; @@ -2787,6 +2788,15 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) goto error; } + container = group->container; + if (container->iommu_ctx.initialized && + pci_device_set_iommu_context(pdev, &container->iommu_ctx)) { + error_setg(errp, "device attachment is denied by vIOMMU, " + "please check host IOMMU nesting capability"); + vfio_put_group(group); + goto error; + } + QLIST_FOREACH(vbasedev_iter, &group->device_list, next) { if (strcmp(vbasedev_iter->name, vdev->vbasedev.name) == 0) { error_setg(errp, "device is already attached"); @@ -3072,9 +3082,16 @@ static void vfio_instance_finalize(Object *obj) static void vfio_exitfn(PCIDevice *pdev) { VFIOPCIDevice *vdev = PCI_VFIO(pdev); + VFIOContainer *container; vfio_unregister_req_notifier(vdev); vfio_unregister_err_notifier(vdev); + + container = vdev->vbasedev.group->container; + if (container->iommu_ctx.initialized) { + pci_device_unset_iommu_context(pdev); + } + pci_device_set_intx_routing_notifier(&vdev->pdev, NULL); if (vdev->irqchip_change_notifier.notify) { kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier);