From patchwork Mon Mar 27 09:34:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13188906 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D729FC761A6 for ; Mon, 27 Mar 2023 09:35:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232453AbjC0Jfz (ORCPT ); Mon, 27 Mar 2023 05:35:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233490AbjC0Jfh (ORCPT ); Mon, 27 Mar 2023 05:35:37 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D296735A6; Mon, 27 Mar 2023 02:35:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679909734; x=1711445734; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DspUHqM3GniNFDpFuKlLGJpDgU4kjJg6COLxwo46xbI=; b=cx7pVr03b5BujxX5ozpzhfBusBlzQLfBUB21VU+zDtPDIY0YkdfaT8Al HOLNldqbzcaDeES3eyyIkeLmalTTv0sfoerDSCmq3awHFz8Pj4kv+WOh8 29jcsgaNmYEkcnuvcE0SVnF/VVfKF1S1kaI2xeKrCwBo0YUsuv7wEYLU3 xk739PYdEIlSpWcKX0TFl1k9xrXV9Fnw5Qx9V4OTEOVawEetRaUo2Gbhs QLTGq6LHfXOzQCRDDb2saov8YXhpadGKwAQu07GDwf8bL9ceayk2DCwJG Yx+l+MAaHiSlavQMp8ZCIDwSGZ/40gGdKeUBOgBBeOU2BE7ftBnnLdGHg w==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="319879532" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="319879532" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 02:35:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="633554643" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="633554643" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga003.jf.intel.com with ESMTP; 27 Mar 2023 02:35:04 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com Subject: [PATCH v2 05/10] vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET Date: Mon, 27 Mar 2023 02:34:53 -0700 Message-Id: <20230327093458.44939-6-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327093458.44939-1-yi.l.liu@intel.com> References: <20230327093458.44939-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org as an alternative method for ownership check when iommufd is used. In this case all opened devices in the affected dev_set are verified to be bound to a same valid iommufd value to allow reset. It's simpler and faster as user does not need to pass a set of fds and kernel no need to search the device within the given fds. a device in noiommu mode doesn't have a valid iommufd, so this method should not be used in a dev_set which contains multiple devices and one of them is in noiommu. The only allowed noiommu scenario is that the calling device is noiommu and it's in a singleton dev_set. Suggested-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe Signed-off-by: Yi Liu Reviewed-by: Jason Gunthorpe --- drivers/vfio/pci/vfio_pci_core.c | 42 +++++++++++++++++++++++++++----- include/uapi/linux/vfio.h | 9 ++++++- 2 files changed, 44 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 3696b8e58445..b68fcba67a4b 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -180,7 +180,8 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_core_device *vdev) struct vfio_pci_group_info; static void vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set); static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, - struct vfio_pci_group_info *groups); + struct vfio_pci_group_info *groups, + struct iommufd_ctx *iommufd_ctx); /* * INTx masking requires the ability to disable INTx signaling via PCI_COMMAND @@ -1277,7 +1278,7 @@ vfio_pci_ioctl_pci_hot_reset_groups(struct vfio_pci_core_device *vdev, return ret; /* Somewhere between 1 and count is OK */ - if (!hdr->count || hdr->count > count) + if (hdr->count > count) return -EINVAL; group_fds = kcalloc(hdr->count, sizeof(*group_fds), GFP_KERNEL); @@ -1326,7 +1327,7 @@ vfio_pci_ioctl_pci_hot_reset_groups(struct vfio_pci_core_device *vdev, info.count = hdr->count; info.files = files; - ret = vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, &info); + ret = vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, &info, NULL); hot_reset_release: for (file_idx--; file_idx >= 0; file_idx--) @@ -1341,6 +1342,7 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, { unsigned long minsz = offsetofend(struct vfio_pci_hot_reset, count); struct vfio_pci_hot_reset hdr; + struct iommufd_ctx *iommufd; bool slot = false; if (copy_from_user(&hdr, arg, minsz)) @@ -1355,7 +1357,12 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, else if (pci_probe_reset_bus(vdev->pdev->bus)) return -ENODEV; - return vfio_pci_ioctl_pci_hot_reset_groups(vdev, &hdr, slot, arg); + if (hdr.count) + return vfio_pci_ioctl_pci_hot_reset_groups(vdev, &hdr, slot, arg); + + iommufd = vfio_iommufd_physical_ictx(&vdev->vdev); + + return vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, NULL, iommufd); } static int vfio_pci_ioctl_ioeventfd(struct vfio_pci_core_device *vdev, @@ -2327,6 +2334,9 @@ static bool vfio_dev_in_groups(struct vfio_pci_core_device *vdev, { unsigned int i; + if (!groups) + return false; + for (i = 0; i < groups->count; i++) if (vfio_file_has_dev(groups->files[i], &vdev->vdev)) return true; @@ -2402,13 +2412,25 @@ static int vfio_pci_dev_set_pm_runtime_get(struct vfio_device_set *dev_set) return ret; } +static bool vfio_dev_in_iommufd_ctx(struct vfio_pci_core_device *vdev, + struct iommufd_ctx *iommufd_ctx) +{ + struct iommufd_ctx *iommufd = vfio_iommufd_physical_ictx(&vdev->vdev); + + if (!iommufd) + return false; + + return iommufd == iommufd_ctx; +} + /* * We need to get memory_lock for each device, but devices can share mmap_lock, * therefore we need to zap and hold the vma_lock for each device, and only then * get each memory_lock. */ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, - struct vfio_pci_group_info *groups) + struct vfio_pci_group_info *groups, + struct iommufd_ctx *iommufd_ctx) { struct vfio_pci_core_device *cur_mem; struct vfio_pci_core_device *cur_vma; @@ -2448,9 +2470,17 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, * * Otherwise all opened devices in the dev_set must be * contained by the set of groups provided by the user. + * + * If user provides a zero-length array, then all the + * opened devices must be bound to a same iommufd_ctx. + * + * If all above checks are failed, reset is allowed only if + * the calling device is in a singleton dev_set. */ if (cur_vma->vdev.open_count && - !vfio_dev_in_groups(cur_vma, groups)) { + !vfio_dev_in_groups(cur_vma, groups) && + !vfio_dev_in_iommufd_ctx(cur_vma, iommufd_ctx) && + (dev_set->device_count > 1)) { ret = -EINVAL; goto err_undo; } diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index f96e5689cffc..17aa5d09db41 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -679,7 +679,14 @@ struct vfio_pci_hot_reset_info { * the calling user must ensure all affected devices, if opened, are * owned by itself. * - * The ownership is proved by an array of group fds. + * The ownership can be proved by: + * - An array of group fds + * - A zero-length array + * + * In the last case all affected devices which are opened by this user + * must have been bound to a same iommufd. If the calling device is in + * noiommu mode (no valid iommufd) then it can be reset only if the reset + * doesn't affect other devices. * * Return: 0 on success, -errno on failure. */