From patchwork Mon May 22 11:57:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13250250 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB647C7EE2D for ; Mon, 22 May 2023 11:58:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233625AbjEVL6t (ORCPT ); Mon, 22 May 2023 07:58:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36640 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233602AbjEVL6n (ORCPT ); Mon, 22 May 2023 07:58:43 -0400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA6401B4; Mon, 22 May 2023 04:58:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684756705; x=1716292705; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FoJTA2x8O689fca/M93WyP56FZmqpQ0OhJBWveNQU+4=; b=OV1pFRPtk7J4RHkk6yrTKw+apQ859NMuZB8dQrkQ87jbuWR1KkfnVM+s ziCCiBIt+qQgW0oE/h8L9MHaMtmzUqx/kMMCYzv8ObkVOzReO7Ea24Zjf 3bUNmrbgMMk8tIoURnRBB4P05h0/gD7Jf7UPwNF1KQ2D0MjU4ITBMxAUf 4hPegUDsv1dHdUMD53VcPRyJc2xKNPYgvQtgoxMuVnY/3Gv5X1OWCrEC3 Y+1Tt4OPNBApLipVIna5rebWlLislq/S7rvcqKRMUZzTHsi699lpn6kjG uiqpn3V23XWigga/3ChGSESI+DKO++2R+j7m6FgHf48e7yb/ai+UNGAD9 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="356128244" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="356128244" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 May 2023 04:58:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="815660290" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="815660290" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga002.fm.intel.com with ESMTP; 22 May 2023 04:58:03 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v6 10/10] vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET Date: Mon, 22 May 2023 04:57:51 -0700 Message-Id: <20230522115751.326947-11-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230522115751.326947-1-yi.l.liu@intel.com> References: <20230522115751.326947-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This is the way user to invoke hot-reset for the devices opened by cdev interface. User should check the flag VFIO_PCI_HOT_RESET_FLAG_DEV_ID_OWNED in the output of VFIO_DEVICE_GET_PCI_HOT_RESET_INFO ioctl before doing hot-reset for cdev devices. Suggested-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe Reviewed-by: Jason Gunthorpe Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/vfio/pci/vfio_pci_core.c | 56 +++++++++++++++++++++++++------- include/uapi/linux/vfio.h | 14 ++++++++ 2 files changed, 59 insertions(+), 11 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 890065f846e4..67f1cb426505 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -181,7 +181,8 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_core_device *vdev) struct vfio_pci_group_info; static void vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set); static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, - struct vfio_pci_group_info *groups); + struct vfio_pci_group_info *groups, + struct iommufd_ctx *iommufd_ctx); /* * INTx masking requires the ability to disable INTx signaling via PCI_COMMAND @@ -1301,8 +1302,7 @@ vfio_pci_ioctl_pci_hot_reset_groups(struct vfio_pci_core_device *vdev, if (ret) return ret; - /* Somewhere between 1 and count is OK */ - if (!array_count || array_count > count) + if (array_count > count || vfio_device_cdev_opened(&vdev->vdev)) return -EINVAL; group_fds = kcalloc(array_count, sizeof(*group_fds), GFP_KERNEL); @@ -1351,7 +1351,7 @@ vfio_pci_ioctl_pci_hot_reset_groups(struct vfio_pci_core_device *vdev, info.count = array_count; info.files = files; - ret = vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, &info); + ret = vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, &info, NULL); hot_reset_release: for (file_idx--; file_idx >= 0; file_idx--) @@ -1380,7 +1380,11 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, else if (pci_probe_reset_bus(vdev->pdev->bus)) return -ENODEV; - return vfio_pci_ioctl_pci_hot_reset_groups(vdev, hdr.count, slot, arg); + if (hdr.count) + return vfio_pci_ioctl_pci_hot_reset_groups(vdev, hdr.count, slot, arg); + + return vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, NULL, + vfio_iommufd_device_ictx(&vdev->vdev)); } static int vfio_pci_ioctl_ioeventfd(struct vfio_pci_core_device *vdev, @@ -2347,13 +2351,16 @@ const struct pci_error_handlers vfio_pci_core_err_handlers = { }; EXPORT_SYMBOL_GPL(vfio_pci_core_err_handlers); -static bool vfio_dev_in_groups(struct vfio_pci_core_device *vdev, +static bool vfio_dev_in_groups(struct vfio_device *vdev, struct vfio_pci_group_info *groups) { unsigned int i; + if (!groups) + return false; + for (i = 0; i < groups->count; i++) - if (vfio_file_has_dev(groups->files[i], &vdev->vdev)) + if (vfio_file_has_dev(groups->files[i], vdev)) return true; return false; } @@ -2429,7 +2436,8 @@ static int vfio_pci_dev_set_pm_runtime_get(struct vfio_device_set *dev_set) * get each memory_lock. */ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, - struct vfio_pci_group_info *groups) + struct vfio_pci_group_info *groups, + struct iommufd_ctx *iommufd_ctx) { struct vfio_pci_core_device *cur_mem; struct vfio_pci_core_device *cur_vma; @@ -2459,11 +2467,37 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, goto err_unlock; list_for_each_entry(cur_vma, &dev_set->device_list, vdev.dev_set_list) { + bool owned; + /* - * Test whether all the affected devices are contained by the - * set of groups provided by the user. + * Test whether all the affected devices can be reset by the + * user. + * + * If the user provides a set of groups, all the devices + * in the dev_set should be contained by the set of groups + * provided by the user. + * + * If the user provides a zero-length group fd array, then + * all the devices in the dev_set must be bound to the same + * iommufd_ctx as the input iommufd_ctx. If there is any + * device that has not been bound to iommufd_ctx yet, check + * if its iommu_group has any device bound to the input + * iommufd_ctx Such devices can be considered owned by + * the input iommufd_ctx as the device cannot be owned + * by another iommufd_ctx when its iommu_group is owned. + * + * Otherwise, reset is not allowed. */ - if (!vfio_dev_in_groups(cur_vma, groups)) { + if (iommufd_ctx) { + int devid = vfio_iommufd_device_hot_reset_devid(&cur_vma->vdev, + iommufd_ctx); + + owned = (devid != VFIO_PCI_DEVID_NOT_OWNED); + } else { + owned = vfio_dev_in_groups(&cur_vma->vdev, groups); + } + + if (!owned) { ret = -EINVAL; goto err_undo; } diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 01203215251a..24858b650562 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -686,6 +686,9 @@ enum { * Flag VFIO_PCI_HOT_RESET_FLAG_DEV_ID_OWNED would be set when all the * affected devices are owned by the user. This flag is available only * when VFIO_PCI_HOT_RESET_FLAG_DEV_ID is set, otherwise reserved. + * When set, user could invoke VFIO_DEVICE_PCI_HOT_RESET with a zero + * length fd array on the calling device as the ownership is validated + * by iommufd_ctx. * * Return: 0 on success, -errno on failure: * -enospc = insufficient buffer, -enodev = unsupported for device. @@ -717,6 +720,17 @@ struct vfio_pci_hot_reset_info { * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13, * struct vfio_pci_hot_reset) * + * Userspace requests hot reset for the devices it operates. Due to the + * underlying topology, multiple devices can be affected in the reset + * while some might be opened by another user. To avoid interference + * the calling user must ensure all affected devices are owned by itself. + * + * As the ownership described by VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, the + * cdev opened devices must exclusively provide a zero-length fd array and + * the group opened devices must exclusively use an array of group fds for + * proof of ownership. Mixed access to devices between cdev and legacy + * groups are not supported by this interface. + * * Return: 0 on success, -errno on failure. */ struct vfio_pci_hot_reset {