From patchwork Tue Jul 18 13:55:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317266 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A7FCEB64DD for ; Tue, 18 Jul 2023 13:56:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233004AbjGRN4H (ORCPT ); Tue, 18 Jul 2023 09:56:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232997AbjGRN4C (ORCPT ); Tue, 18 Jul 2023 09:56:02 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 13268197; Tue, 18 Jul 2023 06:55:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688555; x=1721224555; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TLkUJVZL5PL9obIz6zhR7FKZQyM+AvVpBIitWqax0xw=; b=Ha6dGTp6XBfp80dOlWAVwS8MW310372K1Yqy2aZ4KmA8rRC/i69rwHSz UNpbRtP9ro4+/14KYSPYNDL078TjAtDKcx1wQWCjkq7MTxbI72M0VYzom ioYoEpFTnUhAwNzpyBMBgKUvK3IQCnWBYhglJQeQlSFEdZGJo8xxXwGqe 5+nAhr2fPJonzNg5AusHRGPFxQ63+gLKnPndJMM+xssA587mmjmREvQlK 8oyN+dI5kvRredE7DrTjzqCutKGacesbKntgLBjHENMO8A7cP0zXCaXMS rgZ3YGf1HHerIDsFBlY5FfufpduYFI8qlgHkYlSEAK7AzX68EE0VBSpRP w==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590537" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590537" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:55:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970250924" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970250924" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:55:53 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 01/26] vfio: Allocate per device file structure Date: Tue, 18 Jul 2023 06:55:26 -0700 Message-Id: <20230718135551.6592-2-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This is preparation for adding vfio device cdev support. vfio device cdev requires: 1) A per device file memory to store the kvm pointer set by KVM. It will be propagated to vfio_device:kvm after the device cdev file is bound to an iommufd. 2) A mechanism to block device access through device cdev fd before it is bound to an iommufd. To address the above requirements, this adds a per device file structure named vfio_device_file. For now, it's only a wrapper of struct vfio_device pointer. Other fields will be added to this per file structure in future commits. Reviewed-by: Kevin Tian Reviewed-by: Eric Auger Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/group.c | 13 +++++++++++-- drivers/vfio/vfio.h | 6 ++++++ drivers/vfio/vfio_main.c | 31 ++++++++++++++++++++++++++----- 3 files changed, 43 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index fc75c1000d74..fbba9fc15e57 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -218,19 +218,26 @@ void vfio_device_group_close(struct vfio_device *device) static struct file *vfio_device_open_file(struct vfio_device *device) { + struct vfio_device_file *df; struct file *filep; int ret; + df = vfio_allocate_device_file(device); + if (IS_ERR(df)) { + ret = PTR_ERR(df); + goto err_out; + } + ret = vfio_device_group_open(device); if (ret) - goto err_out; + goto err_free; /* * We can't use anon_inode_getfd() because we need to modify * the f_mode flags directly to allow more than just ioctls */ filep = anon_inode_getfile("[vfio-device]", &vfio_device_fops, - device, O_RDWR); + df, O_RDWR); if (IS_ERR(filep)) { ret = PTR_ERR(filep); goto err_close_device; @@ -254,6 +261,8 @@ static struct file *vfio_device_open_file(struct vfio_device *device) err_close_device: vfio_device_group_close(device); +err_free: + kfree(df); err_out: return ERR_PTR(ret); } diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 7b19c621e0e6..87d3dd6b9ef9 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -16,11 +16,17 @@ struct iommufd_ctx; struct iommu_group; struct vfio_container; +struct vfio_device_file { + struct vfio_device *device; +}; + void vfio_device_put_registration(struct vfio_device *device); bool vfio_device_try_get_registration(struct vfio_device *device); int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd); void vfio_device_close(struct vfio_device *device, struct iommufd_ctx *iommufd); +struct vfio_device_file * +vfio_allocate_device_file(struct vfio_device *device); extern const struct file_operations vfio_device_fops; diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index ab4f3a794f78..39c1158ffef0 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -419,6 +419,20 @@ static bool vfio_assert_device_open(struct vfio_device *device) return !WARN_ON_ONCE(!READ_ONCE(device->open_count)); } +struct vfio_device_file * +vfio_allocate_device_file(struct vfio_device *device) +{ + struct vfio_device_file *df; + + df = kzalloc(sizeof(*df), GFP_KERNEL_ACCOUNT); + if (!df) + return ERR_PTR(-ENOMEM); + + df->device = device; + + return df; +} + static int vfio_device_first_open(struct vfio_device *device, struct iommufd_ctx *iommufd) { @@ -532,12 +546,15 @@ static inline void vfio_device_pm_runtime_put(struct vfio_device *device) */ static int vfio_device_fops_release(struct inode *inode, struct file *filep) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; vfio_device_group_close(device); vfio_device_put_registration(device); + kfree(df); + return 0; } @@ -1102,7 +1119,8 @@ static int vfio_ioctl_device_feature(struct vfio_device *device, static long vfio_device_fops_unl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; int ret; ret = vfio_device_pm_runtime_get(device); @@ -1129,7 +1147,8 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf, size_t count, loff_t *ppos) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; if (unlikely(!device->ops->read)) return -EINVAL; @@ -1141,7 +1160,8 @@ static ssize_t vfio_device_fops_write(struct file *filep, const char __user *buf, size_t count, loff_t *ppos) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; if (unlikely(!device->ops->write)) return -EINVAL; @@ -1151,7 +1171,8 @@ static ssize_t vfio_device_fops_write(struct file *filep, static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; if (unlikely(!device->ops->mmap)) return -EINVAL; From patchwork Tue Jul 18 13:55:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317267 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81E13EB64DC for ; Tue, 18 Jul 2023 13:56:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233028AbjGRN4J (ORCPT ); Tue, 18 Jul 2023 09:56:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232586AbjGRN4E (ORCPT ); Tue, 18 Jul 2023 09:56:04 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F3B8DE52; Tue, 18 Jul 2023 06:55:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688556; x=1721224556; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rDJ4P+I4IB3h9bC/RQx8HmPKkha0YdsOaiuNrvOnXiQ=; b=ZzrJ1hkVvaMdtEOzC203hlyaqdya3Wwa31pHG0a4Zk/O+2ajfx8H9+Xw 8Yfr6BTnIcPA4fhY2dLzk4QlAok0m2LqnbvM3KLoBut8TzWOvrUIjCeNi wGPs6TRGkj/Iw9w7mbhm/mDJ3WCAfo4M284OKXjY/rSzIvF10XR9BSp2C Zy8FUOak5QvCYgs41c5yqXpwwtYY0GvRlvoaDSCHIUnipVRWnjAdWf76d cd76DVHugk7w0iGLH1t9acCf0j2VMita6+KxXBYYT9s7JTf/GEB6nEpS5 f5pZ+XfKFI5wjXoLDwNl08EFT4Pgep4trw0sZjbzYSWHRRYYDu5Q9Dqiu A==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590548" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590548" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:55:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970250933" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970250933" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:55:54 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 02/26] vfio: Refine vfio file kAPIs for KVM Date: Tue, 18 Jul 2023 06:55:27 -0700 Message-Id: <20230718135551.6592-3-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This prepares for making the below kAPIs to accept both group file and device file instead of only vfio group file. bool vfio_file_enforced_coherent(struct file *file); void vfio_file_set_kvm(struct file *file, struct kvm *kvm); Reviewed-by: Kevin Tian Reviewed-by: Eric Auger Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/group.c | 53 +++++++++++++--------------------------- drivers/vfio/vfio.h | 3 +++ drivers/vfio/vfio_main.c | 49 +++++++++++++++++++++++++++++++++++++ include/linux/vfio.h | 1 + virt/kvm/vfio.c | 10 ++++---- 5 files changed, 75 insertions(+), 41 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index fbba9fc15e57..b56e19d2a02d 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -754,6 +754,15 @@ bool vfio_device_has_container(struct vfio_device *device) return device->group->container; } +struct vfio_group *vfio_group_from_file(struct file *file) +{ + struct vfio_group *group = file->private_data; + + if (file->f_op != &vfio_group_fops) + return NULL; + return group; +} + /** * vfio_file_iommu_group - Return the struct iommu_group for the vfio group file * @file: VFIO group file @@ -764,13 +773,13 @@ bool vfio_device_has_container(struct vfio_device *device) */ struct iommu_group *vfio_file_iommu_group(struct file *file) { - struct vfio_group *group = file->private_data; + struct vfio_group *group = vfio_group_from_file(file); struct iommu_group *iommu_group = NULL; if (!IS_ENABLED(CONFIG_SPAPR_TCE_IOMMU)) return NULL; - if (!vfio_file_is_group(file)) + if (!group) return NULL; mutex_lock(&group->group_lock); @@ -784,33 +793,20 @@ struct iommu_group *vfio_file_iommu_group(struct file *file) EXPORT_SYMBOL_GPL(vfio_file_iommu_group); /** - * vfio_file_is_group - True if the file is usable with VFIO aPIS + * vfio_file_is_group - True if the file is a vfio group file * @file: VFIO group file */ bool vfio_file_is_group(struct file *file) { - return file->f_op == &vfio_group_fops; + return vfio_group_from_file(file); } EXPORT_SYMBOL_GPL(vfio_file_is_group); -/** - * vfio_file_enforced_coherent - True if the DMA associated with the VFIO file - * is always CPU cache coherent - * @file: VFIO group file - * - * Enforced coherency means that the IOMMU ignores things like the PCIe no-snoop - * bit in DMA transactions. A return of false indicates that the user has - * rights to access additional instructions such as wbinvd on x86. - */ -bool vfio_file_enforced_coherent(struct file *file) +bool vfio_group_enforced_coherent(struct vfio_group *group) { - struct vfio_group *group = file->private_data; struct vfio_device *device; bool ret = true; - if (!vfio_file_is_group(file)) - return true; - /* * If the device does not have IOMMU_CAP_ENFORCE_CACHE_COHERENCY then * any domain later attached to it will also not support it. If the cap @@ -828,28 +824,13 @@ bool vfio_file_enforced_coherent(struct file *file) mutex_unlock(&group->device_lock); return ret; } -EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent); -/** - * vfio_file_set_kvm - Link a kvm with VFIO drivers - * @file: VFIO group file - * @kvm: KVM to link - * - * When a VFIO device is first opened the KVM will be available in - * device->kvm if one was associated with the group. - */ -void vfio_file_set_kvm(struct file *file, struct kvm *kvm) +void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm) { - struct vfio_group *group = file->private_data; - - if (!vfio_file_is_group(file)) - return; - spin_lock(&group->kvm_ref_lock); group->kvm = kvm; spin_unlock(&group->kvm_ref_lock); } -EXPORT_SYMBOL_GPL(vfio_file_set_kvm); /** * vfio_file_has_dev - True if the VFIO file is a handle for device @@ -860,9 +841,9 @@ EXPORT_SYMBOL_GPL(vfio_file_set_kvm); */ bool vfio_file_has_dev(struct file *file, struct vfio_device *device) { - struct vfio_group *group = file->private_data; + struct vfio_group *group = vfio_group_from_file(file); - if (!vfio_file_is_group(file)) + if (!group) return false; return group == device->group; diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 87d3dd6b9ef9..b1e327a85a32 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -90,6 +90,9 @@ void vfio_device_group_unregister(struct vfio_device *device); int vfio_device_group_use_iommu(struct vfio_device *device); void vfio_device_group_unuse_iommu(struct vfio_device *device); void vfio_device_group_close(struct vfio_device *device); +struct vfio_group *vfio_group_from_file(struct file *file); +bool vfio_group_enforced_coherent(struct vfio_group *group); +void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm); bool vfio_device_has_container(struct vfio_device *device); int __init vfio_group_init(void); void vfio_group_cleanup(void); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 39c1158ffef0..4665791aa2eb 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1190,6 +1190,55 @@ const struct file_operations vfio_device_fops = { .mmap = vfio_device_fops_mmap, }; +/** + * vfio_file_is_valid - True if the file is valid vfio file + * @file: VFIO group file or VFIO device file + */ +bool vfio_file_is_valid(struct file *file) +{ + return vfio_group_from_file(file); +} +EXPORT_SYMBOL_GPL(vfio_file_is_valid); + +/** + * vfio_file_enforced_coherent - True if the DMA associated with the VFIO file + * is always CPU cache coherent + * @file: VFIO group file or VFIO device file + * + * Enforced coherency means that the IOMMU ignores things like the PCIe no-snoop + * bit in DMA transactions. A return of false indicates that the user has + * rights to access additional instructions such as wbinvd on x86. + */ +bool vfio_file_enforced_coherent(struct file *file) +{ + struct vfio_group *group; + + group = vfio_group_from_file(file); + if (group) + return vfio_group_enforced_coherent(group); + + return true; +} +EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent); + +/** + * vfio_file_set_kvm - Link a kvm with VFIO drivers + * @file: VFIO group file or VFIO device file + * @kvm: KVM to link + * + * When a VFIO device is first opened the KVM will be available in + * device->kvm if one was associated with the file. + */ +void vfio_file_set_kvm(struct file *file, struct kvm *kvm) +{ + struct vfio_group *group; + + group = vfio_group_from_file(file); + if (group) + vfio_group_set_kvm(group, kvm); +} +EXPORT_SYMBOL_GPL(vfio_file_set_kvm); + /* * Sub-module support */ diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 7079911edfb1..06a5221949c5 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -272,6 +272,7 @@ int vfio_mig_get_next_state(struct vfio_device *device, */ struct iommu_group *vfio_file_iommu_group(struct file *file); bool vfio_file_is_group(struct file *file); +bool vfio_file_is_valid(struct file *file); bool vfio_file_enforced_coherent(struct file *file); void vfio_file_set_kvm(struct file *file, struct kvm *kvm); bool vfio_file_has_dev(struct file *file, struct vfio_device *device); diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 9584eb57e0ed..b33c7b8488b3 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -64,18 +64,18 @@ static bool kvm_vfio_file_enforced_coherent(struct file *file) return ret; } -static bool kvm_vfio_file_is_group(struct file *file) +static bool kvm_vfio_file_is_valid(struct file *file) { bool (*fn)(struct file *file); bool ret; - fn = symbol_get(vfio_file_is_group); + fn = symbol_get(vfio_file_is_valid); if (!fn) return false; ret = fn(file); - symbol_put(vfio_file_is_group); + symbol_put(vfio_file_is_valid); return ret; } @@ -154,8 +154,8 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd) if (!filp) return -EBADF; - /* Ensure the FD is a vfio group FD.*/ - if (!kvm_vfio_file_is_group(filp)) { + /* Ensure the FD is a vfio FD. */ + if (!kvm_vfio_file_is_valid(filp)) { ret = -EINVAL; goto err_fput; } From patchwork Tue Jul 18 13:55:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317268 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5AF5EB64DD for ; Tue, 18 Jul 2023 13:56:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232975AbjGRN4K (ORCPT ); Tue, 18 Jul 2023 09:56:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232925AbjGRN4E (ORCPT ); Tue, 18 Jul 2023 09:56:04 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1A91170E; Tue, 18 Jul 2023 06:55:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688558; x=1721224558; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4oG2xApeqX8Foyu2rBLN9WcTPJ5+vsPt5f3HR9Hu99g=; b=FSOdt/4y3CnCihAxDACtnExUeftTXZV5eOpD+x64OnfQhf7hi9n3H1Xi EY4DgiymjWcjVV2dMJ5SRkBBPOKxuQNr2bdv9PIj5Q0FSr/6IZpgvkV3A T5ro0oL7WcdQKTAGPMCv4h44okZGwKT5wUxaoq08F9nzy2J8DbnYrd0h3 RNKF2QV1IrYEsgkpUWsZFWjI7EyuSuEaYdGDiS9aS26g1z/mVZwhobNyJ /6ybdxTkdTA44q3NTUdT6sQLHqM6rC8KvR/jilR2x3CjDKzE76+Yh3fOp nwFBkSSKeBlO0bnNhlL1dd1q7Mj85LhUKNZf4zvDGASaL9p7T47Wxpcro g==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590560" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590560" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:55:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970250939" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970250939" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:55:54 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 03/26] vfio: Accept vfio device file in the KVM facing kAPI Date: Tue, 18 Jul 2023 06:55:28 -0700 Message-Id: <20230718135551.6592-4-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This makes the vfio file kAPIs to accept vfio device files, also a preparation for vfio device cdev support. For the kvm set with vfio device file, kvm pointer is stored in struct vfio_device_file, and use kvm_ref_lock to protect kvm set and kvm pointer usage within VFIO. This kvm pointer will be set to vfio_device after device file is bound to iommufd in the cdev path. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/vfio.h | 3 +++ drivers/vfio/vfio_main.c | 36 +++++++++++++++++++++++++++++++++++- 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index b1e327a85a32..332528af0846 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -18,6 +18,9 @@ struct vfio_container; struct vfio_device_file { struct vfio_device *device; + + spinlock_t kvm_ref_lock; /* protect kvm field */ + struct kvm *kvm; }; void vfio_device_put_registration(struct vfio_device *device); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 4665791aa2eb..8ef9210ad2aa 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -429,6 +429,7 @@ vfio_allocate_device_file(struct vfio_device *device) return ERR_PTR(-ENOMEM); df->device = device; + spin_lock_init(&df->kvm_ref_lock); return df; } @@ -1190,13 +1191,23 @@ const struct file_operations vfio_device_fops = { .mmap = vfio_device_fops_mmap, }; +static struct vfio_device *vfio_device_from_file(struct file *file) +{ + struct vfio_device_file *df = file->private_data; + + if (file->f_op != &vfio_device_fops) + return NULL; + return df->device; +} + /** * vfio_file_is_valid - True if the file is valid vfio file * @file: VFIO group file or VFIO device file */ bool vfio_file_is_valid(struct file *file) { - return vfio_group_from_file(file); + return vfio_group_from_file(file) || + vfio_device_from_file(file); } EXPORT_SYMBOL_GPL(vfio_file_is_valid); @@ -1211,16 +1222,36 @@ EXPORT_SYMBOL_GPL(vfio_file_is_valid); */ bool vfio_file_enforced_coherent(struct file *file) { + struct vfio_device *device; struct vfio_group *group; group = vfio_group_from_file(file); if (group) return vfio_group_enforced_coherent(group); + device = vfio_device_from_file(file); + if (device) + return device_iommu_capable(device->dev, + IOMMU_CAP_ENFORCE_CACHE_COHERENCY); + return true; } EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent); +static void vfio_device_file_set_kvm(struct file *file, struct kvm *kvm) +{ + struct vfio_device_file *df = file->private_data; + + /* + * The kvm is first recorded in the vfio_device_file, and will + * be propagated to vfio_device::kvm when the file is bound to + * iommufd successfully in the vfio device cdev path. + */ + spin_lock(&df->kvm_ref_lock); + df->kvm = kvm; + spin_unlock(&df->kvm_ref_lock); +} + /** * vfio_file_set_kvm - Link a kvm with VFIO drivers * @file: VFIO group file or VFIO device file @@ -1236,6 +1267,9 @@ void vfio_file_set_kvm(struct file *file, struct kvm *kvm) group = vfio_group_from_file(file); if (group) vfio_group_set_kvm(group, kvm); + + if (vfio_device_from_file(file)) + vfio_device_file_set_kvm(file, kvm); } EXPORT_SYMBOL_GPL(vfio_file_set_kvm); From patchwork Tue Jul 18 13:55:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317269 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51E6CEB64DC for ; Tue, 18 Jul 2023 13:56:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233018AbjGRN4N (ORCPT ); Tue, 18 Jul 2023 09:56:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232994AbjGRN4G (ORCPT ); Tue, 18 Jul 2023 09:56:06 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C74541722; Tue, 18 Jul 2023 06:55:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688559; x=1721224559; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9aWbJMv4eIbjmWozjpBiUlpb0XzIi2U7UEGv7aeoPoE=; b=O2OMWJklQtANgrAuxm/b0T7oT/e/w42Pk4/U6Fd3XBhxOC/eq4GjIqtC jkesVZ9BttRJZpFhJckB69vdfx1nC6NJdhyd7gBWHgoq2xbmAo/zM/JzB 9SE1AksSx6M4tGt+NQy95R37RVdpfX4yZZyNfvJKKMC33HoCAX4WqttVQ VtlXLrO/WbT3qXyvsunQJvcNeCICUvYyhpe0Yc3AeT5o/fgiEjmm0HWeV tIqjgXFnwalmKzVSiIWfZIsLiEeLZrgs8BnZG5lLiKeOuvvJs6ZUot079 8+oEJesnIIKLcdS9oTLBjTzOtGKmGdexAEjAMscW9Ko9lDhNTf4VSj9Nm w==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590574" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590574" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:55:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970250948" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970250948" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:55:55 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 04/26] kvm/vfio: Prepare for accepting vfio device fd Date: Tue, 18 Jul 2023 06:55:29 -0700 Message-Id: <20230718135551.6592-5-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This renames kvm_vfio_group related helpers to prepare for accepting vfio device fd. No functional change is intended. Reviewed-by: Kevin Tian Reviewed-by: Eric Auger Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- virt/kvm/vfio.c | 115 ++++++++++++++++++++++++------------------------ 1 file changed, 58 insertions(+), 57 deletions(-) diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index b33c7b8488b3..8f7fa07e8170 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -21,7 +21,7 @@ #include #endif -struct kvm_vfio_group { +struct kvm_vfio_file { struct list_head node; struct file *file; #ifdef CONFIG_SPAPR_TCE_IOMMU @@ -30,7 +30,7 @@ struct kvm_vfio_group { }; struct kvm_vfio { - struct list_head group_list; + struct list_head file_list; struct mutex lock; bool noncoherent; }; @@ -98,34 +98,35 @@ static struct iommu_group *kvm_vfio_file_iommu_group(struct file *file) } static void kvm_spapr_tce_release_vfio_group(struct kvm *kvm, - struct kvm_vfio_group *kvg) + struct kvm_vfio_file *kvf) { - if (WARN_ON_ONCE(!kvg->iommu_group)) + if (WARN_ON_ONCE(!kvf->iommu_group)) return; - kvm_spapr_tce_release_iommu_group(kvm, kvg->iommu_group); - iommu_group_put(kvg->iommu_group); - kvg->iommu_group = NULL; + kvm_spapr_tce_release_iommu_group(kvm, kvf->iommu_group); + iommu_group_put(kvf->iommu_group); + kvf->iommu_group = NULL; } #endif /* - * Groups can use the same or different IOMMU domains. If the same then - * adding a new group may change the coherency of groups we've previously - * been told about. We don't want to care about any of that so we retest - * each group and bail as soon as we find one that's noncoherent. This - * means we only ever [un]register_noncoherent_dma once for the whole device. + * Groups/devices can use the same or different IOMMU domains. If the same + * then adding a new group/device may change the coherency of groups/devices + * we've previously been told about. We don't want to care about any of + * that so we retest each group/device and bail as soon as we find one that's + * noncoherent. This means we only ever [un]register_noncoherent_dma once + * for the whole device. */ static void kvm_vfio_update_coherency(struct kvm_device *dev) { struct kvm_vfio *kv = dev->private; bool noncoherent = false; - struct kvm_vfio_group *kvg; + struct kvm_vfio_file *kvf; mutex_lock(&kv->lock); - list_for_each_entry(kvg, &kv->group_list, node) { - if (!kvm_vfio_file_enforced_coherent(kvg->file)) { + list_for_each_entry(kvf, &kv->file_list, node) { + if (!kvm_vfio_file_enforced_coherent(kvf->file)) { noncoherent = true; break; } @@ -143,10 +144,10 @@ static void kvm_vfio_update_coherency(struct kvm_device *dev) mutex_unlock(&kv->lock); } -static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd) +static int kvm_vfio_file_add(struct kvm_device *dev, unsigned int fd) { struct kvm_vfio *kv = dev->private; - struct kvm_vfio_group *kvg; + struct kvm_vfio_file *kvf; struct file *filp; int ret; @@ -162,27 +163,27 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd) mutex_lock(&kv->lock); - list_for_each_entry(kvg, &kv->group_list, node) { - if (kvg->file == filp) { + list_for_each_entry(kvf, &kv->file_list, node) { + if (kvf->file == filp) { ret = -EEXIST; goto err_unlock; } } - kvg = kzalloc(sizeof(*kvg), GFP_KERNEL_ACCOUNT); - if (!kvg) { + kvf = kzalloc(sizeof(*kvf), GFP_KERNEL_ACCOUNT); + if (!kvf) { ret = -ENOMEM; goto err_unlock; } - kvg->file = filp; - list_add_tail(&kvg->node, &kv->group_list); + kvf->file = filp; + list_add_tail(&kvf->node, &kv->file_list); kvm_arch_start_assignment(dev->kvm); mutex_unlock(&kv->lock); - kvm_vfio_file_set_kvm(kvg->file, dev->kvm); + kvm_vfio_file_set_kvm(kvf->file, dev->kvm); kvm_vfio_update_coherency(dev); return 0; @@ -193,10 +194,10 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd) return ret; } -static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd) +static int kvm_vfio_file_del(struct kvm_device *dev, unsigned int fd) { struct kvm_vfio *kv = dev->private; - struct kvm_vfio_group *kvg; + struct kvm_vfio_file *kvf; struct fd f; int ret; @@ -208,18 +209,18 @@ static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd) mutex_lock(&kv->lock); - list_for_each_entry(kvg, &kv->group_list, node) { - if (kvg->file != f.file) + list_for_each_entry(kvf, &kv->file_list, node) { + if (kvf->file != f.file) continue; - list_del(&kvg->node); + list_del(&kvf->node); kvm_arch_end_assignment(dev->kvm); #ifdef CONFIG_SPAPR_TCE_IOMMU - kvm_spapr_tce_release_vfio_group(dev->kvm, kvg); + kvm_spapr_tce_release_vfio_group(dev->kvm, kvf); #endif - kvm_vfio_file_set_kvm(kvg->file, NULL); - fput(kvg->file); - kfree(kvg); + kvm_vfio_file_set_kvm(kvf->file, NULL); + fput(kvf->file); + kfree(kvf); ret = 0; break; } @@ -234,12 +235,12 @@ static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd) } #ifdef CONFIG_SPAPR_TCE_IOMMU -static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev, - void __user *arg) +static int kvm_vfio_file_set_spapr_tce(struct kvm_device *dev, + void __user *arg) { struct kvm_vfio_spapr_tce param; struct kvm_vfio *kv = dev->private; - struct kvm_vfio_group *kvg; + struct kvm_vfio_file *kvf; struct fd f; int ret; @@ -254,20 +255,20 @@ static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev, mutex_lock(&kv->lock); - list_for_each_entry(kvg, &kv->group_list, node) { - if (kvg->file != f.file) + list_for_each_entry(kvf, &kv->file_list, node) { + if (kvf->file != f.file) continue; - if (!kvg->iommu_group) { - kvg->iommu_group = kvm_vfio_file_iommu_group(kvg->file); - if (WARN_ON_ONCE(!kvg->iommu_group)) { + if (!kvf->iommu_group) { + kvf->iommu_group = kvm_vfio_file_iommu_group(kvf->file); + if (WARN_ON_ONCE(!kvf->iommu_group)) { ret = -EIO; goto err_fdput; } } ret = kvm_spapr_tce_attach_iommu_group(dev->kvm, param.tablefd, - kvg->iommu_group); + kvf->iommu_group); break; } @@ -278,8 +279,8 @@ static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev, } #endif -static int kvm_vfio_set_group(struct kvm_device *dev, long attr, - void __user *arg) +static int kvm_vfio_set_file(struct kvm_device *dev, long attr, + void __user *arg) { int32_t __user *argp = arg; int32_t fd; @@ -288,16 +289,16 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, case KVM_DEV_VFIO_GROUP_ADD: if (get_user(fd, argp)) return -EFAULT; - return kvm_vfio_group_add(dev, fd); + return kvm_vfio_file_add(dev, fd); case KVM_DEV_VFIO_GROUP_DEL: if (get_user(fd, argp)) return -EFAULT; - return kvm_vfio_group_del(dev, fd); + return kvm_vfio_file_del(dev, fd); #ifdef CONFIG_SPAPR_TCE_IOMMU case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: - return kvm_vfio_group_set_spapr_tce(dev, arg); + return kvm_vfio_file_set_spapr_tce(dev, arg); #endif } @@ -309,8 +310,8 @@ static int kvm_vfio_set_attr(struct kvm_device *dev, { switch (attr->group) { case KVM_DEV_VFIO_GROUP: - return kvm_vfio_set_group(dev, attr->attr, - u64_to_user_ptr(attr->addr)); + return kvm_vfio_set_file(dev, attr->attr, + u64_to_user_ptr(attr->addr)); } return -ENXIO; @@ -339,16 +340,16 @@ static int kvm_vfio_has_attr(struct kvm_device *dev, static void kvm_vfio_release(struct kvm_device *dev) { struct kvm_vfio *kv = dev->private; - struct kvm_vfio_group *kvg, *tmp; + struct kvm_vfio_file *kvf, *tmp; - list_for_each_entry_safe(kvg, tmp, &kv->group_list, node) { + list_for_each_entry_safe(kvf, tmp, &kv->file_list, node) { #ifdef CONFIG_SPAPR_TCE_IOMMU - kvm_spapr_tce_release_vfio_group(dev->kvm, kvg); + kvm_spapr_tce_release_vfio_group(dev->kvm, kvf); #endif - kvm_vfio_file_set_kvm(kvg->file, NULL); - fput(kvg->file); - list_del(&kvg->node); - kfree(kvg); + kvm_vfio_file_set_kvm(kvf->file, NULL); + fput(kvf->file); + list_del(&kvf->node); + kfree(kvf); kvm_arch_end_assignment(dev->kvm); } @@ -382,7 +383,7 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 type) if (!kv) return -ENOMEM; - INIT_LIST_HEAD(&kv->group_list); + INIT_LIST_HEAD(&kv->file_list); mutex_init(&kv->lock); dev->private = kv; From patchwork Tue Jul 18 13:55:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317270 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06A49C001DC for ; Tue, 18 Jul 2023 13:56:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233025AbjGRN4X (ORCPT ); Tue, 18 Jul 2023 09:56:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53924 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229986AbjGRN4L (ORCPT ); Tue, 18 Jul 2023 09:56:11 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 679B6FA; Tue, 18 Jul 2023 06:56:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688563; x=1721224563; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Iu5pYaxHkXarQYAiu1kR/7i11cMaVu8HqXmRYfKOJmQ=; b=FwXE/OzIqWVC91Z89/LxBcMKU6qN3mEBSY7TvUywUv6FxWZTnIeU/D6q F/cNBSu7vku8XAE0agvvPwnyIqLvG9i1K7JwUOtk8W/YdS+MQGbaruC20 SUKQ/aR+powNyp+baSlTWeOrQ/zr+hY0bcxOloDmrRntWkKV45DQ34xh1 Zn171aO6A4hs5zYVf6EeqLW1lqyohBMr1Qwju0CnIZtC8wSyXc5HNUx3/ CPpfmqiqk+BRMBFdwvilYgzSMxnTSrc6F7PasamEAtqqVwMvs1FGzr352 ICJgXvNYCqNOdcHVTlTxYXjib3RBNAzvtSuL4Z7STmx5YQif0vJ83Xg8z g==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590589" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590589" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:55:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970250954" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970250954" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:55:56 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 05/26] kvm/vfio: Accept vfio device file from userspace Date: Tue, 18 Jul 2023 06:55:30 -0700 Message-Id: <20230718135551.6592-6-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This defines KVM_DEV_VFIO_FILE* and make alias with KVM_DEV_VFIO_GROUP*. Old userspace uses KVM_DEV_VFIO_GROUP* works as well. Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- Documentation/virt/kvm/devices/vfio.rst | 47 ++++++++++++++++--------- include/uapi/linux/kvm.h | 13 +++++-- virt/kvm/vfio.c | 12 +++---- 3 files changed, 47 insertions(+), 25 deletions(-) diff --git a/Documentation/virt/kvm/devices/vfio.rst b/Documentation/virt/kvm/devices/vfio.rst index 08b544212638..c549143bb891 100644 --- a/Documentation/virt/kvm/devices/vfio.rst +++ b/Documentation/virt/kvm/devices/vfio.rst @@ -9,22 +9,34 @@ Device types supported: - KVM_DEV_TYPE_VFIO Only one VFIO instance may be created per VM. The created device -tracks VFIO groups in use by the VM and features of those groups -important to the correctness and acceleration of the VM. As groups -are enabled and disabled for use by the VM, KVM should be updated -about their presence. When registered with KVM, a reference to the -VFIO-group is held by KVM. +tracks VFIO files (group or device) in use by the VM and features +of those groups/devices important to the correctness and acceleration +of the VM. As groups/devices are enabled and disabled for use by the +VM, KVM should be updated about their presence. When registered with +KVM, a reference to the VFIO file is held by KVM. Groups: - KVM_DEV_VFIO_GROUP - -KVM_DEV_VFIO_GROUP attributes: - KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking - kvm_device_attr.addr points to an int32_t file descriptor - for the VFIO group. - KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking - kvm_device_attr.addr points to an int32_t file descriptor - for the VFIO group. + KVM_DEV_VFIO_FILE + alias: KVM_DEV_VFIO_GROUP + +KVM_DEV_VFIO_FILE attributes: + KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device + tracking + + kvm_device_attr.addr points to an int32_t file descriptor for the + VFIO file. + + KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM + device tracking + + kvm_device_attr.addr points to an int32_t file descriptor for the + VFIO file. + +KVM_DEV_VFIO_GROUP (legacy kvm device group restricted to the handling of VFIO group fd): + KVM_DEV_VFIO_GROUP_ADD: same as KVM_DEV_VFIO_FILE_ADD for group fd only + + KVM_DEV_VFIO_GROUP_DEL: same as KVM_DEV_VFIO_FILE_DEL for group fd only + KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table allocated by sPAPR KVM. kvm_device_attr.addr points to a struct:: @@ -40,7 +52,10 @@ KVM_DEV_VFIO_GROUP attributes: - @tablefd is a file descriptor for a TCE table allocated via KVM_CREATE_SPAPR_TCE. -The GROUP_ADD operation above should be invoked prior to accessing the +The FILE/GROUP_ADD operation above should be invoked prior to accessing the device file descriptor via VFIO_GROUP_GET_DEVICE_FD in order to support drivers which require a kvm pointer to be set in their .open_device() -callback. +callback. It is the same for device file descriptor via character device +open which gets device access via VFIO_DEVICE_BIND_IOMMUFD. For such file +descriptors, FILE_ADD should be invoked before VFIO_DEVICE_BIND_IOMMUFD +to support the drivers mentioned in prior sentence as well. diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index f089ab290978..13065dd96132 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1418,9 +1418,16 @@ struct kvm_device_attr { __u64 addr; /* userspace address of attr data */ }; -#define KVM_DEV_VFIO_GROUP 1 -#define KVM_DEV_VFIO_GROUP_ADD 1 -#define KVM_DEV_VFIO_GROUP_DEL 2 +#define KVM_DEV_VFIO_FILE 1 + +#define KVM_DEV_VFIO_FILE_ADD 1 +#define KVM_DEV_VFIO_FILE_DEL 2 + +/* KVM_DEV_VFIO_GROUP aliases are for compile time uapi compatibility */ +#define KVM_DEV_VFIO_GROUP KVM_DEV_VFIO_FILE + +#define KVM_DEV_VFIO_GROUP_ADD KVM_DEV_VFIO_FILE_ADD +#define KVM_DEV_VFIO_GROUP_DEL KVM_DEV_VFIO_FILE_DEL #define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE 3 enum kvm_device_type { diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 8f7fa07e8170..07cb5f44b2a2 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -286,12 +286,12 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long attr, int32_t fd; switch (attr) { - case KVM_DEV_VFIO_GROUP_ADD: + case KVM_DEV_VFIO_FILE_ADD: if (get_user(fd, argp)) return -EFAULT; return kvm_vfio_file_add(dev, fd); - case KVM_DEV_VFIO_GROUP_DEL: + case KVM_DEV_VFIO_FILE_DEL: if (get_user(fd, argp)) return -EFAULT; return kvm_vfio_file_del(dev, fd); @@ -309,7 +309,7 @@ static int kvm_vfio_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr) { switch (attr->group) { - case KVM_DEV_VFIO_GROUP: + case KVM_DEV_VFIO_FILE: return kvm_vfio_set_file(dev, attr->attr, u64_to_user_ptr(attr->addr)); } @@ -321,10 +321,10 @@ static int kvm_vfio_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr) { switch (attr->group) { - case KVM_DEV_VFIO_GROUP: + case KVM_DEV_VFIO_FILE: switch (attr->attr) { - case KVM_DEV_VFIO_GROUP_ADD: - case KVM_DEV_VFIO_GROUP_DEL: + case KVM_DEV_VFIO_FILE_ADD: + case KVM_DEV_VFIO_FILE_DEL: #ifdef CONFIG_SPAPR_TCE_IOMMU case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: #endif From patchwork Tue Jul 18 13:55:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317271 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76B4EC04A6A for ; Tue, 18 Jul 2023 13:56:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231162AbjGRN4Z (ORCPT ); Tue, 18 Jul 2023 09:56:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54256 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233006AbjGRN4O (ORCPT ); Tue, 18 Jul 2023 09:56:14 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65545126; Tue, 18 Jul 2023 06:56:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688564; x=1721224564; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=t+ScppNZYuwSLk7VIjKma/BQtVE+JzoGlQKAEeQLAf4=; b=RBFAozRVlLVs8agQbN5JINrOLED6q9UkuWEC9iCN53+yy8LUEhGqg881 ZsS+C+QIt/v99cSLF3dUtzkapBric9aWFXvSxxQh4VW70XV3iiUNRgKAu Nye80YK/MEQhqD6Xvcrnbq/LecaeRMJ94jS5K1f9TyQFF16HcfHCEFs2c h1YJ3g7pFp7jVFqOgQz2Xb1DR00WzwWEn0WBDUPIE4/DwlNIbdnXMgfXl M7LO9UAY8abuiLKFIRxFR5D/p6vuRnilblEGDQI5O6LmnyeGUfBDub7ni D7AsHck8ybTdHqpefAYT+AXg5aFl2M6Trve+sCAds7KNCkFoee0e73FeA w==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590603" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590603" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:55:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970250966" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970250966" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:55:57 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 06/26] vfio: Pass struct vfio_device_file * to vfio_device_open/close() Date: Tue, 18 Jul 2023 06:55:31 -0700 Message-Id: <20230718135551.6592-7-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This avoids passing too much parameters in multiple functions. Per the input parameter change, rename the function to be vfio_df_open/close(). Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Reviewed-by: Eric Auger Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/group.c | 20 ++++++++++++++------ drivers/vfio/vfio.h | 8 ++++---- drivers/vfio/vfio_main.c | 25 +++++++++++++++---------- 3 files changed, 33 insertions(+), 20 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index b56e19d2a02d..caf53716ddb2 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -169,8 +169,9 @@ static void vfio_device_group_get_kvm_safe(struct vfio_device *device) spin_unlock(&device->group->kvm_ref_lock); } -static int vfio_device_group_open(struct vfio_device *device) +static int vfio_df_group_open(struct vfio_device_file *df) { + struct vfio_device *device = df->device; int ret; mutex_lock(&device->group->group_lock); @@ -190,7 +191,11 @@ static int vfio_device_group_open(struct vfio_device *device) if (device->open_count == 0) vfio_device_group_get_kvm_safe(device); - ret = vfio_device_open(device, device->group->iommufd); + df->iommufd = device->group->iommufd; + + ret = vfio_df_open(df); + if (ret) + df->iommufd = NULL; if (device->open_count == 0) vfio_device_put_kvm(device); @@ -202,12 +207,15 @@ static int vfio_device_group_open(struct vfio_device *device) return ret; } -void vfio_device_group_close(struct vfio_device *device) +void vfio_df_group_close(struct vfio_device_file *df) { + struct vfio_device *device = df->device; + mutex_lock(&device->group->group_lock); mutex_lock(&device->dev_set->lock); - vfio_device_close(device, device->group->iommufd); + vfio_df_close(df); + df->iommufd = NULL; if (device->open_count == 0) vfio_device_put_kvm(device); @@ -228,7 +236,7 @@ static struct file *vfio_device_open_file(struct vfio_device *device) goto err_out; } - ret = vfio_device_group_open(device); + ret = vfio_df_group_open(df); if (ret) goto err_free; @@ -260,7 +268,7 @@ static struct file *vfio_device_open_file(struct vfio_device *device) return filep; err_close_device: - vfio_device_group_close(device); + vfio_df_group_close(df); err_free: kfree(df); err_out: diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 332528af0846..2094f5a4ef04 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -21,13 +21,13 @@ struct vfio_device_file { spinlock_t kvm_ref_lock; /* protect kvm field */ struct kvm *kvm; + struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */ }; void vfio_device_put_registration(struct vfio_device *device); bool vfio_device_try_get_registration(struct vfio_device *device); -int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd); -void vfio_device_close(struct vfio_device *device, - struct iommufd_ctx *iommufd); +int vfio_df_open(struct vfio_device_file *df); +void vfio_df_close(struct vfio_device_file *df); struct vfio_device_file * vfio_allocate_device_file(struct vfio_device *device); @@ -92,7 +92,7 @@ void vfio_device_group_register(struct vfio_device *device); void vfio_device_group_unregister(struct vfio_device *device); int vfio_device_group_use_iommu(struct vfio_device *device); void vfio_device_group_unuse_iommu(struct vfio_device *device); -void vfio_device_group_close(struct vfio_device *device); +void vfio_df_group_close(struct vfio_device_file *df); struct vfio_group *vfio_group_from_file(struct file *file); bool vfio_group_enforced_coherent(struct vfio_group *group); void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 8ef9210ad2aa..825b1eeaebe2 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -434,9 +434,10 @@ vfio_allocate_device_file(struct vfio_device *device) return df; } -static int vfio_device_first_open(struct vfio_device *device, - struct iommufd_ctx *iommufd) +static int vfio_df_device_first_open(struct vfio_device_file *df) { + struct vfio_device *device = df->device; + struct iommufd_ctx *iommufd = df->iommufd; int ret; lockdep_assert_held(&device->dev_set->lock); @@ -468,9 +469,11 @@ static int vfio_device_first_open(struct vfio_device *device, return ret; } -static void vfio_device_last_close(struct vfio_device *device, - struct iommufd_ctx *iommufd) +static void vfio_df_device_last_close(struct vfio_device_file *df) { + struct vfio_device *device = df->device; + struct iommufd_ctx *iommufd = df->iommufd; + lockdep_assert_held(&device->dev_set->lock); if (device->ops->close_device) @@ -482,15 +485,16 @@ static void vfio_device_last_close(struct vfio_device *device, module_put(device->dev->driver->owner); } -int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd) +int vfio_df_open(struct vfio_device_file *df) { + struct vfio_device *device = df->device; int ret = 0; lockdep_assert_held(&device->dev_set->lock); device->open_count++; if (device->open_count == 1) { - ret = vfio_device_first_open(device, iommufd); + ret = vfio_df_device_first_open(df); if (ret) device->open_count--; } @@ -498,14 +502,15 @@ int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd) return ret; } -void vfio_device_close(struct vfio_device *device, - struct iommufd_ctx *iommufd) +void vfio_df_close(struct vfio_device_file *df) { + struct vfio_device *device = df->device; + lockdep_assert_held(&device->dev_set->lock); vfio_assert_device_open(device); if (device->open_count == 1) - vfio_device_last_close(device, iommufd); + vfio_df_device_last_close(df); device->open_count--; } @@ -550,7 +555,7 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep) struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; - vfio_device_group_close(device); + vfio_df_group_close(df); vfio_device_put_registration(device); From patchwork Tue Jul 18 13:55:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317272 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 677E9EB64DA for ; Tue, 18 Jul 2023 13:56:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232994AbjGRN43 (ORCPT ); Tue, 18 Jul 2023 09:56:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232999AbjGRN4W (ORCPT ); Tue, 18 Jul 2023 09:56:22 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 825E119A1; Tue, 18 Jul 2023 06:56:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688566; x=1721224566; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aOMvzl84305gOZ+h1Dfu8gnbceCEOq/1RbMTWdPfv4A=; b=dJBo5lxfd0VEl9mRAABpRrJpmB653btx7ORG9NWJbZ2g4GfEK5b7j7XZ 8Ar2bG0kGQPZPsyMOY5i4GiMg048yQyBfH3dOc8jQm+vmLAmCwswBRtiF aaRAIBaDKvsJPNqQGgS4K7WOk35g/PDuNJCOCnvkwnHulFvExvCZDaA5J I4s7SkqmzDHd4gK+wj/WAjCOJ5Hv3rbaUJ8RDt4vRCkGzItH5yMTz+0Gb JZuAnxQaLy6Q90sZeLhDmQIn4AFTmYW7KUK9az+2hrnRJN1I/mZTOcfQb it0+G42R/Fzwhahi6DdPhwOtPh54kF6e0s+W3l2bbDQvVQ2iF+3kXfk84 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590623" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590623" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970250978" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970250978" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:55:57 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 07/26] vfio: Block device access via device fd until device is opened Date: Tue, 18 Jul 2023 06:55:32 -0700 Message-Id: <20230718135551.6592-8-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Allow the vfio_device file to be in a state where the device FD is opened but the device cannot be used by userspace (i.e. its .open_device() hasn't been called). This inbetween state is not used when the device FD is spawned from the group FD, however when we create the device FD directly by opening a cdev it will be opened in the blocked state. The reason for the inbetween state is that userspace only gets a FD but doesn't gain access permission until binding the FD to an iommufd. So in the blocked state, only the bind operation is allowed. Completing bind will allow user to further access the device. This is implemented by adding a flag in struct vfio_device_file to mark the blocked state and using a simple smp_load_acquire() to obtain the flag value and serialize all the device setup with the thread accessing this device. Following this lockless scheme, it can safely handle the device FD unbound->bound but it cannot handle bound->unbound. To allow this we'd need to add a lock on all the vfio ioctls which seems costly. So once device FD is bound, it remains bound until the FD is closed. Suggested-by: Jason Gunthorpe Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Reviewed-by: Eric Auger Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/group.c | 11 ++++++++++- drivers/vfio/vfio.h | 1 + drivers/vfio/vfio_main.c | 16 ++++++++++++++++ 3 files changed, 27 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index caf53716ddb2..088dd34c8931 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -194,9 +194,18 @@ static int vfio_df_group_open(struct vfio_device_file *df) df->iommufd = device->group->iommufd; ret = vfio_df_open(df); - if (ret) + if (ret) { df->iommufd = NULL; + goto out_put_kvm; + } + + /* + * Paired with smp_load_acquire() in vfio_device_fops::ioctl/ + * read/write/mmap and vfio_file_has_device_access() + */ + smp_store_release(&df->access_granted, true); +out_put_kvm: if (device->open_count == 0) vfio_device_put_kvm(device); diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 2094f5a4ef04..4478a1e77a5e 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -19,6 +19,7 @@ struct vfio_container; struct vfio_device_file { struct vfio_device *device; + u8 access_granted; spinlock_t kvm_ref_lock; /* protect kvm field */ struct kvm *kvm; struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */ diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 825b1eeaebe2..c37fc14599d0 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1129,6 +1129,10 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, struct vfio_device *device = df->device; int ret; + /* Paired with smp_store_release() following vfio_df_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + ret = vfio_device_pm_runtime_get(device); if (ret) return ret; @@ -1156,6 +1160,10 @@ static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf, struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + /* Paired with smp_store_release() following vfio_df_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + if (unlikely(!device->ops->read)) return -EINVAL; @@ -1169,6 +1177,10 @@ static ssize_t vfio_device_fops_write(struct file *filep, struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + /* Paired with smp_store_release() following vfio_df_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + if (unlikely(!device->ops->write)) return -EINVAL; @@ -1180,6 +1192,10 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma) struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + /* Paired with smp_store_release() following vfio_df_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + if (unlikely(!device->ops->mmap)) return -EINVAL; From patchwork Tue Jul 18 13:55:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317300 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58F1AEB64DD for ; Tue, 18 Jul 2023 13:56:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232808AbjGRN4e (ORCPT ); Tue, 18 Jul 2023 09:56:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230501AbjGRN4c (ORCPT ); Tue, 18 Jul 2023 09:56:32 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 055B31722; Tue, 18 Jul 2023 06:56:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688571; x=1721224571; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=A76vR4lYWoloNaH7epJbD8C0JWP11tKB0bA+8n6KH2g=; b=jZBPbtZRUuAGhEsL3JZVlzyY4Xz8WMTss2vjo/zdybMuwHaKCOLd4Vhi 9iWNCtK20hT5OZvRlZXMrDWuuwCK6KZ1u9C4BQSVAxOyXYLHBs8j3lb3J /NdJ2/BCABezD4oeJhIuZRzj9kgfFEdUpgvUgdtdbaSwoqFLOsatydUq/ NBeyhjHDqs77MCT/sl/fhstuANhEF59MacEOCraAV/0qvo0IAv0etMjvX dep6Nb+fPqALmC6XpSaVdeIWKQ7mHLW5JTRuBynxol/tQwG6FlRmQ4EeV pSzhbZnuEUkPjR/pDF40j0xHal1iJTxhqBov4KXB/BNdv76zM+TaG8I2U w==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590634" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590634" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970250984" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970250984" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:00 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 08/26] vfio: Add cdev_device_open_cnt to vfio_group Date: Tue, 18 Jul 2023 06:55:33 -0700 Message-Id: <20230718135551.6592-9-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This is for counting the devices that are opened via the cdev path. This count is increased and decreased by the cdev path. The group path checks it to achieve exclusion with the cdev path. With this, only one path (group path or cdev path) will claim DMA ownership. This avoids scenarios in which devices within the same group may be opened via different paths. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Reviewed-by: Eric Auger Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/group.c | 33 +++++++++++++++++++++++++++++++++ drivers/vfio/vfio.h | 3 +++ 2 files changed, 36 insertions(+) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 088dd34c8931..2751d61689c4 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -383,6 +383,33 @@ static long vfio_group_fops_unl_ioctl(struct file *filep, } } +int vfio_device_block_group(struct vfio_device *device) +{ + struct vfio_group *group = device->group; + int ret = 0; + + mutex_lock(&group->group_lock); + if (group->opened_file) { + ret = -EBUSY; + goto out_unlock; + } + + group->cdev_device_open_cnt++; + +out_unlock: + mutex_unlock(&group->group_lock); + return ret; +} + +void vfio_device_unblock_group(struct vfio_device *device) +{ + struct vfio_group *group = device->group; + + mutex_lock(&group->group_lock); + group->cdev_device_open_cnt--; + mutex_unlock(&group->group_lock); +} + static int vfio_group_fops_open(struct inode *inode, struct file *filep) { struct vfio_group *group = @@ -405,6 +432,11 @@ static int vfio_group_fops_open(struct inode *inode, struct file *filep) goto out_unlock; } + if (group->cdev_device_open_cnt) { + ret = -EBUSY; + goto out_unlock; + } + /* * Do we need multiple instances of the group open? Seems not. */ @@ -479,6 +511,7 @@ static void vfio_group_release(struct device *dev) mutex_destroy(&group->device_lock); mutex_destroy(&group->group_lock); WARN_ON(group->iommu_group); + WARN_ON(group->cdev_device_open_cnt); ida_free(&vfio.group_ida, MINOR(group->dev.devt)); kfree(group); } diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 4478a1e77a5e..ae7dd2ca14b9 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -84,8 +84,11 @@ struct vfio_group { struct blocking_notifier_head notifier; struct iommufd_ctx *iommufd; spinlock_t kvm_ref_lock; + unsigned int cdev_device_open_cnt; }; +int vfio_device_block_group(struct vfio_device *device); +void vfio_device_unblock_group(struct vfio_device *device); int vfio_device_set_group(struct vfio_device *device, enum vfio_group_type type); void vfio_device_remove_group(struct vfio_device *device); From patchwork Tue Jul 18 13:55:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317301 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97008EB64DD for ; Tue, 18 Jul 2023 13:56:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233030AbjGRN4p (ORCPT ); Tue, 18 Jul 2023 09:56:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54698 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233029AbjGRN4k (ORCPT ); Tue, 18 Jul 2023 09:56:40 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9920E1BC8; Tue, 18 Jul 2023 06:56:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688574; x=1721224574; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=R9bRtQdJml9+TitiHGxi0ya5VN9YQ3JVW0CnvpAiveo=; b=J2qXgHwsphEEELbYRzM+hZlOcrRZGNv62r/8Mo3drahfSa7awJ04aBv4 UZ8TK8J++VuW91s6uwrJE35IsMQxyfg5sQbJwO+Gxba95ZsX3srh9keiG Q3fLF6KvbB/taTRVWlC783cSP0PGsF3soLqdo5gQcYWAmAVHlwT8kZLlI BcwJtf5SQUHIpnQR3GF4bgXnGowO7NLEjbFBSn+NGB+dfigMg4djumAey W8eFmIP+1/lLfkNkKG4xYJOZizxgRSRFpWm+HEmbKJ8uFoSWThiP263GL VZw9XC2oClGMXtUi7t9Vt0KA+cBWAbtSj9xvafo7kavs8D7ljyaxmwRfN Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590642" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590642" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251002" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251002" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:01 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 09/26] vfio: Make vfio_df_open() single open for device cdev path Date: Tue, 18 Jul 2023 06:55:34 -0700 Message-Id: <20230718135551.6592-10-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org VFIO group has historically allowed multi-open of the device FD. This was made secure because the "open" was executed via an ioctl to the group FD which is itself only single open. However, no known use of multiple device FDs today. It is kind of a strange thing to do because new device FDs can naturally be created via dup(). When we implement the new device uAPI (only used in cdev path) there is no natural way to allow the device itself from being multi-opened in a secure manner. Without the group FD we cannot prove the security context of the opener. Thus, when moving to the new uAPI we block the ability of opening a device multiple times. Given old group path still allows it we store a vfio_group pointer in struct vfio_device_file to differentiate. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Reviewed-by: Eric Auger Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/group.c | 2 ++ drivers/vfio/vfio.h | 1 + drivers/vfio/vfio_main.c | 7 +++++++ 3 files changed, 10 insertions(+) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 2751d61689c4..4e6277191eb4 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -245,6 +245,8 @@ static struct file *vfio_device_open_file(struct vfio_device *device) goto err_out; } + df->group = device->group; + ret = vfio_df_group_open(df); if (ret) goto err_free; diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index ae7dd2ca14b9..85484a971a3e 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -18,6 +18,7 @@ struct vfio_container; struct vfio_device_file { struct vfio_device *device; + struct vfio_group *group; u8 access_granted; spinlock_t kvm_ref_lock; /* protect kvm field */ diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index c37fc14599d0..be5e4ddd5901 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -492,6 +492,13 @@ int vfio_df_open(struct vfio_device_file *df) lockdep_assert_held(&device->dev_set->lock); + /* + * Only the group path allows the device to be opened multiple + * times. The device cdev path doesn't have a secure way for it. + */ + if (device->open_count != 0 && !df->group) + return -EINVAL; + device->open_count++; if (device->open_count == 1) { ret = vfio_df_device_first_open(df); From patchwork Tue Jul 18 13:55:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317302 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74AC8EB64DD for ; Tue, 18 Jul 2023 13:56:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232999AbjGRN4w (ORCPT ); Tue, 18 Jul 2023 09:56:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54754 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230170AbjGRN4u (ORCPT ); Tue, 18 Jul 2023 09:56:50 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD211198A; Tue, 18 Jul 2023 06:56:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688582; x=1721224582; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yVuTouKiXuMSRVJgUDej1uAqeuOZs2O91YgAsb+rlYo=; b=LgU7Gxt020l/pXvWLxH76OMqFDQT/sSjTpKnXmxzB/tzv/lvScFuv538 iZ0GnWepJUwykhZESUelAGUBl7X11pgsEyQXCVuwwgB4HrtYm626s7wAf d/lU4svBE1sreHFcvRPSvPKC1tZPT1mRxo5zryqTMzmdt1zTFv7PVETZm H67AC3tLjMq1o+1yP/9KPyzlNVrC5w64TsbTGPqzgPYUF3X4fNfeP6En+ zUekaGTyEa7QCjBc41s8ODKgUhZvMNriq17tOCXG41u3aQJj5q4lWZghe VqBG1RL5SxuvyA8fngUGQa0JFVZN4kKGpKfvh3NchT7jopXVhodoxIL72 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590658" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590658" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251013" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251013" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:01 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 10/26] vfio-iommufd: Move noiommu compat validation out of vfio_iommufd_bind() Date: Tue, 18 Jul 2023 06:55:35 -0700 Message-Id: <20230718135551.6592-11-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This moves the noiommu compat validation logic into vfio_df_group_open(). This is more consistent with what will be done in vfio device cdev path. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/group.c | 13 +++++++++++++ drivers/vfio/iommufd.c | 22 ++++++++-------------- drivers/vfio/vfio.h | 9 +++++++++ 3 files changed, 30 insertions(+), 14 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 4e6277191eb4..b8b77daf7aa6 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -192,6 +192,19 @@ static int vfio_df_group_open(struct vfio_device_file *df) vfio_device_group_get_kvm_safe(device); df->iommufd = device->group->iommufd; + if (df->iommufd && vfio_device_is_noiommu(device) && device->open_count == 0) { + /* + * Require no compat ioas to be assigned to proceed. The basic + * statement is that the user cannot have done something that + * implies they expected translation to exist + */ + if (!capable(CAP_SYS_RAWIO) || + vfio_iommufd_device_has_compat_ioas(device, df->iommufd)) + ret = -EPERM; + else + ret = 0; + goto out_put_kvm; + } ret = vfio_df_open(df); if (ret) { diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index afda47ee9663..36f838dad084 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -10,6 +10,14 @@ MODULE_IMPORT_NS(IOMMUFD); MODULE_IMPORT_NS(IOMMUFD_VFIO); +bool vfio_iommufd_device_has_compat_ioas(struct vfio_device *vdev, + struct iommufd_ctx *ictx) +{ + u32 ioas_id; + + return !iommufd_vfio_compat_ioas_get_id(ictx, &ioas_id); +} + int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx) { u32 ioas_id; @@ -18,20 +26,6 @@ int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx) lockdep_assert_held(&vdev->dev_set->lock); - if (vfio_device_is_noiommu(vdev)) { - if (!capable(CAP_SYS_RAWIO)) - return -EPERM; - - /* - * Require no compat ioas to be assigned to proceed. The basic - * statement is that the user cannot have done something that - * implies they expected translation to exist - */ - if (!iommufd_vfio_compat_ioas_get_id(ictx, &ioas_id)) - return -EPERM; - return 0; - } - ret = vdev->ops->bind_iommufd(vdev, ictx, &device_id); if (ret) return ret; diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 85484a971a3e..300cab04f4e1 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -234,9 +234,18 @@ static inline void vfio_container_cleanup(void) #endif #if IS_ENABLED(CONFIG_IOMMUFD) +bool vfio_iommufd_device_has_compat_ioas(struct vfio_device *vdev, + struct iommufd_ctx *ictx); int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx); void vfio_iommufd_unbind(struct vfio_device *device); #else +static inline bool +vfio_iommufd_device_has_compat_ioas(struct vfio_device *vdev, + struct iommufd_ctx *ictx) +{ + return false; +} + static inline int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx) { From patchwork Tue Jul 18 13:55:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317303 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96D12EB64DA for ; Tue, 18 Jul 2023 13:57:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232327AbjGRN5D (ORCPT ); Tue, 18 Jul 2023 09:57:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233006AbjGRN5C (ORCPT ); Tue, 18 Jul 2023 09:57:02 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B917110FF; Tue, 18 Jul 2023 06:56:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688592; x=1721224592; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fOiSoEFzVAkdoi2qnetoh3ogEd9ev3AGUUZlZ3iHOnw=; b=SgI5U7VI2MzaSOKQq2ZWvJ3ogIQAe3a09z2TszAhhflFxkzzLpUW603i uZTYdCvC+nX87MECdawdL7rteZ8R7bvluOKXKjrg12P7bSaRvCW6VL7r5 cHoJRp9XkPC01bKFe7t1UiiHUbea7BFVLBxg22dr1BMbVS9SWgRc4C1B1 M3KBVLoeZ5s6vCDCpMxYsurXa5FSltAszHZta1hG8fq6dlSh8xXzAdivu qRGTaS+p4oqzOsiSWaLo690ijsIAUKgVdXH7eG3LmGHzBL/P6BMCaGQUa k4hYpXEzstFwblRECvEBki/rpQPRegEPxyX7mRvBaQ6rRaLmv1Klcto4w A==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590669" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590669" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251020" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251020" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:02 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 11/26] vfio-iommufd: Split bind/attach into two steps Date: Tue, 18 Jul 2023 06:55:36 -0700 Message-Id: <20230718135551.6592-12-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This aligns the bind/attach logic with the coming vfio device cdev support. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/group.c | 17 +++++++++++++---- drivers/vfio/iommufd.c | 35 +++++++++++++++++------------------ drivers/vfio/vfio.h | 9 +++++++++ 3 files changed, 39 insertions(+), 22 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index b8b77daf7aa6..41a09a2df690 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -207,9 +207,13 @@ static int vfio_df_group_open(struct vfio_device_file *df) } ret = vfio_df_open(df); - if (ret) { - df->iommufd = NULL; + if (ret) goto out_put_kvm; + + if (df->iommufd && device->open_count == 1) { + ret = vfio_iommufd_compat_attach_ioas(device, df->iommufd); + if (ret) + goto out_close_device; } /* @@ -218,12 +222,17 @@ static int vfio_df_group_open(struct vfio_device_file *df) */ smp_store_release(&df->access_granted, true); + mutex_unlock(&device->dev_set->lock); + mutex_unlock(&device->group->group_lock); + return 0; + +out_close_device: + vfio_df_close(df); out_put_kvm: + df->iommufd = NULL; if (device->open_count == 0) vfio_device_put_kvm(device); - mutex_unlock(&device->dev_set->lock); - out_unlock: mutex_unlock(&device->group->group_lock); return ret; diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 36f838dad084..91fdae69bb45 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -20,33 +20,32 @@ bool vfio_iommufd_device_has_compat_ioas(struct vfio_device *vdev, int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx) { - u32 ioas_id; u32 device_id; + + lockdep_assert_held(&vdev->dev_set->lock); + + /* The legacy path has no way to return the device id */ + return vdev->ops->bind_iommufd(vdev, ictx, &device_id); +} + +int vfio_iommufd_compat_attach_ioas(struct vfio_device *vdev, + struct iommufd_ctx *ictx) +{ + u32 ioas_id; int ret; lockdep_assert_held(&vdev->dev_set->lock); - ret = vdev->ops->bind_iommufd(vdev, ictx, &device_id); - if (ret) - return ret; + /* compat noiommu does not need to do ioas attach */ + if (vfio_device_is_noiommu(vdev)) + return 0; ret = iommufd_vfio_compat_ioas_get_id(ictx, &ioas_id); if (ret) - goto err_unbind; - ret = vdev->ops->attach_ioas(vdev, &ioas_id); - if (ret) - goto err_unbind; - - /* - * The legacy path has no way to return the device id or the selected - * pt_id - */ - return 0; + return ret; -err_unbind: - if (vdev->ops->unbind_iommufd) - vdev->ops->unbind_iommufd(vdev); - return ret; + /* The legacy path has no way to return the selected pt_id */ + return vdev->ops->attach_ioas(vdev, &ioas_id); } void vfio_iommufd_unbind(struct vfio_device *vdev) diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 300cab04f4e1..04755379940c 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -238,6 +238,8 @@ bool vfio_iommufd_device_has_compat_ioas(struct vfio_device *vdev, struct iommufd_ctx *ictx); int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx); void vfio_iommufd_unbind(struct vfio_device *device); +int vfio_iommufd_compat_attach_ioas(struct vfio_device *device, + struct iommufd_ctx *ictx); #else static inline bool vfio_iommufd_device_has_compat_ioas(struct vfio_device *vdev, @@ -255,6 +257,13 @@ static inline int vfio_iommufd_bind(struct vfio_device *device, static inline void vfio_iommufd_unbind(struct vfio_device *device) { } + +static inline int +vfio_iommufd_compat_attach_ioas(struct vfio_device *device, + struct iommufd_ctx *ictx) +{ + return -EOPNOTSUPP; +} #endif #if IS_ENABLED(CONFIG_VFIO_VIRQFD) From patchwork Tue Jul 18 13:55:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317304 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEC7DEB64DC for ; Tue, 18 Jul 2023 13:57:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233024AbjGRN5O (ORCPT ); Tue, 18 Jul 2023 09:57:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55178 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232746AbjGRN5L (ORCPT ); Tue, 18 Jul 2023 09:57:11 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B11E7171A; Tue, 18 Jul 2023 06:56:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688599; x=1721224599; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9Uwc9M62tc+ZUI5WOPAj33DqDZqRBK0/Z1nd+1nKVXA=; b=entGzxV/Jf9sYA/uSYg9ECFoWmmj3dS3kOzhWpFTJzdfMlSMp6bNQmoL xCgrkEpZdto1/nmSWJ3cUcToLOK1pBX1L3Zw+A79KOsjyJrEGDywbG9Qw KnmnwuKvkYJSkR6br3s5WHzzDyzd6vuFhmMu2hVuPGYU8Um+nhgk//iis jO0qV8ihjHWo5+B33d5Etm3sA/6NudY262++HT7iZw4/9GY3SPw7NXn2Z Hj2aebeeQtGYVHL/VlHe0/OaDO6O8AOGLMNBRcoaM8t2yvXsDkq7dXAD5 qJ/9RMmMsTlikBtnusMX8VcWqoYgad/68UuZXDvMMwlZdLOoj0qWUmmcL g==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590682" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590682" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251033" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251033" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:03 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 12/26] vfio: Record devid in vfio_device_file Date: Tue, 18 Jul 2023 06:55:37 -0700 Message-Id: <20230718135551.6592-13-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org .bind_iommufd() will generate an ID to represent this bond, which is needed by userspace for further usage. Store devid in vfio_device_file to avoid passing the pointer in multiple places. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/iommufd.c | 12 +++++++----- drivers/vfio/vfio.h | 10 +++++----- drivers/vfio/vfio_main.c | 6 +++--- 3 files changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 91fdae69bb45..4fc674c01a05 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -18,14 +18,14 @@ bool vfio_iommufd_device_has_compat_ioas(struct vfio_device *vdev, return !iommufd_vfio_compat_ioas_get_id(ictx, &ioas_id); } -int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx) +int vfio_df_iommufd_bind(struct vfio_device_file *df) { - u32 device_id; + struct vfio_device *vdev = df->device; + struct iommufd_ctx *ictx = df->iommufd; lockdep_assert_held(&vdev->dev_set->lock); - /* The legacy path has no way to return the device id */ - return vdev->ops->bind_iommufd(vdev, ictx, &device_id); + return vdev->ops->bind_iommufd(vdev, ictx, &df->devid); } int vfio_iommufd_compat_attach_ioas(struct vfio_device *vdev, @@ -48,8 +48,10 @@ int vfio_iommufd_compat_attach_ioas(struct vfio_device *vdev, return vdev->ops->attach_ioas(vdev, &ioas_id); } -void vfio_iommufd_unbind(struct vfio_device *vdev) +void vfio_df_iommufd_unbind(struct vfio_device_file *df) { + struct vfio_device *vdev = df->device; + lockdep_assert_held(&vdev->dev_set->lock); if (vfio_device_is_noiommu(vdev)) diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 04755379940c..58801adc1a7e 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -21,6 +21,7 @@ struct vfio_device_file { struct vfio_group *group; u8 access_granted; + u32 devid; /* only valid when iommufd is valid */ spinlock_t kvm_ref_lock; /* protect kvm field */ struct kvm *kvm; struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */ @@ -236,8 +237,8 @@ static inline void vfio_container_cleanup(void) #if IS_ENABLED(CONFIG_IOMMUFD) bool vfio_iommufd_device_has_compat_ioas(struct vfio_device *vdev, struct iommufd_ctx *ictx); -int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx); -void vfio_iommufd_unbind(struct vfio_device *device); +int vfio_df_iommufd_bind(struct vfio_device_file *df); +void vfio_df_iommufd_unbind(struct vfio_device_file *df); int vfio_iommufd_compat_attach_ioas(struct vfio_device *device, struct iommufd_ctx *ictx); #else @@ -248,13 +249,12 @@ vfio_iommufd_device_has_compat_ioas(struct vfio_device *vdev, return false; } -static inline int vfio_iommufd_bind(struct vfio_device *device, - struct iommufd_ctx *ictx) +static inline int vfio_df_iommufd_bind(struct vfio_device_file *fd) { return -EOPNOTSUPP; } -static inline void vfio_iommufd_unbind(struct vfio_device *device) +static inline void vfio_df_iommufd_unbind(struct vfio_device_file *df) { } diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index be5e4ddd5901..3a4b7eb128df 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -446,7 +446,7 @@ static int vfio_df_device_first_open(struct vfio_device_file *df) return -ENODEV; if (iommufd) - ret = vfio_iommufd_bind(device, iommufd); + ret = vfio_df_iommufd_bind(df); else ret = vfio_device_group_use_iommu(device); if (ret) @@ -461,7 +461,7 @@ static int vfio_df_device_first_open(struct vfio_device_file *df) err_unuse_iommu: if (iommufd) - vfio_iommufd_unbind(device); + vfio_df_iommufd_unbind(df); else vfio_device_group_unuse_iommu(device); err_module_put: @@ -479,7 +479,7 @@ static void vfio_df_device_last_close(struct vfio_device_file *df) if (device->ops->close_device) device->ops->close_device(device); if (iommufd) - vfio_iommufd_unbind(device); + vfio_df_iommufd_unbind(df); else vfio_device_group_unuse_iommu(device); module_put(device->dev->driver->owner); From patchwork Tue Jul 18 13:55:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317305 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4020EB64DC for ; Tue, 18 Jul 2023 13:57:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231773AbjGRN5Y (ORCPT ); Tue, 18 Jul 2023 09:57:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232272AbjGRN5W (ORCPT ); Tue, 18 Jul 2023 09:57:22 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EC0F10C; Tue, 18 Jul 2023 06:56:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688610; x=1721224610; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Pm5tVmP0Ott2buk248HK1c989OK4zQdMS9S99jdzyxo=; b=kSEa+pQ9ToOq8asUKjyWH6Y7+cPr4mCMcbv67+nELE3RK0dvh+4QGJTi SJF5eGnb1ZbCUq4Ponk3YU3tfxXPQh/ToVfvpGLT8BgCanh47AaAQUGMq XSQm49NypUtorMDMHqeHjN1HK9Va3XbDEkZZalTrOyrWCn1uo4n5y0cc3 2Ij4HQ9vMs5J9vSKs8BPW2KBLQyJxrxwedk6VnMu5V+oddoSIWj7hH0UC s9hnvRNQX9bRjp4gL63hIhd+zN6YKm1okohlloGIxpBrVe1UOLmavbn3P L/6UojCor2q2VWMus2DQDfhdmoocledLoE03qhO6zPSLVa0dwNPotQo1D w==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590695" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590695" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251038" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251038" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:03 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 13/26] vfio-iommufd: Add detach_ioas support for physical VFIO devices Date: Tue, 18 Jul 2023 06:55:38 -0700 Message-Id: <20230718135551.6592-14-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This prepares for adding DETACH ioctl for physical VFIO devices. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- Documentation/driver-api/vfio.rst | 8 +++++--- drivers/vfio/fsl-mc/vfio_fsl_mc.c | 1 + drivers/vfio/iommufd.c | 20 +++++++++++++++++++ .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 2 ++ drivers/vfio/pci/mlx5/main.c | 1 + drivers/vfio/pci/vfio_pci.c | 1 + drivers/vfio/platform/vfio_amba.c | 1 + drivers/vfio/platform/vfio_platform.c | 1 + drivers/vfio/vfio_main.c | 3 ++- include/linux/vfio.h | 8 +++++++- 10 files changed, 41 insertions(+), 5 deletions(-) diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst index 68abc089d6dd..363e12c90b87 100644 --- a/Documentation/driver-api/vfio.rst +++ b/Documentation/driver-api/vfio.rst @@ -279,6 +279,7 @@ similar to a file operations structure:: struct iommufd_ctx *ictx, u32 *out_device_id); void (*unbind_iommufd)(struct vfio_device *vdev); int (*attach_ioas)(struct vfio_device *vdev, u32 *pt_id); + void (*detach_ioas)(struct vfio_device *vdev); int (*open_device)(struct vfio_device *vdev); void (*close_device)(struct vfio_device *vdev); ssize_t (*read)(struct vfio_device *vdev, char __user *buf, @@ -315,9 +316,10 @@ container_of(). - The [un]bind_iommufd callbacks are issued when the device is bound to and unbound from iommufd. - - The attach_ioas callback is issued when the device is attached to an - IOAS managed by the bound iommufd. The attached IOAS is automatically - detached when the device is unbound from iommufd. + - The [de]attach_ioas callback is issued when the device is attached to + and detached from an IOAS managed by the bound iommufd. However, the + attached IOAS can also be automatically detached when the device is + unbound from iommufd. - The read/write/mmap callbacks implement the device region access defined by the device's own VFIO_DEVICE_GET_REGION_INFO ioctl. diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c index f2140e94d41e..116358a8f1cf 100644 --- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c +++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c @@ -593,6 +593,7 @@ static const struct vfio_device_ops vfio_fsl_mc_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static struct fsl_mc_driver vfio_fsl_mc_driver = { diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 4fc674c01a05..86df5415759a 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -140,6 +140,14 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id) { int rc; + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device)) + return -EINVAL; + + if (vdev->iommufd_attached) + return -EBUSY; + rc = iommufd_device_attach(vdev->iommufd_device, pt_id); if (rc) return rc; @@ -148,6 +156,18 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id) } EXPORT_SYMBOL_GPL(vfio_iommufd_physical_attach_ioas); +void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev) +{ + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device) || !vdev->iommufd_attached) + return; + + iommufd_device_detach(vdev->iommufd_device); + vdev->iommufd_attached = false; +} +EXPORT_SYMBOL_GPL(vfio_iommufd_physical_detach_ioas); + /* * The emulated standard ops mean that vfio_device is going to use the * "mdev path" and will call vfio_pin_pages()/vfio_dma_rw(). Drivers using this diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c index a117eaf21c14..b2f9778c8366 100644 --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c @@ -1373,6 +1373,7 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static const struct vfio_device_ops hisi_acc_vfio_pci_ops = { @@ -1391,6 +1392,7 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c index d95fd382814c..42ec574a8622 100644 --- a/drivers/vfio/pci/mlx5/main.c +++ b/drivers/vfio/pci/mlx5/main.c @@ -1320,6 +1320,7 @@ static const struct vfio_device_ops mlx5vf_pci_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static int mlx5vf_pci_probe(struct pci_dev *pdev, diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 29091ee2e984..cb5b7f865d58 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -141,6 +141,7 @@ static const struct vfio_device_ops vfio_pci_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c index 83fe54015595..6464b3939ebc 100644 --- a/drivers/vfio/platform/vfio_amba.c +++ b/drivers/vfio/platform/vfio_amba.c @@ -119,6 +119,7 @@ static const struct vfio_device_ops vfio_amba_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static const struct amba_id pl330_ids[] = { diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c index 22a1efca32a8..8cf22fa65baa 100644 --- a/drivers/vfio/platform/vfio_platform.c +++ b/drivers/vfio/platform/vfio_platform.c @@ -108,6 +108,7 @@ static const struct vfio_device_ops vfio_platform_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static struct platform_driver vfio_platform_driver = { diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 3a4b7eb128df..c71c0d1a079f 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -273,7 +273,8 @@ static int __vfio_register_dev(struct vfio_device *device, if (WARN_ON(IS_ENABLED(CONFIG_IOMMUFD) && (!device->ops->bind_iommufd || !device->ops->unbind_iommufd || - !device->ops->attach_ioas))) + !device->ops->attach_ioas || + !device->ops->detach_ioas))) return -EINVAL; /* diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 06a5221949c5..f2f02273ece1 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -73,7 +73,9 @@ struct vfio_device { * @bind_iommufd: Called when binding the device to an iommufd * @unbind_iommufd: Opposite of bind_iommufd * @attach_ioas: Called when attaching device to an IOAS/HWPT managed by the - * bound iommufd. Undo in unbind_iommufd. + * bound iommufd. Undo in unbind_iommufd if @detach_ioas is not + * called. + * @detach_ioas: Opposite of attach_ioas * @open_device: Called when the first file descriptor is opened for this device * @close_device: Opposite of open_device * @read: Perform read(2) on device file descriptor @@ -97,6 +99,7 @@ struct vfio_device_ops { struct iommufd_ctx *ictx, u32 *out_device_id); void (*unbind_iommufd)(struct vfio_device *vdev); int (*attach_ioas)(struct vfio_device *vdev, u32 *pt_id); + void (*detach_ioas)(struct vfio_device *vdev); int (*open_device)(struct vfio_device *vdev); void (*close_device)(struct vfio_device *vdev); ssize_t (*read)(struct vfio_device *vdev, char __user *buf, @@ -120,6 +123,7 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id); void vfio_iommufd_physical_unbind(struct vfio_device *vdev); int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id); +void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev); int vfio_iommufd_emulated_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id); void vfio_iommufd_emulated_unbind(struct vfio_device *vdev); @@ -144,6 +148,8 @@ vfio_iommufd_get_dev_id(struct vfio_device *vdev, struct iommufd_ctx *ictx) ((void (*)(struct vfio_device *vdev)) NULL) #define vfio_iommufd_physical_attach_ioas \ ((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL) +#define vfio_iommufd_physical_detach_ioas \ + ((void (*)(struct vfio_device *vdev)) NULL) #define vfio_iommufd_emulated_bind \ ((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx, \ u32 *out_device_id)) NULL) From patchwork Tue Jul 18 13:55:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317306 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17490EB64DD for ; Tue, 18 Jul 2023 13:57:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232851AbjGRN5d (ORCPT ); Tue, 18 Jul 2023 09:57:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233033AbjGRN5b (ORCPT ); Tue, 18 Jul 2023 09:57:31 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0321610C2; Tue, 18 Jul 2023 06:57:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688621; x=1721224621; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=OCfSOnkUB8+3ZJ+7CRvQgAInJWCkh3Z8L/K/ny42OEk=; b=VYqgONvUyhjIZeUV3AtWfg89XVqNtQ+83W3668A10RfWxVDohOeo49Dv 8+uPb6ofnf5GrF5iX7kyyEsh/p6Ai/xkzaiwTSEoVCULnd9qH6Ol8NH7s JMWKV7jqztzFpoOTrlkrl9JoOOMiYZfxWIg+2WZRS+tl7klVM1tNpakvU 4SHriJo1zV1Jg9V/G0SvboFajEeraLHFudl47iVLLYbGUuu1NgSFsuEew SnSOAZfn//64jrYPMyWawTXOjAb6XsVwUlvWhaZ/rCMj4DJ64vpqlIV6X f6R/9PPzZ4Dw+uAPU8th2zf1Q/pRaFa+/g2vQ2rqLUCUtmVc7VOrywaX0 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590708" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590708" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251044" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251044" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:04 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 14/26] iommufd/device: Add iommufd_access_detach() API Date: Tue, 18 Jul 2023 06:55:39 -0700 Message-Id: <20230718135551.6592-15-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Nicolin Chen Previously, the detach routine is only done by the destroy(). And it was called by vfio_iommufd_emulated_unbind() when the device runs close(), so all the mappings in iopt were cleaned in that setup, when the call trace reaches this detach() routine. Now, there's a need of a detach uAPI, meaning that it does not only need a new iommufd_access_detach() API, but also requires access->ops->unmap() call as a cleanup. So add one. However, leaving that unprotected can introduce some potential of a race condition during the pin_/unpin_pages() call, where access->ioas->iopt is getting referenced. So, add an ioas_lock to protect the context of iopt referencings. Also, to allow the iommufd_access_unpin_pages() callback to happen via this unmap() call, add an ioas_unpin pointer, so the unpin routine won't be affected by the "access->ioas = NULL" trick. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu --- drivers/iommu/iommufd/device.c | 74 +++++++++++++++++++++++-- drivers/iommu/iommufd/iommufd_private.h | 2 + include/linux/iommufd.h | 1 + 3 files changed, 72 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index cd5d8ab907f9..59fec5783eb9 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -486,6 +486,7 @@ iommufd_access_create(struct iommufd_ctx *ictx, iommufd_ctx_get(ictx); iommufd_object_finalize(ictx, &access->obj); *id = access->obj.id; + mutex_init(&access->ioas_lock); return access; } EXPORT_SYMBOL_NS_GPL(iommufd_access_create, IOMMUFD); @@ -505,26 +506,60 @@ void iommufd_access_destroy(struct iommufd_access *access) } EXPORT_SYMBOL_NS_GPL(iommufd_access_destroy, IOMMUFD); +void iommufd_access_detach(struct iommufd_access *access) +{ + struct iommufd_ioas *cur_ioas = access->ioas; + + mutex_lock(&access->ioas_lock); + if (WARN_ON(!access->ioas)) + goto out; + /* + * Set ioas to NULL to block any further iommufd_access_pin_pages(). + * iommufd_access_unpin_pages() can continue using access->ioas_unpin. + */ + access->ioas = NULL; + + if (access->ops->unmap) { + mutex_unlock(&access->ioas_lock); + access->ops->unmap(access->data, 0, ULONG_MAX); + mutex_lock(&access->ioas_lock); + } + iopt_remove_access(&cur_ioas->iopt, access); + refcount_dec(&cur_ioas->obj.users); +out: + access->ioas_unpin = NULL; + mutex_unlock(&access->ioas_lock); +} +EXPORT_SYMBOL_NS_GPL(iommufd_access_detach, IOMMUFD); + int iommufd_access_attach(struct iommufd_access *access, u32 ioas_id) { struct iommufd_ioas *new_ioas; int rc = 0; - if (access->ioas) + mutex_lock(&access->ioas_lock); + if (WARN_ON(access->ioas || access->ioas_unpin)) { + mutex_unlock(&access->ioas_lock); return -EINVAL; + } new_ioas = iommufd_get_ioas(access->ictx, ioas_id); - if (IS_ERR(new_ioas)) + if (IS_ERR(new_ioas)) { + mutex_unlock(&access->ioas_lock); return PTR_ERR(new_ioas); + } rc = iopt_add_access(&new_ioas->iopt, access); if (rc) { + mutex_unlock(&access->ioas_lock); iommufd_put_object(&new_ioas->obj); return rc; } iommufd_ref_to_users(&new_ioas->obj); access->ioas = new_ioas; + access->ioas_unpin = new_ioas; + mutex_unlock(&access->ioas_lock); return 0; } EXPORT_SYMBOL_NS_GPL(iommufd_access_attach, IOMMUFD); @@ -579,8 +614,8 @@ void iommufd_access_notify_unmap(struct io_pagetable *iopt, unsigned long iova, void iommufd_access_unpin_pages(struct iommufd_access *access, unsigned long iova, unsigned long length) { - struct io_pagetable *iopt = &access->ioas->iopt; struct iopt_area_contig_iter iter; + struct io_pagetable *iopt; unsigned long last_iova; struct iopt_area *area; @@ -588,6 +623,17 @@ void iommufd_access_unpin_pages(struct iommufd_access *access, WARN_ON(check_add_overflow(iova, length - 1, &last_iova))) return; + mutex_lock(&access->ioas_lock); + /* + * The driver must be doing something wrong if it calls this before an + * iommufd_access_attach() or after an iommufd_access_detach(). + */ + if (WARN_ON(!access->ioas_unpin)) { + mutex_unlock(&access->ioas_lock); + return; + } + iopt = &access->ioas_unpin->iopt; + down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) iopt_area_remove_access( @@ -597,6 +643,7 @@ void iommufd_access_unpin_pages(struct iommufd_access *access, min(last_iova, iopt_area_last_iova(area)))); WARN_ON(!iopt_area_contig_done(&iter)); up_read(&iopt->iova_rwsem); + mutex_unlock(&access->ioas_lock); } EXPORT_SYMBOL_NS_GPL(iommufd_access_unpin_pages, IOMMUFD); @@ -642,8 +689,8 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, unsigned long length, struct page **out_pages, unsigned int flags) { - struct io_pagetable *iopt = &access->ioas->iopt; struct iopt_area_contig_iter iter; + struct io_pagetable *iopt; unsigned long last_iova; struct iopt_area *area; int rc; @@ -658,6 +705,13 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, if (check_add_overflow(iova, length - 1, &last_iova)) return -EOVERFLOW; + mutex_lock(&access->ioas_lock); + if (!access->ioas) { + mutex_unlock(&access->ioas_lock); + return -ENOENT; + } + iopt = &access->ioas->iopt; + down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) { unsigned long last = min(last_iova, iopt_area_last_iova(area)); @@ -688,6 +742,7 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, } up_read(&iopt->iova_rwsem); + mutex_unlock(&access->ioas_lock); return 0; err_remove: @@ -702,6 +757,7 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, iopt_area_last_iova(area)))); } up_read(&iopt->iova_rwsem); + mutex_unlock(&access->ioas_lock); return rc; } EXPORT_SYMBOL_NS_GPL(iommufd_access_pin_pages, IOMMUFD); @@ -721,8 +777,8 @@ EXPORT_SYMBOL_NS_GPL(iommufd_access_pin_pages, IOMMUFD); int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, void *data, size_t length, unsigned int flags) { - struct io_pagetable *iopt = &access->ioas->iopt; struct iopt_area_contig_iter iter; + struct io_pagetable *iopt; struct iopt_area *area; unsigned long last_iova; int rc; @@ -732,6 +788,13 @@ int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, if (check_add_overflow(iova, length - 1, &last_iova)) return -EOVERFLOW; + mutex_lock(&access->ioas_lock); + if (!access->ioas) { + mutex_unlock(&access->ioas_lock); + return -ENOENT; + } + iopt = &access->ioas->iopt; + down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) { unsigned long last = min(last_iova, iopt_area_last_iova(area)); @@ -758,6 +821,7 @@ int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, rc = -ENOENT; err_out: up_read(&iopt->iova_rwsem); + mutex_unlock(&access->ioas_lock); return rc; } EXPORT_SYMBOL_NS_GPL(iommufd_access_rw, IOMMUFD); diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index b38e67d1988b..3dcaf86aab97 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -285,6 +285,8 @@ struct iommufd_access { struct iommufd_object obj; struct iommufd_ctx *ictx; struct iommufd_ioas *ioas; + struct iommufd_ioas *ioas_unpin; + struct mutex ioas_lock; const struct iommufd_access_ops *ops; void *data; unsigned long iova_alignment; diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 68defed9ea48..3a3216cb9482 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -48,6 +48,7 @@ iommufd_access_create(struct iommufd_ctx *ictx, const struct iommufd_access_ops *ops, void *data, u32 *id); void iommufd_access_destroy(struct iommufd_access *access); int iommufd_access_attach(struct iommufd_access *access, u32 ioas_id); +void iommufd_access_detach(struct iommufd_access *access); void iommufd_ctx_get(struct iommufd_ctx *ictx); From patchwork Tue Jul 18 13:55:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317307 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27ABBEB64DA for ; Tue, 18 Jul 2023 13:57:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233032AbjGRN5j (ORCPT ); Tue, 18 Jul 2023 09:57:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55636 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233043AbjGRN5h (ORCPT ); Tue, 18 Jul 2023 09:57:37 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54498170F; Tue, 18 Jul 2023 06:57:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688631; x=1721224631; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WOkFeGXs+6If0diuH6vBq+ADaQma3j8AesjsEPtUQd4=; b=OUx/WRSexx6bzR38buBxeFPJ49VrBQg0ZWqtQsU3W9g2JTc4qmQeViQo ofgmebXcNf1x0N493L9zbS04DhtQ8/ROEPTlyTr8Lo/sNT1f2NkIf33Gy l1gWWn3LgjgkACr98txFEU5qI4tCBA4CCWwJF1XO216Bvk1SEDPsh7DQF YcTFiDT4dZ2RIIZ2klcEmXdi6/YT82wZbfR7Hv2nzo2ifMsTpEuWFp6gU yqSptgw8seULEZctF6N+/vVikvCHnVWw7qxVGv3AzAPyp7JUHdwNovmk2 kIy/oLC/4Zy0aBgfsJylUKSzPaJbj5UZThmzHlrz12xfKaw+AhViawtdL A==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590718" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590718" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251050" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251050" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:05 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 15/26] vfio-iommufd: Add detach_ioas support for emulated VFIO devices Date: Tue, 18 Jul 2023 06:55:40 -0700 Message-Id: <20230718135551.6592-16-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This prepares for adding DETACH ioctl for emulated VFIO devices. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/gpu/drm/i915/gvt/kvmgt.c | 1 + drivers/s390/cio/vfio_ccw_ops.c | 1 + drivers/s390/crypto/vfio_ap_ops.c | 1 + drivers/vfio/iommufd.c | 13 +++++++++++++ include/linux/vfio.h | 3 +++ samples/vfio-mdev/mbochs.c | 1 + samples/vfio-mdev/mdpy.c | 1 + samples/vfio-mdev/mtty.c | 1 + 8 files changed, 22 insertions(+) diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index de675d799c7d..9cd9e9da60dd 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1474,6 +1474,7 @@ static const struct vfio_device_ops intel_vgpu_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; static int intel_vgpu_probe(struct mdev_device *mdev) diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c index 5b53b94f13c7..cba4971618ff 100644 --- a/drivers/s390/cio/vfio_ccw_ops.c +++ b/drivers/s390/cio/vfio_ccw_ops.c @@ -632,6 +632,7 @@ static const struct vfio_device_ops vfio_ccw_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; struct mdev_driver vfio_ccw_mdev_driver = { diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index b441745b0418..2d3c3a79b687 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1975,6 +1975,7 @@ static const struct vfio_device_ops vfio_ap_matrix_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, .request = vfio_ap_mdev_request }; diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 86df5415759a..4d84904fd927 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -231,3 +231,16 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id) return 0; } EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_attach_ioas); + +void vfio_iommufd_emulated_detach_ioas(struct vfio_device *vdev) +{ + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_access) || + !vdev->iommufd_attached) + return; + + iommufd_access_detach(vdev->iommufd_access); + vdev->iommufd_attached = false; +} +EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_detach_ioas); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index f2f02273ece1..24091a7c7bdb 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -128,6 +128,7 @@ int vfio_iommufd_emulated_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id); void vfio_iommufd_emulated_unbind(struct vfio_device *vdev); int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id); +void vfio_iommufd_emulated_detach_ioas(struct vfio_device *vdev); #else static inline struct iommufd_ctx * vfio_iommufd_device_ictx(struct vfio_device *vdev) @@ -157,6 +158,8 @@ vfio_iommufd_get_dev_id(struct vfio_device *vdev, struct iommufd_ctx *ictx) ((void (*)(struct vfio_device *vdev)) NULL) #define vfio_iommufd_emulated_attach_ioas \ ((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL) +#define vfio_iommufd_emulated_detach_ioas \ + ((void (*)(struct vfio_device *vdev)) NULL) #endif static inline bool vfio_device_cdev_opened(struct vfio_device *device) diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c index c6c6b5d26670..3764d1911b51 100644 --- a/samples/vfio-mdev/mbochs.c +++ b/samples/vfio-mdev/mbochs.c @@ -1377,6 +1377,7 @@ static const struct vfio_device_ops mbochs_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; static struct mdev_driver mbochs_driver = { diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c index a62ea11e20ec..064e1c0a7aa8 100644 --- a/samples/vfio-mdev/mdpy.c +++ b/samples/vfio-mdev/mdpy.c @@ -666,6 +666,7 @@ static const struct vfio_device_ops mdpy_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; static struct mdev_driver mdpy_driver = { diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c index a60801fb8660..5af00387c519 100644 --- a/samples/vfio-mdev/mtty.c +++ b/samples/vfio-mdev/mtty.c @@ -1272,6 +1272,7 @@ static const struct vfio_device_ops mtty_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; static struct mdev_driver mtty_driver = { From patchwork Tue Jul 18 13:55:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317308 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E22FC0015E for ; Tue, 18 Jul 2023 13:57:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233064AbjGRN5q (ORCPT ); Tue, 18 Jul 2023 09:57:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233033AbjGRN5n (ORCPT ); Tue, 18 Jul 2023 09:57:43 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 40B9E1733; Tue, 18 Jul 2023 06:57:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688642; x=1721224642; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=raLxj7JAiIIimfMPOVGcHFV1CVJWAeSsKLNgzOJUOmA=; b=lXHC40IZyQPBO5BIwBmz27b8URxfwZFTJezKaDTQopJi1+vyo68MaP0O 9P8j5+ogJwrPT+R1ye1coBXzWQKcAqPeKzptmWyuDkNmVwCOSRyzloSUV Wo16IpqnpGg31ozZeKtYBjxkO6BTdi9tG1qzJv7upXT0BWySlDCQXg/yB JRG5yr4ZGr9vwG+GJO0E1Tv6e41SrYZuWlEHJaOHWmWX1Y0lhkS5wGPeF 4luWcIra4PLbqg8VzvU81Y55I8UXFJBWGTERlCKLytVA1TH7gA18eDAP+ bti3qpcwPFrif6A9g78XbU7JVA3J0kYTRl1UpEqW7R+XiVGUBfqTrnkLB Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590729" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590729" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251054" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251054" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:05 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 16/26] vfio: Move vfio_device_group_unregister() to be the first operation in unregister Date: Tue, 18 Jul 2023 06:55:41 -0700 Message-Id: <20230718135551.6592-17-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This avoids endless vfio_device refcount increment by userspace, which would keep blocking the vfio_unregister_group_dev(). Reviewed-by: Jason Gunthorpe Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Terrence Xu Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/vfio_main.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index c71c0d1a079f..6d45caa1f9a0 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -332,6 +332,12 @@ void vfio_unregister_group_dev(struct vfio_device *device) bool interrupted = false; long rc; + /* + * Prevent new device opened by userspace via the + * VFIO_GROUP_GET_DEVICE_FD in the group path. + */ + vfio_device_group_unregister(device); + vfio_device_put_registration(device); rc = try_wait_for_completion(&device->comp); while (rc <= 0) { @@ -355,8 +361,6 @@ void vfio_unregister_group_dev(struct vfio_device *device) } } - vfio_device_group_unregister(device); - /* Balances device_add in register path */ device_del(&device->device); From patchwork Tue Jul 18 13:55:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317309 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0842EB64DC for ; Tue, 18 Jul 2023 13:58:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233084AbjGRN5z (ORCPT ); Tue, 18 Jul 2023 09:57:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233059AbjGRN5t (ORCPT ); Tue, 18 Jul 2023 09:57:49 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0682712F; Tue, 18 Jul 2023 06:57:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688650; x=1721224650; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Tro7ldK39c/D+P73wAYl8xylrR+MAx9kksAZpQYgz6M=; b=fUYhLrKUN+FAtp1OYoDn386VTl/nEHvJOKg8p4h6MsmcRb27IGvCq6yY Ste1aQtWOSV0UWDaXag823yeOSpDmN+CsI2RpexgK+o0VmUU0fOdV3ryx +UkCxtNBRxW7YRk13TGYvTT/PWmY6gtWZCCjoLH0IAluuilUCk1IPx9AY Md7CRcNXmje8tcQvxZadXii2LEcGZzBBeckPdy4MoXGHonUvOhDRcgUe5 H0TyV5KUvF6a7eWjdDK/RF4G3mbEJrSfuO0zIZrg1S7CAOP7lhl0VCfAV SDFq0MlZCkie7znG6gY08ma7+2QotARER8CxFWugRt+DHXEnDCRexS0Io g==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590740" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590740" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251060" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251060" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:06 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 17/26] vfio: Move device_del() before waiting for the last vfio_device registration refcount Date: Tue, 18 Jul 2023 06:55:42 -0700 Message-Id: <20230718135551.6592-18-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org device_del() destroys the vfio-dev/vfioX under the sysfs for vfio_device. There is no reason to keep it while the device is going to be unregistered. This movement is also a preparation for adding vfio_device cdev. Kernel should remove the cdev node of the vfio_device to avoid new registration refcount increment while the device is going to be unregistered. Reviewed-by: Jason Gunthorpe Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/vfio_main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 6d45caa1f9a0..294bb5ecfc1c 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -338,6 +338,9 @@ void vfio_unregister_group_dev(struct vfio_device *device) */ vfio_device_group_unregister(device); + /* Balances device_add in register path */ + device_del(&device->device); + vfio_device_put_registration(device); rc = try_wait_for_completion(&device->comp); while (rc <= 0) { @@ -361,9 +364,6 @@ void vfio_unregister_group_dev(struct vfio_device *device) } } - /* Balances device_add in register path */ - device_del(&device->device); - /* Balances vfio_device_set_group in register path */ vfio_device_remove_group(device); } From patchwork Tue Jul 18 13:55:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317310 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58530EB64DC for ; Tue, 18 Jul 2023 13:58:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233044AbjGRN6R (ORCPT ); Tue, 18 Jul 2023 09:58:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233094AbjGRN54 (ORCPT ); Tue, 18 Jul 2023 09:57:56 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F4E71712; Tue, 18 Jul 2023 06:57:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688657; x=1721224657; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7/MGttU9wN35lZ6iBei+3z+TDduOsILJjnK/CTPEzDQ=; b=A0q6vIa1G75Yh9Uq9D+RFyboG5VuOA3LqXwm70E0o/YMm5oBK0Dz5b25 6bFINJ9IQ44JelGBVsS0DeXrunfg4SJxhUaYO/YojIRmdSPqSJNBJr1SJ GXCTuLd+HbxUBF3/1Voadbwy7yIZeW6DgKQ995Zqk+XppE7lkExZCtcQe 2oRhF3goJrHrCaRl0JBQnmWTIC8P1w9Y/w7wSLELo0kClx84vq9BH8k82 zOMxsJulQhMt8SakW2wdQRK+BHjBDYHfz3Al6AGdFW6pB6SycEbzFEd7+ HD+WigpiI+qPKjtec361GBaaAqIlbqIxwoeCdxPMpmK8jclBZuN4mfgpg A==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590756" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590756" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251068" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251068" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:07 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 18/26] vfio: Add cdev for vfio_device Date: Tue, 18 Jul 2023 06:55:43 -0700 Message-Id: <20230718135551.6592-19-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This adds cdev support for vfio_device. It allows the user to directly open a vfio device w/o using the legacy container/group interface, as a prerequisite for supporting new iommu features like nested translation and etc. The device fd opened in this manner doesn't have the capability to access the device as the fops open() doesn't open the device until the successful VFIO_DEVICE_BIND_IOMMUFD ioctl which will be added in a later patch. With this patch, devices registered to vfio core would have both the legacy group and the new device interfaces created. - group interface : /dev/vfio/$groupID - device interface: /dev/vfio/devices/vfioX - normal device ("X" is a unique number across vfio devices) For a given device, the user can identify the matching vfioX by searching the vfio-dev folder under the sysfs path of the device. Take PCI device (0000:6a:01.0) as an example, /sys/bus/pci/devices/0000\:6a\:01.0/vfio-dev/vfioX implies the matching vfioX under /dev/vfio/devices/, and vfio-dev/vfioX/dev contains the major:minor number of the matching /dev/vfio/devices/vfioX. The user can get device fd by opening the /dev/vfio/devices/vfioX. The vfio_device cdev logic in this patch: *) __vfio_register_dev() path ends up doing cdev_device_add() for each vfio_device if VFIO_DEVICE_CDEV configured. *) vfio_unregister_group_dev() path does cdev_device_del(); cdev interface does not support noiommu devices, so VFIO only creates the legacy group interface for the physical devices that do not have IOMMU. noiommu users should use the legacy group interface. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/Kconfig | 12 ++++++++ drivers/vfio/Makefile | 1 + drivers/vfio/device_cdev.c | 63 ++++++++++++++++++++++++++++++++++++++ drivers/vfio/vfio.h | 54 ++++++++++++++++++++++++++++++++ drivers/vfio/vfio_main.c | 21 ++++++++++--- include/linux/vfio.h | 4 +++ 6 files changed, 151 insertions(+), 4 deletions(-) create mode 100644 drivers/vfio/device_cdev.c diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index aba36f5be4ec..26f18d92eb97 100644 --- a/drivers/vfio/Kconfig +++ b/drivers/vfio/Kconfig @@ -12,6 +12,18 @@ menuconfig VFIO If you don't know what to do here, say N. if VFIO +config VFIO_DEVICE_CDEV + bool "Support for the VFIO cdev /dev/vfio/devices/vfioX" + depends on IOMMUFD && !SPAPR_TCE_IOMMU + help + The VFIO device cdev is another way for userspace to get device + access. Userspace gets device fd by opening device cdev under + /dev/vfio/devices/vfioX, and then bind the device fd with an iommufd + to set up secure DMA context for device access. This interface does + not support noiommu. + + If you don't know what to do here, say N. + config VFIO_CONTAINER bool "Support for the VFIO container /dev/vfio/vfio" select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64) diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 66f418aef5a9..87ccb16fdd77 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_VFIO) += vfio.o vfio-y += vfio_main.o \ group.o \ iova_bitmap.o +vfio-$(CONFIG_VFIO_DEVICE_CDEV) += device_cdev.o vfio-$(CONFIG_IOMMUFD) += iommufd.o vfio-$(CONFIG_VFIO_CONTAINER) += container.o vfio-$(CONFIG_VFIO_VIRQFD) += virqfd.o diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c new file mode 100644 index 000000000000..bf1032d00107 --- /dev/null +++ b/drivers/vfio/device_cdev.c @@ -0,0 +1,63 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2023 Intel Corporation. + */ +#include + +#include "vfio.h" + +static dev_t device_devt; + +void vfio_init_device_cdev(struct vfio_device *device) +{ + device->device.devt = MKDEV(MAJOR(device_devt), device->index); + cdev_init(&device->cdev, &vfio_device_fops); + device->cdev.owner = THIS_MODULE; +} + +/* + * device access via the fd opened by this function is blocked until + * .open_device() is called successfully during BIND_IOMMUFD. + */ +int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep) +{ + struct vfio_device *device = container_of(inode->i_cdev, + struct vfio_device, cdev); + struct vfio_device_file *df; + int ret; + + /* Paired with the put in vfio_device_fops_release() */ + if (!vfio_device_try_get_registration(device)) + return -ENODEV; + + df = vfio_allocate_device_file(device); + if (IS_ERR(df)) { + ret = PTR_ERR(df); + goto err_put_registration; + } + + filep->private_data = df; + + return 0; + +err_put_registration: + vfio_device_put_registration(device); + return ret; +} + +static char *vfio_device_devnode(const struct device *dev, umode_t *mode) +{ + return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev)); +} + +int vfio_cdev_init(struct class *device_class) +{ + device_class->devnode = vfio_device_devnode; + return alloc_chrdev_region(&device_devt, 0, + MINORMASK + 1, "vfio-dev"); +} + +void vfio_cdev_cleanup(void) +{ + unregister_chrdev_region(device_devt, MINORMASK + 1); +} diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 58801adc1a7e..fb8f2fac3d23 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -266,6 +266,60 @@ vfio_iommufd_compat_attach_ioas(struct vfio_device *device, } #endif +#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV) +void vfio_init_device_cdev(struct vfio_device *device); + +static inline int vfio_device_add(struct vfio_device *device) +{ + /* cdev does not support noiommu device */ + if (vfio_device_is_noiommu(device)) + return device_add(&device->device); + vfio_init_device_cdev(device); + return cdev_device_add(&device->cdev, &device->device); +} + +static inline void vfio_device_del(struct vfio_device *device) +{ + if (vfio_device_is_noiommu(device)) + device_del(&device->device); + else + cdev_device_del(&device->cdev, &device->device); +} + +int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep); +int vfio_cdev_init(struct class *device_class); +void vfio_cdev_cleanup(void); +#else +static inline void vfio_init_device_cdev(struct vfio_device *device) +{ +} + +static inline int vfio_device_add(struct vfio_device *device) +{ + return device_add(&device->device); +} + +static inline void vfio_device_del(struct vfio_device *device) +{ + device_del(&device->device); +} + +static inline int vfio_device_fops_cdev_open(struct inode *inode, + struct file *filep) +{ + return 0; +} + +static inline int vfio_cdev_init(struct class *device_class) +{ + return 0; +} + +static inline void vfio_cdev_cleanup(void) +{ +} +#endif /* CONFIG_VFIO_DEVICE_CDEV */ + #if IS_ENABLED(CONFIG_VFIO_VIRQFD) int __init vfio_virqfd_init(void); void vfio_virqfd_exit(void); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 294bb5ecfc1c..8a9ebcc6980b 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -292,7 +292,7 @@ static int __vfio_register_dev(struct vfio_device *device, if (ret) return ret; - ret = device_add(&device->device); + ret = vfio_device_add(device); if (ret) goto err_out; @@ -338,8 +338,11 @@ void vfio_unregister_group_dev(struct vfio_device *device) */ vfio_device_group_unregister(device); - /* Balances device_add in register path */ - device_del(&device->device); + /* + * Balances vfio_device_add() in register path, also prevents + * new device opened by userspace in the cdev path. + */ + vfio_device_del(device); vfio_device_put_registration(device); rc = try_wait_for_completion(&device->comp); @@ -567,7 +570,8 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep) struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; - vfio_df_group_close(df); + if (df->group) + vfio_df_group_close(df); vfio_device_put_registration(device); @@ -1216,6 +1220,7 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma) const struct file_operations vfio_device_fops = { .owner = THIS_MODULE, + .open = vfio_device_fops_cdev_open, .release = vfio_device_fops_release, .read = vfio_device_fops_read, .write = vfio_device_fops_write, @@ -1567,9 +1572,16 @@ static int __init vfio_init(void) goto err_dev_class; } + ret = vfio_cdev_init(vfio.device_class); + if (ret) + goto err_alloc_dev_chrdev; + pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n"); return 0; +err_alloc_dev_chrdev: + class_destroy(vfio.device_class); + vfio.device_class = NULL; err_dev_class: vfio_virqfd_exit(); err_virqfd: @@ -1580,6 +1592,7 @@ static int __init vfio_init(void) static void __exit vfio_cleanup(void) { ida_destroy(&vfio.device_ida); + vfio_cdev_cleanup(); class_destroy(vfio.device_class); vfio.device_class = NULL; vfio_virqfd_exit(); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 24091a7c7bdb..e0069f26488d 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -13,6 +13,7 @@ #include #include #include +#include #include #include @@ -51,6 +52,9 @@ struct vfio_device { /* Members below here are private, not for driver use */ unsigned int index; struct device device; /* device.kref covers object life circle */ +#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV) + struct cdev cdev; +#endif refcount_t refcount; /* user count on registered device*/ unsigned int open_count; struct completion comp; From patchwork Tue Jul 18 13:55:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317311 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC566C0015E for ; Tue, 18 Jul 2023 13:58:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233059AbjGRN6T (ORCPT ); Tue, 18 Jul 2023 09:58:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233156AbjGRN6D (ORCPT ); Tue, 18 Jul 2023 09:58:03 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6FB9519A0; Tue, 18 Jul 2023 06:57:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688663; x=1721224663; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YaRGpjj7I+GMXdcc3KIUaOmLLrRW7d35dueJ3AuGdR4=; b=Dbe+DLt23zfnsD8yvXFVJTLWiTV2T7sKLob5j/s2jEG2J6+hsDvwOqXV gxg6nfZXw/vja64NXSkQX26TLO1H1NFnfQLVI9QtuaMY8/JqJkt1eN1WT 3smhSxJYDfwlHwITI0EXhUjnBz27ryEhfqVcdBzPK150q05XlYRIIHvaR vbPlMesfXftIb/BcyrPSybOTKUqzlSV2UuFQ28oR5oiqPBDTOaoEBq/VK tWewe7ghxiVBxGIbs/+ukIbg6OFwBO1tVtyw5YqOZvexweeZ8lLut3u+T 3bbox71kk8a1dw9LeCUkZ01DU3F5hQDNzPF469oMbqU+A0SCbMOtBE5Nd w==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590770" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590770" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251076" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251076" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:08 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 19/26] vfio: Test kvm pointer in _vfio_device_get_kvm_safe() Date: Tue, 18 Jul 2023 06:55:44 -0700 Message-Id: <20230718135551.6592-20-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This saves some lines when adding the kvm get logic for the vfio_device cdev path. This also renames _vfio_device_get_kvm_safe() to be vfio_device_get_kvm_safe(). Suggested-by: Jason Gunthorpe Reviewed-by: Jason Gunthorpe Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/group.c | 7 +------ drivers/vfio/vfio.h | 6 +++--- drivers/vfio/vfio_main.c | 5 ++++- 3 files changed, 8 insertions(+), 10 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 41a09a2df690..5c17ad812313 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -160,12 +160,7 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group, static void vfio_device_group_get_kvm_safe(struct vfio_device *device) { spin_lock(&device->group->kvm_ref_lock); - if (!device->group->kvm) - goto unlock; - - _vfio_device_get_kvm_safe(device, device->group->kvm); - -unlock: + vfio_device_get_kvm_safe(device, device->group->kvm); spin_unlock(&device->group->kvm_ref_lock); } diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index fb8f2fac3d23..c2aa65382592 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -340,11 +340,11 @@ enum { vfio_noiommu = false }; #endif #ifdef CONFIG_HAVE_KVM -void _vfio_device_get_kvm_safe(struct vfio_device *device, struct kvm *kvm); +void vfio_device_get_kvm_safe(struct vfio_device *device, struct kvm *kvm); void vfio_device_put_kvm(struct vfio_device *device); #else -static inline void _vfio_device_get_kvm_safe(struct vfio_device *device, - struct kvm *kvm) +static inline void vfio_device_get_kvm_safe(struct vfio_device *device, + struct kvm *kvm) { } diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 8a9ebcc6980b..5f7c3151d8c0 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -373,7 +373,7 @@ void vfio_unregister_group_dev(struct vfio_device *device) EXPORT_SYMBOL_GPL(vfio_unregister_group_dev); #ifdef CONFIG_HAVE_KVM -void _vfio_device_get_kvm_safe(struct vfio_device *device, struct kvm *kvm) +void vfio_device_get_kvm_safe(struct vfio_device *device, struct kvm *kvm) { void (*pfn)(struct kvm *kvm); bool (*fn)(struct kvm *kvm); @@ -381,6 +381,9 @@ void _vfio_device_get_kvm_safe(struct vfio_device *device, struct kvm *kvm) lockdep_assert_held(&device->dev_set->lock); + if (!kvm) + return; + pfn = symbol_get(kvm_put_kvm); if (WARN_ON(!pfn)) return; From patchwork Tue Jul 18 13:55:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317312 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3063EB64DC for ; Tue, 18 Jul 2023 13:58:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232291AbjGRN6V (ORCPT ); Tue, 18 Jul 2023 09:58:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233222AbjGRN6L (ORCPT ); Tue, 18 Jul 2023 09:58:11 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DBC4197; Tue, 18 Jul 2023 06:57:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688669; x=1721224669; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7JlSkImz3ShAuDaiIZ+sh4zHfMa5cWhubnsyO0SRRR4=; b=Y7ivL4yyG0JFTPoI81YHAO6nbW0BIraFB0kT6NlX8KQ+qu0R6aRxAp12 8V0QfzwBNZfz2fDvH7JhI+7+T1hUplRFtrIdv0qjou+f8KacPxXNTh9TB jWwSpZm1YB3TFBAP2soRPZWLz5IlrTC01uuL7e9r8xWCQVrntsxmYF3uN CRhuPEayY7Gq5lTOpqMiNrBME1h6y8/9Y+8Wtnc7Wy+dbCaQ/432MrAGA Mk7T+nN4LdfyFHOi1f0ayA+xG0QH/43Qj4eBuncBE7wLUBKRJlisNNhoJ ZWK2JvZXtWy4RLnxzFS/0O8SfRwAGAdCgOIdGw7NnJ89bTnGSFyGrhrk9 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590787" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590787" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251096" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251096" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:09 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 20/26] iommufd: Add iommufd_ctx_from_fd() Date: Tue, 18 Jul 2023 06:55:45 -0700 Message-Id: <20230718135551.6592-21-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org It's common to get a reference to the iommufd context from a given file descriptor. So adds an API for it. Existing users of this API are compiled only when IOMMUFD is enabled, so no need to have a stub for the IOMMUFD disabled case. Signed-off-by: Yi Liu Reviewed-by: Jason Gunthorpe --- drivers/iommu/iommufd/main.c | 24 ++++++++++++++++++++++++ include/linux/iommufd.h | 1 + 2 files changed, 25 insertions(+) diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index 32ce7befc8dd..4bbb20dff430 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -377,6 +377,30 @@ struct iommufd_ctx *iommufd_ctx_from_file(struct file *file) } EXPORT_SYMBOL_NS_GPL(iommufd_ctx_from_file, IOMMUFD); +/** + * iommufd_ctx_from_fd - Acquires a reference to the iommufd context + * @fd: File descriptor to obtain the reference from + * + * Returns a pointer to the iommufd_ctx, otherwise ERR_PTR. On success + * the caller is responsible to call iommufd_ctx_put(). + */ +struct iommufd_ctx *iommufd_ctx_from_fd(int fd) +{ + struct file *file; + + file = fget(fd); + if (!file) + return ERR_PTR(-EBADF); + + if (file->f_op != &iommufd_fops) { + fput(file); + return ERR_PTR(-EBADFD); + } + /* fget is the same as iommufd_ctx_get() */ + return file->private_data; +} +EXPORT_SYMBOL_NS_GPL(iommufd_ctx_from_fd, IOMMUFD); + /** * iommufd_ctx_put - Put back a reference * @ictx: Context to put back diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 3a3216cb9482..9657c58813dc 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -54,6 +54,7 @@ void iommufd_ctx_get(struct iommufd_ctx *ictx); #if IS_ENABLED(CONFIG_IOMMUFD) struct iommufd_ctx *iommufd_ctx_from_file(struct file *file); +struct iommufd_ctx *iommufd_ctx_from_fd(int fd); void iommufd_ctx_put(struct iommufd_ctx *ictx); bool iommufd_ctx_has_group(struct iommufd_ctx *ictx, struct iommu_group *group); From patchwork Tue Jul 18 13:55:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87E59C0015E for ; Tue, 18 Jul 2023 13:58:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232220AbjGRN6Y (ORCPT ); Tue, 18 Jul 2023 09:58:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233049AbjGRN6R (ORCPT ); Tue, 18 Jul 2023 09:58:17 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A33CB135; Tue, 18 Jul 2023 06:57:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688676; x=1721224676; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ehYluT/f0F+1shIL+ejEfudcxMv7lj3Qour9CZ02eSM=; b=KQJJRb32pVNs6t5Qx3FNa0nqz6w50+Y5hmkomhyLAyAeY7P+5kNCfeAq oCZErsEXSnkyES+R7batmA9AF/9x1sNs27LlRLMAYKuPc94xvpEenrH5H vl7cLUWScROlSwrONkL6PE5lN1hwsqdee2j4AT5WWRaUuVB/slWFXv82M GpMqngTGmHvBT0Tu2B9hLC7MBht+XBvCqQbj/sgX14S5xKqHUqRc/KUJz vK9vY/VbHwGrXpbSnIkkKo9ZsMBkBsXmHmO6IZZPuhlgrew4ieuMibsRw Y248yZxvjD32NflFMyzHZTr3QGpoWzHrna+xsGqcc2aLQRdxw1pFYvA5i A==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590803" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590803" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251118" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251118" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:10 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 21/26] vfio: Avoid repeated user pointer cast in vfio_device_fops_unl_ioctl() Date: Tue, 18 Jul 2023 06:55:46 -0700 Message-Id: <20230718135551.6592-22-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This adds a local variable to store the user pointer cast result from arg. It avoids the repeated casts in the code when more ioctls are added. Reviewed-by: Jason Gunthorpe Signed-off-by: Yi Liu --- drivers/vfio/vfio_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 5f7c3151d8c0..a2744cb64c6d 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1146,6 +1146,7 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, { struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + void __user *uptr = (void __user *)arg; int ret; /* Paired with smp_store_release() following vfio_df_open() */ @@ -1158,7 +1159,7 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, switch (cmd) { case VFIO_DEVICE_FEATURE: - ret = vfio_ioctl_device_feature(device, (void __user *)arg); + ret = vfio_ioctl_device_feature(device, uptr); break; default: From patchwork Tue Jul 18 13:55:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317314 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A431C0015E for ; Tue, 18 Jul 2023 13:58:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233071AbjGRN6a (ORCPT ); Tue, 18 Jul 2023 09:58:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233048AbjGRN61 (ORCPT ); Tue, 18 Jul 2023 09:58:27 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43A7B1990; Tue, 18 Jul 2023 06:58:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688683; x=1721224683; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZBh7bMFBrgvlWP2zmLNuugMeEYlXwlvKD68Pxx2Yb3o=; b=nNLYi/NbOxhuYIymxNSMK5ZkKYmRIMN08nZdI67c2Ga2jS1mbIxR14L+ DNS8u7PFSKydFvT/v8SvLQvzXHEjgbqL16G+Qt2MDqiXG+ghez/ODw//P daq5kHuqxdfmPglmuPSayFDbHFr2lTBAfa8nKcujRA2ILXWPau8lbiqQc 1OcX8AJWyzLy01qQAVFiCOb/mla5Xdt1VoP6FpM7vnIj3JWL2L2bR72UO nx2V0phZ161Cx1emJyudCSDC43Rf8+t7iXzV4cBVVAR7N7B7ROrM0kgP6 GHmpdSGjb9diOQz++OGMXuUSgRQfdKSKi0fYn7qjQOXWkKArOOpQ1A8Qs w==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590819" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590819" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:12 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251130" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251130" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:11 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 22/26] vfio: Add VFIO_DEVICE_BIND_IOMMUFD Date: Tue, 18 Jul 2023 06:55:47 -0700 Message-Id: <20230718135551.6592-23-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This adds ioctl for userspace to bind device cdev fd to iommufd. VFIO_DEVICE_BIND_IOMMUFD: bind device to an iommufd, hence gain DMA control provided by the iommufd. open_device op is called after bind_iommufd op. Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Terrence Xu Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu Reviewed-by: Jason Gunthorpe --- drivers/vfio/device_cdev.c | 107 +++++++++++++++++++++++++++++++++++++ drivers/vfio/vfio.h | 13 +++++ drivers/vfio/vfio_main.c | 5 ++ include/linux/vfio.h | 5 +- include/uapi/linux/vfio.h | 27 ++++++++++ 5 files changed, 155 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c index bf1032d00107..f40784dd5561 100644 --- a/drivers/vfio/device_cdev.c +++ b/drivers/vfio/device_cdev.c @@ -3,6 +3,7 @@ * Copyright (c) 2023 Intel Corporation. */ #include +#include #include "vfio.h" @@ -45,6 +46,112 @@ int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep) return ret; } +static void vfio_df_get_kvm_safe(struct vfio_device_file *df) +{ + spin_lock(&df->kvm_ref_lock); + vfio_device_get_kvm_safe(df->device, df->kvm); + spin_unlock(&df->kvm_ref_lock); +} + +long vfio_df_ioctl_bind_iommufd(struct vfio_device_file *df, + struct vfio_device_bind_iommufd __user *arg) +{ + struct vfio_device *device = df->device; + struct vfio_device_bind_iommufd bind; + unsigned long minsz; + int ret; + + static_assert(__same_type(arg->out_devid, df->devid)); + + minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid); + + if (copy_from_user(&bind, arg, minsz)) + return -EFAULT; + + if (bind.argsz < minsz || bind.flags || bind.iommufd < 0) + return -EINVAL; + + /* BIND_IOMMUFD only allowed for cdev fds */ + if (df->group) + return -EINVAL; + + ret = vfio_device_block_group(device); + if (ret) + return ret; + + mutex_lock(&device->dev_set->lock); + /* one device cannot be bound twice */ + if (df->access_granted) { + ret = -EINVAL; + goto out_unlock; + } + + df->iommufd = iommufd_ctx_from_fd(bind.iommufd); + if (IS_ERR(df->iommufd)) { + ret = PTR_ERR(df->iommufd); + df->iommufd = NULL; + goto out_unlock; + } + + /* + * Before the device open, get the KVM pointer currently + * associated with the device file (if there is) and obtain + * a reference. This reference is held until device closed. + * Save the pointer in the device for use by drivers. + */ + vfio_df_get_kvm_safe(df); + + ret = vfio_df_open(df); + if (ret) + goto out_put_kvm; + + ret = copy_to_user(&arg->out_devid, &df->devid, + sizeof(df->devid)) ? -EFAULT : 0; + if (ret) + goto out_close_device; + + device->cdev_opened = true; + /* + * Paired with smp_load_acquire() in vfio_device_fops::ioctl/ + * read/write/mmap + */ + smp_store_release(&df->access_granted, true); + mutex_unlock(&device->dev_set->lock); + return 0; + +out_close_device: + vfio_df_close(df); +out_put_kvm: + vfio_device_put_kvm(device); + iommufd_ctx_put(df->iommufd); + df->iommufd = NULL; +out_unlock: + mutex_unlock(&device->dev_set->lock); + vfio_device_unblock_group(device); + return ret; +} + +void vfio_df_unbind_iommufd(struct vfio_device_file *df) +{ + struct vfio_device *device = df->device; + + /* + * In the time of close, there is no contention with another one + * changing this flag. So read df->access_granted without lock + * and no smp_load_acquire() is ok. + */ + if (!df->access_granted) + return; + + mutex_lock(&device->dev_set->lock); + vfio_df_close(df); + vfio_device_put_kvm(device); + iommufd_ctx_put(df->iommufd); + device->cdev_opened = false; + mutex_unlock(&device->dev_set->lock); + vfio_device_unblock_group(device); +} + static char *vfio_device_devnode(const struct device *dev, umode_t *mode) { return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev)); diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index c2aa65382592..b6d4ba1ef2b8 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -287,6 +287,9 @@ static inline void vfio_device_del(struct vfio_device *device) } int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep); +long vfio_df_ioctl_bind_iommufd(struct vfio_device_file *df, + struct vfio_device_bind_iommufd __user *arg); +void vfio_df_unbind_iommufd(struct vfio_device_file *df); int vfio_cdev_init(struct class *device_class); void vfio_cdev_cleanup(void); #else @@ -310,6 +313,16 @@ static inline int vfio_device_fops_cdev_open(struct inode *inode, return 0; } +static inline long vfio_df_ioctl_bind_iommufd(struct vfio_device_file *df, + struct vfio_device_bind_iommufd __user *arg) +{ + return -ENOTTY; +} + +static inline void vfio_df_unbind_iommufd(struct vfio_device_file *df) +{ +} + static inline int vfio_cdev_init(struct class *device_class) { return 0; diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index a2744cb64c6d..9fdf93ff17cf 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -575,6 +575,8 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep) if (df->group) vfio_df_group_close(df); + else + vfio_df_unbind_iommufd(df); vfio_device_put_registration(device); @@ -1149,6 +1151,9 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, void __user *uptr = (void __user *)arg; int ret; + if (cmd == VFIO_DEVICE_BIND_IOMMUFD) + return vfio_df_ioctl_bind_iommufd(df, uptr); + /* Paired with smp_store_release() following vfio_df_open() */ if (!smp_load_acquire(&df->access_granted)) return -EINVAL; diff --git a/include/linux/vfio.h b/include/linux/vfio.h index e0069f26488d..d6228c839c44 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -64,8 +64,9 @@ struct vfio_device { void (*put_kvm)(struct kvm *kvm); #if IS_ENABLED(CONFIG_IOMMUFD) struct iommufd_device *iommufd_device; - bool iommufd_attached; + u8 iommufd_attached:1; #endif + u8 cdev_opened:1; }; /** @@ -168,7 +169,7 @@ vfio_iommufd_get_dev_id(struct vfio_device *vdev, struct iommufd_ctx *ictx) static inline bool vfio_device_cdev_opened(struct vfio_device *device) { - return false; + return device->cdev_opened; } /** diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 4c3d548e9c96..098946b23e86 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -897,6 +897,33 @@ struct vfio_device_feature { #define VFIO_DEVICE_FEATURE _IO(VFIO_TYPE, VFIO_BASE + 17) +/* + * VFIO_DEVICE_BIND_IOMMUFD - _IOR(VFIO_TYPE, VFIO_BASE + 18, + * struct vfio_device_bind_iommufd) + * @argsz: User filled size of this data. + * @flags: Must be 0. + * @iommufd: iommufd to bind. + * @out_devid: The device id generated by this bind. devid is a handle for + * this device/iommufd bond and can be used in IOMMUFD commands. + * + * Bind a vfio_device to the specified iommufd. + * + * User is restricted from accessing the device before the binding operation + * is completed. Only allowed on cdev fds. + * + * Unbind is automatically conducted when device fd is closed. + * + * Return: 0 on success, -errno on failure. + */ +struct vfio_device_bind_iommufd { + __u32 argsz; + __u32 flags; + __s32 iommufd; + __u32 out_devid; +}; + +#define VFIO_DEVICE_BIND_IOMMUFD _IO(VFIO_TYPE, VFIO_BASE + 18) + /* * Provide support for setting a PCI VF Token, which is used as a shared * secret between PF and VF drivers. This feature may only be set on a From patchwork Tue Jul 18 13:55:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 809BEEB64DD for ; Tue, 18 Jul 2023 13:58:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231687AbjGRN6f (ORCPT ); Tue, 18 Jul 2023 09:58:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56804 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229793AbjGRN6d (ORCPT ); Tue, 18 Jul 2023 09:58:33 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9C88E198A; Tue, 18 Jul 2023 06:58:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688691; x=1721224691; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Pxk1jc7Rm+F8hS5OJwZHRb9QQCQaPDvusSEoAXHzOAk=; b=DYyhwkwYiDrTdMCkWByuUHdP3JN/0KMVGqNnQo0hOoTCpvFo9hquKvwG i9R5CZWSEBF+xzT/5mVm05NE18Wug8byv33+UqaJCy4iGwqoLhLyOpZUg icOHIXrSKgc9tUVeoZQ/7h9jiPRDCzvojRSKKD0J97XawW5P5zP1oGPKa DuK2Mk4jV7LtO38bZBWaaKTirS6DZL77EmOcYBnDAz2SfGEI+jhv3rOok rzpyw8xBqGI2f9IfZLleUYDxCJceRuD/lciaOUKW2ZysRLMpWIvaxhnFZ 5uxs/JN1YbRxSYNCXbLIX5k5cmvJEnB7gXLG1fwG/21YEBrPDj+rHGfT/ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590831" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590831" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251149" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251149" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:12 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 23/26] vfio: Add VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT Date: Tue, 18 Jul 2023 06:55:48 -0700 Message-Id: <20230718135551.6592-24-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This adds ioctl for userspace to attach device cdev fd to and detach from IOAS/hw_pagetable managed by iommufd. VFIO_DEVICE_ATTACH_IOMMUFD_PT: attach vfio device to IOAS or hw_pagetable managed by iommufd. Attach can be undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close. VFIO_DEVICE_DETACH_IOMMUFD_PT: detach vfio device from the current attached IOAS or hw_pagetable managed by iommufd. Reviewed-by: Jason Gunthorpe Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Terrence Xu Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/device_cdev.c | 58 ++++++++++++++++++++++++++++++++++++++ drivers/vfio/vfio.h | 5 ++++ drivers/vfio/vfio_main.c | 15 +++++++++- include/uapi/linux/vfio.h | 44 +++++++++++++++++++++++++++++ 4 files changed, 121 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c index f40784dd5561..e75da0a70d1f 100644 --- a/drivers/vfio/device_cdev.c +++ b/drivers/vfio/device_cdev.c @@ -152,6 +152,64 @@ void vfio_df_unbind_iommufd(struct vfio_device_file *df) vfio_device_unblock_group(device); } +int vfio_df_ioctl_attach_pt(struct vfio_device_file *df, + struct vfio_device_attach_iommufd_pt __user *arg) +{ + struct vfio_device *device = df->device; + struct vfio_device_attach_iommufd_pt attach; + unsigned long minsz; + int ret; + + minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id); + + if (copy_from_user(&attach, arg, minsz)) + return -EFAULT; + + if (attach.argsz < minsz || attach.flags) + return -EINVAL; + + mutex_lock(&device->dev_set->lock); + ret = device->ops->attach_ioas(device, &attach.pt_id); + if (ret) + goto out_unlock; + + if (copy_to_user(&arg->pt_id, &attach.pt_id, sizeof(attach.pt_id))) { + ret = -EFAULT; + goto out_detach; + } + mutex_unlock(&device->dev_set->lock); + + return 0; + +out_detach: + device->ops->detach_ioas(device); +out_unlock: + mutex_unlock(&device->dev_set->lock); + return ret; +} + +int vfio_df_ioctl_detach_pt(struct vfio_device_file *df, + struct vfio_device_detach_iommufd_pt __user *arg) +{ + struct vfio_device *device = df->device; + struct vfio_device_detach_iommufd_pt detach; + unsigned long minsz; + + minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags); + + if (copy_from_user(&detach, arg, minsz)) + return -EFAULT; + + if (detach.argsz < minsz || detach.flags) + return -EINVAL; + + mutex_lock(&device->dev_set->lock); + device->ops->detach_ioas(device); + mutex_unlock(&device->dev_set->lock); + + return 0; +} + static char *vfio_device_devnode(const struct device *dev, umode_t *mode) { return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev)); diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index b6d4ba1ef2b8..8353774a5d07 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -266,6 +266,11 @@ vfio_iommufd_compat_attach_ioas(struct vfio_device *device, } #endif +int vfio_df_ioctl_attach_pt(struct vfio_device_file *df, + struct vfio_device_attach_iommufd_pt __user *arg); +int vfio_df_ioctl_detach_pt(struct vfio_device_file *df, + struct vfio_device_detach_iommufd_pt __user *arg); + #if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV) void vfio_init_device_cdev(struct vfio_device *device); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 9fdf93ff17cf..ba1d84afe081 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1162,6 +1162,19 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, if (ret) return ret; + /* cdev only ioctls */ + if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV) && !df->group) { + switch (cmd) { + case VFIO_DEVICE_ATTACH_IOMMUFD_PT: + ret = vfio_df_ioctl_attach_pt(df, uptr); + goto out; + + case VFIO_DEVICE_DETACH_IOMMUFD_PT: + ret = vfio_df_ioctl_detach_pt(df, uptr); + goto out; + } + } + switch (cmd) { case VFIO_DEVICE_FEATURE: ret = vfio_ioctl_device_feature(device, uptr); @@ -1174,7 +1187,7 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, ret = device->ops->ioctl(device, cmd, arg); break; } - +out: vfio_device_pm_runtime_put(device); return ret; } diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 098946b23e86..fa06e3eb4955 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -924,6 +924,50 @@ struct vfio_device_bind_iommufd { #define VFIO_DEVICE_BIND_IOMMUFD _IO(VFIO_TYPE, VFIO_BASE + 18) +/* + * VFIO_DEVICE_ATTACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 19, + * struct vfio_device_attach_iommufd_pt) + * @argsz: User filled size of this data. + * @flags: Must be 0. + * @pt_id: Input the target id which can represent an ioas or a hwpt + * allocated via iommufd subsystem. + * Output the input ioas id or the attached hwpt id which could + * be the specified hwpt itself or a hwpt automatically created + * for the specified ioas by kernel during the attachment. + * + * Associate the device with an address space within the bound iommufd. + * Undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close. This is only + * allowed on cdev fds. + * + * Return: 0 on success, -errno on failure. + */ +struct vfio_device_attach_iommufd_pt { + __u32 argsz; + __u32 flags; + __u32 pt_id; +}; + +#define VFIO_DEVICE_ATTACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 19) + +/* + * VFIO_DEVICE_DETACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 20, + * struct vfio_device_detach_iommufd_pt) + * @argsz: User filled size of this data. + * @flags: Must be 0. + * + * Remove the association of the device and its current associated address + * space. After it, the device should be in a blocking DMA state. This is only + * allowed on cdev fds. + * + * Return: 0 on success, -errno on failure. + */ +struct vfio_device_detach_iommufd_pt { + __u32 argsz; + __u32 flags; +}; + +#define VFIO_DEVICE_DETACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 20) + /* * Provide support for setting a PCI VF Token, which is used as a shared * secret between PF and VF drivers. This feature may only be set on a From patchwork Tue Jul 18 13:55:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317316 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB971EB64DD for ; Tue, 18 Jul 2023 13:58:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233088AbjGRN6o (ORCPT ); Tue, 18 Jul 2023 09:58:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233048AbjGRN6k (ORCPT ); Tue, 18 Jul 2023 09:58:40 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5BD89AC; Tue, 18 Jul 2023 06:58:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688697; x=1721224697; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9oSzt+Qjxbm5o8hEt23r35NFXF06gcMZujcuINafNKM=; b=mCmXQ93ZA/pkYYhc/Se0qLslUJI820lT2Zh1hZNTxQEEcucxaWzIxDRa HCtfGSELUnIkdIj1ZTvb7/pCOQ97z8N7gSy/NKqdEmHYgVC9WVpttIT8u DhnAEW/S8TMirAfVQirT/9+nMiN4+HMgErwu/jfPMRM5wnV8tlERs9N40 W3SbRjUJ3JPcMECefJ1fEcjRTRoEh9SGgv3P4sA8sTgogLqNgzzqnqIh0 ATYJW/95OUvZJoRMEg9vE3hvODc8IFl1APqS3u0hOQF/Dg/kHavbz8ic7 MpqenAti4Vc5yHaTfl+6bLvA0dUfm7QYqszPckh2a8HnLDihfcfyJ95PR Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590846" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590846" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251163" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251163" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:13 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 24/26] vfio: Move the IOMMU_CAP_CACHE_COHERENCY check in __vfio_register_dev() Date: Tue, 18 Jul 2023 06:55:49 -0700 Message-Id: <20230718135551.6592-25-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The IOMMU_CAP_CACHE_COHERENCY check only applies to the physical devices that are IOMMU-backed. But it is now in the group code. If want to compile vfio_group infrastructure out, this check needs to be moved out of the group code. Another reason for this change is to fail the device registration for the physical devices that do not have IOMMU if the group code is not compiled as the cdev interface does not support such devices. Suggested-by: Jason Gunthorpe Reviewed-by: Jason Gunthorpe Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/group.c | 10 ---------- drivers/vfio/vfio_main.c | 11 +++++++++++ 2 files changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 5c17ad812313..610a429c6191 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -682,16 +682,6 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev) if (!iommu_group) return ERR_PTR(-EINVAL); - /* - * VFIO always sets IOMMU_CACHE because we offer no way for userspace to - * restore cache coherency. It has to be checked here because it is only - * valid for cases where we are using iommu groups. - */ - if (!device_iommu_capable(dev, IOMMU_CAP_CACHE_COHERENCY)) { - iommu_group_put(iommu_group); - return ERR_PTR(-EINVAL); - } - mutex_lock(&vfio.group_lock); group = vfio_group_find_from_iommu(iommu_group); if (group) { diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index ba1d84afe081..902f06e52c48 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -292,6 +292,17 @@ static int __vfio_register_dev(struct vfio_device *device, if (ret) return ret; + /* + * VFIO always sets IOMMU_CACHE because we offer no way for userspace to + * restore cache coherency. It has to be checked here because it is only + * valid for cases where we are using iommu groups. + */ + if (type == VFIO_IOMMU && !vfio_device_is_noiommu(device) && + !device_iommu_capable(device->dev, IOMMU_CAP_CACHE_COHERENCY)) { + ret = -EINVAL; + goto err_out; + } + ret = vfio_device_add(device); if (ret) goto err_out; From patchwork Tue Jul 18 13:55:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317317 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CFEDEB64DA for ; Tue, 18 Jul 2023 13:58:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232531AbjGRN6x (ORCPT ); Tue, 18 Jul 2023 09:58:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231154AbjGRN6s (ORCPT ); Tue, 18 Jul 2023 09:58:48 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B0C910FE; Tue, 18 Jul 2023 06:58:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688707; x=1721224707; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/Mqc3P0T6TEZuZYlkUFzDxl9cmobu4G6ID006pRWBLE=; b=cWkDtns+zBgIT9+5Zwhn6M09pw2eiK4G3xDJ7T8R1osw1CU3nDRBRtjz UkXkVjP4uyc63nCiACqaghDGG2kY+/pVlaS7r5nG74iguFFUjfkP12T01 O3raMiCPzr4gBOWGj6bzvuTUXOI/1VWNOMPY7GNJdMlLK6bNQ7jogBWH3 zX0o7X0/i05K6mxoVfLhftp7M8y5eDsSgYcRjj3OWX5XY+UV+jScIU66b dAafNuQInk3yYezMegNK7cypardupCTICClY6eJhRrBXTh+cNI9B7T35g eU53O9da1Y7Tw5cKnMuSUOTaBd4CZiLCY5CN+YOu0FyYh3Z9TJXdM+9JZ w==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590858" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590858" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251176" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251176" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:14 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 25/26] vfio: Compile vfio_group infrastructure optionally Date: Tue, 18 Jul 2023 06:55:50 -0700 Message-Id: <20230718135551.6592-26-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org vfio_group is not needed for vfio device cdev, so with vfio device cdev introduced, the vfio_group infrastructures can be compiled out if only cdev is needed. Reviewed-by: Jason Gunthorpe Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Terrence Xu Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/iommu/iommufd/Kconfig | 4 +- drivers/vfio/Kconfig | 15 ++++++ drivers/vfio/Makefile | 2 +- drivers/vfio/vfio.h | 89 ++++++++++++++++++++++++++++++++--- include/linux/vfio.h | 25 ++++++++-- 5 files changed, 123 insertions(+), 12 deletions(-) diff --git a/drivers/iommu/iommufd/Kconfig b/drivers/iommu/iommufd/Kconfig index ada693ea51a7..99d4b075df49 100644 --- a/drivers/iommu/iommufd/Kconfig +++ b/drivers/iommu/iommufd/Kconfig @@ -14,8 +14,8 @@ config IOMMUFD if IOMMUFD config IOMMUFD_VFIO_CONTAINER bool "IOMMUFD provides the VFIO container /dev/vfio/vfio" - depends on VFIO && !VFIO_CONTAINER - default VFIO && !VFIO_CONTAINER + depends on VFIO_GROUP && !VFIO_CONTAINER + default VFIO_GROUP && !VFIO_CONTAINER help IOMMUFD will provide /dev/vfio/vfio instead of VFIO. This relies on IOMMUFD providing compatibility emulation to give the same ioctls. diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index 26f18d92eb97..6bda6dbb4878 100644 --- a/drivers/vfio/Kconfig +++ b/drivers/vfio/Kconfig @@ -4,6 +4,8 @@ menuconfig VFIO select IOMMU_API depends on IOMMUFD || !IOMMUFD select INTERVAL_TREE + select VFIO_GROUP if SPAPR_TCE_IOMMU || IOMMUFD=n + select VFIO_DEVICE_CDEV if !VFIO_GROUP select VFIO_CONTAINER if IOMMUFD=n help VFIO provides a framework for secure userspace device drivers. @@ -15,6 +17,7 @@ if VFIO config VFIO_DEVICE_CDEV bool "Support for the VFIO cdev /dev/vfio/devices/vfioX" depends on IOMMUFD && !SPAPR_TCE_IOMMU + default !VFIO_GROUP help The VFIO device cdev is another way for userspace to get device access. Userspace gets device fd by opening device cdev under @@ -24,9 +27,20 @@ config VFIO_DEVICE_CDEV If you don't know what to do here, say N. +config VFIO_GROUP + bool "Support for the VFIO group /dev/vfio/$group_id" + default y + help + VFIO group support provides the traditional model for accessing + devices through VFIO and is used by the majority of userspace + applications and drivers making use of VFIO. + + If you don't know what to do here, say Y. + config VFIO_CONTAINER bool "Support for the VFIO container /dev/vfio/vfio" select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64) + depends on VFIO_GROUP default y help The VFIO container is the classic interface to VFIO for establishing @@ -48,6 +62,7 @@ endif config VFIO_NOIOMMU bool "VFIO No-IOMMU support" + depends on VFIO_GROUP help VFIO is built on the ability to isolate devices using the IOMMU. Only with an IOMMU can userspace access to DMA capable devices be diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 87ccb16fdd77..c82ea032d352 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -2,9 +2,9 @@ obj-$(CONFIG_VFIO) += vfio.o vfio-y += vfio_main.o \ - group.o \ iova_bitmap.o vfio-$(CONFIG_VFIO_DEVICE_CDEV) += device_cdev.o +vfio-$(CONFIG_VFIO_GROUP) += group.o vfio-$(CONFIG_IOMMUFD) += iommufd.o vfio-$(CONFIG_VFIO_CONTAINER) += container.o vfio-$(CONFIG_VFIO_VIRQFD) += virqfd.o diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 8353774a5d07..307e3f29b527 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -36,6 +36,12 @@ vfio_allocate_device_file(struct vfio_device *device); extern const struct file_operations vfio_device_fops; +#ifdef CONFIG_VFIO_NOIOMMU +extern bool vfio_noiommu __read_mostly; +#else +enum { vfio_noiommu = false }; +#endif + enum vfio_group_type { /* * Physical device with IOMMU backing. @@ -60,6 +66,7 @@ enum vfio_group_type { VFIO_NO_IOMMU, }; +#if IS_ENABLED(CONFIG_VFIO_GROUP) struct vfio_group { struct device dev; struct cdev cdev; @@ -111,6 +118,82 @@ static inline bool vfio_device_is_noiommu(struct vfio_device *vdev) return IS_ENABLED(CONFIG_VFIO_NOIOMMU) && vdev->group->type == VFIO_NO_IOMMU; } +#else +struct vfio_group; + +static inline int vfio_device_block_group(struct vfio_device *device) +{ + return 0; +} + +static inline void vfio_device_unblock_group(struct vfio_device *device) +{ +} + +static inline int vfio_device_set_group(struct vfio_device *device, + enum vfio_group_type type) +{ + return 0; +} + +static inline void vfio_device_remove_group(struct vfio_device *device) +{ +} + +static inline void vfio_device_group_register(struct vfio_device *device) +{ +} + +static inline void vfio_device_group_unregister(struct vfio_device *device) +{ +} + +static inline int vfio_device_group_use_iommu(struct vfio_device *device) +{ + return -EOPNOTSUPP; +} + +static inline void vfio_device_group_unuse_iommu(struct vfio_device *device) +{ +} + +static inline void vfio_df_group_close(struct vfio_device_file *df) +{ +} + +static inline struct vfio_group *vfio_group_from_file(struct file *file) +{ + return NULL; +} + +static inline bool vfio_group_enforced_coherent(struct vfio_group *group) +{ + return true; +} + +static inline void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm) +{ +} + +static inline bool vfio_device_has_container(struct vfio_device *device) +{ + return false; +} + +static inline int __init vfio_group_init(void) +{ + return 0; +} + +static inline void vfio_group_cleanup(void) +{ +} + +static inline bool vfio_device_is_noiommu(struct vfio_device *vdev) +{ + return false; +} +#endif /* CONFIG_VFIO_GROUP */ #if IS_ENABLED(CONFIG_VFIO_CONTAINER) /** @@ -351,12 +434,6 @@ static inline void vfio_virqfd_exit(void) } #endif -#ifdef CONFIG_VFIO_NOIOMMU -extern bool vfio_noiommu __read_mostly; -#else -enum { vfio_noiommu = false }; -#endif - #ifdef CONFIG_HAVE_KVM void vfio_device_get_kvm_safe(struct vfio_device *device, struct kvm *kvm); void vfio_device_put_kvm(struct vfio_device *device); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index d6228c839c44..5a1dee983f17 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -43,7 +43,11 @@ struct vfio_device { */ const struct vfio_migration_ops *mig_ops; const struct vfio_log_ops *log_ops; +#if IS_ENABLED(CONFIG_VFIO_GROUP) struct vfio_group *group; + struct list_head group_next; + struct list_head iommu_entry; +#endif struct vfio_device_set *dev_set; struct list_head dev_set_list; unsigned int migration_flags; @@ -58,8 +62,6 @@ struct vfio_device { refcount_t refcount; /* user count on registered device*/ unsigned int open_count; struct completion comp; - struct list_head group_next; - struct list_head iommu_entry; struct iommufd_access *iommufd_access; void (*put_kvm)(struct kvm *kvm); #if IS_ENABLED(CONFIG_IOMMUFD) @@ -284,12 +286,29 @@ int vfio_mig_get_next_state(struct vfio_device *device, /* * External user API */ +#if IS_ENABLED(CONFIG_VFIO_GROUP) struct iommu_group *vfio_file_iommu_group(struct file *file); bool vfio_file_is_group(struct file *file); +bool vfio_file_has_dev(struct file *file, struct vfio_device *device); +#else +static inline struct iommu_group *vfio_file_iommu_group(struct file *file) +{ + return NULL; +} + +static inline bool vfio_file_is_group(struct file *file) +{ + return false; +} + +static inline bool vfio_file_has_dev(struct file *file, struct vfio_device *device) +{ + return false; +} +#endif bool vfio_file_is_valid(struct file *file); bool vfio_file_enforced_coherent(struct file *file); void vfio_file_set_kvm(struct file *file, struct kvm *kvm); -bool vfio_file_has_dev(struct file *file, struct vfio_device *device); #define VFIO_PIN_PAGES_MAX_ENTRIES (PAGE_SIZE/sizeof(unsigned long)) From patchwork Tue Jul 18 13:55:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317318 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FCA6EB64DD for ; Tue, 18 Jul 2023 13:59:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232369AbjGRN7A (ORCPT ); Tue, 18 Jul 2023 09:59:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56998 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231154AbjGRN6x (ORCPT ); Tue, 18 Jul 2023 09:58:53 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 89D0B1733; Tue, 18 Jul 2023 06:58:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688713; x=1721224713; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=p5YNVfcmU75vmVCJ0PVVuhw1lipPLjymLPKeSXtmq+I=; b=KFF9AK8VVc3HETs6Nce0mIM7MdwaPAaSzy+JWtG5fb4bFXMfHtddI1JG i/KPKD6EzpsRkzmwQJ9C5EniDZQAPQp2joFU0xD4wcuEGWFjIx9oqWl+j Ku634AAH/yjxuyt2zlILfz1XQk/gzUgoAO3rRmWPjBCifzPzycrz7o7ZN IjjKYZDYQPL6u+tIgZnhFHsWyfif3STD8SQK24045vaK7hmHvGeNhA3WV q/8/Oga5rDaoEl7YCFfU8/lI1AV1qRrzklG53dSP4e5Wl0Md0tjTk4FPR qQrkN9xR2nHsHZ20UR21QHE6wPycZJL2s4Ls3Rm740G3OhVQuJBrxWzA1 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590884" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590884" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970251187" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970251187" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:56:16 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 26/26] docs: vfio: Add vfio device cdev description Date: Tue, 18 Jul 2023 06:55:51 -0700 Message-Id: <20230718135551.6592-27-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This gives notes for userspace applications on device cdev usage. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Signed-off-by: Yi Liu --- Documentation/driver-api/vfio.rst | 139 ++++++++++++++++++++++++++++++ 1 file changed, 139 insertions(+) diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst index 363e12c90b87..633d11c7fa71 100644 --- a/Documentation/driver-api/vfio.rst +++ b/Documentation/driver-api/vfio.rst @@ -239,6 +239,137 @@ group and can access them as follows:: /* Gratuitous device reset and go... */ ioctl(device, VFIO_DEVICE_RESET); +IOMMUFD and vfio_iommu_type1 +---------------------------- + +IOMMUFD is the new user API to manage I/O page tables from userspace. +It intends to be the portal of delivering advanced userspace DMA +features (nested translation [5]_, PASID [6]_, etc.) while also providing +a backwards compatibility interface for existing VFIO_TYPE1v2_IOMMU use +cases. Eventually the vfio_iommu_type1 driver, as well as the legacy +vfio container and group model is intended to be deprecated. + +The IOMMUFD backwards compatibility interface can be enabled two ways. +In the first method, the kernel can be configured with +CONFIG_IOMMUFD_VFIO_CONTAINER, in which case the IOMMUFD subsystem +transparently provides the entire infrastructure for the VFIO +container and IOMMU backend interfaces. The compatibility mode can +also be accessed if the VFIO container interface, ie. /dev/vfio/vfio is +simply symlink'd to /dev/iommu. Note that at the time of writing, the +compatibility mode is not entirely feature complete relative to +VFIO_TYPE1v2_IOMMU (ex. DMA mapping MMIO) and does not attempt to +provide compatibility to the VFIO_SPAPR_TCE_IOMMU interface. Therefore +it is not generally advisable at this time to switch from native VFIO +implementations to the IOMMUFD compatibility interfaces. + +Long term, VFIO users should migrate to device access through the cdev +interface described below, and native access through the IOMMUFD +provided interfaces. + +VFIO Device cdev +---------------- + +Traditionally user acquires a device fd via VFIO_GROUP_GET_DEVICE_FD +in a VFIO group. + +With CONFIG_VFIO_DEVICE_CDEV=y the user can now acquire a device fd +by directly opening a character device /dev/vfio/devices/vfioX where +"X" is the number allocated uniquely by VFIO for registered devices. +cdev interface does not support noiommu devices, so user should use +the legacy group interface if noiommu is wanted. + +The cdev only works with IOMMUFD. Both VFIO drivers and applications +must adapt to the new cdev security model which requires using +VFIO_DEVICE_BIND_IOMMUFD to claim DMA ownership before starting to +actually use the device. Once BIND succeeds then a VFIO device can +be fully accessed by the user. + +VFIO device cdev doesn't rely on VFIO group/container/iommu drivers. +Hence those modules can be fully compiled out in an environment +where no legacy VFIO application exists. + +So far SPAPR does not support IOMMUFD yet. So it cannot support device +cdev either. + +vfio device cdev access is still bound by IOMMU group semantics, ie. there +can be only one DMA owner for the group. Devices belonging to the same +group can not be bound to multiple iommufd_ctx or shared between native +kernel and vfio bus driver or other driver supporting the driver_managed_dma +flag. A violation of this ownership requirement will fail at the +VFIO_DEVICE_BIND_IOMMUFD ioctl, which gates full device access. + +Device cdev Example +------------------- + +Assume user wants to access PCI device 0000:6a:01.0:: + + $ ls /sys/bus/pci/devices/0000:6a:01.0/vfio-dev/ + vfio0 + +This device is therefore represented as vfio0. The user can verify +its existence:: + + $ ls -l /dev/vfio/devices/vfio0 + crw------- 1 root root 511, 0 Feb 16 01:22 /dev/vfio/devices/vfio0 + $ cat /sys/bus/pci/devices/0000:6a:01.0/vfio-dev/vfio0/dev + 511:0 + $ ls -l /dev/char/511\:0 + lrwxrwxrwx 1 root root 21 Feb 16 01:22 /dev/char/511:0 -> ../vfio/devices/vfio0 + +Then provide the user with access to the device if unprivileged +operation is desired:: + + $ chown user:user /dev/vfio/devices/vfio0 + +Finally the user could get cdev fd by:: + + cdev_fd = open("/dev/vfio/devices/vfio0", O_RDWR); + +An opened cdev_fd doesn't give the user any permission of accessing +the device except binding the cdev_fd to an iommufd. After that point +then the device is fully accessible including attaching it to an +IOMMUFD IOAS/HWPT to enable userspace DMA:: + + struct vfio_device_bind_iommufd bind = { + .argsz = sizeof(bind), + .flags = 0, + }; + struct iommu_ioas_alloc alloc_data = { + .size = sizeof(alloc_data), + .flags = 0, + }; + struct vfio_device_attach_iommufd_pt attach_data = { + .argsz = sizeof(attach_data), + .flags = 0, + }; + struct iommu_ioas_map map = { + .size = sizeof(map), + .flags = IOMMU_IOAS_MAP_READABLE | + IOMMU_IOAS_MAP_WRITEABLE | + IOMMU_IOAS_MAP_FIXED_IOVA, + .__reserved = 0, + }; + + iommufd = open("/dev/iommu", O_RDWR); + + bind.iommufd = iommufd; + ioctl(cdev_fd, VFIO_DEVICE_BIND_IOMMUFD, &bind); + + ioctl(iommufd, IOMMU_IOAS_ALLOC, &alloc_data); + attach_data.pt_id = alloc_data.out_ioas_id; + ioctl(cdev_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data); + + /* Allocate some space and setup a DMA mapping */ + map.user_va = (int64_t)mmap(0, 1024 * 1024, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, 0, 0); + map.iova = 0; /* 1MB starting at 0x0 from device view */ + map.length = 1024 * 1024; + map.ioas_id = alloc_data.out_ioas_id;; + + ioctl(iommufd, IOMMU_IOAS_MAP, &map); + + /* Other device operations as stated in "VFIO Usage Example" */ + VFIO User API ------------------------------------------------------------------------------- @@ -566,3 +697,11 @@ This implementation has some specifics: \-0d.1 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) + +.. [5] Nested translation is an IOMMU feature which supports two stage + address translations. This improves the address translation efficiency + in IOMMU virtualization. + +.. [6] PASID stands for Process Address Space ID, introduced by PCI + Express. It is a prerequisite for Shared Virtual Addressing (SVA) + and Scalable I/O Virtualization (Scalable IOV).