From patchwork Tue Jul 18 13:55:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13317272 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 677E9EB64DA for ; Tue, 18 Jul 2023 13:56:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232994AbjGRN43 (ORCPT ); Tue, 18 Jul 2023 09:56:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232999AbjGRN4W (ORCPT ); Tue, 18 Jul 2023 09:56:22 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 825E119A1; Tue, 18 Jul 2023 06:56:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689688566; x=1721224566; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aOMvzl84305gOZ+h1Dfu8gnbceCEOq/1RbMTWdPfv4A=; b=dJBo5lxfd0VEl9mRAABpRrJpmB653btx7ORG9NWJbZ2g4GfEK5b7j7XZ 8Ar2bG0kGQPZPsyMOY5i4GiMg048yQyBfH3dOc8jQm+vmLAmCwswBRtiF aaRAIBaDKvsJPNqQGgS4K7WOk35g/PDuNJCOCnvkwnHulFvExvCZDaA5J I4s7SkqmzDHd4gK+wj/WAjCOJ5Hv3rbaUJ8RDt4vRCkGzItH5yMTz+0Gb JZuAnxQaLy6Q90sZeLhDmQIn4AFTmYW7KUK9az+2hrnRJN1I/mZTOcfQb it0+G42R/Fzwhahi6DdPhwOtPh54kF6e0s+W3l2bbDQvVQ2iF+3kXfk84 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="452590623" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="452590623" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2023 06:56:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10775"; a="970250978" X-IronPort-AV: E=Sophos;i="6.01,214,1684825200"; d="scan'208";a="970250978" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2023 06:55:57 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org, xudong.hao@intel.com, yan.y.zhao@intel.com, terrence.xu@intel.com, yanting.jiang@intel.com, zhenzhong.duan@intel.com, clegoate@redhat.com Subject: [PATCH v15 07/26] vfio: Block device access via device fd until device is opened Date: Tue, 18 Jul 2023 06:55:32 -0700 Message-Id: <20230718135551.6592-8-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230718135551.6592-1-yi.l.liu@intel.com> References: <20230718135551.6592-1-yi.l.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Allow the vfio_device file to be in a state where the device FD is opened but the device cannot be used by userspace (i.e. its .open_device() hasn't been called). This inbetween state is not used when the device FD is spawned from the group FD, however when we create the device FD directly by opening a cdev it will be opened in the blocked state. The reason for the inbetween state is that userspace only gets a FD but doesn't gain access permission until binding the FD to an iommufd. So in the blocked state, only the bind operation is allowed. Completing bind will allow user to further access the device. This is implemented by adding a flag in struct vfio_device_file to mark the blocked state and using a simple smp_load_acquire() to obtain the flag value and serialize all the device setup with the thread accessing this device. Following this lockless scheme, it can safely handle the device FD unbound->bound but it cannot handle bound->unbound. To allow this we'd need to add a lock on all the vfio ioctls which seems costly. So once device FD is bound, it remains bound until the FD is closed. Suggested-by: Jason Gunthorpe Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Reviewed-by: Eric Auger Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Tested-by: Shameer Kolothum Tested-by: Zhenzhong Duan Signed-off-by: Yi Liu --- drivers/vfio/group.c | 11 ++++++++++- drivers/vfio/vfio.h | 1 + drivers/vfio/vfio_main.c | 16 ++++++++++++++++ 3 files changed, 27 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index caf53716ddb2..088dd34c8931 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -194,9 +194,18 @@ static int vfio_df_group_open(struct vfio_device_file *df) df->iommufd = device->group->iommufd; ret = vfio_df_open(df); - if (ret) + if (ret) { df->iommufd = NULL; + goto out_put_kvm; + } + + /* + * Paired with smp_load_acquire() in vfio_device_fops::ioctl/ + * read/write/mmap and vfio_file_has_device_access() + */ + smp_store_release(&df->access_granted, true); +out_put_kvm: if (device->open_count == 0) vfio_device_put_kvm(device); diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 2094f5a4ef04..4478a1e77a5e 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -19,6 +19,7 @@ struct vfio_container; struct vfio_device_file { struct vfio_device *device; + u8 access_granted; spinlock_t kvm_ref_lock; /* protect kvm field */ struct kvm *kvm; struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */ diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 825b1eeaebe2..c37fc14599d0 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1129,6 +1129,10 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, struct vfio_device *device = df->device; int ret; + /* Paired with smp_store_release() following vfio_df_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + ret = vfio_device_pm_runtime_get(device); if (ret) return ret; @@ -1156,6 +1160,10 @@ static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf, struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + /* Paired with smp_store_release() following vfio_df_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + if (unlikely(!device->ops->read)) return -EINVAL; @@ -1169,6 +1177,10 @@ static ssize_t vfio_device_fops_write(struct file *filep, struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + /* Paired with smp_store_release() following vfio_df_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + if (unlikely(!device->ops->write)) return -EINVAL; @@ -1180,6 +1192,10 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma) struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + /* Paired with smp_store_release() following vfio_df_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + if (unlikely(!device->ops->mmap)) return -EINVAL;