From patchwork Fri Apr 12 08:21:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13627151 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD41E4F200 for ; Fri, 12 Apr 2024 08:21:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712910087; cv=none; b=OdSLbZQrNTaWIqVROcIlQ/vaC/3g1+6l3s0fdd6GwkgWBFltrvGNPcY089zD4qDiI//wsHedzJiDWv6P3x5y9jWsHwrCfz0i2YyE92sYjfcKpqea7dSG9KDaphUqHaPNGSaN0ccR81dYr2P/4j9bnKGzJ3DcJ2fHzx9Dd1+ofSU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712910087; c=relaxed/simple; bh=J9o3x43TbGr2JUn0TLrAoMDW7Sz3u0xuDfohbhTmsXA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=biv5gOwNXjBNNNEztNN6VPvqtoxUvMdITSty0S0OoMmNHkcmGsw1H1IQULYemcvVUpwJm80x+HssBvfideU8KWNySH132Sgr3aqaCZ23NbWxujMmxLK7SyqOmvC6tz8WH+/0yKCuZkYwz/qBbKzXmCbZ9u5xCKMlCGjIgKeEHuU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=CUAZmbE7; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="CUAZmbE7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1712910085; x=1744446085; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=J9o3x43TbGr2JUn0TLrAoMDW7Sz3u0xuDfohbhTmsXA=; b=CUAZmbE7RQqMLZSJ8Sx5Is0ZOIfVtx4MxDN3XUnFx//2/7QlgEljwOrq DLG4KM1eizvObDrwks27kKP+tUdN5zdwceiK1ICUqNsxQCo+JuklA/7Tl sGgynMYV8wKuP38oKIUSdlnxcJzCImPHByBd5JX21gcD8sYW6mHXVag17 mrKXZXbPcMA9uZBD/S9tS/RLbveQfk2MMPZ539lhoH3x1dMPDVg5kt5sU XXdOEZspISuQjFZp/TvfGef6HXbO86kDq34exxlMfNDHn/NI4qMbcE+Mo mo4IcNeutS1GbJQN2cIjuQO9XVYN79YpId4Prf3yYtAA8Ez+q1IChR5dC A==; X-CSE-ConnectionGUID: cI0mlL3IRyePlMYsO196QQ== X-CSE-MsgGUID: HdPPVtuiR62tZmkc3PpxGw== X-IronPort-AV: E=McAfee;i="6600,9927,11041"; a="19069400" X-IronPort-AV: E=Sophos;i="6.07,195,1708416000"; d="scan'208";a="19069400" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2024 01:21:23 -0700 X-CSE-ConnectionGUID: wqJn8sOjTB23aFVZoWCV1w== X-CSE-MsgGUID: 4TYuzYl+SUqu4LzzqjBcxQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,195,1708416000"; d="scan'208";a="25836249" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa003.jf.intel.com with ESMTP; 12 Apr 2024 01:21:23 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, iommu@lists.linux.dev, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, jacob.jun.pan@intel.com, Matthew Wilcox Subject: [PATCH v2 1/4] ida: Add ida_get_lowest() Date: Fri, 12 Apr 2024 01:21:18 -0700 Message-Id: <20240412082121.33382-2-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240412082121.33382-1-yi.l.liu@intel.com> References: <20240412082121.33382-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 There is no helpers for user to check if a given ID is allocated or not, neither a helper to loop all the allocated IDs in an IDA and do something for cleanup. With the two needs, a helper to get the lowest allocated ID of a range can help to achieve it. Caller can check if a given ID is allocated or not by: int id = 200, rc; rc = ida_get_lowest(&ida, id, id); if (rc == id) //id 200 is used else //id 200 is not used Caller can iterate all allocated IDs by: int id = 0; while (!ida_is_empty(&pasid_ida)) { id = ida_get_lowest(pasid_ida, id, INT_MAX); if (id < 0) break; //anything to do with the allocated ID ida_free(pasid_ida, pasid); } Cc: Matthew Wilcox (Oracle) Suggested-by: Jason Gunthorpe Signed-off-by: Yi Liu --- include/linux/idr.h | 1 + lib/idr.c | 67 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 68 insertions(+) diff --git a/include/linux/idr.h b/include/linux/idr.h index da5f5fa4a3a6..1dae71d4a75d 100644 --- a/include/linux/idr.h +++ b/include/linux/idr.h @@ -257,6 +257,7 @@ struct ida { int ida_alloc_range(struct ida *, unsigned int min, unsigned int max, gfp_t); void ida_free(struct ida *, unsigned int id); void ida_destroy(struct ida *ida); +int ida_get_lowest(struct ida *ida, unsigned int min, unsigned int max); /** * ida_alloc() - Allocate an unused ID. diff --git a/lib/idr.c b/lib/idr.c index da36054c3ca0..03e461242fe2 100644 --- a/lib/idr.c +++ b/lib/idr.c @@ -476,6 +476,73 @@ int ida_alloc_range(struct ida *ida, unsigned int min, unsigned int max, } EXPORT_SYMBOL(ida_alloc_range); +/** + * ida_get_lowest - Get the lowest used ID. + * @ida: IDA handle. + * @min: Lowest ID to get. + * @max: Highest ID to get. + * + * Get the lowest used ID between @min and @max, inclusive. The returned + * ID will not exceed %INT_MAX, even if @max is larger. + * + * Context: Any context. Takes and releases the xa_lock. + * Return: The lowest used ID, or errno if no used ID is found. + */ +int ida_get_lowest(struct ida *ida, unsigned int min, unsigned int max) +{ + unsigned long index = min / IDA_BITMAP_BITS; + unsigned int offset = min % IDA_BITMAP_BITS; + unsigned long *addr, size, bit; + unsigned long flags; + void *entry; + int ret; + + if (min >= INT_MAX) + return -EINVAL; + if (max >= INT_MAX) + max = INT_MAX; + + xa_lock_irqsave(&ida->xa, flags); + + entry = xa_find(&ida->xa, &index, max / IDA_BITMAP_BITS, XA_PRESENT); + if (!entry) { + ret = -ENOTTY; + goto err_unlock; + } + + if (index > min / IDA_BITMAP_BITS) + offset = 0; + if (index * IDA_BITMAP_BITS + offset > max) { + ret = -ENOTTY; + goto err_unlock; + } + + if (xa_is_value(entry)) { + unsigned long tmp = xa_to_value(entry); + + addr = &tmp; + size = BITS_PER_XA_VALUE; + } else { + addr = ((struct ida_bitmap *)entry)->bitmap; + size = IDA_BITMAP_BITS; + } + + bit = find_next_bit(addr, size, offset); + + xa_unlock_irqrestore(&ida->xa, flags); + + if (bit == size || + index * IDA_BITMAP_BITS + bit > max) + return -ENOTTY; + + return index * IDA_BITMAP_BITS + bit; + +err_unlock: + xa_unlock_irqrestore(&ida->xa, flags); + return ret; +} +EXPORT_SYMBOL(ida_get_lowest); + /** * ida_free() - Release an allocated ID. * @ida: IDA handle. From patchwork Fri Apr 12 08:21:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13627152 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 29EAE4E1B3 for ; Fri, 12 Apr 2024 08:21:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712910087; cv=none; b=MFUMDduiZHRfQFH5uS6A+Sk7hH2TmnCIWZXQHi/3go3d4eVP1Cm/3+SbNQhTDy9wi4EnpiqV8xRtZih0nPYdBEMXuxZ4xdjFF7wh10+Xo54/JOFPMxVreuRtiI4lXGIT0UiU6fnHghRT/1jqy3AfVENTp3g9bW45TXENoL3Og+w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712910087; c=relaxed/simple; bh=UjPvmgpzmm11TnJlYYfXn0iNU172tlWImjrbz2kHSQI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=fuplmi9dYD9cqckBDzOAnEy9b/GgJr/CUdSmIO+AYhbJVZfqyHtkxuwExrduFbgZe/1W+pjZvu5aoHt67Fi4VRusoVbUJFWRARDKxBHClLxapOkzGdjWKGtsUxDuhXfVmaNhAigs7ChxYZvXT5uUizFPimGkAqrNJeszHvHdc3w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=WRADAllr; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="WRADAllr" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1712910086; x=1744446086; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UjPvmgpzmm11TnJlYYfXn0iNU172tlWImjrbz2kHSQI=; b=WRADAllrNZP5iG0Uhtap9Q+Cu6xJcXEL/hqVxiOg6GYptLAuItHeSH65 9b250Wh7s8f8PM7YhUkMRjBXBrz+6fKEP4AJj7AWrJLcJICCwIe2Q6iJ3 cXmJahC7Z3btrLi7X3wx09fOVHn4wF1o6iohQ8QiJ2JVrC4RmzmW2qnLL Y5mFxQIHuAiZdGvOPbovFjMK3/TFWJdJAoSOAKB71c2+eFOIm2wgw6SrP KGAljUqxL1IdDsHyqYmYjCxuMz2C8yGkiS8LZ875iCPMyyiUcek8rzsap 0dmufA/MB9ShsoMPmakqzkopSc9ZoPyl0vZREu+P+w9Dez74cw31TUiQi A==; X-CSE-ConnectionGUID: /NK05YcdTIq17IIkIISAfQ== X-CSE-MsgGUID: 8Jgap+LjTGea7rNa9xBcRA== X-IronPort-AV: E=McAfee;i="6600,9927,11041"; a="19069410" X-IronPort-AV: E=Sophos;i="6.07,195,1708416000"; d="scan'208";a="19069410" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2024 01:21:24 -0700 X-CSE-ConnectionGUID: dpYYN6x0Rh2Kk1AGPvc8Dw== X-CSE-MsgGUID: LRfYMf+UTqmVk3E86mRZjg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,195,1708416000"; d="scan'208";a="25836267" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa003.jf.intel.com with ESMTP; 12 Apr 2024 01:21:24 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, iommu@lists.linux.dev, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, jacob.jun.pan@intel.com, Matthew Wilcox Subject: [PATCH v2 2/4] vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices Date: Fri, 12 Apr 2024 01:21:19 -0700 Message-Id: <20240412082121.33382-3-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240412082121.33382-1-yi.l.liu@intel.com> References: <20240412082121.33382-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This adds pasid_at|de]tach_ioas ops for attaching hwpt to pasid of a device and the helpers for it. For now, only vfio-pci supports pasid attach/detach. Cc: Matthew Wilcox (Oracle) Signed-off-by: Kevin Tian Signed-off-by: Yi Liu --- drivers/vfio/iommufd.c | 60 +++++++++++++++++++++++++++++++++++++ drivers/vfio/pci/vfio_pci.c | 2 ++ include/linux/vfio.h | 11 +++++++ 3 files changed, 73 insertions(+) diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 82eba6966fa5..fc533416c75d 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -119,14 +119,26 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev, if (IS_ERR(idev)) return PTR_ERR(idev); vdev->iommufd_device = idev; + ida_init(&vdev->pasids); return 0; } EXPORT_SYMBOL_GPL(vfio_iommufd_physical_bind); void vfio_iommufd_physical_unbind(struct vfio_device *vdev) { + int pasid = 0; + lockdep_assert_held(&vdev->dev_set->lock); + while (!ida_is_empty(&vdev->pasids)) { + pasid = ida_get_lowest(&vdev->pasids, pasid, INT_MAX); + if (pasid < 0) + break; + + iommufd_device_pasid_detach(vdev->iommufd_device, pasid); + ida_free(&vdev->pasids, pasid); + } + if (vdev->iommufd_attached) { iommufd_device_detach(vdev->iommufd_device); vdev->iommufd_attached = false; @@ -168,6 +180,54 @@ void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev) } EXPORT_SYMBOL_GPL(vfio_iommufd_physical_detach_ioas); +int vfio_iommufd_physical_pasid_attach_ioas(struct vfio_device *vdev, + u32 pasid, u32 *pt_id) +{ + int rc; + + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device)) + return -EINVAL; + + rc = ida_get_lowest(&vdev->pasids, pasid, pasid); + if (rc == pasid) + return iommufd_device_pasid_replace(vdev->iommufd_device, + pasid, pt_id); + + rc = iommufd_device_pasid_attach(vdev->iommufd_device, pasid, pt_id); + if (rc) + return rc; + + rc = ida_alloc_range(&vdev->pasids, pasid, pasid, GFP_KERNEL); + if (rc < 0) { + iommufd_device_pasid_detach(vdev->iommufd_device, pasid); + return rc; + } + + return 0; +} +EXPORT_SYMBOL_GPL(vfio_iommufd_physical_pasid_attach_ioas); + +void vfio_iommufd_physical_pasid_detach_ioas(struct vfio_device *vdev, + u32 pasid) +{ + int rc; + + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device)) + return; + + rc = ida_get_lowest(&vdev->pasids, pasid, pasid); + if (rc < 0) + return; + + iommufd_device_pasid_detach(vdev->iommufd_device, pasid); + ida_free(&vdev->pasids, pasid); +} +EXPORT_SYMBOL_GPL(vfio_iommufd_physical_pasid_detach_ioas); + /* * The emulated standard ops mean that vfio_device is going to use the * "mdev path" and will call vfio_pin_pages()/vfio_dma_rw(). Drivers using this diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index cb5b7f865d58..e0198851ffd2 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -142,6 +142,8 @@ static const struct vfio_device_ops vfio_pci_ops = { .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, .detach_ioas = vfio_iommufd_physical_detach_ioas, + .pasid_attach_ioas = vfio_iommufd_physical_pasid_attach_ioas, + .pasid_detach_ioas = vfio_iommufd_physical_pasid_detach_ioas, }; static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 8b1a29820409..8fd1db173e84 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -66,6 +66,7 @@ struct vfio_device { void (*put_kvm)(struct kvm *kvm); #if IS_ENABLED(CONFIG_IOMMUFD) struct iommufd_device *iommufd_device; + struct ida pasids; u8 iommufd_attached:1; #endif u8 cdev_opened:1; @@ -90,6 +91,8 @@ struct vfio_device { * bound iommufd. Undo in unbind_iommufd if @detach_ioas is not * called. * @detach_ioas: Opposite of attach_ioas + * @pasid_attach_ioas: The pasid variation of attach_ioas + * @pasid_detach_ioas: Opposite of pasid_attach_ioas * @open_device: Called when the first file descriptor is opened for this device * @close_device: Opposite of open_device * @read: Perform read(2) on device file descriptor @@ -114,6 +117,8 @@ struct vfio_device_ops { void (*unbind_iommufd)(struct vfio_device *vdev); int (*attach_ioas)(struct vfio_device *vdev, u32 *pt_id); void (*detach_ioas)(struct vfio_device *vdev); + int (*pasid_attach_ioas)(struct vfio_device *vdev, u32 pasid, u32 *pt_id); + void (*pasid_detach_ioas)(struct vfio_device *vdev, u32 pasid); int (*open_device)(struct vfio_device *vdev); void (*close_device)(struct vfio_device *vdev); ssize_t (*read)(struct vfio_device *vdev, char __user *buf, @@ -138,6 +143,8 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev, void vfio_iommufd_physical_unbind(struct vfio_device *vdev); int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id); void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev); +int vfio_iommufd_physical_pasid_attach_ioas(struct vfio_device *vdev, u32 pasid, u32 *pt_id); +void vfio_iommufd_physical_pasid_detach_ioas(struct vfio_device *vdev, u32 pasid); int vfio_iommufd_emulated_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id); void vfio_iommufd_emulated_unbind(struct vfio_device *vdev); @@ -165,6 +172,10 @@ vfio_iommufd_get_dev_id(struct vfio_device *vdev, struct iommufd_ctx *ictx) ((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL) #define vfio_iommufd_physical_detach_ioas \ ((void (*)(struct vfio_device *vdev)) NULL) +#define vfio_iommufd_physical_pasid_attach_ioas \ + ((int (*)(struct vfio_device *vdev, u32 pasid, u32 *pt_id)) NULL) +#define vfio_iommufd_physical_pasid_detach_ioas \ + ((void (*)(struct vfio_device *vdev, u32 pasid)) NULL) #define vfio_iommufd_emulated_bind \ ((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx, \ u32 *out_device_id)) NULL) From patchwork Fri Apr 12 08:21:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13627153 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6FA9F50269 for ; Fri, 12 Apr 2024 08:21:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712910087; cv=none; b=ThVzsbZS2odYCORMh7yGszaHrGJd3aLvMNt8qu+z2ma+JC4MalOde3kSi+NNClJW15h3GYK57cy7BXHRW8WKuUzFLUbr58kRb1gwBnwt2n9RskeE/ztkqVv0yqHXtc9kRmWLh4zA2emX/nAcmae6Y9X0dirsgn2j6lJgdduQZW4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712910087; c=relaxed/simple; bh=xpCf82BSZgekIxtUm+EvMiqfVBz/KGSVCuqTnIOiUsU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cZCrLYXkDDo+/2iMK1ZbQFugxDBh5G+EN+Wf/SW05IjMalid82ZN29VrV/PCdrX9kW9WeIZnPUhmpTePheeGm1Y3NqREccLPXscwwlBjJ4InnW37fiXZCBX6MFlldUHaSBwL73B2uavbfAmaktQx3KWCXlnBNMeFTsu05hN2uMo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=fTUcuxxV; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="fTUcuxxV" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1712910086; x=1744446086; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xpCf82BSZgekIxtUm+EvMiqfVBz/KGSVCuqTnIOiUsU=; b=fTUcuxxVoJhRi7+cDdNeU5iTSqWHY4ALstiPVSb+Wwd2bRpIzq+BR4Pt 2cMzlJHp582Go8laSUXlXQpHuzueIMJcTAI0vy3awuHHXqpVEulIRv5Nh zHI09YlwEVnnpRjdhoT6sU3Pq+KJzzt0Zi7ZMdeFOymRN2QA7bUqjFxhy WaMLK9iDs7vFYo7QtIPgIfc0QMqTXYK9yBnfyttq6MZXRtzw6l/4J1Bef kM3frQxX7lSMByRNCwELYUqLifQITn9LnpOHYNtmfeyRcszdFc1prZE8y +HNiPDj0TjhwlMUQc8MWyUsi+MDCugKFMMjvBe4fxB3dNMNQHlRwR5XTt w==; X-CSE-ConnectionGUID: ZNOxSoT6TyC0Ab4lUQbE1A== X-CSE-MsgGUID: BgDPoQrYQ3uqlFiMOXjB+A== X-IronPort-AV: E=McAfee;i="6600,9927,11041"; a="19069421" X-IronPort-AV: E=Sophos;i="6.07,195,1708416000"; d="scan'208";a="19069421" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2024 01:21:25 -0700 X-CSE-ConnectionGUID: lKW4bZxaSaylcWWgUV/TUA== X-CSE-MsgGUID: udEdc46zQk2cMlLb4kLCPw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,195,1708416000"; d="scan'208";a="25836273" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa003.jf.intel.com with ESMTP; 12 Apr 2024 01:21:25 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, iommu@lists.linux.dev, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, jacob.jun.pan@intel.com Subject: [PATCH v2 3/4] vfio: Add VFIO_DEVICE_PASID_[AT|DE]TACH_IOMMUFD_PT Date: Fri, 12 Apr 2024 01:21:20 -0700 Message-Id: <20240412082121.33382-4-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240412082121.33382-1-yi.l.liu@intel.com> References: <20240412082121.33382-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This adds ioctls for the userspace to attach/detach a given pasid of a vfio device to/from an IOAS/HWPT. Signed-off-by: Yi Liu Reviewed-by: Jason Gunthorpe --- drivers/vfio/device_cdev.c | 51 +++++++++++++++++++++++++++++++++++ drivers/vfio/vfio.h | 4 +++ drivers/vfio/vfio_main.c | 8 ++++++ include/uapi/linux/vfio.h | 55 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 118 insertions(+) diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c index e75da0a70d1f..5326f1608ace 100644 --- a/drivers/vfio/device_cdev.c +++ b/drivers/vfio/device_cdev.c @@ -210,6 +210,57 @@ int vfio_df_ioctl_detach_pt(struct vfio_device_file *df, return 0; } +int vfio_df_ioctl_pasid_attach_pt(struct vfio_device_file *df, + struct vfio_device_pasid_attach_iommufd_pt __user *arg) +{ + struct vfio_device_pasid_attach_iommufd_pt attach; + struct vfio_device *device = df->device; + unsigned long minsz; + int ret; + + minsz = offsetofend(struct vfio_device_pasid_attach_iommufd_pt, pt_id); + + if (copy_from_user(&attach, arg, minsz)) + return -EFAULT; + + if (attach.argsz < minsz || attach.flags) + return -EINVAL; + + if (!device->ops->pasid_attach_ioas) + return -EOPNOTSUPP; + + mutex_lock(&device->dev_set->lock); + ret = device->ops->pasid_attach_ioas(device, attach.pasid, &attach.pt_id); + mutex_unlock(&device->dev_set->lock); + + return ret; +} + +int vfio_df_ioctl_pasid_detach_pt(struct vfio_device_file *df, + struct vfio_device_pasid_detach_iommufd_pt __user *arg) +{ + struct vfio_device_pasid_detach_iommufd_pt detach; + struct vfio_device *device = df->device; + unsigned long minsz; + + minsz = offsetofend(struct vfio_device_pasid_detach_iommufd_pt, pasid); + + if (copy_from_user(&detach, arg, minsz)) + return -EFAULT; + + if (detach.argsz < minsz || detach.flags) + return -EINVAL; + + if (!device->ops->pasid_detach_ioas) + return -EOPNOTSUPP; + + mutex_lock(&device->dev_set->lock); + device->ops->pasid_detach_ioas(device, detach.pasid); + mutex_unlock(&device->dev_set->lock); + + return 0; +} + static char *vfio_device_devnode(const struct device *dev, umode_t *mode) { return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev)); diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 50128da18bca..20d3cb283ba0 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -353,6 +353,10 @@ int vfio_df_ioctl_attach_pt(struct vfio_device_file *df, struct vfio_device_attach_iommufd_pt __user *arg); int vfio_df_ioctl_detach_pt(struct vfio_device_file *df, struct vfio_device_detach_iommufd_pt __user *arg); +int vfio_df_ioctl_pasid_attach_pt(struct vfio_device_file *df, + struct vfio_device_pasid_attach_iommufd_pt __user *arg); +int vfio_df_ioctl_pasid_detach_pt(struct vfio_device_file *df, + struct vfio_device_pasid_detach_iommufd_pt __user *arg); #if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV) void vfio_init_device_cdev(struct vfio_device *device); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index e97d796a54fb..a5cece9fff5e 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1242,6 +1242,14 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, case VFIO_DEVICE_DETACH_IOMMUFD_PT: ret = vfio_df_ioctl_detach_pt(df, uptr); goto out; + + case VFIO_DEVICE_PASID_ATTACH_IOMMUFD_PT: + ret = vfio_df_ioctl_pasid_attach_pt(df, uptr); + goto out; + + case VFIO_DEVICE_PASID_DETACH_IOMMUFD_PT: + ret = vfio_df_ioctl_pasid_detach_pt(df, uptr); + goto out; } } diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 2b68e6cdf190..9591dc24b75c 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -977,6 +977,61 @@ struct vfio_device_detach_iommufd_pt { #define VFIO_DEVICE_DETACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 20) +/* + * VFIO_DEVICE_PASID_ATTACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 21, + * struct vfio_device_pasid_attach_iommufd_pt) + * @argsz: User filled size of this data. + * @flags: Must be 0. + * @pasid: The pasid to be attached. + * @pt_id: Input the target id which can represent an ioas or a hwpt + * allocated via iommufd subsystem. + * Output the input ioas id or the attached hwpt id which could + * be the specified hwpt itself or a hwpt automatically created + * for the specified ioas by kernel during the attachment. + * + * Associate a pasid (of a cdev device) with an address space within the + * bound iommufd. Undo by VFIO_DEVICE_PASID_DETACH_IOMMUFD_PT or device fd + * close. This is only allowed on cdev fds. + * + * If a pasid is currently attached to a valid hw_pagetable (hwpt), without + * doing a VFIO_DEVICE_PASID_DETACH_IOMMUFD_PT, a second + * VFIO_DEVICE_PASID_ATTACH_IOMMUFD_PT ioctl passing in another hwpt id is + * allowed. This action, also known as a hwpt replacement, will replace the + * pasid's currently attached hwpt with a new hwpt corresponding to the given + * @pt_id. + * + * Return: 0 on success, -errno on failure. + */ +struct vfio_device_pasid_attach_iommufd_pt { + __u32 argsz; + __u32 flags; + __u32 pasid; + __u32 pt_id; +}; + +#define VFIO_DEVICE_PASID_ATTACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 21) + +/* + * VFIO_DEVICE_PASID_DETACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 22, + * struct vfio_device_pasid_detach_iommufd_pt) + * @argsz: User filled size of this data. + * @flags: Must be 0. + * @pasid: The pasid to be detached. + * + * Remove the association of a pasid (of a cdev device) and its current + * associated address space. After it, the pasid of the device should be + * in a blocking DMA state. This is only allowed on cdev fds. + * + * Return: 0 on success, -errno on failure. + */ +struct vfio_device_pasid_detach_iommufd_pt { + __u32 argsz; + __u32 flags; + __u32 pasid; +}; + +#define VFIO_DEVICE_PASID_DETACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 22) + /* * Provide support for setting a PCI VF Token, which is used as a shared * secret between PF and VF drivers. This feature may only be set on a From patchwork Fri Apr 12 08:21:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13627154 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8104350289 for ; Fri, 12 Apr 2024 08:21:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712910088; cv=none; b=IO2ejyoYUWogA4vIawCTS9QYJxxxOBBILy3006j6chPmyV0uL1WVME3mC31cSfU8YUFNLzoESbHC8gaCZPMwcsFIj1Cqns+CBwbOhN2Ak01iamnbKkBXnAEiBScygX3xmeF0r2KTPp34bQy/fC4KmscaV6E+9/sn4pgvUqPBS6o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712910088; c=relaxed/simple; bh=jLjMOyugTL5/+7lybaOBxd433Z4MIv3gSjYWbqITPXc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=DzugRuxd5aseHVZ2RgLVYqVHTmuxXxLT+ySo2s3of/8bOcmcl7GuHZ9i84pQaH0VmNyExVhu60n5gSyv2jAwhMNNQD6styBPQgj1+6xtmlNS4peLLFvPSMpNxRGCQWsI2FFzha7ob2EkjOMYiEo9dYSvfnRi/pRXvqBlwpsfaTQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=BkPV5YKg; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="BkPV5YKg" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1712910088; x=1744446088; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jLjMOyugTL5/+7lybaOBxd433Z4MIv3gSjYWbqITPXc=; b=BkPV5YKg1nbQoN3GnO8Hf+G87jTSCM+NmUkTI1yuCkmrZF26KngFvncY jyP0rb6ImbS7iCW+2TT1ju6jSvSJeUeaGiR1L6kHqM496tPU25zINioOu d9q58z1Aymx2RDix7kMyA8m0cg5sDE3CerULUDcgEU32Muh+Elj690xiB O/GWd4ZV9O8Im/fDAj04DWyj5a/D54o3yO70xjKZJzT2eUbi8gpHJEk3o MmequnoBPzWBZifqpPIEAyrenGAg1Q8PcNyOOGeJJFBUbjE914XD/+HQO F3WqSRNNCvNTlXA58GI1AW5Al5VT4QjQE78l1mUaO+Ylk6eFFKQxvQh5u A==; X-CSE-ConnectionGUID: Drn/Do47Q32LqxeflAQvgQ== X-CSE-MsgGUID: AbdkIltDTHuQ6pnCD5CBFw== X-IronPort-AV: E=McAfee;i="6600,9927,11041"; a="19069428" X-IronPort-AV: E=Sophos;i="6.07,195,1708416000"; d="scan'208";a="19069428" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2024 01:21:26 -0700 X-CSE-ConnectionGUID: yqcSNUxURnmmSB3/gIz9Rg== X-CSE-MsgGUID: eVhiK1TzQdakbtwlDeQTSQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,195,1708416000"; d="scan'208";a="25836280" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa003.jf.intel.com with ESMTP; 12 Apr 2024 01:21:26 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: joro@8bytes.org, robin.murphy@arm.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, iommu@lists.linux.dev, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, jacob.jun.pan@intel.com Subject: [PATCH v2 4/4] vfio: Report PASID capability via VFIO_DEVICE_FEATURE ioctl Date: Fri, 12 Apr 2024 01:21:21 -0700 Message-Id: <20240412082121.33382-5-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240412082121.33382-1-yi.l.liu@intel.com> References: <20240412082121.33382-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Today, vfio-pci hides the PASID capability of devices from userspace. Unlike other PCI capabilities, PASID capability is going to be reported to user by VFIO_DEVICE_FEATURE. Hence userspace could probe PASID capability by it. This is a bit different from the other capabilities which are reported to userspace when the user reads the device's PCI configuration space. There are two reasons for this. - First, userspace like Qemu by default exposes all the available PCI capabilities in vfio-pci config space to the guest as read-only, so adding PASID capability in the vfio-pci config space will make it exposed to the guest automatically while an old Qemu doesn't really support it. - Second, the PASID capability does not exist on VFs (instead shares the cap of the PF). Creating a virtual PASID capability in vfio-pci config space needs to find a hole to place it, but doing so may require device specific knowledge to avoid potential conflict with device specific registers like hidden bits in VF's config space. It's simpler to move this burden to the VMM instead of maintaining a quirk system in the kernel. Suggested-by: Alex Williamson Signed-off-by: Yi Liu --- drivers/vfio/pci/vfio_pci_core.c | 50 ++++++++++++++++++++++++++++++++ include/uapi/linux/vfio.h | 14 +++++++++ 2 files changed, 64 insertions(+) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index d94d61b92c1a..ca64e461d435 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1495,6 +1495,54 @@ static int vfio_pci_core_feature_token(struct vfio_device *device, u32 flags, return 0; } +static int vfio_pci_core_feature_pasid(struct vfio_device *device, u32 flags, + struct vfio_device_feature_pasid __user *arg, + size_t argsz) +{ + struct vfio_pci_core_device *vdev = + container_of(device, struct vfio_pci_core_device, vdev); + struct vfio_device_feature_pasid pasid = { 0 }; + struct pci_dev *pdev = vdev->pdev; + u32 capabilities = 0; + u16 ctrl = 0; + int ret; + + /* + * Due to no PASID capability per VF, to be consistent, we do not + * support SET of the PASID capability for both PF and VF. + */ + ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_GET, + sizeof(pasid)); + if (ret != 1) + return ret; + + /* VF shares the PASID capability of its PF */ + if (pdev->is_virtfn) + pdev = pci_physfn(pdev); + + if (!pdev->pasid_enabled) + goto out; + +#ifdef CONFIG_PCI_PASID + pci_read_config_dword(pdev, pdev->pasid_cap + PCI_PASID_CAP, + &capabilities); + pci_read_config_word(pdev, pdev->pasid_cap + PCI_PASID_CTRL, + &ctrl); +#endif + + pasid.width = (capabilities >> 8) & 0x1f; + + if (ctrl & PCI_PASID_CTRL_EXEC) + pasid.capabilities |= VFIO_DEVICE_PASID_CAP_EXEC; + if (ctrl & PCI_PASID_CTRL_PRIV) + pasid.capabilities |= VFIO_DEVICE_PASID_CAP_PRIV; + +out: + if (copy_to_user(arg, &pasid, sizeof(pasid))) + return -EFAULT; + return 0; +} + int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags, void __user *arg, size_t argsz) { @@ -1508,6 +1556,8 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags, return vfio_pci_core_pm_exit(device, flags, arg, argsz); case VFIO_DEVICE_FEATURE_PCI_VF_TOKEN: return vfio_pci_core_feature_token(device, flags, arg, argsz); + case VFIO_DEVICE_FEATURE_PASID: + return vfio_pci_core_feature_pasid(device, flags, arg, argsz); default: return -ENOTTY; } diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 9591dc24b75c..e50e55c67ab4 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -1513,6 +1513,20 @@ struct vfio_device_feature_bus_master { }; #define VFIO_DEVICE_FEATURE_BUS_MASTER 10 +/** + * Upon VFIO_DEVICE_FEATURE_GET, return the PASID capability for the device. + * Zero width means no support for PASID. + */ +struct vfio_device_feature_pasid { + __u16 capabilities; +#define VFIO_DEVICE_PASID_CAP_EXEC (1 << 0) +#define VFIO_DEVICE_PASID_CAP_PRIV (1 << 1) + __u8 width; + __u8 __reserved; +}; + +#define VFIO_DEVICE_FEATURE_PASID 11 + /* -------- API for Type1 VFIO IOMMU -------- */ /**