From patchwork Thu Dec 19 13:35:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13915099 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C568679F2 for ; Thu, 19 Dec 2024 13:35:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734615340; cv=none; b=G5R6l0zRCwAvaw47p2/IIlfACbVRgnawLs5vnNFUj2gbQ3MNBZxh8kxmvpvjgYMqpYyLlTYXD2RkZitBrN0MV9fdofH49PNOGYxg+O3rCdaprxVja9We6jV7sDgFWB3SyA2wsJUY4kPULXSulQN0uDYdEVfXez9rEaIg+G1gGtU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734615340; c=relaxed/simple; bh=DPzSxv9vMP7mIpXdsVlzuxzRhKS8p8nA3AZc3wGiDlM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cGDac4l/iH5yZbHUq9l7DTv25IoMHJBDQxW79+/yAYzckMsY0UpAm6ii4ozBQYxlFMWXLclVjxjdIa5LKEQjSD94SfOHCC/EQ782gN5vKJn7rP+kvWCxwhitYf/qO9UHgycylgxEZ3z60IZxyLSptmwy0iSQ8ancFVMkoDRGWhI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=G/XiPXPE; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="G/XiPXPE" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1734615338; x=1766151338; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DPzSxv9vMP7mIpXdsVlzuxzRhKS8p8nA3AZc3wGiDlM=; b=G/XiPXPEfk25zZCaqv8+vfIzqlUldongzX4bAfapptmjq3EfHcVquWc7 kdLKItndXm+nSRV8qgJXqTDlw8ZsSZ5s/CYrNGMnCsMX7htphq1K4hTFv 6eat+A9/bBsh8T/ufA0AmpPwHzWEIl3XfOFh7mqA6RCNFn0xl0ric5dVS /KUFmLNMQAW5Rg45tygvuWudWXD60hjy5AM8aDfN2r0nN3MuIo7Akc4+w KasJhzfGXQPWvcbHk+iZvzaclw6m6/vDQdF0ee2XQtiyGpe8CKYyRABAb VSnXCCEboP1wrcoWJzfWMicuKUa2+DQ5taZTbh7y0ivyjvBBBJJo4jMIJ g==; X-CSE-ConnectionGUID: g0ZndZuySomSACGa/JNLCQ== X-CSE-MsgGUID: 5Z7RXJIgSxyXHBQKWxPUMQ== X-IronPort-AV: E=McAfee;i="6700,10204,11290"; a="60504298" X-IronPort-AV: E=Sophos;i="6.12,247,1728975600"; d="scan'208";a="60504298" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2024 05:35:37 -0800 X-CSE-ConnectionGUID: xH0wpkPYSbSDnEByFQUdrg== X-CSE-MsgGUID: XCImA4HBQR6Ngs0T3whgXw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="103197877" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa005.jf.intel.com with ESMTP; 19 Dec 2024 05:35:37 -0800 From: Yi Liu To: alex.williamson@redhat.com, kevin.tian@intel.com Cc: jgg@nvidia.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, yi.l.liu@intel.com, chao.p.peng@linux.intel.com, zhenzhong.duan@intel.com, willy@infradead.org, zhangfei.gao@linaro.org, vasant.hegde@amd.com Subject: [PATCH v6 1/5] ida: Add ida_find_first_range() Date: Thu, 19 Dec 2024 05:35:30 -0800 Message-Id: <20241219133534.16422-2-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241219133534.16422-1-yi.l.liu@intel.com> References: <20241219133534.16422-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 There is no helpers for user to check if a given ID is allocated or not, neither a helper to loop all the allocated IDs in an IDA and do something for cleanup. With the two needs, a helper to get the lowest allocated ID of a range and two variants based on it. Caller can check if a given ID is allocated or not by: bool ida_exists(struct ida *ida, unsigned int id) Caller can iterate all allocated IDs by: int id; while ((id = ida_find_first(&pasid_ida)) >= 0) { //anything to do with the allocated ID ida_free(pasid_ida, pasid); } Cc: Matthew Wilcox (Oracle) Suggested-by: Jason Gunthorpe Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Acked-by: Matthew Wilcox (Oracle) Signed-off-by: Yi Liu --- include/linux/idr.h | 11 +++++++ lib/idr.c | 67 +++++++++++++++++++++++++++++++++++++++++++ lib/test_ida.c | 70 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 148 insertions(+) diff --git a/include/linux/idr.h b/include/linux/idr.h index da5f5fa4a3a6..718f9b1b91af 100644 --- a/include/linux/idr.h +++ b/include/linux/idr.h @@ -257,6 +257,7 @@ struct ida { int ida_alloc_range(struct ida *, unsigned int min, unsigned int max, gfp_t); void ida_free(struct ida *, unsigned int id); void ida_destroy(struct ida *ida); +int ida_find_first_range(struct ida *ida, unsigned int min, unsigned int max); /** * ida_alloc() - Allocate an unused ID. @@ -328,4 +329,14 @@ static inline bool ida_is_empty(const struct ida *ida) { return xa_empty(&ida->xa); } + +static inline bool ida_exists(struct ida *ida, unsigned int id) +{ + return ida_find_first_range(ida, id, id) == id; +} + +static inline int ida_find_first(struct ida *ida) +{ + return ida_find_first_range(ida, 0, ~0); +} #endif /* __IDR_H__ */ diff --git a/lib/idr.c b/lib/idr.c index da36054c3ca0..e2adc457abb4 100644 --- a/lib/idr.c +++ b/lib/idr.c @@ -476,6 +476,73 @@ int ida_alloc_range(struct ida *ida, unsigned int min, unsigned int max, } EXPORT_SYMBOL(ida_alloc_range); +/** + * ida_find_first_range - Get the lowest used ID. + * @ida: IDA handle. + * @min: Lowest ID to get. + * @max: Highest ID to get. + * + * Get the lowest used ID between @min and @max, inclusive. The returned + * ID will not exceed %INT_MAX, even if @max is larger. + * + * Context: Any context. Takes and releases the xa_lock. + * Return: The lowest used ID, or errno if no used ID is found. + */ +int ida_find_first_range(struct ida *ida, unsigned int min, unsigned int max) +{ + unsigned long index = min / IDA_BITMAP_BITS; + unsigned int offset = min % IDA_BITMAP_BITS; + unsigned long *addr, size, bit; + unsigned long tmp = 0; + unsigned long flags; + void *entry; + int ret; + + if ((int)min < 0) + return -EINVAL; + if ((int)max < 0) + max = INT_MAX; + + xa_lock_irqsave(&ida->xa, flags); + + entry = xa_find(&ida->xa, &index, max / IDA_BITMAP_BITS, XA_PRESENT); + if (!entry) { + ret = -ENOENT; + goto err_unlock; + } + + if (index > min / IDA_BITMAP_BITS) + offset = 0; + if (index * IDA_BITMAP_BITS + offset > max) { + ret = -ENOENT; + goto err_unlock; + } + + if (xa_is_value(entry)) { + tmp = xa_to_value(entry); + addr = &tmp; + size = BITS_PER_XA_VALUE; + } else { + addr = ((struct ida_bitmap *)entry)->bitmap; + size = IDA_BITMAP_BITS; + } + + bit = find_next_bit(addr, size, offset); + + xa_unlock_irqrestore(&ida->xa, flags); + + if (bit == size || + index * IDA_BITMAP_BITS + bit > max) + return -ENOENT; + + return index * IDA_BITMAP_BITS + bit; + +err_unlock: + xa_unlock_irqrestore(&ida->xa, flags); + return ret; +} +EXPORT_SYMBOL(ida_find_first_range); + /** * ida_free() - Release an allocated ID. * @ida: IDA handle. diff --git a/lib/test_ida.c b/lib/test_ida.c index c80155a1956d..63078f8dc13f 100644 --- a/lib/test_ida.c +++ b/lib/test_ida.c @@ -189,6 +189,75 @@ static void ida_check_bad_free(struct ida *ida) IDA_BUG_ON(ida, !ida_is_empty(ida)); } +/* + * Check ida_find_first_range() and varriants. + */ +static void ida_check_find_first(struct ida *ida) +{ + /* IDA is empty; all of the below should be not exist */ + IDA_BUG_ON(ida, ida_exists(ida, 0)); + IDA_BUG_ON(ida, ida_exists(ida, 3)); + IDA_BUG_ON(ida, ida_exists(ida, 63)); + IDA_BUG_ON(ida, ida_exists(ida, 1023)); + IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1)); + + /* IDA contains a single value entry */ + IDA_BUG_ON(ida, ida_alloc_min(ida, 3, GFP_KERNEL) != 3); + IDA_BUG_ON(ida, ida_exists(ida, 0)); + IDA_BUG_ON(ida, !ida_exists(ida, 3)); + IDA_BUG_ON(ida, ida_exists(ida, 63)); + IDA_BUG_ON(ida, ida_exists(ida, 1023)); + IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1)); + + IDA_BUG_ON(ida, ida_alloc_min(ida, 63, GFP_KERNEL) != 63); + IDA_BUG_ON(ida, ida_exists(ida, 0)); + IDA_BUG_ON(ida, !ida_exists(ida, 3)); + IDA_BUG_ON(ida, !ida_exists(ida, 63)); + IDA_BUG_ON(ida, ida_exists(ida, 1023)); + IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1)); + + /* IDA contains a single bitmap */ + IDA_BUG_ON(ida, ida_alloc_min(ida, 1023, GFP_KERNEL) != 1023); + IDA_BUG_ON(ida, ida_exists(ida, 0)); + IDA_BUG_ON(ida, !ida_exists(ida, 3)); + IDA_BUG_ON(ida, !ida_exists(ida, 63)); + IDA_BUG_ON(ida, !ida_exists(ida, 1023)); + IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1)); + + /* IDA contains a tree */ + IDA_BUG_ON(ida, ida_alloc_min(ida, (1 << 20) - 1, GFP_KERNEL) != (1 << 20) - 1); + IDA_BUG_ON(ida, ida_exists(ida, 0)); + IDA_BUG_ON(ida, !ida_exists(ida, 3)); + IDA_BUG_ON(ida, !ida_exists(ida, 63)); + IDA_BUG_ON(ida, !ida_exists(ida, 1023)); + IDA_BUG_ON(ida, !ida_exists(ida, (1 << 20) - 1)); + + /* Now try to find first */ + IDA_BUG_ON(ida, ida_find_first(ida) != 3); + IDA_BUG_ON(ida, ida_find_first_range(ida, -1, 2) != -EINVAL); + IDA_BUG_ON(ida, ida_find_first_range(ida, 0, 2) != -ENOENT); // no used ID + IDA_BUG_ON(ida, ida_find_first_range(ida, 0, 3) != 3); + IDA_BUG_ON(ida, ida_find_first_range(ida, 1, 3) != 3); + IDA_BUG_ON(ida, ida_find_first_range(ida, 3, 3) != 3); + IDA_BUG_ON(ida, ida_find_first_range(ida, 2, 4) != 3); + IDA_BUG_ON(ida, ida_find_first_range(ida, 4, 3) != -ENOENT); // min > max, fail + IDA_BUG_ON(ida, ida_find_first_range(ida, 4, 60) != -ENOENT); // no used ID + IDA_BUG_ON(ida, ida_find_first_range(ida, 4, 64) != 63); + IDA_BUG_ON(ida, ida_find_first_range(ida, 63, 63) != 63); + IDA_BUG_ON(ida, ida_find_first_range(ida, 64, 1026) != 1023); + IDA_BUG_ON(ida, ida_find_first_range(ida, 1023, 1023) != 1023); + IDA_BUG_ON(ida, ida_find_first_range(ida, 1023, (1 << 20) - 1) != 1023); + IDA_BUG_ON(ida, ida_find_first_range(ida, 1024, (1 << 20) - 1) != (1 << 20) - 1); + IDA_BUG_ON(ida, ida_find_first_range(ida, (1 << 20), INT_MAX) != -ENOENT); + + ida_free(ida, 3); + ida_free(ida, 63); + ida_free(ida, 1023); + ida_free(ida, (1 << 20) - 1); + + IDA_BUG_ON(ida, !ida_is_empty(ida)); +} + static DEFINE_IDA(ida); static int ida_checks(void) @@ -202,6 +271,7 @@ static int ida_checks(void) ida_check_max(&ida); ida_check_conv(&ida); ida_check_bad_free(&ida); + ida_check_find_first(&ida); printk("IDA: %u of %u tests passed\n", tests_passed, tests_run); return (tests_run != tests_passed) ? 0 : -EINVAL; From patchwork Thu Dec 19 13:35:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13915100 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AAD6E14A630 for ; Thu, 19 Dec 2024 13:35:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734615341; cv=none; b=EDYEQ5kElooqPRUf8n4VK4XBay4NWjIn68No+fbpVLDqPi9NY2T/yYaKzGzaczaUcxodmxX+kzs6a+La09wPZXkC/7WzNPkmOp07E9odpjllrmIVglUV2Jo6Lw8oJwMjkV1lMwSgH3ljUogGImClv35tjDy49as/nDzDdLMgE+U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734615341; c=relaxed/simple; bh=qt49UPHhbt1V4Ig/Bw2enu2OpFRGxh0bGRDltD2C7Es=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=fZCL8vDdzp3RnGhKYnlr1XibxrhRc7rNiJYklHvOdb5lT8iIIxpIXdsHU1xfacGPGdwU/Itb7VJbSINR4J+hEPeX3XKGr964dBy3BZvgrUZTWMc7gVf805OyfNsQpYNEwyce2/ZK/j9pGGD20ZF+nKfgEXPojdbTugtc3b2HsE8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=aa5Vzdjh; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="aa5Vzdjh" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1734615339; x=1766151339; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qt49UPHhbt1V4Ig/Bw2enu2OpFRGxh0bGRDltD2C7Es=; b=aa5VzdjhHQS1IqGgHaD3XFjcuVvp9ziKEqyOx5d9C99nMwSDrYnZlRfo NHKP9/jR+X+Ul2NNL0TGdk/nL9EOlkDOmiMBl05DE4Y5+JoGQCTK9wbj2 CM5vp2OdY+WXmUzQnQ9/e6sql9Au4afOI1vWke1i/rIezrtO7zRO7pn3t uHW+YT9zxSK/NbgDi8HKhD0oD6VLAM+nnfmfqmKoxThga4TXaXZxJHVWh RtvNiDYPKr/JHqk7qOZqeUmE0r1FfKCl8EiFgv2LBVcb/KDp0kdWS9YFd jov+2ad3y9YHO5RBkxj9uoZE/VonW1QmWHOpQleiXd1jaIm9RxFtsP7mQ g==; X-CSE-ConnectionGUID: 7CG6fWELRuK5pr3J3XkcCQ== X-CSE-MsgGUID: jefgC1wqSpyOELNecPUZWA== X-IronPort-AV: E=McAfee;i="6700,10204,11290"; a="60504304" X-IronPort-AV: E=Sophos;i="6.12,247,1728975600"; d="scan'208";a="60504304" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2024 05:35:37 -0800 X-CSE-ConnectionGUID: DpfK5JgPRcWQYqX1em5q4Q== X-CSE-MsgGUID: sciTJh7PSbuwxcYduefaJQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="103197882" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa005.jf.intel.com with ESMTP; 19 Dec 2024 05:35:37 -0800 From: Yi Liu To: alex.williamson@redhat.com, kevin.tian@intel.com Cc: jgg@nvidia.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, yi.l.liu@intel.com, chao.p.peng@linux.intel.com, zhenzhong.duan@intel.com, willy@infradead.org, zhangfei.gao@linaro.org, vasant.hegde@amd.com Subject: [PATCH v6 2/5] vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices Date: Thu, 19 Dec 2024 05:35:31 -0800 Message-Id: <20241219133534.16422-3-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241219133534.16422-1-yi.l.liu@intel.com> References: <20241219133534.16422-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This adds pasid_at|de]tach_ioas ops for attaching hwpt to pasid of a device and the helpers for it. For now, only vfio-pci supports pasid attach/detach. Signed-off-by: Kevin Tian Reviewed-by: Jason Gunthorpe Signed-off-by: Yi Liu --- drivers/vfio/iommufd.c | 50 +++++++++++++++++++++++++++++++++++++ drivers/vfio/pci/vfio_pci.c | 2 ++ include/linux/vfio.h | 11 ++++++++ 3 files changed, 63 insertions(+) diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 516294fd901b..aa8cbee09599 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -119,14 +119,22 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev, if (IS_ERR(idev)) return PTR_ERR(idev); vdev->iommufd_device = idev; + ida_init(&vdev->pasids); return 0; } EXPORT_SYMBOL_GPL(vfio_iommufd_physical_bind); void vfio_iommufd_physical_unbind(struct vfio_device *vdev) { + int pasid; + lockdep_assert_held(&vdev->dev_set->lock); + while ((pasid = ida_find_first(&vdev->pasids)) >= 0) { + iommufd_device_pasid_detach(vdev->iommufd_device, pasid); + ida_free(&vdev->pasids, pasid); + } + if (vdev->iommufd_attached) { iommufd_device_detach(vdev->iommufd_device); vdev->iommufd_attached = false; @@ -168,6 +176,48 @@ void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev) } EXPORT_SYMBOL_GPL(vfio_iommufd_physical_detach_ioas); +int vfio_iommufd_physical_pasid_attach_ioas(struct vfio_device *vdev, + u32 pasid, u32 *pt_id) +{ + int rc; + + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device)) + return -EINVAL; + + if (ida_exists(&vdev->pasids, pasid)) + return iommufd_device_pasid_replace(vdev->iommufd_device, + pasid, pt_id); + + rc = ida_alloc_range(&vdev->pasids, pasid, pasid, GFP_KERNEL); + if (rc < 0) + return rc; + + rc = iommufd_device_pasid_attach(vdev->iommufd_device, pasid, pt_id); + if (rc) + ida_free(&vdev->pasids, pasid); + + return rc; +} +EXPORT_SYMBOL_GPL(vfio_iommufd_physical_pasid_attach_ioas); + +void vfio_iommufd_physical_pasid_detach_ioas(struct vfio_device *vdev, + u32 pasid) +{ + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device)) + return; + + if (!ida_exists(&vdev->pasids, pasid)) + return; + + iommufd_device_pasid_detach(vdev->iommufd_device, pasid); + ida_free(&vdev->pasids, pasid); +} +EXPORT_SYMBOL_GPL(vfio_iommufd_physical_pasid_detach_ioas); + /* * The emulated standard ops mean that vfio_device is going to use the * "mdev path" and will call vfio_pin_pages()/vfio_dma_rw(). Drivers using this diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index e727941f589d..6f7ae7e5b7b0 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -144,6 +144,8 @@ static const struct vfio_device_ops vfio_pci_ops = { .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, .detach_ioas = vfio_iommufd_physical_detach_ioas, + .pasid_attach_ioas = vfio_iommufd_physical_pasid_attach_ioas, + .pasid_detach_ioas = vfio_iommufd_physical_pasid_detach_ioas, }; static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 000a6cab2d31..11b3b453752e 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -67,6 +67,7 @@ struct vfio_device { struct inode *inode; #if IS_ENABLED(CONFIG_IOMMUFD) struct iommufd_device *iommufd_device; + struct ida pasids; u8 iommufd_attached:1; #endif u8 cdev_opened:1; @@ -91,6 +92,8 @@ struct vfio_device { * bound iommufd. Undo in unbind_iommufd if @detach_ioas is not * called. * @detach_ioas: Opposite of attach_ioas + * @pasid_attach_ioas: The pasid variation of attach_ioas + * @pasid_detach_ioas: Opposite of pasid_attach_ioas * @open_device: Called when the first file descriptor is opened for this device * @close_device: Opposite of open_device * @read: Perform read(2) on device file descriptor @@ -115,6 +118,8 @@ struct vfio_device_ops { void (*unbind_iommufd)(struct vfio_device *vdev); int (*attach_ioas)(struct vfio_device *vdev, u32 *pt_id); void (*detach_ioas)(struct vfio_device *vdev); + int (*pasid_attach_ioas)(struct vfio_device *vdev, u32 pasid, u32 *pt_id); + void (*pasid_detach_ioas)(struct vfio_device *vdev, u32 pasid); int (*open_device)(struct vfio_device *vdev); void (*close_device)(struct vfio_device *vdev); ssize_t (*read)(struct vfio_device *vdev, char __user *buf, @@ -139,6 +144,8 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev, void vfio_iommufd_physical_unbind(struct vfio_device *vdev); int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id); void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev); +int vfio_iommufd_physical_pasid_attach_ioas(struct vfio_device *vdev, u32 pasid, u32 *pt_id); +void vfio_iommufd_physical_pasid_detach_ioas(struct vfio_device *vdev, u32 pasid); int vfio_iommufd_emulated_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id); void vfio_iommufd_emulated_unbind(struct vfio_device *vdev); @@ -166,6 +173,10 @@ vfio_iommufd_get_dev_id(struct vfio_device *vdev, struct iommufd_ctx *ictx) ((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL) #define vfio_iommufd_physical_detach_ioas \ ((void (*)(struct vfio_device *vdev)) NULL) +#define vfio_iommufd_physical_pasid_attach_ioas \ + ((int (*)(struct vfio_device *vdev, u32 pasid, u32 *pt_id)) NULL) +#define vfio_iommufd_physical_pasid_detach_ioas \ + ((void (*)(struct vfio_device *vdev, u32 pasid)) NULL) #define vfio_iommufd_emulated_bind \ ((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx, \ u32 *out_device_id)) NULL) From patchwork Thu Dec 19 13:35:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13915101 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3141A21A945 for ; Thu, 19 Dec 2024 13:35:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734615341; cv=none; b=CDkmpQEJnCTS2rcg+AetVN2/RwrL7eDfzwNN5oydJGRZ5wBAlflUrCoF5jcaLJdzhXH9BvkCD41LItM3YK7kY/VUaoVKEDawybK83OVH1VQ+2eHDnQgDL2PyJZGTlqYSloZwzCS1NSpZu8h5414eqBXLYrHFcq5E3m8b3QEWsTw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734615341; c=relaxed/simple; bh=3anyELIJe46K0CHAtqHnFLaf0CmL6Zww7IVGm+tRQoo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=rhMi9CXMko1f65WnMsi+0CbahP68Uvck2BdWsqqEVHA1PqHhAprbBjoApPEegombMZ3cYgIov6LcIt6H2nGhN/uOA61acGICqKMIBb4iyNkZd2GSl01kUYDKvbArU4hOMZNvYPLwuwejvK0pnlAaaYENPEPxbBtGnKvoDGtmjnw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=M01joDh1; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="M01joDh1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1734615340; x=1766151340; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3anyELIJe46K0CHAtqHnFLaf0CmL6Zww7IVGm+tRQoo=; b=M01joDh1gqOi953map11ql/9HHpnE25Kq6snm6beTxDQI9FIQvCX+zar 537LI7ZaV+kHfxeCQAj8hwz6KKRgx6A3JuNaa1YgFEeorTvtgkS6z1m2l bgLIbYzMD+6nMnf7zinkr4lHLM4iIlPKvhM9MKRSI/C/1Ck5vaKfxjLn1 3ZZSJjeDVhFuCJ74kZXd8tpeUQjEBTEtHJz8WLVvhA/G8eHJCToqBQXWj 2kVJp6cD/YHJQQpnshySAClNgwE5UzrNzYgl+untgEIv1Hl9CK5SiqICj LYs0fN9loxuYQjCRN9x7W3oSu3bEbD23Ah6vV2A2gsFamjJb+Io/yxyms w==; X-CSE-ConnectionGUID: dfTiF2ZLRU+VBUg/z4hsbA== X-CSE-MsgGUID: qkFhAo5FRWaShai3b/ET8Q== X-IronPort-AV: E=McAfee;i="6700,10204,11290"; a="60504310" X-IronPort-AV: E=Sophos;i="6.12,247,1728975600"; d="scan'208";a="60504310" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2024 05:35:38 -0800 X-CSE-ConnectionGUID: TCLGpHgLRCWH4e+kaD9LUQ== X-CSE-MsgGUID: lusWw+W+T22fXZxvwwcMcQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="103197885" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa005.jf.intel.com with ESMTP; 19 Dec 2024 05:35:37 -0800 From: Yi Liu To: alex.williamson@redhat.com, kevin.tian@intel.com Cc: jgg@nvidia.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, yi.l.liu@intel.com, chao.p.peng@linux.intel.com, zhenzhong.duan@intel.com, willy@infradead.org, zhangfei.gao@linaro.org, vasant.hegde@amd.com Subject: [PATCH v6 3/5] vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid Date: Thu, 19 Dec 2024 05:35:32 -0800 Message-Id: <20241219133534.16422-4-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241219133534.16422-1-yi.l.liu@intel.com> References: <20241219133534.16422-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This extends the VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT ioctls to attach/detach a given pasid of a vfio device to/from an IOAS/HWPT. Signed-off-by: Yi Liu --- drivers/vfio/device_cdev.c | 60 +++++++++++++++++++++++++++++++++----- include/uapi/linux/vfio.h | 29 +++++++++++------- 2 files changed, 71 insertions(+), 18 deletions(-) diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c index bb1817bd4ff3..6d436bee8207 100644 --- a/drivers/vfio/device_cdev.c +++ b/drivers/vfio/device_cdev.c @@ -162,9 +162,9 @@ void vfio_df_unbind_iommufd(struct vfio_device_file *df) int vfio_df_ioctl_attach_pt(struct vfio_device_file *df, struct vfio_device_attach_iommufd_pt __user *arg) { - struct vfio_device *device = df->device; struct vfio_device_attach_iommufd_pt attach; - unsigned long minsz; + struct vfio_device *device = df->device; + unsigned long minsz, xend = 0; int ret; minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id); @@ -172,11 +172,34 @@ int vfio_df_ioctl_attach_pt(struct vfio_device_file *df, if (copy_from_user(&attach, arg, minsz)) return -EFAULT; - if (attach.argsz < minsz || attach.flags) + if (attach.argsz < minsz) return -EINVAL; + if (attach.flags & (~VFIO_DEVICE_ATTACH_PASID)) + return -EINVAL; + + if (attach.flags & VFIO_DEVICE_ATTACH_PASID) { + if (!device->ops->pasid_attach_ioas) + return -EOPNOTSUPP; + xend = offsetofend(struct vfio_device_attach_iommufd_pt, pasid); + } + + if (xend) { + if (attach.argsz < xend) + return -EINVAL; + + if (copy_from_user((void *)&attach + minsz, + (void __user *)arg + minsz, xend - minsz)) + return -EFAULT; + } + mutex_lock(&device->dev_set->lock); - ret = device->ops->attach_ioas(device, &attach.pt_id); + if (attach.flags & VFIO_DEVICE_ATTACH_PASID) + ret = device->ops->pasid_attach_ioas(device, + attach.pasid, + &attach.pt_id); + else + ret = device->ops->attach_ioas(device, &attach.pt_id); if (ret) goto out_unlock; @@ -198,20 +221,41 @@ int vfio_df_ioctl_attach_pt(struct vfio_device_file *df, int vfio_df_ioctl_detach_pt(struct vfio_device_file *df, struct vfio_device_detach_iommufd_pt __user *arg) { - struct vfio_device *device = df->device; struct vfio_device_detach_iommufd_pt detach; - unsigned long minsz; + struct vfio_device *device = df->device; + unsigned long minsz, xend = 0; minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags); if (copy_from_user(&detach, arg, minsz)) return -EFAULT; - if (detach.argsz < minsz || detach.flags) + if (detach.argsz < minsz) return -EINVAL; + if (detach.flags & (~VFIO_DEVICE_DETACH_PASID)) + return -EINVAL; + + if (detach.flags & VFIO_DEVICE_DETACH_PASID) { + if (!device->ops->pasid_detach_ioas) + return -EOPNOTSUPP; + xend = offsetofend(struct vfio_device_detach_iommufd_pt, pasid); + } + + if (xend) { + if (detach.argsz < xend) + return -EINVAL; + + if (copy_from_user((void *)&detach + minsz, + (void __user *)arg + minsz, xend - minsz)) + return -EFAULT; + } + mutex_lock(&device->dev_set->lock); - device->ops->detach_ioas(device); + if (detach.flags & VFIO_DEVICE_DETACH_PASID) + device->ops->pasid_detach_ioas(device, detach.pasid); + else + device->ops->detach_ioas(device); mutex_unlock(&device->dev_set->lock); return 0; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index c8dbf8219c4f..6899da70b929 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -931,29 +931,34 @@ struct vfio_device_bind_iommufd { * VFIO_DEVICE_ATTACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 19, * struct vfio_device_attach_iommufd_pt) * @argsz: User filled size of this data. - * @flags: Must be 0. + * @flags: Flags for attach. * @pt_id: Input the target id which can represent an ioas or a hwpt * allocated via iommufd subsystem. * Output the input ioas id or the attached hwpt id which could * be the specified hwpt itself or a hwpt automatically created * for the specified ioas by kernel during the attachment. + * @pasid: The pasid to be attached, only meaningful when + * VFIO_DEVICE_ATTACH_PASID is set in @flags * * Associate the device with an address space within the bound iommufd. * Undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close. This is only * allowed on cdev fds. * - * If a vfio device is currently attached to a valid hw_pagetable, without doing - * a VFIO_DEVICE_DETACH_IOMMUFD_PT, a second VFIO_DEVICE_ATTACH_IOMMUFD_PT ioctl - * passing in another hw_pagetable (hwpt) id is allowed. This action, also known - * as a hw_pagetable replacement, will replace the device's currently attached - * hw_pagetable with a new hw_pagetable corresponding to the given pt_id. + * If a vfio device or a pasid of this device is currently attached to a valid + * hw_pagetable (hwpt), without doing a VFIO_DEVICE_DETACH_IOMMUFD_PT, a second + * VFIO_DEVICE_ATTACH_IOMMUFD_PT ioctl passing in another hwpt id is allowed. + * This action, also known as a hw_pagetable replacement, will replace the + * currently attached hwpt of the device or the pasid of this device with a new + * hwpt corresponding to the given pt_id. * * Return: 0 on success, -errno on failure. */ struct vfio_device_attach_iommufd_pt { __u32 argsz; __u32 flags; +#define VFIO_DEVICE_ATTACH_PASID (1 << 0) __u32 pt_id; + __u32 pasid; }; #define VFIO_DEVICE_ATTACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 19) @@ -962,17 +967,21 @@ struct vfio_device_attach_iommufd_pt { * VFIO_DEVICE_DETACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 20, * struct vfio_device_detach_iommufd_pt) * @argsz: User filled size of this data. - * @flags: Must be 0. + * @flags: Flags for detach. + * @pasid: The pasid to be detached, only meaningful when + * VFIO_DEVICE_DETACH_PASID is set in @flags * - * Remove the association of the device and its current associated address - * space. After it, the device should be in a blocking DMA state. This is only - * allowed on cdev fds. + * Remove the association of the device or a pasid of the device and its current + * associated address space. After it, the device or the pasid should be in a + * blocking DMA state. This is only allowed on cdev fds. * * Return: 0 on success, -errno on failure. */ struct vfio_device_detach_iommufd_pt { __u32 argsz; __u32 flags; +#define VFIO_DEVICE_DETACH_PASID (1 << 0) + __u32 pasid; }; #define VFIO_DEVICE_DETACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 20) From patchwork Thu Dec 19 13:35:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13915102 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E3252248A5 for ; Thu, 19 Dec 2024 13:35:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734615342; cv=none; b=pDiNDib9qMuLWdVs13JD4hFTWx9gUOAexlbXASkx+XCMCLD1ooT6GLYq8OXU/qmrD52KuGNlGuqCbc3+o4AneUOXMAbzo8jXMMJYEiXkJ+/3o67CGRAGTrTNuRtIW1uqHnpMJRCQUKmt4AFoKMri741qBaDxHWJToVBb98pWvsM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734615342; c=relaxed/simple; bh=p1isZQbN8LAq+nGm4U6wMsDUdGPhqb/C+dJWFMQ10iw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ETqdrw2D1qaFIZdod/jMgedaYlA1jOV/56nahlxAuaK35JNrq1psvSgJkIoTm/5ImPUU8HNVlD+6xPjZquRWYdB23hdmjbZ2q7ZoC53MYXhVVjH0AhrpNXW48zPR8BdXtiBpgLbGtDDtnKJ6zmwk4JDvxXX1GnORokDsVUqydVc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=dMsCZLz/; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="dMsCZLz/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1734615340; x=1766151340; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=p1isZQbN8LAq+nGm4U6wMsDUdGPhqb/C+dJWFMQ10iw=; b=dMsCZLz/epxhVkV0oQ2hZp0kOnaLc61J5C3Up2XicvWr4ard/PjMCMco r2P2xafnzxIAi2o99LAoX/MOumHgters2pRfGt7Mc8/p5l3gmKsziX0Zz fAadySKk6rRWDkAn9PnUuh1xe5C4LkB6H6GSv4jTi0ZF5K3tKjBqbMiv0 cvHF5oNrxHY1qqZOOEqk/KhpGzzQFaYPl9AA3oCEcW8EjU0QWnRT35MPI ysW3sYHF5qQU6EZgM3q1vJNgEimKFKM77IuVMNbzRsGtkj2AnoEpjW675 si+h6+X21qqCW4VEUP+AqroNg0yH472H3k/dos9s6c3z9g87j1Rt+izCf A==; X-CSE-ConnectionGUID: XY/YTChoSQK/pOiQvDIPOA== X-CSE-MsgGUID: wd+FicxsR0G7vNpV8Xa1Yg== X-IronPort-AV: E=McAfee;i="6700,10204,11290"; a="60504316" X-IronPort-AV: E=Sophos;i="6.12,247,1728975600"; d="scan'208";a="60504316" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2024 05:35:38 -0800 X-CSE-ConnectionGUID: aTojAzkbQQeFyRuzqz5mhw== X-CSE-MsgGUID: 05XzQN7YR1+aRDPxgcwklg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="103197889" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa005.jf.intel.com with ESMTP; 19 Dec 2024 05:35:38 -0800 From: Yi Liu To: alex.williamson@redhat.com, kevin.tian@intel.com Cc: jgg@nvidia.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, yi.l.liu@intel.com, chao.p.peng@linux.intel.com, zhenzhong.duan@intel.com, willy@infradead.org, zhangfei.gao@linaro.org, vasant.hegde@amd.com Subject: [PATCH v6 4/5] iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability Date: Thu, 19 Dec 2024 05:35:33 -0800 Message-Id: <20241219133534.16422-5-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241219133534.16422-1-yi.l.liu@intel.com> References: <20241219133534.16422-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 PASID usage requires PASID support in both device and IOMMU. Since the iommu drivers always enable the PASID capability for the device if it is supported, so it is reasonable to extend the IOMMU_GET_HW_INFO to report the PASID capability to userspace. Also, enhance the selftest accordingly. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Zhangfei Gao #aarch64 platform Signed-off-by: Yi Liu --- drivers/iommu/iommufd/device.c | 33 ++++++++++++++++++++++++++++++++- drivers/pci/ats.c | 33 +++++++++++++++++++++++++++++++++ include/linux/pci-ats.h | 3 +++ include/uapi/linux/iommufd.h | 14 +++++++++++++- 4 files changed, 81 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 768eae6c3275..4d012e4e2e81 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -3,6 +3,8 @@ */ #include #include +#include +#include #include #include @@ -1260,7 +1262,8 @@ int iommufd_get_hw_info(struct iommufd_ucmd *ucmd) void *data; int rc; - if (cmd->flags || cmd->__reserved) + if (cmd->flags || cmd->__reserved[0] || cmd->__reserved[1] || + cmd->__reserved[2]) return -EOPNOTSUPP; idev = iommufd_get_device(ucmd, cmd->dev_id); @@ -1317,6 +1320,34 @@ int iommufd_get_hw_info(struct iommufd_ucmd *ucmd) if (device_iommu_capable(idev->dev, IOMMU_CAP_DIRTY_TRACKING)) cmd->out_capabilities |= IOMMU_HW_CAP_DIRTY_TRACKING; + /* + * Currently, major iommu drivers enable PASID in the probe_device() + * op if iommu and device supports it. So the max_pasids stored in + * dev->iommu indicates both PASID support and enable status. A + * non-zero dev->iommu->max_pasids means PASID is supported and + * enabled, The iommufd only reports PASID capability to userspace + * if it's enabled. + */ + if (idev->dev->iommu->max_pasids) { + cmd->out_max_pasid_log2 = ilog2(idev->dev->iommu->max_pasids); + + if (dev_is_pci(idev->dev)) { + struct pci_dev *pdev = to_pci_dev(idev->dev); + int ctrl; + + ctrl = pci_pasid_ctrl_status(pdev); + + WARN_ON_ONCE(!(ctrl & PCI_PASID_CTRL_ENABLE)); + + if (ctrl & PCI_PASID_CTRL_EXEC) + cmd->out_capabilities |= + IOMMU_HW_CAP_PCI_PASID_EXEC; + if (ctrl & PCI_PASID_CTRL_PRIV) + cmd->out_capabilities |= + IOMMU_HW_CAP_PCI_PASID_PRIV; + } + } + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); out_free: kfree(data); diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index 6afff1f1b143..c35465120329 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -538,4 +538,37 @@ int pci_max_pasids(struct pci_dev *pdev) return (1 << FIELD_GET(PCI_PASID_CAP_WIDTH, supported)); } EXPORT_SYMBOL_GPL(pci_max_pasids); + +/** + * pci_pasid_ctrl_status - Check the PASID status + * @pdev: PCI device structure + * + * Returns a negative value when no PASID capability is present. + * Otherwise the value of the control register is returned. + * Status reported are: + * + * PCI_PASID_CTRL_ENABLE - PASID enabled + * PCI_PASID_CTRL_EXEC - Execute permission enabled + * PCI_PASID_CTRL_PRIV - Privileged mode enabled + */ +int pci_pasid_ctrl_status(struct pci_dev *pdev) +{ + int pasid; + u16 ctrl; + + if (pdev->is_virtfn) + pdev = pci_physfn(pdev); + + pasid = pdev->pasid_cap; + if (!pasid) + return -EINVAL; + + pci_read_config_word(pdev, pasid + PCI_PASID_CTRL, &ctrl); + + ctrl &= PCI_PASID_CTRL_ENABLE | PCI_PASID_CTRL_EXEC | + PCI_PASID_CTRL_PRIV; + + return ctrl; +} +EXPORT_SYMBOL_GPL(pci_pasid_ctrl_status); #endif /* CONFIG_PCI_PASID */ diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h index 0e8b74e63767..f4e1d2010287 100644 --- a/include/linux/pci-ats.h +++ b/include/linux/pci-ats.h @@ -42,6 +42,7 @@ int pci_enable_pasid(struct pci_dev *pdev, int features); void pci_disable_pasid(struct pci_dev *pdev); int pci_pasid_features(struct pci_dev *pdev); int pci_max_pasids(struct pci_dev *pdev); +int pci_pasid_ctrl_status(struct pci_dev *pdev); #else /* CONFIG_PCI_PASID */ static inline int pci_enable_pasid(struct pci_dev *pdev, int features) { return -EINVAL; } @@ -50,6 +51,8 @@ static inline int pci_pasid_features(struct pci_dev *pdev) { return -EINVAL; } static inline int pci_max_pasids(struct pci_dev *pdev) { return -EINVAL; } +static inline int pci_pasid_ctrl_status(struct pci_dev *pdev) +{ return -EINVAL; } #endif /* CONFIG_PCI_PASID */ #endif /* LINUX_PCI_ATS_H */ diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 34810f6ae2b5..0e27557fb86b 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -608,9 +608,17 @@ enum iommu_hw_info_type { * IOMMU_HWPT_GET_DIRTY_BITMAP * IOMMU_HWPT_SET_DIRTY_TRACKING * + * @IOMMU_HW_CAP_PASID_EXEC: Execute Permission Supported, user ignores it + * when the struct iommu_hw_info::out_max_pasid_log2 + * is zero. + * @IOMMU_HW_CAP_PASID_PRIV: Privileged Mode Supported, user ignores it + * when the struct iommu_hw_info::out_max_pasid_log2 + * is zero. */ enum iommufd_hw_capabilities { IOMMU_HW_CAP_DIRTY_TRACKING = 1 << 0, + IOMMU_HW_CAP_PCI_PASID_EXEC = 1 << 1, + IOMMU_HW_CAP_PCI_PASID_PRIV = 1 << 2, }; /** @@ -626,6 +634,9 @@ enum iommufd_hw_capabilities { * iommu_hw_info_type. * @out_capabilities: Output the generic iommu capability info type as defined * in the enum iommu_hw_capabilities. + * @out_max_pasid_log2: Output the width of PASIDs. 0 means no PASID support. + * PCI devices turn to out_capabilities to check if the + * specific capabilities is supported or not. * @__reserved: Must be 0 * * Query an iommu type specific hardware information data from an iommu behind @@ -649,7 +660,8 @@ struct iommu_hw_info { __u32 data_len; __aligned_u64 data_uptr; __u32 out_data_type; - __u32 __reserved; + __u8 out_max_pasid_log2; + __u8 __reserved[3]; __aligned_u64 out_capabilities; }; #define IOMMU_GET_HW_INFO _IO(IOMMUFD_TYPE, IOMMUFD_CMD_GET_HW_INFO) From patchwork Thu Dec 19 13:35:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13915103 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6774E224AF5 for ; Thu, 19 Dec 2024 13:35:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734615343; cv=none; b=PrzfkAInivz+UC/MJ9yYxYA7iJSZ8clteRALnSsjsuruQKb6jOuKuFKMCrkEIx7oNkwy2TFSwp/POV2SW5BDr+ok4Q+BDXODSrAY69iG4/ZOAagUtiAx0fuIDqs5g9vZsJZUUPV9OXOML9roxbwlWX6bffTM1FcI2EWDIhk/LGU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734615343; c=relaxed/simple; bh=C+woQ1UO4duxlhwm6K+9elXNlVWQsXYV0gzPkf2aIFM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Yb2c33O4O0QNxaLYHsSMDSOScIfrBjCH8iEU22jqNik1kuRErtU32YEs2HbCID2TQxGQLcyCgORFNTom0q44n7LnEjigLV/pQtoTp6AzmVEF0dNVulwowOnGABuHmmuyAWgzDued/+IHxgj+ZBVXAQ0kZPov9XQoY3K0TBdYb84= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=KZctQYqK; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="KZctQYqK" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1734615341; x=1766151341; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=C+woQ1UO4duxlhwm6K+9elXNlVWQsXYV0gzPkf2aIFM=; b=KZctQYqKPIGFOLfYfKizAyFFV5h32Egyiq7F4jvvEUH4K476SOORDeMV 8arK5L+vMKsF0zMXp/7pPdyN9h8kHiOwiSKiFQil7cm4FNhWlJZuC8u8q Nim8AvOrz89lRdtURiPv/QTVPozUs0tNt9IN23119cKHUjLzlEsiiX5Ns 18hQPpSHnN0fBwdmzRoO2biRTEorEdrs1op0trIEK2vB6gwSRzsrCdSgw bypBDK6tOkQScWBWN9QP6Fg9sXgbRdcNY+ynqVS+YnHLrp+nx+DUX2MF1 w7+R7sahf+b1j7vlmxeo24fsIu5KD95HYaVQh4pCwZzVG9i+jXVVH8AiJ g==; X-CSE-ConnectionGUID: 7DBgIT/GSWaGF643611NgA== X-CSE-MsgGUID: 59DSxJ4AQ+uG+C+vv3L4Sg== X-IronPort-AV: E=McAfee;i="6700,10204,11290"; a="60504322" X-IronPort-AV: E=Sophos;i="6.12,247,1728975600"; d="scan'208";a="60504322" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2024 05:35:38 -0800 X-CSE-ConnectionGUID: YHbcweSHQS6ZwHbNVDxFXQ== X-CSE-MsgGUID: B9LlH2bSQAyEArd/fA87QQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="103197894" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa005.jf.intel.com with ESMTP; 19 Dec 2024 05:35:38 -0800 From: Yi Liu To: alex.williamson@redhat.com, kevin.tian@intel.com Cc: jgg@nvidia.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, yi.l.liu@intel.com, chao.p.peng@linux.intel.com, zhenzhong.duan@intel.com, willy@infradead.org, zhangfei.gao@linaro.org, vasant.hegde@amd.com Subject: [PATCH v6 5/5] iommufd/selftest: Add coverage for reporting max_pasid_log2 via IOMMU_HW_INFO Date: Thu, 19 Dec 2024 05:35:34 -0800 Message-Id: <20241219133534.16422-6-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241219133534.16422-1-yi.l.liu@intel.com> References: <20241219133534.16422-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 IOMMU_HW_INFO is extended to report max_pasid_log2, hence add coverage for it. Signed-off-by: Yi Liu --- tools/testing/selftests/iommu/iommufd.c | 18 ++++++++++++++++++ .../testing/selftests/iommu/iommufd_fail_nth.c | 3 ++- tools/testing/selftests/iommu/iommufd_utils.h | 17 +++++++++++++---- 3 files changed, 33 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 575a9289dfdb..9f35076878cc 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -342,12 +342,14 @@ FIXTURE(iommufd_ioas) uint32_t hwpt_id; uint32_t device_id; uint64_t base_iova; + uint32_t pasid_device_id; }; FIXTURE_VARIANT(iommufd_ioas) { unsigned int mock_domains; unsigned int memory_limit; + bool pasid_capable; }; FIXTURE_SETUP(iommufd_ioas) @@ -372,6 +374,12 @@ FIXTURE_SETUP(iommufd_ioas) IOMMU_TEST_DEV_CACHE_DEFAULT); self->base_iova = MOCK_APERTURE_START; } + + if (variant->pasid_capable) + test_cmd_mock_domain_flags(self->ioas_id, + MOCK_FLAGS_DEVICE_PASID, + NULL, NULL, + &self->pasid_device_id); } FIXTURE_TEARDOWN(iommufd_ioas) @@ -387,6 +395,7 @@ FIXTURE_VARIANT_ADD(iommufd_ioas, no_domain) FIXTURE_VARIANT_ADD(iommufd_ioas, mock_domain) { .mock_domains = 1, + .pasid_capable = true, }; FIXTURE_VARIANT_ADD(iommufd_ioas, two_mock_domain) @@ -748,6 +757,8 @@ TEST_F(iommufd_ioas, get_hw_info) } buffer_smaller; if (self->device_id) { + uint8_t max_pasid = 0; + /* Provide a zero-size user_buffer */ test_cmd_get_hw_info(self->device_id, NULL, 0); /* Provide a user_buffer with exact size */ @@ -762,6 +773,13 @@ TEST_F(iommufd_ioas, get_hw_info) * the fields within the size range still gets updated. */ test_cmd_get_hw_info(self->device_id, &buffer_smaller, sizeof(buffer_smaller)); + test_cmd_get_hw_info_pasid(self->device_id, &max_pasid); + ASSERT_EQ(0, max_pasid); + if (variant->pasid_capable) { + test_cmd_get_hw_info_pasid(self->pasid_device_id, + &max_pasid); + ASSERT_EQ(20, max_pasid); + } } else { test_err_get_hw_info(ENOENT, self->device_id, &buffer_exact, sizeof(buffer_exact)); diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c index c9580724ca9c..6837b6dc6d10 100644 --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c @@ -664,7 +664,8 @@ TEST_FAIL_NTH(basic_fail_nth, device) &self->stdev_id, NULL, &idev_id)) return -1; - if (_test_cmd_get_hw_info(self->fd, idev_id, &info, sizeof(info), NULL)) + if (_test_cmd_get_hw_info(self->fd, idev_id, &info, + sizeof(info), NULL, NULL)) return -1; if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index 523ff28e4bc9..8ed05838787d 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -757,7 +757,8 @@ static void teardown_iommufd(int fd, struct __test_metadata *_metadata) /* @data can be NULL */ static int _test_cmd_get_hw_info(int fd, __u32 device_id, void *data, - size_t data_len, uint32_t *capabilities) + size_t data_len, uint32_t *capabilities, + uint8_t *max_pasid) { struct iommu_test_hw_info *info = (struct iommu_test_hw_info *)data; struct iommu_hw_info cmd = { @@ -802,6 +803,9 @@ static int _test_cmd_get_hw_info(int fd, __u32 device_id, void *data, assert(!info->flags); } + if (max_pasid) + *max_pasid = cmd.out_max_pasid_log2; + if (capabilities) *capabilities = cmd.out_capabilities; @@ -810,14 +814,19 @@ static int _test_cmd_get_hw_info(int fd, __u32 device_id, void *data, #define test_cmd_get_hw_info(device_id, data, data_len) \ ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, data, \ - data_len, NULL)) + data_len, NULL, NULL)) #define test_err_get_hw_info(_errno, device_id, data, data_len) \ EXPECT_ERRNO(_errno, _test_cmd_get_hw_info(self->fd, device_id, data, \ - data_len, NULL)) + data_len, NULL, NULL)) #define test_cmd_get_hw_capabilities(device_id, caps, mask) \ - ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, 0, &caps)) + ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, \ + 0, &caps, NULL)) + +#define test_cmd_get_hw_info_pasid(device_id, max_pasid) \ + ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, \ + 0, NULL, max_pasid)) static int _test_ioctl_fault_alloc(int fd, __u32 *fault_id, __u32 *fault_fd) {