From patchwork Fri Mar 21 18:01:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 14025846 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E42822D794 for ; Fri, 21 Mar 2025 18:01:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742580109; cv=none; b=OUAe4UEnCrXc9hyOw9Il1M5eZZHuqbgsL4iMAIsaBww/BzIOIowtreI0+Hg07Z1iNrs1FQYt1XxHPi9eoLpFj/74cExv2Z2yB6D/N19J6DhLvvblOeJ29v3FrpPd++7LizZRRJ2AW5//DiIuSwzunsxHtRr9YTUmUn/lLZS71f4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742580109; c=relaxed/simple; bh=DPzSxv9vMP7mIpXdsVlzuxzRhKS8p8nA3AZc3wGiDlM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ahFy9pWtMl6UTYLe3hfZ7TvHaDffSW2PgetpiV58mDcfJNEc/new/ki18Z5zUVuU1omOpCpYOzqJOJ99gTaBV+hZiFYP8Bz8zqmF3hg3jMUTtZGG1xEgFc81Fv5TXhetw0RGpsWmthU4//RpWsgkQyperaVav9OGUib04+1+0mI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Plgu+8Sr; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Plgu+8Sr" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1742580108; x=1774116108; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DPzSxv9vMP7mIpXdsVlzuxzRhKS8p8nA3AZc3wGiDlM=; b=Plgu+8SrPDuDcISrn8xs+Cku3eyMt/RvowPaVVQjUq2mY3fNQsdiUhKm gjZbKmED/0MDNCbQQNqoNvgbJkZu+HBJn6pSD9qpwz63zOnItisrJ2BXt T+pha6FZG6e5Jk2o+PrGNKlt2Oag6rHQxXO0OCTRyzoULJovZEbdcuDS7 JGwmlDXBybA3+3ZYzc6/+UL7LtYs6cPf0EzTnTOvAQnxV/X7qXToIe3gU GRRRxcXpSgMF9c3qTdl6iytcQkLVpZ/Xeb/NytojROaaxy3fob3eorC1O a+A94uUNAUbEHr63nK3of1lS/c1C6EcSWAvLEgkcAtulJkBn5ug2yRkU2 g==; X-CSE-ConnectionGUID: 0sXyqpbRRWuAyvRiZivaZA== X-CSE-MsgGUID: cP9bAxnlQ+CZU+N9UcqGoA== X-IronPort-AV: E=McAfee;i="6700,10204,11380"; a="55234661" X-IronPort-AV: E=Sophos;i="6.14,265,1736841600"; d="scan'208";a="55234661" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 11:01:45 -0700 X-CSE-ConnectionGUID: MtNJ5A5nS6yt7qweEUd4uA== X-CSE-MsgGUID: Jxw7J6o/SxK6dwQ0hJ0fpA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,265,1736841600"; d="scan'208";a="160694110" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa001.jf.intel.com with ESMTP; 21 Mar 2025 11:01:45 -0700 From: Yi Liu To: alex.williamson@redhat.com Cc: jgg@nvidia.com, yi.l.liu@intel.com, kevin.tian@intel.com, eric.auger@redhat.com, kvm@vger.kernel.org, chao.p.peng@linux.intel.com, zhenzhong.duan@intel.com, willy@infradead.org, zhangfei.gao@linaro.org, vasant.hegde@amd.com Subject: [PATCH v9 1/5] ida: Add ida_find_first_range() Date: Fri, 21 Mar 2025 11:01:39 -0700 Message-Id: <20250321180143.8468-2-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250321180143.8468-1-yi.l.liu@intel.com> References: <20250321180143.8468-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 There is no helpers for user to check if a given ID is allocated or not, neither a helper to loop all the allocated IDs in an IDA and do something for cleanup. With the two needs, a helper to get the lowest allocated ID of a range and two variants based on it. Caller can check if a given ID is allocated or not by: bool ida_exists(struct ida *ida, unsigned int id) Caller can iterate all allocated IDs by: int id; while ((id = ida_find_first(&pasid_ida)) >= 0) { //anything to do with the allocated ID ida_free(pasid_ida, pasid); } Cc: Matthew Wilcox (Oracle) Suggested-by: Jason Gunthorpe Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Acked-by: Matthew Wilcox (Oracle) Signed-off-by: Yi Liu --- include/linux/idr.h | 11 +++++++ lib/idr.c | 67 +++++++++++++++++++++++++++++++++++++++++++ lib/test_ida.c | 70 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 148 insertions(+) diff --git a/include/linux/idr.h b/include/linux/idr.h index da5f5fa4a3a6..718f9b1b91af 100644 --- a/include/linux/idr.h +++ b/include/linux/idr.h @@ -257,6 +257,7 @@ struct ida { int ida_alloc_range(struct ida *, unsigned int min, unsigned int max, gfp_t); void ida_free(struct ida *, unsigned int id); void ida_destroy(struct ida *ida); +int ida_find_first_range(struct ida *ida, unsigned int min, unsigned int max); /** * ida_alloc() - Allocate an unused ID. @@ -328,4 +329,14 @@ static inline bool ida_is_empty(const struct ida *ida) { return xa_empty(&ida->xa); } + +static inline bool ida_exists(struct ida *ida, unsigned int id) +{ + return ida_find_first_range(ida, id, id) == id; +} + +static inline int ida_find_first(struct ida *ida) +{ + return ida_find_first_range(ida, 0, ~0); +} #endif /* __IDR_H__ */ diff --git a/lib/idr.c b/lib/idr.c index da36054c3ca0..e2adc457abb4 100644 --- a/lib/idr.c +++ b/lib/idr.c @@ -476,6 +476,73 @@ int ida_alloc_range(struct ida *ida, unsigned int min, unsigned int max, } EXPORT_SYMBOL(ida_alloc_range); +/** + * ida_find_first_range - Get the lowest used ID. + * @ida: IDA handle. + * @min: Lowest ID to get. + * @max: Highest ID to get. + * + * Get the lowest used ID between @min and @max, inclusive. The returned + * ID will not exceed %INT_MAX, even if @max is larger. + * + * Context: Any context. Takes and releases the xa_lock. + * Return: The lowest used ID, or errno if no used ID is found. + */ +int ida_find_first_range(struct ida *ida, unsigned int min, unsigned int max) +{ + unsigned long index = min / IDA_BITMAP_BITS; + unsigned int offset = min % IDA_BITMAP_BITS; + unsigned long *addr, size, bit; + unsigned long tmp = 0; + unsigned long flags; + void *entry; + int ret; + + if ((int)min < 0) + return -EINVAL; + if ((int)max < 0) + max = INT_MAX; + + xa_lock_irqsave(&ida->xa, flags); + + entry = xa_find(&ida->xa, &index, max / IDA_BITMAP_BITS, XA_PRESENT); + if (!entry) { + ret = -ENOENT; + goto err_unlock; + } + + if (index > min / IDA_BITMAP_BITS) + offset = 0; + if (index * IDA_BITMAP_BITS + offset > max) { + ret = -ENOENT; + goto err_unlock; + } + + if (xa_is_value(entry)) { + tmp = xa_to_value(entry); + addr = &tmp; + size = BITS_PER_XA_VALUE; + } else { + addr = ((struct ida_bitmap *)entry)->bitmap; + size = IDA_BITMAP_BITS; + } + + bit = find_next_bit(addr, size, offset); + + xa_unlock_irqrestore(&ida->xa, flags); + + if (bit == size || + index * IDA_BITMAP_BITS + bit > max) + return -ENOENT; + + return index * IDA_BITMAP_BITS + bit; + +err_unlock: + xa_unlock_irqrestore(&ida->xa, flags); + return ret; +} +EXPORT_SYMBOL(ida_find_first_range); + /** * ida_free() - Release an allocated ID. * @ida: IDA handle. diff --git a/lib/test_ida.c b/lib/test_ida.c index c80155a1956d..63078f8dc13f 100644 --- a/lib/test_ida.c +++ b/lib/test_ida.c @@ -189,6 +189,75 @@ static void ida_check_bad_free(struct ida *ida) IDA_BUG_ON(ida, !ida_is_empty(ida)); } +/* + * Check ida_find_first_range() and varriants. + */ +static void ida_check_find_first(struct ida *ida) +{ + /* IDA is empty; all of the below should be not exist */ + IDA_BUG_ON(ida, ida_exists(ida, 0)); + IDA_BUG_ON(ida, ida_exists(ida, 3)); + IDA_BUG_ON(ida, ida_exists(ida, 63)); + IDA_BUG_ON(ida, ida_exists(ida, 1023)); + IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1)); + + /* IDA contains a single value entry */ + IDA_BUG_ON(ida, ida_alloc_min(ida, 3, GFP_KERNEL) != 3); + IDA_BUG_ON(ida, ida_exists(ida, 0)); + IDA_BUG_ON(ida, !ida_exists(ida, 3)); + IDA_BUG_ON(ida, ida_exists(ida, 63)); + IDA_BUG_ON(ida, ida_exists(ida, 1023)); + IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1)); + + IDA_BUG_ON(ida, ida_alloc_min(ida, 63, GFP_KERNEL) != 63); + IDA_BUG_ON(ida, ida_exists(ida, 0)); + IDA_BUG_ON(ida, !ida_exists(ida, 3)); + IDA_BUG_ON(ida, !ida_exists(ida, 63)); + IDA_BUG_ON(ida, ida_exists(ida, 1023)); + IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1)); + + /* IDA contains a single bitmap */ + IDA_BUG_ON(ida, ida_alloc_min(ida, 1023, GFP_KERNEL) != 1023); + IDA_BUG_ON(ida, ida_exists(ida, 0)); + IDA_BUG_ON(ida, !ida_exists(ida, 3)); + IDA_BUG_ON(ida, !ida_exists(ida, 63)); + IDA_BUG_ON(ida, !ida_exists(ida, 1023)); + IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1)); + + /* IDA contains a tree */ + IDA_BUG_ON(ida, ida_alloc_min(ida, (1 << 20) - 1, GFP_KERNEL) != (1 << 20) - 1); + IDA_BUG_ON(ida, ida_exists(ida, 0)); + IDA_BUG_ON(ida, !ida_exists(ida, 3)); + IDA_BUG_ON(ida, !ida_exists(ida, 63)); + IDA_BUG_ON(ida, !ida_exists(ida, 1023)); + IDA_BUG_ON(ida, !ida_exists(ida, (1 << 20) - 1)); + + /* Now try to find first */ + IDA_BUG_ON(ida, ida_find_first(ida) != 3); + IDA_BUG_ON(ida, ida_find_first_range(ida, -1, 2) != -EINVAL); + IDA_BUG_ON(ida, ida_find_first_range(ida, 0, 2) != -ENOENT); // no used ID + IDA_BUG_ON(ida, ida_find_first_range(ida, 0, 3) != 3); + IDA_BUG_ON(ida, ida_find_first_range(ida, 1, 3) != 3); + IDA_BUG_ON(ida, ida_find_first_range(ida, 3, 3) != 3); + IDA_BUG_ON(ida, ida_find_first_range(ida, 2, 4) != 3); + IDA_BUG_ON(ida, ida_find_first_range(ida, 4, 3) != -ENOENT); // min > max, fail + IDA_BUG_ON(ida, ida_find_first_range(ida, 4, 60) != -ENOENT); // no used ID + IDA_BUG_ON(ida, ida_find_first_range(ida, 4, 64) != 63); + IDA_BUG_ON(ida, ida_find_first_range(ida, 63, 63) != 63); + IDA_BUG_ON(ida, ida_find_first_range(ida, 64, 1026) != 1023); + IDA_BUG_ON(ida, ida_find_first_range(ida, 1023, 1023) != 1023); + IDA_BUG_ON(ida, ida_find_first_range(ida, 1023, (1 << 20) - 1) != 1023); + IDA_BUG_ON(ida, ida_find_first_range(ida, 1024, (1 << 20) - 1) != (1 << 20) - 1); + IDA_BUG_ON(ida, ida_find_first_range(ida, (1 << 20), INT_MAX) != -ENOENT); + + ida_free(ida, 3); + ida_free(ida, 63); + ida_free(ida, 1023); + ida_free(ida, (1 << 20) - 1); + + IDA_BUG_ON(ida, !ida_is_empty(ida)); +} + static DEFINE_IDA(ida); static int ida_checks(void) @@ -202,6 +271,7 @@ static int ida_checks(void) ida_check_max(&ida); ida_check_conv(&ida); ida_check_bad_free(&ida); + ida_check_find_first(&ida); printk("IDA: %u of %u tests passed\n", tests_passed, tests_run); return (tests_run != tests_passed) ? 0 : -EINVAL; From patchwork Fri Mar 21 18:01:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 14025847 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5BFD022D7B8 for ; Fri, 21 Mar 2025 18:01:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742580110; cv=none; b=kRj/dLLts8OdfYMEDnTZ48EC1fBCNmaMwS/8vEOw43UD2yJkOwMTOHbicr2h0bpYjWZETfFvfL13kDAMTH0l6gpaBwTMnHk9wCJBunwnL/a1+oQ8iZf7bUyw5LjiqCkpBM3OXnMd2fORO47fE/z/mIbP6Wd8HVF7aBp2rT+S/Hg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742580110; c=relaxed/simple; bh=NObdwK2JBzczUuJKloHtxhaTrjahD/MKVYytxhHmwmo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=dQvMHFmZogV0yZFjwSd68MVsNPgE+D3qIWIrrAJ4QSN0kSTxKmDtYbhOLv75+Ki5h+l8aHLnSsyWxVEPYujM0eJ2T4lRpcByaIoSbzfq+j4ts5JGFwaKsqk1JDNj4ifQZDjivsa3MhS5y9WVakICKxHf7yvOrVuMu+ABQrZk45M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=jFS2/aRE; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="jFS2/aRE" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1742580108; x=1774116108; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=NObdwK2JBzczUuJKloHtxhaTrjahD/MKVYytxhHmwmo=; b=jFS2/aREFSHp/5+QXZgok3wA8LLTYu5+QU2OhOpcErHIQ1cs4zeSZdfD AyUHJulmLaMZIhMPSK/zIjEGaph41ycAiorYXHQL3Vdwd3DQ98Ylmm4+g 5dgd7tiJK0852SZmh057fTs+qYAaCBIejkOOemWbi6SjORQ09ycOoc2w0 H+Ncn1Wl+s9p5xCeQ6m/8e122JBj9XmKsxFzKvYLh0oPqgHUBlPpOBMyV vclwxbI6PBkUtVQn3eB8PHoSBukjso2T7vO8vvs5pZ7Qq2zdEcn97zW0q lAzLWtG3P8FX9hj2uo93qPeG5BSnpP8eDBx/r6DR3eCvac3bBLa5yf10U w==; X-CSE-ConnectionGUID: ZW8kwUyTSjyp+foT+3lfrg== X-CSE-MsgGUID: KhsykWwjR1m/slkGCFTE3w== X-IronPort-AV: E=McAfee;i="6700,10204,11380"; a="55234668" X-IronPort-AV: E=Sophos;i="6.14,265,1736841600"; d="scan'208";a="55234668" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 11:01:45 -0700 X-CSE-ConnectionGUID: V6wuB6PWSpSb8mHApE0+UQ== X-CSE-MsgGUID: Ddv53EJVRH2cxrZ92jjYCQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,265,1736841600"; d="scan'208";a="160694113" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa001.jf.intel.com with ESMTP; 21 Mar 2025 11:01:46 -0700 From: Yi Liu To: alex.williamson@redhat.com Cc: jgg@nvidia.com, yi.l.liu@intel.com, kevin.tian@intel.com, eric.auger@redhat.com, kvm@vger.kernel.org, chao.p.peng@linux.intel.com, zhenzhong.duan@intel.com, willy@infradead.org, zhangfei.gao@linaro.org, vasant.hegde@amd.com Subject: [PATCH v9 2/5] vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices Date: Fri, 21 Mar 2025 11:01:40 -0700 Message-Id: <20250321180143.8468-3-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250321180143.8468-1-yi.l.liu@intel.com> References: <20250321180143.8468-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This adds pasid_at|de]tach_ioas ops for attaching hwpt to pasid of a device and the helpers for it. For now, only vfio-pci supports pasid attach/detach. Signed-off-by: Kevin Tian Reviewed-by: Jason Gunthorpe Reviewed-by: Alex Williamson Signed-off-by: Yi Liu --- drivers/vfio/iommufd.c | 50 +++++++++++++++++++++++++++++++++++++ drivers/vfio/pci/vfio_pci.c | 2 ++ include/linux/vfio.h | 14 +++++++++++ 3 files changed, 66 insertions(+) diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 37e1efa2c7bf..c8c3a2d53f86 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -119,14 +119,22 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev, if (IS_ERR(idev)) return PTR_ERR(idev); vdev->iommufd_device = idev; + ida_init(&vdev->pasids); return 0; } EXPORT_SYMBOL_GPL(vfio_iommufd_physical_bind); void vfio_iommufd_physical_unbind(struct vfio_device *vdev) { + int pasid; + lockdep_assert_held(&vdev->dev_set->lock); + while ((pasid = ida_find_first(&vdev->pasids)) >= 0) { + iommufd_device_detach(vdev->iommufd_device, pasid); + ida_free(&vdev->pasids, pasid); + } + if (vdev->iommufd_attached) { iommufd_device_detach(vdev->iommufd_device, IOMMU_NO_PASID); vdev->iommufd_attached = false; @@ -170,6 +178,48 @@ void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev) } EXPORT_SYMBOL_GPL(vfio_iommufd_physical_detach_ioas); +int vfio_iommufd_physical_pasid_attach_ioas(struct vfio_device *vdev, + u32 pasid, u32 *pt_id) +{ + int rc; + + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device)) + return -EINVAL; + + if (ida_exists(&vdev->pasids, pasid)) + return iommufd_device_replace(vdev->iommufd_device, + pasid, pt_id); + + rc = ida_alloc_range(&vdev->pasids, pasid, pasid, GFP_KERNEL); + if (rc < 0) + return rc; + + rc = iommufd_device_attach(vdev->iommufd_device, pasid, pt_id); + if (rc) + ida_free(&vdev->pasids, pasid); + + return rc; +} +EXPORT_SYMBOL_GPL(vfio_iommufd_physical_pasid_attach_ioas); + +void vfio_iommufd_physical_pasid_detach_ioas(struct vfio_device *vdev, + u32 pasid) +{ + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device)) + return; + + if (!ida_exists(&vdev->pasids, pasid)) + return; + + iommufd_device_detach(vdev->iommufd_device, pasid); + ida_free(&vdev->pasids, pasid); +} +EXPORT_SYMBOL_GPL(vfio_iommufd_physical_pasid_detach_ioas); + /* * The emulated standard ops mean that vfio_device is going to use the * "mdev path" and will call vfio_pin_pages()/vfio_dma_rw(). Drivers using this diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index e727941f589d..6f7ae7e5b7b0 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -144,6 +144,8 @@ static const struct vfio_device_ops vfio_pci_ops = { .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, .detach_ioas = vfio_iommufd_physical_detach_ioas, + .pasid_attach_ioas = vfio_iommufd_physical_pasid_attach_ioas, + .pasid_detach_ioas = vfio_iommufd_physical_pasid_detach_ioas, }; static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 000a6cab2d31..707b00772ce1 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -67,6 +67,7 @@ struct vfio_device { struct inode *inode; #if IS_ENABLED(CONFIG_IOMMUFD) struct iommufd_device *iommufd_device; + struct ida pasids; u8 iommufd_attached:1; #endif u8 cdev_opened:1; @@ -91,6 +92,8 @@ struct vfio_device { * bound iommufd. Undo in unbind_iommufd if @detach_ioas is not * called. * @detach_ioas: Opposite of attach_ioas + * @pasid_attach_ioas: The pasid variation of attach_ioas + * @pasid_detach_ioas: Opposite of pasid_attach_ioas * @open_device: Called when the first file descriptor is opened for this device * @close_device: Opposite of open_device * @read: Perform read(2) on device file descriptor @@ -115,6 +118,9 @@ struct vfio_device_ops { void (*unbind_iommufd)(struct vfio_device *vdev); int (*attach_ioas)(struct vfio_device *vdev, u32 *pt_id); void (*detach_ioas)(struct vfio_device *vdev); + int (*pasid_attach_ioas)(struct vfio_device *vdev, u32 pasid, + u32 *pt_id); + void (*pasid_detach_ioas)(struct vfio_device *vdev, u32 pasid); int (*open_device)(struct vfio_device *vdev); void (*close_device)(struct vfio_device *vdev); ssize_t (*read)(struct vfio_device *vdev, char __user *buf, @@ -139,6 +145,10 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev, void vfio_iommufd_physical_unbind(struct vfio_device *vdev); int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id); void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev); +int vfio_iommufd_physical_pasid_attach_ioas(struct vfio_device *vdev, + u32 pasid, u32 *pt_id); +void vfio_iommufd_physical_pasid_detach_ioas(struct vfio_device *vdev, + u32 pasid); int vfio_iommufd_emulated_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id); void vfio_iommufd_emulated_unbind(struct vfio_device *vdev); @@ -166,6 +176,10 @@ vfio_iommufd_get_dev_id(struct vfio_device *vdev, struct iommufd_ctx *ictx) ((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL) #define vfio_iommufd_physical_detach_ioas \ ((void (*)(struct vfio_device *vdev)) NULL) +#define vfio_iommufd_physical_pasid_attach_ioas \ + ((int (*)(struct vfio_device *vdev, u32 pasid, u32 *pt_id)) NULL) +#define vfio_iommufd_physical_pasid_detach_ioas \ + ((void (*)(struct vfio_device *vdev, u32 pasid)) NULL) #define vfio_iommufd_emulated_bind \ ((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx, \ u32 *out_device_id)) NULL) From patchwork Fri Mar 21 18:01:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 14025848 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D779722E414 for ; Fri, 21 Mar 2025 18:01:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742580110; cv=none; b=ML0L14YCk4ukI56gL3OMl0NlDGm26Fy58vXiN1TY+dbHqhwHw4olH4SbMsvNGXBu1QUQdmvp/i/VGpvRv1cAt4eu6aOiPc3iFo7ZmmdXXedO7BwqENGo4BWH7v57N+3DZC6bIVqiUDK5+IZGr6F4ZVgBqqG6Hes3knBEnYA3w0M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742580110; c=relaxed/simple; bh=ks/T/DR7qhJ5KPMfVuqd5gRzm0/sOka9acQq3FTIPes=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=GS/4k00jBVQqDz/Joo0iIlrxuzfMsbxYVjwCgYVane8mYQitugHZYqWls5iwoH0eRFKBhhOqx+0jt4VI2+0P2MWbfBjYHqnDuoBgMPnidxm8K8m2oW3AKXDaLjHALD04+YVXAoc7ewcqkMs23/xWjyusmJgxfdoOiZE8D3LpfxA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=iJ0+YDqA; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="iJ0+YDqA" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1742580109; x=1774116109; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ks/T/DR7qhJ5KPMfVuqd5gRzm0/sOka9acQq3FTIPes=; b=iJ0+YDqAsiAIc6QrUBwwJPjk5mN8z4p4HSfmX3JqF1za7wqslFCdhmV5 bIQcers2cyqngAysUXXafD+IsZunuIdULbDCVa9TtbeBCxvkvN4Y83DV/ WcLVha6y8lxDmnrXLQI1la5gZGiSEsUVympp7Sjyx9cWrXexqzCK5h/Lo QW5ZU0x6uh0wYQCrlqL2U1H4G6d4eDoVhV7nHsKA0jNi5+lTf7E46nfnW WnQoWDTf2CR7TGy+DIdHYHZae8XKaNcPp+MGsy0U+2LiLfudosOLtVAD1 ALxgBqGNi63XsIl+0HD57q4m/cnavn9mPjv8kSZxD76hO781pHCuHNplo w==; X-CSE-ConnectionGUID: AhEBv4o/QMOBvE2RtQhv0g== X-CSE-MsgGUID: 4JCmynz2SrqFNkTMEIyQrw== X-IronPort-AV: E=McAfee;i="6700,10204,11380"; a="55234674" X-IronPort-AV: E=Sophos;i="6.14,265,1736841600"; d="scan'208";a="55234674" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 11:01:46 -0700 X-CSE-ConnectionGUID: XYBpguPyQhWeCTyvIPsALg== X-CSE-MsgGUID: pqr4INwoQ9OgH62rs7FjwA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,265,1736841600"; d="scan'208";a="160694116" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa001.jf.intel.com with ESMTP; 21 Mar 2025 11:01:46 -0700 From: Yi Liu To: alex.williamson@redhat.com Cc: jgg@nvidia.com, yi.l.liu@intel.com, kevin.tian@intel.com, eric.auger@redhat.com, kvm@vger.kernel.org, chao.p.peng@linux.intel.com, zhenzhong.duan@intel.com, willy@infradead.org, zhangfei.gao@linaro.org, vasant.hegde@amd.com, Nicolin Chen Subject: [PATCH v9 3/5] vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid Date: Fri, 21 Mar 2025 11:01:41 -0700 Message-Id: <20250321180143.8468-4-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250321180143.8468-1-yi.l.liu@intel.com> References: <20250321180143.8468-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This extends the VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT ioctls to attach/detach a given pasid of a vfio device to/from an IOAS/HWPT. Reviewed-by: Alex Williamson Reviewed-by: Kevin Tian Reviewed-by: Nicolin Chen Signed-off-by: Yi Liu --- drivers/vfio/device_cdev.c | 60 +++++++++++++++++++++++++++++++++----- include/uapi/linux/vfio.h | 29 +++++++++++------- 2 files changed, 71 insertions(+), 18 deletions(-) diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c index bb1817bd4ff3..281a8dc3ed49 100644 --- a/drivers/vfio/device_cdev.c +++ b/drivers/vfio/device_cdev.c @@ -162,9 +162,9 @@ void vfio_df_unbind_iommufd(struct vfio_device_file *df) int vfio_df_ioctl_attach_pt(struct vfio_device_file *df, struct vfio_device_attach_iommufd_pt __user *arg) { - struct vfio_device *device = df->device; struct vfio_device_attach_iommufd_pt attach; - unsigned long minsz; + struct vfio_device *device = df->device; + unsigned long minsz, xend = 0; int ret; minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id); @@ -172,11 +172,34 @@ int vfio_df_ioctl_attach_pt(struct vfio_device_file *df, if (copy_from_user(&attach, arg, minsz)) return -EFAULT; - if (attach.argsz < minsz || attach.flags) + if (attach.argsz < minsz) return -EINVAL; + if (attach.flags & ~VFIO_DEVICE_ATTACH_PASID) + return -EINVAL; + + if (attach.flags & VFIO_DEVICE_ATTACH_PASID) { + if (!device->ops->pasid_attach_ioas) + return -EOPNOTSUPP; + xend = offsetofend(struct vfio_device_attach_iommufd_pt, pasid); + } + + if (xend) { + if (attach.argsz < xend) + return -EINVAL; + + if (copy_from_user((void *)&attach + minsz, + (void __user *)arg + minsz, xend - minsz)) + return -EFAULT; + } + mutex_lock(&device->dev_set->lock); - ret = device->ops->attach_ioas(device, &attach.pt_id); + if (attach.flags & VFIO_DEVICE_ATTACH_PASID) + ret = device->ops->pasid_attach_ioas(device, + attach.pasid, + &attach.pt_id); + else + ret = device->ops->attach_ioas(device, &attach.pt_id); if (ret) goto out_unlock; @@ -198,20 +221,41 @@ int vfio_df_ioctl_attach_pt(struct vfio_device_file *df, int vfio_df_ioctl_detach_pt(struct vfio_device_file *df, struct vfio_device_detach_iommufd_pt __user *arg) { - struct vfio_device *device = df->device; struct vfio_device_detach_iommufd_pt detach; - unsigned long minsz; + struct vfio_device *device = df->device; + unsigned long minsz, xend = 0; minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags); if (copy_from_user(&detach, arg, minsz)) return -EFAULT; - if (detach.argsz < minsz || detach.flags) + if (detach.argsz < minsz) return -EINVAL; + if (detach.flags & ~VFIO_DEVICE_DETACH_PASID) + return -EINVAL; + + if (detach.flags & VFIO_DEVICE_DETACH_PASID) { + if (!device->ops->pasid_detach_ioas) + return -EOPNOTSUPP; + xend = offsetofend(struct vfio_device_detach_iommufd_pt, pasid); + } + + if (xend) { + if (detach.argsz < xend) + return -EINVAL; + + if (copy_from_user((void *)&detach + minsz, + (void __user *)arg + minsz, xend - minsz)) + return -EFAULT; + } + mutex_lock(&device->dev_set->lock); - device->ops->detach_ioas(device); + if (detach.flags & VFIO_DEVICE_DETACH_PASID) + device->ops->pasid_detach_ioas(device, detach.pasid); + else + device->ops->detach_ioas(device); mutex_unlock(&device->dev_set->lock); return 0; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index c8dbf8219c4f..6899da70b929 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -931,29 +931,34 @@ struct vfio_device_bind_iommufd { * VFIO_DEVICE_ATTACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 19, * struct vfio_device_attach_iommufd_pt) * @argsz: User filled size of this data. - * @flags: Must be 0. + * @flags: Flags for attach. * @pt_id: Input the target id which can represent an ioas or a hwpt * allocated via iommufd subsystem. * Output the input ioas id or the attached hwpt id which could * be the specified hwpt itself or a hwpt automatically created * for the specified ioas by kernel during the attachment. + * @pasid: The pasid to be attached, only meaningful when + * VFIO_DEVICE_ATTACH_PASID is set in @flags * * Associate the device with an address space within the bound iommufd. * Undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close. This is only * allowed on cdev fds. * - * If a vfio device is currently attached to a valid hw_pagetable, without doing - * a VFIO_DEVICE_DETACH_IOMMUFD_PT, a second VFIO_DEVICE_ATTACH_IOMMUFD_PT ioctl - * passing in another hw_pagetable (hwpt) id is allowed. This action, also known - * as a hw_pagetable replacement, will replace the device's currently attached - * hw_pagetable with a new hw_pagetable corresponding to the given pt_id. + * If a vfio device or a pasid of this device is currently attached to a valid + * hw_pagetable (hwpt), without doing a VFIO_DEVICE_DETACH_IOMMUFD_PT, a second + * VFIO_DEVICE_ATTACH_IOMMUFD_PT ioctl passing in another hwpt id is allowed. + * This action, also known as a hw_pagetable replacement, will replace the + * currently attached hwpt of the device or the pasid of this device with a new + * hwpt corresponding to the given pt_id. * * Return: 0 on success, -errno on failure. */ struct vfio_device_attach_iommufd_pt { __u32 argsz; __u32 flags; +#define VFIO_DEVICE_ATTACH_PASID (1 << 0) __u32 pt_id; + __u32 pasid; }; #define VFIO_DEVICE_ATTACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 19) @@ -962,17 +967,21 @@ struct vfio_device_attach_iommufd_pt { * VFIO_DEVICE_DETACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 20, * struct vfio_device_detach_iommufd_pt) * @argsz: User filled size of this data. - * @flags: Must be 0. + * @flags: Flags for detach. + * @pasid: The pasid to be detached, only meaningful when + * VFIO_DEVICE_DETACH_PASID is set in @flags * - * Remove the association of the device and its current associated address - * space. After it, the device should be in a blocking DMA state. This is only - * allowed on cdev fds. + * Remove the association of the device or a pasid of the device and its current + * associated address space. After it, the device or the pasid should be in a + * blocking DMA state. This is only allowed on cdev fds. * * Return: 0 on success, -errno on failure. */ struct vfio_device_detach_iommufd_pt { __u32 argsz; __u32 flags; +#define VFIO_DEVICE_DETACH_PASID (1 << 0) + __u32 pasid; }; #define VFIO_DEVICE_DETACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 20) From patchwork Fri Mar 21 18:01:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 14025849 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6BD0122F381 for ; Fri, 21 Mar 2025 18:01:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742580111; cv=none; b=RemydEMkjqc6ZByHyHrdDpodn98GpJn9X7LLmF/qT25zt3obbdHE9ebr7UxyTBGyKvZkOP/XDGKn2Di28fxm52afN50xhDsA+ihwbQrjtejrM2rdteG5bnVwvp2s9YDXZjcHG4C8MHiWWHyKOKfVNCb9j5Uq1bAaeRwxODdeU9I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742580111; c=relaxed/simple; bh=YdQ2xBrSwfuHBE32UxKgm+8F1yUUgyT/KV7IOSsV8HQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hDFeznAa+3xK8aoKk64u08xjrWjd3EvzCcQxyIdB9M+UEezQDsSQff45hY5A7kjM/hyZtdQvJbzTTviDq9rIffmzwgzSmUmaV/t+5CFKiGuZLezTvuy/ZM1D84WgkoHNVPOSAxJzWP3FufIV7QnvlDgpW8KKyGYygrSuKLPCVzU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=FNqIaGVK; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="FNqIaGVK" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1742580109; x=1774116109; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YdQ2xBrSwfuHBE32UxKgm+8F1yUUgyT/KV7IOSsV8HQ=; b=FNqIaGVKbNJlFfRheTYFFX4SquejaBX2DNCXRpo6/2wexmLz1IPBZx50 yw0oYL9yUc2gW/JL6DJaWeR9kec4Ya7X5ufanqwdfBrklVR7lazXIP9Yx zYi/4x+8w9qPywZI8A4bevsf/vmx/Wjdq9eWUvUgwR25DgRVmZTMW2R7/ 2jt2Ml7W3XHOqtB2M5nNnt5X96ufwCLo4oWd7+ArbKdNUUjdqlvd+u3nQ as02je1dHTXUWFpGLmTwFMLrlZ1HjzhAwASouE+9B8BvRNER02nm0O+ZG tIllwVG81rNcq+JOuSwVViE1s9ckdNHqSxF4HawHfTGkV6e0NjfqA6Lq8 g==; X-CSE-ConnectionGUID: sPEzGwlbRWO1KXW4A5rVrg== X-CSE-MsgGUID: pzdeYLIDRGarhA0sdFTyrQ== X-IronPort-AV: E=McAfee;i="6700,10204,11380"; a="55234680" X-IronPort-AV: E=Sophos;i="6.14,265,1736841600"; d="scan'208";a="55234680" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 11:01:47 -0700 X-CSE-ConnectionGUID: uU/e6TnlSIem9Uu3w0X68w== X-CSE-MsgGUID: jvB3kdjuTNmBESt3C0/WAQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,265,1736841600"; d="scan'208";a="160694120" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa001.jf.intel.com with ESMTP; 21 Mar 2025 11:01:47 -0700 From: Yi Liu To: alex.williamson@redhat.com Cc: jgg@nvidia.com, yi.l.liu@intel.com, kevin.tian@intel.com, eric.auger@redhat.com, kvm@vger.kernel.org, chao.p.peng@linux.intel.com, zhenzhong.duan@intel.com, willy@infradead.org, zhangfei.gao@linaro.org, vasant.hegde@amd.com, Bjorn Helgaas Subject: [PATCH v9 4/5] iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability Date: Fri, 21 Mar 2025 11:01:42 -0700 Message-Id: <20250321180143.8468-5-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250321180143.8468-1-yi.l.liu@intel.com> References: <20250321180143.8468-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 PASID usage requires PASID support in both device and IOMMU. Since the iommu drivers always enable the PASID capability for the device if it is supported, this extends the IOMMU_GET_HW_INFO to report the PASID capability to userspace. Also, enhances the selftest accordingly. Cc: Bjorn Helgaas Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Zhangfei Gao #aarch64 platform Signed-off-by: Yi Liu --- drivers/iommu/iommufd/device.c | 34 +++++++++++++++++++++++++++++++++- drivers/pci/ats.c | 33 +++++++++++++++++++++++++++++++++ include/linux/pci-ats.h | 3 +++ include/uapi/linux/iommufd.h | 14 +++++++++++++- 4 files changed, 82 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 1605f6c0e1ee..2307daad65c0 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -3,6 +3,7 @@ */ #include #include +#include #include #include @@ -1455,7 +1456,8 @@ int iommufd_get_hw_info(struct iommufd_ucmd *ucmd) void *data; int rc; - if (cmd->flags || cmd->__reserved) + if (cmd->flags || cmd->__reserved[0] || cmd->__reserved[1] || + cmd->__reserved[2]) return -EOPNOTSUPP; idev = iommufd_get_device(ucmd, cmd->dev_id); @@ -1512,6 +1514,36 @@ int iommufd_get_hw_info(struct iommufd_ucmd *ucmd) if (device_iommu_capable(idev->dev, IOMMU_CAP_DIRTY_TRACKING)) cmd->out_capabilities |= IOMMU_HW_CAP_DIRTY_TRACKING; + cmd->out_max_pasid_log2 = 0; + /* + * Currently, all iommu drivers enable PASID in the probe_device() + * op if iommu and device supports it. So the max_pasids stored in + * dev->iommu indicates both PASID support and enable status. A + * non-zero dev->iommu->max_pasids means PASID is supported and + * enabled. The iommufd only reports PASID capability to userspace + * if it's enabled. + */ + if (idev->dev->iommu->max_pasids) { + cmd->out_max_pasid_log2 = ilog2(idev->dev->iommu->max_pasids); + + if (dev_is_pci(idev->dev)) { + struct pci_dev *pdev = to_pci_dev(idev->dev); + int ctrl; + + ctrl = pci_pasid_status(pdev); + + WARN_ON_ONCE(ctrl < 0 || + !(ctrl & PCI_PASID_CTRL_ENABLE)); + + if (ctrl & PCI_PASID_CTRL_EXEC) + cmd->out_capabilities |= + IOMMU_HW_CAP_PCI_PASID_EXEC; + if (ctrl & PCI_PASID_CTRL_PRIV) + cmd->out_capabilities |= + IOMMU_HW_CAP_PCI_PASID_PRIV; + } + } + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); out_free: kfree(data); diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index c6b266c772c8..ec6c8dbdc5e9 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -538,4 +538,37 @@ int pci_max_pasids(struct pci_dev *pdev) return (1 << FIELD_GET(PCI_PASID_CAP_WIDTH, supported)); } EXPORT_SYMBOL_GPL(pci_max_pasids); + +/** + * pci_pasid_status - Check the PASID status + * @pdev: PCI device structure + * + * Returns a negative value when no PASID capability is present. + * Otherwise the value of the control register is returned. + * Status reported are: + * + * PCI_PASID_CTRL_ENABLE - PASID enabled + * PCI_PASID_CTRL_EXEC - Execute permission enabled + * PCI_PASID_CTRL_PRIV - Privileged mode enabled + */ +int pci_pasid_status(struct pci_dev *pdev) +{ + int pasid; + u16 ctrl; + + if (pdev->is_virtfn) + pdev = pci_physfn(pdev); + + pasid = pdev->pasid_cap; + if (!pasid) + return -EINVAL; + + pci_read_config_word(pdev, pasid + PCI_PASID_CTRL, &ctrl); + + ctrl &= PCI_PASID_CTRL_ENABLE | PCI_PASID_CTRL_EXEC | + PCI_PASID_CTRL_PRIV; + + return ctrl; +} +EXPORT_SYMBOL_GPL(pci_pasid_status); #endif /* CONFIG_PCI_PASID */ diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h index 0e8b74e63767..75c6c86cf09d 100644 --- a/include/linux/pci-ats.h +++ b/include/linux/pci-ats.h @@ -42,6 +42,7 @@ int pci_enable_pasid(struct pci_dev *pdev, int features); void pci_disable_pasid(struct pci_dev *pdev); int pci_pasid_features(struct pci_dev *pdev); int pci_max_pasids(struct pci_dev *pdev); +int pci_pasid_status(struct pci_dev *pdev); #else /* CONFIG_PCI_PASID */ static inline int pci_enable_pasid(struct pci_dev *pdev, int features) { return -EINVAL; } @@ -50,6 +51,8 @@ static inline int pci_pasid_features(struct pci_dev *pdev) { return -EINVAL; } static inline int pci_max_pasids(struct pci_dev *pdev) { return -EINVAL; } +static inline int pci_pasid_status(struct pci_dev *pdev) +{ return -EINVAL; } #endif /* CONFIG_PCI_PASID */ #endif /* LINUX_PCI_ATS_H */ diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 6901804ec736..81c31a36e14a 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -612,9 +612,17 @@ enum iommu_hw_info_type { * IOMMU_HWPT_GET_DIRTY_BITMAP * IOMMU_HWPT_SET_DIRTY_TRACKING * + * @IOMMU_HW_CAP_PASID_EXEC: Execute Permission Supported, user ignores it + * when the struct iommu_hw_info::out_max_pasid_log2 + * is zero. + * @IOMMU_HW_CAP_PASID_PRIV: Privileged Mode Supported, user ignores it + * when the struct iommu_hw_info::out_max_pasid_log2 + * is zero. */ enum iommufd_hw_capabilities { IOMMU_HW_CAP_DIRTY_TRACKING = 1 << 0, + IOMMU_HW_CAP_PCI_PASID_EXEC = 1 << 1, + IOMMU_HW_CAP_PCI_PASID_PRIV = 1 << 2, }; /** @@ -630,6 +638,9 @@ enum iommufd_hw_capabilities { * iommu_hw_info_type. * @out_capabilities: Output the generic iommu capability info type as defined * in the enum iommu_hw_capabilities. + * @out_max_pasid_log2: Output the width of PASIDs. 0 means no PASID support. + * PCI devices turn to out_capabilities to check if the + * specific capabilities is supported or not. * @__reserved: Must be 0 * * Query an iommu type specific hardware information data from an iommu behind @@ -653,7 +664,8 @@ struct iommu_hw_info { __u32 data_len; __aligned_u64 data_uptr; __u32 out_data_type; - __u32 __reserved; + __u8 out_max_pasid_log2; + __u8 __reserved[3]; __aligned_u64 out_capabilities; }; #define IOMMU_GET_HW_INFO _IO(IOMMUFD_TYPE, IOMMUFD_CMD_GET_HW_INFO) From patchwork Fri Mar 21 18:01:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 14025850 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2492A22FAFD for ; Fri, 21 Mar 2025 18:01:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742580111; cv=none; b=AuHgB+wwMjWD0igfM1oq/BeL94kWg3r9KzzZwbOvJuSp1CdWtrHh3oduIoCCda6+hKHWNIoG47L70PpDXyOOxVcjdXFFODL7rAit9sLqE0E9YtbHZLD8e68Rvo7V0uoC2CzdN1BQuTJWmSatMGcaggJlAiiuUIH0DSngIQMq/10= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742580111; c=relaxed/simple; bh=7PMTq5liFdrq9AL3d/OEmRder4cDS77P5CSA1CK+eWY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=itEny19Ax/hBz7/PwIRjqcJVmEw1EphGWUvS9IGz5PRxscSVVGDWrJOnuf/uNrBFCkZuBbOOV5YPyxWNwx3B12D9PG0Tz8g4iaqsaZsERdsMHECOHjC8IvIc/Ou9xoJIYoPaEpj8DvNC+670dRrACKbeekZ22jxoDvZJqJ4EEaM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=BzjrrMWq; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="BzjrrMWq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1742580110; x=1774116110; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7PMTq5liFdrq9AL3d/OEmRder4cDS77P5CSA1CK+eWY=; b=BzjrrMWqjHXW+FvSuIWEZHle3lJToFNYbT7HpwDSN7U8CggCi4Mva4fa gI4N7HCkDqMxRdO5bNItTs1R46PWyufgNE8NUcGLIxk+6hHSD5F1C5JKz EP8KWdvX7GAx9VmY6EgRmGMgTgUUb3qSKjl/FJhuJcECpC2F4G0hzH+GC 1alFSZ0ZoIixHCI04bPevYkwvPVzyfPpkXwkpHmoGLQNd+CeC3FwmEdUV 8X9FUTD/o5eO3Mqlx/Cg3Ug7q/hxAxkccgy/xJzgnF4YMHfH4RCP1tvRY eFonf8VFbjNttuxd1Vh2XM2RTWTYbequVS2tygpXW+Nr/LpoUaGkPRyNq w==; X-CSE-ConnectionGUID: +7wu/tfYQ0aLiEfTkWHpLQ== X-CSE-MsgGUID: xJxhuvkvQlOqT0bmOmZe6g== X-IronPort-AV: E=McAfee;i="6700,10204,11380"; a="55234687" X-IronPort-AV: E=Sophos;i="6.14,265,1736841600"; d="scan'208";a="55234687" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 11:01:48 -0700 X-CSE-ConnectionGUID: vAJOZdh2QvWCU98TLX40gA== X-CSE-MsgGUID: hPOg2vHkQwCKUluNJ8hJAw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,265,1736841600"; d="scan'208";a="160694127" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orviesa001.jf.intel.com with ESMTP; 21 Mar 2025 11:01:48 -0700 From: Yi Liu To: alex.williamson@redhat.com Cc: jgg@nvidia.com, yi.l.liu@intel.com, kevin.tian@intel.com, eric.auger@redhat.com, kvm@vger.kernel.org, chao.p.peng@linux.intel.com, zhenzhong.duan@intel.com, willy@infradead.org, zhangfei.gao@linaro.org, vasant.hegde@amd.com, Nicolin Chen Subject: [PATCH v9 5/5] iommufd/selftest: Add coverage for reporting max_pasid_log2 via IOMMU_HW_INFO Date: Fri, 21 Mar 2025 11:01:43 -0700 Message-Id: <20250321180143.8468-6-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250321180143.8468-1-yi.l.liu@intel.com> References: <20250321180143.8468-1-yi.l.liu@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 IOMMU_HW_INFO is extended to report max_pasid_log2, hence add coverage for it. Reviewed-by: Nicolin Chen Signed-off-by: Yi Liu Tested-by: Nicolin Chen --- tools/testing/selftests/iommu/iommufd.c | 18 ++++++++++++++++++ .../testing/selftests/iommu/iommufd_fail_nth.c | 3 ++- tools/testing/selftests/iommu/iommufd_utils.h | 17 +++++++++++++---- 3 files changed, 33 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index c39222b9869b..7eb7ee149f2b 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -342,12 +342,14 @@ FIXTURE(iommufd_ioas) uint32_t hwpt_id; uint32_t device_id; uint64_t base_iova; + uint32_t device_pasid_id; }; FIXTURE_VARIANT(iommufd_ioas) { unsigned int mock_domains; unsigned int memory_limit; + bool pasid_capable; }; FIXTURE_SETUP(iommufd_ioas) @@ -372,6 +374,12 @@ FIXTURE_SETUP(iommufd_ioas) IOMMU_TEST_DEV_CACHE_DEFAULT); self->base_iova = MOCK_APERTURE_START; } + + if (variant->pasid_capable) + test_cmd_mock_domain_flags(self->ioas_id, + MOCK_FLAGS_DEVICE_PASID, + NULL, NULL, + &self->device_pasid_id); } FIXTURE_TEARDOWN(iommufd_ioas) @@ -387,6 +395,7 @@ FIXTURE_VARIANT_ADD(iommufd_ioas, no_domain) FIXTURE_VARIANT_ADD(iommufd_ioas, mock_domain) { .mock_domains = 1, + .pasid_capable = true, }; FIXTURE_VARIANT_ADD(iommufd_ioas, two_mock_domain) @@ -752,6 +761,8 @@ TEST_F(iommufd_ioas, get_hw_info) } buffer_smaller; if (self->device_id) { + uint8_t max_pasid = 0; + /* Provide a zero-size user_buffer */ test_cmd_get_hw_info(self->device_id, NULL, 0); /* Provide a user_buffer with exact size */ @@ -766,6 +777,13 @@ TEST_F(iommufd_ioas, get_hw_info) * the fields within the size range still gets updated. */ test_cmd_get_hw_info(self->device_id, &buffer_smaller, sizeof(buffer_smaller)); + test_cmd_get_hw_info_pasid(self->device_id, &max_pasid); + ASSERT_EQ(0, max_pasid); + if (variant->pasid_capable) { + test_cmd_get_hw_info_pasid(self->device_pasid_id, + &max_pasid); + ASSERT_EQ(MOCK_PASID_WIDTH, max_pasid); + } } else { test_err_get_hw_info(ENOENT, self->device_id, &buffer_exact, sizeof(buffer_exact)); diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c index 8fd6f4500090..e11ec4b121fc 100644 --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c @@ -666,7 +666,8 @@ TEST_FAIL_NTH(basic_fail_nth, device) &self->stdev_id, NULL, &idev_id)) return -1; - if (_test_cmd_get_hw_info(self->fd, idev_id, &info, sizeof(info), NULL)) + if (_test_cmd_get_hw_info(self->fd, idev_id, &info, + sizeof(info), NULL, NULL)) return -1; if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index 27794b6f58fc..72f6636e5d90 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -758,7 +758,8 @@ static void teardown_iommufd(int fd, struct __test_metadata *_metadata) /* @data can be NULL */ static int _test_cmd_get_hw_info(int fd, __u32 device_id, void *data, - size_t data_len, uint32_t *capabilities) + size_t data_len, uint32_t *capabilities, + uint8_t *max_pasid) { struct iommu_test_hw_info *info = (struct iommu_test_hw_info *)data; struct iommu_hw_info cmd = { @@ -803,6 +804,9 @@ static int _test_cmd_get_hw_info(int fd, __u32 device_id, void *data, assert(!info->flags); } + if (max_pasid) + *max_pasid = cmd.out_max_pasid_log2; + if (capabilities) *capabilities = cmd.out_capabilities; @@ -811,14 +815,19 @@ static int _test_cmd_get_hw_info(int fd, __u32 device_id, void *data, #define test_cmd_get_hw_info(device_id, data, data_len) \ ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, data, \ - data_len, NULL)) + data_len, NULL, NULL)) #define test_err_get_hw_info(_errno, device_id, data, data_len) \ EXPECT_ERRNO(_errno, _test_cmd_get_hw_info(self->fd, device_id, data, \ - data_len, NULL)) + data_len, NULL, NULL)) #define test_cmd_get_hw_capabilities(device_id, caps, mask) \ - ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, 0, &caps)) + ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, \ + 0, &caps, NULL)) + +#define test_cmd_get_hw_info_pasid(device_id, max_pasid) \ + ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, \ + 0, NULL, max_pasid)) static int _test_ioctl_fault_alloc(int fd, __u32 *fault_id, __u32 *fault_fd) {