From patchwork Thu Oct 26 02:49:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolu Lu X-Patchwork-Id: 13437180 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE344C0032E for ; Thu, 26 Oct 2023 02:53:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231190AbjJZCxm (ORCPT ); Wed, 25 Oct 2023 22:53:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231214AbjJZCxl (ORCPT ); Wed, 25 Oct 2023 22:53:41 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD3E3185; Wed, 25 Oct 2023 19:53:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698288818; x=1729824818; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VBLOjlJxAcdax0Z6JAId0gaInjYlScrClZox6xpbv6s=; b=DGzz4OkQ2Q2YaekIQzgNct7SvJRPQM1P+XgU9Y0kg0JmIYwX8VKSFkso WoOB/cHBtDp35ifD/TprFZ2dTR1pOx1csdJ9IB+b68hN2MQWX8I1h+PVm CjmV77azBSCYjqMJF4Dj0tq4bNYhGMG+CFD3X3ZOzj6g+Wv1AJC6F+Rtw Syu8lZlkolIs623Rt1+hXORfwTMPgxZYTMuVqeZp0JHIMB5y3Gjn1Nb2Z eQR7gjzOglngIWPUeFhTv88OrIYMxoM/8I+tO8Bknp1pU/49H8ogao0cK zHhv6nMo01ccpZ49t6Ih/d8tJOEdEnohwDmaskckLeYzZXYsJh8DszEpd g==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="391316150" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="391316150" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2023 19:53:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="735604492" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="735604492" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga006.jf.intel.com with ESMTP; 25 Oct 2023 19:53:34 -0700 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan Cc: iommu@lists.linux.dev, linux-kselftest@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v2 1/6] iommu: Add iommu page fault cookie helpers Date: Thu, 26 Oct 2023 10:49:25 +0800 Message-Id: <20231026024930.382898-2-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026024930.382898-1-baolu.lu@linux.intel.com> References: <20231026024930.382898-1-baolu.lu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Add an xarray in iommu_fault_param as place holder for per-{device, pasid} fault cookie. The iommufd will use it to store the iommufd device pointers. This allows the iommufd to quickly retrieve the device object ID for a given {device, pasid} pair in the hot path of I/O page fault delivery. Otherwise, the iommufd would have to maintain its own data structures to map {device, pasid} pairs to object IDs, and then look up the object ID on the critical path. This is not performance friendly. The iommufd is supposed to set the cookie when a fault capable domain is attached to the physical device or pasid, and clear the fault cookie when the domain is removed. Signed-off-by: Lu Baolu --- include/linux/iommu.h | 3 +++ drivers/iommu/iommu-priv.h | 15 ++++++++++++ drivers/iommu/io-pgfault.c | 50 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 68 insertions(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 2ca3a3eda2e4..615d8a5f9dee 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -608,6 +608,8 @@ struct iommu_device { * @dev: the device that owns this param * @queue: IOPF queue * @queue_list: index into queue->devices + * @pasid_cookie: per-pasid fault cookie used by fault message consumers. + * This array is self-protected by xa_lock(). * @partial: faults that are part of a Page Request Group for which the last * request hasn't been submitted yet. * @faults: holds the pending faults which needs response @@ -619,6 +621,7 @@ struct iommu_fault_param { struct device *dev; struct iopf_queue *queue; struct list_head queue_list; + struct xarray pasid_cookie; struct list_head partial; struct list_head faults; diff --git a/drivers/iommu/iommu-priv.h b/drivers/iommu/iommu-priv.h index 2024a2313348..0dc5ad81cbb6 100644 --- a/drivers/iommu/iommu-priv.h +++ b/drivers/iommu/iommu-priv.h @@ -27,4 +27,19 @@ void iommu_device_unregister_bus(struct iommu_device *iommu, struct bus_type *bus, struct notifier_block *nb); +#ifdef CONFIG_IOMMU_IOPF +void *iopf_pasid_cookie_set(struct device *dev, ioasid_t pasid, void *cookie); +void *iopf_pasid_cookie_get(struct device *dev, ioasid_t pasid); +#else +static inline void *iopf_pasid_cookie_set(struct device *dev, ioasid_t pasid, void *cookie) +{ + return ERR_PTR(-ENODEV); +} + +static inline void *iopf_pasid_cookie_get(struct device *dev, ioasid_t pasid) +{ + return ERR_PTR(-ENODEV); +} +#endif /* CONFIG_IOMMU_IOPF */ + #endif /* __LINUX_IOMMU_PRIV_H */ diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index b288c73f2b22..6fa029538deb 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -454,6 +454,7 @@ int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev) mutex_init(&fault_param->lock); INIT_LIST_HEAD(&fault_param->faults); INIT_LIST_HEAD(&fault_param->partial); + xa_init(&fault_param->pasid_cookie); fault_param->dev = dev; fault_param->users = 1; list_add(&fault_param->queue_list, &queue->devices); @@ -575,3 +576,52 @@ void iopf_queue_free(struct iopf_queue *queue) kfree(queue); } EXPORT_SYMBOL_GPL(iopf_queue_free); + +/** + * iopf_pasid_cookie_set - Set a fault cookie for per-{device, pasid} + * @dev: the device to set the cookie + * @pasid: the pasid on this device + * @cookie: the opaque data + * + * Return the old cookie on success, or ERR_PTR on failure. + */ +void *iopf_pasid_cookie_set(struct device *dev, ioasid_t pasid, void *cookie) +{ + struct iommu_fault_param *iopf_param = iopf_get_dev_fault_param(dev); + void *curr; + + if (!iopf_param) + return ERR_PTR(-ENODEV); + + curr = xa_store(&iopf_param->pasid_cookie, pasid, cookie, GFP_KERNEL); + iopf_put_dev_fault_param(iopf_param); + + return xa_is_err(curr) ? ERR_PTR(xa_err(curr)) : curr; +} +EXPORT_SYMBOL_NS_GPL(iopf_pasid_cookie_set, IOMMUFD_INTERNAL); + +/** + * iopf_pasid_cookie_get - Get the fault cookie for {device, pasid} + * @dev: the device where the cookie was set + * @pasid: the pasid on this device + * + * Return the cookie on success, or ERR_PTR on failure. Note that NULL is + * also a successful return. + */ +void *iopf_pasid_cookie_get(struct device *dev, ioasid_t pasid) +{ + struct iommu_fault_param *iopf_param = iopf_get_dev_fault_param(dev); + void *curr; + + if (!iopf_param) + return ERR_PTR(-ENODEV); + + xa_lock(&iopf_param->pasid_cookie); + curr = xa_load(&iopf_param->pasid_cookie, pasid); + xa_unlock(&iopf_param->pasid_cookie); + + iopf_put_dev_fault_param(iopf_param); + + return curr; +} +EXPORT_SYMBOL_NS_GPL(iopf_pasid_cookie_get, IOMMUFD_INTERNAL); From patchwork Thu Oct 26 02:49:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolu Lu X-Patchwork-Id: 13437181 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D37AC25B47 for ; Thu, 26 Oct 2023 02:53:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232329AbjJZCxr (ORCPT ); Wed, 25 Oct 2023 22:53:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58688 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232269AbjJZCxp (ORCPT ); Wed, 25 Oct 2023 22:53:45 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2C5B187; Wed, 25 Oct 2023 19:53:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698288822; x=1729824822; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=W/DoP1hoicOE67hxwjcGLvf4llde7oL/VRtWkBc6854=; b=XUaKFopAE7/omFU6Ko9opwCTiYY7/X4CPM8LZVSB7qiRLa6bGcSEPTaC Jb1rfHzbYrpbU8eFBsKmqeAuqeIBP0vmKfjP4uhkX3cXEe3nGup8/9Yli +Xzq2WznuuydJzRkWCrhQ6Mrc0hYPF3vEwZrdco1+e/68cfTJbLbY/erq l3sBNeDGLqMsvGfJEu96DQcficnJfRCGHlRjcJZ+540JdHJwmJW7MTfRM fR5clCu/+bzelDlEEnw0OdmfpH0dzsPCaLRa1YqCjNkKz2//zwElRnsma E6UkjHtNqv0V7o4gLa1NVFFkVWur1br09Nt1dxOdSov9yigIQcGEbcYzh A==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="391316162" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="391316162" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2023 19:53:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="735604523" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="735604523" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga006.jf.intel.com with ESMTP; 25 Oct 2023 19:53:38 -0700 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan Cc: iommu@lists.linux.dev, linux-kselftest@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v2 2/6] iommufd: Add iommu page fault uapi data Date: Thu, 26 Oct 2023 10:49:26 +0800 Message-Id: <20231026024930.382898-3-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026024930.382898-1-baolu.lu@linux.intel.com> References: <20231026024930.382898-1-baolu.lu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org For user to handle IO page faults generated by IOMMU hardware when walking the HWPT managed by the user. One example of the use case is nested translation, where the first-stage page table is managed by the user space. When allocating a user HWPT, the user could opt-in a flag named IOMMU_HWPT_ALLOC_IOPF_CAPABLE, which indicates that user is capable of handling IO page faults generated for this HWPT. On a successful return of hwpt allocation, the user can retrieve and respond the page faults by reading and writing the fd returned in out_fault_fd. The format of the page fault and response data is encoded in the format defined by struct iommu_hwpt_pgfault and struct iommu_hwpt_response. The iommu_hwpt_pgfault is mostly like the iommu_fault with some new members like fault data size and the device object id where the page fault was originated from. Signed-off-by: Lu Baolu --- include/uapi/linux/iommufd.h | 65 ++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index f9b8b95b36b2..0f00f1dfcded 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -355,9 +355,17 @@ struct iommu_vfio_ioas { * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve * as the parent domain in the nesting * configuration. + * @IOMMU_HWPT_ALLOC_IOPF_CAPABLE: User is capable of handling IO page faults. + * On successful return, user can retrieve + * faults by reading the @out_fault_fd and + * respond the faults by writing it. The fault + * data is encoded in the format defined by + * iommu_hwpt_pgfault. The response data format + * is defined by iommu_hwpt_page_response */ enum iommufd_hwpt_alloc_flags { IOMMU_HWPT_ALLOC_NEST_PARENT = 1 << 0, + IOMMU_HWPT_ALLOC_IOPF_CAPABLE = 1 << 1, }; /** @@ -476,6 +484,7 @@ struct iommu_hwpt_alloc { __u32 hwpt_type; __u32 data_len; __aligned_u64 data_uptr; + __u32 out_fault_fd; }; #define IOMMU_HWPT_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_ALLOC) @@ -679,6 +688,62 @@ struct iommu_dev_data_arm_smmuv3 { __u32 sid; }; +/** + * struct iommu_hwpt_pgfault - iommu page fault data + * @size: sizeof(struct iommu_hwpt_pgfault) + * @flags: Combination of IOMMU_PGFAULT_FLAGS_ flags. + * - PASID_VALID: @pasid field is valid + * - LAST_PAGE: the last page fault in a group + * - PRIV_DATA: @private_data field is valid + * - RESP_NEEDS_PASID: the page response must have the same + * PASID value as the page request. + * @dev_id: id of the originated device + * @pasid: Process Address Space ID + * @grpid: Page Request Group Index + * @perm: requested page permissions (IOMMU_PGFAULT_PERM_* values) + * @addr: page address + * @private_data: device-specific private information + */ +struct iommu_hwpt_pgfault { + __u32 size; + __u32 flags; +#define IOMMU_PGFAULT_FLAGS_PASID_VALID (1 << 0) +#define IOMMU_PGFAULT_FLAGS_LAST_PAGE (1 << 1) +#define IOMMU_PGFAULT_FLAGS_PRIV_DATA (1 << 2) +#define IOMMU_PGFAULT_FLAGS_RESP_NEEDS_PASID (1 << 3) + __u32 dev_id; + __u32 pasid; + __u32 grpid; + __u32 perm; +#define IOMMU_PGFAULT_PERM_READ (1 << 0) +#define IOMMU_PGFAULT_PERM_WRITE (1 << 1) +#define IOMMU_PGFAULT_PERM_EXEC (1 << 2) +#define IOMMU_PGFAULT_PERM_PRIV (1 << 3) + __u64 addr; + __u64 private_data[2]; +}; + +/** + * struct iommu_hwpt_response - IOMMU page fault response + * @size: sizeof(struct iommu_hwpt_response) + * @flags: Must be set to 0 + * @hwpt_id: hwpt ID of target hardware page table for the response + * @dev_id: device ID of target device for the response + * @pasid: Process Address Space ID + * @grpid: Page Request Group Index + * @code: response code. The supported codes include: + * 0: Successful; 1: Response Failure; 2: Invalid Request. + */ +struct iommu_hwpt_page_response { + __u32 size; + __u32 flags; + __u32 hwpt_id; + __u32 dev_id; + __u32 pasid; + __u32 grpid; + __u32 code; +}; + /** * struct iommu_set_dev_data - ioctl(IOMMU_SET_DEV_DATA) * @size: sizeof(struct iommu_set_dev_data) From patchwork Thu Oct 26 02:49:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolu Lu X-Patchwork-Id: 13437182 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35816C0032E for ; Thu, 26 Oct 2023 02:53:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232946AbjJZCxv (ORCPT ); Wed, 25 Oct 2023 22:53:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233081AbjJZCxt (ORCPT ); Wed, 25 Oct 2023 22:53:49 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C69B2186; Wed, 25 Oct 2023 19:53:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698288826; x=1729824826; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/AnHd0glJLhM4XWEU5B7l7tvLBpqBEBmveUVtvBe6W0=; b=fzHyIcLJpQ2GAWi6Eat5XdrDBRNXGNeGoK/lPlGioH9CkbnuY/0WlZmZ NLNG+wzl1ccozNQjydzCS4yWfPXAx8CDr92Rb1ZRYient1CT2goIitYik nNd2Yd+748anTFP1v7x99akpfYt8XS759Eh36PZ1EurcFZ/pPKLD3IJt5 e/9QYgmfPKH2a3XVtZH+jwiIEF1U4LskD4r/NbNDzlGm61R4SYdCZZZkt GOL7M5H8UTiVjMgZaNqbHt2LnEiKMKokP53X+3QVkkvbdNqI1yfLXKXz1 nR/I2+MS6s5bK+FI37KI75yj9dLsW9zPB9R0+2rIrcKMEpnj8Uf9GBy2k w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="391316173" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="391316173" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2023 19:53:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="735604532" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="735604532" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga006.jf.intel.com with ESMTP; 25 Oct 2023 19:53:42 -0700 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan Cc: iommu@lists.linux.dev, linux-kselftest@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v2 3/6] iommufd: Initializing and releasing IO page fault data Date: Thu, 26 Oct 2023 10:49:27 +0800 Message-Id: <20231026024930.382898-4-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026024930.382898-1-baolu.lu@linux.intel.com> References: <20231026024930.382898-1-baolu.lu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Add some housekeeping code for IO page fault dilivery. Add a fault field in the iommufd_hw_pagetable structure to store pending IO page faults and other related data. The fault field is allocated and initialized when an IOPF-capable user HWPT is allocated. It is indicated by the IOMMU_HWPT_ALLOC_IOPF_CAPABLE flag being set in the allocation user data. The fault field exists until the HWPT is destroyed. This also means that you can determine whether a HWPT is IOPF-capable by checking the fault field. When an IOPF-capable HWPT is attached to a device (could also be a PASID of a device in the future), the iommufd device pointer is saved for the pasid of the device. The pointer is recalled and all pending iopf groups are discarded after the HWPT is detached from the device. Signed-off-by: Lu Baolu --- include/linux/iommu.h | 6 +++ drivers/iommu/iommufd/iommufd_private.h | 10 ++++ drivers/iommu/iommufd/device.c | 69 +++++++++++++++++++++++-- drivers/iommu/iommufd/hw_pagetable.c | 56 +++++++++++++++++++- 4 files changed, 137 insertions(+), 4 deletions(-) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 615d8a5f9dee..600ca3842c8a 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -130,6 +130,12 @@ struct iopf_group { struct work_struct work; struct device *dev; struct iommu_domain *domain; + + /* + * Used by iopf handlers, like iommufd, to hook the iopf group + * on its own lists. + */ + struct list_head node; }; /** diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 1bd412cff2d6..0dbaa2dc5b22 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -230,6 +230,15 @@ int iommufd_option_rlimit_mode(struct iommu_option *cmd, int iommufd_vfio_ioas(struct iommufd_ucmd *ucmd); +struct hw_pgtable_fault { + struct iommufd_ctx *ictx; + struct iommufd_hw_pagetable *hwpt; + /* Protect below iopf lists. */ + struct mutex mutex; + struct list_head deliver; + struct list_head response; +}; + /* * A HW pagetable is called an iommu_domain inside the kernel. This user object * allows directly creating and inspecting the domains. Domains that have kernel @@ -239,6 +248,7 @@ int iommufd_vfio_ioas(struct iommufd_ucmd *ucmd); struct iommufd_hw_pagetable { struct iommufd_object obj; struct iommu_domain *domain; + struct hw_pgtable_fault *fault; void (*abort)(struct iommufd_object *obj); void (*destroy)(struct iommufd_object *obj); diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 645ab5d290fe..0a8e03d5e7c5 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -456,6 +456,16 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, if (rc) goto err_unlock; + if (hwpt->fault) { + void *curr; + + curr = iopf_pasid_cookie_set(idev->dev, IOMMU_NO_PASID, idev); + if (IS_ERR(curr)) { + rc = PTR_ERR(curr); + goto err_unresv; + } + } + /* * Only attach to the group once for the first device that is in the * group. All the other devices will follow this attachment. The user @@ -466,17 +476,20 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, if (list_empty(&idev->igroup->device_list)) { rc = iommufd_group_setup_msi(idev->igroup, hwpt); if (rc) - goto err_unresv; + goto err_unset; rc = iommu_attach_group(hwpt->domain, idev->igroup->group); if (rc) - goto err_unresv; + goto err_unset; idev->igroup->hwpt = hwpt; } refcount_inc(&hwpt->obj.users); list_add_tail(&idev->group_item, &idev->igroup->device_list); mutex_unlock(&idev->igroup->lock); return 0; +err_unset: + if (hwpt->fault) + iopf_pasid_cookie_set(idev->dev, IOMMU_NO_PASID, NULL); err_unresv: iommufd_device_remove_rr(idev, hwpt); err_unlock: @@ -484,6 +497,30 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, return rc; } +/* + * Discard all pending page faults. Called when a hw pagetable is detached + * from a device. The iommu core guarantees that all page faults have been + * responded, hence there's no need to respond it again. + */ +static void iommufd_hw_pagetable_discard_iopf(struct iommufd_hw_pagetable *hwpt) +{ + struct iopf_group *group, *next; + + if (!hwpt->fault) + return; + + mutex_lock(&hwpt->fault->mutex); + list_for_each_entry_safe(group, next, &hwpt->fault->deliver, node) { + list_del(&group->node); + iopf_free_group(group); + } + list_for_each_entry_safe(group, next, &hwpt->fault->response, node) { + list_del(&group->node); + iopf_free_group(group); + } + mutex_unlock(&hwpt->fault->mutex); +} + struct iommufd_hw_pagetable * iommufd_hw_pagetable_detach(struct iommufd_device *idev) { @@ -491,6 +528,8 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev) mutex_lock(&idev->igroup->lock); list_del(&idev->group_item); + if (hwpt->fault) + iopf_pasid_cookie_set(idev->dev, IOMMU_NO_PASID, NULL); if (list_empty(&idev->igroup->device_list)) { iommu_detach_group(hwpt->domain, idev->igroup->group); idev->igroup->hwpt = NULL; @@ -498,6 +537,8 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev) iommufd_device_remove_rr(idev, hwpt); mutex_unlock(&idev->igroup->lock); + iommufd_hw_pagetable_discard_iopf(hwpt); + /* Caller must destroy hwpt */ return hwpt; } @@ -563,9 +604,24 @@ iommufd_device_do_replace(struct iommufd_device *idev, if (rc) goto err_unresv; + if (old_hwpt->fault) { + iommufd_hw_pagetable_discard_iopf(old_hwpt); + iopf_pasid_cookie_set(idev->dev, IOMMU_NO_PASID, NULL); + } + + if (hwpt->fault) { + void *curr; + + curr = iopf_pasid_cookie_set(idev->dev, IOMMU_NO_PASID, idev); + if (IS_ERR(curr)) { + rc = PTR_ERR(curr); + goto err_unresv; + } + } + rc = iommu_group_replace_domain(igroup->group, hwpt->domain); if (rc) - goto err_unresv; + goto err_unset; if (iommufd_hw_pagetable_compare_ioas(old_hwpt, hwpt)) { list_for_each_entry(cur, &igroup->device_list, group_item) @@ -583,8 +639,15 @@ iommufd_device_do_replace(struct iommufd_device *idev, &old_hwpt->obj.users)); mutex_unlock(&idev->igroup->lock); + iommufd_hw_pagetable_discard_iopf(old_hwpt); + /* Caller must destroy old_hwpt */ return old_hwpt; +err_unset: + if (hwpt->fault) + iopf_pasid_cookie_set(idev->dev, IOMMU_NO_PASID, NULL); + if (old_hwpt->fault) + iopf_pasid_cookie_set(idev->dev, IOMMU_NO_PASID, idev); err_unresv: if (iommufd_hw_pagetable_compare_ioas(old_hwpt, hwpt)) { list_for_each_entry(cur, &igroup->device_list, group_item) diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 72c46de1396b..9f94c824cf86 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -38,9 +38,38 @@ static void iommufd_kernel_managed_hwpt_destroy(struct iommufd_object *obj) refcount_dec(&hwpt->ioas->obj.users); } +static struct hw_pgtable_fault *hw_pagetable_fault_alloc(void) +{ + struct hw_pgtable_fault *fault; + + fault = kzalloc(sizeof(*fault), GFP_KERNEL); + if (!fault) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&fault->deliver); + INIT_LIST_HEAD(&fault->response); + mutex_init(&fault->mutex); + + return fault; +} + +static void hw_pagetable_fault_free(struct hw_pgtable_fault *fault) +{ + WARN_ON(!list_empty(&fault->deliver)); + WARN_ON(!list_empty(&fault->response)); + + kfree(fault); +} + void iommufd_hw_pagetable_destroy(struct iommufd_object *obj) { - container_of(obj, struct iommufd_hw_pagetable, obj)->destroy(obj); + struct iommufd_hw_pagetable *hwpt = + container_of(obj, struct iommufd_hw_pagetable, obj); + + if (hwpt->fault) + hw_pagetable_fault_free(hwpt->fault); + + hwpt->destroy(obj); } static void iommufd_user_managed_hwpt_abort(struct iommufd_object *obj) @@ -289,6 +318,17 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, return ERR_PTR(rc); } +static int iommufd_hw_pagetable_iopf_handler(struct iopf_group *group) +{ + struct iommufd_hw_pagetable *hwpt = group->domain->fault_data; + + mutex_lock(&hwpt->fault->mutex); + list_add_tail(&group->node, &hwpt->fault->deliver); + mutex_unlock(&hwpt->fault->mutex); + + return 0; +} + int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) { struct iommufd_hw_pagetable *(*alloc_fn)( @@ -364,6 +404,20 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) goto out_unlock; } + if (cmd->flags & IOMMU_HWPT_ALLOC_IOPF_CAPABLE) { + hwpt->fault = hw_pagetable_fault_alloc(); + if (IS_ERR(hwpt->fault)) { + rc = PTR_ERR(hwpt->fault); + hwpt->fault = NULL; + goto out_hwpt; + } + + hwpt->fault->ictx = ucmd->ictx; + hwpt->fault->hwpt = hwpt; + hwpt->domain->iopf_handler = iommufd_hw_pagetable_iopf_handler; + hwpt->domain->fault_data = hwpt; + } + cmd->out_hwpt_id = hwpt->obj.id; rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); if (rc) From patchwork Thu Oct 26 02:49:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolu Lu X-Patchwork-Id: 13437183 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73182C25B47 for ; Thu, 26 Oct 2023 02:53:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233222AbjJZCx5 (ORCPT ); Wed, 25 Oct 2023 22:53:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36094 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234602AbjJZCxz (ORCPT ); Wed, 25 Oct 2023 22:53:55 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36C331AB; Wed, 25 Oct 2023 19:53:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698288831; x=1729824831; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SXBnjgZAYdAreHu7Ix6g3uSQoirCPPblBK4F+Kofnoo=; b=ebV8JrX/CiZAh9I3JYl1q/BDVZxzNeCHEtT5X3EmbB1b6QlelHJDExHj 2laAcu7dPDFeLmhL4ELg5BEnpCd+bC2YBJRaJ9ZingpX9Opb+HqvoUX5E xPyopKwV5U+QgFjLzptyMjlZcPyklWSFUhmVVqlBEDgURj1RfRIC9A051 zaTyLdZ45jGnMva2LsBtXGovmBDg5gqE4Ze8qpPQSh5TPVoC4ggan1uld HtHX9uzwauuH4WSav15f26PGvhGl/3ZYF9+CDoWZLwlFkn85+Ifa7VEit LW+Ez+GAl8y+HbI56zOjbbEnWBBLUg8Do4n2IT6l+OCnK6sjHC+VQ4vDT A==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="391316188" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="391316188" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2023 19:53:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="735604545" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="735604545" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga006.jf.intel.com with ESMTP; 25 Oct 2023 19:53:46 -0700 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan Cc: iommu@lists.linux.dev, linux-kselftest@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v2 4/6] iommufd: Deliver fault messages to user space Date: Thu, 26 Oct 2023 10:49:28 +0800 Message-Id: <20231026024930.382898-5-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026024930.382898-1-baolu.lu@linux.intel.com> References: <20231026024930.382898-1-baolu.lu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Add the file interface that provides a simple and efficient way for userspace to handle page faults. The file interface allows userspace to read fault messages sequentially, and to respond to the handling result by writing to the same file. Userspace applications are recommended to use io_uring to speed up read and write efficiency. With this done, allow userspace application to allocate a hw page table with IOMMU_HWPT_ALLOC_IOPF_CAPABLE flag set. Signed-off-by: Lu Baolu --- drivers/iommu/iommufd/iommufd_private.h | 2 + drivers/iommu/iommufd/hw_pagetable.c | 204 +++++++++++++++++++++++- 2 files changed, 205 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 0dbaa2dc5b22..ff063bc48150 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -237,6 +237,8 @@ struct hw_pgtable_fault { struct mutex mutex; struct list_head deliver; struct list_head response; + struct file *fault_file; + int fault_fd; }; /* diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 9f94c824cf86..f0aac1bb2d2d 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -3,6 +3,8 @@ * Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES */ #include +#include +#include #include #include "../iommu-priv.h" @@ -38,9 +40,198 @@ static void iommufd_kernel_managed_hwpt_destroy(struct iommufd_object *obj) refcount_dec(&hwpt->ioas->obj.users); } +static int iommufd_compose_fault_message(struct iommu_fault *fault, + struct iommu_hwpt_pgfault *hwpt_fault, + struct device *dev) +{ + struct iommufd_device *idev = iopf_pasid_cookie_get(dev, IOMMU_NO_PASID); + + if (!idev) + return -ENODEV; + + if (IS_ERR(idev)) + return PTR_ERR(idev); + + hwpt_fault->size = sizeof(*hwpt_fault); + hwpt_fault->flags = fault->prm.flags; + hwpt_fault->dev_id = idev->obj.id; + hwpt_fault->pasid = fault->prm.pasid; + hwpt_fault->grpid = fault->prm.grpid; + hwpt_fault->perm = fault->prm.perm; + hwpt_fault->addr = fault->prm.addr; + hwpt_fault->private_data[0] = fault->prm.private_data[0]; + hwpt_fault->private_data[1] = fault->prm.private_data[1]; + + return 0; +} + +static ssize_t hwpt_fault_fops_read(struct file *filep, char __user *buf, + size_t count, loff_t *ppos) +{ + size_t fault_size = sizeof(struct iommu_hwpt_pgfault); + struct hw_pgtable_fault *fault = filep->private_data; + struct iommu_hwpt_pgfault data; + struct iopf_group *group; + struct iopf_fault *iopf; + size_t done = 0; + int rc; + + if (*ppos || count % fault_size) + return -ESPIPE; + + mutex_lock(&fault->mutex); + while (!list_empty(&fault->deliver) && count > done) { + group = list_first_entry(&fault->deliver, + struct iopf_group, node); + + if (list_count_nodes(&group->faults) * fault_size > count - done) + break; + + list_for_each_entry(iopf, &group->faults, list) { + rc = iommufd_compose_fault_message(&iopf->fault, + &data, group->dev); + if (rc) + goto err_unlock; + rc = copy_to_user(buf + done, &data, fault_size); + if (rc) + goto err_unlock; + done += fault_size; + } + + list_move_tail(&group->node, &fault->response); + } + mutex_unlock(&fault->mutex); + + return done; +err_unlock: + mutex_unlock(&fault->mutex); + return rc; +} + +static ssize_t hwpt_fault_fops_write(struct file *filep, + const char __user *buf, + size_t count, loff_t *ppos) +{ + size_t response_size = sizeof(struct iommu_hwpt_page_response); + struct hw_pgtable_fault *fault = filep->private_data; + struct iommu_hwpt_page_response response; + struct iommufd_hw_pagetable *hwpt; + struct iopf_group *iter, *group; + struct iommufd_device *idev; + size_t done = 0; + int rc = 0; + + if (*ppos || count % response_size) + return -ESPIPE; + + mutex_lock(&fault->mutex); + while (!list_empty(&fault->response) && count > done) { + rc = copy_from_user(&response, buf + done, response_size); + if (rc) + break; + + /* Get the device that this response targets at. */ + idev = container_of(iommufd_get_object(fault->ictx, + response.dev_id, + IOMMUFD_OBJ_DEVICE), + struct iommufd_device, obj); + if (IS_ERR(idev)) { + rc = PTR_ERR(idev); + break; + } + + /* + * Get the hw page table that this response was generated for. + * It must match the one stored in the fault data. + */ + hwpt = container_of(iommufd_get_object(fault->ictx, + response.hwpt_id, + IOMMUFD_OBJ_HW_PAGETABLE), + struct iommufd_hw_pagetable, obj); + if (IS_ERR(hwpt)) { + iommufd_put_object(&idev->obj); + rc = PTR_ERR(hwpt); + break; + } + + if (hwpt != fault->hwpt) { + rc = -EINVAL; + goto put_obj; + } + + group = NULL; + list_for_each_entry(iter, &fault->response, node) { + if (response.grpid != iter->last_fault.fault.prm.grpid) + continue; + + if (idev->dev != iter->dev) + continue; + + if ((iter->last_fault.fault.prm.flags & + IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) && + response.pasid != iter->last_fault.fault.prm.pasid) + continue; + + group = iter; + break; + } + + if (!group) { + rc = -ENODEV; + goto put_obj; + } + + rc = iopf_group_response(group, response.code); + if (rc) + goto put_obj; + + list_del(&group->node); + iopf_free_group(group); + done += response_size; +put_obj: + iommufd_put_object(&hwpt->obj); + iommufd_put_object(&idev->obj); + if (rc) + break; + } + mutex_unlock(&fault->mutex); + + return (rc < 0) ? rc : done; +} + +static const struct file_operations hwpt_fault_fops = { + .owner = THIS_MODULE, + .read = hwpt_fault_fops_read, + .write = hwpt_fault_fops_write, +}; + +static int hw_pagetable_get_fault_fd(struct hw_pgtable_fault *fault) +{ + struct file *filep; + int fdno; + + fdno = get_unused_fd_flags(O_CLOEXEC); + if (fdno < 0) + return fdno; + + filep = anon_inode_getfile("[iommufd-pgfault]", &hwpt_fault_fops, + fault, O_RDWR); + if (IS_ERR(filep)) { + put_unused_fd(fdno); + return PTR_ERR(filep); + } + + fd_install(fdno, filep); + fault->fault_file = filep; + fault->fault_fd = fdno; + + return 0; +} + static struct hw_pgtable_fault *hw_pagetable_fault_alloc(void) { struct hw_pgtable_fault *fault; + int rc; fault = kzalloc(sizeof(*fault), GFP_KERNEL); if (!fault) @@ -50,6 +241,12 @@ static struct hw_pgtable_fault *hw_pagetable_fault_alloc(void) INIT_LIST_HEAD(&fault->response); mutex_init(&fault->mutex); + rc = hw_pagetable_get_fault_fd(fault); + if (rc) { + kfree(fault); + return ERR_PTR(rc); + } + return fault; } @@ -58,6 +255,8 @@ static void hw_pagetable_fault_free(struct hw_pgtable_fault *fault) WARN_ON(!list_empty(&fault->deliver)); WARN_ON(!list_empty(&fault->response)); + fput(fault->fault_file); + put_unused_fd(fault->fault_fd); kfree(fault); } @@ -347,7 +546,9 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) struct mutex *mutex; int rc; - if (cmd->flags & ~IOMMU_HWPT_ALLOC_NEST_PARENT || cmd->__reserved) + if ((cmd->flags & ~(IOMMU_HWPT_ALLOC_NEST_PARENT | + IOMMU_HWPT_ALLOC_IOPF_CAPABLE)) || + cmd->__reserved) return -EOPNOTSUPP; if (!cmd->data_len && cmd->hwpt_type != IOMMU_HWPT_TYPE_DEFAULT) return -EINVAL; @@ -416,6 +617,7 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) hwpt->fault->hwpt = hwpt; hwpt->domain->iopf_handler = iommufd_hw_pagetable_iopf_handler; hwpt->domain->fault_data = hwpt; + cmd->out_fault_fd = hwpt->fault->fault_fd; } cmd->out_hwpt_id = hwpt->obj.id; From patchwork Thu Oct 26 02:49:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolu Lu X-Patchwork-Id: 13437184 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4570C25B47 for ; Thu, 26 Oct 2023 02:54:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234938AbjJZCyJ (ORCPT ); Wed, 25 Oct 2023 22:54:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60218 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234684AbjJZCyA (ORCPT ); Wed, 25 Oct 2023 22:54:00 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F1DFDD48; Wed, 25 Oct 2023 19:53:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698288835; x=1729824835; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Y/UELdAKdixD7UAeztRKsYvoefrXj6uhYDxNLPIhGxs=; b=VquIDyw+GABnBgSiZP0rQGLB4wNXsfsV2PSgvkQDQz+NDTppJO8+8geQ dZo+rJcdIsT0r0/VjeSqyRGP3fmWOMm1dr2fPBZXVAe7AZAHVnqfd2omG p2s0Lo130ooOUNRz0Y/0d+HSlTIno8bQoIp7r6kG511B3JWVwjEyJg3EY c3gI2jkB31QZj7xmCOtAwl6kDHunQy4Wej7gAqFyp3RhysnOqDWZlfgAw F2ou/c3dvwblA0Q1aASY2fw1xqDJx4HynTCVTD0IuB1AtiqxZkEIpYcBy Bz9PlXONZbPUSIE81+FfVhBLpVv3UoH8oRUH37k5outTAVrFC7L0KEOAc w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="391316202" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="391316202" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2023 19:53:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="735604559" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="735604559" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga006.jf.intel.com with ESMTP; 25 Oct 2023 19:53:50 -0700 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan Cc: iommu@lists.linux.dev, linux-kselftest@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v2 5/6] iommufd/selftest: Add IOMMU_TEST_OP_TRIGGER_IOPF test support Date: Thu, 26 Oct 2023 10:49:29 +0800 Message-Id: <20231026024930.382898-6-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026024930.382898-1-baolu.lu@linux.intel.com> References: <20231026024930.382898-1-baolu.lu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Extend the selftest mock device to support generating and responding to an IOPF. Also add an ioctl interface to userspace applications to trigger the IOPF on the mock device. This would allow userspace applications to test the IOMMUFD's handling of IOPFs without having to rely on any real hardware. Signed-off-by: Lu Baolu --- drivers/iommu/iommufd/iommufd_test.h | 8 ++++ drivers/iommu/iommufd/selftest.c | 56 ++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+) diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h index 65a363f5e81e..98951a2af4bd 100644 --- a/drivers/iommu/iommufd/iommufd_test.h +++ b/drivers/iommu/iommufd/iommufd_test.h @@ -21,6 +21,7 @@ enum { IOMMU_TEST_OP_ACCESS_REPLACE_IOAS, IOMMU_TEST_OP_MD_CHECK_IOTLB, IOMMU_TEST_OP_DEV_CHECK_DATA, + IOMMU_TEST_OP_TRIGGER_IOPF, }; enum { @@ -109,6 +110,13 @@ struct iommu_test_cmd { struct { __u32 val; } check_dev_data; + struct { + __u32 dev_id; + __u32 pasid; + __u32 grpid; + __u32 perm; + __u64 addr; + } trigger_iopf; }; __u32 last; }; diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 117776d236dc..b2d2edc3d2d2 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -401,11 +401,21 @@ static void mock_domain_set_plaform_dma_ops(struct device *dev) */ } +static struct iopf_queue *mock_iommu_iopf_queue; + static struct iommu_device mock_iommu_device = { }; static struct iommu_device *mock_probe_device(struct device *dev) { + int rc; + + if (mock_iommu_iopf_queue) { + rc = iopf_queue_add_device(mock_iommu_iopf_queue, dev); + if (rc) + return ERR_PTR(-ENODEV); + } + return &mock_iommu_device; } @@ -431,6 +441,12 @@ static void mock_domain_unset_dev_user_data(struct device *dev) mdev->dev_data = 0; } +static int mock_domain_page_response(struct device *dev, struct iopf_fault *evt, + struct iommu_page_response *msg) +{ + return 0; +} + static const struct iommu_ops mock_ops = { .owner = THIS_MODULE, .pgsize_bitmap = MOCK_IO_PAGE_SIZE, @@ -443,6 +459,7 @@ static const struct iommu_ops mock_ops = { .probe_device = mock_probe_device, .set_dev_user_data = mock_domain_set_dev_user_data, .unset_dev_user_data = mock_domain_unset_dev_user_data, + .page_response = mock_domain_page_response, .default_domain_ops = &(struct iommu_domain_ops){ .free = mock_domain_free, @@ -542,6 +559,9 @@ static void mock_dev_release(struct device *dev) { struct mock_dev *mdev = container_of(dev, struct mock_dev, dev); + if (mock_iommu_iopf_queue) + iopf_queue_remove_device(mock_iommu_iopf_queue, dev); + atomic_dec(&mock_dev_num); kfree(mdev); } @@ -1200,6 +1220,32 @@ static_assert((unsigned int)MOCK_ACCESS_RW_WRITE == IOMMUFD_ACCESS_RW_WRITE); static_assert((unsigned int)MOCK_ACCESS_RW_SLOW_PATH == __IOMMUFD_ACCESS_RW_SLOW_PATH); +static int iommufd_test_trigger_iopf(struct iommufd_ucmd *ucmd, + struct iommu_test_cmd *cmd) +{ + struct iopf_fault event = { }; + struct iommufd_device *idev; + int rc; + + idev = iommufd_get_device(ucmd, cmd->trigger_iopf.dev_id); + if (IS_ERR(idev)) + return PTR_ERR(idev); + + event.fault.prm.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE; + if (cmd->trigger_iopf.pasid != IOMMU_NO_PASID) + event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID; + event.fault.type = IOMMU_FAULT_PAGE_REQ; + event.fault.prm.addr = cmd->trigger_iopf.addr; + event.fault.prm.pasid = cmd->trigger_iopf.pasid; + event.fault.prm.grpid = cmd->trigger_iopf.grpid; + event.fault.prm.perm = cmd->trigger_iopf.perm; + + rc = iommu_report_device_fault(idev->dev, &event); + iommufd_put_object(&idev->obj); + + return rc; +} + void iommufd_selftest_destroy(struct iommufd_object *obj) { struct selftest_obj *sobj = container_of(obj, struct selftest_obj, obj); @@ -1271,6 +1317,8 @@ int iommufd_test(struct iommufd_ucmd *ucmd) return -EINVAL; iommufd_test_memory_limit = cmd->memory_limit.limit; return 0; + case IOMMU_TEST_OP_TRIGGER_IOPF: + return iommufd_test_trigger_iopf(ucmd, cmd); default: return -EOPNOTSUPP; } @@ -1312,6 +1360,9 @@ int __init iommufd_test_init(void) &iommufd_mock_bus_type.nb); if (rc) goto err_sysfs; + + mock_iommu_iopf_queue = iopf_queue_alloc("mock-iopfq"); + return 0; err_sysfs: @@ -1327,6 +1378,11 @@ int __init iommufd_test_init(void) void iommufd_test_exit(void) { + if (mock_iommu_iopf_queue) { + iopf_queue_free(mock_iommu_iopf_queue); + mock_iommu_iopf_queue = NULL; + } + iommu_device_sysfs_remove(&mock_iommu_device); iommu_device_unregister_bus(&mock_iommu_device, &iommufd_mock_bus_type.bus, From patchwork Thu Oct 26 02:49:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolu Lu X-Patchwork-Id: 13437185 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36F20C25B6B for ; Thu, 26 Oct 2023 02:54:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234859AbjJZCyP (ORCPT ); Wed, 25 Oct 2023 22:54:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234811AbjJZCyH (ORCPT ); Wed, 25 Oct 2023 22:54:07 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E5588182; Wed, 25 Oct 2023 19:53:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698288838; x=1729824838; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GI77uxw5VR1MbYzzDAZm3hnBAc3I7vZ5BHg/97CVfqk=; b=c3fdUubD/Bf+pkAyPDua63XQbdwp0F6nqdE6FOJrs4mnItG/ZRmZlSHJ xp4mLjBDP6A569Sx93Id7Q3+jmNo5K+J4IY8bv0Rh7EN1TF6RxuackeiR cB4fUsU322GgjAuH56xpuSDyxawBhsoNL4FziHa6yghdDhO8n+tOEgB5Z e5nvh5I9SgXztapVkMjFDRuI5G6BTssVwcIWZLNDRbIKPsir3uEDXcmhN rQIgusYi7kfxasJ/BNhgfO8FRzIZidvh9nDmCWInf1SPvpdUrSB3oNZSq q8xgKCzaUrKLlvQfGqF9PbJ3JkrWs4FuNHQIQiU/xfKF78gjmJhwcfOSa w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="391316220" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="391316220" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2023 19:53:58 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="735604571" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="735604571" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga006.jf.intel.com with ESMTP; 25 Oct 2023 19:53:54 -0700 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan Cc: iommu@lists.linux.dev, linux-kselftest@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v2 6/6] iommufd/selftest: Add coverage for IOMMU_TEST_OP_TRIGGER_IOPF Date: Thu, 26 Oct 2023 10:49:30 +0800 Message-Id: <20231026024930.382898-7-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026024930.382898-1-baolu.lu@linux.intel.com> References: <20231026024930.382898-1-baolu.lu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Extend the selftest tool to add coverage of testing IOPF handling. This would include the following tests: - Allocating and destroying an IOPF-capable HWPT. - Attaching/detaching/replacing an IOPF-capable HWPT on a device. - Triggering an IOPF on the mock device. - Retrieving and responding to the IOPF through the IOPF FD Signed-off-by: Lu Baolu --- tools/testing/selftests/iommu/iommufd_utils.h | 66 +++++++++++++++++-- tools/testing/selftests/iommu/iommufd.c | 24 +++++-- .../selftests/iommu/iommufd_fail_nth.c | 2 +- 3 files changed, 81 insertions(+), 11 deletions(-) diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index b75f168fca46..df22c02af997 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -103,8 +103,8 @@ static int _test_cmd_mock_domain_replace(int fd, __u32 stdev_id, __u32 pt_id, pt_id, NULL)) static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, - __u32 flags, __u32 *hwpt_id, __u32 hwpt_type, - void *data, size_t data_len) + __u32 flags, __u32 *hwpt_id, __u32 *fault_fd, + __u32 hwpt_type, void *data, size_t data_len) { struct iommu_hwpt_alloc cmd = { .size = sizeof(cmd), @@ -122,28 +122,39 @@ static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, return ret; if (hwpt_id) *hwpt_id = cmd.out_hwpt_id; + if (fault_fd) + *fault_fd = cmd.out_fault_fd; + return 0; } #define test_cmd_hwpt_alloc(device_id, pt_id, flags, hwpt_id) \ ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, flags, \ - hwpt_id, IOMMU_HWPT_TYPE_DEFAULT, \ + hwpt_id, NULL, \ + IOMMU_HWPT_TYPE_DEFAULT, \ NULL, 0)) #define test_err_hwpt_alloc(_errno, device_id, pt_id, flags, hwpt_id) \ EXPECT_ERRNO(_errno, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, \ - flags, hwpt_id, \ + flags, hwpt_id, NULL, \ IOMMU_HWPT_TYPE_DEFAULT, \ NULL, 0)) #define test_cmd_hwpt_alloc_nested(device_id, pt_id, flags, hwpt_id, \ hwpt_type, data, data_len) \ ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, flags, \ - hwpt_id, hwpt_type, data, data_len)) + hwpt_id, NULL, hwpt_type, data, \ + data_len)) #define test_err_hwpt_alloc_nested(_errno, device_id, pt_id, flags, hwpt_id, \ hwpt_type, data, data_len) \ EXPECT_ERRNO(_errno, \ _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, flags, \ - hwpt_id, hwpt_type, data, data_len)) + hwpt_id, NULL, hwpt_type, data, \ + data_len)) +#define test_cmd_hwpt_alloc_nested_iopf(device_id, pt_id, flags, hwpt_id, \ + fault_fd, hwpt_type, data, data_len) \ + ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, flags, \ + hwpt_id, fault_fd, hwpt_type, data, \ + data_len)) #define test_cmd_hwpt_check_iotlb(hwpt_id, iotlb_id, expected) \ ({ \ @@ -551,3 +562,46 @@ static int _test_cmd_unset_dev_data(int fd, __u32 device_id) #define test_err_unset_dev_data(_errno, device_id) \ EXPECT_ERRNO(_errno, \ _test_cmd_unset_dev_data(self->fd, device_id)) + +static int _test_cmd_trigger_iopf(int fd, __u32 device_id, __u32 fault_fd, __u32 hwpt_id) +{ + struct iommu_test_cmd trigger_iopf_cmd = { + .size = sizeof(trigger_iopf_cmd), + .op = IOMMU_TEST_OP_TRIGGER_IOPF, + .trigger_iopf = { + .dev_id = device_id, + .pasid = 0x1, + .grpid = 0x2, + .perm = IOMMU_PGFAULT_PERM_READ | IOMMU_PGFAULT_PERM_WRITE, + .addr = 0xdeadbeaf, + }, + }; + struct iommu_hwpt_page_response response = { + .size = sizeof(struct iommu_hwpt_page_response), + .hwpt_id = hwpt_id, + .dev_id = device_id, + .pasid = 0x1, + .grpid = 0x2, + .code = 0, + }; + struct iommu_hwpt_pgfault fault = {}; + ssize_t bytes; + int ret; + + ret = ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_TRIGGER_IOPF), &trigger_iopf_cmd); + if (ret) + return ret; + + bytes = read(fault_fd, &fault, sizeof(fault)); + if (bytes < 0) + return bytes; + + bytes = write(fault_fd, &response, sizeof(response)); + if (bytes < 0) + return bytes; + + return 0; +} + +#define test_cmd_trigger_iopf(device_id, fault_fd, hwpt_id) \ + ASSERT_EQ(0, _test_cmd_trigger_iopf(self->fd, device_id, fault_fd, hwpt_id)) diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 7cf06a4635d8..b30b82a72785 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -275,11 +275,12 @@ TEST_F(iommufd_ioas, alloc_hwpt_nested) .iotlb = IOMMU_TEST_IOTLB_DEFAULT, }; struct iommu_hwpt_invalidate_selftest inv_reqs[2] = {0}; - uint32_t nested_hwpt_id[2] = {}; + uint32_t nested_hwpt_id[3] = {}; uint32_t num_inv, driver_error; uint32_t parent_hwpt_id = 0; uint32_t parent_hwpt_id_not_work = 0; uint32_t test_hwpt_id = 0; + uint32_t fault_fd; if (self->device_id) { /* Negative tests */ @@ -323,7 +324,7 @@ TEST_F(iommufd_ioas, alloc_hwpt_nested) IOMMU_HWPT_TYPE_SELFTEST, &data, sizeof(data)); - /* Allocate two nested hwpts sharing one common parent hwpt */ + /* Allocate nested hwpts sharing one common parent hwpt */ test_cmd_hwpt_alloc_nested(self->device_id, parent_hwpt_id, 0, &nested_hwpt_id[0], IOMMU_HWPT_TYPE_SELFTEST, @@ -332,6 +333,11 @@ TEST_F(iommufd_ioas, alloc_hwpt_nested) 0, &nested_hwpt_id[1], IOMMU_HWPT_TYPE_SELFTEST, &data, sizeof(data)); + test_cmd_hwpt_alloc_nested_iopf(self->device_id, parent_hwpt_id, + IOMMU_HWPT_ALLOC_IOPF_CAPABLE, + &nested_hwpt_id[2], &fault_fd, + IOMMU_HWPT_TYPE_SELFTEST, + &data, sizeof(data)); test_cmd_hwpt_check_iotlb_all(nested_hwpt_id[0], IOMMU_TEST_IOTLB_DEFAULT); test_cmd_hwpt_check_iotlb_all(nested_hwpt_id[1], @@ -418,10 +424,20 @@ TEST_F(iommufd_ioas, alloc_hwpt_nested) _test_ioctl_destroy(self->fd, nested_hwpt_id[1])); test_ioctl_destroy(nested_hwpt_id[0]); - /* Detach from nested_hwpt_id[1] and destroy it */ - test_cmd_mock_domain_replace(self->stdev_id, parent_hwpt_id); + /* Switch from nested_hwpt_id[1] to nested_hwpt_id[2] */ + test_cmd_mock_domain_replace(self->stdev_id, + nested_hwpt_id[2]); + EXPECT_ERRNO(EBUSY, + _test_ioctl_destroy(self->fd, nested_hwpt_id[2])); test_ioctl_destroy(nested_hwpt_id[1]); + /* Trigger an IOPF on the device */ + test_cmd_trigger_iopf(self->device_id, fault_fd, nested_hwpt_id[2]); + + /* Detach from nested_hwpt_id[2] and destroy it */ + test_cmd_mock_domain_replace(self->stdev_id, parent_hwpt_id); + test_ioctl_destroy(nested_hwpt_id[2]); + /* Detach from the parent hw_pagetable and destroy it */ test_cmd_mock_domain_replace(self->stdev_id, self->ioas_id); test_ioctl_destroy(parent_hwpt_id); diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c index d3f47f262c04..2b7b582c17c4 100644 --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c @@ -615,7 +615,7 @@ TEST_FAIL_NTH(basic_fail_nth, device) if (_test_cmd_get_hw_info(self->fd, idev_id, &info, sizeof(info))) return -1; - if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, &hwpt_id, + if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, &hwpt_id, NULL, IOMMU_HWPT_TYPE_DEFAULT, 0, 0)) return -1;