From patchwork Mon Feb 5 12:01:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Gowans, James" X-Patchwork-Id: 13545359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1349C4828D for ; Mon, 5 Feb 2024 12:05:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 735896B0074; Mon, 5 Feb 2024 07:05:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E5C56B008C; Mon, 5 Feb 2024 07:05:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5AE976B00A0; Mon, 5 Feb 2024 07:05:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4884C6B0074 for ; Mon, 5 Feb 2024 07:05:20 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2127E14025A for ; Mon, 5 Feb 2024 12:05:20 +0000 (UTC) X-FDA: 81757620000.18.CFE1205 Received: from smtp-fw-80006.amazon.com (smtp-fw-80006.amazon.com [99.78.197.217]) by imf07.hostedemail.com (Postfix) with ESMTP id F276940011 for ; Mon, 5 Feb 2024 12:05:17 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=Tpwt34uS; dmarc=pass (policy=quarantine) header.from=amazon.com; spf=pass (imf07.hostedemail.com: domain of "prvs=75897cb1d=jgowans@amazon.com" designates 99.78.197.217 as permitted sender) smtp.mailfrom="prvs=75897cb1d=jgowans@amazon.com" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707134718; a=rsa-sha256; cv=none; b=0hLQ5t5wyCibMql9a1z0ynfIh98Gq9Eqw6SZeHxTcmDIqaXuOoBqBRWmWUbfO94szndHy7 W7gH//d4hR3IKFYPEpoNwzg7QMbbEkXuNWHAj5diH/JkEgRBuVRbsRpzEfsl0/spDYF/3H EkyWYg0foAR1Q6sPjDoZ6795jPUT+4U= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=Tpwt34uS; dmarc=pass (policy=quarantine) header.from=amazon.com; spf=pass (imf07.hostedemail.com: domain of "prvs=75897cb1d=jgowans@amazon.com" designates 99.78.197.217 as permitted sender) smtp.mailfrom="prvs=75897cb1d=jgowans@amazon.com" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707134718; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mVhmItHN9qmSrpLWsRnSDOh51VJYGjIwmSMhmIPWHpo=; b=bUOc3ik0idEX5YjydrKNUq6H1zXQBL5LZamylNXfq809TezUVcZSc6w5kF3WQRoSoa6vts KThJfnPqukWBCJfmlIKH1sgOZ1jB6t0hwTMI/SI5rSOrk7DfW2oUJfC4Ysx/NVF9/tjXoU AKEirto3mxHb63Q5QHYRaTQEC2LaAUY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1707134718; x=1738670718; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mVhmItHN9qmSrpLWsRnSDOh51VJYGjIwmSMhmIPWHpo=; b=Tpwt34uSXBgwAGElG4iyGBSI7R7ggTDH+cfr4mhjI0GD/mOp6gbtQXG0 hggpy9hjb2RuMXM19PK8q7Tp0zpAejC0iXvq5+zqs5RFa/imrr3HcSIPO BCrkuBTFNwFzIRO5ZWCexm4I+x5kaJiwB0zuOoNfWapjhw0TU27MgLz1Z A=; X-IronPort-AV: E=Sophos;i="6.05,245,1701129600"; d="scan'208";a="271102836" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80006.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2024 12:05:14 +0000 Received: from EX19MTAEUC002.ant.amazon.com [10.0.17.79:18332] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.28.192:2525] with esmtp (Farcaster) id 430958cf-6b4e-40ab-be7a-e53a50074ea7; Mon, 5 Feb 2024 12:05:13 +0000 (UTC) X-Farcaster-Flow-ID: 430958cf-6b4e-40ab-be7a-e53a50074ea7 Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUC002.ant.amazon.com (10.252.51.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Mon, 5 Feb 2024 12:05:13 +0000 Received: from dev-dsk-jgowans-1a-a3faec1f.eu-west-1.amazon.com (172.19.112.191) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Mon, 5 Feb 2024 12:05:06 +0000 From: James Gowans To: CC: Eric Biederman , , "Joerg Roedel" , Will Deacon , , Alexander Viro , "Christian Brauner" , , Paolo Bonzini , Sean Christopherson , , Andrew Morton , , Alexander Graf , David Woodhouse , "Jan H . Schoenherr" , Usama Arif , Anthony Yznaga , Stanislav Kinsburskii , , , Subject: [RFC 13/18] vfio: add ioctl to define persistent pgtables on container Date: Mon, 5 Feb 2024 12:01:58 +0000 Message-ID: <20240205120203.60312-14-jgowans@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20240205120203.60312-1-jgowans@amazon.com> References: <20240205120203.60312-1-jgowans@amazon.com> MIME-Version: 1.0 X-Originating-IP: [172.19.112.191] X-ClientProxiedBy: EX19D046UWA004.ant.amazon.com (10.13.139.76) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: F276940011 X-Stat-Signature: 4xbtzimrajhobrq8bahi6egbszxozieb X-HE-Tag: 1707134717-364625 X-HE-Meta: U2FsdGVkX1/9j2ZWEwBP/pO+mGhqCWyv8SOs9NGpCA1mVWUzStTYyiUaGXoCT/jyHleC9Po7vagS4IZ5DqGuea05T3guxrjs8bB/s2tbk4zV3BhiFTn14sZoGnsHM38nqGY9l8k7QpaWbnEQEFduRfC2/6xdzDlkxk8rTQ9OJo5JcrslRMpPywWFK9wEuzZBOj7NWDq9TOtnyeU9APDSulE4JtB3TAiw22vSs9ijGctQkMWEeYpO4IzJmtW90DzPIk4cvXzmcq8pJMIuzr+ADPjut2sFFUKG6T74R79KUa328Wibf3g3HGijQCiJdmCMgFgLo+cLnYK+kjPiV/n9t+lvJIDoMrLtNg0jGrvTWRyP3zeKvf1Ub0bhjFtM0393lnqHyJlx1c1bYt8oxjOK1oe8S0dLm2cXyDr/xfqAmkPmPWOBjscHUYla2/DPzpcHzzUGBZ+b0HVQpaDZtHZA2aRUlkUKlAxzhz7jRUXk0NHvBZeQDAUQ9rJaB/jFHiJa0lzAbULoARryrhxclMfnQaRi8lgRme33oj24JbeFhePetYJi2natMCs3H68tEtl3onQK8aluhd5OxTuqXjlNOjsx2308aOdMtafrHbswq1bmeannRqvcPF9HJubLprgK40r3mpdA8QM1+YTQVdMw0zyHxs9kpGigJ8zFU+lqR/JXN6Yu05QJ28mUqa0dE/A9h56pKARB6kcAOoz3JPp0i6/q4TDUc7NOA+5FuHoxC/AXoq9wJLt+XbJip8EAz9GK0xhmNOOpBDtH5/57EQtZhml9Rx82UHweyuXzufap0WOyDtC6MHGRLGzPW77IkUY5Qto48o3vwSHWuhEe4CuuiTmPMD4b8/ht5xnG/Fe6Mht3US7l3nxRV769XPZ/BUfkF/siOVvZHboceodqhoRwXeIuoiL9kgmYAW2TqArL1sTKF3OGb+/9v5LycgmWrgVgqIfMQg+Vs3RiYiR85O4 VOxxi0Jz VRb5BxeJu7BV5R7NhbcNGNUD+ygXBvZv6Nmp8ToyLoXgBw2Hc5O5EDbuHncJlmOFpwoQWluVX5a/aWxNQ2Gssmg6OP2LpfzNwo6SWrKgWlzjYaJxLzf1QOLY1d8l+a+8q6MsK/rcloA/0rwd7IWqhWN/xXc3rEBEAskXlG4+Y8Y+svB9a4vEnLunerWBEN6oP1U2+ovGVSZ0oizzobww08UmaMb5Ws/rkCPbmDbFPMZb/fA9j1kCL11OSf0iTjOHVb27poTT7J0bOt5+g0nUMyp5SIxW2P2BIs2XK/l8YQpCTimAHb47yup4P2TaSxaQKcyrCeftTVvEv58WDZb2+RugFpXMYruYCVgqtYB6GWzu38AXVs3f4EAIB6aoexkK0j6g4ZK3kCz14Pbo2/Nt8p5wvNRfXogcQvZKgM5s5VNLjonCK8TvdinezH42kfRwdGkGpQrInaBhL99/sBMAemXj9N0PZ2UeT5sGHVijY6QxLur0fwVP5kiItfbIcSEiq5rGH0rRTfmR7QMCp75dKlngQ0bCl6EflyulvLobnk038laaf85wpgfRrdg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The previous commits added a file type in pkernfs for IOMMU persistent page tables. Now support actually setting persistent page tables on an IOMMU domain. This is done via a VFIO ioctl on a VFIO container. Userspace needs to create and open a IOMMU persistent page tables file and then supply that fd to the new VFIO_CONTAINER_SET_PERSISTENT_PGTABLES ioctl. That ioctl sets the supplied struct file on the struct vfio_container. Later when the IOMMU domain is allocated by VFIO, VFIO will check to see if the persistent pagetables have been defined and if they have will use the iommu_domain_alloc_persistent API which was introduced in the previous commit to pass the struct file down to the IOMMU which will actually use it for page tables. After kexec userspace needs to open the same IOMMU page table file and set it again via the same ioctl so that the IOMMU continues to use the same memory region for its page tables for that domain. --- drivers/vfio/container.c | 27 +++++++++++++++++++++++++++ drivers/vfio/vfio.h | 2 ++ drivers/vfio/vfio_iommu_type1.c | 27 +++++++++++++++++++++++++-- include/uapi/linux/vfio.h | 9 +++++++++ 4 files changed, 63 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/container.c b/drivers/vfio/container.c index d53d08f16973..b60fcbf7bad0 100644 --- a/drivers/vfio/container.c +++ b/drivers/vfio/container.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include "vfio.h" @@ -21,6 +22,7 @@ struct vfio_container { struct rw_semaphore group_lock; struct vfio_iommu_driver *iommu_driver; void *iommu_data; + struct file *persistent_pgtables; bool noiommu; }; @@ -306,6 +308,8 @@ static long vfio_ioctl_set_iommu(struct vfio_container *container, continue; } + driver->ops->set_persistent_pgtables(data, container->persistent_pgtables); + ret = __vfio_container_attach_groups(container, driver, data); if (ret) { driver->ops->release(data); @@ -324,6 +328,26 @@ static long vfio_ioctl_set_iommu(struct vfio_container *container, return ret; } +static int vfio_ioctl_set_persistent_pgtables(struct vfio_container *container, + unsigned long arg) +{ + struct vfio_set_persistent_pgtables set_ppts; + struct file *ppts; + + if (copy_from_user(&set_ppts, (void __user *)arg, sizeof(set_ppts))) + return -EFAULT; + + ppts = fget(set_ppts.persistent_pgtables_fd); + if (!ppts) + return -EBADF; + if (!pkernfs_is_iommu_domain_pgtables(ppts)) { + fput(ppts); + return -EBADF; + } + container->persistent_pgtables = ppts; + return 0; +} + static long vfio_fops_unl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) { @@ -345,6 +369,9 @@ static long vfio_fops_unl_ioctl(struct file *filep, case VFIO_SET_IOMMU: ret = vfio_ioctl_set_iommu(container, arg); break; + case VFIO_CONTAINER_SET_PERSISTENT_PGTABLES: + ret = vfio_ioctl_set_persistent_pgtables(container, arg); + break; default: driver = container->iommu_driver; data = container->iommu_data; diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 307e3f29b527..6fa301bf6474 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -226,6 +226,8 @@ struct vfio_iommu_driver_ops { void *data, size_t count, bool write); struct iommu_domain *(*group_iommu_domain)(void *iommu_data, struct iommu_group *group); + int (*set_persistent_pgtables)(void *iommu_data, + struct file *ppts); }; struct vfio_iommu_driver { diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index eacd6ec04de5..b36edfc5c9ef 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -75,6 +75,7 @@ struct vfio_iommu { bool nesting; bool dirty_page_tracking; struct list_head emulated_iommu_groups; + struct file *persistent_pgtables; }; struct vfio_domain { @@ -2143,9 +2144,14 @@ static void vfio_iommu_iova_insert_copy(struct vfio_iommu *iommu, static int vfio_iommu_domain_alloc(struct device *dev, void *data) { + /* data is an in pointer to PPTs, and an out to the new domain. */ + struct file *ppts = *(struct file **) data; struct iommu_domain **domain = data; - *domain = iommu_domain_alloc(dev->bus); + if (ppts) + *domain = iommu_domain_alloc_persistent(dev->bus, ppts); + else + *domain = iommu_domain_alloc(dev->bus); return 1; /* Don't iterate */ } @@ -2156,6 +2162,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, struct vfio_iommu_group *group; struct vfio_domain *domain, *d; bool resv_msi; + /* In/out ptr to iommu_domain_alloc. */ + void *domain_alloc_data; phys_addr_t resv_msi_base = 0; struct iommu_domain_geometry *geo; LIST_HEAD(iova_copy); @@ -2203,8 +2211,12 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, * want to iterate beyond the first device (if any). */ ret = -EIO; - iommu_group_for_each_dev(iommu_group, &domain->domain, + /* Smuggle the PPTs in the data field; it will be clobbered with the new domain */ + domain_alloc_data = iommu->persistent_pgtables; + iommu_group_for_each_dev(iommu_group, &domain_alloc_data, vfio_iommu_domain_alloc); + domain->domain = domain_alloc_data; + if (!domain->domain) goto out_free_domain; @@ -3165,6 +3177,16 @@ vfio_iommu_type1_group_iommu_domain(void *iommu_data, return domain; } +int vfio_iommu_type1_set_persistent_pgtables(void *iommu_data, + struct file *ppts) +{ + + struct vfio_iommu *iommu = iommu_data; + + iommu->persistent_pgtables = ppts; + return 0; +} + static const struct vfio_iommu_driver_ops vfio_iommu_driver_ops_type1 = { .name = "vfio-iommu-type1", .owner = THIS_MODULE, @@ -3179,6 +3201,7 @@ static const struct vfio_iommu_driver_ops vfio_iommu_driver_ops_type1 = { .unregister_device = vfio_iommu_type1_unregister_device, .dma_rw = vfio_iommu_type1_dma_rw, .group_iommu_domain = vfio_iommu_type1_group_iommu_domain, + .set_persistent_pgtables = vfio_iommu_type1_set_persistent_pgtables, }; static int __init vfio_iommu_type1_init(void) diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index afc1369216d9..fa9676bb4b26 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -1797,6 +1797,15 @@ struct vfio_iommu_spapr_tce_remove { }; #define VFIO_IOMMU_SPAPR_TCE_REMOVE _IO(VFIO_TYPE, VFIO_BASE + 20) +struct vfio_set_persistent_pgtables { + /* + * File descriptor for a pkernfs IOMMU pgtables + * file to be used for persistence. + */ + __u32 persistent_pgtables_fd; +}; +#define VFIO_CONTAINER_SET_PERSISTENT_PGTABLES _IO(VFIO_TYPE, VFIO_BASE + 21) + /* ***************************************************************** */ #endif /* _UAPIVFIO_H */