From patchwork Thu Sep 20 17:00:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10608303 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0A2B3913 for ; Thu, 20 Sep 2018 17:24:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E33CA2E283 for ; Thu, 20 Sep 2018 17:24:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D61C42E29B; Thu, 20 Sep 2018 17:24:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1AC992E283 for ; Thu, 20 Sep 2018 17:24:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387445AbeITXJV (ORCPT ); Thu, 20 Sep 2018 19:09:21 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:50048 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732761AbeITXJU (ORCPT ); Thu, 20 Sep 2018 19:09:20 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 54CAA1596; Thu, 20 Sep 2018 10:24:50 -0700 (PDT) Received: from ostrya.Emea.Arm.com (ostrya.emea.arm.com [10.4.12.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 90E863F557; Thu, 20 Sep 2018 10:24:46 -0700 (PDT) From: Jean-Philippe Brucker To: iommu@lists.linux-foundation.org Cc: joro@8bytes.org, linux-pci@vger.kernel.org, jcrouse@codeaurora.org, alex.williamson@redhat.com, Jonathan.Cameron@huawei.com, jacob.jun.pan@linux.intel.com, christian.koenig@amd.com, eric.auger@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, andrew.murray@arm.com, will.deacon@arm.com, robin.murphy@arm.com, ashok.raj@intel.com, baolu.lu@linux.intel.com, xuzaibo@huawei.com, liguozhu@hisilicon.com, okaya@codeaurora.org, bharatku@xilinx.com, ilias.apalodimas@linaro.org, shunyong.yang@hxt-semitech.com Subject: [PATCH v3 01/10] iommu: Introduce Shared Virtual Addressing API Date: Thu, 20 Sep 2018 18:00:37 +0100 Message-Id: <20180920170046.20154-2-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180920170046.20154-1-jean-philippe.brucker@arm.com> References: <20180920170046.20154-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Shared Virtual Addressing (SVA) provides a way for device drivers to bind process address spaces to devices. This requires the IOMMU to support page table format and features compatible with the CPUs, and usually requires the system to support I/O Page Faults (IOPF) and Process Address Space ID (PASID). When all of these are available, DMA can access virtual addresses of a process. A PASID is allocated for each process, and the device driver programs it into the device in an implementation-specific way. Add a new API for sharing process page tables with devices. Introduce two IOMMU operations, sva_init_device() and sva_shutdown_device(), that prepare the IOMMU driver for SVA. For example allocate PASID tables and fault queues. Subsequent patches will implement the bind() and unbind() operations. Introduce a new mutex sva_lock on the device's IOMMU param to serialize init(), shutdown(), bind() and unbind() operations. Using the existing lock isn't possible because the unbind() and shutdown() operations will have to wait while holding sva_lock for concurrent fault queue flushes to terminate. These flushes will take the existing lock. Support for I/O Page Faults will be added in a later patch using a new feature bit (IOMMU_SVA_FEAT_IOPF). With the current API users must pin down all shared mappings. Signed-off-by: Jean-Philippe Brucker --- v2->v3: * Add sva_lock to serialize init/bind/unbind/shutdown * Rename functions for consistency with the rest of the API --- drivers/iommu/Kconfig | 4 ++ drivers/iommu/Makefile | 1 + drivers/iommu/iommu-sva.c | 107 ++++++++++++++++++++++++++++++++++++++ drivers/iommu/iommu.c | 1 + include/linux/iommu.h | 34 ++++++++++++ 5 files changed, 147 insertions(+) create mode 100644 drivers/iommu/iommu-sva.c diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index c60395b7470f..884580401919 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -95,6 +95,10 @@ config IOMMU_DMA select IOMMU_IOVA select NEED_SG_DMA_LENGTH +config IOMMU_SVA + bool + select IOMMU_API + config FSL_PAMU bool "Freescale IOMMU support" depends on PCI diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index ab5eba6edf82..7d6332be5f0e 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_IOMMU_API) += iommu-traces.o obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o obj-$(CONFIG_IOMMU_DEBUGFS) += iommu-debugfs.o obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o +obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c new file mode 100644 index 000000000000..85ef98efede8 --- /dev/null +++ b/drivers/iommu/iommu-sva.c @@ -0,0 +1,107 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Manage PASIDs and bind process address spaces to devices. + * + * Copyright (C) 2018 ARM Ltd. + */ + +#include +#include + +/** + * iommu_sva_init_device() - Initialize Shared Virtual Addressing for a device + * @dev: the device + * @features: bitmask of features that need to be initialized + * @min_pasid: min PASID value supported by the device + * @max_pasid: max PASID value supported by the device + * + * Users of the bind()/unbind() API must call this function to initialize all + * features required for SVA. + * + * The device must support multiple address spaces (e.g. PCI PASID). By default + * the PASID allocated during bind() is limited by the IOMMU capacity, and by + * the device PASID width defined in the PCI capability or in the firmware + * description. Setting @max_pasid to a non-zero value smaller than this limit + * overrides it. Similarly, @min_pasid overrides the lower PASID limit supported + * by the IOMMU. + * + * The device should not be performing any DMA while this function is running, + * otherwise the behavior is undefined. + * + * Return 0 if initialization succeeded, or an error. + */ +int iommu_sva_init_device(struct device *dev, unsigned long features, + unsigned int min_pasid, unsigned int max_pasid) +{ + int ret; + struct iommu_sva_param *param; + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + + if (!domain || !domain->ops->sva_init_device) + return -ENODEV; + + if (features) + return -EINVAL; + + param = kzalloc(sizeof(*param), GFP_KERNEL); + if (!param) + return -ENOMEM; + + param->features = features; + param->min_pasid = min_pasid; + param->max_pasid = max_pasid; + + mutex_lock(&dev->iommu_param->sva_lock); + if (dev->iommu_param->sva_param) { + ret = -EEXIST; + goto err_unlock; + } + + /* + * IOMMU driver updates the limits depending on the IOMMU and device + * capabilities. + */ + ret = domain->ops->sva_init_device(dev, param); + if (ret) + goto err_unlock; + + dev->iommu_param->sva_param = param; + mutex_unlock(&dev->iommu_param->sva_lock); + return 0; + +err_unlock: + mutex_unlock(&dev->iommu_param->sva_lock); + kfree(param); + return ret; +} +EXPORT_SYMBOL_GPL(iommu_sva_init_device); + +/** + * iommu_sva_shutdown_device() - Shutdown Shared Virtual Addressing for a device + * @dev: the device + * + * Disable SVA. Device driver should ensure that the device isn't performing any + * DMA while this function is running. + */ +void iommu_sva_shutdown_device(struct device *dev) +{ + struct iommu_sva_param *param; + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + + if (!domain) + return; + + mutex_lock(&dev->iommu_param->sva_lock); + param = dev->iommu_param->sva_param; + if (!param) + goto out_unlock; + + if (domain->ops->sva_shutdown_device) + domain->ops->sva_shutdown_device(dev); + + kfree(param); + dev->iommu_param->sva_param = NULL; +out_unlock: + mutex_unlock(&dev->iommu_param->sva_lock); +} +EXPORT_SYMBOL_GPL(iommu_sva_shutdown_device); diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 58f3477f2993..fa0561ed006f 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -653,6 +653,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev) goto err_free_name; } mutex_init(&dev->iommu_param->lock); + mutex_init(&dev->iommu_param->sva_lock); kobject_get(group->devices_kobj); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 8177f7736fcd..4c27cb347770 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -197,6 +197,12 @@ struct page_response_msg { u64 private_data; }; +struct iommu_sva_param { + unsigned long features; + unsigned int min_pasid; + unsigned int max_pasid; +}; + /** * struct iommu_ops - iommu ops and capabilities * @capable: check capability @@ -204,6 +210,8 @@ struct page_response_msg { * @domain_free: free iommu domain * @attach_dev: attach device to an iommu domain * @detach_dev: detach device from an iommu domain + * @sva_init_device: initialize Shared Virtual Addressing for a device + * @sva_shutdown_device: shutdown Shared Virtual Addressing for a device * @map: map a physically contiguous memory region to an iommu domain * @unmap: unmap a physically contiguous memory region from an iommu domain * @flush_tlb_all: Synchronously flush all hardware TLBs for this domain @@ -239,6 +247,8 @@ struct iommu_ops { int (*attach_dev)(struct iommu_domain *domain, struct device *dev); void (*detach_dev)(struct iommu_domain *domain, struct device *dev); + int (*sva_init_device)(struct device *dev, struct iommu_sva_param *param); + void (*sva_shutdown_device)(struct device *dev); int (*map)(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot); size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, @@ -393,6 +403,9 @@ struct iommu_fault_param { * struct iommu_param - collection of per-device IOMMU data * * @fault_param: IOMMU detected device fault reporting data + * @lock: serializes accesses to fault_param + * @sva_param: SVA parameters + * @sva_lock: serializes accesses to sva_param * * TODO: migrate other per device data pointers under iommu_dev_data, e.g. * struct iommu_group *iommu_group; @@ -401,6 +414,8 @@ struct iommu_fault_param { struct iommu_param { struct mutex lock; struct iommu_fault_param *fault_param; + struct mutex sva_lock; + struct iommu_sva_param *sva_param; }; int iommu_device_register(struct iommu_device *iommu); @@ -904,4 +919,23 @@ void iommu_debugfs_setup(void); static inline void iommu_debugfs_setup(void) {} #endif +#ifdef CONFIG_IOMMU_SVA +extern int iommu_sva_init_device(struct device *dev, unsigned long features, + unsigned int min_pasid, + unsigned int max_pasid); +extern void iommu_sva_shutdown_device(struct device *dev); +#else /* CONFIG_IOMMU_SVA */ +static inline int iommu_sva_init_device(struct device *dev, + unsigned long features, + unsigned int min_pasid, + unsigned int max_pasid) +{ + return -ENODEV; +} + +static inline void iommu_sva_shutdown_device(struct device *dev) +{ +} +#endif /* CONFIG_IOMMU_SVA */ + #endif /* __LINUX_IOMMU_H */ From patchwork Thu Sep 20 17:00:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10608307 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 11A2515E8 for ; Thu, 20 Sep 2018 17:24:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ED8B42E283 for ; Thu, 20 Sep 2018 17:24:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E07082E29A; Thu, 20 Sep 2018 17:24:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3D69A2E283 for ; Thu, 20 Sep 2018 17:24:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733291AbeITXJZ (ORCPT ); Thu, 20 Sep 2018 19:09:25 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:50094 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732761AbeITXJY (ORCPT ); Thu, 20 Sep 2018 19:09:24 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5A03A7A9; Thu, 20 Sep 2018 10:24:54 -0700 (PDT) Received: from ostrya.Emea.Arm.com (ostrya.emea.arm.com [10.4.12.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 961C63F557; Thu, 20 Sep 2018 10:24:50 -0700 (PDT) From: Jean-Philippe Brucker To: iommu@lists.linux-foundation.org Cc: joro@8bytes.org, linux-pci@vger.kernel.org, jcrouse@codeaurora.org, alex.williamson@redhat.com, Jonathan.Cameron@huawei.com, jacob.jun.pan@linux.intel.com, christian.koenig@amd.com, eric.auger@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, andrew.murray@arm.com, will.deacon@arm.com, robin.murphy@arm.com, ashok.raj@intel.com, baolu.lu@linux.intel.com, xuzaibo@huawei.com, liguozhu@hisilicon.com, okaya@codeaurora.org, bharatku@xilinx.com, ilias.apalodimas@linaro.org, shunyong.yang@hxt-semitech.com Subject: [PATCH v3 02/10] iommu/sva: Bind process address spaces to devices Date: Thu, 20 Sep 2018 18:00:38 +0100 Message-Id: <20180920170046.20154-3-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180920170046.20154-1-jean-philippe.brucker@arm.com> References: <20180920170046.20154-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add bind() and unbind() operations to the IOMMU API. Bind() returns a PASID that drivers can program in hardware, to let their devices access an mm. This patch only adds skeletons for the device driver API, most of the implementation is still missing. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/iommu-sva.c | 34 +++++++++++++++ drivers/iommu/iommu.c | 90 +++++++++++++++++++++++++++++++++++++++ include/linux/iommu.h | 37 ++++++++++++++++ 3 files changed, 161 insertions(+) diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index 85ef98efede8..d60d4f0bb89e 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -8,6 +8,38 @@ #include #include +int __iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid, + unsigned long flags, void *drvdata) +{ + return -ENOSYS; /* TODO */ +} +EXPORT_SYMBOL_GPL(__iommu_sva_bind_device); + +int __iommu_sva_unbind_device(struct device *dev, int pasid) +{ + return -ENOSYS; /* TODO */ +} +EXPORT_SYMBOL_GPL(__iommu_sva_unbind_device); + +static void __iommu_sva_unbind_device_all(struct device *dev) +{ + /* TODO */ +} + +/** + * iommu_sva_unbind_device_all() - Detach all address spaces from this device + * @dev: the device + * + * When detaching @dev from a domain, IOMMU drivers should use this helper. + */ +void iommu_sva_unbind_device_all(struct device *dev) +{ + mutex_lock(&dev->iommu_param->sva_lock); + __iommu_sva_unbind_device_all(dev); + mutex_unlock(&dev->iommu_param->sva_lock); +} +EXPORT_SYMBOL_GPL(iommu_sva_unbind_device_all); + /** * iommu_sva_init_device() - Initialize Shared Virtual Addressing for a device * @dev: the device @@ -96,6 +128,8 @@ void iommu_sva_shutdown_device(struct device *dev) if (!param) goto out_unlock; + __iommu_sva_unbind_device_all(dev); + if (domain->ops->sva_shutdown_device) domain->ops->sva_shutdown_device(dev); diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index fa0561ed006f..aba3bf15d46c 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2325,3 +2325,93 @@ int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids) return 0; } EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids); + +/** + * iommu_sva_bind_device() - Bind a process address space to a device + * @dev: the device + * @mm: the mm to bind, caller must hold a reference to it + * @pasid: valid address where the PASID will be stored + * @flags: bond properties + * @drvdata: private data passed to the mm exit handler + * + * Create a bond between device and task, allowing the device to access the mm + * using the returned PASID. If unbind() isn't called first, a subsequent bind() + * for the same device and mm fails with -EEXIST. + * + * iommu_sva_init_device() must be called first, to initialize the required SVA + * features. @flags must be a subset of these features. + * + * The caller must pin down using get_user_pages*() all mappings shared with the + * device. mlock() isn't sufficient, as it doesn't prevent minor page faults + * (e.g. copy-on-write). + * + * On success, 0 is returned and @pasid contains a valid ID. Otherwise, an error + * is returned. + */ +int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid, + unsigned long flags, void *drvdata) +{ + int ret = -EINVAL; + struct iommu_group *group; + + if (!pasid) + return -EINVAL; + + group = iommu_group_get(dev); + if (!group) + return -ENODEV; + + /* Ensure device count and domain don't change while we're binding */ + mutex_lock(&group->mutex); + + /* + * To keep things simple, SVA currently doesn't support IOMMU groups + * with more than one device. Existing SVA-capable systems are not + * affected by the problems that required IOMMU groups (lack of ACS + * isolation, device ID aliasing and other hardware issues). + */ + if (iommu_group_device_count(group) != 1) + goto out_unlock; + + ret = __iommu_sva_bind_device(dev, mm, pasid, flags, drvdata); + +out_unlock: + mutex_unlock(&group->mutex); + iommu_group_put(group); + + return ret; +} +EXPORT_SYMBOL_GPL(iommu_sva_bind_device); + +/** + * iommu_sva_unbind_device() - Remove a bond created with iommu_sva_bind_device + * @dev: the device + * @pasid: the pasid returned by bind() + * + * Remove bond between device and address space identified by @pasid. Users + * should not call unbind() if the corresponding mm exited (as the PASID might + * have been reallocated for another process). + * + * The device must not be issuing any more transaction for this PASID. All + * outstanding page requests for this PASID must have been flushed to the IOMMU. + * + * Returns 0 on success, or an error value + */ +int iommu_sva_unbind_device(struct device *dev, int pasid) +{ + int ret = -EINVAL; + struct iommu_group *group; + + group = iommu_group_get(dev); + if (!group) + return -ENODEV; + + mutex_lock(&group->mutex); + ret = __iommu_sva_unbind_device(dev, pasid); + mutex_unlock(&group->mutex); + + iommu_group_put(group); + + return ret; +} +EXPORT_SYMBOL_GPL(iommu_sva_unbind_device); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 4c27cb347770..9c49877e37a5 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -586,6 +586,10 @@ void iommu_fwspec_free(struct device *dev); int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids); const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode); +extern int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, + int *pasid, unsigned long flags, void *drvdata); +extern int iommu_sva_unbind_device(struct device *dev, int pasid); + #else /* CONFIG_IOMMU_API */ struct iommu_ops {}; @@ -910,6 +914,18 @@ static inline int iommu_sva_invalidate(struct iommu_domain *domain, return -ENODEV; } +static inline int iommu_sva_bind_device(struct device *dev, + struct mm_struct *mm, int *pasid, + unsigned long flags, void *drvdata) +{ + return -ENODEV; +} + +static inline int iommu_sva_unbind_device(struct device *dev, int pasid) +{ + return -ENODEV; +} + #endif /* CONFIG_IOMMU_API */ #ifdef CONFIG_IOMMU_DEBUGFS @@ -924,6 +940,11 @@ extern int iommu_sva_init_device(struct device *dev, unsigned long features, unsigned int min_pasid, unsigned int max_pasid); extern void iommu_sva_shutdown_device(struct device *dev); +extern int __iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, + int *pasid, unsigned long flags, + void *drvdata); +extern int __iommu_sva_unbind_device(struct device *dev, int pasid); +extern void iommu_sva_unbind_device_all(struct device *dev); #else /* CONFIG_IOMMU_SVA */ static inline int iommu_sva_init_device(struct device *dev, unsigned long features, @@ -936,6 +957,22 @@ static inline int iommu_sva_init_device(struct device *dev, static inline void iommu_sva_shutdown_device(struct device *dev) { } + +static inline int __iommu_sva_bind_device(struct device *dev, + struct mm_struct *mm, int *pasid, + unsigned long flags, void *drvdata) +{ + return -ENODEV; +} + +static inline int __iommu_sva_unbind_device(struct device *dev, int pasid) +{ + return -ENODEV; +} + +static inline void iommu_sva_unbind_device_all(struct device *dev) +{ +} #endif /* CONFIG_IOMMU_SVA */ #endif /* __LINUX_IOMMU_H */ From patchwork Thu Sep 20 17:00:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10608311 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 45266913 for ; Thu, 20 Sep 2018 17:25:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 251912E283 for ; Thu, 20 Sep 2018 17:25:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 18FE52E29B; Thu, 20 Sep 2018 17:25:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E835E2E283 for ; Thu, 20 Sep 2018 17:25:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733144AbeITXJ3 (ORCPT ); Thu, 20 Sep 2018 19:09:29 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:50110 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732761AbeITXJ3 (ORCPT ); Thu, 20 Sep 2018 19:09:29 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8094F15AD; Thu, 20 Sep 2018 10:24:58 -0700 (PDT) Received: from ostrya.Emea.Arm.com (ostrya.emea.arm.com [10.4.12.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 9B6BA3F73B; Thu, 20 Sep 2018 10:24:54 -0700 (PDT) From: Jean-Philippe Brucker To: iommu@lists.linux-foundation.org Cc: joro@8bytes.org, linux-pci@vger.kernel.org, jcrouse@codeaurora.org, alex.williamson@redhat.com, Jonathan.Cameron@huawei.com, jacob.jun.pan@linux.intel.com, christian.koenig@amd.com, eric.auger@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, andrew.murray@arm.com, will.deacon@arm.com, robin.murphy@arm.com, ashok.raj@intel.com, baolu.lu@linux.intel.com, xuzaibo@huawei.com, liguozhu@hisilicon.com, okaya@codeaurora.org, bharatku@xilinx.com, ilias.apalodimas@linaro.org, shunyong.yang@hxt-semitech.com Subject: [PATCH v3 03/10] iommu/sva: Manage process address spaces Date: Thu, 20 Sep 2018 18:00:39 +0100 Message-Id: <20180920170046.20154-4-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180920170046.20154-1-jean-philippe.brucker@arm.com> References: <20180920170046.20154-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Allocate IOMMU mm structures and bind them to devices. Four operations are added to IOMMU drivers: * mm_alloc(): to create an io_mm structure and perform architecture- specific operations required to grab the process (for instance on ARM, pin down the CPU ASID so that the process doesn't get assigned a new ASID on rollover). There is a single valid io_mm structure per Linux mm. Future extensions may also use io_mm for kernel-managed address spaces, populated with map()/unmap() calls instead of bound to process address spaces. This patch focuses on "shared" io_mm. * mm_attach(): attach an mm to a device. The IOMMU driver checks that the device is capable of sharing an address space, and writes the PASID table entry to install the pgd. Some IOMMU drivers will have a single PASID table per domain, for convenience. Other can implement it differently but to help these drivers, mm_attach and mm_detach take 'attach_domain' and 'detach_domain' parameters, that tell whether they need to set and clear the PASID entry or only send the required TLB invalidations. * mm_detach(): detach an mm from a device. The IOMMU driver removes the PASID table entry and invalidates the IOTLBs. * mm_free(): free a structure allocated by mm_alloc(), and let arch release the process. mm_attach and mm_detach operations are serialized with a spinlock. When trying to optimize this code, we should at least prevent concurrent attach()/detach() on the same domain (so multi-level PASID table code can allocate tables lazily). mm_alloc() can sleep, but mm_free must not (because we'll have to call it from call_srcu later on). At the moment we use an IDR for allocating PASIDs and retrieving contexts. We also use a single spinlock. These can be refined and optimized later (a custom allocator will be needed for top-down PASID allocation). Keeping track of address spaces requires the use of MMU notifiers. Handling process exit with regard to unbind() is tricky, so it is left for another patch and we explicitly fail mm_alloc() for the moment. Signed-off-by: Jean-Philippe Brucker --- v2->v3: use sva_lock, comment updates --- drivers/iommu/iommu-sva.c | 397 +++++++++++++++++++++++++++++++++++++- drivers/iommu/iommu.c | 1 + include/linux/iommu.h | 29 +++ 3 files changed, 424 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index d60d4f0bb89e..a486bc947335 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -5,25 +5,415 @@ * Copyright (C) 2018 ARM Ltd. */ +#include #include +#include #include +#include + +/** + * DOC: io_mm model + * + * The io_mm keeps track of process address spaces shared between CPU and IOMMU. + * The following example illustrates the relation between structures + * iommu_domain, io_mm and iommu_bond. An iommu_bond is a link between io_mm and + * device. A device can have multiple io_mm and an io_mm may be bound to + * multiple devices. + * ___________________________ + * | IOMMU domain A | + * | ________________ | + * | | IOMMU group | +------- io_pgtables + * | | | | + * | | dev 00:00.0 ----+------- bond --- io_mm X + * | |________________| \ | + * | '----- bond ---. + * |___________________________| \ + * ___________________________ \ + * | IOMMU domain B | io_mm Y + * | ________________ | / / + * | | IOMMU group | | / / + * | | | | / / + * | | dev 00:01.0 ------------ bond -' / + * | | dev 00:01.1 ------------ bond --' + * | |________________| | + * | +------- io_pgtables + * |___________________________| + * + * In this example, device 00:00.0 is in domain A, devices 00:01.* are in domain + * B. All devices within the same domain access the same address spaces. Device + * 00:00.0 accesses address spaces X and Y, each corresponding to an mm_struct. + * Devices 00:01.* only access address space Y. In addition each + * IOMMU_DOMAIN_DMA domain has a private address space, io_pgtable, that is + * managed with iommu_map()/iommu_unmap(), and isn't shared with the CPU MMU. + * + * To obtain the above configuration, users would for instance issue the + * following calls: + * + * iommu_sva_bind_device(dev 00:00.0, mm X, ...) -> PASID 1 + * iommu_sva_bind_device(dev 00:00.0, mm Y, ...) -> PASID 2 + * iommu_sva_bind_device(dev 00:01.0, mm Y, ...) -> PASID 2 + * iommu_sva_bind_device(dev 00:01.1, mm Y, ...) -> PASID 2 + * + * A single Process Address Space ID (PASID) is allocated for each mm. In the + * example, devices use PASID 1 to read/write into address space X and PASID 2 + * to read/write into address space Y. + * + * Hardware tables describing this configuration in the IOMMU would typically + * look like this: + * + * PASID tables + * of domain A + * .->+--------+ + * / 0 | |-------> io_pgtable + * / +--------+ + * Device tables / 1 | |-------> pgd X + * +--------+ / +--------+ + * 00:00.0 | A |-' 2 | |--. + * +--------+ +--------+ \ + * : : 3 | | \ + * +--------+ +--------+ --> pgd Y + * 00:01.0 | B |--. / + * +--------+ \ | + * 00:01.1 | B |----+ PASID tables | + * +--------+ \ of domain B | + * '->+--------+ | + * 0 | |-- | --> io_pgtable + * +--------+ | + * 1 | | | + * +--------+ | + * 2 | |---' + * +--------+ + * 3 | | + * +--------+ + * + * With this model, a single call binds all devices in a given domain to an + * address space. Other devices in the domain will get the same bond implicitly. + * However, users must issue one bind() for each device, because IOMMUs may + * implement SVA differently. Furthermore, mandating one bind() per device + * allows the driver to perform sanity-checks on device capabilities. + * + * In some IOMMUs, one entry (typically the first one) of the PASID table can be + * used to hold non-PASID translations. In this case PASID #0 is reserved and + * the first entry points to the io_pgtable pointer. In other IOMMUs the + * io_pgtable pointer is held in the device table and PASID #0 is available to + * the allocator. + */ + +struct iommu_bond { + struct io_mm *io_mm; + struct device *dev; + struct iommu_domain *domain; + + struct list_head mm_head; + struct list_head dev_head; + struct list_head domain_head; + + void *drvdata; +}; + +/* + * Because we're using an IDR, PASIDs are limited to 31 bits (the sign bit is + * used for returning errors). In practice implementations will use at most 20 + * bits, which is the PCI limit. + */ +static DEFINE_IDR(iommu_pasid_idr); + +/* + * For the moment this is an all-purpose lock. It serializes + * access/modifications to bonds, access/modifications to the PASID IDR, and + * changes to io_mm refcount as well. + */ +static DEFINE_SPINLOCK(iommu_sva_lock); + +static struct io_mm * +io_mm_alloc(struct iommu_domain *domain, struct device *dev, + struct mm_struct *mm, unsigned long flags) +{ + int ret; + int pasid; + struct io_mm *io_mm; + struct iommu_sva_param *param = dev->iommu_param->sva_param; + + if (!domain->ops->mm_alloc || !domain->ops->mm_free) + return ERR_PTR(-ENODEV); + + io_mm = domain->ops->mm_alloc(domain, mm, flags); + if (IS_ERR(io_mm)) + return io_mm; + if (!io_mm) + return ERR_PTR(-ENOMEM); + + /* + * The mm must not be freed until after the driver frees the io_mm + * (which may involve unpinning the CPU ASID for instance, requiring a + * valid mm struct.) + */ + mmgrab(mm); + + io_mm->flags = flags; + io_mm->mm = mm; + io_mm->release = domain->ops->mm_free; + INIT_LIST_HEAD(&io_mm->devices); + /* Leave kref as zero until the io_mm is fully initialized */ + + idr_preload(GFP_KERNEL); + spin_lock(&iommu_sva_lock); + pasid = idr_alloc(&iommu_pasid_idr, io_mm, param->min_pasid, + param->max_pasid + 1, GFP_ATOMIC); + io_mm->pasid = pasid; + spin_unlock(&iommu_sva_lock); + idr_preload_end(); + + if (pasid < 0) { + ret = pasid; + goto err_free_mm; + } + + /* TODO: keep track of mm. For the moment, abort. */ + ret = -ENOSYS; + spin_lock(&iommu_sva_lock); + idr_remove(&iommu_pasid_idr, io_mm->pasid); + spin_unlock(&iommu_sva_lock); + +err_free_mm: + io_mm->release(io_mm); + mmdrop(mm); + + return ERR_PTR(ret); +} + +static void io_mm_free(struct io_mm *io_mm) +{ + struct mm_struct *mm = io_mm->mm; + + io_mm->release(io_mm); + mmdrop(mm); +} + +static void io_mm_release(struct kref *kref) +{ + struct io_mm *io_mm; + + io_mm = container_of(kref, struct io_mm, kref); + WARN_ON(!list_empty(&io_mm->devices)); + + /* The PASID can now be reallocated for another mm... */ + idr_remove(&iommu_pasid_idr, io_mm->pasid); + /* ... but this mm is freed after a grace period (TODO) */ + io_mm_free(io_mm); +} + +/* + * Returns non-zero if a reference to the io_mm was successfully taken. + * Returns zero if the io_mm is being freed and should not be used. + */ +static int io_mm_get_locked(struct io_mm *io_mm) +{ + if (io_mm) + return kref_get_unless_zero(&io_mm->kref); + + return 0; +} + +static void io_mm_put_locked(struct io_mm *io_mm) +{ + kref_put(&io_mm->kref, io_mm_release); +} + +static void io_mm_put(struct io_mm *io_mm) +{ + spin_lock(&iommu_sva_lock); + io_mm_put_locked(io_mm); + spin_unlock(&iommu_sva_lock); +} + +static int io_mm_attach(struct iommu_domain *domain, struct device *dev, + struct io_mm *io_mm, void *drvdata) +{ + int ret; + bool attach_domain = true; + int pasid = io_mm->pasid; + struct iommu_bond *bond, *tmp; + struct iommu_sva_param *param = dev->iommu_param->sva_param; + + if (!domain->ops->mm_attach || !domain->ops->mm_detach) + return -ENODEV; + + if (pasid > param->max_pasid || pasid < param->min_pasid) + return -ERANGE; + + bond = kzalloc(sizeof(*bond), GFP_KERNEL); + if (!bond) + return -ENOMEM; + + bond->domain = domain; + bond->io_mm = io_mm; + bond->dev = dev; + bond->drvdata = drvdata; + + spin_lock(&iommu_sva_lock); + /* + * Check if this io_mm is already bound to the domain. In which case the + * IOMMU driver doesn't have to install the PASID table entry. + */ + list_for_each_entry(tmp, &domain->mm_list, domain_head) { + if (tmp->io_mm == io_mm) { + attach_domain = false; + break; + } + } + + ret = domain->ops->mm_attach(domain, dev, io_mm, attach_domain); + if (ret) { + kfree(bond); + goto out_unlock; + } + + list_add(&bond->mm_head, &io_mm->devices); + list_add(&bond->domain_head, &domain->mm_list); + list_add(&bond->dev_head, ¶m->mm_list); + +out_unlock: + spin_unlock(&iommu_sva_lock); + return ret; +} + +static void io_mm_detach_locked(struct iommu_bond *bond) +{ + struct iommu_bond *tmp; + bool detach_domain = true; + struct iommu_domain *domain = bond->domain; + + list_for_each_entry(tmp, &domain->mm_list, domain_head) { + if (tmp->io_mm == bond->io_mm && tmp->dev != bond->dev) { + detach_domain = false; + break; + } + } + + list_del(&bond->mm_head); + list_del(&bond->domain_head); + list_del(&bond->dev_head); + + domain->ops->mm_detach(domain, bond->dev, bond->io_mm, detach_domain); + + io_mm_put_locked(bond->io_mm); + kfree(bond); +} int __iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid, unsigned long flags, void *drvdata) { - return -ENOSYS; /* TODO */ + int i; + int ret = 0; + struct iommu_bond *bond; + struct io_mm *io_mm = NULL; + struct iommu_domain *domain; + struct iommu_sva_param *param; + + domain = iommu_get_domain_for_dev(dev); + if (!domain) + return -EINVAL; + + mutex_lock(&dev->iommu_param->sva_lock); + param = dev->iommu_param->sva_param; + if (!param || (flags & ~param->features)) { + ret = -EINVAL; + goto out_unlock; + } + + /* If an io_mm already exists, use it */ + spin_lock(&iommu_sva_lock); + idr_for_each_entry(&iommu_pasid_idr, io_mm, i) { + if (io_mm->mm == mm && io_mm_get_locked(io_mm)) { + /* ... Unless it's already bound to this device */ + list_for_each_entry(bond, &io_mm->devices, mm_head) { + if (bond->dev == dev) { + ret = -EEXIST; + io_mm_put_locked(io_mm); + break; + } + } + break; + } + } + spin_unlock(&iommu_sva_lock); + if (ret) + goto out_unlock; + + /* Require identical features within an io_mm for now */ + if (io_mm && (flags != io_mm->flags)) { + io_mm_put(io_mm); + ret = -EDOM; + goto out_unlock; + } + + if (!io_mm) { + io_mm = io_mm_alloc(domain, dev, mm, flags); + if (IS_ERR(io_mm)) { + ret = PTR_ERR(io_mm); + goto out_unlock; + } + } + + ret = io_mm_attach(domain, dev, io_mm, drvdata); + if (ret) + io_mm_put(io_mm); + else + *pasid = io_mm->pasid; + +out_unlock: + mutex_unlock(&dev->iommu_param->sva_lock); + return ret; } EXPORT_SYMBOL_GPL(__iommu_sva_bind_device); int __iommu_sva_unbind_device(struct device *dev, int pasid) { - return -ENOSYS; /* TODO */ + int ret = -ESRCH; + struct iommu_domain *domain; + struct iommu_bond *bond = NULL; + struct iommu_sva_param *param; + + domain = iommu_get_domain_for_dev(dev); + if (!domain) + return -EINVAL; + + mutex_lock(&dev->iommu_param->sva_lock); + param = dev->iommu_param->sva_param; + if (!param) { + ret = -EINVAL; + goto out_unlock; + } + + spin_lock(&iommu_sva_lock); + list_for_each_entry(bond, ¶m->mm_list, dev_head) { + if (bond->io_mm->pasid == pasid) { + io_mm_detach_locked(bond); + ret = 0; + break; + } + } + spin_unlock(&iommu_sva_lock); + +out_unlock: + mutex_unlock(&dev->iommu_param->sva_lock); + return ret; } EXPORT_SYMBOL_GPL(__iommu_sva_unbind_device); static void __iommu_sva_unbind_device_all(struct device *dev) { - /* TODO */ + struct iommu_sva_param *param = dev->iommu_param->sva_param; + struct iommu_bond *bond, *next; + + if (!param) + return; + + spin_lock(&iommu_sva_lock); + list_for_each_entry_safe(bond, next, ¶m->mm_list, dev_head) + io_mm_detach_locked(bond); + spin_unlock(&iommu_sva_lock); } /** @@ -82,6 +472,7 @@ int iommu_sva_init_device(struct device *dev, unsigned long features, param->features = features; param->min_pasid = min_pasid; param->max_pasid = max_pasid; + INIT_LIST_HEAD(¶m->mm_list); mutex_lock(&dev->iommu_param->sva_lock); if (dev->iommu_param->sva_param) { diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index aba3bf15d46c..7113fe398b70 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1525,6 +1525,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus, domain->type = type; /* Assume all sizes by default; the driver may override this later */ domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap; + INIT_LIST_HEAD(&domain->mm_list); return domain; } diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 9c49877e37a5..6a3ced6a5aa1 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -99,6 +99,20 @@ struct iommu_domain { void *handler_token; struct iommu_domain_geometry geometry; void *iova_cookie; + + struct list_head mm_list; +}; + +struct io_mm { + int pasid; + /* IOMMU_SVA_FEAT_* */ + unsigned long flags; + struct list_head devices; + struct kref kref; + struct mm_struct *mm; + + /* Release callback for this mm */ + void (*release)(struct io_mm *io_mm); }; enum iommu_cap { @@ -201,6 +215,7 @@ struct iommu_sva_param { unsigned long features; unsigned int min_pasid; unsigned int max_pasid; + struct list_head mm_list; }; /** @@ -212,6 +227,12 @@ struct iommu_sva_param { * @detach_dev: detach device from an iommu domain * @sva_init_device: initialize Shared Virtual Addressing for a device * @sva_shutdown_device: shutdown Shared Virtual Addressing for a device + * @mm_alloc: allocate io_mm + * @mm_free: free io_mm + * @mm_attach: attach io_mm to a device. Install PASID entry if necessary. Must + * not sleep. + * @mm_detach: detach io_mm from a device. Remove PASID entry and + * flush associated TLB entries if necessary. Must not sleep. * @map: map a physically contiguous memory region to an iommu domain * @unmap: unmap a physically contiguous memory region from an iommu domain * @flush_tlb_all: Synchronously flush all hardware TLBs for this domain @@ -249,6 +270,14 @@ struct iommu_ops { void (*detach_dev)(struct iommu_domain *domain, struct device *dev); int (*sva_init_device)(struct device *dev, struct iommu_sva_param *param); void (*sva_shutdown_device)(struct device *dev); + struct io_mm *(*mm_alloc)(struct iommu_domain *domain, + struct mm_struct *mm, + unsigned long flags); + void (*mm_free)(struct io_mm *io_mm); + int (*mm_attach)(struct iommu_domain *domain, struct device *dev, + struct io_mm *io_mm, bool attach_domain); + void (*mm_detach)(struct iommu_domain *domain, struct device *dev, + struct io_mm *io_mm, bool detach_domain); int (*map)(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot); size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, From patchwork Thu Sep 20 17:00:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10608313 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CA6B015E8 for ; Thu, 20 Sep 2018 17:25:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B1DEF2E29A for ; Thu, 20 Sep 2018 17:25:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A53972E2A7; Thu, 20 Sep 2018 17:25:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 35A402E29A for ; Thu, 20 Sep 2018 17:25:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733270AbeITXJd (ORCPT ); Thu, 20 Sep 2018 19:09:33 -0400 Received: from foss.arm.com ([217.140.101.70]:50148 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733097AbeITXJd (ORCPT ); Thu, 20 Sep 2018 19:09:33 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8250E1596; Thu, 20 Sep 2018 10:25:02 -0700 (PDT) Received: from ostrya.Emea.Arm.com (ostrya.emea.arm.com [10.4.12.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C17833F557; Thu, 20 Sep 2018 10:24:58 -0700 (PDT) From: Jean-Philippe Brucker To: iommu@lists.linux-foundation.org Cc: joro@8bytes.org, linux-pci@vger.kernel.org, jcrouse@codeaurora.org, alex.williamson@redhat.com, Jonathan.Cameron@huawei.com, jacob.jun.pan@linux.intel.com, christian.koenig@amd.com, eric.auger@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, andrew.murray@arm.com, will.deacon@arm.com, robin.murphy@arm.com, ashok.raj@intel.com, baolu.lu@linux.intel.com, xuzaibo@huawei.com, liguozhu@hisilicon.com, okaya@codeaurora.org, bharatku@xilinx.com, ilias.apalodimas@linaro.org, shunyong.yang@hxt-semitech.com Subject: [PATCH v3 04/10] iommu/sva: Add a mm_exit callback for device drivers Date: Thu, 20 Sep 2018 18:00:40 +0100 Message-Id: <20180920170046.20154-5-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180920170046.20154-1-jean-philippe.brucker@arm.com> References: <20180920170046.20154-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When an mm exits, devices that were bound to it must stop performing DMA on its PASID. Let device drivers register a callback to be notified on mm exit. Add the callback to the sva_param structure attached to struct device. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/iommu-sva.c | 10 +++++++++- include/linux/iommu.h | 8 ++++++-- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index a486bc947335..08da479dad68 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -436,6 +436,7 @@ EXPORT_SYMBOL_GPL(iommu_sva_unbind_device_all); * @features: bitmask of features that need to be initialized * @min_pasid: min PASID value supported by the device * @max_pasid: max PASID value supported by the device + * @mm_exit: callback for process address space release * * Users of the bind()/unbind() API must call this function to initialize all * features required for SVA. @@ -447,13 +448,19 @@ EXPORT_SYMBOL_GPL(iommu_sva_unbind_device_all); * overrides it. Similarly, @min_pasid overrides the lower PASID limit supported * by the IOMMU. * + * @mm_exit is called when an address space bound to the device is about to be + * torn down by exit_mmap. After @mm_exit returns, the device must not issue any + * more transaction with the PASID given as argument. The handler gets an opaque + * pointer corresponding to the drvdata passed as argument to bind(). + * * The device should not be performing any DMA while this function is running, * otherwise the behavior is undefined. * * Return 0 if initialization succeeded, or an error. */ int iommu_sva_init_device(struct device *dev, unsigned long features, - unsigned int min_pasid, unsigned int max_pasid) + unsigned int min_pasid, unsigned int max_pasid, + iommu_mm_exit_handler_t mm_exit) { int ret; struct iommu_sva_param *param; @@ -472,6 +479,7 @@ int iommu_sva_init_device(struct device *dev, unsigned long features, param->features = features; param->min_pasid = min_pasid; param->max_pasid = max_pasid; + param->mm_exit = mm_exit; INIT_LIST_HEAD(¶m->mm_list); mutex_lock(&dev->iommu_param->sva_lock); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 6a3ced6a5aa1..c95ff714ea66 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -60,6 +60,7 @@ struct iommu_fault_event; typedef int (*iommu_fault_handler_t)(struct iommu_domain *, struct device *, unsigned long, int, void *); typedef int (*iommu_dev_fault_handler_t)(struct iommu_fault_event *, void *); +typedef int (*iommu_mm_exit_handler_t)(struct device *dev, int pasid, void *); struct iommu_domain_geometry { dma_addr_t aperture_start; /* First address that can be mapped */ @@ -216,6 +217,7 @@ struct iommu_sva_param { unsigned int min_pasid; unsigned int max_pasid; struct list_head mm_list; + iommu_mm_exit_handler_t mm_exit; }; /** @@ -967,7 +969,8 @@ static inline void iommu_debugfs_setup(void) {} #ifdef CONFIG_IOMMU_SVA extern int iommu_sva_init_device(struct device *dev, unsigned long features, unsigned int min_pasid, - unsigned int max_pasid); + unsigned int max_pasid, + iommu_mm_exit_handler_t mm_exit); extern void iommu_sva_shutdown_device(struct device *dev); extern int __iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid, unsigned long flags, @@ -978,7 +981,8 @@ extern void iommu_sva_unbind_device_all(struct device *dev); static inline int iommu_sva_init_device(struct device *dev, unsigned long features, unsigned int min_pasid, - unsigned int max_pasid) + unsigned int max_pasid, + iommu_mm_exit_handler_t mm_exit) { return -ENODEV; } From patchwork Thu Sep 20 17:00:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10608321 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1D3C7913 for ; Thu, 20 Sep 2018 17:25:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 008422E283 for ; Thu, 20 Sep 2018 17:25:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E841D2E298; Thu, 20 Sep 2018 17:25:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BCA922E29A for ; Thu, 20 Sep 2018 17:25:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733097AbeITXJh (ORCPT ); Thu, 20 Sep 2018 19:09:37 -0400 Received: from foss.arm.com ([217.140.101.70]:50192 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730955AbeITXJh (ORCPT ); Thu, 20 Sep 2018 19:09:37 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 772A21596; Thu, 20 Sep 2018 10:25:06 -0700 (PDT) Received: from ostrya.Emea.Arm.com (ostrya.emea.arm.com [10.4.12.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C42643F557; Thu, 20 Sep 2018 10:25:02 -0700 (PDT) From: Jean-Philippe Brucker To: iommu@lists.linux-foundation.org Cc: joro@8bytes.org, linux-pci@vger.kernel.org, jcrouse@codeaurora.org, alex.williamson@redhat.com, Jonathan.Cameron@huawei.com, jacob.jun.pan@linux.intel.com, christian.koenig@amd.com, eric.auger@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, andrew.murray@arm.com, will.deacon@arm.com, robin.murphy@arm.com, ashok.raj@intel.com, baolu.lu@linux.intel.com, xuzaibo@huawei.com, liguozhu@hisilicon.com, okaya@codeaurora.org, bharatku@xilinx.com, ilias.apalodimas@linaro.org, shunyong.yang@hxt-semitech.com Subject: [PATCH v3 05/10] iommu/sva: Track mm changes with an MMU notifier Date: Thu, 20 Sep 2018 18:00:41 +0100 Message-Id: <20180920170046.20154-6-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180920170046.20154-1-jean-philippe.brucker@arm.com> References: <20180920170046.20154-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When creating an io_mm structure, register an MMU notifier that informs us when the virtual address space changes and disappears. Add a new operation to the IOMMU driver, mm_invalidate, called when a range of addresses is unmapped to let the IOMMU driver send ATC invalidations. mm_invalidate cannot sleep. Adding the notifier complicates io_mm release. In one case device drivers free the io_mm explicitly by calling unbind (or detaching the device from its domain). In the other case the process could crash before unbind, in which case the release notifier has to do all the work. Signed-off-by: Jean-Philippe Brucker --- v2->v3: Add MMU_INVALIDATE_DOES_NOT_BLOCK flag to MMU notifier In v2 Christian pointed out that letting mm_exit() linger for too long (some devices could need minutes to flush a PASID context) might force the OOM killer to kill additional tasks, for example if the victim has mlocked all its memory, which the reaper thread cannot clean up. If this turns out to be problematic to users, we might need to add some complexity in IOMMU drivers in order to disable PASIDs and return to exit_mmap() while DMA is still running. While invasive on the IOMMU side, such change might not require modification of device drivers or the API, since iommu_notifier_release() could simply schedule a call to their mm_exit() instead of calling it synchronously. So we can tune this behavior in a later series. Note that some steps cannot be skipped: the ATC invalidation, which may take up to a minute according to the PCI spec, must be done from the MMU notifier context. The PCI stop PASID mechanism is an implicit ATC invalidation, but if we postpone it then we'll have to perform an explicit one. --- drivers/iommu/Kconfig | 1 + drivers/iommu/iommu-sva.c | 246 +++++++++++++++++++++++++++++++++++--- include/linux/iommu.h | 10 ++ 3 files changed, 240 insertions(+), 17 deletions(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 884580401919..88d6c68284f3 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -98,6 +98,7 @@ config IOMMU_DMA config IOMMU_SVA bool select IOMMU_API + select MMU_NOTIFIER config FSL_PAMU bool "Freescale IOMMU support" diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index 08da479dad68..5ff8967cb213 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -7,6 +7,7 @@ #include #include +#include #include #include #include @@ -107,6 +108,9 @@ struct iommu_bond { struct list_head mm_head; struct list_head dev_head; struct list_head domain_head; + refcount_t refs; + struct wait_queue_head mm_exit_wq; + bool mm_exit_active; void *drvdata; }; @@ -125,6 +129,8 @@ static DEFINE_IDR(iommu_pasid_idr); */ static DEFINE_SPINLOCK(iommu_sva_lock); +static struct mmu_notifier_ops iommu_mmu_notifier; + static struct io_mm * io_mm_alloc(struct iommu_domain *domain, struct device *dev, struct mm_struct *mm, unsigned long flags) @@ -152,6 +158,7 @@ io_mm_alloc(struct iommu_domain *domain, struct device *dev, io_mm->flags = flags; io_mm->mm = mm; + io_mm->notifier.ops = &iommu_mmu_notifier; io_mm->release = domain->ops->mm_free; INIT_LIST_HEAD(&io_mm->devices); /* Leave kref as zero until the io_mm is fully initialized */ @@ -169,8 +176,29 @@ io_mm_alloc(struct iommu_domain *domain, struct device *dev, goto err_free_mm; } - /* TODO: keep track of mm. For the moment, abort. */ - ret = -ENOSYS; + ret = mmu_notifier_register(&io_mm->notifier, mm); + if (ret) + goto err_free_pasid; + + /* + * Now that the MMU notifier is valid, we can allow users to grab this + * io_mm by setting a valid refcount. Before that it was accessible in + * the IDR but invalid. + * + * The following barrier ensures that users, who obtain the io_mm with + * kref_get_unless_zero, don't read uninitialized fields in the + * structure. + */ + smp_wmb(); + kref_init(&io_mm->kref); + + return io_mm; + +err_free_pasid: + /* + * Even if the io_mm is accessible from the IDR at this point, kref is + * 0 so no user could get a reference to it. Free it manually. + */ spin_lock(&iommu_sva_lock); idr_remove(&iommu_pasid_idr, io_mm->pasid); spin_unlock(&iommu_sva_lock); @@ -182,9 +210,13 @@ io_mm_alloc(struct iommu_domain *domain, struct device *dev, return ERR_PTR(ret); } -static void io_mm_free(struct io_mm *io_mm) +static void io_mm_free(struct rcu_head *rcu) { - struct mm_struct *mm = io_mm->mm; + struct io_mm *io_mm; + struct mm_struct *mm; + + io_mm = container_of(rcu, struct io_mm, rcu); + mm = io_mm->mm; io_mm->release(io_mm); mmdrop(mm); @@ -197,10 +229,24 @@ static void io_mm_release(struct kref *kref) io_mm = container_of(kref, struct io_mm, kref); WARN_ON(!list_empty(&io_mm->devices)); - /* The PASID can now be reallocated for another mm... */ idr_remove(&iommu_pasid_idr, io_mm->pasid); - /* ... but this mm is freed after a grace period (TODO) */ - io_mm_free(io_mm); + + /* + * If we're being released from mm exit, the notifier callback ->release + * has already been called. Otherwise we don't need ->release, the io_mm + * isn't attached to anything anymore. Hence no_release. + */ + mmu_notifier_unregister_no_release(&io_mm->notifier, io_mm->mm); + + /* + * We can't free the structure here, because if mm exits during + * unbind(), then ->release might be attempting to grab the io_mm + * concurrently. And in the other case, if ->release is calling + * io_mm_release, then __mmu_notifier_release expects to still have a + * valid mn when returning. So free the structure when it's safe, after + * the RCU grace period elapsed. + */ + mmu_notifier_call_srcu(&io_mm->rcu, io_mm_free); } /* @@ -209,8 +255,14 @@ static void io_mm_release(struct kref *kref) */ static int io_mm_get_locked(struct io_mm *io_mm) { - if (io_mm) - return kref_get_unless_zero(&io_mm->kref); + if (io_mm && kref_get_unless_zero(&io_mm->kref)) { + /* + * kref_get_unless_zero doesn't provide ordering for reads. This + * barrier pairs with the one in io_mm_alloc. + */ + smp_rmb(); + return 1; + } return 0; } @@ -236,7 +288,8 @@ static int io_mm_attach(struct iommu_domain *domain, struct device *dev, struct iommu_bond *bond, *tmp; struct iommu_sva_param *param = dev->iommu_param->sva_param; - if (!domain->ops->mm_attach || !domain->ops->mm_detach) + if (!domain->ops->mm_attach || !domain->ops->mm_detach || + !domain->ops->mm_invalidate) return -ENODEV; if (pasid > param->max_pasid || pasid < param->min_pasid) @@ -250,6 +303,8 @@ static int io_mm_attach(struct iommu_domain *domain, struct device *dev, bond->io_mm = io_mm; bond->dev = dev; bond->drvdata = drvdata; + refcount_set(&bond->refs, 1); + init_waitqueue_head(&bond->mm_exit_wq); spin_lock(&iommu_sva_lock); /* @@ -278,12 +333,37 @@ static int io_mm_attach(struct iommu_domain *domain, struct device *dev, return ret; } -static void io_mm_detach_locked(struct iommu_bond *bond) +static void io_mm_detach_locked(struct iommu_bond *bond, bool wait) { struct iommu_bond *tmp; bool detach_domain = true; struct iommu_domain *domain = bond->domain; + if (wait) { + bool do_detach = true; + /* + * If we're unbind() then we're deleting the bond no matter + * what. Tell the mm_exit thread that we're cleaning up, and + * wait until it finishes using the bond. + * + * refs is guaranteed to be one or more, otherwise it would + * already have been removed from the list. Check if someone is + * already waiting, in which case we wait but do not free. + */ + if (refcount_read(&bond->refs) > 1) + do_detach = false; + + refcount_inc(&bond->refs); + wait_event_lock_irq(bond->mm_exit_wq, !bond->mm_exit_active, + iommu_sva_lock); + if (!do_detach) + return; + + } else if (!refcount_dec_and_test(&bond->refs)) { + /* unbind() is waiting to free the bond */ + return; + } + list_for_each_entry(tmp, &domain->mm_list, domain_head) { if (tmp->io_mm == bond->io_mm && tmp->dev != bond->dev) { detach_domain = false; @@ -301,6 +381,130 @@ static void io_mm_detach_locked(struct iommu_bond *bond) kfree(bond); } +static int iommu_signal_mm_exit(struct iommu_bond *bond) +{ + struct device *dev = bond->dev; + struct io_mm *io_mm = bond->io_mm; + struct iommu_sva_param *param = dev->iommu_param->sva_param; + + /* + * We can't hold the device's sva_lock. If we did and the device driver + * used a global lock around io_mm, we would risk getting the following + * deadlock: + * + * exit_mm() | Shutdown SVA + * mutex_lock(sva_lock) | mutex_lock(glob lock) + * param->mm_exit() | sva_shutdown_device() + * mutex_lock(glob lock) | mutex_lock(sva_lock) + * + * Fortunately unbind() waits for us to finish, and sva_shutdown_device + * requires that any bond is removed, so we can safely access mm_exit + * and drvdata without taking the sva_lock. + */ + if (!param || !param->mm_exit) + return 0; + + return param->mm_exit(dev, io_mm->pasid, bond->drvdata); +} + +/* Called when the mm exits. Can race with unbind(). */ +static void iommu_notifier_release(struct mmu_notifier *mn, struct mm_struct *mm) +{ + struct iommu_bond *bond, *next; + struct io_mm *io_mm = container_of(mn, struct io_mm, notifier); + + /* + * If the mm is exiting then devices are still bound to the io_mm. + * A few things need to be done before it is safe to release: + * + * - As the mmu notifier doesn't hold any reference to the io_mm when + * calling ->release(), try to take a reference. + * - Tell the device driver to stop using this PASID. + * - Clear the PASID table and invalidate TLBs. + * - Drop all references to this io_mm by freeing the bonds. + */ + spin_lock(&iommu_sva_lock); + if (!io_mm_get_locked(io_mm)) { + /* Someone's already taking care of it. */ + spin_unlock(&iommu_sva_lock); + return; + } + + list_for_each_entry_safe(bond, next, &io_mm->devices, mm_head) { + /* + * Release the lock to let the handler sleep. We need to be + * careful about concurrent modifications to the list and to the + * bond. Tell unbind() not to free the bond until we're done. + */ + bond->mm_exit_active = true; + spin_unlock(&iommu_sva_lock); + + if (iommu_signal_mm_exit(bond)) + dev_WARN(bond->dev, "possible leak of PASID %u", + io_mm->pasid); + + spin_lock(&iommu_sva_lock); + next = list_next_entry(bond, mm_head); + + /* If someone is waiting, let them delete the bond now */ + bond->mm_exit_active = false; + wake_up_all(&bond->mm_exit_wq); + + /* Otherwise, do it ourselves */ + io_mm_detach_locked(bond, false); + } + spin_unlock(&iommu_sva_lock); + + /* + * We're now reasonably certain that no more fault is being handled for + * this io_mm, since we just flushed them all out of the fault queue. + * Release the last reference to free the io_mm. + */ + io_mm_put(io_mm); +} + +static void iommu_notifier_invalidate_range(struct mmu_notifier *mn, + struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + struct iommu_bond *bond; + struct io_mm *io_mm = container_of(mn, struct io_mm, notifier); + + spin_lock(&iommu_sva_lock); + list_for_each_entry(bond, &io_mm->devices, mm_head) { + struct iommu_domain *domain = bond->domain; + + domain->ops->mm_invalidate(domain, bond->dev, io_mm, start, + end - start); + } + spin_unlock(&iommu_sva_lock); +} + +static int iommu_notifier_clear_flush_young(struct mmu_notifier *mn, + struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + iommu_notifier_invalidate_range(mn, mm, start, end); + return 0; +} + +static void iommu_notifier_change_pte(struct mmu_notifier *mn, + struct mm_struct *mm, + unsigned long address, pte_t pte) +{ + iommu_notifier_invalidate_range(mn, mm, address, address + PAGE_SIZE); +} + +static struct mmu_notifier_ops iommu_mmu_notifier = { + .flags = MMU_INVALIDATE_DOES_NOT_BLOCK, + .release = iommu_notifier_release, + .clear_flush_young = iommu_notifier_clear_flush_young, + .change_pte = iommu_notifier_change_pte, + .invalidate_range = iommu_notifier_invalidate_range, +}; + int __iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid, unsigned long flags, void *drvdata) { @@ -386,15 +590,16 @@ int __iommu_sva_unbind_device(struct device *dev, int pasid) goto out_unlock; } - spin_lock(&iommu_sva_lock); + /* spin_lock_irq matches the one in wait_event_lock_irq */ + spin_lock_irq(&iommu_sva_lock); list_for_each_entry(bond, ¶m->mm_list, dev_head) { if (bond->io_mm->pasid == pasid) { - io_mm_detach_locked(bond); + io_mm_detach_locked(bond, true); ret = 0; break; } } - spin_unlock(&iommu_sva_lock); + spin_unlock_irq(&iommu_sva_lock); out_unlock: mutex_unlock(&dev->iommu_param->sva_lock); @@ -410,10 +615,10 @@ static void __iommu_sva_unbind_device_all(struct device *dev) if (!param) return; - spin_lock(&iommu_sva_lock); + spin_lock_irq(&iommu_sva_lock); list_for_each_entry_safe(bond, next, ¶m->mm_list, dev_head) - io_mm_detach_locked(bond); - spin_unlock(&iommu_sva_lock); + io_mm_detach_locked(bond, true); + spin_unlock_irq(&iommu_sva_lock); } /** @@ -421,6 +626,7 @@ static void __iommu_sva_unbind_device_all(struct device *dev) * @dev: the device * * When detaching @dev from a domain, IOMMU drivers should use this helper. + * This function may sleep while waiting for bonds to be released. */ void iommu_sva_unbind_device_all(struct device *dev) { @@ -453,6 +659,12 @@ EXPORT_SYMBOL_GPL(iommu_sva_unbind_device_all); * more transaction with the PASID given as argument. The handler gets an opaque * pointer corresponding to the drvdata passed as argument to bind(). * + * The @mm_exit handler is allowed to sleep. Be careful about the locks taken in + * @mm_exit, because they might lead to deadlocks if they are also held when + * dropping references to the mm. Consider the following call chain: + * mutex_lock(A); mmput(mm) -> exit_mm() -> @mm_exit() -> mutex_lock(A) + * Using mmput_async() prevents this scenario. + * * The device should not be performing any DMA while this function is running, * otherwise the behavior is undefined. * diff --git a/include/linux/iommu.h b/include/linux/iommu.h index c95ff714ea66..429f3dc37a35 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -24,6 +24,7 @@ #include #include #include +#include #include #include @@ -110,10 +111,15 @@ struct io_mm { unsigned long flags; struct list_head devices; struct kref kref; +#if defined(CONFIG_MMU_NOTIFIER) + struct mmu_notifier notifier; +#endif struct mm_struct *mm; /* Release callback for this mm */ void (*release)(struct io_mm *io_mm); + /* For postponed release */ + struct rcu_head rcu; }; enum iommu_cap { @@ -235,6 +241,7 @@ struct iommu_sva_param { * not sleep. * @mm_detach: detach io_mm from a device. Remove PASID entry and * flush associated TLB entries if necessary. Must not sleep. + * @mm_invalidate: Invalidate a range of mappings for an mm * @map: map a physically contiguous memory region to an iommu domain * @unmap: unmap a physically contiguous memory region from an iommu domain * @flush_tlb_all: Synchronously flush all hardware TLBs for this domain @@ -280,6 +287,9 @@ struct iommu_ops { struct io_mm *io_mm, bool attach_domain); void (*mm_detach)(struct iommu_domain *domain, struct device *dev, struct io_mm *io_mm, bool detach_domain); + void (*mm_invalidate)(struct iommu_domain *domain, struct device *dev, + struct io_mm *io_mm, unsigned long vaddr, + size_t size); int (*map)(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot); size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, From patchwork Thu Sep 20 17:00:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10608323 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 85EA9913 for ; Thu, 20 Sep 2018 17:25:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6D59E2E29A for ; Thu, 20 Sep 2018 17:25:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 616752E2A6; Thu, 20 Sep 2018 17:25:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 010D82E29A for ; Thu, 20 Sep 2018 17:25:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732761AbeITXJl (ORCPT ); Thu, 20 Sep 2018 19:09:41 -0400 Received: from foss.arm.com ([217.140.101.70]:50214 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726781AbeITXJl (ORCPT ); Thu, 20 Sep 2018 19:09:41 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5FE1B15AD; Thu, 20 Sep 2018 10:25:10 -0700 (PDT) Received: from ostrya.Emea.Arm.com (ostrya.emea.arm.com [10.4.12.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id B66D13F557; Thu, 20 Sep 2018 10:25:06 -0700 (PDT) From: Jean-Philippe Brucker To: iommu@lists.linux-foundation.org Cc: joro@8bytes.org, linux-pci@vger.kernel.org, jcrouse@codeaurora.org, alex.williamson@redhat.com, Jonathan.Cameron@huawei.com, jacob.jun.pan@linux.intel.com, christian.koenig@amd.com, eric.auger@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, andrew.murray@arm.com, will.deacon@arm.com, robin.murphy@arm.com, ashok.raj@intel.com, baolu.lu@linux.intel.com, xuzaibo@huawei.com, liguozhu@hisilicon.com, okaya@codeaurora.org, bharatku@xilinx.com, ilias.apalodimas@linaro.org, shunyong.yang@hxt-semitech.com Subject: [PATCH v3 06/10] iommu/sva: Search mm by PASID Date: Thu, 20 Sep 2018 18:00:42 +0100 Message-Id: <20180920170046.20154-7-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180920170046.20154-1-jean-philippe.brucker@arm.com> References: <20180920170046.20154-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The fault handler will need to find an mm given its PASID. This is the reason we have an IDR for storing address spaces, so hook it up. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/iommu-sva.c | 26 ++++++++++++++++++++++++++ include/linux/iommu.h | 7 +++++++ 2 files changed, 33 insertions(+) diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index 5ff8967cb213..ee86f00ee1b9 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -636,6 +636,32 @@ void iommu_sva_unbind_device_all(struct device *dev) } EXPORT_SYMBOL_GPL(iommu_sva_unbind_device_all); +/** + * iommu_sva_find() - Find mm associated to the given PASID + * @pasid: Process Address Space ID assigned to the mm + * + * Returns the mm corresponding to this PASID, or NULL if not found. A reference + * to the mm is taken, and must be released with mmput(). + */ +struct mm_struct *iommu_sva_find(int pasid) +{ + struct io_mm *io_mm; + struct mm_struct *mm = NULL; + + spin_lock(&iommu_sva_lock); + io_mm = idr_find(&iommu_pasid_idr, pasid); + if (io_mm && io_mm_get_locked(io_mm)) { + if (mmget_not_zero(io_mm->mm)) + mm = io_mm->mm; + + io_mm_put_locked(io_mm); + } + spin_unlock(&iommu_sva_lock); + + return mm; +} +EXPORT_SYMBOL_GPL(iommu_sva_find); + /** * iommu_sva_init_device() - Initialize Shared Virtual Addressing for a device * @dev: the device diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 429f3dc37a35..a457650b80de 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -987,6 +987,8 @@ extern int __iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata); extern int __iommu_sva_unbind_device(struct device *dev, int pasid); extern void iommu_sva_unbind_device_all(struct device *dev); +extern struct mm_struct *iommu_sva_find(int pasid); + #else /* CONFIG_IOMMU_SVA */ static inline int iommu_sva_init_device(struct device *dev, unsigned long features, @@ -1016,6 +1018,11 @@ static inline int __iommu_sva_unbind_device(struct device *dev, int pasid) static inline void iommu_sva_unbind_device_all(struct device *dev) { } + +static inline struct mm_struct *iommu_sva_find(int pasid) +{ + return NULL; +} #endif /* CONFIG_IOMMU_SVA */ #endif /* __LINUX_IOMMU_H */ From patchwork Thu Sep 20 17:00:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10608329 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C297915E8 for ; Thu, 20 Sep 2018 17:25:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A9A592DEF4 for ; Thu, 20 Sep 2018 17:25:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9DA732E29A; Thu, 20 Sep 2018 17:25:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7FC2B2DEF4 for ; Thu, 20 Sep 2018 17:25:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727252AbeITXJq (ORCPT ); Thu, 20 Sep 2018 19:09:46 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:50260 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726781AbeITXJp (ORCPT ); Thu, 20 Sep 2018 19:09:45 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 462F71596; Thu, 20 Sep 2018 10:25:14 -0700 (PDT) Received: from ostrya.Emea.Arm.com (ostrya.emea.arm.com [10.4.12.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A08443F557; Thu, 20 Sep 2018 10:25:10 -0700 (PDT) From: Jean-Philippe Brucker To: iommu@lists.linux-foundation.org Cc: joro@8bytes.org, linux-pci@vger.kernel.org, jcrouse@codeaurora.org, alex.williamson@redhat.com, Jonathan.Cameron@huawei.com, jacob.jun.pan@linux.intel.com, christian.koenig@amd.com, eric.auger@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, andrew.murray@arm.com, will.deacon@arm.com, robin.murphy@arm.com, ashok.raj@intel.com, baolu.lu@linux.intel.com, xuzaibo@huawei.com, liguozhu@hisilicon.com, okaya@codeaurora.org, bharatku@xilinx.com, ilias.apalodimas@linaro.org, shunyong.yang@hxt-semitech.com Subject: [PATCH v3 07/10] iommu: Add a page fault handler Date: Thu, 20 Sep 2018 18:00:43 +0100 Message-Id: <20180920170046.20154-8-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180920170046.20154-1-jean-philippe.brucker@arm.com> References: <20180920170046.20154-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Some systems allow devices to handle I/O Page Faults in the core mm. For example systems implementing the PCI PRI extension or Arm SMMU stall model. Infrastructure for reporting these recoverable page faults was recently added to the IOMMU core for SVA virtualisation. Add a page fault handler for host SVA. IOMMU driver can now instantiate several fault workqueues and link them to IOPF-capable devices. Drivers can choose between a single global workqueue, one per IOMMU device, one per low-level fault queue, one per domain, etc. When it receives a fault event, supposedly in an IRQ handler, the IOMMU driver reports the fault using iommu_report_device_fault(), which calls the registered handler. The page fault handler then calls the mm fault handler, and reports either success or failure with iommu_page_response(). When the handler succeeded, the IOMMU retries the access. The iopf_param pointer could be embedded into iommu_fault_param. But putting iopf_param into the iommu_param structure allows us not to care about ordering between calls to iopf_queue_add_device() and iommu_register_device_fault_handler(). Signed-off-by: Jean-Philippe Brucker --- v2->v3: * queue_flush now removes pending partial faults * queue_flush now takes an optional PASID argument, allowing IOMMU drivers to selectively flush faults if possible * remove PAGE_RESP_HANDLED/PAGE_RESP_CONTINUE * rename iopf_context -> iopf_fault --- drivers/iommu/Kconfig | 4 + drivers/iommu/Makefile | 1 + drivers/iommu/io-pgfault.c | 382 +++++++++++++++++++++++++++++++++++++ include/linux/iommu.h | 56 +++++- 4 files changed, 442 insertions(+), 1 deletion(-) create mode 100644 drivers/iommu/io-pgfault.c diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 88d6c68284f3..27e9999ad980 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -100,6 +100,10 @@ config IOMMU_SVA select IOMMU_API select MMU_NOTIFIER +config IOMMU_PAGE_FAULT + bool + select IOMMU_API + config FSL_PAMU bool "Freescale IOMMU support" depends on PCI diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 7d6332be5f0e..1c4b0be5d44b 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -5,6 +5,7 @@ obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o obj-$(CONFIG_IOMMU_DEBUGFS) += iommu-debugfs.o obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o +obj-$(CONFIG_IOMMU_PAGE_FAULT) += io-pgfault.o obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c new file mode 100644 index 000000000000..29aa8c6ba459 --- /dev/null +++ b/drivers/iommu/io-pgfault.c @@ -0,0 +1,382 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Handle device page faults + * + * Copyright (C) 2018 ARM Ltd. + */ + +#include +#include +#include +#include + +/** + * struct iopf_queue - IO Page Fault queue + * @wq: the fault workqueue + * @flush: low-level flush callback + * @flush_arg: flush() argument + * @refs: references to this structure taken by producers + */ +struct iopf_queue { + struct workqueue_struct *wq; + iopf_queue_flush_t flush; + void *flush_arg; + refcount_t refs; +}; + +/** + * struct iopf_device_param - IO Page Fault data attached to a device + * @queue: IOPF queue + * @partial: faults that are part of a Page Request Group for which the last + * request hasn't been submitted yet. + */ +struct iopf_device_param { + struct iopf_queue *queue; + struct list_head partial; +}; + +struct iopf_fault { + struct iommu_fault_event evt; + struct list_head head; +}; + +struct iopf_group { + struct iopf_fault last_fault; + struct list_head faults; + struct work_struct work; + struct device *dev; +}; + +static int iopf_complete(struct device *dev, struct iommu_fault_event *evt, + enum page_response_code status) +{ + struct page_response_msg resp = { + .addr = evt->addr, + .pasid = evt->pasid, + .pasid_present = evt->pasid_valid, + .page_req_group_id = evt->page_req_group_id, + .private_data = evt->iommu_private, + .resp_code = status, + }; + + return iommu_page_response(dev, &resp); +} + +static enum page_response_code +iopf_handle_single(struct iopf_fault *fault) +{ + /* TODO */ + return -ENODEV; +} + +static void iopf_handle_group(struct work_struct *work) +{ + struct iopf_group *group; + struct iopf_fault *fault, *next; + enum page_response_code status = IOMMU_PAGE_RESP_SUCCESS; + + group = container_of(work, struct iopf_group, work); + + list_for_each_entry_safe(fault, next, &group->faults, head) { + struct iommu_fault_event *evt = &fault->evt; + /* + * For the moment, errors are sticky: don't handle subsequent + * faults in the group if there is an error. + */ + if (status == IOMMU_PAGE_RESP_SUCCESS) + status = iopf_handle_single(fault); + + if (!evt->last_req) + kfree(fault); + } + + iopf_complete(group->dev, &group->last_fault.evt, status); + kfree(group); +} + +/** + * iommu_queue_iopf - IO Page Fault handler + * @evt: fault event + * @cookie: struct device, passed to iommu_register_device_fault_handler. + * + * Add a fault to the device workqueue, to be handled by mm. + */ +int iommu_queue_iopf(struct iommu_fault_event *evt, void *cookie) +{ + struct iopf_group *group; + struct iopf_fault *fault, *next; + struct iopf_device_param *iopf_param; + + struct device *dev = cookie; + struct iommu_param *param = dev->iommu_param; + + if (WARN_ON(!mutex_is_locked(¶m->lock))) + return -EINVAL; + + if (evt->type != IOMMU_FAULT_PAGE_REQ) + /* Not a recoverable page fault */ + return 0; + + /* + * As long as we're holding param->lock, the queue can't be unlinked + * from the device and therefore cannot disappear. + */ + iopf_param = param->iopf_param; + if (!iopf_param) + return -ENODEV; + + if (!evt->last_req) { + fault = kzalloc(sizeof(*fault), GFP_KERNEL); + if (!fault) + return -ENOMEM; + + fault->evt = *evt; + + /* Non-last request of a group. Postpone until the last one */ + list_add(&fault->head, &iopf_param->partial); + + return 0; + } + + group = kzalloc(sizeof(*group), GFP_KERNEL); + if (!group) + return -ENOMEM; + + group->dev = dev; + group->last_fault.evt = *evt; + INIT_LIST_HEAD(&group->faults); + list_add(&group->last_fault.head, &group->faults); + INIT_WORK(&group->work, iopf_handle_group); + + /* See if we have partial faults for this group */ + list_for_each_entry_safe(fault, next, &iopf_param->partial, head) { + if (fault->evt.page_req_group_id == evt->page_req_group_id) + /* Insert *before* the last fault */ + list_move(&fault->head, &group->faults); + } + + queue_work(iopf_param->queue->wq, &group->work); + + /* Postpone the fault completion */ + return 0; +} +EXPORT_SYMBOL_GPL(iommu_queue_iopf); + +/** + * iopf_queue_flush_dev - Ensure that all queued faults have been processed + * @dev: the endpoint whose faults need to be flushed. + * @pasid: the PASID affected by this flush + * + * Users must call this function when releasing a PASID, to ensure that all + * pending faults for this PASID have been handled, and won't hit the address + * space of the next process that uses this PASID. + * + * This function can also be called before shutting down the device, in which + * case @pasid should be IOMMU_PASID_INVALID. + * + * Return 0 on success. + */ +int iopf_queue_flush_dev(struct device *dev, int pasid) +{ + int ret = 0; + struct iopf_queue *queue; + struct iopf_fault *fault, *next; + struct iommu_param *param = dev->iommu_param; + + if (!param) + return -ENODEV; + + /* + * It is incredibly easy to find ourselves in a deadlock situation if + * we're not careful, because we're taking the opposite path as + * iommu_queue_iopf: + * + * iopf_queue_flush_dev() | PRI queue handler + * lock(mutex) | iommu_queue_iopf() + * queue->flush() | lock(mutex) + * wait PRI queue empty | + * + * So we can't hold the device param lock while flushing. We don't have + * to, because the queue or the device won't disappear until all flush + * are finished. + */ + mutex_lock(¶m->lock); + if (param->iopf_param) + queue = param->iopf_param->queue; + else + ret = -ENODEV; + mutex_unlock(¶m->lock); + if (ret) + return ret; + + /* + * When removing a PASID, the device driver tells the device to stop + * using it, and flush any pending fault to the IOMMU. In this flush + * callback, the IOMMU driver makes sure that there are no such faults + * left in the low-level queue. + */ + queue->flush(queue->flush_arg, dev, pasid); + + /* + * If at some point the low-level fault queue overflowed and the IOMMU + * device had to auto-respond to a 'last' page fault, other faults from + * the same Page Request Group may still be stuck in the partial list. + * We need to make sure that the next address space using the PASID + * doesn't receive them. + */ + mutex_lock(¶m->lock); + list_for_each_entry_safe(fault, next, ¶m->iopf_param->partial, head) { + if (fault->evt.pasid == pasid || pasid == IOMMU_PASID_INVALID) { + list_del(&fault->head); + kfree(fault); + } + } + mutex_unlock(¶m->lock); + + flush_workqueue(queue->wq); + + return 0; +} +EXPORT_SYMBOL_GPL(iopf_queue_flush_dev); + +/** + * iopf_queue_add_device - Add producer to the fault queue + * @queue: IOPF queue + * @dev: device to add + */ +int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev) +{ + int ret = -EINVAL; + struct iopf_device_param *iopf_param; + struct iommu_param *param = dev->iommu_param; + + if (!param) + return -ENODEV; + + iopf_param = kzalloc(sizeof(*iopf_param), GFP_KERNEL); + if (!iopf_param) + return -ENOMEM; + + INIT_LIST_HEAD(&iopf_param->partial); + iopf_param->queue = queue; + + mutex_lock(¶m->lock); + if (!param->iopf_param) { + refcount_inc(&queue->refs); + param->iopf_param = iopf_param; + ret = 0; + } + mutex_unlock(¶m->lock); + + if (ret) + kfree(iopf_param); + + return ret; +} +EXPORT_SYMBOL_GPL(iopf_queue_add_device); + +/** + * iopf_queue_remove_device - Remove producer from fault queue + * @dev: device to remove + * + * Caller makes sure that no more fault is reported for this device, and no more + * flush is scheduled for this device. + * + * Note: safe to call unconditionally on a cleanup path, even if the device + * isn't registered to any IOPF queue. + * + * Return 0 if the device was attached to the IOPF queue + */ +int iopf_queue_remove_device(struct device *dev) +{ + struct iopf_fault *fault, *next; + struct iopf_device_param *iopf_param; + struct iommu_param *param = dev->iommu_param; + + if (!param) + return -EINVAL; + + mutex_lock(¶m->lock); + iopf_param = param->iopf_param; + if (iopf_param) { + refcount_dec(&iopf_param->queue->refs); + param->iopf_param = NULL; + } + mutex_unlock(¶m->lock); + if (!iopf_param) + return -EINVAL; + + /* Just in case flush_dev() wasn't called */ + list_for_each_entry_safe(fault, next, &iopf_param->partial, head) + kfree(fault); + + /* + * No more flush is scheduled, and the caller removed all bonds from + * this device. unbind() waited until any concurrent mm_exit() finished, + * therefore there is no flush() running anymore and we can free the + * param. + */ + kfree(iopf_param); + + return 0; +} +EXPORT_SYMBOL_GPL(iopf_queue_remove_device); + +/** + * iopf_queue_alloc - Allocate and initialize a fault queue + * @name: a unique string identifying the queue (for workqueue) + * @flush: a callback that flushes the low-level queue + * @cookie: driver-private data passed to the flush callback + * + * The callback is called before the workqueue is flushed. The IOMMU driver must + * commit all faults that are pending in its low-level queues at the time of the + * call, into the IOPF queue (with iommu_report_device_fault). The callback + * takes a device pointer as argument, hinting what endpoint is causing the + * flush. When the device is NULL, all faults should be committed. + */ +struct iopf_queue * +iopf_queue_alloc(const char *name, iopf_queue_flush_t flush, void *cookie) +{ + struct iopf_queue *queue; + + queue = kzalloc(sizeof(*queue), GFP_KERNEL); + if (!queue) + return NULL; + + /* + * The WQ is unordered because the low-level handler enqueues faults by + * group. PRI requests within a group have to be ordered, but once + * that's dealt with, the high-level function can handle groups out of + * order. + */ + queue->wq = alloc_workqueue("iopf_queue/%s", WQ_UNBOUND, 0, name); + if (!queue->wq) { + kfree(queue); + return NULL; + } + + queue->flush = flush; + queue->flush_arg = cookie; + refcount_set(&queue->refs, 1); + + return queue; +} +EXPORT_SYMBOL_GPL(iopf_queue_alloc); + +/** + * iopf_queue_free - Free IOPF queue + * @queue: queue to free + * + * Counterpart to iopf_queue_alloc(). Caller must make sure that all producers + * have been removed. + */ +void iopf_queue_free(struct iopf_queue *queue) +{ + /* Caller should have removed all producers first */ + if (WARN_ON(!refcount_dec_and_test(&queue->refs))) + return; + + destroy_workqueue(queue->wq); + kfree(queue); +} +EXPORT_SYMBOL_GPL(iopf_queue_free); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index a457650b80de..b7cd00ae7358 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -63,6 +63,8 @@ typedef int (*iommu_fault_handler_t)(struct iommu_domain *, typedef int (*iommu_dev_fault_handler_t)(struct iommu_fault_event *, void *); typedef int (*iommu_mm_exit_handler_t)(struct device *dev, int pasid, void *); +#define IOMMU_PASID_INVALID (-1) + struct iommu_domain_geometry { dma_addr_t aperture_start; /* First address that can be mapped */ dma_addr_t aperture_end; /* Last address that can be mapped */ @@ -440,11 +442,20 @@ struct iommu_fault_param { void *data; }; +/** + * iopf_queue_flush_t - Flush low-level page fault queue + * + * Report all faults currently pending in the low-level page fault queue + */ +struct iopf_queue; +typedef int (*iopf_queue_flush_t)(void *cookie, struct device *dev, int pasid); + /** * struct iommu_param - collection of per-device IOMMU data * * @fault_param: IOMMU detected device fault reporting data - * @lock: serializes accesses to fault_param + * @iopf_param: I/O Page Fault queue and data + * @lock: serializes accesses to fault_param and iopf_param * @sva_param: SVA parameters * @sva_lock: serializes accesses to sva_param * @@ -455,6 +466,7 @@ struct iommu_fault_param { struct iommu_param { struct mutex lock; struct iommu_fault_param *fault_param; + struct iopf_device_param *iopf_param; struct mutex sva_lock; struct iommu_sva_param *sva_param; }; @@ -1025,4 +1037,46 @@ static inline struct mm_struct *iommu_sva_find(int pasid) } #endif /* CONFIG_IOMMU_SVA */ +#ifdef CONFIG_IOMMU_PAGE_FAULT +extern int iommu_queue_iopf(struct iommu_fault_event *evt, void *cookie); + +extern int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev); +extern int iopf_queue_remove_device(struct device *dev); +extern int iopf_queue_flush_dev(struct device *dev, int pasid); +extern struct iopf_queue * +iopf_queue_alloc(const char *name, iopf_queue_flush_t flush, void *cookie); +extern void iopf_queue_free(struct iopf_queue *queue); +#else /* CONFIG_IOMMU_PAGE_FAULT */ +static inline int iommu_queue_iopf(struct iommu_fault_event *evt, void *cookie) +{ + return -ENODEV; +} + +static inline int iopf_queue_add_device(struct iopf_queue *queue, + struct device *dev) +{ + return -ENODEV; +} + +static inline int iopf_queue_remove_device(struct device *dev) +{ + return -ENODEV; +} + +static inline int iopf_queue_flush_dev(struct device *dev, int pasid) +{ + return -ENODEV; +} + +static inline struct iopf_queue * +iopf_queue_alloc(const char *name, iopf_queue_flush_t flush, void *cookie) +{ + return NULL; +} + +static inline void iopf_queue_free(struct iopf_queue *queue) +{ +} +#endif /* CONFIG_IOMMU_PAGE_FAULT */ + #endif /* __LINUX_IOMMU_H */ From patchwork Thu Sep 20 17:00:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10608333 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 33CE415E8 for ; Thu, 20 Sep 2018 17:25:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1B18B2DEF4 for ; Thu, 20 Sep 2018 17:25:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0F2622E298; Thu, 20 Sep 2018 17:25:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 900E22E283 for ; Thu, 20 Sep 2018 17:25:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728444AbeITXJt (ORCPT ); Thu, 20 Sep 2018 19:09:49 -0400 Received: from foss.arm.com ([217.140.101.70]:50298 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726781AbeITXJt (ORCPT ); Thu, 20 Sep 2018 19:09:49 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2BB7115BE; Thu, 20 Sep 2018 10:25:18 -0700 (PDT) Received: from ostrya.Emea.Arm.com (ostrya.emea.arm.com [10.4.12.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 844CC3F557; Thu, 20 Sep 2018 10:25:14 -0700 (PDT) From: Jean-Philippe Brucker To: iommu@lists.linux-foundation.org Cc: joro@8bytes.org, linux-pci@vger.kernel.org, jcrouse@codeaurora.org, alex.williamson@redhat.com, Jonathan.Cameron@huawei.com, jacob.jun.pan@linux.intel.com, christian.koenig@amd.com, eric.auger@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, andrew.murray@arm.com, will.deacon@arm.com, robin.murphy@arm.com, ashok.raj@intel.com, baolu.lu@linux.intel.com, xuzaibo@huawei.com, liguozhu@hisilicon.com, okaya@codeaurora.org, bharatku@xilinx.com, ilias.apalodimas@linaro.org, shunyong.yang@hxt-semitech.com Subject: [PATCH v3 08/10] iommu/iopf: Handle mm faults Date: Thu, 20 Sep 2018 18:00:44 +0100 Message-Id: <20180920170046.20154-9-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180920170046.20154-1-jean-philippe.brucker@arm.com> References: <20180920170046.20154-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When a recoverable page fault is handled by the fault workqueue, find the associated mm and call handle_mm_fault. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/io-pgfault.c | 86 +++++++++++++++++++++++++++++++++++++- 1 file changed, 84 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index 29aa8c6ba459..f6d9f40b879b 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -7,6 +7,7 @@ #include #include +#include #include #include @@ -65,8 +66,65 @@ static int iopf_complete(struct device *dev, struct iommu_fault_event *evt, static enum page_response_code iopf_handle_single(struct iopf_fault *fault) { - /* TODO */ - return -ENODEV; + vm_fault_t ret; + struct mm_struct *mm; + struct vm_area_struct *vma; + unsigned int access_flags = 0; + unsigned int fault_flags = FAULT_FLAG_REMOTE; + struct iommu_fault_event *evt = &fault->evt; + enum page_response_code status = IOMMU_PAGE_RESP_INVALID; + + if (!evt->pasid_valid) + return status; + + mm = iommu_sva_find(evt->pasid); + if (!mm) + return status; + + down_read(&mm->mmap_sem); + + vma = find_extend_vma(mm, evt->addr); + if (!vma) + /* Unmapped area */ + goto out_put_mm; + + if (evt->prot & IOMMU_FAULT_READ) + access_flags |= VM_READ; + + if (evt->prot & IOMMU_FAULT_WRITE) { + access_flags |= VM_WRITE; + fault_flags |= FAULT_FLAG_WRITE; + } + + if (evt->prot & IOMMU_FAULT_EXEC) { + access_flags |= VM_EXEC; + fault_flags |= FAULT_FLAG_INSTRUCTION; + } + + if (!(evt->prot & IOMMU_FAULT_PRIV)) + fault_flags |= FAULT_FLAG_USER; + + if (access_flags & ~vma->vm_flags) + /* Access fault */ + goto out_put_mm; + + ret = handle_mm_fault(vma, evt->addr, fault_flags); + status = ret & VM_FAULT_ERROR ? IOMMU_PAGE_RESP_INVALID : + IOMMU_PAGE_RESP_SUCCESS; + +out_put_mm: + up_read(&mm->mmap_sem); + + /* + * If the process exits while we're handling the fault on its mm, we + * can't do mmput(). exit_mmap() would release the MMU notifier, calling + * iommu_notifier_release(), which has to flush the fault queue that + * we're executing on... So mmput_async() moves the release of the mm to + * another thread, if we're the last user. + */ + mmput_async(mm); + + return status; } static void iopf_handle_group(struct work_struct *work) @@ -100,6 +158,30 @@ static void iopf_handle_group(struct work_struct *work) * @cookie: struct device, passed to iommu_register_device_fault_handler. * * Add a fault to the device workqueue, to be handled by mm. + * + * This module doesn't handle PCI PASID Stop Marker; IOMMU drivers must discard + * them before reporting faults. A PASID Stop Marker (LRW = 0b100) doesn't + * expect a response. It may be generated when disabling a PASID (issuing a + * PASID stop request) by some PCI devices. + * + * The PASID stop request is triggered by the mm_exit() callback. When the + * callback returns from the device driver, no page request is generated for + * this PASID anymore and outstanding ones have been pushed to the IOMMU (as per + * PCIe 4.0r1.0 - 6.20.1 and 10.4.1.2 - Managing PASID TLP Prefix Usage). Some + * PCI devices will wait for all outstanding page requests to come back with a + * response before completing the PASID stop request. Others do not wait for + * page responses, and instead issue this Stop Marker that tells us when the + * PASID can be reallocated. + * + * It is safe to discard the Stop Marker because it is an optimization. + * a. Page requests, which are posted requests, have been flushed to the IOMMU + * when mm_exit() returns, + * b. We flush all fault queues after mm_exit() returns and before freeing the + * PASID. + * + * So even though the Stop Marker might be issued by the device *after* the stop + * request completes, outstanding faults will have been dealt with by the time + * we free the PASID. */ int iommu_queue_iopf(struct iommu_fault_event *evt, void *cookie) { From patchwork Thu Sep 20 17:00:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10608335 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E552215E8 for ; Thu, 20 Sep 2018 17:25:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C9F3C2E283 for ; Thu, 20 Sep 2018 17:25:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BDFEC2E29B; Thu, 20 Sep 2018 17:25:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3687A2E283 for ; Thu, 20 Sep 2018 17:25:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728787AbeITXJw (ORCPT ); Thu, 20 Sep 2018 19:09:52 -0400 Received: from foss.arm.com ([217.140.101.70]:50320 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726781AbeITXJw (ORCPT ); Thu, 20 Sep 2018 19:09:52 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E7B301596; Thu, 20 Sep 2018 10:25:21 -0700 (PDT) Received: from ostrya.Emea.Arm.com (ostrya.emea.arm.com [10.4.12.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 692763F557; Thu, 20 Sep 2018 10:25:18 -0700 (PDT) From: Jean-Philippe Brucker To: iommu@lists.linux-foundation.org Cc: joro@8bytes.org, linux-pci@vger.kernel.org, jcrouse@codeaurora.org, alex.williamson@redhat.com, Jonathan.Cameron@huawei.com, jacob.jun.pan@linux.intel.com, christian.koenig@amd.com, eric.auger@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, andrew.murray@arm.com, will.deacon@arm.com, robin.murphy@arm.com, ashok.raj@intel.com, baolu.lu@linux.intel.com, xuzaibo@huawei.com, liguozhu@hisilicon.com, okaya@codeaurora.org, bharatku@xilinx.com, ilias.apalodimas@linaro.org, shunyong.yang@hxt-semitech.com Subject: [PATCH v3 09/10] iommu/sva: Register page fault handler Date: Thu, 20 Sep 2018 18:00:45 +0100 Message-Id: <20180920170046.20154-10-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180920170046.20154-1-jean-philippe.brucker@arm.com> References: <20180920170046.20154-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Let users call iommu_sva_init_device() with the IOMMU_SVA_FEAT_IOPF flag, that enables the I/O Page Fault queue. The IOMMU driver checks is the device supports a form of page fault, in which case they add the device to a fault queue. If the device doesn't support page faults, the IOMMU driver aborts iommu_sva_init_device(). The fault queue must be flushed before any io_mm is freed, to make sure that its PASID isn't used in any fault queue, and can be reallocated. Add iopf_queue_flush() calls in a few strategic locations. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/iommu-sva.c | 26 +++++++++++++++++++++++++- drivers/iommu/iommu.c | 6 +++--- include/linux/iommu.h | 2 ++ 3 files changed, 30 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index ee86f00ee1b9..1588a523a214 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -443,6 +443,8 @@ static void iommu_notifier_release(struct mmu_notifier *mn, struct mm_struct *mm dev_WARN(bond->dev, "possible leak of PASID %u", io_mm->pasid); + iopf_queue_flush_dev(bond->dev, io_mm->pasid); + spin_lock(&iommu_sva_lock); next = list_next_entry(bond, mm_head); @@ -590,6 +592,12 @@ int __iommu_sva_unbind_device(struct device *dev, int pasid) goto out_unlock; } + /* + * Caller stopped the device from issuing PASIDs, now make sure they are + * out of the fault queue. + */ + iopf_queue_flush_dev(dev, pasid); + /* spin_lock_irq matches the one in wait_event_lock_irq */ spin_lock_irq(&iommu_sva_lock); list_for_each_entry(bond, ¶m->mm_list, dev_head) { @@ -615,6 +623,8 @@ static void __iommu_sva_unbind_device_all(struct device *dev) if (!param) return; + iopf_queue_flush_dev(dev, IOMMU_PASID_INVALID); + spin_lock_irq(&iommu_sva_lock); list_for_each_entry_safe(bond, next, ¶m->mm_list, dev_head) io_mm_detach_locked(bond, true); @@ -680,6 +690,9 @@ EXPORT_SYMBOL_GPL(iommu_sva_find); * overrides it. Similarly, @min_pasid overrides the lower PASID limit supported * by the IOMMU. * + * If the device should support recoverable I/O Page Faults (e.g. PCI PRI), the + * IOMMU_SVA_FEAT_IOPF feature must be requested. + * * @mm_exit is called when an address space bound to the device is about to be * torn down by exit_mmap. After @mm_exit returns, the device must not issue any * more transaction with the PASID given as argument. The handler gets an opaque @@ -707,7 +720,7 @@ int iommu_sva_init_device(struct device *dev, unsigned long features, if (!domain || !domain->ops->sva_init_device) return -ENODEV; - if (features) + if (features & ~IOMMU_SVA_FEAT_IOPF) return -EINVAL; param = kzalloc(sizeof(*param), GFP_KERNEL); @@ -734,10 +747,20 @@ int iommu_sva_init_device(struct device *dev, unsigned long features, if (ret) goto err_unlock; + if (features & IOMMU_SVA_FEAT_IOPF) { + ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, + dev); + if (ret) + goto err_shutdown; + } + dev->iommu_param->sva_param = param; mutex_unlock(&dev->iommu_param->sva_lock); return 0; +err_shutdown: + if (domain->ops->sva_shutdown_device) + domain->ops->sva_shutdown_device(dev); err_unlock: mutex_unlock(&dev->iommu_param->sva_lock); kfree(param); @@ -766,6 +789,7 @@ void iommu_sva_shutdown_device(struct device *dev) goto out_unlock; __iommu_sva_unbind_device_all(dev); + iommu_unregister_device_fault_handler(dev); if (domain->ops->sva_shutdown_device) domain->ops->sva_shutdown_device(dev); diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 7113fe398b70..b493f5c4fe64 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2342,9 +2342,9 @@ EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids); * iommu_sva_init_device() must be called first, to initialize the required SVA * features. @flags must be a subset of these features. * - * The caller must pin down using get_user_pages*() all mappings shared with the - * device. mlock() isn't sufficient, as it doesn't prevent minor page faults - * (e.g. copy-on-write). + * If IOMMU_SVA_FEAT_IOPF isn't requested, the caller must pin down using + * get_user_pages*() all mappings shared with the device. mlock() isn't + * sufficient, as it doesn't prevent minor page faults (e.g. copy-on-write). * * On success, 0 is returned and @pasid contains a valid ID. Otherwise, an error * is returned. diff --git a/include/linux/iommu.h b/include/linux/iommu.h index b7cd00ae7358..ad2b18883ae2 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -65,6 +65,8 @@ typedef int (*iommu_mm_exit_handler_t)(struct device *dev, int pasid, void *); #define IOMMU_PASID_INVALID (-1) +#define IOMMU_SVA_FEAT_IOPF (1 << 0) + struct iommu_domain_geometry { dma_addr_t aperture_start; /* First address that can be mapped */ dma_addr_t aperture_end; /* Last address that can be mapped */ From patchwork Thu Sep 20 17:00:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10608337 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C5DA9913 for ; Thu, 20 Sep 2018 17:25:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A9BAB2DEF4 for ; Thu, 20 Sep 2018 17:25:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9D7B12E298; Thu, 20 Sep 2018 17:25:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 384AA2DEF4 for ; Thu, 20 Sep 2018 17:25:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730955AbeITXJ5 (ORCPT ); Thu, 20 Sep 2018 19:09:57 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:50362 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726781AbeITXJ4 (ORCPT ); Thu, 20 Sep 2018 19:09:56 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AFC427A9; Thu, 20 Sep 2018 10:25:25 -0700 (PDT) Received: from ostrya.Emea.Arm.com (ostrya.emea.arm.com [10.4.12.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 310AF3F557; Thu, 20 Sep 2018 10:25:22 -0700 (PDT) From: Jean-Philippe Brucker To: iommu@lists.linux-foundation.org Cc: joro@8bytes.org, linux-pci@vger.kernel.org, jcrouse@codeaurora.org, alex.williamson@redhat.com, Jonathan.Cameron@huawei.com, jacob.jun.pan@linux.intel.com, christian.koenig@amd.com, eric.auger@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, andrew.murray@arm.com, will.deacon@arm.com, robin.murphy@arm.com, ashok.raj@intel.com, baolu.lu@linux.intel.com, xuzaibo@huawei.com, liguozhu@hisilicon.com, okaya@codeaurora.org, bharatku@xilinx.com, ilias.apalodimas@linaro.org, shunyong.yang@hxt-semitech.com Subject: [RFC PATCH v3 10/10] iommu/sva: Add support for private PASIDs Date: Thu, 20 Sep 2018 18:00:46 +0100 Message-Id: <20180920170046.20154-11-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180920170046.20154-1-jean-philippe.brucker@arm.com> References: <20180920170046.20154-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Provide an API for allocating PASIDs and populating them manually. To ease cleanup and factor allocation code, reuse the io_mm structure for private PASID. Private io_mm has a NULL mm_struct pointer, and cannot be bound to multiple devices. The mm_alloc() IOMMU op must now check if the mm argument is NULL, in which case it should allocate io_pgtables instead of binding to an mm. Signed-off-by: Jordan Crouse Signed-off-by: Jean-Philippe Brucker --- Sadly this probably won't be the final thing. The API in this patch is used like this: iommu_sva_alloc_pasid(dev, &io_mm) -> PASID iommu_sva_map(io_mm, ...) iommu_sva_unmap(io_mm, ...) iommu_sva_free_pasid(dev, io_mm) The proposed API for auxiliary domains is in an early stage but might replace this patch and could be used like this: iommu_enable_aux_domain(dev) d = iommu_domain_alloc() iommu_attach_aux(dev, d) iommu_aux_id(d) -> PASID iommu_map(d, ...) iommu_unmap(d, ...) iommu_detach_aux(dev, d) iommu_domain_free(d) The advantage being that the driver doesn't have to use a special version of map/unmap/etc. --- drivers/iommu/iommu-sva.c | 209 ++++++++++++++++++++++++++++++++++---- drivers/iommu/iommu.c | 51 ++++++---- include/linux/iommu.h | 112 +++++++++++++++++++- 3 files changed, 331 insertions(+), 41 deletions(-) diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index 1588a523a214..029776f64e7d 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -15,11 +15,11 @@ /** * DOC: io_mm model * - * The io_mm keeps track of process address spaces shared between CPU and IOMMU. - * The following example illustrates the relation between structures - * iommu_domain, io_mm and iommu_bond. An iommu_bond is a link between io_mm and - * device. A device can have multiple io_mm and an io_mm may be bound to - * multiple devices. + * When used with the bind()/unbind() functions, the io_mm keeps track of + * process address spaces shared between CPU and IOMMU. The following example + * illustrates the relation between structures iommu_domain, io_mm and + * iommu_bond. An iommu_bond is a link between io_mm and device. A device can + * have multiple io_mm and an io_mm may be bound to multiple devices. * ___________________________ * | IOMMU domain A | * | ________________ | @@ -98,6 +98,12 @@ * the first entry points to the io_pgtable pointer. In other IOMMUs the * io_pgtable pointer is held in the device table and PASID #0 is available to * the allocator. + * + * The io_mm can also represent a private IOMMU address space, which isn't + * shared with a process. The device driver calls iommu_sva_alloc_pasid which + * returns an io_mm that can be populated with the iommu_sva_map/unmap + * functions. The principle is the same as shared io_mm, except that a private + * io_mm cannot be bound to multiple devices. */ struct iommu_bond { @@ -131,6 +137,9 @@ static DEFINE_SPINLOCK(iommu_sva_lock); static struct mmu_notifier_ops iommu_mmu_notifier; +#define io_mm_is_private(io_mm) ((io_mm) != NULL && (io_mm)->mm == NULL) +#define io_mm_is_shared(io_mm) ((io_mm) != NULL && (io_mm)->mm != NULL) + static struct io_mm * io_mm_alloc(struct iommu_domain *domain, struct device *dev, struct mm_struct *mm, unsigned long flags) @@ -149,19 +158,10 @@ io_mm_alloc(struct iommu_domain *domain, struct device *dev, if (!io_mm) return ERR_PTR(-ENOMEM); - /* - * The mm must not be freed until after the driver frees the io_mm - * (which may involve unpinning the CPU ASID for instance, requiring a - * valid mm struct.) - */ - mmgrab(mm); - io_mm->flags = flags; io_mm->mm = mm; - io_mm->notifier.ops = &iommu_mmu_notifier; io_mm->release = domain->ops->mm_free; INIT_LIST_HEAD(&io_mm->devices); - /* Leave kref as zero until the io_mm is fully initialized */ idr_preload(GFP_KERNEL); spin_lock(&iommu_sva_lock); @@ -176,6 +176,32 @@ io_mm_alloc(struct iommu_domain *domain, struct device *dev, goto err_free_mm; } + return io_mm; + +err_free_mm: + io_mm->release(io_mm); + return ERR_PTR(ret); +} + +static struct io_mm * +io_mm_alloc_shared(struct iommu_domain *domain, struct device *dev, + struct mm_struct *mm, unsigned long flags) +{ + int ret; + struct io_mm *io_mm; + + io_mm = io_mm_alloc(domain, dev, mm, flags); + if (IS_ERR(io_mm)) + return io_mm; + + /* + * The mm must not be freed until after the driver frees the io_mm + * (which may involve unpinning the CPU ASID for instance, requiring a + * valid mm struct.) + */ + mmgrab(mm); + + io_mm->notifier.ops = &iommu_mmu_notifier; ret = mmu_notifier_register(&io_mm->notifier, mm); if (ret) goto err_free_pasid; @@ -203,7 +229,6 @@ io_mm_alloc(struct iommu_domain *domain, struct device *dev, idr_remove(&iommu_pasid_idr, io_mm->pasid); spin_unlock(&iommu_sva_lock); -err_free_mm: io_mm->release(io_mm); mmdrop(mm); @@ -231,6 +256,11 @@ static void io_mm_release(struct kref *kref) idr_remove(&iommu_pasid_idr, io_mm->pasid); + if (io_mm_is_private(io_mm)) { + io_mm->release(io_mm); + return; + } + /* * If we're being released from mm exit, the notifier callback ->release * has already been called. Otherwise we don't need ->release, the io_mm @@ -258,7 +288,7 @@ static int io_mm_get_locked(struct io_mm *io_mm) if (io_mm && kref_get_unless_zero(&io_mm->kref)) { /* * kref_get_unless_zero doesn't provide ordering for reads. This - * barrier pairs with the one in io_mm_alloc. + * barrier pairs with the one in io_mm_alloc_shared. */ smp_rmb(); return 1; @@ -289,7 +319,7 @@ static int io_mm_attach(struct iommu_domain *domain, struct device *dev, struct iommu_sva_param *param = dev->iommu_param->sva_param; if (!domain->ops->mm_attach || !domain->ops->mm_detach || - !domain->ops->mm_invalidate) + (io_mm_is_shared(io_mm) && !domain->ops->mm_invalidate)) return -ENODEV; if (pasid > param->max_pasid || pasid < param->min_pasid) @@ -555,7 +585,7 @@ int __iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid } if (!io_mm) { - io_mm = io_mm_alloc(domain, dev, mm, flags); + io_mm = io_mm_alloc_shared(domain, dev, mm, flags); if (IS_ERR(io_mm)) { ret = PTR_ERR(io_mm); goto out_unlock; @@ -601,6 +631,9 @@ int __iommu_sva_unbind_device(struct device *dev, int pasid) /* spin_lock_irq matches the one in wait_event_lock_irq */ spin_lock_irq(&iommu_sva_lock); list_for_each_entry(bond, ¶m->mm_list, dev_head) { + if (io_mm_is_private(bond->io_mm)) + continue; + if (bond->io_mm->pasid == pasid) { io_mm_detach_locked(bond, true); ret = 0; @@ -672,6 +705,136 @@ struct mm_struct *iommu_sva_find(int pasid) } EXPORT_SYMBOL_GPL(iommu_sva_find); +/* + * iommu_sva_alloc_pasid - Allocate a private PASID + * + * Allocate a PASID for private map/unmap operations. Create a new I/O address + * space for this device, that isn't bound to any process. + * + * iommu_sva_init_device must have been called first. + */ +int iommu_sva_alloc_pasid(struct device *dev, struct io_mm **out) +{ + int ret; + struct io_mm *io_mm; + struct iommu_domain *domain; + struct iommu_sva_param *param = dev->iommu_param->sva_param; + + if (!out || !param) + return -EINVAL; + + domain = iommu_get_domain_for_dev(dev); + if (!domain) + return -EINVAL; + + io_mm = io_mm_alloc(domain, dev, NULL, 0); + if (IS_ERR(io_mm)) + return PTR_ERR(io_mm); + + kref_init(&io_mm->kref); + + ret = io_mm_attach(domain, dev, io_mm, NULL); + if (ret) { + io_mm_put(io_mm); + return ret; + } + + *out = io_mm; + return 0; +} +EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid); + +void iommu_sva_free_pasid(struct device *dev, struct io_mm *io_mm) +{ + struct iommu_bond *bond; + + if (WARN_ON(io_mm_is_shared(io_mm))) + return; + + spin_lock(&iommu_sva_lock); + list_for_each_entry(bond, &io_mm->devices, mm_head) { + if (bond->dev == dev) { + io_mm_detach_locked(bond, false); + break; + } + } + spin_unlock(&iommu_sva_lock); +} +EXPORT_SYMBOL_GPL(iommu_sva_free_pasid); + +int iommu_sva_map(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, phys_addr_t paddr, size_t size, int prot) +{ + if (WARN_ON(io_mm_is_shared(io_mm))) + return -ENODEV; + + return __iommu_map(domain, io_mm, iova, paddr, size, prot); +} +EXPORT_SYMBOL_GPL(iommu_sva_map); + +size_t iommu_sva_map_sg(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, struct scatterlist *sg, + unsigned int nents, int prot) +{ + if (WARN_ON(io_mm_is_shared(io_mm))) + return -ENODEV; + + return __iommu_map_sg(domain, io_mm, iova, sg, nents, prot); +} +EXPORT_SYMBOL_GPL(iommu_sva_map_sg); + +size_t iommu_sva_unmap(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, size_t size) +{ + if (WARN_ON(io_mm_is_shared(io_mm))) + return 0; + + return __iommu_unmap(domain, io_mm, iova, size, true); +} +EXPORT_SYMBOL_GPL(iommu_sva_unmap); + +size_t iommu_sva_unmap_fast(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, size_t size) +{ + if (WARN_ON(io_mm_is_shared(io_mm))) + return 0; + + return __iommu_unmap(domain, io_mm, iova, size, false); +} +EXPORT_SYMBOL_GPL(iommu_sva_unmap_fast); + +phys_addr_t iommu_sva_iova_to_phys(struct iommu_domain *domain, + struct io_mm *io_mm, dma_addr_t iova) +{ + if (!io_mm) + return iommu_iova_to_phys(domain, iova); + + if (WARN_ON(io_mm_is_shared(io_mm))) + return 0; + + if (unlikely(domain->ops->sva_iova_to_phys == NULL)) + return 0; + + return domain->ops->sva_iova_to_phys(domain, io_mm, iova); +} +EXPORT_SYMBOL_GPL(iommu_sva_iova_to_phys); + +void iommu_sva_tlb_range_add(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, size_t size) +{ + if (!io_mm) { + iommu_tlb_range_add(domain, iova, size); + return; + } + + if (WARN_ON(io_mm_is_shared(io_mm))) + return; + + if (domain->ops->sva_iotlb_range_add != NULL) + domain->ops->sva_iotlb_range_add(domain, io_mm, iova, size); +} +EXPORT_SYMBOL_GPL(iommu_sva_tlb_range_add); + /** * iommu_sva_init_device() - Initialize Shared Virtual Addressing for a device * @dev: the device @@ -693,10 +856,12 @@ EXPORT_SYMBOL_GPL(iommu_sva_find); * If the device should support recoverable I/O Page Faults (e.g. PCI PRI), the * IOMMU_SVA_FEAT_IOPF feature must be requested. * - * @mm_exit is called when an address space bound to the device is about to be - * torn down by exit_mmap. After @mm_exit returns, the device must not issue any - * more transaction with the PASID given as argument. The handler gets an opaque - * pointer corresponding to the drvdata passed as argument to bind(). + * If the driver intends to share process address spaces with the device, it + * should pass a valid @mm_exit handler. @mm_exit is called when an address + * space bound to the device is about to be torn down by exit_mmap. After + * @mm_exit returns, the device must not issue any more transaction with the + * PASID given as argument. The handler gets an opaque pointer corresponding to + * the drvdata passed as argument to bind(). * * The @mm_exit handler is allowed to sleep. Be careful about the locks taken in * @mm_exit, because they might lead to deadlocks if they are also held when diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index b493f5c4fe64..dd75c0a19c3a 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1854,8 +1854,8 @@ static size_t iommu_pgsize(struct iommu_domain *domain, return pgsize; } -int iommu_map(struct iommu_domain *domain, unsigned long iova, - phys_addr_t paddr, size_t size, int prot) +int __iommu_map(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, phys_addr_t paddr, size_t size, int prot) { unsigned long orig_iova = iova; unsigned int min_pagesz; @@ -1863,7 +1863,8 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t orig_paddr = paddr; int ret = 0; - if (unlikely(domain->ops->map == NULL || + if (unlikely((!io_mm && domain->ops->map == NULL) || + (io_mm && domain->ops->sva_map == NULL) || domain->pgsize_bitmap == 0UL)) return -ENODEV; @@ -1892,7 +1893,12 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova, pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx\n", iova, &paddr, pgsize); - ret = domain->ops->map(domain, iova, paddr, pgsize, prot); + if (io_mm) + ret = domain->ops->sva_map(domain, io_mm, iova, paddr, + pgsize, prot); + else + ret = domain->ops->map(domain, iova, paddr, pgsize, + prot); if (ret) break; @@ -1903,24 +1909,30 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova, /* unroll mapping in case something went wrong */ if (ret) - iommu_unmap(domain, orig_iova, orig_size - size); + __iommu_unmap(domain, io_mm, orig_iova, orig_size - size, true); else trace_map(orig_iova, orig_paddr, orig_size); return ret; } + +int iommu_map(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, size_t size, int prot) +{ + return __iommu_map(domain, NULL, iova, paddr, size, prot); +} EXPORT_SYMBOL_GPL(iommu_map); -static size_t __iommu_unmap(struct iommu_domain *domain, - unsigned long iova, size_t size, - bool sync) +size_t __iommu_unmap(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, size_t size, bool sync) { const struct iommu_ops *ops = domain->ops; size_t unmapped_page, unmapped = 0; unsigned long orig_iova = iova; unsigned int min_pagesz; - if (unlikely(ops->unmap == NULL || + if (unlikely((!io_mm && ops->unmap == NULL) || + (io_mm && ops->sva_unmap == NULL) || domain->pgsize_bitmap == 0UL)) return 0; @@ -1950,7 +1962,11 @@ static size_t __iommu_unmap(struct iommu_domain *domain, while (unmapped < size) { size_t pgsize = iommu_pgsize(domain, iova, size - unmapped); - unmapped_page = ops->unmap(domain, iova, pgsize); + if (io_mm) + unmapped_page = ops->sva_unmap(domain, io_mm, iova, + pgsize); + else + unmapped_page = ops->unmap(domain, iova, pgsize); if (!unmapped_page) break; @@ -1974,19 +1990,20 @@ static size_t __iommu_unmap(struct iommu_domain *domain, size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size) { - return __iommu_unmap(domain, iova, size, true); + return __iommu_unmap(domain, NULL, iova, size, true); } EXPORT_SYMBOL_GPL(iommu_unmap); size_t iommu_unmap_fast(struct iommu_domain *domain, unsigned long iova, size_t size) { - return __iommu_unmap(domain, iova, size, false); + return __iommu_unmap(domain, NULL, iova, size, false); } EXPORT_SYMBOL_GPL(iommu_unmap_fast); -size_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova, - struct scatterlist *sg, unsigned int nents, int prot) +size_t __iommu_map_sg(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, struct scatterlist *sg, + unsigned int nents, int prot) { struct scatterlist *s; size_t mapped = 0; @@ -2010,7 +2027,7 @@ size_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova, if (!IS_ALIGNED(s->offset, min_pagesz)) goto out_err; - ret = iommu_map(domain, iova + mapped, phys, s->length, prot); + ret = __iommu_map(domain, io_mm, iova + mapped, phys, s->length, prot); if (ret) goto out_err; @@ -2021,12 +2038,12 @@ size_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova, out_err: /* undo mappings already done */ - iommu_unmap(domain, iova, mapped); + __iommu_unmap(domain, io_mm, iova, mapped, true); return 0; } -EXPORT_SYMBOL_GPL(iommu_map_sg); +EXPORT_SYMBOL_GPL(__iommu_map_sg); int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr, phys_addr_t paddr, u64 size, int prot) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index ad2b18883ae2..0674fd983f81 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -248,11 +248,15 @@ struct iommu_sva_param { * @mm_invalidate: Invalidate a range of mappings for an mm * @map: map a physically contiguous memory region to an iommu domain * @unmap: unmap a physically contiguous memory region from an iommu domain + * @sva_map: map a physically contiguous memory region to an address space + * @sva_unmap: unmap a physically contiguous memory region from an address space * @flush_tlb_all: Synchronously flush all hardware TLBs for this domain * @tlb_range_add: Add a given iova range to the flush queue for this domain + * @sva_iotlb_range_add: Add a given iova range to the flush queue for this mm * @tlb_sync: Flush all queued ranges from the hardware TLBs and empty flush * queue * @iova_to_phys: translate iova to physical address + * @sva_iova_to_phys: translate iova to physical address * @add_device: add device to iommu grouping * @remove_device: remove device from iommu grouping * @device_group: find iommu group for a particular device @@ -298,11 +302,21 @@ struct iommu_ops { phys_addr_t paddr, size_t size, int prot); size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, size_t size); + int (*sva_map)(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, phys_addr_t paddr, size_t size, + int prot); + size_t (*sva_unmap)(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, size_t size); void (*flush_iotlb_all)(struct iommu_domain *domain); void (*iotlb_range_add)(struct iommu_domain *domain, unsigned long iova, size_t size); + void (*sva_iotlb_range_add)(struct iommu_domain *domain, + struct io_mm *io_mm, unsigned long iova, + size_t size); void (*iotlb_sync)(struct iommu_domain *domain); phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t iova); + phys_addr_t (*sva_iova_to_phys)(struct iommu_domain *domain, + struct io_mm *io_mm, dma_addr_t iova); int (*add_device)(struct device *dev); void (*remove_device)(struct device *dev); struct iommu_group *(*device_group)(struct device *dev); @@ -525,14 +539,27 @@ extern int iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev, struct tlb_invalidate_info *inv_info); extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev); +extern int __iommu_map(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, phys_addr_t paddr, size_t size, + int prot); extern int iommu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot); +extern size_t __iommu_unmap(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, size_t size, bool sync); extern size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size); extern size_t iommu_unmap_fast(struct iommu_domain *domain, unsigned long iova, size_t size); -extern size_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova, - struct scatterlist *sg,unsigned int nents, int prot); +extern size_t __iommu_map_sg(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, struct scatterlist *sg, + unsigned int nents, int prot); +static inline size_t iommu_map_sg(struct iommu_domain *domain, + unsigned long iova, + struct scatterlist *sg, unsigned int nents, + int prot) +{ + return __iommu_map_sg(domain, NULL, iova, sg, nents, prot); +} extern phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova); extern void iommu_set_fault_handler(struct iommu_domain *domain, iommu_fault_handler_t handler, void *token); @@ -693,12 +720,25 @@ static inline struct iommu_domain *iommu_get_domain_for_dev(struct device *dev) return NULL; } +static inline int __iommu_map(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, phys_addr_t paddr, + size_t size, int prot) +{ + return -ENODEV; +} + static inline int iommu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot) { return -ENODEV; } +static inline size_t __iommu_unmap(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, size_t size, bool sync) +{ + return 0; +} + static inline size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size) { @@ -1003,6 +1043,23 @@ extern int __iommu_sva_unbind_device(struct device *dev, int pasid); extern void iommu_sva_unbind_device_all(struct device *dev); extern struct mm_struct *iommu_sva_find(int pasid); +int iommu_sva_alloc_pasid(struct device *dev, struct io_mm **io_mm); +void iommu_sva_free_pasid(struct device *dev, struct io_mm *io_mm); + +int iommu_sva_map(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, phys_addr_t paddr, size_t size, int prot); +size_t iommu_sva_map_sg(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, struct scatterlist *sg, + unsigned int nents, int prot); +size_t iommu_sva_unmap(struct iommu_domain *domain, + struct io_mm *io_mm, unsigned long iova, size_t size); +size_t iommu_sva_unmap_fast(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, size_t size); +phys_addr_t iommu_sva_iova_to_phys(struct iommu_domain *domain, + struct io_mm *io_mm, dma_addr_t iova); +void iommu_sva_tlb_range_add(struct iommu_domain *domain, struct io_mm *io_mm, + unsigned long iova, size_t size); + #else /* CONFIG_IOMMU_SVA */ static inline int iommu_sva_init_device(struct device *dev, unsigned long features, @@ -1037,6 +1094,57 @@ static inline struct mm_struct *iommu_sva_find(int pasid) { return NULL; } + +static inline int iommu_sva_alloc_pasid(struct device *dev, struct io_mm **io_mm) +{ + return -ENODEV; +} + +static inline void iommu_sva_free_pasid(struct io_mm *io_mm, struct device *dev) +{ +} + +static inline int iommu_sva_map(struct iommu_domain *domain, + struct io_mm *io_mm, unsigned long iova, + phys_addr_t paddr, size_t size, int prot) +{ + return -EINVAL; +} + +static inline size_t iommu_sva_map_sg(struct iommu_domain *domain, + struct io_mm *io_mm, unsigned long iova, + struct scatterlist *sg, + unsigned int nents, int prot) +{ + return 0; +} + +static inline size_t iommu_sva_unmap(struct iommu_domain *domain, + struct io_mm *io_mm, unsigned long iova, + size_t size) +{ + return 0; +} + +static inline size_t iommu_sva_unmap_fast(struct iommu_domain *domain, + struct io_mm *io_mm, + unsigned long iova, size_t size) +{ + return 0; +} + +static inline phys_addr_t iommu_sva_iova_to_phys(struct iommu_domain *domain, + struct io_mm *io_mm, + dma_addr_t iova) +{ + return 0; +} + +static inline void iommu_sva_tlb_range_add(struct iommu_domain *domain, + struct io_mm *io_mm, + unsigned long iova, size_t size) +{ +} #endif /* CONFIG_IOMMU_SVA */ #ifdef CONFIG_IOMMU_PAGE_FAULT