From patchwork Wed Aug 9 12:17:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Brezillon X-Patchwork-Id: 13347871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9C285EB64DD for ; Wed, 9 Aug 2023 12:18:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=MGq6z9un8hAq082OxLMcLFKG9ItNjP3Sj9aM0XLbAXQ=; b=fcqq742RWz3AG9 QGHxcvd8933FlvGY3nhAuc2tKP+lgOAUOXgw9JH7PgioH6gGs6ZJvUrIRLqy3Ji1ITr1fqNsFtgRB Fm9AUccw2rwA/LG58BY3xuEDa4o3bSM75fMhTTH8nx9A7K57D7Vk2tiqmWX1c++eGI7Db5QHBUokY lEmgS+Au16xM6gtO/iH5gsBAvny6ZmkcvLdeII7rtOIClVCRb6srpr7kJB0EzE3IU0m7+YgHWT7wH URzOTmaJCmP1kojnrRZSfAIMr1NaZGYRXDydRekQzIfi/4ZlMNXQMs5BEwm9hXr/hKOA94iNp+pB/ eEI+WDcdjVEbbXc388MQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qTi8U-004sNI-35; Wed, 09 Aug 2023 12:17:58 +0000 Received: from madras.collabora.co.uk ([46.235.227.172]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qTi8S-004sM4-09 for linux-arm-kernel@lists.infradead.org; Wed, 09 Aug 2023 12:17:57 +0000 Received: from localhost.localdomain (unknown [IPv6:2a01:e0a:2c:6930:5cf4:84a1:2763:fe0d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by madras.collabora.co.uk (Postfix) with ESMTPSA id 7901B66071FA; Wed, 9 Aug 2023 13:17:49 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1691583469; bh=xrsJr8/Wzr/bxAh467gukIAWjJA0qC+z2qCOAesp6lw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=eMYmzQDtJIT9UIqCa1YkXi1htwVjleAcBfQcnJFANA/Lp/NKMvB0Nnk7hF4/087PS SYEcsD2dZnC6CvCGIS7FDIf9cuck8WUo3lkKNEloZG4CP9czchlB76Q65QQrkaVjYm 4CEM3VUPnKOa1GXaeVIHzq7Eh4s+h4RKCRGYZsVqrQHIAETR7zgINZNhqBeQTWp0sX XQvvUWi/Jx+hG+AJMZHNVBevwZ+zZDFZDTNcR2cYjSp4WEnDcS7WUr7srwS+Ez2yFc hwPABz+fmD7BaImhweVi7cwmFysrOjs9FGNtV9GZHIBon267sYI03lwVrPvCXDVN/O fDQpyhN4+mOLA== From: Boris Brezillon To: Joerg Roedel , iommu@lists.linux.dev, Will Deacon , Robin Murphy , linux-arm-kernel@lists.infradead.org Cc: Rob Clark , Boris Brezillon Subject: [PATCH 1/2] iommu: Allow passing custom allocators to pgtable drivers Date: Wed, 9 Aug 2023 14:17:43 +0200 Message-ID: <20230809121744.2341454-2-boris.brezillon@collabora.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230809121744.2341454-1-boris.brezillon@collabora.com> References: <20230809121744.2341454-1-boris.brezillon@collabora.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230809_051756_249252_21AAC476 X-CRM114-Status: GOOD ( 17.23 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org This will be useful for GPU drivers who want to keep page tables in a pool so they can: - keep freed page tables in a free pool and speed-up upcoming page table allocations - batch page table allocation instead of allocating one page at a time - pre-reserve pages for page tables needed for map/unmap operations, to ensure map/unmap operations don't try to allocate memory in paths they're allowed to block or fail We will extend the Arm LPAE format to support custom allocators in a separate commit. Signed-off-by: Boris Brezillon --- drivers/iommu/io-pgtable.c | 19 +++++++++++++++++++ include/linux/io-pgtable.h | 21 +++++++++++++++++++++ 2 files changed, 40 insertions(+) diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c index b843fcd365d2..f4caf630638a 100644 --- a/drivers/iommu/io-pgtable.c +++ b/drivers/iommu/io-pgtable.c @@ -34,6 +34,22 @@ io_pgtable_init_table[IO_PGTABLE_NUM_FMTS] = { #endif }; +static int check_custom_allocator(enum io_pgtable_fmt fmt, + struct io_pgtable_cfg *cfg) +{ + /* When passing a custom allocator, both the alloc and free + * functions should be provided. + */ + if ((cfg->alloc != NULL) != (cfg->free != NULL)) + return -EINVAL; + + /* No custom allocator, no need to check the format. */ + if (!cfg->alloc) + return 0; + + return -EINVAL; +} + struct io_pgtable_ops *alloc_io_pgtable_ops(enum io_pgtable_fmt fmt, struct io_pgtable_cfg *cfg, void *cookie) @@ -44,6 +60,9 @@ struct io_pgtable_ops *alloc_io_pgtable_ops(enum io_pgtable_fmt fmt, if (fmt >= IO_PGTABLE_NUM_FMTS) return NULL; + if (check_custom_allocator(fmt, cfg)) + return NULL; + fns = io_pgtable_init_table[fmt]; if (!fns) return NULL; diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index 1b7a44b35616..80d05dcb0f4c 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -100,6 +100,27 @@ struct io_pgtable_cfg { const struct iommu_flush_ops *tlb; struct device *iommu_dev; + /** + * @alloc: Custom page allocator. + * + * Optional hook used to allocate page tables. If this function is NULL, + * @free must be NULL too. + * + * Not all formats support custom page allocators. Before considering + * passing a non-NULL value, make sure the chosen page format supports + * this feature. + */ + void *(*alloc)(void *cookie, size_t size, gfp_t gfp); + + /** + * @free: Custom page de-allocator. + * + * Optional hook used to free page tables allocated with the @alloc + * hook. Must be non-NULL if @alloc is not NULL, must be NULL + * otherwise. + */ + void (*free)(void *cookie, void *pages, size_t size); + /* Low-level data specific to the table format */ union { struct { From patchwork Wed Aug 9 12:17:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Brezillon X-Patchwork-Id: 13347873 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5E2E6EB64DD for ; Wed, 9 Aug 2023 12:18:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=GgZwGkdosTQnTQFq1UZxMdUbVMtCQwc83sbde5r2F8U=; b=YVDBHQNHgQ2SSw oQxSfNoFKxSdmCH8WWaF0EcF31fpgsv3XKrCIo4SI2hVQjS7DlNdTSfgVCl8VitducrXYJWp9YK8l aVLmolic3Cvgi6BvXhfhxfBeHAA5lgI9/sep1Gw4Zsllh6OQpBAhjBuQuD5KZXLKJ1Vjq4Sb/2mOF flZLVIy1dYDfvPvCwGRD7LENHlFqq47DEVCWyRuZV1hGovJfc2TyDsk4VTt+HO3bICXRAFU2Y6PyG asLAdPSvLsnr53A1iobFpQqfEJhEAncKrF4E1BkNcGABoxR1DcPQPoTpDikrEJpvX1EpiXIqgkiW+ QgscPOdTPde/oyZ0Auvw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qTi8c-004sPl-0M; Wed, 09 Aug 2023 12:18:06 +0000 Received: from madras.collabora.co.uk ([46.235.227.172]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qTi8S-004sM5-0A for linux-arm-kernel@lists.infradead.org; Wed, 09 Aug 2023 12:17:59 +0000 Received: from localhost.localdomain (unknown [IPv6:2a01:e0a:2c:6930:5cf4:84a1:2763:fe0d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by madras.collabora.co.uk (Postfix) with ESMTPSA id E002D66071FD; Wed, 9 Aug 2023 13:17:49 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1691583470; bh=pj0zXTmZzrsde+2YXwWG8UlHzkKzmsOUEFbGBO+UfGs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LsPlPYj2OzZEXZnJROHxoAswI+90w7Go4ear1+mD4Umz1enAH5D5fbgVoCwDYC770 U5OIBln+fUgUy4f0yHwc8tSOhsEJDhcEsBrzyXw+DcK6sGQ3w97+QmrhmktoU7bh23 oQq9MvUAes6tbazAsoRZHrhH+4/OhCtHgG0T4JivCarCLAtD7XnbFlggm9MYJG93pE zhsMHbQSHy+8+ab+eaTZCSewadsAyYGWzwszwmnOGfDwEThsVhTJZTQaKGltPLmXBs mbuimZk7cQkzAQsnkiNkz3kbGeTk1h31309dgSsI0H+2nZ6LaPy8CZOqc+a1dmCgPx Rulm8Q8t26Gcw== From: Boris Brezillon To: Joerg Roedel , iommu@lists.linux.dev, Will Deacon , Robin Murphy , linux-arm-kernel@lists.infradead.org Cc: Rob Clark , Boris Brezillon Subject: [PATCH 2/2] iommu: Extend LPAE page table format to support custom allocators Date: Wed, 9 Aug 2023 14:17:44 +0200 Message-ID: <20230809121744.2341454-3-boris.brezillon@collabora.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230809121744.2341454-1-boris.brezillon@collabora.com> References: <20230809121744.2341454-1-boris.brezillon@collabora.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230809_051756_369543_E360D119 X-CRM114-Status: GOOD ( 22.88 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org We need that in order to implement the VM_BIND ioctl in the GPU driver targeting new Mali GPUs. VM_BIND is about executing MMU map/unmap requests asynchronously, possibly after waiting for external dependencies encoded as dma_fences. We intend to use the drm_sched framework to automate the dependency tracking and VM job dequeuing logic, but this comes with its own set of constraints, one of them being the fact we are not allowed to allocate memory in the drm_gpu_scheduler_ops::run_job() to avoid this sort of deadlocks: - VM_BIND map job needs to allocate a page table to map some memory to the VM. No memory available, so kswapd is kicked - GPU driver shrinker backend ends up waiting on the fence attached to the VM map job or any other job fence depending on this VM operation. With custom allocators, we will be able to pre-reserve enough pages to guarantee the map/unmap operations we queued will take place without going through the system allocator. But we can also optimize allocation/reservation by not free-ing pages immediately, so any upcoming page table allocation requests can be serviced by some free page table pool kept at the driver level. Signed-off-by: Boris Brezillon --- drivers/iommu/io-pgtable-arm.c | 50 +++++++++++++++++++++++----------- drivers/iommu/io-pgtable.c | 12 ++++++++ 2 files changed, 46 insertions(+), 16 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 72dcdd468cf3..c5c04f0106f3 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -188,20 +188,28 @@ static dma_addr_t __arm_lpae_dma_addr(void *pages) } static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp, - struct io_pgtable_cfg *cfg) + struct io_pgtable_cfg *cfg, + void *cookie) { struct device *dev = cfg->iommu_dev; int order = get_order(size); - struct page *p; dma_addr_t dma; void *pages; VM_BUG_ON((gfp & __GFP_HIGHMEM)); - p = alloc_pages_node(dev_to_node(dev), gfp | __GFP_ZERO, order); - if (!p) + + if (cfg->alloc) { + pages = cfg->alloc(cookie, size, gfp | __GFP_ZERO); + } else { + struct page *p; + + p = alloc_pages_node(dev_to_node(dev), gfp | __GFP_ZERO, order); + pages = p ? page_address(p) : NULL; + } + + if (!pages) return NULL; - pages = page_address(p); if (!cfg->coherent_walk) { dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE); if (dma_mapping_error(dev, dma)) @@ -220,18 +228,28 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp, out_unmap: dev_err(dev, "Cannot accommodate DMA translation for IOMMU page tables\n"); dma_unmap_single(dev, dma, size, DMA_TO_DEVICE); + out_free: - __free_pages(p, order); + if (cfg->free) + cfg->free(cookie, pages, size); + else + free_pages((unsigned long)pages, order); + return NULL; } static void __arm_lpae_free_pages(void *pages, size_t size, - struct io_pgtable_cfg *cfg) + struct io_pgtable_cfg *cfg, + void *cookie) { if (!cfg->coherent_walk) dma_unmap_single(cfg->iommu_dev, __arm_lpae_dma_addr(pages), size, DMA_TO_DEVICE); - free_pages((unsigned long)pages, get_order(size)); + + if (cfg->free) + cfg->free(cookie, pages, size); + else + free_pages((unsigned long)pages, get_order(size)); } static void __arm_lpae_sync_pte(arm_lpae_iopte *ptep, int num_entries, @@ -373,13 +391,13 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova, /* Grab a pointer to the next level */ pte = READ_ONCE(*ptep); if (!pte) { - cptep = __arm_lpae_alloc_pages(tblsz, gfp, cfg); + cptep = __arm_lpae_alloc_pages(tblsz, gfp, cfg, data->iop.cookie); if (!cptep) return -ENOMEM; pte = arm_lpae_install_table(cptep, ptep, 0, data); if (pte) - __arm_lpae_free_pages(cptep, tblsz, cfg); + __arm_lpae_free_pages(cptep, tblsz, cfg, data->iop.cookie); } else if (!cfg->coherent_walk && !(pte & ARM_LPAE_PTE_SW_SYNC)) { __arm_lpae_sync_pte(ptep, 1, cfg); } @@ -524,7 +542,7 @@ static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int lvl, __arm_lpae_free_pgtable(data, lvl + 1, iopte_deref(pte, data)); } - __arm_lpae_free_pages(start, table_size, &data->iop.cfg); + __arm_lpae_free_pages(start, table_size, &data->iop.cfg, data->iop.cookie); } static void arm_lpae_free_pgtable(struct io_pgtable *iop) @@ -552,7 +570,7 @@ static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data, if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) return 0; - tablep = __arm_lpae_alloc_pages(tablesz, GFP_ATOMIC, cfg); + tablep = __arm_lpae_alloc_pages(tablesz, GFP_ATOMIC, cfg, data->iop.cookie); if (!tablep) return 0; /* Bytes unmapped */ @@ -575,7 +593,7 @@ static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data, pte = arm_lpae_install_table(tablep, ptep, blk_pte, data); if (pte != blk_pte) { - __arm_lpae_free_pages(tablep, tablesz, cfg); + __arm_lpae_free_pages(tablep, tablesz, cfg, data->iop.cookie); /* * We may race against someone unmapping another part of this * block, but anything else is invalid. We can't misinterpret @@ -882,7 +900,7 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) /* Looking good; allocate a pgd */ data->pgd = __arm_lpae_alloc_pages(ARM_LPAE_PGD_SIZE(data), - GFP_KERNEL, cfg); + GFP_KERNEL, cfg, cookie); if (!data->pgd) goto out_free_data; @@ -984,7 +1002,7 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie) /* Allocate pgd pages */ data->pgd = __arm_lpae_alloc_pages(ARM_LPAE_PGD_SIZE(data), - GFP_KERNEL, cfg); + GFP_KERNEL, cfg, cookie); if (!data->pgd) goto out_free_data; @@ -1059,7 +1077,7 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie) << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_DEV)); data->pgd = __arm_lpae_alloc_pages(ARM_LPAE_PGD_SIZE(data), GFP_KERNEL, - cfg); + cfg, cookie); if (!data->pgd) goto out_free_data; diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c index f4caf630638a..e273c18ae22b 100644 --- a/drivers/iommu/io-pgtable.c +++ b/drivers/iommu/io-pgtable.c @@ -47,6 +47,18 @@ static int check_custom_allocator(enum io_pgtable_fmt fmt, if (!cfg->alloc) return 0; + switch (fmt) { + case ARM_32_LPAE_S1: + case ARM_32_LPAE_S2: + case ARM_64_LPAE_S1: + case ARM_64_LPAE_S2: + case ARM_MALI_LPAE: + return 0; + + default: + break; + } + return -EINVAL; }