From patchwork Tue Jun 27 09:54:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Tesarik X-Patchwork-Id: 13294249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B221EEB64D9 for ; Tue, 27 Jun 2023 09:57:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230158AbjF0J5R (ORCPT ); Tue, 27 Jun 2023 05:57:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231542AbjF0J45 (ORCPT ); Tue, 27 Jun 2023 05:56:57 -0400 Received: from frasgout12.his.huawei.com (unknown [14.137.139.154]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1E4E2D66; Tue, 27 Jun 2023 02:55:56 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.227]) by frasgout12.his.huawei.com (SkyGuard) with ESMTP id 4Qr0C342lXz9xGYm; Tue, 27 Jun 2023 17:43:27 +0800 (CST) Received: from A2101119013HW2.china.huawei.com (unknown [10.45.153.185]) by APP1 (Coremail) with SMTP id LxC2BwBnwdzesZpknnvHAw--.39096S3; Tue, 27 Jun 2023 10:55:17 +0100 (CET) From: Petr Tesarik To: Stefano Stabellini , Thomas Bogendoerfer , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), "H. Peter Anvin" , Greg Kroah-Hartman , "Rafael J. Wysocki" , Juergen Gross , Oleksandr Tyshchenko , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Andy Shevchenko , Hans de Goede , Jason Gunthorpe , Kees Cook , Saravana Kannan , xen-devel@lists.xenproject.org (moderated list:XEN HYPERVISOR ARM), linux-arm-kernel@lists.infradead.org (moderated list:ARM PORT), linux-kernel@vger.kernel.org (open list), linux-mips@vger.kernel.org (open list:MIPS), iommu@lists.linux.dev (open list:XEN SWIOTLB SUBSYSTEM) Cc: Roberto Sassu , Kefeng Wang , petr@tesarici.cz Subject: [PATCH v3 1/7] swiotlb: make io_tlb_default_mem local to swiotlb.c Date: Tue, 27 Jun 2023 11:54:23 +0200 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-CM-TRANSID: LxC2BwBnwdzesZpknnvHAw--.39096S3 X-Coremail-Antispam: 1UD129KBjvJXoW3WF1DXF45KFykCry5JFyxKrg_yoWxuw1fpr y7AFyrGF42grn7ArW3C3WkAF1Y93Z7K3yxCryS9w1Ygr15Cwn8Ja1DtrWYyr1rJrW8uF4f AFy0vryaka17Aw7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUma14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UM2 8EF7xvwVC2z280aVAFwI0_Gr0_Cr1l84ACjcxK6I8E87Iv6xkF7I0E14v26r4UJVWxJr1l e2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2Ix0cI 8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8JwAC jcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lFIxGxcIEc7CjxVA2Y2ka0x kIwI1lc7CjxVAKzI0EY4vE52x082I5MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY 6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17 CEb7AF67AKxVWrXVW8Jr1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsG vfC2KfnxnUUI43ZEXa7VUjb_-PUUUUU== X-CM-SenderInfo: hshw23xhvd2x3n6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-mips@vger.kernel.org From: Petr Tesarik SWIOTLB implementation details should not be exposed to the rest of the kernel. This will allow to make changes to the implementation without modifying non-swiotlb code. To avoid breaking existing users, provide helper functions for the few required fields. Enhance is_swiotlb_active() to work with the default IO TLB pool when passed a NULL device. As a bonus, using a helper function to initialize struct device allows to get rid of an #ifdef in driver core. Signed-off-by: Petr Tesarik --- arch/arm/xen/mm.c | 2 +- arch/mips/pci/pci-octeon.c | 2 +- arch/x86/kernel/pci-dma.c | 2 +- drivers/base/core.c | 4 +--- drivers/xen/swiotlb-xen.c | 2 +- include/linux/swiotlb.h | 17 ++++++++++++++++- kernel/dma/swiotlb.c | 39 ++++++++++++++++++++++++++++++++++++-- 7 files changed, 58 insertions(+), 10 deletions(-) diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c index 3d826c0b5fee..529aabc1be38 100644 --- a/arch/arm/xen/mm.c +++ b/arch/arm/xen/mm.c @@ -125,7 +125,7 @@ static int __init xen_mm_init(void) return 0; /* we can work with the default swiotlb */ - if (!io_tlb_default_mem.nslabs) { + if (!is_swiotlb_active(NULL)) { rc = swiotlb_init_late(swiotlb_size_or_default(), xen_swiotlb_gfp(), NULL); if (rc < 0) diff --git a/arch/mips/pci/pci-octeon.c b/arch/mips/pci/pci-octeon.c index e457a18cbdc5..c5c4c1f7d5e4 100644 --- a/arch/mips/pci/pci-octeon.c +++ b/arch/mips/pci/pci-octeon.c @@ -664,7 +664,7 @@ static int __init octeon_pci_setup(void) /* BAR1 movable regions contiguous to cover the swiotlb */ octeon_bar1_pci_phys = - io_tlb_default_mem.start & ~((1ull << 22) - 1); + default_swiotlb_start() & ~((1ull << 22) - 1); for (index = 0; index < 32; index++) { union cvmx_pci_bar1_indexx bar1_index; diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c index de6be0a3965e..33e960672837 100644 --- a/arch/x86/kernel/pci-dma.c +++ b/arch/x86/kernel/pci-dma.c @@ -90,7 +90,7 @@ int pci_xen_swiotlb_init_late(void) return 0; /* we can work with the default swiotlb */ - if (!io_tlb_default_mem.nslabs) { + if (!is_swiotlb_active(NULL)) { int rc = swiotlb_init_late(swiotlb_size_or_default(), GFP_KERNEL, xen_swiotlb_fixup); if (rc < 0) diff --git a/drivers/base/core.c b/drivers/base/core.c index 3dff5037943e..46d1d78c5beb 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -3108,9 +3108,7 @@ void device_initialize(struct device *dev) defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) dev->dma_coherent = dma_default_coherent; #endif -#ifdef CONFIG_SWIOTLB - dev->dma_io_tlb_mem = &io_tlb_default_mem; -#endif + swiotlb_dev_init(dev); } EXPORT_SYMBOL_GPL(device_initialize); diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index 67aa74d20162..946bd56f0ac5 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -381,7 +381,7 @@ xen_swiotlb_sync_sg_for_device(struct device *dev, struct scatterlist *sgl, static int xen_swiotlb_dma_supported(struct device *hwdev, u64 mask) { - return xen_phys_to_dma(hwdev, io_tlb_default_mem.end - 1) <= mask; + return xen_phys_to_dma(hwdev, default_swiotlb_limit()) <= mask; } const struct dma_map_ops xen_swiotlb_dma_ops = { diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 7af2673b47ba..b38045be532d 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -112,7 +112,6 @@ struct io_tlb_mem { atomic_long_t used_hiwater; #endif }; -extern struct io_tlb_mem io_tlb_default_mem; static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) { @@ -130,13 +129,19 @@ static inline bool is_swiotlb_force_bounce(struct device *dev) void swiotlb_init(bool addressing_limited, unsigned int flags); void __init swiotlb_exit(void); +void swiotlb_dev_init(struct device *dev); size_t swiotlb_max_mapping_size(struct device *dev); bool is_swiotlb_active(struct device *dev); void __init swiotlb_adjust_size(unsigned long size); +phys_addr_t default_swiotlb_start(void); +phys_addr_t default_swiotlb_limit(void); #else static inline void swiotlb_init(bool addressing_limited, unsigned int flags) { } +static inline void swiotlb_dev_init(struct device *dev) +{ +} static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) { return false; @@ -161,6 +166,16 @@ static inline bool is_swiotlb_active(struct device *dev) static inline void swiotlb_adjust_size(unsigned long size) { } + +static inline phys_addr_t default_swiotlb_start(void) +{ + return 0; +} + +static inline phys_addr_t default_swiotlb_limit(void) +{ + return 0; +} #endif /* CONFIG_SWIOTLB */ extern void swiotlb_print_info(void); diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 079df5ad38d0..63d318059f27 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -71,7 +71,7 @@ struct io_tlb_slot { static bool swiotlb_force_bounce; static bool swiotlb_force_disable; -struct io_tlb_mem io_tlb_default_mem; +static struct io_tlb_mem io_tlb_default_mem; static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT; static unsigned long default_nareas; @@ -486,6 +486,15 @@ void __init swiotlb_exit(void) memset(mem, 0, sizeof(*mem)); } +/** + * swiotlb_dev_init() - initialize swiotlb fields in &struct device + * @dev: Device to be initialized. + */ +void swiotlb_dev_init(struct device *dev) +{ + dev->dma_io_tlb_mem = &io_tlb_default_mem; +} + /* * Return the offset into a iotlb slot required to keep the device happy. */ @@ -939,14 +948,40 @@ size_t swiotlb_max_mapping_size(struct device *dev) return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE - min_align; } +/** + * is_swiotlb_active() - check if the software IO TLB is initialized + * @dev: Device to check, or %NULL for the default IO TLB. + */ bool is_swiotlb_active(struct device *dev) { - struct io_tlb_mem *mem = dev->dma_io_tlb_mem; + struct io_tlb_mem *mem = dev + ? dev->dma_io_tlb_mem + : &io_tlb_default_mem; return mem && mem->nslabs; } EXPORT_SYMBOL_GPL(is_swiotlb_active); +/** + * default_swiotlb_start() - get the start of the default SWIOTLB + * + * Get the lowest physical address used by the default software IO TLB pool. + */ +phys_addr_t default_swiotlb_start(void) +{ + return io_tlb_default_mem.start; +} + +/** + * default_swiotlb_limit() - get the highest address in the default SWIOTLB + * + * Get the highest physical address used by the default software IO TLB pool. + */ +phys_addr_t default_swiotlb_limit(void) +{ + return io_tlb_default_mem.end - 1; +} + #ifdef CONFIG_DEBUG_FS static int io_tlb_used_get(void *data, u64 *val) From patchwork Tue Jun 27 09:54:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Tesarik X-Patchwork-Id: 13294250 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7BEDEB64DC for ; Tue, 27 Jun 2023 09:57:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231827AbjF0J5q (ORCPT ); Tue, 27 Jun 2023 05:57:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231685AbjF0J5N (ORCPT ); Tue, 27 Jun 2023 05:57:13 -0400 Received: from frasgout12.his.huawei.com (unknown [14.137.139.154]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1398D30E5; Tue, 27 Jun 2023 02:56:13 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.227]) by frasgout12.his.huawei.com (SkyGuard) with ESMTP id 4Qr0CM3Cphz9v7cZ; Tue, 27 Jun 2023 17:43:43 +0800 (CST) Received: from A2101119013HW2.china.huawei.com (unknown [10.45.153.185]) by APP1 (Coremail) with SMTP id LxC2BwBnwdzesZpknnvHAw--.39096S4; Tue, 27 Jun 2023 10:55:33 +0100 (CET) From: Petr Tesarik To: Stefano Stabellini , Thomas Bogendoerfer , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), "H. Peter Anvin" , Greg Kroah-Hartman , "Rafael J. Wysocki" , Juergen Gross , Oleksandr Tyshchenko , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Andy Shevchenko , Hans de Goede , Jason Gunthorpe , Kees Cook , Saravana Kannan , xen-devel@lists.xenproject.org (moderated list:XEN HYPERVISOR ARM), linux-arm-kernel@lists.infradead.org (moderated list:ARM PORT), linux-kernel@vger.kernel.org (open list), linux-mips@vger.kernel.org (open list:MIPS), iommu@lists.linux.dev (open list:XEN SWIOTLB SUBSYSTEM) Cc: Roberto Sassu , Kefeng Wang , petr@tesarici.cz Subject: [PATCH v3 2/7] swiotlb: add documentation and rename swiotlb_do_find_slots() Date: Tue, 27 Jun 2023 11:54:24 +0200 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-CM-TRANSID: LxC2BwBnwdzesZpknnvHAw--.39096S4 X-Coremail-Antispam: 1UD129KBjvJXoW3GrWfGF48Gr13WrW3tFWUArb_yoW7XF43pF 4rKFs0qFZ8Jr1xC3sFkw48uF1Ska1kK347uayS9ryF9rnxJr9agFyqyrW5Xa4rJrs7uF4U Ja4avrW8GF45Xr7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUma14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UM2 8EF7xvwVC2z280aVAFwI0_Gr0_Cr1l84ACjcxK6I8E87Iv6xkF7I0E14v26r4UJVWxJr1l e2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2Ix0cI 8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8JwAC jcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lFIxGxcIEc7CjxVA2Y2ka0x kIwI1lc7CjxVAKzI0EY4vE52x082I5MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY 6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17 CEb7AF67AKxVWrXVW8Jr1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsG vfC2KfnxnUUI43ZEXa7VUURwZ5UUUUU== X-CM-SenderInfo: hshw23xhvd2x3n6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-mips@vger.kernel.org From: Petr Tesarik Add some kernel-doc comments and move the existing documentation of struct io_tlb_slot to its correct location. The latter was forgotten in commit 942a8186eb445 ("swiotlb: move struct io_tlb_slot to swiotlb.c"). Use the opportunity to give swiotlb_do_find_slots() a more descriptive name, which makes it clear how it differs from swiotlb_find_slots(). Signed-off-by: Petr Tesarik --- include/linux/swiotlb.h | 15 ++++++++---- kernel/dma/swiotlb.c | 52 ++++++++++++++++++++++++++++++++++++----- 2 files changed, 57 insertions(+), 10 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index b38045be532d..9f486843a66a 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -77,10 +77,6 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys, * @end. For default swiotlb, this is command line adjustable via * setup_io_tlb_npages. * @used: The number of used IO TLB block. - * @list: The free list describing the number of free entries available - * from each index. - * @orig_addr: The original address corresponding to a mapped entry. - * @alloc_size: Size of the allocated buffer. * @debugfs: The dentry to debugfs. * @late_alloc: %true if allocated using the page allocator * @force_bounce: %true if swiotlb bouncing is forced @@ -113,6 +109,17 @@ struct io_tlb_mem { #endif }; +/** + * is_swiotlb_buffer() - check if a physical address belongs to a swiotlb + * @dev: Device which has mapped the buffer. + * @paddr: Physical address within the DMA buffer. + * + * Check if @paddr points into a bounce buffer. + * + * Return: + * * %true if @paddr points into a bounce buffer + * * %false otherwise + */ static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) { struct io_tlb_mem *mem = dev->dma_io_tlb_mem; diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 63d318059f27..888c25a9bfcc 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -62,6 +62,13 @@ #define INVALID_PHYS_ADDR (~(phys_addr_t)0) +/** + * struct io_tlb_slot - IO TLB slot descriptor + * @orig_addr: The original address corresponding to a mapped entry. + * @alloc_size: Size of the allocated buffer. + * @list: The free list describing the number of free entries available + * from each index. + */ struct io_tlb_slot { phys_addr_t orig_addr; size_t alloc_size; @@ -632,11 +639,22 @@ static void dec_used(struct io_tlb_mem *mem, unsigned int nslots) } #endif /* CONFIG_DEBUG_FS */ -/* - * Find a suitable number of IO TLB entries size that will fit this request and - * allocate a buffer from that IO TLB pool. +/** + * area_find_slots() - search for slots in one IO TLB memory area + * @dev: Device which maps the buffer. + * @area_index: Index of the IO TLB memory area to be searched. + * @orig_addr: Original (non-bounced) IO buffer address. + * @alloc_size: Total requested size of the bounce buffer, + * including initial alignment padding. + * @alloc_align_mask: Required alignment of the allocated buffer. + * + * Find a suitable sequence of IO TLB entries for the request and allocate + * a buffer from the given IO TLB memory area. + * This function takes care of locking. + * + * Return: Index of the first allocated slot, or -1 on error. */ -static int swiotlb_do_find_slots(struct device *dev, int area_index, +static int area_find_slots(struct device *dev, int area_index, phys_addr_t orig_addr, size_t alloc_size, unsigned int alloc_align_mask) { @@ -731,6 +749,19 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index, return slot_index; } +/** + * swiotlb_find_slots() - search for slots in the whole swiotlb + * @dev: Device which maps the buffer. + * @orig_addr: Original (non-bounced) IO buffer address. + * @alloc_size: Total requested size of the bounce buffer, + * including initial alignment padding. + * @alloc_align_mask: Required alignment of the allocated buffer. + * + * Search through the whole software IO TLB to find a sequence of slots that + * match the allocation constraints. + * + * Return: Index of the first allocated slot, or -1 on error. + */ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr, size_t alloc_size, unsigned int alloc_align_mask) { @@ -739,8 +770,8 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr, int i = start, index; do { - index = swiotlb_do_find_slots(dev, i, orig_addr, alloc_size, - alloc_align_mask); + index = area_find_slots(dev, i, orig_addr, alloc_size, + alloc_align_mask); if (index >= 0) return index; if (++i >= mem->nareas) @@ -750,6 +781,15 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr, return -1; } +/** + * mem_used() - get number of used slots in an allocator + * @mem: Software IO TLB allocator. + * + * The result is not accurate, because there is no locking of individual + * areas. + * + * Return: Approximate number of used slots. + */ static unsigned long mem_used(struct io_tlb_mem *mem) { int i; From patchwork Tue Jun 27 09:54:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Tesarik X-Patchwork-Id: 13294251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1922BEB64D9 for ; Tue, 27 Jun 2023 09:58:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231806AbjF0J60 (ORCPT ); Tue, 27 Jun 2023 05:58:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231910AbjF0J5m (ORCPT ); Tue, 27 Jun 2023 05:57:42 -0400 Received: from frasgout12.his.huawei.com (unknown [14.137.139.154]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CFDD426A3; Tue, 27 Jun 2023 02:56:28 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.228]) by frasgout12.his.huawei.com (SkyGuard) with ESMTP id 4Qr0Cg5Twwz9v7cZ; Tue, 27 Jun 2023 17:43:59 +0800 (CST) Received: from A2101119013HW2.china.huawei.com (unknown [10.45.153.185]) by APP1 (Coremail) with SMTP id LxC2BwBnwdzesZpknnvHAw--.39096S5; Tue, 27 Jun 2023 10:55:49 +0100 (CET) From: Petr Tesarik To: Stefano Stabellini , Thomas Bogendoerfer , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), "H. Peter Anvin" , Greg Kroah-Hartman , "Rafael J. Wysocki" , Juergen Gross , Oleksandr Tyshchenko , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Andy Shevchenko , Hans de Goede , Jason Gunthorpe , Kees Cook , Saravana Kannan , xen-devel@lists.xenproject.org (moderated list:XEN HYPERVISOR ARM), linux-arm-kernel@lists.infradead.org (moderated list:ARM PORT), linux-kernel@vger.kernel.org (open list), linux-mips@vger.kernel.org (open list:MIPS), iommu@lists.linux.dev (open list:XEN SWIOTLB SUBSYSTEM) Cc: Roberto Sassu , Kefeng Wang , petr@tesarici.cz Subject: [PATCH v3 3/7] swiotlb: separate memory pool data from other allocator data Date: Tue, 27 Jun 2023 11:54:25 +0200 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-CM-TRANSID: LxC2BwBnwdzesZpknnvHAw--.39096S5 X-Coremail-Antispam: 1UD129KBjvAXoWfCF45Kr47KF1ftryDJr1rtFb_yoW8ur4DCo W7GFnrXw4ftryjkFW2yan3Jr17uayqka1rAF1rZay5Wayjyr4Ygw1fKa15Xw13Wr1fGFWx AwnxWa48WF4kAr18n29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUO37AC8VAFwI0_Wr0E3s1l1xkIjI8I6I8E6xAIw20EY4v20xva j40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l82xGYIkIc2x26280x7IE14v26r1rM28IrcIa0x kI8VCY1x0267AKxVW5JVCq3wA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK021l84AC jcxK6xIIjxv20xvE14v26r1j6r1xM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4UJVWxJr 1l84ACjcxK6I8E87Iv67AKxVW8JVWxJwA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Gr1j6F4U JwAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0264kExVAvwVAq07x20xyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Y z7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zV AF1VAY17CE14v26rWY6r4UJwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1x MIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJV WUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUv cSsGvfC2KfnxnUUI43ZEXa7VUj1xR3UUUUU== X-CM-SenderInfo: hshw23xhvd2x3n6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-mips@vger.kernel.org From: Petr Tesarik Carve out memory pool specific fields from struct io_tlb_mem. The original struct now contains shared data for the whole allocator, while the new struct io_tlb_pool contains data that is specific to one memory pool of (potentially) many. Allocate both structures together for restricted DMA pools to keep the error cleanup path simple. Signed-off-by: Petr Tesarik --- include/linux/device.h | 2 +- include/linux/swiotlb.h | 49 ++++++----- kernel/dma/swiotlb.c | 181 +++++++++++++++++++++++++--------------- 3 files changed, 147 insertions(+), 85 deletions(-) diff --git a/include/linux/device.h b/include/linux/device.h index 472dd24d4823..83081aa99e6a 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -509,7 +509,7 @@ struct device_physical_location { * @dma_pools: Dma pools (if dma'ble device). * @dma_mem: Internal for coherent mem override. * @cma_area: Contiguous memory area for dma allocations - * @dma_io_tlb_mem: Pointer to the swiotlb pool used. Not for driver use. + * @dma_io_tlb_mem: Software IO TLB allocator. Not for driver use. * @archdata: For arch-specific additions. * @of_node: Associated device tree node. * @fwnode: Associated device node supplied by platform firmware. diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 9f486843a66a..0aa6868cb68c 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -62,8 +62,7 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys, #ifdef CONFIG_SWIOTLB /** - * struct io_tlb_mem - IO TLB Memory Pool Descriptor - * + * struct io_tlb_pool - IO TLB memory pool descriptor * @start: The start address of the swiotlb memory pool. Used to do a quick * range check to see if the memory was in fact allocated by this * API. @@ -73,16 +72,36 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys, * @vaddr: The vaddr of the swiotlb memory pool. The swiotlb memory pool * may be remapped in the memory encrypted case and store virtual * address for bounce buffer operation. - * @nslabs: The number of IO TLB blocks (in groups of 64) between @start and - * @end. For default swiotlb, this is command line adjustable via - * setup_io_tlb_npages. - * @used: The number of used IO TLB block. + * @nslabs: The number of IO TLB slots between @start and @end. For the + * default swiotlb, this can be adjusted with a boot parameter, + * see setup_io_tlb_npages(). + * @used: The number of used IO TLB slots. + * @late_alloc: %true if allocated using the page allocator. + * @nareas: Number of areas in the pool. + * @area_nslabs: Number of slots in each area. + * @areas: Array of memory area descriptors. + * @slots: Array of slot descriptors. + */ +struct io_tlb_pool { + phys_addr_t start; + phys_addr_t end; + void *vaddr; + unsigned long nslabs; + unsigned long used; + bool late_alloc; + unsigned int nareas; + unsigned int area_nslabs; + struct io_tlb_area *areas; + struct io_tlb_slot *slots; +}; + +/** + * struct io_tlb_mem - Software IO TLB allocator + * @pool: IO TLB memory pool descriptor. + * @nslabs: Total number of IO TLB slabs in all pools. * @debugfs: The dentry to debugfs. - * @late_alloc: %true if allocated using the page allocator * @force_bounce: %true if swiotlb bouncing is forced * @for_alloc: %true if the pool is used for memory allocation - * @nareas: The area number in the pool. - * @area_nslabs: The slot number in the area. * @total_used: The total number of slots in the pool that are currently used * across all areas. Used only for calculating used_hiwater in * debugfs. @@ -90,19 +109,11 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys, * in debugfs. */ struct io_tlb_mem { - phys_addr_t start; - phys_addr_t end; - void *vaddr; + struct io_tlb_pool *pool; unsigned long nslabs; - unsigned long used; struct dentry *debugfs; - bool late_alloc; bool force_bounce; bool for_alloc; - unsigned int nareas; - unsigned int area_nslabs; - struct io_tlb_area *areas; - struct io_tlb_slot *slots; #ifdef CONFIG_DEBUG_FS atomic_long_t total_used; atomic_long_t used_hiwater; @@ -124,7 +135,7 @@ static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) { struct io_tlb_mem *mem = dev->dma_io_tlb_mem; - return mem && paddr >= mem->start && paddr < mem->end; + return mem && paddr >= mem->pool->start && paddr < mem->pool->end; } static inline bool is_swiotlb_force_bounce(struct device *dev) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 888c25a9bfcc..4c5de91bda57 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -78,7 +78,10 @@ struct io_tlb_slot { static bool swiotlb_force_bounce; static bool swiotlb_force_disable; -static struct io_tlb_mem io_tlb_default_mem; +static struct io_tlb_pool io_tlb_default_pool; +static struct io_tlb_mem io_tlb_default_mem = { + .pool = &io_tlb_default_pool, +}; static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT; static unsigned long default_nareas; @@ -209,7 +212,7 @@ void __init swiotlb_adjust_size(unsigned long size) void swiotlb_print_info(void) { - struct io_tlb_mem *mem = &io_tlb_default_mem; + struct io_tlb_pool *mem = &io_tlb_default_pool; if (!mem->nslabs) { pr_warn("No low mem\n"); @@ -238,7 +241,7 @@ static inline unsigned long nr_slots(u64 val) */ void __init swiotlb_update_mem_attributes(void) { - struct io_tlb_mem *mem = &io_tlb_default_mem; + struct io_tlb_pool *mem = &io_tlb_default_pool; unsigned long bytes; if (!mem->nslabs || mem->late_alloc) @@ -247,9 +250,8 @@ void __init swiotlb_update_mem_attributes(void) set_memory_decrypted((unsigned long)mem->vaddr, bytes >> PAGE_SHIFT); } -static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start, - unsigned long nslabs, unsigned int flags, - bool late_alloc, unsigned int nareas) +static void swiotlb_init_io_tlb_pool(struct io_tlb_pool *mem, phys_addr_t start, + unsigned long nslabs, bool late_alloc, unsigned int nareas) { void *vaddr = phys_to_virt(start); unsigned long bytes = nslabs << IO_TLB_SHIFT, i; @@ -261,8 +263,6 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start, mem->nareas = nareas; mem->area_nslabs = nslabs / mem->nareas; - mem->force_bounce = swiotlb_force_bounce || (flags & SWIOTLB_FORCE); - for (i = 0; i < mem->nareas; i++) { spin_lock_init(&mem->areas[i].lock); mem->areas[i].index = 0; @@ -319,7 +319,7 @@ static void __init *swiotlb_memblock_alloc(unsigned long nslabs, void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags, int (*remap)(void *tlb, unsigned long nslabs)) { - struct io_tlb_mem *mem = &io_tlb_default_mem; + struct io_tlb_pool *mem = &io_tlb_default_pool; unsigned long nslabs; unsigned int nareas; size_t alloc_size; @@ -330,6 +330,9 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags, if (swiotlb_force_disable) return; + io_tlb_default_mem.force_bounce = + swiotlb_force_bounce || (flags & SWIOTLB_FORCE); + if (!default_nareas) swiotlb_adjust_nareas(num_possible_cpus()); @@ -363,8 +366,9 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags, return; } - swiotlb_init_io_tlb_mem(mem, __pa(tlb), nslabs, flags, false, - default_nareas); + swiotlb_init_io_tlb_pool(mem, __pa(tlb), nslabs, false, + default_nareas); + io_tlb_default_mem.nslabs = nslabs; if (flags & SWIOTLB_VERBOSE) swiotlb_print_info(); @@ -383,7 +387,7 @@ void __init swiotlb_init(bool addressing_limit, unsigned int flags) int swiotlb_init_late(size_t size, gfp_t gfp_mask, int (*remap)(void *tlb, unsigned long nslabs)) { - struct io_tlb_mem *mem = &io_tlb_default_mem; + struct io_tlb_pool *mem = &io_tlb_default_pool; unsigned long nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE); unsigned int nareas; unsigned char *vstart = NULL; @@ -394,6 +398,8 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, if (swiotlb_force_disable) return 0; + io_tlb_default_mem.force_bounce = swiotlb_force_bounce; + if (!default_nareas) swiotlb_adjust_nareas(num_possible_cpus()); @@ -445,8 +451,9 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, set_memory_decrypted((unsigned long)vstart, (nslabs << IO_TLB_SHIFT) >> PAGE_SHIFT); - swiotlb_init_io_tlb_mem(mem, virt_to_phys(vstart), nslabs, 0, true, - nareas); + swiotlb_init_io_tlb_pool(mem, virt_to_phys(vstart), nslabs, true, + nareas); + io_tlb_default_mem.nslabs = nslabs; swiotlb_print_info(); return 0; @@ -460,7 +467,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, void __init swiotlb_exit(void) { - struct io_tlb_mem *mem = &io_tlb_default_mem; + struct io_tlb_pool *mem = &io_tlb_default_pool; unsigned long tbl_vaddr; size_t tbl_size, slots_size; unsigned int area_order; @@ -516,7 +523,7 @@ static unsigned int swiotlb_align_offset(struct device *dev, u64 addr) static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t size, enum dma_data_direction dir) { - struct io_tlb_mem *mem = dev->dma_io_tlb_mem; + struct io_tlb_pool *mem = dev->dma_io_tlb_mem->pool; int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT; phys_addr_t orig_addr = mem->slots[index].orig_addr; size_t alloc_size = mem->slots[index].alloc_size; @@ -598,7 +605,7 @@ static inline unsigned long get_max_slots(unsigned long boundary_mask) return nr_slots(boundary_mask + 1); } -static unsigned int wrap_area_index(struct io_tlb_mem *mem, unsigned int index) +static unsigned int wrap_area_index(struct io_tlb_pool *mem, unsigned int index) { if (index >= mem->area_nslabs) return 0; @@ -642,6 +649,7 @@ static void dec_used(struct io_tlb_mem *mem, unsigned int nslots) /** * area_find_slots() - search for slots in one IO TLB memory area * @dev: Device which maps the buffer. + * @pool: Memory pool to be searched. * @area_index: Index of the IO TLB memory area to be searched. * @orig_addr: Original (non-bounced) IO buffer address. * @alloc_size: Total requested size of the bounce buffer, @@ -654,15 +662,14 @@ static void dec_used(struct io_tlb_mem *mem, unsigned int nslots) * * Return: Index of the first allocated slot, or -1 on error. */ -static int area_find_slots(struct device *dev, int area_index, - phys_addr_t orig_addr, size_t alloc_size, +static int area_find_slots(struct device *dev, struct io_tlb_pool *pool, + int area_index, phys_addr_t orig_addr, size_t alloc_size, unsigned int alloc_align_mask) { - struct io_tlb_mem *mem = dev->dma_io_tlb_mem; - struct io_tlb_area *area = mem->areas + area_index; + struct io_tlb_area *area = pool->areas + area_index; unsigned long boundary_mask = dma_get_seg_boundary(dev); dma_addr_t tbl_dma_addr = - phys_to_dma_unencrypted(dev, mem->start) & boundary_mask; + phys_to_dma_unencrypted(dev, pool->start) & boundary_mask; unsigned long max_slots = get_max_slots(boundary_mask); unsigned int iotlb_align_mask = dma_get_min_align_mask(dev) | alloc_align_mask; @@ -674,7 +681,7 @@ static int area_find_slots(struct device *dev, int area_index, unsigned int slot_index; BUG_ON(!nslots); - BUG_ON(area_index >= mem->nareas); + BUG_ON(area_index >= pool->nareas); /* * For allocations of PAGE_SIZE or larger only look for page aligned @@ -691,19 +698,19 @@ static int area_find_slots(struct device *dev, int area_index, stride = (iotlb_align_mask >> IO_TLB_SHIFT) + 1; spin_lock_irqsave(&area->lock, flags); - if (unlikely(nslots > mem->area_nslabs - area->used)) + if (unlikely(nslots > pool->area_nslabs - area->used)) goto not_found; - slot_base = area_index * mem->area_nslabs; + slot_base = area_index * pool->area_nslabs; index = area->index; - for (slots_checked = 0; slots_checked < mem->area_nslabs; ) { + for (slots_checked = 0; slots_checked < pool->area_nslabs; ) { slot_index = slot_base + index; if (orig_addr && (slot_addr(tbl_dma_addr, slot_index) & iotlb_align_mask) != (orig_addr & iotlb_align_mask)) { - index = wrap_area_index(mem, index + 1); + index = wrap_area_index(pool, index + 1); slots_checked++; continue; } @@ -716,10 +723,10 @@ static int area_find_slots(struct device *dev, int area_index, if (!iommu_is_span_boundary(slot_index, nslots, nr_slots(tbl_dma_addr), max_slots)) { - if (mem->slots[slot_index].list >= nslots) + if (pool->slots[slot_index].list >= nslots) goto found; } - index = wrap_area_index(mem, index + stride); + index = wrap_area_index(pool, index + stride); slots_checked += stride; } @@ -729,58 +736,97 @@ static int area_find_slots(struct device *dev, int area_index, found: for (i = slot_index; i < slot_index + nslots; i++) { - mem->slots[i].list = 0; - mem->slots[i].alloc_size = alloc_size - (offset + + pool->slots[i].list = 0; + pool->slots[i].alloc_size = alloc_size - (offset + ((i - slot_index) << IO_TLB_SHIFT)); } for (i = slot_index - 1; io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 && - mem->slots[i].list; i--) - mem->slots[i].list = ++count; + pool->slots[i].list; i--) + pool->slots[i].list = ++count; /* * Update the indices to avoid searching in the next round. */ - area->index = wrap_area_index(mem, index + nslots); + area->index = wrap_area_index(pool, index + nslots); area->used += nslots; spin_unlock_irqrestore(&area->lock, flags); - inc_used_and_hiwater(mem, nslots); + inc_used_and_hiwater(dev->dma_io_tlb_mem, nslots); return slot_index; } /** - * swiotlb_find_slots() - search for slots in the whole swiotlb + * pool_find_slots() - search for slots in one memory pool * @dev: Device which maps the buffer. + * @pool: Memory pool to be searched. * @orig_addr: Original (non-bounced) IO buffer address. * @alloc_size: Total requested size of the bounce buffer, * including initial alignment padding. * @alloc_align_mask: Required alignment of the allocated buffer. * - * Search through the whole software IO TLB to find a sequence of slots that - * match the allocation constraints. + * Search through one memory pool to find a sequence of slots that match the + * allocation constraints. * * Return: Index of the first allocated slot, or -1 on error. */ -static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr, - size_t alloc_size, unsigned int alloc_align_mask) +static int pool_find_slots(struct device *dev, struct io_tlb_pool *pool, + phys_addr_t orig_addr, size_t alloc_size, + unsigned int alloc_align_mask) { - struct io_tlb_mem *mem = dev->dma_io_tlb_mem; - int start = raw_smp_processor_id() & (mem->nareas - 1); + int start = raw_smp_processor_id() & (pool->nareas - 1); int i = start, index; do { - index = area_find_slots(dev, i, orig_addr, alloc_size, + index = area_find_slots(dev, pool, i, orig_addr, alloc_size, alloc_align_mask); if (index >= 0) return index; - if (++i >= mem->nareas) + if (++i >= pool->nareas) i = 0; } while (i != start); return -1; } +/** + * swiotlb_find_slots() - search for slots in the whole swiotlb + * @dev: Device which maps the buffer. + * @orig_addr: Original (non-bounced) IO buffer address. + * @alloc_size: Total requested size of the bounce buffer, + * including initial alignment padding. + * @alloc_align_mask: Required alignment of the allocated buffer. + * + * Search through the whole software IO TLB to find a sequence of slots that + * match the allocation constraints. + * + * Return: Index of the first allocated slot, or -1 on error. + */ +static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr, + size_t alloc_size, unsigned int alloc_align_mask) +{ + return pool_find_slots(dev, dev->dma_io_tlb_mem->pool, orig_addr, + alloc_size, alloc_align_mask); +} + +/** + * mem_pool_used() - get number of used slots in a memory pool + * @pool: Software IO TLB memory pool. + * + * The result is not accurate, see mem_used(). + * + * Return: Approximate number of used slots. + */ +static unsigned long mem_pool_used(struct io_tlb_pool *pool) +{ + int i; + unsigned long used = 0; + + for (i = 0; i < pool->nareas; i++) + used += pool->areas[i].used; + return used; +} + /** * mem_used() - get number of used slots in an allocator * @mem: Software IO TLB allocator. @@ -792,12 +838,7 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr, */ static unsigned long mem_used(struct io_tlb_mem *mem) { - int i; - unsigned long used = 0; - - for (i = 0; i < mem->nareas; i++) - used += mem->areas[i].used; - return used; + return mem_pool_used(mem->pool); } phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, @@ -807,6 +848,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, { struct io_tlb_mem *mem = dev->dma_io_tlb_mem; unsigned int offset = swiotlb_align_offset(dev, orig_addr); + struct io_tlb_pool *pool; unsigned int i; int index; phys_addr_t tlb_addr; @@ -841,9 +883,10 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, * This is needed when we sync the memory. Then we sync the buffer if * needed. */ + pool = mem->pool; for (i = 0; i < nr_slots(alloc_size + offset); i++) - mem->slots[index + i].orig_addr = slot_addr(orig_addr, i); - tlb_addr = slot_addr(mem->start, index) + offset; + pool->slots[index + i].orig_addr = slot_addr(orig_addr, i); + tlb_addr = slot_addr(pool->start, index) + offset; /* * When dir == DMA_FROM_DEVICE we could omit the copy from the orig * to the tlb buffer, if we knew for sure the device will @@ -857,7 +900,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr) { - struct io_tlb_mem *mem = dev->dma_io_tlb_mem; + struct io_tlb_pool *mem = dev->dma_io_tlb_mem->pool; unsigned long flags; unsigned int offset = swiotlb_align_offset(dev, tlb_addr); int index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT; @@ -901,7 +944,7 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr) area->used -= nslots; spin_unlock_irqrestore(&area->lock, flags); - dec_used(mem, nslots); + dec_used(dev->dma_io_tlb_mem, nslots); } /* @@ -1009,7 +1052,7 @@ EXPORT_SYMBOL_GPL(is_swiotlb_active); */ phys_addr_t default_swiotlb_start(void) { - return io_tlb_default_mem.start; + return io_tlb_default_pool.start; } /** @@ -1019,7 +1062,7 @@ phys_addr_t default_swiotlb_start(void) */ phys_addr_t default_swiotlb_limit(void) { - return io_tlb_default_mem.end - 1; + return io_tlb_default_pool.end - 1; } #ifdef CONFIG_DEBUG_FS @@ -1095,6 +1138,7 @@ static inline void swiotlb_create_debugfs_files(struct io_tlb_mem *mem, struct page *swiotlb_alloc(struct device *dev, size_t size) { struct io_tlb_mem *mem = dev->dma_io_tlb_mem; + struct io_tlb_pool *pool; phys_addr_t tlb_addr; int index; @@ -1105,7 +1149,8 @@ struct page *swiotlb_alloc(struct device *dev, size_t size) if (index == -1) return NULL; - tlb_addr = slot_addr(mem->start, index); + pool = mem->pool; + tlb_addr = slot_addr(pool->start, index); return pfn_to_page(PFN_DOWN(tlb_addr)); } @@ -1142,29 +1187,35 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem, * to it. */ if (!mem) { - mem = kzalloc(sizeof(*mem), GFP_KERNEL); + struct io_tlb_pool *pool; + + mem = kzalloc(sizeof(*mem) + sizeof(*pool), GFP_KERNEL); if (!mem) return -ENOMEM; + pool = (void*)mem + sizeof(*mem); - mem->slots = kcalloc(nslabs, sizeof(*mem->slots), GFP_KERNEL); - if (!mem->slots) { + pool->slots = kcalloc(nslabs, sizeof(*pool->slots), GFP_KERNEL); + if (!pool->slots) { kfree(mem); return -ENOMEM; } - mem->areas = kcalloc(nareas, sizeof(*mem->areas), + pool->areas = kcalloc(nareas, sizeof(*pool->areas), GFP_KERNEL); - if (!mem->areas) { - kfree(mem->slots); + if (!pool->areas) { + kfree(pool->slots); kfree(mem); return -ENOMEM; } set_memory_decrypted((unsigned long)phys_to_virt(rmem->base), rmem->size >> PAGE_SHIFT); - swiotlb_init_io_tlb_mem(mem, rmem->base, nslabs, SWIOTLB_FORCE, - false, nareas); + swiotlb_init_io_tlb_pool(pool, rmem->base, nslabs, + false, nareas); + mem->force_bounce = true; mem->for_alloc = true; + mem->pool = pool; + mem->nslabs = nslabs; rmem->priv = mem; From patchwork Tue Jun 27 09:54:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Tesarik X-Patchwork-Id: 13294252 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8490C0015E for ; Tue, 27 Jun 2023 09:58:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231937AbjF0J6n (ORCPT ); Tue, 27 Jun 2023 05:58:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231691AbjF0J6S (ORCPT ); Tue, 27 Jun 2023 05:58:18 -0400 Received: from frasgout11.his.huawei.com (unknown [14.137.139.23]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 558342D60; Tue, 27 Jun 2023 02:56:44 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.228]) by frasgout11.his.huawei.com (SkyGuard) with ESMTP id 4Qr0Fs2cCrz9v7HG; Tue, 27 Jun 2023 17:45:53 +0800 (CST) Received: from A2101119013HW2.china.huawei.com (unknown [10.45.153.185]) by APP1 (Coremail) with SMTP id LxC2BwBnwdzesZpknnvHAw--.39096S6; Tue, 27 Jun 2023 10:56:05 +0100 (CET) From: Petr Tesarik To: Stefano Stabellini , Thomas Bogendoerfer , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), "H. Peter Anvin" , Greg Kroah-Hartman , "Rafael J. Wysocki" , Juergen Gross , Oleksandr Tyshchenko , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Andy Shevchenko , Hans de Goede , Jason Gunthorpe , Kees Cook , Saravana Kannan , xen-devel@lists.xenproject.org (moderated list:XEN HYPERVISOR ARM), linux-arm-kernel@lists.infradead.org (moderated list:ARM PORT), linux-kernel@vger.kernel.org (open list), linux-mips@vger.kernel.org (open list:MIPS), iommu@lists.linux.dev (open list:XEN SWIOTLB SUBSYSTEM) Cc: Roberto Sassu , Kefeng Wang , petr@tesarici.cz Subject: [PATCH v3 4/7] swiotlb: if swiotlb is full, fall back to a transient memory pool Date: Tue, 27 Jun 2023 11:54:26 +0200 Message-Id: <34c2a1ba721a7bc496128aac5e20724e4077f1ab.1687859323.git.petr.tesarik.ext@huawei.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-CM-TRANSID: LxC2BwBnwdzesZpknnvHAw--.39096S6 X-Coremail-Antispam: 1UD129KBjvAXoW3Cr4UZr48ZrW8XFyfAry5urg_yoW8AryfXo W7CFn7Ww4fJryUCF45GFZ7GF47Za1kKF48AF4fuayqgayjvryYgrW3ta15X3sxur1xKFWx Ar1Sqa4fWF4xAry8n29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUOT7AC8VAFwI0_Wr0E3s1l1xkIjI8I6I8E6xAIw20EY4v20xva j40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l82xGYIkIc2x26280x7IE14v26r126s0DM28Irc Ia0xkI8VCY1x0267AKxVW5JVCq3wA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK021l 84ACjcxK6xIIjxv20xvE14v26r1j6r1xM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4UJV WxJr1l84ACjcxK6I8E87Iv67AKxVW8JVWxJwA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Gr1j 6F4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7V C0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j 6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x0262 8vn2kIc2xKxwCY1x0264kExVAvwVAq07x20xyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC 6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWw C2zVAF1VAY17CE14v26rWY6r4UJwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j 6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAvwI8IcIk0rV WUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWx JrUvcSsGvfC2KfnxnUUI43ZEXa7VUU_Ma5UUUUU== X-CM-SenderInfo: hshw23xhvd2x3n6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-mips@vger.kernel.org From: Petr Tesarik Try to allocate a transient memory pool if no suitable slots can be found, except when allocating from a restricted pool. The transient pool is just enough big for this one bounce buffer. It is inserted into a per-device list of transient memory pools, and it is freed again when the bounce buffer is unmapped. Transient memory pools are kept in an RCU list. A memory barrier is required after adding a new entry, because any address within a transient buffer must be immediately recognized as belonging to the SWIOTLB, even if it is passed to another CPU. Deletion does not require any synchronization beyond RCU ordering guarantees. After a buffer is unmapped, its physical addresses may no longer be passed to the DMA API, so the memory range of the corresponding stale entry in the RCU list never matches. If the memory range gets allocated again, then it happens only after a RCU quiescent state. Since bounce buffers can now be allocated from different pools, add a parameter to swiotlb_alloc_pool() to let the caller know which memory pool is used. Add swiotlb_find_pool() to find the memory pool corresponding to an address. This function is now also used by is_swiotlb_buffer(), because a simple boundary check is no longer sufficient. The logic in swiotlb_alloc_tlb() is taken from __dma_direct_alloc_pages(), simplified and enhanced to use coherent memory pools if needed. Note that this is not the most efficient way to provide a bounce buffer, but when a DMA buffer can't be mapped, something may (and will) actually break. At that point it is better to make an allocation, even if it may be an expensive operation. Signed-off-by: Petr Tesarik --- include/linux/device.h | 4 + include/linux/dma-mapping.h | 2 + include/linux/swiotlb.h | 13 +- kernel/dma/direct.c | 2 +- kernel/dma/swiotlb.c | 265 ++++++++++++++++++++++++++++++++++-- 5 files changed, 272 insertions(+), 14 deletions(-) diff --git a/include/linux/device.h b/include/linux/device.h index 83081aa99e6a..a1ee4c5924b8 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -510,6 +510,8 @@ struct device_physical_location { * @dma_mem: Internal for coherent mem override. * @cma_area: Contiguous memory area for dma allocations * @dma_io_tlb_mem: Software IO TLB allocator. Not for driver use. + * @dma_io_tlb_pools: List of transient swiotlb memory pools. + * @dma_io_tlb_lock: Protects changes to the list of active pools. * @archdata: For arch-specific additions. * @of_node: Associated device tree node. * @fwnode: Associated device node supplied by platform firmware. @@ -615,6 +617,8 @@ struct device { #endif #ifdef CONFIG_SWIOTLB struct io_tlb_mem *dma_io_tlb_mem; + struct list_head dma_io_tlb_pools; + spinlock_t dma_io_tlb_lock; #endif /* arch specific additions */ struct dev_archdata archdata; diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 0ee20b764000..c36c5a546787 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -417,6 +417,8 @@ static inline void dma_sync_sgtable_for_device(struct device *dev, #define dma_get_sgtable(d, t, v, h, s) dma_get_sgtable_attrs(d, t, v, h, s, 0) #define dma_mmap_coherent(d, v, c, h, s) dma_mmap_attrs(d, v, c, h, s, 0) +bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size); + static inline void *dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp_t gfp) { diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 0aa6868cb68c..ae1688438850 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -63,6 +63,7 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys, /** * struct io_tlb_pool - IO TLB memory pool descriptor + * @node: Member of the IO TLB memory pool list. * @start: The start address of the swiotlb memory pool. Used to do a quick * range check to see if the memory was in fact allocated by this * API. @@ -77,22 +78,27 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys, * see setup_io_tlb_npages(). * @used: The number of used IO TLB slots. * @late_alloc: %true if allocated using the page allocator. + * @transient: %true if transient memory pool. * @nareas: Number of areas in the pool. * @area_nslabs: Number of slots in each area. * @areas: Array of memory area descriptors. * @slots: Array of slot descriptors. + * @rcu: RCU head for swiotlb_dyn_free(). */ struct io_tlb_pool { + struct list_head node; phys_addr_t start; phys_addr_t end; void *vaddr; unsigned long nslabs; unsigned long used; bool late_alloc; + bool transient; unsigned int nareas; unsigned int area_nslabs; struct io_tlb_area *areas; struct io_tlb_slot *slots; + struct rcu_head rcu; }; /** @@ -120,6 +126,8 @@ struct io_tlb_mem { #endif }; +struct io_tlb_pool *swiotlb_find_pool(struct device *dev, phys_addr_t paddr); + /** * is_swiotlb_buffer() - check if a physical address belongs to a swiotlb * @dev: Device which has mapped the buffer. @@ -133,9 +141,8 @@ struct io_tlb_mem { */ static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) { - struct io_tlb_mem *mem = dev->dma_io_tlb_mem; - - return mem && paddr >= mem->pool->start && paddr < mem->pool->end; + return dev->dma_io_tlb_mem && + !!swiotlb_find_pool(dev, paddr); } static inline bool is_swiotlb_force_bounce(struct device *dev) diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 5595d1d5cdcc..820561cab38d 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -66,7 +66,7 @@ static gfp_t dma_direct_optimal_gfp_mask(struct device *dev, u64 *phys_limit) return 0; } -static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size) +bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size) { dma_addr_t dma_addr = phys_to_dma_direct(dev, phys); diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 4c5de91bda57..06b4fa7c2e9b 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -500,6 +501,157 @@ void __init swiotlb_exit(void) memset(mem, 0, sizeof(*mem)); } +/** + * alloc_dma_pages() - allocate pages to be used for DMA + * @gfp: GFP flags for the allocation. + * @bytes: Size of the buffer. + * + * Allocate pages from the buddy allocator. If successful, make the allocated + * pages decrypted that they can be used for DMA. + * + * Return: Decrypted pages, or %NULL on failure. + */ +static struct page *alloc_dma_pages(gfp_t gfp, size_t bytes) +{ + unsigned int order = get_order(bytes); + struct page *page; + void *vaddr; + + page = alloc_pages(gfp, order); + if (!page) + return NULL; + + vaddr = page_address(page); + if (set_memory_decrypted((unsigned long)vaddr, PFN_UP(bytes))) { + __free_pages(page, order); + return NULL; + } + + return page; +} + +/** + * swiotlb_alloc_tlb() - allocate a dynamic IO TLB buffer + * @dev: Device for which a memory pool is allocated. + * @bytes: Size of the buffer. + * @phys_limit: Maximum allowed physical address of the buffer. + * @gfp: GFP flags for the allocation. + * + * Return: Allocated pages, or %NULL on allocation failure. + */ +static struct page *swiotlb_alloc_tlb(struct device *dev, size_t bytes, + u64 phys_limit, gfp_t gfp) +{ + struct page *page; + + /* + * Allocate from the atomic pools if memory is encrypted and + * the allocation is atomic, because decrypting may block. + */ + if (dev && force_dma_unencrypted(dev) && !gfpflags_allow_blocking(gfp)) { + void *vaddr; + + return IS_ENABLED(CONFIG_DMA_COHERENT_POOL) + ? dma_alloc_from_pool(dev, bytes, &vaddr, gfp, + dma_coherent_ok) + : NULL; + } + + gfp &= ~GFP_ZONEMASK; + if (phys_limit <= DMA_BIT_MASK(zone_dma_bits)) + gfp |= __GFP_DMA; + else if (phys_limit <= DMA_BIT_MASK(32)) + gfp |= __GFP_DMA32; + + while ((page = alloc_dma_pages(gfp, bytes)) && + page_to_phys(page) + bytes - 1 > phys_limit) { + /* allocated, but too high */ + __free_pages(page, get_order(bytes)); + + if (IS_ENABLED(CONFIG_ZONE_DMA32) && + phys_limit < DMA_BIT_MASK(64) && + !(gfp & (__GFP_DMA32 | __GFP_DMA))) + gfp |= __GFP_DMA32; + else if (IS_ENABLED(CONFIG_ZONE_DMA) && + !(gfp & __GFP_DMA)) + gfp = (gfp & ~__GFP_DMA32) | __GFP_DMA; + else + return NULL; + } + + return page; +} + +/** + * swiotlb_free_tlb() - free a dynamically allocated IO TLB buffer + * @vaddr: Virtual address of the buffer. + * @bytes: Size of the buffer. + */ +static void swiotlb_free_tlb(void *vaddr, size_t bytes) +{ + if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL) && + dma_free_from_pool(NULL, vaddr, bytes)) + return; + + /* Intentional leak if pages cannot be encrypted again. */ + if (!set_memory_encrypted((unsigned long)vaddr, PFN_UP(bytes))) + __free_pages(virt_to_page(vaddr), get_order(bytes)); +} + +/** + * swiotlb_alloc_pool() - allocate a new IO TLB memory pool + * @dev: Device for which a memory pool is allocated. + * @nslabs: Desired number of slabs. + * @phys_limit: Maximum DMA buffer physical address. + * @gfp: GFP flags for the allocations. + * + * Allocate and initialize a new IO TLB memory pool. + * + * Return: New memory pool, or %NULL on allocation failure. + */ +static struct io_tlb_pool *swiotlb_alloc_pool(struct device *dev, + unsigned int nslabs, u64 phys_limit, gfp_t gfp) +{ + struct io_tlb_pool *pool; + struct page *tlb; + size_t pool_size; + size_t tlb_size; + + pool_size = sizeof(*pool) + array_size(sizeof(*pool->areas), 1) + + array_size(sizeof(*pool->slots), nslabs); + pool = kzalloc(pool_size, gfp); + if (!pool) + goto error; + pool->areas = (void *)pool + sizeof(*pool); + pool->slots = (void *)pool->areas + sizeof(*pool->areas); + + tlb_size = nslabs << IO_TLB_SHIFT; + tlb = swiotlb_alloc_tlb(dev, tlb_size, phys_limit, gfp); + if (!tlb) + goto error_tlb; + + swiotlb_init_io_tlb_pool(pool, page_to_phys(tlb), nslabs, true, 1); + return pool; + +error_tlb: + kfree(pool); +error: + return NULL; +} + +/** + * swiotlb_dyn_free() - RCU callback to free a memory pool + * @rcu: RCU head in the corresponding struct io_tlb_pool. + */ +static void swiotlb_dyn_free(struct rcu_head *rcu) +{ + struct io_tlb_pool *pool = container_of(rcu, struct io_tlb_pool, rcu); + size_t tlb_size = pool->end - pool->start; + + swiotlb_free_tlb(pool->vaddr, tlb_size); + kfree(pool); +} + /** * swiotlb_dev_init() - initialize swiotlb fields in &struct device * @dev: Device to be initialized. @@ -507,6 +659,56 @@ void __init swiotlb_exit(void) void swiotlb_dev_init(struct device *dev) { dev->dma_io_tlb_mem = &io_tlb_default_mem; + INIT_LIST_HEAD(&dev->dma_io_tlb_pools); + spin_lock_init(&dev->dma_io_tlb_lock); +} + +/** + * swiotlb_find_pool() - find the IO TLB pool for a physical address + * @dev: Device which has mapped the DMA buffer. + * @paddr: Physical address within the DMA buffer. + * + * Find the IO TLB memory pool descriptor which contains the given physical + * address, if any. + * + * Return: Memory pool which contains @paddr, or %NULL if none. + */ +struct io_tlb_pool *swiotlb_find_pool(struct device *dev, phys_addr_t paddr) +{ + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; + struct io_tlb_pool *pool = mem->pool; + + if (paddr >= pool->start && paddr < pool->end) + return pool; + + /* Pairs with smp_wmb() in swiotlb_find_slots(). */ + smp_rmb(); + + rcu_read_lock(); + list_for_each_entry_rcu(pool, &dev->dma_io_tlb_pools, node) { + if (paddr >= pool->start && paddr < pool->end) + goto out; + } + pool = NULL; +out: + rcu_read_unlock(); + return pool; +} + +/** + * swiotlb_del_pool() - remove an IO TLB pool from a device + * @dev: Owning device. + * @pool: Memory pool to be removed. + */ +static void swiotlb_del_pool(struct device *dev, struct io_tlb_pool *pool) +{ + unsigned long flags; + + spin_lock_irqsave(&dev->dma_io_tlb_lock, flags); + list_del_rcu(&pool->node); + spin_unlock_irqrestore(&dev->dma_io_tlb_lock, flags); + + call_rcu(&pool->rcu, swiotlb_dyn_free); } /* @@ -523,7 +725,7 @@ static unsigned int swiotlb_align_offset(struct device *dev, u64 addr) static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t size, enum dma_data_direction dir) { - struct io_tlb_pool *mem = dev->dma_io_tlb_mem->pool; + struct io_tlb_pool *mem = swiotlb_find_pool(dev, tlb_addr); int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT; phys_addr_t orig_addr = mem->slots[index].orig_addr; size_t alloc_size = mem->slots[index].alloc_size; @@ -796,6 +998,7 @@ static int pool_find_slots(struct device *dev, struct io_tlb_pool *pool, * @alloc_size: Total requested size of the bounce buffer, * including initial alignment padding. * @alloc_align_mask: Required alignment of the allocated buffer. + * @retpool: Used memory pool, updated on return. * * Search through the whole software IO TLB to find a sequence of slots that * match the allocation constraints. @@ -803,10 +1006,46 @@ static int pool_find_slots(struct device *dev, struct io_tlb_pool *pool, * Return: Index of the first allocated slot, or -1 on error. */ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr, - size_t alloc_size, unsigned int alloc_align_mask) + size_t alloc_size, unsigned int alloc_align_mask, + struct io_tlb_pool **retpool) { - return pool_find_slots(dev, dev->dma_io_tlb_mem->pool, orig_addr, - alloc_size, alloc_align_mask); + struct io_tlb_pool *pool; + unsigned long flags; + u64 phys_limit; + int index; + + pool = dev->dma_io_tlb_mem->pool; + index = pool_find_slots(dev, pool, orig_addr, + alloc_size, alloc_align_mask); + if (index >= 0) + goto found; + + if (dev->dma_io_tlb_mem->for_alloc) + return -1; + + phys_limit = min_not_zero(dev->coherent_dma_mask, dev->bus_dma_limit); + pool = swiotlb_alloc_pool(dev, nr_slots(alloc_size), phys_limit, + GFP_NOWAIT | __GFP_NOWARN); + if (!pool) + return -1; + + index = pool_find_slots(dev, pool, orig_addr, + alloc_size, alloc_align_mask); + if (index < 0) { + swiotlb_dyn_free(&pool->rcu); + return -1; + } + + pool->transient = true; + spin_lock_irqsave(&dev->dma_io_tlb_lock, flags); + list_add_rcu(&pool->node, &dev->dma_io_tlb_pools); + spin_unlock_irqrestore(&dev->dma_io_tlb_lock, flags); + + /* Pairs with smp_rmb() in swiotlb_find_pool(). */ + smp_wmb(); +found: + *retpool = pool; + return index; } /** @@ -869,7 +1108,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, } index = swiotlb_find_slots(dev, orig_addr, - alloc_size + offset, alloc_align_mask); + alloc_size + offset, alloc_align_mask, &pool); if (index == -1) { if (!(attrs & DMA_ATTR_NO_WARN)) dev_warn_ratelimited(dev, @@ -883,7 +1122,6 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, * This is needed when we sync the memory. Then we sync the buffer if * needed. */ - pool = mem->pool; for (i = 0; i < nr_slots(alloc_size + offset); i++) pool->slots[index + i].orig_addr = slot_addr(orig_addr, i); tlb_addr = slot_addr(pool->start, index) + offset; @@ -900,7 +1138,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr) { - struct io_tlb_pool *mem = dev->dma_io_tlb_mem->pool; + struct io_tlb_pool *mem = swiotlb_find_pool(dev, tlb_addr); unsigned long flags; unsigned int offset = swiotlb_align_offset(dev, tlb_addr); int index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT; @@ -954,6 +1192,8 @@ void swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr, size_t mapping_size, enum dma_data_direction dir, unsigned long attrs) { + struct io_tlb_pool *pool; + /* * First, sync the memory before unmapping the entry */ @@ -961,7 +1201,13 @@ void swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr, (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_FROM_DEVICE); - swiotlb_release_slots(dev, tlb_addr); + pool = swiotlb_find_pool(dev, tlb_addr); + if (pool->transient) { + dec_used(dev->dma_io_tlb_mem, pool->nslabs); + swiotlb_del_pool(dev, pool); + } else { + swiotlb_release_slots(dev, tlb_addr); + } } void swiotlb_sync_single_for_device(struct device *dev, phys_addr_t tlb_addr, @@ -1145,11 +1391,10 @@ struct page *swiotlb_alloc(struct device *dev, size_t size) if (!mem) return NULL; - index = swiotlb_find_slots(dev, 0, size, 0); + index = swiotlb_find_slots(dev, 0, size, 0, &pool); if (index == -1) return NULL; - pool = mem->pool; tlb_addr = slot_addr(pool->start, index); return pfn_to_page(PFN_DOWN(tlb_addr)); From patchwork Tue Jun 27 09:54:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Tesarik X-Patchwork-Id: 13294253 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C03D4EB64D9 for ; Tue, 27 Jun 2023 09:59:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231511AbjF0J7V (ORCPT ); Tue, 27 Jun 2023 05:59:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231615AbjF0J6k (ORCPT ); Tue, 27 Jun 2023 05:58:40 -0400 Received: from frasgout12.his.huawei.com (unknown [14.137.139.154]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E78AF35AE; Tue, 27 Jun 2023 02:57:00 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.227]) by frasgout12.his.huawei.com (SkyGuard) with ESMTP id 4Qr0DH62DDz9xGXS; Tue, 27 Jun 2023 17:44:31 +0800 (CST) Received: from A2101119013HW2.china.huawei.com (unknown [10.45.153.185]) by APP1 (Coremail) with SMTP id LxC2BwBnwdzesZpknnvHAw--.39096S7; Tue, 27 Jun 2023 10:56:21 +0100 (CET) From: Petr Tesarik To: Stefano Stabellini , Thomas Bogendoerfer , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), "H. Peter Anvin" , Greg Kroah-Hartman , "Rafael J. Wysocki" , Juergen Gross , Oleksandr Tyshchenko , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Andy Shevchenko , Hans de Goede , Jason Gunthorpe , Kees Cook , Saravana Kannan , xen-devel@lists.xenproject.org (moderated list:XEN HYPERVISOR ARM), linux-arm-kernel@lists.infradead.org (moderated list:ARM PORT), linux-kernel@vger.kernel.org (open list), linux-mips@vger.kernel.org (open list:MIPS), iommu@lists.linux.dev (open list:XEN SWIOTLB SUBSYSTEM) Cc: Roberto Sassu , Kefeng Wang , petr@tesarici.cz Subject: [PATCH v3 5/7] swiotlb: determine potential physical address limit Date: Tue, 27 Jun 2023 11:54:27 +0200 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-CM-TRANSID: LxC2BwBnwdzesZpknnvHAw--.39096S7 X-Coremail-Antispam: 1UD129KBjvJXoWxXw4fJr1DCF4Dtr1kuF1DAwb_yoW5WF48pF y8Aa1rtFsFqF1xAryxC3W7AFn3Ka1kC3y7urWa9r15Ar13Gr1rKrsrKr4Yqr1FqF409F4I vFyFvayYvF4aqw7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUml14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_JFI_Gr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26r4j6F4UM28EF7xvwVC2z280aVCY1x0267AKxVW8Jr0_ Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6x IIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_ Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8c xan2IY04v7MxkF7I0Ew4C26cxK6c8Ij28IcwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE 7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI 8E67AF67kF1VAFwI0_Wrv_Gr1UMIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_ Gr1lIxAIcVC0I7IYx2IY6xkF7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r 1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4U JbIYCTnIWIevJa73UjIFyTuYvjfU1WlkDUUUU X-CM-SenderInfo: hshw23xhvd2x3n6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-mips@vger.kernel.org From: Petr Tesarik The value returned by default_swiotlb_limit() should not change, because it is used to decide whether DMA can be used. To allow allocating memory pools on the fly, use the maximum possible physical address rather than the highest address used by the default pool. For swiotlb_init_remap(), this is either an arch-specific limit used by memblock_alloc_low(), or the highest directly mapped physical address if the initialization flags include SWIOTLB_ANY. For swiotlb_init_late(), the highest address is determined by the GFP flags. Signed-off-by: Petr Tesarik --- include/linux/swiotlb.h | 2 ++ kernel/dma/swiotlb.c | 11 ++++++++++- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index ae1688438850..4a3af1c216d0 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -105,6 +105,7 @@ struct io_tlb_pool { * struct io_tlb_mem - Software IO TLB allocator * @pool: IO TLB memory pool descriptor. * @nslabs: Total number of IO TLB slabs in all pools. + * @phys_limit: Maximum allowed physical address. * @debugfs: The dentry to debugfs. * @force_bounce: %true if swiotlb bouncing is forced * @for_alloc: %true if the pool is used for memory allocation @@ -117,6 +118,7 @@ struct io_tlb_pool { struct io_tlb_mem { struct io_tlb_pool *pool; unsigned long nslabs; + u64 phys_limit; struct dentry *debugfs; bool force_bounce; bool for_alloc; diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 06b4fa7c2e9b..5bb83097ade6 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -333,6 +333,9 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags, io_tlb_default_mem.force_bounce = swiotlb_force_bounce || (flags & SWIOTLB_FORCE); + io_tlb_default_mem.phys_limit = flags & SWIOTLB_ANY + ? virt_to_phys(high_memory - 1) + : ARCH_LOW_ADDRESS_LIMIT; if (!default_nareas) swiotlb_adjust_nareas(num_possible_cpus()); @@ -400,6 +403,12 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, return 0; io_tlb_default_mem.force_bounce = swiotlb_force_bounce; + io_tlb_default_mem.phys_limit = + IS_ENABLED(CONFIG_ZONE_DMA) && (gfp_mask & __GFP_DMA) + ? DMA_BIT_MASK(zone_dma_bits) + : (IS_ENABLED(CONFIG_ZONE_DMA32) && (gfp_mask & __GFP_DMA32) + ? DMA_BIT_MASK(32) + : virt_to_phys(high_memory - 1)); if (!default_nareas) swiotlb_adjust_nareas(num_possible_cpus()); @@ -1308,7 +1317,7 @@ phys_addr_t default_swiotlb_start(void) */ phys_addr_t default_swiotlb_limit(void) { - return io_tlb_default_pool.end - 1; + return io_tlb_default_mem.phys_limit; } #ifdef CONFIG_DEBUG_FS From patchwork Tue Jun 27 09:54:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Tesarik X-Patchwork-Id: 13294254 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2523EB64D9 for ; Tue, 27 Jun 2023 09:59:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230522AbjF0J7a (ORCPT ); Tue, 27 Jun 2023 05:59:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55350 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231145AbjF0J7N (ORCPT ); Tue, 27 Jun 2023 05:59:13 -0400 Received: from frasgout11.his.huawei.com (unknown [14.137.139.23]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0FF712D59; Tue, 27 Jun 2023 02:57:15 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.228]) by frasgout11.his.huawei.com (SkyGuard) with ESMTP id 4Qr0GT2cBfz9v7Hd; Tue, 27 Jun 2023 17:46:25 +0800 (CST) Received: from A2101119013HW2.china.huawei.com (unknown [10.45.153.185]) by APP1 (Coremail) with SMTP id LxC2BwBnwdzesZpknnvHAw--.39096S8; Tue, 27 Jun 2023 10:56:37 +0100 (CET) From: Petr Tesarik To: Stefano Stabellini , Thomas Bogendoerfer , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), "H. Peter Anvin" , Greg Kroah-Hartman , "Rafael J. Wysocki" , Juergen Gross , Oleksandr Tyshchenko , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Andy Shevchenko , Hans de Goede , Jason Gunthorpe , Kees Cook , Saravana Kannan , xen-devel@lists.xenproject.org (moderated list:XEN HYPERVISOR ARM), linux-arm-kernel@lists.infradead.org (moderated list:ARM PORT), linux-kernel@vger.kernel.org (open list), linux-mips@vger.kernel.org (open list:MIPS), iommu@lists.linux.dev (open list:XEN SWIOTLB SUBSYSTEM) Cc: Roberto Sassu , Kefeng Wang , petr@tesarici.cz Subject: [PATCH v3 6/7] swiotlb: allocate a new memory pool when existing pools are full Date: Tue, 27 Jun 2023 11:54:28 +0200 Message-Id: <3d51c5c0f59d74e2f79b4ce06f7e00b8c09f6b3a.1687859323.git.petr.tesarik.ext@huawei.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-CM-TRANSID: LxC2BwBnwdzesZpknnvHAw--.39096S8 X-Coremail-Antispam: 1UD129KBjvJXoW3ZrWfGFyxAryxCr4UJFWfAFb_yoWkXF13pF y5Aa43KFWUXrnrCFy7C3WUAF1akw4vkr47Cryavr1rZrn7Wr95tFyvk3yYqrWrJFWkuF13 tFWFvF4ruw47Zr7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUml14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_JFI_Gr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26r4j6F4UM28EF7xvwVC2z280aVCY1x0267AKxVW8Jr0_ Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6x IIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_ Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8c xan2IY04v7MxkF7I0Ew4C26cxK6c8Ij28IcwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE 7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI 8E67AF67kF1VAFwI0_Wrv_Gr1UMIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_ Gr1lIxAIcVC0I7IYx2IY6xkF7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r 1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4U JbIYCTnIWIevJa73UjIFyTuYvjfU1WlkDUUUU X-CM-SenderInfo: hshw23xhvd2x3n6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-mips@vger.kernel.org From: Petr Tesarik When swiotlb_find_slots() cannot find suitable slots, schedule the allocation of a new memory pool. It is not possible to allocate the pool immediately, because this code may run in interrupt context, which is not suitable for large memory allocations. This means that the memory pool will be available too late for the currently requested mapping, but the stress on the software IO TLB allocator is likely to continue, and subsequent allocations will benefit from the additional pool eventually. Keep all memory pools for an allocator in an RCU list to avoid locking on the read side. For modifications, add a new spinlock to struct io_tlb_mem. The spinlock also protects updates to the total number of slabs (nslabs in struct io_tlb_mem), but not reads of the value. Readers may therefore encounter a stale value, but this is not an issue: - swiotlb_tbl_map_single() and is_swiotlb_active() only check for non-zero value. This is ensured by the existence of the default memory pool, allocated at boot. - The exact value is used only for non-critical purposes (debugfs, kernel messages). Signed-off-by: Petr Tesarik --- include/linux/swiotlb.h | 9 ++- kernel/dma/swiotlb.c | 141 +++++++++++++++++++++++++++++++--------- 2 files changed, 119 insertions(+), 31 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 4a3af1c216d0..ae402890ba41 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -8,6 +8,7 @@ #include #include #include +#include struct device; struct page; @@ -103,12 +104,14 @@ struct io_tlb_pool { /** * struct io_tlb_mem - Software IO TLB allocator - * @pool: IO TLB memory pool descriptor. + * @lock: Lock to synchronize changes to the list. + * @pools: List of IO TLB memory pool descriptors. * @nslabs: Total number of IO TLB slabs in all pools. * @phys_limit: Maximum allowed physical address. * @debugfs: The dentry to debugfs. * @force_bounce: %true if swiotlb bouncing is forced * @for_alloc: %true if the pool is used for memory allocation + * @dyn_alloc: Dynamic IO TLB pool allocation work. * @total_used: The total number of slots in the pool that are currently used * across all areas. Used only for calculating used_hiwater in * debugfs. @@ -116,12 +119,14 @@ struct io_tlb_pool { * in debugfs. */ struct io_tlb_mem { - struct io_tlb_pool *pool; + spinlock_t lock; + struct list_head pools; unsigned long nslabs; u64 phys_limit; struct dentry *debugfs; bool force_bounce; bool for_alloc; + struct work_struct dyn_alloc; #ifdef CONFIG_DEBUG_FS atomic_long_t total_used; atomic_long_t used_hiwater; diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 5bb83097ade6..7661a6402e80 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -79,9 +79,14 @@ struct io_tlb_slot { static bool swiotlb_force_bounce; static bool swiotlb_force_disable; +static void swiotlb_dyn_alloc(struct work_struct *work); + static struct io_tlb_pool io_tlb_default_pool; -static struct io_tlb_mem io_tlb_default_mem = { - .pool = &io_tlb_default_pool, +struct io_tlb_mem io_tlb_default_mem = { + .lock = __SPIN_LOCK_UNLOCKED(io_tlb_default_mem.lock), + .pools = LIST_HEAD_INIT(io_tlb_default_mem.pools), + .dyn_alloc = __WORK_INITIALIZER(io_tlb_default_mem.dyn_alloc, + swiotlb_dyn_alloc), }; static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT; @@ -281,6 +286,19 @@ static void swiotlb_init_io_tlb_pool(struct io_tlb_pool *mem, phys_addr_t start, return; } +/** + * add_mem_pool() - add a memory pool to the allocator + * @mem: Software IO TLB allocator. + * @pool: Memory pool to be added. + */ +static void add_mem_pool(struct io_tlb_mem *mem, struct io_tlb_pool *pool) +{ + spin_lock(&mem->lock); + list_add_rcu(&pool->node, &mem->pools); + mem->nslabs += pool->nslabs; + spin_unlock(&mem->lock); +} + static void __init *swiotlb_memblock_alloc(unsigned long nslabs, unsigned int flags, int (*remap)(void *tlb, unsigned long nslabs)) @@ -372,7 +390,7 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags, swiotlb_init_io_tlb_pool(mem, __pa(tlb), nslabs, false, default_nareas); - io_tlb_default_mem.nslabs = nslabs; + add_mem_pool(&io_tlb_default_mem, mem); if (flags & SWIOTLB_VERBOSE) swiotlb_print_info(); @@ -463,7 +481,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, (nslabs << IO_TLB_SHIFT) >> PAGE_SHIFT); swiotlb_init_io_tlb_pool(mem, virt_to_phys(vstart), nslabs, true, nareas); - io_tlb_default_mem.nslabs = nslabs; + add_mem_pool(&io_tlb_default_mem, mem); swiotlb_print_info(); return 0; @@ -610,44 +628,83 @@ static void swiotlb_free_tlb(void *vaddr, size_t bytes) /** * swiotlb_alloc_pool() - allocate a new IO TLB memory pool * @dev: Device for which a memory pool is allocated. - * @nslabs: Desired number of slabs. + * @minslabs: Minimum number of slabs. + * @nslabs: Desired (maximum) number of slabs. + * @nareas: Number of areas. * @phys_limit: Maximum DMA buffer physical address. * @gfp: GFP flags for the allocations. * - * Allocate and initialize a new IO TLB memory pool. + * Allocate and initialize a new IO TLB memory pool. The actual number of + * slabs may be reduced if allocation of @nslabs fails. If even + * @minslabs cannot be allocated, this function fails. * * Return: New memory pool, or %NULL on allocation failure. */ static struct io_tlb_pool *swiotlb_alloc_pool(struct device *dev, - unsigned int nslabs, u64 phys_limit, gfp_t gfp) + unsigned long minslabs, unsigned long nslabs, + unsigned int nareas, u64 phys_limit, gfp_t gfp) { struct io_tlb_pool *pool; + unsigned int slot_order; struct page *tlb; size_t pool_size; size_t tlb_size; - pool_size = sizeof(*pool) + array_size(sizeof(*pool->areas), 1) + - array_size(sizeof(*pool->slots), nslabs); + pool_size = sizeof(*pool) + array_size(sizeof(*pool->areas), nareas); pool = kzalloc(pool_size, gfp); if (!pool) goto error; pool->areas = (void *)pool + sizeof(*pool); - pool->slots = (void *)pool->areas + sizeof(*pool->areas); tlb_size = nslabs << IO_TLB_SHIFT; - tlb = swiotlb_alloc_tlb(dev, tlb_size, phys_limit, gfp); - if (!tlb) - goto error_tlb; + while (!(tlb = swiotlb_alloc_tlb(dev, tlb_size, phys_limit, gfp))) { + if (nslabs <= minslabs) + goto error_tlb; + nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE); + nareas = limit_nareas(nareas, nslabs); + tlb_size = nslabs << IO_TLB_SHIFT; + } - swiotlb_init_io_tlb_pool(pool, page_to_phys(tlb), nslabs, true, 1); + slot_order = get_order(array_size(sizeof(*pool->slots), nslabs)); + pool->slots = (struct io_tlb_slot *) + __get_free_pages(gfp, slot_order); + if (!pool->slots) + goto error_slots; + + swiotlb_init_io_tlb_pool(pool, page_to_phys(tlb), nslabs, true, nareas); return pool; +error_slots: + swiotlb_free_tlb(page_address(tlb), tlb_size); error_tlb: kfree(pool); error: return NULL; } +/** + * swiotlb_dyn_alloc() - dynamic memory pool allocation worker + * @work: Pointer to dyn_alloc in struct io_tlb_mem. + */ +static void swiotlb_dyn_alloc(struct work_struct *work) +{ + struct io_tlb_mem *mem = + container_of(work, struct io_tlb_mem, dyn_alloc); + struct io_tlb_pool *pool; + + pool = swiotlb_alloc_pool(NULL, IO_TLB_MIN_SLABS, default_nslabs, + default_nareas, mem->phys_limit, GFP_KERNEL); + if (!pool) { + pr_warn_ratelimited("Failed to allocate new pool"); + return; + } + + add_mem_pool(mem, pool); + + /* Pairs with smp_rmb() in swiotlb_find_pool(). */ + smp_wmb(); +} + /** * swiotlb_dyn_free() - RCU callback to free a memory pool * @rcu: RCU head in the corresponding struct io_tlb_pool. @@ -655,8 +712,10 @@ static struct io_tlb_pool *swiotlb_alloc_pool(struct device *dev, static void swiotlb_dyn_free(struct rcu_head *rcu) { struct io_tlb_pool *pool = container_of(rcu, struct io_tlb_pool, rcu); + size_t slots_size = array_size(sizeof(*pool->slots), pool->nslabs); size_t tlb_size = pool->end - pool->start; + free_pages((unsigned long)pool->slots, get_order(slots_size)); swiotlb_free_tlb(pool->vaddr, tlb_size); kfree(pool); } @@ -685,15 +744,19 @@ void swiotlb_dev_init(struct device *dev) struct io_tlb_pool *swiotlb_find_pool(struct device *dev, phys_addr_t paddr) { struct io_tlb_mem *mem = dev->dma_io_tlb_mem; - struct io_tlb_pool *pool = mem->pool; - - if (paddr >= pool->start && paddr < pool->end) - return pool; + struct io_tlb_pool *pool; - /* Pairs with smp_wmb() in swiotlb_find_slots(). */ + /* Pairs with smp_wmb() in swiotlb_find_slots() and + * swiotlb_dyn_alloc(), which modify the RCU lists. + */ smp_rmb(); rcu_read_lock(); + list_for_each_entry_rcu(pool, &mem->pools, node) { + if (paddr >= pool->start && paddr < pool->end) + goto out; + } + list_for_each_entry_rcu(pool, &dev->dma_io_tlb_pools, node) { if (paddr >= pool->start && paddr < pool->end) goto out; @@ -1019,22 +1082,34 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr, struct io_tlb_pool **retpool) { struct io_tlb_pool *pool; + struct io_tlb_mem *mem; + unsigned long nslabs; unsigned long flags; u64 phys_limit; int index; - pool = dev->dma_io_tlb_mem->pool; - index = pool_find_slots(dev, pool, orig_addr, - alloc_size, alloc_align_mask); - if (index >= 0) - goto found; + mem = dev->dma_io_tlb_mem; + + rcu_read_lock(); + list_for_each_entry_rcu(pool, &mem->pools, node) { + index = pool_find_slots(dev, pool, orig_addr, + alloc_size, alloc_align_mask); + if (index >= 0) { + rcu_read_unlock(); + goto found; + } + } + rcu_read_unlock(); if (dev->dma_io_tlb_mem->for_alloc) return -1; + schedule_work(&mem->dyn_alloc); + phys_limit = min_not_zero(dev->coherent_dma_mask, dev->bus_dma_limit); - pool = swiotlb_alloc_pool(dev, nr_slots(alloc_size), phys_limit, - GFP_NOWAIT | __GFP_NOWARN); + nslabs = nr_slots(alloc_size); + pool = swiotlb_alloc_pool(dev, nslabs, nslabs, 1, phys_limit, + GFP_NOWAIT | __GFP_NOWARN); if (!pool) return -1; @@ -1086,7 +1161,15 @@ static unsigned long mem_pool_used(struct io_tlb_pool *pool) */ static unsigned long mem_used(struct io_tlb_mem *mem) { - return mem_pool_used(mem->pool); + struct io_tlb_pool *pool; + unsigned long used = 0; + + rcu_read_lock(); + list_for_each_entry_rcu(pool, &mem->pools, node) + used += mem_pool_used(pool); + rcu_read_unlock(); + + return used; } phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, @@ -1468,8 +1551,8 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem, false, nareas); mem->force_bounce = true; mem->for_alloc = true; - mem->pool = pool; - mem->nslabs = nslabs; + spin_lock_init(&mem->lock); + add_mem_pool(mem, pool); rmem->priv = mem; From patchwork Tue Jun 27 09:54:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Tesarik X-Patchwork-Id: 13294255 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30727EB64D9 for ; Tue, 27 Jun 2023 10:00:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229966AbjF0KAG (ORCPT ); Tue, 27 Jun 2023 06:00:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231307AbjF0J71 (ORCPT ); Tue, 27 Jun 2023 05:59:27 -0400 Received: from frasgout11.his.huawei.com (unknown [14.137.139.23]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA6342D73; Tue, 27 Jun 2023 02:57:31 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.229]) by frasgout11.his.huawei.com (SkyGuard) with ESMTP id 4Qr0Gn2dWKz9v7HG; Tue, 27 Jun 2023 17:46:41 +0800 (CST) Received: from A2101119013HW2.china.huawei.com (unknown [10.45.153.185]) by APP1 (Coremail) with SMTP id LxC2BwBnwdzesZpknnvHAw--.39096S9; Tue, 27 Jun 2023 10:56:53 +0100 (CET) From: Petr Tesarik To: Stefano Stabellini , Thomas Bogendoerfer , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), "H. Peter Anvin" , Greg Kroah-Hartman , "Rafael J. Wysocki" , Juergen Gross , Oleksandr Tyshchenko , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Andy Shevchenko , Hans de Goede , Jason Gunthorpe , Kees Cook , Saravana Kannan , xen-devel@lists.xenproject.org (moderated list:XEN HYPERVISOR ARM), linux-arm-kernel@lists.infradead.org (moderated list:ARM PORT), linux-kernel@vger.kernel.org (open list), linux-mips@vger.kernel.org (open list:MIPS), iommu@lists.linux.dev (open list:XEN SWIOTLB SUBSYSTEM) Cc: Roberto Sassu , Kefeng Wang , petr@tesarici.cz Subject: [PATCH v3 7/7] swiotlb: search the software IO TLB only if a device makes use of it Date: Tue, 27 Jun 2023 11:54:29 +0200 Message-Id: <34b9a218e9a152dd53eb0d4b0a819c6de987c8c1.1687859323.git.petr.tesarik.ext@huawei.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-CM-TRANSID: LxC2BwBnwdzesZpknnvHAw--.39096S9 X-Coremail-Antispam: 1UD129KBjvJXoWxZF15Ww18ZF1fWr1UtFW7XFb_yoWrXF1DpF 9xAFZ8KayDXr97Cr97CF4UZ3Wagw4vkw43CryagrnYkr1DJwnYqFyDKrWYv3s5Ar47ZFW7 Xryj939Ykw17Xr7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUml14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_JFI_Gr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26r4j6F4UM28EF7xvwVC2z280aVCY1x0267AKxVW8Jr0_ Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6x IIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_ Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8c xan2IY04v7MxkF7I0Ew4C26cxK6c8Ij28IcwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE 7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI 8E67AF67kF1VAFwI0_Wrv_Gr1UMIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_ Gr1lIxAIcVC0I7IYx2IY6xkF7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r 1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4U JbIYCTnIWIevJa73UjIFyTuYvjfU1WlkDUUUU X-CM-SenderInfo: hshw23xhvd2x3n6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-mips@vger.kernel.org From: Petr Tesarik Skip searching the software IO TLB if a device has never used it, making sure these devices are not affected by the introduction of multiple IO TLB memory pools. Additional memory barrier is required to ensure that the new value of the flag is visible to other CPUs after mapping a new bounce buffer. For efficiency, the flag check should be inlined, and then the memory barrier must be moved to is_swiotlb_buffer(). However, it can replace the existing barrier in swiotlb_find_pool(), because all callers use is_swiotlb_buffer() first to verify that the buffer address belongs to the software IO TLB. Signed-off-by: Petr Tesarik --- include/linux/device.h | 2 ++ include/linux/swiotlb.h | 6 +++++- kernel/dma/swiotlb.c | 14 ++++++-------- 3 files changed, 13 insertions(+), 9 deletions(-) diff --git a/include/linux/device.h b/include/linux/device.h index a1ee4c5924b8..d4b35925a6d1 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -512,6 +512,7 @@ struct device_physical_location { * @dma_io_tlb_mem: Software IO TLB allocator. Not for driver use. * @dma_io_tlb_pools: List of transient swiotlb memory pools. * @dma_io_tlb_lock: Protects changes to the list of active pools. + * @dma_uses_io_tlb: %true if device has used the software IO TLB. * @archdata: For arch-specific additions. * @of_node: Associated device tree node. * @fwnode: Associated device node supplied by platform firmware. @@ -619,6 +620,7 @@ struct device { struct io_tlb_mem *dma_io_tlb_mem; struct list_head dma_io_tlb_pools; spinlock_t dma_io_tlb_lock; + bool dma_uses_io_tlb; #endif /* arch specific additions */ struct dev_archdata archdata; diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index ae402890ba41..eb30b5c624f1 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -148,7 +148,11 @@ struct io_tlb_pool *swiotlb_find_pool(struct device *dev, phys_addr_t paddr); */ static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) { - return dev->dma_io_tlb_mem && + /* Pairs with smp_wmb() in swiotlb_find_slots() and + * swiotlb_dyn_alloc(), which modify the RCU lists. + */ + smp_rmb(); + return dev->dma_uses_io_tlb && !!swiotlb_find_pool(dev, paddr); } diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 7661a6402e80..26ab9ed2921b 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -701,7 +701,7 @@ static void swiotlb_dyn_alloc(struct work_struct *work) add_mem_pool(mem, pool); - /* Pairs with smp_rmb() in swiotlb_find_pool(). */ + /* Pairs with smp_rmb() in is_swiotlb_buffer(). */ smp_wmb(); } @@ -729,6 +729,7 @@ void swiotlb_dev_init(struct device *dev) dev->dma_io_tlb_mem = &io_tlb_default_mem; INIT_LIST_HEAD(&dev->dma_io_tlb_pools); spin_lock_init(&dev->dma_io_tlb_lock); + dev->dma_uses_io_tlb = false; } /** @@ -746,11 +747,6 @@ struct io_tlb_pool *swiotlb_find_pool(struct device *dev, phys_addr_t paddr) struct io_tlb_mem *mem = dev->dma_io_tlb_mem; struct io_tlb_pool *pool; - /* Pairs with smp_wmb() in swiotlb_find_slots() and - * swiotlb_dyn_alloc(), which modify the RCU lists. - */ - smp_rmb(); - rcu_read_lock(); list_for_each_entry_rcu(pool, &mem->pools, node) { if (paddr >= pool->start && paddr < pool->end) @@ -1125,9 +1121,11 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr, list_add_rcu(&pool->node, &dev->dma_io_tlb_pools); spin_unlock_irqrestore(&dev->dma_io_tlb_lock, flags); - /* Pairs with smp_rmb() in swiotlb_find_pool(). */ - smp_wmb(); found: + dev->dma_uses_io_tlb = true; + /* Pairs with smp_rmb() in is_swiotlb_buffer() */ + smp_wmb(); + *retpool = pool; return index; }