From patchwork Thu Aug 1 08:25:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baruch Siach X-Patchwork-Id: 13749965 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 81633C3DA64 for ; Thu, 1 Aug 2024 08:26:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=QgiyLHcYGcRHt8rOlQVYCCzTAApoU8bv1AH2DRS6oZA=; b=bE8eFOjx/VKv+tPwpdq/frjfTT zokwFLwPRWTY0m4QZ5EL/xQPMKfl/Gruz/KKwvjOph+feGnIDhg0F2aWZgS84wXz+GeL9P9vT9tna 0BqkKEYkVe6fMrpV59bV3K2SrOpGPqXB3JA2qIZf5I29PkjeJaO9HMaKROo+GMLdRFsaeF/AAaTeI iFxg42PmABTr3cnKVK2PYNQK2PJNteTbsxDVscqmkRk2b8o15vu20SPBxd9xFiPx54q6YA1rk2Ap5 WPWoyNA29lQRLKUm9KyQJB1ePnGJd+W4Dg7bsitg2+pAFZqrngiZ8dts0XHHtbPQYJl9ATJea65R7 JuHOZjKg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sZR8o-00000004LaC-3Mpx; Thu, 01 Aug 2024 08:26:31 +0000 Received: from golan.tkos.co.il ([84.110.109.230] helo=mail.tkos.co.il) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sZR7s-00000004LGN-2LfB for linux-arm-kernel@lists.infradead.org; Thu, 01 Aug 2024 08:25:34 +0000 Received: from tarshish.tkos.co.il (unknown [10.0.8.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.tkos.co.il (Postfix) with ESMTPS id 864ED440932; Thu, 1 Aug 2024 11:24:07 +0300 (IDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tkos.co.il; s=default; t=1722500648; bh=Yaa82cqvK/7ZFcSlM9y57cf/s6CJpTcrZlbQ2R51MpE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HFyQUvTlTTY/eVqJOchHKkcdrdrGmNohf9og26uXHRe5MRgamACVy6P1EEO1MVAh1 II0T6z3ntd4j4LJPyPrQjkGF+KVwjbHDj6f1p31XD7ZDg4siEAAlAqy90hAE3AOKEQ kfDZc6BxfUE3DB1Ze04ejUdK82dx8Wuu01nPg3VFnD2z6gBadmBRwLLlR4AEFA8JPV FRAiKSo0+y6o3FQTn67gC63CII16EWCV7sAwsaCZfaXA6F0NrmAOJc7xQLK8POsxo0 aoykzw5fl8o8QJh5LduEATALVyA6vathNBxOzfVlGDILkvJmQOO8u8pCpXR56wm8Tt VHJdGhIcLRSPQ== From: Baruch Siach To: Christoph Hellwig , Marek Szyprowski , Catalin Marinas , Will Deacon Cc: Baruch Siach , Robin Murphy , iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, =?utf-8?b?UGV0ciBUZXNhxZnDrWs=?= , Ramon Fried , Elad Nachman Subject: [PATCH v4 1/2] dma: improve DMA zone selection Date: Thu, 1 Aug 2024 11:25:06 +0300 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240801_012533_103137_B8DC8B03 X-CRM114-Status: GOOD ( 15.93 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org When device DMA limit does not fit in DMA32 zone it should use DMA zone, even when DMA zone is stricter than needed. Same goes for devices that can't allocate from the entire normal zone. Limit to DMA32 in that case. Reported-by: Catalin Marinas Signed-off-by: Baruch Siach Reviewed-by: Catalin Marinas --- kernel/dma/direct.c | 6 +++--- kernel/dma/swiotlb.c | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 4480a3cd92e0..3b4be4ca3b08 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -4,7 +4,7 @@ * * DMA operations that map physical memory directly without using an IOMMU. */ -#include /* for max_pfn */ +#include #include #include #include @@ -59,9 +59,9 @@ static gfp_t dma_direct_optimal_gfp_mask(struct device *dev, u64 *phys_limit) * zones. */ *phys_limit = dma_to_phys(dev, dma_limit); - if (*phys_limit <= DMA_BIT_MASK(zone_dma_bits)) + if (*phys_limit < DMA_BIT_MASK(32)) return GFP_DMA; - if (*phys_limit <= DMA_BIT_MASK(32)) + if (*phys_limit < memblock_end_of_DRAM()) return GFP_DMA32; return 0; } diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index df68d29740a0..043b0ecd3e8d 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -629,9 +629,9 @@ static struct page *swiotlb_alloc_tlb(struct device *dev, size_t bytes, } gfp &= ~GFP_ZONEMASK; - if (phys_limit <= DMA_BIT_MASK(zone_dma_bits)) + if (phys_limit < DMA_BIT_MASK(32)) gfp |= __GFP_DMA; - else if (phys_limit <= DMA_BIT_MASK(32)) + else if (phys_limit < memblock_end_of_DRAM()) gfp |= __GFP_DMA32; while (IS_ERR(page = alloc_dma_pages(gfp, bytes, phys_limit))) { From patchwork Thu Aug 1 08:25:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baruch Siach X-Patchwork-Id: 13749968 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 53A89C3DA4A for ; Thu, 1 Aug 2024 08:27:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=FBeyxrWmDM2LvKjVVPqDhq/qN4Q4nS0fFhPTEx+3FpA=; b=se3yIEFWy0RUGb2kH+vrlAaehn hZprG9WgsIpVy/WOS6gjasxmezHOaW04URkCP8+klw4HD4K0JNzf9PXJDeHB5VwxQpTrdtncNlSng PDSPqcv6+o9ycFZ+/ILAgJHTZCkmEKzq51+nQiXp9G50bO+iExZtj9Uf0Bt7w/NClwg+7EtOj2BSY qNiFckgikfQqn5BWD4tf7+BokIVGHLKut5c2CV3Z5Rj1sp0NlE+j1Fz5bf9UXZiV0S3EG0Eb8/WZx KjTmU+seVhC36admDxjUacBdhh4M0fxRkoEPoPrBzDsvfbaKsYmkk74ORZ4RMEzrVjXUhg7qUiFW7 didZnh7Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sZR9F-00000004Lfu-2i2K; Thu, 01 Aug 2024 08:26:57 +0000 Received: from guitar.tkos.co.il ([84.110.109.230] helo=mail.tkos.co.il) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sZR7s-00000004LGT-3hRK for linux-arm-kernel@lists.infradead.org; Thu, 01 Aug 2024 08:25:35 +0000 Received: from tarshish.tkos.co.il (unknown [10.0.8.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.tkos.co.il (Postfix) with ESMTPS id 3A250440C81; Thu, 1 Aug 2024 11:24:08 +0300 (IDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tkos.co.il; s=default; t=1722500648; bh=VYVg6dcZoMgPVC5tuSCiZfCedBjhw7Yn+Yo8u4T9MuM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ssxJC2sJd1ijT27i1CV2M9jYOwBkr6JiQYHHngELXwKyu5I9JhzfU99bTyh06TtFO Srdqp5si6rY/clGAqYHZT1D29gf6+NM+pQitik8pY4S+v6OPowHT/08oLa1wMU/XAe JtwZLyQhv1GWdlo+ekjGN2D4tiC7FbN6BZVOXHzEYeL439+1Om6k7NJ0Mh+PqKpZ0M yLHdXOMxPUx+VLOiKTDLBsYyjx/5R+TE4s+U/ZGExJ8LEd9dHI5/8HK3NENUWGjhgu quovkSb1C42G3Ihaqs0FtgNmEwfw5QKO65Q7WPPaaK0Gz95ng8M2HQEqwbJV8FSWWh 2iglGBoKd9B9Q== From: Baruch Siach To: Christoph Hellwig , Marek Szyprowski , Catalin Marinas , Will Deacon Cc: Baruch Siach , Robin Murphy , iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, =?utf-8?b?UGV0ciBUZXNhxZnDrWs=?= , Ramon Fried , Elad Nachman Subject: [PATCH v4 2/2] dma: replace zone_dma_bits by zone_dma_limit Date: Thu, 1 Aug 2024 11:25:07 +0300 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240801_012533_350752_2AD030B8 X-CRM114-Status: GOOD ( 22.86 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Catalin Marinas Hardware DMA limit might not be power of 2. When RAM range starts above 0, say 4GB, DMA limit of 30 bits should end at 5GB. A single high bit can not encode this limit. Use direct phys_addr_t limit address for DMA zone limit. Current code using zone_dma_bits assume that all address range below limit is suitable for DMA. For some existing platforms this assumption is not correct. DMA range might have non zero lower limit. Commit 791ab8b2e3db ("arm64: Ignore any DMA offsets in the max_zone_phys() calculation") made arm64 DMA/DMA32 zones span the entire RAM when RAM starts above 32-bits. This breaks hardware with DMA area that start above 32-bits. But the commit log says that "we haven't noticed any such hardware". It turns out that such hardware does exist. One such platform has RAM starting at 32GB with an internal bus that has the following DMA limits: #address-cells = <2>; #size-cells = <2>; dma-ranges = <0x00 0xc0000000 0x08 0x00000000 0x00 0x40000000>; That is, devices under this bus see 1GB of DMA range between 3GB-4GB in their address space. This range is mapped to CPU memory at 32GB-33GB. With current code DMA allocations for devices under this bus are not limited to DMA area, leading to run-time allocation failure. With the move to zone_dma_limit DMA allocation for constrained devices becomes possible. The result is DMA zone that properly reflects the hardware constraints as follows: [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x0000000800000000-0x000000083fffffff] [ 0.000000] DMA32 empty [ 0.000000] Normal [mem 0x0000000840000000-0x0000000bffffffff] Signed-off-by: Catalin Marinas Co-developed-by: Baruch Siach Signed-off-by: Baruch Siach --- arch/arm64/mm/init.c | 30 +++++++++++++++--------------- arch/powerpc/mm/mem.c | 9 ++++----- arch/s390/mm/init.c | 2 +- include/linux/dma-direct.h | 2 +- kernel/dma/direct.c | 4 ++-- kernel/dma/pool.c | 5 +++-- kernel/dma/swiotlb.c | 2 +- 7 files changed, 27 insertions(+), 27 deletions(-) diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 9b5ab6818f7f..c45e2152ca9e 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -115,35 +115,35 @@ static void __init arch_reserve_crashkernel(void) } /* - * Return the maximum physical address for a zone accessible by the given bits - * limit. If DRAM starts above 32-bit, expand the zone to the maximum + * Return the maximum physical address for a zone given its limit. + * If DRAM starts above 32-bit, expand the zone to the maximum * available memory, otherwise cap it at 32-bit. */ -static phys_addr_t __init max_zone_phys(unsigned int zone_bits) +static phys_addr_t __init max_zone_phys(phys_addr_t zone_limit) { - phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits); phys_addr_t phys_start = memblock_start_of_DRAM(); if (phys_start > U32_MAX) - zone_mask = PHYS_ADDR_MAX; - else if (phys_start > zone_mask) - zone_mask = U32_MAX; + zone_limit = PHYS_ADDR_MAX; + else if (phys_start > zone_limit) + zone_limit = U32_MAX; - return min(zone_mask, memblock_end_of_DRAM() - 1) + 1; + return min(zone_limit, memblock_end_of_DRAM() - 1) + 1; } static void __init zone_sizes_init(void) { unsigned long max_zone_pfns[MAX_NR_ZONES] = {0}; - unsigned int __maybe_unused acpi_zone_dma_bits; - unsigned int __maybe_unused dt_zone_dma_bits; - phys_addr_t __maybe_unused dma32_phys_limit = max_zone_phys(32); + phys_addr_t __maybe_unused acpi_zone_dma_limit; + phys_addr_t __maybe_unused dt_zone_dma_limit; + phys_addr_t __maybe_unused dma32_phys_limit = + max_zone_phys(DMA_BIT_MASK(32)); #ifdef CONFIG_ZONE_DMA - acpi_zone_dma_bits = fls64(acpi_iort_dma_get_max_cpu_address()); - dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL)); - zone_dma_bits = min3(32U, dt_zone_dma_bits, acpi_zone_dma_bits); - arm64_dma_phys_limit = max_zone_phys(zone_dma_bits); + acpi_zone_dma_limit = acpi_iort_dma_get_max_cpu_address(); + dt_zone_dma_limit = of_dma_get_max_cpu_address(NULL); + zone_dma_limit = min(dt_zone_dma_limit, acpi_zone_dma_limit); + arm64_dma_phys_limit = max_zone_phys(zone_dma_limit); max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit); #endif #ifdef CONFIG_ZONE_DMA32 diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index d325217ab201..342c006cc1b8 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -216,7 +216,7 @@ static int __init mark_nonram_nosave(void) * everything else. GFP_DMA32 page allocations automatically fall back to * ZONE_DMA. * - * By using 31-bit unconditionally, we can exploit zone_dma_bits to inform the + * By using 31-bit unconditionally, we can exploit zone_dma_limit to inform the * generic DMA mapping code. 32-bit only devices (if not handled by an IOMMU * anyway) will take a first dip into ZONE_NORMAL and get otherwise served by * ZONE_DMA. @@ -252,13 +252,12 @@ void __init paging_init(void) * powerbooks. */ if (IS_ENABLED(CONFIG_PPC32)) - zone_dma_bits = 30; + zone_dma_limit = DMA_BIT_MASK(30); else - zone_dma_bits = 31; + zone_dma_limit = DMA_BIT_MASK(31); #ifdef CONFIG_ZONE_DMA - max_zone_pfns[ZONE_DMA] = min(max_low_pfn, - 1UL << (zone_dma_bits - PAGE_SHIFT)); + max_zone_pfns[ZONE_DMA] = min(max_low_pfn, zone_dma_limit >> PAGE_SHIFT); #endif max_zone_pfns[ZONE_NORMAL] = max_low_pfn; #ifdef CONFIG_HIGHMEM diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index ddcd39ef4346..91fc2b91adfc 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -97,7 +97,7 @@ void __init paging_init(void) vmem_map_init(); sparse_init(); - zone_dma_bits = 31; + zone_dma_limit = DMA_BIT_MASK(31); memset(max_zone_pfns, 0, sizeof(max_zone_pfns)); max_zone_pfns[ZONE_DMA] = virt_to_pfn(MAX_DMA_ADDRESS); max_zone_pfns[ZONE_NORMAL] = max_low_pfn; diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h index edbe13d00776..98b7d8015043 100644 --- a/include/linux/dma-direct.h +++ b/include/linux/dma-direct.h @@ -12,7 +12,7 @@ #include #include -extern unsigned int zone_dma_bits; +extern phys_addr_t zone_dma_limit; /* * Record the mapping of CPU physical to DMA addresses for a given region. diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 3b4be4ca3b08..3dbc0b89d6fb 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -20,7 +20,7 @@ * it for entirely different regions. In that case the arch code needs to * override the variable below for dma-direct to work properly. */ -unsigned int zone_dma_bits __ro_after_init = 24; +phys_addr_t zone_dma_limit __ro_after_init = DMA_BIT_MASK(24); static inline dma_addr_t phys_to_dma_direct(struct device *dev, phys_addr_t phys) @@ -580,7 +580,7 @@ int dma_direct_supported(struct device *dev, u64 mask) * part of the check. */ if (IS_ENABLED(CONFIG_ZONE_DMA)) - min_mask = min_t(u64, min_mask, DMA_BIT_MASK(zone_dma_bits)); + min_mask = min_t(u64, min_mask, zone_dma_limit); return mask >= phys_to_dma_unencrypted(dev, min_mask); } diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c index d10613eb0f63..a6e15db9d1e7 100644 --- a/kernel/dma/pool.c +++ b/kernel/dma/pool.c @@ -70,9 +70,10 @@ static bool cma_in_zone(gfp_t gfp) /* CMA can't cross zone boundaries, see cma_activate_area() */ end = cma_get_base(cma) + size - 1; if (IS_ENABLED(CONFIG_ZONE_DMA) && (gfp & GFP_DMA)) - return end <= DMA_BIT_MASK(zone_dma_bits); + return end <= zone_dma_limit; + /* Account for possible zone_dma_limit > DMA_BIT_MASK(32) */ if (IS_ENABLED(CONFIG_ZONE_DMA32) && (gfp & GFP_DMA32)) - return end <= DMA_BIT_MASK(32); + return end <= DMA_BIT_MASK(32) || end <= zone_dma_limit; return true; } diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 043b0ecd3e8d..53595eb41922 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -450,7 +450,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, if (!remap) io_tlb_default_mem.can_grow = true; if (IS_ENABLED(CONFIG_ZONE_DMA) && (gfp_mask & __GFP_DMA)) - io_tlb_default_mem.phys_limit = DMA_BIT_MASK(zone_dma_bits); + io_tlb_default_mem.phys_limit = zone_dma_limit; else if (IS_ENABLED(CONFIG_ZONE_DMA32) && (gfp_mask & __GFP_DMA32)) io_tlb_default_mem.phys_limit = DMA_BIT_MASK(32); else