From patchwork Mon Mar 27 12:12:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189148 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6102DC77B6D for ; Mon, 27 Mar 2023 12:14:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232718AbjC0MOT (ORCPT ); Mon, 27 Mar 2023 08:14:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232011AbjC0MOL (ORCPT ); Mon, 27 Mar 2023 08:14:11 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 459EF3A90; Mon, 27 Mar 2023 05:14:09 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id EA050B81151; Mon, 27 Mar 2023 12:14:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6F17DC433A1; Mon, 27 Mar 2023 12:13:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919246; bh=iOskfAliyJ2EhiqigbZrQ+9sb9TAldUUyoYPaRwEDOw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=W6MicPQL1BPAw1q2u16+yvNoyGVAcN9kWWP2q0yJFA8WaLlF3xLFP4qqYt27HzGvZ mYxFUqLxZu9g0vfRk0OkwpqBme7YFBXsTkGRImpamVFKboMg1ei5YSkU3H42SVCWJG H4nHFGqZZLZUxa/aPn/txUIU1mCwKXrPAyD3x4VQDrGJVJD55nysVZn54+BS+ffi/N ZJubl/TUz3/2ub3bBwZP3NXIhkqVNUL1hsEZh4C64w43FHfNUjNUNIpLrpsry8bakr G1L0PvdkaTbdJCAriKbg/QJs9DMAkZ6JBhr2hpVO04DL/gLRI07Z7LrzaJ0P3TgcaW D3F+FJCQte2dA== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 01/21] openrisc: dma-mapping: flush bidirectional mappings Date: Mon, 27 Mar 2023 14:12:57 +0200 Message-Id: <20230327121317.4081816-2-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann The cache management operations on DMA are different from the other architectures: - on DMA_TO_DEVICE, Openrisc currently invalidates the cache after the writeback, where a simple writeback without invalidation should be sufficient. - on DMA_BIDIRECTIONAL, Openrisc does nothing, while most architectures either flush before DMA, or writeback before and invalidate after DMA. The separate invalidation for DMA_BIDIRECTIONAL/DMA_FROM_DEVICE is only required on CPUs that can do speculative prefetches. Change both to have the normal set of operations. Signed-off-by: Arnd Bergmann --- arch/openrisc/kernel/dma.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c index b3edbb33b621..91a00d09ffad 100644 --- a/arch/openrisc/kernel/dma.c +++ b/arch/openrisc/kernel/dma.c @@ -103,10 +103,10 @@ void arch_sync_dma_for_device(phys_addr_t addr, size_t size, switch (dir) { case DMA_TO_DEVICE: - /* Flush the dcache for the requested range */ + /* Write back the dcache for the requested range */ for (cl = addr; cl < addr + size; cl += cpuinfo->dcache_block_size) - mtspr(SPR_DCBFR, cl); + mtspr(SPR_DCBWR, cl); break; case DMA_FROM_DEVICE: /* Invalidate the dcache for the requested range */ @@ -114,12 +114,13 @@ void arch_sync_dma_for_device(phys_addr_t addr, size_t size, cl += cpuinfo->dcache_block_size) mtspr(SPR_DCBIR, cl); break; + case DMA_BIDIRECTIONAL: + /* Flush the dcache for the requested range */ + for (cl = addr; cl < addr + size; + cl += cpuinfo->dcache_block_size) + mtspr(SPR_DCBFR, cl); + break; default: - /* - * NOTE: If dir == DMA_BIDIRECTIONAL then there's no need to - * flush nor invalidate the cache here as the area will need - * to be manually synced anyway. - */ break; } } From patchwork Mon Mar 27 12:12:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189149 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90C65C77B62 for ; Mon, 27 Mar 2023 12:14:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232750AbjC0MOb (ORCPT ); Mon, 27 Mar 2023 08:14:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33438 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232715AbjC0MOS (ORCPT ); Mon, 27 Mar 2023 08:14:18 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D9F040C6; Mon, 27 Mar 2023 05:14:16 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E9F58611EA; Mon, 27 Mar 2023 12:14:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 22CE2C433A0; Mon, 27 Mar 2023 12:14:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919255; bh=5/cbd11rgdZqe98FqIk/+syngKp9Hy8AwsMD312+wVU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fI64KM8c/Ekdiv2ZKC9IycZF7Z2nU1AzPzgqokBHDE9VzldOQyySG0AudPk5QPofj KAKyhzuHamSmO3zl8PCYGHrnv/203MoC5ecW0OjEqVJXXqlBlkvMLUmkWssApHxxv2 Nle0M09VxMvwHkFtDoEoZZpkFdiqCr9l1gU8/Flrgm0ruvonqEUiwE55/I56ExeNoX WEmup8vMT0HfEzIuq/Zd5NckCFP+2d9Yrt6tk6M/3Yd774PQBapWJuZm2ObMbJ+u0A XjUj222I9wl7wwTdyN+au9fLk6dq48OfUVC/8hF0sIbpcUutGH1P8vbg+6qI4iIj1V LMbvRTLfiHWVg== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 02/21] xtensa: dma-mapping: use normal cache invalidation rules Date: Mon, 27 Mar 2023 14:12:58 +0200 Message-Id: <20230327121317.4081816-3-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann xtensa is one of the platforms that has both write-back and write-through caches, and needs to account for both in its DMA mapping operations. It does this through a set of operations that is different from any architecture. This is not a problem by itself, but it makes it rather hard to figure out whether this is correct or not, and to unify this implementation with the others. Change the semantics to the usual ones for non-speculating CPUs: - On DMA_TO_DEVICE, call __flush_dcache_range() to perform the writeback even on writethrough caches, where this is a nop. - On DMA_FROM_DEVICE, invalidate the mapping before the DMA rather than afterwards. - On DMA_BIDIRECTIONAL, combine the pre-writeback with the post-invalidate into a call to __flush_invalidate_dcache_range() that turns into a simple invalidate on writeback caches. Signed-off-by: Arnd Bergmann Reviewed-by: Max Filippov --- arch/xtensa/Kconfig | 1 - arch/xtensa/include/asm/cacheflush.h | 6 +++--- arch/xtensa/kernel/pci-dma.c | 29 +++++----------------------- 3 files changed, 8 insertions(+), 28 deletions(-) diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig index bcb0c5d2abc2..b938bacbb9af 100644 --- a/arch/xtensa/Kconfig +++ b/arch/xtensa/Kconfig @@ -8,7 +8,6 @@ config XTENSA select ARCH_HAS_DMA_PREP_COHERENT if MMU select ARCH_HAS_GCOV_PROFILE_ALL select ARCH_HAS_KCOV - select ARCH_HAS_SYNC_DMA_FOR_CPU if MMU select ARCH_HAS_SYNC_DMA_FOR_DEVICE if MMU select ARCH_HAS_DMA_SET_UNCACHED if MMU select ARCH_HAS_STRNCPY_FROM_USER if !KASAN diff --git a/arch/xtensa/include/asm/cacheflush.h b/arch/xtensa/include/asm/cacheflush.h index 7b4359312c25..2f645d25565a 100644 --- a/arch/xtensa/include/asm/cacheflush.h +++ b/arch/xtensa/include/asm/cacheflush.h @@ -61,9 +61,9 @@ static inline void __flush_dcache_page(unsigned long va) static inline void __flush_dcache_range(unsigned long va, unsigned long sz) { } -# define __flush_invalidate_dcache_all() __invalidate_dcache_all() -# define __flush_invalidate_dcache_page(p) __invalidate_dcache_page(p) -# define __flush_invalidate_dcache_range(p,s) __invalidate_dcache_range(p,s) +# define __flush_invalidate_dcache_all __invalidate_dcache_all +# define __flush_invalidate_dcache_page __invalidate_dcache_page +# define __flush_invalidate_dcache_range __invalidate_dcache_range #endif #if defined(CONFIG_MMU) && (DCACHE_WAY_SIZE > PAGE_SIZE) diff --git a/arch/xtensa/kernel/pci-dma.c b/arch/xtensa/kernel/pci-dma.c index 94955caa4488..ff3bf015eca4 100644 --- a/arch/xtensa/kernel/pci-dma.c +++ b/arch/xtensa/kernel/pci-dma.c @@ -43,38 +43,19 @@ static void do_cache_op(phys_addr_t paddr, size_t size, } } -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, +void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { switch (dir) { - case DMA_BIDIRECTIONAL: + case DMA_TO_DEVICE: + do_cache_op(paddr, size, __flush_dcache_range); + break; case DMA_FROM_DEVICE: do_cache_op(paddr, size, __invalidate_dcache_range); break; - - case DMA_NONE: - BUG(); - break; - - default: - break; - } -} - -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) -{ - switch (dir) { case DMA_BIDIRECTIONAL: - case DMA_TO_DEVICE: - if (XCHAL_DCACHE_IS_WRITEBACK) - do_cache_op(paddr, size, __flush_dcache_range); + do_cache_op(paddr, size, __flush_invalidate_dcache_range); break; - - case DMA_NONE: - BUG(); - break; - default: break; } From patchwork Mon Mar 27 12:12:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189150 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC904C77B62 for ; Mon, 27 Mar 2023 12:14:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232770AbjC0MOq (ORCPT ); Mon, 27 Mar 2023 08:14:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232739AbjC0MOb (ORCPT ); Mon, 27 Mar 2023 08:14:31 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 97AFF3C2A; Mon, 27 Mar 2023 05:14:26 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4B661B8115E; Mon, 27 Mar 2023 12:14:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D461BC433A4; Mon, 27 Mar 2023 12:14:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919264; bh=/zMPyAlfGd5fpJsITtg1RxPyUHI2dqVyx+YefpHkWXQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=d1KUM73yCUytmr56IfxXPD1HZ5UAmAKM+0CBBY3u49w6bNCyrB2sEPtREvafbUkjb CFGZfI0FCfmknT0h+L+eXyaIpkkJwi4v3Z4EgaKViz+/z9QPPlnu3v9F3flYQ26Q7U q3EOb110zv0M3qi2mMyrlqEWXlVpt/VQQyKgll7QY7/XsuTxEM/ZKlPsmMjjXqmTMk uGUPoZTe4WHm4gicfHvkF72XI2puokOkW3TeQ/sk06+15hCStezZtavWZrwr2jQs/G iy1gH+XnYUCVyfsVu1+ZP4/ApT59WEmDcash0EKA5e1KL8ASTdihabMg2li4cEac/w 6iZEB7B78CFDw== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 03/21] sparc32: flush caches in dma_sync_*for_device Date: Mon, 27 Mar 2023 14:12:59 +0200 Message-Id: <20230327121317.4081816-4-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann Leon has a very minimalistic cache that has no range operations and requires being flushed entirely to deal with noncoherent DMA. Most in-order architectures do their cache management in the dma_sync_*for_device() operations rather than dma_sync_*for_cpu. Since the cache is write-through only, both should have the same effect, so change it for consistency with the other architectures. Signed-off-by: Arnd Bergmann --- arch/sparc/Kconfig | 2 +- arch/sparc/kernel/ioport.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index 84437a4c6545..637da50e236c 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -51,7 +51,7 @@ config SPARC config SPARC32 def_bool !64BIT select ARCH_32BIT_OFF_T - select ARCH_HAS_SYNC_DMA_FOR_CPU + select ARCH_HAS_SYNC_DMA_FOR_DEVICE select CLZ_TAB select DMA_DIRECT_REMAP select GENERIC_ATOMIC64 diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c index 4e4f3d3263e4..4f3d26066ec2 100644 --- a/arch/sparc/kernel/ioport.c +++ b/arch/sparc/kernel/ioport.c @@ -306,7 +306,7 @@ arch_initcall(sparc_register_ioport); * On LEON systems without cache snooping, the entire D-CACHE must be flushed to * make DMA to cacheable memory coherent. */ -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, +void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { if (dir != DMA_TO_DEVICE && From patchwork Mon Mar 27 12:13:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189151 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CBCEC77B62 for ; Mon, 27 Mar 2023 12:14:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232797AbjC0MO5 (ORCPT ); Mon, 27 Mar 2023 08:14:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33328 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232726AbjC0MOm (ORCPT ); Mon, 27 Mar 2023 08:14:42 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 56D3649E0; Mon, 27 Mar 2023 05:14:35 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id DE820B8115E; Mon, 27 Mar 2023 12:14:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7B9C0C433EF; Mon, 27 Mar 2023 12:14:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919272; bh=ZhYG8JNUwjljPv9OKiF3eMGP7VJHxmj1d7o6guSf4No=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PBdwMSfV6WhGkWO4GD6yfcHOXcUpefrDC/+txqBn1K1TXjh6vtxTQLHjgNJ0FEkt0 8kEYArEe0oN6TTz55A/ouqhYMm5G10akCJmXzW/muCfT8GBSGN5Uni+3VEARUH4Auc H3B65ricYeSXsGj53UrQTCcr95NZtYK/45G5qbsvcFbKhYxNglRYB3Jv/RHrFzHtks DVdr6HivllSdF2l5dgd1hATA9JZvKqwXHc1koDjA6e60zts4nydXjlN/RqoC6otSSa Tq0Qb2MNzDbHG7RwE+Edr/TKARawLeFdDNo+IyVrlawBJzvWe5xP0tuqDo6AiiQKTk Trtj2/d6VSosQ== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 04/21] microblaze: dma-mapping: skip extra DMA flushes Date: Mon, 27 Mar 2023 14:13:00 +0200 Message-Id: <20230327121317.4081816-5-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann The microblaze dma_sync_* implementation uses the same function for both _for_cpu() and _for_device(), which is inconsistent with other architectures and slightly more expensive. Split it up into separate functions and skip the parts that are not needed: - on dma_sync_*_for_cpu(..., DMA_TO_DEVICE), skip the second writeback, which does nothing. - on dma_sync_*_for_cpu(..., DMA_BIDIRECTIONAL), only invalidate the cache to clear out cache lines that got loaded speculatively, but skip the extraneous writeback. Signed-off-by: Arnd Bergmann --- arch/microblaze/kernel/dma.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c index 04d091ade417..b4c4e45fd45e 100644 --- a/arch/microblaze/kernel/dma.c +++ b/arch/microblaze/kernel/dma.c @@ -14,8 +14,8 @@ #include #include -static void __dma_sync(phys_addr_t paddr, size_t size, - enum dma_data_direction direction) +void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, + enum dma_data_direction dir) { switch (direction) { case DMA_TO_DEVICE: @@ -30,14 +30,16 @@ static void __dma_sync(phys_addr_t paddr, size_t size, } } -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) -{ - __dma_sync(paddr, size, dir); -} - void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - __dma_sync(paddr, size, dir); -} + switch (direction) { + case DMA_TO_DEVICE: + break; + case DMA_BIDIRECTIONAL: + case DMA_FROM_DEVICE: + invalidate_dcache_range(paddr, paddr + size); + break; + default: + BUG(); + }} From patchwork Mon Mar 27 12:13:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189152 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D8FFC761A6 for ; Mon, 27 Mar 2023 12:15:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232842AbjC0MPK (ORCPT ); Mon, 27 Mar 2023 08:15:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33390 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232508AbjC0MOu (ORCPT ); Mon, 27 Mar 2023 08:14:50 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1112A3AB0; Mon, 27 Mar 2023 05:14:44 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 99A8CB81151; Mon, 27 Mar 2023 12:14:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1C784C433A4; Mon, 27 Mar 2023 12:14:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919281; bh=hpHzcJ3T0vShL1lzlkbvi7pqgHNybQdWukxJR0vOP3E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mfm6R/WQZMh61DGU+9uuCWkv0/tyJ6gJGGUPX4zG+Mk/WG5WkNGo0dAhhcwKvkeNZ Z7uItq3PdOnRh7TPukjXUvNYAGRmmdeX8JS3mT3qo9kJY6F3XOZKksXOchdcM4Ox8m YIgytC8vdHgG83EpJ1D/64cYqPwsTsUhPZ6rhpeC62QxfG4lJkjHaP4sM0VmU9aHOb tvmNcya7wRhmDLi1iqOu44WhuiEB0TW0zubz6bYucmFVCHU+7L/S2wlSb9kE6kLicK 5hCctagI4UhCeLS7HyFjmcfLaXiJY5HRdWEL6v0y+dDF92dndyAz6eMDrhb79Jejef K08wGLvDTfBVg== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 05/21] powerpc: dma-mapping: split out cache operation logic Date: Mon, 27 Mar 2023 14:13:01 +0200 Message-Id: <20230327121317.4081816-6-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann The powerpc arch_sync_dma_for_device()/arch_sync_dma_for_cpu() functions behave differently from all other architectures, at least for some of the operations. As a preparation for making the behavior more consistent, reorder the logic in which they decide whether to flush, invalidate or clean the. No change in behavior is intended. Signed-off-by: Arnd Bergmann --- arch/powerpc/mm/dma-noncoherent.c | 91 +++++++++++++++++++++---------- 1 file changed, 63 insertions(+), 28 deletions(-) diff --git a/arch/powerpc/mm/dma-noncoherent.c b/arch/powerpc/mm/dma-noncoherent.c index 30260b5d146d..f10869d27de5 100644 --- a/arch/powerpc/mm/dma-noncoherent.c +++ b/arch/powerpc/mm/dma-noncoherent.c @@ -16,31 +16,28 @@ #include #include +enum dma_cache_op { + DMA_CACHE_CLEAN, + DMA_CACHE_INVAL, + DMA_CACHE_FLUSH, +}; + /* * make an area consistent. */ -static void __dma_sync(void *vaddr, size_t size, int direction) +static void __dma_op(void *vaddr, size_t size, enum dma_cache_op op) { unsigned long start = (unsigned long)vaddr; unsigned long end = start + size; - switch (direction) { - case DMA_NONE: - BUG(); - case DMA_FROM_DEVICE: - /* - * invalidate only when cache-line aligned otherwise there is - * the potential for discarding uncommitted data from the cache - */ - if ((start | end) & (L1_CACHE_BYTES - 1)) - flush_dcache_range(start, end); - else - invalidate_dcache_range(start, end); - break; - case DMA_TO_DEVICE: /* writeback only */ + switch (op) { + case DMA_CACHE_CLEAN: clean_dcache_range(start, end); break; - case DMA_BIDIRECTIONAL: /* writeback and invalidate */ + case DMA_CACHE_INVAL: + invalidate_dcache_range(start, end); + break; + case DMA_CACHE_FLUSH: flush_dcache_range(start, end); break; } @@ -48,16 +45,16 @@ static void __dma_sync(void *vaddr, size_t size, int direction) #ifdef CONFIG_HIGHMEM /* - * __dma_sync_page() implementation for systems using highmem. + * __dma_highmem_op() implementation for systems using highmem. * In this case, each page of a buffer must be kmapped/kunmapped - * in order to have a virtual address for __dma_sync(). This must + * in order to have a virtual address for __dma_op(). This must * not sleep so kmap_atomic()/kunmap_atomic() are used. * * Note: yes, it is possible and correct to have a buffer extend * beyond the first page. */ -static inline void __dma_sync_page_highmem(struct page *page, - unsigned long offset, size_t size, int direction) +static inline void __dma_highmem_op(struct page *page, + unsigned long offset, size_t size, enum dma_cache_op op) { size_t seg_size = min((size_t)(PAGE_SIZE - offset), size); size_t cur_size = seg_size; @@ -71,7 +68,7 @@ static inline void __dma_sync_page_highmem(struct page *page, start = (unsigned long)kmap_atomic(page + seg_nr) + seg_offset; /* Sync this buffer segment */ - __dma_sync((void *)start, seg_size, direction); + __dma_op((void *)start, seg_size, op); kunmap_atomic((void *)start); seg_nr++; @@ -88,32 +85,70 @@ static inline void __dma_sync_page_highmem(struct page *page, #endif /* CONFIG_HIGHMEM */ /* - * __dma_sync_page makes memory consistent. identical to __dma_sync, but - * takes a struct page instead of a virtual address + * __dma_phys_op makes memory consistent. identical to __dma_op, but + * takes a phys_addr_t instead of a virtual address */ -static void __dma_sync_page(phys_addr_t paddr, size_t size, int dir) +static void __dma_phys_op(phys_addr_t paddr, size_t size, enum dma_cache_op op) { struct page *page = pfn_to_page(paddr >> PAGE_SHIFT); unsigned offset = paddr & ~PAGE_MASK; #ifdef CONFIG_HIGHMEM - __dma_sync_page_highmem(page, offset, size, dir); + __dma_highmem_op(page, offset, size, op); #else unsigned long start = (unsigned long)page_address(page) + offset; - __dma_sync((void *)start, size, dir); + __dma_op((void *)start, size, op); #endif } void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - __dma_sync_page(paddr, size, dir); + switch (direction) { + case DMA_NONE: + BUG(); + case DMA_FROM_DEVICE: + /* + * invalidate only when cache-line aligned otherwise there is + * the potential for discarding uncommitted data from the cache + */ + if ((start | end) & (L1_CACHE_BYTES - 1)) + __dma_phys_op(start, end, DMA_CACHE_FLUSH); + else + __dma_phys_op(start, end, DMA_CACHE_INVAL); + break; + case DMA_TO_DEVICE: /* writeback only */ + __dma_phys_op(start, end, DMA_CACHE_CLEAN); + break; + case DMA_BIDIRECTIONAL: /* writeback and invalidate */ + __dma_phys_op(start, end, DMA_CACHE_FLUSH); + break; + } } void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - __dma_sync_page(paddr, size, dir); + switch (direction) { + case DMA_NONE: + BUG(); + case DMA_FROM_DEVICE: + /* + * invalidate only when cache-line aligned otherwise there is + * the potential for discarding uncommitted data from the cache + */ + if ((start | end) & (L1_CACHE_BYTES - 1)) + __dma_phys_op(start, end, DMA_CACHE_FLUSH); + else + __dma_phys_op(start, end, DMA_CACHE_INVAL); + break; + case DMA_TO_DEVICE: /* writeback only */ + __dma_phys_op(start, end, DMA_CACHE_CLEAN); + break; + case DMA_BIDIRECTIONAL: /* writeback and invalidate */ + __dma_phys_op(start, end, DMA_CACHE_FLUSH); + break; + } } void arch_dma_prep_coherent(struct page *page, size_t size) From patchwork Mon Mar 27 12:13:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189153 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74002C77B6D for ; Mon, 27 Mar 2023 12:15:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232209AbjC0MP3 (ORCPT ); Mon, 27 Mar 2023 08:15:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232766AbjC0MPC (ORCPT ); Mon, 27 Mar 2023 08:15:02 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BAAB23C33; Mon, 27 Mar 2023 05:14:52 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 2FCD1B81151; Mon, 27 Mar 2023 12:14:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B8E9EC433EF; Mon, 27 Mar 2023 12:14:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919289; bh=deueI+j9QpGrVEyOG4DDYuCvj98ECBO1BAKmt0h8mOk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XVuAqsjP7YxlTH/jfP4+hsuYEQziw3g24pme7hHONNL+tw53tUh634JeGsQ9qMsJb QiINm6FdyBSv5TDyEB7Icm81k8m52mJn/Oxz3tHbi/ytF+O96JNQpZ/Drh8XHi5GkJ DmnZaeE1/m3voIm0tXrGCkwwIwVqBzNYdHR+o9A0oJjYUFrIsmjSaMN81JqJ9bMdUz ALnkbRdTQARnjKcrPQsU1ISirc/GQsLgUD4/A9nuA1lqc+8ADpy1N7lxpgLiuI99q/ wavZSpV769B5sIjhDLnf+Uwl5ZpVrfQbbb6B1+ZErklaPLQABWTLh2J99jHRm4BWCH eD4ekbguj3pOg== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 06/21] powerpc: dma-mapping: minimize for_cpu flushing Date: Mon, 27 Mar 2023 14:13:02 +0200 Message-Id: <20230327121317.4081816-7-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann The powerpc dma_sync_*_for_cpu() variants do more flushes than on other architectures. Reduce it to what everyone else does: - No flush is needed after data has been sent to a device - When data has been received from a device, the cache only needs to be invalidated to clear out cache lines that were speculatively prefetched. In particular, the second flushing of partial cache lines of bidirectional buffers is actively harmful -- if a single cache line is written by both the CPU and the device, flushing it again does not maintain coherency but instead overwrite the data that was just received from the device. Signed-off-by: Arnd Bergmann --- arch/powerpc/mm/dma-noncoherent.c | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/mm/dma-noncoherent.c b/arch/powerpc/mm/dma-noncoherent.c index f10869d27de5..e108cacf877f 100644 --- a/arch/powerpc/mm/dma-noncoherent.c +++ b/arch/powerpc/mm/dma-noncoherent.c @@ -132,21 +132,11 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, switch (direction) { case DMA_NONE: BUG(); - case DMA_FROM_DEVICE: - /* - * invalidate only when cache-line aligned otherwise there is - * the potential for discarding uncommitted data from the cache - */ - if ((start | end) & (L1_CACHE_BYTES - 1)) - __dma_phys_op(start, end, DMA_CACHE_FLUSH); - else - __dma_phys_op(start, end, DMA_CACHE_INVAL); - break; - case DMA_TO_DEVICE: /* writeback only */ - __dma_phys_op(start, end, DMA_CACHE_CLEAN); + case DMA_TO_DEVICE: break; - case DMA_BIDIRECTIONAL: /* writeback and invalidate */ - __dma_phys_op(start, end, DMA_CACHE_FLUSH); + case DMA_FROM_DEVICE: + case DMA_BIDIRECTIONAL: + __dma_phys_op(start, end, DMA_CACHE_INVAL); break; } } From patchwork Mon Mar 27 12:13:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189154 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C13E4C77B61 for ; Mon, 27 Mar 2023 12:15:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232873AbjC0MPe (ORCPT ); Mon, 27 Mar 2023 08:15:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232827AbjC0MPI (ORCPT ); Mon, 27 Mar 2023 08:15:08 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE77A3A84; Mon, 27 Mar 2023 05:14:59 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 26051611B6; Mon, 27 Mar 2023 12:14:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 660FDC4331E; Mon, 27 Mar 2023 12:14:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919298; bh=Y+wx0FiQgLjkqhPKxtczgrowoCTrPuzhrrVdIO/KSAA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CPGniWUcV5VUAZiOWzIY5/EsUqYQgyEzxjQ8UMeF3Al20/Hh61j9pc8T9PS2mzH4N RSYd/J7YHPH5Nmfk7XsUSK79nWPo+dGlRYvhxWqnqg+N+CIjbuT/Wb2iRTjyp1l0YE q3AT+X8nNFF6xSpGl06sKKOZy0e/LCKNsDpUgC07vuwopI9ykgcvXa2bBFGYOjVaGl X5PV9Eydwk0GTrNzStCQUtT6ovd3xxwDRfs6NRHuZY4OjKijI2Z6KdM95ztBnGcDwK 5HPw8DEJ4VDkMTWgdZ8nt+6QlnfKBn6oiVjPRF9H4yYTieJdfpVbSHx8IkThq5gzbA lsa0aYW1Hchqw== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 07/21] powerpc: dma-mapping: always clean cache in _for_device() op Date: Mon, 27 Mar 2023 14:13:03 +0200 Message-Id: <20230327121317.4081816-8-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann The powerpc implementation of arch_sync_dma_for_device() is unique in that it sometimes performs a full flush for the arch_sync_dma_for_device(paddr, size, DMA_FROM_DEVICE) operation when the address is unaligned, but otherwise invalidates the caches. Since the _for_cpu() counterpart has to invalidate the cache already in order to avoid stale data from prefetching, this operation only really needs to ensure that there are no dirty cache lines, which can be done using either invalidation or cleaning the cache, but not necessarily both. Most architectures traditionally go for invalidation here, but as Will Deacon points out, this can leak old data to user space if a DMA is started but the device ends up not actually filling the entire buffer, see the link below. The same argument applies to DMA_BIDIRECTIONAL transfers. Using a cache-clean operation is the safe choice here, followed by invalidating the cache after the DMA to get rid of stale data that was prefetched before the completion of the DMA. Link: https://lore.kernel.org/all/20220606152150.GA31568@willie-the-truck/ Signed-off-by: Arnd Bergmann --- arch/powerpc/mm/dma-noncoherent.c | 21 +-------------------- 1 file changed, 1 insertion(+), 20 deletions(-) diff --git a/arch/powerpc/mm/dma-noncoherent.c b/arch/powerpc/mm/dma-noncoherent.c index e108cacf877f..00e59a4faa2b 100644 --- a/arch/powerpc/mm/dma-noncoherent.c +++ b/arch/powerpc/mm/dma-noncoherent.c @@ -104,26 +104,7 @@ static void __dma_phys_op(phys_addr_t paddr, size_t size, enum dma_cache_op op) void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - switch (direction) { - case DMA_NONE: - BUG(); - case DMA_FROM_DEVICE: - /* - * invalidate only when cache-line aligned otherwise there is - * the potential for discarding uncommitted data from the cache - */ - if ((start | end) & (L1_CACHE_BYTES - 1)) - __dma_phys_op(start, end, DMA_CACHE_FLUSH); - else - __dma_phys_op(start, end, DMA_CACHE_INVAL); - break; - case DMA_TO_DEVICE: /* writeback only */ - __dma_phys_op(start, end, DMA_CACHE_CLEAN); - break; - case DMA_BIDIRECTIONAL: /* writeback and invalidate */ - __dma_phys_op(start, end, DMA_CACHE_FLUSH); - break; - } + __dma_phys_op(start, end, DMA_CACHE_CLEAN); } void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, From patchwork Mon Mar 27 12:13:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189155 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA622C77B62 for ; Mon, 27 Mar 2023 12:16:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232885AbjC0MP7 (ORCPT ); Mon, 27 Mar 2023 08:15:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35082 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232734AbjC0MP2 (ORCPT ); Mon, 27 Mar 2023 08:15:28 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A86F4220; Mon, 27 Mar 2023 05:15:09 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 6C199B8117B; Mon, 27 Mar 2023 12:15:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0A790C4339E; Mon, 27 Mar 2023 12:14:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919307; bh=p+eG9OtxIwlFEHuBl1BynARl6RQHBxG84PbOJSDaMSg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=k7yTj16MC4QC0lUkI+OlRgN70GEfUAXZsHUm34awwsfDq+uuBVpGn8XzMpAYzrKep rzfZf9IvuTrt/evwGGuCXfuT5QwwzcXZODB9YgjYpxv0iZgT9rwRfHG1vVFKHMnR80 WkjKXapkPoQqpX0z8bvvlpG0YOBt+p7Spw/IdU0XpPUFOYiASiKd39EeZPyxyEYUJn 9ZY055tF4c7BVKMQhNCmNKfo+NgnV/grwYyrPuAwvgV7tJNlhsepLbNs91InZGtrtI 0UOObEmqhXtHeujJTAybyrsno6jtIGCIZbfee+GXSUa8z2HvG7G/GO5AeAihPvul14 jCOgl+Q3LDa9w== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 08/21] riscv: dma-mapping: only invalidate after DMA, not flush Date: Mon, 27 Mar 2023 14:13:04 +0200 Message-Id: <20230327121317.4081816-9-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann No other architecture intentionally writes back dirty cache lines into a buffer that a device has just finished writing into. If the cache is clean, this has no effect at all, but if a cacheline in the buffer has actually been written by the CPU, there is a drive bug that is likely made worse by overwriting that buffer. Signed-off-by: Arnd Bergmann Reviewed-by: Conor Dooley Reviewed-by: Lad Prabhakar Acked-by: Palmer Dabbelt --- arch/riscv/mm/dma-noncoherent.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c index d919efab6eba..640f4c496d26 100644 --- a/arch/riscv/mm/dma-noncoherent.c +++ b/arch/riscv/mm/dma-noncoherent.c @@ -42,7 +42,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, break; case DMA_FROM_DEVICE: case DMA_BIDIRECTIONAL: - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); + ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size); break; default: break; From patchwork Mon Mar 27 12:13:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189156 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79DCBC77B6F for ; Mon, 27 Mar 2023 12:16:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232877AbjC0MQT (ORCPT ); Mon, 27 Mar 2023 08:16:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230264AbjC0MP5 (ORCPT ); Mon, 27 Mar 2023 08:15:57 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32E3D5B9A; Mon, 27 Mar 2023 05:15:20 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 10CDAB81151; Mon, 27 Mar 2023 12:15:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A7D8DC4339B; Mon, 27 Mar 2023 12:15:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919315; bh=gK+eQNvNHVtgIlRiu3kF6HYH3yBsX0klGJL13i4k/aU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=sMy3k5gM2XS604J4ArMuSWvMXzgs2AlKyYx4fCZS7J/lWvOZ2Yw/Z/jrtMa5gIHU9 FjlooO1xQTngsA6hbEvqN4tEnshVSie7B+AEJaU1Cy1oTWuJpEfFFlaRXS3tq+GwRJ er3cljYGerTP/ivX+B9p913oVNSRI6mJdCnknZw/+Go9C3u3Axj27xDUpsuC4nBj8r bgnEtVHNKaiAki6GHJ1OFvO5x00a1gCx3DhHWTCMKP7Wv9fwGxJu8GpPhhiTKRkk1x zZRoP0hNRHHFBO8EoUedx3N9IqVzTMyc60Fp/OWdtgr6CAuRCyXfSdikQcmkzRIL2M hyE6lbMlfVtJQ== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 09/21] riscv: dma-mapping: skip invalidation before bidirectional DMA Date: Mon, 27 Mar 2023 14:13:05 +0200 Message-Id: <20230327121317.4081816-10-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann For a DMA_BIDIRECTIONAL transfer, the caches have to be cleaned first to let the device see data written by the CPU, and invalidated after the transfer to let the CPU see data written by the device. riscv also invalidates the caches before the transfer, which does not appear to serve any purpose. Signed-off-by: Arnd Bergmann Reviewed-by: Conor Dooley Reviewed-by: Lad Prabhakar Acked-by: Palmer Dabbelt Acked-by: Guo Ren --- arch/riscv/mm/dma-noncoherent.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c index 640f4c496d26..69c80b2155a1 100644 --- a/arch/riscv/mm/dma-noncoherent.c +++ b/arch/riscv/mm/dma-noncoherent.c @@ -25,7 +25,7 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); break; case DMA_BIDIRECTIONAL: - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); + ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); break; default: break; From patchwork Mon Mar 27 12:13:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189157 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42EA3C77B62 for ; Mon, 27 Mar 2023 12:16:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232895AbjC0MQ1 (ORCPT ); Mon, 27 Mar 2023 08:16:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34618 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229892AbjC0MQE (ORCPT ); Mon, 27 Mar 2023 08:16:04 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70F0E49F5; Mon, 27 Mar 2023 05:15:26 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 127FD611F2; Mon, 27 Mar 2023 12:15:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 56A3DC433A4; Mon, 27 Mar 2023 12:15:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919324; bh=X7ixDT/eME8zr2M/ikt5bC5rQoS3uASy2IlC1ftIdSM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Y2ffntjZwCcvJ7ZjcddqfIzivLJJTiZqbCBMhicCNOoftlZSKr8SZk1bt+v6QBdJ3 AqWhwmcTeaS96y4RUjpqI9FGKTqNfAOz/7RBt+ymuWQLCwljPJr+H8JgdNwFdqMXwE J72tIuLXAnxcA7lDtjwg32UTtKQ0DkssHKQnBWYb3L+gXkApRayP0Mte6XTqD4QU75 zUBFdzaO9TiPGVooVFe2U07En2yoiRB9h1tj51608yH2BmWke6vfUVRIXO9jhZyAf0 Nifety/7HWh8t+jBWCaVsZT7PH2v1d96plBTbAk5HzBMivkiokDHTCIu3jr6mGkPTw a3+V9R9fKm3ag== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 10/21] csky: dma-mapping: skip invalidating before DMA from device Date: Mon, 27 Mar 2023 14:13:06 +0200 Message-Id: <20230327121317.4081816-11-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann csky is the only architecture that does a full flush for the dma_sync_*_for_device(..., DMA_FROM_DEVICE) operation. The requirement is only make sure there are no dirty cache lines for the buffer, which can be either done through an invalidate operation (as on most architectures including arm32, mips and arc), or a writeback (as on arm64 and riscv). The cache also has to be invalidated eventually but csky already does that after the transfer. Use a 'clean' operation here for consistency with arm64 and riscv. Signed-off-by: Arnd Bergmann Reviewed-by: Guo Ren --- arch/csky/mm/dma-mapping.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/csky/mm/dma-mapping.c b/arch/csky/mm/dma-mapping.c index 82447029feb4..c90f912e2822 100644 --- a/arch/csky/mm/dma-mapping.c +++ b/arch/csky/mm/dma-mapping.c @@ -60,11 +60,9 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, { switch (dir) { case DMA_TO_DEVICE: - cache_op(paddr, size, dma_wb_range); - break; case DMA_FROM_DEVICE: case DMA_BIDIRECTIONAL: - cache_op(paddr, size, dma_wbinv_range); + cache_op(paddr, size, dma_wb_range); break; default: BUG(); From patchwork Mon Mar 27 12:13:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189158 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57352C76195 for ; Mon, 27 Mar 2023 12:16:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232784AbjC0MQn (ORCPT ); Mon, 27 Mar 2023 08:16:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232700AbjC0MQM (ORCPT ); Mon, 27 Mar 2023 08:16:12 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21FD244AC; Mon, 27 Mar 2023 05:15:34 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B1278611F5; Mon, 27 Mar 2023 12:15:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id ECBF0C433EF; Mon, 27 Mar 2023 12:15:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919333; bh=1A9PPhpjfn5VxBSmEZUVZ96wOnXUnzHwhEr/+6UxMII=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZTcyGdTSTyjUcBoLSVeACST/PtG6gZO3O0XJKQPppvSvNB4NDcy90mHMF4gJPiNS6 dNtpCw1Tg4EZtsG9wUbybuw0872LTI9FUQllw/7Q76rCvz6Crnc81KUI9Kj4chNtUZ csalhGjvFzfmW3DiLJ6yzIpVXX+f5lcRxoFKXzIm876A40pKei6+o+8NXVAEYBZqzo s4oPjpuA8RpmeGU6lv0FlQ6gA3tl8++a/WzT9oNtEmsVj2YTsrbo7DHFpEIVM4/NkV v133TD824709IDedq8WaO7uGv2weYLR4T+0csPE1290Z1QSEI0fkOubzH+lLl8W9RM VLW1WwaNQGF9g== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 11/21] mips: dma-mapping: skip invalidating before bidirectional DMA Date: Mon, 27 Mar 2023 14:13:07 +0200 Message-Id: <20230327121317.4081816-12-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann Some architectures that need to invalidate buffers after bidirectional DMA because of speculative prefetching only do a simpler writeback before that DMA, while architectures that don't need to do the second invalidate tend to have a combined writeback+invalidate before the DMA. The behavior on mips is slightly inconsistent, as it always does the invalidation before bidirectional DMA and conditionally does it a second time. In order to make the behavior the same as the rest, change it so that there is exactly one invalidation here, either before or after the DMA. Signed-off-by: Arnd Bergmann --- arch/mips/mm/dma-noncoherent.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/mips/mm/dma-noncoherent.c b/arch/mips/mm/dma-noncoherent.c index 3c4fc97b9f39..b4350faf4f1e 100644 --- a/arch/mips/mm/dma-noncoherent.c +++ b/arch/mips/mm/dma-noncoherent.c @@ -65,7 +65,11 @@ static inline void dma_sync_virt_for_device(void *addr, size_t size, dma_cache_inv((unsigned long)addr, size); break; case DMA_BIDIRECTIONAL: - dma_cache_wback_inv((unsigned long)addr, size); + if (IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) && + cpu_needs_post_dma_flush()) + dma_cache_wback((unsigned long)addr, size); + else + dma_cache_wback_inv((unsigned long)addr, size); break; default: BUG(); From patchwork Mon Mar 27 12:13:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189159 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27DD8C761A6 for ; Mon, 27 Mar 2023 12:17:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232763AbjC0MQ7 (ORCPT ); Mon, 27 Mar 2023 08:16:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232714AbjC0MQU (ORCPT ); Mon, 27 Mar 2023 08:16:20 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 67A72618B; Mon, 27 Mar 2023 05:15:44 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 11E21B8117B; Mon, 27 Mar 2023 12:15:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 97FBDC433A0; Mon, 27 Mar 2023 12:15:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919341; bh=uohG58os+YPGsAoIvHYZEyGIjYTpK/zcJikKrAX9FPU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hBanRUO40otZWpiIlwJ2ahlYI2/SU8zPpLT3Qw57jJE40woaJrRwcvRFQpLc/CmtF 0htU6ckgOgXVZH2DE+dJDcgxfjuUFxw2l8Pn5C7OdFGFsmKKCazqxQQraKNSB0q9Xu 8oHDYMJBaw55+tzbreH4Ndmiz3clXO5/H+FcYDQe1iQ34jySEQMALNvOHQCQFHA7Lk 5/v6eq8R2opwH5aOlgYkRDsOvBF62H+KLqxeGyv9U6ps8rklWRlmbETlf8xD0OymSe 8y75Z9XzlD5evRjJOnI7bZLIIgFEQA8+P2HCk/nt22ZXSqKgj7kPcjr6CG0psnqbOe EFkVwSkNAqLQw== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 12/21] mips: dma-mapping: split out cache operation logic Date: Mon, 27 Mar 2023 14:13:08 +0200 Message-Id: <20230327121317.4081816-13-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann The mips arch_sync_dma_for_device()/arch_sync_dma_for_cpu() functions behave the same way as on other architectures, but in order to unify the implementations, the code needs to be rearranged to pick the type of cache operation in the outermost function. Signed-off-by: Arnd Bergmann --- arch/mips/mm/dma-noncoherent.c | 75 ++++++++++++++-------------------- 1 file changed, 30 insertions(+), 45 deletions(-) diff --git a/arch/mips/mm/dma-noncoherent.c b/arch/mips/mm/dma-noncoherent.c index b4350faf4f1e..b9d68bcc5d53 100644 --- a/arch/mips/mm/dma-noncoherent.c +++ b/arch/mips/mm/dma-noncoherent.c @@ -54,50 +54,13 @@ void *arch_dma_set_uncached(void *addr, size_t size) return (void *)(__pa(addr) + UNCAC_BASE); } -static inline void dma_sync_virt_for_device(void *addr, size_t size, - enum dma_data_direction dir) -{ - switch (dir) { - case DMA_TO_DEVICE: - dma_cache_wback((unsigned long)addr, size); - break; - case DMA_FROM_DEVICE: - dma_cache_inv((unsigned long)addr, size); - break; - case DMA_BIDIRECTIONAL: - if (IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) && - cpu_needs_post_dma_flush()) - dma_cache_wback((unsigned long)addr, size); - else - dma_cache_wback_inv((unsigned long)addr, size); - break; - default: - BUG(); - } -} - -static inline void dma_sync_virt_for_cpu(void *addr, size_t size, - enum dma_data_direction dir) -{ - switch (dir) { - case DMA_TO_DEVICE: - break; - case DMA_FROM_DEVICE: - case DMA_BIDIRECTIONAL: - dma_cache_inv((unsigned long)addr, size); - break; - default: - BUG(); - } -} - /* * A single sg entry may refer to multiple physically contiguous pages. But * we still need to process highmem pages individually. If highmem is not * configured then the bulk of this loop gets optimized out. */ static inline void dma_sync_phys(phys_addr_t paddr, size_t size, - enum dma_data_direction dir, bool for_device) + void(*cache_op)(unsigned long start, unsigned long size)) { struct page *page = pfn_to_page(paddr >> PAGE_SHIFT); unsigned long offset = paddr & ~PAGE_MASK; @@ -113,10 +76,7 @@ static inline void dma_sync_phys(phys_addr_t paddr, size_t size, } addr = kmap_atomic(page); - if (for_device) - dma_sync_virt_for_device(addr + offset, len, dir); - else - dma_sync_virt_for_cpu(addr + offset, len, dir); + cache_op((unsigned long)addr + offset, len); kunmap_atomic(addr); offset = 0; @@ -128,15 +88,40 @@ static inline void dma_sync_phys(phys_addr_t paddr, size_t size, void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - dma_sync_phys(paddr, size, dir, true); + switch (dir) { + case DMA_TO_DEVICE: + dma_sync_phys(paddr, size, _dma_cache_wback); + break; + case DMA_FROM_DEVICE: + dma_sync_phys(paddr, size, _dma_cache_inv); + break; + case DMA_BIDIRECTIONAL: + if (IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) && + cpu_needs_post_dma_flush()) + dma_sync_phys(paddr, size, _dma_cache_wback); + else + dma_sync_phys(paddr, size, _dma_cache_wback_inv); + break; + default: + break; + } } #ifdef CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - if (cpu_needs_post_dma_flush()) - dma_sync_phys(paddr, size, dir, false); + switch (dir) { + case DMA_TO_DEVICE: + break; + case DMA_FROM_DEVICE: + case DMA_BIDIRECTIONAL: + if (cpu_needs_post_dma_flush()) + dma_sync_phys(paddr, size, _dma_cache_inv); + break; + default: + break; + } } #endif From patchwork Mon Mar 27 12:13:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189160 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FD62C77B6D for ; Mon, 27 Mar 2023 12:17:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232821AbjC0MRK (ORCPT ); Mon, 27 Mar 2023 08:17:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33440 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231970AbjC0MQ3 (ORCPT ); Mon, 27 Mar 2023 08:16:29 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE5A33C2D; Mon, 27 Mar 2023 05:15:52 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 96F3BB81151; Mon, 27 Mar 2023 12:15:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3C656C4339B; Mon, 27 Mar 2023 12:15:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919350; bh=WcyJUuclgspkZHSwVclV5Jksx+zy3hY89ahpztaj0hM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gF/2rqKM1LHnyfHfzF1g693XZNfrbHBqd2oKlgoQ2jGFc8xTOws4qsH2njENSW1Si OGeqBCL1Lt3BSCm5fmGZhO1H6HKEoKzetUEYMk3Soh/1yhM8nZtCol16EHntqcT8Er pM+o9uzAgZL1iYBAWUy3z6yGzIeoUU0K+TXymgHYz7EqbCIvcrn8MMo6RRjBuEtKTx 6l+8vcM3FOZ1WkoHQ1Wlf6+gyjrHJP0Ci4xI/sPZptxVi/W3OXZggPcMTH3OeaOXte UVzqZh+or7dDyFm/ogv6Vz88DYJfSiZ34SmQ2duChP+VAO1veqEJjztWBBm3dd9c41 y9Yt0CcmM/sAw== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 13/21] arc: dma-mapping: skip invalidating before bidirectional DMA Date: Mon, 27 Mar 2023 14:13:09 +0200 Message-Id: <20230327121317.4081816-14-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann Some architectures that need to invalidate buffers after bidirectional DMA because of speculative prefetching only do a simpler writeback before that DMA, while architectures that don't need to do the second invalidate tend to have a combined writeback+invalidate before the DMA. arc is one of the architectures that does both, which seems unnecessary. Change it to behave like arm/arm64/xtensa instead, and use just a writeback before the DMA when we do the invalidate afterwards. Signed-off-by: Arnd Bergmann Reviewed-by: Vineet Gupta Tested-by: Shahab Vahedi --- arch/arc/mm/dma.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c index 2a7fbbb83b70..ddb96786f765 100644 --- a/arch/arc/mm/dma.c +++ b/arch/arc/mm/dma.c @@ -40,7 +40,7 @@ void arch_dma_prep_coherent(struct page *page, size_t size) * |---------------------------------------------------------------- * TO_DEV | writeback writeback | none none * FROM_DEV | invalidate invalidate | invalidate* invalidate* - * BIDIR | writeback+inv writeback+inv | invalidate invalidate + * BIDIR | writeback writeback | invalidate invalidate * * [*] needed for CPU speculative prefetches * @@ -61,7 +61,7 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, break; case DMA_BIDIRECTIONAL: - dma_cache_wback_inv(paddr, size); + dma_cache_wback(paddr, size); break; default: From patchwork Mon Mar 27 12:13:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7054AC7619A for ; Mon, 27 Mar 2023 12:17:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232848AbjC0MRQ (ORCPT ); Mon, 27 Mar 2023 08:17:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35694 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232690AbjC0MQi (ORCPT ); Mon, 27 Mar 2023 08:16:38 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 13BD53C0F; Mon, 27 Mar 2023 05:16:00 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A2427611F0; Mon, 27 Mar 2023 12:15:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D68CCC433A8; Mon, 27 Mar 2023 12:15:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919359; bh=dXkVBSHFZo75+7tpVyAnh5gU/93Of01AyIY3GkfM5Xo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LxcGBomvsroLU/soGwnnpht+OmpwZ+CQiYPJxIKYSMNj00PvVbGR5doPLu7Ta51zU A4AjXgBFUoaToan99cxpW2tpzSfzmbx7CfucJpx/cZmN4J82wZvf5lGWjPr1irkWI4 eZp949vYHz5uXoUdFNeW62DfrFW7AYO5kKpo+/0YEhvvJIUHP4VPFskv3Eqs77XyrF Xs9EDnGZ3aZvKZmOae+ULTgqTWW5PJS6Wpz4iR9tqX40bYNm1DPxfkANwAHMeIL88M 53CkbdMXPh+FND6MhgUZQGkY9ycD9t2Azpib43BAHfEe1LeuLFcS8ozxudjpxtNHkD yYFa2ciLpv92A== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 14/21] parisc: dma-mapping: use regular flush/invalidate ops Date: Mon, 27 Mar 2023 14:13:10 +0200 Message-Id: <20230327121317.4081816-15-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann non-coherent devices on parisc traditionally use a full flush+invalidate before and after each DMA, which is more expensive that what we do on other architectures. Before transfers to a device, the cache only has to be written back, but apparently there is no operation for this on parisc. There is no need to flush it again after the transfer though. After transfers from a device, the second writeback can be skipped because the CPU was not allowed to write to the buffer anyway, instead a purge (invalidate without flush) can be used. The DMA_FROM_DEVICE is handled differently across architectures, most use only an invalidate (purge) operation, but some have moved to flush in order to preserve dirty data when the device does not write to the buffer, see the link below. As parisc already did the full flush here, keep that behavior. Link: https://lore.kernel.org/all/20220606152150.GA31568@willie-the-truck/ Signed-off-by: Arnd Bergmann --- I'm not really sure I understand the semantics of the 'flush' and 'purge' operations on parisc correctly, please double-check that this makes sense in the context of this architecture. --- arch/parisc/include/asm/cacheflush.h | 6 +++++- arch/parisc/kernel/pci-dma.c | 25 +++++++++++++++++++++++-- 2 files changed, 28 insertions(+), 3 deletions(-) diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h index 0bdee6724132..a4c5042f1821 100644 --- a/arch/parisc/include/asm/cacheflush.h +++ b/arch/parisc/include/asm/cacheflush.h @@ -33,8 +33,12 @@ void flush_cache_mm(struct mm_struct *mm); void flush_kernel_dcache_page_addr(const void *addr); +#define clean_kernel_dcache_range(start,size) \ + flush_kernel_dcache_range((start), (size)) #define flush_kernel_dcache_range(start,size) \ - flush_kernel_dcache_range_asm((start), (start)+(size)); + flush_kernel_dcache_range_asm((start), (start)+(size)) +#define purge_kernel_dcache_range(start,size) \ + purge_kernel_dcache_range_asm((start), (start)+(size)) #define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1 void flush_kernel_vmap_range(void *vaddr, int size); diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c index ba87f791323b..6d3d3cffb316 100644 --- a/arch/parisc/kernel/pci-dma.c +++ b/arch/parisc/kernel/pci-dma.c @@ -446,11 +446,32 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr, void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size); + unsigned long virt = (unsigned long)phys_to_virt(paddr); + + switch (dir) { + case DMA_TO_DEVICE: + clean_kernel_dcache_range(virt, size); + break; + case DMA_FROM_DEVICE: + clean_kernel_dcache_range(virt, size); + break; + case DMA_BIDIRECTIONAL: + flush_kernel_dcache_range(virt, size); + break; + } } void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size); + unsigned long virt = (unsigned long)phys_to_virt(paddr); + + switch (dir) { + case DMA_TO_DEVICE: + break; + case DMA_FROM_DEVICE: + case DMA_BIDIRECTIONAL: + purge_kernel_dcache_range(virt, size); + break; + } } From patchwork Mon Mar 27 12:13:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189162 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0BB1C77B6D for ; Mon, 27 Mar 2023 12:17:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232731AbjC0MRh (ORCPT ); Mon, 27 Mar 2023 08:17:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232673AbjC0MQ5 (ORCPT ); Mon, 27 Mar 2023 08:16:57 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5EFA859C9; Mon, 27 Mar 2023 05:16:10 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 01356B81183; Mon, 27 Mar 2023 12:16:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8F899C433EF; Mon, 27 Mar 2023 12:15:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919367; bh=YBtSO0RmdwE2AW1zN7HUJxa9NfwhIOXNCN8O+lzIRNw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UN64uPXQtdeLkLg4dn09AzIarAchhb4d1x34/SjwyiZWrDrJBuSzm2pjxivirdXOH jDWvdVBlPURZiAbbZkwr+j/acZdOxZqq6cGIPFvZdjZpd4mtJpLB9c6+CCJIpq7WlT yKwLGhptYzLT9fUIHqk5I/wF2qbPA7Jz2AgpBslZNOD+sEJ3MsZmMDIhBSPTN1n5u3 xeD/9JePzosgXXiQzUg8ldJpqSjwUr9LKOTYJUPw5exVgHkbXy8JjWNj5CwfmVQvWQ Tf5G78NWWF9z4lJFW3lBVqt8Ks2rBF4G8TjU0+Ep1wg2uWq1/XkSNqoo9R0+XEHkMt ETlJDU59jGtbg== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 15/21] ARM: dma-mapping: always invalidate WT caches before DMA Date: Mon, 27 Mar 2023 14:13:11 +0200 Message-Id: <20230327121317.4081816-16-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann Most ARM CPUs can have write-back caches and that require cache management to be done in the dma_sync_*_for_device() operation. This is typically done in both writeback and writethrough mode. The cache-v4.S (arm720/740/7tdmi/9tdmi) and cache-v4wt.S (arm920t, arm940t) implementations are the exception here, and only do the cache management after the DMA is complete, in the dma_sync_*_for_cpu() operation. Change this for consistency with the other platforms. This should have no user visible effect. Signed-off-by: Arnd Bergmann Reviewed-by: Linus Walleij --- arch/arm/mm/cache-v4.S | 8 ++++---- arch/arm/mm/cache-v4wt.S | 8 ++++---- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/arm/mm/cache-v4.S b/arch/arm/mm/cache-v4.S index 7787057e4990..e2b104876340 100644 --- a/arch/arm/mm/cache-v4.S +++ b/arch/arm/mm/cache-v4.S @@ -117,23 +117,23 @@ ENTRY(v4_dma_flush_range) ret lr /* - * dma_unmap_area(start, size, dir) + * dma_map_area(start, size, dir) * - start - kernel virtual start address * - size - size of region * - dir - DMA direction */ -ENTRY(v4_dma_unmap_area) +ENTRY(v4_dma_map_area) teq r2, #DMA_TO_DEVICE bne v4_dma_flush_range /* FALLTHROUGH */ /* - * dma_map_area(start, size, dir) + * dma_unmap_area(start, size, dir) * - start - kernel virtual start address * - size - size of region * - dir - DMA direction */ -ENTRY(v4_dma_map_area) +ENTRY(v4_dma_unmap_area) ret lr ENDPROC(v4_dma_unmap_area) ENDPROC(v4_dma_map_area) diff --git a/arch/arm/mm/cache-v4wt.S b/arch/arm/mm/cache-v4wt.S index 0b290c25a99d..652218752f88 100644 --- a/arch/arm/mm/cache-v4wt.S +++ b/arch/arm/mm/cache-v4wt.S @@ -172,24 +172,24 @@ v4wt_dma_inv_range: .equ v4wt_dma_flush_range, v4wt_dma_inv_range /* - * dma_unmap_area(start, size, dir) + * dma_map_area(start, size, dir) * - start - kernel virtual start address * - size - size of region * - dir - DMA direction */ -ENTRY(v4wt_dma_unmap_area) +ENTRY(v4wt_dma_map_area) add r1, r1, r0 teq r2, #DMA_TO_DEVICE bne v4wt_dma_inv_range /* FALLTHROUGH */ /* - * dma_map_area(start, size, dir) + * dma_unmap_area(start, size, dir) * - start - kernel virtual start address * - size - size of region * - dir - DMA direction */ -ENTRY(v4wt_dma_map_area) +ENTRY(v4wt_dma_unmap_area) ret lr ENDPROC(v4wt_dma_unmap_area) ENDPROC(v4wt_dma_map_area) From patchwork Mon Mar 27 12:13:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87278C77B62 for ; Mon, 27 Mar 2023 12:17:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232362AbjC0MRu (ORCPT ); Mon, 27 Mar 2023 08:17:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232786AbjC0MRK (ORCPT ); Mon, 27 Mar 2023 08:17:10 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5ED344A3; Mon, 27 Mar 2023 05:16:17 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2C7F0611F3; Mon, 27 Mar 2023 12:16:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3D6ACC4339B; Mon, 27 Mar 2023 12:16:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919376; bh=kZ0whxymVbLS460yE7omdNABwzoFPMXv5cC9T62VHxM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gbyhgTZgjGWI3oriwUtSiZyOWJ+ndxgmQzOH83pCswXdUkt6Lhnl1YglFRPAZXdyP IQZ4ax+myG2pMYp2j+0j+I6+1nWVCf8bVkhjue0wkEVNf9sg9ONOcDm9F0V1JfRV79 iFS6tHngpUfRbCZ84cnblXSqWEAG0sD0d+pFLtnU7X6zJVekWzUKOYNWgqkwtiSrTl pdTx/73pQmqmOkvaPTSZnlA1INFgJg6gytF+N1/ofFih7tP03nsskKQG4wdrnabV5a GEn1PscP/E8YaRmJ2E2QXPRxpOnFB3LZOFWS2Bit7jKcQ+GedIyvBqHaEpG21wx91/ 8nuPebpFkBNOw== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 16/21] ARM: dma-mapping: bring back dmac_{clean,inv}_range Date: Mon, 27 Mar 2023 14:13:12 +0200 Message-Id: <20230327121317.4081816-17-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann These were remove ages ago in commit 702b94bff3c5 ("ARM: dma-mapping: remove dmac_clean_range and dmac_inv_range") in an effort to sanitize the dma-mapping API. Now this logic is getting moved into the generic dma-mapping implementation in order to give architectures less control over it, which requires reverting that earlier work. Signed-off-by: Arnd Bergmann --- arch/arm/include/asm/cacheflush.h | 21 +++++++++++++++++++++ arch/arm/include/asm/glue-cache.h | 4 ++++ arch/arm/mm/cache-fa.S | 4 ++-- arch/arm/mm/cache-nop.S | 6 ++++++ arch/arm/mm/cache-v4.S | 5 +++++ arch/arm/mm/cache-v4wb.S | 4 ++-- arch/arm/mm/cache-v4wt.S | 14 +++++++++++++- arch/arm/mm/cache-v6.S | 4 ++-- arch/arm/mm/cache-v7.S | 6 ++++-- arch/arm/mm/cache-v7m.S | 4 ++-- arch/arm/mm/proc-arm1020.S | 4 ++-- arch/arm/mm/proc-arm1020e.S | 4 ++-- arch/arm/mm/proc-arm1022.S | 4 ++-- arch/arm/mm/proc-arm1026.S | 4 ++-- arch/arm/mm/proc-arm920.S | 4 ++-- arch/arm/mm/proc-arm922.S | 4 ++-- arch/arm/mm/proc-arm925.S | 4 ++-- arch/arm/mm/proc-arm926.S | 4 ++-- arch/arm/mm/proc-arm940.S | 4 ++-- arch/arm/mm/proc-arm946.S | 4 ++-- arch/arm/mm/proc-feroceon.S | 8 ++++---- arch/arm/mm/proc-macros.S | 2 ++ arch/arm/mm/proc-mohawk.S | 4 ++-- arch/arm/mm/proc-xsc3.S | 4 ++-- arch/arm/mm/proc-xscale.S | 6 ++++-- 25 files changed, 95 insertions(+), 41 deletions(-) diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h index a094f964c869..04462bfe9130 100644 --- a/arch/arm/include/asm/cacheflush.h +++ b/arch/arm/include/asm/cacheflush.h @@ -91,6 +91,21 @@ * DMA Cache Coherency * =================== * + * dma_inv_range(start, end) + * + * Invalidate (discard) the specified virtual address range. + * May not write back any entries. If 'start' or 'end' + * are not cache line aligned, those lines must be written + * back. + * - start - virtual start address + * - end - virtual end address + * + * dma_clean_range(start, end) + * + * Clean (write back) the specified virtual address range. + * - start - virtual start address + * - end - virtual end address + * * dma_flush_range(start, end) * * Clean and invalidate the specified virtual address range. @@ -112,6 +127,8 @@ struct cpu_cache_fns { void (*dma_map_area)(const void *, size_t, int); void (*dma_unmap_area)(const void *, size_t, int); + void (*dma_clean_range)(const void *, const void *); + void (*dma_inv_range)(const void *, const void *); void (*dma_flush_range)(const void *, const void *); } __no_randomize_layout; @@ -137,6 +154,8 @@ extern struct cpu_cache_fns cpu_cache; * is visible to DMA, or data written by DMA to system memory is * visible to the CPU. */ +#define dmac_clean_range cpu_cache.dma_clean_range +#define dmac_inv_range cpu_cache.dma_inv_range #define dmac_flush_range cpu_cache.dma_flush_range #else @@ -156,6 +175,8 @@ extern void __cpuc_flush_dcache_area(void *, size_t); * is visible to DMA, or data written by DMA to system memory is * visible to the CPU. */ +extern void dmac_clean_range(const void *, const void *); +extern void dmac_inv_range(const void *, const void *); extern void dmac_flush_range(const void *, const void *); #endif diff --git a/arch/arm/include/asm/glue-cache.h b/arch/arm/include/asm/glue-cache.h index 724f8dac1e5b..d8c93b483adf 100644 --- a/arch/arm/include/asm/glue-cache.h +++ b/arch/arm/include/asm/glue-cache.h @@ -139,6 +139,8 @@ static inline int nop_coherent_user_range(unsigned long a, unsigned long b) { return 0; } static inline void nop_flush_kern_dcache_area(void *a, size_t s) { } +static inline void nop_dma_clean_range(const void *a, const void *b) { } +static inline void nop_dma_inv_range(const void *a, const void *b) { } static inline void nop_dma_flush_range(const void *a, const void *b) { } static inline void nop_dma_map_area(const void *s, size_t l, int f) { } @@ -155,6 +157,8 @@ static inline void nop_dma_unmap_area(const void *s, size_t l, int f) { } #define __cpuc_coherent_user_range __glue(_CACHE,_coherent_user_range) #define __cpuc_flush_dcache_area __glue(_CACHE,_flush_kern_dcache_area) +#define dmac_clean_range __glue(_CACHE,_dma_clean_range) +#define dmac_inv_range __glue(_CACHE,_dma_inv_range) #define dmac_flush_range __glue(_CACHE,_dma_flush_range) #endif diff --git a/arch/arm/mm/cache-fa.S b/arch/arm/mm/cache-fa.S index 3a464d1649b4..abc3d58948dd 100644 --- a/arch/arm/mm/cache-fa.S +++ b/arch/arm/mm/cache-fa.S @@ -166,7 +166,7 @@ ENTRY(fa_flush_kern_dcache_area) * - start - virtual start address * - end - virtual end address */ -fa_dma_inv_range: +ENTRY(fa_dma_inv_range) tst r0, #CACHE_DLINESIZE - 1 bic r0, r0, #CACHE_DLINESIZE - 1 mcrne p15, 0, r0, c7, c14, 1 @ clean & invalidate D entry @@ -189,7 +189,7 @@ fa_dma_inv_range: * - start - virtual start address * - end - virtual end address */ -fa_dma_clean_range: +ENTRY(fa_dma_clean_range) bic r0, r0, #CACHE_DLINESIZE - 1 1: mcr p15, 0, r0, c7, c10, 1 @ clean D entry add r0, r0, #CACHE_DLINESIZE diff --git a/arch/arm/mm/cache-nop.S b/arch/arm/mm/cache-nop.S index 72d939ef8798..a058544d6c2b 100644 --- a/arch/arm/mm/cache-nop.S +++ b/arch/arm/mm/cache-nop.S @@ -32,6 +32,12 @@ ENDPROC(nop_coherent_user_range) .globl nop_flush_kern_dcache_area .equ nop_flush_kern_dcache_area, nop_flush_icache_all + .globl nop_dma_clean_range + .equ nop_dma_clean_range, nop_flush_icache_all + + .globl nop_dma_inv_range + .equ nop_dma_inv_range, nop_flush_icache_all + .globl nop_dma_flush_range .equ nop_dma_flush_range, nop_flush_icache_all diff --git a/arch/arm/mm/cache-v4.S b/arch/arm/mm/cache-v4.S index e2b104876340..b747e591109c 100644 --- a/arch/arm/mm/cache-v4.S +++ b/arch/arm/mm/cache-v4.S @@ -103,17 +103,22 @@ ENTRY(v4_flush_kern_dcache_area) /* * dma_flush_range(start, end) + * dma_inv_range(start, end) * * Clean and invalidate the specified virtual address range. + * As only write-through caches are supported here, this is the + * same as invalidate, while the clean operation does nothing. * * - start - virtual start address * - end - virtual end address */ +ENTRY(v4_dma_inv_range) ENTRY(v4_dma_flush_range) #ifdef CONFIG_CPU_CP15 mov r0, #0 mcr p15, 0, r0, c7, c7, 0 @ flush ID cache #endif +ENTRY(v4_dma_clean_range) ret lr /* diff --git a/arch/arm/mm/cache-v4wb.S b/arch/arm/mm/cache-v4wb.S index 905ac2fa2b1e..55f609eae38d 100644 --- a/arch/arm/mm/cache-v4wb.S +++ b/arch/arm/mm/cache-v4wb.S @@ -183,7 +183,7 @@ ENTRY(v4wb_coherent_user_range) * - start - virtual start address * - end - virtual end address */ -v4wb_dma_inv_range: +ENTRY(v4wb_dma_inv_range) tst r0, #CACHE_DLINESIZE - 1 bic r0, r0, #CACHE_DLINESIZE - 1 mcrne p15, 0, r0, c7, c10, 1 @ clean D entry @@ -204,7 +204,7 @@ v4wb_dma_inv_range: * - start - virtual start address * - end - virtual end address */ -v4wb_dma_clean_range: +ENTRY(v4wb_dma_clean_range) bic r0, r0, #CACHE_DLINESIZE - 1 1: mcr p15, 0, r0, c7, c10, 1 @ clean D entry add r0, r0, #CACHE_DLINESIZE diff --git a/arch/arm/mm/cache-v4wt.S b/arch/arm/mm/cache-v4wt.S index 652218752f88..1a88627ec09b 100644 --- a/arch/arm/mm/cache-v4wt.S +++ b/arch/arm/mm/cache-v4wt.S @@ -152,7 +152,7 @@ ENTRY(v4wt_flush_kern_dcache_area) * - start - virtual start address * - end - virtual end address */ -v4wt_dma_inv_range: +ENTRY(v4wt_dma_inv_range) bic r0, r0, #CACHE_DLINESIZE - 1 1: mcr p15, 0, r0, c7, c6, 1 @ invalidate D entry add r0, r0, #CACHE_DLINESIZE @@ -171,6 +171,18 @@ v4wt_dma_inv_range: .globl v4wt_dma_flush_range .equ v4wt_dma_flush_range, v4wt_dma_inv_range +/* + * dma_clean_range(start, end) + * + * Clean the specified virtual address range. + * Empty implementation for writethrough caches. + * + * - start - virtual start address + * - end - virtual end address + */ + .globl v4wt_dma_clean_range + .equ v4wt_dma_clean_range, v4wt_dma_unmap_area + /* * dma_map_area(start, size, dir) * - start - kernel virtual start address diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S index 250c83bf7158..abae7ff5defc 100644 --- a/arch/arm/mm/cache-v6.S +++ b/arch/arm/mm/cache-v6.S @@ -200,7 +200,7 @@ ENTRY(v6_flush_kern_dcache_area) * - start - virtual start address of region * - end - virtual end address of region */ -v6_dma_inv_range: +ENTRY(v6_dma_inv_range) #ifdef CONFIG_DMA_CACHE_RWFO ldrb r2, [r0] @ read for ownership strb r2, [r0] @ write for ownership @@ -245,7 +245,7 @@ v6_dma_inv_range: * - start - virtual start address of region * - end - virtual end address of region */ -v6_dma_clean_range: +ENTRY(v6_dma_clean_range) bic r0, r0, #D_CACHE_LINE_SIZE - 1 1: #ifdef CONFIG_DMA_CACHE_RWFO diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S index 127afe2096ba..b16a0d2a7cce 100644 --- a/arch/arm/mm/cache-v7.S +++ b/arch/arm/mm/cache-v7.S @@ -361,7 +361,7 @@ ENDPROC(v7_flush_kern_dcache_area) * - start - virtual start address of region * - end - virtual end address of region */ -v7_dma_inv_range: +ENTRY(v7_dma_inv_range) dcache_line_size r2, r3 sub r3, r2, #1 tst r0, r3 @@ -391,7 +391,7 @@ ENDPROC(v7_dma_inv_range) * - start - virtual start address of region * - end - virtual end address of region */ -v7_dma_clean_range: +ENTRY(v7_dma_clean_range) dcache_line_size r2, r3 sub r3, r2, #1 bic r0, r0, r3 @@ -477,6 +477,8 @@ ENDPROC(v7_dma_unmap_area) globl_equ b15_dma_map_area, v7_dma_map_area globl_equ b15_dma_unmap_area, v7_dma_unmap_area + globl_equ b15_dma_clean_range, v7_dma_clean_range + globl_equ b15_dma_inv_range, v7_dma_inv_range globl_equ b15_dma_flush_range, v7_dma_flush_range define_cache_functions b15 diff --git a/arch/arm/mm/cache-v7m.S b/arch/arm/mm/cache-v7m.S index eb60b5e5e2ad..4fc6e0028e40 100644 --- a/arch/arm/mm/cache-v7m.S +++ b/arch/arm/mm/cache-v7m.S @@ -364,7 +364,7 @@ ENDPROC(v7m_flush_kern_dcache_area) * - start - virtual start address of region * - end - virtual end address of region */ -v7m_dma_inv_range: +ENTRY(v7m_dma_inv_range) dcache_line_size r2, r3 sub r3, r2, #1 tst r0, r3 @@ -390,7 +390,7 @@ ENDPROC(v7m_dma_inv_range) * - start - virtual start address of region * - end - virtual end address of region */ -v7m_dma_clean_range: +ENTRY(v7m_dma_clean_range) dcache_line_size r2, r3 sub r3, r2, #1 bic r0, r0, r3 diff --git a/arch/arm/mm/proc-arm1020.S b/arch/arm/mm/proc-arm1020.S index 6837cf7a4812..0089e366f4e8 100644 --- a/arch/arm/mm/proc-arm1020.S +++ b/arch/arm/mm/proc-arm1020.S @@ -263,7 +263,7 @@ ENTRY(arm1020_flush_kern_dcache_area) * * (same as v4wb) */ -arm1020_dma_inv_range: +ENTRY(arm1020_dma_inv_range) mov ip, #0 #ifndef CONFIG_CPU_DCACHE_DISABLE tst r0, #CACHE_DLINESIZE - 1 @@ -293,7 +293,7 @@ arm1020_dma_inv_range: * * (same as v4wb) */ -arm1020_dma_clean_range: +ENTRY(arm1020_dma_clean_range) mov ip, #0 #ifndef CONFIG_CPU_DCACHE_DISABLE bic r0, r0, #CACHE_DLINESIZE - 1 diff --git a/arch/arm/mm/proc-arm1020e.S b/arch/arm/mm/proc-arm1020e.S index df49b10250b8..c662e55a76fa 100644 --- a/arch/arm/mm/proc-arm1020e.S +++ b/arch/arm/mm/proc-arm1020e.S @@ -256,7 +256,7 @@ ENTRY(arm1020e_flush_kern_dcache_area) * * (same as v4wb) */ -arm1020e_dma_inv_range: +ENTRY(arm1020e_dma_inv_range) mov ip, #0 #ifndef CONFIG_CPU_DCACHE_DISABLE tst r0, #CACHE_DLINESIZE - 1 @@ -282,7 +282,7 @@ arm1020e_dma_inv_range: * * (same as v4wb) */ -arm1020e_dma_clean_range: +ENTRY(arm1020e_dma_clean_range) mov ip, #0 #ifndef CONFIG_CPU_DCACHE_DISABLE bic r0, r0, #CACHE_DLINESIZE - 1 diff --git a/arch/arm/mm/proc-arm1022.S b/arch/arm/mm/proc-arm1022.S index e89ce467f672..e77328906bc5 100644 --- a/arch/arm/mm/proc-arm1022.S +++ b/arch/arm/mm/proc-arm1022.S @@ -256,7 +256,7 @@ ENTRY(arm1022_flush_kern_dcache_area) * * (same as v4wb) */ -arm1022_dma_inv_range: +ENTRY(arm1022_dma_inv_range) mov ip, #0 #ifndef CONFIG_CPU_DCACHE_DISABLE tst r0, #CACHE_DLINESIZE - 1 @@ -282,7 +282,7 @@ arm1022_dma_inv_range: * * (same as v4wb) */ -arm1022_dma_clean_range: +ENTRY(arm1022_dma_clean_range) mov ip, #0 #ifndef CONFIG_CPU_DCACHE_DISABLE bic r0, r0, #CACHE_DLINESIZE - 1 diff --git a/arch/arm/mm/proc-arm1026.S b/arch/arm/mm/proc-arm1026.S index 7fdd1a205e8e..a23f9fa28d07 100644 --- a/arch/arm/mm/proc-arm1026.S +++ b/arch/arm/mm/proc-arm1026.S @@ -250,7 +250,7 @@ ENTRY(arm1026_flush_kern_dcache_area) * * (same as v4wb) */ -arm1026_dma_inv_range: +ENTRY(arm1026_dma_inv_range) mov ip, #0 #ifndef CONFIG_CPU_DCACHE_DISABLE tst r0, #CACHE_DLINESIZE - 1 @@ -276,7 +276,7 @@ arm1026_dma_inv_range: * * (same as v4wb) */ -arm1026_dma_clean_range: +ENTRY(arm1026_dma_clean_range) mov ip, #0 #ifndef CONFIG_CPU_DCACHE_DISABLE bic r0, r0, #CACHE_DLINESIZE - 1 diff --git a/arch/arm/mm/proc-arm920.S b/arch/arm/mm/proc-arm920.S index a234cd8ba5e6..4c918ab106f3 100644 --- a/arch/arm/mm/proc-arm920.S +++ b/arch/arm/mm/proc-arm920.S @@ -232,7 +232,7 @@ ENTRY(arm920_flush_kern_dcache_area) * * (same as v4wb) */ -arm920_dma_inv_range: +ENTRY(arm920_dma_inv_range) tst r0, #CACHE_DLINESIZE - 1 bic r0, r0, #CACHE_DLINESIZE - 1 mcrne p15, 0, r0, c7, c10, 1 @ clean D entry @@ -255,7 +255,7 @@ arm920_dma_inv_range: * * (same as v4wb) */ -arm920_dma_clean_range: +ENTRY(arm920_dma_clean_range) bic r0, r0, #CACHE_DLINESIZE - 1 1: mcr p15, 0, r0, c7, c10, 1 @ clean D entry add r0, r0, #CACHE_DLINESIZE diff --git a/arch/arm/mm/proc-arm922.S b/arch/arm/mm/proc-arm922.S index 53c029dcfd83..6ac7bb7d94a4 100644 --- a/arch/arm/mm/proc-arm922.S +++ b/arch/arm/mm/proc-arm922.S @@ -234,7 +234,7 @@ ENTRY(arm922_flush_kern_dcache_area) * * (same as v4wb) */ -arm922_dma_inv_range: +ENTRY(arm922_dma_inv_range) tst r0, #CACHE_DLINESIZE - 1 bic r0, r0, #CACHE_DLINESIZE - 1 mcrne p15, 0, r0, c7, c10, 1 @ clean D entry @@ -257,7 +257,7 @@ arm922_dma_inv_range: * * (same as v4wb) */ -arm922_dma_clean_range: +ENTRY(arm922_dma_clean_range) bic r0, r0, #CACHE_DLINESIZE - 1 1: mcr p15, 0, r0, c7, c10, 1 @ clean D entry add r0, r0, #CACHE_DLINESIZE diff --git a/arch/arm/mm/proc-arm925.S b/arch/arm/mm/proc-arm925.S index 0bfad62ea858..860f0074ff81 100644 --- a/arch/arm/mm/proc-arm925.S +++ b/arch/arm/mm/proc-arm925.S @@ -280,7 +280,7 @@ ENTRY(arm925_flush_kern_dcache_area) * * (same as v4wb) */ -arm925_dma_inv_range: +ENTRY(arm925_dma_inv_range) #ifndef CONFIG_CPU_DCACHE_WRITETHROUGH tst r0, #CACHE_DLINESIZE - 1 mcrne p15, 0, r0, c7, c10, 1 @ clean D entry @@ -305,7 +305,7 @@ arm925_dma_inv_range: * * (same as v4wb) */ -arm925_dma_clean_range: +ENTRY(arm925_dma_clean_range) #ifndef CONFIG_CPU_DCACHE_WRITETHROUGH bic r0, r0, #CACHE_DLINESIZE - 1 1: mcr p15, 0, r0, c7, c10, 1 @ clean D entry diff --git a/arch/arm/mm/proc-arm926.S b/arch/arm/mm/proc-arm926.S index 0487a2c3439b..519f62e023c5 100644 --- a/arch/arm/mm/proc-arm926.S +++ b/arch/arm/mm/proc-arm926.S @@ -243,7 +243,7 @@ ENTRY(arm926_flush_kern_dcache_area) * * (same as v4wb) */ -arm926_dma_inv_range: +ENTRY(arm926_dma_inv_range) #ifndef CONFIG_CPU_DCACHE_WRITETHROUGH tst r0, #CACHE_DLINESIZE - 1 mcrne p15, 0, r0, c7, c10, 1 @ clean D entry @@ -268,7 +268,7 @@ arm926_dma_inv_range: * * (same as v4wb) */ -arm926_dma_clean_range: +ENTRY(arm926_dma_clean_range) #ifndef CONFIG_CPU_DCACHE_WRITETHROUGH bic r0, r0, #CACHE_DLINESIZE - 1 1: mcr p15, 0, r0, c7, c10, 1 @ clean D entry diff --git a/arch/arm/mm/proc-arm940.S b/arch/arm/mm/proc-arm940.S index cf9bfcc825ca..14dda5c5ee4a 100644 --- a/arch/arm/mm/proc-arm940.S +++ b/arch/arm/mm/proc-arm940.S @@ -177,7 +177,7 @@ ENTRY(arm940_flush_kern_dcache_area) * - start - virtual start address * - end - virtual end address */ -arm940_dma_inv_range: +ENTRY(arm940_dma_inv_range) mov ip, #0 mov r1, #(CACHE_DSEGMENTS - 1) << 4 @ 4 segments 1: orr r3, r1, #(CACHE_DENTRIES - 1) << 26 @ 64 entries @@ -198,7 +198,7 @@ arm940_dma_inv_range: * - start - virtual start address * - end - virtual end address */ -arm940_dma_clean_range: +ENTRY(arm940_dma_clean_range) ENTRY(cpu_arm940_dcache_clean_area) mov ip, #0 #ifndef CONFIG_CPU_DCACHE_WRITETHROUGH diff --git a/arch/arm/mm/proc-arm946.S b/arch/arm/mm/proc-arm946.S index 6fb3898ad1cd..91f62a7d334b 100644 --- a/arch/arm/mm/proc-arm946.S +++ b/arch/arm/mm/proc-arm946.S @@ -222,7 +222,7 @@ ENTRY(arm946_flush_kern_dcache_area) * - end - virtual end address * (same as arm926) */ -arm946_dma_inv_range: +ENTRY(arm946_dma_inv_range) #ifndef CONFIG_CPU_DCACHE_WRITETHROUGH tst r0, #CACHE_DLINESIZE - 1 mcrne p15, 0, r0, c7, c10, 1 @ clean D entry @@ -247,7 +247,7 @@ arm946_dma_inv_range: * * (same as arm926) */ -arm946_dma_clean_range: +ENTRY(arm946_dma_clean_range) #ifndef CONFIG_CPU_DCACHE_WRITETHROUGH bic r0, r0, #CACHE_DLINESIZE - 1 1: mcr p15, 0, r0, c7, c10, 1 @ clean D entry diff --git a/arch/arm/mm/proc-feroceon.S b/arch/arm/mm/proc-feroceon.S index 61ce82aca6f0..86122bad6d9b 100644 --- a/arch/arm/mm/proc-feroceon.S +++ b/arch/arm/mm/proc-feroceon.S @@ -271,7 +271,7 @@ ENTRY(feroceon_range_flush_kern_dcache_area) * (same as v4wb) */ .align 5 -feroceon_dma_inv_range: +ENTRY(feroceon_dma_inv_range) tst r0, #CACHE_DLINESIZE - 1 bic r0, r0, #CACHE_DLINESIZE - 1 mcrne p15, 0, r0, c7, c10, 1 @ clean D entry @@ -285,7 +285,7 @@ feroceon_dma_inv_range: ret lr .align 5 -feroceon_range_dma_inv_range: +ENTRY(feroceon_range_dma_inv_range) mrs r2, cpsr tst r0, #CACHE_DLINESIZE - 1 mcrne p15, 0, r0, c7, c10, 1 @ clean D entry @@ -311,7 +311,7 @@ feroceon_range_dma_inv_range: * (same as v4wb) */ .align 5 -feroceon_dma_clean_range: +ENTRY(feroceon_dma_clean_range) bic r0, r0, #CACHE_DLINESIZE - 1 1: mcr p15, 0, r0, c7, c10, 1 @ clean D entry add r0, r0, #CACHE_DLINESIZE @@ -321,7 +321,7 @@ feroceon_dma_clean_range: ret lr .align 5 -feroceon_range_dma_clean_range: +ENTRY(feroceon_range_dma_clean_range) mrs r2, cpsr cmp r1, r0 subne r1, r1, #1 @ top address is inclusive diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S index e43f6d716b4b..c1328955fd2a 100644 --- a/arch/arm/mm/proc-macros.S +++ b/arch/arm/mm/proc-macros.S @@ -334,6 +334,8 @@ ENTRY(\name\()_cache_fns) .long \name\()_flush_kern_dcache_area .long \name\()_dma_map_area .long \name\()_dma_unmap_area + .long \name\()_dma_clean_range + .long \name\()_dma_inv_range .long \name\()_dma_flush_range .size \name\()_cache_fns, . - \name\()_cache_fns .endm diff --git a/arch/arm/mm/proc-mohawk.S b/arch/arm/mm/proc-mohawk.S index 1645ccaffe96..db3a2f00372a 100644 --- a/arch/arm/mm/proc-mohawk.S +++ b/arch/arm/mm/proc-mohawk.S @@ -216,7 +216,7 @@ ENTRY(mohawk_flush_kern_dcache_area) * * (same as v4wb) */ -mohawk_dma_inv_range: +ENTRY(mohawk_dma_inv_range) tst r0, #CACHE_DLINESIZE - 1 mcrne p15, 0, r0, c7, c10, 1 @ clean D entry tst r1, #CACHE_DLINESIZE - 1 @@ -239,7 +239,7 @@ mohawk_dma_inv_range: * * (same as v4wb) */ -mohawk_dma_clean_range: +ENTRY(mohawk_dma_clean_range) bic r0, r0, #CACHE_DLINESIZE - 1 1: mcr p15, 0, r0, c7, c10, 1 @ clean D entry add r0, r0, #CACHE_DLINESIZE diff --git a/arch/arm/mm/proc-xsc3.S b/arch/arm/mm/proc-xsc3.S index a17afe7e195a..6db611a945f3 100644 --- a/arch/arm/mm/proc-xsc3.S +++ b/arch/arm/mm/proc-xsc3.S @@ -263,7 +263,7 @@ ENTRY(xsc3_flush_kern_dcache_area) * - start - virtual start address * - end - virtual end address */ -xsc3_dma_inv_range: +ENTRY(xsc3_dma_inv_range) tst r0, #CACHELINESIZE - 1 bic r0, r0, #CACHELINESIZE - 1 mcrne p15, 0, r0, c7, c10, 1 @ clean L1 D line @@ -284,7 +284,7 @@ xsc3_dma_inv_range: * - start - virtual start address * - end - virtual end address */ -xsc3_dma_clean_range: +ENTRY(xsc3_dma_clean_range) bic r0, r0, #CACHELINESIZE - 1 1: mcr p15, 0, r0, c7, c10, 1 @ clean L1 D line add r0, r0, #CACHELINESIZE diff --git a/arch/arm/mm/proc-xscale.S b/arch/arm/mm/proc-xscale.S index d82590aa71c0..291dec830714 100644 --- a/arch/arm/mm/proc-xscale.S +++ b/arch/arm/mm/proc-xscale.S @@ -323,7 +323,7 @@ ENTRY(xscale_flush_kern_dcache_area) * - start - virtual start address * - end - virtual end address */ -xscale_dma_inv_range: +ENTRY(xscale_dma_inv_range) tst r0, #CACHELINESIZE - 1 bic r0, r0, #CACHELINESIZE - 1 mcrne p15, 0, r0, c7, c10, 1 @ clean D entry @@ -344,7 +344,7 @@ xscale_dma_inv_range: * - start - virtual start address * - end - virtual end address */ -xscale_dma_clean_range: +ENTRY(xscale_dma_clean_range) bic r0, r0, #CACHELINESIZE - 1 1: mcr p15, 0, r0, c7, c10, 1 @ clean D entry add r0, r0, #CACHELINESIZE @@ -445,6 +445,8 @@ ENDPROC(xscale_dma_unmap_area) a0_alias coherent_kern_range a0_alias coherent_user_range a0_alias flush_kern_dcache_area + a0_alias dma_clean_range + a0_alias dma_inv_range a0_alias dma_flush_range a0_alias dma_unmap_area From patchwork Mon Mar 27 12:13:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189164 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1FEDC77B62 for ; Mon, 27 Mar 2023 12:18:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232716AbjC0MSA (ORCPT ); Mon, 27 Mar 2023 08:18:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232836AbjC0MRP (ORCPT ); Mon, 27 Mar 2023 08:17:15 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 01E3E55BF; Mon, 27 Mar 2023 05:16:27 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 8F993B81183; Mon, 27 Mar 2023 12:16:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 13EE3C433AC; Mon, 27 Mar 2023 12:16:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919385; bh=z4o74QJO2HbXUwmw9wKbmWzDWd6hXoo60uihfOQ+9AY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LeIWTWJl7na8lr1HDpXLEID7w40kD3jUGlqyEdLIjp28/W62Bg2Q01Di95UzrKvRG HJPQ86ajlp3iCoZ3CJayJ9d+7TXVGDKXmF4H9pFJrUDGFmaBQ1sxvyiJllRmrJgLiW kC23F7Igb+UcfCgn+A3+gD/4+YvQeXUupIIuvr0FfATyUyZ8zELrbCiAvyv4ru8qIH aoWdr4S3g5t8VzyZGuK2ajzBYoNZaX0/7e8Xyof9vCg5r1OYOOAEwSmcelSSXNNM+u 9342trVskeAxtOQIo4GHu8vegYfBrfg6AvF+3aPu6K84TrrA9KjWBg30daulsN+rpN khfEwXxa706kA== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 17/21] ARM: dma-mapping: use arch_sync_dma_for_{device,cpu}() internally Date: Mon, 27 Mar 2023 14:13:13 +0200 Message-Id: <20230327121317.4081816-18-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann The arm specific iommu code in dma-mapping.c uses the page+offset based __dma_page_cpu_to_dev()/__dma_page_dev_to_cpu() helpers in place of the phys_addr_t based arch_sync_dma_for_device()/arch_sync_dma_for_cpu() wrappers around the. In order to be able to move the latter part set of functions into common code, change the iommu implementation to use them directly and remove the internal ones as a separate interface. As page+offset and phys_address are equivalent, but are used in different parts of the code here, this allows removing some of the conversion but adds them elsewhere. Signed-off-by: Arnd Bergmann Reviewed-by: Linus Walleij --- arch/arm/mm/dma-mapping.c | 93 ++++++++++++++------------------------- 1 file changed, 33 insertions(+), 60 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 8bc01071474a..ce4b74f34a58 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -622,16 +622,14 @@ static void __arm_dma_free(struct device *dev, size_t size, void *cpu_addr, kfree(buf); } -static void dma_cache_maint_page(struct page *page, unsigned long offset, +static void dma_cache_maint(phys_addr_t paddr, size_t size, enum dma_data_direction dir, void (*op)(const void *, size_t, int)) { - unsigned long pfn; + unsigned long pfn = PFN_DOWN(paddr); + unsigned long offset = paddr % PAGE_SIZE; size_t left = size; - pfn = page_to_pfn(page) + offset / PAGE_SIZE; - offset %= PAGE_SIZE; - /* * A single sg entry may refer to multiple physically contiguous * pages. But we still need to process highmem pages individually. @@ -641,8 +639,7 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset, do { size_t len = left; void *vaddr; - - page = pfn_to_page(pfn); + struct page *page = pfn_to_page(pfn); if (PageHighMem(page)) { if (len + offset > PAGE_SIZE) @@ -674,14 +671,11 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset, * Note: Drivers should NOT use this function directly. * Use the driver DMA support - see dma-mapping.h (dma_sync_*) */ -static void __dma_page_cpu_to_dev(struct page *page, unsigned long off, - size_t size, enum dma_data_direction dir) +void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, + enum dma_data_direction dir) { - phys_addr_t paddr; + dma_cache_maint(paddr, size, dir, dmac_map_area); - dma_cache_maint_page(page, off, size, dir, dmac_map_area); - - paddr = page_to_phys(page) + off; if (dir == DMA_FROM_DEVICE) { outer_inv_range(paddr, paddr + size); } else { @@ -690,34 +684,30 @@ static void __dma_page_cpu_to_dev(struct page *page, unsigned long off, /* FIXME: non-speculating: flush on bidirectional mappings? */ } -static void __dma_page_dev_to_cpu(struct page *page, unsigned long off, - size_t size, enum dma_data_direction dir) +void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, + enum dma_data_direction dir) { - phys_addr_t paddr = page_to_phys(page) + off; - /* FIXME: non-speculating: not required */ /* in any case, don't bother invalidating if DMA to device */ if (dir != DMA_TO_DEVICE) { outer_inv_range(paddr, paddr + size); - dma_cache_maint_page(page, off, size, dir, dmac_unmap_area); + dma_cache_maint(paddr, size, dir, dmac_unmap_area); } /* * Mark the D-cache clean for these pages to avoid extra flushing. */ if (dir != DMA_TO_DEVICE && size >= PAGE_SIZE) { - unsigned long pfn; + unsigned long pfn = PFN_UP(paddr); + unsigned long off = paddr & (PAGE_SIZE - 1); size_t left = size; - pfn = page_to_pfn(page) + off / PAGE_SIZE; - off %= PAGE_SIZE; - if (off) { - pfn++; + if (off) left -= PAGE_SIZE - off; - } + while (left >= PAGE_SIZE) { - page = pfn_to_page(pfn++); + struct page *page = pfn_to_page(pfn++); set_bit(PG_dcache_clean, &page->flags); left -= PAGE_SIZE; } @@ -1204,7 +1194,7 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, unsigned int len = PAGE_ALIGN(s->offset + s->length); if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) - __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir); + arch_sync_dma_for_device(phys + s->offset, s->length, dir); prot = __dma_info_to_prot(dir, attrs); @@ -1306,8 +1296,7 @@ static void arm_iommu_unmap_sg(struct device *dev, __iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s)); if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) - __dma_page_dev_to_cpu(sg_page(s), s->offset, - s->length, dir); + arch_sync_dma_for_cpu(sg_phys(s), s->length, dir); } } @@ -1329,7 +1318,7 @@ static void arm_iommu_sync_sg_for_cpu(struct device *dev, return; for_each_sg(sg, s, nents, i) - __dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir); + arch_sync_dma_for_cpu(sg_phys(s), s->length, dir); } @@ -1351,7 +1340,8 @@ static void arm_iommu_sync_sg_for_device(struct device *dev, return; for_each_sg(sg, s, nents, i) - __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir); + arch_sync_dma_for_device(page_to_phys(sg_page(s)) + s->offset, + s->length, dir); } /** @@ -1373,7 +1363,8 @@ static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page, int ret, prot, len = PAGE_ALIGN(size + offset); if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) - __dma_page_cpu_to_dev(page, offset, size, dir); + arch_sync_dma_for_device(page_to_phys(page) + offset, + size, dir); dma_addr = __alloc_iova(mapping, len); if (dma_addr == DMA_MAPPING_ERROR) @@ -1406,7 +1397,7 @@ static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle, { struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev); dma_addr_t iova = handle & PAGE_MASK; - struct page *page; + phys_addr_t phys; int offset = handle & ~PAGE_MASK; int len = PAGE_ALIGN(size + offset); @@ -1414,8 +1405,8 @@ static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle, return; if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) { - page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova)); - __dma_page_dev_to_cpu(page, offset, size, dir); + phys = iommu_iova_to_phys(mapping->domain, handle); + arch_sync_dma_for_cpu(phys, size, dir); } iommu_unmap(mapping->domain, iova, len); @@ -1483,30 +1474,26 @@ static void arm_iommu_sync_single_for_cpu(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir) { struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev); - dma_addr_t iova = handle & PAGE_MASK; - struct page *page; - unsigned int offset = handle & ~PAGE_MASK; + phys_addr_t phys; - if (dev->dma_coherent || !iova) + if (dev->dma_coherent || !(handle & PAGE_MASK)) return; - page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova)); - __dma_page_dev_to_cpu(page, offset, size, dir); + phys = iommu_iova_to_phys(mapping->domain, handle); + arch_sync_dma_for_cpu(phys, size, dir); } static void arm_iommu_sync_single_for_device(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir) { struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev); - dma_addr_t iova = handle & PAGE_MASK; - struct page *page; - unsigned int offset = handle & ~PAGE_MASK; + phys_addr_t phys; - if (dev->dma_coherent || !iova) + if (dev->dma_coherent || !(handle & PAGE_MASK)) return; - page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova)); - __dma_page_cpu_to_dev(page, offset, size, dir); + phys = iommu_iova_to_phys(mapping->domain, handle); + arch_sync_dma_for_device(phys, size, dir); } static const struct dma_map_ops iommu_ops = { @@ -1789,20 +1776,6 @@ void arch_teardown_dma_ops(struct device *dev) set_dma_ops(dev, NULL); } -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) -{ - __dma_page_cpu_to_dev(phys_to_page(paddr), paddr & (PAGE_SIZE - 1), - size, dir); -} - -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) -{ - __dma_page_dev_to_cpu(phys_to_page(paddr), paddr & (PAGE_SIZE - 1), - size, dir); -} - void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs) { From patchwork Mon Mar 27 12:13:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 389F4C77B6D for ; Mon, 27 Mar 2023 12:18:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232778AbjC0MSV (ORCPT ); Mon, 27 Mar 2023 08:18:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34616 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232718AbjC0MRg (ORCPT ); Mon, 27 Mar 2023 08:17:36 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F5855B85; Mon, 27 Mar 2023 05:16:35 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id BE4CB61200; Mon, 27 Mar 2023 12:16:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B5CB3C433A7; Mon, 27 Mar 2023 12:16:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919394; bh=1MU8OJwkkHt5sEwmg8fMO+rrTE5E92A6N2HbXcY8t6o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QCmyJCl/uzmN1uz2WuiI28nvGTdQ3lWgo54Cvdn08zKuIbJa9uJcuJ8xzBRXQvews zKHW2YzyjTX3fmCevlSQv1CJi6qVy6a7mTohS+a+ZEr0cox2g66ZhqqI5Abxw/oto0 P/sEVO8x1ykQCyatBDMsdjeI/Pe2xX5XTT0o3XFFGqVARAM4XnGYvAnca6YV0ylyPC KcxwbXOMCWgYj2vBgK6jHoppx9FGJbLvjkaveAjYX/B70ILKZ2bbJKjWcUElbwSJwd 8CPaWdKfzxNXZElXaT68suXPSJw1GtUNGMnFvxRzugvDhkrpDk3xSbqnenBGWsGCvb enei6QONy/e1w== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org, Daniel Golle Subject: [PATCH 18/21] ARM: drop SMP support for ARM11MPCore Date: Mon, 27 Mar 2023 14:13:14 +0200 Message-Id: <20230327121317.4081816-19-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann The cache management operations for noncoherent DMA on ARMv6 work in two different ways: * When CONFIG_DMA_CACHE_RWFO is set, speculative prefetches on in-flight DMA buffers lead to data corruption when the prefetched data is written back on top of data from the device. * When CONFIG_DMA_CACHE_RWFO is disabled, a cache flush on one CPU is not seen by the other core(s), leading to inconsistent contents accross the system. As a consequence, neither configuration is actually safe to use in a general-purpose kernel that is used on both MPCore systems and ARM1176 with prefetching enabled. We could add further workarounds to make the behavior more dynamic based on the system, but realistically, there are close to zero remaining users on any ARM11MPCore anyway, and nobody seems too interested in it, compared to the more popular ARM1176 used in BMC2835 and AST2500. The Oxnas platform has some minimal support in OpenWRT, but most of the drivers and dts files never made it into the mainline kernel, while the Arm Versatile/Realview platform mainly serves as a reference system but is not necessary to be kept working once all other ARM11MPCore are gone. Take the easy way out here and drop support for multiprocessing on ARMv6, along with the CONFIG_DMA_CACHE_RWFO option and the cache management implementation for it. This also helps with other ARMv6 issues, but for the moment leaves the ability to build a kernel that can run on both ARMv7 SMP and single-processor ARMv6, which we probably want to stop supporting as well, but not as part of this series. Cc: Neil Armstrong Cc: Daniel Golle Cc: Linus Walleij Cc: linux-oxnas@groups.io Signed-off-by: Arnd Bergmann Acked-by: Neil Armstrong Acked-by: Linus Walleij Acked-by: Ard Biesheuvel Acked-by: Catalin Marinas --- I could use some help clarifying the above changelog text to describe the exact problem, and how the CONFIG_DMA_CACHE_RWFO actually works on MPCore. The TRMs for both 1176 and 11MPCore only describe prefetching into the instruction cache, not the data cache, but this can end up in the outercache as a result. The 1176 has some extra control bits to control prefetching, but I found no reference that explains why an MPCore does not run into the problem. --- arch/arm/mach-oxnas/Kconfig | 4 - arch/arm/mach-oxnas/Makefile | 1 - arch/arm/mach-oxnas/headsmp.S | 23 ------ arch/arm/mach-oxnas/platsmp.c | 96 ---------------------- arch/arm/mach-versatile/platsmp-realview.c | 4 - arch/arm/mm/Kconfig | 19 ----- arch/arm/mm/cache-v6.S | 31 ------- 7 files changed, 178 deletions(-) delete mode 100644 arch/arm/mach-oxnas/headsmp.S delete mode 100644 arch/arm/mach-oxnas/platsmp.c diff --git a/arch/arm/mach-oxnas/Kconfig b/arch/arm/mach-oxnas/Kconfig index a9ded7079268..a054235c3d6c 100644 --- a/arch/arm/mach-oxnas/Kconfig +++ b/arch/arm/mach-oxnas/Kconfig @@ -28,10 +28,6 @@ config MACH_OX820 bool "Support OX820 Based Products" depends on ARCH_MULTI_V6 select ARM_GIC - select DMA_CACHE_RWFO if SMP - select HAVE_SMP - select HAVE_ARM_SCU if SMP - select HAVE_ARM_TWD if SMP help Include Support for the Oxford Semiconductor OX820 SoC Based Products. diff --git a/arch/arm/mach-oxnas/Makefile b/arch/arm/mach-oxnas/Makefile index 0e78ecfe6c49..a4e40e534e6a 100644 --- a/arch/arm/mach-oxnas/Makefile +++ b/arch/arm/mach-oxnas/Makefile @@ -1,2 +1 @@ # SPDX-License-Identifier: GPL-2.0-only -obj-$(CONFIG_SMP) += platsmp.o headsmp.o diff --git a/arch/arm/mach-oxnas/headsmp.S b/arch/arm/mach-oxnas/headsmp.S deleted file mode 100644 index 9c0f1479f33a..000000000000 --- a/arch/arm/mach-oxnas/headsmp.S +++ /dev/null @@ -1,23 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0-only */ -/* - * Copyright (C) 2013 Ma Haijun - * Copyright (c) 2003 ARM Limited - * All Rights Reserved - */ -#include -#include - - __INIT - -/* - * OX820 specific entry point for secondary CPUs. - */ -ENTRY(ox820_secondary_startup) - mov r4, #0 - /* invalidate both caches and branch target cache */ - mcr p15, 0, r4, c7, c7, 0 - /* - * we've been released from the holding pen: secondary_stack - * should now contain the SVC stack for this core - */ - b secondary_startup diff --git a/arch/arm/mach-oxnas/platsmp.c b/arch/arm/mach-oxnas/platsmp.c deleted file mode 100644 index f0a50b9e61df..000000000000 --- a/arch/arm/mach-oxnas/platsmp.c +++ /dev/null @@ -1,96 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only -/* - * Copyright (C) 2016 Neil Armstrong - * Copyright (C) 2013 Ma Haijun - * Copyright (C) 2002 ARM Ltd. - * All Rights Reserved - */ -#include -#include -#include -#include - -#include -#include -#include -#include - -extern void ox820_secondary_startup(void); - -static void __iomem *cpu_ctrl; -static void __iomem *gic_cpu_ctrl; - -#define HOLDINGPEN_CPU_OFFSET 0xc8 -#define HOLDINGPEN_LOCATION_OFFSET 0xc4 - -#define GIC_NCPU_OFFSET(cpu) (0x100 + (cpu)*0x100) -#define GIC_CPU_CTRL 0x00 -#define GIC_CPU_CTRL_ENABLE 1 - -static int __init ox820_boot_secondary(unsigned int cpu, - struct task_struct *idle) -{ - /* - * Write the address of secondary startup into the - * system-wide flags register. The BootMonitor waits - * until it receives a soft interrupt, and then the - * secondary CPU branches to this address. - */ - writel(virt_to_phys(ox820_secondary_startup), - cpu_ctrl + HOLDINGPEN_LOCATION_OFFSET); - - writel(cpu, cpu_ctrl + HOLDINGPEN_CPU_OFFSET); - - /* - * Enable GIC cpu interface in CPU Interface Control Register - */ - writel(GIC_CPU_CTRL_ENABLE, - gic_cpu_ctrl + GIC_NCPU_OFFSET(cpu) + GIC_CPU_CTRL); - - /* - * Send the secondary CPU a soft interrupt, thereby causing - * the boot monitor to read the system wide flags register, - * and branch to the address found there. - */ - arch_send_wakeup_ipi_mask(cpumask_of(cpu)); - - return 0; -} - -static void __init ox820_smp_prepare_cpus(unsigned int max_cpus) -{ - struct device_node *np; - void __iomem *scu_base; - - np = of_find_compatible_node(NULL, NULL, "arm,arm11mp-scu"); - scu_base = of_iomap(np, 0); - of_node_put(np); - if (!scu_base) - return; - - /* Remap CPU Interrupt Interface Registers */ - np = of_find_compatible_node(NULL, NULL, "arm,arm11mp-gic"); - gic_cpu_ctrl = of_iomap(np, 1); - of_node_put(np); - if (!gic_cpu_ctrl) - goto unmap_scu; - - np = of_find_compatible_node(NULL, NULL, "oxsemi,ox820-sys-ctrl"); - cpu_ctrl = of_iomap(np, 0); - of_node_put(np); - if (!cpu_ctrl) - goto unmap_scu; - - scu_enable(scu_base); - flush_cache_all(); - -unmap_scu: - iounmap(scu_base); -} - -static const struct smp_operations ox820_smp_ops __initconst = { - .smp_prepare_cpus = ox820_smp_prepare_cpus, - .smp_boot_secondary = ox820_boot_secondary, -}; - -CPU_METHOD_OF_DECLARE(ox820_smp, "oxsemi,ox820-smp", &ox820_smp_ops); diff --git a/arch/arm/mach-versatile/platsmp-realview.c b/arch/arm/mach-versatile/platsmp-realview.c index 5d363385c801..fa31fd2d211d 100644 --- a/arch/arm/mach-versatile/platsmp-realview.c +++ b/arch/arm/mach-versatile/platsmp-realview.c @@ -18,16 +18,12 @@ #define REALVIEW_SYS_FLAGSSET_OFFSET 0x30 static const struct of_device_id realview_scu_match[] = { - { .compatible = "arm,arm11mp-scu", }, { .compatible = "arm,cortex-a9-scu", }, { .compatible = "arm,cortex-a5-scu", }, { } }; static const struct of_device_id realview_syscon_match[] = { - { .compatible = "arm,core-module-integrator", }, - { .compatible = "arm,realview-eb-syscon", }, - { .compatible = "arm,realview-pb11mp-syscon", }, { .compatible = "arm,realview-pbx-syscon", }, { }, }; diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index c5bbae86f725..16b62bc0a970 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -937,25 +937,6 @@ config VDSO You must have glibc 2.22 or later for programs to seamlessly take advantage of this. -config DMA_CACHE_RWFO - bool "Enable read/write for ownership DMA cache maintenance" - depends on CPU_V6K && SMP - default y - help - The Snoop Control Unit on ARM11MPCore does not detect the - cache maintenance operations and the dma_{map,unmap}_area() - functions may leave stale cache entries on other CPUs. By - enabling this option, Read or Write For Ownership in the ARMv6 - DMA cache maintenance functions is performed. These LDR/STR - instructions change the cache line state to shared or modified - so that the cache operation has the desired effect. - - Note that the workaround is only valid on processors that do - not perform speculative loads into the D-cache. For such - processors, if cache maintenance operations are not broadcast - in hardware, other workarounds are needed (e.g. cache - maintenance broadcasting in software via FIQ). - config OUTER_CACHE bool diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S index abae7ff5defc..f6ee53c1de20 100644 --- a/arch/arm/mm/cache-v6.S +++ b/arch/arm/mm/cache-v6.S @@ -201,10 +201,6 @@ ENTRY(v6_flush_kern_dcache_area) * - end - virtual end address of region */ ENTRY(v6_dma_inv_range) -#ifdef CONFIG_DMA_CACHE_RWFO - ldrb r2, [r0] @ read for ownership - strb r2, [r0] @ write for ownership -#endif tst r0, #D_CACHE_LINE_SIZE - 1 bic r0, r0, #D_CACHE_LINE_SIZE - 1 #ifdef HARVARD_CACHE @@ -213,10 +209,6 @@ ENTRY(v6_dma_inv_range) mcrne p15, 0, r0, c7, c11, 1 @ clean unified line #endif tst r1, #D_CACHE_LINE_SIZE - 1 -#ifdef CONFIG_DMA_CACHE_RWFO - ldrbne r2, [r1, #-1] @ read for ownership - strbne r2, [r1, #-1] @ write for ownership -#endif bic r1, r1, #D_CACHE_LINE_SIZE - 1 #ifdef HARVARD_CACHE mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D line @@ -231,10 +223,6 @@ ENTRY(v6_dma_inv_range) #endif add r0, r0, #D_CACHE_LINE_SIZE cmp r0, r1 -#ifdef CONFIG_DMA_CACHE_RWFO - ldrlo r2, [r0] @ read for ownership - strlo r2, [r0] @ write for ownership -#endif blo 1b mov r0, #0 mcr p15, 0, r0, c7, c10, 4 @ drain write buffer @@ -248,9 +236,6 @@ ENTRY(v6_dma_inv_range) ENTRY(v6_dma_clean_range) bic r0, r0, #D_CACHE_LINE_SIZE - 1 1: -#ifdef CONFIG_DMA_CACHE_RWFO - ldr r2, [r0] @ read for ownership -#endif #ifdef HARVARD_CACHE mcr p15, 0, r0, c7, c10, 1 @ clean D line #else @@ -269,10 +254,6 @@ ENTRY(v6_dma_clean_range) * - end - virtual end address of region */ ENTRY(v6_dma_flush_range) -#ifdef CONFIG_DMA_CACHE_RWFO - ldrb r2, [r0] @ read for ownership - strb r2, [r0] @ write for ownership -#endif bic r0, r0, #D_CACHE_LINE_SIZE - 1 1: #ifdef HARVARD_CACHE @@ -282,10 +263,6 @@ ENTRY(v6_dma_flush_range) #endif add r0, r0, #D_CACHE_LINE_SIZE cmp r0, r1 -#ifdef CONFIG_DMA_CACHE_RWFO - ldrblo r2, [r0] @ read for ownership - strblo r2, [r0] @ write for ownership -#endif blo 1b mov r0, #0 mcr p15, 0, r0, c7, c10, 4 @ drain write buffer @@ -301,13 +278,7 @@ ENTRY(v6_dma_map_area) add r1, r1, r0 teq r2, #DMA_FROM_DEVICE beq v6_dma_inv_range -#ifndef CONFIG_DMA_CACHE_RWFO b v6_dma_clean_range -#else - teq r2, #DMA_TO_DEVICE - beq v6_dma_clean_range - b v6_dma_flush_range -#endif ENDPROC(v6_dma_map_area) /* @@ -317,11 +288,9 @@ ENDPROC(v6_dma_map_area) * - dir - DMA direction */ ENTRY(v6_dma_unmap_area) -#ifndef CONFIG_DMA_CACHE_RWFO add r1, r1, r0 teq r2, #DMA_TO_DEVICE bne v6_dma_inv_range -#endif ret lr ENDPROC(v6_dma_unmap_area) From patchwork Mon Mar 27 12:13:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189166 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 683D9C77B61 for ; Mon, 27 Mar 2023 12:18:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232696AbjC0MSk (ORCPT ); Mon, 27 Mar 2023 08:18:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232754AbjC0MRu (ORCPT ); Mon, 27 Mar 2023 08:17:50 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70FB361B4; Mon, 27 Mar 2023 05:16:45 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 1900EB8118F; Mon, 27 Mar 2023 12:16:44 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A199BC433A4; Mon, 27 Mar 2023 12:16:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919402; bh=5fk8QyC/KB/jrsvrHp6V4LA83bUbandj0Wfn0eu1ajs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=j2km9ZzYuNDulQLsscEoaytNZ1gOapGTux+/qnSpRR7kgHRFAwDWDmiH52U7YUwS2 l25ig62lChXkRW9dVZHw0u6u831HBjAVfoB4gwIhsDu9Oly0WFytQgzyZsIKEt4ZjO ebigDsT9ROl6sWYKYTUDMLRjvwmuklMuOt84flr18xR/2kTbMG38tS/UXspKPu2ixh EuScV8pm+I4HHjlAfxf0LfOpZq853V+LM4kxrKbWNcmBeEWw/4N5QI2a8DvVG5mYHv K6lh//J1Yh7TkacN4sV4vOvpMMmOkrqnjch6bIyUT4KqwtxFWozh9BZ59rFdrnUb0v pxs4A/jmHxEIw== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 19/21] ARM: dma-mapping: use generic form of arch_sync_dma_* helpers Date: Mon, 27 Mar 2023 14:13:15 +0200 Message-Id: <20230327121317.4081816-20-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann As the final step of the conversion to generic arch_sync_dma_* helpers, change the Arm implementation to look the same as the new generic version, by calling the dmac_{clean,inv,flush}_area low-level functions instead of the abstracted dmac_{map,unmap}_area version. On ARMv6/v7, this invalidates the caches after a DMA transfer from a device because of speculative prefetching, while on earlier versions it only needs to do this before the transfer. This should not change any of the current behavior. FIXME: address CONFIG_DMA_CACHE_RWFO properly. Signed-off-by: Arnd Bergmann --- arch/arm/mm/dma-mapping-nommu.c | 11 +++---- arch/arm/mm/dma-mapping.c | 53 +++++++++++++++++++++++---------- 2 files changed, 43 insertions(+), 21 deletions(-) diff --git a/arch/arm/mm/dma-mapping-nommu.c b/arch/arm/mm/dma-mapping-nommu.c index cfd9c933d2f0..12b5c6ae93fc 100644 --- a/arch/arm/mm/dma-mapping-nommu.c +++ b/arch/arm/mm/dma-mapping-nommu.c @@ -16,12 +16,13 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - dmac_map_area(__va(paddr), size, dir); - - if (dir == DMA_FROM_DEVICE) + if (dir == DMA_FROM_DEVICE) { + dmac_inv_range(__va(paddr), __va(paddr + size)); outer_inv_range(paddr, paddr + size); - else + } else { + dmac_clean_range(__va(paddr), __va(paddr + size)); outer_clean_range(paddr, paddr + size); + } } void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, @@ -29,7 +30,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, { if (dir != DMA_TO_DEVICE) { outer_inv_range(paddr, paddr + size); - dmac_unmap_area(__va(paddr), size, dir); + dmac_inv_range(__va(paddr), __va(paddr)); } } diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index ce4b74f34a58..cc702cb27ae7 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -623,8 +623,7 @@ static void __arm_dma_free(struct device *dev, size_t size, void *cpu_addr, } static void dma_cache_maint(phys_addr_t paddr, - size_t size, enum dma_data_direction dir, - void (*op)(const void *, size_t, int)) + size_t size, void (*op)(const void *, const void *)) { unsigned long pfn = PFN_DOWN(paddr); unsigned long offset = paddr % PAGE_SIZE; @@ -647,18 +646,18 @@ static void dma_cache_maint(phys_addr_t paddr, if (cache_is_vipt_nonaliasing()) { vaddr = kmap_atomic(page); - op(vaddr + offset, len, dir); + op(vaddr + offset, vaddr + offset + len); kunmap_atomic(vaddr); } else { vaddr = kmap_high_get(page); if (vaddr) { - op(vaddr + offset, len, dir); + op(vaddr + offset, vaddr + offset + len); kunmap_high(page); } } } else { vaddr = page_address(page) + offset; - op(vaddr, len, dir); + op(vaddr, vaddr + len); } offset = 0; pfn++; @@ -666,6 +665,18 @@ static void dma_cache_maint(phys_addr_t paddr, } while (left); } +static bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + if (IS_ENABLED(CONFIG_CPU_V6) || + IS_ENABLED(CONFIG_CPU_V6K) || + IS_ENABLED(CONFIG_CPU_V7) || + IS_ENABLED(CONFIG_CPU_V7M)) + return true; + + /* FIXME: runtime detection */ + return false; +} + /* * Make an area consistent for devices. * Note: Drivers should NOT use this function directly. @@ -674,25 +685,35 @@ static void dma_cache_maint(phys_addr_t paddr, void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - dma_cache_maint(paddr, size, dir, dmac_map_area); - - if (dir == DMA_FROM_DEVICE) { - outer_inv_range(paddr, paddr + size); - } else { + switch (dir) { + case DMA_TO_DEVICE: + dma_cache_maint(paddr, size, dmac_clean_range); outer_clean_range(paddr, paddr + size); + break; + case DMA_FROM_DEVICE: + dma_cache_maint(paddr, size, dmac_inv_range); + outer_inv_range(paddr, paddr + size); + break; + case DMA_BIDIRECTIONAL: + if (arch_sync_dma_cpu_needs_post_dma_flush()) { + dma_cache_maint(paddr, size, dmac_clean_range); + outer_clean_range(paddr, paddr + size); + } else { + dma_cache_maint(paddr, size, dmac_flush_range); + outer_flush_range(paddr, paddr + size); + } + break; + default: + break; } - /* FIXME: non-speculating: flush on bidirectional mappings? */ } void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - /* FIXME: non-speculating: not required */ - /* in any case, don't bother invalidating if DMA to device */ - if (dir != DMA_TO_DEVICE) { + if (dir != DMA_TO_DEVICE && arch_sync_dma_cpu_needs_post_dma_flush()) { outer_inv_range(paddr, paddr + size); - - dma_cache_maint(paddr, size, dir, dmac_unmap_area); + dma_cache_maint(paddr, size, dmac_inv_range); } /* From patchwork Mon Mar 27 12:13:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189167 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 464D8C77B6F for ; Mon, 27 Mar 2023 12:18:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232879AbjC0MS6 (ORCPT ); Mon, 27 Mar 2023 08:18:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231932AbjC0MSR (ORCPT ); Mon, 27 Mar 2023 08:18:17 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6FF465FEE; Mon, 27 Mar 2023 05:16:54 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C939EB80B7E; Mon, 27 Mar 2023 12:16:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 501AFC433EF; Mon, 27 Mar 2023 12:16:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919411; bh=Rv4KYy6sKfgGeqiWLGkVJXsqyPpN0yobcAkWx6hOhRA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=seVtQRQNuB5GWBdC8cr5KZ0W+Dgkdso6D+FfC2lHsM55FW+HCEqGCdYrQFb6Hvnnl bXHyqMcd3DyyV7hi1FueJFrSKcIZT1GcdjqwY+W+kWsNdEnWRPJylqhRRg5RsHevq7 WBFUd0m9uRKlCn7LN+gS0Oo+KRc8tPOfbCxZQI64X7AW5W6Ofpjnb7fmDoiT4RacVw vk4b1z+FUEyiJWAWMcn0G5UR2lUKTJzFuBhtLP8MwNCX/APqFQwsqhZT0tSSdZGUwv DnaR/fiB4vNkK6jouFd9Zrn6qoIastoqdb9WZidfdEvnyO6j1MKTBI641v/Mfj9Cpi 1m6kthf5XYK2Q== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 20/21] ARM: dma-mapping: split out arch_dma_mark_clean() helper Date: Mon, 27 Mar 2023 14:13:16 +0200 Message-Id: <20230327121317.4081816-21-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann The arm version of the arch_sync_dma_for_cpu() function annotates pages as PG_dcache_clean after a DMA, but no other architecture does this here. On ia64, the same thing is done in arch_sync_dma_for_cpu(), so it makes sense to use the same hook in order to have identical arch_sync_dma_for_cpu() semantics as all other architectures. Splitting this out has multiple effects: - for dma-direct, this now gets called after arch_sync_dma_for_cpu() for DMA_FROM_DEVICE mappings, but not for DMA_BIDIRECTIONAL. While it would not be harmful to keep doing it for bidirectional mappings, those are apparently not used in any callers that care about the flag. - Since arm has its own dma-iommu abstraction, this now also needs to call the same function, so the calls are added there to mirror the dma-direct version. - Like dma-direct, the dma-iommu version now marks the dcache clean for both coherent and noncoherent devices after a DMA, but it only does this for DMA_FROM_DEVICE, not DMA_BIDIRECTIONAL. [ HELP NEEDED: can anyone confirm that it is a correct assumption on arm that a cache-coherent device writing to a page always results in it being in a PG_dcache_clean state like on ia64, or can a device write directly into the dcache?] Signed-off-by: Arnd Bergmann --- arch/arm/Kconfig | 1 + arch/arm/mm/dma-mapping.c | 71 +++++++++++++++++++++++---------------- 2 files changed, 43 insertions(+), 29 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index e24a9820e12f..125d58c54ab1 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -7,6 +7,7 @@ config ARM select ARCH_HAS_BINFMT_FLAT select ARCH_HAS_CURRENT_STACK_POINTER select ARCH_HAS_DEBUG_VIRTUAL if MMU + select ARCH_HAS_DMA_MARK_CLEAN if MMU select ARCH_HAS_DMA_WRITE_COMBINE if !ARM_DMA_MEM_BUFFERABLE select ARCH_HAS_ELF_RANDOMIZE select ARCH_HAS_FORTIFY_SOURCE diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index cc702cb27ae7..b703cb83d27e 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -665,6 +665,28 @@ static void dma_cache_maint(phys_addr_t paddr, } while (left); } +/* + * Mark the D-cache clean for these pages to avoid extra flushing. + */ +void arch_dma_mark_clean(phys_addr_t paddr, size_t size) +{ + unsigned long pfn = PFN_UP(paddr); + unsigned long off = paddr & (PAGE_SIZE - 1); + size_t left = size; + + if (size < PAGE_SIZE) + return; + + if (off) + left -= PAGE_SIZE - off; + + while (left >= PAGE_SIZE) { + struct page *page = pfn_to_page(pfn++); + set_bit(PG_dcache_clean, &page->flags); + left -= PAGE_SIZE; + } +} + static bool arch_sync_dma_cpu_needs_post_dma_flush(void) { if (IS_ENABLED(CONFIG_CPU_V6) || @@ -715,24 +737,6 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, outer_inv_range(paddr, paddr + size); dma_cache_maint(paddr, size, dmac_inv_range); } - - /* - * Mark the D-cache clean for these pages to avoid extra flushing. - */ - if (dir != DMA_TO_DEVICE && size >= PAGE_SIZE) { - unsigned long pfn = PFN_UP(paddr); - unsigned long off = paddr & (PAGE_SIZE - 1); - size_t left = size; - - if (off) - left -= PAGE_SIZE - off; - - while (left >= PAGE_SIZE) { - struct page *page = pfn_to_page(pfn++); - set_bit(PG_dcache_clean, &page->flags); - left -= PAGE_SIZE; - } - } } #ifdef CONFIG_ARM_DMA_USE_IOMMU @@ -1294,6 +1298,17 @@ static int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, return -EINVAL; } +static void arm_iommu_sync_dma_for_cpu(phys_addr_t phys, size_t len, + enum dma_data_direction dir, + bool dma_coherent) +{ + if (!dma_coherent) + arch_sync_dma_for_cpu(phys, s->length, dir); + + if (dir == DMA_FROM_DEVICE) + arch_dma_mark_clean(phys, s->length); +} + /** * arm_iommu_unmap_sg - unmap a set of SG buffers mapped by dma_map_sg * @dev: valid struct device pointer @@ -1316,8 +1331,9 @@ static void arm_iommu_unmap_sg(struct device *dev, if (sg_dma_len(s)) __iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s)); - if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) - arch_sync_dma_for_cpu(sg_phys(s), s->length, dir); + if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC)) + arm_iommu_sync_dma_for_cpu(sg_phys(s), s->length, dir, + dev->dma_coherent); } } @@ -1335,12 +1351,9 @@ static void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist *s; int i; - if (dev->dma_coherent) - return; - for_each_sg(sg, s, nents, i) - arch_sync_dma_for_cpu(sg_phys(s), s->length, dir); - + arm_iommu_sync_dma_for_cpu(sg_phys(s), s->length, dir, + dev->dma_coherent); } /** @@ -1425,9 +1438,9 @@ static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle, if (!iova) return; - if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) { + if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC)) phys = iommu_iova_to_phys(mapping->domain, handle); - arch_sync_dma_for_cpu(phys, size, dir); + arm_iommu_sync_dma_for_cpu(phys, size, dir, dev->dma_coherent); } iommu_unmap(mapping->domain, iova, len); @@ -1497,11 +1510,11 @@ static void arm_iommu_sync_single_for_cpu(struct device *dev, struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev); phys_addr_t phys; - if (dev->dma_coherent || !(handle & PAGE_MASK)) + if (!(handle & PAGE_MASK)) return; phys = iommu_iova_to_phys(mapping->domain, handle); - arch_sync_dma_for_cpu(phys, size, dir); + arm_iommu_sync_dma_for_cpu(phys, size, dir, dev->dma_coherent); } static void arm_iommu_sync_single_for_device(struct device *dev, From patchwork Mon Mar 27 12:13:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 13189217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5AEDAC77B6F for ; Mon, 27 Mar 2023 12:19:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232912AbjC0MTL (ORCPT ); Mon, 27 Mar 2023 08:19:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232213AbjC0MSj (ORCPT ); Mon, 27 Mar 2023 08:18:39 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18C7E6180; Mon, 27 Mar 2023 05:17:03 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 7FA62B81151; Mon, 27 Mar 2023 12:17:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 100F9C433D2; Mon, 27 Mar 2023 12:16:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919420; bh=Ml5ZUxLRBcXZxDX8ISyxdZtftACTwtt/RJRGQgyDcA4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=syMJ7nxN1404wL8MSmBDBzMbmZHCdTm1Ngp/hp0OZKrTQv6VukcslNqnocgIYm0Pg z2nDiDT5g3WGM7PQyOsYmBSR9OWBLcuCQsrbgKlcQhgarGx+B1WnHMbIAicdtUX90I rv7MkL1vIAzNzi9mV+9HYRQdyax0NNIb8SsxYgbEwbxZ3T+9s7FySsaLMrv38UOthy rsnkD0oaYSIFFeNAIxmAdoDDC6b+8rN89kcPfR/lqO8SVhG1dbQNBMihp0dKI9sDYi S4OY7F19HPM4r+W9fx9IRFRNFVh1DnIy9BLSjbmduw1o6ookXnZJF7dHfVeo3epEYb LMkF5xk0Gxohg== From: Arnd Bergmann To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann , Vineet Gupta , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Max Filippov , Christoph Hellwig , Robin Murphy , Lad Prabhakar , Conor Dooley , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 21/21] dma-mapping: replace custom code with generic implementation Date: Mon, 27 Mar 2023 14:13:17 +0200 Message-Id: <20230327121317.4081816-22-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org From: Arnd Bergmann Now that all of these have consistent behavior, replace them with a single shared implementation of arch_sync_dma_for_device() and arch_sync_dma_for_cpu() and three parameters to pick how they should operate: - If the CPU has speculative prefetching, then the cache has to be invalidated after a transfer from the device. On the rarer CPUs without prefetching, this can be skipped, with all cache management happening before the transfer. This flag can be runtime detected, but is usually fixed per architecture. - Some architectures currently clean the caches before DMA from a device, while others invalidate it. There has not been a conclusion regarding whether we should change all architectures to use clean instead, so this adds an architecture specific flag that we can change later on. - On 32-bit Arm, the arch_sync_dma_for_cpu() function keeps track pages that are marked clean in the page cache, to avoid flushing them again. The implementation for this is generic enough to work on all architectures that use the PG_dcache_clean page flag, but a Kconfig symbol is used to only enable it on Arm to preserve the existing behavior. For the function naming, I picked 'wback' over 'clean', and 'wback_inv' over 'flush', to avoid any ambiguity of what the helper functions are supposed to do. Moving the global functions into a header file is usually a bad idea as it prevents the header from being included more than once, but it helps keep the behavior as close as possible to the previous state, including the possibility of inlining most of it into these functions where that was done before. This also helps keep the global namespace clean, by hiding the new arch_dma_cache{_wback,_inv,_wback_inv} from device drivers that might use them incorrectly. It would be possible to do this one architecture at a time, but as the change is the same everywhere, the combined patch helps explain it better once. Signed-off-by: Arnd Bergmann Reviewed-by: Lad Prabhakar Tested-by: Lad Prabhakar --- arch/arc/mm/dma.c | 66 +++++------------- arch/arm/Kconfig | 3 + arch/arm/mm/dma-mapping-nommu.c | 39 ++++++----- arch/arm/mm/dma-mapping.c | 64 +++++++----------- arch/arm64/mm/dma-mapping.c | 28 +++++--- arch/csky/mm/dma-mapping.c | 44 ++++++------ arch/hexagon/kernel/dma.c | 44 ++++++------ arch/m68k/kernel/dma.c | 43 +++++++----- arch/microblaze/kernel/dma.c | 48 +++++++------- arch/mips/mm/dma-noncoherent.c | 60 +++++++---------- arch/nios2/mm/dma-mapping.c | 57 +++++++--------- arch/openrisc/kernel/dma.c | 63 +++++++++++------- arch/parisc/kernel/pci-dma.c | 46 ++++++------- arch/powerpc/mm/dma-noncoherent.c | 34 ++++++---- arch/riscv/mm/dma-noncoherent.c | 51 +++++++------- arch/sh/kernel/dma-coherent.c | 43 +++++++----- arch/sparc/kernel/ioport.c | 38 ++++++++--- arch/xtensa/kernel/pci-dma.c | 40 ++++++----- include/linux/dma-sync.h | 107 ++++++++++++++++++++++++++++++ 19 files changed, 527 insertions(+), 391 deletions(-) create mode 100644 include/linux/dma-sync.h diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c index ddb96786f765..61cd01646222 100644 --- a/arch/arc/mm/dma.c +++ b/arch/arc/mm/dma.c @@ -30,63 +30,33 @@ void arch_dma_prep_coherent(struct page *page, size_t size) dma_cache_wback_inv(page_to_phys(page), size); } -/* - * Cache operations depending on function and direction argument, inspired by - * https://lore.kernel.org/lkml/20180518175004.GF17671@n2100.armlinux.org.uk - * "dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] - * dma-mapping: provide a generic dma-noncoherent implementation)" - * - * | map == for_device | unmap == for_cpu - * |---------------------------------------------------------------- - * TO_DEV | writeback writeback | none none - * FROM_DEV | invalidate invalidate | invalidate* invalidate* - * BIDIR | writeback writeback | invalidate invalidate - * - * [*] needed for CPU speculative prefetches - * - * NOTE: we don't check the validity of direction argument as it is done in - * upper layer functions (in include/linux/dma-mapping.h) - */ - -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { - switch (dir) { - case DMA_TO_DEVICE: - dma_cache_wback(paddr, size); - break; - - case DMA_FROM_DEVICE: - dma_cache_inv(paddr, size); - break; - - case DMA_BIDIRECTIONAL: - dma_cache_wback(paddr, size); - break; + dma_cache_wback(paddr, size); +} - default: - break; - } +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) +{ + dma_cache_inv(paddr, size); } -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) { - switch (dir) { - case DMA_TO_DEVICE: - break; + dma_cache_wback_inv(paddr, size); +} - /* FROM_DEVICE invalidate needed if speculative CPU prefetch only */ - case DMA_FROM_DEVICE: - case DMA_BIDIRECTIONAL: - dma_cache_inv(paddr, size); - break; +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return false; +} - default: - break; - } +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return true; } +#include + /* * Plug in direct dma map ops. */ diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 125d58c54ab1..0de84e861027 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -212,6 +212,9 @@ config LOCKDEP_SUPPORT bool default y +config ARCH_DMA_MARK_DCACHE_CLEAN + def_bool y + config ARCH_HAS_ILOG2_U32 bool diff --git a/arch/arm/mm/dma-mapping-nommu.c b/arch/arm/mm/dma-mapping-nommu.c index 12b5c6ae93fc..0817274aed15 100644 --- a/arch/arm/mm/dma-mapping-nommu.c +++ b/arch/arm/mm/dma-mapping-nommu.c @@ -13,27 +13,36 @@ #include "dma.h" -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { - if (dir == DMA_FROM_DEVICE) { - dmac_inv_range(__va(paddr), __va(paddr + size)); - outer_inv_range(paddr, paddr + size); - } else { - dmac_clean_range(__va(paddr), __va(paddr + size)); - outer_clean_range(paddr, paddr + size); - } + dmac_clean_range(__va(paddr), __va(paddr + size)); + outer_clean_range(paddr, paddr + size); } -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) { - if (dir != DMA_TO_DEVICE) { - outer_inv_range(paddr, paddr + size); - dmac_inv_range(__va(paddr), __va(paddr)); - } + dmac_inv_range(__va(paddr), __va(paddr + size)); + outer_inv_range(paddr, paddr + size); } +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) +{ + dmac_flush_range(__va(paddr), __va(paddr + size)); + outer_flush_range(paddr, paddr + size); +} + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return false; +} + +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return true; +} + +#include + void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size, const struct iommu_ops *iommu, bool coherent) { diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index b703cb83d27e..aa6ee820a0ab 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -687,6 +687,30 @@ void arch_dma_mark_clean(phys_addr_t paddr, size_t size) } } +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) +{ + dma_cache_maint(paddr, size, dmac_clean_range); + outer_clean_range(paddr, paddr + size); +} + + +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) +{ + dma_cache_maint(paddr, size, dmac_inv_range); + outer_inv_range(paddr, paddr + size); +} + +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) +{ + dma_cache_maint(paddr, size, dmac_flush_range); + outer_flush_range(paddr, paddr + size); +} + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return false; +} + static bool arch_sync_dma_cpu_needs_post_dma_flush(void) { if (IS_ENABLED(CONFIG_CPU_V6) || @@ -699,45 +723,7 @@ static bool arch_sync_dma_cpu_needs_post_dma_flush(void) return false; } -/* - * Make an area consistent for devices. - * Note: Drivers should NOT use this function directly. - * Use the driver DMA support - see dma-mapping.h (dma_sync_*) - */ -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) -{ - switch (dir) { - case DMA_TO_DEVICE: - dma_cache_maint(paddr, size, dmac_clean_range); - outer_clean_range(paddr, paddr + size); - break; - case DMA_FROM_DEVICE: - dma_cache_maint(paddr, size, dmac_inv_range); - outer_inv_range(paddr, paddr + size); - break; - case DMA_BIDIRECTIONAL: - if (arch_sync_dma_cpu_needs_post_dma_flush()) { - dma_cache_maint(paddr, size, dmac_clean_range); - outer_clean_range(paddr, paddr + size); - } else { - dma_cache_maint(paddr, size, dmac_flush_range); - outer_flush_range(paddr, paddr + size); - } - break; - default: - break; - } -} - -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) -{ - if (dir != DMA_TO_DEVICE && arch_sync_dma_cpu_needs_post_dma_flush()) { - outer_inv_range(paddr, paddr + size); - dma_cache_maint(paddr, size, dmac_inv_range); - } -} +#include #ifdef CONFIG_ARM_DMA_USE_IOMMU diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index 5240f6acad64..bae741aa65e9 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -13,25 +13,33 @@ #include #include -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { - unsigned long start = (unsigned long)phys_to_virt(paddr); + dcache_clean_poc(paddr, paddr + size); +} - dcache_clean_poc(start, start + size); +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) +{ + dcache_inval_poc(paddr, paddr + size); } -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) { - unsigned long start = (unsigned long)phys_to_virt(paddr); + dcache_clean_inval_poc(paddr, paddr + size); +} - if (dir == DMA_TO_DEVICE) - return; +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return true; +} - dcache_inval_poc(start, start + size); +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return true; } +#include + void arch_dma_prep_coherent(struct page *page, size_t size) { unsigned long start = (unsigned long)page_address(page); diff --git a/arch/csky/mm/dma-mapping.c b/arch/csky/mm/dma-mapping.c index c90f912e2822..9402e101b363 100644 --- a/arch/csky/mm/dma-mapping.c +++ b/arch/csky/mm/dma-mapping.c @@ -55,31 +55,29 @@ void arch_dma_prep_coherent(struct page *page, size_t size) cache_op(page_to_phys(page), size, dma_wbinv_set_zero_range); } -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { - switch (dir) { - case DMA_TO_DEVICE: - case DMA_FROM_DEVICE: - case DMA_BIDIRECTIONAL: - cache_op(paddr, size, dma_wb_range); - break; - default: - BUG(); - } + cache_op(paddr, size, dma_wb_range); } -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) { - switch (dir) { - case DMA_TO_DEVICE: - return; - case DMA_FROM_DEVICE: - case DMA_BIDIRECTIONAL: - cache_op(paddr, size, dma_inv_range); - break; - default: - BUG(); - } + cache_op(paddr, size, dma_inv_range); } + +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) +{ + cache_op(paddr, size, dma_wbinv_range); +} + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return true; +} + +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return true; +} + +#include diff --git a/arch/hexagon/kernel/dma.c b/arch/hexagon/kernel/dma.c index 882680e81a30..e6538128a75b 100644 --- a/arch/hexagon/kernel/dma.c +++ b/arch/hexagon/kernel/dma.c @@ -9,29 +9,33 @@ #include #include -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { - void *addr = phys_to_virt(paddr); - - switch (dir) { - case DMA_TO_DEVICE: - hexagon_clean_dcache_range((unsigned long) addr, - (unsigned long) addr + size); - break; - case DMA_FROM_DEVICE: - hexagon_inv_dcache_range((unsigned long) addr, - (unsigned long) addr + size); - break; - case DMA_BIDIRECTIONAL: - flush_dcache_range((unsigned long) addr, - (unsigned long) addr + size); - break; - default: - BUG(); - } + hexagon_clean_dcache_range(paddr, paddr + size); } +static inline void arch_dma_cache_inv(phys_addr_t start, size_t size) +{ + hexagon_inv_dcache_range(paddr, paddr + size); +} + +static inline void arch_dma_cache_wback_inv(phys_addr_t start, size_t size) +{ + hexagon_flush_dcache_range(paddr, paddr + size); +} + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return false; +} + +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return false; +} + +#include + /* * Our max_low_pfn should have been backed off by 16MB in mm/init.c to create * DMA coherent space. Use that for the pool. diff --git a/arch/m68k/kernel/dma.c b/arch/m68k/kernel/dma.c index 2e192a5df949..aa9b434e6df8 100644 --- a/arch/m68k/kernel/dma.c +++ b/arch/m68k/kernel/dma.c @@ -58,20 +58,33 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr, #endif /* CONFIG_MMU && !CONFIG_COLDFIRE */ -void arch_sync_dma_for_device(phys_addr_t handle, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { - switch (dir) { - case DMA_BIDIRECTIONAL: - case DMA_TO_DEVICE: - cache_push(handle, size); - break; - case DMA_FROM_DEVICE: - cache_clear(handle, size); - break; - default: - pr_err_ratelimited("dma_sync_single_for_device: unsupported dir %u\n", - dir); - break; - } + /* + * cache_push() always invalidates in addition to cleaning + * write-back caches. + */ + cache_push(paddr, size); +} + +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) +{ + cache_clear(paddr, size); +} + +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) +{ + cache_push(paddr, size); } + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return false; +} + +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return false; +} + +#include diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c index b4c4e45fd45e..01110d4aa5b0 100644 --- a/arch/microblaze/kernel/dma.c +++ b/arch/microblaze/kernel/dma.c @@ -14,32 +14,30 @@ #include #include -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { - switch (direction) { - case DMA_TO_DEVICE: - case DMA_BIDIRECTIONAL: - flush_dcache_range(paddr, paddr + size); - break; - case DMA_FROM_DEVICE: - invalidate_dcache_range(paddr, paddr + size); - break; - default: - BUG(); - } + /* writeback plus invalidate, could be a nop on WT caches */ + flush_dcache_range(paddr, paddr + size); } -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) { - switch (direction) { - case DMA_TO_DEVICE: - break; - case DMA_BIDIRECTIONAL: - case DMA_FROM_DEVICE: - invalidate_dcache_range(paddr, paddr + size); - break; - default: - BUG(); - }} + invalidate_dcache_range(paddr, paddr + size); +} + +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) +{ + flush_dcache_range(paddr, paddr + size); +} + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return false; +} + +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return true; +} + +#include diff --git a/arch/mips/mm/dma-noncoherent.c b/arch/mips/mm/dma-noncoherent.c index b9d68bcc5d53..902d4b7c1f85 100644 --- a/arch/mips/mm/dma-noncoherent.c +++ b/arch/mips/mm/dma-noncoherent.c @@ -85,50 +85,38 @@ static inline void dma_sync_phys(phys_addr_t paddr, size_t size, } while (left); } -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { - switch (dir) { - case DMA_TO_DEVICE: - dma_sync_phys(paddr, size, _dma_cache_wback); - break; - case DMA_FROM_DEVICE: - dma_sync_phys(paddr, size, _dma_cache_inv); - break; - case DMA_BIDIRECTIONAL: - if (IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) && - cpu_needs_post_dma_flush()) - dma_sync_phys(paddr, size, _dma_cache_wback); - else - dma_sync_phys(paddr, size, _dma_cache_wback_inv); - break; - default: - break; - } + dma_sync_phys(paddr, size, _dma_cache_wback); } -#ifdef CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) { - switch (dir) { - case DMA_TO_DEVICE: - break; - case DMA_FROM_DEVICE: - case DMA_BIDIRECTIONAL: - if (cpu_needs_post_dma_flush()) - dma_sync_phys(paddr, size, _dma_cache_inv); - break; - default: - break; - } + dma_sync_phys(paddr, size, _dma_cache_inv); } -#endif + +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) +{ + dma_sync_phys(paddr, size, _dma_cache_wback_inv); +} + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return false; +} + +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) && + cpu_needs_post_dma_flush(); +} + +#include #ifdef CONFIG_ARCH_HAS_SETUP_DMA_OPS void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size, - const struct iommu_ops *iommu, bool coherent) + const struct iommu_ops *iommu, bool coherent) { - dev->dma_coherent = coherent; + dev->dma_coherent = coherent; } #endif diff --git a/arch/nios2/mm/dma-mapping.c b/arch/nios2/mm/dma-mapping.c index fd887d5f3f9a..29978970955e 100644 --- a/arch/nios2/mm/dma-mapping.c +++ b/arch/nios2/mm/dma-mapping.c @@ -13,53 +13,46 @@ #include #include #include +#include #include #include #include #include -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { + /* + * We just need to write back the caches here, but Nios2 flush + * instruction will do both writeback and invalidate. + */ void *vaddr = phys_to_virt(paddr); + flush_dcache_range((unsigned long)vaddr, (unsigned long)(vaddr + size)); +} - switch (dir) { - case DMA_FROM_DEVICE: - invalidate_dcache_range((unsigned long)vaddr, - (unsigned long)(vaddr + size)); - break; - case DMA_TO_DEVICE: - /* - * We just need to flush the caches here , but Nios2 flush - * instruction will do both writeback and invalidate. - */ - case DMA_BIDIRECTIONAL: /* flush and invalidate */ - flush_dcache_range((unsigned long)vaddr, - (unsigned long)(vaddr + size)); - break; - default: - BUG(); - } +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) +{ + unsigned long vaddr = (unsigned long)phys_to_virt(paddr); + invalidate_dcache_range(vaddr, (unsigned long)(vaddr + size)); } -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) { void *vaddr = phys_to_virt(paddr); + flush_dcache_range((unsigned long)vaddr, (unsigned long)(vaddr + size)); +} + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return false; +} - switch (dir) { - case DMA_BIDIRECTIONAL: - case DMA_FROM_DEVICE: - invalidate_dcache_range((unsigned long)vaddr, - (unsigned long)(vaddr + size)); - break; - case DMA_TO_DEVICE: - break; - default: - BUG(); - } +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return true; } +#include + void arch_dma_prep_coherent(struct page *page, size_t size) { unsigned long start = (unsigned long)page_address(page); diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c index 91a00d09ffad..aba2258e62eb 100644 --- a/arch/openrisc/kernel/dma.c +++ b/arch/openrisc/kernel/dma.c @@ -95,32 +95,47 @@ void arch_dma_clear_uncached(void *cpu_addr, size_t size) mmap_write_unlock(&init_mm); } -void arch_sync_dma_for_device(phys_addr_t addr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { unsigned long cl; struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()]; - switch (dir) { - case DMA_TO_DEVICE: - /* Write back the dcache for the requested range */ - for (cl = addr; cl < addr + size; - cl += cpuinfo->dcache_block_size) - mtspr(SPR_DCBWR, cl); - break; - case DMA_FROM_DEVICE: - /* Invalidate the dcache for the requested range */ - for (cl = addr; cl < addr + size; - cl += cpuinfo->dcache_block_size) - mtspr(SPR_DCBIR, cl); - break; - case DMA_BIDIRECTIONAL: - /* Flush the dcache for the requested range */ - for (cl = addr; cl < addr + size; - cl += cpuinfo->dcache_block_size) - mtspr(SPR_DCBFR, cl); - break; - default: - break; - } + /* Write back the dcache for the requested range */ + for (cl = paddr; cl < paddr + size; + cl += cpuinfo->dcache_block_size) + mtspr(SPR_DCBWR, cl); } + +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) +{ + unsigned long cl; + struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()]; + + /* Invalidate the dcache for the requested range */ + for (cl = paddr; cl < paddr + size; + cl += cpuinfo->dcache_block_size) + mtspr(SPR_DCBIR, cl); +} + +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) +{ + unsigned long cl; + struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()]; + + /* Flush the dcache for the requested range */ + for (cl = paddr; cl < paddr + size; + cl += cpuinfo->dcache_block_size) + mtspr(SPR_DCBFR, cl); +} + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return false; +} + +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return false; +} + +#include diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c index 6d3d3cffb316..a7955aab8ce2 100644 --- a/arch/parisc/kernel/pci-dma.c +++ b/arch/parisc/kernel/pci-dma.c @@ -443,35 +443,35 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr, free_pages((unsigned long)__va(dma_handle), order); } -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { unsigned long virt = (unsigned long)phys_to_virt(paddr); - switch (dir) { - case DMA_TO_DEVICE: - clean_kernel_dcache_range(virt, size); - break; - case DMA_FROM_DEVICE: - clean_kernel_dcache_range(virt, size); - break; - case DMA_BIDIRECTIONAL: - flush_kernel_dcache_range(virt, size); - break; - } + clean_kernel_dcache_range(virt, size); } -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) { unsigned long virt = (unsigned long)phys_to_virt(paddr); - switch (dir) { - case DMA_TO_DEVICE: - break; - case DMA_FROM_DEVICE: - case DMA_BIDIRECTIONAL: - purge_kernel_dcache_range(virt, size); - break; - } + purge_kernel_dcache_range(virt, size); +} + +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) +{ + unsigned long virt = (unsigned long)phys_to_virt(paddr); + + flush_kernel_dcache_range(virt, size); } + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return true; +} + +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return true; +} + +#include diff --git a/arch/powerpc/mm/dma-noncoherent.c b/arch/powerpc/mm/dma-noncoherent.c index 00e59a4faa2b..268510c71156 100644 --- a/arch/powerpc/mm/dma-noncoherent.c +++ b/arch/powerpc/mm/dma-noncoherent.c @@ -101,27 +101,33 @@ static void __dma_phys_op(phys_addr_t paddr, size_t size, enum dma_cache_op op) #endif } -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { __dma_phys_op(start, end, DMA_CACHE_CLEAN); } -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) { - switch (direction) { - case DMA_NONE: - BUG(); - case DMA_TO_DEVICE: - break; - case DMA_FROM_DEVICE: - case DMA_BIDIRECTIONAL: - __dma_phys_op(start, end, DMA_CACHE_INVAL); - break; - } + __dma_phys_op(start, end, DMA_CACHE_INVAL); } +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) +{ + __dma_phys_op(start, end, DMA_CACHE_FLUSH); +} + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return true; +} + +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return true; +} + +#include + void arch_dma_prep_coherent(struct page *page, size_t size) { unsigned long kaddr = (unsigned long)page_address(page); diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c index 69c80b2155a1..b9a9f57e02be 100644 --- a/arch/riscv/mm/dma-noncoherent.c +++ b/arch/riscv/mm/dma-noncoherent.c @@ -12,43 +12,40 @@ static bool noncoherent_supported; -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { void *vaddr = phys_to_virt(paddr); - switch (dir) { - case DMA_TO_DEVICE: - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); - break; - case DMA_FROM_DEVICE: - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); - break; - case DMA_BIDIRECTIONAL: - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); - break; - default: - break; - } + ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); } -void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) { void *vaddr = phys_to_virt(paddr); - switch (dir) { - case DMA_TO_DEVICE: - break; - case DMA_FROM_DEVICE: - case DMA_BIDIRECTIONAL: - ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size); - break; - default: - break; - } + ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size); } +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) +{ + void *vaddr = phys_to_virt(paddr); + + ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); +} + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return true; +} + +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return true; +} + +#include + + void arch_dma_prep_coherent(struct page *page, size_t size) { void *flush_addr = page_address(page); diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c index 6a44c0e7ba40..41f031ae7609 100644 --- a/arch/sh/kernel/dma-coherent.c +++ b/arch/sh/kernel/dma-coherent.c @@ -12,22 +12,35 @@ void arch_dma_prep_coherent(struct page *page, size_t size) __flush_purge_region(page_address(page), size); } -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { void *addr = sh_cacheop_vaddr(phys_to_virt(paddr)); - switch (dir) { - case DMA_FROM_DEVICE: /* invalidate only */ - __flush_invalidate_region(addr, size); - break; - case DMA_TO_DEVICE: /* writeback only */ - __flush_wback_region(addr, size); - break; - case DMA_BIDIRECTIONAL: /* writeback and invalidate */ - __flush_purge_region(addr, size); - break; - default: - BUG(); - } + __flush_wback_region(addr, size); } + +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) +{ + void *addr = sh_cacheop_vaddr(phys_to_virt(paddr)); + + __flush_invalidate_region(addr, size); +} + +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) +{ + void *addr = sh_cacheop_vaddr(phys_to_virt(paddr)); + + __flush_purge_region(addr, size); +} + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return false; +} + +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return false; +} + +#include diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c index 4f3d26066ec2..6926ead2f208 100644 --- a/arch/sparc/kernel/ioport.c +++ b/arch/sparc/kernel/ioport.c @@ -300,21 +300,39 @@ arch_initcall(sparc_register_ioport); #endif /* CONFIG_SBUS */ -/* - * IIep is write-through, not flushing on cpu to device transfer. - * - * On LEON systems without cache snooping, the entire D-CACHE must be flushed to - * make DMA to cacheable memory coherent. - */ -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { - if (dir != DMA_TO_DEVICE && - sparc_cpu_model == sparc_leon && + /* IIep is write-through, not flushing on cpu to device transfer. */ +} + +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) +{ + /* + * On LEON systems without cache snooping, the entire D-CACHE must be + * flushed to make DMA to cacheable memory coherent. + */ + if (sparc_cpu_model == sparc_leon && !sparc_leon3_snooping_enabled()) leon_flush_dcache_all(); } +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) +{ + arch_dma_cache_inv(paddr, size); +} + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return true; +} + +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return false; +} + +#include + #ifdef CONFIG_PROC_FS static int sparc_io_proc_show(struct seq_file *m, void *v) diff --git a/arch/xtensa/kernel/pci-dma.c b/arch/xtensa/kernel/pci-dma.c index ff3bf015eca4..d4ff96585545 100644 --- a/arch/xtensa/kernel/pci-dma.c +++ b/arch/xtensa/kernel/pci-dma.c @@ -43,24 +43,34 @@ static void do_cache_op(phys_addr_t paddr, size_t size, } } -void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, - enum dma_data_direction dir) +static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) { - switch (dir) { - case DMA_TO_DEVICE: - do_cache_op(paddr, size, __flush_dcache_range); - break; - case DMA_FROM_DEVICE: - do_cache_op(paddr, size, __invalidate_dcache_range); - break; - case DMA_BIDIRECTIONAL: - do_cache_op(paddr, size, __flush_invalidate_dcache_range); - break; - default: - break; - } + do_cache_op(paddr, size, __flush_dcache_range); } +static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) +{ + do_cache_op(paddr, size, __invalidate_dcache_range); +} + +static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size) +{ + do_cache_op(paddr, size, __flush_invalidate_dcache_range); +} + +static inline bool arch_sync_dma_clean_before_fromdevice(void) +{ + return false; +} + +static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void) +{ + return false; +} + +#include + + void arch_dma_prep_coherent(struct page *page, size_t size) { __invalidate_dcache_range((unsigned long)page_address(page), size); diff --git a/include/linux/dma-sync.h b/include/linux/dma-sync.h new file mode 100644 index 000000000000..18e33d5e8eaf --- /dev/null +++ b/include/linux/dma-sync.h @@ -0,0 +1,107 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Cache operations depending on function and direction argument, inspired by + * https://lore.kernel.org/lkml/20180518175004.GF17671@n2100.armlinux.org.uk + * "dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] + * dma-mapping: provide a generic dma-noncoherent implementation)" + * + * | map == for_device | unmap == for_cpu + * |---------------------------------------------------------------- + * TO_DEV | writeback writeback | none none + * FROM_DEV | invalidate invalidate | invalidate* invalidate* + * BIDIR | writeback writeback | invalidate invalidate + * + * [*] needed for CPU speculative prefetches + * + * NOTE: we don't check the validity of direction argument as it is done in + * upper layer functions (in include/linux/dma-mapping.h) + * + * This file can be included by arch/.../kernel/dma-noncoherent.c to provide + * the respective high-level operations without having to expose the + * cache management ops to drivers. + */ + +void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, + enum dma_data_direction dir) +{ + switch (dir) { + case DMA_TO_DEVICE: + /* + * This may be an empty function on write-through caches, + * and it might invalidate the cache if an architecture has + * a write-back cache but no way to write it back without + * invalidating + */ + arch_dma_cache_wback(paddr, size); + break; + + case DMA_FROM_DEVICE: + /* + * FIXME: this should be handled the same across all + * architectures, see + * https://lore.kernel.org/all/20220606152150.GA31568@willie-the-truck/ + */ + if (!arch_sync_dma_clean_before_fromdevice()) { + arch_dma_cache_inv(paddr, size); + break; + } + fallthrough; + + case DMA_BIDIRECTIONAL: + /* Skip the invalidate here if it's done later */ + if (IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) && + arch_sync_dma_cpu_needs_post_dma_flush()) + arch_dma_cache_wback(paddr, size); + else + arch_dma_cache_wback_inv(paddr, size); + break; + + default: + break; + } +} + +#ifdef CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU +/* + * Mark the D-cache clean for these pages to avoid extra flushing. + */ +static void arch_dma_mark_dcache_clean(phys_addr_t paddr, size_t size) +{ +#ifdef CONFIG_ARCH_DMA_MARK_DCACHE_CLEAN + unsigned long pfn = PFN_UP(paddr); + unsigned long off = paddr & (PAGE_SIZE - 1); + size_t left = size; + + if (off) + left -= PAGE_SIZE - off; + + while (left >= PAGE_SIZE) { + struct page *page = pfn_to_page(pfn++); + set_bit(PG_dcache_clean, &page->flags); + left -= PAGE_SIZE; + } +#endif +} + +void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, + enum dma_data_direction dir) +{ + switch (dir) { + case DMA_TO_DEVICE: + break; + + case DMA_FROM_DEVICE: + case DMA_BIDIRECTIONAL: + /* FROM_DEVICE invalidate needed if speculative CPU prefetch only */ + if (arch_sync_dma_cpu_needs_post_dma_flush()) + arch_dma_cache_inv(paddr, size); + + if (size > PAGE_SIZE) + arch_dma_mark_dcache_clean(paddr, size); + break; + + default: + break; + } +} +#endif