diff mbox series

[06/21] powerpc: dma-mapping: minimize for_cpu flushing

Message ID 20230327121317.4081816-7-arnd@kernel.org (mailing list archive)
State Handled Elsewhere
Headers show
Series dma-mapping: unify support for cache flushes | expand

Checks

Context Check Description
conchuod/cover_letter success Series has a cover letter
conchuod/tree_selection success Guessed tree name to be for-next at HEAD e45d6a52fe2b
conchuod/fixes_present success Fixes tag not required for -next series
conchuod/maintainers_pattern success MAINTAINERS pattern errors before the patch: 1 and now 1
conchuod/verify_signedoff success Signed-off-by tag matches author and committer
conchuod/kdoc success Errors and warnings before: 0 this patch: 0
conchuod/build_rv64_clang_allmodconfig success Errors and warnings before: 18 this patch: 18
conchuod/module_param success Was 0 now: 0
conchuod/build_rv64_gcc_allmodconfig success Errors and warnings before: 18 this patch: 18
conchuod/build_rv32_defconfig success Build OK
conchuod/dtb_warn_rv64 success Errors and warnings before: 3 this patch: 3
conchuod/header_inline success No static functions without inline keyword in header files
conchuod/checkpatch success total: 0 errors, 0 warnings, 0 checks, 25 lines checked
conchuod/source_inline success Was 0 now: 0
conchuod/build_rv64_nommu_k210_defconfig success Build OK
conchuod/verify_fixes success No Fixes tag
conchuod/build_rv64_nommu_virt_defconfig success Build OK

Commit Message

Arnd Bergmann March 27, 2023, 12:13 p.m. UTC
From: Arnd Bergmann <arnd@arndb.de>

The powerpc dma_sync_*_for_cpu() variants do more flushes than on other
architectures. Reduce it to what everyone else does:

 - No flush is needed after data has been sent to a device

 - When data has been received from a device, the cache only needs to
   be invalidated to clear out cache lines that were speculatively
   prefetched.

In particular, the second flushing of partial cache lines of bidirectional
buffers is actively harmful -- if a single cache line is written by both
the CPU and the device, flushing it again does not maintain coherency
but instead overwrite the data that was just received from the device.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/powerpc/mm/dma-noncoherent.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

Comments

Christophe Leroy March 27, 2023, 12:56 p.m. UTC | #1
Le 27/03/2023 à 14:13, Arnd Bergmann a écrit :
> From: Arnd Bergmann <arnd@arndb.de>
> 
> The powerpc dma_sync_*_for_cpu() variants do more flushes than on other
> architectures. Reduce it to what everyone else does:
> 
>   - No flush is needed after data has been sent to a device
> 
>   - When data has been received from a device, the cache only needs to
>     be invalidated to clear out cache lines that were speculatively
>     prefetched.
> 
> In particular, the second flushing of partial cache lines of bidirectional
> buffers is actively harmful -- if a single cache line is written by both
> the CPU and the device, flushing it again does not maintain coherency
> but instead overwrite the data that was just received from the device.

Hum ..... Who is right ?

That behaviour was introduced by commit 03d70617b8a7 ("powerpc: Prevent 
memory corruption due to cache invalidation of unaligned DMA buffer")

I think your commit log should explain why that commit was wrong, and 
maybe say that your patch is a revert of that commit ?

Christophe


> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>   arch/powerpc/mm/dma-noncoherent.c | 18 ++++--------------
>   1 file changed, 4 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/powerpc/mm/dma-noncoherent.c b/arch/powerpc/mm/dma-noncoherent.c
> index f10869d27de5..e108cacf877f 100644
> --- a/arch/powerpc/mm/dma-noncoherent.c
> +++ b/arch/powerpc/mm/dma-noncoherent.c
> @@ -132,21 +132,11 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
>   	switch (direction) {
>   	case DMA_NONE:
>   		BUG();
> -	case DMA_FROM_DEVICE:
> -		/*
> -		 * invalidate only when cache-line aligned otherwise there is
> -		 * the potential for discarding uncommitted data from the cache
> -		 */
> -		if ((start | end) & (L1_CACHE_BYTES - 1))
> -			__dma_phys_op(start, end, DMA_CACHE_FLUSH);
> -		else
> -			__dma_phys_op(start, end, DMA_CACHE_INVAL);
> -		break;
> -	case DMA_TO_DEVICE:		/* writeback only */
> -		__dma_phys_op(start, end, DMA_CACHE_CLEAN);
> +	case DMA_TO_DEVICE:
>   		break;
> -	case DMA_BIDIRECTIONAL:	/* writeback and invalidate */
> -		__dma_phys_op(start, end, DMA_CACHE_FLUSH);
> +	case DMA_FROM_DEVICE:
> +	case DMA_BIDIRECTIONAL:
> +		__dma_phys_op(start, end, DMA_CACHE_INVAL);
>   		break;
>   	}
>   }
Arnd Bergmann March 27, 2023, 1:02 p.m. UTC | #2
On Mon, Mar 27, 2023, at 14:56, Christophe Leroy wrote:
> Le 27/03/2023 à 14:13, Arnd Bergmann a écrit :
>> From: Arnd Bergmann <arnd@arndb.de>
>> 
>> The powerpc dma_sync_*_for_cpu() variants do more flushes than on other
>> architectures. Reduce it to what everyone else does:
>> 
>>   - No flush is needed after data has been sent to a device
>> 
>>   - When data has been received from a device, the cache only needs to
>>     be invalidated to clear out cache lines that were speculatively
>>     prefetched.
>> 
>> In particular, the second flushing of partial cache lines of bidirectional
>> buffers is actively harmful -- if a single cache line is written by both
>> the CPU and the device, flushing it again does not maintain coherency
>> but instead overwrite the data that was just received from the device.
>
> Hum ..... Who is right ?
>
> That behaviour was introduced by commit 03d70617b8a7 ("powerpc: Prevent 
> memory corruption due to cache invalidation of unaligned DMA buffer")
>
> I think your commit log should explain why that commit was wrong, and 
> maybe say that your patch is a revert of that commit ?

Ok, I'll try to explain this better. To clarify here: the __dma_sync()
function in commit 03d70617b8a7 is used both before and after a DMA,
but my patch 05/21 splits this in two, and patch 06/21 only changes
the part that gets called after the DMA-from-device but leaves the
part before DMA-from-device unchanged, which Andrew's patch
addressed.

As I mentioned in the cover letter, it is still unclear whether
we want to consider this the expected behavior as the documentation
seems unclear, but my series does not attempt to answer that
question.

     Arnd
diff mbox series

Patch

diff --git a/arch/powerpc/mm/dma-noncoherent.c b/arch/powerpc/mm/dma-noncoherent.c
index f10869d27de5..e108cacf877f 100644
--- a/arch/powerpc/mm/dma-noncoherent.c
+++ b/arch/powerpc/mm/dma-noncoherent.c
@@ -132,21 +132,11 @@  void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
 	switch (direction) {
 	case DMA_NONE:
 		BUG();
-	case DMA_FROM_DEVICE:
-		/*
-		 * invalidate only when cache-line aligned otherwise there is
-		 * the potential for discarding uncommitted data from the cache
-		 */
-		if ((start | end) & (L1_CACHE_BYTES - 1))
-			__dma_phys_op(start, end, DMA_CACHE_FLUSH);
-		else
-			__dma_phys_op(start, end, DMA_CACHE_INVAL);
-		break;
-	case DMA_TO_DEVICE:		/* writeback only */
-		__dma_phys_op(start, end, DMA_CACHE_CLEAN);
+	case DMA_TO_DEVICE:
 		break;
-	case DMA_BIDIRECTIONAL:	/* writeback and invalidate */
-		__dma_phys_op(start, end, DMA_CACHE_FLUSH);
+	case DMA_FROM_DEVICE:
+	case DMA_BIDIRECTIONAL:
+		__dma_phys_op(start, end, DMA_CACHE_INVAL);
 		break;
 	}
 }