diff mbox

[RFC,2/2] spi: atmel: Fix DMA transfers data corruption

Message ID 1510763732-10151-3-git-send-email-radu.pirea@microchip.com (mailing list archive)
State New, archived
Headers show

Commit Message

Radu Pirea Nov. 15, 2017, 4:35 p.m. UTC
If the cache model is VIVT, DMA data transfers may not be valid and to
ensure the validity of the data cache must be flushed and invalidated.

Signed-off-by: Radu Pirea <radu.pirea@microchip.com>
---
 drivers/spi/spi-atmel.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Trent Piepho Nov. 15, 2017, 7:01 p.m. UTC | #1
On Wed, 2017-11-15 at 18:35 +0200, Radu Pirea wrote:
> If the cache model is VIVT, DMA data transfers may not be valid and to

> ensure the validity of the data cache must be flushed and invalidated.

> 

> Signed-off-by: Radu Pirea <radu.pirea@microchip.com>

>  

> +#ifdef CONFIG_SOC_SAM_V4_V5

> +	/*

> +	 * On Atmel SoCs based on ARM9 cores, the data cache follows the VIVT

> +	 * model, hence the cache aliases issue can occur when buffers are

> +	 * allocated from DMA-unsafe areas, by vmalloc() for instance, where

> +	 * cache coherency is not taken into account or at least not handled

> +	 * completely (cache lines of aliases are not flushed and invalidated).

> +	 * This is not a theorical issue: it was reproduced when trying to mount

> +	 * a UBI file-system on a at91sam9g35ek board.

> +	 */

> +	flush_kernel_vmap_range((void *)xfer->rx_buf, xfer->len);

> +#endif


Does this call need to be inside an ifdef for a specific SOC?  I
believe that flush_kernel_vmap_range should expand to a no-op if the
cache is not VIVT or aliasing VIPT.  So it should be safe to always
call it.

It also doesn't seem right that the SPI driver needs to know which SoCs
have what kind of cache.  If there are more socs with more cache
details, does the spi driver need to expand the list?  If another
driver does DMA, does it need the list too?


>  	/* Send both scatterlists */

>  	rxdesc = dmaengine_prep_slave_sg(rxchan,

>  					 xfer->rx_sg.sgl, xfer->rx_sg.nents,


Does this problem also possibly affect all other users of dmaengine on
these SoCs?  I2C, serial, crpto, etc. I couldn't find any other driver
that need to make flush_kernel_vmap_range().  Which seems odd, since
there are many drivers that use DMA and run on ARM9 cores.  E.g., I
believe spi-mxs runs on ARM9 based iMX.23 and iMX.28 which also have
VIVT caches.  It doesn't make flush_kernel_vmap_range calls.  Does it
have this same bug?
Radu Pirea Nov. 16, 2017, 10:26 a.m. UTC | #2
On 15.11.2017 21:01, Trent Piepho wrote:
> On Wed, 2017-11-15 at 18:35 +0200, Radu Pirea wrote:
>> If the cache model is VIVT, DMA data transfers may not be valid and to
>> ensure the validity of the data cache must be flushed and invalidated.
>>
>> Signed-off-by: Radu Pirea <radu.pirea@microchip.com>
>>   
>> +#ifdef CONFIG_SOC_SAM_V4_V5
>> +	/*
>> +	 * On Atmel SoCs based on ARM9 cores, the data cache follows the VIVT
>> +	 * model, hence the cache aliases issue can occur when buffers are
>> +	 * allocated from DMA-unsafe areas, by vmalloc() for instance, where
>> +	 * cache coherency is not taken into account or at least not handled
>> +	 * completely (cache lines of aliases are not flushed and invalidated).
>> +	 * This is not a theorical issue: it was reproduced when trying to mount
>> +	 * a UBI file-system on a at91sam9g35ek board.
>> +	 */
>> +	flush_kernel_vmap_range((void *)xfer->rx_buf, xfer->len);
>> +#endif
> 
> Does this call need to be inside an ifdef for a specific SOC?  I
> believe that flush_kernel_vmap_range should expand to a no-op if the
> cache is not VIVT or aliasing VIPT.  So it should be safe to always
> call it.
> 
> It also doesn't seem right that the SPI driver needs to know which SoCs
> have what kind of cache.  If there are more socs with more cache
> details, does the spi driver need to expand the list?  If another
> driver does DMA, does it need the list too?

Understood. I will find another way to call the function.

> 
> 
>>   	/* Send both scatterlists */
>>   	rxdesc = dmaengine_prep_slave_sg(rxchan,
>>   					 xfer->rx_sg.sgl, xfer->rx_sg.nents,
> 
> Does this problem also possibly affect all other users of dmaengine on
> these SoCs?  I2C, serial, crpto, etc. I couldn't find any other driver
> that need to make flush_kernel_vmap_range().  Which seems odd, since
> there are many drivers that use DMA and run on ARM9 cores.  E.g., I
> believe spi-mxs runs on ARM9 based iMX.23 and iMX.28 which also have
> VIVT caches.  It doesn't make flush_kernel_vmap_range calls.  Does it
> have this same bug?��칻�&�~�&���+-��ݶ��w��˛���m�b��l�(����ܨ}���Ơz�&j:+v�������zZ+��+zf���h���~����i���z��w���?����&�)ߢf
> 

Other users of dmaengine are not affected by this problem. I know nothing about 
spi-mxs, but i know spi-davinci have the same bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Brown Nov. 16, 2017, 10:45 a.m. UTC | #3
On Wed, Nov 15, 2017 at 06:35:32PM +0200, Radu Pirea wrote:

> +#ifdef CONFIG_SOC_SAM_V4_V5
> +	/*
> +	 * On Atmel SoCs based on ARM9 cores, the data cache follows the VIVT
> +	 * model, hence the cache aliases issue can occur when buffers are
> +	 * allocated from DMA-unsafe areas, by vmalloc() for instance, where
> +	 * cache coherency is not taken into account or at least not handled
> +	 * completely (cache lines of aliases are not flushed and invalidated).
> +	 * This is not a theorical issue: it was reproduced when trying to mount
> +	 * a UBI file-system on a at91sam9g35ek board.
> +	 */
> +	flush_kernel_vmap_range((void *)xfer->rx_buf, xfer->len);
> +#endif

Shouldn't we be fixing this in the DMA mapping operations for the SoC,
won't this affect everything that does DMA on this platform and not just
this driver?  I'd expect that dma_map_sg() and so on would do the right
thing.
Radu Pirea Dec. 8, 2017, 1:26 p.m. UTC | #4
On 16.11.2017 12:45, Mark Brown wrote:
> On Wed, Nov 15, 2017 at 06:35:32PM +0200, Radu Pirea wrote:
> 
>> +#ifdef CONFIG_SOC_SAM_V4_V5
>> +	/*
>> +	 * On Atmel SoCs based on ARM9 cores, the data cache follows the VIVT
>> +	 * model, hence the cache aliases issue can occur when buffers are
>> +	 * allocated from DMA-unsafe areas, by vmalloc() for instance, where
>> +	 * cache coherency is not taken into account or at least not handled
>> +	 * completely (cache lines of aliases are not flushed and invalidated).
>> +	 * This is not a theorical issue: it was reproduced when trying to mount
>> +	 * a UBI file-system on a at91sam9g35ek board.
>> +	 */
>> +	flush_kernel_vmap_range((void *)xfer->rx_buf, xfer->len);
>> +#endif
> 
> Shouldn't we be fixing this in the DMA mapping operations for the SoC,
> won't this affect everything that does DMA on this platform and not just
> this driver?  I'd expect that dma_map_sg() and so on would do the right
> thing.
> 

I didn't find a bug like this in other drivers and the only way I can reproduce 
the bug is with UBIFS on top of a spi-nor memory. dma_map_sg() does the right 
thing, but data from cache are not written-back. If I enable 
CONFIG_CPU_DCACHE_WRITETHROUGH bug disappears. Anyway, enabling 
CONFIG_CPU_DCACHE_WRITETHROUGH is not an option because performance will drop.

Fixing the bug in DMA driver is not an option because other DMA operations are 
not affected and the bug comes from the fact that UBIFS allocates memory with 
vmalloc.

Until now I have two solutions for this bug:
1. This one with cache flushing.
2. Another solution, based on ti-qspi driver, is to transfer the data whit a 
bounce buffer allocated with dma_alloc_coherent when rx_buf or tx_buf is 
allocated with vmalloc.

So, witch solution do you think is suitable to solve this bug?
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/spi/spi-atmel.c b/drivers/spi/spi-atmel.c
index 4e5e51f..cda6d0f 100644
--- a/drivers/spi/spi-atmel.c
+++ b/drivers/spi/spi-atmel.c
@@ -21,6 +21,8 @@ 
 #include <linux/slab.h>
 #include <linux/platform_data/dma-atmel.h>
 #include <linux/of.h>
+#include <asm/cacheflush.h>
+#include <asm/cachetype.h>
 
 #include <linux/io.h>
 #include <linux/gpio.h>
@@ -742,6 +744,18 @@  static int atmel_spi_next_xfer_dma_submit(struct spi_master *master,
 				       xfer->bits_per_word))
 		goto err_exit;
 
+#ifdef CONFIG_SOC_SAM_V4_V5
+	/*
+	 * On Atmel SoCs based on ARM9 cores, the data cache follows the VIVT
+	 * model, hence the cache aliases issue can occur when buffers are
+	 * allocated from DMA-unsafe areas, by vmalloc() for instance, where
+	 * cache coherency is not taken into account or at least not handled
+	 * completely (cache lines of aliases are not flushed and invalidated).
+	 * This is not a theorical issue: it was reproduced when trying to mount
+	 * a UBI file-system on a at91sam9g35ek board.
+	 */
+	flush_kernel_vmap_range((void *)xfer->rx_buf, xfer->len);
+#endif
 	/* Send both scatterlists */
 	rxdesc = dmaengine_prep_slave_sg(rxchan,
 					 xfer->rx_sg.sgl, xfer->rx_sg.nents,
@@ -750,6 +764,9 @@  static int atmel_spi_next_xfer_dma_submit(struct spi_master *master,
 	if (!rxdesc)
 		goto err_dma;
 
+#ifdef CONFIG_SOC_SAM_V4_V5
+	flush_kernel_vmap_range((void *)xfer->tx_buf, xfer->len);
+#endif
 	txdesc = dmaengine_prep_slave_sg(txchan,
 					 xfer->tx_sg.sgl, xfer->tx_sg.nents,
 					 DMA_TO_DEVICE,
@@ -779,6 +796,9 @@  static int atmel_spi_next_xfer_dma_submit(struct spi_master *master,
 	rxchan->device->device_issue_pending(rxchan);
 	txchan->device->device_issue_pending(txchan);
 
+#ifdef CONFIG_SOC_SAM_V4_V5
+	invalidate_kernel_vmap_range((void *)xfer->rx_buf, xfer->len);
+#endif
 	/* take back lock */
 	atmel_spi_lock(as);
 	return 0;