Message ID | 20170411115225.31709-3-vigneshr@ti.com (mailing list archive) |
---|---|
State | Accepted |
Commit | c687c46e9e4527c4b4d82bc3cca58c1b08bcfb83 |
Headers | show |
On Tue, Apr 11, 2017 at 05:22:25PM +0530, Vignesh R wrote: > Flash filesystems like JFFS2, UBIFS and MTD block layer can provide > vmalloc'd or kmap'd buffers that cannot be mapped using dma_map_sg() and > can potentially be in memory region above 32bit addressable region(ie > buffers belonging to memory region backed by LPAE) of DMA, implement > spi_flash_can_dma() interface to inform SPI core not to map such > buffers. I'll apply this since it fixes bugs for your systems but it feels like something that we should be moving further into the core since LPAE isn't specific to your devices. We should ideally have something (possibly in the DMA mapping code even) which does the remapping without the driver needing to know about it.
On Friday 21 April 2017 10:36 PM, Mark Brown wrote: > On Tue, Apr 11, 2017 at 05:22:25PM +0530, Vignesh R wrote: >> Flash filesystems like JFFS2, UBIFS and MTD block layer can provide >> vmalloc'd or kmap'd buffers that cannot be mapped using dma_map_sg() and >> can potentially be in memory region above 32bit addressable region(ie >> buffers belonging to memory region backed by LPAE) of DMA, implement >> spi_flash_can_dma() interface to inform SPI core not to map such >> buffers. > > I'll apply this since it fixes bugs for your systems but it feels like > something that we should be moving further into the core since LPAE > isn't specific to your devices. We should ideally have something > (possibly in the DMA mapping code even) which does the remapping without > the driver needing to know about it. > I agree, there is a need to have generic remapping code. Also, I guess, once UBIFS is moved to use kmalloc'd buffers SPI flash devices will not have to worry much about vmalloc'd buffers.
Hi all, + Richard and Boris as MTD maintainers Le 25/04/2017 à 14:18, Vignesh R a écrit : > > > On Friday 21 April 2017 10:36 PM, Mark Brown wrote: >> On Tue, Apr 11, 2017 at 05:22:25PM +0530, Vignesh R wrote: >>> Flash filesystems like JFFS2, UBIFS and MTD block layer can provide >>> vmalloc'd or kmap'd buffers that cannot be mapped using dma_map_sg() and >>> can potentially be in memory region above 32bit addressable region(ie >>> buffers belonging to memory region backed by LPAE) of DMA, implement >>> spi_flash_can_dma() interface to inform SPI core not to map such >>> buffers. >> >> I'll apply this since it fixes bugs for your systems but it feels like >> something that we should be moving further into the core since LPAE >> isn't specific to your devices. We should ideally have something >> (possibly in the DMA mapping code even) which does the remapping without >> the driver needing to know about it. >> > > I agree, there is a need to have generic remapping code. Also, I guess, > once UBIFS is moved to use kmalloc'd buffers SPI flash devices will not > have to worry much about vmalloc'd buffers. > I've just discussed with Richard and Boris and AFAIK, nothing is planned at the UBIFS side to replace vmalloc'd buffers by kmalloc'd buffers. There are reasons for using vmalloc() but Richard can explain better than me :) Also, depending on the cache model used by Atmel SoCs, the spi-atmel.c driver may suffer from the same issue too: using spi_map_buf() hence mapping vmalloc'ed buffers for DMA usage will be OK with ARM Cortex A5 (PIPT data cache, so no cache aliasing issue at all) hence with SAMA5 series but is not OK for some older cores like ARM926 (VIVT data cache) hence the SAM9 series. So to fix the spi-atmel.c driver when used with SAM9 SoCs, we are thinking about sending a first patch to simply disable the use of DMA transfers on SAM9 SoCs in case of vmalloc'ed buffers and use CPU transfers instead. The code will be left unchanged for SAMA5 SoCs so there would be no performance loss on those SoCs. It won't be optimal on SAM9 SoCs but at least it would work. Then in a new series, if nobody has started to work on this topic yet, we could propose a generic solution using a bounce buffer at the SPI core level. however we first need to think how we could do this. Best regards, Cyrille -- To unsubscribe from this list: send the line "unsubscribe linux-spi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Friday 16 June 2017 09:24 PM, Cyrille Pitchen wrote: > Hi all, > > + Richard and Boris as MTD maintainers > > Le 25/04/2017 à 14:18, Vignesh R a écrit : >> >> >> On Friday 21 April 2017 10:36 PM, Mark Brown wrote: >>> On Tue, Apr 11, 2017 at 05:22:25PM +0530, Vignesh R wrote: >>>> Flash filesystems like JFFS2, UBIFS and MTD block layer can provide >>>> vmalloc'd or kmap'd buffers that cannot be mapped using dma_map_sg() and >>>> can potentially be in memory region above 32bit addressable region(ie >>>> buffers belonging to memory region backed by LPAE) of DMA, implement >>>> spi_flash_can_dma() interface to inform SPI core not to map such >>>> buffers. >>> >>> I'll apply this since it fixes bugs for your systems but it feels like >>> something that we should be moving further into the core since LPAE >>> isn't specific to your devices. We should ideally have something >>> (possibly in the DMA mapping code even) which does the remapping without >>> the driver needing to know about it. >>> >> >> I agree, there is a need to have generic remapping code. Also, I guess, >> once UBIFS is moved to use kmalloc'd buffers SPI flash devices will not >> have to worry much about vmalloc'd buffers. >> > > I've just discussed with Richard and Boris and AFAIK, nothing is planned > at the UBIFS side to replace vmalloc'd buffers by kmalloc'd buffers. > There are reasons for using vmalloc() but Richard can explain better > than me :) > > Also, depending on the cache model used by Atmel SoCs, the spi-atmel.c > driver may suffer from the same issue too: using spi_map_buf() hence > mapping vmalloc'ed buffers for DMA usage will be OK with ARM Cortex A5 > (PIPT data cache, so no cache aliasing issue at all) hence with SAMA5 > series but is not OK for some older cores like ARM926 (VIVT data cache) > hence the SAM9 series. > > So to fix the spi-atmel.c driver when used with SAM9 SoCs, we are > thinking about sending a first patch to simply disable the use of DMA > transfers on SAM9 SoCs in case of vmalloc'ed buffers and use CPU > transfers instead. > The code will be left unchanged for SAMA5 SoCs so there would be no > performance loss on those SoCs. > It won't be optimal on SAM9 SoCs but at least it would work. > > Then in a new series, if nobody has started to work on this topic yet, > we could propose a generic solution using a bounce buffer at the SPI > core level. however we first need to think how we could do this. > One of the questions that was hovering around when this issue was discussed last time around was where should the code to detect whether or not to use bounce buffer reside? Some extension to generic DMA APIs or SPI drivers or somewhere else?
On Tue, Jun 20, 2017 at 03:15:34PM +0530, Vignesh R wrote: > On Friday 16 June 2017 09:24 PM, Cyrille Pitchen wrote: > > Then in a new series, if nobody has started to work on this topic yet, > > we could propose a generic solution using a bounce buffer at the SPI > > core level. however we first need to think how we could do this. > One of the questions that was hovering around when this issue was > discussed last time around was where should the code to detect whether > or not to use bounce buffer reside? Some extension to generic DMA APIs > or SPI drivers or somewhere else? It seems like it's a generic DMA thing - presumably it's going to be an issue for other devices as well sometimes.
diff --git a/drivers/spi/spi-ti-qspi.c b/drivers/spi/spi-ti-qspi.c index 7b39bc204a30..c24d9b45a27c 100644 --- a/drivers/spi/spi-ti-qspi.c +++ b/drivers/spi/spi-ti-qspi.c @@ -33,6 +33,7 @@ #include <linux/pinctrl/consumer.h> #include <linux/mfd/syscon.h> #include <linux/regmap.h> +#include <linux/sizes.h> #include <linux/spi/spi.h> @@ -57,6 +58,8 @@ struct ti_qspi { struct ti_qspi_regs ctx_reg; dma_addr_t mmap_phys_base; + dma_addr_t rx_bb_dma_addr; + void *rx_bb_addr; struct dma_chan *rx_chan; u32 spi_max_frequency; @@ -126,6 +129,8 @@ struct ti_qspi { #define QSPI_SETUP_ADDR_SHIFT 8 #define QSPI_SETUP_DUMMY_SHIFT 10 +#define QSPI_DMA_BUFFER_SIZE SZ_64K + static inline unsigned long ti_qspi_read(struct ti_qspi *qspi, unsigned long reg) { @@ -429,6 +434,35 @@ static int ti_qspi_dma_xfer(struct ti_qspi *qspi, dma_addr_t dma_dst, return 0; } +static int ti_qspi_dma_bounce_buffer(struct ti_qspi *qspi, + struct spi_flash_read_message *msg) +{ + size_t readsize = msg->len; + void *to = msg->buf; + dma_addr_t dma_src = qspi->mmap_phys_base + msg->from; + int ret = 0; + + /* + * Use bounce buffer as FS like jffs2, ubifs may pass + * buffers that does not belong to kernel lowmem region. + */ + while (readsize != 0) { + size_t xfer_len = min_t(size_t, QSPI_DMA_BUFFER_SIZE, + readsize); + + ret = ti_qspi_dma_xfer(qspi, qspi->rx_bb_dma_addr, + dma_src, xfer_len); + if (ret != 0) + return ret; + memcpy(to, qspi->rx_bb_addr, xfer_len); + readsize -= xfer_len; + dma_src += xfer_len; + to += xfer_len; + } + + return ret; +} + static int ti_qspi_dma_xfer_sg(struct ti_qspi *qspi, struct sg_table rx_sg, loff_t from) { @@ -496,6 +530,12 @@ static void ti_qspi_setup_mmap_read(struct spi_device *spi, QSPI_SPI_SETUP_REG(spi->chip_select)); } +static bool ti_qspi_spi_flash_can_dma(struct spi_device *spi, + struct spi_flash_read_message *msg) +{ + return virt_addr_valid(msg->buf); +} + static int ti_qspi_spi_flash_read(struct spi_device *spi, struct spi_flash_read_message *msg) { @@ -509,15 +549,12 @@ static int ti_qspi_spi_flash_read(struct spi_device *spi, ti_qspi_setup_mmap_read(spi, msg); if (qspi->rx_chan) { - if (msg->cur_msg_mapped) { + if (msg->cur_msg_mapped) ret = ti_qspi_dma_xfer_sg(qspi, msg->rx_sg, msg->from); - if (ret) - goto err_unlock; - } else { - dev_err(qspi->dev, "Invalid address for DMA\n"); - ret = -EIO; + else + ret = ti_qspi_dma_bounce_buffer(qspi, msg); + if (ret) goto err_unlock; - } } else { memcpy_fromio(msg->buf, qspi->mmap_base + msg->from, msg->len); } @@ -723,6 +760,17 @@ static int ti_qspi_probe(struct platform_device *pdev) ret = 0; goto no_dma; } + qspi->rx_bb_addr = dma_alloc_coherent(qspi->dev, + QSPI_DMA_BUFFER_SIZE, + &qspi->rx_bb_dma_addr, + GFP_KERNEL | GFP_DMA); + if (!qspi->rx_bb_addr) { + dev_err(qspi->dev, + "dma_alloc_coherent failed, using PIO mode\n"); + dma_release_channel(qspi->rx_chan); + goto no_dma; + } + master->spi_flash_can_dma = ti_qspi_spi_flash_can_dma; master->dma_rx = qspi->rx_chan; init_completion(&qspi->transfer_complete); if (res_mmap) @@ -763,6 +811,10 @@ static int ti_qspi_remove(struct platform_device *pdev) pm_runtime_put_sync(&pdev->dev); pm_runtime_disable(&pdev->dev); + if (qspi->rx_bb_addr) + dma_free_coherent(qspi->dev, QSPI_DMA_BUFFER_SIZE, + qspi->rx_bb_addr, + qspi->rx_bb_dma_addr); if (qspi->rx_chan) dma_release_channel(qspi->rx_chan);
Flash filesystems like JFFS2, UBIFS and MTD block layer can provide vmalloc'd or kmap'd buffers that cannot be mapped using dma_map_sg() and can potentially be in memory region above 32bit addressable region(ie buffers belonging to memory region backed by LPAE) of DMA, implement spi_flash_can_dma() interface to inform SPI core not to map such buffers. When buffers are not mapped for DMA, then use a pre allocated bounce buffer(64K = typical flash erase sector size) to read from flash and then do a copy to actual destination buffer. This is approach is much faster than using memcpy using CPU and also reduces CPU load. With this patch, UBIFS read speed is ~18MB/s and CPU utilization <20% on DRA74 Rev H EVM. Performance degradation is negligible when compared with non bounce buffer case while using UBIFS. Signed-off-by: Vignesh R <vigneshr@ti.com> --- v2: Fix compiler warnings and sparse warnings reported by Kbuild bot. drivers/spi/spi-ti-qspi.c | 66 ++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 59 insertions(+), 7 deletions(-)