diff mbox

SPI: bcm2835: move to the transfer_one driver model

Message ID E13569A5-33E5-48ED-B25F-EC7374E144C9@martin.sperl.org (mailing list archive)
State New, archived
Headers show

Commit Message

Martin Sperl March 31, 2015, 2:49 p.m. UTC
> On 28.03.2015, at 05:09, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> As for testing: I have also tried to test with mmc_spi, but I have not
>> been able to make that driver work reliably in any recent kernel
>> versions.
>> Most of the time I see timeouts - and with lots of different SD-cards...
>> 
>> IIRC the last time I tested it successfully was with 3.12.
> 
> It'd be great if you could use "git bisect" to track down the change
> that broke this.

Here some info on my testing with the upstream kernels:
3.12.0 - 5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52 - WORKS
3.13.0 - d8ec26d7f8287f5788a494f56e8814210f0e64be - FAILS
which is what I had suspected.

Bisection between these shows the responsible commit is:

[sperl@build linux-upstream]$ git bisect bad
0589342c27944e50ebd7a54f5215002b6598b748 is the first bad commit
commit 0589342c27944e50ebd7a54f5215002b6598b748
Author: Rob Herring <rob.herring@calxeda.com>
Date:   Tue Oct 29 23:36:46 2013 -0500

    of: set dma_mask to point to coherent_dma_mask

    Platform devices created by DT code don't initialize dma_mask pointer to
    anything. Set it to coherent_dma_mask by default if the architecture
    code has not set it.

    Signed-off-by: Rob Herring <rob.herring@calxeda.com>

:040000 040000 805e59ffdb181a71299f354ee7cca727dad4778a 5445e765b6be497d45af3f108c3f38ef6aabb44a M	drivers

It is a surprising result and does not really explain the situation.

I did check the last working (0589342c2794) and non 
working (13ccacd5945a) commits again after cleaning the builds
and rebuilding from scratch.

I did not believe this so I reran the bisect just to make sure - 
this time also rebuilding the dtb every time.

Now I went back to 4.0.rc5+ and patched it as follows:


Note if you want to repeat this, then you have to boot with the
SD card removed, because otherwise mmc0 will be the sd card on
the SPI bus and not the "native" SD card...

I have now also patched out the changes introduces with the patch
in a 4.0rc and the result is 

here the bisect log:
git bisect start
# good: [5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52] Linux 3.12
git bisect good 5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52
# bad: [d8ec26d7f8287f5788a494f56e8814210f0e64be] Linux 3.13
git bisect bad d8ec26d7f8287f5788a494f56e8814210f0e64be
# bad: [42a2d923cc349583ebf6fdd52a7d35e1c2f7e6bd] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect bad 42a2d923cc349583ebf6fdd52a7d35e1c2f7e6bd
# good: [4b4d2b463461f1b86fd89353184e6f2938e7566b] Merge tag 'h8300-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
git bisect good 4b4d2b463461f1b86fd89353184e6f2938e7566b
# bad: [5cbb3d216e2041700231bcfc383ee5f8b7fc8b74] Merge branch 'akpm' (patches from Andrew Morton)
git bisect bad 5cbb3d216e2041700231bcfc383ee5f8b7fc8b74
# good: [eeab517b68beb9e044e869bee18e3bdfa60e5aca] Merge tag 'sound-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good eeab517b68beb9e044e869bee18e3bdfa60e5aca
# bad: [07eb6597c059d9276c8dc7cd0401083ed66ad714] MAINTAINERS: remove Richard Purdie as backlight maintainer
git bisect bad 07eb6597c059d9276c8dc7cd0401083ed66ad714
# good: [85b656cf1560e27a89354a23f2c10ba229d2f173] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds
git bisect good 85b656cf1560e27a89354a23f2c10ba229d2f173
# bad: [a8f70de37bbb991033208edff7758d997d68db37] ocfs2: use bitmap_weight()
git bisect bad a8f70de37bbb991033208edff7758d997d68db37
# good: [b5b4bb3f6a11f9c37b6d53138244f2ffe5bacd12] of: only include prom.h on sparc
git bisect good b5b4bb3f6a11f9c37b6d53138244f2ffe5bacd12
# good: [355e62f5ad12b005c862838156262eb2df2f8dff] of/irq: Fix potential buffer overflow
git bisect good 355e62f5ad12b005c862838156262eb2df2f8dff
# bad: [74dac2ed699cbe1dee0e4e7891619d53a5f2632f] of: irq: Fix interrupt-map entry matching
git bisect bad 74dac2ed699cbe1dee0e4e7891619d53a5f2632f
# bad: [f0e1bfbd2d1b0900a56c25cea35be2467a5cfc32] of: Add AU Optronics Corporation vendor prefix
git bisect bad f0e1bfbd2d1b0900a56c25cea35be2467a5cfc32
# good: [67aeb39a173b5c172c75c5b9f1844853aec16eb4] of: Add vendor prefix for Cadence
git bisect good 67aeb39a173b5c172c75c5b9f1844853aec16eb4
# good: [13ccacd5945aa5ce81d8566f57587e95fbd90742] of: add vendor prefix for PHYTEC Messtechnik GmbH
git bisect good 13ccacd5945aa5ce81d8566f57587e95fbd90742
# bad: [0589342c27944e50ebd7a54f5215002b6598b748] of: set dma_mask to point to coherent_dma_mask
git bisect bad 0589342c27944e50ebd7a54f5215002b6598b748
# first bad commit: [0589342c27944e50ebd7a54f5215002b6598b748] of: set dma_mask to point to coherent_dma_mask

--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Stephen Warren April 1, 2015, 3:20 p.m. UTC | #1
On 03/31/2015 08:49 AM, Martin Sperl wrote:
>
>> On 28.03.2015, at 05:09, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>> As for testing: I have also tried to test with mmc_spi, but I have not
>>> been able to make that driver work reliably in any recent kernel
>>> versions.
>>> Most of the time I see timeouts - and with lots of different SD-cards...
>>>
>>> IIRC the last time I tested it successfully was with 3.12.
>>
>> It'd be great if you could use "git bisect" to track down the change
>> that broke this.
...
> Bisection between these shows the responsible commit is:
...
> commit 0589342c27944e50ebd7a54f5215002b6598b748
> Author: Rob Herring <rob.herring@calxeda.com>
> Date:   Tue Oct 29 23:36:46 2013 -0500
>
>      of: set dma_mask to point to coherent_dma_mask
>
>      Platform devices created by DT code don't initialize dma_mask pointer to
>      anything. Set it to coherent_dma_mask by default if the architecture
>      code has not set it.
>
>      Signed-off-by: Rob Herring <rob.herring@calxeda.com>
...
> Now I went back to 4.0.rc5+ and patched it as follows:
> --- a/drivers/of/platform.c
> +++ b/drivers/of/platform.c
> @@ -179,9 +179,10 @@ static void of_dma_configure(struct device *dev)
>           * Set it to coherent_dma_mask by default if the architecture
>           * code has not set it.
>           */
> +/*
>          if (!dev->dma_mask)
>                  dev->dma_mask = &dev->coherent_dma_mask;
> -
> +*/
>          ret = of_dma_get_range(dev->of_node, &dma_addr, &paddr, &size);
>          if (ret < 0) {
>                  dma_addr = offset = 0;
>
> And now it works - the SD card gets detected immediately,
> I can mount it and everything...

Does that change cause the SDHCI core to use vs. not-use any of the 
SDHCI DMA modes? If so, I wonder if this is because the upstream kernel 
doesn't deal with the bcm2835's ARM physical address <-> SoC bus address 
conversions, which in turn can cause DMA and CPU access to not use the 
same caching settings and become incoherent, which in turn can cause DMA 
data corruption. See the following link for some background:

https://www.mail-archive.com/u-boot@lists.denx.de/msg167376.html
[PATCH 2/3] ARM: bcm2835: implement phys_to_bus/bus_to_phys

(you'll want to read all of patches 1-3 for the complete background I 
think, but patch 2 describes the HW issue)
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martin Sperl April 3, 2015, 10:17 a.m. UTC | #2
> On 01.04.2015, at 17:20, Stephen Warren <swarren@wwwdotorg.org> wrote:
> 
> Does that change cause the SDHCI core to use vs. not-use any of the SDHCI DMA modes? If so, I wonder if this is because the upstream kernel doesn't deal with the bcm2835's ARM physical address <-> SoC bus address conversions, which in turn can cause DMA and CPU access to not use the same caching settings and become incoherent, which in turn can cause DMA data corruption. See the following link for some background:
> 
> https://www.mail-archive.com/u-boot@lists.denx.de/msg167376.html
> [PATCH 2/3] ARM: bcm2835: implement phys_to_bus/bus_to_phys
> 
> (you'll want to read all of patches 1-3 for the complete background I think, but patch 2 describes the HW issue)

I can also just patch out the assignement to dev_dma here and it works as well:
	if (spi->master->dev.parent->dma_mask) {
		struct device   *dev = spi->master->dev.parent;

		host->dma_dev = dev;
		host->ones_dma = dma_map_single(dev, ones,
				MMC_SPI_BLOCKSIZE, DMA_TO_DEVICE);
		host->data_dma = dma_map_single(dev, host->data,
				sizeof(*host->data), DMA_BIDIRECTIONAL);
So DMA there is not an issue with SDHCI - it is just when using the
"normal" spi-bcm2835 that I can reproduce it.

If you read the mmc_spi driver there is a lot of places where we have 
constructs like this:
        if (host->dma_dev) {
                host->m.is_dma_mapped = 1;
                dma_sync_single_for_device(host->dma_dev,
                                host->data_dma, sizeof(*host->data),
                                DMA_BIDIRECTIONAL);
        }
        status = spi_sync_locked(host->spi, &host->m);

        if (host->dma_dev)
                dma_sync_single_for_cpu(host->dma_dev,
                                host->data_dma, sizeof(*host->data),
                                DMA_BIDIRECTIONAL);

Similar constructs exist for: dma_map_page and dma_unmap_page.

As far as I can tell these are the "typical" dma-related calls.

But the point is that the spi-bcm2835 driver does not implement DMA (yet),
so some things may not be fully set up for mapping inside 
spi->master->dev.parent and some mapping options may fail/not work as
expected. This may results in possibly "bad" behavior, just because dma_mask
 is set.

So i am not sure if this "mapping" issue you are mentioning is really the
root-cause - also I test on a RPI1, but have also tried on a RPI2 with the 
foundation kernel.

On top my reading/experience with DMA on the bcm2835 is that I was always
using 0xC0000000 as the bus-address, as the L1/L2 addresses in the
documentation are only related to VideoCore caches - and that is I guess
why the "L2 preload" is explicitly mentioned in the documentation - probably
to speed up the GPU by  prefetching the necessary data via DMA and thus 
avoiding to wait for data to get read from SRAM.

Unfortunately I can not test if the the mmc_spi issue exists on a
beaglebone, as the mmc_spi driver can not even get compiled there because
of CONFIG_HIGHMEM being set which conflicts with KConfig portion for
mmc_spi...


Ciao,
	Martin


--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Brown April 6, 2015, 5:26 p.m. UTC | #3
On Fri, Apr 03, 2015 at 12:17:31PM +0200, Martin Sperl wrote:
> > On 01.04.2015, at 17:20, Stephen Warren <swarren@wwwdotorg.org> wrote:

It looks like something rewrote Stephen's message to have no word
wrapping...

> If you read the mmc_spi driver there is a lot of places where we have 
> constructs like this:
>         if (host->dma_dev) {
>                 host->m.is_dma_mapped = 1;
>                 dma_sync_single_for_device(host->dma_dev,
>                                 host->data_dma, sizeof(*host->data),
>                                 DMA_BIDIRECTIONAL);
>         }
>         status = spi_sync_locked(host->spi, &host->m);

We probably need to have a good, hard look at this stuff - in general we
shouldn't be using this sort of mapping in clients, and for some systems
this will actually end up broken.

> Unfortunately I can not test if the the mmc_spi issue exists on a
> beaglebone, as the mmc_spi driver can not even get compiled there because
> of CONFIG_HIGHMEM being set which conflicts with KConfig portion for
> mmc_spi...

This is a bit alarming and probably also indicates that the code needs
looking at.
diff mbox

Patch

--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -179,9 +179,10 @@  static void of_dma_configure(struct device *dev)
         * Set it to coherent_dma_mask by default if the architecture
         * code has not set it.
         */
+/*
        if (!dev->dma_mask)
                dev->dma_mask = &dev->coherent_dma_mask;
-
+*/
        ret = of_dma_get_range(dev->of_node, &dma_addr, &paddr, &size);
        if (ret < 0) {
                dma_addr = offset = 0;

And now it works - the SD card gets detected immediately,
I can mount it and everything...

Is this something we need to fix in bcm2835 by using the "correct"
coherent_dma_mask?

Still I do not understand why we need dma_mask not set for the
mmc_spi driver and how it influences the behaviour...

Martin

P.s: Note that I have built like this:
make bcm2835_defconfig
echo "CONFIG_MMC_SPI=y" >> .config

using the following diff to enable mmc_spi in the device-tree:
--- a/arch/arm/boot/dts/bcm2835-rpi-b.dts
+++ b/arch/arm/boot/dts/bcm2835-rpi-b.dts
@@ -50,3 +50,14 @@ 
 	status = "okay";
 	bus-width = <4>;
 };
+
+&spi {
+	status = "okay";
+	sd1: sd@1 {
+		reg = <1>;
+		status = "okay";
+		compatible = "spi,mmc_spi";
+		voltage-ranges = <3200 3500>;
+		spi-max-frequency = <8000000>;
+	};
+};