diff mbox series

Onboard SD card doesn't work anymore after the 'mmc-v5.4-2' updates

Message ID 7b549219-a2e1-08c7-331b-9c3e4fdb8a8f@xenosoft.de
State New
Headers show
Series Onboard SD card doesn't work anymore after the 'mmc-v5.4-2' updates | expand

Commit Message

Christian Zigotzky Oct. 4, 2019, 10:45 p.m. UTC
Hello,

The onboard SD card of the Cyrus+ PowerPC board (A-EON AmigaOne X5000) 
[1] doesn't work anymore after the 'mmc-v5.4-2' updates [2].

Error messages:

[   12.118148] mmc0: SDHCI controller on ffe114000.sdhc [ffe114000.sdhc] 
using ADMA
[   12.232869] mmc0: ADMA error: 0x02000000
[   12.237977] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[   12.243137] mmc0: sdhci: Sys addr:  0x00000000 | Version: 0x00001301
[   12.248307] mmc0: sdhci: Blk size:  0x00000008 | Blk cnt: 0x00000001
[   12.253448] mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
[   12.258530] mmc0: sdhci: Present:   0x01fd020a | Host ctl: 0x00000030
[   12.263572] mmc0: sdhci: Power:     0x00000002 | Blk gap: 0x00000000
[   12.268574] mmc0: sdhci: Wake-up:   0x00000000 | Clock: 0x000020f8
[   12.273535] mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
[   12.278461] mmc0: sdhci: Int enab:  0x037f008f | Sig enab: 0x037f008b
[   12.283339] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00001301
[   12.288198] mmc0: sdhci: Caps:      0x05fa0000 | Caps_1: 0x00000000
[   12.293032] mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
[   12.297861] mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]: 0x001dd533
[   12.302615] mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]: 0x3f400e00
[   12.307312] mmc0: sdhci: Host ctl2: 0x00000000
[   12.311940] mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x7e04f208
[   12.316573] mmc0: sdhci: ============================================
[   12.321204] mmc0: sdhci: 7e04f200: DMA 0xfc000000, LEN 0x0008, Attr=0x23
[   12.326022] mmc0: error -5 whilst initialising SD card
[   12.507466] mmc0: invalid bus width
[   12.512154] mmc0: error -22 whilst initialising SD card
[   12.683457] mmc0: invalid bus width
[   12.688968] mmc0: error -22 whilst initialising SD card
[   12.774281] mmc0: invalid bus width
[   12.779910] mmc0: error -22 whilst initialising SD card

-------------------------

I have found the issue in the DMA changes of the 'mmc-v5.4-2' updates. 
It's the commit "121bd08b029e03404c451bb237729cdff76eafed (mmc: 
sdhci-of-esdhc: set DMA snooping based on DMA coherence)" [3].

Please find attached a patch. This patch solves the issue.

dmesg output after compiling with the attached patch:

[   12.117149] mmc0: SDHCI controller on ffe114000.sdhc [ffe114000.sdhc] 
using ADMA
[   12.208646] mmc0: new high speed SDHC card at address 59b4
[   12.230808] mmcblk0: mmc0:59b4 USD   3.73 GiB

-------------------------

Please check the commit "121bd08b029e03404c451bb237729cdff76eafed" [3].

Thanks,
Christian


[1] https://www.amigaos.net/hardware/133/amigaone-x5000 and 
http://wiki.amiga.org/index.php?title=X5000
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.4-rc1&id=c710364f78afdef8c2ed07556d0743c5a30ed429
[3] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=121bd08b029e03404c451bb237729cdff76eafed

Comments

Christian Zigotzky Oct. 4, 2019, 11:10 p.m. UTC | #1
On 05 October 2019 at 00:53 am, Russell King - ARM Linux admin wrote:
> On Sat, Oct 05, 2019 at 12:45:29AM +0200, Christian Zigotzky wrote:
>> Hello,
>>
>> The onboard SD card of the Cyrus+ PowerPC board (A-EON AmigaOne X5000) [1]
>> doesn't work anymore after the 'mmc-v5.4-2' updates [2].
> Hi Christian,
>
> Please can you help us understand what is going on with your setup.
> Obviously, your board wants the SDHCI controller to be DMA coherent,
> but I guess that DT on your platform does not specify the
> "dma-coherent" property.  Can you confirm that?
>
> If that's the case, the question is what we do about that - do we add
> it (and wait for further reports?)
>
> The only alternative I can see is that we split ARM usage from PowerPC
> usage of this driver; it seems PowerPC can get away with being more
> lenient wrt DT DMA coherency correctness than the ARM architecture.
>
> Thanks.
>
Hi Russell,

I didn't find the property "dma-coherent" in the dtb source files 
(attached).

Output of "fdtdump cyrus_p5020_eth_poweroff.dtb | grep dma":

dma0 = "/soc@ffe000000/dma@100300";
         dma1 = "/soc@ffe000000/dma@101300";
         dma@100300 {
             compatible = "fsl,eloplus-dma";
             dma-channel@0 {
                 compatible = "fsl,eloplus-dma-channel";
             dma-channel@80 {
                 compatible = "fsl,eloplus-dma-channel";
             dma-channel@100 {
                 compatible = "fsl,eloplus-dma-channel";
             dma-channel@180 {
                 compatible = "fsl,eloplus-dma-channel";
         dma@101300 {
             compatible = "fsl,eloplus-dma";
             dma-channel@0 {
                 compatible = "fsl,eloplus-dma-channel";
             dma-channel@80 {
                 compatible = "fsl,eloplus-dma-channel";
             dma-channel@100 {
                 compatible = "fsl,eloplus-dma-channel";
             dma-channel@180 {
                 compatible = "fsl,eloplus-dma-channel";


Thanks,
Christian
/*
 * P5020/P5010 Silicon/SoC Device Tree Source (pre include)
 *
 * Copyright 2011 - 2015 Freescale Semiconductor Inc.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are met:
 *     * Redistributions of source code must retain the above copyright
 *       notice, this list of conditions and the following disclaimer.
 *     * Redistributions in binary form must reproduce the above copyright
 *       notice, this list of conditions and the following disclaimer in the
 *       documentation and/or other materials provided with the distribution.
 *     * Neither the name of Freescale Semiconductor nor the
 *       names of its contributors may be used to endorse or promote products
 *       derived from this software without specific prior written permission.
 *
 *
 * ALTERNATIVELY, this software may be distributed under the terms of the
 * GNU General Public License ("GPL") as published by the Free Software
 * Foundation, either version 2 of that License or (at your option) any
 * later version.
 *
 * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
 * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
 * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
 * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
 * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 */

/dts-v1/;

/include/ "e5500_power_isa.dtsi"

/ {
	compatible = "fsl,P5020";
	#address-cells = <2>;
	#size-cells = <2>;
	interrupt-parent = <&mpic>;

	aliases {
		ccsr = &soc;
		dcsr = &dcsr;

		serial0 = &serial0;
		serial1 = &serial1;
		serial2 = &serial2;
		serial3 = &serial3;
		pci0 = &pci0;
		pci1 = &pci1;
		pci2 = &pci2;
		pci3 = &pci3;
		usb0 = &usb0;
		usb1 = &usb1;
		dma0 = &dma0;
		dma1 = &dma1;
		sdhc = &sdhc;
		msi0 = &msi0;
		msi1 = &msi1;
		msi2 = &msi2;

		crypto = &crypto;
		sec_jr0 = &sec_jr0;
		sec_jr1 = &sec_jr1;
		sec_jr2 = &sec_jr2;
		sec_jr3 = &sec_jr3;
		rtic_a = &rtic_a;
		rtic_b = &rtic_b;
		rtic_c = &rtic_c;
		rtic_d = &rtic_d;
		sec_mon = &sec_mon;

		raideng = &raideng;
		raideng_jr0 = &raideng_jr0;
		raideng_jr1 = &raideng_jr1;
		raideng_jr2 = &raideng_jr2;
		raideng_jr3 = &raideng_jr3;

		fman0 = &fman0;
		ethernet0 = &enet3;
		ethernet1 = &enet4;
	};

	cpus {
		#address-cells = <1>;
		#size-cells = <0>;

		cpu0: PowerPC,e5500@0 {
			device_type = "cpu";
			reg = <0>;
			clocks = <&mux0>;
			next-level-cache = <&L2_0>;
			fsl,portid-mapping = <0x80000000>;
			L2_0: l2-cache {
				next-level-cache = <&cpc>;
			};
		};
		cpu1: PowerPC,e5500@1 {
			device_type = "cpu";
			reg = <1>;
			clocks = <&mux1>;
			next-level-cache = <&L2_1>;
			fsl,portid-mapping = <0x40000000>;
			L2_1: l2-cache {
				next-level-cache = <&cpc>;
			};
		};
	};
};
Christian Zigotzky Oct. 15, 2019, 12:27 p.m. UTC | #2
Hello,

We have patched the RC2 [1] and RC3 [2] of kernel 5.4 with the mmc3 
patch [3] and after that the onboard SD card works without any problems 
with them.

Could you please add this patch or do you have another solution?

Thanks,
Christian

[1] http://www.xenosoft.de/linux-image-5.4-rc2-X1000_X5000_A1222.tar.gz
[2] http://www.xenosoft.de/linux-image-5.4-rc3-X1000_X5000_A1222.tar.gz
[3] http://www.xenosoft.de/mmc3.patch


On 05 October 2019 at 00:45 am, Christian Zigotzky wrote:
> Hello,
>
> The onboard SD card of the Cyrus+ PowerPC board (A-EON AmigaOne X5000) 
> [1] doesn't work anymore after the 'mmc-v5.4-2' updates [2].
>
> Error messages:
>
> [   12.118148] mmc0: SDHCI controller on ffe114000.sdhc 
> [ffe114000.sdhc] using ADMA
> [   12.232869] mmc0: ADMA error: 0x02000000
> [   12.237977] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> [   12.243137] mmc0: sdhci: Sys addr:  0x00000000 | Version: 0x00001301
> [   12.248307] mmc0: sdhci: Blk size:  0x00000008 | Blk cnt: 0x00000001
> [   12.253448] mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> [   12.258530] mmc0: sdhci: Present:   0x01fd020a | Host ctl: 0x00000030
> [   12.263572] mmc0: sdhci: Power:     0x00000002 | Blk gap: 0x00000000
> [   12.268574] mmc0: sdhci: Wake-up:   0x00000000 | Clock: 0x000020f8
> [   12.273535] mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> [   12.278461] mmc0: sdhci: Int enab:  0x037f008f | Sig enab: 0x037f008b
> [   12.283339] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00001301
> [   12.288198] mmc0: sdhci: Caps:      0x05fa0000 | Caps_1: 0x00000000
> [   12.293032] mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> [   12.297861] mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]: 0x001dd533
> [   12.302615] mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]: 0x3f400e00
> [   12.307312] mmc0: sdhci: Host ctl2: 0x00000000
> [   12.311940] mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x7e04f208
> [   12.316573] mmc0: sdhci: ============================================
> [   12.321204] mmc0: sdhci: 7e04f200: DMA 0xfc000000, LEN 0x0008, 
> Attr=0x23
> [   12.326022] mmc0: error -5 whilst initialising SD card
> [   12.507466] mmc0: invalid bus width
> [   12.512154] mmc0: error -22 whilst initialising SD card
> [   12.683457] mmc0: invalid bus width
> [   12.688968] mmc0: error -22 whilst initialising SD card
> [   12.774281] mmc0: invalid bus width
> [   12.779910] mmc0: error -22 whilst initialising SD card
>
> -------------------------
>
> I have found the issue in the DMA changes of the 'mmc-v5.4-2' updates. 
> It's the commit "121bd08b029e03404c451bb237729cdff76eafed (mmc: 
> sdhci-of-esdhc: set DMA snooping based on DMA coherence)" [3].
>
> Please find attached a patch. This patch solves the issue.
>
> dmesg output after compiling with the attached patch:
>
> [   12.117149] mmc0: SDHCI controller on ffe114000.sdhc 
> [ffe114000.sdhc] using ADMA
> [   12.208646] mmc0: new high speed SDHC card at address 59b4
> [   12.230808] mmcblk0: mmc0:59b4 USD   3.73 GiB
>
> -------------------------
>
> Please check the commit "121bd08b029e03404c451bb237729cdff76eafed" [3].
>
> Thanks,
> Christian
>
>
> [1] https://www.amigaos.net/hardware/133/amigaone-x5000 and 
> http://wiki.amiga.org/index.php?title=X5000
> [2] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.4-rc1&id=c710364f78afdef8c2ed07556d0743c5a30ed429
> [3] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=121bd08b029e03404c451bb237729cdff76eafed
>
Russell King - ARM Linux admin Oct. 15, 2019, 12:51 p.m. UTC | #3
The patch which you are reverting fixes exactly the same error on
ARM64 boards.

I had tried to get some additional information from you, but I never
received a response to my query.

While it is regrettable that fixing that has caused a regression
for yours (which we need to fix) we really need to find a way to
solve this for both situations.

Merely reverting the patch and then sitting on our hands with no
possible solution for ARM64 isn't really an acceptable way forward.

The only solution I can see at the moment to this is to #ifdef in
the driver based on whether we are building for ARM64 or PowerPC.

Maybe someone else can come up with a better idea?

On Tue, Oct 15, 2019 at 02:27:44PM +0200, Christian Zigotzky wrote:
> Hello,
> 
> We have patched the RC2 [1] and RC3 [2] of kernel 5.4 with the mmc3 patch
> [3] and after that the onboard SD card works without any problems with them.
> 
> Could you please add this patch or do you have another solution?
> 
> Thanks,
> Christian
> 
> [1] http://www.xenosoft.de/linux-image-5.4-rc2-X1000_X5000_A1222.tar.gz
> [2] http://www.xenosoft.de/linux-image-5.4-rc3-X1000_X5000_A1222.tar.gz
> [3] http://www.xenosoft.de/mmc3.patch
> 
> 
> On 05 October 2019 at 00:45 am, Christian Zigotzky wrote:
> > Hello,
> > 
> > The onboard SD card of the Cyrus+ PowerPC board (A-EON AmigaOne X5000)
> > [1] doesn't work anymore after the 'mmc-v5.4-2' updates [2].
> > 
> > Error messages:
> > 
> > [   12.118148] mmc0: SDHCI controller on ffe114000.sdhc [ffe114000.sdhc]
> > using ADMA
> > [   12.232869] mmc0: ADMA error: 0x02000000
> > [   12.237977] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> > [   12.243137] mmc0: sdhci: Sys addr:  0x00000000 | Version: 0x00001301
> > [   12.248307] mmc0: sdhci: Blk size:  0x00000008 | Blk cnt: 0x00000001
> > [   12.253448] mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> > [   12.258530] mmc0: sdhci: Present:   0x01fd020a | Host ctl: 0x00000030
> > [   12.263572] mmc0: sdhci: Power:     0x00000002 | Blk gap: 0x00000000
> > [   12.268574] mmc0: sdhci: Wake-up:   0x00000000 | Clock: 0x000020f8
> > [   12.273535] mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> > [   12.278461] mmc0: sdhci: Int enab:  0x037f008f | Sig enab: 0x037f008b
> > [   12.283339] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00001301
> > [   12.288198] mmc0: sdhci: Caps:      0x05fa0000 | Caps_1: 0x00000000
> > [   12.293032] mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> > [   12.297861] mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]: 0x001dd533
> > [   12.302615] mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]: 0x3f400e00
> > [   12.307312] mmc0: sdhci: Host ctl2: 0x00000000
> > [   12.311940] mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x7e04f208
> > [   12.316573] mmc0: sdhci: ============================================
> > [   12.321204] mmc0: sdhci: 7e04f200: DMA 0xfc000000, LEN 0x0008,
> > Attr=0x23
> > [   12.326022] mmc0: error -5 whilst initialising SD card
> > [   12.507466] mmc0: invalid bus width
> > [   12.512154] mmc0: error -22 whilst initialising SD card
> > [   12.683457] mmc0: invalid bus width
> > [   12.688968] mmc0: error -22 whilst initialising SD card
> > [   12.774281] mmc0: invalid bus width
> > [   12.779910] mmc0: error -22 whilst initialising SD card
> > 
> > -------------------------
> > 
> > I have found the issue in the DMA changes of the 'mmc-v5.4-2' updates.
> > It's the commit "121bd08b029e03404c451bb237729cdff76eafed (mmc:
> > sdhci-of-esdhc: set DMA snooping based on DMA coherence)" [3].
> > 
> > Please find attached a patch. This patch solves the issue.
> > 
> > dmesg output after compiling with the attached patch:
> > 
> > [   12.117149] mmc0: SDHCI controller on ffe114000.sdhc [ffe114000.sdhc]
> > using ADMA
> > [   12.208646] mmc0: new high speed SDHC card at address 59b4
> > [   12.230808] mmcblk0: mmc0:59b4 USD   3.73 GiB
> > 
> > -------------------------
> > 
> > Please check the commit "121bd08b029e03404c451bb237729cdff76eafed" [3].
> > 
> > Thanks,
> > Christian
> > 
> > 
> > [1] https://www.amigaos.net/hardware/133/amigaone-x5000 and
> > http://wiki.amiga.org/index.php?title=X5000
> > [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.4-rc1&id=c710364f78afdef8c2ed07556d0743c5a30ed429
> > [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=121bd08b029e03404c451bb237729cdff76eafed
> > 
> 
>
Christian Zigotzky Oct. 15, 2019, 1:12 p.m. UTC | #4
Hello Russell,

You asked me about "dma-coherent" in the Cyrus device tree. 
Unfortunately I don't find the property "dma-coherent" in the dtb source 
files.

Output of "fdtdump cyrus_p5020_eth_poweroff.dtb | grep dma":

dma0 = "/soc@ffe000000/dma@100300";
         dma1 = "/soc@ffe000000/dma@101300";
         dma@100300 {
             compatible = "fsl,eloplus-dma";
             dma-channel@0 {
                 compatible = "fsl,eloplus-dma-channel";
             dma-channel@80 {
                 compatible = "fsl,eloplus-dma-channel";
             dma-channel@100 {
                 compatible = "fsl,eloplus-dma-channel";
             dma-channel@180 {
                 compatible = "fsl,eloplus-dma-channel";
         dma@101300 {
             compatible = "fsl,eloplus-dma";
             dma-channel@0 {
                 compatible = "fsl,eloplus-dma-channel";
             dma-channel@80 {
                 compatible = "fsl,eloplus-dma-channel";
             dma-channel@100 {
                 compatible = "fsl,eloplus-dma-channel";
             dma-channel@180 {
                 compatible = "fsl,eloplus-dma-channel";

Thanks,
Christian


On 15 October 2019 at 2:51 pm, Russell King - ARM Linux admin wrote:
> The patch which you are reverting fixes exactly the same error on
> ARM64 boards.
>
> I had tried to get some additional information from you, but I never
> received a response to my query.
>
> While it is regrettable that fixing that has caused a regression
> for yours (which we need to fix) we really need to find a way to
> solve this for both situations.
>
> Merely reverting the patch and then sitting on our hands with no
> possible solution for ARM64 isn't really an acceptable way forward.
>
> The only solution I can see at the moment to this is to #ifdef in
> the driver based on whether we are building for ARM64 or PowerPC.
>
> Maybe someone else can come up with a better idea?
>
> On Tue, Oct 15, 2019 at 02:27:44PM +0200, Christian Zigotzky wrote:
>> Hello,
>>
>> We have patched the RC2 [1] and RC3 [2] of kernel 5.4 with the mmc3 patch
>> [3] and after that the onboard SD card works without any problems with them.
>>
>> Could you please add this patch or do you have another solution?
>>
>> Thanks,
>> Christian
>>
>> [1] http://www.xenosoft.de/linux-image-5.4-rc2-X1000_X5000_A1222.tar.gz
>> [2] http://www.xenosoft.de/linux-image-5.4-rc3-X1000_X5000_A1222.tar.gz
>> [3] http://www.xenosoft.de/mmc3.patch
>>
>>
>> On 05 October 2019 at 00:45 am, Christian Zigotzky wrote:
>>> Hello,
>>>
>>> The onboard SD card of the Cyrus+ PowerPC board (A-EON AmigaOne X5000)
>>> [1] doesn't work anymore after the 'mmc-v5.4-2' updates [2].
>>>
>>> Error messages:
>>>
>>> [   12.118148] mmc0: SDHCI controller on ffe114000.sdhc [ffe114000.sdhc]
>>> using ADMA
>>> [   12.232869] mmc0: ADMA error: 0x02000000
>>> [   12.237977] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
>>> [   12.243137] mmc0: sdhci: Sys addr:  0x00000000 | Version: 0x00001301
>>> [   12.248307] mmc0: sdhci: Blk size:  0x00000008 | Blk cnt: 0x00000001
>>> [   12.253448] mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
>>> [   12.258530] mmc0: sdhci: Present:   0x01fd020a | Host ctl: 0x00000030
>>> [   12.263572] mmc0: sdhci: Power:     0x00000002 | Blk gap: 0x00000000
>>> [   12.268574] mmc0: sdhci: Wake-up:   0x00000000 | Clock: 0x000020f8
>>> [   12.273535] mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
>>> [   12.278461] mmc0: sdhci: Int enab:  0x037f008f | Sig enab: 0x037f008b
>>> [   12.283339] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00001301
>>> [   12.288198] mmc0: sdhci: Caps:      0x05fa0000 | Caps_1: 0x00000000
>>> [   12.293032] mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
>>> [   12.297861] mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]: 0x001dd533
>>> [   12.302615] mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]: 0x3f400e00
>>> [   12.307312] mmc0: sdhci: Host ctl2: 0x00000000
>>> [   12.311940] mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x7e04f208
>>> [   12.316573] mmc0: sdhci: ============================================
>>> [   12.321204] mmc0: sdhci: 7e04f200: DMA 0xfc000000, LEN 0x0008,
>>> Attr=0x23
>>> [   12.326022] mmc0: error -5 whilst initialising SD card
>>> [   12.507466] mmc0: invalid bus width
>>> [   12.512154] mmc0: error -22 whilst initialising SD card
>>> [   12.683457] mmc0: invalid bus width
>>> [   12.688968] mmc0: error -22 whilst initialising SD card
>>> [   12.774281] mmc0: invalid bus width
>>> [   12.779910] mmc0: error -22 whilst initialising SD card
>>>
>>> -------------------------
>>>
>>> I have found the issue in the DMA changes of the 'mmc-v5.4-2' updates.
>>> It's the commit "121bd08b029e03404c451bb237729cdff76eafed (mmc:
>>> sdhci-of-esdhc: set DMA snooping based on DMA coherence)" [3].
>>>
>>> Please find attached a patch. This patch solves the issue.
>>>
>>> dmesg output after compiling with the attached patch:
>>>
>>> [   12.117149] mmc0: SDHCI controller on ffe114000.sdhc [ffe114000.sdhc]
>>> using ADMA
>>> [   12.208646] mmc0: new high speed SDHC card at address 59b4
>>> [   12.230808] mmcblk0: mmc0:59b4 USD   3.73 GiB
>>>
>>> -------------------------
>>>
>>> Please check the commit "121bd08b029e03404c451bb237729cdff76eafed" [3].
>>>
>>> Thanks,
>>> Christian
>>>
>>>
>>> [1] https://www.amigaos.net/hardware/133/amigaone-x5000 and
>>> http://wiki.amiga.org/index.php?title=X5000
>>> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.4-rc1&id=c710364f78afdef8c2ed07556d0743c5a30ed429
>>> [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=121bd08b029e03404c451bb237729cdff76eafed
>>>
>>
Russell King - ARM Linux admin Oct. 15, 2019, 1:17 p.m. UTC | #5
On Tue, Oct 15, 2019 at 03:12:49PM +0200, Christian Zigotzky wrote:
> Hello Russell,
> 
> You asked me about "dma-coherent" in the Cyrus device tree. Unfortunately I
> don't find the property "dma-coherent" in the dtb source files.
> 
> Output of "fdtdump cyrus_p5020_eth_poweroff.dtb | grep dma":
> 
> dma0 = "/soc@ffe000000/dma@100300";
>         dma1 = "/soc@ffe000000/dma@101300";
>         dma@100300 {
>             compatible = "fsl,eloplus-dma";
>             dma-channel@0 {
>                 compatible = "fsl,eloplus-dma-channel";
>             dma-channel@80 {
>                 compatible = "fsl,eloplus-dma-channel";
>             dma-channel@100 {
>                 compatible = "fsl,eloplus-dma-channel";
>             dma-channel@180 {
>                 compatible = "fsl,eloplus-dma-channel";
>         dma@101300 {
>             compatible = "fsl,eloplus-dma";
>             dma-channel@0 {
>                 compatible = "fsl,eloplus-dma-channel";
>             dma-channel@80 {
>                 compatible = "fsl,eloplus-dma-channel";
>             dma-channel@100 {
>                 compatible = "fsl,eloplus-dma-channel";
>             dma-channel@180 {
>                 compatible = "fsl,eloplus-dma-channel";

Hmm, so it looks like PowerPC doesn't mark devices that are dma
coherent with a property that describes them as such.

I think this opens a wider question - what should of_dma_is_coherent()
return for PowerPC?  It seems right now that it returns false for
devices that are DMA coherent, which seems to me to be a recipe for
future mistakes.

Any ideas from the PPC maintainers?

> 
> Thanks,
> Christian
> 
> 
> On 15 October 2019 at 2:51 pm, Russell King - ARM Linux admin wrote:
> > The patch which you are reverting fixes exactly the same error on
> > ARM64 boards.
> > 
> > I had tried to get some additional information from you, but I never
> > received a response to my query.
> > 
> > While it is regrettable that fixing that has caused a regression
> > for yours (which we need to fix) we really need to find a way to
> > solve this for both situations.
> > 
> > Merely reverting the patch and then sitting on our hands with no
> > possible solution for ARM64 isn't really an acceptable way forward.
> > 
> > The only solution I can see at the moment to this is to #ifdef in
> > the driver based on whether we are building for ARM64 or PowerPC.
> > 
> > Maybe someone else can come up with a better idea?
> > 
> > On Tue, Oct 15, 2019 at 02:27:44PM +0200, Christian Zigotzky wrote:
> > > Hello,
> > > 
> > > We have patched the RC2 [1] and RC3 [2] of kernel 5.4 with the mmc3 patch
> > > [3] and after that the onboard SD card works without any problems with them.
> > > 
> > > Could you please add this patch or do you have another solution?
> > > 
> > > Thanks,
> > > Christian
> > > 
> > > [1] http://www.xenosoft.de/linux-image-5.4-rc2-X1000_X5000_A1222.tar.gz
> > > [2] http://www.xenosoft.de/linux-image-5.4-rc3-X1000_X5000_A1222.tar.gz
> > > [3] http://www.xenosoft.de/mmc3.patch
> > > 
> > > 
> > > On 05 October 2019 at 00:45 am, Christian Zigotzky wrote:
> > > > Hello,
> > > > 
> > > > The onboard SD card of the Cyrus+ PowerPC board (A-EON AmigaOne X5000)
> > > > [1] doesn't work anymore after the 'mmc-v5.4-2' updates [2].
> > > > 
> > > > Error messages:
> > > > 
> > > > [   12.118148] mmc0: SDHCI controller on ffe114000.sdhc [ffe114000.sdhc]
> > > > using ADMA
> > > > [   12.232869] mmc0: ADMA error: 0x02000000
> > > > [   12.237977] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> > > > [   12.243137] mmc0: sdhci: Sys addr:  0x00000000 | Version: 0x00001301
> > > > [   12.248307] mmc0: sdhci: Blk size:  0x00000008 | Blk cnt: 0x00000001
> > > > [   12.253448] mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> > > > [   12.258530] mmc0: sdhci: Present:   0x01fd020a | Host ctl: 0x00000030
> > > > [   12.263572] mmc0: sdhci: Power:     0x00000002 | Blk gap: 0x00000000
> > > > [   12.268574] mmc0: sdhci: Wake-up:   0x00000000 | Clock: 0x000020f8
> > > > [   12.273535] mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> > > > [   12.278461] mmc0: sdhci: Int enab:  0x037f008f | Sig enab: 0x037f008b
> > > > [   12.283339] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00001301
> > > > [   12.288198] mmc0: sdhci: Caps:      0x05fa0000 | Caps_1: 0x00000000
> > > > [   12.293032] mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> > > > [   12.297861] mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]: 0x001dd533
> > > > [   12.302615] mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]: 0x3f400e00
> > > > [   12.307312] mmc0: sdhci: Host ctl2: 0x00000000
> > > > [   12.311940] mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x7e04f208
> > > > [   12.316573] mmc0: sdhci: ============================================
> > > > [   12.321204] mmc0: sdhci: 7e04f200: DMA 0xfc000000, LEN 0x0008,
> > > > Attr=0x23
> > > > [   12.326022] mmc0: error -5 whilst initialising SD card
> > > > [   12.507466] mmc0: invalid bus width
> > > > [   12.512154] mmc0: error -22 whilst initialising SD card
> > > > [   12.683457] mmc0: invalid bus width
> > > > [   12.688968] mmc0: error -22 whilst initialising SD card
> > > > [   12.774281] mmc0: invalid bus width
> > > > [   12.779910] mmc0: error -22 whilst initialising SD card
> > > > 
> > > > -------------------------
> > > > 
> > > > I have found the issue in the DMA changes of the 'mmc-v5.4-2' updates.
> > > > It's the commit "121bd08b029e03404c451bb237729cdff76eafed (mmc:
> > > > sdhci-of-esdhc: set DMA snooping based on DMA coherence)" [3].
> > > > 
> > > > Please find attached a patch. This patch solves the issue.
> > > > 
> > > > dmesg output after compiling with the attached patch:
> > > > 
> > > > [   12.117149] mmc0: SDHCI controller on ffe114000.sdhc [ffe114000.sdhc]
> > > > using ADMA
> > > > [   12.208646] mmc0: new high speed SDHC card at address 59b4
> > > > [   12.230808] mmcblk0: mmc0:59b4 USD   3.73 GiB
> > > > 
> > > > -------------------------
> > > > 
> > > > Please check the commit "121bd08b029e03404c451bb237729cdff76eafed" [3].
> > > > 
> > > > Thanks,
> > > > Christian
> > > > 
> > > > 
> > > > [1] https://www.amigaos.net/hardware/133/amigaone-x5000 and
> > > > http://wiki.amiga.org/index.php?title=X5000
> > > > [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.4-rc1&id=c710364f78afdef8c2ed07556d0743c5a30ed429
> > > > [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=121bd08b029e03404c451bb237729cdff76eafed
> > > > 
> > > 
> 
>
Ulf Hansson Oct. 15, 2019, 1:44 p.m. UTC | #6
On Tue, 15 Oct 2019 at 15:18, Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Tue, Oct 15, 2019 at 03:12:49PM +0200, Christian Zigotzky wrote:
> > Hello Russell,
> >
> > You asked me about "dma-coherent" in the Cyrus device tree. Unfortunately I
> > don't find the property "dma-coherent" in the dtb source files.
> >
> > Output of "fdtdump cyrus_p5020_eth_poweroff.dtb | grep dma":
> >
> > dma0 = "/soc@ffe000000/dma@100300";
> >         dma1 = "/soc@ffe000000/dma@101300";
> >         dma@100300 {
> >             compatible = "fsl,eloplus-dma";
> >             dma-channel@0 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >             dma-channel@80 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >             dma-channel@100 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >             dma-channel@180 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >         dma@101300 {
> >             compatible = "fsl,eloplus-dma";
> >             dma-channel@0 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >             dma-channel@80 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >             dma-channel@100 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >             dma-channel@180 {
> >                 compatible = "fsl,eloplus-dma-channel";
>
> Hmm, so it looks like PowerPC doesn't mark devices that are dma
> coherent with a property that describes them as such.
>
> I think this opens a wider question - what should of_dma_is_coherent()
> return for PowerPC?  It seems right now that it returns false for
> devices that are DMA coherent, which seems to me to be a recipe for
> future mistakes.

Perhaps implement the arch_setup_dma_ops() for PPC, that set
"dev->dma_coherent = true" could work?

[...]

Kind regards
Uffe
Benjamin Herrenschmidt Oct. 17, 2019, 2:15 a.m. UTC | #7
On Tue, 2019-10-15 at 14:17 +0100, Russell King - ARM Linux admin
wrote:
> On Tue, Oct 15, 2019 at 03:12:49PM +0200, Christian Zigotzky wrote:
> > Hello Russell,
> > 
> > You asked me about "dma-coherent" in the Cyrus device tree.
> > Unfortunately I
> > don't find the property "dma-coherent" in the dtb source files.
> > 
> > Output of "fdtdump cyrus_p5020_eth_poweroff.dtb | grep dma":
> > 
> > dma0 = "/soc@ffe000000/dma@100300";
> >         dma1 = "/soc@ffe000000/dma@101300";
> >         dma@100300 {
> >             compatible = "fsl,eloplus-dma";
> >             dma-channel@0 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >             dma-channel@80 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >             dma-channel@100 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >             dma-channel@180 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >         dma@101300 {
> >             compatible = "fsl,eloplus-dma";
> >             dma-channel@0 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >             dma-channel@80 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >             dma-channel@100 {
> >                 compatible = "fsl,eloplus-dma-channel";
> >             dma-channel@180 {
> >                 compatible = "fsl,eloplus-dma-channel";
> 
> Hmm, so it looks like PowerPC doesn't mark devices that are dma
> coherent with a property that describes them as such.
> 
> I think this opens a wider question - what should
> of_dma_is_coherent()
> return for PowerPC?  It seems right now that it returns false for
> devices that are DMA coherent, which seems to me to be a recipe for
> future mistakes.
> 
> Any ideas from the PPC maintainers?

The historical powerpc OF machines were all coherent. OF didn't define
such attributes iirc.

We could add some kind of arch/platform override for that, all the
native OF platforms are coherent at this point afaik, only embedded
with fdt aren't always.

Our *code* knows whether DMA is coherent, but it's not exposed in the
DT for those historical reasons.

Cheers,
Ben.

> > 
> > Thanks,
> > Christian
> > 
> > 
> > On 15 October 2019 at 2:51 pm, Russell King - ARM Linux admin
> > wrote:
> > > The patch which you are reverting fixes exactly the same error on
> > > ARM64 boards.
> > > 
> > > I had tried to get some additional information from you, but I
> > > never
> > > received a response to my query.
> > > 
> > > While it is regrettable that fixing that has caused a regression
> > > for yours (which we need to fix) we really need to find a way to
> > > solve this for both situations.
> > > 
> > > Merely reverting the patch and then sitting on our hands with no
> > > possible solution for ARM64 isn't really an acceptable way
> > > forward.
> > > 
> > > The only solution I can see at the moment to this is to #ifdef in
> > > the driver based on whether we are building for ARM64 or PowerPC.
> > > 
> > > Maybe someone else can come up with a better idea?
> > > 
> > > On Tue, Oct 15, 2019 at 02:27:44PM +0200, Christian Zigotzky
> > > wrote:
> > > > Hello,
> > > > 
> > > > We have patched the RC2 [1] and RC3 [2] of kernel 5.4 with the
> > > > mmc3 patch
> > > > [3] and after that the onboard SD card works without any
> > > > problems with them.
> > > > 
> > > > Could you please add this patch or do you have another
> > > > solution?
> > > > 
> > > > Thanks,
> > > > Christian
> > > > 
> > > > [1] 
> > > > http://www.xenosoft.de/linux-image-5.4-rc2-X1000_X5000_A1222.tar.gz
> > > > [2] 
> > > > http://www.xenosoft.de/linux-image-5.4-rc3-X1000_X5000_A1222.tar.gz
> > > > [3] http://www.xenosoft.de/mmc3.patch
> > > > 
> > > > 
> > > > On 05 October 2019 at 00:45 am, Christian Zigotzky wrote:
> > > > > Hello,
> > > > > 
> > > > > The onboard SD card of the Cyrus+ PowerPC board (A-EON
> > > > > AmigaOne X5000)
> > > > > [1] doesn't work anymore after the 'mmc-v5.4-2' updates [2].
> > > > > 
> > > > > Error messages:
> > > > > 
> > > > > [   12.118148] mmc0: SDHCI controller on ffe114000.sdhc
> > > > > [ffe114000.sdhc]
> > > > > using ADMA
> > > > > [   12.232869] mmc0: ADMA error: 0x02000000
> > > > > [   12.237977] mmc0: sdhci: ============ SDHCI REGISTER DUMP
> > > > > ===========
> > > > > [   12.243137] mmc0: sdhci: Sys addr:  0x00000000 | Version:
> > > > > 0x00001301
> > > > > [   12.248307] mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:
> > > > > 0x00000001
> > > > > [   12.253448] mmc0: sdhci: Argument:  0x00000000 | Trn mode:
> > > > > 0x00000013
> > > > > [   12.258530] mmc0: sdhci: Present:   0x01fd020a | Host ctl:
> > > > > 0x00000030
> > > > > [   12.263572] mmc0: sdhci: Power:     0x00000002 | Blk gap:
> > > > > 0x00000000
> > > > > [   12.268574] mmc0: sdhci: Wake-up:   0x00000000 | Clock:
> > > > > 0x000020f8
> > > > > [   12.273535] mmc0: sdhci: Timeout:   0x00000003 | Int stat:
> > > > > 0x00000001
> > > > > [   12.278461] mmc0: sdhci: Int enab:  0x037f008f | Sig enab:
> > > > > 0x037f008b
> > > > > [   12.283339] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int:
> > > > > 0x00001301
> > > > > [   12.288198] mmc0: sdhci: Caps:      0x05fa0000 | Caps_1:
> > > > > 0x00000000
> > > > > [   12.293032] mmc0: sdhci: Cmd:       0x0000333a | Max curr:
> > > > > 0x00000000
> > > > > [   12.297861] mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:
> > > > > 0x001dd533
> > > > > [   12.302615] mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:
> > > > > 0x3f400e00
> > > > > [   12.307312] mmc0: sdhci: Host ctl2: 0x00000000
> > > > > [   12.311940] mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr:
> > > > > 0x7e04f208
> > > > > [   12.316573] mmc0: sdhci:
> > > > > ============================================
> > > > > [   12.321204] mmc0: sdhci: 7e04f200: DMA 0xfc000000, LEN
> > > > > 0x0008,
> > > > > Attr=0x23
> > > > > [   12.326022] mmc0: error -5 whilst initialising SD card
> > > > > [   12.507466] mmc0: invalid bus width
> > > > > [   12.512154] mmc0: error -22 whilst initialising SD card
> > > > > [   12.683457] mmc0: invalid bus width
> > > > > [   12.688968] mmc0: error -22 whilst initialising SD card
> > > > > [   12.774281] mmc0: invalid bus width
> > > > > [   12.779910] mmc0: error -22 whilst initialising SD card
> > > > > 
> > > > > -------------------------
> > > > > 
> > > > > I have found the issue in the DMA changes of the 'mmc-v5.4-2' 
> > > > > updates.
> > > > > It's the commit "121bd08b029e03404c451bb237729cdff76eafed
> > > > > (mmc:
> > > > > sdhci-of-esdhc: set DMA snooping based on DMA coherence)"
> > > > > [3].
> > > > > 
> > > > > Please find attached a patch. This patch solves the issue.
> > > > > 
> > > > > dmesg output after compiling with the attached patch:
> > > > > 
> > > > > [   12.117149] mmc0: SDHCI controller on ffe114000.sdhc
> > > > > [ffe114000.sdhc]
> > > > > using ADMA
> > > > > [   12.208646] mmc0: new high speed SDHC card at address 59b4
> > > > > [   12.230808] mmcblk0: mmc0:59b4 USD   3.73 GiB
> > > > > 
> > > > > -------------------------
> > > > > 
> > > > > Please check the commit
> > > > > "121bd08b029e03404c451bb237729cdff76eafed" [3].
> > > > > 
> > > > > Thanks,
> > > > > Christian
> > > > > 
> > > > > 
> > > > > [1] https://www.amigaos.net/hardware/133/amigaone-x5000 and
> > > > > http://wiki.amiga.org/index.php?title=X5000
> > > > > [2] 
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.4-rc1&id=c710364f78afdef8c2ed07556d0743c5a30ed429
> > > > > [3] 
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=121bd08b029e03404c451bb237729cdff76eafed
> > > > > 
> > 
> > 
> 
>
Benjamin Herrenschmidt Oct. 17, 2019, 2:16 a.m. UTC | #8
On Tue, 2019-10-15 at 15:44 +0200, Ulf Hansson wrote:
> > Hmm, so it looks like PowerPC doesn't mark devices that are dma
> > coherent with a property that describes them as such.
> > 
> > I think this opens a wider question - what should
> > of_dma_is_coherent()
> > return for PowerPC?  It seems right now that it returns false for
> > devices that are DMA coherent, which seems to me to be a recipe for
> > future mistakes.
> 
> Perhaps implement the arch_setup_dma_ops() for PPC, that set
> "dev->dma_coherent = true" could work?

Only for coherent ops :)

Cheers,
Ben.
Russell King - ARM Linux admin Oct. 18, 2019, 10:13 a.m. UTC | #9
On Thu, Oct 17, 2019 at 01:16:00PM +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2019-10-15 at 15:44 +0200, Ulf Hansson wrote:
> > > Hmm, so it looks like PowerPC doesn't mark devices that are dma
> > > coherent with a property that describes them as such.
> > > 
> > > I think this opens a wider question - what should
> > > of_dma_is_coherent()
> > > return for PowerPC?  It seems right now that it returns false for
> > > devices that are DMA coherent, which seems to me to be a recipe for
> > > future mistakes.
> > 
> > Perhaps implement the arch_setup_dma_ops() for PPC, that set
> > "dev->dma_coherent = true" could work?
> 
> Only for coherent ops :)

For those of us who have never touched the PowerPC code before, any
suggestion where you would like to see that?  No file in arch/powerpc/mm
stands out as a place for DMA stuff (and dma-noncoherent.c is certainly
not correct.)

Thanks.
Benjamin Herrenschmidt Oct. 18, 2019, 11:40 p.m. UTC | #10
On Fri, 2019-10-18 at 11:13 +0100, Russell King - ARM Linux admin wrote:
> On Thu, Oct 17, 2019 at 01:16:00PM +1100, Benjamin Herrenschmidt wrote:
> > On Tue, 2019-10-15 at 15:44 +0200, Ulf Hansson wrote:
> > > > Hmm, so it looks like PowerPC doesn't mark devices that are dma
> > > > coherent with a property that describes them as such.
> > > > 
> > > > I think this opens a wider question - what should
> > > > of_dma_is_coherent()
> > > > return for PowerPC?  It seems right now that it returns false for
> > > > devices that are DMA coherent, which seems to me to be a recipe for
> > > > future mistakes.
> > > 
> > > Perhaps implement the arch_setup_dma_ops() for PPC, that set
> > > "dev->dma_coherent = true" could work?
> > 
> > Only for coherent ops :)
> 
> For those of us who have never touched the PowerPC code before, any
> suggestion where you would like to see that?  No file in arch/powerpc/mm
> stands out as a place for DMA stuff (and dma-noncoherent.c is certainly
> not correct.)

Anywhere is fine but I think it's easy. Coherent DMA is for historical
reasons a function of the processor generation, and as such a CONFIG
option.

So setting it according to CONFIG_NOT_COHERENT_CACHE will probably work
just fine.

Well at least I think so ... unless I'm missing some broken HW
somewhere I am not aware of.

Cheers,
Ben.
Christian Zigotzky Oct. 20, 2019, 12:04 p.m. UTC | #11
On 19 October 2019 at 01:40 am, Benjamin Herrenschmidt wrote:
> Anywhere is fine but I think it's easy. Coherent DMA is for historical
> reasons a function of the processor generation, and as such a CONFIG
> option.
>
> So setting it according to CONFIG_NOT_COHERENT_CACHE will probably work
> just fine.
>
> Well at least I think so ... unless I'm missing some broken HW
> somewhere I am not aware of.
>
> Cheers,
> Ben.
Hi Ben,

Thanks a lot for your suggestion!

I had to create a patch because I wasn't able to select 
"CONFIG_NOT_COHERENT_CACHE" in the kernel configuration (patch attached).

After patching I was able to select "CONFIG_NOT_COHERENT_CACHE" in the 
kernel config. I compiled a new RC3 of kernel 5.4 with this kernel 
config option yesterday.

The good news is, that the onboard SD card works! We successfully tested 
it on two AmigaOnes (X5000/20 and X5000/40) yesterday but we need a new 
patch
because of the possibility to select "CONFIG_NOT_COHERENT_CACHE" in the 
kernel config.

Cheers,
Christian
Benjamin Herrenschmidt Oct. 21, 2019, 12:06 a.m. UTC | #12
On Sun, 2019-10-20 at 14:04 +0200, Christian Zigotzky wrote:
> 
> Thanks a lot for your suggestion!
> 
> I had to create a patch because I wasn't able to select 
> "CONFIG_NOT_COHERENT_CACHE" in the kernel configuration (patch attached).
> 
> After patching I was able to select "CONFIG_NOT_COHERENT_CACHE" in the 
> kernel config. I compiled a new RC3 of kernel 5.4 with this kernel 
> config option yesterday.
> 
> The good news is, that the onboard SD card works! We successfully tested 
> it on two AmigaOnes (X5000/20 and X5000/40) yesterday but we need a new 
> patch
> because of the possibility to select "CONFIG_NOT_COHERENT_CACHE" in the 
> kernel config

What I find weird is why would it be non-coherent ? A system based on a
E500 should be coherent afaik...

Cheers,
Ben.
Christian Zigotzky Oct. 21, 2019, 6:39 a.m. UTC | #13
> On 21. Oct 2019, at 02:06, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
>> On Sun, 2019-10-20 at 14:04 +0200, Christian Zigotzky wrote:
>> 
>> Thanks a lot for your suggestion!
>> 
>> I had to create a patch because I wasn't able to select 
>> "CONFIG_NOT_COHERENT_CACHE" in the kernel configuration (patch attached).
>> 
>> After patching I was able to select "CONFIG_NOT_COHERENT_CACHE" in the 
>> kernel config. I compiled a new RC3 of kernel 5.4 with this kernel 
>> config option yesterday.
>> 
>> The good news is, that the onboard SD card works! We successfully tested 
>> it on two AmigaOnes (X5000/20 and X5000/40) yesterday but we need a new 
>> patch
>> because of the possibility to select "CONFIG_NOT_COHERENT_CACHE" in the 
>> kernel config
> 
> What I find weird is why would it be non-coherent ? A system based on a
> E500 should be coherent afaik...
> 
> Cheers,
> Ben.
> 

Hi Ben,

I think it isn’t good to use the kernel config option "CONFIG_NOT_COHERENT_CACHE" if the system is coherent, is it?

We tested the kernel with "CONFIG_NOT_COHERENT_CACHE" yesterday but we didn’t find any differences to the coherent kernel.

Could you please explain us the difference between the coherent and not coherent kernel?

Thanks,
Christian
Benjamin Herrenschmidt Oct. 23, 2019, 1:17 a.m. UTC | #14
On Mon, 2019-10-21 at 08:39 +0200, Christian Zigotzky wrote:
> > 
> Hi Ben,
> 
> I think it isn’t good to use the kernel config option
> "CONFIG_NOT_COHERENT_CACHE" if the system is coherent, is it?
> 
> We tested the kernel with "CONFIG_NOT_COHERENT_CACHE" yesterday but
> we didn’t find any differences to the coherent kernel.
> 
> Could you please explain us the difference between the coherent and
> not coherent kernel?

So first, ideally, we should make this a runtime mechanism...
historical crap.

Fundamentally when you set CONFIG_NOT_COHERENT_CACHE you tell the
kernel that your processor isn't snooping the bus for DMA accesses
colliding with the caches. Thus SW has to explicitly maintain cache
coherency around DMA operations by invalidating or flushing the cache
accordingly and by keeping pool(s) of non-cachable memory around
to use for the consistent allocator.

This slows things down.

Cheers,
Ben.
Michael Ellerman Oct. 23, 2019, 5:42 a.m. UTC | #15
Russell King - ARM Linux admin <linux@armlinux.org.uk> writes:
> On Tue, Oct 15, 2019 at 03:12:49PM +0200, Christian Zigotzky wrote:
>> Hello Russell,
>> 
>> You asked me about "dma-coherent" in the Cyrus device tree. Unfortunately I
>> don't find the property "dma-coherent" in the dtb source files.
>> 
>> Output of "fdtdump cyrus_p5020_eth_poweroff.dtb | grep dma":
>> 
>> dma0 = "/soc@ffe000000/dma@100300";
>>         dma1 = "/soc@ffe000000/dma@101300";
>>         dma@100300 {
>>             compatible = "fsl,eloplus-dma";
>>             dma-channel@0 {
>>                 compatible = "fsl,eloplus-dma-channel";
>>             dma-channel@80 {
>>                 compatible = "fsl,eloplus-dma-channel";
>>             dma-channel@100 {
>>                 compatible = "fsl,eloplus-dma-channel";
>>             dma-channel@180 {
>>                 compatible = "fsl,eloplus-dma-channel";
>>         dma@101300 {
>>             compatible = "fsl,eloplus-dma";
>>             dma-channel@0 {
>>                 compatible = "fsl,eloplus-dma-channel";
>>             dma-channel@80 {
>>                 compatible = "fsl,eloplus-dma-channel";
>>             dma-channel@100 {
>>                 compatible = "fsl,eloplus-dma-channel";
>>             dma-channel@180 {
>>                 compatible = "fsl,eloplus-dma-channel";
>
> Hmm, so it looks like PowerPC doesn't mark devices that are dma
> coherent with a property that describes them as such.
>
> I think this opens a wider question - what should of_dma_is_coherent()
> return for PowerPC?  It seems right now that it returns false for
> devices that are DMA coherent, which seems to me to be a recipe for
> future mistakes.

Right, it seems of_dma_is_coherent() has baked in the assumption that
devices are non-coherent unless explicitly marked as coherent.

Which is wrong on all or at least most existing powerpc systems
according to Ben.

> Any ideas from the PPC maintainers?

Fixing it at the source seems like the best option to prevent future
breakage.

So I guess that would mean making of_dma_is_coherent() return true/false
based on CONFIG_NOT_COHERENT_CACHE on powerpc.

We could do it like below, which would still allow the dma-coherent
property to work if it ever makes sense on a future powerpc platform.

I don't really know any of this embedded stuff well, so happy to take
other suggestions on how to handle this mess.

cheers


diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index 25aaa3903000..b96c9010acb6 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -760,6 +760,22 @@ static int __init check_cache_coherency(void)
 late_initcall(check_cache_coherency);
 #endif /* CONFIG_CHECK_CACHE_COHERENCY */
 
+#ifndef CONFIG_NOT_COHERENT_CACHE
+/*
+ * For historical reasons powerpc kernels are built with hard wired knowledge of
+ * whether or not DMA accesses are cache coherent. Additionally device trees on
+ * powerpc do not typically support the dma-coherent property.
+ *
+ * So when we know that DMA is coherent, override arch_of_dma_is_coherent() to
+ * tell the drivers/of code that all devices are coherent regardless of whether
+ * they have a dma-coherent property.
+ */
+bool arch_of_dma_is_coherent(struct device_node *np)
+{
+	return true;
+}
+#endif
+
 #ifdef CONFIG_DEBUG_FS
 struct dentry *powerpc_debugfs_root;
 EXPORT_SYMBOL(powerpc_debugfs_root);
diff --git a/drivers/of/address.c b/drivers/of/address.c
index 978427a9d5e6..3a4b2949a322 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -993,6 +993,14 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
 }
 EXPORT_SYMBOL_GPL(of_dma_get_range);
 
+/*
+ * arch_of_dma_is_coherent - Arch hook to determine if device is coherent for DMA
+ */
+bool __weak arch_of_dma_is_coherent(struct device_node *np)
+{
+	return false;
+}
+
 /**
  * of_dma_is_coherent - Check if device is coherent
  * @np:	device node
@@ -1002,8 +1010,12 @@ EXPORT_SYMBOL_GPL(of_dma_get_range);
  */
 bool of_dma_is_coherent(struct device_node *np)
 {
-	struct device_node *node = of_node_get(np);
+	struct device_node *node;
+
+	if (arch_of_dma_is_coherent(np))
+		return true;
 
+	np = of_node_get(np);
 	while (node) {
 		if (of_property_read_bool(node, "dma-coherent")) {
 			of_node_put(node);
Benjamin Herrenschmidt Oct. 23, 2019, 6:41 a.m. UTC | #16
On Wed, 2019-10-23 at 16:42 +1100, Michael Ellerman wrote:
> 
> Right, it seems of_dma_is_coherent() has baked in the assumption that
> devices are non-coherent unless explicitly marked as coherent.
> 
> Which is wrong on all or at least most existing powerpc systems
> according to Ben.

This is probably broken on sparc(64) as well and whatever else uses
DT and is an intrinsicly coherent architecture (did we ever have
DT enabled x86s ? Wasn't OLPC such a beast ?).

I think this should have been done the other way around and default to
coherent since most traditional OF platforms are coherent, and you
can't just require those DTs to change.

> > Any ideas from the PPC maintainers?
> 
> Fixing it at the source seems like the best option to prevent future
> breakage.
> 
> So I guess that would mean making of_dma_is_coherent() return true/false
> based on CONFIG_NOT_COHERENT_CACHE on powerpc.
> 
> We could do it like below, which would still allow the dma-coherent
> property to work if it ever makes sense on a future powerpc platform.
> 
> I don't really know any of this embedded stuff well, so happy to take
> other suggestions on how to handle this mess.
Rob Herring Oct. 23, 2019, 1:52 p.m. UTC | #17
On Wed, Oct 23, 2019 at 1:41 AM Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
>
> On Wed, 2019-10-23 at 16:42 +1100, Michael Ellerman wrote:
> >
> > Right, it seems of_dma_is_coherent() has baked in the assumption that
> > devices are non-coherent unless explicitly marked as coherent.
> >
> > Which is wrong on all or at least most existing powerpc systems
> > according to Ben.
>
> This is probably broken on sparc(64) as well and whatever else uses
> DT and is an intrinsicly coherent architecture (did we ever have
> DT enabled x86s ? Wasn't OLPC such a beast ?).

Only if those platforms use one of the 5 drivers that call this function:

drivers/ata/ahci_qoriq.c:       qoriq_priv->is_dmacoherent =
of_dma_is_coherent(np);
drivers/crypto/ccree/cc_driver.c:       new_drvdata->coherent =
of_dma_is_coherent(np);
drivers/iommu/arm-smmu-v3.c:    if (of_dma_is_coherent(dev->of_node))
drivers/iommu/arm-smmu.c:       if (of_dma_is_coherent(dev->of_node))
drivers/mmc/host/sdhci-of-esdhc.c:      if (of_dma_is_coherent(dev->of_node))

Curious that a PPC specific driver (ahci_qoriq) calls it...

Note that the value is also passed to arch_setup_dma_ops(), but only
arc, arm, arm64, and mips implement arch_setup_dma_ops.

> I think this should have been done the other way around and default to
> coherent since most traditional OF platforms are coherent, and you
> can't just require those DTs to change.

You can blame me. This was really only intended for cases where
coherency is configurable on a per peripheral basis and can't be
determined in other ways.

The simple solution here is simply to use the compatible string for
the device to determine coherent or not.

Rob
Russell King - ARM Linux admin Oct. 23, 2019, 2:12 p.m. UTC | #18
On Wed, Oct 23, 2019 at 08:52:33AM -0500, Rob Herring wrote:
> On Wed, Oct 23, 2019 at 1:41 AM Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> >
> > On Wed, 2019-10-23 at 16:42 +1100, Michael Ellerman wrote:
> > >
> > > Right, it seems of_dma_is_coherent() has baked in the assumption that
> > > devices are non-coherent unless explicitly marked as coherent.
> > >
> > > Which is wrong on all or at least most existing powerpc systems
> > > according to Ben.
> >
> > This is probably broken on sparc(64) as well and whatever else uses
> > DT and is an intrinsicly coherent architecture (did we ever have
> > DT enabled x86s ? Wasn't OLPC such a beast ?).
> 
> Only if those platforms use one of the 5 drivers that call this function:
> 
> drivers/ata/ahci_qoriq.c:       qoriq_priv->is_dmacoherent =
> of_dma_is_coherent(np);
> drivers/crypto/ccree/cc_driver.c:       new_drvdata->coherent =
> of_dma_is_coherent(np);
> drivers/iommu/arm-smmu-v3.c:    if (of_dma_is_coherent(dev->of_node))
> drivers/iommu/arm-smmu.c:       if (of_dma_is_coherent(dev->of_node))
> drivers/mmc/host/sdhci-of-esdhc.c:      if (of_dma_is_coherent(dev->of_node))
> 
> Curious that a PPC specific driver (ahci_qoriq) calls it...

Maybe because it is not PPC specific:

arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi:
compatible = "fsl,lx2160a-ahci";
arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi:
compatible = "fsl,lx2160a-ahci";
arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi:
compatible = "fsl,lx2160a-ahci";
arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi:
compatible = "fsl,lx2160a-ahci";

drivers/ata/ahci_qoriq.c:       { .compatible = "fsl,lx2160a-ahci", .data = (void *)AHCI_LX2160A},
Christian Zigotzky Oct. 23, 2019, 2:20 p.m. UTC | #19
Hello,

The patch below works. I compiled the RC4 of kernel 5.4 with this patch today and the onboard SD card works without any problems.

Thanks!

Christian

> On 23. Oct 2019, at 07:42, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> Russell King - ARM Linux admin <linux@armlinux.org.uk> writes:
>>> On Tue, Oct 15, 2019 at 03:12:49PM +0200, Christian Zigotzky wrote:
>>> Hello Russell,
>>> 
>>> You asked me about "dma-coherent" in the Cyrus device tree. Unfortunately I
>>> don't find the property "dma-coherent" in the dtb source files.
>>> 
>>> Output of "fdtdump cyrus_p5020_eth_poweroff.dtb | grep dma":
>>> 
>>> dma0 = "/soc@ffe000000/dma@100300";
>>>         dma1 = "/soc@ffe000000/dma@101300";
>>>         dma@100300 {
>>>             compatible = "fsl,eloplus-dma";
>>>             dma-channel@0 {
>>>                 compatible = "fsl,eloplus-dma-channel";
>>>             dma-channel@80 {
>>>                 compatible = "fsl,eloplus-dma-channel";
>>>             dma-channel@100 {
>>>                 compatible = "fsl,eloplus-dma-channel";
>>>             dma-channel@180 {
>>>                 compatible = "fsl,eloplus-dma-channel";
>>>         dma@101300 {
>>>             compatible = "fsl,eloplus-dma";
>>>             dma-channel@0 {
>>>                 compatible = "fsl,eloplus-dma-channel";
>>>             dma-channel@80 {
>>>                 compatible = "fsl,eloplus-dma-channel";
>>>             dma-channel@100 {
>>>                 compatible = "fsl,eloplus-dma-channel";
>>>             dma-channel@180 {
>>>                 compatible = "fsl,eloplus-dma-channel";
>> 
>> Hmm, so it looks like PowerPC doesn't mark devices that are dma
>> coherent with a property that describes them as such.
>> 
>> I think this opens a wider question - what should of_dma_is_coherent()
>> return for PowerPC?  It seems right now that it returns false for
>> devices that are DMA coherent, which seems to me to be a recipe for
>> future mistakes.
> 
> Right, it seems of_dma_is_coherent() has baked in the assumption that
> devices are non-coherent unless explicitly marked as coherent.
> 
> Which is wrong on all or at least most existing powerpc systems
> according to Ben.
> 
>> Any ideas from the PPC maintainers?
> 
> Fixing it at the source seems like the best option to prevent future
> breakage.
> 
> So I guess that would mean making of_dma_is_coherent() return true/false
> based on CONFIG_NOT_COHERENT_CACHE on powerpc.
> 
> We could do it like below, which would still allow the dma-coherent
> property to work if it ever makes sense on a future powerpc platform.
> 
> I don't really know any of this embedded stuff well, so happy to take
> other suggestions on how to handle this mess.
> 
> cheers
> 
> 
> diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
> index 25aaa3903000..b96c9010acb6 100644
> --- a/arch/powerpc/kernel/setup-common.c
> +++ b/arch/powerpc/kernel/setup-common.c
> @@ -760,6 +760,22 @@ static int __init check_cache_coherency(void)
> late_initcall(check_cache_coherency);
> #endif /* CONFIG_CHECK_CACHE_COHERENCY */
> 
> +#ifndef CONFIG_NOT_COHERENT_CACHE
> +/*
> + * For historical reasons powerpc kernels are built with hard wired knowledge of
> + * whether or not DMA accesses are cache coherent. Additionally device trees on
> + * powerpc do not typically support the dma-coherent property.
> + *
> + * So when we know that DMA is coherent, override arch_of_dma_is_coherent() to
> + * tell the drivers/of code that all devices are coherent regardless of whether
> + * they have a dma-coherent property.
> + */
> +bool arch_of_dma_is_coherent(struct device_node *np)
> +{
> +    return true;
> +}
> +#endif
> +
> #ifdef CONFIG_DEBUG_FS
> struct dentry *powerpc_debugfs_root;
> EXPORT_SYMBOL(powerpc_debugfs_root);
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 978427a9d5e6..3a4b2949a322 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -993,6 +993,14 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
> }
> EXPORT_SYMBOL_GPL(of_dma_get_range);
> 
> +/*
> + * arch_of_dma_is_coherent - Arch hook to determine if device is coherent for DMA
> + */
> +bool __weak arch_of_dma_is_coherent(struct device_node *np)
> +{
> +    return false;
> +}
> +
> /**
>  * of_dma_is_coherent - Check if device is coherent
>  * @np:    device node
> @@ -1002,8 +1010,12 @@ EXPORT_SYMBOL_GPL(of_dma_get_range);
>  */
> bool of_dma_is_coherent(struct device_node *np)
> {
> -    struct device_node *node = of_node_get(np);
> +    struct device_node *node;
> +
> +    if (arch_of_dma_is_coherent(np))
> +        return true;
> 
> +    np = of_node_get(np);
>    while (node) {
>        if (of_property_read_bool(node, "dma-coherent")) {
>            of_node_put(node);
Russell King - ARM Linux admin Oct. 23, 2019, 2:31 p.m. UTC | #20
On Wed, Oct 23, 2019 at 08:52:33AM -0500, Rob Herring wrote:
> > I think this should have been done the other way around and default to
> > coherent since most traditional OF platforms are coherent, and you
> > can't just require those DTs to change.
> 
> You can blame me. This was really only intended for cases where
> coherency is configurable on a per peripheral basis and can't be
> determined in other ways.
> 
> The simple solution here is simply to use the compatible string for
> the device to determine coherent or not.

It really isn't that simple.

There are two aspects to coherency, both of which must match:

1) The configuration of the device
2) The configuration of the kernel's DMA API

(1) is controlled by the driver, which can make the decision any way
it pleases.

(2) on ARM64 is controlled depending on whether or not "dma-coherent"
is specified in the device tree, since ARM64 can have a mixture of
DMA coherent and non-coherent devices.

A mismatch between (1) and (2) results in data corruption, potentially
eating your filesystem.  So, it's very important that the two match.

These didn't match for the LX2160A, but, due to the way CMA was working,
we sort of got away with it, but it was very dangerous as far as data
safety went.

Then, a change to CMA happened which moved where it was located, which
caused a regression.  Reverting the CMA changes didn't seem to be an
option, so another solution had to be found.

I started a discussion on how best to solve this:

https://archive.armlinux.org.uk/lurker/thread/20190919.041320.1e53541f.en.html

and the solution that the discussion came out with was the one that has
been merged - which we now know caused a regression on PPC.

Using compatible strings doesn't solve the issue: there is no way to
tell the DMA API from the driver that the device is coherent.  The
only way to do that is via the "dma-coherent" property in DT on ARM64.

To say that this is a mess is an under-statement, but we seem to have
ended up here because of a series of piece-meal changes that don't seem
to have been thought through enough.

So, what's the right way to solve this, and ensure that the DMA API and
device match as far as their coherency expectations go?  Revert all the
changes for sdhci-of-esdhc and CMA back to 5.0 or 5.1 state?
Rob Herring Oct. 25, 2019, 10:28 p.m. UTC | #21
On Wed, Oct 23, 2019 at 9:32 AM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Wed, Oct 23, 2019 at 08:52:33AM -0500, Rob Herring wrote:
> > > I think this should have been done the other way around and default to
> > > coherent since most traditional OF platforms are coherent, and you
> > > can't just require those DTs to change.
> >
> > You can blame me. This was really only intended for cases where
> > coherency is configurable on a per peripheral basis and can't be
> > determined in other ways.
> >
> > The simple solution here is simply to use the compatible string for
> > the device to determine coherent or not.
>
> It really isn't that simple.

This doesn't work?:

        if (IS_ENABLED(CONFIG_PPC) || of_dma_is_coherent(dev->of_node))
                value |= ESDHC_DMA_SNOOP;
        else
                value &= ~ESDHC_DMA_SNOOP;

While I said use the compatibles, using the kconfig symbol is easier
than sorting out which compatibles are PPC SoCs. Though if that's
already done elsewhere in the driver, you could set a flag and use
that here. I'd be surprised if this was the only difference between
ARM and PPC SoCs for this block.

> There are two aspects to coherency, both of which must match:
>
> 1) The configuration of the device
> 2) The configuration of the kernel's DMA API
>
> (1) is controlled by the driver, which can make the decision any way
> it pleases.
>
> (2) on ARM64 is controlled depending on whether or not "dma-coherent"
> is specified in the device tree, since ARM64 can have a mixture of
> DMA coherent and non-coherent devices.
>
> A mismatch between (1) and (2) results in data corruption, potentially
> eating your filesystem.  So, it's very important that the two match.
>
> These didn't match for the LX2160A, but, due to the way CMA was working,
> we sort of got away with it, but it was very dangerous as far as data
> safety went.
>
> Then, a change to CMA happened which moved where it was located, which
> caused a regression.  Reverting the CMA changes didn't seem to be an
> option, so another solution had to be found.
>
> I started a discussion on how best to solve this:
>
> https://archive.armlinux.org.uk/lurker/thread/20190919.041320.1e53541f.en.html
>
> and the solution that the discussion came out with was the one that has
> been merged - which we now know caused a regression on PPC.
>
> Using compatible strings doesn't solve the issue: there is no way to
> tell the DMA API from the driver that the device is coherent.  The
> only way to do that is via the "dma-coherent" property in DT on ARM64.
>
> To say that this is a mess is an under-statement, but we seem to have
> ended up here because of a series of piece-meal changes that don't seem
> to have been thought through enough.
>
> So, what's the right way to solve this, and ensure that the DMA API and
> device match as far as their coherency expectations go?  Revert all the
> changes for sdhci-of-esdhc and CMA back to 5.0 or 5.1 state?

The other option is similar to earlier in the thread and just add to
of_dma_is_coherent():

/* Powerpc is normally cache coherent DMA */
if (IS_ENABLED(CONFIG_PPC) && !IS_ENABLED(CONFIG_NOT_COHERENT_CACHE))
    return true;

We could do the all the weak arch hooks, but that seems like overkill
to me at this point.

Rob
Christoph Hellwig Oct. 26, 2019, 6:39 a.m. UTC | #22
On Fri, Oct 25, 2019 at 05:28:45PM -0500, Rob Herring wrote:
> This doesn't work?:
> 
>         if (IS_ENABLED(CONFIG_PPC) || of_dma_is_coherent(dev->of_node))
>                 value |= ESDHC_DMA_SNOOP;
>         else
>                 value &= ~ESDHC_DMA_SNOOP;
> 
> While I said use the compatibles, using the kconfig symbol is easier
> than sorting out which compatibles are PPC SoCs. Though if that's
> already done elsewhere in the driver, you could set a flag and use
> that here. I'd be surprised if this was the only difference between
> ARM and PPC SoCs for this block.

I think the right thing is a Kconfig variable that the architectures
selects which says if OF is by default coherent or incoherent, and then
use that in of_dma_is_coherent.
Benjamin Herrenschmidt Oct. 28, 2019, 8:45 a.m. UTC | #23
On Fri, 2019-10-25 at 17:28 -0500, Rob Herring wrote:
> This doesn't work?:
> 
>         if (IS_ENABLED(CONFIG_PPC) || of_dma_is_coherent(dev-
> >of_node))
>                 value |= ESDHC_DMA_SNOOP;
>         else
>                 value &= ~ESDHC_DMA_SNOOP;

CONFIG_PPC is restrictive. What about sparc64 ? There could be others
.. .we can't suddenly requests people to add new properties for what
was implied behaviours before hand, esp since it's not in the base 1275
spec, no ?

I would suggest of_dma_is_coherent is *true* by default unless
overriden to do something else.

Cheers,
Ben.
Benjamin Herrenschmidt Oct. 28, 2019, 8:47 a.m. UTC | #24
On Fri, 2019-10-25 at 23:39 -0700, Christoph Hellwig wrote:
> On Fri, Oct 25, 2019 at 05:28:45PM -0500, Rob Herring wrote:
> > This doesn't work?:
> > 
> >         if (IS_ENABLED(CONFIG_PPC) || of_dma_is_coherent(dev-
> > >of_node))
> >                 value |= ESDHC_DMA_SNOOP;
> >         else
> >                 value &= ~ESDHC_DMA_SNOOP;
> > 
> > While I said use the compatibles, using the kconfig symbol is
> > easier
> > than sorting out which compatibles are PPC SoCs. Though if that's
> > already done elsewhere in the driver, you could set a flag and use
> > that here. I'd be surprised if this was the only difference between
> > ARM and PPC SoCs for this block.
> 
> I think the right thing is a Kconfig variable that the architectures
> selects which says if OF is by default coherent or incoherent, and
> then use that in of_dma_is_coherent.

That too. We could also define properties for both ways so we can
override the default either way.

Cheers,
Ben.
Christian Zigotzky Nov. 4, 2019, 2:44 p.m. UTC | #25
FYI: The onboard SD card works also with the RC6 of kernel 5.4 with the 
patch below.

-- Christian

On 23 October 2019 at 4:20pm, Christian Zigotzky wrote:
> Hello,
>
> The patch below works. I compiled the RC4 of kernel 5.4 with this patch today and the onboard SD card works without any problems.
>
> Thanks!
>
> Christian
>
>> On 23. Oct 2019, at 07:42, Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> Russell King - ARM Linux admin <linux@armlinux.org.uk> writes:
>>>> On Tue, Oct 15, 2019 at 03:12:49PM +0200, Christian Zigotzky wrote:
>>>> Hello Russell,
>>>>
>>>> You asked me about "dma-coherent" in the Cyrus device tree. Unfortunately I
>>>> don't find the property "dma-coherent" in the dtb source files.
>>>>
>>>> Output of "fdtdump cyrus_p5020_eth_poweroff.dtb | grep dma":
>>>>
>>>> dma0 = "/soc@ffe000000/dma@100300";
>>>>          dma1 = "/soc@ffe000000/dma@101300";
>>>>          dma@100300 {
>>>>              compatible = "fsl,eloplus-dma";
>>>>              dma-channel@0 {
>>>>                  compatible = "fsl,eloplus-dma-channel";
>>>>              dma-channel@80 {
>>>>                  compatible = "fsl,eloplus-dma-channel";
>>>>              dma-channel@100 {
>>>>                  compatible = "fsl,eloplus-dma-channel";
>>>>              dma-channel@180 {
>>>>                  compatible = "fsl,eloplus-dma-channel";
>>>>          dma@101300 {
>>>>              compatible = "fsl,eloplus-dma";
>>>>              dma-channel@0 {
>>>>                  compatible = "fsl,eloplus-dma-channel";
>>>>              dma-channel@80 {
>>>>                  compatible = "fsl,eloplus-dma-channel";
>>>>              dma-channel@100 {
>>>>                  compatible = "fsl,eloplus-dma-channel";
>>>>              dma-channel@180 {
>>>>                  compatible = "fsl,eloplus-dma-channel";
>>> Hmm, so it looks like PowerPC doesn't mark devices that are dma
>>> coherent with a property that describes them as such.
>>>
>>> I think this opens a wider question - what should of_dma_is_coherent()
>>> return for PowerPC?  It seems right now that it returns false for
>>> devices that are DMA coherent, which seems to me to be a recipe for
>>> future mistakes.
>> Right, it seems of_dma_is_coherent() has baked in the assumption that
>> devices are non-coherent unless explicitly marked as coherent.
>>
>> Which is wrong on all or at least most existing powerpc systems
>> according to Ben.
>>
>>> Any ideas from the PPC maintainers?
>> Fixing it at the source seems like the best option to prevent future
>> breakage.
>>
>> So I guess that would mean making of_dma_is_coherent() return true/false
>> based on CONFIG_NOT_COHERENT_CACHE on powerpc.
>>
>> We could do it like below, which would still allow the dma-coherent
>> property to work if it ever makes sense on a future powerpc platform.
>>
>> I don't really know any of this embedded stuff well, so happy to take
>> other suggestions on how to handle this mess.
>>
>> cheers
>>
>>
>> diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
>> index 25aaa3903000..b96c9010acb6 100644
>> --- a/arch/powerpc/kernel/setup-common.c
>> +++ b/arch/powerpc/kernel/setup-common.c
>> @@ -760,6 +760,22 @@ static int __init check_cache_coherency(void)
>> late_initcall(check_cache_coherency);
>> #endif /* CONFIG_CHECK_CACHE_COHERENCY */
>>
>> +#ifndef CONFIG_NOT_COHERENT_CACHE
>> +/*
>> + * For historical reasons powerpc kernels are built with hard wired knowledge of
>> + * whether or not DMA accesses are cache coherent. Additionally device trees on
>> + * powerpc do not typically support the dma-coherent property.
>> + *
>> + * So when we know that DMA is coherent, override arch_of_dma_is_coherent() to
>> + * tell the drivers/of code that all devices are coherent regardless of whether
>> + * they have a dma-coherent property.
>> + */
>> +bool arch_of_dma_is_coherent(struct device_node *np)
>> +{
>> +    return true;
>> +}
>> +#endif
>> +
>> #ifdef CONFIG_DEBUG_FS
>> struct dentry *powerpc_debugfs_root;
>> EXPORT_SYMBOL(powerpc_debugfs_root);
>> diff --git a/drivers/of/address.c b/drivers/of/address.c
>> index 978427a9d5e6..3a4b2949a322 100644
>> --- a/drivers/of/address.c
>> +++ b/drivers/of/address.c
>> @@ -993,6 +993,14 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
>> }
>> EXPORT_SYMBOL_GPL(of_dma_get_range);
>>
>> +/*
>> + * arch_of_dma_is_coherent - Arch hook to determine if device is coherent for DMA
>> + */
>> +bool __weak arch_of_dma_is_coherent(struct device_node *np)
>> +{
>> +    return false;
>> +}
>> +
>> /**
>>   * of_dma_is_coherent - Check if device is coherent
>>   * @np:    device node
>> @@ -1002,8 +1010,12 @@ EXPORT_SYMBOL_GPL(of_dma_get_range);
>>   */
>> bool of_dma_is_coherent(struct device_node *np)
>> {
>> -    struct device_node *node = of_node_get(np);
>> +    struct device_node *node;
>> +
>> +    if (arch_of_dma_is_coherent(np))
>> +        return true;
>>
>> +    np = of_node_get(np);
>>     while (node) {
>>         if (of_property_read_bool(node, "dma-coherent")) {
>>             of_node_put(node);
diff mbox series

Patch

diff -rupN a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c
--- a/drivers/mmc/host/sdhci-of-esdhc.c	2019-10-04 12:40:23.736729250 +0200
+++ b/drivers/mmc/host/sdhci-of-esdhc.c	2019-10-04 12:40:46.476869118 +0200
@@ -495,12 +495,7 @@  static int esdhc_of_enable_dma(struct sd
 		dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
 
 	value = sdhci_readl(host, ESDHC_DMA_SYSCTL);
-
-	if (of_dma_is_coherent(dev->of_node))
-		value |= ESDHC_DMA_SNOOP;
-	else
-		value &= ~ESDHC_DMA_SNOOP;
-
+	value |= ESDHC_DMA_SNOOP;
 	sdhci_writel(host, value, ESDHC_DMA_SYSCTL);
 	return 0;
 }