diff mbox series

[2/3] mmc: sdhci-of-esdhc: set DMA snooping based on DMA coherence

Message ID E1iBz50-0008Mc-8K@rmk-PC.armlinux.org.uk (mailing list archive)
State New, archived
Headers show
Series Fix sdhci-of-esdhc DMA coherency | expand

Commit Message

Russell King (Oracle) Sept. 22, 2019, 10:26 a.m. UTC
We must not unconditionally set the DMA snoop bit; if the DMA API is
assuming that the device is not DMA coherent, and the device snoops the
CPU caches, the device can see stale cache lines brought in by
speculative prefetch.

This leads to the device seeing stale data, potentially resulting in
corrupted data transfers.  Commonly, this results in a descriptor fetch
error such as:

mmc0: ADMA error
mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002202
mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
mmc0: sdhci: Present:   0x01f50008 | Host ctl: 0x00000038
mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000040d8
mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
mmc0: sdhci: Int enab:  0x037f108f | Sig enab: 0x037f108b
mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
mmc0: sdhci: Caps:      0x35fa0000 | Caps_1:   0x0000af00
mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x001d8a33
mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
mmc0: sdhci: Host ctl2: 0x00000000
mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x000000236d43820c
mmc0: sdhci: ============================================
mmc0: error -5 whilst initialising SD card

but can lead to other errors, and potentially direct the SDHCI
controller to read/write data to other memory locations (e.g. if a valid
descriptor is visible to the device in a stale cache line.)

Fix this by ensuring that the DMA snoop bit corresponds with the
behaviour of the DMA API.  Since the driver currently only supports DT,
use of_dma_is_coherent().  Note that device_get_dma_attr() can not be
used as that risks re-introducing this bug if/when the driver is
converted to ACPI.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/mmc/host/sdhci-of-esdhc.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Yangbo Lu Sept. 23, 2019, 2:51 a.m. UTC | #1
Hi Russell,

> -----Original Message-----
> From: Russell King <rmk@armlinux.org.uk> On Behalf Of Russell King
> Sent: Sunday, September 22, 2019 6:27 PM
> To: Robin Murphy <robin.murphy@arm.com>; dann frazier
> <dann.frazier@canonical.com>; Will Deacon <will.deacon@arm.com>; Nicolin
> Chen <nicoleotsuka@gmail.com>; Y.b. Lu <yangbo.lu@nxp.com>; Christoph
> Hellwig <hch@lst.de>
> Cc: Adrian Hunter <adrian.hunter@intel.com>; Ulf Hansson
> <ulf.hansson@linaro.org>; linux-mmc@vger.kernel.org
> Subject: [PATCH 2/3] mmc: sdhci-of-esdhc: set DMA snooping based on DMA
> coherence
> 
> We must not unconditionally set the DMA snoop bit; if the DMA API is
> assuming that the device is not DMA coherent, and the device snoops the CPU
> caches, the device can see stale cache lines brought in by speculative prefetch.
> 
> This leads to the device seeing stale data, potentially resulting in corrupted
> data transfers.  Commonly, this results in a descriptor fetch error such as:
> 
> mmc0: ADMA error
> mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002202
> mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
> mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> mmc0: sdhci: Present:   0x01f50008 | Host ctl: 0x00000038
> mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
> mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000040d8
> mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> mmc0: sdhci: Int enab:  0x037f108f | Sig enab: 0x037f108b
> mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
> mmc0: sdhci: Caps:      0x35fa0000 | Caps_1:   0x0000af00
> mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x001d8a33
> mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
> mmc0: sdhci: Host ctl2: 0x00000000
> mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x000000236d43820c
> mmc0: sdhci: ============================================
> mmc0: error -5 whilst initialising SD card
> 
> but can lead to other errors, and potentially direct the SDHCI controller to
> read/write data to other memory locations (e.g. if a valid descriptor is visible
> to the device in a stale cache line.)
> 
> Fix this by ensuring that the DMA snoop bit corresponds with the behaviour of
> the DMA API.  Since the driver currently only supports DT, use
> of_dma_is_coherent().  Note that device_get_dma_attr() can not be used as
> that risks re-introducing this bug if/when the driver is converted to ACPI.

[Y.b. Lu] May I have your suggestion how to check this condition with ACPI?
Although we didn’t support APCI on upstream for this driver now, as I know we had already had ACPI used internally and would be upstreamed.
Thanks.

> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> ---
>  drivers/mmc/host/sdhci-of-esdhc.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci-of-esdhc.c
> b/drivers/mmc/host/sdhci-of-esdhc.c
> index 4dd43b1adf2c..74de5e8c45c8 100644
> --- a/drivers/mmc/host/sdhci-of-esdhc.c
> +++ b/drivers/mmc/host/sdhci-of-esdhc.c
> @@ -495,7 +495,12 @@ static int esdhc_of_enable_dma(struct sdhci_host
> *host)
>  		dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
> 
>  	value = sdhci_readl(host, ESDHC_DMA_SYSCTL);
> -	value |= ESDHC_DMA_SNOOP;
> +
> +	if (of_dma_is_coherent(dev->of_node))
> +		value |= ESDHC_DMA_SNOOP;
> +	else
> +		value &= ~ESDHC_DMA_SNOOP;
> +
>  	sdhci_writel(host, value, ESDHC_DMA_SYSCTL);
>  	return 0;
>  }
> --
> 2.7.4
Adrian Hunter Sept. 23, 2019, 6:59 a.m. UTC | #2
On 22/09/19 1:26 PM, Russell King wrote:
> We must not unconditionally set the DMA snoop bit; if the DMA API is
> assuming that the device is not DMA coherent, and the device snoops the
> CPU caches, the device can see stale cache lines brought in by
> speculative prefetch.
> 
> This leads to the device seeing stale data, potentially resulting in
> corrupted data transfers.  Commonly, this results in a descriptor fetch
> error such as:
> 
> mmc0: ADMA error
> mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002202
> mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
> mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> mmc0: sdhci: Present:   0x01f50008 | Host ctl: 0x00000038
> mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
> mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000040d8
> mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> mmc0: sdhci: Int enab:  0x037f108f | Sig enab: 0x037f108b
> mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
> mmc0: sdhci: Caps:      0x35fa0000 | Caps_1:   0x0000af00
> mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x001d8a33
> mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
> mmc0: sdhci: Host ctl2: 0x00000000
> mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x000000236d43820c
> mmc0: sdhci: ============================================
> mmc0: error -5 whilst initialising SD card
> 
> but can lead to other errors, and potentially direct the SDHCI
> controller to read/write data to other memory locations (e.g. if a valid
> descriptor is visible to the device in a stale cache line.)
> 
> Fix this by ensuring that the DMA snoop bit corresponds with the
> behaviour of the DMA API.  Since the driver currently only supports DT,
> use of_dma_is_coherent().  Note that device_get_dma_attr() can not be
> used as that risks re-introducing this bug if/when the driver is
> converted to ACPI.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Acked-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  drivers/mmc/host/sdhci-of-esdhc.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c
> index 4dd43b1adf2c..74de5e8c45c8 100644
> --- a/drivers/mmc/host/sdhci-of-esdhc.c
> +++ b/drivers/mmc/host/sdhci-of-esdhc.c
> @@ -495,7 +495,12 @@ static int esdhc_of_enable_dma(struct sdhci_host *host)
>  		dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
>  
>  	value = sdhci_readl(host, ESDHC_DMA_SYSCTL);
> -	value |= ESDHC_DMA_SNOOP;
> +
> +	if (of_dma_is_coherent(dev->of_node))
> +		value |= ESDHC_DMA_SNOOP;
> +	else
> +		value &= ~ESDHC_DMA_SNOOP;
> +
>  	sdhci_writel(host, value, ESDHC_DMA_SYSCTL);
>  	return 0;
>  }
>
Russell King (Oracle) Sept. 23, 2019, 8:41 a.m. UTC | #3
On Mon, Sep 23, 2019 at 02:51:15AM +0000, Y.b. Lu wrote:
> Hi Russell,
> 
> > -----Original Message-----
> > From: Russell King <rmk@armlinux.org.uk> On Behalf Of Russell King
> > Sent: Sunday, September 22, 2019 6:27 PM
> > To: Robin Murphy <robin.murphy@arm.com>; dann frazier
> > <dann.frazier@canonical.com>; Will Deacon <will.deacon@arm.com>; Nicolin
> > Chen <nicoleotsuka@gmail.com>; Y.b. Lu <yangbo.lu@nxp.com>; Christoph
> > Hellwig <hch@lst.de>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>; Ulf Hansson
> > <ulf.hansson@linaro.org>; linux-mmc@vger.kernel.org
> > Subject: [PATCH 2/3] mmc: sdhci-of-esdhc: set DMA snooping based on DMA
> > coherence
> > 
> > We must not unconditionally set the DMA snoop bit; if the DMA API is
> > assuming that the device is not DMA coherent, and the device snoops the CPU
> > caches, the device can see stale cache lines brought in by speculative prefetch.
> > 
> > This leads to the device seeing stale data, potentially resulting in corrupted
> > data transfers.  Commonly, this results in a descriptor fetch error such as:
> > 
> > mmc0: ADMA error
> > mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> > mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002202
> > mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
> > mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> > mmc0: sdhci: Present:   0x01f50008 | Host ctl: 0x00000038
> > mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
> > mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000040d8
> > mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> > mmc0: sdhci: Int enab:  0x037f108f | Sig enab: 0x037f108b
> > mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
> > mmc0: sdhci: Caps:      0x35fa0000 | Caps_1:   0x0000af00
> > mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> > mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x001d8a33
> > mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
> > mmc0: sdhci: Host ctl2: 0x00000000
> > mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x000000236d43820c
> > mmc0: sdhci: ============================================
> > mmc0: error -5 whilst initialising SD card
> > 
> > but can lead to other errors, and potentially direct the SDHCI controller to
> > read/write data to other memory locations (e.g. if a valid descriptor is visible
> > to the device in a stale cache line.)
> > 
> > Fix this by ensuring that the DMA snoop bit corresponds with the behaviour of
> > the DMA API.  Since the driver currently only supports DT, use
> > of_dma_is_coherent().  Note that device_get_dma_attr() can not be used as
> > that risks re-introducing this bug if/when the driver is converted to ACPI.
> 
> [Y.b. Lu] May I have your suggestion how to check this condition with ACPI?
> Although we didn’t support APCI on upstream for this driver now, as I know we had already had ACPI used internally and would be upstreamed.
> Thanks.

The short answer is, I don't believe there is an equivalent for ACPI.
However, it's a question for ACPI people, not me, as I have little
knowedge about ACPI.

From what I've looked at so far, for ACPI, the decision used for
dev->dma_coherent() / dev_is_dma_coherent() (which controls the DMA
API behaviour) is not available - this is a decision by the SMMU code.

See:
	drivers/acpi/arm64/iort.c:arm_smmu_dma_configure()
	drivers/acpi/scan.c:acpi_dma_configure()
	arch/arm64/mm/dma-mapping.c:arch_setup_dma_ops()

The decision is made whether the SMMU is in coherent walk mode.

The proposed interface to use is device_get_dma_attr().  For OF,
this works as expected - it mirrors of_dma_is_coherent().  However,
for ACPI it does not - under ACPI, it uses acpi_get_dma_attr().

See:
	drivers/acpi/scan.c:acpi_get_dma_attr()

This can return one of three states.  Let's concentrate on the
coherent/non-coherent states.  This depends on
adev->flags.coherent_dma.  This gets set by acpi_init_coherency(),
which looks up the _CCA property in the device handle.

It looks to me like it is entirely possible for ACPI to say that, for
example, the SMMU is coherent, which will cause dev->dma_coherent to
be set, but the device (and it's parents) not to have _CCA indicating
that the device is coherent.

It seems that the only way this could be guaranteed is if the ESDHC
was a child of the SMMU - but I don't know whether that is the case
or not.  If it isn't, using device_get_dma_attr() will result in a
coherency disagreement between the DMA API and the device, which
will lead to ADMA errors and potential data corruption.

There may be some subtle reason this can't happen, but from merely
looking at the code, I'd say using device_get_dma_attr() here would
be dangerous.

> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> > ---
> >  drivers/mmc/host/sdhci-of-esdhc.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/mmc/host/sdhci-of-esdhc.c
> > b/drivers/mmc/host/sdhci-of-esdhc.c
> > index 4dd43b1adf2c..74de5e8c45c8 100644
> > --- a/drivers/mmc/host/sdhci-of-esdhc.c
> > +++ b/drivers/mmc/host/sdhci-of-esdhc.c
> > @@ -495,7 +495,12 @@ static int esdhc_of_enable_dma(struct sdhci_host
> > *host)
> >  		dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
> > 
> >  	value = sdhci_readl(host, ESDHC_DMA_SYSCTL);
> > -	value |= ESDHC_DMA_SNOOP;
> > +
> > +	if (of_dma_is_coherent(dev->of_node))
> > +		value |= ESDHC_DMA_SNOOP;
> > +	else
> > +		value &= ~ESDHC_DMA_SNOOP;
> > +
> >  	sdhci_writel(host, value, ESDHC_DMA_SYSCTL);
> >  	return 0;
> >  }
> > --
> > 2.7.4
>
Russell King (Oracle) Sept. 23, 2019, 11:10 a.m. UTC | #4
On Mon, Sep 23, 2019 at 09:41:34AM +0100, Russell King - ARM Linux admin wrote:
> On Mon, Sep 23, 2019 at 02:51:15AM +0000, Y.b. Lu wrote:
> > Hi Russell,
> > 
> > > -----Original Message-----
> > > From: Russell King <rmk@armlinux.org.uk> On Behalf Of Russell King
> > > Sent: Sunday, September 22, 2019 6:27 PM
> > > To: Robin Murphy <robin.murphy@arm.com>; dann frazier
> > > <dann.frazier@canonical.com>; Will Deacon <will.deacon@arm.com>; Nicolin
> > > Chen <nicoleotsuka@gmail.com>; Y.b. Lu <yangbo.lu@nxp.com>; Christoph
> > > Hellwig <hch@lst.de>
> > > Cc: Adrian Hunter <adrian.hunter@intel.com>; Ulf Hansson
> > > <ulf.hansson@linaro.org>; linux-mmc@vger.kernel.org
> > > Subject: [PATCH 2/3] mmc: sdhci-of-esdhc: set DMA snooping based on DMA
> > > coherence
> > > 
> > > We must not unconditionally set the DMA snoop bit; if the DMA API is
> > > assuming that the device is not DMA coherent, and the device snoops the CPU
> > > caches, the device can see stale cache lines brought in by speculative prefetch.
> > > 
> > > This leads to the device seeing stale data, potentially resulting in corrupted
> > > data transfers.  Commonly, this results in a descriptor fetch error such as:
> > > 
> > > mmc0: ADMA error
> > > mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> > > mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002202
> > > mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
> > > mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> > > mmc0: sdhci: Present:   0x01f50008 | Host ctl: 0x00000038
> > > mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
> > > mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000040d8
> > > mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> > > mmc0: sdhci: Int enab:  0x037f108f | Sig enab: 0x037f108b
> > > mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
> > > mmc0: sdhci: Caps:      0x35fa0000 | Caps_1:   0x0000af00
> > > mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> > > mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x001d8a33
> > > mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
> > > mmc0: sdhci: Host ctl2: 0x00000000
> > > mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x000000236d43820c
> > > mmc0: sdhci: ============================================
> > > mmc0: error -5 whilst initialising SD card
> > > 
> > > but can lead to other errors, and potentially direct the SDHCI controller to
> > > read/write data to other memory locations (e.g. if a valid descriptor is visible
> > > to the device in a stale cache line.)
> > > 
> > > Fix this by ensuring that the DMA snoop bit corresponds with the behaviour of
> > > the DMA API.  Since the driver currently only supports DT, use
> > > of_dma_is_coherent().  Note that device_get_dma_attr() can not be used as
> > > that risks re-introducing this bug if/when the driver is converted to ACPI.
> > 
> > [Y.b. Lu] May I have your suggestion how to check this condition with ACPI?
> > Although we didn’t support APCI on upstream for this driver now, as I know we had already had ACPI used internally and would be upstreamed.
> > Thanks.
> 
> The short answer is, I don't believe there is an equivalent for ACPI.
> However, it's a question for ACPI people, not me, as I have little
> knowedge about ACPI.
> 
> From what I've looked at so far, for ACPI, the decision used for
> dev->dma_coherent() / dev_is_dma_coherent() (which controls the DMA
> API behaviour) is not available - this is a decision by the SMMU code.
> 
> See:
> 	drivers/acpi/arm64/iort.c:arm_smmu_dma_configure()
> 	drivers/acpi/scan.c:acpi_dma_configure()
> 	arch/arm64/mm/dma-mapping.c:arch_setup_dma_ops()
> 
> The decision is made whether the SMMU is in coherent walk mode.
> 
> The proposed interface to use is device_get_dma_attr().  For OF,
> this works as expected - it mirrors of_dma_is_coherent().  However,
> for ACPI it does not - under ACPI, it uses acpi_get_dma_attr().
> 
> See:
> 	drivers/acpi/scan.c:acpi_get_dma_attr()
> 
> This can return one of three states.  Let's concentrate on the
> coherent/non-coherent states.  This depends on
> adev->flags.coherent_dma.  This gets set by acpi_init_coherency(),
> which looks up the _CCA property in the device handle.
> 
> It looks to me like it is entirely possible for ACPI to say that, for
> example, the SMMU is coherent, which will cause dev->dma_coherent to
> be set, but the device (and it's parents) not to have _CCA indicating
> that the device is coherent.
> 
> It seems that the only way this could be guaranteed is if the ESDHC
> was a child of the SMMU - but I don't know whether that is the case
> or not.  If it isn't, using device_get_dma_attr() will result in a
> coherency disagreement between the DMA API and the device, which
> will lead to ADMA errors and potential data corruption.
> 
> There may be some subtle reason this can't happen, but from merely
> looking at the code, I'd say using device_get_dma_attr() here would
> be dangerous.

The best I can come up with is something like:

	enum dev_dma_attr attr, expected;

	attr = device_get_dma_attr(dev);
	expected = dev_is_dma_coherent(dev) ? DEV_DMA_COHERENT :
			DEV_DMA_NON_COHERENT;

	WARN_ON(attr != expected);

Maybe even aborting the probe if they don't agree.  However, it seems
dev_is_dma_coherent() is an IOMMU/DMA API implementation detail that
only IOMMU drivers and arch code are supposed to use.

What I'm hearing at the moment from Jon Nettleton is that the NXP ACPI
description does not include an IORT table and doesn't mention the
SMMU.  If I'm reading the code correctly, that means the Linux DMA API
will assume all devices are DMA non-coherent - and if we use
device_get_dma_attr() for this, and the ACPI device description has
_CCA=1, we'll end up with potentially data corrupting mismatched DMA
coherency expectations.
Russell King (Oracle) Sept. 23, 2019, 12:29 p.m. UTC | #5
On Mon, Sep 23, 2019 at 12:10:16PM +0100, Russell King - ARM Linux admin wrote:
> On Mon, Sep 23, 2019 at 09:41:34AM +0100, Russell King - ARM Linux admin wrote:
> > On Mon, Sep 23, 2019 at 02:51:15AM +0000, Y.b. Lu wrote:
> > > Hi Russell,
> > > 
> > > > -----Original Message-----
> > > > From: Russell King <rmk@armlinux.org.uk> On Behalf Of Russell King
> > > > Sent: Sunday, September 22, 2019 6:27 PM
> > > > To: Robin Murphy <robin.murphy@arm.com>; dann frazier
> > > > <dann.frazier@canonical.com>; Will Deacon <will.deacon@arm.com>; Nicolin
> > > > Chen <nicoleotsuka@gmail.com>; Y.b. Lu <yangbo.lu@nxp.com>; Christoph
> > > > Hellwig <hch@lst.de>
> > > > Cc: Adrian Hunter <adrian.hunter@intel.com>; Ulf Hansson
> > > > <ulf.hansson@linaro.org>; linux-mmc@vger.kernel.org
> > > > Subject: [PATCH 2/3] mmc: sdhci-of-esdhc: set DMA snooping based on DMA
> > > > coherence
> > > > 
> > > > We must not unconditionally set the DMA snoop bit; if the DMA API is
> > > > assuming that the device is not DMA coherent, and the device snoops the CPU
> > > > caches, the device can see stale cache lines brought in by speculative prefetch.
> > > > 
> > > > This leads to the device seeing stale data, potentially resulting in corrupted
> > > > data transfers.  Commonly, this results in a descriptor fetch error such as:
> > > > 
> > > > mmc0: ADMA error
> > > > mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> > > > mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002202
> > > > mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
> > > > mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> > > > mmc0: sdhci: Present:   0x01f50008 | Host ctl: 0x00000038
> > > > mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
> > > > mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000040d8
> > > > mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> > > > mmc0: sdhci: Int enab:  0x037f108f | Sig enab: 0x037f108b
> > > > mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
> > > > mmc0: sdhci: Caps:      0x35fa0000 | Caps_1:   0x0000af00
> > > > mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> > > > mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x001d8a33
> > > > mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
> > > > mmc0: sdhci: Host ctl2: 0x00000000
> > > > mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x000000236d43820c
> > > > mmc0: sdhci: ============================================
> > > > mmc0: error -5 whilst initialising SD card
> > > > 
> > > > but can lead to other errors, and potentially direct the SDHCI controller to
> > > > read/write data to other memory locations (e.g. if a valid descriptor is visible
> > > > to the device in a stale cache line.)
> > > > 
> > > > Fix this by ensuring that the DMA snoop bit corresponds with the behaviour of
> > > > the DMA API.  Since the driver currently only supports DT, use
> > > > of_dma_is_coherent().  Note that device_get_dma_attr() can not be used as
> > > > that risks re-introducing this bug if/when the driver is converted to ACPI.
> > > 
> > > [Y.b. Lu] May I have your suggestion how to check this condition with ACPI?
> > > Although we didn’t support APCI on upstream for this driver now, as I know we had already had ACPI used internally and would be upstreamed.
> > > Thanks.
> > 
> > The short answer is, I don't believe there is an equivalent for ACPI.
> > However, it's a question for ACPI people, not me, as I have little
> > knowedge about ACPI.
> > 
> > From what I've looked at so far, for ACPI, the decision used for
> > dev->dma_coherent() / dev_is_dma_coherent() (which controls the DMA
> > API behaviour) is not available - this is a decision by the SMMU code.
> > 
> > See:
> > 	drivers/acpi/arm64/iort.c:arm_smmu_dma_configure()
> > 	drivers/acpi/scan.c:acpi_dma_configure()
> > 	arch/arm64/mm/dma-mapping.c:arch_setup_dma_ops()
> > 
> > The decision is made whether the SMMU is in coherent walk mode.
> > 
> > The proposed interface to use is device_get_dma_attr().  For OF,
> > this works as expected - it mirrors of_dma_is_coherent().  However,
> > for ACPI it does not - under ACPI, it uses acpi_get_dma_attr().
> > 
> > See:
> > 	drivers/acpi/scan.c:acpi_get_dma_attr()
> > 
> > This can return one of three states.  Let's concentrate on the
> > coherent/non-coherent states.  This depends on
> > adev->flags.coherent_dma.  This gets set by acpi_init_coherency(),
> > which looks up the _CCA property in the device handle.
> > 
> > It looks to me like it is entirely possible for ACPI to say that, for
> > example, the SMMU is coherent, which will cause dev->dma_coherent to
> > be set, but the device (and it's parents) not to have _CCA indicating
> > that the device is coherent.
> > 
> > It seems that the only way this could be guaranteed is if the ESDHC
> > was a child of the SMMU - but I don't know whether that is the case
> > or not.  If it isn't, using device_get_dma_attr() will result in a
> > coherency disagreement between the DMA API and the device, which
> > will lead to ADMA errors and potential data corruption.
> > 
> > There may be some subtle reason this can't happen, but from merely
> > looking at the code, I'd say using device_get_dma_attr() here would
> > be dangerous.
> 
> The best I can come up with is something like:
> 
> 	enum dev_dma_attr attr, expected;
> 
> 	attr = device_get_dma_attr(dev);
> 	expected = dev_is_dma_coherent(dev) ? DEV_DMA_COHERENT :
> 			DEV_DMA_NON_COHERENT;
> 
> 	WARN_ON(attr != expected);
> 
> Maybe even aborting the probe if they don't agree.  However, it seems
> dev_is_dma_coherent() is an IOMMU/DMA API implementation detail that
> only IOMMU drivers and arch code are supposed to use.
> 
> What I'm hearing at the moment from Jon Nettleton is that the NXP ACPI
> description does not include an IORT table and doesn't mention the
> SMMU.  If I'm reading the code correctly, that means the Linux DMA API
> will assume all devices are DMA non-coherent - and if we use
> device_get_dma_attr() for this, and the ACPI device description has
> _CCA=1, we'll end up with potentially data corrupting mismatched DMA
> coherency expectations.

Okay, so digging in to the edk2-platforms code at:

https://source.codeaurora.org/external/qoriq/qoriq-components/edk2-platforms/

I find this:

20 Scope(_SB)
21 {
22   Device(SDC0) {
23     Name(_HID, "NXP0003")
24     Name(_CID, "PNP0D40")
25     Name(_CCA, 0)
...
50   Device(SDC1) {
51     Name(_HID, "NXP0003")
52     Name(_CID, "PNP0D40")
53     Name(_CCA, 0)

So, ACPI describes these devices as DMA _non-coherent_.

The AcpiTables.inf file has:

35  /* Iort.asl
...

So, my understanding is that as there is no IORT, the DMA API will
consider ACPI devices to be non-coherent, and will do cache maintenance
and for dma_alloc_coherent(), will remap memory uncached.

So, it seems that given the current ACPI description, DMA snooping
must *not* be enabled for the eSDHC controller.

Checking the SATA situation:

20 Scope(_SB)
21 {
22   Device(SAT0) {
23     Name(_HID, "NXP0004")
24     Name(_CCA, 1)

So, ACPI is saying that the device _is_ DMA coherent, but given the
lack of IORT, the DMA API will treat this as DMA non-coherent.  I don't
have the knowledge to know what the SATA driver does, whether the SATA
hardware is coherent or not, or whether it contains a bit to control
it, but this basically looks wrong to me given what I understand so
far.

From what I can gather, _CCA must not be explicitly set to 1 or
allowed to default to 1 on ARM64 for any ACPI device without an IORT
table being present to tell the DMA API that the ACPI devices are
coherent.
Ulf Hansson Sept. 23, 2019, 9:54 p.m. UTC | #6
On Sun, 22 Sep 2019 at 12:27, Russell King <rmk+kernel@armlinux.org.uk> wrote:
>
> We must not unconditionally set the DMA snoop bit; if the DMA API is
> assuming that the device is not DMA coherent, and the device snoops the
> CPU caches, the device can see stale cache lines brought in by
> speculative prefetch.
>
> This leads to the device seeing stale data, potentially resulting in
> corrupted data transfers.  Commonly, this results in a descriptor fetch
> error such as:
>
> mmc0: ADMA error
> mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002202
> mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
> mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> mmc0: sdhci: Present:   0x01f50008 | Host ctl: 0x00000038
> mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
> mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000040d8
> mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> mmc0: sdhci: Int enab:  0x037f108f | Sig enab: 0x037f108b
> mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
> mmc0: sdhci: Caps:      0x35fa0000 | Caps_1:   0x0000af00
> mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x001d8a33
> mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
> mmc0: sdhci: Host ctl2: 0x00000000
> mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x000000236d43820c
> mmc0: sdhci: ============================================
> mmc0: error -5 whilst initialising SD card
>
> but can lead to other errors, and potentially direct the SDHCI
> controller to read/write data to other memory locations (e.g. if a valid
> descriptor is visible to the device in a stale cache line.)
>
> Fix this by ensuring that the DMA snoop bit corresponds with the
> behaviour of the DMA API.  Since the driver currently only supports DT,
> use of_dma_is_coherent().  Note that device_get_dma_attr() can not be
> used as that risks re-introducing this bug if/when the driver is
> converted to ACPI.
>
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Applied for fixes and taged for stable, thanks!

Kind regards
Uffe


> ---
>  drivers/mmc/host/sdhci-of-esdhc.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c
> index 4dd43b1adf2c..74de5e8c45c8 100644
> --- a/drivers/mmc/host/sdhci-of-esdhc.c
> +++ b/drivers/mmc/host/sdhci-of-esdhc.c
> @@ -495,7 +495,12 @@ static int esdhc_of_enable_dma(struct sdhci_host *host)
>                 dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
>
>         value = sdhci_readl(host, ESDHC_DMA_SYSCTL);
> -       value |= ESDHC_DMA_SNOOP;
> +
> +       if (of_dma_is_coherent(dev->of_node))
> +               value |= ESDHC_DMA_SNOOP;
> +       else
> +               value &= ~ESDHC_DMA_SNOOP;
> +
>         sdhci_writel(host, value, ESDHC_DMA_SYSCTL);
>         return 0;
>  }
> --
> 2.7.4
>
Yangbo Lu Sept. 24, 2019, 2:37 a.m. UTC | #7
Hi Russell,

> -----Original Message-----
> From: Russell King - ARM Linux admin <linux@armlinux.org.uk>
> Sent: Monday, September 23, 2019 8:30 PM
> To: Y.b. Lu <yangbo.lu@nxp.com>
> Cc: Robin Murphy <robin.murphy@arm.com>; dann frazier
> <dann.frazier@canonical.com>; Will Deacon <will.deacon@arm.com>; Nicolin
> Chen <nicoleotsuka@gmail.com>; Christoph Hellwig <hch@lst.de>; Meenakshi
> Aggarwal <meenakshi.aggarwal@nxp.com>; Adrian Hunter
> <adrian.hunter@intel.com>; Ulf Hansson <ulf.hansson@linaro.org>;
> linux-mmc@vger.kernel.org
> Subject: Re: [PATCH 2/3] mmc: sdhci-of-esdhc: set DMA snooping based on
> DMA coherence
> 
> On Mon, Sep 23, 2019 at 12:10:16PM +0100, Russell King - ARM Linux admin
> wrote:
> > On Mon, Sep 23, 2019 at 09:41:34AM +0100, Russell King - ARM Linux
> admin wrote:
> > > On Mon, Sep 23, 2019 at 02:51:15AM +0000, Y.b. Lu wrote:
> > > > Hi Russell,
> > > >
> > > > > -----Original Message-----
> > > > > From: Russell King <rmk@armlinux.org.uk> On Behalf Of Russell
> > > > > King
> > > > > Sent: Sunday, September 22, 2019 6:27 PM
> > > > > To: Robin Murphy <robin.murphy@arm.com>; dann frazier
> > > > > <dann.frazier@canonical.com>; Will Deacon <will.deacon@arm.com>;
> > > > > Nicolin Chen <nicoleotsuka@gmail.com>; Y.b. Lu
> > > > > <yangbo.lu@nxp.com>; Christoph Hellwig <hch@lst.de>
> > > > > Cc: Adrian Hunter <adrian.hunter@intel.com>; Ulf Hansson
> > > > > <ulf.hansson@linaro.org>; linux-mmc@vger.kernel.org
> > > > > Subject: [PATCH 2/3] mmc: sdhci-of-esdhc: set DMA snooping based
> > > > > on DMA coherence
> > > > >
> > > > > We must not unconditionally set the DMA snoop bit; if the DMA
> > > > > API is assuming that the device is not DMA coherent, and the
> > > > > device snoops the CPU caches, the device can see stale cache lines
> brought in by speculative prefetch.
> > > > >
> > > > > This leads to the device seeing stale data, potentially
> > > > > resulting in corrupted data transfers.  Commonly, this results in a
> descriptor fetch error such as:
> > > > >
> > > > > mmc0: ADMA error
> > > > > mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> > > > > mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002202
> > > > > mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
> > > > > mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> > > > > mmc0: sdhci: Present:   0x01f50008 | Host ctl: 0x00000038
> > > > > mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
> > > > > mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000040d8
> > > > > mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> > > > > mmc0: sdhci: Int enab:  0x037f108f | Sig enab: 0x037f108b
> > > > > mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
> > > > > mmc0: sdhci: Caps:      0x35fa0000 | Caps_1:   0x0000af00
> > > > > mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> > > > > mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x001d8a33
> > > > > mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
> > > > > mmc0: sdhci: Host ctl2: 0x00000000
> > > > > mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr:
> > > > > 0x000000236d43820c
> > > > > mmc0: sdhci: ============================================
> > > > > mmc0: error -5 whilst initialising SD card
> > > > >
> > > > > but can lead to other errors, and potentially direct the SDHCI
> > > > > controller to read/write data to other memory locations (e.g. if
> > > > > a valid descriptor is visible to the device in a stale cache
> > > > > line.)
> > > > >
> > > > > Fix this by ensuring that the DMA snoop bit corresponds with the
> > > > > behaviour of the DMA API.  Since the driver currently only
> > > > > supports DT, use of_dma_is_coherent().  Note that
> > > > > device_get_dma_attr() can not be used as that risks re-introducing this
> bug if/when the driver is converted to ACPI.
> > > >
> > > > [Y.b. Lu] May I have your suggestion how to check this condition with
> ACPI?
> > > > Although we didn’t support APCI on upstream for this driver now, as I
> know we had already had ACPI used internally and would be upstreamed.
> > > > Thanks.
> > >
> > > The short answer is, I don't believe there is an equivalent for ACPI.
> > > However, it's a question for ACPI people, not me, as I have little
> > > knowedge about ACPI.
> > >
> > > From what I've looked at so far, for ACPI, the decision used for
> > > dev->dma_coherent() / dev_is_dma_coherent() (which controls the DMA
> > > API behaviour) is not available - this is a decision by the SMMU code.
> > >
> > > See:
> > > 	drivers/acpi/arm64/iort.c:arm_smmu_dma_configure()
> > > 	drivers/acpi/scan.c:acpi_dma_configure()
> > > 	arch/arm64/mm/dma-mapping.c:arch_setup_dma_ops()
> > >
> > > The decision is made whether the SMMU is in coherent walk mode.
> > >
> > > The proposed interface to use is device_get_dma_attr().  For OF,
> > > this works as expected - it mirrors of_dma_is_coherent().  However,
> > > for ACPI it does not - under ACPI, it uses acpi_get_dma_attr().
> > >
> > > See:
> > > 	drivers/acpi/scan.c:acpi_get_dma_attr()
> > >
> > > This can return one of three states.  Let's concentrate on the
> > > coherent/non-coherent states.  This depends on
> > > adev->flags.coherent_dma.  This gets set by acpi_init_coherency(),
> > > which looks up the _CCA property in the device handle.
> > >
> > > It looks to me like it is entirely possible for ACPI to say that,
> > > for example, the SMMU is coherent, which will cause
> > > dev->dma_coherent to be set, but the device (and it's parents) not
> > > to have _CCA indicating that the device is coherent.
> > >
> > > It seems that the only way this could be guaranteed is if the ESDHC
> > > was a child of the SMMU - but I don't know whether that is the case
> > > or not.  If it isn't, using device_get_dma_attr() will result in a
> > > coherency disagreement between the DMA API and the device, which
> > > will lead to ADMA errors and potential data corruption.
> > >
> > > There may be some subtle reason this can't happen, but from merely
> > > looking at the code, I'd say using device_get_dma_attr() here would
> > > be dangerous.
> >
> > The best I can come up with is something like:
> >
> > 	enum dev_dma_attr attr, expected;
> >
> > 	attr = device_get_dma_attr(dev);
> > 	expected = dev_is_dma_coherent(dev) ? DEV_DMA_COHERENT :
> > 			DEV_DMA_NON_COHERENT;
> >
> > 	WARN_ON(attr != expected);
> >
> > Maybe even aborting the probe if they don't agree.  However, it seems
> > dev_is_dma_coherent() is an IOMMU/DMA API implementation detail that
> > only IOMMU drivers and arch code are supposed to use.
> >
> > What I'm hearing at the moment from Jon Nettleton is that the NXP ACPI
> > description does not include an IORT table and doesn't mention the
> > SMMU.  If I'm reading the code correctly, that means the Linux DMA API
> > will assume all devices are DMA non-coherent - and if we use
> > device_get_dma_attr() for this, and the ACPI device description has
> > _CCA=1, we'll end up with potentially data corrupting mismatched DMA
> > coherency expectations.
> 
> Okay, so digging in to the edk2-platforms code at:
> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsource.
> codeaurora.org%2Fexternal%2Fqoriq%2Fqoriq-components%2Fedk2-platform
> s%2F&amp;data=02%7C01%7Cyangbo.lu%40nxp.com%7Cb080efc2fa564fb98
> e2108d74021b99e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C
> 637048385907777184&amp;sdata=5MQWxAdNHvMwQS21QaUzKjMZGTUUl
> DP7GxUKN1mEpSw%3D&amp;reserved=0
> 
> I find this:
> 
> 20 Scope(_SB)
> 21 {
> 22   Device(SDC0) {
> 23     Name(_HID, "NXP0003")
> 24     Name(_CID, "PNP0D40")
> 25     Name(_CCA, 0)
> ...
> 50   Device(SDC1) {
> 51     Name(_HID, "NXP0003")
> 52     Name(_CID, "PNP0D40")
> 53     Name(_CCA, 0)
> 
> So, ACPI describes these devices as DMA _non-coherent_.
> 
> The AcpiTables.inf file has:
> 
> 35  /* Iort.asl
> ...
> 
> So, my understanding is that as there is no IORT, the DMA API will consider
> ACPI devices to be non-coherent, and will do cache maintenance and for
> dma_alloc_coherent(), will remap memory uncached.
> 
> So, it seems that given the current ACPI description, DMA snooping must
> *not* be enabled for the eSDHC controller.
> 
> Checking the SATA situation:
> 
> 20 Scope(_SB)
> 21 {
> 22   Device(SAT0) {
> 23     Name(_HID, "NXP0004")
> 24     Name(_CCA, 1)
> 
> So, ACPI is saying that the device _is_ DMA coherent, but given the lack of
> IORT, the DMA API will treat this as DMA non-coherent.  I don't have the
> knowledge to know what the SATA driver does, whether the SATA hardware is
> coherent or not, or whether it contains a bit to control it, but this basically
> looks wrong to me given what I understand so far.
> 
> From what I can gather, _CCA must not be explicitly set to 1 or allowed to
> default to 1 on ARM64 for any ACPI device without an IORT table being
> present to tell the DMA API that the ACPI devices are coherent.
> 

[Y.b. Lu] Thank you very much for your fix-up and the analysis on the ACPI. That's very helpful for us.
I know little about ACPI either but I had added Meenakshi who worked on the ACPI in case she wants to know it.

Thanks again :)

> --
> RMK's Patch system:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ar
> mlinux.org.uk%2Fdeveloper%2Fpatches%2F&amp;data=02%7C01%7Cyangbo.l
> u%40nxp.com%7Cb080efc2fa564fb98e2108d74021b99e%7C686ea1d3bc2b4
> c6fa92cd99c5c301635%7C0%7C0%7C637048385907777184&amp;sdata=Dtk
> M9lpicCv%2BYXdQyX6uIWsLKZBItJEwwt0ZKdyYsZo%3D&amp;reserved=0
> FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps
> up According to speedtest.net: 11.9Mbps down 500kbps up
Robin Murphy Sept. 24, 2019, 5:56 p.m. UTC | #8
On 23/09/2019 12:10, Russell King - ARM Linux admin wrote:
> On Mon, Sep 23, 2019 at 09:41:34AM +0100, Russell King - ARM Linux admin wrote:
>> On Mon, Sep 23, 2019 at 02:51:15AM +0000, Y.b. Lu wrote:
>>> Hi Russell,
>>>
>>>> -----Original Message-----
>>>> From: Russell King <rmk@armlinux.org.uk> On Behalf Of Russell King
>>>> Sent: Sunday, September 22, 2019 6:27 PM
>>>> To: Robin Murphy <robin.murphy@arm.com>; dann frazier
>>>> <dann.frazier@canonical.com>; Will Deacon <will.deacon@arm.com>; Nicolin
>>>> Chen <nicoleotsuka@gmail.com>; Y.b. Lu <yangbo.lu@nxp.com>; Christoph
>>>> Hellwig <hch@lst.de>
>>>> Cc: Adrian Hunter <adrian.hunter@intel.com>; Ulf Hansson
>>>> <ulf.hansson@linaro.org>; linux-mmc@vger.kernel.org
>>>> Subject: [PATCH 2/3] mmc: sdhci-of-esdhc: set DMA snooping based on DMA
>>>> coherence
>>>>
>>>> We must not unconditionally set the DMA snoop bit; if the DMA API is
>>>> assuming that the device is not DMA coherent, and the device snoops the CPU
>>>> caches, the device can see stale cache lines brought in by speculative prefetch.
>>>>
>>>> This leads to the device seeing stale data, potentially resulting in corrupted
>>>> data transfers.  Commonly, this results in a descriptor fetch error such as:
>>>>
>>>> mmc0: ADMA error
>>>> mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
>>>> mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002202
>>>> mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
>>>> mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
>>>> mmc0: sdhci: Present:   0x01f50008 | Host ctl: 0x00000038
>>>> mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
>>>> mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000040d8
>>>> mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
>>>> mmc0: sdhci: Int enab:  0x037f108f | Sig enab: 0x037f108b
>>>> mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
>>>> mmc0: sdhci: Caps:      0x35fa0000 | Caps_1:   0x0000af00
>>>> mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
>>>> mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x001d8a33
>>>> mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
>>>> mmc0: sdhci: Host ctl2: 0x00000000
>>>> mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x000000236d43820c
>>>> mmc0: sdhci: ============================================
>>>> mmc0: error -5 whilst initialising SD card
>>>>
>>>> but can lead to other errors, and potentially direct the SDHCI controller to
>>>> read/write data to other memory locations (e.g. if a valid descriptor is visible
>>>> to the device in a stale cache line.)
>>>>
>>>> Fix this by ensuring that the DMA snoop bit corresponds with the behaviour of
>>>> the DMA API.  Since the driver currently only supports DT, use
>>>> of_dma_is_coherent().  Note that device_get_dma_attr() can not be used as
>>>> that risks re-introducing this bug if/when the driver is converted to ACPI.
>>>
>>> [Y.b. Lu] May I have your suggestion how to check this condition with ACPI?
>>> Although we didn’t support APCI on upstream for this driver now, as I know we had already had ACPI used internally and would be upstreamed.
>>> Thanks.
>>
>> The short answer is, I don't believe there is an equivalent for ACPI.
>> However, it's a question for ACPI people, not me, as I have little
>> knowedge about ACPI.
>>
>>  From what I've looked at so far, for ACPI, the decision used for
>> dev->dma_coherent() / dev_is_dma_coherent() (which controls the DMA
>> API behaviour) is not available - this is a decision by the SMMU code.
>>
>> See:
>> 	drivers/acpi/arm64/iort.c:arm_smmu_dma_configure()
>> 	drivers/acpi/scan.c:acpi_dma_configure()
>> 	arch/arm64/mm/dma-mapping.c:arch_setup_dma_ops()
>>
>> The decision is made whether the SMMU is in coherent walk mode.
>>
>> The proposed interface to use is device_get_dma_attr().  For OF,
>> this works as expected - it mirrors of_dma_is_coherent().  However,
>> for ACPI it does not - under ACPI, it uses acpi_get_dma_attr().

Which is more or less the direct equivalent of of_dma_is_coherent(), 
just with an extra state and rather different-looking inheritance logic.

>> See:
>> 	drivers/acpi/scan.c:acpi_get_dma_attr()
>>
>> This can return one of three states.  Let's concentrate on the
>> coherent/non-coherent states.

(note that the third state is moot anyway because it will actively 
prevent the driver from doing DMA API operations - see dma_dummy_ops)

>>  This depends on
>> adev->flags.coherent_dma.  This gets set by acpi_init_coherency(),
>> which looks up the _CCA property in the device handle.
>>
>> It looks to me like it is entirely possible for ACPI to say that, for
>> example, the SMMU is coherent, which will cause dev->dma_coherent to
>> be set, but the device (and it's parents) not to have _CCA indicating
>> that the device is coherent.

It is indeed entirely possible to have an inherently non-coherent device 
translated by a coherent SMMU - see the USB controller in your Juno, for 
example - however in terms of these firmware properties those 
coherencies are orthogonal; the latter only describes the SMMU's 
pagetable walk interface, not the path(s) through its translation 
interface(s) which aren't even necessarily all equal.

>> It seems that the only way this could be guaranteed is if the ESDHC
>> was a child of the SMMU - but I don't know whether that is the case
>> or not.  If it isn't, using device_get_dma_attr() will result in a
>> coherency disagreement between the DMA API and the device, which
>> will lead to ADMA errors and potential data corruption.
>>
>> There may be some subtle reason this can't happen, but from merely
>> looking at the code, I'd say using device_get_dma_attr() here would
>> be dangerous.

This can't ever happen because SMMUs aren't represented in the ACPI 
namespace at all. And even if they were, such a topology would still be 
as bogus as it would be in DT - physically, it's perfectly possible for 
a single "device" like a PCI host bridge to have different sets of 
requester IDs translated by different SMMUs, so treating the SMMU as a 
hierarchical "bus" just doesn't work.

> The best I can come up with is something like:
> 
> 	enum dev_dma_attr attr, expected;
> 
> 	attr = device_get_dma_attr(dev);
> 	expected = dev_is_dma_coherent(dev) ? DEV_DMA_COHERENT :
> 			DEV_DMA_NON_COHERENT;
> 
> 	WARN_ON(attr != expected);

This would be a tautology given where dev->dma_coherent comes from in 
the first place - see {platform,pci}_dma_configure().

> Maybe even aborting the probe if they don't agree.  However, it seems
> dev_is_dma_coherent() is an IOMMU/DMA API implementation detail that
> only IOMMU drivers and arch code are supposed to use.
> 
> What I'm hearing at the moment from Jon Nettleton is that the NXP ACPI
> description does not include an IORT table and doesn't mention the
> SMMU.  If I'm reading the code correctly, that means the Linux DMA API
> will assume all devices are DMA non-coherent - and if we use
> device_get_dma_attr() for this, and the ACPI device description has
> _CCA=1, we'll end up with potentially data corrupting mismatched DMA
> coherency expectations.

In terms of DMA API coherency, unless you're working on the SMMU drivers 
you can consider IORT entirely irrelevant and just treat _CCA for 
namespace devices as the first and last word. Yes, technically IORT can 
describe weird stuff with the CPM attribute, but I have every intention 
to keep ignoring that in Linux.

Robin.
Russell King (Oracle) Sept. 24, 2019, 6:15 p.m. UTC | #9
On Tue, Sep 24, 2019 at 06:56:53PM +0100, Robin Murphy wrote:
> On 23/09/2019 12:10, Russell King - ARM Linux admin wrote:
> > On Mon, Sep 23, 2019 at 09:41:34AM +0100, Russell King - ARM Linux admin wrote:
> > > On Mon, Sep 23, 2019 at 02:51:15AM +0000, Y.b. Lu wrote:
> > > > Hi Russell,
> > > > 
> > > > > -----Original Message-----
> > > > > From: Russell King <rmk@armlinux.org.uk> On Behalf Of Russell King
> > > > > Sent: Sunday, September 22, 2019 6:27 PM
> > > > > To: Robin Murphy <robin.murphy@arm.com>; dann frazier
> > > > > <dann.frazier@canonical.com>; Will Deacon <will.deacon@arm.com>; Nicolin
> > > > > Chen <nicoleotsuka@gmail.com>; Y.b. Lu <yangbo.lu@nxp.com>; Christoph
> > > > > Hellwig <hch@lst.de>
> > > > > Cc: Adrian Hunter <adrian.hunter@intel.com>; Ulf Hansson
> > > > > <ulf.hansson@linaro.org>; linux-mmc@vger.kernel.org
> > > > > Subject: [PATCH 2/3] mmc: sdhci-of-esdhc: set DMA snooping based on DMA
> > > > > coherence
> > > > > 
> > > > > We must not unconditionally set the DMA snoop bit; if the DMA API is
> > > > > assuming that the device is not DMA coherent, and the device snoops the CPU
> > > > > caches, the device can see stale cache lines brought in by speculative prefetch.
> > > > > 
> > > > > This leads to the device seeing stale data, potentially resulting in corrupted
> > > > > data transfers.  Commonly, this results in a descriptor fetch error such as:
> > > > > 
> > > > > mmc0: ADMA error
> > > > > mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> > > > > mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002202
> > > > > mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
> > > > > mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> > > > > mmc0: sdhci: Present:   0x01f50008 | Host ctl: 0x00000038
> > > > > mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
> > > > > mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000040d8
> > > > > mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> > > > > mmc0: sdhci: Int enab:  0x037f108f | Sig enab: 0x037f108b
> > > > > mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
> > > > > mmc0: sdhci: Caps:      0x35fa0000 | Caps_1:   0x0000af00
> > > > > mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> > > > > mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x001d8a33
> > > > > mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
> > > > > mmc0: sdhci: Host ctl2: 0x00000000
> > > > > mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x000000236d43820c
> > > > > mmc0: sdhci: ============================================
> > > > > mmc0: error -5 whilst initialising SD card
> > > > > 
> > > > > but can lead to other errors, and potentially direct the SDHCI controller to
> > > > > read/write data to other memory locations (e.g. if a valid descriptor is visible
> > > > > to the device in a stale cache line.)
> > > > > 
> > > > > Fix this by ensuring that the DMA snoop bit corresponds with the behaviour of
> > > > > the DMA API.  Since the driver currently only supports DT, use
> > > > > of_dma_is_coherent().  Note that device_get_dma_attr() can not be used as
> > > > > that risks re-introducing this bug if/when the driver is converted to ACPI.
> > > > 
> > > > [Y.b. Lu] May I have your suggestion how to check this condition with ACPI?
> > > > Although we didn’t support APCI on upstream for this driver now, as I know we had already had ACPI used internally and would be upstreamed.
> > > > Thanks.
> > > 
> > > The short answer is, I don't believe there is an equivalent for ACPI.
> > > However, it's a question for ACPI people, not me, as I have little
> > > knowedge about ACPI.
> > > 
> > >  From what I've looked at so far, for ACPI, the decision used for
> > > dev->dma_coherent() / dev_is_dma_coherent() (which controls the DMA
> > > API behaviour) is not available - this is a decision by the SMMU code.
> > > 
> > > See:
> > > 	drivers/acpi/arm64/iort.c:arm_smmu_dma_configure()
> > > 	drivers/acpi/scan.c:acpi_dma_configure()
> > > 	arch/arm64/mm/dma-mapping.c:arch_setup_dma_ops()
> > > 
> > > The decision is made whether the SMMU is in coherent walk mode.
> > > 
> > > The proposed interface to use is device_get_dma_attr().  For OF,
> > > this works as expected - it mirrors of_dma_is_coherent().  However,
> > > for ACPI it does not - under ACPI, it uses acpi_get_dma_attr().
> 
> Which is more or less the direct equivalent of of_dma_is_coherent(), just
> with an extra state and rather different-looking inheritance logic.

Having read the code, I remain to be convinced of that equivalence.
Are you sure it's not an expectation on your part, rather than a
true equivalence (see below.)

> > > See:
> > > 	drivers/acpi/scan.c:acpi_get_dma_attr()
> > > 
> > > This can return one of three states.  Let's concentrate on the
> > > coherent/non-coherent states.
> 
> (note that the third state is moot anyway because it will actively prevent
> the driver from doing DMA API operations - see dma_dummy_ops)
> 
> > >  This depends on
> > > adev->flags.coherent_dma.  This gets set by acpi_init_coherency(),
> > > which looks up the _CCA property in the device handle.
> > > 
> > > It looks to me like it is entirely possible for ACPI to say that, for
> > > example, the SMMU is coherent, which will cause dev->dma_coherent to
> > > be set, but the device (and it's parents) not to have _CCA indicating
> > > that the device is coherent.
> 
> It is indeed entirely possible to have an inherently non-coherent device
> translated by a coherent SMMU - see the USB controller in your Juno, for
> example - however in terms of these firmware properties those coherencies
> are orthogonal; the latter only describes the SMMU's pagetable walk
> interface, not the path(s) through its translation interface(s) which aren't
> even necessarily all equal.
> 
> > > It seems that the only way this could be guaranteed is if the ESDHC
> > > was a child of the SMMU - but I don't know whether that is the case
> > > or not.  If it isn't, using device_get_dma_attr() will result in a
> > > coherency disagreement between the DMA API and the device, which
> > > will lead to ADMA errors and potential data corruption.
> > > 
> > > There may be some subtle reason this can't happen, but from merely
> > > looking at the code, I'd say using device_get_dma_attr() here would
> > > be dangerous.
> 
> This can't ever happen because SMMUs aren't represented in the ACPI
> namespace at all. And even if they were, such a topology would still be as
> bogus as it would be in DT - physically, it's perfectly possible for a
> single "device" like a PCI host bridge to have different sets of requester
> IDs translated by different SMMUs, so treating the SMMU as a hierarchical
> "bus" just doesn't work.

My comment reflects my investigations into all possible routes to find
an answer to the issue at hand.  I can't find in the code any
correlation between the DMA API's idea of coherency and the _CCA
attribute for ACPI devices.  As far as I can see, it's entirely
possible for the two to disagree - to the extent that I believe that
the DMA API will default to assuming non-coherence whereas the ACPI
code defaults to assuming coherence - and there is no path in the ACPI
code that sets dev->dma_coherent for ACPI devices.

> 
> > The best I can come up with is something like:
> > 
> > 	enum dev_dma_attr attr, expected;
> > 
> > 	attr = device_get_dma_attr(dev);
> > 	expected = dev_is_dma_coherent(dev) ? DEV_DMA_COHERENT :
> > 			DEV_DMA_NON_COHERENT;
> > 
> > 	WARN_ON(attr != expected);
> 
> This would be a tautology given where dev->dma_coherent comes from in the
> first place - see {platform,pci}_dma_configure().

For platform and PCI devices, yes, I agree. But for ACPI devices, it
seems dev->dma_coherent is left uninitialised - or I could not locate
where it gets initialised.  My conclusion is that for ACPI devices,
setting _CCA=1 or allowing it to default to 1 results in a mismatch
between what the DMA API believes and what device_get_dma_attr() would
return.  Given what I understand, the above seems to be far from a
tautology.

If you feel differently, please reference the code so I can get a
better understanding of the ACPI code - but my position that there is
a mismatch between the DMA API and devices is based on grepping the
code and following the various paths.

> 
> > Maybe even aborting the probe if they don't agree.  However, it seems
> > dev_is_dma_coherent() is an IOMMU/DMA API implementation detail that
> > only IOMMU drivers and arch code are supposed to use.
> > 
> > What I'm hearing at the moment from Jon Nettleton is that the NXP ACPI
> > description does not include an IORT table and doesn't mention the
> > SMMU.  If I'm reading the code correctly, that means the Linux DMA API
> > will assume all devices are DMA non-coherent - and if we use
> > device_get_dma_attr() for this, and the ACPI device description has
> > _CCA=1, we'll end up with potentially data corrupting mismatched DMA
> > coherency expectations.
> 
> In terms of DMA API coherency, unless you're working on the SMMU drivers you
> can consider IORT entirely irrelevant and just treat _CCA for namespace
> devices as the first and last word. Yes, technically IORT can describe weird
> stuff with the CPM attribute, but I have every intention to keep ignoring
> that in Linux.
> 
> Robin.
>
Yangbo Lu Jan. 20, 2020, 10:09 a.m. UTC | #10
Hi Russell,

Recently I got eSDHC ADMA issue on PowerPC T2080 platform. After checking, the issue is related to this patch.
This patch was to make eSDHC DMA SNOOP bit set per dma-coherent. That resolved issue on LX2160A ARM64 platform.
However on T2080, we are facing similar issue again. It didn't have dma-coherent in dts.
Adding dma-coherent in dts, or reverting the patch could resolve the problem.

Would you please help to have a look at it too? Thanks:)

mmc0: ADMA error: 0x02000000
mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002002
mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
NET: Registered protocol family 10
mmc0: sdhci: Present:   0x01fd020a | Host ctl: 0x00000038
mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000020b8
mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
mmc0: sdhci: Int enab:  0x037f100f | Sig enab: 0x037f100b
mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002002
mmc0: sdhci: Caps:      0x34fa0000 | Caps_1:   0x0000af00
mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x003b3733
mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
mmc0: sdhci: Host ctl2: 0x00000000
mmc0: sdhci: ADMA Err:  0x0000000d | ADMA Ptr: 0x00000000f35ad20c
mmc0: sdhci: ============================================
mmc0: sdhci: f35ad200: DMA 0x00000000f3587710, LEN 0x0008, Attr=0x23
mmc0: error -5 whilst initialising SD card

Best regards,
Yangbo Lu


> -----Original Message-----
> From: Russell King <rmk@armlinux.org.uk> On Behalf Of Russell King
> Sent: Sunday, September 22, 2019 6:27 PM
> To: Robin Murphy <robin.murphy@arm.com>; dann frazier
> <dann.frazier@canonical.com>; Will Deacon <will.deacon@arm.com>; Nicolin
> Chen <nicoleotsuka@gmail.com>; Y.b. Lu <yangbo.lu@nxp.com>; Christoph
> Hellwig <hch@lst.de>
> Cc: Adrian Hunter <adrian.hunter@intel.com>; Ulf Hansson
> <ulf.hansson@linaro.org>; linux-mmc@vger.kernel.org
> Subject: [PATCH 2/3] mmc: sdhci-of-esdhc: set DMA snooping based on DMA
> coherence
> 
> We must not unconditionally set the DMA snoop bit; if the DMA API is
> assuming that the device is not DMA coherent, and the device snoops the
> CPU caches, the device can see stale cache lines brought in by
> speculative prefetch.
> 
> This leads to the device seeing stale data, potentially resulting in
> corrupted data transfers.  Commonly, this results in a descriptor fetch
> error such as:
> 
> mmc0: ADMA error
> mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002202
> mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
> mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> mmc0: sdhci: Present:   0x01f50008 | Host ctl: 0x00000038
> mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
> mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000040d8
> mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> mmc0: sdhci: Int enab:  0x037f108f | Sig enab: 0x037f108b
> mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
> mmc0: sdhci: Caps:      0x35fa0000 | Caps_1:   0x0000af00
> mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x001d8a33
> mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
> mmc0: sdhci: Host ctl2: 0x00000000
> mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x000000236d43820c
> mmc0: sdhci: ============================================
> mmc0: error -5 whilst initialising SD card
> 
> but can lead to other errors, and potentially direct the SDHCI
> controller to read/write data to other memory locations (e.g. if a valid
> descriptor is visible to the device in a stale cache line.)
> 
> Fix this by ensuring that the DMA snoop bit corresponds with the
> behaviour of the DMA API.  Since the driver currently only supports DT,
> use of_dma_is_coherent().  Note that device_get_dma_attr() can not be
> used as that risks re-introducing this bug if/when the driver is
> converted to ACPI.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> ---
>  drivers/mmc/host/sdhci-of-esdhc.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci-of-esdhc.c
> b/drivers/mmc/host/sdhci-of-esdhc.c
> index 4dd43b1adf2c..74de5e8c45c8 100644
> --- a/drivers/mmc/host/sdhci-of-esdhc.c
> +++ b/drivers/mmc/host/sdhci-of-esdhc.c
> @@ -495,7 +495,12 @@ static int esdhc_of_enable_dma(struct sdhci_host
> *host)
>  		dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
> 
>  	value = sdhci_readl(host, ESDHC_DMA_SYSCTL);
> -	value |= ESDHC_DMA_SNOOP;
> +
> +	if (of_dma_is_coherent(dev->of_node))
> +		value |= ESDHC_DMA_SNOOP;
> +	else
> +		value &= ~ESDHC_DMA_SNOOP;
> +
>  	sdhci_writel(host, value, ESDHC_DMA_SYSCTL);
>  	return 0;
>  }
> --
> 2.7.4
Russell King (Oracle) Jan. 20, 2020, 10:20 a.m. UTC | #11
On Mon, Jan 20, 2020 at 10:09:00AM +0000, Y.b. Lu wrote:
> Hi Russell,
> 
> Recently I got eSDHC ADMA issue on PowerPC T2080 platform. After checking, the issue is related to this patch.
> This patch was to make eSDHC DMA SNOOP bit set per dma-coherent. That resolved issue on LX2160A ARM64 platform.
> However on T2080, we are facing similar issue again. It didn't have dma-coherent in dts.
> Adding dma-coherent in dts, or reverting the patch could resolve the problem.
> 
> Would you please help to have a look at it too? Thanks:)

All aspects of this was discussed extensively by many parties, and as
far as I remember, no conclusion was reached - the discussion became
rather unproductive, so I walked away from it.

I've more or less washed my hands of this in disgust that no way
forward can be found, not even reverting the patch (and then I'll
just carry the patch locally, so at least my machines work - shame
for other ARM64 folk trying to use the LX2160A.)

> 
> mmc0: ADMA error: 0x02000000
> mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002002
> mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
> mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> NET: Registered protocol family 10
> mmc0: sdhci: Present:   0x01fd020a | Host ctl: 0x00000038
> mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
> mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000020b8
> mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> mmc0: sdhci: Int enab:  0x037f100f | Sig enab: 0x037f100b
> mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002002
> mmc0: sdhci: Caps:      0x34fa0000 | Caps_1:   0x0000af00
> mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x003b3733
> mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
> mmc0: sdhci: Host ctl2: 0x00000000
> mmc0: sdhci: ADMA Err:  0x0000000d | ADMA Ptr: 0x00000000f35ad20c
> mmc0: sdhci: ============================================
> mmc0: sdhci: f35ad200: DMA 0x00000000f3587710, LEN 0x0008, Attr=0x23
> mmc0: error -5 whilst initialising SD card
> 
> Best regards,
> Yangbo Lu
> 
> 
> > -----Original Message-----
> > From: Russell King <rmk@armlinux.org.uk> On Behalf Of Russell King
> > Sent: Sunday, September 22, 2019 6:27 PM
> > To: Robin Murphy <robin.murphy@arm.com>; dann frazier
> > <dann.frazier@canonical.com>; Will Deacon <will.deacon@arm.com>; Nicolin
> > Chen <nicoleotsuka@gmail.com>; Y.b. Lu <yangbo.lu@nxp.com>; Christoph
> > Hellwig <hch@lst.de>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>; Ulf Hansson
> > <ulf.hansson@linaro.org>; linux-mmc@vger.kernel.org
> > Subject: [PATCH 2/3] mmc: sdhci-of-esdhc: set DMA snooping based on DMA
> > coherence
> > 
> > We must not unconditionally set the DMA snoop bit; if the DMA API is
> > assuming that the device is not DMA coherent, and the device snoops the
> > CPU caches, the device can see stale cache lines brought in by
> > speculative prefetch.
> > 
> > This leads to the device seeing stale data, potentially resulting in
> > corrupted data transfers.  Commonly, this results in a descriptor fetch
> > error such as:
> > 
> > mmc0: ADMA error
> > mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> > mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00002202
> > mmc0: sdhci: Blk size:  0x00000008 | Blk cnt:  0x00000001
> > mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x00000013
> > mmc0: sdhci: Present:   0x01f50008 | Host ctl: 0x00000038
> > mmc0: sdhci: Power:     0x00000003 | Blk gap:  0x00000000
> > mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x000040d8
> > mmc0: sdhci: Timeout:   0x00000003 | Int stat: 0x00000001
> > mmc0: sdhci: Int enab:  0x037f108f | Sig enab: 0x037f108b
> > mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
> > mmc0: sdhci: Caps:      0x35fa0000 | Caps_1:   0x0000af00
> > mmc0: sdhci: Cmd:       0x0000333a | Max curr: 0x00000000
> > mmc0: sdhci: Resp[0]:   0x00000920 | Resp[1]:  0x001d8a33
> > mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
> > mmc0: sdhci: Host ctl2: 0x00000000
> > mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x000000236d43820c
> > mmc0: sdhci: ============================================
> > mmc0: error -5 whilst initialising SD card
> > 
> > but can lead to other errors, and potentially direct the SDHCI
> > controller to read/write data to other memory locations (e.g. if a valid
> > descriptor is visible to the device in a stale cache line.)
> > 
> > Fix this by ensuring that the DMA snoop bit corresponds with the
> > behaviour of the DMA API.  Since the driver currently only supports DT,
> > use of_dma_is_coherent().  Note that device_get_dma_attr() can not be
> > used as that risks re-introducing this bug if/when the driver is
> > converted to ACPI.
> > 
> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> > ---
> >  drivers/mmc/host/sdhci-of-esdhc.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/mmc/host/sdhci-of-esdhc.c
> > b/drivers/mmc/host/sdhci-of-esdhc.c
> > index 4dd43b1adf2c..74de5e8c45c8 100644
> > --- a/drivers/mmc/host/sdhci-of-esdhc.c
> > +++ b/drivers/mmc/host/sdhci-of-esdhc.c
> > @@ -495,7 +495,12 @@ static int esdhc_of_enable_dma(struct sdhci_host
> > *host)
> >  		dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
> > 
> >  	value = sdhci_readl(host, ESDHC_DMA_SYSCTL);
> > -	value |= ESDHC_DMA_SNOOP;
> > +
> > +	if (of_dma_is_coherent(dev->of_node))
> > +		value |= ESDHC_DMA_SNOOP;
> > +	else
> > +		value &= ~ESDHC_DMA_SNOOP;
> > +
> >  	sdhci_writel(host, value, ESDHC_DMA_SYSCTL);
> >  	return 0;
> >  }
> > --
> > 2.7.4
>
Robin Murphy Jan. 20, 2020, 1:34 p.m. UTC | #12
On 20/01/2020 10:09 am, Y.b. Lu wrote:
> Hi Russell,
> 
> Recently I got eSDHC ADMA issue on PowerPC T2080 platform. After checking, the issue is related to this patch.
> This patch was to make eSDHC DMA SNOOP bit set per dma-coherent. That resolved issue on LX2160A ARM64 platform.
> However on T2080, we are facing similar issue again. It didn't have dma-coherent in dts.
> Adding dma-coherent in dts, or reverting the patch could resolve the problem.

Arguably updating the DTS would be the most accurate option, since it 
would be describing the hardware more correctly, however if there are 
reasons for that not being sufficient (e.g. DTBs baked into firmware, or 
worries of confusing some other DT consumer) then something like the 
below seems reasonable (albeit a little crude) IMO.

Robin.

----->8-----
 From fafad319893b4168fcccc5445543caf876a0be2d Mon Sep 17 00:00:00 2001
Message-Id: 
<fafad319893b4168fcccc5445543caf876a0be2d.1579526755.git.robin.murphy@arm.com>
From: Robin Murphy <robin.murphy@arm.com>
Date: Mon, 20 Jan 2020 13:11:59 +0000
Subject: [PATCH] mmc: sdhci-of-esdhc: Restore coherency for PPC platforms

Historically, not all PPC platforms have supported per-device coherency,
and some may rely on platform-level assumptions rather than explicitly
specifying the "dma-coherent" propert in their DT. Although the eSDHC
driver needs to tie in to per-device coherency to work correctly on
arm/arm64 platforms, this has apparently caused problems for PPC, so
restore the previous behaviour there with a special case.

Fixes: 121bd08b029e ("mmc: sdhci-of-esdhc: set DMA snooping based on DMA 
coherence")
Reported-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
  drivers/mmc/host/sdhci-of-esdhc.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci-of-esdhc.c 
b/drivers/mmc/host/sdhci-of-esdhc.c
index 500f70a6ee42..a2599368b2bd 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -527,7 +527,8 @@ static int esdhc_of_enable_dma(struct sdhci_host *host)

  	value = sdhci_readl(host, ESDHC_DMA_SYSCTL);

-	if (of_dma_is_coherent(dev->of_node))
+	/* Historically, PPC has always assumed coherency here */
+	if (IS_ENABLED(CONFIG_PPC) || of_dma_is_coherent(dev->of_node))
  		value |= ESDHC_DMA_SNOOP;
  	else
  		value &= ~ESDHC_DMA_SNOOP;
Yangbo Lu Feb. 5, 2020, 5:56 a.m. UTC | #13
Hi Robin,

Thanks lot for your suggestion. I prefer your fix-up patch to treat PowerPC as a special case.

Hi Adrian and Uffe,

Do you think it is ok? Should I help to re-send that patch to mmc mailing list for reviewing.
Thanks.

Best regards,
Yangbo Lu

> -----Original Message-----
> From: Robin Murphy <robin.murphy@arm.com>
> Sent: Monday, January 20, 2020 9:34 PM
> To: Y.b. Lu <yangbo.lu@nxp.com>; Russell King <rmk+kernel@armlinux.org.uk>
> Cc: Adrian Hunter <adrian.hunter@intel.com>; Ulf Hansson
> <ulf.hansson@linaro.org>; linux-mmc@vger.kernel.org; dann frazier
> <dann.frazier@canonical.com>; Will Deacon <will.deacon@arm.com>; Nicolin
> Chen <nicoleotsuka@gmail.com>; Christoph Hellwig <hch@lst.de>
> Subject: Re: [PATCH 2/3] mmc: sdhci-of-esdhc: set DMA snooping based on
> DMA coherence
> 
> On 20/01/2020 10:09 am, Y.b. Lu wrote:
> > Hi Russell,
> >
> > Recently I got eSDHC ADMA issue on PowerPC T2080 platform. After
> checking, the issue is related to this patch.
> > This patch was to make eSDHC DMA SNOOP bit set per dma-coherent. That
> resolved issue on LX2160A ARM64 platform.
> > However on T2080, we are facing similar issue again. It didn't have
> dma-coherent in dts.
> > Adding dma-coherent in dts, or reverting the patch could resolve the
> problem.
> 
> Arguably updating the DTS would be the most accurate option, since it
> would be describing the hardware more correctly, however if there are
> reasons for that not being sufficient (e.g. DTBs baked into firmware, or
> worries of confusing some other DT consumer) then something like the
> below seems reasonable (albeit a little crude) IMO.
> 
> Robin.
> 
> ----->8-----
>  From fafad319893b4168fcccc5445543caf876a0be2d Mon Sep 17 00:00:00
> 2001
> Message-Id:
> <fafad319893b4168fcccc5445543caf876a0be2d.1579526755.git.robin.murph
> y@arm.com>
> From: Robin Murphy <robin.murphy@arm.com>
> Date: Mon, 20 Jan 2020 13:11:59 +0000
> Subject: [PATCH] mmc: sdhci-of-esdhc: Restore coherency for PPC platforms
> 
> Historically, not all PPC platforms have supported per-device coherency,
> and some may rely on platform-level assumptions rather than explicitly
> specifying the "dma-coherent" propert in their DT. Although the eSDHC
> driver needs to tie in to per-device coherency to work correctly on
> arm/arm64 platforms, this has apparently caused problems for PPC, so
> restore the previous behaviour there with a special case.
> 
> Fixes: 121bd08b029e ("mmc: sdhci-of-esdhc: set DMA snooping based on
> DMA
> coherence")
> Reported-by: Yangbo Lu <yangbo.lu@nxp.com>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>   drivers/mmc/host/sdhci-of-esdhc.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci-of-esdhc.c
> b/drivers/mmc/host/sdhci-of-esdhc.c
> index 500f70a6ee42..a2599368b2bd 100644
> --- a/drivers/mmc/host/sdhci-of-esdhc.c
> +++ b/drivers/mmc/host/sdhci-of-esdhc.c
> @@ -527,7 +527,8 @@ static int esdhc_of_enable_dma(struct sdhci_host
> *host)
> 
>   	value = sdhci_readl(host, ESDHC_DMA_SYSCTL);
> 
> -	if (of_dma_is_coherent(dev->of_node))
> +	/* Historically, PPC has always assumed coherency here */
> +	if (IS_ENABLED(CONFIG_PPC) || of_dma_is_coherent(dev->of_node))
>   		value |= ESDHC_DMA_SNOOP;
>   	else
>   		value &= ~ESDHC_DMA_SNOOP;
> --
> 2.23.0.dirty
Ulf Hansson Feb. 5, 2020, 7:56 a.m. UTC | #14
On Wed, 5 Feb 2020 at 06:56, Y.b. Lu <yangbo.lu@nxp.com> wrote:
>
> Hi Robin,
>
> Thanks lot for your suggestion. I prefer your fix-up patch to treat PowerPC as a special case.
>
> Hi Adrian and Uffe,
>
> Do you think it is ok? Should I help to re-send that patch to mmc mailing list for reviewing.
> Thanks.

commit dabf6b36b83a18d57e3d4b9d50544ed040d86255
Author: Michael Ellerman <mpe@ellerman.id.au>
Date:   Sun Jan 26 22:52:47 2020 +1100
of: Add OF_DMA_DEFAULT_COHERENT & select it on powerpc

This one is already in Linus' tree, I suppose that is sufficient.

[...]

Kind regards
Uffe
diff mbox series

Patch

diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c
index 4dd43b1adf2c..74de5e8c45c8 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -495,7 +495,12 @@  static int esdhc_of_enable_dma(struct sdhci_host *host)
 		dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
 
 	value = sdhci_readl(host, ESDHC_DMA_SYSCTL);
-	value |= ESDHC_DMA_SNOOP;
+
+	if (of_dma_is_coherent(dev->of_node))
+		value |= ESDHC_DMA_SNOOP;
+	else
+		value &= ~ESDHC_DMA_SNOOP;
+
 	sdhci_writel(host, value, ESDHC_DMA_SYSCTL);
 	return 0;
 }