diff mbox series

[v3,3/3] mmc: renesas_sdhi: do hard reset if possible

Message ID 20210317091622.31890-4-wsa+renesas@sang-engineering.com (mailing list archive)
State Accepted
Delegated to: Geert Uytterhoeven
Headers show
Series mmc: renesas_sdhi: reset via reset controller | expand

Commit Message

Wolfram Sang March 17, 2021, 9:16 a.m. UTC
All recent SDHI instances can be reset via the reset controller. If one
is found, use it instead of the open coded reset. This is to get a
future-proof sane reset state.

Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
---
 drivers/mmc/host/Kconfig             |  1 +
 drivers/mmc/host/renesas_sdhi.h      |  2 ++
 drivers/mmc/host/renesas_sdhi_core.c | 17 ++++++++++++++++-
 3 files changed, 19 insertions(+), 1 deletion(-)

Comments

Geert Uytterhoeven April 28, 2021, 2:36 p.m. UTC | #1
Hi Wolfram,

On Wed, Mar 17, 2021 at 10:17 AM Wolfram Sang
<wsa+renesas@sang-engineering.com> wrote:
> All recent SDHI instances can be reset via the reset controller. If one
> is found, use it instead of the open coded reset. This is to get a
> future-proof sane reset state.
>
> Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

Thanks for your patch, which is now commit b4d86f37eacb7246 ("mmc:
renesas_sdhi: do hard reset if possible") in mmc/next.

This breaks SDHI on koelsch (R-Car M2-W).
In v5.10, v5.11, and v512, it works fine:

    sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
    mmc0: new ultra high speed SDR104 SDHC card at address aaaa
    mmcblk0: mmc0:aaaa SM32G 29.7 GiB
      mmcblk0: p1

Note that I do see one timeout during identification.

After b4d86f37eacb7246 (bisected), it fails:

    sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
    sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
    [...]
    sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
    mmc0: tuning execution failed: -5
    mmc0: error -5 whilst initialising SD card
    sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
    sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
    [...]

Given the single timeout I see with older kernels, the issue may be that the
harder reset causes that timeout to repeat ad infinitum?

With renesas-drivers-2021-04-27-v5.12, I saw various different timeouts:

    sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
    sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD0)
    sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5)
    sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD52)
    sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55)
    sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD8)

(ee100000 is the SD104 slot, ee140000 is the SDR50 slot).

The card is a brand new SanDisk Extreme 32GB A1 microSD card in the
microSD adapter that came with the card (Conrad 1553726).

On R-Car H3 ES2.0 (Salvator-XS), the card works fine, without any timeouts:

    mmc1: new ultra high speed SDR104 SDHC card at address aaaa
    mmcblk1: mmc1:aaaa SM32G 29.7 GiB
     mmcblk1: p1

Gr{oetje,eeting}s,

                        Geert
Wolfram Sang June 18, 2021, 9:36 a.m. UTC | #2
Hi Geert,

> In v5.10, v5.11, and v512, it works fine:
> 
>     sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
>     mmc0: new ultra high speed SDR104 SDHC card at address aaaa
>     mmcblk0: mmc0:aaaa SM32G 29.7 GiB
>       mmcblk0: p1
> 
> Note that I do see one timeout during identification.
> 
> After b4d86f37eacb7246 (bisected), it fails:
> 
>     sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
>     sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
>     [...]
>     sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
>     mmc0: tuning execution failed: -5
>     mmc0: error -5 whilst initialising SD card
>     sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
>     sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
>     [...]
> 
> Given the single timeout I see with older kernels, the issue may be that the
> harder reset causes that timeout to repeat ad infinitum?

I can confirm this. I also found a SanDisk card which shows the same
issue on my Lager board. However, I wouldn't say this patch breaks
things in a way that a revert is a good solution.

The card does not really work "fine". During probe we get one timeout,
and when trying to read from the card, more follow. Already before this
patch. There seems to be a state where an initial command fails and only
the retry suceeds. The hard reset in deed seems to cause an endless loop
here. However, the proper fix is to find out why this first command
fails, especially only with some cards. My Samsung one works 100% fine.

And if we fix this, then the hard reset is still good for Gen2 as well.

Makes sense?

All the best,

   Wolfram
Geert Uytterhoeven June 18, 2021, 9:42 a.m. UTC | #3
Hi Wolfram,

On Fri, Jun 18, 2021 at 11:36 AM Wolfram Sang
<wsa+renesas@sang-engineering.com> wrote:
> > In v5.10, v5.11, and v512, it works fine:
> >
> >     sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
> >     mmc0: new ultra high speed SDR104 SDHC card at address aaaa
> >     mmcblk0: mmc0:aaaa SM32G 29.7 GiB
> >       mmcblk0: p1
> >
> > Note that I do see one timeout during identification.
> >
> > After b4d86f37eacb7246 (bisected), it fails:
> >
> >     sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
> >     sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
> >     [...]
> >     sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
> >     mmc0: tuning execution failed: -5
> >     mmc0: error -5 whilst initialising SD card
> >     sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
> >     sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19)
> >     [...]
> >
> > Given the single timeout I see with older kernels, the issue may be that the
> > harder reset causes that timeout to repeat ad infinitum?
>
> I can confirm this. I also found a SanDisk card which shows the same
> issue on my Lager board. However, I wouldn't say this patch breaks
> things in a way that a revert is a good solution.
>
> The card does not really work "fine". During probe we get one timeout,
> and when trying to read from the card, more follow. Already before this
> patch. There seems to be a state where an initial command fails and only
> the retry suceeds. The hard reset in deed seems to cause an endless loop
> here. However, the proper fix is to find out why this first command
> fails, especially only with some cards. My Samsung one works 100% fine.

BTW, does it work (really) fine on R-Car Gen3? Mine does.

> And if we fix this, then the hard reset is still good for Gen2 as well.
>
> Makes sense?

True.

Gr{oetje,eeting}s,

                        Geert
Wolfram Sang June 18, 2021, 9:57 a.m. UTC | #4
> BTW, does it work (really) fine on R-Car Gen3? Mine does.

Yes, mine does, too. So, I first wondered if the difference is an older
SCC variant. But when I let the timeouts run for a while, I get a DMA
problem. Which could also be the difference between Gen2 and Gen3 here:

[   53.328284] WARNING: CPU: 0 PID: 743 at kernel/dma/debug.c:498 add_dma_entry+0x158/0x180
[   53.336397] DMA-API: exceeded 7 overlapping mappings of cacheline 0x011807bc
[   53.343451] Modules linked in:
[   53.346512] CPU: 0 PID: 743 Comm: kworker/0:7 Tainted: G        W         5.13.0-rc6-00061-g639f0f876ccf-dirty #1152
[   53.357046] Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
[   53.363493] Workqueue: events_freezable mmc_rescan
[   53.368299] Backtrace: 
[   53.370749] [<c0523e08>] (dump_backtrace) from [<c05241d0>] (show_stack+0x20/0x24)
[   53.378337]  r7:c1da1aec r6:00000000 r5:00000009 r4:c0707df7
[   53.383998] [<c05241b0>] (show_stack) from [<c05262b4>] (dump_stack+0x28/0x30)
[   53.391232] [<c052628c>] (dump_stack) from [<c011754c>] (__warn+0xcc/0xf8)
[   53.398119]  r5:00000009 r4:c06fb973
[   53.401695] [<c0117480>] (__warn) from [<c05248c4>] (warn_slowpath_fmt+0x88/0xcc)
[   53.409191]  r9:ffffffff r8:c06fbc8b r7:00000009 r6:c0163e20 r5:000001f2 r4:c06fb973
[   53.416938] [<c0524840>] (warn_slowpath_fmt) from [<c0163e20>] (add_dma_entry+0x158/0x180)
[   53.425218]  r8:00000001 r7:4601ef3f r6:20000013 r5:011807bc r4:ffffffef
[   53.431921] [<c0163cc8>] (add_dma_entry) from [<c016541c>] (debug_dma_map_sg+0x214/0x330)
[   53.440115]  r6:c1135370 r5:c1b44410 r4:c1da1cd8
[   53.444732] [<c0165208>] (debug_dma_map_sg) from [<c015fe7c>] (dma_map_sg_attrs+0xb4/0xec)
[   53.453013]  r10:00000001 r9:00000000 r8:00000001 r7:c1da1cd8 r6:00000002 r5:c1b44410
[   53.460847]  r4:00000001
[   53.463380] [<c015fdc8>] (dma_map_sg_attrs) from [<c0490a60>] (renesas_sdhi_sys_dmac_start_dma+0xe0/0x3ec)
[   53.473055]  r8:c5e58d48 r7:c5dbf0ec r6:c1da1cd8 r5:00000001 r4:c1f73340
[   53.479758] [<c0490980>] (renesas_sdhi_sys_dmac_start_dma) from [<c048d7b4>] (tmio_process_mrq+0x134/0x274)
[   53.489519]  r10:c1da1d5c r9:00000001 r8:f082f014 r7:c1da1d20 r6:c1da1d5c r5:c1da1ce8
[   53.497353]  r4:c1f73340
[   53.499886] [<c048d680>] (tmio_process_mrq) from [<c048d9ac>] (tmio_mmc_request+0xb8/0xc4)
[   53.508168]  r9:c1da1ce8 r8:c06818f0 r7:60000013 r6:c1f73340 r5:c1da1d20 r4:c1f73000
[   53.515915] [<c048d8f4>] (tmio_mmc_request) from [<c0473c94>] (__mmc_start_request+0xb8/0x11c)
[   53.524544]  r7:00000040 r6:00000000 r5:c1da1d20 r4:c1f73000
[   53.530204] [<c0473bdc>] (__mmc_start_request) from [<c0473d8c>] (mmc_start_request+0x94/0xa8)
[   53.538831]  r7:00000040 r6:00000000 r5:c1da1d20 r4:c1f73000
[   53.544491] [<c0473cf8>] (mmc_start_request) from [<c0473eb4>] (mmc_wait_for_req+0x74/0xbc)
[   53.552858]  r7:00000040 r6:c1da1d30 r5:c1f73000 r4:c1da1d20
[   53.558518] [<c0473e40>] (mmc_wait_for_req) from [<c047a128>] (mmc_send_tuning+0x108/0x180)
[   53.566887]  r7:00000040 r6:c1da1dd8 r5:c601ef00 r4:c1f73000
[   53.572547] [<c047a020>] (mmc_send_tuning) from [<c0490448>] (renesas_sdhi_execute_tuning+0x2b4/0x440)
[   53.581874]  r10:c0738b11 r9:c5e58dd4 r8:00000013 r7:0000000b r6:c5e58d48 r5:c1f73340
[   53.589708]  r4:c1f73000
[   53.592241] [<c0490194>] (renesas_sdhi_execute_tuning) from [<c0474580>] (mmc_execute_tuning+0x70/0xa8)
[   53.601653]  r10:c0aaf1d4 r9:00000000 r8:00200000 r7:00000006 r6:c601b800 r5:c0490194
[   53.609487]  r4:c1f73000
[   53.612020] [<c0474510>] (mmc_execute_tuning) from [<c047c058>] (mmc_sd_init_uhs_card+0x310/0x3f4)
[   53.620995]  r7:00000006 r6:c601ec80 r5:00000000 r4:c601b800
[   53.626655] [<c047bd48>] (mmc_sd_init_uhs_card) from [<c047d0b8>] (mmc_sd_init_card+0x2a4/0x7e0)
[   53.635454]  r7:00000000 r6:c1f73000 r5:00000000 r4:c601b800

I haven't really an idea why SYS-DMAC here is different that internal
DMAC. This needs investigation.
Geert Uytterhoeven June 18, 2021, 10:53 a.m. UTC | #5
Hi Wolfram,

On Fri, Jun 18, 2021 at 11:57 AM Wolfram Sang
<wsa+renesas@sang-engineering.com> wrote:
> > BTW, does it work (really) fine on R-Car Gen3? Mine does.
>
> Yes, mine does, too. So, I first wondered if the difference is an older
> SCC variant. But when I let the timeouts run for a while, I get a DMA
> problem. Which could also be the difference between Gen2 and Gen3 here:
>
> [   53.328284] WARNING: CPU: 0 PID: 743 at kernel/dma/debug.c:498 add_dma_entry+0x158/0x180
> [   53.336397] DMA-API: exceeded 7 overlapping mappings of cacheline 0x011807bc

Someone's forgetting to clean up DMA mappings?

Gr{oetje,eeting}s,

                        Geert
Wolfram Sang Aug. 26, 2021, 8:16 a.m. UTC | #6
On Wed, Apr 28, 2021 at 04:36:17PM +0200, Geert Uytterhoeven wrote:
> Hi Wolfram,
> 
> On Wed, Mar 17, 2021 at 10:17 AM Wolfram Sang
> <wsa+renesas@sang-engineering.com> wrote:
> > All recent SDHI instances can be reset via the reset controller. If one
> > is found, use it instead of the open coded reset. This is to get a
> > future-proof sane reset state.
> >
> > Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> 
> Thanks for your patch, which is now commit b4d86f37eacb7246 ("mmc:
> renesas_sdhi: do hard reset if possible") in mmc/next.
> 
> This breaks SDHI on koelsch (R-Car M2-W).
> In v5.10, v5.11, and v512, it works fine:

I think I have a fix and will send it out in some minutes.
diff mbox series

Patch

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index b236dfe2e879..2f6bc19d0e40 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -707,6 +707,7 @@  config MMC_SDHI
 	tristate "Renesas SDHI SD/SDIO controller support"
 	depends on SUPERH || ARCH_RENESAS || COMPILE_TEST
 	select MMC_TMIO_CORE
+	select RESET_CONTROLLER if ARCH_RENESAS
 	help
 	  This provides support for the SDHI SD/SDIO controller found in
 	  Renesas SuperH, ARM and ARM64 based SoCs
diff --git a/drivers/mmc/host/renesas_sdhi.h b/drivers/mmc/host/renesas_sdhi.h
index cb962c7883dc..53eded81a53e 100644
--- a/drivers/mmc/host/renesas_sdhi.h
+++ b/drivers/mmc/host/renesas_sdhi.h
@@ -70,6 +70,8 @@  struct renesas_sdhi {
 	DECLARE_BITMAP(smpcmp, BITS_PER_LONG);
 	unsigned int tap_num;
 	unsigned int tap_set;
+
+	struct reset_control *rstc;
 };
 
 #define host_to_priv(host) \
diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
index db753829eaf6..d36181b6f687 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -20,6 +20,7 @@ 
 
 #include <linux/clk.h>
 #include <linux/delay.h>
+#include <linux/iopoll.h>
 #include <linux/kernel.h>
 #include <linux/mfd/tmio.h>
 #include <linux/mmc/host.h>
@@ -32,6 +33,7 @@ 
 #include <linux/platform_device.h>
 #include <linux/pm_domain.h>
 #include <linux/regulator/consumer.h>
+#include <linux/reset.h>
 #include <linux/sh_dma.h>
 #include <linux/slab.h>
 #include <linux/sys_soc.h>
@@ -572,10 +574,19 @@  static void renesas_sdhi_scc_reset(struct tmio_mmc_host *host, struct renesas_sd
 static void renesas_sdhi_reset(struct tmio_mmc_host *host)
 {
 	struct renesas_sdhi *priv = host_to_priv(host);
+	int ret;
 	u16 val;
 
-	if (priv->scc_ctl)
+	if (priv->rstc) {
+		reset_control_reset(priv->rstc);
+		/* Unknown why but without polling reset status, it will hang */
+		read_poll_timeout(reset_control_status, ret, ret == 0, 1, 100,
+				  false, priv->rstc);
+		priv->needs_adjust_hs400 = false;
+		renesas_sdhi_set_clock(host, host->clk_cache);
+	} else if (priv->scc_ctl) {
 		renesas_sdhi_scc_reset(host, priv);
+	}
 
 	sd_ctrl_write32_as_16_and_16(host, CTL_IRQ_MASK, TMIO_MASK_ALL_RCAR2);
 
@@ -1081,6 +1092,10 @@  int renesas_sdhi_probe(struct platform_device *pdev,
 	if (ret)
 		goto efree;
 
+	priv->rstc = devm_reset_control_get_optional_exclusive(&pdev->dev, NULL);
+	if (IS_ERR(priv->rstc))
+		return PTR_ERR(priv->rstc);
+
 	ver = sd_ctrl_read16(host, CTL_VERSION);
 	/* GEN2_SDR104 is first known SDHI to use 32bit block count */
 	if (ver < SDHI_VER_GEN2_SDR104 && mmc_data->max_blk_count > U16_MAX)