diff mbox

xHCI problem? [was Re: Erratic USB device behavior and device loss]

Message ID CAPDyKFpnCXhdoKgoG576teC=y38vbC1x=-ehC_9EWEeKr_K6BQ@mail.gmail.com
State New
Headers show

Commit Message

Ulf Hansson Sept. 6, 2016, 9:42 a.m. UTC
On 5 September 2016 at 17:58, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Mon, 5 Sep 2016, Ritesh Raj Sarraf wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA512
>>
>> On Sun, 2016-09-04 at 15:46 -0400, Alan Stern wrote:
>> >
>> > This is not the problem I was discussing with Ulf.  The problem was why
>> > the device kept going into and out of runtime suspend every three
>> > seconds.  The kernel log above does not say whether this was happening.
>> > One way to tell is to look at a usbmon trace (like we did before).
>>
>> https://people.debian.org/~rrs/tmp/usbmon.txt.gz
>>
>> This should have the logs you asked for, running on 4.8-rc4.
>
> Confirmed, the runtime suspends and resumes are still happening.
>
> Ulf, any insights?

Alan, Ritesh,

Yes, I am starting to understand more about what goes on here.
Although I need help to test as I don't have the HW.
As you already guessed, I suspect the problem is within the runtime PM
deployment in the drivers/mmc/host/rtsx_usb_sdmmc.c.

Let me start by first give you some background to how the mmc core
deals with runtime PM.

*)
The mmc core manages most of the calls to the pm_runtime_get|put*()
and pm_runtime_mark_last_busy() for the mmc host device. The gets/puts
are done when the core is about to access the mmc host device, via the
mmc host ops driver interface. You may search for calls to the
mmc_claim|release_host() functions to find out when the gets/puts are
done.

**)
The mmc core have also deployed runtime PM for the mmc *card* device
and which has the runtime PM autosuspend feature enabled with a 3s
default timeout. One important point is that the mmc card device has
the mmc host device assigned as being its parent device. I guessing
the reason to why you are encountering strange 3s intervals of runtime
PM suspend/resume is related to this.

Now, in this case, the rtsx_usb_sdmmc driver seems to need a bit of
special runtime PM deployment, as the calls to pm_runtime_get|put*()
also controls the power to the usb device and thus also the power to
the card. I am guessing that's done via the usb device being assigned
as parent for the mmc host's platform device!?

By reviewing the code of the rtsx_usb_sdmmc driver, particularly how
it calls pm_runtime_get|put() I am guessing those calls may not be
properly deployed. Perhaps rtsx_usb_sdmmc should convert to use the
usb_autopm_put|get_interface() and friends, although I didn't want to
make that change at this point so instead I have cooked a patch that
might fixes the behaviour.

Ritesh, can you please try it out to see what happens?

---
 drivers/mmc/host/rtsx_usb_sdmmc.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

        sd_set_timing(host, ios->timing, &host->ddr_mode);
@@ -1336,7 +1331,7 @@ static void rtsx_usb_init_host(struct
rtsx_usb_sdmmc *host)
                MMC_CAP_MMC_HIGHSPEED | MMC_CAP_BUS_WIDTH_TEST |
                MMC_CAP_UHS_SDR12 | MMC_CAP_UHS_SDR25 | MMC_CAP_UHS_SDR50 |
                MMC_CAP_NEEDS_POLL;
-       mmc->caps2 = MMC_CAP2_NO_PRESCAN_POWERUP | MMC_CAP2_FULL_PWR_CYCLE;
+       mmc->caps2 = MMC_CAP2_FULL_PWR_CYCLE;

        mmc->max_current_330 = 400;
        mmc->max_current_180 = 800;

Comments

Ritesh Raj Sarraf Sept. 6, 2016, 5:08 p.m. UTC | #1
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hello Ulf,

On Tue, 2016-09-06 at 11:42 +0200, Ulf Hansson wrote:
> 
> By reviewing the code of the rtsx_usb_sdmmc driver, particularly how
> it calls pm_runtime_get|put() I am guessing those calls may not be
> properly deployed. Perhaps rtsx_usb_sdmmc should convert to use the
> usb_autopm_put|get_interface() and friends, although I didn't want to
> make that change at this point so instead I have cooked a patch that
> might fixes the behaviour.
> 
> Ritesh, can you please try it out to see what happens?

I was able to hit the issue again, with your patch applied. I tried it on top of
the 4.8-rc5 kernel.

I ensured to capture usbmon trace.
https://people.debian.org/~rrs/tmp/usbmon-ulf.txt.gz


- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJXzvgmAAoJEKY6WKPy4XVpJ/oQAIR/bC82KdqD3xuQpq8r62RT
fV8DSduNhr394GIDO5PBDSKUAFDKCHL9SyCBx9I5zx1U7OF+IfvhgpRji+XcBsQ5
TOtDAhiiTatKfWl/zLGbVqukNS8QLzyU26w0HRabvK3RZ0lGxBugQ8KneUC8wA/Q
NYWAgio5gN5jpD6BgGEZXo6cdzcdI1HM5kyNxcs4VfhD0zDor4wlrW024458a3Fs
02QAkmyV1aBwR4w/Ntw7D9DXb8sVujGgeEOr+KukzdFOPW+w4JOANv3gregDnKSc
EgIYmtTE95d0jWyJuS5sbZlbf4QOcrv9eVwn86TOdhZHU1XlaqaIEbRPzu15ZtZF
B+zYbWg+TgTYFgTYJoaB8T+CUJMMnNh62IC72LXa5K/C0kjgjlHUzis04XdFmT2q
jYC767M1utXi7a4LcXxw8V8uoV0nnmuvxWzhrcdBxi6+a85BLL8bPKT+nrGxkxB0
Hg7Wmnc9moohpjYD3KlAYR3bW56dDYrGgsj8xRUnplBkUp10g2vhix7bk/q+eUwU
gJghnI/WgEE5hBQabtuqCTCsvMdTXmZVjzMg1k43V5v+UR0a7mKVW0uBWu5cU38p
Ga9ZrNmPEzxOVmeXH3XCB8fCXRgce8ieu/KEPHpvbegia0nX4X7BcCHhK2KGsqnX
Y4f+TvpJw2idUe/cvdtw
=/aP1
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern Sept. 7, 2016, 8:48 p.m. UTC | #2
On Tue, 6 Sep 2016, Ulf Hansson wrote:

> On 5 September 2016 at 17:58, Alan Stern <stern@rowland.harvard.edu> wrote:
> > On Mon, 5 Sep 2016, Ritesh Raj Sarraf wrote:
> >
> >> -----BEGIN PGP SIGNED MESSAGE-----
> >> Hash: SHA512
> >>
> >> On Sun, 2016-09-04 at 15:46 -0400, Alan Stern wrote:
> >> >
> >> > This is not the problem I was discussing with Ulf.  The problem was why
> >> > the device kept going into and out of runtime suspend every three
> >> > seconds.  The kernel log above does not say whether this was happening.
> >> > One way to tell is to look at a usbmon trace (like we did before).
> >>
> >> https://people.debian.org/~rrs/tmp/usbmon.txt.gz
> >>
> >> This should have the logs you asked for, running on 4.8-rc4.
> >
> > Confirmed, the runtime suspends and resumes are still happening.
> >
> > Ulf, any insights?
> 
> Alan, Ritesh,
> 
> Yes, I am starting to understand more about what goes on here.
> Although I need help to test as I don't have the HW.
> As you already guessed, I suspect the problem is within the runtime PM
> deployment in the drivers/mmc/host/rtsx_usb_sdmmc.c.
> 
> Let me start by first give you some background to how the mmc core
> deals with runtime PM.
> 
> *)
> The mmc core manages most of the calls to the pm_runtime_get|put*()
> and pm_runtime_mark_last_busy() for the mmc host device. The gets/puts
> are done when the core is about to access the mmc host device, via the
> mmc host ops driver interface. You may search for calls to the
> mmc_claim|release_host() functions to find out when the gets/puts are
> done.

Since mmc_claim_host() does call pm_runtime_get_sync(), these runtime
suspends would not occur if the host was claimed.  So we can conclude
that rtsx_usb_sdmmc.c must be doing I/O to the device without claiming
the host.

> **)
> The mmc core have also deployed runtime PM for the mmc *card* device
> and which has the runtime PM autosuspend feature enabled with a 3s
> default timeout. One important point is that the mmc card device has
> the mmc host device assigned as being its parent device. I guessing
> the reason to why you are encountering strange 3s intervals of runtime
> PM suspend/resume is related to this.

That sounds likely.

> Now, in this case, the rtsx_usb_sdmmc driver seems to need a bit of
> special runtime PM deployment, as the calls to pm_runtime_get|put*()
> also controls the power to the usb device and thus also the power to
> the card. I am guessing that's done via the usb device being assigned
> as parent for the mmc host's platform device!?

Yes, in drivers/mfd/mfd-core.c's mfd_add_device() routine, which is 
called indirectly by drivers/mfd/rtsx_usb.c's probe routine.

> By reviewing the code of the rtsx_usb_sdmmc driver, particularly how
> it calls pm_runtime_get|put() I am guessing those calls may not be
> properly deployed. Perhaps rtsx_usb_sdmmc should convert to use the
> usb_autopm_put|get_interface() and friends, although I didn't want to
> make that change at this point so instead I have cooked a patch that
> might fixes the behaviour.

The usb_autopm_* calls are mostly just convenience wrappers around the
pm_runtime_* functions, meant for use with USB drivers.  In fact, you
can't use them with a platform_device.

It seems odd that rtsx_usb_sdmmc calls pm_runtime_put() and
pm_runtime_get_sync() directly instead of using
mmc_claim|release_host().

> Ritesh, can you please try it out to see what happens?
> 
> ---
>  drivers/mmc/host/rtsx_usb_sdmmc.c | 7 +------
>  1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/drivers/mmc/host/rtsx_usb_sdmmc.c
> b/drivers/mmc/host/rtsx_usb_sdmmc.c
> index 6c71fc9..3d6fe51 100644
> --- a/drivers/mmc/host/rtsx_usb_sdmmc.c
> +++ b/drivers/mmc/host/rtsx_usb_sdmmc.c
> @@ -1138,11 +1138,6 @@ static void sdmmc_set_ios(struct mmc_host *mmc,
> struct mmc_ios *ios)
>         dev_dbg(sdmmc_dev(host), "%s\n", __func__);
>         mutex_lock(&ucr->dev_mutex);
> 
> -       if (rtsx_usb_card_exclusive_check(ucr, RTSX_USB_SD_CARD)) {
> -               mutex_unlock(&ucr->dev_mutex);
> -               return;
> -       }
> -
>         sd_set_power_mode(host, ios->power_mode);
>         sd_set_bus_width(host, ios->bus_width);
>         sd_set_timing(host, ios->timing, &host->ddr_mode);
> @@ -1336,7 +1331,7 @@ static void rtsx_usb_init_host(struct
> rtsx_usb_sdmmc *host)
>                 MMC_CAP_MMC_HIGHSPEED | MMC_CAP_BUS_WIDTH_TEST |
>                 MMC_CAP_UHS_SDR12 | MMC_CAP_UHS_SDR25 | MMC_CAP_UHS_SDR50 |
>                 MMC_CAP_NEEDS_POLL;
> -       mmc->caps2 = MMC_CAP2_NO_PRESCAN_POWERUP | MMC_CAP2_FULL_PWR_CYCLE;
> +       mmc->caps2 = MMC_CAP2_FULL_PWR_CYCLE;
> 
>         mmc->max_current_330 = 400;
>         mmc->max_current_180 = 800;

These changes don't seem to affect the way rtsx_usb_sdmmc.c handles 
runtime PM.  In particular, the driver doesn't call 
pm_runtime_mark_last_busy() anywhere.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Sept. 9, 2016, 10:54 a.m. UTC | #3
On 7 September 2016 at 22:48, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Tue, 6 Sep 2016, Ulf Hansson wrote:
>
>> On 5 September 2016 at 17:58, Alan Stern <stern@rowland.harvard.edu> wrote:
>> > On Mon, 5 Sep 2016, Ritesh Raj Sarraf wrote:
>> >
>> >> -----BEGIN PGP SIGNED MESSAGE-----
>> >> Hash: SHA512
>> >>
>> >> On Sun, 2016-09-04 at 15:46 -0400, Alan Stern wrote:
>> >> >
>> >> > This is not the problem I was discussing with Ulf.  The problem was why
>> >> > the device kept going into and out of runtime suspend every three
>> >> > seconds.  The kernel log above does not say whether this was happening.
>> >> > One way to tell is to look at a usbmon trace (like we did before).
>> >>
>> >> https://people.debian.org/~rrs/tmp/usbmon.txt.gz
>> >>
>> >> This should have the logs you asked for, running on 4.8-rc4.
>> >
>> > Confirmed, the runtime suspends and resumes are still happening.
>> >
>> > Ulf, any insights?
>>
>> Alan, Ritesh,
>>
>> Yes, I am starting to understand more about what goes on here.
>> Although I need help to test as I don't have the HW.
>> As you already guessed, I suspect the problem is within the runtime PM
>> deployment in the drivers/mmc/host/rtsx_usb_sdmmc.c.
>>
>> Let me start by first give you some background to how the mmc core
>> deals with runtime PM.
>>
>> *)
>> The mmc core manages most of the calls to the pm_runtime_get|put*()
>> and pm_runtime_mark_last_busy() for the mmc host device. The gets/puts
>> are done when the core is about to access the mmc host device, via the
>> mmc host ops driver interface. You may search for calls to the
>> mmc_claim|release_host() functions to find out when the gets/puts are
>> done.
>
> Since mmc_claim_host() does call pm_runtime_get_sync(), these runtime
> suspends would not occur if the host was claimed.  So we can conclude
> that rtsx_usb_sdmmc.c must be doing I/O to the device without claiming
> the host.

Yes.

So to be clear, it's the responsible of the mmc core to deal with
mmc_claim_release() host, not the mmc host driver.

Although, under special circumstances the mmc host driver may still
need to access its device. In these cases it explicitly needs to deal
with pm_runtime_get|put*() itself.

Now, what puzzles me here, is that the rtsx_usb_sdmmc driver was
introduced in kernel v3.16.
At that point (and not until v4.1) the mmc core did *not* deal with
runtime PM for mmc host devices via mmc_claim_release_host.

So how could the driver work at that point? :-) Maybe runtime PM has
never worked for this driver!?

>
>> **)
>> The mmc core have also deployed runtime PM for the mmc *card* device
>> and which has the runtime PM autosuspend feature enabled with a 3s
>> default timeout. One important point is that the mmc card device has
>> the mmc host device assigned as being its parent device. I guessing
>> the reason to why you are encountering strange 3s intervals of runtime
>> PM suspend/resume is related to this.
>
> That sounds likely.
>
>> Now, in this case, the rtsx_usb_sdmmc driver seems to need a bit of
>> special runtime PM deployment, as the calls to pm_runtime_get|put*()
>> also controls the power to the usb device and thus also the power to
>> the card. I am guessing that's done via the usb device being assigned
>> as parent for the mmc host's platform device!?
>
> Yes, in drivers/mfd/mfd-core.c's mfd_add_device() routine, which is
> called indirectly by drivers/mfd/rtsx_usb.c's probe routine.
>
>> By reviewing the code of the rtsx_usb_sdmmc driver, particularly how
>> it calls pm_runtime_get|put() I am guessing those calls may not be
>> properly deployed. Perhaps rtsx_usb_sdmmc should convert to use the
>> usb_autopm_put|get_interface() and friends, although I didn't want to
>> make that change at this point so instead I have cooked a patch that
>> might fixes the behaviour.
>
> The usb_autopm_* calls are mostly just convenience wrappers around the
> pm_runtime_* functions, meant for use with USB drivers.  In fact, you
> can't use them with a platform_device.

Okay.

>
> It seems odd that rtsx_usb_sdmmc calls pm_runtime_put() and
> pm_runtime_get_sync() directly instead of using
> mmc_claim|release_host().

Those calls is done when the mmc core calls mmc host driver's
->set_ios() callback and to power up/off the mmc *card*. It's has
nothing to do with the mmc host device as such.

If I understand the original author's intent, was that because of
these runtime PM calls he wanted to control the power to the mmc card.

>
>> Ritesh, can you please try it out to see what happens?
>>
>> ---
>>  drivers/mmc/host/rtsx_usb_sdmmc.c | 7 +------
>>  1 file changed, 1 insertion(+), 6 deletions(-)
>>
>> diff --git a/drivers/mmc/host/rtsx_usb_sdmmc.c
>> b/drivers/mmc/host/rtsx_usb_sdmmc.c
>> index 6c71fc9..3d6fe51 100644
>> --- a/drivers/mmc/host/rtsx_usb_sdmmc.c
>> +++ b/drivers/mmc/host/rtsx_usb_sdmmc.c
>> @@ -1138,11 +1138,6 @@ static void sdmmc_set_ios(struct mmc_host *mmc,
>> struct mmc_ios *ios)
>>         dev_dbg(sdmmc_dev(host), "%s\n", __func__);
>>         mutex_lock(&ucr->dev_mutex);
>>
>> -       if (rtsx_usb_card_exclusive_check(ucr, RTSX_USB_SD_CARD)) {
>> -               mutex_unlock(&ucr->dev_mutex);
>> -               return;
>> -       }
>> -
>>         sd_set_power_mode(host, ios->power_mode);
>>         sd_set_bus_width(host, ios->bus_width);
>>         sd_set_timing(host, ios->timing, &host->ddr_mode);
>> @@ -1336,7 +1331,7 @@ static void rtsx_usb_init_host(struct
>> rtsx_usb_sdmmc *host)
>>                 MMC_CAP_MMC_HIGHSPEED | MMC_CAP_BUS_WIDTH_TEST |
>>                 MMC_CAP_UHS_SDR12 | MMC_CAP_UHS_SDR25 | MMC_CAP_UHS_SDR50 |
>>                 MMC_CAP_NEEDS_POLL;
>> -       mmc->caps2 = MMC_CAP2_NO_PRESCAN_POWERUP | MMC_CAP2_FULL_PWR_CYCLE;
>> +       mmc->caps2 = MMC_CAP2_FULL_PWR_CYCLE;
>>
>>         mmc->max_current_330 = 400;
>>         mmc->max_current_180 = 800;
>
> These changes don't seem to affect the way rtsx_usb_sdmmc.c handles
> runtime PM.  In particular, the driver doesn't call
> pm_runtime_mark_last_busy() anywhere.

This affects the way the core calls the host driver's ->set_ios()
callback. Earlier it was invoked first to do power off then power up.
With this change it starts with power up instead.
I wanted to try this because I suspected the initial state could be wrong.

So here are some other ideas on how to move forward.
1. Run with CONFIG_PM unset to see if we can reproduce the problem.
2. Revert back the state in the mmc core we had in 3.16 around how it
deals with runtime PM for host devices. That's actually very easy as
we only need to remove the pm_runtime_put|get() calls in
mmc_claim|release_host().

Ritesh, can you try these options?

Neither of the above will actually solve the problem, so I guess we
anyway need take a closer look to understand why the usb device is
accessed when it is actually runtime suspended.

BTW, Ritesh you could also run a git bisect to find out if/when this
became broken.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ritesh Raj Sarraf Sept. 9, 2016, 1:14 p.m. UTC | #4
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Fri, 2016-09-09 at 12:54 +0200, Ulf Hansson wrote:
> This affects the way the core calls the host driver's ->set_ios()
> callback. Earlier it was invoked first to do power off then power up.
> With this change it starts with power up instead.
> I wanted to try this because I suspected the initial state could be wrong.
> 
> So here are some other ideas on how to move forward.
> 1. Run with CONFIG_PM unset to see if we can reproduce the problem.
> 2. Revert back the state in the mmc core we had in 3.16 around how it
> deals with runtime PM for host devices. That's actually very easy as
> we only need to remove the pm_runtime_put|get() calls in
> mmc_claim|release_host().
> 
> Ritesh, can you try these options?
> 

Yes. I can try the above ones now. I'm building the kernel for it.


> Neither of the above will actually solve the problem, so I guess we
> anyway need take a closer look to understand why the usb device is
> accessed when it is actually runtime suspended.
> 
> BTW, Ritesh you could also run a git bisect to find out if/when this
> became broken.

What starting point do you want me to use ? 4.0 ?

- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJX0rWtAAoJEKY6WKPy4XVpgeYP/3FCabIWEOvKeVJK52B5gviF
+CFz00vNVddMDQjUlbN/JzaA8h5HiLMBaTp/7nBvaMYKYKZjjo8iiNQ/3jjwPUu8
S/XrWPvMUVJUhF42OMKTl4uUpVPo3fI18SYIw4XFt5pNaUFrmBPjZ6S23BhqfLN3
x2BSw9rwZrCmxH4aMpfWmDFzjWvkuJ6grpxPfbz2lx4t7BWGs4wWdTc0yfQgzWDR
SLD/CpPCkVFjnpjRnrYZei3nNo0tS0+qSv1SsRwy8GchRbOo9bK1W8YKnLH/AFLb
kCYcVKCc4mq8iq5e05vjPvdppXpbzKKlhpxxtlqzLjAYx9FbESRN89D9c7V1Ng0q
c40b+lQV7AGYUxaf+tOcErflDvzRquR1dcwcjJizgeO/pOI6uXGWE+/T5j+4q4UJ
cypkxjzPIz8SbipNEPGf2nSMppXC1RRcBGo4247vFZ0QYTC7RAViL6ErRS67kfM9
5eX2iBtO3qNZMxIMqBPu97rLoLHSZ3nFEwNbyyFnNHERZuvLUZrZY208tZYUfZKZ
X18QJKPrqfbb9n+KyyLNTaESm6jhW2yYFdiRs/4LQWniOlLLPa+6GOD7E81044nB
t8LAKtSSxK+qvFVcfqrZSfaivRKQZbYwuk+x3O4vAUDgTpTgmugNaDTYendN3v1B
R809X7cWC6V1kzTaBkK9
=o1rL
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ritesh Raj Sarraf Sept. 9, 2016, 2:04 p.m. UTC | #5
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Fri, 2016-09-09 at 18:44 +0530, Ritesh Raj Sarraf wrote:
> On Fri, 2016-09-09 at 12:54 +0200, Ulf Hansson wrote:
> > This affects the way the core calls the host driver's ->set_ios()
> > callback. Earlier it was invoked first to do power off then power up.
> > With this change it starts with power up instead.
> > I wanted to try this because I suspected the initial state could be wrong.
> > 
> > So here are some other ideas on how to move forward.
> > 1. Run with CONFIG_PM unset to see if we can reproduce the problem.
> > 2. Revert back the state in the mmc core we had in 3.16 around how it
> > deals with runtime PM for host devices. That's actually very easy as
> > we only need to remove the pm_runtime_put|get() calls in
> > mmc_claim|release_host().
> > 
> > Ritesh, can you try these options?
> > 
> 
> Yes. I can try the above ones now. I'm building the kernel for it.

For #1, menuconfig doesn't allow me to disable CONFIG_PM in 4.8. I checked it
back up till 4.0, and it still doesn't allow disabling CONFIG_PM.

For #2, I'm building the 4.8-rc5 kernel with the following change. This build
does not include the previous change you had suggested (related to POWER_CYCLE)

Date:   Fri Sep 9 19:28:03 2016 +0530

    Disable pm runtime in mmc core

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index e55cde6..32388d5 100644
- --- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -970,9 +970,6 @@ int __mmc_claim_host(struct mmc_host *host, atomic_t *abort)
        spin_unlock_irqrestore(&host->lock, flags);
        remove_wait_queue(&host->wq, &wait);
 
- -       if (pm)
- -               pm_runtime_get_sync(mmc_dev(host));
- -
        return stop;
 }
 EXPORT_SYMBOL(__mmc_claim_host);
@@ -1000,7 +997,6 @@ void mmc_release_host(struct mmc_host *host)
                spin_unlock_irqrestore(&host->lock, flags);
                wake_up(&host->wq);
                pm_runtime_mark_last_busy(mmc_dev(host));
- -               pm_runtime_put_autosuspend(mmc_dev(host));
        }
 }
 EXPORT_SYMBOL(mmc_release_host);


- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJX0sF8AAoJEKY6WKPy4XVpmDQQAKPpzuw4QaaYdGuoEdZs9tvL
ZIVXOp81QVFg/VC+k8b5JmxcVmyaRrAmlKwKSBUrQqLsRIDROKHz7kAZmABvvmMo
8PC+haslv6o+M/xTd2kZMgYRk0Xj11+Ucr6mTd0BVbTqzD86WZhSmdufeiFWhzjB
aMloMDJ3cYABMIHqPQH5S/+knNhuffKqEEZ1O7jgcc10c/JpwpxaAlefNLh9Qotk
bsb+ptpBE0ggk8gD/tGSx6JZLNFy15JyzE8yuL8LfrZzzW2KU8M4kv94+6BNMqpE
sJ1mapW3zu52Hev9cDpUeTgyVVEOEXJKu9AM626voyxVYrCEwrE4usUcLVsdJH17
p7Rm5gBiEK/Wx+f10CBiFW2HwdE0KmeBgxweprv+E6VXaWFkjoXSJY5DDX5zuxlf
we9onx87IaGVTLN0I7dEcVse/3T3zT8URM/HwFyR6K+PWD0Ioiyoi4GIE96OCIJn
oUahrupOppgUZbr+qn+HULHLXJONWBslZmbS3gjQG282+koy00wquAou+4HznA5z
DBLPaljAaIuIPKxKrIDOcJ3nxBh5eBf0RHXsn9Ho6iWrcpPqYtN2XuDWNhOA/wJB
eEm1TQAMmi83FATZ0qf0n70E1/mv2wrV4nSzHPqeIgtexABWLqhS8fLCEuMW6ESt
zMRmOrou7dzaXfwkGzhC
=2dkl
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern Sept. 9, 2016, 4:15 p.m. UTC | #6
On Fri, 9 Sep 2016, Ritesh Raj Sarraf wrote:

> On Fri, 2016-09-09 at 18:44 +0530, Ritesh Raj Sarraf wrote:
> > On Fri, 2016-09-09 at 12:54 +0200, Ulf Hansson wrote:
> > > This affects the way the core calls the host driver's ->set_ios()
> > > callback. Earlier it was invoked first to do power off then power up.
> > > With this change it starts with power up instead.
> > > I wanted to try this because I suspected the initial state could be wrong.
> > > 
> > > So here are some other ideas on how to move forward.
> > > 1. Run with CONFIG_PM unset to see if we can reproduce the problem.
> > > 2. Revert back the state in the mmc core we had in 3.16 around how it
> > > deals with runtime PM for host devices. That's actually very easy as
> > > we only need to remove the pm_runtime_put|get() calls in
> > > mmc_claim|release_host().
> > > 
> > > Ritesh, can you try these options?
> > > 
> > 
> > Yes. I can try the above ones now. I'm building the kernel for it.
> 
> For #1, menuconfig doesn't allow me to disable CONFIG_PM in 4.8. I checked it
> back up till 4.0, and it still doesn't allow disabling CONFIG_PM.

You can do it, but it would require some pretty far-reaching changes.  
Besides, there's really no point.  If CONFIG_PM isn't enabled then the 
kernel doesn't do any runtime PM at all.

Alan Stern


> For #2, I'm building the 4.8-rc5 kernel with the following change. This build
> does not include the previous change you had suggested (related to POWER_CYCLE)
> 
> Date:   Fri Sep 9 19:28:03 2016 +0530
> 
>     Disable pm runtime in mmc core
> 
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index e55cde6..32388d5 100644
> - --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -970,9 +970,6 @@ int __mmc_claim_host(struct mmc_host *host, atomic_t *abort)
>         spin_unlock_irqrestore(&host->lock, flags);
>         remove_wait_queue(&host->wq, &wait);
>  
> - -       if (pm)
> - -               pm_runtime_get_sync(mmc_dev(host));
> - -
>         return stop;
>  }
>  EXPORT_SYMBOL(__mmc_claim_host);
> @@ -1000,7 +997,6 @@ void mmc_release_host(struct mmc_host *host)
>                 spin_unlock_irqrestore(&host->lock, flags);
>                 wake_up(&host->wq);
>                 pm_runtime_mark_last_busy(mmc_dev(host));
> - -               pm_runtime_put_autosuspend(mmc_dev(host));
>         }
>  }
>  EXPORT_SYMBOL(mmc_release_host);

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ritesh Raj Sarraf Sept. 14, 2016, 2:50 p.m. UTC | #7
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hello Ulf and Alan,

On Fri, 2016-09-09 at 19:34 +0530, Ritesh Raj Sarraf wrote:
> For #2, I'm building the 4.8-rc5 kernel with the following change. This build
> does not include the previous change you had suggested (related to
> POWER_CYCLE)
> 
> Date:   Fri Sep 9 19:28:03 2016 +0530
> 
>     Disable pm runtime in mmc core
> 
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index e55cde6..32388d5 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -970,9 +970,6 @@ int __mmc_claim_host(struct mmc_host *host, atomic_t
> *abort)
>         spin_unlock_irqrestore(&host->lock, flags);
>         remove_wait_queue(&host->wq, &wait);
>  
> -       if (pm)
> -               pm_runtime_get_sync(mmc_dev(host));
> -
>         return stop;
>  }
>  EXPORT_SYMBOL(__mmc_claim_host);
> @@ -1000,7 +997,6 @@ void mmc_release_host(struct mmc_host *host)
>                 spin_unlock_irqrestore(&host->lock, flags);
>                 wake_up(&host->wq);
>                 pm_runtime_mark_last_busy(mmc_dev(host));
> -               pm_runtime_put_autosuspend(mmc_dev(host));
>         }
>  }
>  EXPORT_SYMBOL(mmc_release_host);

I tried with these changes on 4.8-rc6 and I only saw 2 resets so far.
I captured the usb trace [1], just in case if you need it.

[1] https://people.debian.org/~rrs/tmp/4.8-rc6-ulf.txt.gz

- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJX2WO6AAoJEKY6WKPy4XVpfx4P/3VOWgHw87iZxuHlw+vHUVOR
BB8cdQidh6nVba2UcDXb8uw/oYYYEJZ0FYvvdgKwt/14QNaL1L3jwrLpayUC7AAM
wZA8bhESANx/KoJiZH4GuasnkXfmjXVz2XPOIz/b8qfp4jfreFZfZQHgkIY7cEtE
gO9JArpM/e6FZY/5mEYy1bFnvuuyTfI4Wu5Cm6HmyidT1mfhTM7xvyMTLfM+QRUm
VT7pRhS6oujWO7K2KDMTrcZqgs2AVmjCqZW13F4AnMU/owxvRYTpGyW3hDbi9oIg
Z8ZJ5sHT7jpJ4I/FZJxwaSthIN5aF4wG8UjTFr2EIc+W5Xx/xzYpZP3Qm+HkFnej
n8Uyo0ZMN9+CV6VI2Qgzr+pB5LAfYHjIMobGAzaCzN81MWCVmb/GfiLQhpcgOnoK
TMHVqCqWlkF8W0V5ap+Tc4Ce4Wqj/V+RlQpE01GeOg15DUtgqWXCGbYKspnwwRD2
u2Ivso9G4xDd0VOp9x4zpIdfbAqaSP+DuZZ7EnGVfd30j4ENWsrRTDXdWf63uAx6
GMOpAephwUOZo7WFxwd/q19181YM/emSg+weOMcGxwHvpQ+vTEN+JELD0B0Z6PpI
OvaPROIL8bb5SuPKzKJ55ZRLpv40sjWOglZC1oOhtSg2IMFJwjPcqzaO4frIxl4t
3OCttIZYgpp5kIiW8qV1
=KNnj
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern Sept. 14, 2016, 3:19 p.m. UTC | #8
On Wed, 14 Sep 2016, Ritesh Raj Sarraf wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
> 
> Hello Ulf and Alan,
> 
> On Fri, 2016-09-09 at 19:34 +0530, Ritesh Raj Sarraf wrote:
> > For #2, I'm building the 4.8-rc5 kernel with the following change. This build
> > does not include the previous change you had suggested (related to
> > POWER_CYCLE)
> > 
> > Date:   Fri Sep 9 19:28:03 2016 +0530
> > 
> >     Disable pm runtime in mmc core
> > 
> > diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> > index e55cde6..32388d5 100644
> > --- a/drivers/mmc/core/core.c
> > +++ b/drivers/mmc/core/core.c
> > @@ -970,9 +970,6 @@ int __mmc_claim_host(struct mmc_host *host, atomic_t
> > *abort)
> >         spin_unlock_irqrestore(&host->lock, flags);
> >         remove_wait_queue(&host->wq, &wait);
> >  
> > -       if (pm)
> > -               pm_runtime_get_sync(mmc_dev(host));
> > -
> >         return stop;
> >  }
> >  EXPORT_SYMBOL(__mmc_claim_host);
> > @@ -1000,7 +997,6 @@ void mmc_release_host(struct mmc_host *host)
> >                 spin_unlock_irqrestore(&host->lock, flags);
> >                 wake_up(&host->wq);
> >                 pm_runtime_mark_last_busy(mmc_dev(host));
> > -               pm_runtime_put_autosuspend(mmc_dev(host));
> >         }
> >  }
> >  EXPORT_SYMBOL(mmc_release_host);
> 
> I tried with these changes on 4.8-rc6 and I only saw 2 resets so far.
> I captured the usb trace [1], just in case if you need it.
> 
> [1] https://people.debian.org/~rrs/tmp/4.8-rc6-ulf.txt.gz

The situation isn't any better.  At the start of the trace, 
the device is in runtime suspend but there are many attempts to 
communicate with it, all of which fail.

Then a little less than an hour after the trace started, the device was 
resumed.  At that point it started working okay.  Until there was a 
spontaneous disconnect.

The device reconnected, but after 3 seconds it was runtime suspended 
again -- and the I/O attempts continued.  Some time later there was 
another runtime resume, and the device began working again.  Until 
another spontaneous disconnect occurred.  And so on...

Ulf, we really need to figure out why the autosuspends are occurring 
and why the I/O doesn't stop while the device is suspended.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/mmc/host/rtsx_usb_sdmmc.c
b/drivers/mmc/host/rtsx_usb_sdmmc.c
index 6c71fc9..3d6fe51 100644
--- a/drivers/mmc/host/rtsx_usb_sdmmc.c
+++ b/drivers/mmc/host/rtsx_usb_sdmmc.c
@@ -1138,11 +1138,6 @@  static void sdmmc_set_ios(struct mmc_host *mmc,
struct mmc_ios *ios)
        dev_dbg(sdmmc_dev(host), "%s\n", __func__);
        mutex_lock(&ucr->dev_mutex);

-       if (rtsx_usb_card_exclusive_check(ucr, RTSX_USB_SD_CARD)) {
-               mutex_unlock(&ucr->dev_mutex);
-               return;
-       }
-
        sd_set_power_mode(host, ios->power_mode);
        sd_set_bus_width(host, ios->bus_width);