diff mbox

PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1

Message ID 20160126224804.GB19432@atomide.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tony Lindgren Jan. 26, 2016, 10:48 p.m. UTC
Hi,

Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
PM states at probe error and driver unbind") broke PM on at least
omap3. It seems we now need to additionally also call
pm_runtime_dont_use_autosuspend() to get things working again?

The following fixes idling on omap3, but I'm wondering if we
should do something in pm_runtime_reinit() instead?

Regards,

Tony

8< ---------------------
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Tony Lindgren Jan. 26, 2016, 10:50 p.m. UTC | #1
* Tony Lindgren <tony@atomide.com> [160126 14:49]:
> Hi,
> 
> Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind") broke PM on at least
> omap3. It seems we now need to additionally also call
> pm_runtime_dont_use_autosuspend() to get things working again?
> 
> The following fixes idling on omap3, but I'm wondering if we
> should do something in pm_runtime_reinit() instead?

Oh one more thing, this happens as we get -EPROBE_DEFER with
the regulators initially.

> 8< ---------------------
> --- a/drivers/mmc/host/omap_hsmmc.c
> +++ b/drivers/mmc/host/omap_hsmmc.c
> @@ -2232,6 +2232,7 @@ err_irq:
>  		dma_release_channel(host->tx_chan);
>  	if (host->rx_chan)
>  		dma_release_channel(host->rx_chan);
> +	pm_runtime_dont_use_autosuspend(host->dev);
>  	pm_runtime_put_sync(host->dev);
>  	pm_runtime_disable(host->dev);
>  	if (host->dbclk)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Jan. 26, 2016, 11:14 p.m. UTC | #2
On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> Hi,
>
> Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind") broke PM on at least
> omap3. It seems we now need to additionally also call
> pm_runtime_dont_use_autosuspend() to get things working again?
>
> The following fixes idling on omap3, but I'm wondering if we
> should do something in pm_runtime_reinit() instead?

Well, does it actually help if you add
pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?

> 8< ---------------------
> --- a/drivers/mmc/host/omap_hsmmc.c
> +++ b/drivers/mmc/host/omap_hsmmc.c
> @@ -2232,6 +2232,7 @@ err_irq:
>                 dma_release_channel(host->tx_chan);
>         if (host->rx_chan)
>                 dma_release_channel(host->rx_chan);
> +       pm_runtime_dont_use_autosuspend(host->dev);
>         pm_runtime_put_sync(host->dev);
>         pm_runtime_disable(host->dev);
>         if (host->dbclk)
> --


Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Jan. 26, 2016, 11:22 p.m. UTC | #3
* Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> > Hi,
> >
> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> > PM states at probe error and driver unbind") broke PM on at least
> > omap3. It seems we now need to additionally also call
> > pm_runtime_dont_use_autosuspend() to get things working again?
> >
> > The following fixes idling on omap3, but I'm wondering if we
> > should do something in pm_runtime_reinit() instead?
> 
> Well, does it actually help if you add
> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?

No adding it to pm_runtime_reinit() does not help.

Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
gives any clues.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Jan. 26, 2016, 11:37 p.m. UTC | #4
On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
>> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
>> > Hi,
>> >
>> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> > PM states at probe error and driver unbind") broke PM on at least
>> > omap3. It seems we now need to additionally also call
>> > pm_runtime_dont_use_autosuspend() to get things working again?
>> >
>> > The following fixes idling on omap3, but I'm wondering if we
>> > should do something in pm_runtime_reinit() instead?
>>
>> Well, does it actually help if you add
>> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
>
> No adding it to pm_runtime_reinit() does not help.

Yes, I realized that it wouldn't help only after sending the previous
message, sorry about that.

The reason why it helps in the driver code seems to be that
autosuspend_delay happens to be negative, so update_autosuspend()
decrements the usage counter that would have stayed incremented
otherwise.  Or at least that's the only way it can help I see ATM.

> Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> gives any clues.

It looks like pm_runtime_reinit() should clear the usage counter too.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Jan. 26, 2016, 11:52 p.m. UTC | #5
* Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
> On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> >> > Hi,
> >> >
> >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> >> > PM states at probe error and driver unbind") broke PM on at least
> >> > omap3. It seems we now need to additionally also call
> >> > pm_runtime_dont_use_autosuspend() to get things working again?
> >> >
> >> > The following fixes idling on omap3, but I'm wondering if we
> >> > should do something in pm_runtime_reinit() instead?
> >>
> >> Well, does it actually help if you add
> >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
> >
> > No adding it to pm_runtime_reinit() does not help.
> 
> Yes, I realized that it wouldn't help only after sending the previous
> message, sorry about that.
>
> The reason why it helps in the driver code seems to be that
> autosuspend_delay happens to be negative, so update_autosuspend()
> decrements the usage counter that would have stayed incremented
> otherwise.  Or at least that's the only way it can help I see ATM.

Oh OK.

> > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> > gives any clues.
> 
> It looks like pm_runtime_reinit() should clear the usage counter too.

Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
safe to pretty much reset everything?

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Jan. 27, 2016, 7:54 a.m. UTC | #6
On Tuesday, January 26, 2016 03:52:23 PM Tony Lindgren wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
> > On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> > > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> > >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> > >> > Hi,
> > >> >
> > >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> > >> > PM states at probe error and driver unbind") broke PM on at least
> > >> > omap3. It seems we now need to additionally also call
> > >> > pm_runtime_dont_use_autosuspend() to get things working again?
> > >> >
> > >> > The following fixes idling on omap3, but I'm wondering if we
> > >> > should do something in pm_runtime_reinit() instead?
> > >>
> > >> Well, does it actually help if you add
> > >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
> > >
> > > No adding it to pm_runtime_reinit() does not help.
> > 
> > Yes, I realized that it wouldn't help only after sending the previous
> > message, sorry about that.
> >
> > The reason why it helps in the driver code seems to be that
> > autosuspend_delay happens to be negative, so update_autosuspend()
> > decrements the usage counter that would have stayed incremented
> > otherwise.  Or at least that's the only way it can help I see ATM.
> 
> Oh OK.
> 
> > > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> > > gives any clues.
> > 
> > It looks like pm_runtime_reinit() should clear the usage counter too.
> 
> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
> safe to pretty much reset everything?

Not only safe, but also a good idea apparently. :-)

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Jan. 27, 2016, 8:17 a.m. UTC | #7
On 27 January 2016 at 08:54, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Tuesday, January 26, 2016 03:52:23 PM Tony Lindgren wrote:
>> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
>> > On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
>> > > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
>> > >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
>> > >> > Hi,
>> > >> >
>> > >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> > >> > PM states at probe error and driver unbind") broke PM on at least
>> > >> > omap3. It seems we now need to additionally also call
>> > >> > pm_runtime_dont_use_autosuspend() to get things working again?
>> > >> >
>> > >> > The following fixes idling on omap3, but I'm wondering if we
>> > >> > should do something in pm_runtime_reinit() instead?
>> > >>
>> > >> Well, does it actually help if you add
>> > >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
>> > >
>> > > No adding it to pm_runtime_reinit() does not help.
>> >
>> > Yes, I realized that it wouldn't help only after sending the previous
>> > message, sorry about that.
>> >
>> > The reason why it helps in the driver code seems to be that
>> > autosuspend_delay happens to be negative, so update_autosuspend()
>> > decrements the usage counter that would have stayed incremented
>> > otherwise.  Or at least that's the only way it can help I see ATM.
>>
>> Oh OK.
>>
>> > > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
>> > > gives any clues.
>> >
>> > It looks like pm_runtime_reinit() should clear the usage counter too.
>>
>> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
>> safe to pretty much reset everything?
>
> Not only safe, but also a good idea apparently. :-)

I happy to send a patch, extending pm_runtime_reinit() with some more
data to be reset.

Or, you or Tony intend to send a patch?

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Jan. 27, 2016, 3:19 p.m. UTC | #8
* Ulf Hansson <ulf.hansson@linaro.org> [160127 00:18]:
> On 27 January 2016 at 08:54, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Tuesday, January 26, 2016 03:52:23 PM Tony Lindgren wrote:
> >> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
> >> safe to pretty much reset everything?
> >
> > Not only safe, but also a good idea apparently. :-)
> 
> I happy to send a patch, extending pm_runtime_reinit() with some more
> data to be reset.
> 
> Or, you or Tony intend to send a patch?

Please do, I'll be glad to test it!

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Jan. 27, 2016, 10:51 p.m. UTC | #9
On Wednesday, January 27, 2016 09:17:45 AM Ulf Hansson wrote:
> On 27 January 2016 at 08:54, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Tuesday, January 26, 2016 03:52:23 PM Tony Lindgren wrote:
> >> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
> >> > On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> >> > > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> >> > >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> >> > >> > Hi,
> >> > >> >
> >> > >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> >> > >> > PM states at probe error and driver unbind") broke PM on at least
> >> > >> > omap3. It seems we now need to additionally also call
> >> > >> > pm_runtime_dont_use_autosuspend() to get things working again?
> >> > >> >
> >> > >> > The following fixes idling on omap3, but I'm wondering if we
> >> > >> > should do something in pm_runtime_reinit() instead?
> >> > >>
> >> > >> Well, does it actually help if you add
> >> > >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
> >> > >
> >> > > No adding it to pm_runtime_reinit() does not help.
> >> >
> >> > Yes, I realized that it wouldn't help only after sending the previous
> >> > message, sorry about that.
> >> >
> >> > The reason why it helps in the driver code seems to be that
> >> > autosuspend_delay happens to be negative, so update_autosuspend()
> >> > decrements the usage counter that would have stayed incremented
> >> > otherwise.  Or at least that's the only way it can help I see ATM.
> >>
> >> Oh OK.
> >>
> >> > > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> >> > > gives any clues.
> >> >
> >> > It looks like pm_runtime_reinit() should clear the usage counter too.
> >>
> >> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
> >> safe to pretty much reset everything?
> >
> > Not only safe, but also a good idea apparently. :-)
> 
> I happy to send a patch, extending pm_runtime_reinit() with some more
> data to be reset.

Please do.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Jan. 28, 2016, 2:29 p.m. UTC | #10
On 27 January 2016 at 00:52, Tony Lindgren <tony@atomide.com> wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
>> On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
>> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
>> >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
>> >> > Hi,
>> >> >
>> >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> >> > PM states at probe error and driver unbind") broke PM on at least

I need to understand the issue here in a bit more detail, could you
please try to fill out some of my gaps!?

In *what way* did it break PM?

Did the driver not probe successfully the second try? If so, what happened.

>> >> > omap3. It seems we now need to additionally also call
>> >> > pm_runtime_dont_use_autosuspend() to get things working again?
>> >> >
>> >> > The following fixes idling on omap3, but I'm wondering if we
>> >> > should do something in pm_runtime_reinit() instead?

I understand this as the second (or third, forth, whatever) probing
attempt actually succeeds, right!?

Is the issue you are seeing, that the device never becomes runtime
suspended due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
PM states at probe error and driver unbind")?

>> >>
>> >> Well, does it actually help if you add
>> >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
>> >
>> > No adding it to pm_runtime_reinit() does not help.
>>
>> Yes, I realized that it wouldn't help only after sending the previous
>> message, sorry about that.
>>
>> The reason why it helps in the driver code seems to be that
>> autosuspend_delay happens to be negative, so update_autosuspend()
>> decrements the usage counter that would have stayed incremented
>> otherwise.  Or at least that's the only way it can help I see ATM.
>
> Oh OK.
>
>> > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
>> > gives any clues.

That's a good clue. Although, there could be several reasons to why.

Rafael has pointed out one valid potential case, but I want to be sure
that's really what happening here.

*If* the problem is that the device doesn't becomes runtime suspended,
that will anyway be prevented as long as the autosuspend_delay has
been set to a negative value. That's why I wonder whether this really
is the case here.

For omap3, I assume there's a PM domain (the so called hwmod) being
attached to the omap_hsmmc device at device registration point!?
In that path depending on a specific hwmod flag, the device will be
given the an initial *active* runtime PM status, via invoking
pm_runtime_set_active().

*If* that's the case, it affects the probing sequence, as the
pm_runtime_get_sync() won't trigger the ->runtime_resume() callbacks
to be invoked in the first probe attempt.

Moreover, according to the data I received in this regression report
so far, it seems like probing the second time has *always* been done
with the device in runtime PM active state. Could that be the reason
to why it "happens" to work?

>>
>> It looks like pm_runtime_reinit() should clear the usage counter too.
>
> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
> safe to pretty much reset everything?
>
> Regards,
>
> Tony

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Jan. 28, 2016, 4:58 p.m. UTC | #11
Hi,

* Ulf Hansson <ulf.hansson@linaro.org> [160128 06:30]:
> On 27 January 2016 at 00:52, Tony Lindgren <tony@atomide.com> wrote:
> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
> >> On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> >> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> >> >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> >> >> > PM states at probe error and driver unbind") broke PM on at least
> 
> I need to understand the issue here in a bit more detail, could you
> please try to fill out some of my gaps!?
> 
> In *what way* did it break PM?

The MMC hardware will not get idled properly any longer blocking any
deeper idle states.

> Did the driver not probe successfully the second try? If so, what happened.

It probes fine after a -EPROBE_DEFER on the vmmc i2c regulator.
But the PM runtime usecounts are wrong.

> >> >> > omap3. It seems we now need to additionally also call
> >> >> > pm_runtime_dont_use_autosuspend() to get things working again?
> >> >> >
> >> >> > The following fixes idling on omap3, but I'm wondering if we
> >> >> > should do something in pm_runtime_reinit() instead?
> 
> I understand this as the second (or third, forth, whatever) probing
> attempt actually succeeds, right!?

Right.

> Is the issue you are seeing, that the device never becomes runtime
> suspended due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind")?

Correct.

> >> > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> >> > gives any clues.
> 
> That's a good clue. Although, there could be several reasons to why.
> 
> Rafael has pointed out one valid potential case, but I want to be sure
> that's really what happening here.
> 
> *If* the problem is that the device doesn't becomes runtime suspended,
> that will anyway be prevented as long as the autosuspend_delay has
> been set to a negative value. That's why I wonder whether this really
> is the case here.

That seems to be the case here. In device driver probe, commenting out
pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY)
also makes things work again.

So one of the failing cases seems to be:

1. Device driver probe sets pm_runtime_set_autosuspend_delay()

2. Device driver probe initially fails with -EPROBE_DEFER

3. PM runtime usecounts get messed up

> For omap3, I assume there's a PM domain (the so called hwmod) being
> attached to the omap_hsmmc device at device registration point!?

Correct. It uses the notifiers like bus code does in general.

FYI, most SoCs don't have hardware based autoidle signaling between
the interconnect and the interconnect targets. So the hwmod code is
still needed until we have converted it into a proper interconnect +
module target drivers.

> In that path depending on a specific hwmod flag, the device will be
> given the an initial *active* runtime PM status, via invoking
> pm_runtime_set_active().
>
> *If* that's the case, it affects the probing sequence, as the
> pm_runtime_get_sync() won't trigger the ->runtime_resume() callbacks
> to be invoked in the first probe attempt.

It has worked since pm runtime. And it works with MMC as as a loadable
module just fine when no -EPROBE_DEFER happens.

> Moreover, according to the data I received in this regression report
> so far, it seems like probing the second time has *always* been done
> with the device in runtime PM active state. Could that be the reason
> to why it "happens" to work?

Not correct. I think that speculation is not related to the $subject
regression at all.

BTW, do you have some hardware to test with that has PM runtime
implemnted and actually working with mainline kernel?

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Feb. 1, 2016, 4:44 p.m. UTC | #12
On 28 January 2016 at 17:58, Tony Lindgren <tony@atomide.com> wrote:
> Hi,
>
> * Ulf Hansson <ulf.hansson@linaro.org> [160128 06:30]:
>> On 27 January 2016 at 00:52, Tony Lindgren <tony@atomide.com> wrote:
>> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
>> >> On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
>> >> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
>> >> >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> >> >> > PM states at probe error and driver unbind") broke PM on at least
>>
>> I need to understand the issue here in a bit more detail, could you
>> please try to fill out some of my gaps!?
>>
>> In *what way* did it break PM?
>
> The MMC hardware will not get idled properly any longer blocking any
> deeper idle states.
>
>> Did the driver not probe successfully the second try? If so, what happened.
>
> It probes fine after a -EPROBE_DEFER on the vmmc i2c regulator.
> But the PM runtime usecounts are wrong.

Okay. How did you verify this?

I think the easiest way would be to make sure the usage count is
*one*, just before the omap_hsmmc calls mmc_add_host() in its
->probe() function.

If it's *two*, that confirms this theory.

>
>> >> >> > omap3. It seems we now need to additionally also call
>> >> >> > pm_runtime_dont_use_autosuspend() to get things working again?
>> >> >> >
>> >> >> > The following fixes idling on omap3, but I'm wondering if we
>> >> >> > should do something in pm_runtime_reinit() instead?
>>
>> I understand this as the second (or third, forth, whatever) probing
>> attempt actually succeeds, right!?
>
> Right.
>
>> Is the issue you are seeing, that the device never becomes runtime
>> suspended due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> PM states at probe error and driver unbind")?
>
> Correct.
>
>> >> > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
>> >> > gives any clues.
>>
>> That's a good clue. Although, there could be several reasons to why.
>>
>> Rafael has pointed out one valid potential case, but I want to be sure
>> that's really what happening here.
>>
>> *If* the problem is that the device doesn't becomes runtime suspended,
>> that will anyway be prevented as long as the autosuspend_delay has
>> been set to a negative value. That's why I wonder whether this really
>> is the case here.
>
> That seems to be the case here. In device driver probe, commenting out
> pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY)
> also makes things work again.

Okay. Interesting. :-)

>
> So one of the failing cases seems to be:
>
> 1. Device driver probe sets pm_runtime_set_autosuspend_delay()

Only when "someone" in-before has set the autosuspend delay to a
negative value, this will affect the usage count.

I didn't find any in-kernel user that would set this for the
omap_hsmmc device, so it needs to be done from userspace via sysfs. It
would be really great if you could help to confirm this.

But more importantly, I fail to understand why commit 5de85b9d57ab
("PM / runtime: Re-init runtime PM states at probe error and driver
unbind"), is changing the behaviour in this regards.
Potentially we might have a different runtime PM status than we had
before when probing the second try, but how could that mess up the
usage count...?

>
> 2. Device driver probe initially fails with -EPROBE_DEFER

So that happened before as well.

Although, we now get a different runtime PM status (suspended) the
second probe time.

>
> 3. PM runtime usecounts get messed up

Perhaps, but let's try to validate that as it should be rather easy.

>
>> For omap3, I assume there's a PM domain (the so called hwmod) being
>> attached to the omap_hsmmc device at device registration point!?
>
> Correct. It uses the notifiers like bus code does in general.

Yes, there perfectly okay.

Although, I wondering whether it could be that it's the PM domain
that's preventing the omap_hsmmc device from becoming runtime
suspended.

Perhaps the PM domain returns an error code from its
->runtime_suspend() callback?

>
> FYI, most SoCs don't have hardware based autoidle signaling between
> the interconnect and the interconnect targets. So the hwmod code is
> still needed until we have converted it into a proper interconnect +
> module target drivers.
>
>> In that path depending on a specific hwmod flag, the device will be
>> given the an initial *active* runtime PM status, via invoking
>> pm_runtime_set_active().
>>
>> *If* that's the case, it affects the probing sequence, as the
>> pm_runtime_get_sync() won't trigger the ->runtime_resume() callbacks
>> to be invoked in the first probe attempt.
>
> It has worked since pm runtime. And it works with MMC as as a loadable
> module just fine when no -EPROBE_DEFER happens.

Okay, so we definitely seem to have an issue with the changed runtime
PM status (suspended) the second probe time.

That's indeed caused by 5de85b9d57ab ("PM / runtime: Re-init runtime
PM states at probe error and driver unbind").

>
>> Moreover, according to the data I received in this regression report
>> so far, it seems like probing the second time has *always* been done
>> with the device in runtime PM active state. Could that be the reason
>> to why it "happens" to work?
>
> Not correct. I think that speculation is not related to the $subject
> regression at all.

I think it's important in this context, as it could affect how the PM
domain may treat the device. As I mentioned earlier.

For example due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
PM states at probe error and driver unbind"), the ->runtime_resume()
callback seems now to be called twice in a row, without in-between
having the ->runtime_suspend() callback being invoked. Does really the
PM domain cope with that correctly?

>
> BTW, do you have some hardware to test with that has PM runtime
> implemnted and actually working with mainline kernel?

Oh, yes. Although I don't have an omap3, I wish I had.

Also, I have locally a "runtime PM test driver", which helps me to
test various runtime PM sequences.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Feb. 1, 2016, 6:11 p.m. UTC | #13
* Ulf Hansson <ulf.hansson@linaro.org> [160201 08:45]:
> On 28 January 2016 at 17:58, Tony Lindgren <tony@atomide.com> wrote:
> >
> > The MMC hardware will not get idled properly any longer blocking any
> > deeper idle states.
> >
> >> Did the driver not probe successfully the second try? If so, what happened.
> >
> > It probes fine after a -EPROBE_DEFER on the vmmc i2c regulator.
> > But the PM runtime usecounts are wrong.
> 
> Okay. How did you verify this?

Well that was just based on what I see in the dmesg:

omap_device: omap_device_enable() called from invalid state 1

> I think the easiest way would be to make sure the usage count is
> *one*, just before the omap_hsmmc calls mmc_add_host() in its
> ->probe() function.
> 
> If it's *two*, that confirms this theory.

Here's with use count dumped for one MMC:

omap_hsmmc 4809c000.mmc: GPIO lookup for consumer cd
omap_hsmmc 4809c000.mmc: using device tree for GPIO lookup
omap_hsmmc 4809c000.mmc: Got CD GPIO
omap_hsmmc 4809c000.mmc: GPIO lookup for consumer wp
omap_hsmmc 4809c000.mmc: using device tree for GPIO lookup
omap_hsmmc 4809c000.mmc: using lookup tables for GPIO lookup
omap_hsmmc 4809c000.mmc: lookup for GPIO wp failed
omap_hsmmc 4809c000.mmc: GPIO lookup for consumer cd
omap_hsmmc 4809c000.mmc: using device tree for GPIO lookup
omap_hsmmc 4809c000.mmc: Got CD GPIO
omap_hsmmc 4809c000.mmc: GPIO lookup for consumer wp
omap_hsmmc 4809c000.mmc: using device tree for GPIO lookup
omap_hsmmc 4809c000.mmc: using lookup tables for GPIO lookup
omap_hsmmc 4809c000.mmc: lookup for GPIO wp failed
omap_hsmmc 4809c000.mmc: omap_device: omap_device_enable() called from invalid state 1
omap_hsmmc 4809c000.mmc: PM runtime use count: 0

So it seems you're right, it's the state not the count.

> > So one of the failing cases seems to be:
> >
> > 1. Device driver probe sets pm_runtime_set_autosuspend_delay()
> 
> Only when "someone" in-before has set the autosuspend delay to a
> negative value, this will affect the usage count.
> 
> I didn't find any in-kernel user that would set this for the
> omap_hsmmc device, so it needs to be done from userspace via sysfs. It
> would be really great if you could help to confirm this.

No nothing is setting that from userspace.

> But more importantly, I fail to understand why commit 5de85b9d57ab
> ("PM / runtime: Re-init runtime PM states at probe error and driver
> unbind"), is changing the behaviour in this regards.
> Potentially we might have a different runtime PM status than we had
> before when probing the second try, but how could that mess up the
> usage count...?

Yes I think you're right here, it's the state, not the count.

> Although, I wondering whether it could be that it's the PM domain
> that's preventing the omap_hsmmc device from becoming runtime
> suspended.
> 
> Perhaps the PM domain returns an error code from its
> ->runtime_suspend() callback?

We certainly see a warning there.

> Okay, so we definitely seem to have an issue with the changed runtime
> PM status (suspended) the second probe time.
> 
> That's indeed caused by 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind").

Yup.

> For example due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind"), the ->runtime_resume()
> callback seems now to be called twice in a row, without in-between
> having the ->runtime_suspend() callback being invoked. Does really the
> PM domain cope with that correctly?

That might explain the warning above.

> > BTW, do you have some hardware to test with that has PM runtime
> > implemnted and actually working with mainline kernel?
> 
> Oh, yes. Although I don't have an omap3, I wish I had.

OK good to hear. Anyways, getting an omap3 should be in tens of
whatever units if you need one to test with.

> Also, I have locally a "runtime PM test driver", which helps me to
> test various runtime PM sequences.

Now that would be good to have in the mainline kernel. Of course it
still is a very limited test.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -2232,6 +2232,7 @@  err_irq:
 		dma_release_channel(host->tx_chan);
 	if (host->rx_chan)
 		dma_release_channel(host->rx_chan);
+	pm_runtime_dont_use_autosuspend(host->dev);
 	pm_runtime_put_sync(host->dev);
 	pm_runtime_disable(host->dev);
 	if (host->dbclk)