diff mbox

[v2] PM / runtime: Rework pm_runtime_force_suspend/resume()

Message ID 2074722.H5Hd6orL2D@aspire.rjw.lan (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Rafael J. Wysocki Jan. 9, 2018, 4:28 p.m. UTC
On Tuesday, January 9, 2018 5:03:18 PM CET Rafael J. Wysocki wrote:
> On Tue, Jan 9, 2018 at 4:30 PM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > Hi Rafael,
> >
> > CC usb
> >
> > On Tue, Jan 9, 2018 at 4:00 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> >> On Tue, Jan 9, 2018 at 2:37 PM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >>> On Wed, Jan 3, 2018 at 12:06 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>>
> >>>> One of the limitations of pm_runtime_force_suspend/resume() is that
> >>>> if a parent driver wants to use these functions, all of its child
> >>>> drivers have to do that too because of the parent usage counter
> >>>> manipulations necessary to get the correct state of the parent during
> >>>> system-wide transitions to the working state (system resume).
> >>>> However, that limitation turns out to be artificial, so remove it.
> >>>>
> >>>> Namely, pm_runtime_force_suspend() only needs to update the children
> >>>> counter of its parent (if there's is a parent) when the device can
> >>>> stay in suspend after the subsequent system resume transition, as
> >>>> that counter is correct already otherwise.  Now, if the parent's
> >>>> children counter is not updated, it is not necessary to increment
> >>>> the parent's usage counter in that case any more, as long as the
> >>>> children counters of devices are checked along with their usage
> >>>> counters in order to decide whether or not the devices may be left
> >>>> in suspend after the subsequent system resume transition.
> >>>>
> >>>> Accordingly, modify pm_runtime_force_suspend() to only call
> >>>> pm_runtime_set_suspended() for devices whose usage and children
> >>>> counters are at the "no references" level (the runtime PM status
> >>>> of the device needs to be updated to "suspended" anyway in case
> >>>> this function is called once again for the same device during the
> >>>> transition under way), drop the parent usage counter incrementation
> >>>> from it and update pm_runtime_force_resume() to compensate for these
> >>>> changes.
> >>>>
> >>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>
> >>> This patch causes a regression during system resume on Renesas Salvator-XS
> >>> with R-Car H3 ES2.0:
> >>
> >> I have dropped it for now, but we need to address the underlying issue.
> >>
> >>>     SError Interrupt on CPU3, code 0xbf000002 -- SError
> >>>     SError Interrupt on CPU2, code 0xbf000002 -- SError
> >>>     CPU: 3 PID: 1769 Comm: kworker/u16:13 Not tainted
> >>> 4.15.0-rc7-arm64-renesas-05338-gf14cf570a813c9ca-dirty #97
> >>>     CPU: 2 PID: 1774 Comm: kworker/u16:18 Not tainted
> >>> 4.15.0-rc7-arm64-renesas-05338-gf14cf570a813c9ca-dirty #97
> >>>     Hardware name: Renesas Salvator-X 2nd version board based on
> >>> r8a7795 ES2.0+ (DT)
> >>>     Hardware name: Renesas Salvator-X 2nd version board based on
> >>> r8a7795 ES2.0+ (DT)
> >>>     Workqueue: events_unbound async_run_entry_fn
> >>>     Workqueue: events_unbound async_run_entry_fn
> >>>     pstate: 60000005 (nZCv daif -PAN -UAO)
> >>>     pstate: 60000005 (nZCv daif -PAN -UAO)
> >>>     pc : rcar_gen3_phy_usb2_init+0x34/0xf8
> >>>     pc : rcar_gen3_phy_usb2_init+0x34/0xf8
> >>>     lr : phy_init+0x64/0xcc
> >>>     lr : phy_init+0x64/0xcc
> >>>     ...
> >>>     Kernel panic - not syncing: Asynchronous SError Interrupt
> >>>
> >>> Note that before, it printed a warning instead:
> >>>
> >>>     Enabling runtime PM for inactive device (ee0a0200.usb-phy) with
> >>> active children
> >>>     WARNING: CPU: 0 PID: 1741 at drivers/base/power/runtime.c:1300
> >>> pm_runtime_enable+0x94/0xd8
> >>>
> >>> Reverting commit 0408584d580d4a2c ("PM / runtime: Rework
> >>> pm_runtime_force_suspend/resume()") fixes the crash.
> >>>
> >>> Note that applying Ulf's "[PATCH v2 0/3] phy: core: Re-work runtime PM
> >>> deployment and fix an issue"
> >>> (https://www.spinics.net/lists/linux-renesas-soc/msg21719.html) instead
> >>> does not fix the crash.
> >>
> >> OK
> >>
> >>> With more debug code added, it seems the EHCI module clocks (701-703) are
> >>> enabled earlier than before. I guess this triggers the workqueue to perform
> >>> an operation while another related device (HSUSB 704?) is still disabled?
> >>
> >> Probably.
> >>
> >> Likely a device that wasn't resumed before resumes now and that causes
> >> the issue to appear.
> >>
> >> I'm wondering if adding the ignore_children check to the patch helps.
> >> If not, there clearly is a resume ordering issue that is papered over
> >> by the current code.
> >
> > Something fishy is going on. Status of the USB PHYs differ before/after
> > system suspend, according to /sys/kernel/debug/pm_genpd/pm_genpd_summary:
> >
> > -    /devices/platform/soc/ee0a0200.usb-phy              active
> > -    /devices/platform/soc/ee0c0200.usb-phy              active
> > -    /devices/platform/soc/ee080200.usb-phy              active
> > +    /devices/platform/soc/ee0a0200.usb-phy              suspended
> > +    /devices/platform/soc/ee0c0200.usb-phy              suspended
> > +    /devices/platform/soc/ee080200.usb-phy              suspended
> 
> Yeah.
> 
> That's because of the pm_runtime_force_suspend/resume() called by
> genpd.  These functions generally may cause devices active before
> system suspend to be left in suspend after it.  That generally is a
> good idea if the device was not really in use before the system
> suspend, but there is a problem that the driver of it may not be
> prepared for that to happen (and also the way to determine the "not
> really in use" case may not be as expected by the driver).

So below is something to try (untested).

I know that Ulf will be protesting, but that's what I would do unless there
are really *really* good reasons not to.


---
 drivers/base/power/domain.c |   34 +++++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 13 deletions(-)

Comments

Ulf Hansson Jan. 10, 2018, 9:26 a.m. UTC | #1
On 9 January 2018 at 17:28, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Tuesday, January 9, 2018 5:03:18 PM CET Rafael J. Wysocki wrote:
>> On Tue, Jan 9, 2018 at 4:30 PM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> > Hi Rafael,
>> >
>> > CC usb
>> >
>> > On Tue, Jan 9, 2018 at 4:00 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
>> >> On Tue, Jan 9, 2018 at 2:37 PM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> >>> On Wed, Jan 3, 2018 at 12:06 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> >>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >>>>
>> >>>> One of the limitations of pm_runtime_force_suspend/resume() is that
>> >>>> if a parent driver wants to use these functions, all of its child
>> >>>> drivers have to do that too because of the parent usage counter
>> >>>> manipulations necessary to get the correct state of the parent during
>> >>>> system-wide transitions to the working state (system resume).
>> >>>> However, that limitation turns out to be artificial, so remove it.
>> >>>>
>> >>>> Namely, pm_runtime_force_suspend() only needs to update the children
>> >>>> counter of its parent (if there's is a parent) when the device can
>> >>>> stay in suspend after the subsequent system resume transition, as
>> >>>> that counter is correct already otherwise.  Now, if the parent's
>> >>>> children counter is not updated, it is not necessary to increment
>> >>>> the parent's usage counter in that case any more, as long as the
>> >>>> children counters of devices are checked along with their usage
>> >>>> counters in order to decide whether or not the devices may be left
>> >>>> in suspend after the subsequent system resume transition.
>> >>>>
>> >>>> Accordingly, modify pm_runtime_force_suspend() to only call
>> >>>> pm_runtime_set_suspended() for devices whose usage and children
>> >>>> counters are at the "no references" level (the runtime PM status
>> >>>> of the device needs to be updated to "suspended" anyway in case
>> >>>> this function is called once again for the same device during the
>> >>>> transition under way), drop the parent usage counter incrementation
>> >>>> from it and update pm_runtime_force_resume() to compensate for these
>> >>>> changes.
>> >>>>
>> >>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >>>
>> >>> This patch causes a regression during system resume on Renesas Salvator-XS
>> >>> with R-Car H3 ES2.0:
>> >>
>> >> I have dropped it for now, but we need to address the underlying issue.
>> >>
>> >>>     SError Interrupt on CPU3, code 0xbf000002 -- SError
>> >>>     SError Interrupt on CPU2, code 0xbf000002 -- SError
>> >>>     CPU: 3 PID: 1769 Comm: kworker/u16:13 Not tainted
>> >>> 4.15.0-rc7-arm64-renesas-05338-gf14cf570a813c9ca-dirty #97
>> >>>     CPU: 2 PID: 1774 Comm: kworker/u16:18 Not tainted
>> >>> 4.15.0-rc7-arm64-renesas-05338-gf14cf570a813c9ca-dirty #97
>> >>>     Hardware name: Renesas Salvator-X 2nd version board based on
>> >>> r8a7795 ES2.0+ (DT)
>> >>>     Hardware name: Renesas Salvator-X 2nd version board based on
>> >>> r8a7795 ES2.0+ (DT)
>> >>>     Workqueue: events_unbound async_run_entry_fn
>> >>>     Workqueue: events_unbound async_run_entry_fn
>> >>>     pstate: 60000005 (nZCv daif -PAN -UAO)
>> >>>     pstate: 60000005 (nZCv daif -PAN -UAO)
>> >>>     pc : rcar_gen3_phy_usb2_init+0x34/0xf8
>> >>>     pc : rcar_gen3_phy_usb2_init+0x34/0xf8
>> >>>     lr : phy_init+0x64/0xcc
>> >>>     lr : phy_init+0x64/0xcc
>> >>>     ...
>> >>>     Kernel panic - not syncing: Asynchronous SError Interrupt
>> >>>
>> >>> Note that before, it printed a warning instead:
>> >>>
>> >>>     Enabling runtime PM for inactive device (ee0a0200.usb-phy) with
>> >>> active children
>> >>>     WARNING: CPU: 0 PID: 1741 at drivers/base/power/runtime.c:1300
>> >>> pm_runtime_enable+0x94/0xd8
>> >>>
>> >>> Reverting commit 0408584d580d4a2c ("PM / runtime: Rework
>> >>> pm_runtime_force_suspend/resume()") fixes the crash.
>> >>>
>> >>> Note that applying Ulf's "[PATCH v2 0/3] phy: core: Re-work runtime PM
>> >>> deployment and fix an issue"
>> >>> (https://www.spinics.net/lists/linux-renesas-soc/msg21719.html) instead
>> >>> does not fix the crash.
>> >>
>> >> OK
>> >>
>> >>> With more debug code added, it seems the EHCI module clocks (701-703) are
>> >>> enabled earlier than before. I guess this triggers the workqueue to perform
>> >>> an operation while another related device (HSUSB 704?) is still disabled?
>> >>
>> >> Probably.
>> >>
>> >> Likely a device that wasn't resumed before resumes now and that causes
>> >> the issue to appear.
>> >>
>> >> I'm wondering if adding the ignore_children check to the patch helps.
>> >> If not, there clearly is a resume ordering issue that is papered over
>> >> by the current code.
>> >
>> > Something fishy is going on. Status of the USB PHYs differ before/after
>> > system suspend, according to /sys/kernel/debug/pm_genpd/pm_genpd_summary:
>> >
>> > -    /devices/platform/soc/ee0a0200.usb-phy              active
>> > -    /devices/platform/soc/ee0c0200.usb-phy              active
>> > -    /devices/platform/soc/ee080200.usb-phy              active
>> > +    /devices/platform/soc/ee0a0200.usb-phy              suspended
>> > +    /devices/platform/soc/ee0c0200.usb-phy              suspended
>> > +    /devices/platform/soc/ee080200.usb-phy              suspended
>>
>> Yeah.
>>
>> That's because of the pm_runtime_force_suspend/resume() called by
>> genpd.  These functions generally may cause devices active before
>> system suspend to be left in suspend after it.  That generally is a
>> good idea if the device was not really in use before the system
>> suspend, but there is a problem that the driver of it may not be
>> prepared for that to happen (and also the way to determine the "not
>> really in use" case may not be as expected by the driver).
>
> So below is something to try (untested).
>
> I know that Ulf will be protesting, but that's what I would do unless there
> are really *really* good reasons not to.

I am not protesting! :-)

Instead I think we are rather in agreement that we are concerned about
that genpd are invoking pm_runtime_force_suspend|resume().

>
>
> ---
>  drivers/base/power/domain.c |   34 +++++++++++++++++++++-------------
>  1 file changed, 21 insertions(+), 13 deletions(-)
>
> Index: linux-pm/drivers/base/power/domain.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/domain.c
> +++ linux-pm/drivers/base/power/domain.c
> @@ -1048,8 +1048,9 @@ static int genpd_finish_suspend(struct d
>         if (ret)
>                 return ret;
>
> -       if (genpd->dev_ops.stop && genpd->dev_ops.start) {
> -               ret = pm_runtime_force_suspend(dev);
> +       if (genpd->dev_ops.stop && genpd->dev_ops.start &&
> +           !pm_runtime_status_suspended(dev)) {
> +               ret = genpd_stop_dev(genpd, dev);

Something like this existed there before, but because of other
internal genpd reasons. It's an option but there are issues with it.

I think it's fragile because invoking the ->stop() callback may
trigger calls to "disable" resources (like clocks, pinctrls, etc) for
the device. Doing this at ->suspend_noirq() may be too late, as that
subsystem/driver for the resource(s) may already be system suspended.
This is the classic problem of suspending (and later resuming) devices
in in-correct order.

Today we deal with this, by drivers/subsystem assigning the right
level of callback, as well as making sure the "dpm_list" is sorted
correctly (yeah, we have device links as well).

The point is, we can go for this solution, but is it good enough?

Another option, is to simply to remove (and not replace with something
else) the calls to pm_runtime_force_ suspend|resume() from genpd. This
moves the entire responsibility to the driver, to put its device into
low power state during system suspend, but with the benefit of that it
can decide to use the correct level of suspend/resume callbacks.

No matter how we do it, changing this may introduce some potential
regressions from a power consumption point of view, however although
only for those SoCs that uses the ->start|stop() callbacks. To
mitigate these power regressions, some drivers needs to be fixed, but
that seems inevitable anyway, doesn't it?

[...]

Kind regards
Uffe
Rafael J. Wysocki Jan. 11, 2018, 12:46 a.m. UTC | #2
On Wed, Jan 10, 2018 at 10:26 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 9 January 2018 at 17:28, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> On Tuesday, January 9, 2018 5:03:18 PM CET Rafael J. Wysocki wrote:
>>> On Tue, Jan 9, 2018 at 4:30 PM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:

[cut]

>>> That's because of the pm_runtime_force_suspend/resume() called by
>>> genpd.  These functions generally may cause devices active before
>>> system suspend to be left in suspend after it.  That generally is a
>>> good idea if the device was not really in use before the system
>>> suspend, but there is a problem that the driver of it may not be
>>> prepared for that to happen (and also the way to determine the "not
>>> really in use" case may not be as expected by the driver).
>>
>> So below is something to try (untested).
>>
>> I know that Ulf will be protesting, but that's what I would do unless there
>> are really *really* good reasons not to.
>
> I am not protesting! :-)

OK

That makes things a lot easier in principle. :-)

> Instead I think we are rather in agreement that we are concerned about
> that genpd are invoking pm_runtime_force_suspend|resume().
>
>>
>>
>> ---
>>  drivers/base/power/domain.c |   34 +++++++++++++++++++++-------------
>>  1 file changed, 21 insertions(+), 13 deletions(-)
>>
>> Index: linux-pm/drivers/base/power/domain.c
>> ===================================================================
>> --- linux-pm.orig/drivers/base/power/domain.c
>> +++ linux-pm/drivers/base/power/domain.c
>> @@ -1048,8 +1048,9 @@ static int genpd_finish_suspend(struct d
>>         if (ret)
>>                 return ret;
>>
>> -       if (genpd->dev_ops.stop && genpd->dev_ops.start) {
>> -               ret = pm_runtime_force_suspend(dev);
>> +       if (genpd->dev_ops.stop && genpd->dev_ops.start &&
>> +           !pm_runtime_status_suspended(dev)) {
>> +               ret = genpd_stop_dev(genpd, dev);
>
> Something like this existed there before, but because of other
> internal genpd reasons. It's an option but there are issues with it.

OK

> I think it's fragile because invoking the ->stop() callback may
> trigger calls to "disable" resources (like clocks, pinctrls, etc) for
> the device. Doing this at ->suspend_noirq() may be too late, as that
> subsystem/driver for the resource(s) may already be system suspended.
> This is the classic problem of suspending (and later resuming) devices
> in in-correct order.

Well, this is what happens with the current code too.

I mean if unmodified genpd_finish_suspend() runs, it will call
pm_runtime_force_suspend() and that will check the device's runtime PM
status in the first place.  If that is "suspended", it will return
right away without doing anything (after disabling runtime PM, but
that is irrelevant here as runtime PM is already disabled).  In turn,
if the runtime PM status is "active", it will run
genpd_runtime_suspend() (as the callback) and that will run the
driver's runtime PM callback and the genpd_stop_dev() right after it.
Thus, if calling genpd_stop_dev() at that point is problematic, it is
problematic already today.

What the patch does is essentially to drop the driver's runtime PM
callback (should not be necessary) and the domain power off (certainly
not necessary) from that path, but to keep the genpd_stop_dev()
invocation that may be necessary in principle (the device is "active",
so genpd_runtime_suspend() has not run for it, so genpd_stop_dev() has
not been called for it yet, but it may be good to stop the clocks
before powering off the domain).

The resume part is basically running genpd_start_dev() for the devices
for which genpd_stop_dev() was run by genpd_finish_suspend(), which is
because the subsequent driver callbacks may expect the clocks to be
enabled.

> Today we deal with this, by drivers/subsystem assigning the right
> level of callback, as well as making sure the "dpm_list" is sorted
> correctly (yeah, we have device links as well).
>
> The point is, we can go for this solution, but is it good enough?

I would like to treat it as a step away from what is there today,
retaining some of the existing functionality.

From a quick look at the existing users of genpd that use the device
start/stop functionality, running genpd_stop_dev()/genpd_start_dev()
in the "noirq" phases should not be problematic for them at least.

> Another option, is to simply to remove (and not replace with something
> else) the calls to pm_runtime_force_ suspend|resume() from genpd.

Right.

> This moves the entire responsibility to the driver, to put its device into
> low power state during system suspend, but with the benefit of that it
> can decide to use the correct level of suspend/resume callbacks.

Drivers of the devices in the "stop/start" domains will have to use
pm_runtime_force_ suspend|resume() internally then to benefit from the
domain's management of clocks, but that just may be the only way to
avoid more problems.

> No matter how we do it, changing this may introduce some potential
> regressions from a power consumption point of view, however although
> only for those SoCs that uses the ->start|stop() callbacks. To
> mitigate these power regressions, some drivers needs to be fixed, but
> that seems inevitable anyway, doesn't it?

Yes, it does.

I would say let's try to go with this patch of mine as submitted and
see how far we can get with that.

That essentially doesn't prevent us from dropping the
genpd_stop_dev()/genpd_start_dev() from the "noirq" stages if need be
after all, but dropping them right away may cause the fallout to be
more dramatic, at least in theory.

Thanks,
Rafael
Ulf Hansson Jan. 11, 2018, 12:32 p.m. UTC | #3
[...]

>>> Index: linux-pm/drivers/base/power/domain.c
>>> ===================================================================
>>> --- linux-pm.orig/drivers/base/power/domain.c
>>> +++ linux-pm/drivers/base/power/domain.c
>>> @@ -1048,8 +1048,9 @@ static int genpd_finish_suspend(struct d
>>>         if (ret)
>>>                 return ret;
>>>
>>> -       if (genpd->dev_ops.stop && genpd->dev_ops.start) {
>>> -               ret = pm_runtime_force_suspend(dev);
>>> +       if (genpd->dev_ops.stop && genpd->dev_ops.start &&
>>> +           !pm_runtime_status_suspended(dev)) {
>>> +               ret = genpd_stop_dev(genpd, dev);
>>
>> Something like this existed there before, but because of other
>> internal genpd reasons. It's an option but there are issues with it.
>
> OK
>
>> I think it's fragile because invoking the ->stop() callback may
>> trigger calls to "disable" resources (like clocks, pinctrls, etc) for
>> the device. Doing this at ->suspend_noirq() may be too late, as that
>> subsystem/driver for the resource(s) may already be system suspended.
>> This is the classic problem of suspending (and later resuming) devices
>> in in-correct order.
>
> Well, this is what happens with the current code too.

Correct.

>
> I mean if unmodified genpd_finish_suspend() runs, it will call
> pm_runtime_force_suspend() and that will check the device's runtime PM
> status in the first place.  If that is "suspended", it will return
> right away without doing anything (after disabling runtime PM, but
> that is irrelevant here as runtime PM is already disabled).  In turn,
> if the runtime PM status is "active", it will run
> genpd_runtime_suspend() (as the callback) and that will run the
> driver's runtime PM callback and the genpd_stop_dev() right after it.
> Thus, if calling genpd_stop_dev() at that point is problematic, it is
> problematic already today.
>
> What the patch does is essentially to drop the driver's runtime PM
> callback (should not be necessary) and the domain power off (certainly
> not necessary) from that path, but to keep the genpd_stop_dev()
> invocation that may be necessary in principle (the device is "active",
> so genpd_runtime_suspend() has not run for it, so genpd_stop_dev() has
> not been called for it yet, but it may be good to stop the clocks
> before powering off the domain).

I understand your suggested approach and it may work.

>
> The resume part is basically running genpd_start_dev() for the devices
> for which genpd_stop_dev() was run by genpd_finish_suspend(), which is
> because the subsequent driver callbacks may expect the clocks to be
> enabled.
>
>> Today we deal with this, by drivers/subsystem assigning the right
>> level of callback, as well as making sure the "dpm_list" is sorted
>> correctly (yeah, we have device links as well).
>>
>> The point is, we can go for this solution, but is it good enough?
>
> I would like to treat it as a step away from what is there today,
> retaining some of the existing functionality.
>
> From a quick look at the existing users of genpd that use the device
> start/stop functionality, running genpd_stop_dev()/genpd_start_dev()
> in the "noirq" phases should not be problematic for them at least.

I guess so.

Still, the error report by Geert worries me, but I don't think it
worth to speculate, but rather test and see.

>
>> Another option, is to simply to remove (and not replace with something
>> else) the calls to pm_runtime_force_ suspend|resume() from genpd.
>
> Right.
>
>> This moves the entire responsibility to the driver, to put its device into
>> low power state during system suspend, but with the benefit of that it
>> can decide to use the correct level of suspend/resume callbacks.
>
> Drivers of the devices in the "stop/start" domains will have to use
> pm_runtime_force_ suspend|resume() internally then to benefit from the
> domain's management of clocks, but that just may be the only way to
> avoid more problems.

Agree.

>
>> No matter how we do it, changing this may introduce some potential
>> regressions from a power consumption point of view, however although
>> only for those SoCs that uses the ->start|stop() callbacks. To
>> mitigate these power regressions, some drivers needs to be fixed, but
>> that seems inevitable anyway, doesn't it?
>
> Yes, it does.
>
> I would say let's try to go with this patch of mine as submitted and
> see how far we can get with that.
>
> That essentially doesn't prevent us from dropping the
> genpd_stop_dev()/genpd_start_dev() from the "noirq" stages if need be
> after all, but dropping them right away may cause the fallout to be
> more dramatic, at least in theory.
>

Well, my guesses is that would be a step to make the behavior a bit
more deterministic and perhaps less fragile than today.

On the other hand, I am also concerned about the future users of the
->stop|start() callbacks, including related drivers dealing with the
affected devices. That in a sense, that converting the code in genpd
to what you suggest, would still not encourage related drivers to
behave correctly during system suspend/resume. Then down the road,
when additional new users of the ->stop|start() callbacks may have
been added, it may be even harder to drop calling them in from the
noirq phases.

So to conclude, I would prefer to drop calling
pm_runtime_force_suspend|resume() from genpd, without a replacement,
but I am willing to accept also your suggested alternative.

Kind regards
Uffe
Rafael J. Wysocki Jan. 11, 2018, 6:44 p.m. UTC | #4
On Thu, Jan 11, 2018 at 1:32 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:

[cut]

>>>
>>> The point is, we can go for this solution, but is it good enough?
>>
>> I would like to treat it as a step away from what is there today,
>> retaining some of the existing functionality.
>>
>> From a quick look at the existing users of genpd that use the device
>> start/stop functionality, running genpd_stop_dev()/genpd_start_dev()
>> in the "noirq" phases should not be problematic for them at least.
>
> I guess so.
>
> Still, the error report by Geert worries me, but I don't think it
> worth to speculate, but rather test and see.

Agreed.

>>> Another option, is to simply to remove (and not replace with something
>>> else) the calls to pm_runtime_force_ suspend|resume() from genpd.
>>
>> Right.
>>
>>> This moves the entire responsibility to the driver, to put its device into
>>> low power state during system suspend, but with the benefit of that it
>>> can decide to use the correct level of suspend/resume callbacks.
>>
>> Drivers of the devices in the "stop/start" domains will have to use
>> pm_runtime_force_ suspend|resume() internally then to benefit from the
>> domain's management of clocks, but that just may be the only way to
>> avoid more problems.
>
> Agree.

But this doesn't look particularly attractive to me, because it would
add a requirement for drivers to implement their system-wide PM
callbacks in a specific way and, in my view, no such requirements
should be imposed by code that advertises itself as "generic" (by the
name at least).  In particular, drivers written to this requirement
might not be able to work correctly with other middle-layer code.

If genpd was a bus type, that wouldn't be particularly objectionable,
even though it is not particularly straightforward either, because
drivers register for a particular bus type and so they need to take
its specifics into account.  Drivers usually don't register for a
specific PM domain, however, and at least some of them may need to
work with different PM domain code on different platforms.

>>> No matter how we do it, changing this may introduce some potential
>>> regressions from a power consumption point of view, however although
>>> only for those SoCs that uses the ->start|stop() callbacks. To
>>> mitigate these power regressions, some drivers needs to be fixed, but
>>> that seems inevitable anyway, doesn't it?
>>
>> Yes, it does.
>>
>> I would say let's try to go with this patch of mine as submitted and
>> see how far we can get with that.
>>
>> That essentially doesn't prevent us from dropping the
>> genpd_stop_dev()/genpd_start_dev() from the "noirq" stages if need be
>> after all, but dropping them right away may cause the fallout to be
>> more dramatic, at least in theory.
>>
>
> Well, my guesses is that would be a step to make the behavior a bit
> more deterministic and perhaps less fragile than today.
>
> On the other hand, I am also concerned about the future users of the
> ->stop|start() callbacks, including related drivers dealing with the
> affected devices. That in a sense, that converting the code in genpd
> to what you suggest, would still not encourage related drivers to
> behave correctly during system suspend/resume.

I'm not sure what you mean here.

Drivers in principle should not need to worry about the ->stop|start()
part, because that belongs to a different code layer.

They only need to be able to expect certain things (like clocks) to be
there (or in a specific state) when their code is running in certain
cases.

Now, I realize that this is pure theory and that in practice, if
->stop|start() are there, drivers working with genpd are probably
platform-specific anyway, more or less, so it is not likely that they
will ever have to work with any other middle-layer code.

>Then down the road, when additional new users of the ->stop|start() callbacks
> may have been added, it may be even harder to drop calling them in from the
> noirq phases.
>
> So to conclude, I would prefer to drop calling
> pm_runtime_force_suspend|resume() from genpd, without a replacement,
> but I am willing to accept also your suggested alternative.

OK

Thanks,
Rafael
Geert Uytterhoeven Jan. 12, 2018, 11:26 a.m. UTC | #5
Hi Rafael,

On Tue, Jan 9, 2018 at 5:28 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Tuesday, January 9, 2018 5:03:18 PM CET Rafael J. Wysocki wrote:
>> On Tue, Jan 9, 2018 at 4:30 PM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> > On Tue, Jan 9, 2018 at 4:00 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
>> >> On Tue, Jan 9, 2018 at 2:37 PM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> >>> On Wed, Jan 3, 2018 at 12:06 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> >>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >>>>
>> >>>> One of the limitations of pm_runtime_force_suspend/resume() is that
>> >>>> if a parent driver wants to use these functions, all of its child
>> >>>> drivers have to do that too because of the parent usage counter
>> >>>> manipulations necessary to get the correct state of the parent during
>> >>>> system-wide transitions to the working state (system resume).
>> >>>> However, that limitation turns out to be artificial, so remove it.
>> >>>>
>> >>>> Namely, pm_runtime_force_suspend() only needs to update the children
>> >>>> counter of its parent (if there's is a parent) when the device can
>> >>>> stay in suspend after the subsequent system resume transition, as
>> >>>> that counter is correct already otherwise.  Now, if the parent's
>> >>>> children counter is not updated, it is not necessary to increment
>> >>>> the parent's usage counter in that case any more, as long as the
>> >>>> children counters of devices are checked along with their usage
>> >>>> counters in order to decide whether or not the devices may be left
>> >>>> in suspend after the subsequent system resume transition.
>> >>>>
>> >>>> Accordingly, modify pm_runtime_force_suspend() to only call
>> >>>> pm_runtime_set_suspended() for devices whose usage and children
>> >>>> counters are at the "no references" level (the runtime PM status
>> >>>> of the device needs to be updated to "suspended" anyway in case
>> >>>> this function is called once again for the same device during the
>> >>>> transition under way), drop the parent usage counter incrementation
>> >>>> from it and update pm_runtime_force_resume() to compensate for these
>> >>>> changes.
>> >>>>
>> >>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >>>
>> >>> This patch causes a regression during system resume on Renesas Salvator-XS
>> >>> with R-Car H3 ES2.0:
>> >>
>> >> I have dropped it for now, but we need to address the underlying issue.
>> >>
>> >>>     SError Interrupt on CPU3, code 0xbf000002 -- SError
>> >>>     SError Interrupt on CPU2, code 0xbf000002 -- SError
>> >>>     CPU: 3 PID: 1769 Comm: kworker/u16:13 Not tainted
>> >>> 4.15.0-rc7-arm64-renesas-05338-gf14cf570a813c9ca-dirty #97
>> >>>     CPU: 2 PID: 1774 Comm: kworker/u16:18 Not tainted
>> >>> 4.15.0-rc7-arm64-renesas-05338-gf14cf570a813c9ca-dirty #97
>> >>>     Hardware name: Renesas Salvator-X 2nd version board based on
>> >>> r8a7795 ES2.0+ (DT)
>> >>>     Hardware name: Renesas Salvator-X 2nd version board based on
>> >>> r8a7795 ES2.0+ (DT)
>> >>>     Workqueue: events_unbound async_run_entry_fn
>> >>>     Workqueue: events_unbound async_run_entry_fn
>> >>>     pstate: 60000005 (nZCv daif -PAN -UAO)
>> >>>     pstate: 60000005 (nZCv daif -PAN -UAO)
>> >>>     pc : rcar_gen3_phy_usb2_init+0x34/0xf8
>> >>>     pc : rcar_gen3_phy_usb2_init+0x34/0xf8
>> >>>     lr : phy_init+0x64/0xcc
>> >>>     lr : phy_init+0x64/0xcc
>> >>>     ...
>> >>>     Kernel panic - not syncing: Asynchronous SError Interrupt
>> >>>
>> >>> Note that before, it printed a warning instead:
>> >>>
>> >>>     Enabling runtime PM for inactive device (ee0a0200.usb-phy) with
>> >>> active children
>> >>>     WARNING: CPU: 0 PID: 1741 at drivers/base/power/runtime.c:1300
>> >>> pm_runtime_enable+0x94/0xd8
>> >>>
>> >>> Reverting commit 0408584d580d4a2c ("PM / runtime: Rework
>> >>> pm_runtime_force_suspend/resume()") fixes the crash.
>> >>>
>> >>> Note that applying Ulf's "[PATCH v2 0/3] phy: core: Re-work runtime PM
>> >>> deployment and fix an issue"
>> >>> (https://www.spinics.net/lists/linux-renesas-soc/msg21719.html) instead
>> >>> does not fix the crash.
>> >>
>> >> OK
>> >>
>> >>> With more debug code added, it seems the EHCI module clocks (701-703) are
>> >>> enabled earlier than before. I guess this triggers the workqueue to perform
>> >>> an operation while another related device (HSUSB 704?) is still disabled?
>> >>
>> >> Probably.
>> >>
>> >> Likely a device that wasn't resumed before resumes now and that causes
>> >> the issue to appear.
>> >>
>> >> I'm wondering if adding the ignore_children check to the patch helps.
>> >> If not, there clearly is a resume ordering issue that is papered over
>> >> by the current code.
>> >
>> > Something fishy is going on. Status of the USB PHYs differ before/after
>> > system suspend, according to /sys/kernel/debug/pm_genpd/pm_genpd_summary:
>> >
>> > -    /devices/platform/soc/ee0a0200.usb-phy              active
>> > -    /devices/platform/soc/ee0c0200.usb-phy              active
>> > -    /devices/platform/soc/ee080200.usb-phy              active
>> > +    /devices/platform/soc/ee0a0200.usb-phy              suspended
>> > +    /devices/platform/soc/ee0c0200.usb-phy              suspended
>> > +    /devices/platform/soc/ee080200.usb-phy              suspended
>>
>> Yeah.
>>
>> That's because of the pm_runtime_force_suspend/resume() called by
>> genpd.  These functions generally may cause devices active before
>> system suspend to be left in suspend after it.  That generally is a
>> good idea if the device was not really in use before the system
>> suspend, but there is a problem that the driver of it may not be
>> prepared for that to happen (and also the way to determine the "not
>> really in use" case may not be as expected by the driver).
>
> So below is something to try (untested).
>
> I know that Ulf will be protesting, but that's what I would do unless there
> are really *really* good reasons not to.

With the below, the issue is gone, both with and without the
dev->power.ignore_children check.

> --- linux-pm.orig/drivers/base/power/domain.c
> +++ linux-pm/drivers/base/power/domain.c
> @@ -1048,8 +1048,9 @@ static int genpd_finish_suspend(struct d
>         if (ret)
>                 return ret;
>
> -       if (genpd->dev_ops.stop && genpd->dev_ops.start) {
> -               ret = pm_runtime_force_suspend(dev);
> +       if (genpd->dev_ops.stop && genpd->dev_ops.start &&
> +           !pm_runtime_status_suspended(dev)) {
> +               ret = genpd_stop_dev(genpd, dev);
>                 if (ret)
>                         return ret;
>         }
> @@ -1101,8 +1102,12 @@ static int genpd_resume_noirq(struct dev
>         genpd->suspended_count--;
>         genpd_unlock(genpd);
>
> -       if (genpd->dev_ops.stop && genpd->dev_ops.start)
> -               ret = pm_runtime_force_resume(dev);
> +       if (genpd->dev_ops.stop && genpd->dev_ops.start &&
> +           !pm_runtime_status_suspended(dev)) {
> +               ret = genpd_start_dev(genpd, dev);
> +               if (ret)
> +                       return ret;
> +       }
>
>         ret = pm_generic_resume_noirq(dev);
>         if (ret)
> @@ -1123,7 +1128,7 @@ static int genpd_resume_noirq(struct dev
>  static int genpd_freeze_noirq(struct device *dev)
>  {
>         const struct generic_pm_domain *genpd;
> -       int ret = 0;
> +       int ret;
>
>         dev_dbg(dev, "%s()\n", __func__);
>
> @@ -1135,8 +1140,9 @@ static int genpd_freeze_noirq(struct dev
>         if (ret)
>                 return ret;
>
> -       if (genpd->dev_ops.stop && genpd->dev_ops.start)
> -               ret = pm_runtime_force_suspend(dev);
> +       if (genpd->dev_ops.stop && genpd->dev_ops.start &&
> +           !pm_runtime_status_suspended(dev))
> +               ret = genpd_stop_dev(genpd, dev);
>
>         return ret;
>  }
> @@ -1151,7 +1157,7 @@ static int genpd_freeze_noirq(struct dev
>  static int genpd_thaw_noirq(struct device *dev)
>  {
>         const struct generic_pm_domain *genpd;
> -       int ret = 0;
> +       int ret;
>
>         dev_dbg(dev, "%s()\n", __func__);
>
> @@ -1159,8 +1165,9 @@ static int genpd_thaw_noirq(struct devic
>         if (IS_ERR(genpd))
>                 return -EINVAL;
>
> -       if (genpd->dev_ops.stop && genpd->dev_ops.start) {
> -               ret = pm_runtime_force_resume(dev);
> +       if (genpd->dev_ops.stop && genpd->dev_ops.start &&
> +           !pm_runtime_status_suspended(dev)) {
> +               ret = genpd_start_dev(genpd, dev);
>                 if (ret)
>                         return ret;
>         }
> @@ -1193,7 +1200,7 @@ static int genpd_poweroff_noirq(struct d
>  static int genpd_restore_noirq(struct device *dev)
>  {
>         struct generic_pm_domain *genpd;
> -       int ret = 0;
> +       int ret;
>
>         dev_dbg(dev, "%s()\n", __func__);
>
> @@ -1217,8 +1224,9 @@ static int genpd_restore_noirq(struct de
>         genpd_sync_power_on(genpd, true, 0);
>         genpd_unlock(genpd);
>
> -       if (genpd->dev_ops.stop && genpd->dev_ops.start) {
> -               ret = pm_runtime_force_resume(dev);
> +       if (genpd->dev_ops.stop && genpd->dev_ops.start &&
> +           !pm_runtime_status_suspended(dev)) {
> +               ret = genpd_start_dev(genpd, dev);
>                 if (ret)
>                         return ret;
>         }

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Rafael J. Wysocki Jan. 12, 2018, 12:09 p.m. UTC | #6
On Friday, January 12, 2018 12:26:56 PM CET Geert Uytterhoeven wrote:
> Hi Rafael,
> 
> On Tue, Jan 9, 2018 at 5:28 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Tuesday, January 9, 2018 5:03:18 PM CET Rafael J. Wysocki wrote:
> >> On Tue, Jan 9, 2018 at 4:30 PM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >> > On Tue, Jan 9, 2018 at 4:00 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> >> >> On Tue, Jan 9, 2018 at 2:37 PM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >> >>> On Wed, Jan 3, 2018 at 12:06 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >> >>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> >>>>
> >> >>>> One of the limitations of pm_runtime_force_suspend/resume() is that
> >> >>>> if a parent driver wants to use these functions, all of its child
> >> >>>> drivers have to do that too because of the parent usage counter
> >> >>>> manipulations necessary to get the correct state of the parent during
> >> >>>> system-wide transitions to the working state (system resume).
> >> >>>> However, that limitation turns out to be artificial, so remove it.
> >> >>>>
> >> >>>> Namely, pm_runtime_force_suspend() only needs to update the children
> >> >>>> counter of its parent (if there's is a parent) when the device can
> >> >>>> stay in suspend after the subsequent system resume transition, as
> >> >>>> that counter is correct already otherwise.  Now, if the parent's
> >> >>>> children counter is not updated, it is not necessary to increment
> >> >>>> the parent's usage counter in that case any more, as long as the
> >> >>>> children counters of devices are checked along with their usage
> >> >>>> counters in order to decide whether or not the devices may be left
> >> >>>> in suspend after the subsequent system resume transition.
> >> >>>>
> >> >>>> Accordingly, modify pm_runtime_force_suspend() to only call
> >> >>>> pm_runtime_set_suspended() for devices whose usage and children
> >> >>>> counters are at the "no references" level (the runtime PM status
> >> >>>> of the device needs to be updated to "suspended" anyway in case
> >> >>>> this function is called once again for the same device during the
> >> >>>> transition under way), drop the parent usage counter incrementation
> >> >>>> from it and update pm_runtime_force_resume() to compensate for these
> >> >>>> changes.
> >> >>>>
> >> >>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> >>>
> >> >>> This patch causes a regression during system resume on Renesas Salvator-XS
> >> >>> with R-Car H3 ES2.0:
> >> >>
> >> >> I have dropped it for now, but we need to address the underlying issue.
> >> >>
> >> >>>     SError Interrupt on CPU3, code 0xbf000002 -- SError
> >> >>>     SError Interrupt on CPU2, code 0xbf000002 -- SError
> >> >>>     CPU: 3 PID: 1769 Comm: kworker/u16:13 Not tainted
> >> >>> 4.15.0-rc7-arm64-renesas-05338-gf14cf570a813c9ca-dirty #97
> >> >>>     CPU: 2 PID: 1774 Comm: kworker/u16:18 Not tainted
> >> >>> 4.15.0-rc7-arm64-renesas-05338-gf14cf570a813c9ca-dirty #97
> >> >>>     Hardware name: Renesas Salvator-X 2nd version board based on
> >> >>> r8a7795 ES2.0+ (DT)
> >> >>>     Hardware name: Renesas Salvator-X 2nd version board based on
> >> >>> r8a7795 ES2.0+ (DT)
> >> >>>     Workqueue: events_unbound async_run_entry_fn
> >> >>>     Workqueue: events_unbound async_run_entry_fn
> >> >>>     pstate: 60000005 (nZCv daif -PAN -UAO)
> >> >>>     pstate: 60000005 (nZCv daif -PAN -UAO)
> >> >>>     pc : rcar_gen3_phy_usb2_init+0x34/0xf8
> >> >>>     pc : rcar_gen3_phy_usb2_init+0x34/0xf8
> >> >>>     lr : phy_init+0x64/0xcc
> >> >>>     lr : phy_init+0x64/0xcc
> >> >>>     ...
> >> >>>     Kernel panic - not syncing: Asynchronous SError Interrupt
> >> >>>
> >> >>> Note that before, it printed a warning instead:
> >> >>>
> >> >>>     Enabling runtime PM for inactive device (ee0a0200.usb-phy) with
> >> >>> active children
> >> >>>     WARNING: CPU: 0 PID: 1741 at drivers/base/power/runtime.c:1300
> >> >>> pm_runtime_enable+0x94/0xd8
> >> >>>
> >> >>> Reverting commit 0408584d580d4a2c ("PM / runtime: Rework
> >> >>> pm_runtime_force_suspend/resume()") fixes the crash.
> >> >>>
> >> >>> Note that applying Ulf's "[PATCH v2 0/3] phy: core: Re-work runtime PM
> >> >>> deployment and fix an issue"
> >> >>> (https://www.spinics.net/lists/linux-renesas-soc/msg21719.html) instead
> >> >>> does not fix the crash.
> >> >>
> >> >> OK
> >> >>
> >> >>> With more debug code added, it seems the EHCI module clocks (701-703) are
> >> >>> enabled earlier than before. I guess this triggers the workqueue to perform
> >> >>> an operation while another related device (HSUSB 704?) is still disabled?
> >> >>
> >> >> Probably.
> >> >>
> >> >> Likely a device that wasn't resumed before resumes now and that causes
> >> >> the issue to appear.
> >> >>
> >> >> I'm wondering if adding the ignore_children check to the patch helps.
> >> >> If not, there clearly is a resume ordering issue that is papered over
> >> >> by the current code.
> >> >
> >> > Something fishy is going on. Status of the USB PHYs differ before/after
> >> > system suspend, according to /sys/kernel/debug/pm_genpd/pm_genpd_summary:
> >> >
> >> > -    /devices/platform/soc/ee0a0200.usb-phy              active
> >> > -    /devices/platform/soc/ee0c0200.usb-phy              active
> >> > -    /devices/platform/soc/ee080200.usb-phy              active
> >> > +    /devices/platform/soc/ee0a0200.usb-phy              suspended
> >> > +    /devices/platform/soc/ee0c0200.usb-phy              suspended
> >> > +    /devices/platform/soc/ee080200.usb-phy              suspended
> >>
> >> Yeah.
> >>
> >> That's because of the pm_runtime_force_suspend/resume() called by
> >> genpd.  These functions generally may cause devices active before
> >> system suspend to be left in suspend after it.  That generally is a
> >> good idea if the device was not really in use before the system
> >> suspend, but there is a problem that the driver of it may not be
> >> prepared for that to happen (and also the way to determine the "not
> >> really in use" case may not be as expected by the driver).
> >
> > So below is something to try (untested).
> >
> > I know that Ulf will be protesting, but that's what I would do unless there
> > are really *really* good reasons not to.
> 
> With the below, the issue is gone,

Cool, thanks for testing!

> both with and without the dev->power.ignore_children check.

Yes, that check is irrelevant here.

Thanks,
Rafael
diff mbox

Patch

Index: linux-pm/drivers/base/power/domain.c
===================================================================
--- linux-pm.orig/drivers/base/power/domain.c
+++ linux-pm/drivers/base/power/domain.c
@@ -1048,8 +1048,9 @@  static int genpd_finish_suspend(struct d
 	if (ret)
 		return ret;
 
-	if (genpd->dev_ops.stop && genpd->dev_ops.start) {
-		ret = pm_runtime_force_suspend(dev);
+	if (genpd->dev_ops.stop && genpd->dev_ops.start &&
+	    !pm_runtime_status_suspended(dev)) {
+		ret = genpd_stop_dev(genpd, dev);
 		if (ret)
 			return ret;
 	}
@@ -1101,8 +1102,12 @@  static int genpd_resume_noirq(struct dev
 	genpd->suspended_count--;
 	genpd_unlock(genpd);
 
-	if (genpd->dev_ops.stop && genpd->dev_ops.start)
-		ret = pm_runtime_force_resume(dev);
+	if (genpd->dev_ops.stop && genpd->dev_ops.start &&
+	    !pm_runtime_status_suspended(dev)) {
+		ret = genpd_start_dev(genpd, dev);
+		if (ret)
+			return ret;
+	}
 
 	ret = pm_generic_resume_noirq(dev);
 	if (ret)
@@ -1123,7 +1128,7 @@  static int genpd_resume_noirq(struct dev
 static int genpd_freeze_noirq(struct device *dev)
 {
 	const struct generic_pm_domain *genpd;
-	int ret = 0;
+	int ret;
 
 	dev_dbg(dev, "%s()\n", __func__);
 
@@ -1135,8 +1140,9 @@  static int genpd_freeze_noirq(struct dev
 	if (ret)
 		return ret;
 
-	if (genpd->dev_ops.stop && genpd->dev_ops.start)
-		ret = pm_runtime_force_suspend(dev);
+	if (genpd->dev_ops.stop && genpd->dev_ops.start &&
+	    !pm_runtime_status_suspended(dev))
+		ret = genpd_stop_dev(genpd, dev);
 
 	return ret;
 }
@@ -1151,7 +1157,7 @@  static int genpd_freeze_noirq(struct dev
 static int genpd_thaw_noirq(struct device *dev)
 {
 	const struct generic_pm_domain *genpd;
-	int ret = 0;
+	int ret;
 
 	dev_dbg(dev, "%s()\n", __func__);
 
@@ -1159,8 +1165,9 @@  static int genpd_thaw_noirq(struct devic
 	if (IS_ERR(genpd))
 		return -EINVAL;
 
-	if (genpd->dev_ops.stop && genpd->dev_ops.start) {
-		ret = pm_runtime_force_resume(dev);
+	if (genpd->dev_ops.stop && genpd->dev_ops.start &&
+	    !pm_runtime_status_suspended(dev)) {
+		ret = genpd_start_dev(genpd, dev);
 		if (ret)
 			return ret;
 	}
@@ -1193,7 +1200,7 @@  static int genpd_poweroff_noirq(struct d
 static int genpd_restore_noirq(struct device *dev)
 {
 	struct generic_pm_domain *genpd;
-	int ret = 0;
+	int ret;
 
 	dev_dbg(dev, "%s()\n", __func__);
 
@@ -1217,8 +1224,9 @@  static int genpd_restore_noirq(struct de
 	genpd_sync_power_on(genpd, true, 0);
 	genpd_unlock(genpd);
 
-	if (genpd->dev_ops.stop && genpd->dev_ops.start) {
-		ret = pm_runtime_force_resume(dev);
+	if (genpd->dev_ops.stop && genpd->dev_ops.start &&
+	    !pm_runtime_status_suspended(dev)) {
+		ret = genpd_start_dev(genpd, dev);
 		if (ret)
 			return ret;
 	}