diff mbox

[RFC,2/3] PM / sleep: Mechanism to avoid resuming runtime-suspended devices unnecessarily

Message ID 1752814.FckPkBGaUc@vostro.rjw.lan (mailing list archive)
State RFC, archived
Headers show

Commit Message

Rafael J. Wysocki May 13, 2014, 1:10 a.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
resume all runtime-suspended devices during system suspend, mostly
because those devices may need to be reprogrammed due to different
wakeup settings for system sleep and for runtime PM.

For some devices, though, it's OK to remain in runtime suspend 
throughout a complete system suspend/resume cycle (if the device was in
runtime suspend at the start of the cycle).  We would like to do this
whenever possible, to avoid the overhead of extra power-up and power-down
events.

However, problems may arise because the device's descendants may require
it to be at full power at various points during the cycle.  Therefore the
most straightforward way to do this safely is if the device and all its
descendants can remain runtime suspended until the complete stage of
system resume.

To this end, introduce a new device PM flag, power.direct_complete
and modify the PM core to use that flag as follows.

If the ->prepare() callback of a device returns a positive number,
the PM core will regard that as an indication that it may leave the
device runtime-suspended.  It will then check if the system power
transition in progress is a suspend (and not hibernation in particular)
and if the device is, indeed, runtime-suspended.  In that case, the PM
core will set the device's power.direct_complete flag.  Otherwise it
will clear power.direct_complete for the device and it also will later
clear it for the device's parent (if there's one).

Next, the PM core will not invoke the ->suspend() ->suspend_late(),
->suspend_irq(), ->resume_irq(), ->resume_early(), or ->resume()
callbacks for all devices having power.direct_complete set.  It
will invoke their ->complete() callbacks, however, and those
callbacks are then responsible for resuming the devices as
appropriate, if necessary.

Changelog partly based on an Alan Stern's description of the idea
(http://marc.info/?l=linux-pm&m=139940466625569&w=2).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/main.c |   45 ++++++++++++++++++++++++++++-----------------
 include/linux/pm.h        |    1 +
 2 files changed, 29 insertions(+), 17 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Ulf Hansson May 13, 2014, 9:30 a.m. UTC | #1
On 13 May 2014 03:10, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> resume all runtime-suspended devices during system suspend, mostly
> because those devices may need to be reprogrammed due to different
> wakeup settings for system sleep and for runtime PM.
>
> For some devices, though, it's OK to remain in runtime suspend
> throughout a complete system suspend/resume cycle (if the device was in
> runtime suspend at the start of the cycle).  We would like to do this
> whenever possible, to avoid the overhead of extra power-up and power-down
> events.
>
> However, problems may arise because the device's descendants may require
> it to be at full power at various points during the cycle.  Therefore the
> most straightforward way to do this safely is if the device and all its
> descendants can remain runtime suspended until the complete stage of
> system resume.
>
> To this end, introduce a new device PM flag, power.direct_complete
> and modify the PM core to use that flag as follows.
>
> If the ->prepare() callback of a device returns a positive number,
> the PM core will regard that as an indication that it may leave the
> device runtime-suspended.  It will then check if the system power
> transition in progress is a suspend (and not hibernation in particular)
> and if the device is, indeed, runtime-suspended.  In that case, the PM
> core will set the device's power.direct_complete flag.  Otherwise it
> will clear power.direct_complete for the device and it also will later
> clear it for the device's parent (if there's one).
>
> Next, the PM core will not invoke the ->suspend() ->suspend_late(),
> ->suspend_irq(), ->resume_irq(), ->resume_early(), or ->resume()
> callbacks for all devices having power.direct_complete set.  It
> will invoke their ->complete() callbacks, however, and those
> callbacks are then responsible for resuming the devices as
> appropriate, if necessary.
>
> Changelog partly based on an Alan Stern's description of the idea
> (http://marc.info/?l=linux-pm&m=139940466625569&w=2).
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

I like this idea! And we don't need to add new functions to the runtime PM API.

[snip]

> -
> -       return error;
> +               return ret;
> +       }
> +       dev->power.direct_complete = ret > 0 && state.event == PM_EVENT_SUSPEND
> +                                       && pm_runtime_suspended(dev);

This might deserve a comment!?

> +       return 0;
>  }
>
>  /**
>

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Ulf Hansson
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern May 13, 2014, 2:49 p.m. UTC | #2
On Tue, 13 May 2014, Rafael J. Wysocki wrote:

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> resume all runtime-suspended devices during system suspend, mostly
> because those devices may need to be reprogrammed due to different
> wakeup settings for system sleep and for runtime PM.
> 
> For some devices, though, it's OK to remain in runtime suspend 
> throughout a complete system suspend/resume cycle (if the device was in
> runtime suspend at the start of the cycle).  We would like to do this
> whenever possible, to avoid the overhead of extra power-up and power-down
> events.
> 
> However, problems may arise because the device's descendants may require
> it to be at full power at various points during the cycle.  Therefore the
> most straightforward way to do this safely is if the device and all its
> descendants can remain runtime suspended until the complete stage of
> system resume.
> 
> To this end, introduce a new device PM flag, power.direct_complete
> and modify the PM core to use that flag as follows.
> 
> If the ->prepare() callback of a device returns a positive number,
> the PM core will regard that as an indication that it may leave the
> device runtime-suspended.  It will then check if the system power
> transition in progress is a suspend (and not hibernation in particular)
> and if the device is, indeed, runtime-suspended.  In that case, the PM
> core will set the device's power.direct_complete flag.  Otherwise it
> will clear power.direct_complete for the device and it also will later
> clear it for the device's parent (if there's one).
> 
> Next, the PM core will not invoke the ->suspend() ->suspend_late(),
> ->suspend_irq(), ->resume_irq(), ->resume_early(), or ->resume()
> callbacks for all devices having power.direct_complete set.  It
> will invoke their ->complete() callbacks, however, and those
> callbacks are then responsible for resuming the devices as
> appropriate, if necessary.

Perhaps you should mention here (and maybe even as a comment in the 
code) that ->complete() callbacks may want to call pm_request_resume() 
if dev->power.direct_resume is set, but they shouldn't call 
pm_runtime_resume().

> Changelog partly based on an Alan Stern's description of the idea
> (http://marc.info/?l=linux-pm&m=139940466625569&w=2).
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

...

> @@ -1518,17 +1527,19 @@ static int device_prepare(struct device
>  		callback = dev->driver->pm->prepare;
>  	}
>  
> -	if (callback) {
> -		error = callback(dev);
> -		suspend_report_result(callback, error);
> -	}
> +	if (callback)
> +		ret = callback(dev);
>  
>  	device_unlock(dev);
>  
> -	if (error)
> +	if (ret < 0) {
> +		suspend_report_result(callback, ret);
>  		pm_runtime_put(dev);
> -
> -	return error;
> +		return ret;
> +	}
> +	dev->power.direct_complete = ret > 0 && state.event == PM_EVENT_SUSPEND
> +					&& pm_runtime_suspended(dev);

Shouldn't the flag be set under the spinlock?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern May 13, 2014, 3:12 p.m. UTC | #3
On Tue, 13 May 2014, Rafael J. Wysocki wrote:

> > > +	dev->power.direct_complete = ret > 0 && state.event == PM_EVENT_SUSPEND
> > > +					&& pm_runtime_suspended(dev);
> > 
> > Shouldn't the flag be set under the spinlock?
> 
> I guess you're worried about runtime PM flags being modified in parallel to
> this?  But we've just done the barrier a while ago, so is that still a concern
> here?

A wakeup request from the hardware could cause a runtime resume to 
occur at this time.  The barrier wouldn't prevent that.

It's unlikely, I agree, but not impossible.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki May 13, 2014, 3:13 p.m. UTC | #4
On Tuesday, May 13, 2014 10:49:32 AM Alan Stern wrote:
> On Tue, 13 May 2014, Rafael J. Wysocki wrote:
> 
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > resume all runtime-suspended devices during system suspend, mostly
> > because those devices may need to be reprogrammed due to different
> > wakeup settings for system sleep and for runtime PM.
> > 
> > For some devices, though, it's OK to remain in runtime suspend 
> > throughout a complete system suspend/resume cycle (if the device was in
> > runtime suspend at the start of the cycle).  We would like to do this
> > whenever possible, to avoid the overhead of extra power-up and power-down
> > events.
> > 
> > However, problems may arise because the device's descendants may require
> > it to be at full power at various points during the cycle.  Therefore the
> > most straightforward way to do this safely is if the device and all its
> > descendants can remain runtime suspended until the complete stage of
> > system resume.
> > 
> > To this end, introduce a new device PM flag, power.direct_complete
> > and modify the PM core to use that flag as follows.
> > 
> > If the ->prepare() callback of a device returns a positive number,
> > the PM core will regard that as an indication that it may leave the
> > device runtime-suspended.  It will then check if the system power
> > transition in progress is a suspend (and not hibernation in particular)
> > and if the device is, indeed, runtime-suspended.  In that case, the PM
> > core will set the device's power.direct_complete flag.  Otherwise it
> > will clear power.direct_complete for the device and it also will later
> > clear it for the device's parent (if there's one).
> > 
> > Next, the PM core will not invoke the ->suspend() ->suspend_late(),
> > ->suspend_irq(), ->resume_irq(), ->resume_early(), or ->resume()
> > callbacks for all devices having power.direct_complete set.  It
> > will invoke their ->complete() callbacks, however, and those
> > callbacks are then responsible for resuming the devices as
> > appropriate, if necessary.
> 
> Perhaps you should mention here (and maybe even as a comment in the 
> code) that ->complete() callbacks may want to call pm_request_resume() 
> if dev->power.direct_resume is set, but they shouldn't call 
> pm_runtime_resume().

OK

> > Changelog partly based on an Alan Stern's description of the idea
> > (http://marc.info/?l=linux-pm&m=139940466625569&w=2).
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> ...
> 
> > @@ -1518,17 +1527,19 @@ static int device_prepare(struct device
> >  		callback = dev->driver->pm->prepare;
> >  	}
> >  
> > -	if (callback) {
> > -		error = callback(dev);
> > -		suspend_report_result(callback, error);
> > -	}
> > +	if (callback)
> > +		ret = callback(dev);
> >  
> >  	device_unlock(dev);
> >  
> > -	if (error)
> > +	if (ret < 0) {
> > +		suspend_report_result(callback, ret);
> >  		pm_runtime_put(dev);
> > -
> > -	return error;
> > +		return ret;
> > +	}
> > +	dev->power.direct_complete = ret > 0 && state.event == PM_EVENT_SUSPEND
> > +					&& pm_runtime_suspended(dev);
> 
> Shouldn't the flag be set under the spinlock?

I guess you're worried about runtime PM flags being modified in parallel to
this?  But we've just done the barrier a while ago, so is that still a concern
here?

This won't run in parallel with device_prepare() for any other devices, because
the "complete" phase is sequential.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki May 13, 2014, 3:43 p.m. UTC | #5
On Tuesday, May 13, 2014 11:12:28 AM Alan Stern wrote:
> On Tue, 13 May 2014, Rafael J. Wysocki wrote:
> 
> > > > +	dev->power.direct_complete = ret > 0 && state.event == PM_EVENT_SUSPEND
> > > > +					&& pm_runtime_suspended(dev);
> > > 
> > > Shouldn't the flag be set under the spinlock?
> > 
> > I guess you're worried about runtime PM flags being modified in parallel to
> > this?  But we've just done the barrier a while ago, so is that still a concern
> > here?
> 
> A wakeup request from the hardware could cause a runtime resume to 
> occur at this time.  The barrier wouldn't prevent that.
> 
> It's unlikely, I agree, but not impossible.

Yeah, I didn't think about that.

But that also can occur in __device_suspend(), after we've checked the flag
and decided not to invoke the ->suspend() callback, right?  So moving the
check in there doesn't help much I'd say.  It closes the race window, but
that's it.

That means that the whole approach based on ->prepare() is problematic
unless we somehow mix it with disabling runtime PM.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern May 13, 2014, 3:46 p.m. UTC | #6
On Tue, 13 May 2014, Rafael J. Wysocki wrote:

> > A wakeup request from the hardware could cause a runtime resume to 
> > occur at this time.  The barrier wouldn't prevent that.
> > 
> > It's unlikely, I agree, but not impossible.
> 
> Yeah, I didn't think about that.

Come to think of it, if the hardware sends a wakeup request then it
must have been enabled for remote wakeup.  And if the hardware settings
are appropriate for system suspend then it must be enabled for system
wakeup.  Consequently a wakeup from the hardware ought to abort the
system suspend in any case.  So maybe we don't care about this 
scenario.

On the other hand, there may be other mechanisms that could cause a 
runtime resume at this inconvenient time.  A timer routine, for 
instance.

> But that also can occur in __device_suspend(), after we've checked the flag
> and decided not to invoke the ->suspend() callback, right?  So moving the
> check in there doesn't help much I'd say.  It closes the race window, but
> that's it.
> 
> That means that the whole approach based on ->prepare() is problematic
> unless we somehow mix it with disabling runtime PM.

Maybe the call to __pm_runtime_disable() should be moved from
__device_suspend_late() to __device_suspend(), after the callback has
been invoked (or skipped, as the case may be).  Then after runtime PM
has been disabled, you can check the device's status has changed and go
back to invoke the callback if necessary.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki May 13, 2014, 4:16 p.m. UTC | #7
On Tuesday, May 13, 2014 11:46:55 AM Alan Stern wrote:
> On Tue, 13 May 2014, Rafael J. Wysocki wrote:
> 
> > > A wakeup request from the hardware could cause a runtime resume to 
> > > occur at this time.  The barrier wouldn't prevent that.
> > > 
> > > It's unlikely, I agree, but not impossible.
> > 
> > Yeah, I didn't think about that.
> 
> Come to think of it, if the hardware sends a wakeup request then it
> must have been enabled for remote wakeup.  And if the hardware settings
> are appropriate for system suspend then it must be enabled for system
> wakeup.  Consequently a wakeup from the hardware ought to abort the
> system suspend in any case.  So maybe we don't care about this 
> scenario.
> 
> On the other hand, there may be other mechanisms that could cause a 
> runtime resume at this inconvenient time.  A timer routine, for 
> instance.
> 
> > But that also can occur in __device_suspend(), after we've checked the flag
> > and decided not to invoke the ->suspend() callback, right?  So moving the
> > check in there doesn't help much I'd say.  It closes the race window, but
> > that's it.
> > 
> > That means that the whole approach based on ->prepare() is problematic
> > unless we somehow mix it with disabling runtime PM.
> 
> Maybe the call to __pm_runtime_disable() should be moved from
> __device_suspend_late() to __device_suspend(), after the callback has
> been invoked (or skipped, as the case may be).  Then after runtime PM
> has been disabled, you can check the device's status has changed and go
> back to invoke the callback if necessary.

We moved __pm_runtime_disable() to __device_suspend_late() to be able to
use pm_runtime_resume() in __device_suspend() (and we actually do that in
some places now).

But, in principle, we can do __pm_runtime_disable() temporarily in some place
between ->prepare() and ->suspend(), it doesn't matter if that's in
device_prepare() in __device_suspend() really.  Then, we can check the device's
runtime PM status (that'd need to be done carefully to take the disabling into
account) and
(1) if the device is runtime-suspended, set direct_complete for it without
    enabling runtime PM, or
(2) if the device is not runtime-suspended, clear direct_complete for it
    and re-enable runtime PM.
and in case of (1) we would re-enable runtime PM in device_complete().

That should work I suppose?

Of course, question is what ->prepare() is supposed to do then if it needs
to check the state of the device before deciding whether or not to return 1.
I guess it would need to disable runtime PM around that check too.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern May 13, 2014, 4:19 p.m. UTC | #8
On Tue, 13 May 2014, Rafael J. Wysocki wrote:

> > Maybe the call to __pm_runtime_disable() should be moved from
> > __device_suspend_late() to __device_suspend(), after the callback has
> > been invoked (or skipped, as the case may be).  Then after runtime PM
> > has been disabled, you can check the device's status has changed and go
> > back to invoke the callback if necessary.
> 
> We moved __pm_runtime_disable() to __device_suspend_late() to be able to
> use pm_runtime_resume() in __device_suspend() (and we actually do that in
> some places now).
> 
> But, in principle, we can do __pm_runtime_disable() temporarily in some place
> between ->prepare() and ->suspend(), it doesn't matter if that's in
> device_prepare() in __device_suspend() really.

It should be as late as possible, to allow for detecting wakeup 
requests.

>  Then, we can check the device's
> runtime PM status (that'd need to be done carefully to take the disabling into
> account) and
> (1) if the device is runtime-suspended, set direct_complete for it without
>     enabling runtime PM, or
> (2) if the device is not runtime-suspended, clear direct_complete for it
>     and re-enable runtime PM.
> and in case of (1) we would re-enable runtime PM in device_complete().
> 
> That should work I suppose?

Yes; it's similar to what I proposed.  Note that this can be skipped if 
direct_complete is already clear.

> Of course, question is what ->prepare() is supposed to do then if it needs
> to check the state of the device before deciding whether or not to return 1.
> I guess it would need to disable runtime PM around that check too.

It would be surprising if ->prepare() needed to make any difficult
checks.  This would imply that the device could have multiple
runtime-suspend states, some of which are appropriate for system
suspend while others aren't.  Not impossible, but I wouldn't expect it
to come up often.

Besides, as I mentioned before, we never have to worry about status 
changes.  If one occurs while ->prepare() is running or afterward, it 
means the device is runtime-resumed and therefore the setting of 
direct_complete doesn't matter.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jacob Pan May 14, 2014, 10:24 p.m. UTC | #9
On Tue, 13 May 2014 03:10:19 +0200
"Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> resume all runtime-suspended devices during system suspend, mostly
> because those devices may need to be reprogrammed due to different
> wakeup settings for system sleep and for runtime PM.
> 
> For some devices, though, it's OK to remain in runtime suspend 
> throughout a complete system suspend/resume cycle (if the device was
> in runtime suspend at the start of the cycle).  We would like to do
> this whenever possible, to avoid the overhead of extra power-up and
> power-down events.
> 
> However, problems may arise because the device's descendants may
> require it to be at full power at various points during the cycle.
> Therefore the most straightforward way to do this safely is if the
> device and all its descendants can remain runtime suspended until the
> complete stage of system resume.
> 
> To this end, introduce a new device PM flag, power.direct_complete
> and modify the PM core to use that flag as follows.
> 
> If the ->prepare() callback of a device returns a positive number,
> the PM core will regard that as an indication that it may leave the
> device runtime-suspended.  It will then check if the system power
> transition in progress is a suspend (and not hibernation in
> particular) and if the device is, indeed, runtime-suspended.  In that
> case, the PM core will set the device's power.direct_complete flag.
> Otherwise it will clear power.direct_complete for the device and it
> also will later clear it for the device's parent (if there's one).
> 
> Next, the PM core will not invoke the ->suspend() ->suspend_late(),
> ->suspend_irq(), ->resume_irq(), ->resume_early(), or ->resume()
> callbacks for all devices having power.direct_complete set.  It
> will invoke their ->complete() callbacks, however, and those
> callbacks are then responsible for resuming the devices as
> appropriate, if necessary.
> 
> Changelog partly based on an Alan Stern's description of the idea
> (http://marc.info/?l=linux-pm&m=139940466625569&w=2).
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/power/main.c |   45
> ++++++++++++++++++++++++++++-----------------
> include/linux/pm.h        |    1 + 2 files changed, 29 insertions(+),
> 17 deletions(-)
> 
> Index: linux-pm/include/linux/pm.h
> ===================================================================
> --- linux-pm.orig/include/linux/pm.h
> +++ linux-pm/include/linux/pm.h
> @@ -546,6 +546,7 @@ struct dev_pm_info {
>  	bool			is_late_suspended:1;
>  	bool			ignore_children:1;
>  	bool			early_init:1;	/* Owned by
> the PM core */
> +	bool			direct_complete:1;	/*
> Owned by the PM core */ spinlock_t		lock;
>  #ifdef CONFIG_PM_SLEEP
>  	struct list_head	entry;
> Index: linux-pm/drivers/base/power/main.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/main.c
> +++ linux-pm/drivers/base/power/main.c
> @@ -479,7 +479,7 @@ static int device_resume_noirq(struct de
>  	TRACE_DEVICE(dev);
>  	TRACE_RESUME(0);
>  
> -	if (dev->power.syscore)
> +	if (dev->power.syscore || dev->power.direct_complete)
>  		goto Out;
>  
>  	if (!dev->power.is_noirq_suspended)
> @@ -605,7 +605,7 @@ static int device_resume_early(struct de
>  	TRACE_DEVICE(dev);
>  	TRACE_RESUME(0);
>  
> -	if (dev->power.syscore)
> +	if (dev->power.syscore || dev->power.direct_complete)
>  		goto Out;
>  
>  	if (!dev->power.is_late_suspended)
> @@ -732,7 +732,7 @@ static int device_resume(struct device *
>  	TRACE_DEVICE(dev);
>  	TRACE_RESUME(0);
>  
> -	if (dev->power.syscore)
> +	if (dev->power.syscore || dev->power.direct_complete)
>  		goto Complete;
>  
>  	dpm_wait(dev->parent, async);
> @@ -1007,7 +1007,7 @@ static int __device_suspend_noirq(struct
>  		goto Complete;
>  	}
>  
> -	if (dev->power.syscore)
> +	if (dev->power.syscore || dev->power.direct_complete)
>  		goto Complete;
>  
>  	dpm_wait_for_children(dev, async);
> @@ -1146,7 +1146,7 @@ static int __device_suspend_late(struct
>  		goto Complete;
>  	}
>  
> -	if (dev->power.syscore)
> +	if (dev->power.syscore || dev->power.direct_complete)
>  		goto Complete;
>  
>  	dpm_wait_for_children(dev, async);
> @@ -1312,7 +1312,7 @@ static int __device_suspend(struct devic
>  
>  	dpm_wait_for_children(dev, async);
>  
> -	if (async_error || dev->power.syscore)
> +	if (async_error || dev->power.syscore ||
> dev->power.direct_complete) goto Complete;
>  
>  	dpm_watchdog_set(&wd, dev);
> @@ -1365,10 +1365,19 @@ static int __device_suspend(struct devic
>  
>   End:
>  	if (!error) {
> +		struct device *parent = dev->parent;
> +
>  		dev->power.is_suspended = true;
> -		if (dev->power.wakeup_path
> -		    && dev->parent
> && !dev->parent->power.ignore_children)
> -			dev->parent->power.wakeup_path = true;
> +		if (parent) {
> +			spin_lock_irq(&parent->power.lock);
> +
> +			dev->parent->power.direct_complete = false;
should we respect ignore_children flag here? not all parent devices
create children with proper .prepare() function. this allows parents
override children.
I am looking at USB, a USB device could have logical children such as
ep_xx, they don't go through the same subsystem .prepare().

> +			if (dev->power.wakeup_path
> +			    && !dev->parent->power.ignore_children)
> +				dev->parent->power.wakeup_path =
> true; +
> +			spin_unlock_irq(&parent->power.lock);
> +		}
>  	}
>  
>  	device_unlock(dev);
> @@ -1470,7 +1479,7 @@ static int device_prepare(struct device
>  {
>  	int (*callback)(struct device *) = NULL;
>  	char *info = NULL;
> -	int error = 0;
> +	int ret = 0;
>  
>  	if (dev->power.syscore)
>  		return 0;
> @@ -1518,17 +1527,19 @@ static int device_prepare(struct device
>  		callback = dev->driver->pm->prepare;
>  	}
>  
> -	if (callback) {
> -		error = callback(dev);
> -		suspend_report_result(callback, error);
> -	}
> +	if (callback)
> +		ret = callback(dev);
>  
>  	device_unlock(dev);
>  
> -	if (error)
> +	if (ret < 0) {
> +		suspend_report_result(callback, ret);
>  		pm_runtime_put(dev);
> -
> -	return error;
> +		return ret;
> +	}
> +	dev->power.direct_complete = ret > 0 && state.event ==
> PM_EVENT_SUSPEND
> +					&& pm_runtime_suspended(dev);
> +	return 0;
>  }
>  
>  /**
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Jacob Pan]
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jacob Pan May 15, 2014, 7:03 a.m. UTC | #10
On Thu, 15 May 2014 10:29:42 -0400 (EDT)
Alan Stern <stern@rowland.harvard.edu> wrote:

> On Thu, 15 May 2014, Jacob Pan wrote:
> 
> > > > should we respect ignore_children flag here? not all parent
> > > > devices create children with proper .prepare() function. this
> > > > allows parents override children.
> > > > I am looking at USB, a USB device could have logical children
> > > > such as ep_xx, they don't go through the same
> > > > subsystem .prepare().
> > > 
> > > Well, I'm not sure about that.  Let me consider that for a while.
> > OK. let me be more clear about the situation i see in USB. Correct
> > me if I am wrong, a USB device will always has at least one
> > endpoint/ep_00 as a kid for control pipe, it is a logical device.
> > So when device_prepare() is called, its call back is NULL which
> > makes .direct_complete = 0. Since children device suspend is called
> > before parents, the parents .direct_complete flag will always get
> > cleared.
> > 
> > What i am trying to achieve here is to see if we avoid resuming
> > built-in (hardwired connect_type) non-hub USB devices based on this
> > new patchset. E.g. we don't want to resume/suspend USB camera every
> > time in system suspend/resume cycle if they are already rpm
> > suspended. We can save ~100ms resume time for the devices we have
> > tested.
> 
> This is a good point, but I don't think it is at all related to 
> ignore_children.
> 
> Instead, it seems that the best way to solve it would be to add a 
> ->prepare() handler for usb_ep_device_type that would always turn 
> on direct_complete.
> 
yeah, that would solve the problem with EP device type. But what about
other subdevices. e.g. for USB camera, uvcvideo device? We can add
.prepare(return 1;) for each level but would it be better to have a
flag similar to ignore_children if not ignore_children itself.
Actually, I don't understand why this is not related to
ignore_children. Could you explain?

If the parent knows it can ignore children and already rpm suspended,
why do we still ask children?

> Alan Stern
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Jacob Pan]
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki May 15, 2014, 11:11 a.m. UTC | #11
On Wednesday, May 14, 2014 03:24:21 PM Jacob Pan wrote:
> On Tue, 13 May 2014 03:10:19 +0200
> "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
> 
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > resume all runtime-suspended devices during system suspend, mostly
> > because those devices may need to be reprogrammed due to different
> > wakeup settings for system sleep and for runtime PM.
> > 
> > For some devices, though, it's OK to remain in runtime suspend 
> > throughout a complete system suspend/resume cycle (if the device was
> > in runtime suspend at the start of the cycle).  We would like to do
> > this whenever possible, to avoid the overhead of extra power-up and
> > power-down events.
> > 
> > However, problems may arise because the device's descendants may
> > require it to be at full power at various points during the cycle.
> > Therefore the most straightforward way to do this safely is if the
> > device and all its descendants can remain runtime suspended until the
> > complete stage of system resume.
> > 
> > To this end, introduce a new device PM flag, power.direct_complete
> > and modify the PM core to use that flag as follows.
> > 
> > If the ->prepare() callback of a device returns a positive number,
> > the PM core will regard that as an indication that it may leave the
> > device runtime-suspended.  It will then check if the system power
> > transition in progress is a suspend (and not hibernation in
> > particular) and if the device is, indeed, runtime-suspended.  In that
> > case, the PM core will set the device's power.direct_complete flag.
> > Otherwise it will clear power.direct_complete for the device and it
> > also will later clear it for the device's parent (if there's one).
> > 
> > Next, the PM core will not invoke the ->suspend() ->suspend_late(),
> > ->suspend_irq(), ->resume_irq(), ->resume_early(), or ->resume()
> > callbacks for all devices having power.direct_complete set.  It
> > will invoke their ->complete() callbacks, however, and those
> > callbacks are then responsible for resuming the devices as
> > appropriate, if necessary.
> > 
> > Changelog partly based on an Alan Stern's description of the idea
> > (http://marc.info/?l=linux-pm&m=139940466625569&w=2).
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/base/power/main.c |   45
> > ++++++++++++++++++++++++++++-----------------
> > include/linux/pm.h        |    1 + 2 files changed, 29 insertions(+),
> > 17 deletions(-)
> > 
> > Index: linux-pm/include/linux/pm.h
> > ===================================================================
> > --- linux-pm.orig/include/linux/pm.h
> > +++ linux-pm/include/linux/pm.h
> > @@ -546,6 +546,7 @@ struct dev_pm_info {
> >  	bool			is_late_suspended:1;
> >  	bool			ignore_children:1;
> >  	bool			early_init:1;	/* Owned by
> > the PM core */
> > +	bool			direct_complete:1;	/*
> > Owned by the PM core */ spinlock_t		lock;
> >  #ifdef CONFIG_PM_SLEEP
> >  	struct list_head	entry;
> > Index: linux-pm/drivers/base/power/main.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/power/main.c
> > +++ linux-pm/drivers/base/power/main.c
> > @@ -479,7 +479,7 @@ static int device_resume_noirq(struct de
> >  	TRACE_DEVICE(dev);
> >  	TRACE_RESUME(0);
> >  
> > -	if (dev->power.syscore)
> > +	if (dev->power.syscore || dev->power.direct_complete)
> >  		goto Out;
> >  
> >  	if (!dev->power.is_noirq_suspended)
> > @@ -605,7 +605,7 @@ static int device_resume_early(struct de
> >  	TRACE_DEVICE(dev);
> >  	TRACE_RESUME(0);
> >  
> > -	if (dev->power.syscore)
> > +	if (dev->power.syscore || dev->power.direct_complete)
> >  		goto Out;
> >  
> >  	if (!dev->power.is_late_suspended)
> > @@ -732,7 +732,7 @@ static int device_resume(struct device *
> >  	TRACE_DEVICE(dev);
> >  	TRACE_RESUME(0);
> >  
> > -	if (dev->power.syscore)
> > +	if (dev->power.syscore || dev->power.direct_complete)
> >  		goto Complete;
> >  
> >  	dpm_wait(dev->parent, async);
> > @@ -1007,7 +1007,7 @@ static int __device_suspend_noirq(struct
> >  		goto Complete;
> >  	}
> >  
> > -	if (dev->power.syscore)
> > +	if (dev->power.syscore || dev->power.direct_complete)
> >  		goto Complete;
> >  
> >  	dpm_wait_for_children(dev, async);
> > @@ -1146,7 +1146,7 @@ static int __device_suspend_late(struct
> >  		goto Complete;
> >  	}
> >  
> > -	if (dev->power.syscore)
> > +	if (dev->power.syscore || dev->power.direct_complete)
> >  		goto Complete;
> >  
> >  	dpm_wait_for_children(dev, async);
> > @@ -1312,7 +1312,7 @@ static int __device_suspend(struct devic
> >  
> >  	dpm_wait_for_children(dev, async);
> >  
> > -	if (async_error || dev->power.syscore)
> > +	if (async_error || dev->power.syscore ||
> > dev->power.direct_complete) goto Complete;
> >  
> >  	dpm_watchdog_set(&wd, dev);
> > @@ -1365,10 +1365,19 @@ static int __device_suspend(struct devic
> >  
> >   End:
> >  	if (!error) {
> > +		struct device *parent = dev->parent;
> > +
> >  		dev->power.is_suspended = true;
> > -		if (dev->power.wakeup_path
> > -		    && dev->parent
> > && !dev->parent->power.ignore_children)
> > -			dev->parent->power.wakeup_path = true;
> > +		if (parent) {
> > +			spin_lock_irq(&parent->power.lock);
> > +
> > +			dev->parent->power.direct_complete = false;
> should we respect ignore_children flag here? not all parent devices
> create children with proper .prepare() function. this allows parents
> override children.
> I am looking at USB, a USB device could have logical children such as
> ep_xx, they don't go through the same subsystem .prepare().

Well, I'm not sure about that.  Let me consider that for a while.

Alan, what do you think?

> 
> > +			if (dev->power.wakeup_path
> > +			    && !dev->parent->power.ignore_children)
> > +				dev->parent->power.wakeup_path =
> > true; +
> > +			spin_unlock_irq(&parent->power.lock);
> > +		}
> >  	}
> >  
> >  	device_unlock(dev);
> > @@ -1470,7 +1479,7 @@ static int device_prepare(struct device
> >  {
> >  	int (*callback)(struct device *) = NULL;
> >  	char *info = NULL;
> > -	int error = 0;
> > +	int ret = 0;
> >  
> >  	if (dev->power.syscore)
> >  		return 0;
> > @@ -1518,17 +1527,19 @@ static int device_prepare(struct device
> >  		callback = dev->driver->pm->prepare;
> >  	}
> >  
> > -	if (callback) {
> > -		error = callback(dev);
> > -		suspend_report_result(callback, error);
> > -	}
> > +	if (callback)
> > +		ret = callback(dev);
> >  
> >  	device_unlock(dev);
> >  
> > -	if (error)
> > +	if (ret < 0) {
> > +		suspend_report_result(callback, ret);
> >  		pm_runtime_put(dev);
> > -
> > -	return error;
> > +		return ret;
> > +	}
> > +	dev->power.direct_complete = ret > 0 && state.event ==
> > PM_EVENT_SUSPEND
> > +					&& pm_runtime_suspended(dev);
> > +	return 0;
> >  }
> >  
> >  /**
> > 
> > --

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jacob Pan May 15, 2014, 1:09 p.m. UTC | #12
On Thu, 15 May 2014 13:11:15 +0200
"Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:

> On Wednesday, May 14, 2014 03:24:21 PM Jacob Pan wrote:
> > On Tue, 13 May 2014 03:10:19 +0200
> > "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
> > 
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have
> > > to resume all runtime-suspended devices during system suspend,
> > > mostly because those devices may need to be reprogrammed due to
> > > different wakeup settings for system sleep and for runtime PM.
> > > 
> > > For some devices, though, it's OK to remain in runtime suspend 
> > > throughout a complete system suspend/resume cycle (if the device
> > > was in runtime suspend at the start of the cycle).  We would like
> > > to do this whenever possible, to avoid the overhead of extra
> > > power-up and power-down events.
> > > 
> > > However, problems may arise because the device's descendants may
> > > require it to be at full power at various points during the cycle.
> > > Therefore the most straightforward way to do this safely is if the
> > > device and all its descendants can remain runtime suspended until
> > > the complete stage of system resume.
> > > 
> > > To this end, introduce a new device PM flag, power.direct_complete
> > > and modify the PM core to use that flag as follows.
> > > 
> > > If the ->prepare() callback of a device returns a positive number,
> > > the PM core will regard that as an indication that it may leave
> > > the device runtime-suspended.  It will then check if the system
> > > power transition in progress is a suspend (and not hibernation in
> > > particular) and if the device is, indeed, runtime-suspended.  In
> > > that case, the PM core will set the device's
> > > power.direct_complete flag. Otherwise it will clear
> > > power.direct_complete for the device and it also will later clear
> > > it for the device's parent (if there's one).
> > > 
> > > Next, the PM core will not invoke the ->suspend()
> > > ->suspend_late(), ->suspend_irq(), ->resume_irq(),
> > > ->resume_early(), or ->resume() callbacks for all devices having
> > > power.direct_complete set.  It will invoke their ->complete()
> > > callbacks, however, and those callbacks are then responsible for
> > > resuming the devices as appropriate, if necessary.
> > > 
> > > Changelog partly based on an Alan Stern's description of the idea
> > > (http://marc.info/?l=linux-pm&m=139940466625569&w=2).
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  drivers/base/power/main.c |   45
> > > ++++++++++++++++++++++++++++-----------------
> > > include/linux/pm.h        |    1 + 2 files changed, 29
> > > insertions(+), 17 deletions(-)
> > > 
> > > Index: linux-pm/include/linux/pm.h
> > > ===================================================================
> > > --- linux-pm.orig/include/linux/pm.h
> > > +++ linux-pm/include/linux/pm.h
> > > @@ -546,6 +546,7 @@ struct dev_pm_info {
> > >  	bool			is_late_suspended:1;
> > >  	bool			ignore_children:1;
> > >  	bool			early_init:1;	/*
> > > Owned by the PM core */
> > > +	bool			direct_complete:1;	/*
> > > Owned by the PM core */ spinlock_t		lock;
> > >  #ifdef CONFIG_PM_SLEEP
> > >  	struct list_head	entry;
> > > Index: linux-pm/drivers/base/power/main.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/base/power/main.c
> > > +++ linux-pm/drivers/base/power/main.c
> > > @@ -479,7 +479,7 @@ static int device_resume_noirq(struct de
> > >  	TRACE_DEVICE(dev);
> > >  	TRACE_RESUME(0);
> > >  
> > > -	if (dev->power.syscore)
> > > +	if (dev->power.syscore || dev->power.direct_complete)
> > >  		goto Out;
> > >  
> > >  	if (!dev->power.is_noirq_suspended)
> > > @@ -605,7 +605,7 @@ static int device_resume_early(struct de
> > >  	TRACE_DEVICE(dev);
> > >  	TRACE_RESUME(0);
> > >  
> > > -	if (dev->power.syscore)
> > > +	if (dev->power.syscore || dev->power.direct_complete)
> > >  		goto Out;
> > >  
> > >  	if (!dev->power.is_late_suspended)
> > > @@ -732,7 +732,7 @@ static int device_resume(struct device *
> > >  	TRACE_DEVICE(dev);
> > >  	TRACE_RESUME(0);
> > >  
> > > -	if (dev->power.syscore)
> > > +	if (dev->power.syscore || dev->power.direct_complete)
> > >  		goto Complete;
> > >  
> > >  	dpm_wait(dev->parent, async);
> > > @@ -1007,7 +1007,7 @@ static int __device_suspend_noirq(struct
> > >  		goto Complete;
> > >  	}
> > >  
> > > -	if (dev->power.syscore)
> > > +	if (dev->power.syscore || dev->power.direct_complete)
> > >  		goto Complete;
> > >  
> > >  	dpm_wait_for_children(dev, async);
> > > @@ -1146,7 +1146,7 @@ static int __device_suspend_late(struct
> > >  		goto Complete;
> > >  	}
> > >  
> > > -	if (dev->power.syscore)
> > > +	if (dev->power.syscore || dev->power.direct_complete)
> > >  		goto Complete;
> > >  
> > >  	dpm_wait_for_children(dev, async);
> > > @@ -1312,7 +1312,7 @@ static int __device_suspend(struct devic
> > >  
> > >  	dpm_wait_for_children(dev, async);
> > >  
> > > -	if (async_error || dev->power.syscore)
> > > +	if (async_error || dev->power.syscore ||
> > > dev->power.direct_complete) goto Complete;
> > >  
> > >  	dpm_watchdog_set(&wd, dev);
> > > @@ -1365,10 +1365,19 @@ static int __device_suspend(struct devic
> > >  
> > >   End:
> > >  	if (!error) {
> > > +		struct device *parent = dev->parent;
> > > +
> > >  		dev->power.is_suspended = true;
> > > -		if (dev->power.wakeup_path
> > > -		    && dev->parent
> > > && !dev->parent->power.ignore_children)
> > > -			dev->parent->power.wakeup_path = true;
> > > +		if (parent) {
> > > +			spin_lock_irq(&parent->power.lock);
> > > +
> > > +			dev->parent->power.direct_complete =
> > > false;
> > should we respect ignore_children flag here? not all parent devices
> > create children with proper .prepare() function. this allows parents
> > override children.
> > I am looking at USB, a USB device could have logical children such
> > as ep_xx, they don't go through the same subsystem .prepare().
> 
> Well, I'm not sure about that.  Let me consider that for a while.
OK. let me be more clear about the situation i see in USB. Correct me
if I am wrong, a USB device will always has at least one endpoint/ep_00
as a kid for control pipe, it is a logical device. So when
device_prepare() is called, its call back is NULL which
makes .direct_complete = 0. Since children device suspend is called
before parents, the parents .direct_complete flag will always get
cleared.

What i am trying to achieve here is to see if we avoid resuming
built-in (hardwired connect_type) non-hub USB devices based on this new
patchset. E.g. we don't want to resume/suspend USB camera every time in
system suspend/resume cycle if they are already rpm suspended. We can
save ~100ms resume time for the devices we have tested.

> 
> Alan, what do you think?
> 
> > 
> > > +			if (dev->power.wakeup_path
> > > +
> > > && !dev->parent->power.ignore_children)
> > > +				dev->parent->power.wakeup_path =
> > > true; +
> > > +			spin_unlock_irq(&parent->power.lock);
> > > +		}
> > >  	}
> > >  
> > >  	device_unlock(dev);
> > > @@ -1470,7 +1479,7 @@ static int device_prepare(struct device
> > >  {
> > >  	int (*callback)(struct device *) = NULL;
> > >  	char *info = NULL;
> > > -	int error = 0;
> > > +	int ret = 0;
> > >  
> > >  	if (dev->power.syscore)
> > >  		return 0;
> > > @@ -1518,17 +1527,19 @@ static int device_prepare(struct device
> > >  		callback = dev->driver->pm->prepare;
> > >  	}
> > >  
> > > -	if (callback) {
> > > -		error = callback(dev);
> > > -		suspend_report_result(callback, error);
> > > -	}
> > > +	if (callback)
> > > +		ret = callback(dev);
> > >  
> > >  	device_unlock(dev);
> > >  
> > > -	if (error)
> > > +	if (ret < 0) {
> > > +		suspend_report_result(callback, ret);
> > >  		pm_runtime_put(dev);
> > > -
> > > -	return error;
> > > +		return ret;
> > > +	}
> > > +	dev->power.direct_complete = ret > 0 && state.event ==
> > > PM_EVENT_SUSPEND
> > > +					&&
> > > pm_runtime_suspended(dev);
> > > +	return 0;
> > >  }
> > >  
> > >  /**
> > > 
> > > --
> 
> Rafael
> 

[Jacob Pan]
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern May 15, 2014, 2:29 p.m. UTC | #13
On Thu, 15 May 2014, Jacob Pan wrote:

> > > should we respect ignore_children flag here? not all parent devices
> > > create children with proper .prepare() function. this allows parents
> > > override children.
> > > I am looking at USB, a USB device could have logical children such
> > > as ep_xx, they don't go through the same subsystem .prepare().
> > 
> > Well, I'm not sure about that.  Let me consider that for a while.
> OK. let me be more clear about the situation i see in USB. Correct me
> if I am wrong, a USB device will always has at least one endpoint/ep_00
> as a kid for control pipe, it is a logical device. So when
> device_prepare() is called, its call back is NULL which
> makes .direct_complete = 0. Since children device suspend is called
> before parents, the parents .direct_complete flag will always get
> cleared.
> 
> What i am trying to achieve here is to see if we avoid resuming
> built-in (hardwired connect_type) non-hub USB devices based on this new
> patchset. E.g. we don't want to resume/suspend USB camera every time in
> system suspend/resume cycle if they are already rpm suspended. We can
> save ~100ms resume time for the devices we have tested.

This is a good point, but I don't think it is at all related to 
ignore_children.

Instead, it seems that the best way to solve it would be to add a 
->prepare() handler for usb_ep_device_type that would always turn 
on direct_complete.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern May 15, 2014, 3:58 p.m. UTC | #14
On Thu, 15 May 2014, Jacob Pan wrote:

> On Thu, 15 May 2014 10:29:42 -0400 (EDT)
> Alan Stern <stern@rowland.harvard.edu> wrote:
> 
> > On Thu, 15 May 2014, Jacob Pan wrote:
> > 
> > > > > should we respect ignore_children flag here? not all parent
> > > > > devices create children with proper .prepare() function. this
> > > > > allows parents override children.
> > > > > I am looking at USB, a USB device could have logical children
> > > > > such as ep_xx, they don't go through the same
> > > > > subsystem .prepare().
> > > > 
> > > > Well, I'm not sure about that.  Let me consider that for a while.
> > > OK. let me be more clear about the situation i see in USB. Correct
> > > me if I am wrong, a USB device will always has at least one
> > > endpoint/ep_00 as a kid for control pipe, it is a logical device.
> > > So when device_prepare() is called, its call back is NULL which
> > > makes .direct_complete = 0. Since children device suspend is called
> > > before parents, the parents .direct_complete flag will always get
> > > cleared.
> > > 
> > > What i am trying to achieve here is to see if we avoid resuming
> > > built-in (hardwired connect_type) non-hub USB devices based on this
> > > new patchset. E.g. we don't want to resume/suspend USB camera every
> > > time in system suspend/resume cycle if they are already rpm
> > > suspended. We can save ~100ms resume time for the devices we have
> > > tested.
> > 
> > This is a good point, but I don't think it is at all related to 
> > ignore_children.
> > 
> > Instead, it seems that the best way to solve it would be to add a 
> > ->prepare() handler for usb_ep_device_type that would always turn 
> > on direct_complete.
> > 
> yeah, that would solve the problem with EP device type. But what about
> other subdevices. e.g. for USB camera, uvcvideo device? We can add
> .prepare(return 1;) for each level but would it be better to have a
> flag similar to ignore_children if not ignore_children itself.

Something like that could always be added.

> Actually, I don't understand why this is not related to
> ignore_children. Could you explain?

It's hard to explain why two things are totally separate.  Much better 
for you to describe why you think they _are_ related, so that I can 
explain how you are wrong.

> If the parent knows it can ignore children and already rpm suspended,
> why do we still ask children?

The "ignore_children" flag doesn't mean that the parent can ignore its 
children.  It means that the PM core is allowed to do a runtime suspend 
of the parent while leaving the children at full power.

In particular, it doesn't mean that the children's ->suspend() callback 
will work correctly if it is called while the parent is runtime 
suspended.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kevin Hilman May 15, 2014, 5:35 p.m. UTC | #15
Alan Stern <stern@rowland.harvard.edu> writes:

> On Tue, 13 May 2014, Rafael J. Wysocki wrote:
>
>> > A wakeup request from the hardware could cause a runtime resume to 
>> > occur at this time.  The barrier wouldn't prevent that.
>> > 
>> > It's unlikely, I agree, but not impossible.
>> 
>> Yeah, I didn't think about that.
>
> Come to think of it, if the hardware sends a wakeup request then it
> must have been enabled for remote wakeup.  And if the hardware settings
> are appropriate for system suspend then it must be enabled for system
> wakeup.  Consequently a wakeup from the hardware ought to abort the
> system suspend in any case.  So maybe we don't care about this 
> scenario.
>
> On the other hand, there may be other mechanisms that could cause a 
> runtime resume at this inconvenient time.  A timer routine, for 
> instance.

Another common case is when device X depends on device Y in it's
->prepare or ->suspend path (e.g. need to write to an I2C connected
GPIO/PMIC) in which case, device Y (and the I2C bus) would be runtime
resumed during device X's ->prepare or ->suspend path, and possibly
after device Y (or the I2C busses) ->prepare and ->suspend.

Kevin
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -546,6 +546,7 @@  struct dev_pm_info {
 	bool			is_late_suspended:1;
 	bool			ignore_children:1;
 	bool			early_init:1;	/* Owned by the PM core */
+	bool			direct_complete:1;	/* Owned by the PM core */
 	spinlock_t		lock;
 #ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -479,7 +479,7 @@  static int device_resume_noirq(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.direct_complete)
 		goto Out;
 
 	if (!dev->power.is_noirq_suspended)
@@ -605,7 +605,7 @@  static int device_resume_early(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.direct_complete)
 		goto Out;
 
 	if (!dev->power.is_late_suspended)
@@ -732,7 +732,7 @@  static int device_resume(struct device *
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.direct_complete)
 		goto Complete;
 
 	dpm_wait(dev->parent, async);
@@ -1007,7 +1007,7 @@  static int __device_suspend_noirq(struct
 		goto Complete;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.direct_complete)
 		goto Complete;
 
 	dpm_wait_for_children(dev, async);
@@ -1146,7 +1146,7 @@  static int __device_suspend_late(struct
 		goto Complete;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.direct_complete)
 		goto Complete;
 
 	dpm_wait_for_children(dev, async);
@@ -1312,7 +1312,7 @@  static int __device_suspend(struct devic
 
 	dpm_wait_for_children(dev, async);
 
-	if (async_error || dev->power.syscore)
+	if (async_error || dev->power.syscore || dev->power.direct_complete)
 		goto Complete;
 
 	dpm_watchdog_set(&wd, dev);
@@ -1365,10 +1365,19 @@  static int __device_suspend(struct devic
 
  End:
 	if (!error) {
+		struct device *parent = dev->parent;
+
 		dev->power.is_suspended = true;
-		if (dev->power.wakeup_path
-		    && dev->parent && !dev->parent->power.ignore_children)
-			dev->parent->power.wakeup_path = true;
+		if (parent) {
+			spin_lock_irq(&parent->power.lock);
+
+			dev->parent->power.direct_complete = false;
+			if (dev->power.wakeup_path
+			    && !dev->parent->power.ignore_children)
+				dev->parent->power.wakeup_path = true;
+
+			spin_unlock_irq(&parent->power.lock);
+		}
 	}
 
 	device_unlock(dev);
@@ -1470,7 +1479,7 @@  static int device_prepare(struct device
 {
 	int (*callback)(struct device *) = NULL;
 	char *info = NULL;
-	int error = 0;
+	int ret = 0;
 
 	if (dev->power.syscore)
 		return 0;
@@ -1518,17 +1527,19 @@  static int device_prepare(struct device
 		callback = dev->driver->pm->prepare;
 	}
 
-	if (callback) {
-		error = callback(dev);
-		suspend_report_result(callback, error);
-	}
+	if (callback)
+		ret = callback(dev);
 
 	device_unlock(dev);
 
-	if (error)
+	if (ret < 0) {
+		suspend_report_result(callback, ret);
 		pm_runtime_put(dev);
-
-	return error;
+		return ret;
+	}
+	dev->power.direct_complete = ret > 0 && state.event == PM_EVENT_SUSPEND
+					&& pm_runtime_suspended(dev);
+	return 0;
 }
 
 /**