diff mbox

PM: runtime: add might_sleep to PM runtime functions

Message ID 1311396967-16721-1-git-send-email-ccross@android.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Colin Cross July 23, 2011, 4:56 a.m. UTC
The list of functions that can be called in atomic context is
non-intuitive (pm_runtime_put_sync can not, but
pm_runtime_put_sync_suspend can, if pm_runtime_irq_safe has
been called?).  The code is actively misleading - the entry
points all start with spin_lock_irqsave, suggesting they
are safe to call in atomic context, but may later
enable interrupts.

Add might_sleep_if to all the __pm_runtime_* entry points
to enforce correct usage.

Also add pm_runtime_put_sync_autosuspend to the list of
functions that can be called in atomic context.

Signed-off-by: Colin Cross <ccross@android.com>
---
 Documentation/power/runtime_pm.txt |    1 +
 drivers/base/power/runtime.c       |   16 ++++++++++++++--
 2 files changed, 15 insertions(+), 2 deletions(-)

Comments

Rafael Wysocki July 23, 2011, 10:57 p.m. UTC | #1
On Saturday, July 23, 2011, Colin Cross wrote:
> The list of functions that can be called in atomic context is
> non-intuitive (pm_runtime_put_sync can not, but
> pm_runtime_put_sync_suspend can, if pm_runtime_irq_safe has
> been called?).

However, this behavior is documented.

Also, if you have a clean use case for calling rpm_idle() with interrupts
off, it can be modified to work in analogy with rpm_suspend() in that respect.

> The code is actively misleading - the entry
> points all start with spin_lock_irqsave, suggesting they
> are safe to call in atomic context, but may later
> enable interrupts.

May I say it is this way for a reason?

> Add might_sleep_if to all the __pm_runtime_* entry points
> to enforce correct usage.

I'm not sure how this makes things better.

> Also add pm_runtime_put_sync_autosuspend to the list of
> functions that can be called in atomic context.

OK

In addition to that rpm_idle() is missing the __releases __acquires
annotations.

> Signed-off-by: Colin Cross <ccross@android.com>
> ---
>  Documentation/power/runtime_pm.txt |    1 +
>  drivers/base/power/runtime.c       |   16 ++++++++++++++--
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
> index b24875b..22852b3 100644
> --- a/Documentation/power/runtime_pm.txt
> +++ b/Documentation/power/runtime_pm.txt
> @@ -469,6 +469,7 @@ pm_runtime_autosuspend()
>  pm_runtime_resume()
>  pm_runtime_get_sync()
>  pm_runtime_put_sync_suspend()
> +pm_runtime_put_sync_autosuspend()
>  
>  5. Run-time PM Initialization, Device Probing and Removal
>  
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 0d4587b..fdc4567 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -732,6 +732,8 @@ int __pm_runtime_idle(struct device *dev, int rpmflags)
>  	unsigned long flags;
>  	int retval;
>  
> +	might_sleep_if(!(rpmflags & RPM_ASYNC));
> +
>  	if (rpmflags & RPM_GET_PUT) {
>  		if (!atomic_dec_and_test(&dev->power.usage_count))
>  			return 0;
> @@ -754,13 +756,16 @@ EXPORT_SYMBOL_GPL(__pm_runtime_idle);
>   * return immediately if it is larger than zero.  Then carry out a suspend,
>   * either synchronous or asynchronous.
>   *
> - * This routine may be called in atomic context if the RPM_ASYNC flag is set.
> + * This routine may be called in atomic context if the RPM_ASYNC flag is set,
> + * or if pm_runtime_irq_safe() has been called.
>   */
>  int __pm_runtime_suspend(struct device *dev, int rpmflags)
>  {
>  	unsigned long flags;
>  	int retval;
>  
> +	might_sleep_if(!(rpmflags & RPM_ASYNC) && !dev->power.irq_safe);
> +
>  	if (rpmflags & RPM_GET_PUT) {
>  		if (!atomic_dec_and_test(&dev->power.usage_count))
>  			return 0;
> @@ -782,13 +787,16 @@ EXPORT_SYMBOL_GPL(__pm_runtime_suspend);
>   * If the RPM_GET_PUT flag is set, increment the device's usage count.  Then
>   * carry out a resume, either synchronous or asynchronous.
>   *
> - * This routine may be called in atomic context if the RPM_ASYNC flag is set.
> + * This routine may be called in atomic context if the RPM_ASYNC flag is set,
> + * or if pm_runtime_irq_safe() has been called.
>   */
>  int __pm_runtime_resume(struct device *dev, int rpmflags)
>  {
>  	unsigned long flags;
>  	int retval;
>  
> +	might_sleep_if(!(rpmflags & RPM_ASYNC) && !dev->power.irq_safe);
> +
>  	if (rpmflags & RPM_GET_PUT)
>  		atomic_inc(&dev->power.usage_count);
>  
> @@ -978,6 +986,8 @@ EXPORT_SYMBOL_GPL(pm_runtime_barrier);
>   */
>  void __pm_runtime_disable(struct device *dev, bool check_resume)
>  {
> +	might_sleep();
> +

Well, it's not correct to call spin_lock_irq() from interrupt context anyway.

>  	spin_lock_irq(&dev->power.lock);
>  
>  	if (dev->power.disable_depth > 0) {
> @@ -1184,6 +1194,8 @@ void __pm_runtime_use_autosuspend(struct device *dev, bool use)
>  {
>  	int old_delay, old_use;
>  
> +	might_sleep();
> +
>  	spin_lock_irq(&dev->power.lock);
>  	old_delay = dev->power.autosuspend_delay;
>  	old_use = dev->power.use_autosuspend;
> 

Thanks,
Rafael
Colin Cross July 23, 2011, 11:26 p.m. UTC | #2
On Sat, Jul 23, 2011 at 3:57 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Saturday, July 23, 2011, Colin Cross wrote:
>> The list of functions that can be called in atomic context is
>> non-intuitive (pm_runtime_put_sync can not, but
>> pm_runtime_put_sync_suspend can, if pm_runtime_irq_safe has
>> been called?).
>
> However, this behavior is documented.
>
> Also, if you have a clean use case for calling rpm_idle() with interrupts
> off, it can be modified to work in analogy with rpm_suspend() in that respect.

Yes, Kevin posted that patch in response to a bug that would never
have existed with this patch.  Even with Kevin's change, this patch
still detects drivers that are missing pm_runtime_irq_safe().

>> The code is actively misleading - the entry
>> points all start with spin_lock_irqsave, suggesting they
>> are safe to call in atomic context, but may later
>> enable interrupts.
>
> May I say it is this way for a reason?

I'll reword that

>> Add might_sleep_if to all the __pm_runtime_* entry points
>> to enforce correct usage.
>
> I'm not sure how this makes things better.

I spent hours tracking down a bug that was caused by
pm_runtime_put_sync enabling interrupts when entering idle, which was
causing the timer interrupt to be serviced before the cpu entered
idle, and the cpu to idle forever until a non-timer interrupt
occurred.  The bug would never have been introduced with this patch.
When I ran with this patch, it immediately caught 3 other cases of
incorrect usage in atomic context, any of which could cause deadlocks:
spin_lock_irqsave(driver lock)
pm_runtime_put_sync
spin_lock_irqsave(dev lock)
spin_unlock_irq(dev_lock) - enables interrupts
driver irq
spin_lock(driver lock)

One of the bugs was put_sync instead of put_sync_suspend, which would
not be a problem after Kevin's patch, but the other two were missing
pm_runtime_irq_safe.

Not every developer who calls a pm_runtime function is going to read
the documentation, and this patch will catch the common incorrect
usage the first time it is run.

I'll update this patch on top of Kevin's.

>> Also add pm_runtime_put_sync_autosuspend to the list of
>> functions that can be called in atomic context.
>
> OK
>
> In addition to that rpm_idle() is missing the __releases __acquires
> annotations.

Do you want that added to this patch?  It seems like that fits better
into Kevin's patch, or a third patch.

>> Signed-off-by: Colin Cross <ccross@android.com>
>> ---
>>  Documentation/power/runtime_pm.txt |    1 +
>>  drivers/base/power/runtime.c       |   16 ++++++++++++++--
>>  2 files changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
>> index b24875b..22852b3 100644
>> --- a/Documentation/power/runtime_pm.txt
>> +++ b/Documentation/power/runtime_pm.txt
>> @@ -469,6 +469,7 @@ pm_runtime_autosuspend()
>>  pm_runtime_resume()
>>  pm_runtime_get_sync()
>>  pm_runtime_put_sync_suspend()
>> +pm_runtime_put_sync_autosuspend()
>>
>>  5. Run-time PM Initialization, Device Probing and Removal
>>
>> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
>> index 0d4587b..fdc4567 100644
>> --- a/drivers/base/power/runtime.c
>> +++ b/drivers/base/power/runtime.c
>> @@ -732,6 +732,8 @@ int __pm_runtime_idle(struct device *dev, int rpmflags)
>>       unsigned long flags;
>>       int retval;
>>
>> +     might_sleep_if(!(rpmflags & RPM_ASYNC));
>> +
>>       if (rpmflags & RPM_GET_PUT) {
>>               if (!atomic_dec_and_test(&dev->power.usage_count))
>>                       return 0;
>> @@ -754,13 +756,16 @@ EXPORT_SYMBOL_GPL(__pm_runtime_idle);
>>   * return immediately if it is larger than zero.  Then carry out a suspend,
>>   * either synchronous or asynchronous.
>>   *
>> - * This routine may be called in atomic context if the RPM_ASYNC flag is set.
>> + * This routine may be called in atomic context if the RPM_ASYNC flag is set,
>> + * or if pm_runtime_irq_safe() has been called.
>>   */
>>  int __pm_runtime_suspend(struct device *dev, int rpmflags)
>>  {
>>       unsigned long flags;
>>       int retval;
>>
>> +     might_sleep_if(!(rpmflags & RPM_ASYNC) && !dev->power.irq_safe);
>> +
>>       if (rpmflags & RPM_GET_PUT) {
>>               if (!atomic_dec_and_test(&dev->power.usage_count))
>>                       return 0;
>> @@ -782,13 +787,16 @@ EXPORT_SYMBOL_GPL(__pm_runtime_suspend);
>>   * If the RPM_GET_PUT flag is set, increment the device's usage count.  Then
>>   * carry out a resume, either synchronous or asynchronous.
>>   *
>> - * This routine may be called in atomic context if the RPM_ASYNC flag is set.
>> + * This routine may be called in atomic context if the RPM_ASYNC flag is set,
>> + * or if pm_runtime_irq_safe() has been called.
>>   */
>>  int __pm_runtime_resume(struct device *dev, int rpmflags)
>>  {
>>       unsigned long flags;
>>       int retval;
>>
>> +     might_sleep_if(!(rpmflags & RPM_ASYNC) && !dev->power.irq_safe);
>> +
>>       if (rpmflags & RPM_GET_PUT)
>>               atomic_inc(&dev->power.usage_count);
>>
>> @@ -978,6 +986,8 @@ EXPORT_SYMBOL_GPL(pm_runtime_barrier);
>>   */
>>  void __pm_runtime_disable(struct device *dev, bool check_resume)
>>  {
>> +     might_sleep();
>> +
>
> Well, it's not correct to call spin_lock_irq() from interrupt context anyway.
>
>>       spin_lock_irq(&dev->power.lock);
>>
>>       if (dev->power.disable_depth > 0) {
>> @@ -1184,6 +1194,8 @@ void __pm_runtime_use_autosuspend(struct device *dev, bool use)
>>  {
>>       int old_delay, old_use;
>>
>> +     might_sleep();
>> +
>>       spin_lock_irq(&dev->power.lock);
>>       old_delay = dev->power.autosuspend_delay;
>>       old_use = dev->power.use_autosuspend;
>>
>
> Thanks,
> Rafael
>
Alan Stern July 24, 2011, 1:41 a.m. UTC | #3
On Sat, 23 Jul 2011, Colin Cross wrote:

> On Sat, Jul 23, 2011 at 3:57 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Saturday, July 23, 2011, Colin Cross wrote:
> >> The list of functions that can be called in atomic context is
> >> non-intuitive (pm_runtime_put_sync can not, but
> >> pm_runtime_put_sync_suspend can, if pm_runtime_irq_safe has
> >> been called?).
> >
> > However, this behavior is documented.
> >
> > Also, if you have a clean use case for calling rpm_idle() with interrupts
> > off, it can be modified to work in analogy with rpm_suspend() in that respect.
> 
> Yes, Kevin posted that patch in response to a bug that would never
> have existed with this patch.  Even with Kevin's change, this patch
> still detects drivers that are missing pm_runtime_irq_safe().

I suggest that adding the annotations to __pm_runtime_idle(),
__pm_runtime_suspend(), and __pm_runtime_resume() is entirely
reasonable.  But the annotations to __pm_runtime_disable() and
__pm_runtime_use_autosuspend() do seem unnecessary.

Alan Stern
Colin Cross July 24, 2011, 1:48 a.m. UTC | #4
On Sat, Jul 23, 2011 at 6:41 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Sat, 23 Jul 2011, Colin Cross wrote:
>
>> On Sat, Jul 23, 2011 at 3:57 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> > On Saturday, July 23, 2011, Colin Cross wrote:
>> >> The list of functions that can be called in atomic context is
>> >> non-intuitive (pm_runtime_put_sync can not, but
>> >> pm_runtime_put_sync_suspend can, if pm_runtime_irq_safe has
>> >> been called?).
>> >
>> > However, this behavior is documented.
>> >
>> > Also, if you have a clean use case for calling rpm_idle() with interrupts
>> > off, it can be modified to work in analogy with rpm_suspend() in that respect.
>>
>> Yes, Kevin posted that patch in response to a bug that would never
>> have existed with this patch.  Even with Kevin's change, this patch
>> still detects drivers that are missing pm_runtime_irq_safe().
>
> I suggest that adding the annotations to __pm_runtime_idle(),
> __pm_runtime_suspend(), and __pm_runtime_resume() is entirely
> reasonable.  But the annotations to __pm_runtime_disable() and
> __pm_runtime_use_autosuspend() do seem unnecessary.

OK, I'll drop those.
Rafael Wysocki July 24, 2011, 9:26 p.m. UTC | #5
On Sunday, July 24, 2011, Colin Cross wrote:
> On Sat, Jul 23, 2011 at 3:57 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Saturday, July 23, 2011, Colin Cross wrote:
> >> The list of functions that can be called in atomic context is
> >> non-intuitive (pm_runtime_put_sync can not, but
> >> pm_runtime_put_sync_suspend can, if pm_runtime_irq_safe has
> >> been called?).
> >
> > However, this behavior is documented.
> >
> > Also, if you have a clean use case for calling rpm_idle() with interrupts
> > off, it can be modified to work in analogy with rpm_suspend() in that respect.
> 
> Yes, Kevin posted that patch in response to a bug that would never
> have existed with this patch.  Even with Kevin's change, this patch
> still detects drivers that are missing pm_runtime_irq_safe().
> 
> >> The code is actively misleading - the entry
> >> points all start with spin_lock_irqsave, suggesting they
> >> are safe to call in atomic context, but may later
> >> enable interrupts.
> >
> > May I say it is this way for a reason?
> 
> I'll reword that
> 
> >> Add might_sleep_if to all the __pm_runtime_* entry points
> >> to enforce correct usage.
> >
> > I'm not sure how this makes things better.
> 
> I spent hours tracking down a bug that was caused by
> pm_runtime_put_sync enabling interrupts when entering idle, which was
> causing the timer interrupt to be serviced before the cpu entered
> idle, and the cpu to idle forever until a non-timer interrupt
> occurred.  The bug would never have been introduced with this patch.
> When I ran with this patch, it immediately caught 3 other cases of
> incorrect usage in atomic context, any of which could cause deadlocks:
> spin_lock_irqsave(driver lock)
> pm_runtime_put_sync
> spin_lock_irqsave(dev lock)
> spin_unlock_irq(dev_lock) - enables interrupts
> driver irq
> spin_lock(driver lock)
> 
> One of the bugs was put_sync instead of put_sync_suspend, which would
> not be a problem after Kevin's patch, but the other two were missing
> pm_runtime_irq_safe.
> 
> Not every developer who calls a pm_runtime function is going to read
> the documentation, and this patch will catch the common incorrect
> usage the first time it is run.
> 
> I'll update this patch on top of Kevin's.
> 
> >> Also add pm_runtime_put_sync_autosuspend to the list of
> >> functions that can be called in atomic context.
> >
> > OK
> >
> > In addition to that rpm_idle() is missing the __releases __acquires
> > annotations.
> 
> Do you want that added to this patch?  It seems like that fits better
> into Kevin's patch, or a third patch.

OK, I'll do a separate patch adding those.

Thanks,
Rafael
diff mbox

Patch

diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
index b24875b..22852b3 100644
--- a/Documentation/power/runtime_pm.txt
+++ b/Documentation/power/runtime_pm.txt
@@ -469,6 +469,7 @@  pm_runtime_autosuspend()
 pm_runtime_resume()
 pm_runtime_get_sync()
 pm_runtime_put_sync_suspend()
+pm_runtime_put_sync_autosuspend()
 
 5. Run-time PM Initialization, Device Probing and Removal
 
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 0d4587b..fdc4567 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -732,6 +732,8 @@  int __pm_runtime_idle(struct device *dev, int rpmflags)
 	unsigned long flags;
 	int retval;
 
+	might_sleep_if(!(rpmflags & RPM_ASYNC));
+
 	if (rpmflags & RPM_GET_PUT) {
 		if (!atomic_dec_and_test(&dev->power.usage_count))
 			return 0;
@@ -754,13 +756,16 @@  EXPORT_SYMBOL_GPL(__pm_runtime_idle);
  * return immediately if it is larger than zero.  Then carry out a suspend,
  * either synchronous or asynchronous.
  *
- * This routine may be called in atomic context if the RPM_ASYNC flag is set.
+ * This routine may be called in atomic context if the RPM_ASYNC flag is set,
+ * or if pm_runtime_irq_safe() has been called.
  */
 int __pm_runtime_suspend(struct device *dev, int rpmflags)
 {
 	unsigned long flags;
 	int retval;
 
+	might_sleep_if(!(rpmflags & RPM_ASYNC) && !dev->power.irq_safe);
+
 	if (rpmflags & RPM_GET_PUT) {
 		if (!atomic_dec_and_test(&dev->power.usage_count))
 			return 0;
@@ -782,13 +787,16 @@  EXPORT_SYMBOL_GPL(__pm_runtime_suspend);
  * If the RPM_GET_PUT flag is set, increment the device's usage count.  Then
  * carry out a resume, either synchronous or asynchronous.
  *
- * This routine may be called in atomic context if the RPM_ASYNC flag is set.
+ * This routine may be called in atomic context if the RPM_ASYNC flag is set,
+ * or if pm_runtime_irq_safe() has been called.
  */
 int __pm_runtime_resume(struct device *dev, int rpmflags)
 {
 	unsigned long flags;
 	int retval;
 
+	might_sleep_if(!(rpmflags & RPM_ASYNC) && !dev->power.irq_safe);
+
 	if (rpmflags & RPM_GET_PUT)
 		atomic_inc(&dev->power.usage_count);
 
@@ -978,6 +986,8 @@  EXPORT_SYMBOL_GPL(pm_runtime_barrier);
  */
 void __pm_runtime_disable(struct device *dev, bool check_resume)
 {
+	might_sleep();
+
 	spin_lock_irq(&dev->power.lock);
 
 	if (dev->power.disable_depth > 0) {
@@ -1184,6 +1194,8 @@  void __pm_runtime_use_autosuspend(struct device *dev, bool use)
 {
 	int old_delay, old_use;
 
+	might_sleep();
+
 	spin_lock_irq(&dev->power.lock);
 	old_delay = dev->power.autosuspend_delay;
 	old_use = dev->power.use_autosuspend;