diff mbox

[v3] PM: check for complete cb before device lock in dpm_complete

Message ID 6146fb8483239461040674927df2e71d4555c59e.1445035227.git.todd.e.brandt@linux.intel.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Todd Brandt Oct. 16, 2015, 10:45 p.m. UTC
If a device has no pm_ops complete callback it shouldn't have
to be locked in order to skip it in the dpm_complete call. This causes
problems when a device without a complete callback has already begun
operation after its resume cb is called. It can end up holding up the
system resume as the pm subsystem tries to get a device lock just to
check for a callback that isn't there.

This is basically the original v1 patch but updated for the latest kernel.

Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
---
 drivers/base/power/main.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Comments

Rafael J. Wysocki Oct. 16, 2015, 10:49 p.m. UTC | #1
Hi,

On Sat, Oct 17, 2015 at 12:45 AM, Todd Brandt
<todd.e.brandt@linux.intel.com> wrote:
> If a device has no pm_ops complete callback it shouldn't have
> to be locked in order to skip it in the dpm_complete call. This causes
> problems when a device without a complete callback has already begun
> operation after its resume cb is called. It can end up holding up the
> system resume as the pm subsystem tries to get a device lock just to
> check for a callback that isn't there.
>
> This is basically the original v1 patch but updated for the latest kernel.
>
> Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
> ---
>  drivers/base/power/main.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> index 1710c26..9bb8ff0 100644
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -899,8 +899,6 @@ static void device_complete(struct device *dev, pm_message_t state)
>         if (dev->power.syscore)
>                 return;
>
> -       device_lock(dev);
> -
>         if (dev->pm_domain) {
>                 info = "completing power domain ";
>                 callback = dev->pm_domain->ops.complete;
> @@ -920,13 +918,15 @@ static void device_complete(struct device *dev, pm_message_t state)
>                 callback = dev->driver->pm->complete;
>         }
>
> -       if (callback) {
> -               pm_dev_dbg(dev, state, info);
> -               callback(dev);
> -       }
> +       if (!callback)
> +               goto Complete;
>

Suppose that the devices is unregistered at this point and callback
was pointing to the driver callback.

> +       device_lock(dev);
> +       pm_dev_dbg(dev, state, info);
> +       callback(dev);

I don't think it is correct to execute that callback here in that case, is it?

>         device_unlock(dev);
>
> +Complete:
>         pm_runtime_put(dev);
>  }

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Todd Brandt Oct. 16, 2015, 11:06 p.m. UTC | #2
On Sat, Oct 17, 2015 at 12:49:04AM +0200, Rafael J. Wysocki wrote:
> Hi,
> 
> On Sat, Oct 17, 2015 at 12:45 AM, Todd Brandt
> <todd.e.brandt@linux.intel.com> wrote:
> > If a device has no pm_ops complete callback it shouldn't have
> > to be locked in order to skip it in the dpm_complete call. This causes
> > problems when a device without a complete callback has already begun
> > operation after its resume cb is called. It can end up holding up the
> > system resume as the pm subsystem tries to get a device lock just to
> > check for a callback that isn't there.
> >
> > This is basically the original v1 patch but updated for the latest kernel.
> >
> > Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
> > ---
> >  drivers/base/power/main.c | 12 ++++++------
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> > index 1710c26..9bb8ff0 100644
> > --- a/drivers/base/power/main.c
> > +++ b/drivers/base/power/main.c
> > @@ -899,8 +899,6 @@ static void device_complete(struct device *dev, pm_message_t state)
> >         if (dev->power.syscore)
> >                 return;
> >
> > -       device_lock(dev);
> > -
> >         if (dev->pm_domain) {
> >                 info = "completing power domain ";
> >                 callback = dev->pm_domain->ops.complete;
> > @@ -920,13 +918,15 @@ static void device_complete(struct device *dev, pm_message_t state)
> >                 callback = dev->driver->pm->complete;
> >         }
> >
> > -       if (callback) {
> > -               pm_dev_dbg(dev, state, info);
> > -               callback(dev);
> > -       }
> > +       if (!callback)
> > +               goto Complete;
> >
> 
> Suppose that the devices is unregistered at this point and callback
> was pointing to the driver callback.
> 
> > +       device_lock(dev);
> > +       pm_dev_dbg(dev, state, info);
> > +       callback(dev);
> 
> I don't think it is correct to execute that callback here in that case, is it?

Ahh, so we *should* be worried about an unregister in between the check and the call.
I seem to remember a discussion where this was deemed a non-issue but I must have
misheard (it was a while back).

It sounds to me then that there's just no way to do this safely :(. I'll experiment 
with the i8042 driver to see if theres a way to fix this at a lower level. Thus far
it's the only driver which has this issue.

Thanks for the feedback.

> 
> >         device_unlock(dev);
> >
> > +Complete:
> >         pm_runtime_put(dev);
> >  }
> 
> Thanks,
> Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern Oct. 17, 2015, 3:04 p.m. UTC | #3
On Fri, 16 Oct 2015, Todd E Brandt wrote:

> On Sat, Oct 17, 2015 at 12:49:04AM +0200, Rafael J. Wysocki wrote:
> > Hi,
> > 
> > On Sat, Oct 17, 2015 at 12:45 AM, Todd Brandt
> > <todd.e.brandt@linux.intel.com> wrote:
> > > If a device has no pm_ops complete callback it shouldn't have
> > > to be locked in order to skip it in the dpm_complete call. This causes
> > > problems when a device without a complete callback has already begun
> > > operation after its resume cb is called. It can end up holding up the
> > > system resume as the pm subsystem tries to get a device lock just to
> > > check for a callback that isn't there.
> > >
> > > This is basically the original v1 patch but updated for the latest kernel.
> > >
> > > Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
> > > ---
> > >  drivers/base/power/main.c | 12 ++++++------
> > >  1 file changed, 6 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> > > index 1710c26..9bb8ff0 100644
> > > --- a/drivers/base/power/main.c
> > > +++ b/drivers/base/power/main.c
> > > @@ -899,8 +899,6 @@ static void device_complete(struct device *dev, pm_message_t state)
> > >         if (dev->power.syscore)
> > >                 return;
> > >
> > > -       device_lock(dev);
> > > -
> > >         if (dev->pm_domain) {
> > >                 info = "completing power domain ";
> > >                 callback = dev->pm_domain->ops.complete;
> > > @@ -920,13 +918,15 @@ static void device_complete(struct device *dev, pm_message_t state)
> > >                 callback = dev->driver->pm->complete;
> > >         }
> > >
> > > -       if (callback) {
> > > -               pm_dev_dbg(dev, state, info);
> > > -               callback(dev);
> > > -       }
> > > +       if (!callback)
> > > +               goto Complete;
> > >
> > 
> > Suppose that the devices is unregistered at this point and callback
> > was pointing to the driver callback.
> > 
> > > +       device_lock(dev);
> > > +       pm_dev_dbg(dev, state, info);
> > > +       callback(dev);
> > 
> > I don't think it is correct to execute that callback here in that case, is it?
> 
> Ahh, so we *should* be worried about an unregister in between the check and the call.
> I seem to remember a discussion where this was deemed a non-issue but I must have
> misheard (it was a while back).
> 
> It sounds to me then that there's just no way to do this safely :(. I'll experiment 
> with the i8042 driver to see if theres a way to fix this at a lower level. Thus far
> it's the only driver which has this issue.

Fixing it in the driver would be best.  (Why does the i8042 device stay
locked long enough to matter?  And doesn't the driver use async
suspend/resume anyway?)  Still, if you want to do it here safely, there
is a way:

	Compute the callback's address
	If it is not NULL:
		Lock the device
		Compute the callback's address again
		If it is not NULL:
			Invoke the callback
		Unlock the device

However, duplicating the computation likek this would be a waste for
almost all devices.  So unless there's a real need to do this, we
should avoid it.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitry Torokhov Nov. 29, 2015, 5:33 a.m. UTC | #4
On Fri, Oct 16, 2015 at 3:49 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> Hi,
>
> On Sat, Oct 17, 2015 at 12:45 AM, Todd Brandt
> <todd.e.brandt@linux.intel.com> wrote:
>> If a device has no pm_ops complete callback it shouldn't have
>> to be locked in order to skip it in the dpm_complete call. This causes
>> problems when a device without a complete callback has already begun
>> operation after its resume cb is called. It can end up holding up the
>> system resume as the pm subsystem tries to get a device lock just to
>> check for a callback that isn't there.
>>
>> This is basically the original v1 patch but updated for the latest kernel.
>>
>> Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
>> ---
>>  drivers/base/power/main.c | 12 ++++++------
>>  1 file changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
>> index 1710c26..9bb8ff0 100644
>> --- a/drivers/base/power/main.c
>> +++ b/drivers/base/power/main.c
>> @@ -899,8 +899,6 @@ static void device_complete(struct device *dev, pm_message_t state)
>>         if (dev->power.syscore)
>>                 return;
>>
>> -       device_lock(dev);
>> -
>>         if (dev->pm_domain) {
>>                 info = "completing power domain ";
>>                 callback = dev->pm_domain->ops.complete;
>> @@ -920,13 +918,15 @@ static void device_complete(struct device *dev, pm_message_t state)
>>                 callback = dev->driver->pm->complete;
>>         }
>>
>> -       if (callback) {
>> -               pm_dev_dbg(dev, state, info);
>> -               callback(dev);
>> -       }
>> +       if (!callback)
>> +               goto Complete;
>>
>
> Suppose that the devices is unregistered at this point and callback
> was pointing to the driver callback.
>
>> +       device_lock(dev);
>> +       pm_dev_dbg(dev, state, info);
>> +       callback(dev);
>
> I don't think it is correct to execute that callback here in that case, is it?

Continuing this line of thought: what if driver was unbound and
rebound again before we got to executing device_complete()? In that
case we should not be calling ->complete() even if it is present,
since it is as if prepare() was not called, and there is nothing for
->complete() to undo.

It seems to me we should not be allowing bind/unbind transitions here.

Thanks.
Rafael J. Wysocki Nov. 30, 2015, 2:15 a.m. UTC | #5
On Saturday, November 28, 2015 09:33:20 PM Dmitry Torokhov wrote:
> On Fri, Oct 16, 2015 at 3:49 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> > Hi,
> >
> > On Sat, Oct 17, 2015 at 12:45 AM, Todd Brandt
> > <todd.e.brandt@linux.intel.com> wrote:
> >> If a device has no pm_ops complete callback it shouldn't have
> >> to be locked in order to skip it in the dpm_complete call. This causes
> >> problems when a device without a complete callback has already begun
> >> operation after its resume cb is called. It can end up holding up the
> >> system resume as the pm subsystem tries to get a device lock just to
> >> check for a callback that isn't there.
> >>
> >> This is basically the original v1 patch but updated for the latest kernel.
> >>
> >> Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
> >> ---
> >>  drivers/base/power/main.c | 12 ++++++------
> >>  1 file changed, 6 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> >> index 1710c26..9bb8ff0 100644
> >> --- a/drivers/base/power/main.c
> >> +++ b/drivers/base/power/main.c
> >> @@ -899,8 +899,6 @@ static void device_complete(struct device *dev, pm_message_t state)
> >>         if (dev->power.syscore)
> >>                 return;
> >>
> >> -       device_lock(dev);
> >> -
> >>         if (dev->pm_domain) {
> >>                 info = "completing power domain ";
> >>                 callback = dev->pm_domain->ops.complete;
> >> @@ -920,13 +918,15 @@ static void device_complete(struct device *dev, pm_message_t state)
> >>                 callback = dev->driver->pm->complete;
> >>         }
> >>
> >> -       if (callback) {
> >> -               pm_dev_dbg(dev, state, info);
> >> -               callback(dev);
> >> -       }
> >> +       if (!callback)
> >> +               goto Complete;
> >>
> >
> > Suppose that the devices is unregistered at this point and callback
> > was pointing to the driver callback.
> >
> >> +       device_lock(dev);
> >> +       pm_dev_dbg(dev, state, info);
> >> +       callback(dev);
> >
> > I don't think it is correct to execute that callback here in that case, is it?
> 
> Continuing this line of thought: what if driver was unbound and
> rebound again before we got to executing device_complete()? In that
> case we should not be calling ->complete() even if it is present,
> since it is as if prepare() was not called, and there is nothing for
> ->complete() to undo.
> 
> It seems to me we should not be allowing bind/unbind transitions here.

Probes will be avoided after this patch:

https://patchwork.kernel.org/patch/7589121/

which I'm going to queue up for 4.5.

I'm not sure about unbinds, though.  Some parts of the kernel seem to want
to be able to remove devices during system suspend/resume.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 1710c26..9bb8ff0 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -899,8 +899,6 @@  static void device_complete(struct device *dev, pm_message_t state)
 	if (dev->power.syscore)
 		return;
 
-	device_lock(dev);
-
 	if (dev->pm_domain) {
 		info = "completing power domain ";
 		callback = dev->pm_domain->ops.complete;
@@ -920,13 +918,15 @@  static void device_complete(struct device *dev, pm_message_t state)
 		callback = dev->driver->pm->complete;
 	}
 
-	if (callback) {
-		pm_dev_dbg(dev, state, info);
-		callback(dev);
-	}
+	if (!callback)
+		goto Complete;
 
+	device_lock(dev);
+	pm_dev_dbg(dev, state, info);
+	callback(dev);
 	device_unlock(dev);
 
+Complete:
 	pm_runtime_put(dev);
 }