[PATCH/RFC] MMC: remove unbalanced pm_runtime_suspend()

Message ID	Pine.LNX.4.64.1104191230150.16641@axis700.grange (mailing list archive)
State	New, archived
Headers	show Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id p3JAjQTC029681 for <patchwork-linux-mmc@patchwork.kernel.org>; Tue, 19 Apr 2011 10:46:25 GMT Date: Tue, 19 Apr 2011 12:46:20 +0200 (CEST) From: Guennadi Liakhovetski <g.liakhovetski@gmx.de> To: linux-mmc@vger.kernel.org cc: Alan Stern <stern@rowland.harvard.edu>, Magnus Damm <damm@opensource.se>, Simon Horman <horms@verge.net.au>, "Rafael J. Wysocki" <rjw@sisk.pl> Subject: [PATCH/RFC] MMC: remove unbalanced pm_runtime_suspend() Message-ID: <Pine.LNX.4.64.1104191230150.16641@axis700.grange> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-mmc-owner@vger.kernel.org Precedence: bulk

Guennadi Liakhovetski April 19, 2011, 10:46 a.m. UTC

MMC bus PM operations implement a .runtime_idle() method, which calls 
pm_runtime_suspend(), but this call is not balanced by a resume 
counterpart, which causes problems with repeated card-plug and driver-load 
cycles. Removing this method fixes the disbalance.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
---

With this patch and with v2 of the MMCIF PM patch, that I'll be posting 
shortly, I can load / unload the driver, insert and remove the card and 
suspend and wake up the system multiple times and each time the full PM 
cycle is performed, going down to the platform callbacks. However, it 
might well be, that there's a more correct way to achieve the same.

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ohad Ben Cohen April 19, 2011, 12:44 p.m. UTC | #1

On Tue, Apr 19, 2011 at 1:46 PM, Guennadi Liakhovetski
<g.liakhovetski@gmx.de> wrote:
> MMC bus PM operations implement a .runtime_idle() method, which calls
> pm_runtime_suspend(), but this call is not balanced by a resume
> counterpart,

What's the exact flow you refer to ?

> which causes problems with repeated card-plug and driver-load
> cycles.

Can you please be more specific ? What are you trying to achieve, what
are the problems you encounter ?

>  Removing this method fixes the disbalance.

I'm not sure which disbalance you're referring to, but if you'll
remove this method you will break MMC/SDIO runtime PM.

More specifically, without having this ->runtime_idle() handler, the
last user giving up its power usage_count (e.g. by calling
pm_runtime_put{_sync}) will not end up powering down the MMC card.
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Guennadi Liakhovetski April 19, 2011, 1:23 p.m. UTC | #2

On Tue, 19 Apr 2011, Ohad Ben-Cohen wrote:

> On Tue, Apr 19, 2011 at 1:46 PM, Guennadi Liakhovetski
> <g.liakhovetski@gmx.de> wrote:
> > MMC bus PM operations implement a .runtime_idle() method, which calls
> > pm_runtime_suspend(), but this call is not balanced by a resume
> > counterpart,
> 
> What's the exact flow you refer to ?

Seeing a "struct dev_pm_ops" instance with .runtime_suspend(), 
.runtime_resume(), and .runtime_idle() methods I understand, that 
"suspend" and "resume" are two counterparts, that balance each other. Now 
with "idle" I am not sure which method should balance it. With platform 
devices in the generic case idle ends up calling 
pm_generic_runtime_idle(), which then calls pm_runtime_suspend(). So, 
there should be a balancing pm_runtime_resume() somewhere?

> > which causes problems with repeated card-plug and driver-load
> > cycles.
> 
> Can you please be more specific ? What are you trying to achieve, what
> are the problems you encounter ?

See this patch:

http://article.gmane.org/gmane.linux.ports.sh.devel/10724

The purpose of that patch is to (1) implement runtime PM in a way, that 
whenever the driver is unloaded or the card is ejected the interface is 
powered down, and (2) implement system-wide PM. For (1) doing the usual

probe()
{
	...
	pm_runtime_enable();
	pm_runtime_resume();
	...
}

remove()
{
	...
	pm_runtime_suspend();
	pm_runtime_dieabls();
	...
}

set_ios()
{
	...
	if (power_down)
		pm_runtime_put();
	if (power_up)
		pm_runtime_get_sync();
	...
}

Doesn't work: some internal MMC counters become disbalanced and after one 
card replug or driver reloading the interface gets stuck with power either 
permanently on or off.

> >  Removing this method fixes the disbalance.
> 
> I'm not sure which disbalance you're referring to, but if you'll
> remove this method you will break MMC/SDIO runtime PM.
> 
> More specifically, without having this ->runtime_idle() handler, the
> last user giving up its power usage_count (e.g. by calling
> pm_runtime_put{_sync}) will not end up powering down the MMC card.

How do they work then? Who does the pm_runtime_resume() to undo the 
effects of the pm_runtime_suspend(), or is it the platform runtime-pm, 
that is implementing the "idle" method wrongly?

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ohad Ben Cohen April 19, 2011, 2:16 p.m. UTC | #3

On Tue, Apr 19, 2011 at 4:23 PM, Guennadi Liakhovetski
<g.liakhovetski@gmx.de> wrote:
> Seeing a "struct dev_pm_ops" instance with .runtime_suspend(),
> .runtime_resume(), and .runtime_idle() methods I understand, that
> "suspend" and "resume" are two counterparts, that balance each other. Now
> with "idle" I am not sure which method should balance it. With platform
> devices in the generic case idle ends up calling
> pm_generic_runtime_idle(), which then calls pm_runtime_suspend(). So,
> there should be a balancing pm_runtime_resume() somewhere?

There are many ways with which the dev_pm_ops handlers get called, but
none of them include imbalance.

E.g. take a look how pm_runtime_{get,put}_sync balance each other,
while involving all three runtime pm handlers that you've specified
(suspend/resume/idle).

Can you point out an existing device/flow that demonstrates a runtime
pm imbalance ?

>> More specifically, without having this ->runtime_idle() handler, the
>> last user giving up its power usage_count (e.g. by calling
>> pm_runtime_put{_sync}) will not end up powering down the MMC card.
>
> How do they work then? Who does the pm_runtime_resume() to undo the
> effects of the pm_runtime_suspend()

Let's take SDIO for example; note how sdio_bus_probe and
sdio_bus_remove balance each other with respect to runtime PM api
invocations.

> or is it the platform runtime-pm that is implementing the "idle" method wrongly?

I'm not following what's wrong exactly. If you point out specific code
and specify exactly the issues you witness, it might be easier to
help.

Thanks,
Ohad.
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

'Alan Stern' April 19, 2011, 2:26 p.m. UTC | #4

On Tue, 19 Apr 2011, Guennadi Liakhovetski wrote:

> On Tue, 19 Apr 2011, Ohad Ben-Cohen wrote:
> 
> > On Tue, Apr 19, 2011 at 1:46 PM, Guennadi Liakhovetski
> > <g.liakhovetski@gmx.de> wrote:
> > > MMC bus PM operations implement a .runtime_idle() method, which calls
> > > pm_runtime_suspend(), but this call is not balanced by a resume
> > > counterpart,
> > 
> > What's the exact flow you refer to ?
> 
> Seeing a "struct dev_pm_ops" instance with .runtime_suspend(), 
> .runtime_resume(), and .runtime_idle() methods I understand, that 
> "suspend" and "resume" are two counterparts, that balance each other.

No, they do not balance each other.  There is no "suspend counter" or
"resume counter".  Whenever a suspend or resume call is made, it
overrides all previous calls.

This is different from dev_pm_info's usage_count value, which _is_ a
counter.  However it does not count suspend or resume calls; instead it
counts the number of routines that need to use the device.  When
usage_count drops to 0, the PM core assumes the device is no longer
being used and may therefore be suspended safely (at which point the
core invokes the pm_idle callback).  If usage_count > 0 then attempts
to call pm_runtime_suspend will fail.

>  Now 
> with "idle" I am not sure which method should balance it. With platform 
> devices in the generic case idle ends up calling 
> pm_generic_runtime_idle(), which then calls pm_runtime_suspend(). So, 
> there should be a balancing pm_runtime_resume() somewhere?

No.  If the device is idle, suspending it is appropriate.  The device 
will be resumed later when a driver needs to use it.

> > > which causes problems with repeated card-plug and driver-load
> > > cycles.
> > 
> > Can you please be more specific ? What are you trying to achieve, what
> > are the problems you encounter ?
> 
> See this patch:
> 
> http://article.gmane.org/gmane.linux.ports.sh.devel/10724
> 
> The purpose of that patch is to (1) implement runtime PM in a way, that 
> whenever the driver is unloaded or the card is ejected the interface is 
> powered down, and (2) implement system-wide PM. For (1) doing the usual
> 
> probe()
> {
> 	...
> 	pm_runtime_enable();
> 	pm_runtime_resume();
> 	...
> }
> 
> remove()
> {
> 	...
> 	pm_runtime_suspend();
> 	pm_runtime_dieabls();
> 	...
> }
> 
> set_ios()
> {
> 	...
> 	if (power_down)
> 		pm_runtime_put();
> 	if (power_up)
> 		pm_runtime_get_sync();
> 	...
> }
> 
> Doesn't work: some internal MMC counters become disbalanced and after one 
> card replug or driver reloading the interface gets stuck with power either 
> permanently on or off.

I'm not familiar enough with the inner workings of the MMC subsystem to 
help much with this.  You'd have to explain things in sufficient detail 
first.

> > >  Removing this method fixes the disbalance.
> > 
> > I'm not sure which disbalance you're referring to, but if you'll
> > remove this method you will break MMC/SDIO runtime PM.
> > 
> > More specifically, without having this ->runtime_idle() handler, the
> > last user giving up its power usage_count (e.g. by calling
> > pm_runtime_put{_sync}) will not end up powering down the MMC card.
> 
> How do they work then? Who does the pm_runtime_resume() to undo the 
> effects of the pm_runtime_suspend(), or is it the platform runtime-pm, 
> that is implementing the "idle" method wrongly?

Put it this way: When do you want the interface to be powered up and 
powered down?  That is, what events should cause the power level to 
change?  The code that handles those events is responsible for calling 
the appropriate PM runtime routines.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Guennadi Liakhovetski April 19, 2011, 10:59 p.m. UTC | #5

Thanks to all for clarifications. Since everyone is convinced, that that 
idle function in mmc bus.c is appropriate, I restored it and managed to 
achieve my goals also without it by adjusting the platform runtime pm and 
power domain prototype support.

But I still need those "cheating" calls to

	pm_runtime_put_noidle(&pdev->dev);
and
	pm_runtime_get_noresume(&pdev->dev);

in the sh_mmcif.c driver. If I use the patch as posted at

http://article.gmane.org/gmane.linux.ports.sh.devel/10724

but without those two calls, and load and unload the driver, while a card 
is plugged in, unloading the driver doesn't power down the interface, 
because the usage_count == 1 also after the kernel has soft-ejected the 
card

mmc0: card 0001 removed

With my put_noidle() / get_noresume() hack all cases of modprobe / rmmod, 
card insert / eject work correctly.

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

'Alan Stern' April 20, 2011, 2:22 p.m. UTC | #6

On Wed, 20 Apr 2011, Guennadi Liakhovetski wrote:

> Thanks to all for clarifications. Since everyone is convinced, that that 
> idle function in mmc bus.c is appropriate, I restored it and managed to 
> achieve my goals also without it by adjusting the platform runtime pm and 
> power domain prototype support.
> 
> But I still need those "cheating" calls to
> 
> 	pm_runtime_put_noidle(&pdev->dev);
> and
> 	pm_runtime_get_noresume(&pdev->dev);
> 
> in the sh_mmcif.c driver. If I use the patch as posted at
> 
> http://article.gmane.org/gmane.linux.ports.sh.devel/10724
> 
> but without those two calls, and load and unload the driver, while a card 
> is plugged in, unloading the driver doesn't power down the interface, 
> because the usage_count == 1 also after the kernel has soft-ejected the 
> card

You haven't explained why you want the interface to be powered down
when the driver is unloaded.  Normally, unloading a driver is pretty
much the exact reverse of loading it -- if the interface wasn't powered
down before the driver was loaded originally then it shouldn't be
powered down after the driver is unloaded.

Conversely, if the interface _was_ powered down before the driver was 
loaded originally, then making unload the exact reverse of load will 
naturally leave the interface powered down, with no need for any extra 
"cheating" calls.

> mmc0: card 0001 removed
> 
> With my put_noidle() / get_noresume() hack all cases of modprobe / rmmod, 
> card insert / eject work correctly.

Depending on the subsystem, those calls may not be "cheating" at all.  
When a subsystem has to deal with a variety of device drivers, some of
which may not support runtime PM, a standard trick is for the subsystem
to increment the usage_count value before probing.  Drivers that aren't
runtime-PM-aware will leave the usage_count alone and therefore the
device won't ever be runtime-suspended.  Drivers that _are_
runtime-PM-aware will know to decrement the usage_count in their probe
routine and increment it in their remove routine.

However this involves calling pm_runtime_put_noidle() during probe and
pm_runtime_get_noresume() during remove -- not during module
initialization and module unloading.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Guennadi Liakhovetski April 20, 2011, 2:50 p.m. UTC | #7

On Wed, 20 Apr 2011, Alan Stern wrote:

> On Wed, 20 Apr 2011, Guennadi Liakhovetski wrote:
> 
> > Thanks to all for clarifications. Since everyone is convinced, that that 
> > idle function in mmc bus.c is appropriate, I restored it and managed to 
> > achieve my goals also without it by adjusting the platform runtime pm and 
> > power domain prototype support.
> > 
> > But I still need those "cheating" calls to
> > 
> > 	pm_runtime_put_noidle(&pdev->dev);
> > and
> > 	pm_runtime_get_noresume(&pdev->dev);
> > 
> > in the sh_mmcif.c driver. If I use the patch as posted at
> > 
> > http://article.gmane.org/gmane.linux.ports.sh.devel/10724
> > 
> > but without those two calls, and load and unload the driver, while a card 
> > is plugged in, unloading the driver doesn't power down the interface, 
> > because the usage_count == 1 also after the kernel has soft-ejected the 
> > card
> 
> You haven't explained why you want the interface to be powered down
> when the driver is unloaded.  Normally, unloading a driver is pretty
> much the exact reverse of loading it -- if the interface wasn't powered
> down before the driver was loaded originally then it shouldn't be
> powered down after the driver is unloaded.

The interface was powered down before loading. And yes, sorry, I actually 
meant probing and removing, which in case of platform devices is almost 
equivalent to driver loading and unloading, yes, you can also use bind / 
unbind to do the same.

> Conversely, if the interface _was_ powered down before the driver was 
> loaded originally, then making unload the exact reverse of load will 
> naturally leave the interface powered down, with no need for any extra 
> "cheating" calls.

Not quite, see below.

> > mmc0: card 0001 removed
> > 
> > With my put_noidle() / get_noresume() hack all cases of modprobe / rmmod, 
> > card insert / eject work correctly.
> 
> Depending on the subsystem, those calls may not be "cheating" at all.  
> When a subsystem has to deal with a variety of device drivers, some of
> which may not support runtime PM, a standard trick is for the subsystem
> to increment the usage_count value before probing.  Drivers that aren't
> runtime-PM-aware will leave the usage_count alone and therefore the
> device won't ever be runtime-suspended.  Drivers that _are_
> runtime-PM-aware will know to decrement the usage_count in their probe
> routine and increment it in their remove routine.
> 
> However this involves calling pm_runtime_put_noidle() during probe and
> pm_runtime_get_noresume() during remove -- not during module
> initialization and module unloading.

As I said above - I did mean probe() and remove(). Now, I have now traced 
down the cause of my problem. In dd.c::driver_probe_device():

	pm_runtime_get_noresume(dev);
	pm_runtime_barrier(dev);
	ret = really_probe(dev, drv);
	pm_runtime_put_sync(dev);

Which means, if originally 

	usage_count = 0
	disable_depth = 1

the first get_noresume() increments

	usage_count = 1

In probe() we do enable(), which decrements

	disable_depth = 0

and resume(), which poweres everything up. If we now return from probe(), 
put_sync() will power down and decrement

	usage_count = 0

which is ok for sh_mmcif.c. If there is no card in the slot, we keep 
interface powered down, card polling works also in powered down mode if 
there's a GPIO, or the MMC core will wake us up every time to check for a 
new card. Next, let's say, a card has been inserted, and we rmmod the 
driver. dd.c::__device_release_driver() does

	pm_runtime_get_noresume(dev);
	pm_runtime_barrier(dev);
	.remove();
	pm_runtime_put_sync(dev);

the first get_noresume() increments

	usage_count = 1

Then if we go the "exact inverse" route, we do suspend(), which does 
nothing, because usage_count > 0, then we do disable(), which increments

	disable_depth = 1

After returning from .remove() pm_runtime_put_sync() is executed, but also 
it does nothing, because disable_depth > 0. The interface stays powered 
on. So, as you see, the problem is on the .remove() path. I work around it 
by doing

	pm_runtime_put_sync(&pdev->dev);
	pm_runtime_disable(&pdev->dev);
	pm_runtime_get_noresume(&pdev->dev);

in the driver. This way the first call decrements

	usage_count = 0

and powers down, the second one increments

	disable_depth = 1

the third one

	usage_count = 1

Then, back in dd.c again

	usage_count = 0

and all is happy:-) But the above triplet looks kinda ugly to me. Is this 
really how it is supposed to work?...

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

'Alan Stern' April 20, 2011, 3:12 p.m. UTC | #8

On Wed, 20 Apr 2011, Guennadi Liakhovetski wrote:

> As I said above - I did mean probe() and remove(). Now, I have now traced 
> down the cause of my problem. In dd.c::driver_probe_device():
> 
> 	pm_runtime_get_noresume(dev);
> 	pm_runtime_barrier(dev);
> 	ret = really_probe(dev, drv);
> 	pm_runtime_put_sync(dev);
> 
> Which means, if originally 
> 
> 	usage_count = 0
> 	disable_depth = 1
> 
> the first get_noresume() increments
> 
> 	usage_count = 1
> 
> In probe() we do enable(), which decrements
> 
> 	disable_depth = 0
> 
> and resume(), which poweres everything up. If we now return from probe(), 
> put_sync() will power down and decrement
> 
> 	usage_count = 0
> 
> which is ok for sh_mmcif.c. If there is no card in the slot, we keep 
> interface powered down, card polling works also in powered down mode if 
> there's a GPIO, or the MMC core will wake us up every time to check for a 
> new card. Next, let's say, a card has been inserted, and we rmmod the 
> driver. dd.c::__device_release_driver() does
> 
> 	pm_runtime_get_noresume(dev);
> 	pm_runtime_barrier(dev);
> 	.remove();
> 	pm_runtime_put_sync(dev);
> 
> the first get_noresume() increments
> 
> 	usage_count = 1
> 
> Then if we go the "exact inverse" route, we do suspend(), which does 
> nothing, because usage_count > 0, then we do disable(), which increments
> 
> 	disable_depth = 1
> 
> After returning from .remove() pm_runtime_put_sync() is executed, but also 
> it does nothing, because disable_depth > 0. The interface stays powered 
> on. So, as you see, the problem is on the .remove() path. I work around it 
> by doing
> 
> 	pm_runtime_put_sync(&pdev->dev);
> 	pm_runtime_disable(&pdev->dev);
> 	pm_runtime_get_noresume(&pdev->dev);
> 
> in the driver. This way the first call decrements
> 
> 	usage_count = 0
> 
> and powers down, the second one increments
> 
> 	disable_depth = 1
> 
> the third one
> 
> 	usage_count = 1
> 
> Then, back in dd.c again
> 
> 	usage_count = 0
> 
> and all is happy:-) But the above triplet looks kinda ugly to me. Is this 
> really how it is supposed to work?...

Ah, now I see the problem.  It looks like we did not give sufficient
thought to the case where a device starts off (and therefore should
finish up) in a powered-down state.  Calling pm_runtime_put_sync()
after unbinding the device driver seems a little futile -- with no
driver, the subsystem may not be able to power-down the device!

Rafael, how do you think we should handle this?  Get rid of the 
pm_runtime_get_no_resume() and pm_runtime_put_sync() calls in 
dd.c:__device_release_driver()?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Rafael Wysocki April 20, 2011, 8:06 p.m. UTC | #9

[Added linux-pm to the CC list.]

On Wednesday, April 20, 2011, Alan Stern wrote:
> On Wed, 20 Apr 2011, Guennadi Liakhovetski wrote:
...
> Ah, now I see the problem.  It looks like we did not give sufficient
> thought to the case where a device starts off (and therefore should
> finish up) in a powered-down state.  Calling pm_runtime_put_sync()
> after unbinding the device driver seems a little futile -- with no
> driver, the subsystem may not be able to power-down the device!
> 
> Rafael, how do you think we should handle this?  Get rid of the 
> pm_runtime_get_no_resume() and pm_runtime_put_sync() calls in 
> dd.c:__device_release_driver()?

I think we need pm_runtime_barrier() in there.  Otherwise we risk
removing the driver while there's a runtime PM request pending.

But we can move the pm_runtime_put_sync() before driver_sysfs_remove().

Would that help?

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

'Alan Stern' April 20, 2011, 9:16 p.m. UTC | #10

On Wed, 20 Apr 2011, Rafael J. Wysocki wrote:

> On Wednesday, April 20, 2011, Alan Stern wrote:
> > On Wed, 20 Apr 2011, Guennadi Liakhovetski wrote:
> ...
> > Ah, now I see the problem.  It looks like we did not give sufficient
> > thought to the case where a device starts off (and therefore should
> > finish up) in a powered-down state.  Calling pm_runtime_put_sync()
> > after unbinding the device driver seems a little futile -- with no
> > driver, the subsystem may not be able to power-down the device!
> > 
> > Rafael, how do you think we should handle this?  Get rid of the 
> > pm_runtime_get_no_resume() and pm_runtime_put_sync() calls in 
> > dd.c:__device_release_driver()?
> 
> I think we need pm_runtime_barrier() in there.  Otherwise we risk
> removing the driver while there's a runtime PM request pending.
> 
> But we can move the pm_runtime_put_sync() before driver_sysfs_remove().

What happens if another runtime PM request is queued between the
put_sync() and the remove callback?  We may need a safe way to prevent
async runtime PM requests while still allowing synchronous requests.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Rafael Wysocki April 20, 2011, 9:44 p.m. UTC | #11

On Wednesday, April 20, 2011, Alan Stern wrote:
> On Wed, 20 Apr 2011, Rafael J. Wysocki wrote:
> 
> > On Wednesday, April 20, 2011, Alan Stern wrote:
> > > On Wed, 20 Apr 2011, Guennadi Liakhovetski wrote:
> > ...
> > > Ah, now I see the problem.  It looks like we did not give sufficient
> > > thought to the case where a device starts off (and therefore should
> > > finish up) in a powered-down state.  Calling pm_runtime_put_sync()
> > > after unbinding the device driver seems a little futile -- with no
> > > driver, the subsystem may not be able to power-down the device!
> > > 
> > > Rafael, how do you think we should handle this?  Get rid of the 
> > > pm_runtime_get_no_resume() and pm_runtime_put_sync() calls in 
> > > dd.c:__device_release_driver()?
> > 
> > I think we need pm_runtime_barrier() in there.  Otherwise we risk
> > removing the driver while there's a runtime PM request pending.
> > 
> > But we can move the pm_runtime_put_sync() before driver_sysfs_remove().
> 
> What happens if another runtime PM request is queued between the
> put_sync() and the remove callback?  We may need a safe way to prevent
> async runtime PM requests while still allowing synchronous requests.

What about making a rule that it is invalid to schedule a future suspend
or queue a resume request of a device whose driver is being removed?

Arguably, we can't prevent people from shooting themselves in the foot this
way or another and I'm not sure if this particular case is worth additional
handling.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

'Alan Stern' April 21, 2011, 1:58 p.m. UTC | #12

On Wed, 20 Apr 2011, Rafael J. Wysocki wrote:

> On Wednesday, April 20, 2011, Alan Stern wrote:
> > On Wed, 20 Apr 2011, Rafael J. Wysocki wrote:
> > 
> > > On Wednesday, April 20, 2011, Alan Stern wrote:
> > > > On Wed, 20 Apr 2011, Guennadi Liakhovetski wrote:
> > > ...
> > > > Ah, now I see the problem.  It looks like we did not give sufficient
> > > > thought to the case where a device starts off (and therefore should
> > > > finish up) in a powered-down state.  Calling pm_runtime_put_sync()
> > > > after unbinding the device driver seems a little futile -- with no
> > > > driver, the subsystem may not be able to power-down the device!
> > > > 
> > > > Rafael, how do you think we should handle this?  Get rid of the 
> > > > pm_runtime_get_no_resume() and pm_runtime_put_sync() calls in 
> > > > dd.c:__device_release_driver()?
> > > 
> > > I think we need pm_runtime_barrier() in there.  Otherwise we risk
> > > removing the driver while there's a runtime PM request pending.
> > > 
> > > But we can move the pm_runtime_put_sync() before driver_sysfs_remove().
> > 
> > What happens if another runtime PM request is queued between the
> > put_sync() and the remove callback?  We may need a safe way to prevent
> > async runtime PM requests while still allowing synchronous requests.
> 
> What about making a rule that it is invalid to schedule a future suspend
> or queue a resume request of a device whose driver is being removed?
> 
> Arguably, we can't prevent people from shooting themselves in the foot this
> way or another and I'm not sure if this particular case is worth additional
> handling.

After thinking about this, I tend to agree.  The synchronization 
issues, combined with the unknown needs of the driver, make this very 
difficult to handle in the PM core.

Here's another possible approach: If a driver wants to leave its device 
in a powered-down state after unbinding then it can invoke its own 
runtime_suspend callback directly, in the following way:

	... unregister all child devices below dev ...
	pm_runtime_disable(dev);
	if (dev->power.runtime_status != RPM_SUSPENDED) {
		pm_set_suspended(dev);
		my_runtime_suspend_callback(dev);
	}

There may be issues regarding coordination with the subsystem or the
power domain; at the moment it's not clear what should be done.  Maybe
the runtime-PM core should include an API for directly invoking the
appropriate callbacks.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Rafael Wysocki April 21, 2011, 6 p.m. UTC | #13

On Thursday, April 21, 2011, Alan Stern wrote:
> On Wed, 20 Apr 2011, Rafael J. Wysocki wrote:
> 
> > On Wednesday, April 20, 2011, Alan Stern wrote:
> > > On Wed, 20 Apr 2011, Rafael J. Wysocki wrote:
> > > 
> > > > On Wednesday, April 20, 2011, Alan Stern wrote:
> > > > > On Wed, 20 Apr 2011, Guennadi Liakhovetski wrote:
> > > > ...
> > > > > Ah, now I see the problem.  It looks like we did not give sufficient
> > > > > thought to the case where a device starts off (and therefore should
> > > > > finish up) in a powered-down state.  Calling pm_runtime_put_sync()
> > > > > after unbinding the device driver seems a little futile -- with no
> > > > > driver, the subsystem may not be able to power-down the device!
> > > > > 
> > > > > Rafael, how do you think we should handle this?  Get rid of the 
> > > > > pm_runtime_get_no_resume() and pm_runtime_put_sync() calls in 
> > > > > dd.c:__device_release_driver()?
> > > > 
> > > > I think we need pm_runtime_barrier() in there.  Otherwise we risk
> > > > removing the driver while there's a runtime PM request pending.
> > > > 
> > > > But we can move the pm_runtime_put_sync() before driver_sysfs_remove().
> > > 
> > > What happens if another runtime PM request is queued between the
> > > put_sync() and the remove callback?  We may need a safe way to prevent
> > > async runtime PM requests while still allowing synchronous requests.
> > 
> > What about making a rule that it is invalid to schedule a future suspend
> > or queue a resume request of a device whose driver is being removed?
> > 
> > Arguably, we can't prevent people from shooting themselves in the foot this
> > way or another and I'm not sure if this particular case is worth additional
> > handling.
> 
> After thinking about this, I tend to agree.  The synchronization 
> issues, combined with the unknown needs of the driver, make this very 
> difficult to handle in the PM core.
> 
> Here's another possible approach: If a driver wants to leave its device 
> in a powered-down state after unbinding then it can invoke its own 
> runtime_suspend callback directly, in the following way:
> 
> 	... unregister all child devices below dev ...
> 	pm_runtime_disable(dev);
> 	if (dev->power.runtime_status != RPM_SUSPENDED) {
> 		pm_set_suspended(dev);
> 		my_runtime_suspend_callback(dev);
> 	}

I think this would work too, but then possibly many drivers would have to
do the same thing in their "remove" routines.

> There may be issues regarding coordination with the subsystem or the
> power domain; at the moment it's not clear what should be done.  Maybe
> the runtime-PM core should include an API for directly invoking the
> appropriate callbacks.

If we choose this approach, then yes, we should provide a suitable API, but
I'm still thinking it would be simpler to move the pm_runtime_put_sync() before driver_sysfs_remove() and make the rule as I said previously. :-)

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

'Alan Stern' April 21, 2011, 6:36 p.m. UTC | #14

On Thu, 21 Apr 2011, Rafael J. Wysocki wrote:

> > > What about making a rule that it is invalid to schedule a future suspend
> > > or queue a resume request of a device whose driver is being removed?
> > > 
> > > Arguably, we can't prevent people from shooting themselves in the foot this
> > > way or another and I'm not sure if this particular case is worth additional
> > > handling.
> > 
> > After thinking about this, I tend to agree.  The synchronization 
> > issues, combined with the unknown needs of the driver, make this very 
> > difficult to handle in the PM core.
> > 
> > Here's another possible approach: If a driver wants to leave its device 
> > in a powered-down state after unbinding then it can invoke its own 
> > runtime_suspend callback directly, in the following way:
> > 
> > 	... unregister all child devices below dev ...
> > 	pm_runtime_disable(dev);
> > 	if (dev->power.runtime_status != RPM_SUSPENDED) {
> > 		pm_set_suspended(dev);
> > 		my_runtime_suspend_callback(dev);
> > 	}
> 
> I think this would work too, but then possibly many drivers would have to
> do the same thing in their "remove" routines.
> 
> > There may be issues regarding coordination with the subsystem or the
> > power domain; at the moment it's not clear what should be done.  Maybe
> > the runtime-PM core should include an API for directly invoking the
> > appropriate callbacks.
> 
> If we choose this approach, then yes, we should provide a suitable API, but
> I'm still thinking it would be simpler to move the pm_runtime_put_sync() before driver_sysfs_remove() and make the rule as I said previously. :-)

The problem is synchronization.  At what point is the driver supposed 
to stop queuing runtime PM requests?  It would have to be sometime 
before the pm_runtime_barrier() call.  How is the driver supposed to 
know when that point is reached?  The remove routine isn't called until 
later.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Rafael Wysocki April 21, 2011, 8:05 p.m. UTC | #15

On Thursday, April 21, 2011, Alan Stern wrote:
> On Thu, 21 Apr 2011, Rafael J. Wysocki wrote:
> 
> > > > What about making a rule that it is invalid to schedule a future suspend
> > > > or queue a resume request of a device whose driver is being removed?
> > > > 
> > > > Arguably, we can't prevent people from shooting themselves in the foot this
> > > > way or another and I'm not sure if this particular case is worth additional
> > > > handling.
> > > 
> > > After thinking about this, I tend to agree.  The synchronization 
> > > issues, combined with the unknown needs of the driver, make this very 
> > > difficult to handle in the PM core.
> > > 
> > > Here's another possible approach: If a driver wants to leave its device 
> > > in a powered-down state after unbinding then it can invoke its own 
> > > runtime_suspend callback directly, in the following way:
> > > 
> > > 	... unregister all child devices below dev ...
> > > 	pm_runtime_disable(dev);
> > > 	if (dev->power.runtime_status != RPM_SUSPENDED) {
> > > 		pm_set_suspended(dev);
> > > 		my_runtime_suspend_callback(dev);
> > > 	}
> > 
> > I think this would work too, but then possibly many drivers would have to
> > do the same thing in their "remove" routines.
> > 
> > > There may be issues regarding coordination with the subsystem or the
> > > power domain; at the moment it's not clear what should be done.  Maybe
> > > the runtime-PM core should include an API for directly invoking the
> > > appropriate callbacks.
> > 
> > If we choose this approach, then yes, we should provide a suitable API, but
> > I'm still thinking it would be simpler to move the pm_runtime_put_sync() before driver_sysfs_remove() and make the rule as I said previously. :-)
> 
> The problem is synchronization.  At what point is the driver supposed 
> to stop queuing runtime PM requests?  It would have to be sometime 
> before the pm_runtime_barrier() call.  How is the driver supposed to 
> know when that point is reached?  The remove routine isn't called until 
> later.

Executing the driver's callback is not an ideal solution either, because
it simply may be insufficient (it may be necessary to execute the power
domain and/or subsystem callbacks, pretty much what rpm_suspend() does,
but without taking the usage counter into consideration).

Moreover,  if we want the driver's ->remove() to do the cleanup anyway,
there's not much point in doing any cleanup before in the core.  Also,
there's a little problem that the bus ->remove() is called before the
driver's ->remove(), so it may not be entirely possible to power down
the device when the driver's ->remove() is called already.

I think the current code is better than any of the alternatives considered
so far.

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

'Alan Stern' April 21, 2011, 9:48 p.m. UTC | #16

On Thu, 21 Apr 2011, Rafael J. Wysocki wrote:

> > > If we choose this approach, then yes, we should provide a suitable API, but
> > > I'm still thinking it would be simpler to move the pm_runtime_put_sync() before driver_sysfs_remove() and make the rule as I said previously. :-)
> > 
> > The problem is synchronization.  At what point is the driver supposed 
> > to stop queuing runtime PM requests?  It would have to be sometime 
> > before the pm_runtime_barrier() call.  How is the driver supposed to 
> > know when that point is reached?  The remove routine isn't called until 
> > later.
> 
> Executing the driver's callback is not an ideal solution either, because
> it simply may be insufficient (it may be necessary to execute the power
> domain and/or subsystem callbacks, pretty much what rpm_suspend() does,
> but without taking the usage counter into consideration).

That's why I suggested a new API.  It could do the right callbacks.

> Moreover,  if we want the driver's ->remove() to do the cleanup anyway,
> there's not much point in doing any cleanup before in the core.  Also,
> there's a little problem that the bus ->remove() is called before the
> driver's ->remove(), so it may not be entirely possible to power down
> the device when the driver's ->remove() is called already.

Actually, the bus->remove() callback (if there is one) is responsible
for invoking the driver's callback.  The subsystem should be smart
enough to handle runtime PM requests while the driver's remove callback
is running.

> I think the current code is better than any of the alternatives considered
> so far.

Then you think Guennadi should leave his patch as it is, including the 
"reversed" put/get?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Rafael Wysocki April 21, 2011, 10:06 p.m. UTC | #17

On Thursday, April 21, 2011, Alan Stern wrote:
> On Thu, 21 Apr 2011, Rafael J. Wysocki wrote:
> 
> > > > If we choose this approach, then yes, we should provide a suitable API, but
> > > > I'm still thinking it would be simpler to move the pm_runtime_put_sync() before driver_sysfs_remove() and make the rule as I said previously. :-)
> > > 
> > > The problem is synchronization.  At what point is the driver supposed 
> > > to stop queuing runtime PM requests?  It would have to be sometime 
> > > before the pm_runtime_barrier() call.  How is the driver supposed to 
> > > know when that point is reached?  The remove routine isn't called until 
> > > later.
> > 
> > Executing the driver's callback is not an ideal solution either, because
> > it simply may be insufficient (it may be necessary to execute the power
> > domain and/or subsystem callbacks, pretty much what rpm_suspend() does,
> > but without taking the usage counter into consideration).
> 
> That's why I suggested a new API.  It could do the right callbacks.
> 
> > Moreover,  if we want the driver's ->remove() to do the cleanup anyway,
> > there's not much point in doing any cleanup before in the core.  Also,
> > there's a little problem that the bus ->remove() is called before the
> > driver's ->remove(), so it may not be entirely possible to power down
> > the device when the driver's ->remove() is called already.
> 
> Actually, the bus->remove() callback (if there is one) is responsible
> for invoking the driver's callback.

Ah, sorry, I misread the code in __device_release_driver() (too little
coffee perhaps).

> The subsystem should be smart enough to handle runtime PM requests while
> the driver's remove callback is running.

If we make such a rule, we may as well remove all of the runtime PM
calls from __device_release_driver().
 
> > I think the current code is better than any of the alternatives considered
> > so far.
> 
> Then you think Guennadi should leave his patch as it is, including the 
> "reversed" put/get?

This, or we need to remove the runtime PM calls from __device_release_driver().

I'm a bit worried about the driver_sysfs_remove() and the bus notifier that
in theory may affect the runtime PM callbacks potentially executed before
->remove() is called.

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH/RFC] MMC: remove unbalanced pm_runtime_suspend()

Commit Message

Comments

Patch