Message ID | 1343367309.3874.21.camel@yhuang-dev (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Huang Ying <ying.huang@intel.com> writes: > Do you have time to try the following patch? > > Best Regards, > Huang Ying > > --- > drivers/pci/pci-driver.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > --- a/drivers/pci/pci-driver.c > +++ b/drivers/pci/pci-driver.c > @@ -280,8 +280,12 @@ static long local_pci_probe(void *_ddi) > { > struct drv_dev_and_id *ddi = _ddi; > struct device *dev = &ddi->dev->dev; > + struct device *parent = dev->parent; > int rc; > > + /* The parent bridge must be in active state when probing */ > + if (parent) > + pm_runtime_get_sync(parent); > /* Unbound PCI devices are always set to disabled and suspended. > * During probe, the device is set to enabled and active and the > * usage count is incremented. If the driver supports runtime PM, > @@ -298,6 +302,8 @@ static long local_pci_probe(void *_ddi) > pm_runtime_set_suspended(dev); > pm_runtime_put_noidle(dev); > } > + if (parent) > + pm_runtime_put(parent); > return rc; > } > Yup, that worked in the quick test I just did. lspci reading the device config will still not wake the bridge, but I assume that is intentional? But loading the device driver now wakes both the bridge and the device, so that works. Bjørn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 27 Jul 2012, Huang Ying wrote: > --- a/drivers/pci/pci-driver.c > +++ b/drivers/pci/pci-driver.c > @@ -280,8 +280,12 @@ static long local_pci_probe(void *_ddi) > { > struct drv_dev_and_id *ddi = _ddi; > struct device *dev = &ddi->dev->dev; > + struct device *parent = dev->parent; > int rc; > > + /* The parent bridge must be in active state when probing */ > + if (parent) > + pm_runtime_get_sync(parent); Ooh, this is a very good point. I completely missed it. > /* Unbound PCI devices are always set to disabled and suspended. > * During probe, the device is set to enabled and active and the > * usage count is incremented. If the driver supports runtime PM, > @@ -298,6 +302,8 @@ static long local_pci_probe(void *_ddi) > pm_runtime_set_suspended(dev); > pm_runtime_put_noidle(dev); > } > + if (parent) > + pm_runtime_put(parent); You should use pm_runtime_put_sync(), not pm_runtime_put(). Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday, July 27, 2012, Alan Stern wrote: > On Fri, 27 Jul 2012, Huang Ying wrote: > > > --- a/drivers/pci/pci-driver.c > > +++ b/drivers/pci/pci-driver.c > > @@ -280,8 +280,12 @@ static long local_pci_probe(void *_ddi) > > { > > struct drv_dev_and_id *ddi = _ddi; > > struct device *dev = &ddi->dev->dev; > > + struct device *parent = dev->parent; > > int rc; > > > > + /* The parent bridge must be in active state when probing */ > > + if (parent) > > + pm_runtime_get_sync(parent); > > Ooh, this is a very good point. I completely missed it. > > > /* Unbound PCI devices are always set to disabled and suspended. > > * During probe, the device is set to enabled and active and the > > * usage count is incremented. If the driver supports runtime PM, > > @@ -298,6 +302,8 @@ static long local_pci_probe(void *_ddi) > > pm_runtime_set_suspended(dev); > > pm_runtime_put_noidle(dev); > > } > > + if (parent) > > + pm_runtime_put(parent); > > You should use pm_runtime_put_sync(), not pm_runtime_put(). Hmm, why exactly? Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 27 Jul 2012, Rafael J. Wysocki wrote: > On Friday, July 27, 2012, Alan Stern wrote: > > On Fri, 27 Jul 2012, Huang Ying wrote: > > > > > --- a/drivers/pci/pci-driver.c > > > +++ b/drivers/pci/pci-driver.c > > > @@ -280,8 +280,12 @@ static long local_pci_probe(void *_ddi) > > > { > > > struct drv_dev_and_id *ddi = _ddi; > > > struct device *dev = &ddi->dev->dev; > > > + struct device *parent = dev->parent; > > > int rc; > > > > > > + /* The parent bridge must be in active state when probing */ > > > + if (parent) > > > + pm_runtime_get_sync(parent); > > > > Ooh, this is a very good point. I completely missed it. > > > > > /* Unbound PCI devices are always set to disabled and suspended. > > > * During probe, the device is set to enabled and active and the > > > * usage count is incremented. If the driver supports runtime PM, > > > @@ -298,6 +302,8 @@ static long local_pci_probe(void *_ddi) > > > pm_runtime_set_suspended(dev); > > > pm_runtime_put_noidle(dev); > > > } > > > + if (parent) > > > + pm_runtime_put(parent); > > > > You should use pm_runtime_put_sync(), not pm_runtime_put(). > > Hmm, why exactly? Because it's more efficient to do something directly than to run it in a workqueue. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday, July 27, 2012, Alan Stern wrote: > On Fri, 27 Jul 2012, Rafael J. Wysocki wrote: > > > On Friday, July 27, 2012, Alan Stern wrote: > > > On Fri, 27 Jul 2012, Huang Ying wrote: > > > > > > > --- a/drivers/pci/pci-driver.c > > > > +++ b/drivers/pci/pci-driver.c > > > > @@ -280,8 +280,12 @@ static long local_pci_probe(void *_ddi) > > > > { > > > > struct drv_dev_and_id *ddi = _ddi; > > > > struct device *dev = &ddi->dev->dev; > > > > + struct device *parent = dev->parent; > > > > int rc; > > > > > > > > + /* The parent bridge must be in active state when probing */ > > > > + if (parent) > > > > + pm_runtime_get_sync(parent); > > > > > > Ooh, this is a very good point. I completely missed it. > > > > > > > /* Unbound PCI devices are always set to disabled and suspended. > > > > * During probe, the device is set to enabled and active and the > > > > * usage count is incremented. If the driver supports runtime PM, > > > > @@ -298,6 +302,8 @@ static long local_pci_probe(void *_ddi) > > > > pm_runtime_set_suspended(dev); > > > > pm_runtime_put_noidle(dev); > > > > } > > > > + if (parent) > > > > + pm_runtime_put(parent); > > > > > > You should use pm_runtime_put_sync(), not pm_runtime_put(). > > > > Hmm, why exactly? > > Because it's more efficient to do something directly than to run it in > a workqueue. Well, depends. If that results in a power off (the parent goes into D3 for example), we may wait as long as 10 ms for that to complete. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 27 Jul 2012, Rafael J. Wysocki wrote: > > > > > + if (parent) > > > > > + pm_runtime_put(parent); > > > > > > > > You should use pm_runtime_put_sync(), not pm_runtime_put(). > > > > > > Hmm, why exactly? > > > > Because it's more efficient to do something directly than to run it in > > a workqueue. > > Well, depends. If that results in a power off (the parent goes into > D3 for example), we may wait as long as 10 ms for that to complete. True, but so what? You'd also have to wait 10 ms for the workqueue item to complete if pm_runtime_put() was used, plus the additional overhead of scheduling the workqueue thread. Now, if the length of time required for the probe to run were an issue then yes, the synchronous routine could make things take longer. But probing is generally done asynchronously anyway, right? And usually what matters is the time before the device becomes available, not the time for the probe to run. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday, July 28, 2012, Alan Stern wrote: > On Fri, 27 Jul 2012, Rafael J. Wysocki wrote: > > > > > > > + if (parent) > > > > > > + pm_runtime_put(parent); > > > > > > > > > > You should use pm_runtime_put_sync(), not pm_runtime_put(). > > > > > > > > Hmm, why exactly? > > > > > > Because it's more efficient to do something directly than to run it in > > > a workqueue. > > > > Well, depends. If that results in a power off (the parent goes into > > D3 for example), we may wait as long as 10 ms for that to complete. > > True, but so what? You'd also have to wait 10 ms for the workqueue > item to complete if pm_runtime_put() was used, Are you sure? pm_runtime_put() leads to rpm_idle() with the RPM_ASYNC flag set, which causes it to queue up the work item and return immediately without waiting. Why exactly do you think I'd need to wait, then? > plus the additional overhead of scheduling the workqueue thread. > > Now, if the length of time required for the probe to run were an issue > then yes, the synchronous routine could make things take longer. But > probing is generally done asynchronously anyway, right? And usually > what matters is the time before the device becomes available, not the > time for the probe to run. Still, I see no reason to wait synchronously for the _parent_ to suspend. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 28 Jul 2012, Rafael J. Wysocki wrote: > On Saturday, July 28, 2012, Alan Stern wrote: > > On Fri, 27 Jul 2012, Rafael J. Wysocki wrote: > > > > > > > > > + if (parent) > > > > > > > + pm_runtime_put(parent); > > > > > > > > > > > > You should use pm_runtime_put_sync(), not pm_runtime_put(). > > > > > > > > > > Hmm, why exactly? > > > > > > > > Because it's more efficient to do something directly than to run it in > > > > a workqueue. > > > > > > Well, depends. If that results in a power off (the parent goes into > > > D3 for example), we may wait as long as 10 ms for that to complete. > > > > True, but so what? You'd also have to wait 10 ms for the workqueue > > item to complete if pm_runtime_put() was used, > > Are you sure? pm_runtime_put() leads to rpm_idle() with the RPM_ASYNC > flag set, which causes it to queue up the work item and return immediately > without waiting. Why exactly do you think I'd need to wait, then? What I mean is, it takes 10 ms for the parent to suspend regardless of whether or not you use the _sync form of the call. If you're waiting for the parent to suspend, you would therefore have to wait 10 ms in either case. Likewise, if you're waiting for the PCI device to become usable then you don't have to wait for 10 ms, because you can use it as soon as the probe routine returns. > > plus the additional overhead of scheduling the workqueue thread. > > > > Now, if the length of time required for the probe to run were an issue > > then yes, the synchronous routine could make things take longer. But > > probing is generally done asynchronously anyway, right? And usually > > what matters is the time before the device becomes available, not the > > time for the probe to run. > > Still, I see no reason to wait synchronously for the _parent_ to suspend. Why not? There's no penalty, particularly since most device probing happens asynchronously these days anyway. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday, July 28, 2012, Alan Stern wrote: > On Sat, 28 Jul 2012, Rafael J. Wysocki wrote: > > > On Saturday, July 28, 2012, Alan Stern wrote: > > > On Fri, 27 Jul 2012, Rafael J. Wysocki wrote: > > > > > > > > > > > + if (parent) > > > > > > > > + pm_runtime_put(parent); > > > > > > > > > > > > > > You should use pm_runtime_put_sync(), not pm_runtime_put(). > > > > > > > > > > > > Hmm, why exactly? > > > > > > > > > > Because it's more efficient to do something directly than to run it in > > > > > a workqueue. > > > > > > > > Well, depends. If that results in a power off (the parent goes into > > > > D3 for example), we may wait as long as 10 ms for that to complete. > > > > > > True, but so what? You'd also have to wait 10 ms for the workqueue > > > item to complete if pm_runtime_put() was used, > > > > Are you sure? pm_runtime_put() leads to rpm_idle() with the RPM_ASYNC > > flag set, which causes it to queue up the work item and return immediately > > without waiting. Why exactly do you think I'd need to wait, then? > > What I mean is, it takes 10 ms for the parent to suspend regardless of > whether or not you use the _sync form of the call. If you're waiting > for the parent to suspend, you would therefore have to wait 10 ms in > either case. > > Likewise, if you're waiting for the PCI device to become usable then > you don't have to wait for 10 ms, because you can use it as soon as the > probe routine returns. The difference is, if you use _put_sync(), you need to wait the extra 10 ms for local_pci_probe() to return (if the parent is actually suspended), although you might not need to wait for it if you used _put(), right? Which, to me, means that using _put_sync() may not be always better. It probably doesn't matter a lot, but then the workqueue overhead shouldn't matter a lot either. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 29 Jul 2012, Rafael J. Wysocki wrote: > The difference is, if you use _put_sync(), you need to wait the extra 10 ms > for local_pci_probe() to return (if the parent is actually suspended), > although you might not need to wait for it if you used _put(), right? Yes, that's the difference. But who waits for local_pci_probe() to return? :-) > Which, to me, means that using _put_sync() may not be always better. > It probably doesn't matter a lot, but then the workqueue overhead shouldn't > matter a lot either. It's that in the end, the extra overhead is pretty small. For me there's also an issue of style: If you do a synchronous get then it looks odd not to do a synchronous put. My feeling has always been that the async routines are for use in non-process contexts, where the sync routines can't be used. Using them just to return a little more quickly is a foreign idea. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sunday, July 29, 2012, Alan Stern wrote: > On Sun, 29 Jul 2012, Rafael J. Wysocki wrote: > > > The difference is, if you use _put_sync(), you need to wait the extra 10 ms > > for local_pci_probe() to return (if the parent is actually suspended), > > although you might not need to wait for it if you used _put(), right? > > Yes, that's the difference. But who waits for local_pci_probe() to > return? :-) pci_register_driver() might, but that's not a big deal. Hot-plug might as well, though. > > Which, to me, means that using _put_sync() may not be always better. > > It probably doesn't matter a lot, but then the workqueue overhead shouldn't > > matter a lot either. > > It's that in the end, the extra overhead is pretty small. For me > there's also an issue of style: If you do a synchronous get then it > looks odd not to do a synchronous put. My feeling has always been that > the async routines are for use in non-process contexts, where the sync > routines can't be used. Using them just to return a little more > quickly is a foreign idea. I see. :-) The reason for using sync get is quite obvious: we want to be able to access the device later on. For sync put that's not so clear, though. There are cases when we definitely want to do it, like the failing .probe() in which we want to disable runtime PM next to the put, but usually it doesn't hurt (too much) to defer it IMO. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Jul 29, 2012 at 8:25 PM, Alan Stern <stern@rowland.harvard.edu> wrote: > > For me there's also an issue of style: If you do a synchronous get then it > looks odd not to do a synchronous put. My feeling has always been that > the async routines are for use in non-process contexts, where the sync > routines can't be used. Using them just to return a little more > quickly is a foreign idea. > Another way of looking at it is - I need h/w to be active in order to proceed so I call get_sync but I don't care if the h/w is not put down immediately after I am done using it. So the put could be relaxed/async - At best, we could avoid an unnecessary suspend-resume cycle which could be expensive power and time wise. At worst, we return a bit quicker. Or so do I think. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 30 Jul 2012, Jassi Brar wrote: > On Sun, Jul 29, 2012 at 8:25 PM, Alan Stern <stern@rowland.harvard.edu> wrote: > > > > For me there's also an issue of style: If you do a synchronous get then it > > looks odd not to do a synchronous put. My feeling has always been that > > the async routines are for use in non-process contexts, where the sync > > routines can't be used. Using them just to return a little more > > quickly is a foreign idea. > > > Another way of looking at it is - I need h/w to be active in order to > proceed so I call get_sync but I don't care if the h/w is not put down > immediately after I am done using it. So the put could be > relaxed/async - At best, we could avoid an unnecessary suspend-resume > cycle which could be expensive power and time wise. At worst, we > return a bit quicker. Or so do I think. If the reason for the choice is to opportunistically delay suspending, there are better ways of doing it: pm_schedule_suspend, pm_runtime_put_autosuspend. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sunday, July 29, 2012, Rafael J. Wysocki wrote: > On Sunday, July 29, 2012, Alan Stern wrote: > > On Sun, 29 Jul 2012, Rafael J. Wysocki wrote: > > > > > The difference is, if you use _put_sync(), you need to wait the extra 10 ms > > > for local_pci_probe() to return (if the parent is actually suspended), > > > although you might not need to wait for it if you used _put(), right? > > > > Yes, that's the difference. But who waits for local_pci_probe() to > > return? :-) > > pci_register_driver() might, but that's not a big deal. Hot-plug might > as well, though. > > > > Which, to me, means that using _put_sync() may not be always better. > > > It probably doesn't matter a lot, but then the workqueue overhead shouldn't > > > matter a lot either. > > > > It's that in the end, the extra overhead is pretty small. For me > > there's also an issue of style: If you do a synchronous get then it > > looks odd not to do a synchronous put. My feeling has always been that > > the async routines are for use in non-process contexts, where the sync > > routines can't be used. Using them just to return a little more > > quickly is a foreign idea. > > I see. :-) > > The reason for using sync get is quite obvious: we want to be able to access > the device later on. For sync put that's not so clear, though. There are cases > when we definitely want to do it, like the failing .probe() in which we want to > disable runtime PM next to the put, but usually it doesn't hurt (too much) to > defer it IMO. Now it occured to me that perhaps we don't need the current asynchronous pm_runtime_get() at all. The usefulness of it is quite questionable, because either we want to resume the device immediately, for which pm_runtime_get_sync() should be used, or we just want to bump up the usage counter, in which cases pm_runtime_get_noresume() should always be sufficient. I fail to see any particularly compelling use case for pm_runtime_get() doing an asynchronous resume at the moment, but perhaps that's just me. However, I receive reports of people using pm_runtime_get() where they really should use pm_runtime_get_sync(), so I wonder if we can simply rename pm_runtime_get_sync() as pm_runtime_get() and drop the asynchronous version altogether? Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[CC: list trimmed] On Tue, 31 Jul 2012, Rafael J. Wysocki wrote: > Now it occured to me that perhaps we don't need the current asynchronous > pm_runtime_get() at all. The usefulness of it is quite questionable, because > either we want to resume the device immediately, for which pm_runtime_get_sync() > should be used, or we just want to bump up the usage counter, in which cases > pm_runtime_get_noresume() should always be sufficient. I fail to see any > particularly compelling use case for pm_runtime_get() doing an asynchronous > resume at the moment, but perhaps that's just me. There are indeed valid uses for pm_runtime_get(). We are forced to use it in non-sleepable contexts when we want to resume the device as quickly as possible. Example: a driver receives an I/O request from an interrupt handler. > However, I receive reports of people using pm_runtime_get() where they really > should use pm_runtime_get_sync(), so I wonder if we can simply rename > pm_runtime_get_sync() as pm_runtime_get() and drop the asynchronous version > altogether? Well, IMO the naming should have been the other way around from the start. That is, we should have made pm_runtime_get be the synchronous routine and pm_runtime_get_async be the asynchronous one. But it's too late to change now. And no, we can't get rid of the async version. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday, July 31, 2012, Alan Stern wrote: > [CC: list trimmed] > > On Tue, 31 Jul 2012, Rafael J. Wysocki wrote: > > > Now it occured to me that perhaps we don't need the current asynchronous > > pm_runtime_get() at all. The usefulness of it is quite questionable, because > > either we want to resume the device immediately, for which pm_runtime_get_sync() > > should be used, or we just want to bump up the usage counter, in which cases > > pm_runtime_get_noresume() should always be sufficient. I fail to see any > > particularly compelling use case for pm_runtime_get() doing an asynchronous > > resume at the moment, but perhaps that's just me. > > There are indeed valid uses for pm_runtime_get(). We are forced to use > it in non-sleepable contexts when we want to resume the device as > quickly as possible. Example: a driver receives an I/O request from an > interrupt handler. Is it actually suitable to be used in such contexts? It doesn't give any guarantee that the device will be active when it returns, so the caller can't really rely on it. The caller always has to check if the device has been resumed before accessing it anyway, so it may as well do a _get_noresume(), do the check and do pm_request_resume() if needed directly. Now, as far as interrupt handlers are concerned, there are two cases: either the device has to be active to generate an interrupt (like PCI), or the interrupt is generated by something else on behalf of it. We don't seem to handle the first case very well right now, because in that case the interrupt handler will always know that the device is active, so it should do a _get_noresume() and then change the device's status to "active" without calling any kind of "resume", but we don't provide a helper for that. In the second case calling pm_runtime_get() doesn't really help either. I think it's better to do _get_noresume(), check the status and if "suspended", set a flag for the resume callback to do something specific before returning successfully, then call pm_request_resume() and return. > > However, I receive reports of people using pm_runtime_get() where they really > > should use pm_runtime_get_sync(), so I wonder if we can simply rename > > pm_runtime_get_sync() as pm_runtime_get() and drop the asynchronous version > > altogether? > > Well, IMO the naming should have been the other way around from the > start. That is, we should have made pm_runtime_get be the synchronous > routine and pm_runtime_get_async be the asynchronous one. But it's too > late to change now. I'm not sure it is too late. If we first change all the instances of pm_runtime_get() to pm_runtime_get_async() and then all of the instances of pm_runtime_get_sync() to pm_runtime_get(), it should be technically possible. Of course, it would be confusing, but that's a different matter. :-) > And no, we can't get rid of the async version. I'm still not sure of that. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday, July 31, 2012, Rafael J. Wysocki wrote: > On Tuesday, July 31, 2012, Alan Stern wrote: > > [CC: list trimmed] > > > > On Tue, 31 Jul 2012, Rafael J. Wysocki wrote: > > > > > Now it occured to me that perhaps we don't need the current asynchronous > > > pm_runtime_get() at all. The usefulness of it is quite questionable, because > > > either we want to resume the device immediately, for which pm_runtime_get_sync() > > > should be used, or we just want to bump up the usage counter, in which cases > > > pm_runtime_get_noresume() should always be sufficient. I fail to see any > > > particularly compelling use case for pm_runtime_get() doing an asynchronous > > > resume at the moment, but perhaps that's just me. > > > > There are indeed valid uses for pm_runtime_get(). We are forced to use > > it in non-sleepable contexts when we want to resume the device as > > quickly as possible. Example: a driver receives an I/O request from an > > interrupt handler. > > Is it actually suitable to be used in such contexts? It doesn't give any > guarantee that the device will be active when it returns, so the caller can't > really rely on it. The caller always has to check if the device has been > resumed before accessing it anyway, so it may as well do a _get_noresume(), > do the check and do pm_request_resume() if needed directly. > > Now, as far as interrupt handlers are concerned, there are two cases: either > the device has to be active to generate an interrupt (like PCI), or the > interrupt is generated by something else on behalf of it. We don't seem to > handle the first case very well right now, because in that case the interrupt > handler will always know that the device is active, so it should do a > _get_noresume() and then change the device's status to "active" without > calling any kind of "resume", but we don't provide a helper for that. Unless, of course, this is a shared interrupt, in which case the handler may be invoked, because _another_ device has generated an interrupt, so the "active" check will have to be done anyway. > In the second case calling pm_runtime_get() doesn't really help either. > I think it's better to do _get_noresume(), check the status and if > "suspended", set a flag for the resume callback to do something specific > before returning successfully, then call pm_request_resume() and return. I realize that this may be somewhat racy, so perhaps we need something like pm_request_resume_and_call(dev, func) that will execute the given function once the device has been resumed (or just happens to be "active" when the resume work item is run for it). > > > However, I receive reports of people using pm_runtime_get() where they really > > > should use pm_runtime_get_sync(), so I wonder if we can simply rename > > > pm_runtime_get_sync() as pm_runtime_get() and drop the asynchronous version > > > altogether? > > > > Well, IMO the naming should have been the other way around from the > > start. That is, we should have made pm_runtime_get be the synchronous > > routine and pm_runtime_get_async be the asynchronous one. But it's too > > late to change now. > > I'm not sure it is too late. If we first change all the instances of > pm_runtime_get() to pm_runtime_get_async() and then all of the instances of > pm_runtime_get_sync() to pm_runtime_get(), it should be technically possible. > Of course, it would be confusing, but that's a different matter. :-) > > > And no, we can't get rid of the async version. > > I'm still not sure of that. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 31 Jul 2012, Rafael J. Wysocki wrote: > On Tuesday, July 31, 2012, Alan Stern wrote: > > [CC: list trimmed] > > > > On Tue, 31 Jul 2012, Rafael J. Wysocki wrote: > > > > > Now it occured to me that perhaps we don't need the current asynchronous > > > pm_runtime_get() at all. The usefulness of it is quite questionable, because > > > either we want to resume the device immediately, for which pm_runtime_get_sync() > > > should be used, or we just want to bump up the usage counter, in which cases > > > pm_runtime_get_noresume() should always be sufficient. I fail to see any > > > particularly compelling use case for pm_runtime_get() doing an asynchronous > > > resume at the moment, but perhaps that's just me. > > > > There are indeed valid uses for pm_runtime_get(). We are forced to use > > it in non-sleepable contexts when we want to resume the device as > > quickly as possible. Example: a driver receives an I/O request from an > > interrupt handler. > > Is it actually suitable to be used in such contexts? It doesn't give any > guarantee that the device will be active when it returns, so the caller can't > really rely on it. Of course not. When you're in interrupt context, you can't wait around to see if the device will actually resume. (Unless you're using irq-safe runtime PM, but we can ignore that case.) > The caller always has to check if the device has been > resumed before accessing it anyway, so it may as well do a _get_noresume(), > do the check and do pm_request_resume() if needed directly. This is exactly equivalent to doing pm_runtime_get() followed by a check, except that it's longer. Yes, I agree that pm_runtime_get(dev) is nothing more than a shorthand way of doing pm_runtime_get_noresume(dev) followed by pm_request_resume(dev). That doesn't mean it should be eliminated. Shorthands are convenient. As another example, pm_runtime_get_sync(dev) is nothing more than a shorthand for pm_runtime_get_noresume(dev) followed by pm_runtime_resume(dev). IMO, having these joint functions that do a get/put combined with a suspend/resume/idle is very useful. > Now, as far as interrupt handlers are concerned, there are two cases: either > the device has to be active to generate an interrupt (like PCI), or the > interrupt is generated by something else on behalf of it. We don't seem to > handle the first case very well right now, because in that case the interrupt > handler will always know that the device is active, so it should do a > _get_noresume() and then change the device's status to "active" without > calling any kind of "resume", but we don't provide a helper for that. I disagree. Even if the device has to be active to generate an IRQ, it's possible for the interrupt handler to be delayed until after the device is suspended (unless the suspend routine explicitly calls synchronize_irq(), which would be pretty foolish). Hence the handler can't make any assumptions about the device's state. > In the > second case calling pm_runtime_get() doesn't really help either. I think it's > better to do _get_noresume(), check the status and if "suspended", set a flag > for the resume callback to do something specific before returning successfully, > then call pm_request_resume() and return. This is exactly equivalent to: pm_runtime_get(), check the status, if "suspended" then set a flag, otherwise do the work. Except that again, it's longer. Check out the sample driver code in section 9 of Documentation/power/runtime_pm.txt, especially the foo_read_or_write() routine. > > Well, IMO the naming should have been the other way around from the > > start. That is, we should have made pm_runtime_get be the synchronous > > routine and pm_runtime_get_async be the asynchronous one. But it's too > > late to change now. > > I'm not sure it is too late. If we first change all the instances of > pm_runtime_get() to pm_runtime_get_async() and then all of the instances of > pm_runtime_get_sync() to pm_runtime_get(), it should be technically possible. > Of course, it would be confusing, but that's a different matter. :-) If you're willing to risk a certain amount of confusion, count me in. :-) While you're changing names around, consider also adding a "_runtime" somewhere to: pm_children_suspended, pm_schedule_suspend, pm_request_idle, pm_request_resume, pm_request_autosuspend. For example, we could have pm_runtime_idle_async instead of pm_request_idle. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, August 01, 2012, Alan Stern wrote: > On Tue, 31 Jul 2012, Rafael J. Wysocki wrote: > > > On Tuesday, July 31, 2012, Alan Stern wrote: > > > [CC: list trimmed] > > > > > > On Tue, 31 Jul 2012, Rafael J. Wysocki wrote: > > > > > > > Now it occured to me that perhaps we don't need the current asynchronous > > > > pm_runtime_get() at all. The usefulness of it is quite questionable, because > > > > either we want to resume the device immediately, for which pm_runtime_get_sync() > > > > should be used, or we just want to bump up the usage counter, in which cases > > > > pm_runtime_get_noresume() should always be sufficient. I fail to see any > > > > particularly compelling use case for pm_runtime_get() doing an asynchronous > > > > resume at the moment, but perhaps that's just me. > > > > > > There are indeed valid uses for pm_runtime_get(). We are forced to use > > > it in non-sleepable contexts when we want to resume the device as > > > quickly as possible. Example: a driver receives an I/O request from an > > > interrupt handler. > > > > Is it actually suitable to be used in such contexts? It doesn't give any > > guarantee that the device will be active when it returns, so the caller can't > > really rely on it. > > Of course not. When you're in interrupt context, you can't wait around > to see if the device will actually resume. (Unless you're using > irq-safe runtime PM, but we can ignore that case.) > > > The caller always has to check if the device has been > > resumed before accessing it anyway, so it may as well do a _get_noresume(), > > do the check and do pm_request_resume() if needed directly. > > This is exactly equivalent to doing pm_runtime_get() followed by a > check, except that it's longer. Except that I meant something different from what I wrote, sorry about that (must be too much heat). What I really thought about was to do _get_noresume(), then check if the device is active and if not, queue up a work item (or another delayed execution in process context) that will do pm_runtime_resume() and then access the hardware. Why do I think it's better than plain pm_runtime_get()? Because the code path calling pm_runtime_resume() will know that it is safe to access the hardware after it has returned (unless it returns an error code, but that's exceptional). In contrast, with pm_runtime_get() there is no way to predict when the device is going to be resumed, so if the device happens to be suspended when pm_runtime_get() returns, the driver kind of has to poll it until it becomes active, or use a wait queue woken up from the resume callback, or do all of the processing in the resume callback itself (like in the example you mentioned). I'm not sure if the expectation that all driver writers will be able to implement any of these options correctly is a realistic one. > Yes, I agree that pm_runtime_get(dev) is nothing more than a shorthand > way of doing pm_runtime_get_noresume(dev) followed by > pm_request_resume(dev). That doesn't mean it should be eliminated. > Shorthands are convenient. > > As another example, pm_runtime_get_sync(dev) is nothing more than a > shorthand for pm_runtime_get_noresume(dev) followed by > pm_runtime_resume(dev). IMO, having these joint functions that do a > get/put combined with a suspend/resume/idle is very useful. > > > Now, as far as interrupt handlers are concerned, there are two cases: either > > the device has to be active to generate an interrupt (like PCI), or the > > interrupt is generated by something else on behalf of it. We don't seem to > > handle the first case very well right now, because in that case the interrupt > > handler will always know that the device is active, so it should do a > > _get_noresume() and then change the device's status to "active" without > > calling any kind of "resume", but we don't provide a helper for that. > > I disagree. Even if the device has to be active to generate an IRQ, > it's possible for the interrupt handler to be delayed until after the > device is suspended (unless the suspend routine explicitly calls > synchronize_irq(), which would be pretty foolish). Hence the handler > can't make any assumptions about the device's state. > > > In the > > second case calling pm_runtime_get() doesn't really help either. I think it's > > better to do _get_noresume(), check the status and if "suspended", set a flag > > for the resume callback to do something specific before returning successfully, > > then call pm_request_resume() and return. > > This is exactly equivalent to: pm_runtime_get(), check the status, if > "suspended" then set a flag, otherwise do the work. Except that again, > it's longer. > > Check out the sample driver code in section 9 of > Documentation/power/runtime_pm.txt, especially the foo_read_or_write() > routine. Well, that shouldn't need the is_suspended flag at all, methinks, and the reason it does need it is because it uses pm_runtime_get(). Moreover, processing requests in the resume callback is not something I'd recommend to anyone, because that's going to happen when the status of the device is RPM_RESUMING, we're holding a reference on the parent (if there's one) etc. So, it looks like I don't really agree with the example. :-) > > > Well, IMO the naming should have been the other way around from the > > > start. That is, we should have made pm_runtime_get be the synchronous > > > routine and pm_runtime_get_async be the asynchronous one. But it's too > > > late to change now. > > > > I'm not sure it is too late. If we first change all the instances of > > pm_runtime_get() to pm_runtime_get_async() and then all of the instances of > > pm_runtime_get_sync() to pm_runtime_get(), it should be technically possible. > > Of course, it would be confusing, but that's a different matter. :-) > > If you're willing to risk a certain amount of confusion, count me in. :-) > > While you're changing names around, consider also adding a "_runtime" > somewhere to: > > pm_children_suspended, > pm_schedule_suspend, > pm_request_idle, > pm_request_resume, > pm_request_autosuspend. > > For example, we could have pm_runtime_idle_async instead of > pm_request_idle. Well, these are not as misleading as pm_runtime_get(), at least in principle. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 1 Aug 2012, Rafael J. Wysocki wrote: > What I really thought about was to do _get_noresume(), then check if the > device is active and if not, queue up a work item (or another delayed > execution in process context) that will do pm_runtime_resume() and > then access the hardware. > > Why do I think it's better than plain pm_runtime_get()? Because the code > path calling pm_runtime_resume() will know that it is safe to access the > hardware after it has returned (unless it returns an error code, but > that's exceptional). > > In contrast, with pm_runtime_get() there is no way to predict when the > device is going to be resumed, so if the device happens to be suspended > when pm_runtime_get() returns, the driver kind of has to poll it until > it becomes active, or use a wait queue woken up from the resume callback, > or do all of the processing in the resume callback itself (like in the > example you mentioned). I'm not sure if the expectation that all driver > writers will be able to implement any of these options correctly is a realistic > one. I don't know about that -- the logic involved in doing the processing within the resume callback isn't terribly complicated. At least, not much more complicated than the logic involved in setting up a custom work routine as you suggest. Anyway with the existing code, driver writers are free to choose whichever approach they prefer. > > Check out the sample driver code in section 9 of > > Documentation/power/runtime_pm.txt, especially the foo_read_or_write() > > routine. > > Well, that shouldn't need the is_suspended flag at all, methinks, and the > reason it does need it is because it uses pm_runtime_get(). Not so. Consider your scheme. When starting an I/O request, you call pm_runtime_get_noresume() and then check to see if the device is active, say by calling pm_runtime_suspended(). Suppose at that moment the suspend callback has just finished and has released the private spinlock. The device's status is still RPM_SUSPENDING, so pm_runtime_suspended() returns 0 and you try to carry out the I/O. To fix this problem you have to synchronize the status checking with the suspend/resume operations. This means the status changes have to occur under the protection of the private lock, which means a private flag is needed. > Moreover, > processing requests in the resume callback is not something I'd recommend > to anyone, because that's going to happen when the status of the device > is RPM_RESUMING, we're holding a reference on the parent (if there's one) etc. I don't see any problem with that. The parent's child_count will be incremented while the requests are being processed regardless. And if you've been waiting for the device to resume in order to carry out some processing, within the resume callback is the logical place to do the work. It avoids the overhead of a second context switch. > So, it looks like I don't really agree with the example. :-) Feel free to add your scheme as a second example in the document. :-) But please don't remove the first example, unless you can find something actually wrong with it. > > While you're changing names around, consider also adding a "_runtime" > > somewhere to: > > > > pm_children_suspended, > > pm_schedule_suspend, > > pm_request_idle, > > pm_request_resume, > > pm_request_autosuspend. > > > > For example, we could have pm_runtime_idle_async instead of > > pm_request_idle. > > Well, these are not as misleading as pm_runtime_get(), at least in principle. No, but it would be good to be more consistent about our naming. Making sure all the function names contain "_runtime" would help. I suppose we could keep pm_runtime_get_sync as is, and just change pm_runtime_get to pm_runtime_get_async (and likewise for _put). That could reduce the confusion during the changeover. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, August 02, 2012, Alan Stern wrote: > On Wed, 1 Aug 2012, Rafael J. Wysocki wrote: > > > What I really thought about was to do _get_noresume(), then check if the > > device is active and if not, queue up a work item (or another delayed > > execution in process context) that will do pm_runtime_resume() and > > then access the hardware. > > > > Why do I think it's better than plain pm_runtime_get()? Because the code > > path calling pm_runtime_resume() will know that it is safe to access the > > hardware after it has returned (unless it returns an error code, but > > that's exceptional). > > > > In contrast, with pm_runtime_get() there is no way to predict when the > > device is going to be resumed, so if the device happens to be suspended > > when pm_runtime_get() returns, the driver kind of has to poll it until > > it becomes active, or use a wait queue woken up from the resume callback, > > or do all of the processing in the resume callback itself (like in the > > example you mentioned). I'm not sure if the expectation that all driver > > writers will be able to implement any of these options correctly is a realistic > > one. > > I don't know about that -- the logic involved in doing the processing > within the resume callback isn't terribly complicated. At least, not > much more complicated than the logic involved in setting up a custom > work routine as you suggest. That's why I had the idea of pm_request_resume_and_call(dev, func) described in this message: http://marc.info/?l=linux-usb&m=134377126804170&w=4 although perheps it would be better to call it something like pm_runtime_async_resume_and_call(dev, func). > Anyway with the existing code, driver writers are free to choose > whichever approach they prefer. I wonder how many instances of pm_runtime_get() used in a wrong way there are in the kernel right now. I guess I'll sacrifice some time to check that. > > > Check out the sample driver code in section 9 of > > > Documentation/power/runtime_pm.txt, especially the foo_read_or_write() > > > routine. > > > > Well, that shouldn't need the is_suspended flag at all, methinks, and the > > reason it does need it is because it uses pm_runtime_get(). > > Not so. Consider your scheme. When starting an I/O request, you call > pm_runtime_get_noresume() and then check to see if the device is > active, say by calling pm_runtime_suspended(). Suppose at that moment > the suspend callback has just finished and has released the private > spinlock. The device's status is still RPM_SUSPENDING, so > pm_runtime_suspended() returns 0 and you try to carry out the I/O. > > To fix this problem you have to synchronize the status checking with > the suspend/resume operations. This means the status changes have to > occur under the protection of the private lock, which means a private > flag is needed. What about checking if the status is RPM_ACTIVE under dev->power.lock? That's what rpm_resume() does, more or less. > > Moreover, > > processing requests in the resume callback is not something I'd recommend > > to anyone, because that's going to happen when the status of the device > > is RPM_RESUMING, we're holding a reference on the parent (if there's one) etc. > > I don't see any problem with that. The parent's child_count will be > incremented while the requests are being processed regardless. And if > you've been waiting for the device to resume in order to carry out some > processing, within the resume callback is the logical place to do the > work. Unless your _driver_ callback is actually executed from within a PM domain callback, for example, and something else may be waiting for it to complete, so your data processing is adding latencies to some other threads. I'm not making that up, by the way, that really can happen. And what if you are a parent waited for by a child to resume so that _it_ can process its data? Would you still want to process your data in the resume callback in that case? > It avoids the overhead of a second context switch. That may be avoided without processing data from within the resume callback, although not with the current code. > > So, it looks like I don't really agree with the example. :-) > > Feel free to add your scheme as a second example in the document. :-) > But please don't remove the first example, unless you can find > something actually wrong with it. Well, it should work in general, so it is not functionally wrong. However, I wouldn't like to regard it as the best thing we can offer. :-) > > > While you're changing names around, consider also adding a "_runtime" > > > somewhere to: > > > > > > pm_children_suspended, > > > pm_schedule_suspend, > > > pm_request_idle, > > > pm_request_resume, > > > pm_request_autosuspend. > > > > > > For example, we could have pm_runtime_idle_async instead of > > > pm_request_idle. > > > > Well, these are not as misleading as pm_runtime_get(), at least in principle. > > No, but it would be good to be more consistent about our naming. > Making sure all the function names contain "_runtime" would help. > > I suppose we could keep pm_runtime_get_sync as is, and just change > pm_runtime_get to pm_runtime_get_async (and likewise for _put). That > could reduce the confusion during the changeover. Changing pm_runtime_get() to pm_runtime_get_async() would be an improvement, although pm_runtime_get_but_do_not_resume_immediately() might even be better. Or even pm_runtime_get_but_do_not_access_hardware_when_it_returns(). ;-) I see no reason to have any of those things, though. Yes, they _may_ be useful to someone knowing the runtime PM core internals to save a few lines of code in some places, but generally they lead people to make serious mistakes that tend to be difficult to debug. For this very reason pm_runtime_get() is a bad interface and I wish we hadn't introduced it at all. Even if we give it a more descriptive name, it won't be much better. And note how that doesn't apply to the pm_runtime_put*() family. After all, doing pm_runtime_put() instead of pm_runtime_put_sync() will never lead to accessing registers of a suspended device. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2 Aug 2012, Rafael J. Wysocki wrote: > > I don't know about that -- the logic involved in doing the processing > > within the resume callback isn't terribly complicated. At least, not > > much more complicated than the logic involved in setting up a custom > > work routine as you suggest. > > That's why I had the idea of pm_request_resume_and_call(dev, func) > described in this message: > > http://marc.info/?l=linux-usb&m=134377126804170&w=4 > > although perheps it would be better to call it something like > pm_runtime_async_resume_and_call(dev, func). Hmmm. You'd probably want a version that does a "get" at the same time. I suppose you would call func directly if the device was already resumed, without going through the workqueue? Yes, I agree this would be a better interface then pm_runtime_get. > > > Well, that shouldn't need the is_suspended flag at all, methinks, and the > > > reason it does need it is because it uses pm_runtime_get(). > > > > Not so. Consider your scheme. When starting an I/O request, you call > > pm_runtime_get_noresume() and then check to see if the device is > > active, say by calling pm_runtime_suspended(). Suppose at that moment > > the suspend callback has just finished and has released the private > > spinlock. The device's status is still RPM_SUSPENDING, so > > pm_runtime_suspended() returns 0 and you try to carry out the I/O. > > > > To fix this problem you have to synchronize the status checking with > > the suspend/resume operations. This means the status changes have to > > occur under the protection of the private lock, which means a private > > flag is needed. > > What about checking if the status is RPM_ACTIVE under dev->power.lock? > That's what rpm_resume() does, more or less. That wouldn't solve the race described above. > > > Moreover, > > > processing requests in the resume callback is not something I'd recommend > > > to anyone, because that's going to happen when the status of the device > > > is RPM_RESUMING, we're holding a reference on the parent (if there's one) etc. > > > > I don't see any problem with that. The parent's child_count will be > > incremented while the requests are being processed regardless. And if > > you've been waiting for the device to resume in order to carry out some > > processing, within the resume callback is the logical place to do the > > work. > > Unless your _driver_ callback is actually executed from within a PM domain > callback, for example, and something else may be waiting for it to complete, > so your data processing is adding latencies to some other threads. I'm not > making that up, by the way, that really can happen. > > And what if you are a parent waited for by a child to resume so that _it_ > can process its data? Would you still want to process your data in the > resume callback in that case? Okay, those are valid reasons. (Although a device handling I/O requests isn't likely to have a child with its own independent I/O handling.) > > I suppose we could keep pm_runtime_get_sync as is, and just change > > pm_runtime_get to pm_runtime_get_async (and likewise for _put). That > > could reduce the confusion during the changeover. > > Changing pm_runtime_get() to pm_runtime_get_async() would be an improvement, > although pm_runtime_get_but_do_not_resume_immediately() might even be better. > Or even pm_runtime_get_but_do_not_access_hardware_when_it_returns(). ;-) > > I see no reason to have any of those things, though. Yes, they _may_ be > useful to someone knowing the runtime PM core internals to save a few lines > of code in some places, but generally they lead people to make serious mistakes > that tend to be difficult to debug. For this very reason pm_runtime_get() is a > bad interface and I wish we hadn't introduced it at all. Even if we give it > a more descriptive name, it won't be much better. > > And note how that doesn't apply to the pm_runtime_put*() family. After all, > doing pm_runtime_put() instead of pm_runtime_put_sync() will never lead to > accessing registers of a suspended device. All right. But I still think "pm_runtime_put_async" is better than "pm_runtime_put". At least it forces people to think about what they're doing. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 3, 2012 at 10:20 AM, Alan Stern <stern@rowland.harvard.edu> wrote: > On Thu, 2 Aug 2012, Rafael J. Wysocki wrote: > Hmmm. You'd probably want a version that does a "get" at the same > time. I suppose you would call func directly if the device was already > resumed, without going through the workqueue? Maybe it isn't good, the 'func' might not be run in the current context (irq context or some spinlock is held). >> And what if you are a parent waited for by a child to resume so that _it_ >> can process its data? Would you still want to process your data in the >> resume callback in that case? Looks the child is always resumed after its parent completes the resuming, and current pm_runtime_get doesn't delay the resume of the parent, and just make the .runtime_resume run in another context. Also are there actual examples about the above situation? > > Okay, those are valid reasons. (Although a device handling I/O > requests isn't likely to have a child with its own independent I/O > handling.) IMO, the .runtime_resume callback can handle the IO things easily via scheduling worker function if the driver don't want to delay its children's resume. Thanks,
On Thu, 2 Aug 2012, Alan Stern wrote: > On Thu, 2 Aug 2012, Rafael J. Wysocki wrote: > > > > I don't know about that -- the logic involved in doing the processing > > > within the resume callback isn't terribly complicated. At least, not > > > much more complicated than the logic involved in setting up a custom > > > work routine as you suggest. > > > > That's why I had the idea of pm_request_resume_and_call(dev, func) > > described in this message: > > > > http://marc.info/?l=linux-usb&m=134377126804170&w=4 > > > > although perheps it would be better to call it something like > > pm_runtime_async_resume_and_call(dev, func). > > Hmmm. You'd probably want a version that does a "get" at the same > time. I suppose you would call func directly if the device was already > resumed, without going through the workqueue? > > Yes, I agree this would be a better interface then pm_runtime_get. A few problems: What happens if a system sleep begins after pm_runtime_async_resume_and_call() but before the runtime resume? When the system wakes up, the device will be back at full power without ever executing a runtime PM call. Then how will func get invoked? What happens if the driver is unbound after pm_runtime_async_resume_and_call() but before the runtime resume? The When the runtime resume occurs, func will be invoked -- and the driver might not even be in memory any more. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 3 Aug 2012, Ming Lei wrote: > On Fri, Aug 3, 2012 at 10:20 AM, Alan Stern <stern@rowland.harvard.edu> wrote: > > On Thu, 2 Aug 2012, Rafael J. Wysocki wrote: > > > Hmmm. You'd probably want a version that does a "get" at the same > > time. I suppose you would call func directly if the device was already > > resumed, without going through the workqueue? > > Maybe it isn't good, the 'func' might not be run in the current context > (irq context or some spinlock is held). True. But we don't want to wait for the workqueue if the device is already at full power. Suggestions? > >> And what if you are a parent waited for by a child to resume so that _it_ > >> can process its data? Would you still want to process your data in the > >> resume callback in that case? > > Looks the child is always resumed after its parent completes the resuming, > and current pm_runtime_get doesn't delay the resume of the parent, and > just make the .runtime_resume run in another context. I don't understand. Rafael's point was that if the parent's resume callback did a bunch of real work, it would delay resuming the child because the child can't be resumed until the parent's resume callback returns. > Also are there actual examples about the above situation? I don't know of any. I suspect there aren't many examples. It wouldn't make much sense. > IMO, the .runtime_resume callback can handle the IO things easily > via scheduling worker function if the driver don't want to delay > its children's resume. That was one of Rafael's other suggestions. But calling pm_request_resume_and_call() is easier than scheduling a worker function. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday, August 03, 2012, Alan Stern wrote: > On Thu, 2 Aug 2012, Rafael J. Wysocki wrote: > > > > I don't know about that -- the logic involved in doing the processing > > > within the resume callback isn't terribly complicated. At least, not > > > much more complicated than the logic involved in setting up a custom > > > work routine as you suggest. > > > > That's why I had the idea of pm_request_resume_and_call(dev, func) > > described in this message: > > > > http://marc.info/?l=linux-usb&m=134377126804170&w=4 > > > > although perheps it would be better to call it something like > > pm_runtime_async_resume_and_call(dev, func). > > Hmmm. You'd probably want a version that does a "get" at the same > time. I suppose you would call func directly if the device was already > resumed, without going through the workqueue? Yes. > Yes, I agree this would be a better interface then pm_runtime_get. OK > > > > Well, that shouldn't need the is_suspended flag at all, methinks, and the > > > > reason it does need it is because it uses pm_runtime_get(). > > > > > > Not so. Consider your scheme. When starting an I/O request, you call > > > pm_runtime_get_noresume() and then check to see if the device is > > > active, say by calling pm_runtime_suspended(). Suppose at that moment > > > the suspend callback has just finished and has released the private > > > spinlock. The device's status is still RPM_SUSPENDING, so > > > pm_runtime_suspended() returns 0 and you try to carry out the I/O. > > > > > > To fix this problem you have to synchronize the status checking with > > > the suspend/resume operations. This means the status changes have to > > > occur under the protection of the private lock, which means a private > > > flag is needed. > > > > What about checking if the status is RPM_ACTIVE under dev->power.lock? > > That's what rpm_resume() does, more or less. > > That wouldn't solve the race described above. > > > > > Moreover, > > > > processing requests in the resume callback is not something I'd recommend > > > > to anyone, because that's going to happen when the status of the device > > > > is RPM_RESUMING, we're holding a reference on the parent (if there's one) etc. > > > > > > I don't see any problem with that. The parent's child_count will be > > > incremented while the requests are being processed regardless. And if > > > you've been waiting for the device to resume in order to carry out some > > > processing, within the resume callback is the logical place to do the > > > work. > > > > Unless your _driver_ callback is actually executed from within a PM domain > > callback, for example, and something else may be waiting for it to complete, > > so your data processing is adding latencies to some other threads. I'm not > > making that up, by the way, that really can happen. > > > > And what if you are a parent waited for by a child to resume so that _it_ > > can process its data? Would you still want to process your data in the > > resume callback in that case? > > Okay, those are valid reasons. (Although a device handling I/O > requests isn't likely to have a child with its own independent I/O > handling.) I agree that this isn't very likely in practice. > > > I suppose we could keep pm_runtime_get_sync as is, and just change > > > pm_runtime_get to pm_runtime_get_async (and likewise for _put). That > > > could reduce the confusion during the changeover. > > > > Changing pm_runtime_get() to pm_runtime_get_async() would be an improvement, > > although pm_runtime_get_but_do_not_resume_immediately() might even be better. > > Or even pm_runtime_get_but_do_not_access_hardware_when_it_returns(). ;-) > > > > I see no reason to have any of those things, though. Yes, they _may_ be > > useful to someone knowing the runtime PM core internals to save a few lines > > of code in some places, but generally they lead people to make serious mistakes > > that tend to be difficult to debug. For this very reason pm_runtime_get() is a > > bad interface and I wish we hadn't introduced it at all. Even if we give it > > a more descriptive name, it won't be much better. > > > > And note how that doesn't apply to the pm_runtime_put*() family. After all, > > doing pm_runtime_put() instead of pm_runtime_put_sync() will never lead to > > accessing registers of a suspended device. > > All right. But I still think "pm_runtime_put_async" is better than > "pm_runtime_put". At least it forces people to think about what > they're doing. I agree. My current plan is to provide a better alternative interface, then to change the name of "pm_runtime_put" to "pm_runtime_put_async" and to document that it's going to be deprecated in future. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday, August 03, 2012, Ming Lei wrote: > On Fri, Aug 3, 2012 at 10:20 AM, Alan Stern <stern@rowland.harvard.edu> wrote: > > On Thu, 2 Aug 2012, Rafael J. Wysocki wrote: > > > Hmmm. You'd probably want a version that does a "get" at the same > > time. I suppose you would call func directly if the device was already > > resumed, without going through the workqueue? > > Maybe it isn't good, the 'func' might not be run in the current context > (irq context or some spinlock is held). Then I'd say don't use this interface. If you have code that needs to be run in a different context, then you have to use a work item (or equivalent) anyway and you can do a synchronous runtime resume from there. The problem we want to address here is when there's code that should be run as soon as the device is active, preferably _without_ a context switch. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday, August 03, 2012, Alan Stern wrote: > On Thu, 2 Aug 2012, Alan Stern wrote: > > > On Thu, 2 Aug 2012, Rafael J. Wysocki wrote: > > > > > > I don't know about that -- the logic involved in doing the processing > > > > within the resume callback isn't terribly complicated. At least, not > > > > much more complicated than the logic involved in setting up a custom > > > > work routine as you suggest. > > > > > > That's why I had the idea of pm_request_resume_and_call(dev, func) > > > described in this message: > > > > > > http://marc.info/?l=linux-usb&m=134377126804170&w=4 > > > > > > although perheps it would be better to call it something like > > > pm_runtime_async_resume_and_call(dev, func). > > > > Hmmm. You'd probably want a version that does a "get" at the same > > time. I suppose you would call func directly if the device was already > > resumed, without going through the workqueue? > > > > Yes, I agree this would be a better interface then pm_runtime_get. > > A few problems: Which are good points. :-) > What happens if a system sleep begins after > pm_runtime_async_resume_and_call() but before the runtime resume? When > the system wakes up, the device will be back at full power without ever > executing a runtime PM call. Then how will func get invoked? The pm_runtime_barrier() at the beginning of __device_suspend() guarantees that if there's a resume request pending at this point, the resume will be carried out. I think we can modify rpm_resume() to return only after func() is executed, so there shouldn't be a problem with that particular case. > What happens if the driver is unbound after > pm_runtime_async_resume_and_call() but before the runtime resume? The > When the runtime resume occurs, func will be invoked -- and the driver > might not even be in memory any more. We have no protection against that at the moment, but then there is no guarantee that the driver's runtime PM callbacks won't be run after it's been unbound. Perhaps we should run pm_runtime_barrier() before executing .remove() in __device_release_driver() (and perhaps run pm_runtime_idle() afterwards in case the subsystem or PM domain knows how to power down the device even if there's no driver) and expect .remove() to clean up? Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 4 Aug 2012, Rafael J. Wysocki wrote: > On Friday, August 03, 2012, Ming Lei wrote: > > On Fri, Aug 3, 2012 at 10:20 AM, Alan Stern <stern@rowland.harvard.edu> wrote: > > > On Thu, 2 Aug 2012, Rafael J. Wysocki wrote: > > > > > Hmmm. You'd probably want a version that does a "get" at the same > > > time. I suppose you would call func directly if the device was already > > > resumed, without going through the workqueue? > > > > Maybe it isn't good, the 'func' might not be run in the current context > > (irq context or some spinlock is held). > > Then I'd say don't use this interface. If you have code that needs to > be run in a different context, then you have to use a work item (or > equivalent) anyway and you can do a synchronous runtime resume from there. That wasn't what he meant. What if the code needs to run in the _same_ context as the caller? For example, with a spinlock held. func would need to know whether it should acquire the lock, which means it needs to know whether it was called directly or through the workqueue. I suppose we could pass it an extra argument with this information... but that's a little ugly. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 4 Aug 2012, Rafael J. Wysocki wrote: > > What happens if a system sleep begins after > > pm_runtime_async_resume_and_call() but before the runtime resume? When > > the system wakes up, the device will be back at full power without ever > > executing a runtime PM call. Then how will func get invoked? > > The pm_runtime_barrier() at the beginning of __device_suspend() guarantees > that if there's a resume request pending at this point, the resume will be > carried out. I think we can modify rpm_resume() to return only after func() > is executed, so there shouldn't be a problem with that particular case. A resume request can get added to the workqueue after __device_suspend() runs, in which case the problem will still exist. Should pm_runtime_set_active() call func (if such a call is pending)? > > What happens if the driver is unbound after > > pm_runtime_async_resume_and_call() but before the runtime resume? The > > When the runtime resume occurs, func will be invoked -- and the driver > > might not even be in memory any more. > > We have no protection against that at the moment, but then there is no > guarantee that the driver's runtime PM callbacks won't be run after > it's been unbound. True. We have been relying on the subsystem to handle this race, but obviously that's no good if the callback is directly to the driver. I don't see any way to fix this in the PM core. > Perhaps we should run pm_runtime_barrier() before executing .remove() > in __device_release_driver() We already do pm_runtime_get_sync() and pm_runtime_put_sync(). I'm not sure how necessary those calls really are ... but pm_runtime_barrier() would be redundant. > (and perhaps run pm_runtime_idle() afterwards > in case the subsystem or PM domain knows how to power down the device even if > there's no driver) and expect .remove() to clean up? I can't think of anything wrong with calling pm_runtime_idle() at the end of __device_release_driver(). Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday, August 04, 2012, Alan Stern wrote: > On Sat, 4 Aug 2012, Rafael J. Wysocki wrote: > > > On Friday, August 03, 2012, Ming Lei wrote: > > > On Fri, Aug 3, 2012 at 10:20 AM, Alan Stern <stern@rowland.harvard.edu> wrote: > > > > On Thu, 2 Aug 2012, Rafael J. Wysocki wrote: > > > > > > > Hmmm. You'd probably want a version that does a "get" at the same > > > > time. I suppose you would call func directly if the device was already > > > > resumed, without going through the workqueue? > > > > > > Maybe it isn't good, the 'func' might not be run in the current context > > > (irq context or some spinlock is held). > > > > Then I'd say don't use this interface. If you have code that needs to > > be run in a different context, then you have to use a work item (or > > equivalent) anyway and you can do a synchronous runtime resume from there. > > That wasn't what he meant. What if the code needs to run in the _same_ > context as the caller? For example, with a spinlock held. I see. I think it wouldn't be unreasonable to require that func should take all of the necessary locks by itself. However, I believe that if func is going to be called through the workqueue, the usage counter should be incremented before the preceding resume and decremented after func returns by the workqueue thread. > func would need to know whether it should acquire the lock, which means it > needs to know whether it was called directly or through the workqueue. > > I suppose we could pass it an extra argument with this information... > but that's a little ugly. I agree. I'd prefer to avoid that, if possible. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 4 Aug 2012, Rafael J. Wysocki wrote: > > That wasn't what he meant. What if the code needs to run in the _same_ > > context as the caller? For example, with a spinlock held. > > I see. I think it wouldn't be unreasonable to require that func should take > all of the necessary locks by itself. But then if func was called directly, because the device was at full power, it would deadlock trying to acquire a lock the caller already holds. > However, I believe that if func is going to be called through the workqueue, > the usage counter should be incremented before the preceding resume and > decremented after func returns by the workqueue thread. Okay. > > func would need to know whether it should acquire the lock, which means it > > needs to know whether it was called directly or through the workqueue. > > > > I suppose we could pass it an extra argument with this information... > > but that's a little ugly. > > I agree. I'd prefer to avoid that, if possible. Do you have any better suggestions? Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday, August 04, 2012, Alan Stern wrote: > On Sat, 4 Aug 2012, Rafael J. Wysocki wrote: > > > > What happens if a system sleep begins after > > > pm_runtime_async_resume_and_call() but before the runtime resume? When > > > the system wakes up, the device will be back at full power without ever > > > executing a runtime PM call. Then how will func get invoked? > > > > The pm_runtime_barrier() at the beginning of __device_suspend() guarantees > > that if there's a resume request pending at this point, the resume will be > > carried out. I think we can modify rpm_resume() to return only after func() > > is executed, so there shouldn't be a problem with that particular case. > > A resume request can get added to the workqueue after > __device_suspend() runs, in which case the problem will still exist. In theory. But the question is who's going to add that request, since the driver's (and possibly subsystem's/PM domain's) .suspend() callbacks will be run by __device_suspend() and we can expect those to ensure certain level of consistency. Anyway, today we'll have the same problem with a runtime resume request queued up during system suspend (after __device_suspend() has run). > Should pm_runtime_set_active() call func (if such a call is pending)? > > > > What happens if the driver is unbound after > > > pm_runtime_async_resume_and_call() but before the runtime resume? The > > > When the runtime resume occurs, func will be invoked -- and the driver > > > might not even be in memory any more. > > > > We have no protection against that at the moment, but then there is no > > guarantee that the driver's runtime PM callbacks won't be run after > > it's been unbound. > > True. We have been relying on the subsystem to handle this race, but > obviously that's no good if the callback is directly to the driver. I > don't see any way to fix this in the PM core. Me neither. > > Perhaps we should run pm_runtime_barrier() before executing .remove() > > in __device_release_driver() > > We already do pm_runtime_get_sync() and pm_runtime_put_sync(). I'm not > sure how necessary those calls really are ... They are, because some platforms do some unusual things in their platform bus type notifiers. > but pm_runtime_barrier() would be redundant. > > > (and perhaps run pm_runtime_idle() afterwards > > in case the subsystem or PM domain knows how to power down the device even if > > there's no driver) and expect .remove() to clean up? > > I can't think of anything wrong with calling pm_runtime_idle() at the > end of __device_release_driver(). Yeah. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday, August 04, 2012, Alan Stern wrote: > On Sat, 4 Aug 2012, Rafael J. Wysocki wrote: > > > > That wasn't what he meant. What if the code needs to run in the _same_ > > > context as the caller? For example, with a spinlock held. > > > > I see. I think it wouldn't be unreasonable to require that func should take > > all of the necessary locks by itself. > > But then if func was called directly, because the device was at full > power, it would deadlock trying to acquire a lock the caller already > holds. I wonder why the caller may want to take any locks beforehand? > > However, I believe that if func is going to be called through the workqueue, > > the usage counter should be incremented before the preceding resume and > > decremented after func returns by the workqueue thread. > > Okay. > > > > func would need to know whether it should acquire the lock, which means it > > > needs to know whether it was called directly or through the workqueue. > > > > > > I suppose we could pass it an extra argument with this information... > > > but that's a little ugly. > > > > I agree. I'd prefer to avoid that, if possible. > > Do you have any better suggestions? Hmm. What about pm_runtime_get_and_call(dev, func_sync, func_async), where func_sync() is to be called if the device is already active and func_async() is to be called if it has to be resumed from the workqueue? Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 4 Aug 2012, Rafael J. Wysocki wrote: > On Saturday, August 04, 2012, Alan Stern wrote: > > On Sat, 4 Aug 2012, Rafael J. Wysocki wrote: > > > > > > That wasn't what he meant. What if the code needs to run in the _same_ > > > > context as the caller? For example, with a spinlock held. > > > > > > I see. I think it wouldn't be unreasonable to require that func should take > > > all of the necessary locks by itself. > > > > But then if func was called directly, because the device was at full > > power, it would deadlock trying to acquire a lock the caller already > > holds. > > I wonder why the caller may want to take any locks beforehand? Who knows? :-) The best answer may be for the caller not to hold any locks. In the runtime-PM document's example driver, the lock would be dropped before the resume-and-call. > > Do you have any better suggestions? > > Hmm. What about pm_runtime_get_and_call(dev, func_sync, func_async), where > func_sync() is to be called if the device is already active and func_async() > is to be called if it has to be resumed from the workqueue? That's a little better but not much. It would require storing two function pointers in the dev->power structure. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -280,8 +280,12 @@ static long local_pci_probe(void *_ddi) { struct drv_dev_and_id *ddi = _ddi; struct device *dev = &ddi->dev->dev; + struct device *parent = dev->parent; int rc; + /* The parent bridge must be in active state when probing */ + if (parent) + pm_runtime_get_sync(parent); /* Unbound PCI devices are always set to disabled and suspended. * During probe, the device is set to enabled and active and the * usage count is incremented. If the driver supports runtime PM, @@ -298,6 +302,8 @@ static long local_pci_probe(void *_ddi) pm_runtime_set_suspended(dev); pm_runtime_put_noidle(dev); } + if (parent) + pm_runtime_put(parent); return rc; }