Message ID | 1381479385-1614-1-git-send-email-tianyu.lan@intel.com (mailing list archive) |
---|---|
State | Rejected, archived |
Headers | show |
On Friday, October 11, 2013 04:16:25 PM tianyu.lan@intel.com wrote: > From: Lan Tianyu <tianyu.lan@intel.com> > > Currently, when one power resource is turned on, devices owning it > will be requested to resume regardless of their runtime pm status. > ACPI power resource maybe turn on in some devices' runtime pm > resume callback(E.G, usb port) while turning on the power resource > will trigger one new resume request of the device. It causes > infinite loop between resume and suspend. This has happened on > clearing usb port's PM Qos NO_POWER_OFF flag twice. This patch is > to add check of physical device's runtime pm status and request resume > if the device is suspended. > > Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> > --- > drivers/acpi/power.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c > index 0dbe5cd..228c138 100644 > --- a/drivers/acpi/power.c > +++ b/drivers/acpi/power.c > @@ -250,8 +250,10 @@ static void acpi_power_resume_dependent(struct work_struct *work) > > mutex_lock(&adev->physical_node_lock); > > - list_for_each_entry(pn, &adev->physical_node_list, node) > - pm_request_resume(pn->dev); > + list_for_each_entry(pn, &adev->physical_node_list, node) { > + if (pm_runtime_suspended(pn->dev)) > + pm_request_resume(pn->dev); > + } This is racy, because the status may change right after you check it and before you call pm_request_resume(). Besides, pm_request_resume() checks the status of the device and won't try to resume it if it is not suspended. > > list_for_each_entry(pn, &adev->power_dependent, node) > pm_request_resume(pn->dev); Thanks!
On 2013?10?11? 19:19, Rafael J. Wysocki wrote: > On Friday, October 11, 2013 04:16:25 PM tianyu.lan@intel.com wrote: >> From: Lan Tianyu <tianyu.lan@intel.com> >> >> Currently, when one power resource is turned on, devices owning it >> will be requested to resume regardless of their runtime pm status. >> ACPI power resource maybe turn on in some devices' runtime pm >> resume callback(E.G, usb port) while turning on the power resource >> will trigger one new resume request of the device. It causes >> infinite loop between resume and suspend. This has happened on >> clearing usb port's PM Qos NO_POWER_OFF flag twice. This patch is >> to add check of physical device's runtime pm status and request resume >> if the device is suspended. >> >> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> >> --- >> drivers/acpi/power.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c >> index 0dbe5cd..228c138 100644 >> --- a/drivers/acpi/power.c >> +++ b/drivers/acpi/power.c >> @@ -250,8 +250,10 @@ static void acpi_power_resume_dependent(struct work_struct *work) >> >> mutex_lock(&adev->physical_node_lock); >> >> - list_for_each_entry(pn, &adev->physical_node_list, node) >> - pm_request_resume(pn->dev); >> + list_for_each_entry(pn, &adev->physical_node_list, node) { >> + if (pm_runtime_suspended(pn->dev)) >> + pm_request_resume(pn->dev); >> + } > > This is racy, because the status may change right after you check it and before > you call pm_request_resume(). Yes, the runtime status may be changed just after the check. > > Besides, pm_request_resume() checks the status of the device and won't > try to resume it if it is not suspended. > For this issue, usb port is in the RPM_SUSPENDING state when pm_request_resume() is called. The deferred_resume will be set to true during this procedure and trigger resume after finishing suspend. USB port runtime resume callback will turn on its power resource again and the work of acpi_power_resume_dependent() is scheduled. Because the usb port's usage count remains zero, it's to be suspended soon. When pm_request_resume() of acpi_power_resume_dependent() is called, the usb port is always in the PRM_SUSPENDING. Fall in the loop of suspend and resume. How about running acpi_power_dependent when turning on power resource rather than scheduling a work to run it? After this, pm_request_resume() can check device's right status just after turning on power resource. Furthermore, pm_request_resume() is async resume and this change will not consume much time. >> >> list_for_each_entry(pn, &adev->power_dependent, node) >> pm_request_resume(pn->dev); > > Thanks! >
On Tuesday, October 15, 2013 04:58:28 PM Lan Tianyu wrote: > On 2013?10?11? 19:19, Rafael J. Wysocki wrote: > > On Friday, October 11, 2013 04:16:25 PM tianyu.lan@intel.com wrote: > >> From: Lan Tianyu <tianyu.lan@intel.com> > >> > >> Currently, when one power resource is turned on, devices owning it > >> will be requested to resume regardless of their runtime pm status. > >> ACPI power resource maybe turn on in some devices' runtime pm > >> resume callback(E.G, usb port) while turning on the power resource > >> will trigger one new resume request of the device. It causes > >> infinite loop between resume and suspend. This has happened on > >> clearing usb port's PM Qos NO_POWER_OFF flag twice. This patch is > >> to add check of physical device's runtime pm status and request resume > >> if the device is suspended. > >> > >> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> > >> --- > >> drivers/acpi/power.c | 6 ++++-- > >> 1 file changed, 4 insertions(+), 2 deletions(-) > >> > >> diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c > >> index 0dbe5cd..228c138 100644 > >> --- a/drivers/acpi/power.c > >> +++ b/drivers/acpi/power.c > >> @@ -250,8 +250,10 @@ static void acpi_power_resume_dependent(struct work_struct *work) > >> > >> mutex_lock(&adev->physical_node_lock); > >> > >> - list_for_each_entry(pn, &adev->physical_node_list, node) > >> - pm_request_resume(pn->dev); > >> + list_for_each_entry(pn, &adev->physical_node_list, node) { > >> + if (pm_runtime_suspended(pn->dev)) > >> + pm_request_resume(pn->dev); > >> + } > > > > This is racy, because the status may change right after you check it and before > > you call pm_request_resume(). > > Yes, the runtime status may be changed just after the check. > > > > > Besides, pm_request_resume() checks the status of the device and won't > > try to resume it if it is not suspended. > > > > For this issue, usb port is in the RPM_SUSPENDING state when > pm_request_resume() is called. Why exactly does that happen? > The deferred_resume will be set to true > during this procedure and trigger resume after finishing suspend. USB > port runtime resume callback will turn on its power resource again and > the work of acpi_power_resume_dependent() is scheduled. Because the usb > port's usage count remains zero, it's to be suspended soon. When > pm_request_resume() of acpi_power_resume_dependent() is called, the usb > port is always in the PRM_SUSPENDING. Fall in the loop of suspend and > resume. > > How about running acpi_power_dependent when turning on power resource > rather than scheduling a work to run it? Is this actually going to help? Even if acpi_power_resume_dependent() is run synchronously from within the resume callback, it will still see RPM_SUSPENDING as the device's status, won't it? > After this, pm_request_resume() can check device's right status just after > turning on power resource. The status doesn't change until the .runtime_suspend() callback returns and running pm_request_resume() syncrhonously from that callback for the device being suspended just plain doesn't make sense. > Furthermore, pm_request_resume() is async resume and > this change will not consume much time. acpi_power_resume_dependent() is run from a work item to avoid locking problems. Can you please explain to me how this is possible that the USB port's power resource is turned "on" while the port is RPM_SUSPENDING?
On Wednesday, October 16, 2013 05:26:21 PM Lan Tianyu wrote: > This is a multi-part message in MIME format. > --------------090400010209000300030201 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > On 10/16/2013 05:22 AM, Rafael J. Wysocki wrote: > > On Tuesday, October 15, 2013 04:58:28 PM Lan Tianyu wrote: > >> On 2013å¹´10æ??11æ?¥ 19:19, Rafael J. Wysocki wrote: > >>> On Friday, October 11, 2013 04:16:25 PM tianyu.lan@intel.com > >>> wrote: > >>>> From: Lan Tianyu <tianyu.lan@intel.com> > >>>> > >>>> Currently, when one power resource is turned on, devices owning > >>>> it will be requested to resume regardless of their runtime pm > >>>> status. ACPI power resource maybe turn on in some devices' > >>>> runtime pm resume callback(E.G, usb port) while turning on the > >>>> power resource will trigger one new resume request of the > >>>> device. It causes infinite loop between resume and suspend. > >>>> This has happened on clearing usb port's PM Qos NO_POWER_OFF > >>>> flag twice. This patch is to add check of physical device's > >>>> runtime pm status and request resume if the device is > >>>> suspended. > >>>> > >>>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> --- > >>>> drivers/acpi/power.c | 6 ++++-- 1 file changed, 4 > >>>> insertions(+), 2 deletions(-) > >>>> > >>>> diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c index > >>>> 0dbe5cd..228c138 100644 --- a/drivers/acpi/power.c +++ > >>>> b/drivers/acpi/power.c @@ -250,8 +250,10 @@ static void > >>>> acpi_power_resume_dependent(struct work_struct *work) > >>>> > >>>> mutex_lock(&adev->physical_node_lock); > >>>> > >>>> - list_for_each_entry(pn, &adev->physical_node_list, node) - > >>>> pm_request_resume(pn->dev); + list_for_each_entry(pn, > >>>> &adev->physical_node_list, node) { + if > >>>> (pm_runtime_suspended(pn->dev)) + pm_request_resume(pn->dev); > >>>> + } > >>> > >>> This is racy, because the status may change right after you check > >>> it and before you call pm_request_resume(). > >> > >> Yes, the runtime status may be changed just after the check. > >> > >>> > >>> Besides, pm_request_resume() checks the status of the device and > >>> won't try to resume it if it is not suspended. > >>> > >> > >> For this issue, usb port is in the RPM_SUSPENDING state when > >> pm_request_resume() is called. > > > > Why exactly does that happen? > > > >> The deferred_resume will be set to true during this procedure and > >> trigger resume after finishing suspend. USB port runtime resume > >> callback will turn on its power resource again and the work of > >> acpi_power_resume_dependent() is scheduled. Because the usb port's > >> usage count remains zero, it's to be suspended soon. When > >> pm_request_resume() of acpi_power_resume_dependent() is called, the > >> usb port is always in the PRM_SUSPENDING. Fall in the loop of > >> suspend and resume. > >> > >> How about running acpi_power_dependent when turning on power > >> resource rather than scheduling a work to run it? > > > > Is this actually going to help? Even if > > acpi_power_resume_dependent() is run synchronously from within the > > resume callback, it will still see RPM_SUSPENDING as the device's > > status, won't it? > > > >> After this, pm_request_resume() can check device's right status > >> just after turning on power resource. > > > > The status doesn't change until the .runtime_suspend() callback > > returns and running pm_request_resume() syncrhonously from that > > callback for the device being suspended just plain doesn't make > > sense. > > > > > > Can you please explain to me how this is possible that the USB port's > > power resource is turned "on" while the port is RPM_SUSPENDING? > > > [This mail seems not to be sent to maillist. So resend. Try again > Sorry for noise] > > > Hi Rafael: > > No, I mean the acpi_power_resume_dependent() is running while the port's > status is RPM_SUSPENDING. It runs from a work item and turning on power > resource adds the work to workqueue. There is a timeslot between running > acpi_power_resume_dependent() and turning on power resource. In the > timeslot, the device runtime pm status maybe change. > > In this case, changing PM Qos flag will do pm_runtime_get_sync and > pm_runtime_put usb port. The usb port is without device attached and so > it will be resumed and suspended soon during changing PM Qos flag. > > Resume callback turns on power resource if NO_POWER_OFF flag has been > cleared and add the work of acpi_power_resume_dependent() to > workqueue. After resume, the usb port will be suspended while the > acpi_power_resume_dependent() is running. pm_request_resume() gets > RPM_SUSPENDING as the usb port's runtime status and set the > deferred_resume of usb port. > > After suspend, usb port will be resumed again and turn on power > resource. The work of acpi_power_resume_dependent() is also added to > workqueue. Because the usb port's usage count remains 0 during this > procedure. it will be suspended soon after resume. While suspending, > acpi_power_resume_dependent() is running and pm_request_resume() > sets deferred_resume again. The infinite look starts. So the problem is that the whole thing is racy. There is no guarantee that the power resource will not be turned off after the acpi_power_get_inferred_state() check in acpi_power_resume_dependent() and before rpm_resume() queued by the pm_request_resume() called from there runs. Playing with the time windows doesn't make that race go away. If we did what you suggested, the race still would be there, because the queued up resume may still run after the power resource has been turned off. Unfortunately, I don't see how we can fix this race in a satisfactory way and I'm starting to think that the whole resuming of dependent devices may be a bad idea. IIRC, the original concern was that devices may end up in D0-uninitialized if we don't do that, but then whoever turned the power resource on will probably turn if off at one point anyway, so they will be in that state temporarily. In other words, in addition to the fact that this is racy, there even is no reason to do it. I'll send a patch to rip off that stuff later today. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013?10?16? 20:42, Rafael J. Wysocki wrote: > On Wednesday, October 16, 2013 05:26:21 PM Lan Tianyu wrote: >> This is a multi-part message in MIME format. >> --------------090400010209000300030201 >> Content-Type: text/plain; charset=UTF-8 >> Content-Transfer-Encoding: 8bit >> >> On 10/16/2013 05:22 AM, Rafael J. Wysocki wrote: >>> On Tuesday, October 15, 2013 04:58:28 PM Lan Tianyu wrote: >>>> On 2013å¹´10æ??11æ?¥ 19:19, Rafael J. Wysocki wrote: >>>>> On Friday, October 11, 2013 04:16:25 PM tianyu.lan@intel.com >>>>> wrote: >>>>>> From: Lan Tianyu <tianyu.lan@intel.com> >>>>>> >>>>>> Currently, when one power resource is turned on, devices owning >>>>>> it will be requested to resume regardless of their runtime pm >>>>>> status. ACPI power resource maybe turn on in some devices' >>>>>> runtime pm resume callback(E.G, usb port) while turning on the >>>>>> power resource will trigger one new resume request of the >>>>>> device. It causes infinite loop between resume and suspend. >>>>>> This has happened on clearing usb port's PM Qos NO_POWER_OFF >>>>>> flag twice. This patch is to add check of physical device's >>>>>> runtime pm status and request resume if the device is >>>>>> suspended. >>>>>> >>>>>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> --- >>>>>> drivers/acpi/power.c | 6 ++++-- 1 file changed, 4 >>>>>> insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c index >>>>>> 0dbe5cd..228c138 100644 --- a/drivers/acpi/power.c +++ >>>>>> b/drivers/acpi/power.c @@ -250,8 +250,10 @@ static void >>>>>> acpi_power_resume_dependent(struct work_struct *work) >>>>>> >>>>>> mutex_lock(&adev->physical_node_lock); >>>>>> >>>>>> - list_for_each_entry(pn, &adev->physical_node_list, node) - >>>>>> pm_request_resume(pn->dev); + list_for_each_entry(pn, >>>>>> &adev->physical_node_list, node) { + if >>>>>> (pm_runtime_suspended(pn->dev)) + pm_request_resume(pn->dev); >>>>>> + } >>>>> >>>>> This is racy, because the status may change right after you check >>>>> it and before you call pm_request_resume(). >>>> >>>> Yes, the runtime status may be changed just after the check. >>>> >>>>> >>>>> Besides, pm_request_resume() checks the status of the device and >>>>> won't try to resume it if it is not suspended. >>>>> >>>> >>>> For this issue, usb port is in the RPM_SUSPENDING state when >>>> pm_request_resume() is called. >>> >>> Why exactly does that happen? >>> >>>> The deferred_resume will be set to true during this procedure and >>>> trigger resume after finishing suspend. USB port runtime resume >>>> callback will turn on its power resource again and the work of >>>> acpi_power_resume_dependent() is scheduled. Because the usb port's >>>> usage count remains zero, it's to be suspended soon. When >>>> pm_request_resume() of acpi_power_resume_dependent() is called, the >>>> usb port is always in the PRM_SUSPENDING. Fall in the loop of >>>> suspend and resume. >>>> >>>> How about running acpi_power_dependent when turning on power >>>> resource rather than scheduling a work to run it? >>> >>> Is this actually going to help? Even if >>> acpi_power_resume_dependent() is run synchronously from within the >>> resume callback, it will still see RPM_SUSPENDING as the device's >>> status, won't it? >>> >>>> After this, pm_request_resume() can check device's right status >>>> just after turning on power resource. >>> >>> The status doesn't change until the .runtime_suspend() callback >>> returns and running pm_request_resume() syncrhonously from that >>> callback for the device being suspended just plain doesn't make >>> sense. >>> >>> >>> Can you please explain to me how this is possible that the USB port's >>> power resource is turned "on" while the port is RPM_SUSPENDING? >>> >> [This mail seems not to be sent to maillist. So resend. Try again >> Sorry for noise] >> >> >> Hi Rafael: >> >> No, I mean the acpi_power_resume_dependent() is running while the port's >> status is RPM_SUSPENDING. It runs from a work item and turning on power >> resource adds the work to workqueue. There is a timeslot between running >> acpi_power_resume_dependent() and turning on power resource. In the >> timeslot, the device runtime pm status maybe change. >> >> In this case, changing PM Qos flag will do pm_runtime_get_sync and >> pm_runtime_put usb port. The usb port is without device attached and so >> it will be resumed and suspended soon during changing PM Qos flag. >> >> Resume callback turns on power resource if NO_POWER_OFF flag has been >> cleared and add the work of acpi_power_resume_dependent() to >> workqueue. After resume, the usb port will be suspended while the >> acpi_power_resume_dependent() is running. pm_request_resume() gets >> RPM_SUSPENDING as the usb port's runtime status and set the >> deferred_resume of usb port. >> >> After suspend, usb port will be resumed again and turn on power >> resource. The work of acpi_power_resume_dependent() is also added to >> workqueue. Because the usb port's usage count remains 0 during this >> procedure. it will be suspended soon after resume. While suspending, >> acpi_power_resume_dependent() is running and pm_request_resume() >> sets deferred_resume again. The infinite look starts. > > So the problem is that the whole thing is racy. There is no guarantee > that the power resource will not be turned off after the > acpi_power_get_inferred_state() check in acpi_power_resume_dependent() > and before rpm_resume() queued by the pm_request_resume() called from > there runs. Playing with the time windows doesn't make that race go away. > > If we did what you suggested, the race still would be there, because the > queued up resume may still run after the power resource has been turned off. Yes, the queued up resume will run after the power resource has been turned off but normally the resume should try to turn on the power resource before doing some hardware related operations. If so, there will not be problem, right? > > Unfortunately, I don't see how we can fix this race in a satisfactory way > and I'm starting to think that the whole resuming of dependent devices may > be a bad idea. > > IIRC, the original concern was that devices may end up in D0-uninitialized > if we don't do that, but then whoever turned the power resource on will > probably turn if off at one point anyway, so they will be in that state > temporarily. In other words, in addition to the fact that this is racy, > there even is no reason to do it. > > I'll send a patch to rip off that stuff later today. > > Thanks, > Rafael >
On 2013?10?17? 09:02, Lan Tianyu wrote: > On 2013?10?16? 20:42, Rafael J. Wysocki wrote: >> On Wednesday, October 16, 2013 05:26:21 PM Lan Tianyu wrote: >>> This is a multi-part message in MIME format. >>> --------------090400010209000300030201 >>> Content-Type: text/plain; charset=UTF-8 >>> Content-Transfer-Encoding: 8bit >>> >>> On 10/16/2013 05:22 AM, Rafael J. Wysocki wrote: >>>> On Tuesday, October 15, 2013 04:58:28 PM Lan Tianyu wrote: >>>>> On 2013å¹´10æ??11æ?¥ 19:19, Rafael J. Wysocki wrote: >>>>>> On Friday, October 11, 2013 04:16:25 PM tianyu.lan@intel.com >>>>>> wrote: >>>>>>> From: Lan Tianyu <tianyu.lan@intel.com> >>>>>>> >>>>>>> Currently, when one power resource is turned on, devices owning >>>>>>> it will be requested to resume regardless of their runtime pm >>>>>>> status. ACPI power resource maybe turn on in some devices' >>>>>>> runtime pm resume callback(E.G, usb port) while turning on the >>>>>>> power resource will trigger one new resume request of the >>>>>>> device. It causes infinite loop between resume and suspend. >>>>>>> This has happened on clearing usb port's PM Qos NO_POWER_OFF >>>>>>> flag twice. This patch is to add check of physical device's >>>>>>> runtime pm status and request resume if the device is >>>>>>> suspended. >>>>>>> >>>>>>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> --- >>>>>>> drivers/acpi/power.c | 6 ++++-- 1 file changed, 4 >>>>>>> insertions(+), 2 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c index >>>>>>> 0dbe5cd..228c138 100644 --- a/drivers/acpi/power.c +++ >>>>>>> b/drivers/acpi/power.c @@ -250,8 +250,10 @@ static void >>>>>>> acpi_power_resume_dependent(struct work_struct *work) >>>>>>> >>>>>>> mutex_lock(&adev->physical_node_lock); >>>>>>> >>>>>>> - list_for_each_entry(pn, &adev->physical_node_list, node) - >>>>>>> pm_request_resume(pn->dev); + list_for_each_entry(pn, >>>>>>> &adev->physical_node_list, node) { + if >>>>>>> (pm_runtime_suspended(pn->dev)) + pm_request_resume(pn->dev); >>>>>>> + } >>>>>> >>>>>> This is racy, because the status may change right after you check >>>>>> it and before you call pm_request_resume(). >>>>> >>>>> Yes, the runtime status may be changed just after the check. >>>>> >>>>>> >>>>>> Besides, pm_request_resume() checks the status of the device and >>>>>> won't try to resume it if it is not suspended. >>>>>> >>>>> >>>>> For this issue, usb port is in the RPM_SUSPENDING state when >>>>> pm_request_resume() is called. >>>> >>>> Why exactly does that happen? >>>> >>>>> The deferred_resume will be set to true during this procedure and >>>>> trigger resume after finishing suspend. USB port runtime resume >>>>> callback will turn on its power resource again and the work of >>>>> acpi_power_resume_dependent() is scheduled. Because the usb port's >>>>> usage count remains zero, it's to be suspended soon. When >>>>> pm_request_resume() of acpi_power_resume_dependent() is called, the >>>>> usb port is always in the PRM_SUSPENDING. Fall in the loop of >>>>> suspend and resume. >>>>> >>>>> How about running acpi_power_dependent when turning on power >>>>> resource rather than scheduling a work to run it? >>>> >>>> Is this actually going to help? Even if >>>> acpi_power_resume_dependent() is run synchronously from within the >>>> resume callback, it will still see RPM_SUSPENDING as the device's >>>> status, won't it? >>>> >>>>> After this, pm_request_resume() can check device's right status >>>>> just after turning on power resource. >>>> >>>> The status doesn't change until the .runtime_suspend() callback >>>> returns and running pm_request_resume() syncrhonously from that >>>> callback for the device being suspended just plain doesn't make >>>> sense. >>>> >>>> >>>> Can you please explain to me how this is possible that the USB port's >>>> power resource is turned "on" while the port is RPM_SUSPENDING? >>>> >>> [This mail seems not to be sent to maillist. So resend. Try again >>> Sorry for noise] >>> >>> >>> Hi Rafael: >>> >>> No, I mean the acpi_power_resume_dependent() is running while the port's >>> status is RPM_SUSPENDING. It runs from a work item and turning on power >>> resource adds the work to workqueue. There is a timeslot between running >>> acpi_power_resume_dependent() and turning on power resource. In the >>> timeslot, the device runtime pm status maybe change. >>> >>> In this case, changing PM Qos flag will do pm_runtime_get_sync and >>> pm_runtime_put usb port. The usb port is without device attached and so >>> it will be resumed and suspended soon during changing PM Qos flag. >>> >>> Resume callback turns on power resource if NO_POWER_OFF flag has been >>> cleared and add the work of acpi_power_resume_dependent() to >>> workqueue. After resume, the usb port will be suspended while the >>> acpi_power_resume_dependent() is running. pm_request_resume() gets >>> RPM_SUSPENDING as the usb port's runtime status and set the >>> deferred_resume of usb port. >>> >>> After suspend, usb port will be resumed again and turn on power >>> resource. The work of acpi_power_resume_dependent() is also added to >>> workqueue. Because the usb port's usage count remains 0 during this >>> procedure. it will be suspended soon after resume. While suspending, >>> acpi_power_resume_dependent() is running and pm_request_resume() >>> sets deferred_resume again. The infinite look starts. >> >> So the problem is that the whole thing is racy. There is no guarantee >> that the power resource will not be turned off after the >> acpi_power_get_inferred_state() check in acpi_power_resume_dependent() >> and before rpm_resume() queued by the pm_request_resume() called from >> there runs. Playing with the time windows doesn't make that race go away. >> >> If we did what you suggested, the race still would be there, because the >> queued up resume may still run after the power resource has been turned off. > > Yes, the queued up resume will run after the power resource has been > turned off but normally the resume should try to turn on the power > resource before doing some hardware related operations. If so, there > will not be problem, right? > Sorry. I think I misunderstood your word. Please ignore the previous comment. Yes, there is still a racy. I think of one case that there are multi devices that share one power resource. One device turns on power resource during resuming and queue resumes for other devices. The device enter into suspend and power resource turns off soon. When one other device do resume, the power resource will turn on again and queue resume for the previous device. Just like a Ping-pong. >> >> Unfortunately, I don't see how we can fix this race in a satisfactory way >> and I'm starting to think that the whole resuming of dependent devices may >> be a bad idea. >> >> IIRC, the original concern was that devices may end up in D0-uninitialized >> if we don't do that, but then whoever turned the power resource on will >> probably turn if off at one point anyway, so they will be in that state >> temporarily. In other words, in addition to the fact that this is racy, >> there even is no reason to do it. >> >> I'll send a patch to rip off that stuff later today. Currently, dropping it should be the better choice but I think we still need to resolve the D0-uninitialized problem, right? I will try to resolve it by other way. >> >> Thanks, >> Rafael >> > >
On 2013?10?17? 10:40, Lan Tianyu wrote: > On 2013?10?17? 09:02, Lan Tianyu wrote: >> On 2013?10?16? 20:42, Rafael J. Wysocki wrote: >>> On Wednesday, October 16, 2013 05:26:21 PM Lan Tianyu wrote: >>>> This is a multi-part message in MIME format. >>>> --------------090400010209000300030201 >>>> Content-Type: text/plain; charset=UTF-8 >>>> Content-Transfer-Encoding: 8bit >>>> >>>> On 10/16/2013 05:22 AM, Rafael J. Wysocki wrote: >>>>> On Tuesday, October 15, 2013 04:58:28 PM Lan Tianyu wrote: >>>>>> On 2013å¹´10æ??11æ?¥ 19:19, Rafael J. Wysocki wrote: >>>>>>> On Friday, October 11, 2013 04:16:25 PM tianyu.lan@intel.com >>>>>>> wrote: >>>>>>>> From: Lan Tianyu <tianyu.lan@intel.com> >>>>>>>> >>>>>>>> Currently, when one power resource is turned on, devices owning >>>>>>>> it will be requested to resume regardless of their runtime pm >>>>>>>> status. ACPI power resource maybe turn on in some devices' >>>>>>>> runtime pm resume callback(E.G, usb port) while turning on the >>>>>>>> power resource will trigger one new resume request of the >>>>>>>> device. It causes infinite loop between resume and suspend. >>>>>>>> This has happened on clearing usb port's PM Qos NO_POWER_OFF >>>>>>>> flag twice. This patch is to add check of physical device's >>>>>>>> runtime pm status and request resume if the device is >>>>>>>> suspended. >>>>>>>> >>>>>>>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> --- >>>>>>>> drivers/acpi/power.c | 6 ++++-- 1 file changed, 4 >>>>>>>> insertions(+), 2 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c index >>>>>>>> 0dbe5cd..228c138 100644 --- a/drivers/acpi/power.c +++ >>>>>>>> b/drivers/acpi/power.c @@ -250,8 +250,10 @@ static void >>>>>>>> acpi_power_resume_dependent(struct work_struct *work) >>>>>>>> >>>>>>>> mutex_lock(&adev->physical_node_lock); >>>>>>>> >>>>>>>> - list_for_each_entry(pn, &adev->physical_node_list, node) - >>>>>>>> pm_request_resume(pn->dev); + list_for_each_entry(pn, >>>>>>>> &adev->physical_node_list, node) { + if >>>>>>>> (pm_runtime_suspended(pn->dev)) + pm_request_resume(pn->dev); >>>>>>>> + } >>>>>>> >>>>>>> This is racy, because the status may change right after you check >>>>>>> it and before you call pm_request_resume(). >>>>>> >>>>>> Yes, the runtime status may be changed just after the check. >>>>>> >>>>>>> >>>>>>> Besides, pm_request_resume() checks the status of the device and >>>>>>> won't try to resume it if it is not suspended. >>>>>>> >>>>>> >>>>>> For this issue, usb port is in the RPM_SUSPENDING state when >>>>>> pm_request_resume() is called. >>>>> >>>>> Why exactly does that happen? >>>>> >>>>>> The deferred_resume will be set to true during this procedure and >>>>>> trigger resume after finishing suspend. USB port runtime resume >>>>>> callback will turn on its power resource again and the work of >>>>>> acpi_power_resume_dependent() is scheduled. Because the usb port's >>>>>> usage count remains zero, it's to be suspended soon. When >>>>>> pm_request_resume() of acpi_power_resume_dependent() is called, the >>>>>> usb port is always in the PRM_SUSPENDING. Fall in the loop of >>>>>> suspend and resume. >>>>>> >>>>>> How about running acpi_power_dependent when turning on power >>>>>> resource rather than scheduling a work to run it? >>>>> >>>>> Is this actually going to help? Even if >>>>> acpi_power_resume_dependent() is run synchronously from within the >>>>> resume callback, it will still see RPM_SUSPENDING as the device's >>>>> status, won't it? >>>>> >>>>>> After this, pm_request_resume() can check device's right status >>>>>> just after turning on power resource. >>>>> >>>>> The status doesn't change until the .runtime_suspend() callback >>>>> returns and running pm_request_resume() syncrhonously from that >>>>> callback for the device being suspended just plain doesn't make >>>>> sense. >>>>> >>>>> >>>>> Can you please explain to me how this is possible that the USB port's >>>>> power resource is turned "on" while the port is RPM_SUSPENDING? >>>>> >>>> [This mail seems not to be sent to maillist. So resend. Try again >>>> Sorry for noise] >>>> >>>> >>>> Hi Rafael: >>>> >>>> No, I mean the acpi_power_resume_dependent() is running while the port's >>>> status is RPM_SUSPENDING. It runs from a work item and turning on power >>>> resource adds the work to workqueue. There is a timeslot between running >>>> acpi_power_resume_dependent() and turning on power resource. In the >>>> timeslot, the device runtime pm status maybe change. >>>> >>>> In this case, changing PM Qos flag will do pm_runtime_get_sync and >>>> pm_runtime_put usb port. The usb port is without device attached and so >>>> it will be resumed and suspended soon during changing PM Qos flag. >>>> >>>> Resume callback turns on power resource if NO_POWER_OFF flag has been >>>> cleared and add the work of acpi_power_resume_dependent() to >>>> workqueue. After resume, the usb port will be suspended while the >>>> acpi_power_resume_dependent() is running. pm_request_resume() gets >>>> RPM_SUSPENDING as the usb port's runtime status and set the >>>> deferred_resume of usb port. >>>> >>>> After suspend, usb port will be resumed again and turn on power >>>> resource. The work of acpi_power_resume_dependent() is also added to >>>> workqueue. Because the usb port's usage count remains 0 during this >>>> procedure. it will be suspended soon after resume. While suspending, >>>> acpi_power_resume_dependent() is running and pm_request_resume() >>>> sets deferred_resume again. The infinite look starts. >>> >>> So the problem is that the whole thing is racy. There is no guarantee >>> that the power resource will not be turned off after the >>> acpi_power_get_inferred_state() check in acpi_power_resume_dependent() >>> and before rpm_resume() queued by the pm_request_resume() called from >>> there runs. Playing with the time windows doesn't make that race go away. >>> >>> If we did what you suggested, the race still would be there, because the >>> queued up resume may still run after the power resource has been turned off. >> >> Yes, the queued up resume will run after the power resource has been >> turned off but normally the resume should try to turn on the power >> resource before doing some hardware related operations. If so, there >> will not be problem, right? >> > > Sorry. I think I misunderstood your word. Please ignore the previous > comment. > > Yes, there is still a racy. I think of one case that there are multi > devices that share one power resource. One device turns on power > resource during resuming and queue resumes for other devices. The device > enter into suspend and power resource turns off soon. When one other > device do resume, the power resource will turn on again and queue resume > for the previous device. Just like a Ping-pong. > >>> >>> Unfortunately, I don't see how we can fix this race in a satisfactory way >>> and I'm starting to think that the whole resuming of dependent devices may >>> be a bad idea. >>> >>> IIRC, the original concern was that devices may end up in D0-uninitialized >>> if we don't do that, but then whoever turned the power resource on will >>> probably turn if off at one point anyway, so they will be in that state >>> temporarily. In other words, in addition to the fact that this is racy, >>> there even is no reason to do it. >>> >>> I'll send a patch to rip off that stuff later today. > > Currently, dropping it should be the better choice but I think we still > need to resolve the D0-uninitialized problem, right? I will try to > resolve it by other way. > How about create generic power domain for power resource and adding physical devices and dependent devices to the domain? When the domain powers on, resume all the devices. >>> >>> Thanks, >>> Rafael >>> >> >> > >
On Thursday, October 17, 2013 10:40:03 AM Lan Tianyu wrote: > On 2013?10?17? 09:02, Lan Tianyu wrote: > > On 2013?10?16? 20:42, Rafael J. Wysocki wrote: > >> On Wednesday, October 16, 2013 05:26:21 PM Lan Tianyu wrote: > >>> This is a multi-part message in MIME format. > >>> --------------090400010209000300030201 > >>> Content-Type: text/plain; charset=UTF-8 > >>> Content-Transfer-Encoding: 8bit > >>> > >>> On 10/16/2013 05:22 AM, Rafael J. Wysocki wrote: > >>>> On Tuesday, October 15, 2013 04:58:28 PM Lan Tianyu wrote: > >>>>> On 2013å¹´10æ??11æ?¥ 19:19, Rafael J. Wysocki wrote: > >>>>>> On Friday, October 11, 2013 04:16:25 PM tianyu.lan@intel.com > >>>>>> wrote: > >>>>>>> From: Lan Tianyu <tianyu.lan@intel.com> > >>>>>>> > >>>>>>> Currently, when one power resource is turned on, devices owning > >>>>>>> it will be requested to resume regardless of their runtime pm > >>>>>>> status. ACPI power resource maybe turn on in some devices' > >>>>>>> runtime pm resume callback(E.G, usb port) while turning on the > >>>>>>> power resource will trigger one new resume request of the > >>>>>>> device. It causes infinite loop between resume and suspend. > >>>>>>> This has happened on clearing usb port's PM Qos NO_POWER_OFF > >>>>>>> flag twice. This patch is to add check of physical device's > >>>>>>> runtime pm status and request resume if the device is > >>>>>>> suspended. > >>>>>>> > >>>>>>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> --- > >>>>>>> drivers/acpi/power.c | 6 ++++-- 1 file changed, 4 > >>>>>>> insertions(+), 2 deletions(-) > >>>>>>> > >>>>>>> diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c index > >>>>>>> 0dbe5cd..228c138 100644 --- a/drivers/acpi/power.c +++ > >>>>>>> b/drivers/acpi/power.c @@ -250,8 +250,10 @@ static void > >>>>>>> acpi_power_resume_dependent(struct work_struct *work) > >>>>>>> > >>>>>>> mutex_lock(&adev->physical_node_lock); > >>>>>>> > >>>>>>> - list_for_each_entry(pn, &adev->physical_node_list, node) - > >>>>>>> pm_request_resume(pn->dev); + list_for_each_entry(pn, > >>>>>>> &adev->physical_node_list, node) { + if > >>>>>>> (pm_runtime_suspended(pn->dev)) + pm_request_resume(pn->dev); > >>>>>>> + } > >>>>>> > >>>>>> This is racy, because the status may change right after you check > >>>>>> it and before you call pm_request_resume(). > >>>>> > >>>>> Yes, the runtime status may be changed just after the check. > >>>>> > >>>>>> > >>>>>> Besides, pm_request_resume() checks the status of the device and > >>>>>> won't try to resume it if it is not suspended. > >>>>>> > >>>>> > >>>>> For this issue, usb port is in the RPM_SUSPENDING state when > >>>>> pm_request_resume() is called. > >>>> > >>>> Why exactly does that happen? > >>>> > >>>>> The deferred_resume will be set to true during this procedure and > >>>>> trigger resume after finishing suspend. USB port runtime resume > >>>>> callback will turn on its power resource again and the work of > >>>>> acpi_power_resume_dependent() is scheduled. Because the usb port's > >>>>> usage count remains zero, it's to be suspended soon. When > >>>>> pm_request_resume() of acpi_power_resume_dependent() is called, the > >>>>> usb port is always in the PRM_SUSPENDING. Fall in the loop of > >>>>> suspend and resume. > >>>>> > >>>>> How about running acpi_power_dependent when turning on power > >>>>> resource rather than scheduling a work to run it? > >>>> > >>>> Is this actually going to help? Even if > >>>> acpi_power_resume_dependent() is run synchronously from within the > >>>> resume callback, it will still see RPM_SUSPENDING as the device's > >>>> status, won't it? > >>>> > >>>>> After this, pm_request_resume() can check device's right status > >>>>> just after turning on power resource. > >>>> > >>>> The status doesn't change until the .runtime_suspend() callback > >>>> returns and running pm_request_resume() syncrhonously from that > >>>> callback for the device being suspended just plain doesn't make > >>>> sense. > >>>> > >>>> > >>>> Can you please explain to me how this is possible that the USB port's > >>>> power resource is turned "on" while the port is RPM_SUSPENDING? > >>>> > >>> [This mail seems not to be sent to maillist. So resend. Try again > >>> Sorry for noise] > >>> > >>> > >>> Hi Rafael: > >>> > >>> No, I mean the acpi_power_resume_dependent() is running while the port's > >>> status is RPM_SUSPENDING. It runs from a work item and turning on power > >>> resource adds the work to workqueue. There is a timeslot between running > >>> acpi_power_resume_dependent() and turning on power resource. In the > >>> timeslot, the device runtime pm status maybe change. > >>> > >>> In this case, changing PM Qos flag will do pm_runtime_get_sync and > >>> pm_runtime_put usb port. The usb port is without device attached and so > >>> it will be resumed and suspended soon during changing PM Qos flag. > >>> > >>> Resume callback turns on power resource if NO_POWER_OFF flag has been > >>> cleared and add the work of acpi_power_resume_dependent() to > >>> workqueue. After resume, the usb port will be suspended while the > >>> acpi_power_resume_dependent() is running. pm_request_resume() gets > >>> RPM_SUSPENDING as the usb port's runtime status and set the > >>> deferred_resume of usb port. > >>> > >>> After suspend, usb port will be resumed again and turn on power > >>> resource. The work of acpi_power_resume_dependent() is also added to > >>> workqueue. Because the usb port's usage count remains 0 during this > >>> procedure. it will be suspended soon after resume. While suspending, > >>> acpi_power_resume_dependent() is running and pm_request_resume() > >>> sets deferred_resume again. The infinite look starts. > >> > >> So the problem is that the whole thing is racy. There is no guarantee > >> that the power resource will not be turned off after the > >> acpi_power_get_inferred_state() check in acpi_power_resume_dependent() > >> and before rpm_resume() queued by the pm_request_resume() called from > >> there runs. Playing with the time windows doesn't make that race go away. > >> > >> If we did what you suggested, the race still would be there, because the > >> queued up resume may still run after the power resource has been turned off. > > > > Yes, the queued up resume will run after the power resource has been > > turned off but normally the resume should try to turn on the power > > resource before doing some hardware related operations. If so, there > > will not be problem, right? > > > > Sorry. I think I misunderstood your word. Please ignore the previous > comment. > > Yes, there is still a racy. I think of one case that there are multi > devices that share one power resource. One device turns on power > resource during resuming and queue resumes for other devices. The device > enter into suspend and power resource turns off soon. When one other > device do resume, the power resource will turn on again and queue resume > for the previous device. Just like a Ping-pong. > > >> > >> Unfortunately, I don't see how we can fix this race in a satisfactory way > >> and I'm starting to think that the whole resuming of dependent devices may > >> be a bad idea. > >> > >> IIRC, the original concern was that devices may end up in D0-uninitialized > >> if we don't do that, but then whoever turned the power resource on will > >> probably turn if off at one point anyway, so they will be in that state > >> temporarily. In other words, in addition to the fact that this is racy, > >> there even is no reason to do it. > >> > >> I'll send a patch to rip off that stuff later today. > > Currently, dropping it should be the better choice but I think we still > need to resolve the D0-uninitialized problem, right? Why do you think it is a problem in the first place? Those devices will not be accessed while in that state (unless there's a bug somewhere). Thanks!
On 10/17/2013 07:38 PM, Rafael J. Wysocki wrote: >>>>>>> Unfortunately, I don't see how we can fix this race in a >>>>>>> satisfactory way and I'm starting to think that the whole >>>>>>> resuming of dependent devices may be a bad idea. >>>>>>> >>>>>>> IIRC, the original concern was that devices may end up in >>>>>>> D0-uninitialized if we don't do that, but then whoever >>>>>>> turned the power resource on will probably turn if off at >>>>>>> one point anyway, so they will be in that state >>>>>>> temporarily. In other words, in addition to the fact >>>>>>> that this is racy, there even is no reason to do it. >>>>>>> >>>>>>> I'll send a patch to rip off that stuff later today. >>> >>> Currently, dropping it should be the better choice but I think we >>> still need to resolve the D0-uninitialized problem, right? > Why do you think it is a problem in the first place? Those devices > will not be accessed while in that state (unless there's a bug > somewhere). > Yes, those devices will not be accessed but they will continue to stay D0-uninitiallized without any users before next resume and suspend. PM core and device driver still think they are in the lower power state. At this point, it seems these devices should be put into lower power state(E.G D3hot) than D0-uninitiallized. E.G, two devices share one power resource. After they are suspended and power resource turns off, one device is resumed and power resource turns on. The other device will remain D0-uninitallized until there are resume and suspend for it. It may consume more power than lowest power state it can reach at that point. > Thanks! > > -- I speak only for myself. Rafael J. Wysocki, Intel Open Source > Technology Center. -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday, October 18, 2013 09:05:13 PM Lan Tianyu wrote: > On 10/17/2013 07:38 PM, Rafael J. Wysocki wrote: > >>>>>>> Unfortunately, I don't see how we can fix this race in a > >>>>>>> satisfactory way and I'm starting to think that the whole > >>>>>>> resuming of dependent devices may be a bad idea. > >>>>>>> > >>>>>>> IIRC, the original concern was that devices may end up in > >>>>>>> D0-uninitialized if we don't do that, but then whoever > >>>>>>> turned the power resource on will probably turn if off at > >>>>>>> one point anyway, so they will be in that state > >>>>>>> temporarily. In other words, in addition to the fact > >>>>>>> that this is racy, there even is no reason to do it. > >>>>>>> > >>>>>>> I'll send a patch to rip off that stuff later today. > >>> > >>> Currently, dropping it should be the better choice but I think we > >>> still need to resolve the D0-uninitialized problem, right? > > Why do you think it is a problem in the first place? Those devices > > will not be accessed while in that state (unless there's a bug > > somewhere). > > > > Yes, those devices will not be accessed but they will continue to stay > D0-uninitiallized without any users before next resume and suspend. PM > core and device driver still think they are in the lower power state. > > At this point, it seems these devices should be put into lower power > state(E.G D3hot) than D0-uninitiallized. > > E.G, two devices share one power resource. After they are suspended and > power resource turns off, one device is resumed and power resource turns > on. The other device will remain D0-uninitallized until there are resume > and suspend for it. That's not correct. If the device that has been resumed is now suspended again, that will cause the power resource to go off again and the other device will not stay in D0-uninitialized. > It may consume more power than lowest power state it can reach at that point. I don't see a compelling case for that. The only situation of interest is if A and B share a power resource, they both are suspended to start with, so the power resource is off, and then A is resumed and not suspended again for a relatively long time. In that case B *may* be in D0-uninitialized although it could go into D3hot if it were resumed and suspended. Now there's a question of timing, because the resume and suspend of B is only beneficial if it would stay in D0-uninitialized long enough and we can't say in advance when A is going to be suspended again. Moreover, we don't really know what state B is in, because "power restored" need not automatically mean "clock enabled" etc. for various kinds of devices. I think the whole problem is highly theoretical and really only affects PCI where "power restored" pretty much means "reset". And I am yet to see a system where it actually matters. Thanks!
diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c index 0dbe5cd..228c138 100644 --- a/drivers/acpi/power.c +++ b/drivers/acpi/power.c @@ -250,8 +250,10 @@ static void acpi_power_resume_dependent(struct work_struct *work) mutex_lock(&adev->physical_node_lock); - list_for_each_entry(pn, &adev->physical_node_list, node) - pm_request_resume(pn->dev); + list_for_each_entry(pn, &adev->physical_node_list, node) { + if (pm_runtime_suspended(pn->dev)) + pm_request_resume(pn->dev); + } list_for_each_entry(pn, &adev->power_dependent, node) pm_request_resume(pn->dev);