Message ID | 1860008.68DHz0qP41@vostro.rjw.lan (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
On Wed, 8 Jan 2014, Rafael J. Wysocki wrote: > On Wednesday, January 08, 2014 03:44:53 PM Alan Stern wrote: > > On Wed, 8 Jan 2014, Alan Stern wrote: > > > > > On Wed, 8 Jan 2014, Alan Stern wrote: > > > > > > > In essence, you want your disk's state to move directly from unpowered > > > > (system asleep) to runtime-suspended with no intermediate active -- but > > > > only if the disk doesn't spin up all by itself. > > > > > > > In the end, it sounds like what you want can be done fairly easily. > > > > > > It turns out there is a snag. The PM core is set up such that during > > > system sleep transitions, there's no way to tell whether or not > > > userspace has permitted runtime PM for a device. (This is because the > > > core wants to prevent untimely runtime suspends from occurring in the > > > middle of a system sleep transition, and to do so it uses the same > > > mechanism as it does when userspace disallows runtime suspend.) > > > > > > As a result, we must not go directly from unpowered to > > > runtime-suspended. Not without some sort of change to the PM core. > > > > Rafael: > > > > Right now the PM core does pm_runtime_get_noresume() during the prepare > > phase of system sleep, and pm_runtime_put() during the complete phase. > > Maybe we should cut down the range of coverage. > > > > For example, we could do the pm_runtime_put() during dpm_resume(), or > > even earlier. That way, the resume routine could do something like: > > > > if (!pm_runtime_in_use(dev)) { > > pm_runtime_disable(dev); > > pm_runtime_set_suspended(dev); > > pm_runtime_enable(dev); > > return 0; > > } > > > > where pm_runtime_in_use() would check the device's usage counters. > > The device would come out of system sleep in a powered-down state, and > > could remain runtime-suspended with no need for an extra > > power-up/power-down cycle. Isn't this the sort of thing the embedded > > people want to do? > > > > As it is, drivers have no way to safely put devices directly into > > runtime suspend when coming out of system sleep. > > > > What do you think? > > There are two problems here. > > First off, for some devices that were runtime-suspended before system suspend > we may not want to touch them at all during system suspend/resume. I think > I know how to address that, but I need to prepare patches yet. That's going > to take a few days at least, sorry about that. > > Second, we may want some devices to be set to "runtime suspended" during > system resume without touching them, so that runtime PM resumes them when > necessary. I have a patch for that, which hasn't been published yet and which > is appended. > > The way it works is that whoever wants the system resume of certain device > to be skipped, sets the (new) skip_resume flag for that device using > device_pm_skip_resume() and that causes the PM core to basically ignore that > device's PM callbacks during system resume. > > Whether or not doing the pm_request_resume(dev) for devices with skip_resume > set in dpm_resume_early() is a good idea is a discussion item in my opinion > (evidently, I thought it was when I was preparing this patch). The situation isn't quite so simple. We're talking specifically about disk drives. Depending on the drive, it may or may not spin up all by itself when the system wakes up. If it does spin up, the final state should be RPM_ACTIVE (the way it is now). If it doesn't, we want the final state to be RPM_SUSPENDED. This is true regardless of the runtime status before the system sleep. This means we can't skip the resume routine. At the very least, the routine needs to ask the disk whether or not it is spinning. Depending on the result, it would then set the RPM status appropriately. In addition, we only want to end up in RPM_SUSPENDED if the usage counters are 0 (in particular, power/control = "auto"). Otherwise, we want the resume routine always to spin up the drive, if it hasn't spun up by itself. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Wed, 8 Jan 2014, Rafael J. Wysocki wrote: > > On Wednesday, January 08, 2014 03:44:53 PM Alan Stern wrote: > > > On Wed, 8 Jan 2014, Alan Stern wrote: > > > > On Wed, 8 Jan 2014, Alan Stern wrote: > > > > > In essence, you want your disk's state to move directly from > > > > > unpowered > > > > > (system asleep) to runtime-suspended with no intermediate active -- > > > > > but > > > > > only if the disk doesn't spin up all by itself. > > > > > > > > > > In the end, it sounds like what you want can be done fairly easily. > > > > > > > > It turns out there is a snag. The PM core is set up such that during > > > > system sleep transitions, there's no way to tell whether or not > > > > userspace has permitted runtime PM for a device. (This is because the > > > > core wants to prevent untimely runtime suspends from occurring in the > > > > middle of a system sleep transition, and to do so it uses the same > > > > mechanism as it does when userspace disallows runtime suspend.) > > > > > > > > As a result, we must not go directly from unpowered to > > > > runtime-suspended. Not without some sort of change to the PM core. > > > > > > Rafael: > > > > > > Right now the PM core does pm_runtime_get_noresume() during the prepare > > > phase of system sleep, and pm_runtime_put() during the complete phase. > > > Maybe we should cut down the range of coverage. > > > > > > For example, we could do the pm_runtime_put() during dpm_resume(), or > > > > > > even earlier. That way, the resume routine could do something like: > > > if (!pm_runtime_in_use(dev)) { > > > > > > pm_runtime_disable(dev); > > > pm_runtime_set_suspended(dev); > > > pm_runtime_enable(dev); > > > return 0; > > > > > > } > > > > > > where pm_runtime_in_use() would check the device's usage counters. > > > The device would come out of system sleep in a powered-down state, and > > > could remain runtime-suspended with no need for an extra > > > power-up/power-down cycle. Isn't this the sort of thing the embedded > > > people want to do? > > > > > > As it is, drivers have no way to safely put devices directly into > > > runtime suspend when coming out of system sleep. > > > > > > What do you think? > > > > There are two problems here. > > > > First off, for some devices that were runtime-suspended before system > > suspend > > we may not want to touch them at all during system suspend/resume. I > > think > > I know how to address that, but I need to prepare patches yet. That's > > going > > to take a few days at least, sorry about that. > > > > Second, we may want some devices to be set to "runtime suspended" during > > system resume without touching them, so that runtime PM resumes them when > > necessary. I have a patch for that, which hasn't been published yet and > > which is appended. > > > > The way it works is that whoever wants the system resume of certain device > > to be skipped, sets the (new) skip_resume flag for that device using > > device_pm_skip_resume() and that causes the PM core to basically ignore > > that > > device's PM callbacks during system resume. > > > > Whether or not doing the pm_request_resume(dev) for devices with > > skip_resume set in dpm_resume_early() is a good idea is a discussion item > > in my opinion (evidently, I thought it was when I was preparing this patch). > > The situation isn't quite so simple. > > We're talking specifically about disk drives. Depending on the drive, > it may or may not spin up all by itself when the system wakes up. If > it does spin up, the final state should be RPM_ACTIVE (the way it is > now). If it doesn't, we want the final state to be RPM_SUSPENDED. > This is true regardless of the runtime status before the system sleep. OK We can do all that in .resume_early() which runs with runtime PM disabled. > This means we can't skip the resume routine. At the very least, the > routine needs to ask the disk whether or not it is spinning. Depending > on the result, it would then set the RPM status appropriately. Yes, .resume_early() can do that. And it can set a (internal) flag for a device telling the driver/bus type .resume() and .complete() to leave the device alone. > In addition, we only want to end up in RPM_SUSPENDED if the usage > counters are 0 (in particular, power/control = "auto"). Otherwise, we > want the resume routine always to spin up the drive, if it hasn't spun > up by itself. That still can be done in .resume_early(). Or in .resume_noirq() if device interrupts are not requisite for that. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 9 Jan 2014, Rafael J. Wysocki wrote: > > The situation isn't quite so simple. > > > > We're talking specifically about disk drives. Depending on the drive, > > it may or may not spin up all by itself when the system wakes up. If > > it does spin up, the final state should be RPM_ACTIVE (the way it is > > now). If it doesn't, we want the final state to be RPM_SUSPENDED. > > This is true regardless of the runtime status before the system sleep. > > OK > > We can do all that in .resume_early() which runs with runtime PM disabled. Either the .resume_early or the .resume routine would be okay, as far as I'm concerned. Is there any strong reason to prefer one over the other, besides the fact that in .resume_early the pm_runtime_disable and _enable calls wouldn't be needed? > > This means we can't skip the resume routine. At the very least, the > > routine needs to ask the disk whether or not it is spinning. Depending > > on the result, it would then set the RPM status appropriately. > > Yes, .resume_early() can do that. And it can set a (internal) flag for > a device telling the driver/bus type .resume() and .complete() to > leave the device alone. I had in mind something more like this for the driver's .resume (or .resume_early) routine: bool leave_suspended = false; int rc = 0; if (!pm_runtime_in_use(dev) && !disk_is_spinning(dev)) leave_suspended = true; else rc = spin_up_the_disk(dev); if (rc == 0) { pm_runtime_disable(dev); if (leave_suspended) pm_runtime_set_suspended(dev); else pm_runtime_set_active(dev); pm_runtime_enable(dev); } return rc; That wouldn't need any special support from the PM core, other than the new pm_runtime_in_use routine. > > In addition, we only want to end up in RPM_SUSPENDED if the usage > > counters are 0 (in particular, power/control = "auto"). Otherwise, we > > want the resume routine always to spin up the drive, if it hasn't spun > > up by itself. > > That still can be done in .resume_early(). Or in .resume_noirq() if device > interrupts are not requisite for that. No, currently it cannot be done. In both resume_early and resume_noirq, the usage counter is _always_ > 0. This is because device_prepare calls pm_runtime_get_noresume, and the corresponding pm_runtime_put doesn't occur until device_complete. That's the problem I need to fix. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 1/9/2014 10:32 AM, Alan Stern wrote: > I had in mind something more like this for the driver's .resume > (or .resume_early) routine: > > bool leave_suspended = false; int rc = 0; > > if (!pm_runtime_in_use(dev) && !disk_is_spinning(dev)) > leave_suspended = true; else rc = spin_up_the_disk(dev); > > if (rc == 0) { pm_runtime_disable(dev); if (leave_suspended) > pm_runtime_set_suspended(dev); else pm_runtime_set_active(dev); > pm_runtime_enable(dev); } return rc; We also don't want to block the resume path while the REQUEST SENSE runs, since it typically takes 10 seconds or so on normal ATA disks. Thus, the resume or resume early needs to be able to return with the device left in the transitioning state and complete or abort the transition later. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSzsVVAAoJEI5FoCIzSKrwLGAIAJj3GvPEcwtFvkJ6oAX+4fbf z75OHkOkKtwOTY1SvqpgyYUMdrsTLRYO5ZPCCzZemow6s24xjEhtHVYONKogZ8Rv lYzFpsZhWd7pLjHm5PDUSbEJiOvx9Ba+6JbRYGl9rInv2PdJOSscEpN+4a0/bvGU F72tUfYI58su0RHcee9xXFRBOZkM61DCxNNXAyr+SHa7fIJi5jKrlNSbHPzBXAo0 g/qw8IY86a3SWE4cOGIfsrtL8GpmbwFNkT+BHWhaRlj2wj2VTP5FTZrOHndNL7pU WC29Qxg2cKhSgm15BhNw3ESjn/3rdFaHsV96ietFKMlavXDqRmmwPsTmNK8Pbuo= =5AE6 -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 9 Jan 2014, Phillip Susi wrote: > On 1/9/2014 10:32 AM, Alan Stern wrote: > > I had in mind something more like this for the driver's .resume > > (or .resume_early) routine: > > > > bool leave_suspended = false; int rc = 0; > > > > if (!pm_runtime_in_use(dev) && !disk_is_spinning(dev)) > > leave_suspended = true; else rc = spin_up_the_disk(dev); > > > > if (rc == 0) { pm_runtime_disable(dev); if (leave_suspended) > > pm_runtime_set_suspended(dev); else pm_runtime_set_active(dev); > > pm_runtime_enable(dev); } return rc; > > We also don't want to block the resume path while the REQUEST SENSE > runs, since it typically takes 10 seconds or so on normal ATA disks. > Thus, the resume or resume early needs to be able to return with the > device left in the transitioning state and complete or abort the > transition later. The code will be restructured as part of the asynchronous spin-up change. The resume routine would then look more like this: bool test_spinning = !pm_runtime_in_use(dev); pm_runtime_disable(dev); if (test_spinning) pm_runtime_set_suspended(dev); else pm_runtime_set_active(dev); pm_runtime_enable(dev); start_async_spin_up(dev, test_spinning); return 0; The asynchronous routine would skip the initial test if its second argument is nonzero, and proceed directly to spin-up the disk. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 1/9/2014 11:08 AM, Alan Stern wrote: > The code will be restructured as part of the asynchronous spin-up > change. The resume routine would then look more like this: > > bool test_spinning = !pm_runtime_in_use(dev); > > pm_runtime_disable(dev); if (test_spinning) > pm_runtime_set_suspended(dev); else pm_runtime_set_active(dev); > pm_runtime_enable(dev); > > start_async_spin_up(dev, test_spinning); return 0; > > The asynchronous routine would skip the initial test if its second > argument is nonzero, and proceed directly to spin-up the disk. It isn't enough to background the spin up; the TEST will block, so it needs to be async as well. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSzs66AAoJEI5FoCIzSKrwY40H/1gAfH5KxfXvr9gHGMOHAdiw VwUNKJTnADrEREOTEgWSPH8/VYyK2aFPjiGlWKvmZXMNC0CYLPRInSbhLx0cH4x1 uvezYnvVno/xME8QlLTqzVuQIrk3TKjyrd0rCBpj3YvAAyDy8TSGrJZfDAUixZkP zoahqzIthPE52DirixVBYIamsDOZltHOjK2HbfTHZXo9m+dxQirDqL/ClQivt3Gq e0IhRDEXsEb1b4ICQztuqNAp2BV57IzIHCd83+CTd/hkPalHFxJv6Y26IGYrORSZ CRtXWMYiGs2/cIElSpIIQDUL60niqfFPaxsGRCT+AISwu/f4TRzAo2t/t+Vaa6c= =aR9i -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 9 Jan 2014, Phillip Susi wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 1/9/2014 11:08 AM, Alan Stern wrote: > > The code will be restructured as part of the asynchronous spin-up > > change. The resume routine would then look more like this: > > > > bool test_spinning = !pm_runtime_in_use(dev); > > > > pm_runtime_disable(dev); if (test_spinning) > > pm_runtime_set_suspended(dev); else pm_runtime_set_active(dev); > > pm_runtime_enable(dev); > > > > start_async_spin_up(dev, test_spinning); return 0; > > > > The asynchronous routine would skip the initial test if its second > > argument is nonzero, and proceed directly to spin-up the disk. > > It isn't enough to background the spin up; the TEST will block, so it > needs to be async as well. That's what I meant. If the second argument is zero, the asynchronous routine will first test whether the disk is spinning. If it is, the status will be set to RPM_ACTIVE. (The routine may also send a START command -- for some disks this may be necessary.) Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, January 09, 2014 10:32:37 AM Alan Stern wrote: > On Thu, 9 Jan 2014, Rafael J. Wysocki wrote: > > > > The situation isn't quite so simple. > > > > > > We're talking specifically about disk drives. Depending on the drive, > > > it may or may not spin up all by itself when the system wakes up. If > > > it does spin up, the final state should be RPM_ACTIVE (the way it is > > > now). If it doesn't, we want the final state to be RPM_SUSPENDED. > > > This is true regardless of the runtime status before the system sleep. > > > > OK > > > > We can do all that in .resume_early() which runs with runtime PM disabled. > > Either the .resume_early or the .resume routine would be okay, as far > as I'm concerned. Is there any strong reason to prefer one over the > other, besides the fact that in .resume_early the pm_runtime_disable > and _enable calls wouldn't be needed? Not so much, at least I don't have anything like that in mind right now, but I'm not a big fan of disabling/enabling runtime PM temporarily to have a look at the "current" status. That just doesn't seem quite right. :-) > > > This means we can't skip the resume routine. At the very least, the > > > routine needs to ask the disk whether or not it is spinning. Depending > > > on the result, it would then set the RPM status appropriately. > > > > Yes, .resume_early() can do that. And it can set a (internal) flag for > > a device telling the driver/bus type .resume() and .complete() to > > leave the device alone. > > I had in mind something more like this for the driver's .resume (or > .resume_early) routine: > > bool leave_suspended = false; > int rc = 0; > > if (!pm_runtime_in_use(dev) && !disk_is_spinning(dev)) > leave_suspended = true; > else > rc = spin_up_the_disk(dev); > > if (rc == 0) { > pm_runtime_disable(dev); > if (leave_suspended) > pm_runtime_set_suspended(dev); > else > pm_runtime_set_active(dev); > pm_runtime_enable(dev); > } > return rc; > > That wouldn't need any special support from the PM core, other than the > new pm_runtime_in_use routine. That looks reasonable, but in .resume_early() you can do it like this I think: bool disk_spinning = disk_is_spinning(dev); int rc = 0; if (!pm_runtime_in_use(dev) && !disk_spinning) { pm_runtime_set_suspended(dev); } else { pm_runtime_set_active(dev); if (!disk_spinning) rc = spin_up_the_disk(dev); } and of course there needs to be a flag to pass to .resume() in case the device is supposed to be left in the suspended state. > > > In addition, we only want to end up in RPM_SUSPENDED if the usage > > > counters are 0 (in particular, power/control = "auto"). Otherwise, we > > > want the resume routine always to spin up the drive, if it hasn't spun > > > up by itself. > > > > That still can be done in .resume_early(). Or in .resume_noirq() if device > > interrupts are not requisite for that. > > No, currently it cannot be done. > > In both resume_early and resume_noirq, the usage counter is _always_ > > 0. This is because device_prepare calls pm_runtime_get_noresume, and > the corresponding pm_runtime_put doesn't occur until device_complete. > That's the problem I need to fix. Hmm. Why do you have to compare usage_count with 0 and not with 1? Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 01/09/2014 08:25 PM, Rafael J. Wysocki wrote: > That looks reasonable, but in .resume_early() you can do it like > this I think: > > bool disk_spinning = disk_is_spinning(dev); int rc = 0; > > if (!pm_runtime_in_use(dev) && !disk_spinning) { > pm_runtime_set_suspended(dev); } else { > pm_runtime_set_active(dev); if (!disk_spinning) rc = > spin_up_the_disk(dev); } You're not allowed to call pm_runtime_set_{suspended,active} without disabling runtime pm. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCgAGBQJSz1L6AAoJEI5FoCIzSKrwNWoIAKXZ6E7Vcrhbf7odxxdp11nq vmr76+RkywUvJWDeRkq5IrgXfepyc/tEdQURNXcUvquOwbxX6NfH/Sb4DYZ+UUva nJ9Po/vCyQAZgXodK3In7cteY4NvZLGVt9m3JkaYHKs3vkDriusSLMI1ghVNZIkW X/VrVWhlnl9XNYWzFZuCs3ccsYpDfQ0akrxqFDoEh/mj5UUVkVXQt+JpkGPuUDD4 8yR3omZ+OFwnjqHLtzC8AOI59wujY2NPjoohQH4Rd+JTJEmAu9Ml5fqkPg90z9b2 C2mnpkEBoDEO4L1pDrD1jmxCwZQApf0qFWfu8qo+OR/VtMnxF5Cpkd+O5H+0OQg= =Ksl4 -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, January 09, 2014 08:55:06 PM Phillip Susi wrote: > On 01/09/2014 08:25 PM, Rafael J. Wysocki wrote: > > That looks reasonable, but in .resume_early() you can do it like > > this I think: > > > > bool disk_spinning = disk_is_spinning(dev); int rc = 0; > > > > if (!pm_runtime_in_use(dev) && !disk_spinning) { > > pm_runtime_set_suspended(dev); } else { > > pm_runtime_set_active(dev); if (!disk_spinning) rc = > > spin_up_the_disk(dev); } > > You're not allowed to call pm_runtime_set_{suspended,active} without > disabling runtime pm. Well, thanks for reminding me how my code works. ;-) .resume_early() runs with runtime PM disabled. Thanks!
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 1/10/2014 8:35 AM, Rafael J. Wysocki wrote: > Well, thanks for reminding me how my code works. ;-) > > .resume_early() runs with runtime PM disabled. Ahh, then in that case we're back to not wanting to block the resume path on disk_is_spinning(). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJS0Ae2AAoJEI5FoCIzSKrwiqcH/jaoQ9WtE124xT53OaD7XfZQ C45ftL7iZG/8iF/oW48RaEa8l87yoisThDjdokuMqo6SEkxsxAegg+l9yz98tSoX kTj3x7rE8wPAwC4JCq7+6Xu2XTiK9ujHmohwJNljsqIdhbuLo+Y4XWNrKUzag2WX 8rJbX7bPTfg0ijtOkYUHfpJyap1OkG007MIeH4qQ/vZVd8Pem3Oy1jLdyLtpUMB5 D5ebHo6IdJDJpxX2EDSyHsI5UKEDycH6VLkCz06RpeMBkL4jePzL76D//9qfw+Kc RpOyoOFhYM5q2XeXRSjtPX4vrUe5EqdLfqYriIYiYjNNDqrirhsBDUQ5SdXr7zE= =HLer -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 10 Jan 2014, Rafael J. Wysocki wrote: > On Thursday, January 09, 2014 10:32:37 AM Alan Stern wrote: > > On Thu, 9 Jan 2014, Rafael J. Wysocki wrote: > > > > > > The situation isn't quite so simple. > > > > > > > > We're talking specifically about disk drives. Depending on the drive, > > > > it may or may not spin up all by itself when the system wakes up. If > > > > it does spin up, the final state should be RPM_ACTIVE (the way it is > > > > now). If it doesn't, we want the final state to be RPM_SUSPENDED. > > > > This is true regardless of the runtime status before the system sleep. > > > > > > OK > > > > > > We can do all that in .resume_early() which runs with runtime PM disabled. > > > > Either the .resume_early or the .resume routine would be okay, as far > > as I'm concerned. Is there any strong reason to prefer one over the > > other, besides the fact that in .resume_early the pm_runtime_disable > > and _enable calls wouldn't be needed? > > Not so much, at least I don't have anything like that in mind right now, but > I'm not a big fan of disabling/enabling runtime PM temporarily to have a look > at the "current" status. That just doesn't seem quite right. :-) The disable/enable is in order to _set_ the status, not to have a look at it. > > I had in mind something more like this for the driver's .resume (or > > .resume_early) routine: > > > > bool leave_suspended = false; > > int rc = 0; > > > > if (!pm_runtime_in_use(dev) && !disk_is_spinning(dev)) > > leave_suspended = true; > > else > > rc = spin_up_the_disk(dev); > > > > if (rc == 0) { > > pm_runtime_disable(dev); > > if (leave_suspended) > > pm_runtime_set_suspended(dev); > > else > > pm_runtime_set_active(dev); > > pm_runtime_enable(dev); > > } > > return rc; > > > > That wouldn't need any special support from the PM core, other than the > > new pm_runtime_in_use routine. > > That looks reasonable, but in .resume_early() you can do it like this I think: > > bool disk_spinning = disk_is_spinning(dev); > int rc = 0; > > if (!pm_runtime_in_use(dev) && !disk_spinning) { > pm_runtime_set_suspended(dev); > } else { > pm_runtime_set_active(dev); > if (!disk_spinning) > rc = spin_up_the_disk(dev); > } > > and of course there needs to be a flag to pass to .resume() in case the > device is supposed to be left in the suspended state. Agreed, there are various ways to get the same end result. My question here is: How should we implement pm_runtime_in_use()? > > > > In addition, we only want to end up in RPM_SUSPENDED if the usage > > > > counters are 0 (in particular, power/control = "auto"). Otherwise, we > > > > want the resume routine always to spin up the drive, if it hasn't spun > > > > up by itself. > > > > > > That still can be done in .resume_early(). Or in .resume_noirq() if device > > > interrupts are not requisite for that. > > > > No, currently it cannot be done. > > > > In both resume_early and resume_noirq, the usage counter is _always_ > > > 0. This is because device_prepare calls pm_runtime_get_noresume, and > > the corresponding pm_runtime_put doesn't occur until device_complete. > > That's the problem I need to fix. > > Hmm. Why do you have to compare usage_count with 0 and not with 1? It is an awkward problem. During a system sleep transition, pm_runtime_in_use() should return 0 if the usage_count is 1 (because of the pm_runtime_get_noresume() in device_prepare()). But at other times (i.e., during normal operation), pm_runtime_in_use() should return 0 if the usage_count is 0. I suppose a simple way to solve the problem is to document that pm_runtime_in_use() should only be invoked from within a system sleep callback routine. Then we can compare against 1 and not worry about the normal-operation case. Maybe call it pm_runtime_usage_during_sleep() to make this restriction explicit. Would that be acceptable? Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday, January 10, 2014 10:25:32 AM Alan Stern wrote: > On Fri, 10 Jan 2014, Rafael J. Wysocki wrote: > > > On Thursday, January 09, 2014 10:32:37 AM Alan Stern wrote: > > > On Thu, 9 Jan 2014, Rafael J. Wysocki wrote: > > > > > > > > The situation isn't quite so simple. > > > > > > > > > > We're talking specifically about disk drives. Depending on the drive, > > > > > it may or may not spin up all by itself when the system wakes up. If > > > > > it does spin up, the final state should be RPM_ACTIVE (the way it is > > > > > now). If it doesn't, we want the final state to be RPM_SUSPENDED. > > > > > This is true regardless of the runtime status before the system sleep. > > > > > > > > OK > > > > > > > > We can do all that in .resume_early() which runs with runtime PM disabled. > > > > > > Either the .resume_early or the .resume routine would be okay, as far > > > as I'm concerned. Is there any strong reason to prefer one over the > > > other, besides the fact that in .resume_early the pm_runtime_disable > > > and _enable calls wouldn't be needed? > > > > Not so much, at least I don't have anything like that in mind right now, but > > I'm not a big fan of disabling/enabling runtime PM temporarily to have a look > > at the "current" status. That just doesn't seem quite right. :-) > > The disable/enable is in order to _set_ the status, not to have a look > at it. Right. I should have said that in my opinion disabling runtime PM in order to do anything with the fields owned by it and re-enabling it right after that wasn't quite right in principle. Re-adjusting those fields *before* we re-enable runtime PM during system resume is quite different from doing the on-off trick *after* we've re-enabled runtime PM. So, my opinion is that the runtime PM internal information should be made reflect the actual (and/or desirable) state of things before it is re-enabled, which practically means in the .resume_noirq()/.resume_early() time frame. Doing it after that might actually work, but is questionable from the overall consistency perspective (it basically would mean that we ran with wrong runtime PM data for a while after we'd re-enabled it and before it got adjusted, so we shouldn't have re-enabled it in the first place). > > > I had in mind something more like this for the driver's .resume (or > > > .resume_early) routine: > > > > > > bool leave_suspended = false; > > > int rc = 0; > > > > > > if (!pm_runtime_in_use(dev) && !disk_is_spinning(dev)) > > > leave_suspended = true; > > > else > > > rc = spin_up_the_disk(dev); > > > > > > if (rc == 0) { > > > pm_runtime_disable(dev); > > > if (leave_suspended) > > > pm_runtime_set_suspended(dev); > > > else > > > pm_runtime_set_active(dev); > > > pm_runtime_enable(dev); > > > } > > > return rc; > > > > > > That wouldn't need any special support from the PM core, other than the > > > new pm_runtime_in_use routine. > > > > That looks reasonable, but in .resume_early() you can do it like this I think: > > > > bool disk_spinning = disk_is_spinning(dev); > > int rc = 0; > > > > if (!pm_runtime_in_use(dev) && !disk_spinning) { > > pm_runtime_set_suspended(dev); > > } else { > > pm_runtime_set_active(dev); > > if (!disk_spinning) > > rc = spin_up_the_disk(dev); > > } > > > > and of course there needs to be a flag to pass to .resume() in case the > > device is supposed to be left in the suspended state. > > Agreed, there are various ways to get the same end result. My question > here is: How should we implement pm_runtime_in_use()? In my opinion what really matters is (a) whether or not the pm_runtime_enable() in device_resume_early() will actually enable runtime PM and (b) whether or not the pm_runtime_put() in device_complete() will actually make it possible to suspend the device. So, these are the conditions that I would check. > > > > > In addition, we only want to end up in RPM_SUSPENDED if the usage > > > > > counters are 0 (in particular, power/control = "auto"). Otherwise, we > > > > > want the resume routine always to spin up the drive, if it hasn't spun > > > > > up by itself. > > > > > > > > That still can be done in .resume_early(). Or in .resume_noirq() if device > > > > interrupts are not requisite for that. > > > > > > No, currently it cannot be done. > > > > > > In both resume_early and resume_noirq, the usage counter is _always_ > > > > 0. This is because device_prepare calls pm_runtime_get_noresume, and > > > the corresponding pm_runtime_put doesn't occur until device_complete. > > > That's the problem I need to fix. > > > > Hmm. Why do you have to compare usage_count with 0 and not with 1? > > It is an awkward problem. During a system sleep transition, > pm_runtime_in_use() should return 0 if the usage_count is 1 (because of > the pm_runtime_get_noresume() in device_prepare()). > > But at other times (i.e., during normal operation), pm_runtime_in_use() > should return 0 if the usage_count is 0. > > I suppose a simple way to solve the problem is to document that > pm_runtime_in_use() should only be invoked from within a system sleep > callback routine. Then we can compare against 1 and not worry about > the normal-operation case. Maybe call it > pm_runtime_usage_during_sleep() to make this restriction explicit. > > Would that be acceptable? Yes, it would, as far as I'm concerned. In my opinion it only is really useful to call it before calling pm_runtime_enable() in device_resume_early(). Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 01/10/2014 06:02 PM, Rafael J. Wysocki wrote: > So, my opinion is that the runtime PM internal information should > be made reflect the actual (and/or desirable) state of things > before it is re-enabled, which practically means in the > .resume_noirq()/.resume_early() time frame. > > Doing it after that might actually work, but is questionable from > the overall consistency perspective (it basically would mean that > we ran with wrong runtime PM data for a while after we'd re-enabled > it and before it got adjusted, so we shouldn't have re-enabled it > in the first place). But we don't want to block the system resume path for too long while we figure out what the correct state is. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCgAGBQJS0KedAAoJEI5FoCIzSKrw41EH/0phrR8FKj2D8H2YidMluZ0/ 0crckeYkz4VMEkxMrzjnFl1sxwaWLfYW1GOpJ6FnGZqArjLr0GktOHBJfVTWqNqI bNIFW5rKX/JKgns3Nmv1Zx5twxHuIfVtAF2mjgorabyKyRpaVwsiZa9ogLgBnUVr ipLYX1mBRvmoMPDHYB3/wrefc4c0Eg4n1O24/wVhehxtsBCeN9BKUUEcIRvkiRUj FwMmWXuZMlTGK2LhjmgCMuH7Ea9r8sAJWUkGA8zJJLuS8y7kK9grUwAW5IhfrJGH Wg5sBce8sKhoX7h5gze+rVktrM1B4DDzEt4J356lmIjfeQ75rUrHMzEDvzh021M= =+ubG -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 11 Jan 2014, Rafael J. Wysocki wrote: > I should have said that in my opinion disabling runtime PM in order to do > anything with the fields owned by it and re-enabling it right after that > wasn't quite right in principle. Re-adjusting those fields *before* we > re-enable runtime PM during system resume is quite different from doing > the on-off trick *after* we've re-enabled runtime PM. > > So, my opinion is that the runtime PM internal information should be made > reflect the actual (and/or desirable) state of things before it is re-enabled, > which practically means in the .resume_noirq()/.resume_early() time frame. That makes sense. > Doing it after that might actually work, but is questionable from the > overall consistency perspective (it basically would mean that we ran with > wrong runtime PM data for a while after we'd re-enabled it and before it > got adjusted, so we shouldn't have re-enabled it in the first place). Of course, the middle of a system resume is a rather special time. Still, I agree with your basic point. > > Agreed, there are various ways to get the same end result. My question > > here is: How should we implement pm_runtime_in_use()? > > In my opinion what really matters is (a) whether or not the pm_runtime_enable() > in device_resume_early() will actually enable runtime PM and (b) whether or not > the pm_runtime_put() in device_complete() will actually make it possible to > suspend the device. So, these are the conditions that I would check. Yes. The key being that the check will give the correct result only during the _noirq and _late/_early phases of system sleep, not at other times. > > I suppose a simple way to solve the problem is to document that > > pm_runtime_in_use() should only be invoked from within a system sleep > > callback routine. Then we can compare against 1 and not worry about > > the normal-operation case. Maybe call it > > pm_runtime_usage_during_sleep() to make this restriction explicit. > > > > Would that be acceptable? > > Yes, it would, as far as I'm concerned. > > In my opinion it only is really useful to call it before calling > pm_runtime_enable() in device_resume_early(). Okay, I'll put something together. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 10 Jan 2014, Phillip Susi wrote: > On 01/10/2014 06:02 PM, Rafael J. Wysocki wrote: > > So, my opinion is that the runtime PM internal information should > > be made reflect the actual (and/or desirable) state of things > > before it is re-enabled, which practically means in the > > .resume_noirq()/.resume_early() time frame. > > > > Doing it after that might actually work, but is questionable from > > the overall consistency perspective (it basically would mean that > > we ran with wrong runtime PM data for a while after we'd re-enabled > > it and before it got adjusted, so we shouldn't have re-enabled it > > in the first place). > > But we don't want to block the system resume path for too long while > we figure out what the correct state is. We won't. The resume_early routine will set the status to RPM_ACTIVE or RPM_SUSPENDED, according to the result from pm_runtime_usage_during_sleep. It will also do a pm_runtime_get_sync on the disk's parent, to prevent the parent from going into runtime suspend before the async routine can run. Then the async routine will do the following: If the status is RPM_ACTIVE, spin up the disk. Else (the status is RPM_SUSPENDED) check whether the disk has spun up by itself. If it has, or if you can't tell, call pm_runtime_resume to set the status correctly and do an explicit spin-up. Call pm_runtime_put_sync on the parent. I think this will give the desired result. But there is one potential problem: If the disk has a child device (such as a partition) and the child's driver needs to do I/O as part of its resume routine, the I/O is likely to hang if the disk is runtime suspended. However, I don't know of any child devices like this. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 01/11/2014 05:50 PM, Alan Stern wrote: > We won't. The resume_early routine will set the status to > RPM_ACTIVE or RPM_SUSPENDED, according to the result from > pm_runtime_usage_during_sleep. It will also do a > pm_runtime_get_sync on the disk's parent, to prevent the parent > from going into runtime suspend before the async routine can run. > > Then the async routine will do the following: > > If the status is RPM_ACTIVE, spin up the disk. > > Else (the status is RPM_SUSPENDED) check whether the disk has spun > up by itself. If it has, or if you can't tell, call > pm_runtime_resume to set the status correctly and do an explicit > spin-up. > > Call pm_runtime_put_sync on the parent. I see now... thanks. > I think this will give the desired result. But there is one > potential problem: If the disk has a child device (such as a > partition) and the child's driver needs to do I/O as part of its > resume routine, the I/O is likely to hang if the disk is runtime > suspended. However, I don't know of any child devices like this. Probably should check dm and md to see if they do IO in their resume. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCgAGBQJS0fTbAAoJEI5FoCIzSKrwTVoH/3Cl4ZEmE+5aUJX5+mCGmoPD Ap+UHe517qsfDWwKaDqURTcXaZOqMfbk71jdTCAIIVTSPB95NWFQg3d/XpWAVY29 yoXdUj28WHTlgtfhtqa92K7lImOijxJ/32myEK9/+dQBDXXnG4n+jt1eL+vo9Rhy vlcAJLKcCsL4SpDD8z/N2lvsmJ5F1vIB5SkM7GiyaR3HHfRIM3hzV4JErtHozgWE x2zNfD6DXk7CR3GnISq2bznIYvRlfsmvzwCjfJ3AYBJdtFQmuwMDznaVKlCpy/Ya G07n0HqghB3ZYseZLZt8HyB/men72SZolJCTb/UnkODTsymq9+6J3XrJ6bb29Vc= =V+EM -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: linux-pm/include/linux/pm.h =================================================================== --- linux-pm.orig/include/linux/pm.h +++ linux-pm/include/linux/pm.h @@ -532,9 +532,12 @@ struct dev_pm_info { struct wakeup_source *wakeup; bool wakeup_path:1; bool syscore:1; +#ifdef CONFIG_PM_RUNTIME + bool skip_resume:1; +#endif #else unsigned int should_wakeup:1; -#endif +#endif /* CONFIG_PM_SLEEP */ #ifdef CONFIG_PM_RUNTIME struct timer_list suspend_timer; unsigned long timer_expires; @@ -561,7 +564,7 @@ struct dev_pm_info { unsigned long active_jiffies; unsigned long suspended_jiffies; unsigned long accounting_timestamp; -#endif +#endif /* CONFIG_PM_RUNTIME */ struct pm_subsys_data *subsys_data; /* Owned by the subsystem. */ struct dev_pm_qos *qos; }; @@ -716,4 +719,10 @@ enum dpm_order { DPM_ORDER_DEV_LAST, }; +#if defined(CONFIG_PM_RUNTIME) && defined(CONFIG_PM_SLEEP) +void device_pm_skip_resume(struct device *dev, bool enable); +#else +static inline void device_pm_skip_resume(struct device *dev, bool enable) {} +#endif /* CONFIG_PM_RUNTIME && CONFIG_PM_SLEEP */ + #endif /* _LINUX_PM_H */ Index: linux-pm/drivers/base/power/main.c =================================================================== --- linux-pm.orig/drivers/base/power/main.c +++ linux-pm/drivers/base/power/main.c @@ -384,6 +384,29 @@ static int dpm_run_callback(pm_callback_ return error; } +#ifdef CONFIG_PM_RUNTIME +void device_pm_skip_resume(struct device *dev, bool enable) +{ + device_pm_lock(); + + if (!dev->power.is_prepared) + dev->power.skip_resume = !!enable; + + device_pm_unlock(); +} + +static inline bool skip_device_resume(struct device *dev, pm_message_t state) +{ + return dev->power.skip_resume && (state.event == PM_EVENT_RESUME + || state.event == PM_EVENT_RESTORE); +} +#else +static inline bool skip_device_resume(struct device *dev, pm_message_t state) +{ + return false; +} +#endif /* CONFIG_PM_RUNTIME */ + /*------------------------- Resume routines -------------------------*/ /** @@ -446,21 +469,24 @@ static void dpm_resume_noirq(pm_message_ mutex_lock(&dpm_list_mtx); while (!list_empty(&dpm_noirq_list)) { struct device *dev = to_device(dpm_noirq_list.next); - int error; get_device(dev); list_move_tail(&dev->power.entry, &dpm_late_early_list); - mutex_unlock(&dpm_list_mtx); + if (!skip_device_resume(dev, state)) { + int error; - error = device_resume_noirq(dev, state); - if (error) { - suspend_stats.failed_resume_noirq++; - dpm_save_failed_step(SUSPEND_RESUME_NOIRQ); - dpm_save_failed_dev(dev_name(dev)); - pm_dev_err(dev, state, " noirq", error); - } + mutex_unlock(&dpm_list_mtx); + + error = device_resume_noirq(dev, state); + if (error) { + suspend_stats.failed_resume_noirq++; + dpm_save_failed_step(SUSPEND_RESUME_NOIRQ); + dpm_save_failed_dev(dev_name(dev)); + pm_dev_err(dev, state, " noirq", error); + } - mutex_lock(&dpm_list_mtx); + mutex_lock(&dpm_list_mtx); + } put_device(dev); } mutex_unlock(&dpm_list_mtx); @@ -527,21 +553,39 @@ static void dpm_resume_early(pm_message_ mutex_lock(&dpm_list_mtx); while (!list_empty(&dpm_late_early_list)) { struct device *dev = to_device(dpm_late_early_list.next); - int error; get_device(dev); list_move_tail(&dev->power.entry, &dpm_suspended_list); - mutex_unlock(&dpm_list_mtx); + if (skip_device_resume(dev, state)) { + pm_runtime_set_suspended(dev); + pm_runtime_enable(dev); + /* + * Balance the pm_runtime_get_noresume() in + * device_prepare(). + */ + pm_runtime_put_noidle(dev); + /* + * The device might have been powered up by the platform + * firmware already, so make it resume and then possibly + * suspend again to avoid leaving powered up devices as + * "suspended" for too long. + */ + pm_request_resume(dev); + } else { + int error; - error = device_resume_early(dev, state); - if (error) { - suspend_stats.failed_resume_early++; - dpm_save_failed_step(SUSPEND_RESUME_EARLY); - dpm_save_failed_dev(dev_name(dev)); - pm_dev_err(dev, state, " early", error); - } + mutex_unlock(&dpm_list_mtx); + + error = device_resume_early(dev, state); + if (error) { + suspend_stats.failed_resume_early++; + dpm_save_failed_step(SUSPEND_RESUME_EARLY); + dpm_save_failed_dev(dev_name(dev)); + pm_dev_err(dev, state, " early", error); + } - mutex_lock(&dpm_list_mtx); + mutex_lock(&dpm_list_mtx); + } put_device(dev); } mutex_unlock(&dpm_list_mtx); @@ -682,6 +726,10 @@ void dpm_resume(pm_message_t state) list_for_each_entry(dev, &dpm_suspended_list, power.entry) { INIT_COMPLETION(dev->power.completion); + if (skip_device_resume(dev, state)) { + complete(&dev->power.completion); + continue; + } if (is_async(dev)) { get_device(dev); async_schedule(async_resume, dev); @@ -691,7 +739,7 @@ void dpm_resume(pm_message_t state) while (!list_empty(&dpm_suspended_list)) { dev = to_device(dpm_suspended_list.next); get_device(dev); - if (!is_async(dev)) { + if (!is_async(dev) && !skip_device_resume(dev, state)) { int error; mutex_unlock(&dpm_list_mtx); @@ -780,11 +828,13 @@ void dpm_complete(pm_message_t state) get_device(dev); dev->power.is_prepared = false; list_move(&dev->power.entry, &list); - mutex_unlock(&dpm_list_mtx); + if (!skip_device_resume(dev, state)) { + mutex_unlock(&dpm_list_mtx); - device_complete(dev, state); + device_complete(dev, state); - mutex_lock(&dpm_list_mtx); + mutex_lock(&dpm_list_mtx); + } put_device(dev); } list_splice(&list, &dpm_list);