mbox series

[v9,0/8] PM / devfreq: Add dev_pm_qos support

Message ID cover.1570044052.git.leonard.crestez@nxp.com (mailing list archive)
Headers show
Series PM / devfreq: Add dev_pm_qos support | expand

Message

Leonard Crestez Oct. 2, 2019, 7:25 p.m. UTC
Add dev_pm_qos notifiers to devfreq core in order to support frequency
limits via dev_pm_qos_add_request.

Unlike the rest of devfreq the dev_pm_qos frequency is measured in Khz,
this is consistent with current dev_pm_qos usage for cpufreq and
allows frequencies above 2Ghz (pm_qos expresses limits as s32).

Like with cpufreq the handling of min_freq/max_freq is moved to the
dev_pm_qos mechanism. Constraints from userspace are no longer clamped on
store, instead all values can be written and we only check against OPPs in a
new devfreq_get_freq_range function. This is consistent with the design of
dev_pm_qos.

Notifiers from pm_qos are executed under a single global dev_pm_qos_mtx and
need to take devfreq->lock. Notifier registration takes the same dev_pm_qos_mtx
so in order to prevent lockdep warnings it must be done outside devfreq->lock.
Current devfreq_add_device does all initialization under devfreq->lock and that
needs to be relaxed.

Some of first patches in the series are bugfixes and cleanups, they could be
applied separately.

---
Changes since v8:
* Fix incorrectly reading qos max twice in get_freq_range.
* Make devfreq_notifier_call set scaling_max_freq to ULONG_MAX instead of 0 on
error. This avoids special-casing this in get_freq_range.
* Add a fix for devfreq_notifier_call returning -errno.
* Replace S32_MAX with PM_QOS_MAX_FREQUENCY_DEFAULT_VALUE. This seems cleaner
and avoids overflow when multiplying S32_MAX with 1000 on 32bit platforms. It
overflowed to 0xfffffc18 hz so it was mostly safe anyway.
* Warn when encountering errors on cleanup (but continue).
* Incorporate other smaller comment and printk adjustments from review.
Link to v8:
https://patchwork.kernel.org/cover/11158383/

Changes since v7:
* Only #define HZ_PER_KHZ in patch where it's used.
* Drop devfreq_ prefix for some internal functions.
* Improve qos update error message.
* Remove some unnecessary comments.
* Collect reviews
Link to v7: https://patchwork.kernel.org/cover/11157649/

Changes since v6:
* Don't return errno from devfreq_qos_notifier_call, return NOTIFY_DONE
and print the error.
* More spelling and punctuation nits
Link to v6: https://patchwork.kernel.org/cover/11157201/

Changes since v5:
* Drop patches which are not strictly related to PM QoS.
* Add a comment explaining why devfreq_add_device needs two cleanup paths.
* Remove {} for single line.
* Rename {min,max}_freq_req to user_{min,max}_freq_req
* Collect reviews
Link to v5: https://patchwork.kernel.org/cover/11149497/

Changes since v4:
* Move more devfreq_add_device init ahead of device_register.
* Make devfreq_dev_release cleanup devices not yet in devfreq_list. This is
simpler than previous attempt to add to devfreq_list sonner.
* Take devfreq->lock in trans_stat_show
* Register dev_pm_opp notifier on devfreq parent dev (which has OPPs)
Link to v4: https://patchwork.kernel.org/cover/11114657/

Changes since v3:
* Cleanup locking and error-handling in devfreq_add_device
* Register notifiers after device registration but before governor start
* Keep the initialization of min_req/max_req ahead of device_register
because it's used for sysfs handling
* Use HZ_PER_KHZ instead of 1000
* Add kernel-doc comments
* Move OPP notifier to core
Link to v3: https://patchwork.kernel.org/cover/11104061/

Changes since v2:
* Handle sysfs via dev_pm_qos (in separate patch)
* Add locking to {min,max}_freq_show
* Fix checkpatch issues (long lines etc)
Link to v2: https://patchwork.kernel.org/patch/11084279/

Changes since v1:
* Add doxygen comments for min_nb/max_nb
* Remove notifiers on error/cleanup paths. Keep gotos simple by relying on
dev_pm_qos_remove_notifier ignoring notifiers which were not added.
Link to v1: https://patchwork.kernel.org/patch/11078475/

Leonard Crestez (8):
  PM / devfreq: Don't fail devfreq_dev_release if not in list
  PM / devfreq: Fix devfreq_notifier_call returning errno
  PM / devfreq: Set scaling_max_freq to max on OPP notifier error
  PM / devfreq: Move more initialization before registration
  PM / devfreq: Don't take lock in devfreq_add_device
  PM / devfreq: Introduce get_freq_range helper
  PM / devfreq: Add PM QoS support
  PM / devfreq: Use PM QoS for sysfs min/max_freq

 drivers/devfreq/devfreq.c | 310 ++++++++++++++++++++++++++------------
 include/linux/devfreq.h   |  14 +-
 2 files changed, 224 insertions(+), 100 deletions(-)

Comments

Leonard Crestez Oct. 23, 2019, 2:06 p.m. UTC | #1
On 2019-10-02 10:25 PM, Leonard Crestez wrote:
> Add dev_pm_qos notifiers to devfreq core in order to support frequency
> limits via dev_pm_qos_add_request.
> 
> Unlike the rest of devfreq the dev_pm_qos frequency is measured in Khz,
> this is consistent with current dev_pm_qos usage for cpufreq and
> allows frequencies above 2Ghz (pm_qos expresses limits as s32).
> 
> Like with cpufreq the handling of min_freq/max_freq is moved to the
> dev_pm_qos mechanism. Constraints from userspace are no longer clamped on
> store, instead all values can be written and we only check against OPPs in a
> new devfreq_get_freq_range function. This is consistent with the design of
> dev_pm_qos.
> 
> Notifiers from pm_qos are executed under a single global dev_pm_qos_mtx and
> need to take devfreq->lock. Notifier registration takes the same dev_pm_qos_mtx
> so in order to prevent lockdep warnings it must be done outside devfreq->lock.
> Current devfreq_add_device does all initialization under devfreq->lock and that
> needs to be relaxed.
> 
> Some of first patches in the series are bugfixes and cleanups, they could be
> applied separately.

Hello,

This series was posted a few ago and all patches have been 
reviewed/tested-by multiple people. Possible minor hangups:

1) Matthias found it confusing that min/max_freq in sysfs changes 
on-the-fly. This is not a behavior change and I believe a decent 
workaround would be to implement separate user_min/max_freq files from 
which userspace will always read back the contraints it has written.

2) There was an objection to removing devm from two allocs in PATCH 4. I 
believe current solution is acceptable but a possible alternative would 
be to split device_register into device_initialize and device_add: this 
should allow devm sooner.
Link: https://patchwork.kernel.org/patch/11158385/#22902151

Let me know if you think I should implement the options above and resend.

The bigger problem is that DEV_PM_QOS_MIN/MAX_FREQUENCY was removed from 
pm core because only user (cpufreq) was refactored to use a new 
interface on top of cpufreq_policy. Links to discussion:
     https://patchwork.kernel.org/cover/11193021/
 
https://lore.kernel.org/linux-pm/VI1PR04MB7023DF47D046AEADB4E051EBEE680@VI1PR04MB7023.eurprd04.prod.outlook.com/T/#u

I believe there is still significant value in supporting min/max 
frequency requests on a per-target-device basis. This makes much more 
sense for devfreq that for cpufreq because the whole point of devfreq is 
scaling arbitrary independent devices.

> ---
> Changes since v8:
> * Fix incorrectly reading qos max twice in get_freq_range.
> * Make devfreq_notifier_call set scaling_max_freq to ULONG_MAX instead of 0 on
> error. This avoids special-casing this in get_freq_range.
> * Add a fix for devfreq_notifier_call returning -errno.
> * Replace S32_MAX with PM_QOS_MAX_FREQUENCY_DEFAULT_VALUE. This seems cleaner
> and avoids overflow when multiplying S32_MAX with 1000 on 32bit platforms. It
> overflowed to 0xfffffc18 hz so it was mostly safe anyway.
> * Warn when encountering errors on cleanup (but continue).
> * Incorporate other smaller comment and printk adjustments from review.
> Link to v8:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fcover%2F11158383%2F&data=02%7C01%7Cleonard.crestez%40nxp.com%7Ca6b53046361642d0fd5708d7476e40e4%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637056411174276846&sdata=YKKW1RJl1YArhtjlA859DeRxys4d4iiB%2FQIz3Nl8OtU%3D&reserved=0
> 
> Changes since v7:
> * Only #define HZ_PER_KHZ in patch where it's used.
> * Drop devfreq_ prefix for some internal functions.
> * Improve qos update error message.
> * Remove some unnecessary comments.
> * Collect reviews
> Link to v7: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fcover%2F11157649%2F&data=02%7C01%7Cleonard.crestez%40nxp.com%7Ca6b53046361642d0fd5708d7476e40e4%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637056411174276846&sdata=0s6mH192c3dvDlfrFEVdqdMqoFRM4QJiLZiRg4c89nQ%3D&reserved=0
> 
> Changes since v6:
> * Don't return errno from devfreq_qos_notifier_call, return NOTIFY_DONE
> and print the error.
> * More spelling and punctuation nits
> Link to v6: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fcover%2F11157201%2F&data=02%7C01%7Cleonard.crestez%40nxp.com%7Ca6b53046361642d0fd5708d7476e40e4%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637056411174276846&sdata=6hcaCgESymHhCk5UbFC182FrgFVdJdDoorAJhZXKuTg%3D&reserved=0
> 
> Changes since v5:
> * Drop patches which are not strictly related to PM QoS.
> * Add a comment explaining why devfreq_add_device needs two cleanup paths.
> * Remove {} for single line.
> * Rename {min,max}_freq_req to user_{min,max}_freq_req
> * Collect reviews
> Link to v5: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fcover%2F11149497%2F&data=02%7C01%7Cleonard.crestez%40nxp.com%7Ca6b53046361642d0fd5708d7476e40e4%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637056411174276846&sdata=oDuaTI7bpOlCEs9mFRgY5eUvGX%2FtpP6m0U5t4SYNKaw%3D&reserved=0
> 
> Changes since v4:
> * Move more devfreq_add_device init ahead of device_register.
> * Make devfreq_dev_release cleanup devices not yet in devfreq_list. This is
> simpler than previous attempt to add to devfreq_list sonner.
> * Take devfreq->lock in trans_stat_show
> * Register dev_pm_opp notifier on devfreq parent dev (which has OPPs)
> Link to v4: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fcover%2F11114657%2F&data=02%7C01%7Cleonard.crestez%40nxp.com%7Ca6b53046361642d0fd5708d7476e40e4%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637056411174276846&sdata=ItIVLxJZVOpO2XL2MBqONWLQ1lHFu2gA7GC1yb%2BQgEE%3D&reserved=0
> 
> Changes since v3:
> * Cleanup locking and error-handling in devfreq_add_device
> * Register notifiers after device registration but before governor start
> * Keep the initialization of min_req/max_req ahead of device_register
> because it's used for sysfs handling
> * Use HZ_PER_KHZ instead of 1000
> * Add kernel-doc comments
> * Move OPP notifier to core
> Link to v3: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fcover%2F11104061%2F&data=02%7C01%7Cleonard.crestez%40nxp.com%7Ca6b53046361642d0fd5708d7476e40e4%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637056411174276846&sdata=unedjnBbjVtifB8koQ0B8eOjqwCnnAeqHGI8QYuH%2Fv4%3D&reserved=0
> 
> Changes since v2:
> * Handle sysfs via dev_pm_qos (in separate patch)
> * Add locking to {min,max}_freq_show
> * Fix checkpatch issues (long lines etc)
> Link to v2: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fpatch%2F11084279%2F&data=02%7C01%7Cleonard.crestez%40nxp.com%7Ca6b53046361642d0fd5708d7476e40e4%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637056411174276846&sdata=boHOmdwyUhKlmk7F3X8oYbLq1MvR4dQWlF14I0AXXes%3D&reserved=0
> 
> Changes since v1:
> * Add doxygen comments for min_nb/max_nb
> * Remove notifiers on error/cleanup paths. Keep gotos simple by relying on
> dev_pm_qos_remove_notifier ignoring notifiers which were not added.
> Link to v1: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fpatch%2F11078475%2F&data=02%7C01%7Cleonard.crestez%40nxp.com%7Ca6b53046361642d0fd5708d7476e40e4%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637056411174276846&sdata=DZlQAKLqQc4Q46a7v%2FtEBexsp1jDSLB8yuZcLXPEHl4%3D&reserved=0
> 
> Leonard Crestez (8):
>    PM / devfreq: Don't fail devfreq_dev_release if not in list
>    PM / devfreq: Fix devfreq_notifier_call returning errno
>    PM / devfreq: Set scaling_max_freq to max on OPP notifier error
>    PM / devfreq: Move more initialization before registration
>    PM / devfreq: Don't take lock in devfreq_add_device
>    PM / devfreq: Introduce get_freq_range helper
>    PM / devfreq: Add PM QoS support
>    PM / devfreq: Use PM QoS for sysfs min/max_freq
> 
>   drivers/devfreq/devfreq.c | 310 ++++++++++++++++++++++++++------------
>   include/linux/devfreq.h   |  14 +-
>   2 files changed, 224 insertions(+), 100 deletions(-)
>
Matthias Kaehlcke Oct. 23, 2019, 4:34 p.m. UTC | #2
On Wed, Oct 23, 2019 at 02:06:40PM +0000, Leonard Crestez wrote:
> On 2019-10-02 10:25 PM, Leonard Crestez wrote:
> > Add dev_pm_qos notifiers to devfreq core in order to support frequency
> > limits via dev_pm_qos_add_request.
> > 
> > Unlike the rest of devfreq the dev_pm_qos frequency is measured in Khz,
> > this is consistent with current dev_pm_qos usage for cpufreq and
> > allows frequencies above 2Ghz (pm_qos expresses limits as s32).
> > 
> > Like with cpufreq the handling of min_freq/max_freq is moved to the
> > dev_pm_qos mechanism. Constraints from userspace are no longer clamped on
> > store, instead all values can be written and we only check against OPPs in a
> > new devfreq_get_freq_range function. This is consistent with the design of
> > dev_pm_qos.
> > 
> > Notifiers from pm_qos are executed under a single global dev_pm_qos_mtx and
> > need to take devfreq->lock. Notifier registration takes the same dev_pm_qos_mtx
> > so in order to prevent lockdep warnings it must be done outside devfreq->lock.
> > Current devfreq_add_device does all initialization under devfreq->lock and that
> > needs to be relaxed.
> > 
> > Some of first patches in the series are bugfixes and cleanups, they could be
> > applied separately.
> 
> Hello,
> 
> This series was posted a few ago and all patches have been 
> reviewed/tested-by multiple people. Possible minor hangups:
> 
> 1) Matthias found it confusing that min/max_freq in sysfs changes 
> on-the-fly. This is not a behavior change and I believe a decent 
> workaround would be to implement separate user_min/max_freq files from 
> which userspace will always read back the contraints it has written.

As you said, it isn't a behavioral change, so it shouldn't be an issue
for this series.

Regarding the workaround I think it would be clearer to have new
xyx_min/max_freq files for the aggregate values. min/max_freq are the
interface userspace uses to specify the limits, it would be strange to
use different files to read them out.

In any case the aggregate values seem to be of little practical value,
except perhaps for monitoring, since they could be stale right after
userspace read them out. We could start with not providing them, and add
them if it turns out that they are actually needed.

> 2) There was an objection to removing devm from two allocs in PATCH 4. I 
> believe current solution is acceptable but a possible alternative would 
> be to split device_register into device_initialize and device_add: this 
> should allow devm sooner.
> Link: https://patchwork.kernel.org/patch/11158385/#22902151
> 
> Let me know if you think I should implement the options above and resend.
> 
> The bigger problem is that DEV_PM_QOS_MIN/MAX_FREQUENCY was removed from 
> pm core because only user (cpufreq) was refactored to use a new 
> interface on top of cpufreq_policy. Links to discussion:
>      https://patchwork.kernel.org/cover/11193021/
>  
> https://lore.kernel.org/linux-pm/VI1PR04MB7023DF47D046AEADB4E051EBEE680@VI1PR04MB7023.eurprd04.prod.outlook.com/T/#u
> 
> I believe there is still significant value in supporting min/max 
> frequency requests on a per-target-device basis. This makes much more 
> sense for devfreq that for cpufreq because the whole point of devfreq is 
> scaling arbitrary independent devices.

Agreed.

It seems Rafael would be ok with reverting the patch that removes
DEV_PM_QOS_MIN/MAX_FREQUENCY, IIUC he just doesn't want to keep it around
at this time because with the rest of his series there remain no in-tree
users.
Leonard Crestez Oct. 24, 2019, 4:21 p.m. UTC | #3
On 23.10.2019 19:34, Matthias Kaehlcke wrote:
> On Wed, Oct 23, 2019 at 02:06:40PM +0000, Leonard Crestez wrote:
>> On 2019-10-02 10:25 PM, Leonard Crestez wrote:
>>> Add dev_pm_qos notifiers to devfreq core in order to support frequency
>>> limits via dev_pm_qos_add_request.
>>>
>>> Unlike the rest of devfreq the dev_pm_qos frequency is measured in Khz,
>>> this is consistent with current dev_pm_qos usage for cpufreq and
>>> allows frequencies above 2Ghz (pm_qos expresses limits as s32).
>>>
>>> Like with cpufreq the handling of min_freq/max_freq is moved to the
>>> dev_pm_qos mechanism. Constraints from userspace are no longer clamped on
>>> store, instead all values can be written and we only check against OPPs in a
>>> new devfreq_get_freq_range function. This is consistent with the design of
>>> dev_pm_qos.
>>>
>>> Notifiers from pm_qos are executed under a single global dev_pm_qos_mtx and
>>> need to take devfreq->lock. Notifier registration takes the same dev_pm_qos_mtx
>>> so in order to prevent lockdep warnings it must be done outside devfreq->lock.
>>> Current devfreq_add_device does all initialization under devfreq->lock and that
>>> needs to be relaxed.
>>>
>>> Some of first patches in the series are bugfixes and cleanups, they could be
>>> applied separately.
>>
>> Hello,
>>
>> This series was posted a few ago and all patches have been
>> reviewed/tested-by multiple people. Possible minor hangups:
>>
>> 1) Matthias found it confusing that min/max_freq in sysfs changes
>> on-the-fly. This is not a behavior change and I believe a decent
>> workaround would be to implement separate user_min/max_freq files from
>> which userspace will always read back the contraints it has written.
> 
> As you said, it isn't a behavioral change, so it shouldn't be an issue
> for this series.
> 
> Regarding the workaround I think it would be clearer to have new
> xyx_min/max_freq files for the aggregate values. min/max_freq are the
> interface userspace uses to specify the limits, it would be strange to
> use different files to read them out.
> 
> In any case the aggregate values seem to be of little practical value,
> except perhaps for monitoring, since they could be stale right after
> userspace read them out. We could start with not providing them, and add
> them if it turns out that they are actually needed.

But the min/max_freq files right now already are already aggregates and 
sysfs is considered ABI. I wouldn't be surprised if somebody has an 
userspace daemon which uses them.

My proposal is to add user_min/max_freq as a new finer-grained interface 
that you can both read and write to, no confusion here.

Writes to min/max_freq would still be supported but only for 
compatibility with older releases.

>> 2) There was an objection to removing devm from two allocs in PATCH 4. I
>> believe current solution is acceptable but a possible alternative would
>> be to split device_register into device_initialize and device_add: this
>> should allow devm sooner.
>> Link: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fpatch%2F11158385%2F%2322902151&data=02%7C01%7Cleonard.crestez%40nxp.com%7Cb89f3efc3c934030fb6108d757d6ecb2%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637074452911403311&sdata=DeOUpVjT1yZ2EWs50CFL98OoTjVMCpQDCM3qjCtKuW0%3D&reserved=0
>>
>> Let me know if you think I should implement the options above and resend.
>>
>> The bigger problem is that DEV_PM_QOS_MIN/MAX_FREQUENCY was removed from
>> pm core because only user (cpufreq) was refactored to use a new
>> interface on top of cpufreq_policy. Links to discussion:
>>       https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fcover%2F11193021%2F&data=02%7C01%7Cleonard.crestez%40nxp.com%7Cb89f3efc3c934030fb6108d757d6ecb2%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637074452911408301&sdata=DxfUtaGch6MilSy5fX8AHN3%2BDIp8MrbQrHH%2B6VdRb%2FI%3D&reserved=0
>>   
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-pm%2FVI1PR04MB7023DF47D046AEADB4E051EBEE680%40VI1PR04MB7023.eurprd04.prod.outlook.com%2FT%2F%23u&data=02%7C01%7Cleonard.crestez%40nxp.com%7Cb89f3efc3c934030fb6108d757d6ecb2%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637074452911408301&sdata=sYQZUbzEk2DsWGQ5eQnu2m2%2Bsp%2BYBO16Abyjwf7Z1sQ%3D&reserved=0
>>
>> I believe there is still significant value in supporting min/max
>> frequency requests on a per-target-device basis. This makes much more
>> sense for devfreq that for cpufreq because the whole point of devfreq is
>> scaling arbitrary independent devices.
> 
> Agreed.
> 
> It seems Rafael would be ok with reverting the patch that removes
> DEV_PM_QOS_MIN/MAX_FREQUENCY, IIUC he just doesn't want to keep it around
> at this time because with the rest of his series there remain no in-tree
> users.