diff mbox series

[v4,2/2] Documentation/sysctl: Document uclamp sysctl knobs

Message ID 20200501114927.15248-2-qais.yousef@arm.com (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Qais Yousef May 1, 2020, 11:49 a.m. UTC
Uclamp exposes 3 sysctl knobs:

	* sched_util_clamp_min
	* sched_util_clamp_max
	* sched_util_clamp_min_rt_default

Document them in sysctl/kernel.rst.

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
CC: Jonathan Corbet <corbet@lwn.net>
CC: Juri Lelli <juri.lelli@redhat.com>
CC: Vincent Guittot <vincent.guittot@linaro.org>
CC: Dietmar Eggemann <dietmar.eggemann@arm.com>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ben Segall <bsegall@google.com>
CC: Mel Gorman <mgorman@suse.de>
CC: Luis Chamberlain <mcgrof@kernel.org>
CC: Kees Cook <keescook@chromium.org>
CC: Iurii Zaikin <yzaikin@google.com>
CC: Quentin Perret <qperret@google.com>
CC: Valentin Schneider <valentin.schneider@arm.com>
CC: Patrick Bellasi <patrick.bellasi@matbug.net>
CC: Pavan Kondeti <pkondeti@codeaurora.org>
CC: Randy Dunlap <rdunlap@infradead.org>
CC: linux-doc@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: linux-fsdevel@vger.kernel.org
---

Changes in v4:
	* Punctuation fixes (Randy Dunlap).


 Documentation/admin-guide/sysctl/kernel.rst | 48 +++++++++++++++++++++
 1 file changed, 48 insertions(+)

Comments

Patrick Bellasi May 3, 2020, 5:45 p.m. UTC | #1
Hi Qais,

On Fri, May 01, 2020 at 13:49:27 +0200, Qais Yousef <qais.yousef@arm.com> wrote...

[...]

> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> index 0d427fd10941..521c18ce3d92 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -940,6 +940,54 @@ Enables/disables scheduler statistics. Enabling this feature
>  incurs a small amount of overhead in the scheduler but is
>  useful for debugging and performance tuning.
>  
> +sched_util_clamp_min:
> +=====================
> +
> +Max allowed *minimum* utilization.
> +
> +Default value is SCHED_CAPACITY_SCALE (1024), which is the maximum possible
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^

Mmm... I feel one of the two is an implementation detail which should
probably not be exposed?

The user perhaps needs to know the value (1024) but we don't need to
expose the internal representation.


> +value.
> +
> +It means that any requested uclamp.min value cannot be greater than
> +sched_util_clamp_min, i.e., it is restricted to the range
> +[0:sched_util_clamp_min].
> +
> +sched_util_clamp_max:
> +=====================
> +
> +Max allowed *maximum* utilization.
> +
> +Default value is SCHED_CAPACITY_SCALE (1024), which is the maximum possible
> +value.
> +
> +It means that any requested uclamp.max value cannot be greater than
> +sched_util_clamp_max, i.e., it is restricted to the range
> +[0:sched_util_clamp_max].
> +
> +sched_util_clamp_min_rt_default:
> +================================
> +
> +By default Linux is tuned for performance. Which means that RT tasks always run
> +at the highest frequency and most capable (highest capacity) CPU (in
> +heterogeneous systems).
> +
> +Uclamp achieves this by setting the requested uclamp.min of all RT tasks to
> +SCHED_CAPACITY_SCALE (1024) by default, which effectively boosts the tasks to
> +run at the highest frequency and biases them to run on the biggest CPU.
> +
> +This knob allows admins to change the default behavior when uclamp is being
> +used. In battery powered devices particularly, running at the maximum
> +capacity and frequency will increase energy consumption and shorten the battery
> +life.
> +
> +This knob is only effective for RT tasks which the user hasn't modified their
> +requested uclamp.min value via sched_setattr() syscall.
> +
> +This knob will not escape the constraint imposed by sched_util_clamp_min
> +defined above.

Perhaps it's worth to specify that this value is going to be clamped by
the values above? Otherwise it's a bit ambiguous to know what happen
when it's bigger than schedu_util_clamp_min.

> +Any modification is applied lazily on the next opportunity the scheduler needs
> +to calculate the effective value of uclamp.min of the task.
                    ^^^^^^^^^

This is also an implementation detail, I would remove it.

>  
>  seccomp
>  =======


Best,
Patrick
Qais Yousef May 5, 2020, 2:56 p.m. UTC | #2
Hi Patrick

On 05/03/20 19:45, Patrick Bellasi wrote:
> > +sched_util_clamp_min:
> > +=====================
> > +
> > +Max allowed *minimum* utilization.
> > +
> > +Default value is SCHED_CAPACITY_SCALE (1024), which is the maximum possible
>                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Mmm... I feel one of the two is an implementation detail which should
> probably not be exposed?
> 
> The user perhaps needs to know the value (1024) but we don't need to
> expose the internal representation.

Okay.

> 
> 
> > +value.
> > +
> > +It means that any requested uclamp.min value cannot be greater than
> > +sched_util_clamp_min, i.e., it is restricted to the range
> > +[0:sched_util_clamp_min].
> > +
> > +sched_util_clamp_max:
> > +=====================
> > +
> > +Max allowed *maximum* utilization.
> > +
> > +Default value is SCHED_CAPACITY_SCALE (1024), which is the maximum possible
> > +value.
> > +
> > +It means that any requested uclamp.max value cannot be greater than
> > +sched_util_clamp_max, i.e., it is restricted to the range
> > +[0:sched_util_clamp_max].
> > +
> > +sched_util_clamp_min_rt_default:
> > +================================
> > +
> > +By default Linux is tuned for performance. Which means that RT tasks always run
> > +at the highest frequency and most capable (highest capacity) CPU (in
> > +heterogeneous systems).
> > +
> > +Uclamp achieves this by setting the requested uclamp.min of all RT tasks to
> > +SCHED_CAPACITY_SCALE (1024) by default, which effectively boosts the tasks to
> > +run at the highest frequency and biases them to run on the biggest CPU.
> > +
> > +This knob allows admins to change the default behavior when uclamp is being
> > +used. In battery powered devices particularly, running at the maximum
> > +capacity and frequency will increase energy consumption and shorten the battery
> > +life.
> > +
> > +This knob is only effective for RT tasks which the user hasn't modified their
> > +requested uclamp.min value via sched_setattr() syscall.
> > +
> > +This knob will not escape the constraint imposed by sched_util_clamp_min
> > +defined above.
> 
> Perhaps it's worth to specify that this value is going to be clamped by
> the values above? Otherwise it's a bit ambiguous to know what happen
> when it's bigger than schedu_util_clamp_min.

Hmm for me that sentence says exactly what you're asking for.

So what you want is

	s/will not escape the constraint imposed by/will be clamped by/

?

I'm not sure if this will help if the above is already ambiguous. Maybe if
I explicitly say

	..will not escape the *range* constrained imposed by..

sched_util_clamp_min is already defined as a range constraint, so hopefully it
should hit the mark better now?

> 
> > +Any modification is applied lazily on the next opportunity the scheduler needs
> > +to calculate the effective value of uclamp.min of the task.
>                     ^^^^^^^^^
> 
> This is also an implementation detail, I would remove it.

The idea is that this value is not updated 'immediately'/synchronously. So
currently RUNNING tasks will not see the effect, which could generate confusion
when users trip over it. IMO giving an idea of how it's updated will help with
expectation of the users. I doubt any will care, but I think it's an important
behavior element that is worth conveying and documenting. I'd be happy to
reword it if necessary.

I have this now

"""
 984 This knob will not escape the range constraint imposed by sched_util_clamp_min
 985 defined above.
 986
 987 For example if
 988
 989         sched_util_clamp_min_rt_default = 800
 990         sched_util_clamp_min = 600
 991
 992 Then the boost will be clamped to 600 because 800 is outside of the permissible
 993 range of [0:600]. This could happen for instance if a powersave mode will
 994 restrict all boosts temporarily by modifying sched_util_clamp_min. As soon as
 995 this restriction is lifted, the requested sched_util_clamp_min_rt_default
 996 will take effect.
 997
 998 Any modification is applied lazily to currently running tasks and should be
 999 visible by the next wakeup.
"""

Thanks

--
Qais Yousef
Patrick Bellasi May 11, 2020, 1 p.m. UTC | #3
Hi Qais,

On Tue, May 05, 2020 at 16:56:37 +0200, Qais Yousef <qais.yousef@arm.com> wrote...

>> > +sched_util_clamp_min_rt_default:
>> > +================================
>> > +
>> > +By default Linux is tuned for performance. Which means that RT tasks always run
>> > +at the highest frequency and most capable (highest capacity) CPU (in
>> > +heterogeneous systems).
>> > +
>> > +Uclamp achieves this by setting the requested uclamp.min of all RT tasks to
>> > +SCHED_CAPACITY_SCALE (1024) by default, which effectively boosts the tasks to
>> > +run at the highest frequency and biases them to run on the biggest CPU.
>> > +
>> > +This knob allows admins to change the default behavior when uclamp is being
>> > +used. In battery powered devices particularly, running at the maximum
>> > +capacity and frequency will increase energy consumption and shorten the battery
>> > +life.
>> > +
>> > +This knob is only effective for RT tasks which the user hasn't modified their
>> > +requested uclamp.min value via sched_setattr() syscall.
>> > +
>> > +This knob will not escape the constraint imposed by sched_util_clamp_min
>> > +defined above.
>> 
>> Perhaps it's worth to specify that this value is going to be clamped by
>> the values above? Otherwise it's a bit ambiguous to know what happen
>> when it's bigger than schedu_util_clamp_min.
>
> Hmm for me that sentence says exactly what you're asking for.
>
> So what you want is
>
> 	s/will not escape the constraint imposed by/will be clamped by/
>
> ?
>
> I'm not sure if this will help if the above is already ambiguous. Maybe if
> I explicitly say
>
> 	..will not escape the *range* constrained imposed by..
>
> sched_util_clamp_min is already defined as a range constraint, so hopefully it
> should hit the mark better now?

Right, that also can work.

>> 
>> > +Any modification is applied lazily on the next opportunity the scheduler needs
>> > +to calculate the effective value of uclamp.min of the task.
>>                     ^^^^^^^^^
>> 
>> This is also an implementation detail, I would remove it.
>
> The idea is that this value is not updated 'immediately'/synchronously. So
> currently RUNNING tasks will not see the effect, which could generate confusion
> when users trip over it. IMO giving an idea of how it's updated will help with
> expectation of the users. I doubt any will care, but I think it's an important
> behavior element that is worth conveying and documenting. I'd be happy to
> reword it if necessary.

Right, I agree on giving an hint on the lazy update. What I was pointing
out was mainly the reference to the 'effective' value. Maybe we can just
drop that word.

> I have this now
>
> """
>  984 This knob will not escape the range constraint imposed by sched_util_clamp_min
>  985 defined above.
>  986
>  987 For example if
>  988
>  989         sched_util_clamp_min_rt_default = 800
>  990         sched_util_clamp_min = 600
>  991
>  992 Then the boost will be clamped to 600 because 800 is outside of the permissible
>  993 range of [0:600]. This could happen for instance if a powersave mode will
>  994 restrict all boosts temporarily by modifying sched_util_clamp_min. As soon as
>  995 this restriction is lifted, the requested sched_util_clamp_min_rt_default
>  996 will take effect.
>  997
>  998 Any modification is applied lazily to currently running tasks and should be
>  999 visible by the next wakeup.
> """

That's better IMHO, would just slightly change the last sentence to:

       Any modification is applied lazily to tasks and is effective
       starting from their next wakeup.

Best,
Patrick
Qais Yousef May 11, 2020, 3:28 p.m. UTC | #4
Hi Patrick

On 05/11/20 15:00, Patrick Bellasi wrote:

[...]

> > I have this now
> >
> > """
> >  984 This knob will not escape the range constraint imposed by sched_util_clamp_min
> >  985 defined above.
> >  986
> >  987 For example if
> >  988
> >  989         sched_util_clamp_min_rt_default = 800
> >  990         sched_util_clamp_min = 600
> >  991
> >  992 Then the boost will be clamped to 600 because 800 is outside of the permissible
> >  993 range of [0:600]. This could happen for instance if a powersave mode will
> >  994 restrict all boosts temporarily by modifying sched_util_clamp_min. As soon as
> >  995 this restriction is lifted, the requested sched_util_clamp_min_rt_default
> >  996 will take effect.
> >  997
> >  998 Any modification is applied lazily to currently running tasks and should be
> >  999 visible by the next wakeup.
> > """
> 
> That's better IMHO, would just slightly change the last sentence to:
> 
>        Any modification is applied lazily to tasks and is effective
>        starting from their next wakeup.

+1, will post v5 later today.

Thanks

--
Qais Yousef
diff mbox series

Patch

diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 0d427fd10941..521c18ce3d92 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -940,6 +940,54 @@  Enables/disables scheduler statistics. Enabling this feature
 incurs a small amount of overhead in the scheduler but is
 useful for debugging and performance tuning.
 
+sched_util_clamp_min:
+=====================
+
+Max allowed *minimum* utilization.
+
+Default value is SCHED_CAPACITY_SCALE (1024), which is the maximum possible
+value.
+
+It means that any requested uclamp.min value cannot be greater than
+sched_util_clamp_min, i.e., it is restricted to the range
+[0:sched_util_clamp_min].
+
+sched_util_clamp_max:
+=====================
+
+Max allowed *maximum* utilization.
+
+Default value is SCHED_CAPACITY_SCALE (1024), which is the maximum possible
+value.
+
+It means that any requested uclamp.max value cannot be greater than
+sched_util_clamp_max, i.e., it is restricted to the range
+[0:sched_util_clamp_max].
+
+sched_util_clamp_min_rt_default:
+================================
+
+By default Linux is tuned for performance. Which means that RT tasks always run
+at the highest frequency and most capable (highest capacity) CPU (in
+heterogeneous systems).
+
+Uclamp achieves this by setting the requested uclamp.min of all RT tasks to
+SCHED_CAPACITY_SCALE (1024) by default, which effectively boosts the tasks to
+run at the highest frequency and biases them to run on the biggest CPU.
+
+This knob allows admins to change the default behavior when uclamp is being
+used. In battery powered devices particularly, running at the maximum
+capacity and frequency will increase energy consumption and shorten the battery
+life.
+
+This knob is only effective for RT tasks which the user hasn't modified their
+requested uclamp.min value via sched_setattr() syscall.
+
+This knob will not escape the constraint imposed by sched_util_clamp_min
+defined above.
+
+Any modification is applied lazily on the next opportunity the scheduler needs
+to calculate the effective value of uclamp.min of the task.
 
 seccomp
 =======