[4/7] sched/core: uclamp: add utilization clamping to the CPU controller

The cgroup's CPU controller allows to assign a specified (maximum)
bandwidth to the tasks of a group. However this bandwidth is defined and
enforced only on a temporal base, without considering the actual
frequency a CPU is running on. Thus, the amount of computation completed
by a task within an allocated bandwidth can be very different depending
on the actual frequency the CPU is running that task.

With the availability of schedutil, the scheduler is now able
to drive frequency selections based on the actual tasks utilization.
Moreover, the utilization clamping support provides a mechanism to
constraint the frequency selection operated by schedutil depending on
constraints assigned to the tasks currently active on a CPU.

Give the above mechanisms, it is now possible to extend the cpu
controller to specify what is the minimum (or maximum) utilization which
a task is allowed to generate. By adding new constraints on minimum and
maximum utilization allowed for tasks in a cpu control group it will
also be possible to better control the actual amount of CPU bandwidth
consumed by these tasks.

The ultimate goal of this new pair of constraints is to enable:

- boosting: by selecting a higher execution frequency for small tasks
	    which are affecting the user interactive experience

- capping: by selecting lower execution frequency, which usually improves
	   energy efficiency, for big tasks which are mainly related to
	   background activities, and thus without a direct impact on
           the user experience.

This patch extends the CPU controller by adding a couple of new attributes,
util_min and util_max, which can be used to enforce frequency boosting and
capping. Specifically:

- util_min: defines the minimum CPU utilization which should be considered,
	    e.g. when  schedutil selects the frequency for a CPU while a
	    task in this group is RUNNABLE.
	    i.e. the task will run at least at a minimum frequency which
	         corresponds to the min_util utilization

- util_max: defines the maximum CPU utilization which should be considered,
	    e.g. when schedutil selects the frequency for a CPU while a
	    task in this group is RUNNABLE.
	    i.e. the task will run up to a maximum frequency which
	         corresponds to the max_util utilization

These attributes:
a) are tunable at all hierarchy levels, i.e. at root group level too, thus
   allowing to define the minimum and maximum frequency constraints for all
   otherwise non-classified tasks (e.g. autogroups) and to be a sort-of
   replacement for cpufreq's powersave, ondemand and performance
   governors.
b) allow to create subgroups of tasks which are not violating the
   utilization constraints defined by the parent group.

Tasks on a subgroup can only be more boosted and/or capped, which is
matching with the "limits" schema proposed by the "Resource Distribution
Model (RDM)" suggested by the CGroups v2 documentation:
   Documentation/cgroup-v2.txt

This patch provides the basic support to expose the two new attributes and
to validate their run-time update based on the "limits" of the
aforementioned RDM schema.

We first ensure that, whenever a task group is assigned a specific
clamp_value, this is properly translated into a unique clamp group to be
used in the fast-path (i.e. at enqueue/dequeue time). This is done by
slightly refactoring uclamp_group_get to accept a *cgroup_subsys_state
alongside *task_struct.

When uclamp_group_get is called with a valid *cgroup_subsys_state, a
clamp group is assigned to the task, which is possibly different than
the task specific clamp group. We then ensure to update the current
clamp group accounting for all the tasks which are currently runnable on
the cgroup via a new uclamp_group_get_tg() call.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-pm@vger.kernel.org

---
The actual aggregation of per-task and per-task_group utilization
constraints is provided in a separate patch to make it more clear and
documented how this aggregation is performed.
---
 init/Kconfig         |  22 +++++
 kernel/sched/core.c  | 271 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 kernel/sched/sched.h |  21 ++++
 3 files changed, 311 insertions(+), 3 deletions(-)

[4/7] sched/core: uclamp: add utilization clamping to the CPU controller

Commit Message

Comments

Patch