[v2,3/3] sched/fair: schedutil: explicit update only when required

Schedutil updates for FAIR tasks are triggered implicitly each time a
cfs_rq's utilization is updated via cfs_rq_util_change(), currently
called by update_cfs_rq_load_avg(), when the utilization of a cfs_rq has
changed, and {attach,detach}_entity_load_avg().

This design is based on the idea that "we should callback schedutil
frequently enough" to properly update the CPU frequency at every
utilization change. However, such an integration strategy has also
some downsides:

 - schedutil updates are triggered by RQ's load updates, which makes
   sense in general but it does not allow to know exactly which other RQ
   related information have been updated.
   Recently, for example, we had issues due to schedutil dependencies on
   cfs_rq->h_nr_running and estimated utilization updates.

 - cfs_rq_util_change() is mainly a wrapper function for an already
   existing "public API", cpufreq_update_util(), which is required
   just to ensure we actually update schedutil only when we are updating
   a root cfs_rq.
   Thus, especially when task groups are in use, most of the calls to
   this wrapper function are not required.

 - the usage of a wrapper function is not completely consistent across
   fair.c, since we could still need additional explicit calls to
   cpufreq_update_util().
   For example this already happens to report the IOWAIT boot flag in
   the wakeup path.

 - it makes it hard to integrate new features since it could require to
   change other function prototypes just to pass in an additional flag,
   as it happened for example in commit:

      commit ea14b57e8a18 ("sched/cpufreq: Provide migration hint")

All the above considered, let's make schedutil updates more explicit in
fair.c by removing the cfs_rq_util_change() wrapper function in favour
of the existing cpufreq_update_util() public API.
This can be done by calling cpufreq_update_util() explicitly in the few
call sites where it really makes sense and when all the (potentially)
required cfs_rq's information have been updated.

This patch mainly removes code and adds explicit schedutil updates
only when we:
 - {enqueue,dequeue}_task_fair() a task to/from the root cfs_rq
 - (un)throttle_cfs_rq() a set of tasks up to the root cfs_rq
 - task_tick_fair() to update the utilization of the root cfs_rq

All the other code paths, currently _indirectly_ covered by a call to
update_load_avg(), are still covered. Indeed, some paths already imply
enqueue/dequeue calls:
 - switch_{to,from}_fair()
 - sched_move_task()
while others are followed by enqueue/dequeue calls:
 - cpu_cgroup_fork() and
   post_init_entity_util_avg():
     are used at wakeup_new_task() time and thus already followed by an
     enqueue_task_fair()
 - migrate_task_rq_fair():
     updates the removed utilization but not the actual cfs_rq
     utilization, which is updated by a following sched event

This new proposal allows also to better aggregate schedutil related
flags, which are required only at enqueue_task_fair() time.
IOWAIT and MIGRATION flags are now requested only when a task is
actually visible at the root cfs_rq level.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-pm@vger.kernel.org

---

Changes in v2:
 - fixed flags masking (Peter)
 - fixed !CONFIG_SMP build (0-DAY)
   by using a cpufreq_enqueue wrapper to avoid setting the
   SCHED_CPUFREQ_MIGRATION flag on !SMP systems
 - removed blank lines (Viresh)

NOTE: this patch changes the behavior of the IOWAIT flag: in case of a
task waking up on a throttled RQ we do not assert the flag to schedutil
anymore. However, this seems to make sense since the task will not be
running anyway.
---
 kernel/sched/fair.c | 90 +++++++++++++++++++++++++----------------------------
 1 file changed, 43 insertions(+), 47 deletions(-)

[v2,3/3] sched/fair: schedutil: explicit update only when required

Commit Message

Comments

Patch