diff mbox series

[1/4] sched: consolidate and cleanup access to CPU's max compute capacity

Message ID 20230901130312.247719-2-vincent.guittot@linaro.org (mailing list archive)
State Superseded
Headers show
Series consolidate and cleanup CPU capacity | expand

Checks

Context Check Description
conchuod/cover_letter success Series has a cover letter
conchuod/tree_selection success Guessed tree name to be for-next at HEAD 9a1d204f5c57
conchuod/fixes_present success Fixes tag not required for -next series
conchuod/maintainers_pattern success MAINTAINERS pattern errors before the patch: 2 and now 2
conchuod/verify_signedoff success Signed-off-by tag matches author and committer
conchuod/kdoc success Errors and warnings before: 2 this patch: 2
conchuod/build_rv64_clang_allmodconfig success Errors and warnings before: 12 this patch: 12
conchuod/module_param success Was 0 now: 0
conchuod/build_rv64_gcc_allmodconfig success Errors and warnings before: 10 this patch: 10
conchuod/build_rv32_defconfig success Build OK
conchuod/dtb_warn_rv64 success Errors and warnings before: 39 this patch: 39
conchuod/header_inline success No static functions without inline keyword in header files
conchuod/checkpatch success total: 0 errors, 0 warnings, 0 checks, 147 lines checked
conchuod/build_rv64_nommu_k210_defconfig success Build OK
conchuod/verify_fixes success No Fixes tag
conchuod/build_rv64_nommu_virt_defconfig success Build OK

Commit Message

Vincent Guittot Sept. 1, 2023, 1:03 p.m. UTC
Remove struct rq cpu_capacity_orig field and use arch_scale_cpu_capacity()
instead.

Scheduler uses 3 methods to get access to the CPU's max compute capacity:
- arch_scale_cpu_capacity(cpu) which is the default way to get CPU's capacity.
- cpu_capacity_orig field which is periodically updated with
  arch_scale_cpu_capacity().
- capacity_orig_of(cpu) which encapsulates rq->cpu_capacity_orig

There is no real need to save the value returned by arch_scale_cpu_capacity()
in struct rq. arch_scale_cpu_capacity() returns:
- either a per_cpu variable.
- or a const value for systems which have only one capacity.

Remove cpu_capacity_orig and use arch_scale_cpu_capacity() everywhere.

No functional changes.

some tests of Arm64:
small SMP device (hikey): no noticeable changes
HMP device (RB5): hackbench shows minor improvement (1-2%)
large smp (thx2): hackbench and tbench shows minor improvement (1%)

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/core.c        |  2 +-
 kernel/sched/cpudeadline.c |  2 +-
 kernel/sched/deadline.c    |  4 ++--
 kernel/sched/fair.c        | 18 ++++++++----------
 kernel/sched/rt.c          |  2 +-
 kernel/sched/sched.h       |  6 ------
 kernel/sched/topology.c    |  7 +++++--
 7 files changed, 18 insertions(+), 23 deletions(-)

Comments

Peter Zijlstra Sept. 5, 2023, 11:25 a.m. UTC | #1
On Fri, Sep 01, 2023 at 03:03:09PM +0200, Vincent Guittot wrote:
> Remove struct rq cpu_capacity_orig field and use arch_scale_cpu_capacity()
> instead.
> 
> Scheduler uses 3 methods to get access to the CPU's max compute capacity:
> - arch_scale_cpu_capacity(cpu) which is the default way to get CPU's capacity.
> - cpu_capacity_orig field which is periodically updated with
>   arch_scale_cpu_capacity().
> - capacity_orig_of(cpu) which encapsulates rq->cpu_capacity_orig
> 
> There is no real need to save the value returned by arch_scale_cpu_capacity()
> in struct rq. arch_scale_cpu_capacity() returns:
> - either a per_cpu variable.
> - or a const value for systems which have only one capacity.
> 
> Remove cpu_capacity_orig and use arch_scale_cpu_capacity() everywhere.
> 
> No functional changes.

I think the original thinking was that we wouldn't know how expensive
the function call would end up being, but yeah, given how things stand
this is a nice cleanup.
Dietmar Eggemann Sept. 14, 2023, 8:45 p.m. UTC | #2
On 01/09/2023 15:03, Vincent Guittot wrote:
> Remove struct rq cpu_capacity_orig field and use arch_scale_cpu_capacity()
> instead.
> 
> Scheduler uses 3 methods to get access to the CPU's max compute capacity:
> - arch_scale_cpu_capacity(cpu) which is the default way to get CPU's capacity.
> - cpu_capacity_orig field which is periodically updated with
>   arch_scale_cpu_capacity().
> - capacity_orig_of(cpu) which encapsulates rq->cpu_capacity_orig
> 
> There is no real need to save the value returned by arch_scale_cpu_capacity()
> in struct rq. arch_scale_cpu_capacity() returns:
> - either a per_cpu variable.
> - or a const value for systems which have only one capacity.
> 
> Remove cpu_capacity_orig and use arch_scale_cpu_capacity() everywhere.
> 
> No functional changes.
> 
> some tests of Arm64:
> small SMP device (hikey): no noticeable changes
> HMP device (RB5): hackbench shows minor improvement (1-2%)
> large smp (thx2): hackbench and tbench shows minor improvement (1%)
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>

Next to util_fits_cpu() which uses capacity_orig as a local variable
(which is fine) there is sis() referring to capacity_orig in a comment.

Documentation/scheduler/sched-capacity.rst uses the term `capacity_orig`
in chapter 1.2 to explain the difference between CPU's maximum
(attainable) capacity and capacity as the former reduced by pressure.

Not sure if you want to change those refs as well with this patch?
People might get confused about the term `capacity_orig` pretty soon.

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Vincent Guittot Sept. 15, 2023, 1:20 p.m. UTC | #3
On Thu, 14 Sept 2023 at 22:46, Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
> On 01/09/2023 15:03, Vincent Guittot wrote:
> > Remove struct rq cpu_capacity_orig field and use arch_scale_cpu_capacity()
> > instead.
> >
> > Scheduler uses 3 methods to get access to the CPU's max compute capacity:
> > - arch_scale_cpu_capacity(cpu) which is the default way to get CPU's capacity.
> > - cpu_capacity_orig field which is periodically updated with
> >   arch_scale_cpu_capacity().
> > - capacity_orig_of(cpu) which encapsulates rq->cpu_capacity_orig
> >
> > There is no real need to save the value returned by arch_scale_cpu_capacity()
> > in struct rq. arch_scale_cpu_capacity() returns:
> > - either a per_cpu variable.
> > - or a const value for systems which have only one capacity.
> >
> > Remove cpu_capacity_orig and use arch_scale_cpu_capacity() everywhere.
> >
> > No functional changes.
> >
> > some tests of Arm64:
> > small SMP device (hikey): no noticeable changes
> > HMP device (RB5): hackbench shows minor improvement (1-2%)
> > large smp (thx2): hackbench and tbench shows minor improvement (1%)
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>
> Next to util_fits_cpu() which uses capacity_orig as a local variable
> (which is fine) there is sis() referring to capacity_orig in a comment.
>
> Documentation/scheduler/sched-capacity.rst uses the term `capacity_orig`
> in chapter 1.2 to explain the difference between CPU's maximum
> (attainable) capacity and capacity as the former reduced by pressure.

ok, I will have a look at those references to capacity_orig

>
> Not sure if you want to change those refs as well with this patch?
> People might get confused about the term `capacity_orig` pretty soon.
>
> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>

Thanks
diff mbox series

Patch

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index efe3848978a0..6560392f2f83 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10014,7 +10014,7 @@  void __init sched_init(void)
 #ifdef CONFIG_SMP
 		rq->sd = NULL;
 		rq->rd = NULL;
-		rq->cpu_capacity = rq->cpu_capacity_orig = SCHED_CAPACITY_SCALE;
+		rq->cpu_capacity = SCHED_CAPACITY_SCALE;
 		rq->balance_callback = &balance_push_callback;
 		rq->active_balance = 0;
 		rq->next_balance = jiffies;
diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 57c92d751bcd..95baa12a1029 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -131,7 +131,7 @@  int cpudl_find(struct cpudl *cp, struct task_struct *p,
 			if (!dl_task_fits_capacity(p, cpu)) {
 				cpumask_clear_cpu(cpu, later_mask);
 
-				cap = capacity_orig_of(cpu);
+				cap = arch_scale_cpu_capacity(cpu);
 
 				if (cap > max_cap ||
 				    (cpu == task_cpu(p) && cap == max_cap)) {
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 58b542bf2893..c57ef2e0db41 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -132,7 +132,7 @@  static inline unsigned long __dl_bw_capacity(const struct cpumask *mask)
 	int i;
 
 	for_each_cpu_and(i, mask, cpu_active_mask)
-		cap += capacity_orig_of(i);
+		cap += arch_scale_cpu_capacity(i);
 
 	return cap;
 }
@@ -144,7 +144,7 @@  static inline unsigned long __dl_bw_capacity(const struct cpumask *mask)
 static inline unsigned long dl_bw_capacity(int i)
 {
 	if (!sched_asym_cpucap_active() &&
-	    capacity_orig_of(i) == SCHED_CAPACITY_SCALE) {
+	    arch_scale_cpu_capacity(i) == SCHED_CAPACITY_SCALE) {
 		return dl_bw_cpus(i) << SCHED_CAPACITY_SHIFT;
 	} else {
 		RCU_LOCKDEP_WARN(!rcu_read_lock_sched_held(),
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0b7445cd5af9..06d6d0dde48a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4690,7 +4690,7 @@  static inline void util_est_update(struct cfs_rq *cfs_rq,
 	 * To avoid overestimation of actual task utilization, skip updates if
 	 * we cannot grant there is idle time in this CPU.
 	 */
-	if (task_util(p) > capacity_orig_of(cpu_of(rq_of(cfs_rq))))
+	if (task_util(p) > arch_scale_cpu_capacity(cpu_of(rq_of(cfs_rq))))
 		return;
 
 	/*
@@ -4738,14 +4738,14 @@  static inline int util_fits_cpu(unsigned long util,
 		return fits;
 
 	/*
-	 * We must use capacity_orig_of() for comparing against uclamp_min and
+	 * We must use arch_scale_cpu_capacity() for comparing against uclamp_min and
 	 * uclamp_max. We only care about capacity pressure (by using
 	 * capacity_of()) for comparing against the real util.
 	 *
 	 * If a task is boosted to 1024 for example, we don't want a tiny
 	 * pressure to skew the check whether it fits a CPU or not.
 	 *
-	 * Similarly if a task is capped to capacity_orig_of(little_cpu), it
+	 * Similarly if a task is capped to arch_scale_cpu_capacity(little_cpu), it
 	 * should fit a little cpu even if there's some pressure.
 	 *
 	 * Only exception is for thermal pressure since it has a direct impact
@@ -4757,7 +4757,7 @@  static inline int util_fits_cpu(unsigned long util,
 	 * For uclamp_max, we can tolerate a drop in performance level as the
 	 * goal is to cap the task. So it's okay if it's getting less.
 	 */
-	capacity_orig = capacity_orig_of(cpu);
+	capacity_orig = arch_scale_cpu_capacity(cpu);
 	capacity_orig_thermal = capacity_orig - arch_scale_thermal_pressure(cpu);
 
 	/*
@@ -7226,7 +7226,7 @@  select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
 		 * Look for the CPU with best capacity.
 		 */
 		else if (fits < 0)
-			cpu_cap = capacity_orig_of(cpu) - thermal_load_avg(cpu_rq(cpu));
+			cpu_cap = arch_scale_cpu_capacity(cpu) - thermal_load_avg(cpu_rq(cpu));
 
 		/*
 		 * First, select CPU which fits better (-1 being better than 0).
@@ -7468,7 +7468,7 @@  cpu_util(int cpu, struct task_struct *p, int dst_cpu, int boost)
 		util = max(util, util_est);
 	}
 
-	return min(util, capacity_orig_of(cpu));
+	return min(util, arch_scale_cpu_capacity(cpu));
 }
 
 unsigned long cpu_util_cfs(int cpu)
@@ -9251,8 +9251,6 @@  static void update_cpu_capacity(struct sched_domain *sd, int cpu)
 	unsigned long capacity = scale_rt_capacity(cpu);
 	struct sched_group *sdg = sd->groups;
 
-	cpu_rq(cpu)->cpu_capacity_orig = arch_scale_cpu_capacity(cpu);
-
 	if (!capacity)
 		capacity = 1;
 
@@ -9328,7 +9326,7 @@  static inline int
 check_cpu_capacity(struct rq *rq, struct sched_domain *sd)
 {
 	return ((rq->cpu_capacity * sd->imbalance_pct) <
-				(rq->cpu_capacity_orig * 100));
+				(arch_scale_cpu_capacity(cpu_of(rq)) * 100));
 }
 
 /*
@@ -9339,7 +9337,7 @@  check_cpu_capacity(struct rq *rq, struct sched_domain *sd)
 static inline int check_misfit_status(struct rq *rq, struct sched_domain *sd)
 {
 	return rq->misfit_task_load &&
-		(rq->cpu_capacity_orig < rq->rd->max_cpu_capacity ||
+		(arch_scale_cpu_capacity(rq->cpu) < rq->rd->max_cpu_capacity ||
 		 check_cpu_capacity(rq, sd));
 }
 
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 0597ba0f85ff..8f4e8db6e234 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -515,7 +515,7 @@  static inline bool rt_task_fits_capacity(struct task_struct *p, int cpu)
 	min_cap = uclamp_eff_value(p, UCLAMP_MIN);
 	max_cap = uclamp_eff_value(p, UCLAMP_MAX);
 
-	cpu_cap = capacity_orig_of(cpu);
+	cpu_cap = arch_scale_cpu_capacity(cpu);
 
 	return cpu_cap >= min(min_cap, max_cap);
 }
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 3a01b7a2bf66..17ae151e90c0 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1048,7 +1048,6 @@  struct rq {
 	struct sched_domain __rcu	*sd;
 
 	unsigned long		cpu_capacity;
-	unsigned long		cpu_capacity_orig;
 
 	struct balance_callback *balance_callback;
 
@@ -2974,11 +2973,6 @@  static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
 #endif
 
 #ifdef CONFIG_SMP
-static inline unsigned long capacity_orig_of(int cpu)
-{
-	return cpu_rq(cpu)->cpu_capacity_orig;
-}
-
 /**
  * enum cpu_util_type - CPU utilization type
  * @FREQUENCY_UTIL:	Utilization used to select frequency
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 05a5bc678c08..e6b0b6a8e60a 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2479,12 +2479,15 @@  build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
 	/* Attach the domains */
 	rcu_read_lock();
 	for_each_cpu(i, cpu_map) {
+		unsigned long capacity;
+
 		rq = cpu_rq(i);
 		sd = *per_cpu_ptr(d.sd, i);
 
+		capacity = arch_scale_cpu_capacity(i);
 		/* Use READ_ONCE()/WRITE_ONCE() to avoid load/store tearing: */
-		if (rq->cpu_capacity_orig > READ_ONCE(d.rd->max_cpu_capacity))
-			WRITE_ONCE(d.rd->max_cpu_capacity, rq->cpu_capacity_orig);
+		if (capacity > READ_ONCE(d.rd->max_cpu_capacity))
+			WRITE_ONCE(d.rd->max_cpu_capacity, capacity);
 
 		cpu_attach_domain(sd, d.rd, i);
 	}