Message ID | 20201203141124.7391-5-mgorman@techsingularity.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Reduce time complexity of select_idle_sibling | expand |
On Thu, 3 Dec 2020 at 15:11, Mel Gorman <mgorman@techsingularity.net> wrote: > > select_idle_core is called when SMT is active and there is likely a free > core available. It may find idle CPUs but this information is simply > discarded and the scan starts over again with select_idle_cpu. > > This patch caches information on idle CPUs found during the search for > a core and uses one if no core is found. This is a tradeoff. There may > be a slight impact when utilisation is low and an idle core can be > found quickly. It provides improvements as the number of busy CPUs > approaches 50% of the domain size when SMT is enabled. > > With tbench on a 2-socket CascadeLake machine, 80 logical CPUs, HT enabled > > 5.10.0-rc6 5.10.0-rc6 > schedstat idlecandidate > Hmean 1 500.06 ( 0.00%) 505.67 * 1.12%* > Hmean 2 975.90 ( 0.00%) 974.06 * -0.19%* > Hmean 4 1902.95 ( 0.00%) 1904.43 * 0.08%* > Hmean 8 3761.73 ( 0.00%) 3721.02 * -1.08%* > Hmean 16 6713.93 ( 0.00%) 6769.17 * 0.82%* > Hmean 32 10435.31 ( 0.00%) 10312.58 * -1.18%* > Hmean 64 12325.51 ( 0.00%) 13792.01 * 11.90%* > Hmean 128 21225.21 ( 0.00%) 20963.44 * -1.23%* > Hmean 256 20532.83 ( 0.00%) 20335.62 * -0.96%* > Hmean 320 20334.81 ( 0.00%) 20147.25 * -0.92%* > > In this particular test, the cost/benefit is marginal except > for 64 which was a point where the machine was over 50% busy > but not fully utilised. > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > --- > kernel/sched/fair.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index fc48cc99b03d..845bc0cd9158 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6066,6 +6066,7 @@ void __update_idle_core(struct rq *rq) > */ > static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target) > { > + int idle_candidate = -1; > struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); > int core, cpu; > > @@ -6084,7 +6085,13 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int > schedstat_inc(this_rq()->sis_scanned); > if (!available_idle_cpu(cpu)) { > idle = false; > - break; > + if (idle_candidate != -1) > + break; If I get your changes correctly, it will now continue to loop on all cpus of the smt mask to try to find an idle cpu whereas it was breaking before as soon as a cpu was not idle. In fact, I thought that you just wanted to be opportunistic and save a candidate but without looping more cpus than currently. With the change above you might end up looping all cpus of llc if there is only one idle cpu in the llc whereas before we were looping only 1 cpu per core at most. The bottom change makes sense but the above on is in some way replacing completely select_idle_cpu and bypass SIS_PROP and we should avoid that IMO > + } > + > + if (idle_candidate == -1 && > + cpumask_test_cpu(cpu, p->cpus_ptr)) { > + idle_candidate = cpu; > } > } > > @@ -6099,7 +6106,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int > */ > set_idle_cores(target, 0); > > - return -1; > + return idle_candidate; > } > > /* > -- > 2.26.2 >
On Thu, Dec 03, 2020 at 05:35:29PM +0100, Vincent Guittot wrote: > > index fc48cc99b03d..845bc0cd9158 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -6066,6 +6066,7 @@ void __update_idle_core(struct rq *rq) > > */ > > static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target) > > { > > + int idle_candidate = -1; > > struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); > > int core, cpu; > > > > @@ -6084,7 +6085,13 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int > > schedstat_inc(this_rq()->sis_scanned); > > if (!available_idle_cpu(cpu)) { > > idle = false; > > - break; > > + if (idle_candidate != -1) > > + break; > > > If I get your changes correctly, it will now continue to loop on all > cpus of the smt mask to try to find an idle cpu whereas it was That was an oversight, the intent is that the SMT search breaks but the search for an idle core continues. The patch was taken from a very different series that unified all the select_idle_* functions as a single function and I failed to fix it up properly. The unification series didn't generate good results back 9 months ago when I tried and I never finished it off. In the current context, it would not make sense to try a unification again. > With the change above you might end up looping all cpus of llc if > there is only one idle cpu in the llc whereas before we were looping > only 1 cpu per core at most. The bottom change makes sense but the > above on is in some way replacing completely select_idle_cpu and > bypass SIS_PROP and we should avoid that IMO > You're right of course, it was never intended to behave like that. diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 0a3d338770c4..49b1590e60a9 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6084,8 +6084,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int for_each_cpu(cpu, cpu_smt_mask(core)) { if (!available_idle_cpu(cpu)) { idle = false; - if (idle_candidate != -1) - break; + break; } if (idle_candidate == -1 &&
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index fc48cc99b03d..845bc0cd9158 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6066,6 +6066,7 @@ void __update_idle_core(struct rq *rq) */ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target) { + int idle_candidate = -1; struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); int core, cpu; @@ -6084,7 +6085,13 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int schedstat_inc(this_rq()->sis_scanned); if (!available_idle_cpu(cpu)) { idle = false; - break; + if (idle_candidate != -1) + break; + } + + if (idle_candidate == -1 && + cpumask_test_cpu(cpu, p->cpus_ptr)) { + idle_candidate = cpu; } } @@ -6099,7 +6106,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int */ set_idle_cores(target, 0); - return -1; + return idle_candidate; } /*
select_idle_core is called when SMT is active and there is likely a free core available. It may find idle CPUs but this information is simply discarded and the scan starts over again with select_idle_cpu. This patch caches information on idle CPUs found during the search for a core and uses one if no core is found. This is a tradeoff. There may be a slight impact when utilisation is low and an idle core can be found quickly. It provides improvements as the number of busy CPUs approaches 50% of the domain size when SMT is enabled. With tbench on a 2-socket CascadeLake machine, 80 logical CPUs, HT enabled 5.10.0-rc6 5.10.0-rc6 schedstat idlecandidate Hmean 1 500.06 ( 0.00%) 505.67 * 1.12%* Hmean 2 975.90 ( 0.00%) 974.06 * -0.19%* Hmean 4 1902.95 ( 0.00%) 1904.43 * 0.08%* Hmean 8 3761.73 ( 0.00%) 3721.02 * -1.08%* Hmean 16 6713.93 ( 0.00%) 6769.17 * 0.82%* Hmean 32 10435.31 ( 0.00%) 10312.58 * -1.18%* Hmean 64 12325.51 ( 0.00%) 13792.01 * 11.90%* Hmean 128 21225.21 ( 0.00%) 20963.44 * -1.23%* Hmean 256 20532.83 ( 0.00%) 20335.62 * -0.96%* Hmean 320 20334.81 ( 0.00%) 20147.25 * -0.92%* In this particular test, the cost/benefit is marginal except for 64 which was a point where the machine was over 50% busy but not fully utilised. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> --- kernel/sched/fair.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)