Message ID | 20201203141124.7391-7-mgorman@techsingularity.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Reduce time complexity of select_idle_sibling | expand |
On Thu, 3 Dec 2020 at 15:11, Mel Gorman <mgorman@techsingularity.net> wrote: > > The target CPU is definitely not idle in both select_idle_core and > select_idle_cpu. For select_idle_core(), the SMT is potentially > checked unnecessarily as the core is definitely not idle if the > target is busy. For select_idle_cpu(), the first CPU checked is > simply a waste. > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > --- > kernel/sched/fair.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 68dd9cd62fbd..1d8f5c4b4936 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6077,6 +6077,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int > return -1; > > cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > + __cpumask_clear_cpu(target, cpus); should clear cpu_smt_mask(target) as we are sure that the core will not be idle > > for_each_cpu_wrap(core, cpus, target) { > bool idle = true; > @@ -6181,6 +6182,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > time = cpu_clock(this); > > cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > + __cpumask_clear_cpu(target, cpus); > > for_each_cpu_wrap(cpu, cpus, target) { > schedstat_inc(this_rq()->sis_scanned); > -- > 2.26.2 >
On Thu, Dec 03, 2020 at 05:38:03PM +0100, Vincent Guittot wrote: > On Thu, 3 Dec 2020 at 15:11, Mel Gorman <mgorman@techsingularity.net> wrote: > > > > The target CPU is definitely not idle in both select_idle_core and > > select_idle_cpu. For select_idle_core(), the SMT is potentially > > checked unnecessarily as the core is definitely not idle if the > > target is busy. For select_idle_cpu(), the first CPU checked is > > simply a waste. > > > > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > --- > > kernel/sched/fair.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 68dd9cd62fbd..1d8f5c4b4936 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -6077,6 +6077,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int > > return -1; > > > > cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > > + __cpumask_clear_cpu(target, cpus); > > should clear cpu_smt_mask(target) as we are sure that the core will not be idle > The intent was that the sibling might still be an idle candidate. In the current draft of the series, I do not even clear this so that the SMT sibling is considered as an idle candidate. The reasoning is that if there are no idle cores then an SMT sibling of the target is as good an idle CPU to select as any.
On Thu, 3 Dec 2020 at 18:52, Mel Gorman <mgorman@techsingularity.net> wrote: > > On Thu, Dec 03, 2020 at 05:38:03PM +0100, Vincent Guittot wrote: > > On Thu, 3 Dec 2020 at 15:11, Mel Gorman <mgorman@techsingularity.net> wrote: > > > > > > The target CPU is definitely not idle in both select_idle_core and > > > select_idle_cpu. For select_idle_core(), the SMT is potentially > > > checked unnecessarily as the core is definitely not idle if the > > > target is busy. For select_idle_cpu(), the first CPU checked is > > > simply a waste. > > > > > > > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > > --- > > > kernel/sched/fair.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > index 68dd9cd62fbd..1d8f5c4b4936 100644 > > > --- a/kernel/sched/fair.c > > > +++ b/kernel/sched/fair.c > > > @@ -6077,6 +6077,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int > > > return -1; > > > > > > cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > > > + __cpumask_clear_cpu(target, cpus); > > > > should clear cpu_smt_mask(target) as we are sure that the core will not be idle > > > > The intent was that the sibling might still be an idle candidate. In > the current draft of the series, I do not even clear this so that the > SMT sibling is considered as an idle candidate. The reasoning is that if > there are no idle cores then an SMT sibling of the target is as good an > idle CPU to select as any. Isn't the purpose of select_idle_smt ? select_idle_core() looks for an idle core and opportunistically saves an idle CPU candidate to skip select_idle_cpu. In this case this is useless loops for select_idle_core() because we are sure that the core is not idle > > -- > Mel Gorman > SUSE Labs
On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote: > > The intent was that the sibling might still be an idle candidate. In > > the current draft of the series, I do not even clear this so that the > > SMT sibling is considered as an idle candidate. The reasoning is that if > > there are no idle cores then an SMT sibling of the target is as good an > > idle CPU to select as any. > > Isn't the purpose of select_idle_smt ? > Only in part. > select_idle_core() looks for an idle core and opportunistically saves > an idle CPU candidate to skip select_idle_cpu. In this case this is > useless loops for select_idle_core() because we are sure that the core > is not idle > If select_idle_core() finds an idle candidate other than the sibling, it'll use it if there is no idle core -- it picks a busy sibling based on a linear walk of the cpumask. Similarly, select_idle_cpu() is not guaranteed to scan the sibling first (ordering) or even reach the sibling (throttling). select_idle_smt() is a last-ditch effort.
On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote: > > On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote: > > > The intent was that the sibling might still be an idle candidate. In > > > the current draft of the series, I do not even clear this so that the > > > SMT sibling is considered as an idle candidate. The reasoning is that if > > > there are no idle cores then an SMT sibling of the target is as good an > > > idle CPU to select as any. > > > > Isn't the purpose of select_idle_smt ? > > > > Only in part. > > > select_idle_core() looks for an idle core and opportunistically saves > > an idle CPU candidate to skip select_idle_cpu. In this case this is > > useless loops for select_idle_core() because we are sure that the core > > is not idle > > > > If select_idle_core() finds an idle candidate other than the sibling, > it'll use it if there is no idle core -- it picks a busy sibling based > on a linear walk of the cpumask. Similarly, select_idle_cpu() is not My point is that it's a waste of time to loop the sibling cpus of target in select_idle_core because it will not help to find an idle core. The sibling cpus will then be check either by select_idle_cpu of select_idle_smt > guaranteed to scan the sibling first (ordering) or even reach the sibling > (throttling). select_idle_smt() is a last-ditch effort. > > -- > Mel Gorman > SUSE Labs
On Fri, 4 Dec 2020 at 14:13, Vincent Guittot <vincent.guittot@linaro.org> wrote: > > On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote: > > > > On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote: > > > > The intent was that the sibling might still be an idle candidate. In > > > > the current draft of the series, I do not even clear this so that the > > > > SMT sibling is considered as an idle candidate. The reasoning is that if > > > > there are no idle cores then an SMT sibling of the target is as good an > > > > idle CPU to select as any. > > > > > > Isn't the purpose of select_idle_smt ? > > > > > > > Only in part. > > > > > select_idle_core() looks for an idle core and opportunistically saves > > > an idle CPU candidate to skip select_idle_cpu. In this case this is > > > useless loops for select_idle_core() because we are sure that the core > > > is not idle > > > > > > > If select_idle_core() finds an idle candidate other than the sibling, > > it'll use it if there is no idle core -- it picks a busy sibling based > > on a linear walk of the cpumask. Similarly, select_idle_cpu() is not > > My point is that it's a waste of time to loop the sibling cpus of > target in select_idle_core because it will not help to find an idle > core. The sibling cpus will then be check either by select_idle_cpu > of select_idle_smt also, while looping the cpumask, the sibling cpus of not idle cpu are removed and will not be check > > > guaranteed to scan the sibling first (ordering) or even reach the sibling > > (throttling). select_idle_smt() is a last-ditch effort. > > > > -- > > Mel Gorman > > SUSE Labs
On 2020/12/4 21:17, Vincent Guittot wrote: > On Fri, 4 Dec 2020 at 14:13, Vincent Guittot <vincent.guittot@linaro.org> wrote: >> >> On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote: >>> >>> On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote: >>>>> The intent was that the sibling might still be an idle candidate. In >>>>> the current draft of the series, I do not even clear this so that the >>>>> SMT sibling is considered as an idle candidate. The reasoning is that if >>>>> there are no idle cores then an SMT sibling of the target is as good an >>>>> idle CPU to select as any. >>>> >>>> Isn't the purpose of select_idle_smt ? >>>> >>> >>> Only in part. >>> >>>> select_idle_core() looks for an idle core and opportunistically saves >>>> an idle CPU candidate to skip select_idle_cpu. In this case this is >>>> useless loops for select_idle_core() because we are sure that the core >>>> is not idle >>>> >>> >>> If select_idle_core() finds an idle candidate other than the sibling, >>> it'll use it if there is no idle core -- it picks a busy sibling based >>> on a linear walk of the cpumask. Similarly, select_idle_cpu() is not >> >> My point is that it's a waste of time to loop the sibling cpus of >> target in select_idle_core because it will not help to find an idle >> core. The sibling cpus will then be check either by select_idle_cpu >> of select_idle_smt > > also, while looping the cpumask, the sibling cpus of not idle cpu are > removed and will not be check > IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)? If the target's sibling is removed from select_idle_mask from select_idle_core(), select_idle_cpu() will lose the chance to pick it up? Thanks, -Aubrey
On 2020/12/4 21:40, Li, Aubrey wrote: > On 2020/12/4 21:17, Vincent Guittot wrote: >> On Fri, 4 Dec 2020 at 14:13, Vincent Guittot <vincent.guittot@linaro.org> wrote: >>> >>> On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote: >>>> >>>> On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote: >>>>>> The intent was that the sibling might still be an idle candidate. In >>>>>> the current draft of the series, I do not even clear this so that the >>>>>> SMT sibling is considered as an idle candidate. The reasoning is that if >>>>>> there are no idle cores then an SMT sibling of the target is as good an >>>>>> idle CPU to select as any. >>>>> >>>>> Isn't the purpose of select_idle_smt ? >>>>> >>>> >>>> Only in part. >>>> >>>>> select_idle_core() looks for an idle core and opportunistically saves >>>>> an idle CPU candidate to skip select_idle_cpu. In this case this is >>>>> useless loops for select_idle_core() because we are sure that the core >>>>> is not idle >>>>> >>>> >>>> If select_idle_core() finds an idle candidate other than the sibling, >>>> it'll use it if there is no idle core -- it picks a busy sibling based >>>> on a linear walk of the cpumask. Similarly, select_idle_cpu() is not >>> >>> My point is that it's a waste of time to loop the sibling cpus of >>> target in select_idle_core because it will not help to find an idle >>> core. The sibling cpus will then be check either by select_idle_cpu >>> of select_idle_smt >> >> also, while looping the cpumask, the sibling cpus of not idle cpu are >> removed and will not be check >> > > IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)? > If the target's sibling is removed from select_idle_mask from select_idle_core(), > select_idle_cpu() will lose the chance to pick it up? aha, no, select_idle_mask will be re-assigned in select_idle_cpu() by: cpumask_and(cpus, sds_idle_cpus(sd->shared), p->cpus_ptr); So, yes, I guess we can remove the cpu_smt_mask(target) from select_idle_core() safely. > > Thanks, > -Aubrey >
On Fri, 4 Dec 2020 at 14:40, Li, Aubrey <aubrey.li@linux.intel.com> wrote: > > On 2020/12/4 21:17, Vincent Guittot wrote: > > On Fri, 4 Dec 2020 at 14:13, Vincent Guittot <vincent.guittot@linaro.org> wrote: > >> > >> On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote: > >>> > >>> On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote: > >>>>> The intent was that the sibling might still be an idle candidate. In > >>>>> the current draft of the series, I do not even clear this so that the > >>>>> SMT sibling is considered as an idle candidate. The reasoning is that if > >>>>> there are no idle cores then an SMT sibling of the target is as good an > >>>>> idle CPU to select as any. > >>>> > >>>> Isn't the purpose of select_idle_smt ? > >>>> > >>> > >>> Only in part. > >>> > >>>> select_idle_core() looks for an idle core and opportunistically saves > >>>> an idle CPU candidate to skip select_idle_cpu. In this case this is > >>>> useless loops for select_idle_core() because we are sure that the core > >>>> is not idle > >>>> > >>> > >>> If select_idle_core() finds an idle candidate other than the sibling, > >>> it'll use it if there is no idle core -- it picks a busy sibling based > >>> on a linear walk of the cpumask. Similarly, select_idle_cpu() is not > >> > >> My point is that it's a waste of time to loop the sibling cpus of > >> target in select_idle_core because it will not help to find an idle > >> core. The sibling cpus will then be check either by select_idle_cpu > >> of select_idle_smt > > > > also, while looping the cpumask, the sibling cpus of not idle cpu are > > removed and will not be check > > > > IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)? > If the target's sibling is removed from select_idle_mask from select_idle_core(), > select_idle_cpu() will lose the chance to pick it up? This is only relevant for patch 10 which is not to be included IIUC what mel said in cover letter : "Patches 9 and 10 are stupid in the context of this series." > > Thanks, > -Aubrey
On 2020/12/4 21:47, Vincent Guittot wrote: > On Fri, 4 Dec 2020 at 14:40, Li, Aubrey <aubrey.li@linux.intel.com> wrote: >> >> On 2020/12/4 21:17, Vincent Guittot wrote: >>> On Fri, 4 Dec 2020 at 14:13, Vincent Guittot <vincent.guittot@linaro.org> wrote: >>>> >>>> On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote: >>>>> >>>>> On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote: >>>>>>> The intent was that the sibling might still be an idle candidate. In >>>>>>> the current draft of the series, I do not even clear this so that the >>>>>>> SMT sibling is considered as an idle candidate. The reasoning is that if >>>>>>> there are no idle cores then an SMT sibling of the target is as good an >>>>>>> idle CPU to select as any. >>>>>> >>>>>> Isn't the purpose of select_idle_smt ? >>>>>> >>>>> >>>>> Only in part. >>>>> >>>>>> select_idle_core() looks for an idle core and opportunistically saves >>>>>> an idle CPU candidate to skip select_idle_cpu. In this case this is >>>>>> useless loops for select_idle_core() because we are sure that the core >>>>>> is not idle >>>>>> >>>>> >>>>> If select_idle_core() finds an idle candidate other than the sibling, >>>>> it'll use it if there is no idle core -- it picks a busy sibling based >>>>> on a linear walk of the cpumask. Similarly, select_idle_cpu() is not >>>> >>>> My point is that it's a waste of time to loop the sibling cpus of >>>> target in select_idle_core because it will not help to find an idle >>>> core. The sibling cpus will then be check either by select_idle_cpu >>>> of select_idle_smt >>> >>> also, while looping the cpumask, the sibling cpus of not idle cpu are >>> removed and will not be check >>> >> >> IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)? >> If the target's sibling is removed from select_idle_mask from select_idle_core(), >> select_idle_cpu() will lose the chance to pick it up? > > This is only relevant for patch 10 which is not to be included IIUC > what mel said in cover letter : "Patches 9 and 10 are stupid in the > context of this series." So the target's sibling can be removed from cpumask in select_idle_core in patch 6, and need to be added back in select_idle_core in patch 10, :)
On Fri, Dec 04, 2020 at 02:17:20PM +0100, Vincent Guittot wrote: > On Fri, 4 Dec 2020 at 14:13, Vincent Guittot <vincent.guittot@linaro.org> wrote: > > > > On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote: > > > > > > On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote: > > > > > The intent was that the sibling might still be an idle candidate. In > > > > > the current draft of the series, I do not even clear this so that the > > > > > SMT sibling is considered as an idle candidate. The reasoning is that if > > > > > there are no idle cores then an SMT sibling of the target is as good an > > > > > idle CPU to select as any. > > > > > > > > Isn't the purpose of select_idle_smt ? > > > > > > > > > > Only in part. > > > > > > > select_idle_core() looks for an idle core and opportunistically saves > > > > an idle CPU candidate to skip select_idle_cpu. In this case this is > > > > useless loops for select_idle_core() because we are sure that the core > > > > is not idle > > > > > > > > > > If select_idle_core() finds an idle candidate other than the sibling, > > > it'll use it if there is no idle core -- it picks a busy sibling based > > > on a linear walk of the cpumask. Similarly, select_idle_cpu() is not > > > > My point is that it's a waste of time to loop the sibling cpus of > > target in select_idle_core because it will not help to find an idle > > core. The sibling cpus will then be check either by select_idle_cpu > > of select_idle_smt > I understand and you're right, the full loop was in the context of a series that unified select_idle_* where it made sense. The version I'm currently testing aborts the SMT search if a !idle sibling is encountered. That means that select_idle_core() will no longer scan the entire domain if there are no idle cores. https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git/commit/?h=sched-sissearch-v2r6&id=eb04a344cf7d7ca64c0c8fc0bcade261fa08c19e With the patch on its own, it does mean that select_idle_sibling starts over because SMT siblings might have been cleared. As an aside, select_idle_core() has it's own problems even then. It can start a scan for an idle sibling when cpu_rq(target)->nr_running is very large -- over 100+ running tasks which is almost certainly a useless scan for cores. However, I haven't done anything with that in this series as it seemed like it would be follow-up work. > also, while looping the cpumask, the sibling cpus of not idle cpu are > removed and will not be check > True and I spotted this. I think the load_balance_mask can be abused to clear siblings during select_idle_core() while using select_idle_mask to track CPUs that have not been scanned yet so select_idle_cpu only scans CPUs that have not already been visited. https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git/commit/?h=sched-sissearch-v2r6&id=a6e986dae38855e3be26dfde86bbef1617431dd1 As both the idle candidate and the load_balance_mask abuse are likely to be controversial, I shuffled the series so that it's ordered from least least controversial to most controversial. This https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git/log/?h=sched-sissearch-v2r6 is what is currently being tested. It'll take most of the weekend and I'll post them properly if they pass tests and do not throw up nasty surprises.
On Fri, Dec 04, 2020 at 02:47:48PM +0100, Vincent Guittot wrote: > > IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)? > > If the target's sibling is removed from select_idle_mask from select_idle_core(), > > select_idle_cpu() will lose the chance to pick it up? > > This is only relevant for patch 10 which is not to be included IIUC > what mel said in cover letter : "Patches 9 and 10 are stupid in the > context of this series." > Patch 10 was stupid in the context of the prototype because select_idle_core always returned a CPU. A variation ended up being reintroduced at the end of the Series Yet To Be Posted so that SMT siblings are cleared during select_idle_core() but select_idle_cpu() still has a mask with unvisited CPUs to consider if no idle cores are found. As far as I know, this would still be compatible with Aubrey's idle cpu mask as long as it's visited and cleared between select_idle_core and select_idle_cpu. It relaxes the contraints on Aubrey to some extent because the idle cpu mask would be a hint so if the information is out of date, an idle cpu may still be found the normal way.
On Fri, 4 Dec 2020 at 15:31, Mel Gorman <mgorman@techsingularity.net> wrote: > > On Fri, Dec 04, 2020 at 02:47:48PM +0100, Vincent Guittot wrote: > > > IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)? > > > If the target's sibling is removed from select_idle_mask from select_idle_core(), > > > select_idle_cpu() will lose the chance to pick it up? > > > > This is only relevant for patch 10 which is not to be included IIUC > > what mel said in cover letter : "Patches 9 and 10 are stupid in the > > context of this series." > > > > Patch 10 was stupid in the context of the prototype because > select_idle_core always returned a CPU. A variation ended up being > reintroduced at the end of the Series Yet To Be Posted so that SMT siblings > are cleared during select_idle_core() but select_idle_cpu() still has a > mask with unvisited CPUs to consider if no idle cores are found. > > As far as I know, this would still be compatible with Aubrey's idle > cpu mask as long as it's visited and cleared between select_idle_core > and select_idle_cpu. It relaxes the contraints on Aubrey to some extent > because the idle cpu mask would be a hint so if the information is out > of date, an idle cpu may still be found the normal way. But even without patch 10, just replacing sched_domain_span(sd) by sds_idle_cpus(sd->shared) will ensure that sis loops only on cpus that get a chance to be idle so select_idle_core is likely to return an idle_candidate > > -- > Mel Gorman > SUSE Labs
On Fri, Dec 04, 2020 at 04:23:48PM +0100, Vincent Guittot wrote: > On Fri, 4 Dec 2020 at 15:31, Mel Gorman <mgorman@techsingularity.net> wrote: > > > > On Fri, Dec 04, 2020 at 02:47:48PM +0100, Vincent Guittot wrote: > > > > IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)? > > > > If the target's sibling is removed from select_idle_mask from select_idle_core(), > > > > select_idle_cpu() will lose the chance to pick it up? > > > > > > This is only relevant for patch 10 which is not to be included IIUC > > > what mel said in cover letter : "Patches 9 and 10 are stupid in the > > > context of this series." > > > > > > > Patch 10 was stupid in the context of the prototype because > > select_idle_core always returned a CPU. A variation ended up being > > reintroduced at the end of the Series Yet To Be Posted so that SMT siblings > > are cleared during select_idle_core() but select_idle_cpu() still has a > > mask with unvisited CPUs to consider if no idle cores are found. > > > > As far as I know, this would still be compatible with Aubrey's idle > > cpu mask as long as it's visited and cleared between select_idle_core > > and select_idle_cpu. It relaxes the contraints on Aubrey to some extent > > because the idle cpu mask would be a hint so if the information is out > > of date, an idle cpu may still be found the normal way. > > But even without patch 10, just replacing sched_domain_span(sd) by > sds_idle_cpus(sd->shared) will ensure that sis loops only on cpus that > get a chance to be idle so select_idle_core is likely to return an > idle_candidate > Yes but if the idle mask is out of date for any reason then idle CPUs might be missed -- hence the intent to maintain a mask of CPUs visited and use the idle cpu mask as a hint to prioritise CPUs that are likely idle but fall back to a normal scan if none of the "idle cpu mask" CPUs are actually idle.
On Fri, 4 Dec 2020 at 16:40, Mel Gorman <mgorman@techsingularity.net> wrote: > > On Fri, Dec 04, 2020 at 04:23:48PM +0100, Vincent Guittot wrote: > > On Fri, 4 Dec 2020 at 15:31, Mel Gorman <mgorman@techsingularity.net> wrote: > > > > > > On Fri, Dec 04, 2020 at 02:47:48PM +0100, Vincent Guittot wrote: > > > > > IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)? > > > > > If the target's sibling is removed from select_idle_mask from select_idle_core(), > > > > > select_idle_cpu() will lose the chance to pick it up? > > > > > > > > This is only relevant for patch 10 which is not to be included IIUC > > > > what mel said in cover letter : "Patches 9 and 10 are stupid in the > > > > context of this series." > > > > > > > > > > Patch 10 was stupid in the context of the prototype because > > > select_idle_core always returned a CPU. A variation ended up being > > > reintroduced at the end of the Series Yet To Be Posted so that SMT siblings > > > are cleared during select_idle_core() but select_idle_cpu() still has a > > > mask with unvisited CPUs to consider if no idle cores are found. > > > > > > As far as I know, this would still be compatible with Aubrey's idle > > > cpu mask as long as it's visited and cleared between select_idle_core > > > and select_idle_cpu. It relaxes the contraints on Aubrey to some extent > > > because the idle cpu mask would be a hint so if the information is out > > > of date, an idle cpu may still be found the normal way. > > > > But even without patch 10, just replacing sched_domain_span(sd) by > > sds_idle_cpus(sd->shared) will ensure that sis loops only on cpus that > > get a chance to be idle so select_idle_core is likely to return an > > idle_candidate > > > > Yes but if the idle mask is out of date for any reason then idle CPUs might In fact it's the opposite, a cpu in idle mask might not be idle but all cpus that enter idle will be set > be missed -- hence the intent to maintain a mask of CPUs visited and use > the idle cpu mask as a hint to prioritise CPUs that are likely idle but > fall back to a normal scan if none of the "idle cpu mask" CPUs are > actually idle. > > -- > Mel Gorman > SUSE Labs
On Fri, Dec 04, 2020 at 04:43:05PM +0100, Vincent Guittot wrote: > On Fri, 4 Dec 2020 at 16:40, Mel Gorman <mgorman@techsingularity.net> wrote: > > > > On Fri, Dec 04, 2020 at 04:23:48PM +0100, Vincent Guittot wrote: > > > On Fri, 4 Dec 2020 at 15:31, Mel Gorman <mgorman@techsingularity.net> wrote: > > > > > > > > On Fri, Dec 04, 2020 at 02:47:48PM +0100, Vincent Guittot wrote: > > > > > > IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)? > > > > > > If the target's sibling is removed from select_idle_mask from select_idle_core(), > > > > > > select_idle_cpu() will lose the chance to pick it up? > > > > > > > > > > This is only relevant for patch 10 which is not to be included IIUC > > > > > what mel said in cover letter : "Patches 9 and 10 are stupid in the > > > > > context of this series." > > > > > > > > > > > > > Patch 10 was stupid in the context of the prototype because > > > > select_idle_core always returned a CPU. A variation ended up being > > > > reintroduced at the end of the Series Yet To Be Posted so that SMT siblings > > > > are cleared during select_idle_core() but select_idle_cpu() still has a > > > > mask with unvisited CPUs to consider if no idle cores are found. > > > > > > > > As far as I know, this would still be compatible with Aubrey's idle > > > > cpu mask as long as it's visited and cleared between select_idle_core > > > > and select_idle_cpu. It relaxes the contraints on Aubrey to some extent > > > > because the idle cpu mask would be a hint so if the information is out > > > > of date, an idle cpu may still be found the normal way. > > > > > > But even without patch 10, just replacing sched_domain_span(sd) by > > > sds_idle_cpus(sd->shared) will ensure that sis loops only on cpus that > > > get a chance to be idle so select_idle_core is likely to return an > > > idle_candidate > > > > > > > Yes but if the idle mask is out of date for any reason then idle CPUs might > > In fact it's the opposite, a cpu in idle mask might not be idle but > all cpus that enter idle will be set > When I first checked, the information was based on the tick or a CPU stopping the tick. That was not guaranteed to be up to date so I considered the best option would be to treat idle cpu mask as advisory. It would not necessarily cover a CPU that was entering idle and polling before entering an idle state for example or a rq that would pass sched_idle_cpu() depending on the timing of the update_idle_cpumask call. I know you reviewed that patch and v6 may be very different but the more up to date that information is, the greater the cache conflicts will be on sched_domain_shared so maintaining the up-to-date information may cost enough to offset any benefit from reduced searching at wakeup. If this turns out to be wrong, then great, the idle cpu mask can be used as both the basis for an idle core search and a fast find of an individual CPU. If the cost of keeping up to date information is too high then the idle_cpu_mask can be treated as advisory to start the search and track CPUs visited. The series are not either/or, chunks of the series I posted are orthogonal (e.g. changes to p->recent_cpu_used), the latter parts could either work with idle cpu mask or be replaced by idle cpu mask depending on which performs better.
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 68dd9cd62fbd..1d8f5c4b4936 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6077,6 +6077,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int return -1; cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + __cpumask_clear_cpu(target, cpus); for_each_cpu_wrap(core, cpus, target) { bool idle = true; @@ -6181,6 +6182,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t time = cpu_clock(this); cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + __cpumask_clear_cpu(target, cpus); for_each_cpu_wrap(cpu, cpus, target) { schedstat_inc(this_rq()->sis_scanned);
The target CPU is definitely not idle in both select_idle_core and select_idle_cpu. For select_idle_core(), the SMT is potentially checked unnecessarily as the core is definitely not idle if the target is busy. For select_idle_cpu(), the first CPU checked is simply a waste. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> --- kernel/sched/fair.c | 2 ++ 1 file changed, 2 insertions(+)