diff mbox series

[06/10] sched/fair: Clear the target CPU from the cpumask of CPUs searched

Message ID 20201203141124.7391-7-mgorman@techsingularity.net (mailing list archive)
State New, archived
Headers show
Series Reduce time complexity of select_idle_sibling | expand

Commit Message

Mel Gorman Dec. 3, 2020, 2:11 p.m. UTC
The target CPU is definitely not idle in both select_idle_core and
select_idle_cpu. For select_idle_core(), the SMT is potentially
checked unnecessarily as the core is definitely not idle if the
target is busy. For select_idle_cpu(), the first CPU checked is
simply a waste.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 kernel/sched/fair.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Vincent Guittot Dec. 3, 2020, 4:38 p.m. UTC | #1
On Thu, 3 Dec 2020 at 15:11, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> The target CPU is definitely not idle in both select_idle_core and
> select_idle_cpu. For select_idle_core(), the SMT is potentially
> checked unnecessarily as the core is definitely not idle if the
> target is busy. For select_idle_cpu(), the first CPU checked is
> simply a waste.

>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
>  kernel/sched/fair.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 68dd9cd62fbd..1d8f5c4b4936 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6077,6 +6077,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int
>                 return -1;
>
>         cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> +       __cpumask_clear_cpu(target, cpus);

should clear cpu_smt_mask(target) as we are sure that the core will not be idle

>
>         for_each_cpu_wrap(core, cpus, target) {
>                 bool idle = true;
> @@ -6181,6 +6182,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>         time = cpu_clock(this);
>
>         cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> +       __cpumask_clear_cpu(target, cpus);
>
>         for_each_cpu_wrap(cpu, cpus, target) {
>                 schedstat_inc(this_rq()->sis_scanned);
> --
> 2.26.2
>
Mel Gorman Dec. 3, 2020, 5:52 p.m. UTC | #2
On Thu, Dec 03, 2020 at 05:38:03PM +0100, Vincent Guittot wrote:
> On Thu, 3 Dec 2020 at 15:11, Mel Gorman <mgorman@techsingularity.net> wrote:
> >
> > The target CPU is definitely not idle in both select_idle_core and
> > select_idle_cpu. For select_idle_core(), the SMT is potentially
> > checked unnecessarily as the core is definitely not idle if the
> > target is busy. For select_idle_cpu(), the first CPU checked is
> > simply a waste.
> 
> >
> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> > ---
> >  kernel/sched/fair.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 68dd9cd62fbd..1d8f5c4b4936 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -6077,6 +6077,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int
> >                 return -1;
> >
> >         cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> > +       __cpumask_clear_cpu(target, cpus);
> 
> should clear cpu_smt_mask(target) as we are sure that the core will not be idle
> 

The intent was that the sibling might still be an idle candidate. In
the current draft of the series, I do not even clear this so that the
SMT sibling is considered as an idle candidate. The reasoning is that if
there are no idle cores then an SMT sibling of the target is as good an
idle CPU to select as any.
Vincent Guittot Dec. 4, 2020, 10:56 a.m. UTC | #3
On Thu, 3 Dec 2020 at 18:52, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> On Thu, Dec 03, 2020 at 05:38:03PM +0100, Vincent Guittot wrote:
> > On Thu, 3 Dec 2020 at 15:11, Mel Gorman <mgorman@techsingularity.net> wrote:
> > >
> > > The target CPU is definitely not idle in both select_idle_core and
> > > select_idle_cpu. For select_idle_core(), the SMT is potentially
> > > checked unnecessarily as the core is definitely not idle if the
> > > target is busy. For select_idle_cpu(), the first CPU checked is
> > > simply a waste.
> >
> > >
> > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> > > ---
> > >  kernel/sched/fair.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index 68dd9cd62fbd..1d8f5c4b4936 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -6077,6 +6077,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int
> > >                 return -1;
> > >
> > >         cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> > > +       __cpumask_clear_cpu(target, cpus);
> >
> > should clear cpu_smt_mask(target) as we are sure that the core will not be idle
> >
>
> The intent was that the sibling might still be an idle candidate. In
> the current draft of the series, I do not even clear this so that the
> SMT sibling is considered as an idle candidate. The reasoning is that if
> there are no idle cores then an SMT sibling of the target is as good an
> idle CPU to select as any.

Isn't the purpose of select_idle_smt ?

select_idle_core() looks for an idle core and opportunistically saves
an idle CPU candidate to skip select_idle_cpu. In this case this is
useless loops for select_idle_core() because we are sure that the core
is not idle


>
> --
> Mel Gorman
> SUSE Labs
Mel Gorman Dec. 4, 2020, 11:30 a.m. UTC | #4
On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote:
> > The intent was that the sibling might still be an idle candidate. In
> > the current draft of the series, I do not even clear this so that the
> > SMT sibling is considered as an idle candidate. The reasoning is that if
> > there are no idle cores then an SMT sibling of the target is as good an
> > idle CPU to select as any.
> 
> Isn't the purpose of select_idle_smt ?
> 

Only in part.

> select_idle_core() looks for an idle core and opportunistically saves
> an idle CPU candidate to skip select_idle_cpu. In this case this is
> useless loops for select_idle_core() because we are sure that the core
> is not idle
> 

If select_idle_core() finds an idle candidate other than the sibling,
it'll use it if there is no idle core -- it picks a busy sibling based
on a linear walk of the cpumask. Similarly, select_idle_cpu() is not
guaranteed to scan the sibling first (ordering) or even reach the sibling
(throttling). select_idle_smt() is a last-ditch effort.
Vincent Guittot Dec. 4, 2020, 1:13 p.m. UTC | #5
On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote:
> > > The intent was that the sibling might still be an idle candidate. In
> > > the current draft of the series, I do not even clear this so that the
> > > SMT sibling is considered as an idle candidate. The reasoning is that if
> > > there are no idle cores then an SMT sibling of the target is as good an
> > > idle CPU to select as any.
> >
> > Isn't the purpose of select_idle_smt ?
> >
>
> Only in part.
>
> > select_idle_core() looks for an idle core and opportunistically saves
> > an idle CPU candidate to skip select_idle_cpu. In this case this is
> > useless loops for select_idle_core() because we are sure that the core
> > is not idle
> >
>
> If select_idle_core() finds an idle candidate other than the sibling,
> it'll use it if there is no idle core -- it picks a busy sibling based
> on a linear walk of the cpumask. Similarly, select_idle_cpu() is not

My point is that it's a waste of time to loop the sibling cpus of
target in select_idle_core because it will not help to find an idle
core. The sibling  cpus will then be check either by select_idle_cpu
of select_idle_smt

> guaranteed to scan the sibling first (ordering) or even reach the sibling
> (throttling). select_idle_smt() is a last-ditch effort.
>
> --
> Mel Gorman
> SUSE Labs
Vincent Guittot Dec. 4, 2020, 1:17 p.m. UTC | #6
On Fri, 4 Dec 2020 at 14:13, Vincent Guittot <vincent.guittot@linaro.org> wrote:
>
> On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote:
> >
> > On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote:
> > > > The intent was that the sibling might still be an idle candidate. In
> > > > the current draft of the series, I do not even clear this so that the
> > > > SMT sibling is considered as an idle candidate. The reasoning is that if
> > > > there are no idle cores then an SMT sibling of the target is as good an
> > > > idle CPU to select as any.
> > >
> > > Isn't the purpose of select_idle_smt ?
> > >
> >
> > Only in part.
> >
> > > select_idle_core() looks for an idle core and opportunistically saves
> > > an idle CPU candidate to skip select_idle_cpu. In this case this is
> > > useless loops for select_idle_core() because we are sure that the core
> > > is not idle
> > >
> >
> > If select_idle_core() finds an idle candidate other than the sibling,
> > it'll use it if there is no idle core -- it picks a busy sibling based
> > on a linear walk of the cpumask. Similarly, select_idle_cpu() is not
>
> My point is that it's a waste of time to loop the sibling cpus of
> target in select_idle_core because it will not help to find an idle
> core. The sibling  cpus will then be check either by select_idle_cpu
> of select_idle_smt

also, while looping the cpumask, the sibling cpus of not idle cpu are
removed and will not be check

>
> > guaranteed to scan the sibling first (ordering) or even reach the sibling
> > (throttling). select_idle_smt() is a last-ditch effort.
> >
> > --
> > Mel Gorman
> > SUSE Labs
Aubrey Li Dec. 4, 2020, 1:40 p.m. UTC | #7
On 2020/12/4 21:17, Vincent Guittot wrote:
> On Fri, 4 Dec 2020 at 14:13, Vincent Guittot <vincent.guittot@linaro.org> wrote:
>>
>> On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote:
>>>
>>> On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote:
>>>>> The intent was that the sibling might still be an idle candidate. In
>>>>> the current draft of the series, I do not even clear this so that the
>>>>> SMT sibling is considered as an idle candidate. The reasoning is that if
>>>>> there are no idle cores then an SMT sibling of the target is as good an
>>>>> idle CPU to select as any.
>>>>
>>>> Isn't the purpose of select_idle_smt ?
>>>>
>>>
>>> Only in part.
>>>
>>>> select_idle_core() looks for an idle core and opportunistically saves
>>>> an idle CPU candidate to skip select_idle_cpu. In this case this is
>>>> useless loops for select_idle_core() because we are sure that the core
>>>> is not idle
>>>>
>>>
>>> If select_idle_core() finds an idle candidate other than the sibling,
>>> it'll use it if there is no idle core -- it picks a busy sibling based
>>> on a linear walk of the cpumask. Similarly, select_idle_cpu() is not
>>
>> My point is that it's a waste of time to loop the sibling cpus of
>> target in select_idle_core because it will not help to find an idle
>> core. The sibling  cpus will then be check either by select_idle_cpu
>> of select_idle_smt
> 
> also, while looping the cpumask, the sibling cpus of not idle cpu are
> removed and will not be check
>

IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)?
If the target's sibling is removed from select_idle_mask from select_idle_core(),
select_idle_cpu() will lose the chance to pick it up?

Thanks,
-Aubrey
Aubrey Li Dec. 4, 2020, 1:47 p.m. UTC | #8
On 2020/12/4 21:40, Li, Aubrey wrote:
> On 2020/12/4 21:17, Vincent Guittot wrote:
>> On Fri, 4 Dec 2020 at 14:13, Vincent Guittot <vincent.guittot@linaro.org> wrote:
>>>
>>> On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote:
>>>>
>>>> On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote:
>>>>>> The intent was that the sibling might still be an idle candidate. In
>>>>>> the current draft of the series, I do not even clear this so that the
>>>>>> SMT sibling is considered as an idle candidate. The reasoning is that if
>>>>>> there are no idle cores then an SMT sibling of the target is as good an
>>>>>> idle CPU to select as any.
>>>>>
>>>>> Isn't the purpose of select_idle_smt ?
>>>>>
>>>>
>>>> Only in part.
>>>>
>>>>> select_idle_core() looks for an idle core and opportunistically saves
>>>>> an idle CPU candidate to skip select_idle_cpu. In this case this is
>>>>> useless loops for select_idle_core() because we are sure that the core
>>>>> is not idle
>>>>>
>>>>
>>>> If select_idle_core() finds an idle candidate other than the sibling,
>>>> it'll use it if there is no idle core -- it picks a busy sibling based
>>>> on a linear walk of the cpumask. Similarly, select_idle_cpu() is not
>>>
>>> My point is that it's a waste of time to loop the sibling cpus of
>>> target in select_idle_core because it will not help to find an idle
>>> core. The sibling  cpus will then be check either by select_idle_cpu
>>> of select_idle_smt
>>
>> also, while looping the cpumask, the sibling cpus of not idle cpu are
>> removed and will not be check
>>
> 
> IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)?
> If the target's sibling is removed from select_idle_mask from select_idle_core(),
> select_idle_cpu() will lose the chance to pick it up?

aha, no, select_idle_mask will be re-assigned in select_idle_cpu() by:

	cpumask_and(cpus, sds_idle_cpus(sd->shared), p->cpus_ptr);

So, yes, I guess we can remove the cpu_smt_mask(target) from select_idle_core() safely.

> 
> Thanks,
> -Aubrey
>
Vincent Guittot Dec. 4, 2020, 1:47 p.m. UTC | #9
On Fri, 4 Dec 2020 at 14:40, Li, Aubrey <aubrey.li@linux.intel.com> wrote:
>
> On 2020/12/4 21:17, Vincent Guittot wrote:
> > On Fri, 4 Dec 2020 at 14:13, Vincent Guittot <vincent.guittot@linaro.org> wrote:
> >>
> >> On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote:
> >>>
> >>> On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote:
> >>>>> The intent was that the sibling might still be an idle candidate. In
> >>>>> the current draft of the series, I do not even clear this so that the
> >>>>> SMT sibling is considered as an idle candidate. The reasoning is that if
> >>>>> there are no idle cores then an SMT sibling of the target is as good an
> >>>>> idle CPU to select as any.
> >>>>
> >>>> Isn't the purpose of select_idle_smt ?
> >>>>
> >>>
> >>> Only in part.
> >>>
> >>>> select_idle_core() looks for an idle core and opportunistically saves
> >>>> an idle CPU candidate to skip select_idle_cpu. In this case this is
> >>>> useless loops for select_idle_core() because we are sure that the core
> >>>> is not idle
> >>>>
> >>>
> >>> If select_idle_core() finds an idle candidate other than the sibling,
> >>> it'll use it if there is no idle core -- it picks a busy sibling based
> >>> on a linear walk of the cpumask. Similarly, select_idle_cpu() is not
> >>
> >> My point is that it's a waste of time to loop the sibling cpus of
> >> target in select_idle_core because it will not help to find an idle
> >> core. The sibling  cpus will then be check either by select_idle_cpu
> >> of select_idle_smt
> >
> > also, while looping the cpumask, the sibling cpus of not idle cpu are
> > removed and will not be check
> >
>
> IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)?
> If the target's sibling is removed from select_idle_mask from select_idle_core(),
> select_idle_cpu() will lose the chance to pick it up?

This is only relevant for patch 10 which is not to be included IIUC
what mel said in cover letter : "Patches 9 and 10 are stupid in the
context of this series."

>
> Thanks,
> -Aubrey
Aubrey Li Dec. 4, 2020, 2:07 p.m. UTC | #10
On 2020/12/4 21:47, Vincent Guittot wrote:
> On Fri, 4 Dec 2020 at 14:40, Li, Aubrey <aubrey.li@linux.intel.com> wrote:
>>
>> On 2020/12/4 21:17, Vincent Guittot wrote:
>>> On Fri, 4 Dec 2020 at 14:13, Vincent Guittot <vincent.guittot@linaro.org> wrote:
>>>>
>>>> On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote:
>>>>>
>>>>> On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote:
>>>>>>> The intent was that the sibling might still be an idle candidate. In
>>>>>>> the current draft of the series, I do not even clear this so that the
>>>>>>> SMT sibling is considered as an idle candidate. The reasoning is that if
>>>>>>> there are no idle cores then an SMT sibling of the target is as good an
>>>>>>> idle CPU to select as any.
>>>>>>
>>>>>> Isn't the purpose of select_idle_smt ?
>>>>>>
>>>>>
>>>>> Only in part.
>>>>>
>>>>>> select_idle_core() looks for an idle core and opportunistically saves
>>>>>> an idle CPU candidate to skip select_idle_cpu. In this case this is
>>>>>> useless loops for select_idle_core() because we are sure that the core
>>>>>> is not idle
>>>>>>
>>>>>
>>>>> If select_idle_core() finds an idle candidate other than the sibling,
>>>>> it'll use it if there is no idle core -- it picks a busy sibling based
>>>>> on a linear walk of the cpumask. Similarly, select_idle_cpu() is not
>>>>
>>>> My point is that it's a waste of time to loop the sibling cpus of
>>>> target in select_idle_core because it will not help to find an idle
>>>> core. The sibling  cpus will then be check either by select_idle_cpu
>>>> of select_idle_smt
>>>
>>> also, while looping the cpumask, the sibling cpus of not idle cpu are
>>> removed and will not be check
>>>
>>
>> IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)?
>> If the target's sibling is removed from select_idle_mask from select_idle_core(),
>> select_idle_cpu() will lose the chance to pick it up?
> 
> This is only relevant for patch 10 which is not to be included IIUC
> what mel said in cover letter : "Patches 9 and 10 are stupid in the
> context of this series."

So the target's sibling can be removed from cpumask in select_idle_core
in patch 6, and need to be added back in select_idle_core in patch 10, :)
Mel Gorman Dec. 4, 2020, 2:27 p.m. UTC | #11
On Fri, Dec 04, 2020 at 02:17:20PM +0100, Vincent Guittot wrote:
> On Fri, 4 Dec 2020 at 14:13, Vincent Guittot <vincent.guittot@linaro.org> wrote:
> >
> > On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@techsingularity.net> wrote:
> > >
> > > On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote:
> > > > > The intent was that the sibling might still be an idle candidate. In
> > > > > the current draft of the series, I do not even clear this so that the
> > > > > SMT sibling is considered as an idle candidate. The reasoning is that if
> > > > > there are no idle cores then an SMT sibling of the target is as good an
> > > > > idle CPU to select as any.
> > > >
> > > > Isn't the purpose of select_idle_smt ?
> > > >
> > >
> > > Only in part.
> > >
> > > > select_idle_core() looks for an idle core and opportunistically saves
> > > > an idle CPU candidate to skip select_idle_cpu. In this case this is
> > > > useless loops for select_idle_core() because we are sure that the core
> > > > is not idle
> > > >
> > >
> > > If select_idle_core() finds an idle candidate other than the sibling,
> > > it'll use it if there is no idle core -- it picks a busy sibling based
> > > on a linear walk of the cpumask. Similarly, select_idle_cpu() is not
> >
> > My point is that it's a waste of time to loop the sibling cpus of
> > target in select_idle_core because it will not help to find an idle
> > core. The sibling  cpus will then be check either by select_idle_cpu
> > of select_idle_smt
> 

I understand and you're right, the full loop was in the context of a series
that unified select_idle_* where it made sense. The version I'm currently
testing aborts the SMT search if a !idle sibling is encountered. That
means that select_idle_core() will no longer scan the entire domain if
there are no idle cores.

https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git/commit/?h=sched-sissearch-v2r6&id=eb04a344cf7d7ca64c0c8fc0bcade261fa08c19e

With the patch on its own, it does mean that select_idle_sibling
starts over because SMT siblings might have been cleared. As an aside,
select_idle_core() has it's own problems even then.  It can start a scan
for an idle sibling when cpu_rq(target)->nr_running is very large --
over 100+ running tasks which is almost certainly a useless scan for
cores. However, I haven't done anything with that in this series as it
seemed like it would be follow-up work.

> also, while looping the cpumask, the sibling cpus of not idle cpu are
> removed and will not be check
> 

True and I spotted this. I think the load_balance_mask can be abused to
clear siblings during select_idle_core() while using select_idle_mask to
track CPUs that have not been scanned yet so select_idle_cpu only scans
CPUs that have not already been visited.

https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git/commit/?h=sched-sissearch-v2r6&id=a6e986dae38855e3be26dfde86bbef1617431dd1

As both the idle candidate and the load_balance_mask abuse are likely to
be controversial, I shuffled the series so that it's ordered from least
least controversial to most controversial.

This
https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git/log/?h=sched-sissearch-v2r6
is what is currently being tested. It'll take most of the weekend and I'll
post them properly if they pass tests and do not throw up nasty surprises.
Mel Gorman Dec. 4, 2020, 2:31 p.m. UTC | #12
On Fri, Dec 04, 2020 at 02:47:48PM +0100, Vincent Guittot wrote:
> > IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)?
> > If the target's sibling is removed from select_idle_mask from select_idle_core(),
> > select_idle_cpu() will lose the chance to pick it up?
> 
> This is only relevant for patch 10 which is not to be included IIUC
> what mel said in cover letter : "Patches 9 and 10 are stupid in the
> context of this series."
> 

Patch 10 was stupid in the context of the prototype because
select_idle_core always returned a CPU. A variation ended up being
reintroduced at the end of the Series Yet To Be Posted so that SMT siblings
are cleared during select_idle_core() but select_idle_cpu() still has a
mask with unvisited CPUs to consider if no idle cores are found.

As far as I know, this would still be compatible with Aubrey's idle
cpu mask as long as it's visited and cleared between select_idle_core
and select_idle_cpu. It relaxes the contraints on Aubrey to some extent
because the idle cpu mask would be a hint so if the information is out
of date, an idle cpu may still be found the normal way.
Vincent Guittot Dec. 4, 2020, 3:23 p.m. UTC | #13
On Fri, 4 Dec 2020 at 15:31, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> On Fri, Dec 04, 2020 at 02:47:48PM +0100, Vincent Guittot wrote:
> > > IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)?
> > > If the target's sibling is removed from select_idle_mask from select_idle_core(),
> > > select_idle_cpu() will lose the chance to pick it up?
> >
> > This is only relevant for patch 10 which is not to be included IIUC
> > what mel said in cover letter : "Patches 9 and 10 are stupid in the
> > context of this series."
> >
>
> Patch 10 was stupid in the context of the prototype because
> select_idle_core always returned a CPU. A variation ended up being
> reintroduced at the end of the Series Yet To Be Posted so that SMT siblings
> are cleared during select_idle_core() but select_idle_cpu() still has a
> mask with unvisited CPUs to consider if no idle cores are found.
>
> As far as I know, this would still be compatible with Aubrey's idle
> cpu mask as long as it's visited and cleared between select_idle_core
> and select_idle_cpu. It relaxes the contraints on Aubrey to some extent
> because the idle cpu mask would be a hint so if the information is out
> of date, an idle cpu may still be found the normal way.

But even without patch 10, just replacing sched_domain_span(sd) by
sds_idle_cpus(sd->shared) will ensure that sis loops only on cpus that
get a chance to be idle so select_idle_core is likely to return an
idle_candidate

>
> --
> Mel Gorman
> SUSE Labs
Mel Gorman Dec. 4, 2020, 3:40 p.m. UTC | #14
On Fri, Dec 04, 2020 at 04:23:48PM +0100, Vincent Guittot wrote:
> On Fri, 4 Dec 2020 at 15:31, Mel Gorman <mgorman@techsingularity.net> wrote:
> >
> > On Fri, Dec 04, 2020 at 02:47:48PM +0100, Vincent Guittot wrote:
> > > > IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)?
> > > > If the target's sibling is removed from select_idle_mask from select_idle_core(),
> > > > select_idle_cpu() will lose the chance to pick it up?
> > >
> > > This is only relevant for patch 10 which is not to be included IIUC
> > > what mel said in cover letter : "Patches 9 and 10 are stupid in the
> > > context of this series."
> > >
> >
> > Patch 10 was stupid in the context of the prototype because
> > select_idle_core always returned a CPU. A variation ended up being
> > reintroduced at the end of the Series Yet To Be Posted so that SMT siblings
> > are cleared during select_idle_core() but select_idle_cpu() still has a
> > mask with unvisited CPUs to consider if no idle cores are found.
> >
> > As far as I know, this would still be compatible with Aubrey's idle
> > cpu mask as long as it's visited and cleared between select_idle_core
> > and select_idle_cpu. It relaxes the contraints on Aubrey to some extent
> > because the idle cpu mask would be a hint so if the information is out
> > of date, an idle cpu may still be found the normal way.
> 
> But even without patch 10, just replacing sched_domain_span(sd) by
> sds_idle_cpus(sd->shared) will ensure that sis loops only on cpus that
> get a chance to be idle so select_idle_core is likely to return an
> idle_candidate
> 

Yes but if the idle mask is out of date for any reason then idle CPUs might
be missed -- hence the intent to maintain a mask of CPUs visited and use
the idle cpu mask as a hint to prioritise CPUs that are likely idle but
fall back to a normal scan if none of the "idle cpu mask" CPUs are
actually idle.
Vincent Guittot Dec. 4, 2020, 3:43 p.m. UTC | #15
On Fri, 4 Dec 2020 at 16:40, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> On Fri, Dec 04, 2020 at 04:23:48PM +0100, Vincent Guittot wrote:
> > On Fri, 4 Dec 2020 at 15:31, Mel Gorman <mgorman@techsingularity.net> wrote:
> > >
> > > On Fri, Dec 04, 2020 at 02:47:48PM +0100, Vincent Guittot wrote:
> > > > > IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)?
> > > > > If the target's sibling is removed from select_idle_mask from select_idle_core(),
> > > > > select_idle_cpu() will lose the chance to pick it up?
> > > >
> > > > This is only relevant for patch 10 which is not to be included IIUC
> > > > what mel said in cover letter : "Patches 9 and 10 are stupid in the
> > > > context of this series."
> > > >
> > >
> > > Patch 10 was stupid in the context of the prototype because
> > > select_idle_core always returned a CPU. A variation ended up being
> > > reintroduced at the end of the Series Yet To Be Posted so that SMT siblings
> > > are cleared during select_idle_core() but select_idle_cpu() still has a
> > > mask with unvisited CPUs to consider if no idle cores are found.
> > >
> > > As far as I know, this would still be compatible with Aubrey's idle
> > > cpu mask as long as it's visited and cleared between select_idle_core
> > > and select_idle_cpu. It relaxes the contraints on Aubrey to some extent
> > > because the idle cpu mask would be a hint so if the information is out
> > > of date, an idle cpu may still be found the normal way.
> >
> > But even without patch 10, just replacing sched_domain_span(sd) by
> > sds_idle_cpus(sd->shared) will ensure that sis loops only on cpus that
> > get a chance to be idle so select_idle_core is likely to return an
> > idle_candidate
> >
>
> Yes but if the idle mask is out of date for any reason then idle CPUs might

In fact it's the opposite, a cpu in idle mask might not be idle but
all cpus that enter idle will be set

> be missed -- hence the intent to maintain a mask of CPUs visited and use
> the idle cpu mask as a hint to prioritise CPUs that are likely idle but
> fall back to a normal scan if none of the "idle cpu mask" CPUs are
> actually idle.
>
> --
> Mel Gorman
> SUSE Labs
Mel Gorman Dec. 4, 2020, 6:41 p.m. UTC | #16
On Fri, Dec 04, 2020 at 04:43:05PM +0100, Vincent Guittot wrote:
> On Fri, 4 Dec 2020 at 16:40, Mel Gorman <mgorman@techsingularity.net> wrote:
> >
> > On Fri, Dec 04, 2020 at 04:23:48PM +0100, Vincent Guittot wrote:
> > > On Fri, 4 Dec 2020 at 15:31, Mel Gorman <mgorman@techsingularity.net> wrote:
> > > >
> > > > On Fri, Dec 04, 2020 at 02:47:48PM +0100, Vincent Guittot wrote:
> > > > > > IIUC, select_idle_core and select_idle_cpu share the same cpumask(select_idle_mask)?
> > > > > > If the target's sibling is removed from select_idle_mask from select_idle_core(),
> > > > > > select_idle_cpu() will lose the chance to pick it up?
> > > > >
> > > > > This is only relevant for patch 10 which is not to be included IIUC
> > > > > what mel said in cover letter : "Patches 9 and 10 are stupid in the
> > > > > context of this series."
> > > > >
> > > >
> > > > Patch 10 was stupid in the context of the prototype because
> > > > select_idle_core always returned a CPU. A variation ended up being
> > > > reintroduced at the end of the Series Yet To Be Posted so that SMT siblings
> > > > are cleared during select_idle_core() but select_idle_cpu() still has a
> > > > mask with unvisited CPUs to consider if no idle cores are found.
> > > >
> > > > As far as I know, this would still be compatible with Aubrey's idle
> > > > cpu mask as long as it's visited and cleared between select_idle_core
> > > > and select_idle_cpu. It relaxes the contraints on Aubrey to some extent
> > > > because the idle cpu mask would be a hint so if the information is out
> > > > of date, an idle cpu may still be found the normal way.
> > >
> > > But even without patch 10, just replacing sched_domain_span(sd) by
> > > sds_idle_cpus(sd->shared) will ensure that sis loops only on cpus that
> > > get a chance to be idle so select_idle_core is likely to return an
> > > idle_candidate
> > >
> >
> > Yes but if the idle mask is out of date for any reason then idle CPUs might
> 
> In fact it's the opposite, a cpu in idle mask might not be idle but
> all cpus that enter idle will be set
> 

When I first checked, the information was based on the tick or a CPU
stopping the tick. That was not guaranteed to be up to date so I considered
the best option would be to treat idle cpu mask as advisory. It would
not necessarily cover a CPU that was entering idle and polling before
entering an idle state for example or a rq that would pass sched_idle_cpu()
depending on the timing of the update_idle_cpumask call.

I know you reviewed that patch and v6 may be very different but the more
up to date that information is, the greater the cache conflicts will be
on sched_domain_shared so maintaining the up-to-date information may cost
enough to offset any benefit from reduced searching at wakeup.

If this turns out to be wrong, then great, the idle cpu mask can be used
as both the basis for an idle core search and a fast find of an individual
CPU. If the cost of keeping up to date information is too high then the
idle_cpu_mask can be treated as advisory to start the search and track
CPUs visited.

The series are not either/or, chunks of the series I posted are orthogonal
(e.g. changes to p->recent_cpu_used), the latter parts could either work
with idle cpu mask or be replaced by idle cpu mask depending on which
performs better.
diff mbox series

Patch

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 68dd9cd62fbd..1d8f5c4b4936 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6077,6 +6077,7 @@  static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int
 		return -1;
 
 	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+	__cpumask_clear_cpu(target, cpus);
 
 	for_each_cpu_wrap(core, cpus, target) {
 		bool idle = true;
@@ -6181,6 +6182,7 @@  static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 	time = cpu_clock(this);
 
 	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+	__cpumask_clear_cpu(target, cpus);
 
 	for_each_cpu_wrap(cpu, cpus, target) {
 		schedstat_inc(this_rq()->sis_scanned);