Message ID | 20201208153501.1467-3-mgorman@techsingularity.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Reduce scanning of runqueues in select_idle_sibling | expand |
On Tue, 8 Dec 2020 at 16:35, Mel Gorman <mgorman@techsingularity.net> wrote: > > As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP > even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP > check and while we are at it, exclude the cost of initialising the CPU > mask from the average scan cost. > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > --- > kernel/sched/fair.c | 14 ++++++++------ > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index ac7b34e7372b..5c41875aec23 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6153,6 +6153,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > if (!this_sd) > return -1; Just noticed while reviewing the patch that the above related to this_sd can also go under sched_feat(SIS_PROP) > > + cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > + > if (sched_feat(SIS_PROP)) { > u64 avg_cost, avg_idle, span_avg; > > @@ -6168,11 +6170,9 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > nr = div_u64(span_avg, avg_cost); > else > nr = 4; > - } > - > - time = cpu_clock(this); > > - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > + time = cpu_clock(this); > + } > > for_each_cpu_wrap(cpu, cpus, target) { > if (!--nr) > @@ -6181,8 +6181,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > break; > } > > - time = cpu_clock(this) - time; > - update_avg(&this_sd->avg_scan_cost, time); > + if (sched_feat(SIS_PROP)) { > + time = cpu_clock(this) - time; > + update_avg(&this_sd->avg_scan_cost, time); > + } > > return cpu; > } > -- > 2.26.2 >
On Tue, Dec 08, 2020 at 05:03:21PM +0100, Vincent Guittot wrote: > On Tue, 8 Dec 2020 at 16:35, Mel Gorman <mgorman@techsingularity.net> wrote: > > > > As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP > > even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP > > check and while we are at it, exclude the cost of initialising the CPU > > mask from the average scan cost. > > > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > --- > > kernel/sched/fair.c | 14 ++++++++------ > > 1 file changed, 8 insertions(+), 6 deletions(-) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index ac7b34e7372b..5c41875aec23 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -6153,6 +6153,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > > if (!this_sd) > > return -1; > > Just noticed while reviewing the patch that the above related to > this_sd can also go under sched_feat(SIS_PROP) > Technically yes but I also decided against it. It's a functional difference depending on whether SIS_PROP is set in the highly unlikely case that this_sd == NULL. I was also thinking in terms of what happens if SIS_PROP was disabled and enabled while a search is in progress even if it's very unlikely. In that case, this_sd would be uninitialised. That might be impossible in practice depending on how static branching is implemented but I don't think we should rely on the internals of jump labels and play it safe. I can move it in if you feel strongly about it but I think the disable/enable race is enough of a concern to leave it alone.
On 2020/12/9 0:03, Vincent Guittot wrote: > On Tue, 8 Dec 2020 at 16:35, Mel Gorman <mgorman@techsingularity.net> wrote: >> >> As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP >> even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP >> check and while we are at it, exclude the cost of initialising the CPU >> mask from the average scan cost. >> >> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> >> --- >> kernel/sched/fair.c | 14 ++++++++------ >> 1 file changed, 8 insertions(+), 6 deletions(-) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index ac7b34e7372b..5c41875aec23 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -6153,6 +6153,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t >> if (!this_sd) >> return -1; > > Just noticed while reviewing the patch that the above related to > this_sd can also go under sched_feat(SIS_PROP) > >> >> + cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); >> + >> if (sched_feat(SIS_PROP)) { >> u64 avg_cost, avg_idle, span_avg; >> >> @@ -6168,11 +6170,9 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t >> nr = div_u64(span_avg, avg_cost); >> else >> nr = 4; >> - } >> - >> - time = cpu_clock(this); >> >> - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); >> + time = cpu_clock(this); >> + } >> >> for_each_cpu_wrap(cpu, cpus, target) { >> if (!--nr) nr is the key of this throttling mechanism, need to be placed under sched_feat(SIS_PROP) as well. Thanks, -Aubrey
On Wed, Dec 09, 2020 at 01:28:11PM +0800, Li, Aubrey wrote: > >> nr = div_u64(span_avg, avg_cost); > >> else > >> nr = 4; > >> - } > >> - > >> - time = cpu_clock(this); > >> > >> - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > >> + time = cpu_clock(this); > >> + } > >> > >> for_each_cpu_wrap(cpu, cpus, target) { > >> if (!--nr) > > nr is the key of this throttling mechanism, need to be placed under sched_feat(SIS_PROP) as well. > It isn't necessary as nr in initialised to INT_MAX if !SIS_PROP.
On 2020/12/9 17:05, Mel Gorman wrote: > On Wed, Dec 09, 2020 at 01:28:11PM +0800, Li, Aubrey wrote: >>>> nr = div_u64(span_avg, avg_cost); >>>> else >>>> nr = 4; >>>> - } >>>> - >>>> - time = cpu_clock(this); >>>> >>>> - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); >>>> + time = cpu_clock(this); >>>> + } >>>> >>>> for_each_cpu_wrap(cpu, cpus, target) { >>>> if (!--nr) >> >> nr is the key of this throttling mechanism, need to be placed under sched_feat(SIS_PROP) as well. >> > > It isn't necessary as nr in initialised to INT_MAX if !SIS_PROP. >If !SIS_PROP, nr need to -1 then tested in the loop, instead of testing directly. But with SIS_PROP, need adding a test in the loop. Since SIS_PROP is default true, I think it's okay to keep the current way. Thanks, -Aubrey
On Wed, Dec 09, 2020 at 07:07:11PM +0800, Li, Aubrey wrote: > On 2020/12/9 17:05, Mel Gorman wrote: > > On Wed, Dec 09, 2020 at 01:28:11PM +0800, Li, Aubrey wrote: > >>>> nr = div_u64(span_avg, avg_cost); > >>>> else > >>>> nr = 4; > >>>> - } > >>>> - > >>>> - time = cpu_clock(this); > >>>> > >>>> - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > >>>> + time = cpu_clock(this); > >>>> + } > >>>> > >>>> for_each_cpu_wrap(cpu, cpus, target) { > >>>> if (!--nr) > >> > >> nr is the key of this throttling mechanism, need to be placed under sched_feat(SIS_PROP) as well. > >> > > > > It isn't necessary as nr in initialised to INT_MAX if !SIS_PROP. > >If !SIS_PROP, nr need to -1 then tested in the loop, instead of testing directly. > But with SIS_PROP, need adding a test in the loop. > Since SIS_PROP is default true, I think it's okay to keep the current way. > It's because it's default true and the cost is negligible that I'm leaving it alone. The branch cost and nr accounting cost is negligible and it avoids peppering select_idle_cpu() with too many SIS_PROP checks.
On 2020/12/8 23:34, Mel Gorman wrote: > As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP > even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP > check and while we are at it, exclude the cost of initialising the CPU > mask from the average scan cost. > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > --- > kernel/sched/fair.c | 14 ++++++++------ > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index ac7b34e7372b..5c41875aec23 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6153,6 +6153,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > if (!this_sd) > return -1; > > + cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > + > if (sched_feat(SIS_PROP)) { > u64 avg_cost, avg_idle, span_avg; > > @@ -6168,11 +6170,9 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > nr = div_u64(span_avg, avg_cost); > else > nr = 4; > - } > - > - time = cpu_clock(this); > > - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > + time = cpu_clock(this); > + } > > for_each_cpu_wrap(cpu, cpus, target) { > if (!--nr) > return -1; I thought about this again and here seems not to be consistent: - even if nr reduces to 0, shouldn't avg_scan_cost be updated as well before return -1? - if avg_scan_cost is not updated because nr is throttled, the first time = cpu_clock(this); can be optimized. As nr is calculated and we already know which of the weight of cpumask and nr is greater. Thanks, -Aubrey
On Thu, Dec 10, 2020 at 01:18:05PM +0800, Li, Aubrey wrote: > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index ac7b34e7372b..5c41875aec23 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -6153,6 +6153,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > > if (!this_sd) > > return -1; > > > > + cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > > + > > if (sched_feat(SIS_PROP)) { > > u64 avg_cost, avg_idle, span_avg; > > > > @@ -6168,11 +6170,9 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > > nr = div_u64(span_avg, avg_cost); > > else > > nr = 4; > > - } > > - > > - time = cpu_clock(this); > > > > - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > > + time = cpu_clock(this); > > + } > > > > for_each_cpu_wrap(cpu, cpus, target) { > > if (!--nr) > > return -1; > > I thought about this again and here seems not to be consistent: > - even if nr reduces to 0, shouldn't avg_scan_cost be updated as well before return -1? You're right, but it's outside the scope of this patch. I noted that this was a problem in lore.kernel.org/r/lore.kernel.org/r/20201203141124.7391-8-mgorman@techsingularity.net It's neither a consistent win or loss to always account for it and so was dropped for this series to keep the number of controversial patches to a minimum. > - if avg_scan_cost is not updated because nr is throttled, the first > time = cpu_clock(this); > can be optimized. As nr is calculated and we already know which of the weight of cpumask and nr is greater. > That is also outside the scope of this patch. To do that, cpumask_weight() would have to be calculated but it's likely to be a net loss. Even under light load, nr will be smaller than the domain weight incurring both the cost of cpumask_weight and the clock read in the common case.
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index ac7b34e7372b..5c41875aec23 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6153,6 +6153,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t if (!this_sd) return -1; + cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + if (sched_feat(SIS_PROP)) { u64 avg_cost, avg_idle, span_avg; @@ -6168,11 +6170,9 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t nr = div_u64(span_avg, avg_cost); else nr = 4; - } - - time = cpu_clock(this); - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + time = cpu_clock(this); + } for_each_cpu_wrap(cpu, cpus, target) { if (!--nr) @@ -6181,8 +6181,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t break; } - time = cpu_clock(this) - time; - update_avg(&this_sd->avg_scan_cost, time); + if (sched_feat(SIS_PROP)) { + time = cpu_clock(this) - time; + update_avg(&this_sd->avg_scan_cost, time); + } return cpu; }
As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP check and while we are at it, exclude the cost of initialising the CPU mask from the average scan cost. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> --- kernel/sched/fair.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)