Message ID | 1456190570-4475-2-git-send-email-smuckle@linaro.org (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
On Tue, Feb 23, 2016 at 2:22 AM, Steve Muckle <steve.muckle@linaro.org> wrote: > From: Morten Rasmussen <morten.rasmussen@arm.com> > > capacity_orig_of() returns the max available compute capacity of a cpu. > For scale-invariant utilization tracking and energy-aware scheduling > decisions it is useful to know the compute capacity available at the > current OPP of a cpu. > > cc: Ingo Molnar <mingo@redhat.com> > cc: Peter Zijlstra <peterz@infradead.org> > Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> > Signed-off-by: Steve Muckle <smuckle@linaro.org> > --- > kernel/sched/fair.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 7ce24a4..3437e01 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4821,6 +4821,17 @@ static long effective_load(struct task_group *tg, int cpu, long wl, long wg) > #endif > > /* > + * Returns the current capacity of cpu after applying both > + * cpu and freq scaling. > + */ > +static unsigned long capacity_curr_of(int cpu) > +{ > + return cpu_rq(cpu)->cpu_capacity_orig * > + arch_scale_freq_capacity(NULL, cpu) What about architectures that don't have this? Why is that an architecture feature? I can easily imagine two x86 platforms using different scale_freq_capacity(), for example. > + >> SCHED_CAPACITY_SHIFT; > +} > + > +/* > * Detect M:N waker/wakee relationships via a switching-frequency heuristic. > * A waker of many should wake a different task than the one last awakened > * at a frequency roughly N times higher than one of its wakees. In order > -- Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 23, 2016 at 02:41:20AM +0100, Rafael J. Wysocki wrote: > > /* > > + * Returns the current capacity of cpu after applying both > > + * cpu and freq scaling. > > + */ > > +static unsigned long capacity_curr_of(int cpu) > > +{ > > + return cpu_rq(cpu)->cpu_capacity_orig * > > + arch_scale_freq_capacity(NULL, cpu) > > What about architectures that don't have this? They get the 'default' which is a constant SCHED_CAPACITY_SCALE unit. > Why is that an architecture feature? Because not all archs can tell the frequency the same way. Some you program the DVFS state and they really run at this speed, for those you can simply report back. For others, x86 for example, you program a DVFS 'hint' and the hardware does whatever, we'd have to do APERF/MPERF samples to get an idea of the actual frequency we ran at. Also, the having of this makes the load tracking slightly more expensive, instead of compile time constants we get function calls and actual multiplications. Its not _too_ bad, but still. > I can easily imagine two x86 platforms using different > scale_freq_capacity(), for example. That's up to the arch, if different x86 platforms need different thingies the arch implementation needs to offer a selector -- this isn't 'hard'. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday, February 23, 2016 10:19:16 AM Peter Zijlstra wrote: > On Tue, Feb 23, 2016 at 02:41:20AM +0100, Rafael J. Wysocki wrote: > > > /* > > > + * Returns the current capacity of cpu after applying both > > > + * cpu and freq scaling. > > > + */ > > > +static unsigned long capacity_curr_of(int cpu) > > > +{ > > > + return cpu_rq(cpu)->cpu_capacity_orig * > > > + arch_scale_freq_capacity(NULL, cpu) > > > > What about architectures that don't have this? > > They get the 'default' which is a constant SCHED_CAPACITY_SCALE unit. > > > Why is that an architecture feature? > > Because not all archs can tell the frequency the same way. Some you > program the DVFS state and they really run at this speed, for those you > can simply report back. > > For others, x86 for example, you program a DVFS 'hint' and the hardware > does whatever, we'd have to do APERF/MPERF samples to get an idea of the > actual frequency we ran at. > > Also, the having of this makes the load tracking slightly more > expensive, instead of compile time constants we get function calls and > actual multiplications. Its not _too_ bad, but still. That's all correct, but my question should rather be: is arch the right granularity? In theory, there may be ARM64-based platforms using ACPI and behaving like x86 in that respect in the future. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Feb 26, 2016 at 02:37:19AM +0100, Rafael J. Wysocki wrote: > That's all correct, but my question should rather be: is arch the right > granularity? > > In theory, there may be ARM64-based platforms using ACPI and behaving > like x86 in that respect in the future. Ah, so I started these hooks way before the cpufreq/cpuidle etc. integration push. Maybe we should look at something like that, but performance is really critical, you most definitely do not want 3 indirections just because abstract framework crap, that's measurable overhead on these callsites. Hence the current inline with constant value or single function call. And if archs would want a selector, I would recommend boot time call instruction rewrites a-la alternatives/paravirt. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7ce24a4..3437e01 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4821,6 +4821,17 @@ static long effective_load(struct task_group *tg, int cpu, long wl, long wg) #endif /* + * Returns the current capacity of cpu after applying both + * cpu and freq scaling. + */ +static unsigned long capacity_curr_of(int cpu) +{ + return cpu_rq(cpu)->cpu_capacity_orig * + arch_scale_freq_capacity(NULL, cpu) + >> SCHED_CAPACITY_SHIFT; +} + +/* * Detect M:N waker/wakee relationships via a switching-frequency heuristic. * A waker of many should wake a different task than the one last awakened * at a frequency roughly N times higher than one of its wakees. In order