Message ID | 1457932932-28444-8-git-send-email-mturquette+renesas@baylibre.com (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
Hi Mike, On 14/03/16 05:22, Michael Turquette wrote: > From: Dietmar Eggemann <dietmar.eggemann@arm.com> > > Implements cpufreq_scale_freq_capacity() to provide the scheduler with a > frequency scaling correction factor for more accurate load-tracking. > > The factor is: > > current_freq(cpu) << SCHED_CAPACITY_SHIFT / max_freq(cpu) > > In fact, freq_scale should be a struct cpufreq_policy data member. But > this would require that the scheduler hot path (__update_load_avg()) would > have to grab the cpufreq lock. This can be avoided by using per-cpu data > initialized to SCHED_CAPACITY_SCALE for freq_scale. > > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> > Signed-off-by: Michael Turquette <mturquette+renesas@baylibre.com> > --- > I'm not as sure about patches 7 & 8, but I included them since I needed > frequency invariance while testing. > > As mentioned by myself in 2014 and Rafael last month, the > arch_scale_freq_capacity hook is awkward, because this behavior may vary > within an architecture. > > I re-introduce Dietmar's generic cpufreq implementation of the frequency > invariance hook in this patch, and change the preprocessor magic in > sched.h to favor the cpufreq implementation over arch- or > platform-specific ones in the next patch. Maybe it is worth mentioning that this patch is from EAS RFC5.2 (linux-arm.org/linux-power.git energy_model_rfc_v5.2) which hasn't been posted to LKML. The last EAS RFCv5 has the Frequency Invariant Engine (FEI) based on the cpufreq notifier calls (cpufreq_callback, cpufreq_policy_callback) in the ARM arch code. > If run-time selection of ops is needed them someone will need to write > that code. Right now I see 3 different implementations of the FEI. 1) The X86 aperf/mperf based one (https://lkml.org/lkml/2016/3/3/589), 2) This one in cpufreq.c and 3) the one based on cpufreq notifiers in ARCH (ARM, ARM64) code. I guess with sched_util we do need a solution for all platforms (different archs, x86 w/ and w/o X86_FEATURE_APERFMPERF, ...). > I think that this negates the need for the arm arch hooks[0-2], and > hopefully Morten and Dietmar can weigh in on this. It's true that we tried to get rid of the usage of the cpufreq callbacks (cpufreq_callback, cpufreq_policy_callback) with this patch. Plus we didn't want to implement it twice (for ARM and ARM64). But 2) would have to work for other ARCHs as well. Maybe as a fall-back for X86 w/o X86_FEATURE_APERFMPERF feature? [...] -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Quoting Dietmar Eggemann (2016-03-15 12:13:46) > Hi Mike, > > On 14/03/16 05:22, Michael Turquette wrote: > > From: Dietmar Eggemann <dietmar.eggemann@arm.com> > > > > Implements cpufreq_scale_freq_capacity() to provide the scheduler with a > > frequency scaling correction factor for more accurate load-tracking. > > > > The factor is: > > > > current_freq(cpu) << SCHED_CAPACITY_SHIFT / max_freq(cpu) > > > > In fact, freq_scale should be a struct cpufreq_policy data member. But > > this would require that the scheduler hot path (__update_load_avg()) would > > have to grab the cpufreq lock. This can be avoided by using per-cpu data > > initialized to SCHED_CAPACITY_SCALE for freq_scale. > > > > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> > > Signed-off-by: Michael Turquette <mturquette+renesas@baylibre.com> > > --- > > I'm not as sure about patches 7 & 8, but I included them since I needed > > frequency invariance while testing. > > > > As mentioned by myself in 2014 and Rafael last month, the > > arch_scale_freq_capacity hook is awkward, because this behavior may vary > > within an architecture. > > > > I re-introduce Dietmar's generic cpufreq implementation of the frequency > > invariance hook in this patch, and change the preprocessor magic in > > sched.h to favor the cpufreq implementation over arch- or > > platform-specific ones in the next patch. > > Maybe it is worth mentioning that this patch is from EAS RFC5.2 > (linux-arm.org/linux-power.git energy_model_rfc_v5.2) which hasn't been > posted to LKML. The last EAS RFCv5 has the Frequency Invariant Engine > (FEI) based on the cpufreq notifier calls (cpufreq_callback, > cpufreq_policy_callback) in the ARM arch code. Oops, my apologies. I got a little mixed up while developing these patches and I should have at least asked you about this one before posting. I'm really quite happy to drop #7 and #8 if they are too contentious or if patch #7 is deemed as not-ready by you. > > > If run-time selection of ops is needed them someone will need to write > > that code. > > Right now I see 3 different implementations of the FEI. 1) The X86 > aperf/mperf based one (https://lkml.org/lkml/2016/3/3/589), 2) This one > in cpufreq.c and 3) the one based on cpufreq notifiers in ARCH (ARM, > ARM64) code. > > I guess with sched_util we do need a solution for all platforms > (different archs, x86 w/ and w/o X86_FEATURE_APERFMPERF, ...). > > > I think that this negates the need for the arm arch hooks[0-2], and > > hopefully Morten and Dietmar can weigh in on this. > > It's true that we tried to get rid of the usage of the cpufreq callbacks > (cpufreq_callback, cpufreq_policy_callback) with this patch. Plus we > didn't want to implement it twice (for ARM and ARM64). > > But 2) would have to work for other ARCHs as well. Maybe as a fall-back > for X86 w/o X86_FEATURE_APERFMPERF feature? That's what I had in mind. I guess that some day there will be a need to select implementations at run-time for both cpufreq (e.g. different cpufreq drivers might implement arch_scale_freq_capacity) and for the !CONFIG_CPU_FREQ case (e.g. different platforms might implement arch_scale_freq_capcity within the same arch). The cpufreq approach seems the most generic, hence patch #8 to make it the default. Regards, Mike > > [...] -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 15, 2016 at 01:19:17PM -0700, Michael Turquette wrote: > That's what I had in mind. I guess that some day there will be a need to > select implementations at run-time for both cpufreq (e.g. different > cpufreq drivers might implement arch_scale_freq_capacity) and for the > !CONFIG_CPU_FREQ case (e.g. different platforms might implement > arch_scale_freq_capcity within the same arch). No, no runtime selection. That gets us function pointers and other indirect mess. We should be trying very hard to get rid of that cpufreq_util_update() pointer, not add more of that gunk. Use self modifying code if you have to do something. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Mar 13, 2016 at 10:22:11PM -0700, Michael Turquette wrote: > +static DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE; > + > +unsigned long cpufreq_scale_freq_capacity(struct sched_domain *sd, int cpu) > +{ > + return per_cpu(freq_scale, cpu); > +} These should be in a header so that we can avoid having to do a function call. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 15/03/16 20:19, Michael Turquette wrote: > Quoting Dietmar Eggemann (2016-03-15 12:13:46) >> Hi Mike, >> >> On 14/03/16 05:22, Michael Turquette wrote: >>> From: Dietmar Eggemann <dietmar.eggemann@arm.com> >>> [...] >> Maybe it is worth mentioning that this patch is from EAS RFC5.2 >> (linux-arm.org/linux-power.git energy_model_rfc_v5.2) which hasn't been >> posted to LKML. The last EAS RFCv5 has the Frequency Invariant Engine >> (FEI) based on the cpufreq notifier calls (cpufreq_callback, >> cpufreq_policy_callback) in the ARM arch code. > > Oops, my apologies. I got a little mixed up while developing these > patches and I should have at least asked you about this one before > posting. No need to apologize, just wanted to set the context right for people following the EAS stuff. > I'm really quite happy to drop #7 and #8 if they are too contentious or > if patch #7 is deemed as not-ready by you. If people think that driving frequency invariance from within cpufreq.c is the right thing to so then this patch can stay. That said, I'm not sure if we can do a better job of integrating freq_scale into existing cpufreq data structures. >>> If run-time selection of ops is needed them someone will need to write >>> that code. >> >> Right now I see 3 different implementations of the FEI. 1) The X86 >> aperf/mperf based one (https://lkml.org/lkml/2016/3/3/589), 2) This one >> in cpufreq.c and 3) the one based on cpufreq notifiers in ARCH (ARM, >> ARM64) code. I rechecked the functionality of this implementation (2) versus (1) and (3) by running them all together on an X86 system w/ X86_FEATURE_APERFMPERF and acpi-cpufreq (i5 CPU M520) w/o actually doing the wiring towards arch_scale_freq_capacity and they all behave similar or equal. The update of (1) happens much more often obviously since its driven from the scheduler tick. ... arch_scale_freq_tick: APERF/MPERF: aperf=5864236223 mperf=6009442434 acnt=200073 mcnt=362360 arch_cpu_freq=565 arch_scale_freq_tick: APERF/MPERF: aperf=5864261886 mperf=6009474724 acnt=25663 mcnt=32290 arch_cpu_freq=813 scale_freq_capacity: CPUFREQ.C: cpu=1 freq_scale=910 cpufreq_callback: NOTIFIER: cpu=1 curr_freq=2133000 max_freq=2400000 freq_scale=910 arch_scale_freq_tick: APERF/MPERF: aperf=5864524610 mperf=6009802492 acnt=262724 mcnt=327768 arch_cpu_freq=820 arch_scale_freq_tick: APERF/MPERF: aperf=5864581063 mperf=6009862268 acnt=56453 mcnt=59776 arch_cpu_freq=967 ... [...] -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index b1ca9c4..e67584f 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -306,6 +306,31 @@ static void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci) #endif } +/********************************************************************* + * FREQUENCY INVARIANT CPU CAPACITY * + *********************************************************************/ + +static DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE; + +static void +scale_freq_capacity(struct cpufreq_policy *policy, struct cpufreq_freqs *freqs) +{ + unsigned long cur = freqs ? freqs->new : policy->cur; + unsigned long scale = (cur << SCHED_CAPACITY_SHIFT) / policy->max; + int cpu; + + pr_debug("cpus %*pbl cur/cur max freq %lu/%u kHz freq scale %lu\n", + cpumask_pr_args(policy->cpus), cur, policy->max, scale); + + for_each_cpu(cpu, policy->cpus) + per_cpu(freq_scale, cpu) = scale; +} + +unsigned long cpufreq_scale_freq_capacity(struct sched_domain *sd, int cpu) +{ + return per_cpu(freq_scale, cpu); +} + static void __cpufreq_notify_transition(struct cpufreq_policy *policy, struct cpufreq_freqs *freqs, unsigned int state) { @@ -409,6 +434,8 @@ wait: spin_unlock(&policy->transition_lock); + scale_freq_capacity(policy, freqs); + cpufreq_notify_transition(policy, freqs, CPUFREQ_PRECHANGE); } EXPORT_SYMBOL_GPL(cpufreq_freq_transition_begin); @@ -2125,6 +2152,8 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy, blocking_notifier_call_chain(&cpufreq_policy_notifier_list, CPUFREQ_NOTIFY, new_policy); + scale_freq_capacity(new_policy, NULL); + policy->min = new_policy->min; policy->max = new_policy->max; diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 0e39499..72833be 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -583,4 +583,7 @@ unsigned int cpufreq_generic_get(unsigned int cpu); int cpufreq_generic_init(struct cpufreq_policy *policy, struct cpufreq_frequency_table *table, unsigned int transition_latency); + +struct sched_domain; +unsigned long cpufreq_scale_freq_capacity(struct sched_domain *sd, int cpu); #endif /* _LINUX_CPUFREQ_H */