Message ID | 2188688.SPioTUuSuO@aspire.rjw.lan (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Rafael Wysocki |
Headers | show |
On Wed, Mar 1, 2017 at 12:09 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > If the current P-state selection algorithm is set to "performance" > in intel_pstate_set_policy(), the limits may be initialized from > scratch, but only if no_turbo is not set and the maximum frequency > allowed for the given CPU (i.e. the policy object representing it) > is at least equal to the max frequency supported by the CPU. In all > of the other cases, the limits will not be updated. > > For example, the following can happen: > > # cat intel_pstate/status > active > # echo performance > cpufreq/policy0/scaling_governor > # cat intel_pstate/min_perf_pct > 100 > # echo 94 > intel_pstate/min_perf_pct > # cat intel_pstate/min_perf_pct > 100 > # cat cpufreq/policy0/scaling_max_freq > 3100000 > echo 3000000 > cpufreq/policy0/scaling_max_freq > # cat intel_pstate/min_perf_pct > 94 > # echo 95 > intel_pstate/min_perf_pct > # cat intel_pstate/min_perf_pct > 95 > > That is confusing for two reasons. First, the initial attempt to > change min_perf_pct to 94 seems to have no effect, even though > setting the global limits should always work. Second, after > changing scaling_max_freq for policy0 the global min_perf_pct > attribute shows 94, even though it should have not been affected > by that operation in principle. > > Moreover, the final attempt to change min_perf_pct to 95 worked > as expected, because scaling_max_freq for the only policy with > scaling_governor equal to "performance" was different from the > maximum at that time. > > To make all that confusion go away, modify intel_pstate_set_policy() > so that it doesn't reinitialize the limits at all. > > At the same time, change intel_pstate_set_performance_limits() to > set min_sysfs_pct to 100 in the "performance" limits set so that > switching the P-state selection algorithm to "performance" causes > intel_pstate/min_perf_pct in sysfs to go to 100 (or whatever value > min_sysfs_pct in the "performance" limits is set to later). > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > > -> v2: No changes > > --- > drivers/cpufreq/intel_pstate.c | 10 +++------- > 1 file changed, 3 insertions(+), 7 deletions(-) > > Index: linux-pm/drivers/cpufreq/intel_pstate.c > =================================================================== > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c > +++ linux-pm/drivers/cpufreq/intel_pstate.c > @@ -382,6 +382,7 @@ static void intel_pstate_set_performance > intel_pstate_init_limits(limits); > limits->min_perf_pct = 100; > limits->min_perf = int_ext_tofp(1); > + limits->min_sysfs_pct = 100; This change breaks the per-CPU limits, so the patch is not correct. Withdrawing. Thanks, Rafael
On Thu, Mar 2, 2017 at 6:18 PM, Rafael J. Wysocki <rafael@kernel.org> wrote: > On Wed, Mar 1, 2017 at 12:09 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> >> If the current P-state selection algorithm is set to "performance" >> in intel_pstate_set_policy(), the limits may be initialized from >> scratch, but only if no_turbo is not set and the maximum frequency >> allowed for the given CPU (i.e. the policy object representing it) >> is at least equal to the max frequency supported by the CPU. In all >> of the other cases, the limits will not be updated. >> >> For example, the following can happen: >> >> # cat intel_pstate/status >> active >> # echo performance > cpufreq/policy0/scaling_governor >> # cat intel_pstate/min_perf_pct >> 100 >> # echo 94 > intel_pstate/min_perf_pct >> # cat intel_pstate/min_perf_pct >> 100 >> # cat cpufreq/policy0/scaling_max_freq >> 3100000 >> echo 3000000 > cpufreq/policy0/scaling_max_freq >> # cat intel_pstate/min_perf_pct >> 94 >> # echo 95 > intel_pstate/min_perf_pct >> # cat intel_pstate/min_perf_pct >> 95 >> >> That is confusing for two reasons. First, the initial attempt to >> change min_perf_pct to 94 seems to have no effect, even though >> setting the global limits should always work. Second, after >> changing scaling_max_freq for policy0 the global min_perf_pct >> attribute shows 94, even though it should have not been affected >> by that operation in principle. >> >> Moreover, the final attempt to change min_perf_pct to 95 worked >> as expected, because scaling_max_freq for the only policy with >> scaling_governor equal to "performance" was different from the >> maximum at that time. >> >> To make all that confusion go away, modify intel_pstate_set_policy() >> so that it doesn't reinitialize the limits at all. >> >> At the same time, change intel_pstate_set_performance_limits() to >> set min_sysfs_pct to 100 in the "performance" limits set so that >> switching the P-state selection algorithm to "performance" causes >> intel_pstate/min_perf_pct in sysfs to go to 100 (or whatever value >> min_sysfs_pct in the "performance" limits is set to later). >> >> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> --- >> >> -> v2: No changes >> >> --- >> drivers/cpufreq/intel_pstate.c | 10 +++------- >> 1 file changed, 3 insertions(+), 7 deletions(-) >> >> Index: linux-pm/drivers/cpufreq/intel_pstate.c >> =================================================================== >> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c >> +++ linux-pm/drivers/cpufreq/intel_pstate.c >> @@ -382,6 +382,7 @@ static void intel_pstate_set_performance >> intel_pstate_init_limits(limits); >> limits->min_perf_pct = 100; >> limits->min_perf = int_ext_tofp(1); >> + limits->min_sysfs_pct = 100; > > This change breaks the per-CPU limits, so the patch is not correct. > > Withdrawing. Actually, it appears to be fixable, so I will send a new version later, most likely. Thanks, Rafael
Hi Rafael, Sorry for the delay, but I didn't notice until today that this commit causes a regression, at least in my computer. I have not figured out exactly what is wrong, as I must admit I am finding these policy interactions difficult to follow. On 2017.03.02 14:29 Rafael J. Wysocki wrote: >From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > If the current P-state selection algorithm is set to "performance" > in intel_pstate_set_policy(), the limits may be initialized from ... [cut] ... Going back to kernel 4.11-rc1 I get this after boot**: $ uname -a Linux s15 4.11.0-rc1-stock #217 SMP Sun Mar 5 15:34:38 PST 2017 x86_64 x86_64 x86_64 GNU/Linux $ grep . /sys/devices/system/cpu/intel_pstate/* /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 <<< Correct /sys/devices/system/cpu/intel_pstate/min_perf_pct:43 <<< Correct /sys/devices/system/cpu/intel_pstate/no_turbo:0 /sys/devices/system/cpu/intel_pstate/num_pstates:23 /sys/devices/system/cpu/intel_pstate/status:active /sys/devices/system/cpu/intel_pstate/turbo_pct:18 $ grep . /sys/devices/system/cpu/cpu0/cpufreq/* /sys/devices/system/cpu/cpu0/cpufreq/affected_cpus:0 grep: /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq: Permission denied /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq:3800000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq:1600000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_transition_latency:4294967295 /sys/devices/system/cpu/cpu0/cpufreq/related_cpus:0 /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors:performance powersave /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq:1600805 /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver:intel_pstate /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave <<< Notice /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq:3800000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq:1600000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed:<unsupported> $ sudo rdmsr --bitfield 15:8 -d -a 0x199 <<< Requested P-States 16 16 17 17 16 16 16 16 Going back to kernel 4.11-rc2 I get this after boot**: ~$ uname -a Linux s15 4.11.0-rc2-stock #218 SMP Sun Mar 12 23:57:44 PDT 2017 x86_64 x86_64 x86_64 GNU/Linux $ grep . /sys/devices/system/cpu/intel_pstate/* /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 <<< Correct /sys/devices/system/cpu/intel_pstate/min_perf_pct:100 <<< Incorrect /sys/devices/system/cpu/intel_pstate/no_turbo:0 /sys/devices/system/cpu/intel_pstate/num_pstates:23 /sys/devices/system/cpu/intel_pstate/status:active /sys/devices/system/cpu/intel_pstate/turbo_pct:18 $ grep . /sys/devices/system/cpu/cpu0/cpufreq/* /sys/devices/system/cpu/cpu0/cpufreq/affected_cpus:0 grep: /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq: Permission denied /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq:3800000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq:1600000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_transition_latency:4294967295 /sys/devices/system/cpu/cpu0/cpufreq/related_cpus:0 /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors:performance powersave /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq:1600805 /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver:intel_pstate /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave <<< Notice /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq:3800000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq:3800000 <<< Incorrect /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed:<unsupported> $ sudo rdmsr --bitfield 15:8 -d -a 0x199 <<< Requested P-States 38 <<<< All are pinned, system is idle 38 38 38 38 38 38 38 **: After boot means > 1 minute after boot, because my distro (Ubtunu) starts up using the performance governor and then changes to powersave after 1 minute. ... Doug
Hi Rafael, Disregard. I see this was fixed in kernel 4.11-rc4. While kernel 4.11-rc4 was where I thought I started from earlier, actually I was running 4.11-rc2 at the time. sorry for the noise. On 2017.03.29 15:01 Doug Smythies wrote: > Sorry for the delay, but I didn't notice until today that this > commit causes a regression, at least in my computer. > > I have not figured out exactly what is wrong, as I must > admit I am finding these policy interactions difficult > to follow. > > On 2017.03.02 14:29 Rafael J. Wysocki wrote: >>From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> >> If the current P-state selection algorithm is set to "performance" >> in intel_pstate_set_policy(), the limits may be initialized from > > ... [cut] ... > > Going back to kernel 4.11-rc1 I get this after boot**: > > $ uname -a > Linux s15 4.11.0-rc1-stock #217 SMP Sun Mar 5 15:34:38 PST 2017 x86_64 x86_64 x86_64 GNU/Linux
On Thu, Mar 30, 2017 at 12:16 AM, Doug Smythies <dsmythies@telus.net> wrote: > Hi Rafael, > > Disregard. > I see this was fixed in kernel 4.11-rc4. > While kernel 4.11-rc4 was where I thought I started from earlier, > actually I was running 4.11-rc2 at the time. > > sorry for the noise. No worries. Thanks, Rafael
Index: linux-pm/drivers/cpufreq/intel_pstate.c =================================================================== --- linux-pm.orig/drivers/cpufreq/intel_pstate.c +++ linux-pm/drivers/cpufreq/intel_pstate.c @@ -382,6 +382,7 @@ static void intel_pstate_set_performance intel_pstate_init_limits(limits); limits->min_perf_pct = 100; limits->min_perf = int_ext_tofp(1); + limits->min_sysfs_pct = 100; } static DEFINE_MUTEX(intel_pstate_driver_lock); @@ -2146,16 +2147,11 @@ static int intel_pstate_set_policy(struc mutex_lock(&intel_pstate_limits_lock); if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) { + pr_debug("set performance\n"); if (!perf_limits) { limits = &performance_limits; perf_limits = limits; } - if (policy->max >= policy->cpuinfo.max_freq && - !limits->no_turbo) { - pr_debug("set performance\n"); - intel_pstate_set_performance_limits(perf_limits); - goto out; - } } else { pr_debug("set powersave\n"); if (!perf_limits) { @@ -2166,7 +2162,7 @@ static int intel_pstate_set_policy(struc } intel_pstate_update_perf_limits(policy, perf_limits); - out: + if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE) { /* * NOHZ_FULL CPUs need this as the governor callback may not