Message ID | 1563431200-3042-1-git-send-email-dsmythies@telus.net (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | Revert "cpufreq: schedutil: Don't set next_freq to UINT_MAX" | expand |
On 17-07-19, 23:26, Doug Smythies wrote: > This reverts commit ecd2884291261e3fddbc7651ee11a20d596bb514. > > The commit caused a regression whereby reducing the maximum > CPU clock frequency is ineffective while busy, and the CPU > clock remains unchanged. Once the system has experienced > some idle time, the new limit is implemented. Can you explain why this patch caused that issue ? I am sorry but I couldn't understand it from your email. How are we trying to reduce the frequency? Is clk_set_rate() getting called with that finally and not working ? > A consequence is that any thermal throttling monitoring > and control based on max freq limits fail to respond > in a timely manor, if at all, to a thermal temperature > trip on a busy system.
On 2019.07.18 03:28 Viresh Kumar wrote: > On 17-07-19, 23:26, Doug Smythies wrote: >> This reverts commit ecd2884291261e3fddbc7651ee11a20d596bb514. >> >> The commit caused a regression whereby reducing the maximum >> CPU clock frequency is ineffective while busy, and the CPU >> clock remains unchanged. Once the system has experienced >> some idle time, the new limit is implemented. > > Can you explain why this patch caused that issue ? I am sorry but I couldn't > understand it from your email. How are we trying to reduce the frequency? Is > clk_set_rate() getting called with that finally and not working ? The patch eliminates the "flag", UNIT_MAX, and it's related comment, that was used to indicate if it was a limit change that causes the subsequent execution of sugov_update_single. As for "clk_set_rate()", I don't know. I bisected the kernel and got here. I also looked at reverting B7EAF1AAB9F8, but it seemed to have some purpose which I don't know of more important than this one or not. I'll reinsert the first test below with more detail: On 2019.07.17 23:27 Doug wrote: > Driver: intel_cpufreq ; Governor: Schedutil > Kernel 5.2: $ uname -a Linux s15 5.2.0-stock #608 SMP PREEMPT Mon Jul 8 08:21:56 PDT 2019 x86_64 x86_64 x86_64 GNU/Linux doug@s15:~$ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_driver intel_cpufreq intel_cpufreq intel_cpufreq intel_cpufreq intel_cpufreq intel_cpufreq intel_cpufreq intel_cpufreq doug@s15:~$ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor schedutil schedutil schedutil schedutil schedutil schedutil schedutil schedutil > Test 1: No thermal throttling involved (I only ever use thermal, if anything): doug@s15:~$ sudo systemctl status thermald.service . thermald.service - Thermal Daemon Service Loaded: loaded (/lib/systemd/system/thermald.service; disabled; vendor preset: enabled) Active: inactive (dead) > - load the system (some sort of CPU stress test). Use whatever you want. I use my own, only because I always have and it prints something every so often which gives an indication of actual clock speed. doug@s15:~$ ~/c/test1 & ~/c/test1 & ~/c/test1 & ~/c/test1 & ~/c/test1 & ~/c/test1 & ~/c/test1 & ~/c/test1 & ~/c/test1 & [1] 3000 [2] 3001 [3] 3002 [4] 3003 [5] 3004 [6] 3005 [7] 3006 [8] 3007 [9] 3008 Watch everything with turobstat: doug@s15:~$ sudo turbostat --quiet --Summary --show Busy%,Bzy_MHz,PkgTmp,PkgWatt,IRQ --interval 5 Busy% Bzy_MHz IRQ PkgTmp PkgWatt 0.03 1600 269 25 3.69 0.07 1600 390 25 3.72 0.06 1600 368 25 3.72 0.06 1600 343 25 3.71 0.04 1600 269 26 3.70 74.72 3474 30230 46 43.95 <<< Add load 100.00 3500 40228 50 58.27 100.00 3500 40196 52 58.59 100.00 3500 40215 55 58.91 100.00 3500 40211 56 59.12 100.00 3500 40291 58 59.33 <<< Try to set 60% max 100.00 3500 40278 59 59.55 100.00 3500 40591 61 59.73 100.00 3500 40279 61 60.10 100.00 3500 40292 63 60.35 100.00 3500 40401 64 60.85 99.99 3500 40352 65 61.12 100.00 3500 40230 67 61.32 100.00 3500 40228 67 61.52 100.00 3500 40400 68 61.60 <<< Try to set 42% max 100.00 3500 40279 69 61.74 100.00 3500 40258 68 61.85 100.00 3500 40226 70 62.01 100.00 3500 40220 70 62.10 100.00 3500 40211 71 62.25 > - adjust downwards the maximum clock frequency > echo 60 | sudo tee /sys/devices/system/cpu/intel_pstate/max_perf_pct doug@s15:~/temp$ echo 60 | sudo tee /sys/devices/system/cpu/intel_pstate/max_perf_pct 60 doug@s15:~/temp$ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct 60 ... wait awhile ... doug@s15:~/temp$ echo 42 | sudo tee /sys/devices/system/cpu/intel_pstate/max_perf_pct 42 doug@s15:~/temp$ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct 42 > - observe the Bzy_MHz does not change on the turbostat output. See annotated turbostat output above. > - reduce the system load, such that there is some idle time. > - observe the Bzy_MHz now does change. > - increase the load again. > - observe the CPU frequency is now pinned to the new limit. Go back to 60% first: doug@s15:~/temp$ echo 60 | sudo tee /sys/devices/system/cpu/intel_pstate/max_perf_pct 60 killall test1 ... wait awhile ... doug@s15:~$ ~/c/test1 & ~/c/test1 & ~/c/test1 & ~/c/test1 & ~/c/test1 & ~/c/test1 & ~/c/test1 & ~/c/test1 & ~/c/test1 & [1] 3040 [2] 3041 [3] 3042 [4] 3043 [5] 3044 [6] 3045 [7] 3046 [8] 3047 [9] 3048 doug@s15:~$ sudo turbostat --quiet --Summary --show Busy%,Bzy_MHz,PkgTmp,PkgWatt,IRQ --interval 5 Busy% Bzy_MHz IRQ PkgTmp PkgWatt 100.00 3500 40289 80 64.32 100.00 3500 40223 79 64.16 100.00 3500 40212 79 64.14 100.00 3500 40269 79 64.17 100.00 3500 41330 79 64.19 <<< Freq is above the 60% limit 14.10 3489 6260 55 12.63 <<< Load removed, now not "busy" 0.03 1600 263 53 4.13 <<< see sugov_update_single 23.22 2293 9582 65 10.72 <<< Load applied 100.00 2300 40229 66 35.09 <<< 3.8 GHz * 0.6 = 2.3 GHz 100.00 2300 40209 66 35.21 100.00 2300 40211 65 35.20 100.00 2300 40219 65 35.16 100.00 2300 40224 65 35.14 The only procedure changes when using the acpi-cpufreq CPU scaling driver and the schedutil governor are for setting and monitoring the max freq settings, as described in the test write-ups. Hope this helps. ... Doug
On 18-07-19, 08:46, Doug Smythies wrote: > On 2019.07.18 03:28 Viresh Kumar wrote: > > On 17-07-19, 23:26, Doug Smythies wrote: > >> This reverts commit ecd2884291261e3fddbc7651ee11a20d596bb514. > >> > >> The commit caused a regression whereby reducing the maximum > >> CPU clock frequency is ineffective while busy, and the CPU > >> clock remains unchanged. Once the system has experienced > >> some idle time, the new limit is implemented. > > > > Can you explain why this patch caused that issue ? I am sorry but I couldn't > > understand it from your email. How are we trying to reduce the frequency? Is > > clk_set_rate() getting called with that finally and not working ? > > The patch eliminates the "flag", UNIT_MAX, and it's related comment, > that was used to indicate if it was a limit change that causes the > subsequent execution of sugov_update_single. I think I may have understood the root cause. Please try the patch I just sent as reply to this thread. Thanks.
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 962cf34..ea4e04b 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -89,8 +89,15 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time) !cpufreq_this_cpu_can_update(sg_policy->policy)) return false; - if (unlikely(sg_policy->need_freq_update)) + if (unlikely(sg_policy->need_freq_update)) { + sg_policy->need_freq_update = false; + /* + * This happens when limits change, so forget the previous + * next_freq value and force an update. + */ + sg_policy->next_freq = UINT_MAX; return true; + } delta_ns = time - sg_policy->last_freq_update_time; @@ -168,10 +175,8 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy, freq = map_util_freq(util, freq, max); - if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update) + if (freq == sg_policy->cached_raw_freq && sg_policy->next_freq != UINT_MAX) return sg_policy->next_freq; - - sg_policy->need_freq_update = false; sg_policy->cached_raw_freq = freq; return cpufreq_driver_resolve_freq(policy, freq); } @@ -457,7 +462,8 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, * Do not reduce the frequency if the CPU has not been idle * recently, as the reduction is likely to be premature then. */ - if (busy && next_f < sg_policy->next_freq) { + if (busy && next_f < sg_policy->next_freq && + sg_policy->next_freq != UINT_MAX) { next_f = sg_policy->next_freq; /* Reset cached freq as next_freq has changed */ @@ -819,7 +825,7 @@ static int sugov_start(struct cpufreq_policy *policy) sg_policy->freq_update_delay_ns = sg_policy->tunables->rate_limit_us * NSEC_PER_USEC; sg_policy->last_freq_update_time = 0; - sg_policy->next_freq = 0; + sg_policy->next_freq = UINT_MAX; sg_policy->work_in_progress = false; sg_policy->need_freq_update = false; sg_policy->cached_raw_freq = 0;