Message ID | 20170324140900.7334-5-juri.lelli@arm.com (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote: > No assumption can be made upon the rate at which frequency updates get > triggered, as there are scheduling policies (like SCHED_DEADLINE) which > don't trigger them so frequently. > > Remove such assumption from the code. But the util/max values for idle CPUs may be stale, no? Thanks, Rafael
Hi, On 30/03/17 00:41, Rafael J. Wysocki wrote: > On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote: > > No assumption can be made upon the rate at which frequency updates get > > triggered, as there are scheduling policies (like SCHED_DEADLINE) which > > don't trigger them so frequently. > > > > Remove such assumption from the code. > > But the util/max values for idle CPUs may be stale, no? > Right, that might be a problem. A proper solution I think would be to remotely update such values for idle CPUs, and I believe Vincent is working on a patch for that. As mid-term workarounds, changing a bit the current one, come to my mind: - consider TICK_NSEC (continue) only when SCHED_CPUFREQ_DL is not set - remove CFS contribution (without triggering a freq update) when a CPU enters IDLE; this might not work well, though, as we probably want to keep in blocked util contribution for a bit What you think is the way to go? Thanks, - Juri
On 30 March 2017 at 10:58, Juri Lelli <juri.lelli@arm.com> wrote: > Hi, > > On 30/03/17 00:41, Rafael J. Wysocki wrote: >> On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote: >> > No assumption can be made upon the rate at which frequency updates get >> > triggered, as there are scheduling policies (like SCHED_DEADLINE) which >> > don't trigger them so frequently. >> > >> > Remove such assumption from the code. >> >> But the util/max values for idle CPUs may be stale, no? >> > > Right, that might be a problem. A proper solution I think would be to > remotely update such values for idle CPUs, and I believe Vincent is > working on a patch for that. Yes. I'm working on a patch that will regularly update the blocked load/utilization of idle CPU. This update will be done on a slow pace to make sure that utilization and load will be decayed regularly > > As mid-term workarounds, changing a bit the current one, come to my > mind: > > - consider TICK_NSEC (continue) only when SCHED_CPUFREQ_DL is not set > - remove CFS contribution (without triggering a freq update) when a CPU > enters IDLE; this might not work well, though, as we probably want > to keep in blocked util contribution for a bit > > What you think is the way to go? > > Thanks, > > - Juri
On Thu, Mar 30, 2017 at 10:58 AM, Juri Lelli <juri.lelli@arm.com> wrote: > Hi, Hi, > On 30/03/17 00:41, Rafael J. Wysocki wrote: >> On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote: >> > No assumption can be made upon the rate at which frequency updates get >> > triggered, as there are scheduling policies (like SCHED_DEADLINE) which >> > don't trigger them so frequently. >> > >> > Remove such assumption from the code. >> >> But the util/max values for idle CPUs may be stale, no? >> > > Right, that might be a problem. A proper solution I think would be to > remotely update such values for idle CPUs, and I believe Vincent is > working on a patch for that. > > As mid-term workarounds, changing a bit the current one, come to my > mind: > > - consider TICK_NSEC (continue) only when SCHED_CPUFREQ_DL is not set > - remove CFS contribution (without triggering a freq update) when a CPU > enters IDLE; this might not work well, though, as we probably want > to keep in blocked util contribution for a bit > > What you think is the way to go? Well, do we want SCHED_DEADLINE util contribution to be there even for idle CPUs? Thanks, Rafael
On 30/03/17 22:13, Rafael J. Wysocki wrote: > On Thu, Mar 30, 2017 at 10:58 AM, Juri Lelli <juri.lelli@arm.com> wrote: > > Hi, > > Hi, > > > On 30/03/17 00:41, Rafael J. Wysocki wrote: > >> On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote: > >> > No assumption can be made upon the rate at which frequency updates get > >> > triggered, as there are scheduling policies (like SCHED_DEADLINE) which > >> > don't trigger them so frequently. > >> > > >> > Remove such assumption from the code. > >> > >> But the util/max values for idle CPUs may be stale, no? > >> > > > > Right, that might be a problem. A proper solution I think would be to > > remotely update such values for idle CPUs, and I believe Vincent is > > working on a patch for that. > > > > As mid-term workarounds, changing a bit the current one, come to my > > mind: > > > > - consider TICK_NSEC (continue) only when SCHED_CPUFREQ_DL is not set > > - remove CFS contribution (without triggering a freq update) when a CPU > > enters IDLE; this might not work well, though, as we probably want > > to keep in blocked util contribution for a bit > > > > What you think is the way to go? > > Well, do we want SCHED_DEADLINE util contribution to be there even for > idle CPUs? > DEADLINE util contribution is removed, even if the CPU is idle, by the reclaiming mechanism when we know (applying GRUB algorithm rules [1]) that it can't be used anymore by a task (roughly speaking). So, we shouldn't have this problem in the DEADLINE case. [1] https://marc.info/?l=linux-kernel&m=149029880524038
On Fri, Mar 31, 2017 at 9:31 AM, Juri Lelli <juri.lelli@arm.com> wrote: > On 30/03/17 22:13, Rafael J. Wysocki wrote: >> On Thu, Mar 30, 2017 at 10:58 AM, Juri Lelli <juri.lelli@arm.com> wrote: >> > Hi, >> >> Hi, >> >> > On 30/03/17 00:41, Rafael J. Wysocki wrote: >> >> On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote: >> >> > No assumption can be made upon the rate at which frequency updates get >> >> > triggered, as there are scheduling policies (like SCHED_DEADLINE) which >> >> > don't trigger them so frequently. >> >> > >> >> > Remove such assumption from the code. >> >> >> >> But the util/max values for idle CPUs may be stale, no? >> >> >> > >> > Right, that might be a problem. A proper solution I think would be to >> > remotely update such values for idle CPUs, and I believe Vincent is >> > working on a patch for that. >> > >> > As mid-term workarounds, changing a bit the current one, come to my >> > mind: >> > >> > - consider TICK_NSEC (continue) only when SCHED_CPUFREQ_DL is not set >> > - remove CFS contribution (without triggering a freq update) when a CPU >> > enters IDLE; this might not work well, though, as we probably want >> > to keep in blocked util contribution for a bit >> > >> > What you think is the way to go? >> >> Well, do we want SCHED_DEADLINE util contribution to be there even for >> idle CPUs? >> > > DEADLINE util contribution is removed, even if the CPU is idle, by the > reclaiming mechanism when we know (applying GRUB algorithm rules [1]) > that it can't be used anymore by a task (roughly speaking). So, we > shouldn't have this problem in the DEADLINE case. > > [1] https://marc.info/?l=linux-kernel&m=149029880524038 OK Why don't you store the contributions from DL and CFS separately, then (say, as util_dl, util_cfs, respectively) and only discard the CFS one if delta_ns > TICK_NSEC?
On 31/03/17 11:03, Rafael J. Wysocki wrote: > On Fri, Mar 31, 2017 at 9:31 AM, Juri Lelli <juri.lelli@arm.com> wrote: > > On 30/03/17 22:13, Rafael J. Wysocki wrote: > >> On Thu, Mar 30, 2017 at 10:58 AM, Juri Lelli <juri.lelli@arm.com> wrote: > >> > Hi, > >> > >> Hi, > >> > >> > On 30/03/17 00:41, Rafael J. Wysocki wrote: > >> >> On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote: > >> >> > No assumption can be made upon the rate at which frequency updates get > >> >> > triggered, as there are scheduling policies (like SCHED_DEADLINE) which > >> >> > don't trigger them so frequently. > >> >> > > >> >> > Remove such assumption from the code. > >> >> > >> >> But the util/max values for idle CPUs may be stale, no? > >> >> > >> > > >> > Right, that might be a problem. A proper solution I think would be to > >> > remotely update such values for idle CPUs, and I believe Vincent is > >> > working on a patch for that. > >> > > >> > As mid-term workarounds, changing a bit the current one, come to my > >> > mind: > >> > > >> > - consider TICK_NSEC (continue) only when SCHED_CPUFREQ_DL is not set > >> > - remove CFS contribution (without triggering a freq update) when a CPU > >> > enters IDLE; this might not work well, though, as we probably want > >> > to keep in blocked util contribution for a bit > >> > > >> > What you think is the way to go? > >> > >> Well, do we want SCHED_DEADLINE util contribution to be there even for > >> idle CPUs? > >> > > > > DEADLINE util contribution is removed, even if the CPU is idle, by the > > reclaiming mechanism when we know (applying GRUB algorithm rules [1]) > > that it can't be used anymore by a task (roughly speaking). So, we > > shouldn't have this problem in the DEADLINE case. > > > > [1] https://marc.info/?l=linux-kernel&m=149029880524038 > > OK > > Why don't you store the contributions from DL and CFS separately, then > (say, as util_dl, util_cfs, respectively) and only discard the CFS one > if delta_ns > TICK_NSEC? Sure, this should work as well. I'll try this approach for next version. Thanks, - Juri
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index da67a1cf91e7..40f30373b709 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -233,14 +233,13 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu) * If the CPU utilization was last updated before the previous * frequency update and the time elapsed between the last update * of the CPU utilization and the last frequency update is long - * enough, don't take the CPU into account as it probably is - * idle now (and clear iowait_boost for it). + * enough, reset iowait_boost, as it probably is not boosted + * anymore now. */ delta_ns = last_freq_update_time - j_sg_cpu->last_update; - if (delta_ns > TICK_NSEC) { + if (delta_ns > TICK_NSEC) j_sg_cpu->iowait_boost = 0; - continue; - } + if (j_sg_cpu->flags & SCHED_CPUFREQ_RT) return policy->cpuinfo.max_freq;
No assumption can be made upon the rate at which frequency updates get triggered, as there are scheduling policies (like SCHED_DEADLINE) which don't trigger them so frequently. Remove such assumption from the code. Signed-off-by: Juri Lelli <juri.lelli@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Claudio Scordino <claudio@evidence.eu.com> --- kernel/sched/cpufreq_schedutil.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)