Message ID | 1518109302-8239-1-git-send-email-claudio@evidence.eu.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
On 08-02-18, 18:01, Claudio Scordino wrote: > When the SCHED_DEADLINE scheduling class increases the CPU utilization, > we should not wait for the rate limit, otherwise we may miss some deadline. > > Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline > misses for tasks with low RT periods. > > The patch applies on top of the one recently proposed by Peter to drop the > SCHED_CPUFREQ_* flags. > > Signed-off-by: Claudio Scordino <claudio@evidence.eu.com> > CC: Rafael J . Wysocki <rafael.j.wysocki@intel.com> > CC: Patrick Bellasi <patrick.bellasi@arm.com> > CC: Dietmar Eggemann <dietmar.eggemann@arm.com> > CC: Morten Rasmussen <morten.rasmussen@arm.com> > CC: Juri Lelli <juri.lelli@redhat.com> > CC: Viresh Kumar <viresh.kumar@linaro.org> > CC: Vincent Guittot <vincent.guittot@linaro.org> > CC: Todd Kjos <tkjos@android.com> > CC: Joel Fernandes <joelaf@google.com> > CC: linux-pm@vger.kernel.org > CC: linux-kernel@vger.kernel.org > --- > kernel/sched/cpufreq_schedutil.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) So the previous commit was surely incorrect as it relied on comparing frequencies instead of dl-util, and freq requirements could have even changed due to CFS. > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c > index b0bd77d..d8dcba2 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -74,7 +74,10 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu); > > /************************ Governor internals ***********************/ > > -static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time) > +static bool sugov_should_update_freq(struct sugov_policy *sg_policy, > + u64 time, > + struct sugov_cpu *sg_cpu_old, > + struct sugov_cpu *sg_cpu_new) > { > s64 delta_ns; > > @@ -111,6 +114,10 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time) > return true; > } > > + /* Ignore rate limit when DL increased utilization. */ > + if (sg_cpu_new->util_dl > sg_cpu_old->util_dl) > + return true; > + Changing the frequency has a penalty, specially in the ARM world (and that's where you are testing your stuff). I am worried that we will have (corner) cases where we will waste a lot of time changing the frequencies. For example (I may be wrong here), what if 10 small DL tasks are queued one after the other? The util will keep on changing and so will the frequency ? There may be more similar cases ? Is it possible to (somehow) check here if the DL tasks will miss deadline if we continue to run at current frequency? And only ignore rate-limit if that is the case ? > delta_ns = time - sg_policy->last_freq_update_time; > return delta_ns >= sg_policy->freq_update_delay_ns; > } > @@ -271,6 +278,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, > unsigned int flags) > { > struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util); > + struct sugov_cpu sg_cpu_old = *sg_cpu; Not really a big deal, but this structure is 80 bytes on ARM64, why copy everything when what we need is just 8 bytes ?
Hi Viresh, Il 09/02/2018 04:51, Viresh Kumar ha scritto: > On 08-02-18, 18:01, Claudio Scordino wrote: >> When the SCHED_DEADLINE scheduling class increases the CPU utilization, >> we should not wait for the rate limit, otherwise we may miss some deadline. >> >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline >> misses for tasks with low RT periods. >> >> The patch applies on top of the one recently proposed by Peter to drop the >> SCHED_CPUFREQ_* flags. >> >> Signed-off-by: Claudio Scordino <claudio@evidence.eu.com> >> CC: Rafael J . Wysocki <rafael.j.wysocki@intel.com> >> CC: Patrick Bellasi <patrick.bellasi@arm.com> >> CC: Dietmar Eggemann <dietmar.eggemann@arm.com> >> CC: Morten Rasmussen <morten.rasmussen@arm.com> >> CC: Juri Lelli <juri.lelli@redhat.com> >> CC: Viresh Kumar <viresh.kumar@linaro.org> >> CC: Vincent Guittot <vincent.guittot@linaro.org> >> CC: Todd Kjos <tkjos@android.com> >> CC: Joel Fernandes <joelaf@google.com> >> CC: linux-pm@vger.kernel.org >> CC: linux-kernel@vger.kernel.org >> --- >> kernel/sched/cpufreq_schedutil.c | 15 ++++++++++++--- >> 1 file changed, 12 insertions(+), 3 deletions(-) > > So the previous commit was surely incorrect as it relied on comparing > frequencies instead of dl-util, and freq requirements could have even > changed due to CFS. You're right. The very original patch (not posted) added a specific SCHED_CPUFREQ flag to let the scheduling class ask for ignoring the rate limit. However, polluting the API with further flags is not such a good approach. The next patches didn't introduce such flag, but were incorrect. > >> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c >> index b0bd77d..d8dcba2 100644 >> --- a/kernel/sched/cpufreq_schedutil.c >> +++ b/kernel/sched/cpufreq_schedutil.c >> @@ -74,7 +74,10 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu); >> >> /************************ Governor internals ***********************/ >> >> -static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time) >> +static bool sugov_should_update_freq(struct sugov_policy *sg_policy, >> + u64 time, >> + struct sugov_cpu *sg_cpu_old, >> + struct sugov_cpu *sg_cpu_new) >> { >> s64 delta_ns; >> >> @@ -111,6 +114,10 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time) >> return true; >> } >> >> + /* Ignore rate limit when DL increased utilization. */ >> + if (sg_cpu_new->util_dl > sg_cpu_old->util_dl) >> + return true; >> + > > Changing the frequency has a penalty, specially in the ARM world (and > that's where you are testing your stuff). I am worried that we will > have (corner) cases where we will waste a lot of time changing the > frequencies. For example (I may be wrong here), what if 10 small DL > tasks are queued one after the other? The util will keep on changing > and so will the frequency ? There may be more similar cases ? I forgot to say that I've not observed any relevant increase of the energy consumption (measured through a Baylibre Cape). However, the tests had a very small number of RT tasks. If I'm not wrong, at the hardware level we do have a physical rate limit (as we cannot trigger a frequency update when there is one already on-going). Don't know if this could somehow mitigate such effect. Anyway, I'll repeat the tests with a considerable amount of RT tasks to check if I can reproduce such "ramp up" situation. Depending on the energy results, we may have to choose between meeting more RT deadlines and consuming less energy. > > Is it possible to (somehow) check here if the DL tasks will miss > deadline if we continue to run at current frequency? And only ignore > rate-limit if that is the case ? I need to think further about it. > >> delta_ns = time - sg_policy->last_freq_update_time; >> return delta_ns >= sg_policy->freq_update_delay_ns; >> } >> @@ -271,6 +278,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, >> unsigned int flags) >> { >> struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util); >> + struct sugov_cpu sg_cpu_old = *sg_cpu; > > Not really a big deal, but this structure is 80 bytes on ARM64, why > copy everything when what we need is just 8 bytes ? I didn't want to add deadline-specific code into the sugov_should_update_freq() signature as it should remain independent from the scheduling classes. In my opinion, the best approach would be to group util_cfs and util_dl in a struct within sugov_cpu and pass that struct to sugov_should_update_freq(). Thanks for your comments. Claudio
On 09-02-18, 09:02, Claudio Scordino wrote: > If I'm not wrong, at the hardware level we do have a physical rate limit (as we cannot trigger a frequency update when there is one already on-going). > Don't know if this could somehow mitigate such effect. Yeah, so in the worst case we will start a new freq-change right after the previous one has finished.
On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote: > Hi Viresh, > > Il 09/02/2018 04:51, Viresh Kumar ha scritto: > > On 08-02-18, 18:01, Claudio Scordino wrote: > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization, > >> we should not wait for the rate limit, otherwise we may miss some deadline. > >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline > >> misses for tasks with low RT periods. > >> > >> The patch applies on top of the one recently proposed by Peter to drop the > >> SCHED_CPUFREQ_* flags. > >> [cut] > > > > > Is it possible to (somehow) check here if the DL tasks will miss > > deadline if we continue to run at current frequency? And only ignore > > rate-limit if that is the case ? > > I need to think further about it. That would be my approach FWIW. Increasing the frequency beyond what is necessary means wasting energy in any case. Thanks, Rafael
Hi, On 09/02/18 11:36, Rafael J. Wysocki wrote: > On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote: > > Hi Viresh, > > > > Il 09/02/2018 04:51, Viresh Kumar ha scritto: > > > On 08-02-18, 18:01, Claudio Scordino wrote: > > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization, > > >> we should not wait for the rate limit, otherwise we may miss some deadline. > > >> > > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline > > >> misses for tasks with low RT periods. > > >> > > >> The patch applies on top of the one recently proposed by Peter to drop the > > >> SCHED_CPUFREQ_* flags. > > >> > > [cut] > > > > > > > > > Is it possible to (somehow) check here if the DL tasks will miss > > > deadline if we continue to run at current frequency? And only ignore > > > rate-limit if that is the case ? Isn't it always the case? Utilization associated to DL tasks is given by what the user said it's needed to meet a task deadlines (admission control). If that task wakes up and we realize that adding its utilization contribution is going to require a frequency change, we should _theoretically_ always do it, or it will be too late. Now, user might have asked for a bit more than what strictly required (this is usually the case to compensate for discrepancies between theory and real world, e.g. hw transition limits), but I don't think there is a way to know "how much". :/ Thanks, - Juri > > > > I need to think further about it. > > That would be my approach FWIW. > > Increasing the frequency beyond what is necessary means wasting energy > in any case. > > Thanks, > Rafael >
On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote: > Hi, > > On 09/02/18 11:36, Rafael J. Wysocki wrote: >> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote: >> > Hi Viresh, >> > >> > Il 09/02/2018 04:51, Viresh Kumar ha scritto: >> > > On 08-02-18, 18:01, Claudio Scordino wrote: >> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization, >> > >> we should not wait for the rate limit, otherwise we may miss some deadline. >> > >> >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline >> > >> misses for tasks with low RT periods. >> > >> >> > >> The patch applies on top of the one recently proposed by Peter to drop the >> > >> SCHED_CPUFREQ_* flags. >> > >> >> >> [cut] >> >> > >> > > >> > > Is it possible to (somehow) check here if the DL tasks will miss >> > > deadline if we continue to run at current frequency? And only ignore >> > > rate-limit if that is the case ? > > Isn't it always the case? Utilization associated to DL tasks is given by > what the user said it's needed to meet a task deadlines (admission > control). If that task wakes up and we realize that adding its > utilization contribution is going to require a frequency change, we > should _theoretically_ always do it, or it will be too late. Now, user > might have asked for a bit more than what strictly required (this is > usually the case to compensate for discrepancies between theory and real > world, e.g. hw transition limits), but I don't think there is a way to > know "how much". :/ You are right. I'm somewhat concerned about "fast switch" cases when the rate limit is used to reduce overhead.
On Thu, Feb 8, 2018 at 6:01 PM, Claudio Scordino <claudio@evidence.eu.com> wrote: > When the SCHED_DEADLINE scheduling class increases the CPU utilization, > we should not wait for the rate limit, otherwise we may miss some deadline. > > Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline > misses for tasks with low RT periods. > > The patch applies on top of the one recently proposed by Peter to drop the > SCHED_CPUFREQ_* flags. > > Signed-off-by: Claudio Scordino <claudio@evidence.eu.com> > CC: Rafael J . Wysocki <rafael.j.wysocki@intel.com> > CC: Patrick Bellasi <patrick.bellasi@arm.com> > CC: Dietmar Eggemann <dietmar.eggemann@arm.com> > CC: Morten Rasmussen <morten.rasmussen@arm.com> > CC: Juri Lelli <juri.lelli@redhat.com> > CC: Viresh Kumar <viresh.kumar@linaro.org> > CC: Vincent Guittot <vincent.guittot@linaro.org> > CC: Todd Kjos <tkjos@android.com> > CC: Joel Fernandes <joelaf@google.com> > CC: linux-pm@vger.kernel.org > CC: linux-kernel@vger.kernel.org > --- > kernel/sched/cpufreq_schedutil.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c > index b0bd77d..d8dcba2 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -74,7 +74,10 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu); > > /************************ Governor internals ***********************/ > > -static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time) > +static bool sugov_should_update_freq(struct sugov_policy *sg_policy, > + u64 time, > + struct sugov_cpu *sg_cpu_old, > + struct sugov_cpu *sg_cpu_new) This looks somewhat excessive for using just one field from each of these. > { > s64 delta_ns; > > @@ -111,6 +114,10 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time) > return true; > } > > + /* Ignore rate limit when DL increased utilization. */ > + if (sg_cpu_new->util_dl > sg_cpu_old->util_dl) > + return true; > + > delta_ns = time - sg_policy->last_freq_update_time; > return delta_ns >= sg_policy->freq_update_delay_ns; > } > @@ -271,6 +278,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, > unsigned int flags) > { > struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util); > + struct sugov_cpu sg_cpu_old = *sg_cpu; And here you copy the entire struct to pass a pointer to the copy to a helper function so that it can access one field. That doesn't look particularly straightforward to me, let alone the overhead. I guess you may do the check before calling sugov_should_update_freq() and set sg_policy->need_freq_update if its true, as you know upfront that the previous sg_policy->next_freq value isn't going to be used anyway in that case.
On 09/02/18 12:04, Rafael J. Wysocki wrote: > On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote: > > Hi, > > > > On 09/02/18 11:36, Rafael J. Wysocki wrote: > >> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote: > >> > Hi Viresh, > >> > > >> > Il 09/02/2018 04:51, Viresh Kumar ha scritto: > >> > > On 08-02-18, 18:01, Claudio Scordino wrote: > >> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization, > >> > >> we should not wait for the rate limit, otherwise we may miss some deadline. > >> > >> > >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline > >> > >> misses for tasks with low RT periods. > >> > >> > >> > >> The patch applies on top of the one recently proposed by Peter to drop the > >> > >> SCHED_CPUFREQ_* flags. > >> > >> > >> > >> [cut] > >> > >> > > >> > > > >> > > Is it possible to (somehow) check here if the DL tasks will miss > >> > > deadline if we continue to run at current frequency? And only ignore > >> > > rate-limit if that is the case ? > > > > Isn't it always the case? Utilization associated to DL tasks is given by > > what the user said it's needed to meet a task deadlines (admission > > control). If that task wakes up and we realize that adding its > > utilization contribution is going to require a frequency change, we > > should _theoretically_ always do it, or it will be too late. Now, user > > might have asked for a bit more than what strictly required (this is > > usually the case to compensate for discrepancies between theory and real > > world, e.g. hw transition limits), but I don't think there is a way to > > know "how much". :/ > > You are right. > > I'm somewhat concerned about "fast switch" cases when the rate limit > is used to reduce overhead. Mmm, right. I'm thinking that in those cases we could leave rate limit as is. The user should then be aware of it and consider it as proper overhead when designing her/his system. But then, isn't it the same for "non fast switch" platforms? I mean, even in the latter case we can't go faster than hw limits.. mmm, maybe the difference is that in the former case we could go as fast as theory would expect.. but we shouldn't. :)
On Fri, Feb 9, 2018 at 12:26 PM, Juri Lelli <juri.lelli@redhat.com> wrote: > On 09/02/18 12:04, Rafael J. Wysocki wrote: >> On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote: >> > Hi, >> > >> > On 09/02/18 11:36, Rafael J. Wysocki wrote: >> >> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote: >> >> > Hi Viresh, >> >> > >> >> > Il 09/02/2018 04:51, Viresh Kumar ha scritto: >> >> > > On 08-02-18, 18:01, Claudio Scordino wrote: >> >> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization, >> >> > >> we should not wait for the rate limit, otherwise we may miss some deadline. >> >> > >> >> >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline >> >> > >> misses for tasks with low RT periods. >> >> > >> >> >> > >> The patch applies on top of the one recently proposed by Peter to drop the >> >> > >> SCHED_CPUFREQ_* flags. >> >> > >> >> >> >> >> [cut] >> >> >> >> > >> >> > > >> >> > > Is it possible to (somehow) check here if the DL tasks will miss >> >> > > deadline if we continue to run at current frequency? And only ignore >> >> > > rate-limit if that is the case ? >> > >> > Isn't it always the case? Utilization associated to DL tasks is given by >> > what the user said it's needed to meet a task deadlines (admission >> > control). If that task wakes up and we realize that adding its >> > utilization contribution is going to require a frequency change, we >> > should _theoretically_ always do it, or it will be too late. Now, user >> > might have asked for a bit more than what strictly required (this is >> > usually the case to compensate for discrepancies between theory and real >> > world, e.g. hw transition limits), but I don't think there is a way to >> > know "how much". :/ >> >> You are right. >> >> I'm somewhat concerned about "fast switch" cases when the rate limit >> is used to reduce overhead. > > Mmm, right. I'm thinking that in those cases we could leave rate limit > as is. The user should then be aware of it and consider it as proper > overhead when designing her/his system. > > But then, isn't it the same for "non fast switch" platforms? I mean, > even in the latter case we can't go faster than hw limits.. mmm, maybe > the difference is that in the former case we could go as fast as theory > would expect.. but we shouldn't. :) Well, in practical terms that means "no difference" IMO. :-) I can imagine that in some cases this approach may lead to better results than reducing the rate limit overall, but the general case I'm not sure about. I mean, if overriding the rate limit doesn't take place very often, then it really should make no difference overhead-wise. Now, of course, how to define "not very often" is a good question as that leads to rate-limiting the overriding of the original rate limit and that scheme may continue indefinitely ...
On 09/02/18 12:37, Rafael J. Wysocki wrote: > On Fri, Feb 9, 2018 at 12:26 PM, Juri Lelli <juri.lelli@redhat.com> wrote: > > On 09/02/18 12:04, Rafael J. Wysocki wrote: > >> On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote: > >> > Hi, > >> > > >> > On 09/02/18 11:36, Rafael J. Wysocki wrote: > >> >> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote: > >> >> > Hi Viresh, > >> >> > > >> >> > Il 09/02/2018 04:51, Viresh Kumar ha scritto: > >> >> > > On 08-02-18, 18:01, Claudio Scordino wrote: > >> >> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization, > >> >> > >> we should not wait for the rate limit, otherwise we may miss some deadline. > >> >> > >> > >> >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline > >> >> > >> misses for tasks with low RT periods. > >> >> > >> > >> >> > >> The patch applies on top of the one recently proposed by Peter to drop the > >> >> > >> SCHED_CPUFREQ_* flags. > >> >> > >> > >> >> > >> >> [cut] > >> >> > >> >> > > >> >> > > > >> >> > > Is it possible to (somehow) check here if the DL tasks will miss > >> >> > > deadline if we continue to run at current frequency? And only ignore > >> >> > > rate-limit if that is the case ? > >> > > >> > Isn't it always the case? Utilization associated to DL tasks is given by > >> > what the user said it's needed to meet a task deadlines (admission > >> > control). If that task wakes up and we realize that adding its > >> > utilization contribution is going to require a frequency change, we > >> > should _theoretically_ always do it, or it will be too late. Now, user > >> > might have asked for a bit more than what strictly required (this is > >> > usually the case to compensate for discrepancies between theory and real > >> > world, e.g. hw transition limits), but I don't think there is a way to > >> > know "how much". :/ > >> > >> You are right. > >> > >> I'm somewhat concerned about "fast switch" cases when the rate limit > >> is used to reduce overhead. > > > > Mmm, right. I'm thinking that in those cases we could leave rate limit > > as is. The user should then be aware of it and consider it as proper > > overhead when designing her/his system. > > > > But then, isn't it the same for "non fast switch" platforms? I mean, > > even in the latter case we can't go faster than hw limits.. mmm, maybe > > the difference is that in the former case we could go as fast as theory > > would expect.. but we shouldn't. :) > > Well, in practical terms that means "no difference" IMO. :-) > > I can imagine that in some cases this approach may lead to better > results than reducing the rate limit overall, but the general case I'm > not sure about. > > I mean, if overriding the rate limit doesn't take place very often, > then it really should make no difference overhead-wise. Now, of > course, how to define "not very often" is a good question as that > leads to rate-limiting the overriding of the original rate limit and > that scheme may continue indefinitely ... :) My impression is that rate limit helps a lot for CFS, where the "true" utilization is not known in advance, and being too responsive might actually be counterproductive. For DEADLINE (and RT, with differences) we should always respond as quick as we can (and probably remember that a frequency transition was requested if hw was already performing one, but that's another patch) because, if we don't, a task belonging to a lower priority class might induce deadline misses in highest priority activities. E.g., a CFS task that happens to trigger a freq switch right before a DEADLINE task wakes up and needs an higher frequency to meet its deadline: if we wait for the rate limit of the CFS originated transition.. deadline miss!
On Fri, Feb 9, 2018 at 12:51 PM, Juri Lelli <juri.lelli@redhat.com> wrote: > On 09/02/18 12:37, Rafael J. Wysocki wrote: >> On Fri, Feb 9, 2018 at 12:26 PM, Juri Lelli <juri.lelli@redhat.com> wrote: >> > On 09/02/18 12:04, Rafael J. Wysocki wrote: >> >> On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote: >> >> > Hi, >> >> > >> >> > On 09/02/18 11:36, Rafael J. Wysocki wrote: >> >> >> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote: >> >> >> > Hi Viresh, >> >> >> > >> >> >> > Il 09/02/2018 04:51, Viresh Kumar ha scritto: >> >> >> > > On 08-02-18, 18:01, Claudio Scordino wrote: >> >> >> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization, >> >> >> > >> we should not wait for the rate limit, otherwise we may miss some deadline. >> >> >> > >> >> >> >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline >> >> >> > >> misses for tasks with low RT periods. >> >> >> > >> >> >> >> > >> The patch applies on top of the one recently proposed by Peter to drop the >> >> >> > >> SCHED_CPUFREQ_* flags. >> >> >> > >> >> >> >> >> >> >> [cut] >> >> >> >> >> >> > >> >> >> > > >> >> >> > > Is it possible to (somehow) check here if the DL tasks will miss >> >> >> > > deadline if we continue to run at current frequency? And only ignore >> >> >> > > rate-limit if that is the case ? >> >> > >> >> > Isn't it always the case? Utilization associated to DL tasks is given by >> >> > what the user said it's needed to meet a task deadlines (admission >> >> > control). If that task wakes up and we realize that adding its >> >> > utilization contribution is going to require a frequency change, we >> >> > should _theoretically_ always do it, or it will be too late. Now, user >> >> > might have asked for a bit more than what strictly required (this is >> >> > usually the case to compensate for discrepancies between theory and real >> >> > world, e.g. hw transition limits), but I don't think there is a way to >> >> > know "how much". :/ >> >> >> >> You are right. >> >> >> >> I'm somewhat concerned about "fast switch" cases when the rate limit >> >> is used to reduce overhead. >> > >> > Mmm, right. I'm thinking that in those cases we could leave rate limit >> > as is. The user should then be aware of it and consider it as proper >> > overhead when designing her/his system. >> > >> > But then, isn't it the same for "non fast switch" platforms? I mean, >> > even in the latter case we can't go faster than hw limits.. mmm, maybe >> > the difference is that in the former case we could go as fast as theory >> > would expect.. but we shouldn't. :) >> >> Well, in practical terms that means "no difference" IMO. :-) >> >> I can imagine that in some cases this approach may lead to better >> results than reducing the rate limit overall, but the general case I'm >> not sure about. >> >> I mean, if overriding the rate limit doesn't take place very often, >> then it really should make no difference overhead-wise. Now, of >> course, how to define "not very often" is a good question as that >> leads to rate-limiting the overriding of the original rate limit and >> that scheme may continue indefinitely ... > > :) > > My impression is that rate limit helps a lot for CFS, where the "true" > utilization is not known in advance, and being too responsive might > actually be counterproductive. > > For DEADLINE (and RT, with differences) we should always respond as > quick as we can (and probably remember that a frequency transition was > requested if hw was already performing one, but that's another patch) > because, if we don't, a task belonging to a lower priority class might > induce deadline misses in highest priority activities. E.g., a CFS task > that happens to trigger a freq switch right before a DEADLINE task wakes > up and needs an higher frequency to meet its deadline: if we wait for > the rate limit of the CFS originated transition.. deadline miss! Fair enough, but if there's too much overhead as a result of this, you can't guarantee the deadlines to be met anyway.
On 09/02/18 13:08, Rafael J. Wysocki wrote: > On Fri, Feb 9, 2018 at 12:51 PM, Juri Lelli <juri.lelli@redhat.com> wrote: > > On 09/02/18 12:37, Rafael J. Wysocki wrote: > >> On Fri, Feb 9, 2018 at 12:26 PM, Juri Lelli <juri.lelli@redhat.com> wrote: > >> > On 09/02/18 12:04, Rafael J. Wysocki wrote: > >> >> On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote: > >> >> > Hi, > >> >> > > >> >> > On 09/02/18 11:36, Rafael J. Wysocki wrote: > >> >> >> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote: > >> >> >> > Hi Viresh, > >> >> >> > > >> >> >> > Il 09/02/2018 04:51, Viresh Kumar ha scritto: > >> >> >> > > On 08-02-18, 18:01, Claudio Scordino wrote: > >> >> >> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization, > >> >> >> > >> we should not wait for the rate limit, otherwise we may miss some deadline. > >> >> >> > >> > >> >> >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline > >> >> >> > >> misses for tasks with low RT periods. > >> >> >> > >> > >> >> >> > >> The patch applies on top of the one recently proposed by Peter to drop the > >> >> >> > >> SCHED_CPUFREQ_* flags. > >> >> >> > >> > >> >> >> > >> >> >> [cut] > >> >> >> > >> >> >> > > >> >> >> > > > >> >> >> > > Is it possible to (somehow) check here if the DL tasks will miss > >> >> >> > > deadline if we continue to run at current frequency? And only ignore > >> >> >> > > rate-limit if that is the case ? > >> >> > > >> >> > Isn't it always the case? Utilization associated to DL tasks is given by > >> >> > what the user said it's needed to meet a task deadlines (admission > >> >> > control). If that task wakes up and we realize that adding its > >> >> > utilization contribution is going to require a frequency change, we > >> >> > should _theoretically_ always do it, or it will be too late. Now, user > >> >> > might have asked for a bit more than what strictly required (this is > >> >> > usually the case to compensate for discrepancies between theory and real > >> >> > world, e.g. hw transition limits), but I don't think there is a way to > >> >> > know "how much". :/ > >> >> > >> >> You are right. > >> >> > >> >> I'm somewhat concerned about "fast switch" cases when the rate limit > >> >> is used to reduce overhead. > >> > > >> > Mmm, right. I'm thinking that in those cases we could leave rate limit > >> > as is. The user should then be aware of it and consider it as proper > >> > overhead when designing her/his system. > >> > > >> > But then, isn't it the same for "non fast switch" platforms? I mean, > >> > even in the latter case we can't go faster than hw limits.. mmm, maybe > >> > the difference is that in the former case we could go as fast as theory > >> > would expect.. but we shouldn't. :) > >> > >> Well, in practical terms that means "no difference" IMO. :-) > >> > >> I can imagine that in some cases this approach may lead to better > >> results than reducing the rate limit overall, but the general case I'm > >> not sure about. > >> > >> I mean, if overriding the rate limit doesn't take place very often, > >> then it really should make no difference overhead-wise. Now, of > >> course, how to define "not very often" is a good question as that > >> leads to rate-limiting the overriding of the original rate limit and > >> that scheme may continue indefinitely ... > > > > :) > > > > My impression is that rate limit helps a lot for CFS, where the "true" > > utilization is not known in advance, and being too responsive might > > actually be counterproductive. > > > > For DEADLINE (and RT, with differences) we should always respond as > > quick as we can (and probably remember that a frequency transition was > > requested if hw was already performing one, but that's another patch) > > because, if we don't, a task belonging to a lower priority class might > > induce deadline misses in highest priority activities. E.g., a CFS task > > that happens to trigger a freq switch right before a DEADLINE task wakes > > up and needs an higher frequency to meet its deadline: if we wait for > > the rate limit of the CFS originated transition.. deadline miss! > > Fair enough, but if there's too much overhead as a result of this, you > can't guarantee the deadlines to be met anyway. Indeed. I guess this only works if corner cases as the one above don't happen too often.
On Fri, Feb 9, 2018 at 1:52 PM, Juri Lelli <juri.lelli@redhat.com> wrote: > On 09/02/18 13:08, Rafael J. Wysocki wrote: >> On Fri, Feb 9, 2018 at 12:51 PM, Juri Lelli <juri.lelli@redhat.com> wrote: >> > On 09/02/18 12:37, Rafael J. Wysocki wrote: >> >> On Fri, Feb 9, 2018 at 12:26 PM, Juri Lelli <juri.lelli@redhat.com> wrote: >> >> > On 09/02/18 12:04, Rafael J. Wysocki wrote: >> >> >> On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote: >> >> >> > Hi, >> >> >> > >> >> >> > On 09/02/18 11:36, Rafael J. Wysocki wrote: >> >> >> >> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote: >> >> >> >> > Hi Viresh, >> >> >> >> > >> >> >> >> > Il 09/02/2018 04:51, Viresh Kumar ha scritto: >> >> >> >> > > On 08-02-18, 18:01, Claudio Scordino wrote: >> >> >> >> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization, >> >> >> >> > >> we should not wait for the rate limit, otherwise we may miss some deadline. >> >> >> >> > >> >> >> >> >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline >> >> >> >> > >> misses for tasks with low RT periods. >> >> >> >> > >> >> >> >> >> > >> The patch applies on top of the one recently proposed by Peter to drop the >> >> >> >> > >> SCHED_CPUFREQ_* flags. >> >> >> >> > >> >> >> >> >> >> >> >> >> [cut] >> >> >> >> >> >> >> >> > >> >> >> >> > > >> >> >> >> > > Is it possible to (somehow) check here if the DL tasks will miss >> >> >> >> > > deadline if we continue to run at current frequency? And only ignore >> >> >> >> > > rate-limit if that is the case ? >> >> >> > >> >> >> > Isn't it always the case? Utilization associated to DL tasks is given by >> >> >> > what the user said it's needed to meet a task deadlines (admission >> >> >> > control). If that task wakes up and we realize that adding its >> >> >> > utilization contribution is going to require a frequency change, we >> >> >> > should _theoretically_ always do it, or it will be too late. Now, user >> >> >> > might have asked for a bit more than what strictly required (this is >> >> >> > usually the case to compensate for discrepancies between theory and real >> >> >> > world, e.g. hw transition limits), but I don't think there is a way to >> >> >> > know "how much". :/ >> >> >> >> >> >> You are right. >> >> >> >> >> >> I'm somewhat concerned about "fast switch" cases when the rate limit >> >> >> is used to reduce overhead. >> >> > >> >> > Mmm, right. I'm thinking that in those cases we could leave rate limit >> >> > as is. The user should then be aware of it and consider it as proper >> >> > overhead when designing her/his system. >> >> > >> >> > But then, isn't it the same for "non fast switch" platforms? I mean, >> >> > even in the latter case we can't go faster than hw limits.. mmm, maybe >> >> > the difference is that in the former case we could go as fast as theory >> >> > would expect.. but we shouldn't. :) >> >> >> >> Well, in practical terms that means "no difference" IMO. :-) >> >> >> >> I can imagine that in some cases this approach may lead to better >> >> results than reducing the rate limit overall, but the general case I'm >> >> not sure about. >> >> >> >> I mean, if overriding the rate limit doesn't take place very often, >> >> then it really should make no difference overhead-wise. Now, of >> >> course, how to define "not very often" is a good question as that >> >> leads to rate-limiting the overriding of the original rate limit and >> >> that scheme may continue indefinitely ... >> > >> > :) >> > >> > My impression is that rate limit helps a lot for CFS, where the "true" >> > utilization is not known in advance, and being too responsive might >> > actually be counterproductive. >> > >> > For DEADLINE (and RT, with differences) we should always respond as >> > quick as we can (and probably remember that a frequency transition was >> > requested if hw was already performing one, but that's another patch) >> > because, if we don't, a task belonging to a lower priority class might >> > induce deadline misses in highest priority activities. E.g., a CFS task >> > that happens to trigger a freq switch right before a DEADLINE task wakes >> > up and needs an higher frequency to meet its deadline: if we wait for >> > the rate limit of the CFS originated transition.. deadline miss! >> >> Fair enough, but if there's too much overhead as a result of this, you >> can't guarantee the deadlines to be met anyway. > > Indeed. I guess this only works if corner cases as the one above don't > happen too often. Well, that's the point. So there is a tradeoff: do we want to allow deadlines to be missed because of excessive overhead or do we want to allow deadlines to be missed because of the rate limit.
Il 09/02/2018 13:56, Rafael J. Wysocki ha scritto: > On Fri, Feb 9, 2018 at 1:52 PM, Juri Lelli <juri.lelli@redhat.com> wrote: >> On 09/02/18 13:08, Rafael J. Wysocki wrote: >>> On Fri, Feb 9, 2018 at 12:51 PM, Juri Lelli <juri.lelli@redhat.com> wrote: >>>> On 09/02/18 12:37, Rafael J. Wysocki wrote: >>>>> On Fri, Feb 9, 2018 at 12:26 PM, Juri Lelli <juri.lelli@redhat.com> wrote: >>>>>> On 09/02/18 12:04, Rafael J. Wysocki wrote: >>>>>>> On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> On 09/02/18 11:36, Rafael J. Wysocki wrote: >>>>>>>>> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote: >>>>>>>>>> Hi Viresh, >>>>>>>>>> >>>>>>>>>> Il 09/02/2018 04:51, Viresh Kumar ha scritto: >>>>>>>>>>> On 08-02-18, 18:01, Claudio Scordino wrote: >>>>>>>>>>>> When the SCHED_DEADLINE scheduling class increases the CPU utilization, >>>>>>>>>>>> we should not wait for the rate limit, otherwise we may miss some deadline. >>>>>>>>>>>> >>>>>>>>>>>> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline >>>>>>>>>>>> misses for tasks with low RT periods. >>>>>>>>>>>> >>>>>>>>>>>> The patch applies on top of the one recently proposed by Peter to drop the >>>>>>>>>>>> SCHED_CPUFREQ_* flags. >>>>>>>>>>>> >>>>>>>>> >>>>>>>>> [cut] >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Is it possible to (somehow) check here if the DL tasks will miss >>>>>>>>>>> deadline if we continue to run at current frequency? And only ignore >>>>>>>>>>> rate-limit if that is the case ? >>>>>>>> >>>>>>>> Isn't it always the case? Utilization associated to DL tasks is given by >>>>>>>> what the user said it's needed to meet a task deadlines (admission >>>>>>>> control). If that task wakes up and we realize that adding its >>>>>>>> utilization contribution is going to require a frequency change, we >>>>>>>> should _theoretically_ always do it, or it will be too late. Now, user >>>>>>>> might have asked for a bit more than what strictly required (this is >>>>>>>> usually the case to compensate for discrepancies between theory and real >>>>>>>> world, e.g. hw transition limits), but I don't think there is a way to >>>>>>>> know "how much". :/ >>>>>>> >>>>>>> You are right. >>>>>>> >>>>>>> I'm somewhat concerned about "fast switch" cases when the rate limit >>>>>>> is used to reduce overhead. >>>>>> >>>>>> Mmm, right. I'm thinking that in those cases we could leave rate limit >>>>>> as is. The user should then be aware of it and consider it as proper >>>>>> overhead when designing her/his system. >>>>>> >>>>>> But then, isn't it the same for "non fast switch" platforms? I mean, >>>>>> even in the latter case we can't go faster than hw limits.. mmm, maybe >>>>>> the difference is that in the former case we could go as fast as theory >>>>>> would expect.. but we shouldn't. :) >>>>> >>>>> Well, in practical terms that means "no difference" IMO. :-) >>>>> >>>>> I can imagine that in some cases this approach may lead to better >>>>> results than reducing the rate limit overall, but the general case I'm >>>>> not sure about. >>>>> >>>>> I mean, if overriding the rate limit doesn't take place very often, >>>>> then it really should make no difference overhead-wise. Now, of >>>>> course, how to define "not very often" is a good question as that >>>>> leads to rate-limiting the overriding of the original rate limit and >>>>> that scheme may continue indefinitely ... >>>> >>>> :) >>>> >>>> My impression is that rate limit helps a lot for CFS, where the "true" >>>> utilization is not known in advance, and being too responsive might >>>> actually be counterproductive. >>>> >>>> For DEADLINE (and RT, with differences) we should always respond as >>>> quick as we can (and probably remember that a frequency transition was >>>> requested if hw was already performing one, but that's another patch) >>>> because, if we don't, a task belonging to a lower priority class might >>>> induce deadline misses in highest priority activities. E.g., a CFS task >>>> that happens to trigger a freq switch right before a DEADLINE task wakes >>>> up and needs an higher frequency to meet its deadline: if we wait for >>>> the rate limit of the CFS originated transition.. deadline miss! >>> >>> Fair enough, but if there's too much overhead as a result of this, you >>> can't guarantee the deadlines to be met anyway. >> >> Indeed. I guess this only works if corner cases as the one above don't >> happen too often. > > Well, that's the point. > > So there is a tradeoff: do we want to allow deadlines to be missed > because of excessive overhead or do we want to allow deadlines to be > missed because of the rate limit. For a very few tasks, the tests have indeed shown that the approach pays off: we get a significant reduction of misses with a negligible increase of energy consumption. I still need to check what happens for a high amount of tasks, trying to reproduce the "ramp up" pattern (in which DL keeps increasing the utilization, ignoring the rate limit and adding overhead) Thanks, Claudio
On 09/02/18 13:56, Rafael J. Wysocki wrote: > On Fri, Feb 9, 2018 at 1:52 PM, Juri Lelli <juri.lelli@redhat.com> wrote: > > On 09/02/18 13:08, Rafael J. Wysocki wrote: > >> On Fri, Feb 9, 2018 at 12:51 PM, Juri Lelli <juri.lelli@redhat.com> wrote: [...] > >> > My impression is that rate limit helps a lot for CFS, where the "true" > >> > utilization is not known in advance, and being too responsive might > >> > actually be counterproductive. > >> > > >> > For DEADLINE (and RT, with differences) we should always respond as > >> > quick as we can (and probably remember that a frequency transition was > >> > requested if hw was already performing one, but that's another patch) > >> > because, if we don't, a task belonging to a lower priority class might > >> > induce deadline misses in highest priority activities. E.g., a CFS task > >> > that happens to trigger a freq switch right before a DEADLINE task wakes > >> > up and needs an higher frequency to meet its deadline: if we wait for > >> > the rate limit of the CFS originated transition.. deadline miss! > >> > >> Fair enough, but if there's too much overhead as a result of this, you > >> can't guarantee the deadlines to be met anyway. > > > > Indeed. I guess this only works if corner cases as the one above don't > > happen too often. > > Well, that's the point. > > So there is a tradeoff: do we want to allow deadlines to be missed > because of excessive overhead or do we want to allow deadlines to be > missed because of the rate limit. The difference between the two seems to be that while overhead is an intrisic hw thing, rate limit is something we have mostly to cope with the nature of certain classes of tasks and how we describe/track them (at least IMHO). I'd say that for other classes of tasks (DL/RT) we would be better off consciously living with the former only and accept that real world is "seldom" not ideal. But then again this is just another theory, experiments might easily prove me wrong. :)
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index b0bd77d..d8dcba2 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -74,7 +74,10 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu); /************************ Governor internals ***********************/ -static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time) +static bool sugov_should_update_freq(struct sugov_policy *sg_policy, + u64 time, + struct sugov_cpu *sg_cpu_old, + struct sugov_cpu *sg_cpu_new) { s64 delta_ns; @@ -111,6 +114,10 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time) return true; } + /* Ignore rate limit when DL increased utilization. */ + if (sg_cpu_new->util_dl > sg_cpu_old->util_dl) + return true; + delta_ns = time - sg_policy->last_freq_update_time; return delta_ns >= sg_policy->freq_update_delay_ns; } @@ -271,6 +278,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, unsigned int flags) { struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util); + struct sugov_cpu sg_cpu_old = *sg_cpu; struct sugov_policy *sg_policy = sg_cpu->sg_policy; unsigned long util, max; unsigned int next_f; @@ -279,7 +287,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, sugov_set_iowait_boost(sg_cpu, time, flags); sg_cpu->last_update = time; - if (!sugov_should_update_freq(sg_policy, time)) + if (!sugov_should_update_freq(sg_policy, time, &sg_cpu_old, sg_cpu)) return; busy = sugov_cpu_is_busy(sg_cpu); @@ -350,6 +358,7 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time, unsigned int flags) { struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util); + struct sugov_cpu sg_cpu_old = *sg_cpu; struct sugov_policy *sg_policy = sg_cpu->sg_policy; unsigned int next_f; @@ -359,7 +368,7 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time, sugov_set_iowait_boost(sg_cpu, time, flags); sg_cpu->last_update = time; - if (sugov_should_update_freq(sg_policy, time)) { + if (sugov_should_update_freq(sg_policy, time, &sg_cpu_old, sg_cpu)) { next_f = sugov_next_freq_shared(sg_cpu, time); sugov_update_commit(sg_policy, time, next_f); }
When the SCHED_DEADLINE scheduling class increases the CPU utilization, we should not wait for the rate limit, otherwise we may miss some deadline. Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline misses for tasks with low RT periods. The patch applies on top of the one recently proposed by Peter to drop the SCHED_CPUFREQ_* flags. Signed-off-by: Claudio Scordino <claudio@evidence.eu.com> CC: Rafael J . Wysocki <rafael.j.wysocki@intel.com> CC: Patrick Bellasi <patrick.bellasi@arm.com> CC: Dietmar Eggemann <dietmar.eggemann@arm.com> CC: Morten Rasmussen <morten.rasmussen@arm.com> CC: Juri Lelli <juri.lelli@redhat.com> CC: Viresh Kumar <viresh.kumar@linaro.org> CC: Vincent Guittot <vincent.guittot@linaro.org> CC: Todd Kjos <tkjos@android.com> CC: Joel Fernandes <joelaf@google.com> CC: linux-pm@vger.kernel.org CC: linux-kernel@vger.kernel.org --- kernel/sched/cpufreq_schedutil.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)