diff mbox

cpufreq: schedutil: rate limits for SCHED_DEADLINE

Message ID 1518109302-8239-1-git-send-email-claudio@evidence.eu.com (mailing list archive)
State Changes Requested, archived
Headers show

Commit Message

Claudio Scordino Feb. 8, 2018, 5:01 p.m. UTC
When the SCHED_DEADLINE scheduling class increases the CPU utilization,
we should not wait for the rate limit, otherwise we may miss some deadline.

Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
misses for tasks with low RT periods.

The patch applies on top of the one recently proposed by Peter to drop the
SCHED_CPUFREQ_* flags.

Signed-off-by: Claudio Scordino <claudio@evidence.eu.com>
CC: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
CC: Patrick Bellasi <patrick.bellasi@arm.com>
CC: Dietmar Eggemann <dietmar.eggemann@arm.com>
CC: Morten Rasmussen <morten.rasmussen@arm.com>
CC: Juri Lelli <juri.lelli@redhat.com>
CC: Viresh Kumar <viresh.kumar@linaro.org>
CC: Vincent Guittot <vincent.guittot@linaro.org>
CC: Todd Kjos <tkjos@android.com>
CC: Joel Fernandes <joelaf@google.com>
CC: linux-pm@vger.kernel.org
CC: linux-kernel@vger.kernel.org
---
 kernel/sched/cpufreq_schedutil.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

Comments

Viresh Kumar Feb. 9, 2018, 3:51 a.m. UTC | #1
On 08-02-18, 18:01, Claudio Scordino wrote:
> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
> we should not wait for the rate limit, otherwise we may miss some deadline.
> 
> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
> misses for tasks with low RT periods.
> 
> The patch applies on top of the one recently proposed by Peter to drop the
> SCHED_CPUFREQ_* flags.
> 
> Signed-off-by: Claudio Scordino <claudio@evidence.eu.com>
> CC: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
> CC: Patrick Bellasi <patrick.bellasi@arm.com>
> CC: Dietmar Eggemann <dietmar.eggemann@arm.com>
> CC: Morten Rasmussen <morten.rasmussen@arm.com>
> CC: Juri Lelli <juri.lelli@redhat.com>
> CC: Viresh Kumar <viresh.kumar@linaro.org>
> CC: Vincent Guittot <vincent.guittot@linaro.org>
> CC: Todd Kjos <tkjos@android.com>
> CC: Joel Fernandes <joelaf@google.com>
> CC: linux-pm@vger.kernel.org
> CC: linux-kernel@vger.kernel.org
> ---
>  kernel/sched/cpufreq_schedutil.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)

So the previous commit was surely incorrect as it relied on comparing
frequencies instead of dl-util, and freq requirements could have even
changed due to CFS.

> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index b0bd77d..d8dcba2 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -74,7 +74,10 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
>  
>  /************************ Governor internals ***********************/
>  
> -static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> +static bool sugov_should_update_freq(struct sugov_policy *sg_policy,
> +				     u64 time,
> +				     struct sugov_cpu *sg_cpu_old,
> +				     struct sugov_cpu *sg_cpu_new)
>  {
>  	s64 delta_ns;
>  
> @@ -111,6 +114,10 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>  		return true;
>  	}
>  
> +	/* Ignore rate limit when DL increased utilization. */
> +	if (sg_cpu_new->util_dl > sg_cpu_old->util_dl)
> +		return true;
> +

Changing the frequency has a penalty, specially in the ARM world (and
that's where you are testing your stuff). I am worried that we will
have (corner) cases where we will waste a lot of time changing the
frequencies. For example (I may be wrong here), what if 10 small DL
tasks are queued one after the other? The util will keep on changing
and so will the frequency ? There may be more similar cases ?

Is it possible to (somehow) check here if the DL tasks will miss
deadline if we continue to run at current frequency? And only ignore
rate-limit if that is the case ?

>  	delta_ns = time - sg_policy->last_freq_update_time;
>  	return delta_ns >= sg_policy->freq_update_delay_ns;
>  }
> @@ -271,6 +278,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
>  				unsigned int flags)
>  {
>  	struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util);
> +	struct sugov_cpu sg_cpu_old = *sg_cpu;

Not really a big deal, but this structure is 80 bytes on ARM64, why
copy everything when what we need is just 8 bytes ?
Claudio Scordino Feb. 9, 2018, 8:02 a.m. UTC | #2
Hi Viresh,

Il 09/02/2018 04:51, Viresh Kumar ha scritto:
> On 08-02-18, 18:01, Claudio Scordino wrote:
>> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
>> we should not wait for the rate limit, otherwise we may miss some deadline.
>>
>> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
>> misses for tasks with low RT periods.
>>
>> The patch applies on top of the one recently proposed by Peter to drop the
>> SCHED_CPUFREQ_* flags.
>>
>> Signed-off-by: Claudio Scordino <claudio@evidence.eu.com>
>> CC: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
>> CC: Patrick Bellasi <patrick.bellasi@arm.com>
>> CC: Dietmar Eggemann <dietmar.eggemann@arm.com>
>> CC: Morten Rasmussen <morten.rasmussen@arm.com>
>> CC: Juri Lelli <juri.lelli@redhat.com>
>> CC: Viresh Kumar <viresh.kumar@linaro.org>
>> CC: Vincent Guittot <vincent.guittot@linaro.org>
>> CC: Todd Kjos <tkjos@android.com>
>> CC: Joel Fernandes <joelaf@google.com>
>> CC: linux-pm@vger.kernel.org
>> CC: linux-kernel@vger.kernel.org
>> ---
>>   kernel/sched/cpufreq_schedutil.c | 15 ++++++++++++---
>>   1 file changed, 12 insertions(+), 3 deletions(-)
> 
> So the previous commit was surely incorrect as it relied on comparing
> frequencies instead of dl-util, and freq requirements could have even
> changed due to CFS.

You're right.
The very original patch (not posted) added a specific SCHED_CPUFREQ flag to let the scheduling class ask for ignoring the rate limit.
However, polluting the API with further flags is not such a good approach.
The next patches didn't introduce such flag, but were incorrect.

> 
>> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> index b0bd77d..d8dcba2 100644
>> --- a/kernel/sched/cpufreq_schedutil.c
>> +++ b/kernel/sched/cpufreq_schedutil.c
>> @@ -74,7 +74,10 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
>>   
>>   /************************ Governor internals ***********************/
>>   
>> -static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>> +static bool sugov_should_update_freq(struct sugov_policy *sg_policy,
>> +				     u64 time,
>> +				     struct sugov_cpu *sg_cpu_old,
>> +				     struct sugov_cpu *sg_cpu_new)
>>   {
>>   	s64 delta_ns;
>>   
>> @@ -111,6 +114,10 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>>   		return true;
>>   	}
>>   
>> +	/* Ignore rate limit when DL increased utilization. */
>> +	if (sg_cpu_new->util_dl > sg_cpu_old->util_dl)
>> +		return true;
>> +
> 
> Changing the frequency has a penalty, specially in the ARM world (and
> that's where you are testing your stuff). I am worried that we will
> have (corner) cases where we will waste a lot of time changing the
> frequencies. For example (I may be wrong here), what if 10 small DL
> tasks are queued one after the other? The util will keep on changing
> and so will the frequency ? There may be more similar cases ?

I forgot to say that I've not observed any relevant increase of the energy consumption (measured through a Baylibre Cape).
However, the tests had a very small number of RT tasks.

If I'm not wrong, at the hardware level we do have a physical rate limit (as we cannot trigger a frequency update when there is one already on-going).
Don't know if this could somehow mitigate such effect.

Anyway, I'll repeat the tests with a considerable amount of RT tasks to check if I can reproduce such "ramp up" situation.
Depending on the energy results, we may have to choose between meeting more RT deadlines and consuming less energy.

> 
> Is it possible to (somehow) check here if the DL tasks will miss
> deadline if we continue to run at current frequency? And only ignore
> rate-limit if that is the case ?

I need to think further about it.

> 
>>   	delta_ns = time - sg_policy->last_freq_update_time;
>>   	return delta_ns >= sg_policy->freq_update_delay_ns;
>>   }
>> @@ -271,6 +278,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
>>   				unsigned int flags)
>>   {
>>   	struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util);
>> +	struct sugov_cpu sg_cpu_old = *sg_cpu;
> 
> Not really a big deal, but this structure is 80 bytes on ARM64, why
> copy everything when what we need is just 8 bytes ?

I didn't want to add deadline-specific code into the sugov_should_update_freq() signature as it should remain independent from the scheduling classes.
In my opinion, the best approach would be to group util_cfs and util_dl in a struct within sugov_cpu and pass that struct to sugov_should_update_freq().

Thanks for your comments.

                Claudio
Viresh Kumar Feb. 9, 2018, 8:40 a.m. UTC | #3
On 09-02-18, 09:02, Claudio Scordino wrote:
> If I'm not wrong, at the hardware level we do have a physical rate limit (as we cannot trigger a frequency update when there is one already on-going).
> Don't know if this could somehow mitigate such effect.

Yeah, so in the worst case we will start a new freq-change right after
the previous one has finished.
Rafael J. Wysocki Feb. 9, 2018, 10:36 a.m. UTC | #4
On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote:
> Hi Viresh,
> 
> Il 09/02/2018 04:51, Viresh Kumar ha scritto:
> > On 08-02-18, 18:01, Claudio Scordino wrote:
> >> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
> >> we should not wait for the rate limit, otherwise we may miss some deadline.
> >>
> >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
> >> misses for tasks with low RT periods.
> >>
> >> The patch applies on top of the one recently proposed by Peter to drop the
> >> SCHED_CPUFREQ_* flags.
> >>

[cut]

> 
> > 
> > Is it possible to (somehow) check here if the DL tasks will miss
> > deadline if we continue to run at current frequency? And only ignore
> > rate-limit if that is the case ?
> 
> I need to think further about it.

That would be my approach FWIW.

Increasing the frequency beyond what is necessary means wasting energy
in any case.

Thanks,
Rafael
Juri Lelli Feb. 9, 2018, 10:53 a.m. UTC | #5
Hi,

On 09/02/18 11:36, Rafael J. Wysocki wrote:
> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote:
> > Hi Viresh,
> > 
> > Il 09/02/2018 04:51, Viresh Kumar ha scritto:
> > > On 08-02-18, 18:01, Claudio Scordino wrote:
> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
> > >> we should not wait for the rate limit, otherwise we may miss some deadline.
> > >>
> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
> > >> misses for tasks with low RT periods.
> > >>
> > >> The patch applies on top of the one recently proposed by Peter to drop the
> > >> SCHED_CPUFREQ_* flags.
> > >>
> 
> [cut]
> 
> > 
> > > 
> > > Is it possible to (somehow) check here if the DL tasks will miss
> > > deadline if we continue to run at current frequency? And only ignore
> > > rate-limit if that is the case ?

Isn't it always the case? Utilization associated to DL tasks is given by
what the user said it's needed to meet a task deadlines (admission
control). If that task wakes up and we realize that adding its
utilization contribution is going to require a frequency change, we
should _theoretically_ always do it, or it will be too late. Now, user
might have asked for a bit more than what strictly required (this is
usually the case to compensate for discrepancies between theory and real
world, e.g.  hw transition limits), but I don't think there is a way to
know "how much". :/

Thanks,

- Juri

> > 
> > I need to think further about it.
> 
> That would be my approach FWIW.
> 
> Increasing the frequency beyond what is necessary means wasting energy
> in any case.
> 
> Thanks,
> Rafael
>
Rafael J. Wysocki Feb. 9, 2018, 11:04 a.m. UTC | #6
On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote:
> Hi,
>
> On 09/02/18 11:36, Rafael J. Wysocki wrote:
>> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote:
>> > Hi Viresh,
>> >
>> > Il 09/02/2018 04:51, Viresh Kumar ha scritto:
>> > > On 08-02-18, 18:01, Claudio Scordino wrote:
>> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
>> > >> we should not wait for the rate limit, otherwise we may miss some deadline.
>> > >>
>> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
>> > >> misses for tasks with low RT periods.
>> > >>
>> > >> The patch applies on top of the one recently proposed by Peter to drop the
>> > >> SCHED_CPUFREQ_* flags.
>> > >>
>>
>> [cut]
>>
>> >
>> > >
>> > > Is it possible to (somehow) check here if the DL tasks will miss
>> > > deadline if we continue to run at current frequency? And only ignore
>> > > rate-limit if that is the case ?
>
> Isn't it always the case? Utilization associated to DL tasks is given by
> what the user said it's needed to meet a task deadlines (admission
> control). If that task wakes up and we realize that adding its
> utilization contribution is going to require a frequency change, we
> should _theoretically_ always do it, or it will be too late. Now, user
> might have asked for a bit more than what strictly required (this is
> usually the case to compensate for discrepancies between theory and real
> world, e.g.  hw transition limits), but I don't think there is a way to
> know "how much". :/

You are right.

I'm somewhat concerned about "fast switch" cases when the rate limit
is used to reduce overhead.
Rafael J. Wysocki Feb. 9, 2018, 11:14 a.m. UTC | #7
On Thu, Feb 8, 2018 at 6:01 PM, Claudio Scordino
<claudio@evidence.eu.com> wrote:
> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
> we should not wait for the rate limit, otherwise we may miss some deadline.
>
> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
> misses for tasks with low RT periods.
>
> The patch applies on top of the one recently proposed by Peter to drop the
> SCHED_CPUFREQ_* flags.
>
> Signed-off-by: Claudio Scordino <claudio@evidence.eu.com>
> CC: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
> CC: Patrick Bellasi <patrick.bellasi@arm.com>
> CC: Dietmar Eggemann <dietmar.eggemann@arm.com>
> CC: Morten Rasmussen <morten.rasmussen@arm.com>
> CC: Juri Lelli <juri.lelli@redhat.com>
> CC: Viresh Kumar <viresh.kumar@linaro.org>
> CC: Vincent Guittot <vincent.guittot@linaro.org>
> CC: Todd Kjos <tkjos@android.com>
> CC: Joel Fernandes <joelaf@google.com>
> CC: linux-pm@vger.kernel.org
> CC: linux-kernel@vger.kernel.org
> ---
>  kernel/sched/cpufreq_schedutil.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index b0bd77d..d8dcba2 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -74,7 +74,10 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
>
>  /************************ Governor internals ***********************/
>
> -static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> +static bool sugov_should_update_freq(struct sugov_policy *sg_policy,
> +                                    u64 time,
> +                                    struct sugov_cpu *sg_cpu_old,
> +                                    struct sugov_cpu *sg_cpu_new)

This looks somewhat excessive for using just one field from each of these.

>  {
>         s64 delta_ns;
>
> @@ -111,6 +114,10 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>                 return true;
>         }
>
> +       /* Ignore rate limit when DL increased utilization. */
> +       if (sg_cpu_new->util_dl > sg_cpu_old->util_dl)
> +               return true;
> +
>         delta_ns = time - sg_policy->last_freq_update_time;
>         return delta_ns >= sg_policy->freq_update_delay_ns;
>  }
> @@ -271,6 +278,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
>                                 unsigned int flags)
>  {
>         struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util);
> +       struct sugov_cpu sg_cpu_old = *sg_cpu;

And here you copy the entire struct to pass a pointer to the copy to a
helper function so that it can access one field.

That doesn't look particularly straightforward to me, let alone the overhead.

I guess you may do the check before calling sugov_should_update_freq()
and set sg_policy->need_freq_update if its true, as you know upfront
that the previous sg_policy->next_freq value isn't going to be used
anyway in that case.
Juri Lelli Feb. 9, 2018, 11:26 a.m. UTC | #8
On 09/02/18 12:04, Rafael J. Wysocki wrote:
> On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote:
> > Hi,
> >
> > On 09/02/18 11:36, Rafael J. Wysocki wrote:
> >> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote:
> >> > Hi Viresh,
> >> >
> >> > Il 09/02/2018 04:51, Viresh Kumar ha scritto:
> >> > > On 08-02-18, 18:01, Claudio Scordino wrote:
> >> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
> >> > >> we should not wait for the rate limit, otherwise we may miss some deadline.
> >> > >>
> >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
> >> > >> misses for tasks with low RT periods.
> >> > >>
> >> > >> The patch applies on top of the one recently proposed by Peter to drop the
> >> > >> SCHED_CPUFREQ_* flags.
> >> > >>
> >>
> >> [cut]
> >>
> >> >
> >> > >
> >> > > Is it possible to (somehow) check here if the DL tasks will miss
> >> > > deadline if we continue to run at current frequency? And only ignore
> >> > > rate-limit if that is the case ?
> >
> > Isn't it always the case? Utilization associated to DL tasks is given by
> > what the user said it's needed to meet a task deadlines (admission
> > control). If that task wakes up and we realize that adding its
> > utilization contribution is going to require a frequency change, we
> > should _theoretically_ always do it, or it will be too late. Now, user
> > might have asked for a bit more than what strictly required (this is
> > usually the case to compensate for discrepancies between theory and real
> > world, e.g.  hw transition limits), but I don't think there is a way to
> > know "how much". :/
> 
> You are right.
> 
> I'm somewhat concerned about "fast switch" cases when the rate limit
> is used to reduce overhead.

Mmm, right. I'm thinking that in those cases we could leave rate limit
as is. The user should then be aware of it and consider it as proper
overhead when designing her/his system.

But then, isn't it the same for "non fast switch" platforms? I mean,
even in the latter case we can't go faster than hw limits.. mmm, maybe
the difference is that in the former case we could go as fast as theory
would expect.. but we shouldn't. :)
Rafael J. Wysocki Feb. 9, 2018, 11:37 a.m. UTC | #9
On Fri, Feb 9, 2018 at 12:26 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
> On 09/02/18 12:04, Rafael J. Wysocki wrote:
>> On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote:
>> > Hi,
>> >
>> > On 09/02/18 11:36, Rafael J. Wysocki wrote:
>> >> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote:
>> >> > Hi Viresh,
>> >> >
>> >> > Il 09/02/2018 04:51, Viresh Kumar ha scritto:
>> >> > > On 08-02-18, 18:01, Claudio Scordino wrote:
>> >> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
>> >> > >> we should not wait for the rate limit, otherwise we may miss some deadline.
>> >> > >>
>> >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
>> >> > >> misses for tasks with low RT periods.
>> >> > >>
>> >> > >> The patch applies on top of the one recently proposed by Peter to drop the
>> >> > >> SCHED_CPUFREQ_* flags.
>> >> > >>
>> >>
>> >> [cut]
>> >>
>> >> >
>> >> > >
>> >> > > Is it possible to (somehow) check here if the DL tasks will miss
>> >> > > deadline if we continue to run at current frequency? And only ignore
>> >> > > rate-limit if that is the case ?
>> >
>> > Isn't it always the case? Utilization associated to DL tasks is given by
>> > what the user said it's needed to meet a task deadlines (admission
>> > control). If that task wakes up and we realize that adding its
>> > utilization contribution is going to require a frequency change, we
>> > should _theoretically_ always do it, or it will be too late. Now, user
>> > might have asked for a bit more than what strictly required (this is
>> > usually the case to compensate for discrepancies between theory and real
>> > world, e.g.  hw transition limits), but I don't think there is a way to
>> > know "how much". :/
>>
>> You are right.
>>
>> I'm somewhat concerned about "fast switch" cases when the rate limit
>> is used to reduce overhead.
>
> Mmm, right. I'm thinking that in those cases we could leave rate limit
> as is. The user should then be aware of it and consider it as proper
> overhead when designing her/his system.
>
> But then, isn't it the same for "non fast switch" platforms? I mean,
> even in the latter case we can't go faster than hw limits.. mmm, maybe
> the difference is that in the former case we could go as fast as theory
> would expect.. but we shouldn't. :)

Well, in practical terms that means "no difference" IMO. :-)

I can imagine that in some cases this approach may lead to better
results than reducing the rate limit overall, but the general case I'm
not sure about.

I mean, if overriding the rate limit doesn't take place very often,
then it really should make no difference overhead-wise.  Now, of
course, how to define "not very often" is a good question as that
leads to rate-limiting the overriding of the original rate limit and
that scheme may continue indefinitely ...
Juri Lelli Feb. 9, 2018, 11:51 a.m. UTC | #10
On 09/02/18 12:37, Rafael J. Wysocki wrote:
> On Fri, Feb 9, 2018 at 12:26 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
> > On 09/02/18 12:04, Rafael J. Wysocki wrote:
> >> On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote:
> >> > Hi,
> >> >
> >> > On 09/02/18 11:36, Rafael J. Wysocki wrote:
> >> >> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote:
> >> >> > Hi Viresh,
> >> >> >
> >> >> > Il 09/02/2018 04:51, Viresh Kumar ha scritto:
> >> >> > > On 08-02-18, 18:01, Claudio Scordino wrote:
> >> >> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
> >> >> > >> we should not wait for the rate limit, otherwise we may miss some deadline.
> >> >> > >>
> >> >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
> >> >> > >> misses for tasks with low RT periods.
> >> >> > >>
> >> >> > >> The patch applies on top of the one recently proposed by Peter to drop the
> >> >> > >> SCHED_CPUFREQ_* flags.
> >> >> > >>
> >> >>
> >> >> [cut]
> >> >>
> >> >> >
> >> >> > >
> >> >> > > Is it possible to (somehow) check here if the DL tasks will miss
> >> >> > > deadline if we continue to run at current frequency? And only ignore
> >> >> > > rate-limit if that is the case ?
> >> >
> >> > Isn't it always the case? Utilization associated to DL tasks is given by
> >> > what the user said it's needed to meet a task deadlines (admission
> >> > control). If that task wakes up and we realize that adding its
> >> > utilization contribution is going to require a frequency change, we
> >> > should _theoretically_ always do it, or it will be too late. Now, user
> >> > might have asked for a bit more than what strictly required (this is
> >> > usually the case to compensate for discrepancies between theory and real
> >> > world, e.g.  hw transition limits), but I don't think there is a way to
> >> > know "how much". :/
> >>
> >> You are right.
> >>
> >> I'm somewhat concerned about "fast switch" cases when the rate limit
> >> is used to reduce overhead.
> >
> > Mmm, right. I'm thinking that in those cases we could leave rate limit
> > as is. The user should then be aware of it and consider it as proper
> > overhead when designing her/his system.
> >
> > But then, isn't it the same for "non fast switch" platforms? I mean,
> > even in the latter case we can't go faster than hw limits.. mmm, maybe
> > the difference is that in the former case we could go as fast as theory
> > would expect.. but we shouldn't. :)
> 
> Well, in practical terms that means "no difference" IMO. :-)
> 
> I can imagine that in some cases this approach may lead to better
> results than reducing the rate limit overall, but the general case I'm
> not sure about.
> 
> I mean, if overriding the rate limit doesn't take place very often,
> then it really should make no difference overhead-wise.  Now, of
> course, how to define "not very often" is a good question as that
> leads to rate-limiting the overriding of the original rate limit and
> that scheme may continue indefinitely ...

:)

My impression is that rate limit helps a lot for CFS, where the "true"
utilization is not known in advance, and being too responsive might
actually be counterproductive.

For DEADLINE (and RT, with differences) we should always respond as
quick as we can (and probably remember that a frequency transition was
requested if hw was already performing one, but that's another patch)
because, if we don't, a task belonging to a lower priority class might
induce deadline misses in highest priority activities. E.g., a CFS task
that happens to trigger a freq switch right before a DEADLINE task wakes
up and needs an higher frequency to meet its deadline: if we wait for
the rate limit of the CFS originated transition.. deadline miss!
Rafael J. Wysocki Feb. 9, 2018, 12:08 p.m. UTC | #11
On Fri, Feb 9, 2018 at 12:51 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
> On 09/02/18 12:37, Rafael J. Wysocki wrote:
>> On Fri, Feb 9, 2018 at 12:26 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
>> > On 09/02/18 12:04, Rafael J. Wysocki wrote:
>> >> On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote:
>> >> > Hi,
>> >> >
>> >> > On 09/02/18 11:36, Rafael J. Wysocki wrote:
>> >> >> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote:
>> >> >> > Hi Viresh,
>> >> >> >
>> >> >> > Il 09/02/2018 04:51, Viresh Kumar ha scritto:
>> >> >> > > On 08-02-18, 18:01, Claudio Scordino wrote:
>> >> >> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
>> >> >> > >> we should not wait for the rate limit, otherwise we may miss some deadline.
>> >> >> > >>
>> >> >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
>> >> >> > >> misses for tasks with low RT periods.
>> >> >> > >>
>> >> >> > >> The patch applies on top of the one recently proposed by Peter to drop the
>> >> >> > >> SCHED_CPUFREQ_* flags.
>> >> >> > >>
>> >> >>
>> >> >> [cut]
>> >> >>
>> >> >> >
>> >> >> > >
>> >> >> > > Is it possible to (somehow) check here if the DL tasks will miss
>> >> >> > > deadline if we continue to run at current frequency? And only ignore
>> >> >> > > rate-limit if that is the case ?
>> >> >
>> >> > Isn't it always the case? Utilization associated to DL tasks is given by
>> >> > what the user said it's needed to meet a task deadlines (admission
>> >> > control). If that task wakes up and we realize that adding its
>> >> > utilization contribution is going to require a frequency change, we
>> >> > should _theoretically_ always do it, or it will be too late. Now, user
>> >> > might have asked for a bit more than what strictly required (this is
>> >> > usually the case to compensate for discrepancies between theory and real
>> >> > world, e.g.  hw transition limits), but I don't think there is a way to
>> >> > know "how much". :/
>> >>
>> >> You are right.
>> >>
>> >> I'm somewhat concerned about "fast switch" cases when the rate limit
>> >> is used to reduce overhead.
>> >
>> > Mmm, right. I'm thinking that in those cases we could leave rate limit
>> > as is. The user should then be aware of it and consider it as proper
>> > overhead when designing her/his system.
>> >
>> > But then, isn't it the same for "non fast switch" platforms? I mean,
>> > even in the latter case we can't go faster than hw limits.. mmm, maybe
>> > the difference is that in the former case we could go as fast as theory
>> > would expect.. but we shouldn't. :)
>>
>> Well, in practical terms that means "no difference" IMO. :-)
>>
>> I can imagine that in some cases this approach may lead to better
>> results than reducing the rate limit overall, but the general case I'm
>> not sure about.
>>
>> I mean, if overriding the rate limit doesn't take place very often,
>> then it really should make no difference overhead-wise.  Now, of
>> course, how to define "not very often" is a good question as that
>> leads to rate-limiting the overriding of the original rate limit and
>> that scheme may continue indefinitely ...
>
> :)
>
> My impression is that rate limit helps a lot for CFS, where the "true"
> utilization is not known in advance, and being too responsive might
> actually be counterproductive.
>
> For DEADLINE (and RT, with differences) we should always respond as
> quick as we can (and probably remember that a frequency transition was
> requested if hw was already performing one, but that's another patch)
> because, if we don't, a task belonging to a lower priority class might
> induce deadline misses in highest priority activities. E.g., a CFS task
> that happens to trigger a freq switch right before a DEADLINE task wakes
> up and needs an higher frequency to meet its deadline: if we wait for
> the rate limit of the CFS originated transition.. deadline miss!

Fair enough, but if there's too much overhead as a result of this, you
can't guarantee the deadlines to be met anyway.
Juri Lelli Feb. 9, 2018, 12:52 p.m. UTC | #12
On 09/02/18 13:08, Rafael J. Wysocki wrote:
> On Fri, Feb 9, 2018 at 12:51 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
> > On 09/02/18 12:37, Rafael J. Wysocki wrote:
> >> On Fri, Feb 9, 2018 at 12:26 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
> >> > On 09/02/18 12:04, Rafael J. Wysocki wrote:
> >> >> On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > On 09/02/18 11:36, Rafael J. Wysocki wrote:
> >> >> >> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote:
> >> >> >> > Hi Viresh,
> >> >> >> >
> >> >> >> > Il 09/02/2018 04:51, Viresh Kumar ha scritto:
> >> >> >> > > On 08-02-18, 18:01, Claudio Scordino wrote:
> >> >> >> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
> >> >> >> > >> we should not wait for the rate limit, otherwise we may miss some deadline.
> >> >> >> > >>
> >> >> >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
> >> >> >> > >> misses for tasks with low RT periods.
> >> >> >> > >>
> >> >> >> > >> The patch applies on top of the one recently proposed by Peter to drop the
> >> >> >> > >> SCHED_CPUFREQ_* flags.
> >> >> >> > >>
> >> >> >>
> >> >> >> [cut]
> >> >> >>
> >> >> >> >
> >> >> >> > >
> >> >> >> > > Is it possible to (somehow) check here if the DL tasks will miss
> >> >> >> > > deadline if we continue to run at current frequency? And only ignore
> >> >> >> > > rate-limit if that is the case ?
> >> >> >
> >> >> > Isn't it always the case? Utilization associated to DL tasks is given by
> >> >> > what the user said it's needed to meet a task deadlines (admission
> >> >> > control). If that task wakes up and we realize that adding its
> >> >> > utilization contribution is going to require a frequency change, we
> >> >> > should _theoretically_ always do it, or it will be too late. Now, user
> >> >> > might have asked for a bit more than what strictly required (this is
> >> >> > usually the case to compensate for discrepancies between theory and real
> >> >> > world, e.g.  hw transition limits), but I don't think there is a way to
> >> >> > know "how much". :/
> >> >>
> >> >> You are right.
> >> >>
> >> >> I'm somewhat concerned about "fast switch" cases when the rate limit
> >> >> is used to reduce overhead.
> >> >
> >> > Mmm, right. I'm thinking that in those cases we could leave rate limit
> >> > as is. The user should then be aware of it and consider it as proper
> >> > overhead when designing her/his system.
> >> >
> >> > But then, isn't it the same for "non fast switch" platforms? I mean,
> >> > even in the latter case we can't go faster than hw limits.. mmm, maybe
> >> > the difference is that in the former case we could go as fast as theory
> >> > would expect.. but we shouldn't. :)
> >>
> >> Well, in practical terms that means "no difference" IMO. :-)
> >>
> >> I can imagine that in some cases this approach may lead to better
> >> results than reducing the rate limit overall, but the general case I'm
> >> not sure about.
> >>
> >> I mean, if overriding the rate limit doesn't take place very often,
> >> then it really should make no difference overhead-wise.  Now, of
> >> course, how to define "not very often" is a good question as that
> >> leads to rate-limiting the overriding of the original rate limit and
> >> that scheme may continue indefinitely ...
> >
> > :)
> >
> > My impression is that rate limit helps a lot for CFS, where the "true"
> > utilization is not known in advance, and being too responsive might
> > actually be counterproductive.
> >
> > For DEADLINE (and RT, with differences) we should always respond as
> > quick as we can (and probably remember that a frequency transition was
> > requested if hw was already performing one, but that's another patch)
> > because, if we don't, a task belonging to a lower priority class might
> > induce deadline misses in highest priority activities. E.g., a CFS task
> > that happens to trigger a freq switch right before a DEADLINE task wakes
> > up and needs an higher frequency to meet its deadline: if we wait for
> > the rate limit of the CFS originated transition.. deadline miss!
> 
> Fair enough, but if there's too much overhead as a result of this, you
> can't guarantee the deadlines to be met anyway.

Indeed. I guess this only works if corner cases as the one above don't
happen too often.
Rafael J. Wysocki Feb. 9, 2018, 12:56 p.m. UTC | #13
On Fri, Feb 9, 2018 at 1:52 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
> On 09/02/18 13:08, Rafael J. Wysocki wrote:
>> On Fri, Feb 9, 2018 at 12:51 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
>> > On 09/02/18 12:37, Rafael J. Wysocki wrote:
>> >> On Fri, Feb 9, 2018 at 12:26 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
>> >> > On 09/02/18 12:04, Rafael J. Wysocki wrote:
>> >> >> On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> > On 09/02/18 11:36, Rafael J. Wysocki wrote:
>> >> >> >> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote:
>> >> >> >> > Hi Viresh,
>> >> >> >> >
>> >> >> >> > Il 09/02/2018 04:51, Viresh Kumar ha scritto:
>> >> >> >> > > On 08-02-18, 18:01, Claudio Scordino wrote:
>> >> >> >> > >> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
>> >> >> >> > >> we should not wait for the rate limit, otherwise we may miss some deadline.
>> >> >> >> > >>
>> >> >> >> > >> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
>> >> >> >> > >> misses for tasks with low RT periods.
>> >> >> >> > >>
>> >> >> >> > >> The patch applies on top of the one recently proposed by Peter to drop the
>> >> >> >> > >> SCHED_CPUFREQ_* flags.
>> >> >> >> > >>
>> >> >> >>
>> >> >> >> [cut]
>> >> >> >>
>> >> >> >> >
>> >> >> >> > >
>> >> >> >> > > Is it possible to (somehow) check here if the DL tasks will miss
>> >> >> >> > > deadline if we continue to run at current frequency? And only ignore
>> >> >> >> > > rate-limit if that is the case ?
>> >> >> >
>> >> >> > Isn't it always the case? Utilization associated to DL tasks is given by
>> >> >> > what the user said it's needed to meet a task deadlines (admission
>> >> >> > control). If that task wakes up and we realize that adding its
>> >> >> > utilization contribution is going to require a frequency change, we
>> >> >> > should _theoretically_ always do it, or it will be too late. Now, user
>> >> >> > might have asked for a bit more than what strictly required (this is
>> >> >> > usually the case to compensate for discrepancies between theory and real
>> >> >> > world, e.g.  hw transition limits), but I don't think there is a way to
>> >> >> > know "how much". :/
>> >> >>
>> >> >> You are right.
>> >> >>
>> >> >> I'm somewhat concerned about "fast switch" cases when the rate limit
>> >> >> is used to reduce overhead.
>> >> >
>> >> > Mmm, right. I'm thinking that in those cases we could leave rate limit
>> >> > as is. The user should then be aware of it and consider it as proper
>> >> > overhead when designing her/his system.
>> >> >
>> >> > But then, isn't it the same for "non fast switch" platforms? I mean,
>> >> > even in the latter case we can't go faster than hw limits.. mmm, maybe
>> >> > the difference is that in the former case we could go as fast as theory
>> >> > would expect.. but we shouldn't. :)
>> >>
>> >> Well, in practical terms that means "no difference" IMO. :-)
>> >>
>> >> I can imagine that in some cases this approach may lead to better
>> >> results than reducing the rate limit overall, but the general case I'm
>> >> not sure about.
>> >>
>> >> I mean, if overriding the rate limit doesn't take place very often,
>> >> then it really should make no difference overhead-wise.  Now, of
>> >> course, how to define "not very often" is a good question as that
>> >> leads to rate-limiting the overriding of the original rate limit and
>> >> that scheme may continue indefinitely ...
>> >
>> > :)
>> >
>> > My impression is that rate limit helps a lot for CFS, where the "true"
>> > utilization is not known in advance, and being too responsive might
>> > actually be counterproductive.
>> >
>> > For DEADLINE (and RT, with differences) we should always respond as
>> > quick as we can (and probably remember that a frequency transition was
>> > requested if hw was already performing one, but that's another patch)
>> > because, if we don't, a task belonging to a lower priority class might
>> > induce deadline misses in highest priority activities. E.g., a CFS task
>> > that happens to trigger a freq switch right before a DEADLINE task wakes
>> > up and needs an higher frequency to meet its deadline: if we wait for
>> > the rate limit of the CFS originated transition.. deadline miss!
>>
>> Fair enough, but if there's too much overhead as a result of this, you
>> can't guarantee the deadlines to be met anyway.
>
> Indeed. I guess this only works if corner cases as the one above don't
> happen too often.

Well, that's the point.

So there is a tradeoff: do we want to allow deadlines to be missed
because of excessive overhead or do we want to allow deadlines to be
missed because of the rate limit.
Claudio Scordino Feb. 9, 2018, 1:20 p.m. UTC | #14
Il 09/02/2018 13:56, Rafael J. Wysocki ha scritto:
> On Fri, Feb 9, 2018 at 1:52 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
>> On 09/02/18 13:08, Rafael J. Wysocki wrote:
>>> On Fri, Feb 9, 2018 at 12:51 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
>>>> On 09/02/18 12:37, Rafael J. Wysocki wrote:
>>>>> On Fri, Feb 9, 2018 at 12:26 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
>>>>>> On 09/02/18 12:04, Rafael J. Wysocki wrote:
>>>>>>> On Fri, Feb 9, 2018 at 11:53 AM, Juri Lelli <juri.lelli@redhat.com> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 09/02/18 11:36, Rafael J. Wysocki wrote:
>>>>>>>>> On Friday, February 9, 2018 9:02:34 AM CET Claudio Scordino wrote:
>>>>>>>>>> Hi Viresh,
>>>>>>>>>>
>>>>>>>>>> Il 09/02/2018 04:51, Viresh Kumar ha scritto:
>>>>>>>>>>> On 08-02-18, 18:01, Claudio Scordino wrote:
>>>>>>>>>>>> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
>>>>>>>>>>>> we should not wait for the rate limit, otherwise we may miss some deadline.
>>>>>>>>>>>>
>>>>>>>>>>>> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
>>>>>>>>>>>> misses for tasks with low RT periods.
>>>>>>>>>>>>
>>>>>>>>>>>> The patch applies on top of the one recently proposed by Peter to drop the
>>>>>>>>>>>> SCHED_CPUFREQ_* flags.
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [cut]
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Is it possible to (somehow) check here if the DL tasks will miss
>>>>>>>>>>> deadline if we continue to run at current frequency? And only ignore
>>>>>>>>>>> rate-limit if that is the case ?
>>>>>>>>
>>>>>>>> Isn't it always the case? Utilization associated to DL tasks is given by
>>>>>>>> what the user said it's needed to meet a task deadlines (admission
>>>>>>>> control). If that task wakes up and we realize that adding its
>>>>>>>> utilization contribution is going to require a frequency change, we
>>>>>>>> should _theoretically_ always do it, or it will be too late. Now, user
>>>>>>>> might have asked for a bit more than what strictly required (this is
>>>>>>>> usually the case to compensate for discrepancies between theory and real
>>>>>>>> world, e.g.  hw transition limits), but I don't think there is a way to
>>>>>>>> know "how much". :/
>>>>>>>
>>>>>>> You are right.
>>>>>>>
>>>>>>> I'm somewhat concerned about "fast switch" cases when the rate limit
>>>>>>> is used to reduce overhead.
>>>>>>
>>>>>> Mmm, right. I'm thinking that in those cases we could leave rate limit
>>>>>> as is. The user should then be aware of it and consider it as proper
>>>>>> overhead when designing her/his system.
>>>>>>
>>>>>> But then, isn't it the same for "non fast switch" platforms? I mean,
>>>>>> even in the latter case we can't go faster than hw limits.. mmm, maybe
>>>>>> the difference is that in the former case we could go as fast as theory
>>>>>> would expect.. but we shouldn't. :)
>>>>>
>>>>> Well, in practical terms that means "no difference" IMO. :-)
>>>>>
>>>>> I can imagine that in some cases this approach may lead to better
>>>>> results than reducing the rate limit overall, but the general case I'm
>>>>> not sure about.
>>>>>
>>>>> I mean, if overriding the rate limit doesn't take place very often,
>>>>> then it really should make no difference overhead-wise.  Now, of
>>>>> course, how to define "not very often" is a good question as that
>>>>> leads to rate-limiting the overriding of the original rate limit and
>>>>> that scheme may continue indefinitely ...
>>>>
>>>> :)
>>>>
>>>> My impression is that rate limit helps a lot for CFS, where the "true"
>>>> utilization is not known in advance, and being too responsive might
>>>> actually be counterproductive.
>>>>
>>>> For DEADLINE (and RT, with differences) we should always respond as
>>>> quick as we can (and probably remember that a frequency transition was
>>>> requested if hw was already performing one, but that's another patch)
>>>> because, if we don't, a task belonging to a lower priority class might
>>>> induce deadline misses in highest priority activities. E.g., a CFS task
>>>> that happens to trigger a freq switch right before a DEADLINE task wakes
>>>> up and needs an higher frequency to meet its deadline: if we wait for
>>>> the rate limit of the CFS originated transition.. deadline miss!
>>>
>>> Fair enough, but if there's too much overhead as a result of this, you
>>> can't guarantee the deadlines to be met anyway.
>>
>> Indeed. I guess this only works if corner cases as the one above don't
>> happen too often.
> 
> Well, that's the point.
> 
> So there is a tradeoff: do we want to allow deadlines to be missed
> because of excessive overhead or do we want to allow deadlines to be
> missed because of the rate limit.

For a very few tasks, the tests have indeed shown that the approach pays off: we get a significant reduction of misses with a negligible increase of energy consumption.
I still need to check what happens for a high amount of tasks, trying to reproduce the  "ramp up" pattern (in which DL keeps increasing the utilization, ignoring the rate limit and adding overhead)

Thanks,

               Claudio
Juri Lelli Feb. 9, 2018, 1:25 p.m. UTC | #15
On 09/02/18 13:56, Rafael J. Wysocki wrote:
> On Fri, Feb 9, 2018 at 1:52 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
> > On 09/02/18 13:08, Rafael J. Wysocki wrote:
> >> On Fri, Feb 9, 2018 at 12:51 PM, Juri Lelli <juri.lelli@redhat.com> wrote:

[...]

> >> > My impression is that rate limit helps a lot for CFS, where the "true"
> >> > utilization is not known in advance, and being too responsive might
> >> > actually be counterproductive.
> >> >
> >> > For DEADLINE (and RT, with differences) we should always respond as
> >> > quick as we can (and probably remember that a frequency transition was
> >> > requested if hw was already performing one, but that's another patch)
> >> > because, if we don't, a task belonging to a lower priority class might
> >> > induce deadline misses in highest priority activities. E.g., a CFS task
> >> > that happens to trigger a freq switch right before a DEADLINE task wakes
> >> > up and needs an higher frequency to meet its deadline: if we wait for
> >> > the rate limit of the CFS originated transition.. deadline miss!
> >>
> >> Fair enough, but if there's too much overhead as a result of this, you
> >> can't guarantee the deadlines to be met anyway.
> >
> > Indeed. I guess this only works if corner cases as the one above don't
> > happen too often.
> 
> Well, that's the point.
> 
> So there is a tradeoff: do we want to allow deadlines to be missed
> because of excessive overhead or do we want to allow deadlines to be
> missed because of the rate limit.

The difference between the two seems to be that while overhead is an
intrisic hw thing, rate limit is something we have mostly to cope with
the nature of certain classes of tasks and how we describe/track them
(at least IMHO). I'd say that for other classes of tasks (DL/RT) we
would be better off consciously living with the former only and accept
that real world is "seldom" not ideal.

But then again this is just another theory, experiments might easily
prove me wrong. :)
diff mbox

Patch

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index b0bd77d..d8dcba2 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -74,7 +74,10 @@  static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
 
 /************************ Governor internals ***********************/
 
-static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
+static bool sugov_should_update_freq(struct sugov_policy *sg_policy,
+				     u64 time,
+				     struct sugov_cpu *sg_cpu_old,
+				     struct sugov_cpu *sg_cpu_new)
 {
 	s64 delta_ns;
 
@@ -111,6 +114,10 @@  static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
 		return true;
 	}
 
+	/* Ignore rate limit when DL increased utilization. */
+	if (sg_cpu_new->util_dl > sg_cpu_old->util_dl)
+		return true;
+
 	delta_ns = time - sg_policy->last_freq_update_time;
 	return delta_ns >= sg_policy->freq_update_delay_ns;
 }
@@ -271,6 +278,7 @@  static void sugov_update_single(struct update_util_data *hook, u64 time,
 				unsigned int flags)
 {
 	struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util);
+	struct sugov_cpu sg_cpu_old = *sg_cpu;
 	struct sugov_policy *sg_policy = sg_cpu->sg_policy;
 	unsigned long util, max;
 	unsigned int next_f;
@@ -279,7 +287,7 @@  static void sugov_update_single(struct update_util_data *hook, u64 time,
 	sugov_set_iowait_boost(sg_cpu, time, flags);
 	sg_cpu->last_update = time;
 
-	if (!sugov_should_update_freq(sg_policy, time))
+	if (!sugov_should_update_freq(sg_policy, time, &sg_cpu_old, sg_cpu))
 		return;
 
 	busy = sugov_cpu_is_busy(sg_cpu);
@@ -350,6 +358,7 @@  static void sugov_update_shared(struct update_util_data *hook, u64 time,
 				unsigned int flags)
 {
 	struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util);
+	struct sugov_cpu sg_cpu_old = *sg_cpu;
 	struct sugov_policy *sg_policy = sg_cpu->sg_policy;
 	unsigned int next_f;
 
@@ -359,7 +368,7 @@  static void sugov_update_shared(struct update_util_data *hook, u64 time,
 	sugov_set_iowait_boost(sg_cpu, time, flags);
 	sg_cpu->last_update = time;
 
-	if (sugov_should_update_freq(sg_policy, time)) {
+	if (sugov_should_update_freq(sg_policy, time, &sg_cpu_old, sg_cpu)) {
 		next_f = sugov_next_freq_shared(sg_cpu, time);
 		sugov_update_commit(sg_policy, time, next_f);
 	}