[V4] intel_pstate: Use avg_pstate instead of current_pstate

Message ID	1459754617-8872-1-git-send-email-philippe.longepe@linux.intel.com (mailing list archive)
State	Superseded, archived
Headers	show Return-Path: <linux-pm-owner@kernel.org> From: Philippe Longepe <philippe.longepe@linux.intel.com> To: linux-pm@vger.kernel.org Cc: srinivas.pandruvada@linux.intel.com, rafael@kernel.org, len.brown@intel.com Subject: [PATCH V4] intel_pstate: Use avg_pstate instead of current_pstate Date: Mon, 4 Apr 2016 09:23:36 +0200 Message-Id: <1459754617-8872-1-git-send-email-philippe.longepe@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-pm-owner@vger.kernel.org Precedence: bulk

plongepe April 4, 2016, 7:23 a.m. UTC

The result returned by pid_calc() is subtracted from current_pstate
(which is the pstate requested during the last period) in order to
obtain the target pstate for the current iteration.
However, current_pstate may not reflect the real current P-state of
the CPU.  In particular, that P-state may be higher because of the
frequency sharing per module.

The theory is:

- The load is the percentage of time spent in C0 and is related to
the average frequency during the same period (We'll not have the
same load at 1GHz or at 2GHz for the same task running).

- The current frequency can be completely different than the
average frequency (because of frequency sharing or throttling).

=> The frequency shift computed by the pid_calc is based on the
load, so it must be applied to the frequency with which the load
was measured.

Using the average pstate instead of current pstate solve some
migration issues (e.g when a task migrates from one core to another
in the same package/module and all of the cores in there except for
that particular one are basically idle).

Performance and power comparison with this patch on Android:

IPLoad+Avg-Pstate vs IP Load:

Benchmark               ?Perf    ?Power
FishTank                10.45%    3.1%
SmartBench-Gaming       -0.1%   -10.4%
SmartBench-Productivity -0.8%   -10.4%
CandyCrush                n/a   -17.4%
AngryBirds                n/a    -5.9%
videoPlayback             n/a   -13.9%
audioPlayback             n/a    -4.9%
IcyRocks-20-50           0.0%   -38.4%
iozone RR               -0.16%  -1.3%
iozone RW                0.74%  -1.3%

Comparison with the perf algorithm:
(this patch in cpu_load vs Core algorithm)

Benchmark               ?Perf    ?Power
SmartBench-Gaming       -0.58%   -22.8%
SmartBench-Productivity  0.82%
CandyCrush                n/a    -20.8%
AngryBirds                n/a    -37.0%
videoPlayback             n/a    -53.4%
audioPlayback             n/a     -2.1%
iozone RR               -0.55%   -13.29%
iozone RW                2.22%

=> No regression > 1% observed and a huge power improvement!

Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com>
---
 drivers/cpufreq/intel_pstate.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Rafael J. Wysocki April 22, 2016, 12:48 a.m. UTC | #1

On Monday, April 04, 2016 09:23:36 AM Philippe Longepe wrote:
> The result returned by pid_calc() is subtracted from current_pstate
> (which is the pstate requested during the last period) in order to
> obtain the target pstate for the current iteration.
> However, current_pstate may not reflect the real current P-state of
> the CPU.  In particular, that P-state may be higher because of the
> frequency sharing per module.
> 
> The theory is:
> 
> - The load is the percentage of time spent in C0 and is related to
> the average frequency during the same period (We'll not have the
> same load at 1GHz or at 2GHz for the same task running).
> 
> - The current frequency can be completely different than the
> average frequency (because of frequency sharing or throttling).
> 
> => The frequency shift computed by the pid_calc is based on the
> load, so it must be applied to the frequency with which the load
> was measured.
> 
> Using the average pstate instead of current pstate solve some
> migration issues (e.g when a task migrates from one core to another
> in the same package/module and all of the cores in there except for
> that particular one are basically idle).
> 
> Performance and power comparison with this patch on Android:
> 
> IPLoad+Avg-Pstate vs IP Load:
> 
> Benchmark               ?Perf    ?Power
> FishTank                10.45%    3.1%
> SmartBench-Gaming       -0.1%   -10.4%
> SmartBench-Productivity -0.8%   -10.4%
> CandyCrush                n/a   -17.4%
> AngryBirds                n/a    -5.9%
> videoPlayback             n/a   -13.9%
> audioPlayback             n/a    -4.9%
> IcyRocks-20-50           0.0%   -38.4%
> iozone RR               -0.16%  -1.3%
> iozone RW                0.74%  -1.3%
> 
> Comparison with the perf algorithm:
> (this patch in cpu_load vs Core algorithm)
> 
> Benchmark               ?Perf    ?Power
> SmartBench-Gaming       -0.58%   -22.8%
> SmartBench-Productivity  0.82%
> CandyCrush                n/a    -20.8%
> AngryBirds                n/a    -37.0%
> videoPlayback             n/a    -53.4%
> audioPlayback             n/a     -2.1%
> iozone RR               -0.55%   -13.29%
> iozone RW                2.22%
> 
> => No regression > 1% observed and a huge power improvement!
> 
> Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com>

Srinivas, what about this one?  Yes?  No?  Maybe?

> ---
>  drivers/cpufreq/intel_pstate.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 4b64452..b998e1d 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -919,6 +919,12 @@ static inline int32_t get_avg_frequency(struct cpudata *cpu)
>  		cpu->pstate.scaling, cpu->sample.mperf);
>  }
>  
> +static inline int32_t get_avg_pstate(struct cpudata *cpu)
> +{
> +	return div64_u64(cpu->pstate.max_pstate_physical * cpu->sample.aperf,
> +		cpu->sample.mperf);
> +}
> +
>  static inline int32_t get_target_pstate_use_cpu_load(struct cpudata *cpu)
>  {
>  	struct sample *sample = &cpu->sample;
> @@ -951,7 +957,7 @@ static inline int32_t get_target_pstate_use_cpu_load(struct cpudata *cpu)
>  	cpu_load = div64_u64(int_tofp(100) * mperf, sample->tsc);
>  	cpu->sample.busy_scaled = cpu_load;
>  
> -	return cpu->pstate.current_pstate - pid_calc(&cpu->pid, cpu_load);
> +	return get_avg_pstate(cpu) - pid_calc(&cpu->pid, cpu_load);
>  }
>  
>  static inline int32_t get_target_pstate_use_performance(struct cpudata *cpu)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

srinivas pandruvada April 22, 2016, 12:59 a.m. UTC | #2

On Fri, 2016-04-22 at 02:48 +0200, Rafael J. Wysocki wrote:
> On Monday, April 04, 2016 09:23:36 AM Philippe Longepe wrote:
> > 
> > The result returned by pid_calc() is subtracted from current_pstate
> > (which is the pstate requested during the last period) in order to
> > obtain the target pstate for the current iteration.
> > However, current_pstate may not reflect the real current P-state of
> > the CPU.  In particular, that P-state may be higher because of the
> > frequency sharing per module.
> > 
> > The theory is:
> > 
> > - The load is the percentage of time spent in C0 and is related to
> > the average frequency during the same period (We'll not have the
> > same load at 1GHz or at 2GHz for the same task running).
> > 
> > - The current frequency can be completely different than the
> > average frequency (because of frequency sharing or throttling).
> > 
> > => The frequency shift computed by the pid_calc is based on the
> > load, so it must be applied to the frequency with which the load
> > was measured.
> > 
> > Using the average pstate instead of current pstate solve some
> > migration issues (e.g when a task migrates from one core to another
> > in the same package/module and all of the cores in there except for
> > that particular one are basically idle).
> > 
> > Performance and power comparison with this patch on Android:
> > 
> > IPLoad+Avg-Pstate vs IP Load:
> > 
> > Benchmark               ?Perf    ?Power
> > FishTank                10.45%    3.1%
> > SmartBench-Gaming       -0.1%   -10.4%
> > SmartBench-Productivity -0.8%   -10.4%
> > CandyCrush                n/a   -17.4%
> > AngryBirds                n/a    -5.9%
> > videoPlayback             n/a   -13.9%
> > audioPlayback             n/a    -4.9%
> > IcyRocks-20-50           0.0%   -38.4%
> > iozone RR               -0.16%  -1.3%
> > iozone RW                0.74%  -1.3%
> > 
> > Comparison with the perf algorithm:
> > (this patch in cpu_load vs Core algorithm)
> > 
> > Benchmark               ?Perf    ?Power
> > SmartBench-Gaming       -0.58%   -22.8%
> > SmartBench-Productivity  0.82%
> > CandyCrush                n/a    -20.8%
> > AngryBirds                n/a    -37.0%
> > videoPlayback             n/a    -53.4%
> > audioPlayback             n/a     -2.1%
> > iozone RR               -0.55%   -13.29%
> > iozone RW                2.22%
> > 
> > => No regression > 1% observed and a huge power improvement!
> > 
> > Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com>


> Srinivas, what about this one?  Yes?  No?  Maybe?
> 
I am fine with this change.
I would like to edit the commit and remove the bottom table compare
with core and statement with exclamation as this not relevant here for
the affected platforms.

Do you want me to send update?


Thanks,
Srinivas

> > 
> > ---
> >  drivers/cpufreq/intel_pstate.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/cpufreq/intel_pstate.c
> > b/drivers/cpufreq/intel_pstate.c
> > index 4b64452..b998e1d 100644
> > --- a/drivers/cpufreq/intel_pstate.c
> > +++ b/drivers/cpufreq/intel_pstate.c
> > @@ -919,6 +919,12 @@ static inline int32_t get_avg_frequency(struct
> > cpudata *cpu)
> >  		cpu->pstate.scaling, cpu->sample.mperf);
> >  }
> >  
> > +static inline int32_t get_avg_pstate(struct cpudata *cpu)
> > +{
> > +	return div64_u64(cpu->pstate.max_pstate_physical * cpu-
> > >sample.aperf,
> > +		cpu->sample.mperf);
> > +}
> > +
> >  static inline int32_t get_target_pstate_use_cpu_load(struct
> > cpudata *cpu)
> >  {
> >  	struct sample *sample = &cpu->sample;
> > @@ -951,7 +957,7 @@ static inline int32_t
> > get_target_pstate_use_cpu_load(struct cpudata *cpu)
> >  	cpu_load = div64_u64(int_tofp(100) * mperf, sample->tsc);
> >  	cpu->sample.busy_scaled = cpu_load;
> >  
> > -	return cpu->pstate.current_pstate - pid_calc(&cpu->pid,
> > cpu_load);
> > +	return get_avg_pstate(cpu) - pid_calc(&cpu->pid,
> > cpu_load);
> >  }
> >  
> >  static inline int32_t get_target_pstate_use_performance(struct
> > cpudata *cpu)
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Rafael J. Wysocki April 22, 2016, 1:01 a.m. UTC | #3

On Fri, Apr 22, 2016 at 2:59 AM, Srinivas Pandruvada
<srinivas.pandruvada@linux.intel.com> wrote:
> On Fri, 2016-04-22 at 02:48 +0200, Rafael J. Wysocki wrote:
>> On Monday, April 04, 2016 09:23:36 AM Philippe Longepe wrote:
>> >
>> > The result returned by pid_calc() is subtracted from current_pstate
>> > (which is the pstate requested during the last period) in order to
>> > obtain the target pstate for the current iteration.
>> > However, current_pstate may not reflect the real current P-state of
>> > the CPU.  In particular, that P-state may be higher because of the
>> > frequency sharing per module.
>> >
>> > The theory is:
>> >
>> > - The load is the percentage of time spent in C0 and is related to
>> > the average frequency during the same period (We'll not have the
>> > same load at 1GHz or at 2GHz for the same task running).
>> >
>> > - The current frequency can be completely different than the
>> > average frequency (because of frequency sharing or throttling).
>> >
>> > => The frequency shift computed by the pid_calc is based on the
>> > load, so it must be applied to the frequency with which the load
>> > was measured.
>> >
>> > Using the average pstate instead of current pstate solve some
>> > migration issues (e.g when a task migrates from one core to another
>> > in the same package/module and all of the cores in there except for
>> > that particular one are basically idle).
>> >
>> > Performance and power comparison with this patch on Android:
>> >
>> > IPLoad+Avg-Pstate vs IP Load:
>> >
>> > Benchmark               ?Perf    ?Power
>> > FishTank                10.45%    3.1%
>> > SmartBench-Gaming       -0.1%   -10.4%
>> > SmartBench-Productivity -0.8%   -10.4%
>> > CandyCrush                n/a   -17.4%
>> > AngryBirds                n/a    -5.9%
>> > videoPlayback             n/a   -13.9%
>> > audioPlayback             n/a    -4.9%
>> > IcyRocks-20-50           0.0%   -38.4%
>> > iozone RR               -0.16%  -1.3%
>> > iozone RW                0.74%  -1.3%
>> >
>> > Comparison with the perf algorithm:
>> > (this patch in cpu_load vs Core algorithm)
>> >
>> > Benchmark               ?Perf    ?Power
>> > SmartBench-Gaming       -0.58%   -22.8%
>> > SmartBench-Productivity  0.82%
>> > CandyCrush                n/a    -20.8%
>> > AngryBirds                n/a    -37.0%
>> > videoPlayback             n/a    -53.4%
>> > audioPlayback             n/a     -2.1%
>> > iozone RR               -0.55%   -13.29%
>> > iozone RW                2.22%
>> >
>> > => No regression > 1% observed and a huge power improvement!
>> >
>> > Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com>
>
>
>> Srinivas, what about this one?  Yes?  No?  Maybe?
>>
> I am fine with this change.
> I would like to edit the commit and remove the bottom table compare
> with core and statement with exclamation as this not relevant here for
> the affected platforms.
>
> Do you want me to send update?

Yes, please.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Doug Smythies April 22, 2016, 3:18 p.m. UTC | #4

Srinivas,

Recall a couple of months ago, on the "Increase hold-off time before busyness is scaled"
thread, Stephane suggested we try applying this method on the get_target_pstate_use_performance
branch of the intel_pstate driver, as opposed to just on the get_target_pstate_use_cpu_load branch.

It didn't make any difference with respect to the hold-off time issue.
However, I did spend considerable time testing it in other scenarios.
It does somewhat temper the occasional tendency to suddenly have a ridiculously high
scaled busy number with virtually no load (the same issue from the
"[intel-pstate driver regression] processor frequency very high even if in idle" thread,
that continued in https://bugzilla.kernel.org/show_bug.cgi?id=115771 )
I didn't find any significant regression, and did observe some small energy savings
in some scenarios (admittedly, small enough to have possibly been simply test to test
variations, and I didn't do enough tests to extract a definite trend).

I am suggesting to consider extending the patch to get_target_pstate_use_performance also.

... Doug


--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

srinivas pandruvada April 22, 2016, 4:26 p.m. UTC | #5

Hi Doug,

On Fri, 2016-04-22 at 08:18 -0700, Doug Smythies wrote:
> Srinivas,
> 
> Recall a couple of months ago, on the "Increase hold-off time before
> busyness is scaled"
> thread, Stephane suggested we try applying this method on the
> get_target_pstate_use_performance
> branch of the intel_pstate driver, as opposed to just on the
> get_target_pstate_use_cpu_load branch.
> 
> It didn't make any difference with respect to the hold-off time
> issue.
> However, I did spend considerable time testing it in other scenarios.
> It does somewhat temper the occasional tendency to suddenly have a
> ridiculously high
> scaled busy number with virtually no load (the same issue from the
> "[intel-pstate driver regression] processor frequency very high even
> if in idle" thread,
> that continued in https://bugzilla.kernel.org/show_bug.cgi?id=115771 
> )
> I didn't find any significant regression, and did observe some small
> energy savings
> in some scenarios (admittedly, small enough to have possibly been
> simply test to test
> variations, and I didn't do enough tests to extract a definite
> trend).
> 
> I am suggesting to consider extending the patch to
> get_target_pstate_use_performance also.
> 
Many core platforms have per core P states. So here the assumption that
we get a boost of P State on one core because some activity on other
core will not hold true.

We are experimenting with algorithm to improve core performance
(similar approach as yours + IO boost), hopefully we can publish soon.

Thanks,
Srinivas


> ... Doug
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[V4] intel_pstate: Use avg_pstate instead of current_pstate

Commit Message

Comments

Patch