diff mbox

[v2,2/3] cpufreq: intel_pstate: Adjust policy->max

Message ID 1461384233-24214-3-git-send-email-srinivas.pandruvada@linux.intel.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

srinivas pandruvada April 23, 2016, 4:03 a.m. UTC
When policy->max is changed via _PPC or sysfs and is more than the max non
turbo frequency, it does not really change resulting performance in some
processors. When policy->max results in a P-State ratio more than the
turbo activation ratio, then processor can choose any P-State up to max
turbo. So the user or _PPC setting has no value, but this can cause
undesirable side effects like:
- Showing reduced max percentage in Intel P-State sysfs
- It can cause reduced max performance, if the policy->max is set to
the least turbo frequency and because of precision error in calculation
of ceiling limit, we may end up in a limit which is in non turbo region.
This issue is more prone when we enforce _PPC limit, because of the way
_PPC limit is set to indicate the beginning of turbo region when config
TDP feature is in use.

When config TDP feature is ON, the max non turbo ratio can be less than
max physical non turbo ratio. In this case _PPC points to turbo activation
ratio + 1. In this case we don't need to treat this as the reduced
frequency in set_policy callback, as we can get performance up to max
turbo frequency.

In this change when config TDP is active (When the physical max non turbo
ratio is more than the current max non turbo ratio), any request above
current max non turbo is treated as full performance.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
---
 drivers/cpufreq/intel_pstate.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Konstantin Khlebnikov April 25, 2016, 10:13 a.m. UTC | #1
On 23.04.2016 07:03, Srinivas Pandruvada wrote:
> When policy->max is changed via _PPC or sysfs and is more than the max non
> turbo frequency, it does not really change resulting performance in some
> processors. When policy->max results in a P-State ratio more than the
> turbo activation ratio, then processor can choose any P-State up to max
> turbo. So the user or _PPC setting has no value, but this can cause
> undesirable side effects like:
> - Showing reduced max percentage in Intel P-State sysfs
> - It can cause reduced max performance, if the policy->max is set to
> the least turbo frequency and because of precision error in calculation
> of ceiling limit, we may end up in a limit which is in non turbo region.
> This issue is more prone when we enforce _PPC limit, because of the way
> _PPC limit is set to indicate the beginning of turbo region when config
> TDP feature is in use.

I don't understand this. This fix for configuration where maximum allowed
frequency between maximum non-turbo and first turbo frequency?
Or this address regression that Borislav Petkov reported last year where
_PSS had bogus pstate 0xff?

>
> When config TDP feature is ON, the max non turbo ratio can be less than
> max physical non turbo ratio. In this case _PPC points to turbo activation
> ratio + 1. In this case we don't need to treat this as the reduced
> frequency in set_policy callback, as we can get performance up to max
> turbo frequency.
>
> In this change when config TDP is active (When the physical max non turbo
> ratio is more than the current max non turbo ratio), any request above
> current max non turbo is treated as full performance.
>
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> ---
>   drivers/cpufreq/intel_pstate.c | 12 ++++++++++++
>   1 file changed, 12 insertions(+)
>
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index b3e8124..c9cc72d 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -1428,11 +1428,23 @@ static void intel_pstate_set_performance_limits(struct perf_limits *limits)
>
>   static int intel_pstate_set_policy(struct cpufreq_policy *policy)
>   {
> +	struct cpudata *cpu;
> +
>   	if (!policy->cpuinfo.max_freq)
>   		return -ENODEV;
>
>   	intel_pstate_clear_update_util_hook(policy->cpu);
>
> +	cpu = all_cpu_data[0];
> +	if (cpu->pstate.max_pstate_physical > cpu->pstate.max_pstate) {
> +		if (policy->max < policy->cpuinfo.max_freq &&
> +		    policy->max > (cpu->pstate.max_pstate *
> +				   cpu->pstate.scaling)) {
> +			pr_info("policy->max > max non turbo frequency\n");
> +			policy->max = policy->cpuinfo.max_freq;
> +		}
> +	}
> +
>   	if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
>   		limits = &performance_limits;
>   		if (policy->max >= policy->cpuinfo.max_freq) {
>
srinivas pandruvada April 25, 2016, 6:58 p.m. UTC | #2
On Mon, 2016-04-25 at 13:13 +0300, Konstantin Khlebnikov wrote:
> On 23.04.2016 07:03, Srinivas Pandruvada wrote:
> > 
> > 
> 

[...]

> > - It can cause reduced max performance, if the policy->max is set
> > to
> > the least turbo frequency and because of precision error in
> > calculation
> > of ceiling limit, we may end up in a limit which is in non turbo
> > region.
> > This issue is more prone when we enforce _PPC limit, because of the
> > way
> > _PPC limit is set to indicate the beginning of turbo region when
> > config
> > TDP feature is in use.
> I don't understand this. This fix for configuration where maximum
> allowed
> frequency between maximum non-turbo and first turbo frequency?
> Or this address regression that Borislav Petkov reported last year
> where
> _PSS had bogus pstate 0xff?
> 

I am glad you asked this question. The requested max scaling frequency
either via _PPC or via cpufreq-sysfs, will be converted into a fixed
floating point max percent scale. On majority of the cases this will
result in correct max (What you set via scaling_max, you will see that
as max). But not 100% of time. If your _PPC is requested at a point
where we have issue, we will loose performance as we will not request
turbo.

Let's look at real example from a Broadwell laptop with config TDP.

_PSS table from a Broadwell laptop

2301000 2300000 2200000 2000000 1900000 1800000 1700000 1500000 1400000
1300000 1100000 1000000 900000 800000 600000 500000

The actual results by disabling config TDP so that we can get what you
requested on or below 2300000Khz.

scaling_max_freq	Max Requested P-State	Resultant scaling
max
---------------------------------------- ----------------------
2400000			18			2900000 (max
turbo)
2300000			17			2300000 (max
physical non turbo)
2200000			15			2100000
2100000			15			2100000
2000000			13			1900000 
1900000			13			1900000
1800000			12			1800000
1700000			11			1700000
1600000			10			1600000
1500000			f			1500000
1400000			e			1400000
1300000			d			1300000
1200000			c			1200000
1100000			a			1000000
1000000			a			1000000
900000			9			 900000
800000			8			 800000
700000			7			 700000
600000			6			 600000
500000			5			 500000
------------------------------------------------------------------


Now set the config TDP level 1 ratio as 0x0b (equivalent to 1100000KHz)
in BIOS (not every system will let you adjust this).
The turbo activation ratio will be set to one less than that, which
will be 0x0a (So any request above 1000000KHz should result in turbo
region assuming no thermal limits).
Here _PPC will request max to 1100000KHz (which basically should still
result in turbo as this is more than the turbo activation ratio upto
max allowable turbo frequency), but actual calculation resulted in a
max ceiling P-State which is 0x0a.
So under any load we will not go to turbo frequency. This will be a
huge performance hit.

Thanks,
Srinivas








--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki April 25, 2016, 9:24 p.m. UTC | #3
On Mon, Apr 25, 2016 at 8:58 PM, Srinivas Pandruvada
<srinivas.pandruvada@linux.intel.com> wrote:
> On Mon, 2016-04-25 at 13:13 +0300, Konstantin Khlebnikov wrote:
>> On 23.04.2016 07:03, Srinivas Pandruvada wrote:
>> >
>> >
>>
>
> [...]
>
>> > - It can cause reduced max performance, if the policy->max is set
>> > to
>> > the least turbo frequency and because of precision error in
>> > calculation
>> > of ceiling limit, we may end up in a limit which is in non turbo
>> > region.
>> > This issue is more prone when we enforce _PPC limit, because of the
>> > way
>> > _PPC limit is set to indicate the beginning of turbo region when
>> > config
>> > TDP feature is in use.
>> I don't understand this. This fix for configuration where maximum
>> allowed
>> frequency between maximum non-turbo and first turbo frequency?
>> Or this address regression that Borislav Petkov reported last year
>> where
>> _PSS had bogus pstate 0xff?
>>
>
> I am glad you asked this question. The requested max scaling frequency
> either via _PPC or via cpufreq-sysfs, will be converted into a fixed
> floating point max percent scale. On majority of the cases this will
> result in correct max (What you set via scaling_max, you will see that
> as max). But not 100% of time. If your _PPC is requested at a point
> where we have issue, we will loose performance as we will not request
> turbo.
>
> Let's look at real example from a Broadwell laptop with config TDP.
>
> _PSS table from a Broadwell laptop
>
> 2301000 2300000 2200000 2000000 1900000 1800000 1700000 1500000 1400000
> 1300000 1100000 1000000 900000 800000 600000 500000
>
> The actual results by disabling config TDP so that we can get what you
> requested on or below 2300000Khz.
>
> scaling_max_freq        Max Requested P-State   Resultant scaling
> max
> ---------------------------------------- ----------------------
> 2400000                 18                      2900000 (max
> turbo)
> 2300000                 17                      2300000 (max
> physical non turbo)
> 2200000                 15                      2100000
> 2100000                 15                      2100000
> 2000000                 13                      1900000
> 1900000                 13                      1900000
> 1800000                 12                      1800000
> 1700000                 11                      1700000
> 1600000                 10                      1600000
> 1500000                 f                       1500000
> 1400000                 e                       1400000
> 1300000                 d                       1300000
> 1200000                 c                       1200000
> 1100000                 a                       1000000
> 1000000                 a                       1000000
> 900000                  9                        900000
> 800000                  8                        800000
> 700000                  7                        700000
> 600000                  6                        600000
> 500000                  5                        500000
> ------------------------------------------------------------------
>
>
> Now set the config TDP level 1 ratio as 0x0b (equivalent to 1100000KHz)
> in BIOS (not every system will let you adjust this).
> The turbo activation ratio will be set to one less than that, which
> will be 0x0a (So any request above 1000000KHz should result in turbo
> region assuming no thermal limits).
> Here _PPC will request max to 1100000KHz (which basically should still
> result in turbo as this is more than the turbo activation ratio upto
> max allowable turbo frequency), but actual calculation resulted in a
> max ceiling P-State which is 0x0a.
> So under any load we will not go to turbo frequency. This will be a
> huge performance hit.

Maybe you can fold the above into the patch changelog?
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index b3e8124..c9cc72d 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -1428,11 +1428,23 @@  static void intel_pstate_set_performance_limits(struct perf_limits *limits)
 
 static int intel_pstate_set_policy(struct cpufreq_policy *policy)
 {
+	struct cpudata *cpu;
+
 	if (!policy->cpuinfo.max_freq)
 		return -ENODEV;
 
 	intel_pstate_clear_update_util_hook(policy->cpu);
 
+	cpu = all_cpu_data[0];
+	if (cpu->pstate.max_pstate_physical > cpu->pstate.max_pstate) {
+		if (policy->max < policy->cpuinfo.max_freq &&
+		    policy->max > (cpu->pstate.max_pstate *
+				   cpu->pstate.scaling)) {
+			pr_info("policy->max > max non turbo frequency\n");
+			policy->max = policy->cpuinfo.max_freq;
+		}
+	}
+
 	if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
 		limits = &performance_limits;
 		if (policy->max >= policy->cpuinfo.max_freq) {