diff mbox

[v2] cpufreq: pcc-cpufreq: Disable dynamic scaling on many-CPU systems

Message ID 5423012.ZZnfdYddaT@aspire.rjw.lan (mailing list archive)
State Mainlined
Delegated to: Rafael Wysocki
Headers show

Commit Message

Rafael J. Wysocki July 17, 2018, 4:14 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The firmware interface used by the pcc-cpufreq driver is
fundamentally not scalable and using it for dynamic CPU performance
scaling on systems with many CPUs leads to degraded performance.

For this reason, disable dynamic CPU performance scaling on systems
with pcc-cpufreq where the number of CPUs present at the driver init
time is greater than 4.  Also make the driver print corresponding
complaints to the kernel log.

Reported-by: Andreas Herrmann <aherrmann@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

-> v2: Rework the messages printed in the problematic case.

---
 drivers/cpufreq/pcc-cpufreq.c |    9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Andreas Herrmann July 17, 2018, 8:13 p.m. UTC | #1
On Tue, Jul 17, 2018 at 06:14:58PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> The firmware interface used by the pcc-cpufreq driver is
> fundamentally not scalable and using it for dynamic CPU performance
> scaling on systems with many CPUs leads to degraded performance.
> 
> For this reason, disable dynamic CPU performance scaling on systems
> with pcc-cpufreq where the number of CPUs present at the driver init
> time is greater than 4.  Also make the driver print corresponding
> complaints to the kernel log.
> 
> Reported-by: Andreas Herrmann <aherrmann@suse.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> -> v2: Rework the messages printed in the problematic case.

I've tested this patch. Effect is as expected: driver loads but use of
ondemand governor is not allowed. Sample output:

[   40.757519] pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1200 MHz, 2800 MHz
[   40.831705] pcc_cpufreq_init: Too many CPUs, dynamic performance scaling disabled
[   40.898353] pcc_cpufreq_init: Try to enable a different scaling driver through BIOS settings
[   40.972327] pcc_cpufreq_init: and complain to the system vendor
[   41.025620] cpufreq: Can't use ondemand governor as dynamic switching is disallowed. Fallback to performance governor
...
[   41.187928] cpufreq: Can't use ondemand governor as dynamic switching is disallowed. Fallback to performance governor

Last message is shown for each online CPU in the system  (ie. 120x).

Looks good to me.


Andreas

> ---
>  drivers/cpufreq/pcc-cpufreq.c |    9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> Index: linux-pm/drivers/cpufreq/pcc-cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/pcc-cpufreq.c
> +++ linux-pm/drivers/cpufreq/pcc-cpufreq.c
> @@ -589,6 +589,15 @@ static int __init pcc_cpufreq_init(void)
>  		return ret;
>  	}
>  
> +	if (num_present_cpus() > 4) {
> +		pcc_cpufreq_driver.flags |= CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING;
> +		pr_err("%s: Too many CPUs, dynamic performance scaling disabled\n",
> +		       __func__);
> +		pr_err("%s: Try to enable a different scaling driver through BIOS settings\n",
> +		       __func__);
> +		pr_err("%s: and complain to the system vendor\n", __func__);
> +	}
> +
>  	ret = cpufreq_register_driver(&pcc_cpufreq_driver);
>  
>  	return ret;
> 
>
Rafael J. Wysocki July 18, 2018, 7:44 a.m. UTC | #2
On Tue, Jul 17, 2018 at 10:13 PM, Andreas Herrmann <aherrmann@suse.com> wrote:
> On Tue, Jul 17, 2018 at 06:14:58PM +0200, Rafael J. Wysocki wrote:
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> The firmware interface used by the pcc-cpufreq driver is
>> fundamentally not scalable and using it for dynamic CPU performance
>> scaling on systems with many CPUs leads to degraded performance.
>>
>> For this reason, disable dynamic CPU performance scaling on systems
>> with pcc-cpufreq where the number of CPUs present at the driver init
>> time is greater than 4.  Also make the driver print corresponding
>> complaints to the kernel log.
>>
>> Reported-by: Andreas Herrmann <aherrmann@suse.com>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> ---
>>
>> -> v2: Rework the messages printed in the problematic case.
>
> I've tested this patch. Effect is as expected: driver loads but use of
> ondemand governor is not allowed. Sample output:
>
> [   40.757519] pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1200 MHz, 2800 MHz
> [   40.831705] pcc_cpufreq_init: Too many CPUs, dynamic performance scaling disabled
> [   40.898353] pcc_cpufreq_init: Try to enable a different scaling driver through BIOS settings
> [   40.972327] pcc_cpufreq_init: and complain to the system vendor
> [   41.025620] cpufreq: Can't use ondemand governor as dynamic switching is disallowed. Fallback to performance governor
> ...
> [   41.187928] cpufreq: Can't use ondemand governor as dynamic switching is disallowed. Fallback to performance governor
>
> Last message is shown for each online CPU in the system  (ie. 120x).
>
> Looks good to me.

Thanks a lot!

Please also try https://patchwork.kernel.org/patch/10530321/

Cheers,
Rafael
Peter Zijlstra July 18, 2018, 8:23 a.m. UTC | #3
On Tue, Jul 17, 2018 at 10:13:23PM +0200, Andreas Herrmann wrote:
> On Tue, Jul 17, 2018 at 06:14:58PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > The firmware interface used by the pcc-cpufreq driver is
> > fundamentally not scalable and using it for dynamic CPU performance
> > scaling on systems with many CPUs leads to degraded performance.
> > 
> > For this reason, disable dynamic CPU performance scaling on systems
> > with pcc-cpufreq where the number of CPUs present at the driver init
> > time is greater than 4.  Also make the driver print corresponding
> > complaints to the kernel log.
> > 
> > Reported-by: Andreas Herrmann <aherrmann@suse.com>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> > 
> > -> v2: Rework the messages printed in the problematic case.
> 
> I've tested this patch. Effect is as expected: driver loads but use of
> ondemand governor is not allowed. Sample output:
> 
> [   40.757519] pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1200 MHz, 2800 MHz
> [   40.831705] pcc_cpufreq_init: Too many CPUs, dynamic performance scaling disabled
> [   40.898353] pcc_cpufreq_init: Try to enable a different scaling driver through BIOS settings

BTW, Andreas, is that BIOS option available through the normal BIOS
settings, or it is in the "secret" BIOS menu that HP has? If it is in
the "secret" one (^A IIRC) then we might want to explicitly mention
that.
Andreas Herrmann July 18, 2018, 9:34 a.m. UTC | #4
On Wed, Jul 18, 2018 at 10:23:52AM +0200, Peter Zijlstra wrote:
> On Tue, Jul 17, 2018 at 10:13:23PM +0200, Andreas Herrmann wrote:
> > On Tue, Jul 17, 2018 at 06:14:58PM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > The firmware interface used by the pcc-cpufreq driver is
> > > fundamentally not scalable and using it for dynamic CPU performance
> > > scaling on systems with many CPUs leads to degraded performance.
> > > 
> > > For this reason, disable dynamic CPU performance scaling on systems
> > > with pcc-cpufreq where the number of CPUs present at the driver init
> > > time is greater than 4.  Also make the driver print corresponding
> > > complaints to the kernel log.
> > > 
> > > Reported-by: Andreas Herrmann <aherrmann@suse.com>
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > > 
> > > -> v2: Rework the messages printed in the problematic case.
> > 
> > I've tested this patch. Effect is as expected: driver loads but use of
> > ondemand governor is not allowed. Sample output:
> > 
> > [   40.757519] pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1200 MHz, 2800 MHz
> > [   40.831705] pcc_cpufreq_init: Too many CPUs, dynamic performance scaling disabled
> > [   40.898353] pcc_cpufreq_init: Try to enable a different scaling driver through BIOS settings
> 
> BTW, Andreas, is that BIOS option available through the normal BIOS
> settings, or it is in the "secret" BIOS menu that HP has? If it is in
> the "secret" one (^A IIRC) then we might want to explicitly mention
> that.

The relevant options are not in a secret section (so far I was not
even aware of a secret section). But on those systems some stuff is
available via iLO interface (browser) and some only via "BIOS/Platform
Configuration (RBSU)". The iLO interface has a couple of options that
are only visible with an advanced ILO license.

The options affecting Linux cpufreq subsystem are

(a) available only via "BIOS/Platform Configuration (RBSU)"

  [Power Management]->
    [Advanced Power Options]->
      [Collaborative Power Control] == "enabled" | "disabled"

 If set to "disabled": No cpufreq driver will load and platform is
 solely responsible for CPU frequency adaptions (whatever that means).

(b) available via "BIOS/Platform Configuration (RBSU)" and iLO interface

  [Power Management]->
    [Power Regulator] == "Dynamic Power Savings Mode" | "OS Control Mode" |
                         "Static Low Power Mode" | "Static High Performance Mode"

 "Dynamic Power Savings Mode" allows pcc-cpufreq to load and "OS
 Control Mode" allows intel-pstate to be loaded. We now change it such
 that also with "Dynamic Power Savings Mode" intel-pstate is loaded
 (if available; if not, pcc-cpufreq will still be loaded but it now
 emits a warning and disallows use of ondemand governor if too many
 CPUs are in use).

 AFAIK the other (static) modes have no real meaning when
 "Collaborative Power Control" is enabled because as soon as a cpufreq
 driver is loaded frequency will be adapted by OS. I can't remember
 which one (pcc-cpufreq or intel-pstate) would be loaded, I've just
 tried it once or twice.

Note that the above is valid for a DL580 Gen8 with Intel CPUs. It
might be slightly different on Gen9/Gen10 and/or other models. Ie. for
models based on AMD CPUs we just have added a new warning in
pcc-cpufreq and disabled use of ondemand governor when the system has
more than 4 CPUs.


Andreas
diff mbox

Patch

Index: linux-pm/drivers/cpufreq/pcc-cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/pcc-cpufreq.c
+++ linux-pm/drivers/cpufreq/pcc-cpufreq.c
@@ -589,6 +589,15 @@  static int __init pcc_cpufreq_init(void)
 		return ret;
 	}
 
+	if (num_present_cpus() > 4) {
+		pcc_cpufreq_driver.flags |= CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING;
+		pr_err("%s: Too many CPUs, dynamic performance scaling disabled\n",
+		       __func__);
+		pr_err("%s: Try to enable a different scaling driver through BIOS settings\n",
+		       __func__);
+		pr_err("%s: and complain to the system vendor\n", __func__);
+	}
+
 	ret = cpufreq_register_driver(&pcc_cpufreq_driver);
 
 	return ret;