diff mbox

Do not modify perf bias performance setting by default at boot

Message ID 5246881.PY2xISUnh1@skinner (mailing list archive)
State RFC, archived
Headers show

Commit Message

Thomas Renninger March 7, 2016, 1:24 p.m. UTC
It came out that on certain CPUs perf bias MSR has to be set to performance,
so that the CPU enters turbo states at all.

Better do not try to wrongly adjust perf bias value,
its value probably is intended by BIOS.

This is a revert of mainline git commits:
commit b51ef52df71cb28e9d90cd1d48b79bf19f0bab06
commit 17edf2d79f1ea6dfdb4c444801d928953b9f98d6
commit abe48b108247e9b90b4c6739662a2e5c765ed114

Signed-off-by: Thomas Renninger <trenn@suse.de>

---
 arch/x86/kernel/cpu/common.c |   18 ------------------
 arch/x86/kernel/cpu/cpu.h    |    1 -
 arch/x86/kernel/cpu/intel.c  |   32 --------------------------------
 3 files changed, 51 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Thomas Renninger March 7, 2016, 4:17 p.m. UTC | #1
On Monday, March 07, 2016 02:24:54 PM Thomas Renninger wrote:
> It came out that on certain CPUs perf bias MSR has to be set to performance,
> so that the CPU enters turbo states at all.
> 
> Better do not try to wrongly adjust perf bias value,
> its value probably is intended by BIOS.

This whole perf bias CPU feature and trying to adjust the value
at boot time gets more and more a mess.

I try to sum this up, try to collect some missing bits and get them
published through mailing lists. Hopefully some root causes of what
went wrong are identified and passed on to the right people to get
things fixed.

1. It started with:
commit abe48b108247e9b90b4c6739662a2e5c765ed114
Author: Len Brown <len.brown@intel.com>
Date:   Thu Jul 14 00:53:24 2011 -0400

    x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS

...
    However, the typical BIOS fails to initialize the MSR, presumably
    because this is handled by high-volume shrink-wrap operating systems...
    
    Linux distros, on the other hand, do not yet invoke x86_energy_perf_policy(8).
    As a result, WSM-EP, SNB, and later hardware from Intel will run in its
    default hardware power-on state (performance), which assumes that users
    care for performance at all costs and not for energy efficiency.
    While that is fine for performance benchmarks, the hardware's intended default
    operating point is "normal" mode...
    
    Initialize the MSR to the "normal" by default during kernel boot.
...


2. Four years later it has been identified that the boot processor's perf bias
switches back to 0 on suspend.

I wonder whether this happens generally. Or do BIOS hooks also reset the BIAS
value on suspend?

3. It has now be identified that specific CPUs can only enter Turbo modes, if
the perf BIAS value is set to 0 (performance). But the performance value cannot
be set anymore by BIOS because of above patches.


My questions:

1. If this is true:
"While that is fine for performance benchmarks, the hardware's intended default
operating point is "normal" mode..."
Then the HW (CPU) must initialize to 5 (normal).
It's not the BIOS' fault which forgets to set something,
this may always be intended.
commit b51ef52df71cb28e9d90cd1d48b79bf19f0bab06
commit 17edf2d79f1ea6dfdb4c444801d928953b9f98d6
commit abe48b108247e9b90b4c6739662a2e5c765ed114

Intel must tell their microcode writers to initialize perf BIAS with 5, if this
is what Intel wants.

2. Why is the value for the boot processor reset to 0 on suspend?
Is this a platform/BIOS issue only?
Or could this be another major misdesign in perf BIAS CPU implementation.

3. What exactly was the problem?
The argumentation that the perf BIAS "should" be 5 was rather vague.
What exactly are the concequences? Getting a dirty workaround into the x86 core
code only by this information without any values or references is bad.

          Thomas

PS: I will submit the patch (revert of commit b51ef52df71cb2, 17edf2d79f1ea6d and
abe48b108247e9) now to our SLE code base.

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -13,7 +13,6 @@ 
 #include <linux/kgdb.h>
 #include <linux/smp.h>
 #include <linux/io.h>
-#include <linux/syscore_ops.h>
 
 #include <asm/stackprotector.h>
 #include <asm/perf_event.h>
@@ -1489,20 +1488,3 @@  inline bool __static_cpu_has_safe(u16 bi
 	return boot_cpu_has(bit);
 }
 EXPORT_SYMBOL_GPL(__static_cpu_has_safe);
-
-static void bsp_resume(void)
-{
-	if (this_cpu->c_bsp_resume)
-		this_cpu->c_bsp_resume(&boot_cpu_data);
-}
-
-static struct syscore_ops cpu_syscore_ops = {
-	.resume		= bsp_resume,
-};
-
-static int __init init_cpu_syscore(void)
-{
-	register_syscore_ops(&cpu_syscore_ops);
-	return 0;
-}
-core_initcall(init_cpu_syscore);
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -13,7 +13,6 @@  struct cpu_dev {
 	void		(*c_init)(struct cpuinfo_x86 *);
 	void		(*c_identify)(struct cpuinfo_x86 *);
 	void		(*c_detect_tlb)(struct cpuinfo_x86 *);
-	void		(*c_bsp_resume)(struct cpuinfo_x86 *);
 	int		c_x86_vendor;
 #ifdef CONFIG_X86_32
 	/* Optional vendor specific routine to obtain the cache size. */
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -372,36 +372,6 @@  static void detect_vmx_virtcap(struct cp
 	}
 }
 
-static void init_intel_energy_perf(struct cpuinfo_x86 *c)
-{
-	u64 epb;
-
-	/*
-	 * Initialize MSR_IA32_ENERGY_PERF_BIAS if not already initialized.
-	 * (x86_energy_perf_policy(8) is available to change it at run-time.)
-	 */
-	if (!cpu_has(c, X86_FEATURE_EPB))
-		return;
-
-	rdmsrl(MSR_IA32_ENERGY_PERF_BIAS, epb);
-	if ((epb & 0xF) != ENERGY_PERF_BIAS_PERFORMANCE)
-		return;
-
-	pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was 'performance'\n");
-	pr_warn_once("ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8)\n");
-	epb = (epb & ~0xF) | ENERGY_PERF_BIAS_NORMAL;
-	wrmsrl(MSR_IA32_ENERGY_PERF_BIAS, epb);
-}
-
-static void intel_bsp_resume(struct cpuinfo_x86 *c)
-{
-	/*
-	 * MSR_IA32_ENERGY_PERF_BIAS is lost across suspend/resume,
-	 * so reinitialize it properly like during bootup:
-	 */
-	init_intel_energy_perf(c);
-}
-
 static void init_intel(struct cpuinfo_x86 *c)
 {
 	unsigned int l2 = 0;
@@ -509,7 +479,6 @@  static void init_intel(struct cpuinfo_x8
 	if (cpu_has(c, X86_FEATURE_VMX))
 		detect_vmx_virtcap(c);
 
-	init_intel_energy_perf(c);
 }
 
 #ifdef CONFIG_X86_32
@@ -764,7 +733,6 @@  static const struct cpu_dev intel_cpu_de
 	.c_detect_tlb	= intel_detect_tlb,
 	.c_early_init   = early_init_intel,
 	.c_init		= init_intel,
-	.c_bsp_resume	= intel_bsp_resume,
 	.c_x86_vendor	= X86_VENDOR_INTEL,
 };