Message ID | 5246881.PY2xISUnh1@skinner (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
On Monday, March 07, 2016 02:24:54 PM Thomas Renninger wrote: > It came out that on certain CPUs perf bias MSR has to be set to performance, > so that the CPU enters turbo states at all. > > Better do not try to wrongly adjust perf bias value, > its value probably is intended by BIOS. This whole perf bias CPU feature and trying to adjust the value at boot time gets more and more a mess. I try to sum this up, try to collect some missing bits and get them published through mailing lists. Hopefully some root causes of what went wrong are identified and passed on to the right people to get things fixed. 1. It started with: commit abe48b108247e9b90b4c6739662a2e5c765ed114 Author: Len Brown <len.brown@intel.com> Date: Thu Jul 14 00:53:24 2011 -0400 x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS ... However, the typical BIOS fails to initialize the MSR, presumably because this is handled by high-volume shrink-wrap operating systems... Linux distros, on the other hand, do not yet invoke x86_energy_perf_policy(8). As a result, WSM-EP, SNB, and later hardware from Intel will run in its default hardware power-on state (performance), which assumes that users care for performance at all costs and not for energy efficiency. While that is fine for performance benchmarks, the hardware's intended default operating point is "normal" mode... Initialize the MSR to the "normal" by default during kernel boot. ... 2. Four years later it has been identified that the boot processor's perf bias switches back to 0 on suspend. I wonder whether this happens generally. Or do BIOS hooks also reset the BIAS value on suspend? 3. It has now be identified that specific CPUs can only enter Turbo modes, if the perf BIAS value is set to 0 (performance). But the performance value cannot be set anymore by BIOS because of above patches. My questions: 1. If this is true: "While that is fine for performance benchmarks, the hardware's intended default operating point is "normal" mode..." Then the HW (CPU) must initialize to 5 (normal). It's not the BIOS' fault which forgets to set something, this may always be intended. commit b51ef52df71cb28e9d90cd1d48b79bf19f0bab06 commit 17edf2d79f1ea6dfdb4c444801d928953b9f98d6 commit abe48b108247e9b90b4c6739662a2e5c765ed114 Intel must tell their microcode writers to initialize perf BIAS with 5, if this is what Intel wants. 2. Why is the value for the boot processor reset to 0 on suspend? Is this a platform/BIOS issue only? Or could this be another major misdesign in perf BIAS CPU implementation. 3. What exactly was the problem? The argumentation that the perf BIAS "should" be 5 was rather vague. What exactly are the concequences? Getting a dirty workaround into the x86 core code only by this information without any values or references is bad. Thomas PS: I will submit the patch (revert of commit b51ef52df71cb2, 17edf2d79f1ea6d and abe48b108247e9) now to our SLE code base. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -13,7 +13,6 @@ #include <linux/kgdb.h> #include <linux/smp.h> #include <linux/io.h> -#include <linux/syscore_ops.h> #include <asm/stackprotector.h> #include <asm/perf_event.h> @@ -1489,20 +1488,3 @@ inline bool __static_cpu_has_safe(u16 bi return boot_cpu_has(bit); } EXPORT_SYMBOL_GPL(__static_cpu_has_safe); - -static void bsp_resume(void) -{ - if (this_cpu->c_bsp_resume) - this_cpu->c_bsp_resume(&boot_cpu_data); -} - -static struct syscore_ops cpu_syscore_ops = { - .resume = bsp_resume, -}; - -static int __init init_cpu_syscore(void) -{ - register_syscore_ops(&cpu_syscore_ops); - return 0; -} -core_initcall(init_cpu_syscore); --- a/arch/x86/kernel/cpu/cpu.h +++ b/arch/x86/kernel/cpu/cpu.h @@ -13,7 +13,6 @@ struct cpu_dev { void (*c_init)(struct cpuinfo_x86 *); void (*c_identify)(struct cpuinfo_x86 *); void (*c_detect_tlb)(struct cpuinfo_x86 *); - void (*c_bsp_resume)(struct cpuinfo_x86 *); int c_x86_vendor; #ifdef CONFIG_X86_32 /* Optional vendor specific routine to obtain the cache size. */ --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -372,36 +372,6 @@ static void detect_vmx_virtcap(struct cp } } -static void init_intel_energy_perf(struct cpuinfo_x86 *c) -{ - u64 epb; - - /* - * Initialize MSR_IA32_ENERGY_PERF_BIAS if not already initialized. - * (x86_energy_perf_policy(8) is available to change it at run-time.) - */ - if (!cpu_has(c, X86_FEATURE_EPB)) - return; - - rdmsrl(MSR_IA32_ENERGY_PERF_BIAS, epb); - if ((epb & 0xF) != ENERGY_PERF_BIAS_PERFORMANCE) - return; - - pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was 'performance'\n"); - pr_warn_once("ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8)\n"); - epb = (epb & ~0xF) | ENERGY_PERF_BIAS_NORMAL; - wrmsrl(MSR_IA32_ENERGY_PERF_BIAS, epb); -} - -static void intel_bsp_resume(struct cpuinfo_x86 *c) -{ - /* - * MSR_IA32_ENERGY_PERF_BIAS is lost across suspend/resume, - * so reinitialize it properly like during bootup: - */ - init_intel_energy_perf(c); -} - static void init_intel(struct cpuinfo_x86 *c) { unsigned int l2 = 0; @@ -509,7 +479,6 @@ static void init_intel(struct cpuinfo_x8 if (cpu_has(c, X86_FEATURE_VMX)) detect_vmx_virtcap(c); - init_intel_energy_perf(c); } #ifdef CONFIG_X86_32 @@ -764,7 +733,6 @@ static const struct cpu_dev intel_cpu_de .c_detect_tlb = intel_detect_tlb, .c_early_init = early_init_intel, .c_init = init_intel, - .c_bsp_resume = intel_bsp_resume, .c_x86_vendor = X86_VENDOR_INTEL, };
It came out that on certain CPUs perf bias MSR has to be set to performance, so that the CPU enters turbo states at all. Better do not try to wrongly adjust perf bias value, its value probably is intended by BIOS. This is a revert of mainline git commits: commit b51ef52df71cb28e9d90cd1d48b79bf19f0bab06 commit 17edf2d79f1ea6dfdb4c444801d928953b9f98d6 commit abe48b108247e9b90b4c6739662a2e5c765ed114 Signed-off-by: Thomas Renninger <trenn@suse.de> --- arch/x86/kernel/cpu/common.c | 18 ------------------ arch/x86/kernel/cpu/cpu.h | 1 - arch/x86/kernel/cpu/intel.c | 32 -------------------------------- 3 files changed, 51 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html