Message ID | 4687430.mfM0GbdeDL@skinner (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
This in fact is a re-send, including x86 maintainers. Even this is a PM (Power Management) issue, the code is in the x86 architecture paths. From last submit: > > Patch is against latest linux-pm kernel. > > Rafael: Can you queue this one up, please. > Well, I'm not an x86 arch maintainer. > Can you at least CC them, please? Can someone, best would be a x86 maintainer?, take the patch please if it is considered ok. Thanks, Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday, February 26, 2016 05:38:00 PM Thomas Renninger wrote: > The assumption that BIOSes never want to have this register being set to > full performance (zero) is wrong. > > While wrongly overruling this BIOS setting and set it from performance > to normal did not hurt that much, because nobody really knew the effects inside > Intel processors. > > But with Broadwell-EP processor (E5-2687W v4) the CPU will not enter turbo modes > if this value is not set to performance. > > So switch logic to tell the user in a friendly way (info) that the CPU is in > performance mode and how to switch via userspace if this is not intended. > > But otherwise trust that the BIOS has set the correct value here and do not > blindly overrule. > > How this has been found: SLE11 had this patch, SLE12 it slipped through. > It took quite some time to nail down that this patch missing is the reason > for not entering turbo modes with this specific processor. > > Signed-off-by: Thomas Renninger <trenn@suse.com> > > --- a/arch/x86/kernel/cpu/intel.c 2016-02-26 17:19:55.731042972 +0100 > +++ b/arch/x86/kernel/cpu/intel.c 2016-02-26 17:20:48.598020581 +0100 > @@ -377,8 +377,12 @@ static void init_intel_energy_perf(struc > u64 epb; > > /* > - * Initialize MSR_IA32_ENERGY_PERF_BIAS if not already initialized. > - * (x86_energy_perf_policy(8) is available to change it at run-time.) > + * On server platforms energy bias typically is set to > + * performance on purpose. > + * On other platforms it may happen that MSR_IA32_ENERGY_PERF_BIAS > + * did not get initialized properly by BIOS. > + * Best is to to keep BIOS settings and give the user a hint whether > + * to change it via cpupower-set(8) userspace tool at runtime. > */ > if (!cpu_has(c, X86_FEATURE_EPB)) > return; > @@ -387,10 +391,8 @@ static void init_intel_energy_perf(struc > if ((epb & 0xF) != ENERGY_PERF_BIAS_PERFORMANCE) > return; > > - pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was 'performance'\n"); > - pr_warn_once("ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8)\n"); > - epb = (epb & ~0xF) | ENERGY_PERF_BIAS_NORMAL; > - wrmsrl(MSR_IA32_ENERGY_PERF_BIAS, epb); > + pr_info_once("ENERGY_PERF_BIAS is set to 'performance'\n"); > + pr_info_once("ENERGY_PERF_BIAS: Update with cpupower-set(8)\n"); This doesn't need to be cpupower-set IMO. > } > > static void intel_bsp_resume(struct cpuinfo_x86 *c) > Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday, February 27, 2016 12:15:47 AM Rafael J. Wysocki wrote: > On Friday, February 26, 2016 05:38:00 PM Thomas Renninger wrote: > > The assumption that BIOSes never want to have this register being set to > > full performance (zero) is wrong. > > > > While wrongly overruling this BIOS setting and set it from performance > > to normal did not hurt that much, because nobody really knew the effects > > inside Intel processors. > > > > But with Broadwell-EP processor (E5-2687W v4) the CPU will not enter turbo > > modes if this value is not set to performance. > > > > So switch logic to tell the user in a friendly way (info) that the CPU is > > in performance mode and how to switch via userspace if this is not > > intended. > > > > But otherwise trust that the BIOS has set the correct value here and do > > not > > blindly overrule. > > > > How this has been found: SLE11 had this patch, SLE12 it slipped through. > > It took quite some time to nail down that this patch missing is the reason > > for not entering turbo modes with this specific processor. > > > > Signed-off-by: Thomas Renninger <trenn@suse.com> > > > > --- a/arch/x86/kernel/cpu/intel.c 2016-02-26 17:19:55.731042972 +0100 > > +++ b/arch/x86/kernel/cpu/intel.c 2016-02-26 17:20:48.598020581 +0100 > > @@ -377,8 +377,12 @@ static void init_intel_energy_perf(struc > > > > u64 epb; > > > > /* > > > > - * Initialize MSR_IA32_ENERGY_PERF_BIAS if not already initialized. > > - * (x86_energy_perf_policy(8) is available to change it at run-time.) > > + * On server platforms energy bias typically is set to > > + * performance on purpose. > > + * On other platforms it may happen that MSR_IA32_ENERGY_PERF_BIAS > > + * did not get initialized properly by BIOS. > > + * Best is to to keep BIOS settings and give the user a hint whether > > + * to change it via cpupower-set(8) userspace tool at runtime. > > > > */ > > > > if (!cpu_has(c, X86_FEATURE_EPB)) > > > > return; > > > > @@ -387,10 +391,8 @@ static void init_intel_energy_perf(struc > > > > if ((epb & 0xF) != ENERGY_PERF_BIAS_PERFORMANCE) > > > > return; > > > > - pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was 'performance'\n"); > > - pr_warn_once("ENERGY_PERF_BIAS: View and update with > > x86_energy_perf_policy(8)\n"); - epb = (epb & ~0xF) | > > ENERGY_PERF_BIAS_NORMAL; > > - wrmsrl(MSR_IA32_ENERGY_PERF_BIAS, epb); > > + pr_info_once("ENERGY_PERF_BIAS is set to 'performance'\n"); > > + pr_info_once("ENERGY_PERF_BIAS: Update with cpupower-set(8)\n"); > > This doesn't need to be cpupower-set IMO. You mean why switch the message from: x86_energy_perf_policy to cpupower-set ? IMO x86_energy_perf_policy should not exist. It has been introduce before cpupower set -b. Having an extra tool/binary for this functionality is an unneeded packaging overhead for distros. Also having more and more of such CPU specific tools is not userfriendly. cpupower supports all power relevant features of your CPU and on all architectures (or at least it should). People should know this one better than "x86_energy_perf_policy" and theoretically intuitively find it, even without a message. So it would be nice to get the message fixed as well. Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday, March 01, 2016 01:17:37 PM Thomas Renninger wrote: > On Saturday, February 27, 2016 12:15:47 AM Rafael J. Wysocki wrote: > > On Friday, February 26, 2016 05:38:00 PM Thomas Renninger wrote: > > > The assumption that BIOSes never want to have this register being set to > > > full performance (zero) is wrong. > > > > > > While wrongly overruling this BIOS setting and set it from performance > > > to normal did not hurt that much, because nobody really knew the effects > > > inside Intel processors. > > > > > > But with Broadwell-EP processor (E5-2687W v4) the CPU will not enter turbo > > > modes if this value is not set to performance. > > > > > > So switch logic to tell the user in a friendly way (info) that the CPU is > > > in performance mode and how to switch via userspace if this is not > > > intended. > > > > > > But otherwise trust that the BIOS has set the correct value here and do > > > not > > > blindly overrule. > > > > > > How this has been found: SLE11 had this patch, SLE12 it slipped through. > > > It took quite some time to nail down that this patch missing is the reason > > > for not entering turbo modes with this specific processor. > > > > > > Signed-off-by: Thomas Renninger <trenn@suse.com> > > > > > > --- a/arch/x86/kernel/cpu/intel.c 2016-02-26 17:19:55.731042972 +0100 > > > +++ b/arch/x86/kernel/cpu/intel.c 2016-02-26 17:20:48.598020581 +0100 > > > @@ -377,8 +377,12 @@ static void init_intel_energy_perf(struc > > > > > > u64 epb; > > > > > > /* > > > > > > - * Initialize MSR_IA32_ENERGY_PERF_BIAS if not already initialized. > > > - * (x86_energy_perf_policy(8) is available to change it at run-time.) > > > + * On server platforms energy bias typically is set to > > > + * performance on purpose. > > > + * On other platforms it may happen that MSR_IA32_ENERGY_PERF_BIAS > > > + * did not get initialized properly by BIOS. > > > + * Best is to to keep BIOS settings and give the user a hint whether > > > + * to change it via cpupower-set(8) userspace tool at runtime. > > > > > > */ > > > > > > if (!cpu_has(c, X86_FEATURE_EPB)) > > > > > > return; > > > > > > @@ -387,10 +391,8 @@ static void init_intel_energy_perf(struc > > > > > > if ((epb & 0xF) != ENERGY_PERF_BIAS_PERFORMANCE) > > > > > > return; > > > > > > - pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was > 'performance'\n"); > > > - pr_warn_once("ENERGY_PERF_BIAS: View and update with > > > x86_energy_perf_policy(8)\n"); - epb = (epb & ~0xF) | > > > ENERGY_PERF_BIAS_NORMAL; > > > - wrmsrl(MSR_IA32_ENERGY_PERF_BIAS, epb); > > > + pr_info_once("ENERGY_PERF_BIAS is set to 'performance'\n"); > > > + pr_info_once("ENERGY_PERF_BIAS: Update with cpupower-set(8)\n"); > > > > This doesn't need to be cpupower-set IMO. > > You mean why switch the message from: > x86_energy_perf_policy to cpupower-set > ? > > IMO x86_energy_perf_policy should not exist. It has been introduce before > cpupower set -b. > Having an extra tool/binary for this functionality is an unneeded packaging > overhead for distros. > Also having more and more of such CPU specific tools is not userfriendly. > cpupower supports all power relevant features of your CPU and on all > architectures (or at least it should). People should know this one better > than "x86_energy_perf_policy" and theoretically intuitively find it, even > without a message. > > So it would be nice to get the message fixed as well. My point is that since "cpupower set -b" is not the only way to set this, it doesn't seem appropriate to refer to it explicitly from a kernel message. I actually don't think the second message is necessary at all. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, March 02, 2016 01:26:18 AM Rafael J. Wysocki wrote: > On Tuesday, March 01, 2016 01:17:37 PM Thomas Renninger wrote: > > > > if (!cpu_has(c, X86_FEATURE_EPB))z > > > > > > > > return; > > > > > > > > @@ -387,10 +391,8 @@ static void init_intel_energy_perf(struc > > > > > > > > if ((epb & 0xF) != ENERGY_PERF_BIAS_PERFORMANCE) > > > > > > > > return; > > > > > > > > - pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was > > > > 'performance'\n"); > > > > > > - pr_warn_once("ENERGY_PERF_BIAS: View and update with > > > > x86_energy_perf_policy(8)\n"); - epb = (epb & ~0xF) | > > > > ENERGY_PERF_BIAS_NORMAL; > > > > - wrmsrl(MSR_IA32_ENERGY_PERF_BIAS, epb); > > > > + pr_info_once("ENERGY_PERF_BIAS is set to 'performance'\n"); > > > > + pr_info_once("ENERGY_PERF_BIAS: Update with cpupower- set(8)\n"); > > > > > > This doesn't need to be cpupower-set IMO. > > > > You mean why switch the message from: > > x86_energy_perf_policy to cpupower-set > > ? > > > > IMO x86_energy_perf_policy should not exist. It has been introduce before > > cpupower set -b. > > Having an extra tool/binary for this functionality is an unneeded > > packaging > > overhead for distros. > > Also having more and more of such CPU specific tools is not userfriendly. > > cpupower supports all power relevant features of your CPU and on all > > architectures (or at least it should). People should know this one better > > than "x86_energy_perf_policy" and theoretically intuitively find it, even > > without a message. > > > > So it would be nice to get the message fixed as well. > > My point is that since "cpupower set -b" is not the only way to set this, > it doesn't seem appropriate to refer to it explicitly from a kernel message. > > I actually don't think the second message is necessary at all. Hmm, thinking a bit more about this, I think the whole init_intel_energy_perf() function check should vanish. The check should get moved into the powertop userspace tool. This one is used to optimize platform for power saving features. This would also keep the kernel core code clean... If you agree I will send the patch. Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Fri, Mar 4, 2016 at 9:37 AM, Thomas Renninger <trenn@suse.de> wrote: > On Wednesday, March 02, 2016 01:26:18 AM Rafael J. Wysocki wrote: >> On Tuesday, March 01, 2016 01:17:37 PM Thomas Renninger wrote: >> > > > if (!cpu_has(c, X86_FEATURE_EPB))z >> > > > >> > > > return; >> > > > >> > > > @@ -387,10 +391,8 @@ static void init_intel_energy_perf(struc >> > > > >> > > > if ((epb & 0xF) != ENERGY_PERF_BIAS_PERFORMANCE) >> > > > >> > > > return; >> > > > >> > > > - pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was >> > >> > 'performance'\n"); >> > >> > > > - pr_warn_once("ENERGY_PERF_BIAS: View and update with >> > > > x86_energy_perf_policy(8)\n"); - epb = (epb & ~0xF) | >> > > > ENERGY_PERF_BIAS_NORMAL; >> > > > - wrmsrl(MSR_IA32_ENERGY_PERF_BIAS, epb); >> > > > + pr_info_once("ENERGY_PERF_BIAS is set to 'performance'\n"); >> > > > + pr_info_once("ENERGY_PERF_BIAS: Update with cpupower- > set(8)\n"); >> > > >> > > This doesn't need to be cpupower-set IMO. >> > >> > You mean why switch the message from: >> > x86_energy_perf_policy to cpupower-set >> > ? >> > >> > IMO x86_energy_perf_policy should not exist. It has been introduce before >> > cpupower set -b. >> > Having an extra tool/binary for this functionality is an unneeded >> > packaging >> > overhead for distros. >> > Also having more and more of such CPU specific tools is not userfriendly. >> > cpupower supports all power relevant features of your CPU and on all >> > architectures (or at least it should). People should know this one better >> > than "x86_energy_perf_policy" and theoretically intuitively find it, even >> > without a message. >> > >> > So it would be nice to get the message fixed as well. >> >> My point is that since "cpupower set -b" is not the only way to set this, >> it doesn't seem appropriate to refer to it explicitly from a kernel message. >> >> I actually don't think the second message is necessary at all. > > Hmm, thinking a bit more about this, I think the whole > init_intel_energy_perf() function check should vanish. > > The check should get moved into the powertop userspace tool. > This one is used to optimize platform for power saving features. > > This would also keep the kernel core code clean... > > If you agree I will send the patch. I need to talk to Len about that, but why don't you send it anyway? If we are not going to update the knob, I'm not seeing much value in checking it from the kernel. A few people read boot logs if their systems work as expected, so the value of the message alone is quite limited IMO. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> But with Broadwell-EP processor (E5-2687W v4) the CPU will not enter turbo modes > if this value is not set to performance BDX-EP supports HWP. Are these failing machines running in HWP mode? (On BDX-EP, and only on BDX-EP, EPB acts to set the BIAS for HWP, because that processor doesn't yet have EPP.)
On Monday, March 07, 2016 07:50:57 PM Len Brown wrote: > > But with Broadwell-EP processor (E5-2687W v4) the CPU will not enter turbo > > modes if this value is not set to performance > > BDX-EP supports HWP. > Are these failing machines running in HWP mode? > > (On BDX-EP, and only on BDX-EP, EPB acts to set the BIAS for HWP, > because that processor doesn't yet have EPP.) I am not sure whether I can publish platform info. I asked for and added you to CC of the private bug a while ago. This kernel is run: SLES12 SP1, 3.12.49-4-default I grepped the whole supportconfig for -i hwp and couldn't find anything, so I very much expect this machine is/was not run in HWP mode, right? I still question the usefulness of the "initialize perf bias to normal" hack. For our distro I am pretty sure, we do not want to have this one and we prefer the CPU or BIOS initialized value, even (or especially) if it is set to performance. Are there any known platforms where this would really be an issue and how sever would it be? I already removed the "set perf bias to normal" years ago for SLE11 without getting any regression or negative reports. Now finding the workaround on the hack to run a suspend hook to adjust perf bias value on each suspend cycle... This is going into a wrong direction. I agree that if this needs resetting after each suspend cycle, userspace should not need to care about it. I could imagine a sysfs variable here: /sys/devices/system/cpu/intel_pstate/perf_bias but the static setting from 0 to 6 in the x86 core code and the suspend callback should get reverted, right? Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 8, 2016 at 7:14 AM, Thomas Renninger <trenn@suse.de> wrote: > On Monday, March 07, 2016 07:50:57 PM Len Brown wrote: >> > But with Broadwell-EP processor (E5-2687W v4) the CPU will not enter turbo >> > modes if this value is not set to performance >> >> BDX-EP supports HWP. >> Are these failing machines running in HWP mode? >> >> (On BDX-EP, and only on BDX-EP, EPB acts to set the BIAS for HWP, >> because that processor doesn't yet have EPP.) > > I am not sure whether I can publish platform info. I asked for and added you > to CC of the private bug a while ago. > > This kernel is run: SLES12 SP1, 3.12.49-4-default > I grepped the whole supportconfig for -i hwp and couldn't find anything, so I > very much expect this machine is/was not run in HWP mode, right? /proc/cpuinfo will have a bunch of hwp stuff if CPUID is advertising the feature. The BIOS could put it in HWP mode by default, or the intel_pstate driver could enable HWP. If intel_pstate did that, it would print it in dmesg. pr_info("intel_pstate: HWP enabled\n"); > I still question the usefulness of the "initialize perf bias to normal" hack. > For our distro I am pretty sure, we do not want to have this one and > we prefer the CPU or BIOS initialized value, even (or especially) if it is > set to performance. > > Are there any known platforms where this would really be an issue > and how sever would it be? > > I already removed the "set perf bias to normal" years ago for SLE11 without > getting any regression or negative reports. > > Now finding the workaround on the hack to run a suspend hook to > adjust perf bias value on each suspend cycle... This is going into > a wrong direction. > > I agree that if this needs resetting after each suspend cycle, userspace > should not need to care about it. I could imagine a sysfs variable here: > /sys/devices/system/cpu/intel_pstate/perf_bias As EPB it is unrelated to intel_pstate, this would be a bad place to put such state. > but the static setting from 0 to 6 in the x86 core code and > the suspend callback should get reverted, right? The kernel is supplying a reasonable default. It is up to the operating system to apply -- from user space -- a value that is consistent with the performance profile for how the machine is being used. I proposed doing such policy in the kernel a number of years ago, and a number of non-kernel people insisted strenuously that the kernel should not know or care, and that user-space should manage this. Perhaps we need to examine how well abdicating responsibility to the upper OS has been going... I'm not going to tell you that you can't make policy choices in distro-specific kernel patches. Maybe it is difficult for some distros to modify user-space. But I am not going to recommend such changes go in the upstream kernel, since it would impact other users. thanks, Len Brown, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/arch/x86/kernel/cpu/intel.c 2016-02-26 17:19:55.731042972 +0100 +++ b/arch/x86/kernel/cpu/intel.c 2016-02-26 17:20:48.598020581 +0100 @@ -377,8 +377,12 @@ static void init_intel_energy_perf(struc u64 epb; /* - * Initialize MSR_IA32_ENERGY_PERF_BIAS if not already initialized. - * (x86_energy_perf_policy(8) is available to change it at run-time.) + * On server platforms energy bias typically is set to + * performance on purpose. + * On other platforms it may happen that MSR_IA32_ENERGY_PERF_BIAS + * did not get initialized properly by BIOS. + * Best is to to keep BIOS settings and give the user a hint whether + * to change it via cpupower-set(8) userspace tool at runtime. */ if (!cpu_has(c, X86_FEATURE_EPB)) return; @@ -387,10 +391,8 @@ static void init_intel_energy_perf(struc if ((epb & 0xF) != ENERGY_PERF_BIAS_PERFORMANCE) return; - pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was 'performance'\n"); - pr_warn_once("ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8)\n"); - epb = (epb & ~0xF) | ENERGY_PERF_BIAS_NORMAL; - wrmsrl(MSR_IA32_ENERGY_PERF_BIAS, epb); + pr_info_once("ENERGY_PERF_BIAS is set to 'performance'\n"); + pr_info_once("ENERGY_PERF_BIAS: Update with cpupower-set(8)\n"); } static void intel_bsp_resume(struct cpuinfo_x86 *c)
The assumption that BIOSes never want to have this register being set to full performance (zero) is wrong. While wrongly overruling this BIOS setting and set it from performance to normal did not hurt that much, because nobody really knew the effects inside Intel processors. But with Broadwell-EP processor (E5-2687W v4) the CPU will not enter turbo modes if this value is not set to performance. So switch logic to tell the user in a friendly way (info) that the CPU is in performance mode and how to switch via userspace if this is not intended. But otherwise trust that the BIOS has set the correct value here and do not blindly overrule. How this has been found: SLE11 had this patch, SLE12 it slipped through. It took quite some time to nail down that this patch missing is the reason for not entering turbo modes with this specific processor. Signed-off-by: Thomas Renninger <trenn@suse.com> -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html