Message ID | 20200625224931.1468150-1-srinivas.pandruvada@linux.intel.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | [UPDATE,v3,1/2] cpufreq: intel_pstate: Allow enable/disable energy efficiency | expand |
On Thu, Jun 25, 2020 at 03:49:31PM -0700, Srinivas Pandruvada wrote: > By default intel_pstate driver disables energy efficiency by setting > MSR_IA32_POWER_CTL bit 19 for Kaby Lake desktop CPU model in HWP mode. > This CPU model is also shared by Coffee Lake desktop CPUs. This allows > these systems to reach maximum possible frequency. But this adds power > penalty, which some customers don't want. They want some way to enable/ > disable dynamically. > > So, add an additional attribute "energy_efficiency_enable" under > /sys/devices/system/cpu/intel_pstate/ for these CPU models. This allows > to read and write bit 19 ("Disable Energy Efficiency Optimization") in > the MSR IA32_POWER_CTL. Yes, this is how functionality behind MSRs should be made available to userspace - not poking at naked MSRs. Good. > This attribute is present in both HWP and non-HWP mode as this has an > effect in both modes. Refer to Intel Software Developer's manual for > details. The scope of this bit is package wide. Also these systems > support only one package. So read/write MSR on the current CPU is > enough. > > Suggested-by: Len Brown <lenb@kernel.org> > Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> > --- > v3 update > Moved the MSR bit definition to msr-index.h from intel_pstate.c as Doug > wanted. Offline checking with Borislav, for MSR defintion it is > fine to move to msr-index.h even for single user of the definition. But > here the MSR definition is already in msr-index.h, but adding the MSR bit > definition also. Yes. Btw, no need for the "offline checking" - you can do this on the mailing list just fine. > Documentation/admin-guide/pm/intel_pstate.rst | 9 ++++ > arch/x86/include/asm/msr-index.h | 1 + > drivers/cpufreq/intel_pstate.c | 47 ++++++++++++++++++- > 3 files changed, 55 insertions(+), 2 deletions(-) > > diff --git a/Documentation/admin-guide/pm/intel_pstate.rst b/Documentation/admin-guide/pm/intel_pstate.rst > index 39d80bc29ccd..1ca2684a94d7 100644 > --- a/Documentation/admin-guide/pm/intel_pstate.rst > +++ b/Documentation/admin-guide/pm/intel_pstate.rst > @@ -431,6 +431,15 @@ argument is passed to the kernel in the command line. > supported in the current configuration, writes to this attribute will > fail with an appropriate error. > > +``energy_efficiency_enable`` > + This attribute is only present on platforms, which has CPUs matching which have > + Kaby Lake or Coffee Lake desktop CPU model. By default > + "energy_efficiency" is disabled on these CPU models in HWP mode by this > + driver. Enabling energy efficiency may limit maximum operating > + frequency in both HWP and non HWP mode. In non HWP mode, this attribute > + has an effect in turbo range only. But in HWP mode, this attribute also > + has an effect in non turbo range. Those last two sentences could be simplified - read strange. > + > Interpretation of Policy Attributes > ----------------------------------- > > diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h > index e8370e64a155..fec86ad14f8d 100644 > --- a/arch/x86/include/asm/msr-index.h > +++ b/arch/x86/include/asm/msr-index.h > @@ -254,6 +254,7 @@ > #define MSR_PEBS_FRONTEND 0x000003f7 > > #define MSR_IA32_POWER_CTL 0x000001fc > +#define MSR_IA32_POWER_CTL_BIT_EE 19 Sort that MSR in - I know, the rest is not sorted either but we can start somewhere. So pls put it... #define MSR_LBR_SELECT 0x000001c8 #define MSR_LBR_TOS 0x000001c9 <--- here. #define MSR_LBR_NHM_FROM 0x00000680 #define MSR_LBR_NHM_TO 0x000006c0 > #define MSR_IA32_MC0_CTL 0x00000400 > #define MSR_IA32_MC0_STATUS 0x00000401 > diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c > index 8e23a698ce04..daa1d9c12098 100644 > --- a/drivers/cpufreq/intel_pstate.c > +++ b/drivers/cpufreq/intel_pstate.c > @@ -1218,6 +1218,42 @@ static ssize_t store_hwp_dynamic_boost(struct kobject *a, > return count; > } > > +static ssize_t show_energy_efficiency_enable(struct kobject *kobj, > + struct kobj_attribute *attr, > + char *buf) > +{ > + u64 power_ctl; > + int enable; > + > + rdmsrl(MSR_IA32_POWER_CTL, power_ctl); > + enable = (power_ctl & BIT(MSR_IA32_POWER_CTL_BIT_EE)) >> MSR_IA32_POWER_CTL_BIT_EE; So you can simplify to: enable = !!(power_ctl & BIT(MSR_IA32_POWER_CTL_BIT_EE)); methinks. > + return sprintf(buf, "%d\n", !enable); If this bit is called "Disable Energy Efficiency Optimization" why do you call your function and sysfs file "enable"? This is making it more confusing. Why don't you call it simply: "energy_efficiency" and have it intuitive: 1 - enabled 0 - disabled ? > +static ssize_t store_energy_efficiency_enable(struct kobject *a, > + struct kobj_attribute *b, > + const char *buf, size_t count) > +{ > + u64 power_ctl; > + u32 input; > + int ret; > + > + ret = kstrtouint(buf, 10, &input); > + if (ret) > + return ret; > + > + mutex_lock(&intel_pstate_driver_lock); > + rdmsrl(MSR_IA32_POWER_CTL, power_ctl); > + if (input) This is too lax - it will be enabled for any !0 value. Please accept only 0 and 1. Thx.
On Fri, 2020-06-26 at 10:49 +0200, Borislav Petkov wrote: > On Thu, Jun 25, 2020 at 03:49:31PM -0700, Srinivas Pandruvada wrote: > > By default intel_pstate driver disables energy efficiency by > > setting > > MSR_IA32_POWER_CTL bit 19 for Kaby Lake desktop CPU model in HWP > > mode. > > This CPU model is also shared by Coffee Lake desktop CPUs. This > > allows > > these systems to reach maximum possible frequency. But this adds > > power > > penalty, which some customers don't want. They want some way to > > enable/ > > disable dynamically. > > > > So, add an additional attribute "energy_efficiency_enable" under > > /sys/devices/system/cpu/intel_pstate/ for these CPU models. This > > allows > > to read and write bit 19 ("Disable Energy Efficiency Optimization") > > in > > the MSR IA32_POWER_CTL. > > [...] > > +``energy_efficiency_enable`` > > + This attribute is only present on platforms, which has CPUs > > matching > > which have > Thanks, I will fix that. > > + Kaby Lake or Coffee Lake desktop CPU model. By default > > + "energy_efficiency" is disabled on these CPU models in HWP mode > > by this > > + driver. Enabling energy efficiency may limit maximum operating > > + frequency in both HWP and non HWP mode. In non HWP mode, this > > attribute > > + has an effect in turbo range only. But in HWP mode, this > > attribute also > > + has an effect in non turbo range. > > Those last two sentences could be simplified - read strange. I will try to address this. [...] > > @@ -254,6 +254,7 @@ > > #define MSR_PEBS_FRONTEND 0x000003f7 > > > > #define MSR_IA32_POWER_CTL 0x000001fc > > +#define MSR_IA32_POWER_CTL_BIT_EE 19 > > Sort that MSR in - I know, the rest is not sorted either but we can > start somewhere. So pls put it... > I will. > #define MSR_LBR_SELECT 0x000001c8 > #define MSR_LBR_TOS 0x000001c9 > > <--- here. > > [...] > > + > > + rdmsrl(MSR_IA32_POWER_CTL, power_ctl); > > + enable = (power_ctl & BIT(MSR_IA32_POWER_CTL_BIT_EE)) >> > > MSR_IA32_POWER_CTL_BIT_EE; > > So you can simplify to: > > enable = !!(power_ctl & BIT(MSR_IA32_POWER_CTL_BIT_EE)); > > methinks. > Better. > > + return sprintf(buf, "%d\n", !enable); > > If this bit is called > > "Disable Energy Efficiency Optimization" > > why do you call your function and sysfs file "enable"? This is making > it > more confusing. > > Why don't you call it simply: "energy_efficiency" and have it > intuitive: > > 1 - enabled > 0 - disabled > I think your suggestion is good. The one other attributes under this directory has similar style. I will get rid of "_enable". > ? > > > +static ssize_t store_energy_efficiency_enable(struct kobject *a, > > + struct kobj_attribute *b, > > + const char *buf, size_t > > count) > > +{ > > + u64 power_ctl; > > + u32 input; > > + int ret; > > + > > + ret = kstrtouint(buf, 10, &input); > > + if (ret) > > + return ret; > > + > > + mutex_lock(&intel_pstate_driver_lock); > > + rdmsrl(MSR_IA32_POWER_CTL, power_ctl); > > + if (input) > > This is too lax - it will be enabled for any !0 value. Please accept > only 0 and 1. > OK. Thanks for the review. - Srinivas
On Fri, Jun 26, 2020 at 10:49:03AM +0200, Borislav Petkov wrote: > On Thu, Jun 25, 2020 at 03:49:31PM -0700, Srinivas Pandruvada wrote: > > +static ssize_t store_energy_efficiency_enable(struct kobject *a, > > + struct kobj_attribute *b, > > + const char *buf, size_t count) > > +{ > > + u64 power_ctl; > > + u32 input; > > + int ret; > > + > > + ret = kstrtouint(buf, 10, &input); > > + if (ret) > > + return ret; > > + > > + mutex_lock(&intel_pstate_driver_lock); > > + rdmsrl(MSR_IA32_POWER_CTL, power_ctl); > > + if (input) > > This is too lax - it will be enabled for any !0 value. Please accept > only 0 and 1. kstrtobool() ftw
diff --git a/Documentation/admin-guide/pm/intel_pstate.rst b/Documentation/admin-guide/pm/intel_pstate.rst index 39d80bc29ccd..1ca2684a94d7 100644 --- a/Documentation/admin-guide/pm/intel_pstate.rst +++ b/Documentation/admin-guide/pm/intel_pstate.rst @@ -431,6 +431,15 @@ argument is passed to the kernel in the command line. supported in the current configuration, writes to this attribute will fail with an appropriate error. +``energy_efficiency_enable`` + This attribute is only present on platforms, which has CPUs matching + Kaby Lake or Coffee Lake desktop CPU model. By default + "energy_efficiency" is disabled on these CPU models in HWP mode by this + driver. Enabling energy efficiency may limit maximum operating + frequency in both HWP and non HWP mode. In non HWP mode, this attribute + has an effect in turbo range only. But in HWP mode, this attribute also + has an effect in non turbo range. + Interpretation of Policy Attributes ----------------------------------- diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index e8370e64a155..fec86ad14f8d 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -254,6 +254,7 @@ #define MSR_PEBS_FRONTEND 0x000003f7 #define MSR_IA32_POWER_CTL 0x000001fc +#define MSR_IA32_POWER_CTL_BIT_EE 19 #define MSR_IA32_MC0_CTL 0x00000400 #define MSR_IA32_MC0_STATUS 0x00000401 diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 8e23a698ce04..daa1d9c12098 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -1218,6 +1218,42 @@ static ssize_t store_hwp_dynamic_boost(struct kobject *a, return count; } +static ssize_t show_energy_efficiency_enable(struct kobject *kobj, + struct kobj_attribute *attr, + char *buf) +{ + u64 power_ctl; + int enable; + + rdmsrl(MSR_IA32_POWER_CTL, power_ctl); + enable = (power_ctl & BIT(MSR_IA32_POWER_CTL_BIT_EE)) >> MSR_IA32_POWER_CTL_BIT_EE; + return sprintf(buf, "%d\n", !enable); +} + +static ssize_t store_energy_efficiency_enable(struct kobject *a, + struct kobj_attribute *b, + const char *buf, size_t count) +{ + u64 power_ctl; + u32 input; + int ret; + + ret = kstrtouint(buf, 10, &input); + if (ret) + return ret; + + mutex_lock(&intel_pstate_driver_lock); + rdmsrl(MSR_IA32_POWER_CTL, power_ctl); + if (input) + power_ctl &= ~BIT(MSR_IA32_POWER_CTL_BIT_EE); + else + power_ctl |= BIT(MSR_IA32_POWER_CTL_BIT_EE); + wrmsrl(MSR_IA32_POWER_CTL, power_ctl); + mutex_unlock(&intel_pstate_driver_lock); + + return count; +} + show_one(max_perf_pct, max_perf_pct); show_one(min_perf_pct, min_perf_pct); @@ -1228,6 +1264,7 @@ define_one_global_rw(min_perf_pct); define_one_global_ro(turbo_pct); define_one_global_ro(num_pstates); define_one_global_rw(hwp_dynamic_boost); +define_one_global_rw(energy_efficiency_enable); static struct attribute *intel_pstate_attributes[] = { &status.attr, @@ -1241,6 +1278,8 @@ static const struct attribute_group intel_pstate_attr_group = { .attrs = intel_pstate_attributes, }; +static const struct x86_cpu_id intel_pstate_cpu_ee_disable_ids[]; + static void __init intel_pstate_sysfs_expose_params(void) { struct kobject *intel_pstate_kobject; @@ -1273,6 +1312,12 @@ static void __init intel_pstate_sysfs_expose_params(void) &hwp_dynamic_boost.attr); WARN_ON(rc); } + + if (x86_match_cpu(intel_pstate_cpu_ee_disable_ids)) { + rc = sysfs_create_file(intel_pstate_kobject, + &energy_efficiency_enable.attr); + WARN_ON(rc); + } } /************************** sysfs end ************************/ @@ -1288,8 +1333,6 @@ static void intel_pstate_hwp_enable(struct cpudata *cpudata) cpudata->epp_default = intel_pstate_get_epp(cpudata, 0); } -#define MSR_IA32_POWER_CTL_BIT_EE 19 - /* Disable energy efficiency optimization */ static void intel_pstate_disable_ee(int cpu) {
By default intel_pstate driver disables energy efficiency by setting MSR_IA32_POWER_CTL bit 19 for Kaby Lake desktop CPU model in HWP mode. This CPU model is also shared by Coffee Lake desktop CPUs. This allows these systems to reach maximum possible frequency. But this adds power penalty, which some customers don't want. They want some way to enable/ disable dynamically. So, add an additional attribute "energy_efficiency_enable" under /sys/devices/system/cpu/intel_pstate/ for these CPU models. This allows to read and write bit 19 ("Disable Energy Efficiency Optimization") in the MSR IA32_POWER_CTL. This attribute is present in both HWP and non-HWP mode as this has an effect in both modes. Refer to Intel Software Developer's manual for details. The scope of this bit is package wide. Also these systems support only one package. So read/write MSR on the current CPU is enough. Suggested-by: Len Brown <lenb@kernel.org> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> --- v3 update Moved the MSR bit definition to msr-index.h from intel_pstate.c as Doug wanted. Offline checking with Borislav, for MSR defintion it is fine to move to msr-index.h even for single user of the definition. But here the MSR definition is already in msr-index.h, but adding the MSR bit definition also. Documentation/admin-guide/pm/intel_pstate.rst | 9 ++++ arch/x86/include/asm/msr-index.h | 1 + drivers/cpufreq/intel_pstate.c | 47 ++++++++++++++++++- 3 files changed, 55 insertions(+), 2 deletions(-)