diff mbox series

[RFC] cpufreq: intel_pstate: Delete intel_pstate sysfs if failed to register the driver

Message ID 20201009033038.23157-1-yu.c.chen@intel.com (mailing list archive)
State Mainlined, archived
Headers show
Series [RFC] cpufreq: intel_pstate: Delete intel_pstate sysfs if failed to register the driver | expand

Commit Message

Chen Yu Oct. 9, 2020, 3:30 a.m. UTC
There is a corner case that if the intel_pstate driver failed to be
registered(might be due to invalid MSR access) and with the acpi_cpufreq
loaded, the intel_pstate sysfs might still be created, which makes the
user confusing(turbostat for example):

grep . /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
acpi-cpufreq

grep . /sys/devices/system/cpu/intel_pstate/*
/sys/devices/system/cpu/intel_pstate/max_perf_pct:0
/sys/devices/system/cpu/intel_pstate/min_perf_pct:0
grep: /sys/devices/system/cpu/intel_pstate/no_turbo: Resource temporarily unavailable
grep: /sys/devices/system/cpu/intel_pstate/num_pstates: Resource temporarily unavailable
/sys/devices/system/cpu/intel_pstate/status:off
grep: /sys/devices/system/cpu/intel_pstate/turbo_pct: Resource temporarily unavailable

The existing of intel_pstate sysfs does not mean that the intel_pstate driver
has been successfully loaded(for example, echo off to status), but the
intel_pstate sysfs should not co-exist when acpi-cpufreq is also present.
Fix this issue by deleting the intel_pstate sysfs if the driver failed
to be loaded during bootup.

Reported-by: Wendy Wang <wendy.wang@intel.com>
Suggested-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 drivers/cpufreq/intel_pstate.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

Comments

Srinivas Pandruvada Oct. 12, 2020, 1:22 p.m. UTC | #1
On Fri, 2020-10-09 at 11:30 +0800, Chen Yu wrote:
> There is a corner case that if the intel_pstate driver failed to be
> registered(might be due to invalid MSR access) 
Do you have logs why it is not loaded? On supported platforms MSRs
should be invalid.
It may be a case when we are trying to bring up pre-production systems
where some instability in MSRs on certain CPUs. 

But the patch is correct. We can't have invalid folder when
intel_pstate is not used. 

> and with the acpi_cpufreq
> loaded, the intel_pstate sysfs might still be created, which makes
> the
> user confusing(turbostat for example):
> 
> grep . /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
> acpi-cpufreq
> 
> grep . /sys/devices/system/cpu/intel_pstate/*
> /sys/devices/system/cpu/intel_pstate/max_perf_pct:0
> /sys/devices/system/cpu/intel_pstate/min_perf_pct:0
> grep: /sys/devices/system/cpu/intel_pstate/no_turbo: Resource
> temporarily unavailable
> grep: /sys/devices/system/cpu/intel_pstate/num_pstates: Resource
> temporarily unavailable
> /sys/devices/system/cpu/intel_pstate/status:off
> grep: /sys/devices/system/cpu/intel_pstate/turbo_pct: Resource
> temporarily unavailable
> 
> The existing of intel_pstate sysfs does not mean that the
> intel_pstate driver
> has been successfully loaded(for example, echo off to status), but
> the
> intel_pstate sysfs should not co-exist when acpi-cpufreq is also
> present.
> Fix this issue by deleting the intel_pstate sysfs if the driver
> failed
> to be loaded during bootup.
> 
> Reported-by: Wendy Wang <wendy.wang@intel.com>
> Suggested-by: Zhang Rui <rui.zhang@intel.com>
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com


> --->
>  drivers/cpufreq/intel_pstate.c | 24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/intel_pstate.c
> b/drivers/cpufreq/intel_pstate.c
> index 9a515c460a00..8c5f9680de83 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -1420,6 +1420,26 @@ static void __init
> intel_pstate_sysfs_expose_params(void)
>  	}
>  }
>  
> +static void __init intel_pstate_sysfs_clean(void)
> +{
> +	if (!intel_pstate_kobject)
> +		return;
> +
> +	sysfs_remove_group(intel_pstate_kobject,
> &intel_pstate_attr_group);
> +
> +	if (per_cpu_limits)
> +		goto release_kobj;
> +
> +	sysfs_remove_file(intel_pstate_kobject, &max_perf_pct.attr);
> +	sysfs_remove_file(intel_pstate_kobject, &min_perf_pct.attr);
> +
> +	if (x86_match_cpu(intel_pstate_cpu_ee_disable_ids))
> +		sysfs_remove_file(intel_pstate_kobject,
> &energy_efficiency.attr);
> +
> +release_kobj:
> +	kobject_put(intel_pstate_kobject);
> +}
> +
>  static void intel_pstate_sysfs_expose_hwp_dynamic_boost(void)
>  {
>  	int rc;
> @@ -3063,8 +3083,10 @@ static int __init intel_pstate_init(void)
>  	mutex_lock(&intel_pstate_driver_lock);
>  	rc = intel_pstate_register_driver(default_driver);
>  	mutex_unlock(&intel_pstate_driver_lock);
> -	if (rc)
> +	if (rc) {
> +		intel_pstate_sysfs_clean();
>  		return rc;
> +	}
>  
>  	if (hwp_active) {
>  		const struct x86_cpu_id *id;
Chen Yu Oct. 13, 2020, 6:44 a.m. UTC | #2
Hi Srinivas,
On Mon, Oct 12, 2020 at 06:22:40AM -0700, srinivas pandruvada wrote:
> On Fri, 2020-10-09 at 11:30 +0800, Chen Yu wrote:
> > There is a corner case that if the intel_pstate driver failed to be
> > registered(might be due to invalid MSR access) 
> Do you have logs why it is not loaded? On supported platforms MSRs
> should be invalid.
Unfortunately we don't have the boot up log for now, as it is
a pre-production platform and the low-level simulation(for MSR)
might be unstable.( And there seems to be some environment issue
on pre-production platform to reproduce this issue).
But we can hack the code in intel_pstate to make the driver failed
to be loaded and the issue was reproduced.
> It may be a case when we are trying to bring up pre-production systems
> where some instability in MSRs on certain CPUs. 
> 
> But the patch is correct. We can't have invalid folder when
> intel_pstate is not used. 
> 
> > and with the acpi_cpufreq
> > loaded, the intel_pstate sysfs might still be created, which makes
> > the
> > user confusing(turbostat for example):
> > 
> > grep . /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
> > acpi-cpufreq
> > 
> > grep . /sys/devices/system/cpu/intel_pstate/*
> > /sys/devices/system/cpu/intel_pstate/max_perf_pct:0
> > /sys/devices/system/cpu/intel_pstate/min_perf_pct:0
> > grep: /sys/devices/system/cpu/intel_pstate/no_turbo: Resource
> > temporarily unavailable
> > grep: /sys/devices/system/cpu/intel_pstate/num_pstates: Resource
> > temporarily unavailable
> > /sys/devices/system/cpu/intel_pstate/status:off
> > grep: /sys/devices/system/cpu/intel_pstate/turbo_pct: Resource
> > temporarily unavailable
> > 
> > The existing of intel_pstate sysfs does not mean that the
> > intel_pstate driver
> > has been successfully loaded(for example, echo off to status), but
> > the
> > intel_pstate sysfs should not co-exist when acpi-cpufreq is also
> > present.
> > Fix this issue by deleting the intel_pstate sysfs if the driver
> > failed
> > to be loaded during bootup.
> > 
> > Reported-by: Wendy Wang <wendy.wang@intel.com>
> > Suggested-by: Zhang Rui <rui.zhang@intel.com>
> > Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com
Thanks!


Best,
Chenyu
Rafael J. Wysocki Oct. 16, 2020, 2:34 p.m. UTC | #3
On Fri, Oct 9, 2020 at 5:29 AM Chen Yu <yu.c.chen@intel.com> wrote:
>
> There is a corner case that if the intel_pstate driver failed to be
> registered(might be due to invalid MSR access) and with the acpi_cpufreq
> loaded, the intel_pstate sysfs might still be created, which makes the
> user confusing(turbostat for example):
>
> grep . /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
> acpi-cpufreq
>
> grep . /sys/devices/system/cpu/intel_pstate/*
> /sys/devices/system/cpu/intel_pstate/max_perf_pct:0
> /sys/devices/system/cpu/intel_pstate/min_perf_pct:0
> grep: /sys/devices/system/cpu/intel_pstate/no_turbo: Resource temporarily unavailable
> grep: /sys/devices/system/cpu/intel_pstate/num_pstates: Resource temporarily unavailable
> /sys/devices/system/cpu/intel_pstate/status:off
> grep: /sys/devices/system/cpu/intel_pstate/turbo_pct: Resource temporarily unavailable
>
> The existing of intel_pstate sysfs does not mean that the intel_pstate driver
> has been successfully loaded(for example, echo off to status), but the
> intel_pstate sysfs should not co-exist when acpi-cpufreq is also present.
> Fix this issue by deleting the intel_pstate sysfs if the driver failed
> to be loaded during bootup.
>
> Reported-by: Wendy Wang <wendy.wang@intel.com>
> Suggested-by: Zhang Rui <rui.zhang@intel.com>
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> ---
>  drivers/cpufreq/intel_pstate.c | 24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 9a515c460a00..8c5f9680de83 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -1420,6 +1420,26 @@ static void __init intel_pstate_sysfs_expose_params(void)
>         }
>  }
>
> +static void __init intel_pstate_sysfs_clean(void)
> +{
> +       if (!intel_pstate_kobject)
> +               return;
> +
> +       sysfs_remove_group(intel_pstate_kobject, &intel_pstate_attr_group);
> +
> +       if (per_cpu_limits)
> +               goto release_kobj;
> +
> +       sysfs_remove_file(intel_pstate_kobject, &max_perf_pct.attr);
> +       sysfs_remove_file(intel_pstate_kobject, &min_perf_pct.attr);
> +
> +       if (x86_match_cpu(intel_pstate_cpu_ee_disable_ids))
> +               sysfs_remove_file(intel_pstate_kobject, &energy_efficiency.attr);
> +
> +release_kobj:
> +       kobject_put(intel_pstate_kobject);
> +}
> +
>  static void intel_pstate_sysfs_expose_hwp_dynamic_boost(void)
>  {
>         int rc;
> @@ -3063,8 +3083,10 @@ static int __init intel_pstate_init(void)
>         mutex_lock(&intel_pstate_driver_lock);
>         rc = intel_pstate_register_driver(default_driver);
>         mutex_unlock(&intel_pstate_driver_lock);
> -       if (rc)
> +       if (rc) {
> +               intel_pstate_sysfs_clean();
>                 return rc;
> +       }
>
>         if (hwp_active) {
>                 const struct x86_cpu_id *id;
> --

Applied as 5.10-rc material with some minor changes and the Srinivas'
ACK, thanks!
diff mbox series

Patch

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 9a515c460a00..8c5f9680de83 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -1420,6 +1420,26 @@  static void __init intel_pstate_sysfs_expose_params(void)
 	}
 }
 
+static void __init intel_pstate_sysfs_clean(void)
+{
+	if (!intel_pstate_kobject)
+		return;
+
+	sysfs_remove_group(intel_pstate_kobject, &intel_pstate_attr_group);
+
+	if (per_cpu_limits)
+		goto release_kobj;
+
+	sysfs_remove_file(intel_pstate_kobject, &max_perf_pct.attr);
+	sysfs_remove_file(intel_pstate_kobject, &min_perf_pct.attr);
+
+	if (x86_match_cpu(intel_pstate_cpu_ee_disable_ids))
+		sysfs_remove_file(intel_pstate_kobject, &energy_efficiency.attr);
+
+release_kobj:
+	kobject_put(intel_pstate_kobject);
+}
+
 static void intel_pstate_sysfs_expose_hwp_dynamic_boost(void)
 {
 	int rc;
@@ -3063,8 +3083,10 @@  static int __init intel_pstate_init(void)
 	mutex_lock(&intel_pstate_driver_lock);
 	rc = intel_pstate_register_driver(default_driver);
 	mutex_unlock(&intel_pstate_driver_lock);
-	if (rc)
+	if (rc) {
+		intel_pstate_sysfs_clean();
 		return rc;
+	}
 
 	if (hwp_active) {
 		const struct x86_cpu_id *id;