diff mbox series

cpufreq: fix cpufreq_get() can't get correct CPU frequency

Message ID 20220311081111.159639-1-zhengzucheng@huawei.com (mailing list archive)
State Rejected, archived
Headers show
Series cpufreq: fix cpufreq_get() can't get correct CPU frequency | expand

Commit Message

zhengzucheng March 11, 2022, 8:11 a.m. UTC
From: Zucheng Zheng <zhengzucheng@huawei.com>

On some specific platforms, the cpufreq driver does not define
cpufreq_driver.get() routine (eg:x86 intel_pstate driver), as a
result, the cpufreq_get() can't get the correct CPU frequency.

Modern x86 processors include the hardware needed to accurately
calculate frequency over an interval -- APERF, MPERF and the TSC.
Here we use arch_freq_get_on_cpu() in preference to any driver
driver-specific cpufreq_driver.get() routine to get CPU frequency.

Fixes: f8475cef9008 ("x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF")
Signed-off-by: Zucheng Zheng <zhengzucheng@huawei.com>
---
 drivers/cpufreq/cpufreq.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Greg KH March 11, 2022, 8:33 a.m. UTC | #1
On Fri, Mar 11, 2022 at 04:11:11PM +0800, z00314508 wrote:
> From: Zucheng Zheng <zhengzucheng@huawei.com>
> 
> On some specific platforms, the cpufreq driver does not define
> cpufreq_driver.get() routine (eg:x86 intel_pstate driver), as a
> result, the cpufreq_get() can't get the correct CPU frequency.
> 
> Modern x86 processors include the hardware needed to accurately
> calculate frequency over an interval -- APERF, MPERF and the TSC.
> Here we use arch_freq_get_on_cpu() in preference to any driver
> driver-specific cpufreq_driver.get() routine to get CPU frequency.
> 
> Fixes: f8475cef9008 ("x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF")
> Signed-off-by: Zucheng Zheng <zhengzucheng@huawei.com>
> ---
>  drivers/cpufreq/cpufreq.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 80f535cc8a75..d777257b4454 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1806,10 +1806,14 @@ unsigned int cpufreq_get(unsigned int cpu)
>  {
>  	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
>  	unsigned int ret_freq = 0;
> +	unsigned int freq;
>  
>  	if (policy) {
>  		down_read(&policy->rwsem);
> -		if (cpufreq_driver->get)
> +		freq = arch_freq_get_on_cpu(policy->cpu);
> +		if (freq)
> +			ret_freq = freq;
> +		else if (cpufreq_driver->get)
>  			ret_freq = __cpufreq_get(policy);
>  		up_read(&policy->rwsem);
>  
> -- 
> 2.18.0.huawei.25
> 

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>
Rafael J. Wysocki March 11, 2022, 3:52 p.m. UTC | #2
On Fri, Mar 11, 2022 at 9:11 AM z00314508 <zhengzucheng@huawei.com> wrote:
>
> From: Zucheng Zheng <zhengzucheng@huawei.com>
>
> On some specific platforms, the cpufreq driver does not define
> cpufreq_driver.get() routine (eg:x86 intel_pstate driver), as a

I guess you mean the cpufreq driver ->get callback.

No, intel_pstate doesn't implement it, because it cannot reliably
return the current CPU frequency.

> result, the cpufreq_get() can't get the correct CPU frequency.

No, it can't, if intel_pstate is the driver, but what's the problem?
This function is only called in one place in the kernel and not on x8
even.

> Modern x86 processors include the hardware needed to accurately
> calculate frequency over an interval -- APERF, MPERF and the TSC.

You can compute the average frequency over an interval, but ->get is
expected to return the actual current frequency at the time call time.

> Here we use arch_freq_get_on_cpu() in preference to any driver
> driver-specific cpufreq_driver.get() routine to get CPU frequency.
>
> Fixes: f8475cef9008 ("x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF")

No kidding.

> Signed-off-by: Zucheng Zheng <zhengzucheng@huawei.com>
> ---
>  drivers/cpufreq/cpufreq.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 80f535cc8a75..d777257b4454 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1806,10 +1806,14 @@ unsigned int cpufreq_get(unsigned int cpu)
>  {
>         struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
>         unsigned int ret_freq = 0;
> +       unsigned int freq;
>
>         if (policy) {
>                 down_read(&policy->rwsem);
> -               if (cpufreq_driver->get)
> +               freq = arch_freq_get_on_cpu(policy->cpu);
> +               if (freq)
> +                       ret_freq = freq;
> +               else if (cpufreq_driver->get)

Again, what problem exactly does this address?

>                         ret_freq = __cpufreq_get(policy);
>                 up_read(&policy->rwsem);
>
> --
zhengzucheng March 15, 2022, 2:30 a.m. UTC | #3
On 2022/3/11 23:52, Rafael J. Wysocki wrote:
> On Fri, Mar 11, 2022 at 9:11 AM z00314508 <zhengzucheng@huawei.com> wrote:
>> From: Zucheng Zheng <zhengzucheng@huawei.com>
>>
>> On some specific platforms, the cpufreq driver does not define
>> cpufreq_driver.get() routine (eg:x86 intel_pstate driver), as a
> I guess you mean the cpufreq driver ->get callback.
>
> No, intel_pstate doesn't implement it, because it cannot reliably
> return the current CPU frequency.
>
>> result, the cpufreq_get() can't get the correct CPU frequency.
> No, it can't, if intel_pstate is the driver, but what's the problem?
> This function is only called in one place in the kernel and not on x8
> even.
>
>> Modern x86 processors include the hardware needed to accurately
>> calculate frequency over an interval -- APERF, MPERF and the TSC.
> You can compute the average frequency over an interval, but ->get is
> expected to return the actual current frequency at the time call time.
>
>> Here we use arch_freq_get_on_cpu() in preference to any driver
>> driver-specific cpufreq_driver.get() routine to get CPU frequency.
>>
>> Fixes: f8475cef9008 ("x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF")
> No kidding.
>
>> Signed-off-by: Zucheng Zheng <zhengzucheng@huawei.com>
>> ---
>>   drivers/cpufreq/cpufreq.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 80f535cc8a75..d777257b4454 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1806,10 +1806,14 @@ unsigned int cpufreq_get(unsigned int cpu)
>>   {
>>          struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
>>          unsigned int ret_freq = 0;
>> +       unsigned int freq;
>>
>>          if (policy) {
>>                  down_read(&policy->rwsem);
>> -               if (cpufreq_driver->get)
>> +               freq = arch_freq_get_on_cpu(policy->cpu);
>> +               if (freq)
>> +                       ret_freq = freq;
>> +               else if (cpufreq_driver->get)
> Again, what problem exactly does this address?
Thank you for review.

In some scenarios, cpufreq driver ->get is not defined,
some driver get the CPU frequency by calling cpufreq_get() will return 0.
The modification to this problem is inspired by the implementation of 
the show_scaling_cur_freq().
>
>>                          ret_freq = __cpufreq_get(policy);
>>                  up_read(&policy->rwsem);
>>
>> --
> .
>
Rafael J. Wysocki March 15, 2022, 10:47 a.m. UTC | #4
On Tue, Mar 15, 2022 at 3:30 AM zhengzucheng <zhengzucheng@huawei.com> wrote:
>
>
>
> On 2022/3/11 23:52, Rafael J. Wysocki wrote:
> > On Fri, Mar 11, 2022 at 9:11 AM z00314508 <zhengzucheng@huawei.com> wrote:
> >> From: Zucheng Zheng <zhengzucheng@huawei.com>
> >>
> >> On some specific platforms, the cpufreq driver does not define
> >> cpufreq_driver.get() routine (eg:x86 intel_pstate driver), as a
> > I guess you mean the cpufreq driver ->get callback.
> >
> > No, intel_pstate doesn't implement it, because it cannot reliably
> > return the current CPU frequency.
> >
> >> result, the cpufreq_get() can't get the correct CPU frequency.
> > No, it can't, if intel_pstate is the driver, but what's the problem?
> > This function is only called in one place in the kernel and not on x8
> > even.
> >
> >> Modern x86 processors include the hardware needed to accurately
> >> calculate frequency over an interval -- APERF, MPERF and the TSC.
> > You can compute the average frequency over an interval, but ->get is
> > expected to return the actual current frequency at the time call time.
> >
> >> Here we use arch_freq_get_on_cpu() in preference to any driver
> >> driver-specific cpufreq_driver.get() routine to get CPU frequency.
> >>
> >> Fixes: f8475cef9008 ("x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF")
> > No kidding.
> >
> >> Signed-off-by: Zucheng Zheng <zhengzucheng@huawei.com>
> >> ---
> >>   drivers/cpufreq/cpufreq.c | 6 +++++-
> >>   1 file changed, 5 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> >> index 80f535cc8a75..d777257b4454 100644
> >> --- a/drivers/cpufreq/cpufreq.c
> >> +++ b/drivers/cpufreq/cpufreq.c
> >> @@ -1806,10 +1806,14 @@ unsigned int cpufreq_get(unsigned int cpu)
> >>   {
> >>          struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> >>          unsigned int ret_freq = 0;
> >> +       unsigned int freq;
> >>
> >>          if (policy) {
> >>                  down_read(&policy->rwsem);
> >> -               if (cpufreq_driver->get)
> >> +               freq = arch_freq_get_on_cpu(policy->cpu);
> >> +               if (freq)
> >> +                       ret_freq = freq;
> >> +               else if (cpufreq_driver->get)
> > Again, what problem exactly does this address?
> Thank you for review.
>
> In some scenarios, cpufreq driver ->get is not defined,
> some driver get the CPU frequency by calling cpufreq_get() will return 0.

Which driver?  Again, there is only one calling cpufreq_get() in the
kernel tree and it is not on x86.

> The modification to this problem is inspired by the implementation of
> the show_scaling_cur_freq().
> >
> >>                          ret_freq = __cpufreq_get(policy);
> >>                  up_read(&policy->rwsem);
> >>
> >> --

The answer to my question appears to be that you want cpufreq_get() to
be consistent with show_scaling_cur_freq().

Fair enough, but in that case please make them both call the same
lower-level routine implementing the desired behavior so as to avoid
code duplication.
diff mbox series

Patch

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 80f535cc8a75..d777257b4454 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1806,10 +1806,14 @@  unsigned int cpufreq_get(unsigned int cpu)
 {
 	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
 	unsigned int ret_freq = 0;
+	unsigned int freq;
 
 	if (policy) {
 		down_read(&policy->rwsem);
-		if (cpufreq_driver->get)
+		freq = arch_freq_get_on_cpu(policy->cpu);
+		if (freq)
+			ret_freq = freq;
+		else if (cpufreq_driver->get)
 			ret_freq = __cpufreq_get(policy);
 		up_read(&policy->rwsem);