diff mbox series

[1/2] cpufreq: qcom-hw: Use initialized cpumask for thermal pressure update

Message ID 20220118185612.2067031-1-bjorn.andersson@linaro.org (mailing list archive)
State New, archived
Delegated to: viresh kumar
Headers show
Series [1/2] cpufreq: qcom-hw: Use initialized cpumask for thermal pressure update | expand

Commit Message

Bjorn Andersson Jan. 18, 2022, 6:56 p.m. UTC
In the event that the SoC is under thermal pressure while booting it's
possible for the dcvs notification to happen inbetween the cpufreq
framework calling init and it actually updating the policy's
related_cpus cpumask.

Prior to the introduction of the thermal pressure update helper an empty
cpumask would simply result in the thermal pressure of no cpus being
updated, but the new code will attempt to dereference an invalid per_cpu
variable.

Avoid this problem by using the policy's cpus cpumask instead of the
related_cpus mask, as this is initialized before the interrupt is
registered.

Fixes: 0258cb19c77d ("cpufreq: qcom-cpufreq-hw: Use new thermal pressure update function")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
---
 drivers/cpufreq/qcom-cpufreq-hw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Viresh Kumar Jan. 19, 2022, 6:35 a.m. UTC | #1
On 18-01-22, 10:56, Bjorn Andersson wrote:
> In the event that the SoC is under thermal pressure while booting it's
> possible for the dcvs notification to happen inbetween the cpufreq
> framework calling init and it actually updating the policy's
> related_cpus cpumask.
> 
> Prior to the introduction of the thermal pressure update helper an empty
> cpumask would simply result in the thermal pressure of no cpus being
> updated, but the new code will attempt to dereference an invalid per_cpu
> variable.
> 
> Avoid this problem by using the policy's cpus cpumask instead of the
> related_cpus mask, as this is initialized before the interrupt is
> registered.
> 
> Fixes: 0258cb19c77d ("cpufreq: qcom-cpufreq-hw: Use new thermal pressure update function")
> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
> ---
>  drivers/cpufreq/qcom-cpufreq-hw.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index 05f3d7876e44..866fba3ac6fc 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -297,7 +297,7 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>  	throttled_freq = freq_hz / HZ_PER_KHZ;
>  
>  	/* Update thermal pressure (the boost frequencies are accepted) */
> -	arch_update_thermal_pressure(policy->related_cpus, throttled_freq);
> +	arch_update_thermal_pressure(policy->cpus, throttled_freq);

policy->cpus keeps on changing with CPU hotplug and this can leave
your platform in an inconsistent state. For example, in case where you
offline a CPU from policy, other CPUs get their thermal pressure
updated, online the CPU back and all CPUs of a policy don't have the
same settings anymore.

There are few things we can do here now:

- Check for empty related_cpus and return early. Since related_cpus is
  updated only once, this shall work just fine and must not be racy.

  While at it, I think we can also do something like this in
  topology_update_thermal_pressure() instead:

  	cpu = cpumask_first(cpus);
        if (unlikely(cpu >= NR_CPUS))
                return;

- And while writing this email, I dropped all other ideas in favor of
  change to topology_update_thermal_pressure() :)

--
viresh
Viresh Kumar Jan. 19, 2022, 6:40 a.m. UTC | #2
On 19-01-22, 12:05, Viresh Kumar wrote:
> policy->cpus keeps on changing with CPU hotplug and this can leave
> your platform in an inconsistent state. For example, in case where you
> offline a CPU from policy, other CPUs get their thermal pressure
> updated, online the CPU back and all CPUs of a policy don't have the
> same settings anymore.
> 
> There are few things we can do here now:
> 
> - Check for empty related_cpus and return early. Since related_cpus is
>   updated only once, this shall work just fine and must not be racy.
> 
>   While at it, I think we can also do something like this in
>   topology_update_thermal_pressure() instead:
> 
>   	cpu = cpumask_first(cpus);
>         if (unlikely(cpu >= NR_CPUS))
>                 return;
> 
> - And while writing this email, I dropped all other ideas in favor of
>   change to topology_update_thermal_pressure() :)

And then I saw your second patch, which looks good as otherwise we
will not be able to catch the bug in our system where we are sending
the empty cpumask :)

So the other idea is:

- Revert, or bring back a new version of this and register the
  interrupt from there. But that is also not a very clean solution.

  commit 4bf8e582119e ("cpufreq: Remove ready() callback")

-
Bjorn Andersson Jan. 19, 2022, 3:05 p.m. UTC | #3
On Tue 18 Jan 22:40 PST 2022, Viresh Kumar wrote:

> On 19-01-22, 12:05, Viresh Kumar wrote:
> > policy->cpus keeps on changing with CPU hotplug and this can leave
> > your platform in an inconsistent state. For example, in case where you
> > offline a CPU from policy, other CPUs get their thermal pressure
> > updated, online the CPU back and all CPUs of a policy don't have the
> > same settings anymore.
> > 

Oh, I didn't know that. Then my proposal doesn't seem that awesome.

> > There are few things we can do here now:
> > 
> > - Check for empty related_cpus and return early. Since related_cpus is
> >   updated only once, this shall work just fine and must not be racy.
> > 
> >   While at it, I think we can also do something like this in
> >   topology_update_thermal_pressure() instead:
> > 
> >   	cpu = cpumask_first(cpus);
> >         if (unlikely(cpu >= NR_CPUS))
> >                 return;
> > 
> > - And while writing this email, I dropped all other ideas in favor of
> >   change to topology_update_thermal_pressure() :)
> 
> And then I saw your second patch, which looks good as otherwise we
> will not be able to catch the bug in our system where we are sending
> the empty cpumask :)
> 
> So the other idea is:
> 
> - Revert, or bring back a new version of this and register the
>   interrupt from there. But that is also not a very clean solution.
> 
>   commit 4bf8e582119e ("cpufreq: Remove ready() callback")
> 

We could do this and keep the interrupt disabled until we hit ready().

But I found the resulting issue non-trivial to debug, so I would prefer
if arch_update_thermal_pressure() dealt with the empty cpumask. So as
you suggest in your first reply, I'll respin the second patch alone,
without the WARN_ON().

Thanks,
Bjorn
diff mbox series

Patch

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index 05f3d7876e44..866fba3ac6fc 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -297,7 +297,7 @@  static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
 	throttled_freq = freq_hz / HZ_PER_KHZ;
 
 	/* Update thermal pressure (the boost frequencies are accepted) */
-	arch_update_thermal_pressure(policy->related_cpus, throttled_freq);
+	arch_update_thermal_pressure(policy->cpus, throttled_freq);
 
 	/*
 	 * In the unlikely case policy is unregistered do not enable