diff mbox series

[v2,1/2] thermal: int340x: do not set a wrong tcc offset on resume

Message ID 20210909085613.5577-2-atenart@kernel.org (mailing list archive)
State Mainlined
Delegated to: Zhang Rui
Headers show
Series thermal: int340x: fix tcc offset on resume | expand

Commit Message

Antoine Tenart Sept. 9, 2021, 8:56 a.m. UTC
After upgrading to Linux 5.13.3 I noticed my laptop would shutdown due
to overheat (when it should not). It turned out this was due to commit
fe6a6de6692e ("thermal/drivers/int340x/processor_thermal: Fix tcc setting").

What happens is this drivers uses a global variable to keep track of the
tcc offset (tcc_offset_save) and uses it on resume. The issue is this
variable is initialized to 0, but is only set in
tcc_offset_degree_celsius_store, i.e. when the tcc offset is explicitly
set by userspace. If that does not happen, the resume path will set the
offset to 0 (in my case the h/w default being 3, the offset would become
too low after a suspend/resume cycle).

The issue did not arise before commit fe6a6de6692e, as the function
setting the offset would return if the offset was 0. This is no longer
the case (rightfully).

Fix this by not applying the offset if it wasn't saved before, reverting
back to the old logic. A better approach will come later, but this will
be easier to apply to stable kernels.

The logic to restore the offset after a resume was there long before
commit fe6a6de6692e, but as a value of 0 was considered invalid I'm
referencing the commit that made the issue possible in the Fixes tag
instead.

Fixes: fe6a6de6692e ("thermal/drivers/int340x/processor_thermal: Fix tcc setting")
Cc: stable@vger.kernel.org
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 .../thermal/intel/int340x_thermal/processor_thermal_device.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Srinivas Pandruvada Sept. 24, 2021, 4:27 p.m. UTC | #1
Hi Daniel,

This patch is important. Can we send for 5.15 rc release?

I see the previous version of this patch is applied to linux-next.
But this series is better as it splits into two patches. The first one
can be easily backported and will fix the problem. The second one is an
improvement.

Thanks,
Srinivas

On Thu, 2021-09-09 at 10:56 +0200, Antoine Tenart wrote:
> After upgrading to Linux 5.13.3 I noticed my laptop would shutdown
> due
> to overheat (when it should not). It turned out this was due to
> commit
> fe6a6de6692e ("thermal/drivers/int340x/processor_thermal: Fix tcc
> setting").
> 
> What happens is this drivers uses a global variable to keep track of
> the
> tcc offset (tcc_offset_save) and uses it on resume. The issue is this
> variable is initialized to 0, but is only set in
> tcc_offset_degree_celsius_store, i.e. when the tcc offset is
> explicitly
> set by userspace. If that does not happen, the resume path will set
> the
> offset to 0 (in my case the h/w default being 3, the offset would
> become
> too low after a suspend/resume cycle).
> 
> The issue did not arise before commit fe6a6de6692e, as the function
> setting the offset would return if the offset was 0. This is no
> longer
> the case (rightfully).
> 
> Fix this by not applying the offset if it wasn't saved before,
> reverting
> back to the old logic. A better approach will come later, but this
> will
> be easier to apply to stable kernels.
> 
> The logic to restore the offset after a resume was there long before
> commit fe6a6de6692e, but as a value of 0 was considered invalid I'm
> referencing the commit that made the issue possible in the Fixes tag
> instead.
> 
> Fixes: fe6a6de6692e ("thermal/drivers/int340x/processor_thermal: Fix
> tcc setting")
> Cc: stable@vger.kernel.org
> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> Signed-off-by: Antoine Tenart <atenart@kernel.org>
> ---
>  .../thermal/intel/int340x_thermal/processor_thermal_device.c | 5
> +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git
> a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c
> b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c
> index 0f0038af2ad4..fb64acfd5e07 100644
> ---
> a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c
> +++
> b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c
> @@ -107,7 +107,7 @@ static int tcc_offset_update(unsigned int tcc)
>         return 0;
>  }
>  
> -static unsigned int tcc_offset_save;
> +static int tcc_offset_save = -1;
>  
>  static ssize_t tcc_offset_degree_celsius_store(struct device *dev,
>                                 struct device_attribute *attr, const
> char *buf,
> @@ -352,7 +352,8 @@ int proc_thermal_resume(struct device *dev)
>         proc_dev = dev_get_drvdata(dev);
>         proc_thermal_read_ppcc(proc_dev);
>  
> -       tcc_offset_update(tcc_offset_save);
> +       if (tcc_offset_save >= 0)
> +               tcc_offset_update(tcc_offset_save);
>  
>         return 0;
>  }
Daniel Lezcano Sept. 24, 2021, 5:40 p.m. UTC | #2
On 24/09/2021 18:27, Srinivas Pandruvada wrote:
> Hi Daniel,
> 
> This patch is important. Can we send for 5.15 rc release?
> 
> I see the previous version of this patch is applied to linux-next.
> But this series is better as it splits into two patches. The first one
> can be easily backported and will fix the problem. The second one is an
> improvement.

Yes, it is in the pipe.

I've applied the patch 1/2 to the fixes branch and the patch 2/2 will
land in the next branch as soon as the next -rc is released with the fix
and merged to the next branch.
Srinivas Pandruvada Sept. 24, 2021, 5:51 p.m. UTC | #3
On Fri, 2021-09-24 at 19:40 +0200, Daniel Lezcano wrote:
> On 24/09/2021 18:27, Srinivas Pandruvada wrote:
> > Hi Daniel,
> > 
> > This patch is important. Can we send for 5.15 rc release?
> > 
> > I see the previous version of this patch is applied to linux-next.
> > But this series is better as it splits into two patches. The first
> > one
> > can be easily backported and will fix the problem. The second one
> > is an
> > improvement.
> 
> Yes, it is in the pipe.
> 
> I've applied the patch 1/2 to the fixes branch and the patch 2/2 will
> land in the next branch as soon as the next -rc is released with the
> fix
> and merged to the next branch.

Thanks Daniel.

-Srinivas

> 
>
Antoine Tenart Oct. 20, 2021, 1:38 p.m. UTC | #4
Hello Daniel,

Quoting Daniel Lezcano (2021-09-24 19:40:13)
> 
> I've applied the patch 1/2 to the fixes branch and the patch 2/2 will
> land in the next branch as soon as the next -rc is released with the fix
> and merged to the next branch.

I don't see it in thermal/next even though patch 1 has made it. Not sure
if patch 2 has slipped through the cracks or wasn't pushed yet. If it's
the later, please ignore this mail.

Thanks!
Antoine
Daniel Lezcano Oct. 21, 2021, 9:47 a.m. UTC | #5
On 20/10/2021 15:38, Antoine Tenart wrote:
> Hello Daniel,
> 
> Quoting Daniel Lezcano (2021-09-24 19:40:13)
>>
>> I've applied the patch 1/2 to the fixes branch and the patch 2/2 will
>> land in the next branch as soon as the next -rc is released with the fix
>> and merged to the next branch.
> 
> I don't see it in thermal/next even though patch 1 has made it. Not sure
> if patch 2 has slipped through the cracks or wasn't pushed yet. If it's
> the later, please ignore this mail.

Indeed, I thougth I picked it but it wasn't.

Thanks for the head up, it is applied now.

  -- D.
Antoine Tenart Oct. 21, 2021, 10:02 a.m. UTC | #6
Quoting Daniel Lezcano (2021-10-21 11:47:50)
> On 20/10/2021 15:38, Antoine Tenart wrote:
> > Quoting Daniel Lezcano (2021-09-24 19:40:13)
> >>
> >> I've applied the patch 1/2 to the fixes branch and the patch 2/2 will
> >> land in the next branch as soon as the next -rc is released with the fix
> >> and merged to the next branch.
> > 
> > I don't see it in thermal/next even though patch 1 has made it. Not sure
> > if patch 2 has slipped through the cracks or wasn't pushed yet. If it's
> > the later, please ignore this mail.
> 
> Indeed, I thougth I picked it but it wasn't.
> 
> Thanks for the head up, it is applied now.

Thanks!
diff mbox series

Patch

diff --git a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c
index 0f0038af2ad4..fb64acfd5e07 100644
--- a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c
+++ b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c
@@ -107,7 +107,7 @@  static int tcc_offset_update(unsigned int tcc)
 	return 0;
 }
 
-static unsigned int tcc_offset_save;
+static int tcc_offset_save = -1;
 
 static ssize_t tcc_offset_degree_celsius_store(struct device *dev,
 				struct device_attribute *attr, const char *buf,
@@ -352,7 +352,8 @@  int proc_thermal_resume(struct device *dev)
 	proc_dev = dev_get_drvdata(dev);
 	proc_thermal_read_ppcc(proc_dev);
 
-	tcc_offset_update(tcc_offset_save);
+	if (tcc_offset_save >= 0)
+		tcc_offset_update(tcc_offset_save);
 
 	return 0;
 }