diff mbox series

[v1] thermal: trip: Avoid skipping trips in thermal_zone_set_trips()

Message ID 12509184.O9o76ZdvQC@rjwysocki.net (mailing list archive)
State In Next
Delegated to: Rafael Wysocki
Headers show
Series [v1] thermal: trip: Avoid skipping trips in thermal_zone_set_trips() | expand

Commit Message

Rafael J. Wysocki July 30, 2024, 2:41 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Say there are 3 trip points A, B, C sorted in ascending temperature
order with no hysteresis.  If the zone temerature is exactly equal to
B, thermal_zone_set_trips() will set the boundaries to A and C and the
hardware will not catch any crossing of B (either way) until either A
or C is crossed and the boundaries are changed.

To avoid that, use non-strict inequalities when comparing the trip
threshold to the zone temperature in thermal_zone_set_trips().

In the example above, it will cause both boundaries to be set to B,
which is desirable because an interrupt will trigger when the zone
temperature becomes different from B regardless of which way it goes.
That will allow a new interval to be set depending on the direction of
the zone temperature change.

Fixes: 893bae92237d ("thermal: trip: Make thermal_zone_set_trips() use trip thresholds")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

6.11-rc material.

---
 drivers/thermal/thermal_trip.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Daniel Lezcano July 30, 2024, 2:57 p.m. UTC | #1
On 30/07/2024 16:41, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Say there are 3 trip points A, B, C sorted in ascending temperature
> order with no hysteresis.  If the zone temerature is exactly equal to
> B, thermal_zone_set_trips() will set the boundaries to A and C and the
> hardware will not catch any crossing of B (either way) until either A
> or C is crossed and the boundaries are changed.

When the temperature is B, an interrupt fired which led to the 
thermal_zone_update() and in turn it calls thermal_zone_set_trips()

As the current temperature is equal to trip B, it makes sense to set A 
and C, as B has been processes when handling the thermal trips right 
before calling thermal_zone_set_trips()

I'm failing to understand the issue you describe

Were you able to reproduce the issue with emul_temp ?


> To avoid that, use non-strict inequalities when comparing the trip
> threshold to the zone temperature in thermal_zone_set_trips().
> 
> In the example above, it will cause both boundaries to be set to B,
> which is desirable because an interrupt will trigger when the zone
> temperature becomes different from B regardless of which way it goes.
> That will allow a new interval to be set depending on the direction of
> the zone temperature change.
> 
> Fixes: 893bae92237d ("thermal: trip: Make thermal_zone_set_trips() use trip thresholds")
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> 6.11-rc material.
> 
> ---
>   drivers/thermal/thermal_trip.c |    4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> Index: linux-pm/drivers/thermal/thermal_trip.c
> ===================================================================
> --- linux-pm.orig/drivers/thermal/thermal_trip.c
> +++ linux-pm/drivers/thermal/thermal_trip.c
> @@ -82,10 +82,10 @@ void thermal_zone_set_trips(struct therm
>   		return;
>   
>   	for_each_trip_desc(tz, td) {
> -		if (td->threshold < tz->temperature && td->threshold > low)
> +		if (td->threshold <= tz->temperature && td->threshold > low)
>   			low = td->threshold;
>   
> -		if (td->threshold > tz->temperature && td->threshold < high)
> +		if (td->threshold >= tz->temperature && td->threshold < high)
>   			high = td->threshold;
>   	}
>   
> 
> 
>
Rafael J. Wysocki July 30, 2024, 4:46 p.m. UTC | #2
On Tue, Jul 30, 2024 at 4:57 PM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
> On 30/07/2024 16:41, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Say there are 3 trip points A, B, C sorted in ascending temperature
> > order with no hysteresis.  If the zone temerature is exactly equal to
> > B, thermal_zone_set_trips() will set the boundaries to A and C and the
> > hardware will not catch any crossing of B (either way) until either A
> > or C is crossed and the boundaries are changed.
>
> When the temperature is B, an interrupt fired which led to the
> thermal_zone_update() and in turn it calls thermal_zone_set_trips()
>
> As the current temperature is equal to trip B, it makes sense to set A
> and C, as B has been processes when handling the thermal trips right
> before calling thermal_zone_set_trips()

So say that A, B and C are active trips and the thermal zone uses the
Bang-bang governor.  Say that each trip point has a fan associated
with it, so that every time it is crossed on the way up, the fan
should be turned on, and every time it is crossed on the way down, the
fan should be turned off.  Denote these fans as f_A, f_B, and f_C,
respectively.

Say the initial thermal zone temperature is below A, so the initial
thermal_zone_set_trips() interval is {-INT_MAX, A}.  The zone
temperature starts to rise and A is reached, so an interrupt triggers.
__thermal_zone_device_update() runs and it sees that the zone
temperature is equal to A, so thermal_zone_set_trips() sets the new
interval to {-INT_MAX, B} and f_A is turned on.

Say the zone temperature is still rising, despite f_A being on, and B
is reached.   __thermal_zone_device_update() runs and it sees that the
zone temperature is equal to B, so thermal_zone_set_trips() sets the
new interval to {A, C} and f_B is turned on.

Say the temperature rises somewhat above B, but does not reach C and
starts to fall down.  B is crossed on the way down, so f_B should be
turned off, but nothing happens, because an interrupt will only
trigger when A is reached.

> I'm failing to understand the issue you describe

I hope the above helps.

> Were you able to reproduce the issue with emul_temp ?

I haven't tried, but I'm sure I'd be able to reproduce it.
Rafael J. Wysocki July 30, 2024, 8:17 p.m. UTC | #3
On Tue, Jul 30, 2024 at 6:46 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Jul 30, 2024 at 4:57 PM Daniel Lezcano
> <daniel.lezcano@linaro.org> wrote:
> >
> > On 30/07/2024 16:41, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> > > Say there are 3 trip points A, B, C sorted in ascending temperature
> > > order with no hysteresis.  If the zone temerature is exactly equal to
> > > B, thermal_zone_set_trips() will set the boundaries to A and C and the
> > > hardware will not catch any crossing of B (either way) until either A
> > > or C is crossed and the boundaries are changed.
> >
> > When the temperature is B, an interrupt fired which led to the
> > thermal_zone_update() and in turn it calls thermal_zone_set_trips()
> >
> > As the current temperature is equal to trip B, it makes sense to set A
> > and C, as B has been processes when handling the thermal trips right
> > before calling thermal_zone_set_trips()
>
> So say that A, B and C are active trips and the thermal zone uses the
> Bang-bang governor.  Say that each trip point has a fan associated
> with it, so that every time it is crossed on the way up, the fan
> should be turned on, and every time it is crossed on the way down, the
> fan should be turned off.  Denote these fans as f_A, f_B, and f_C,
> respectively.
>
> Say the initial thermal zone temperature is below A, so the initial
> thermal_zone_set_trips() interval is {-INT_MAX, A}.  The zone
> temperature starts to rise and A is reached, so an interrupt triggers.
> __thermal_zone_device_update() runs and it sees that the zone
> temperature is equal to A, so thermal_zone_set_trips() sets the new
> interval to {-INT_MAX, B} and f_A is turned on.
>
> Say the zone temperature is still rising, despite f_A being on, and B
> is reached.   __thermal_zone_device_update() runs and it sees that the
> zone temperature is equal to B, so thermal_zone_set_trips() sets the
> new interval to {A, C} and f_B is turned on.
>
> Say the temperature rises somewhat above B, but does not reach C and
> starts to fall down.  B is crossed on the way down, so f_B should be
> turned off, but nothing happens, because an interrupt will only
> trigger when A is reached.

Worse yet, if the zone temperature starts to fall down between A and
B, after setting the thermal_zone_set_trips() interval to {-INT_MAX,
B},,nothing will happen when A is crossed on the way down, and since
the low boundary is clearly of the "don't care" type, f_A will not be
turned off until B is reached AFAICS, which may be never.

> > I'm failing to understand the issue you describe
>
> I hope the above helps.
>
> > Were you able to reproduce the issue with emul_temp ?
>
> I haven't tried, but I'm sure I'd be able to reproduce it.
Daniel Lezcano July 31, 2024, 10:08 a.m. UTC | #4
On 30/07/2024 18:46, Rafael J. Wysocki wrote:
> On Tue, Jul 30, 2024 at 4:57 PM Daniel Lezcano
> <daniel.lezcano@linaro.org> wrote:
>>
>> On 30/07/2024 16:41, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>> Say there are 3 trip points A, B, C sorted in ascending temperature
>>> order with no hysteresis.  If the zone temerature is exactly equal to
>>> B, thermal_zone_set_trips() will set the boundaries to A and C and the
>>> hardware will not catch any crossing of B (either way) until either A
>>> or C is crossed and the boundaries are changed.
>>
>> When the temperature is B, an interrupt fired which led to the
>> thermal_zone_update() and in turn it calls thermal_zone_set_trips()
>>
>> As the current temperature is equal to trip B, it makes sense to set A
>> and C, as B has been processes when handling the thermal trips right
>> before calling thermal_zone_set_trips()
> 
> So say that A, B and C are active trips and the thermal zone uses the
> Bang-bang governor.  Say that each trip point has a fan associated
> with it, so that every time it is crossed on the way up, the fan
> should be turned on, and every time it is crossed on the way down, the
> fan should be turned off.  Denote these fans as f_A, f_B, and f_C,
> respectively.
> 
> Say the initial thermal zone temperature is below A, so the initial
> thermal_zone_set_trips() interval is {-INT_MAX, A}.  The zone
> temperature starts to rise and A is reached, so an interrupt triggers.
> __thermal_zone_device_update() runs and it sees that the zone
> temperature is equal to A, so thermal_zone_set_trips() sets the new
> interval to {-INT_MAX, B} and f_A is turned on.
> 
> Say the zone temperature is still rising, despite f_A being on, and B
> is reached.   __thermal_zone_device_update() runs and it sees that the
> zone temperature is equal to B, so thermal_zone_set_trips() sets the
> new interval to {A, C} and f_B is turned on.
> 
> Say the temperature rises somewhat above B, but does not reach C and
> starts to fall down.  B is crossed on the way down, so f_B should be
> turned off, but nothing happens, because an interrupt will only
> trigger when A is reached.
> 
>> I'm failing to understand the issue you describe
> 
> I hope the above helps.

Yes, I understand now. This is really specific when it is with active 
trip points, so passive delay is not involved.

Thanks for the clarification


>> Were you able to reproduce the issue with emul_temp ?
> 
> I haven't tried, but I'm sure I'd be able to reproduce it.
diff mbox series

Patch

Index: linux-pm/drivers/thermal/thermal_trip.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_trip.c
+++ linux-pm/drivers/thermal/thermal_trip.c
@@ -82,10 +82,10 @@  void thermal_zone_set_trips(struct therm
 		return;
 
 	for_each_trip_desc(tz, td) {
-		if (td->threshold < tz->temperature && td->threshold > low)
+		if (td->threshold <= tz->temperature && td->threshold > low)
 			low = td->threshold;
 
-		if (td->threshold > tz->temperature && td->threshold < high)
+		if (td->threshold >= tz->temperature && td->threshold < high)
 			high = td->threshold;
 	}