diff mbox series

[3/3] thermal/core: Fix thermal trip cross point

Message ID 20220707214513.1133506-3-daniel.lezcano@linaro.org (mailing list archive)
State Superseded, archived
Headers show
Series [1/3] thermal/core: Encapsulate the trip point crossed function | expand

Commit Message

Daniel Lezcano July 7, 2022, 9:45 p.m. UTC
The routine doing trip point crossing the way up or down is actually
wrong.

A trip point is composed with a trip temperature and a hysteresis.

The trip temperature is used to detect when the trip point is crossed
the way up.

The trip temperature minus the hysteresis is used to detect when the
trip point is crossed the way down.

|-----------low--------high------------|
             |<--------->|
             |    hyst   |
             |           |
             |          -|--> crossed the way up
             |
         <---|-- crossed the way down

For that, there is a two point comparison: the current temperature and
the previous temperature.

The actual code assumes if the current temperature is greater than the
trip temperature and the previous temperature was lesser, then the
trip point is crossed the way up. That is true only if we crossed the
way down the low temperature boundary from the previous temperature or
if the hysteresis is zero. The temperature can decrease between the
low and high, so the trip point is not crossed the way down and then
increase again and cross the high temperature raising a new trip point
crossed detection which is incorrect. The same scenario happens when
crossing the way down.

The trip point crossing the way up and down must act as parenthesis, a
trip point down must close a trip point up. Today we have multiple
trip point up without the corresponding trip point down.

In order to fix that, we store the previous trip point which gives the
information about the previous trip.

As a sidenote, the thermal_zone_device structure has already the
prev_trip_low and prev_trip_high information which are used by the
thermal_zone_set_trips() function. This one can be changed to be
triggered by the trip temperature crossing function, which makes more
sense, and the two fields will disappear.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/thermal/thermal_core.c | 32 ++++++++++++++++++++++----------
 include/linux/thermal.h        |  2 ++
 2 files changed, 24 insertions(+), 10 deletions(-)

Comments

Zhang Rui July 8, 2022, 3:56 a.m. UTC | #1
On Thu, 2022-07-07 at 23:45 +0200, Daniel Lezcano wrote:
> The routine doing trip point crossing the way up or down is actually
> wrong.
> 
> A trip point is composed with a trip temperature and a hysteresis.
> 
> The trip temperature is used to detect when the trip point is crossed
> the way up.
> 
> The trip temperature minus the hysteresis is used to detect when the
> trip point is crossed the way down.
> 
> > -----------low--------high------------|
> 
>              |<--------->|
>              |    hyst   |
>              |           |
>              |          -|--> crossed the way up
>              |
>          <---|-- crossed the way down
> 
> For that, there is a two point comparison: the current temperature
> and
> the previous temperature.
> 
> The actual code assumes if the current temperature is greater than
> the
> trip temperature and the previous temperature was lesser, then the
> trip point is crossed the way up. That is true only if we crossed the
> way down the low temperature boundary from the previous temperature
> or
> if the hysteresis is zero. The temperature can decrease between the
> low and high, so the trip point is not crossed the way down and then
> increase again and cross the high temperature raising a new trip
> point
> crossed detection which is incorrect. The same scenario happens when
> crossing the way down.
> 
> The trip point crossing the way up and down must act as parenthesis,
> a
> trip point down must close a trip point up. Today we have multiple
> trip point up without the corresponding trip point down.
> 
> In order to fix that, we store the previous trip point which gives
> the
> information about the previous trip.
> 
> As a sidenote, the thermal_zone_device structure has already the
> prev_trip_low and prev_trip_high information which are used by the
> thermal_zone_set_trips() function. This one can be changed to be
> triggered by the trip temperature crossing function, which makes more
> sense, and the two fields will disappear.
> 
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
>  drivers/thermal/thermal_core.c | 32 ++++++++++++++++++++++----------
>  include/linux/thermal.h        |  2 ++
>  2 files changed, 24 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_core.c
> b/drivers/thermal/thermal_core.c
> index f66036b3daae..92bc9ddb6904 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -357,19 +357,30 @@ static void handle_critical_trips(struct
> thermal_zone_device *tz,
>  static void handle_thermal_trip_crossed(struct thermal_zone_device
> *tz, int trip,
>  					int trip_temp, int trip_hyst,
> enum thermal_trip_type trip_type)
>  {
> +	int trip_low_temp = trip_temp - trip_hyst;
> +
>  	if (tz->last_temperature == THERMAL_TEMP_INVALID)
>  		return;
>  
> -	if (tz->last_temperature < trip_temp &&
> -	    tz->temperature >= trip_temp) {
> -		thermal_notify_tz_trip_up(tz->id, trip,
> -					  tz->temperature);
> -	}
> -
> -	if (tz->last_temperature >= trip_temp &&
> -	    tz->temperature < (trip_temp - trip_hyst)) {
> -		thermal_notify_tz_trip_down(tz->id, trip,
> -					    tz->temperature);
> +	/*
> +	 * Due to the hysteresis, a third information is needed to
> +	 * detect when the temperature is wavering between the
> +	 * trip_low_temp and the trip_temp. A trip point is crossed
> +	 * the way up only if the temperature is above it while the
> +	 * previous temperature was below *and* we crossed the
> +	 * trip_temp_low before. The previous trip point give us the
> +	 * previous trip point transition. The similar problem exists
> +	 * when crossing the way down.
> +	 */
> +	if (tz->last_temperature < trip_temp && tz->temperature >=
> trip_temp &&
> +	    trip != tz->prev_trip) {
> +		thermal_notify_tz_trip_up(tz->id, trip, tz-
> >temperature);
> +		tz->prev_trip = trip;
> +		
> +	} else if (tz->last_temperature >= trip_low_temp && tz-
> >temperature < trip_low_temp &&
> +	    trip == tz->prev_trip) {
> +		thermal_notify_tz_trip_down(tz->id, trip, tz-
> >temperature);
> +		tz->prev_trip = trip - 1;

Say, let's assume hysteresis is Zero,
When the temperature increases and we do thermal_notify_tz_trip_up()
for trip 0 and trip 1, tz->prev_trip is set to 1 in this case.
And then the temperature drops below trip 0, we don't have chance to do
thermal_notify_tz_trip_down() for trip 0, because we always handle the
trips in ascending order, and tz->prev_trip is 1 when we do
handle_thermal_trip(0).

thanks,
rui

>  	}
>  }
>  
> @@ -427,6 +438,7 @@ static void thermal_zone_device_init(struct
> thermal_zone_device *tz)
>  {
>  	struct thermal_instance *pos;
>  	tz->temperature = THERMAL_TEMP_INVALID;
> +	tz->prev_trip = -1;
>  	tz->prev_low_trip = -INT_MAX;
>  	tz->prev_high_trip = INT_MAX;
>  	list_for_each_entry(pos, &tz->thermal_instances, tz_node)
> diff --git a/include/linux/thermal.h b/include/linux/thermal.h
> index 231bac2768fb..5b3bfb902d10 100644
> --- a/include/linux/thermal.h
> +++ b/include/linux/thermal.h
> @@ -124,6 +124,7 @@ struct thermal_cooling_device {
>   * @last_temperature:	previous temperature read
>   * @emul_temperature:	emulated temperature when using
> CONFIG_THERMAL_EMULATION
>   * @passive:		1 if you've crossed a passive trip point, 0
> otherwise.
> + * @prev_trip:		previous trip point the thermal zone
> was, -1 if below all of them
>   * @prev_low_trip:	the low current temperature if you've crossed a
> passive
>  			trip point.
>   * @prev_high_trip:	the above current temperature if you've crossed
> a
> @@ -159,6 +160,7 @@ struct thermal_zone_device {
>  	int last_temperature;
>  	int emul_temperature;
>  	int passive;
> +	int prev_trip;
>  	int prev_low_trip;
>  	int prev_high_trip;
>  	atomic_t need_update;
Daniel Lezcano July 8, 2022, 10:15 a.m. UTC | #2
Hi Zhang,

thanks for reviewing

On 08/07/2022 05:56, Zhang Rui wrote:
> On Thu, 2022-07-07 at 23:45 +0200, Daniel Lezcano wrote:
>> The routine doing trip point crossing the way up or down is actually
>> wrong.
>>
>> A trip point is composed with a trip temperature and a hysteresis.
>>
>> The trip temperature is used to detect when the trip point is crossed
>> the way up.
>>
>> The trip temperature minus the hysteresis is used to detect when the
>> trip point is crossed the way down.
>>
>>> -----------low--------high------------|
>>
>>               |<--------->|
>>               |    hyst   |
>>               |           |
>>               |          -|--> crossed the way up
>>               |
>>           <---|-- crossed the way down
>>
>> For that, there is a two point comparison: the current temperature
>> and
>> the previous temperature.
>>
>> The actual code assumes if the current temperature is greater than
>> the
>> trip temperature and the previous temperature was lesser, then the
>> trip point is crossed the way up. That is true only if we crossed the
>> way down the low temperature boundary from the previous temperature
>> or
>> if the hysteresis is zero. The temperature can decrease between the
>> low and high, so the trip point is not crossed the way down and then
>> increase again and cross the high temperature raising a new trip
>> point
>> crossed detection which is incorrect. The same scenario happens when
>> crossing the way down.
>>
>> The trip point crossing the way up and down must act as parenthesis,
>> a
>> trip point down must close a trip point up. Today we have multiple
>> trip point up without the corresponding trip point down.
>>
>> In order to fix that, we store the previous trip point which gives
>> the
>> information about the previous trip.
>>
>> As a sidenote, the thermal_zone_device structure has already the
>> prev_trip_low and prev_trip_high information which are used by the
>> thermal_zone_set_trips() function. This one can be changed to be
>> triggered by the trip temperature crossing function, which makes more
>> sense, and the two fields will disappear.
>>
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>> ---
>>   drivers/thermal/thermal_core.c | 32 ++++++++++++++++++++++----------
>>   include/linux/thermal.h        |  2 ++
>>   2 files changed, 24 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/thermal/thermal_core.c
>> b/drivers/thermal/thermal_core.c
>> index f66036b3daae..92bc9ddb6904 100644
>> --- a/drivers/thermal/thermal_core.c
>> +++ b/drivers/thermal/thermal_core.c
>> @@ -357,19 +357,30 @@ static void handle_critical_trips(struct
>> thermal_zone_device *tz,
>>   static void handle_thermal_trip_crossed(struct thermal_zone_device
>> *tz, int trip,
>>   					int trip_temp, int trip_hyst,
>> enum thermal_trip_type trip_type)
>>   {
>> +	int trip_low_temp = trip_temp - trip_hyst;
>> +
>>   	if (tz->last_temperature == THERMAL_TEMP_INVALID)
>>   		return;
>>   
>> -	if (tz->last_temperature < trip_temp &&
>> -	    tz->temperature >= trip_temp) {
>> -		thermal_notify_tz_trip_up(tz->id, trip,
>> -					  tz->temperature);
>> -	}
>> -
>> -	if (tz->last_temperature >= trip_temp &&
>> -	    tz->temperature < (trip_temp - trip_hyst)) {
>> -		thermal_notify_tz_trip_down(tz->id, trip,
>> -					    tz->temperature);
>> +	/*
>> +	 * Due to the hysteresis, a third information is needed to
>> +	 * detect when the temperature is wavering between the
>> +	 * trip_low_temp and the trip_temp. A trip point is crossed
>> +	 * the way up only if the temperature is above it while the
>> +	 * previous temperature was below *and* we crossed the
>> +	 * trip_temp_low before. The previous trip point give us the
>> +	 * previous trip point transition. The similar problem exists
>> +	 * when crossing the way down.
>> +	 */
>> +	if (tz->last_temperature < trip_temp && tz->temperature >=
>> trip_temp &&
>> +	    trip != tz->prev_trip) {
>> +		thermal_notify_tz_trip_up(tz->id, trip, tz-
>>> temperature);
>> +		tz->prev_trip = trip;
>> +		
>> +	} else if (tz->last_temperature >= trip_low_temp && tz-
>>> temperature < trip_low_temp &&
>> +	    trip == tz->prev_trip) {
>> +		thermal_notify_tz_trip_down(tz->id, trip, tz-
>>> temperature);
>> +		tz->prev_trip = trip - 1;
> 
> Say, let's assume hysteresis is Zero,
> When the temperature increases and we do thermal_notify_tz_trip_up()
> for trip 0 and trip 1, tz->prev_trip is set to 1 in this case.
> And then the temperature drops below trip 0, we don't have chance to do
> thermal_notify_tz_trip_down() for trip 0, because we always handle the
> trips in ascending order, and tz->prev_trip is 1 when we do
> handle_thermal_trip(0).

Well I don't see how to handle this case, except accepting the detection 
will happen at the next temperature update, no ?
Daniel Lezcano July 8, 2022, 10:48 a.m. UTC | #3
On 08/07/2022 05:56, Zhang Rui wrote:

[ ... ]

>> +	if (tz->last_temperature < trip_temp && tz->temperature >=
>> trip_temp &&
>> +	    trip != tz->prev_trip) {
>> +		thermal_notify_tz_trip_up(tz->id, trip, tz-
>>> temperature);
>> +		tz->prev_trip = trip;
>> +		
>> +	} else if (tz->last_temperature >= trip_low_temp && tz-
>>> temperature < trip_low_temp &&
>> +	    trip == tz->prev_trip) {
>> +		thermal_notify_tz_trip_down(tz->id, trip, tz-
>>> temperature);
>> +		tz->prev_trip = trip - 1;
> 
> Say, let's assume hysteresis is Zero,
> When the temperature increases and we do thermal_notify_tz_trip_up()
> for trip 0 and trip 1, tz->prev_trip is set to 1 in this case.
> And then the temperature drops below trip 0, we don't have chance to do
> thermal_notify_tz_trip_down() for trip 0, because we always handle the
> trips in ascending order, and tz->prev_trip is 1 when we do
> handle_thermal_trip(0).
Well actually you are right, I reproduced the scenario with temperature 
emulation. I'll investigate how to handle this case.

Thanks for spotting this



<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
diff mbox series

Patch

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index f66036b3daae..92bc9ddb6904 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -357,19 +357,30 @@  static void handle_critical_trips(struct thermal_zone_device *tz,
 static void handle_thermal_trip_crossed(struct thermal_zone_device *tz, int trip,
 					int trip_temp, int trip_hyst, enum thermal_trip_type trip_type)
 {
+	int trip_low_temp = trip_temp - trip_hyst;
+
 	if (tz->last_temperature == THERMAL_TEMP_INVALID)
 		return;
 
-	if (tz->last_temperature < trip_temp &&
-	    tz->temperature >= trip_temp) {
-		thermal_notify_tz_trip_up(tz->id, trip,
-					  tz->temperature);
-	}
-
-	if (tz->last_temperature >= trip_temp &&
-	    tz->temperature < (trip_temp - trip_hyst)) {
-		thermal_notify_tz_trip_down(tz->id, trip,
-					    tz->temperature);
+	/*
+	 * Due to the hysteresis, a third information is needed to
+	 * detect when the temperature is wavering between the
+	 * trip_low_temp and the trip_temp. A trip point is crossed
+	 * the way up only if the temperature is above it while the
+	 * previous temperature was below *and* we crossed the
+	 * trip_temp_low before. The previous trip point give us the
+	 * previous trip point transition. The similar problem exists
+	 * when crossing the way down.
+	 */
+	if (tz->last_temperature < trip_temp && tz->temperature >= trip_temp &&
+	    trip != tz->prev_trip) {
+		thermal_notify_tz_trip_up(tz->id, trip, tz->temperature);
+		tz->prev_trip = trip;
+		
+	} else if (tz->last_temperature >= trip_low_temp && tz->temperature < trip_low_temp &&
+	    trip == tz->prev_trip) {
+		thermal_notify_tz_trip_down(tz->id, trip, tz->temperature);
+		tz->prev_trip = trip - 1;
 	}
 }
 
@@ -427,6 +438,7 @@  static void thermal_zone_device_init(struct thermal_zone_device *tz)
 {
 	struct thermal_instance *pos;
 	tz->temperature = THERMAL_TEMP_INVALID;
+	tz->prev_trip = -1;
 	tz->prev_low_trip = -INT_MAX;
 	tz->prev_high_trip = INT_MAX;
 	list_for_each_entry(pos, &tz->thermal_instances, tz_node)
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 231bac2768fb..5b3bfb902d10 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -124,6 +124,7 @@  struct thermal_cooling_device {
  * @last_temperature:	previous temperature read
  * @emul_temperature:	emulated temperature when using CONFIG_THERMAL_EMULATION
  * @passive:		1 if you've crossed a passive trip point, 0 otherwise.
+ * @prev_trip:		previous trip point the thermal zone was, -1 if below all of them
  * @prev_low_trip:	the low current temperature if you've crossed a passive
 			trip point.
  * @prev_high_trip:	the above current temperature if you've crossed a
@@ -159,6 +160,7 @@  struct thermal_zone_device {
 	int last_temperature;
 	int emul_temperature;
 	int passive;
+	int prev_trip;
 	int prev_low_trip;
 	int prev_high_trip;
 	atomic_t need_update;