Message ID | 12509184.O9o76ZdvQC@rjwysocki.net (mailing list archive) |
---|---|
State | Mainlined, archived |
Headers | show |
Series | [v1] thermal: trip: Avoid skipping trips in thermal_zone_set_trips() | expand |
On 30/07/2024 16:41, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Say there are 3 trip points A, B, C sorted in ascending temperature > order with no hysteresis. If the zone temerature is exactly equal to > B, thermal_zone_set_trips() will set the boundaries to A and C and the > hardware will not catch any crossing of B (either way) until either A > or C is crossed and the boundaries are changed. When the temperature is B, an interrupt fired which led to the thermal_zone_update() and in turn it calls thermal_zone_set_trips() As the current temperature is equal to trip B, it makes sense to set A and C, as B has been processes when handling the thermal trips right before calling thermal_zone_set_trips() I'm failing to understand the issue you describe Were you able to reproduce the issue with emul_temp ? > To avoid that, use non-strict inequalities when comparing the trip > threshold to the zone temperature in thermal_zone_set_trips(). > > In the example above, it will cause both boundaries to be set to B, > which is desirable because an interrupt will trigger when the zone > temperature becomes different from B regardless of which way it goes. > That will allow a new interval to be set depending on the direction of > the zone temperature change. > > Fixes: 893bae92237d ("thermal: trip: Make thermal_zone_set_trips() use trip thresholds") > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > > 6.11-rc material. > > --- > drivers/thermal/thermal_trip.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > Index: linux-pm/drivers/thermal/thermal_trip.c > =================================================================== > --- linux-pm.orig/drivers/thermal/thermal_trip.c > +++ linux-pm/drivers/thermal/thermal_trip.c > @@ -82,10 +82,10 @@ void thermal_zone_set_trips(struct therm > return; > > for_each_trip_desc(tz, td) { > - if (td->threshold < tz->temperature && td->threshold > low) > + if (td->threshold <= tz->temperature && td->threshold > low) > low = td->threshold; > > - if (td->threshold > tz->temperature && td->threshold < high) > + if (td->threshold >= tz->temperature && td->threshold < high) > high = td->threshold; > } > > > >
On Tue, Jul 30, 2024 at 4:57 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > On 30/07/2024 16:41, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > Say there are 3 trip points A, B, C sorted in ascending temperature > > order with no hysteresis. If the zone temerature is exactly equal to > > B, thermal_zone_set_trips() will set the boundaries to A and C and the > > hardware will not catch any crossing of B (either way) until either A > > or C is crossed and the boundaries are changed. > > When the temperature is B, an interrupt fired which led to the > thermal_zone_update() and in turn it calls thermal_zone_set_trips() > > As the current temperature is equal to trip B, it makes sense to set A > and C, as B has been processes when handling the thermal trips right > before calling thermal_zone_set_trips() So say that A, B and C are active trips and the thermal zone uses the Bang-bang governor. Say that each trip point has a fan associated with it, so that every time it is crossed on the way up, the fan should be turned on, and every time it is crossed on the way down, the fan should be turned off. Denote these fans as f_A, f_B, and f_C, respectively. Say the initial thermal zone temperature is below A, so the initial thermal_zone_set_trips() interval is {-INT_MAX, A}. The zone temperature starts to rise and A is reached, so an interrupt triggers. __thermal_zone_device_update() runs and it sees that the zone temperature is equal to A, so thermal_zone_set_trips() sets the new interval to {-INT_MAX, B} and f_A is turned on. Say the zone temperature is still rising, despite f_A being on, and B is reached. __thermal_zone_device_update() runs and it sees that the zone temperature is equal to B, so thermal_zone_set_trips() sets the new interval to {A, C} and f_B is turned on. Say the temperature rises somewhat above B, but does not reach C and starts to fall down. B is crossed on the way down, so f_B should be turned off, but nothing happens, because an interrupt will only trigger when A is reached. > I'm failing to understand the issue you describe I hope the above helps. > Were you able to reproduce the issue with emul_temp ? I haven't tried, but I'm sure I'd be able to reproduce it.
On Tue, Jul 30, 2024 at 6:46 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Tue, Jul 30, 2024 at 4:57 PM Daniel Lezcano > <daniel.lezcano@linaro.org> wrote: > > > > On 30/07/2024 16:41, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > > > Say there are 3 trip points A, B, C sorted in ascending temperature > > > order with no hysteresis. If the zone temerature is exactly equal to > > > B, thermal_zone_set_trips() will set the boundaries to A and C and the > > > hardware will not catch any crossing of B (either way) until either A > > > or C is crossed and the boundaries are changed. > > > > When the temperature is B, an interrupt fired which led to the > > thermal_zone_update() and in turn it calls thermal_zone_set_trips() > > > > As the current temperature is equal to trip B, it makes sense to set A > > and C, as B has been processes when handling the thermal trips right > > before calling thermal_zone_set_trips() > > So say that A, B and C are active trips and the thermal zone uses the > Bang-bang governor. Say that each trip point has a fan associated > with it, so that every time it is crossed on the way up, the fan > should be turned on, and every time it is crossed on the way down, the > fan should be turned off. Denote these fans as f_A, f_B, and f_C, > respectively. > > Say the initial thermal zone temperature is below A, so the initial > thermal_zone_set_trips() interval is {-INT_MAX, A}. The zone > temperature starts to rise and A is reached, so an interrupt triggers. > __thermal_zone_device_update() runs and it sees that the zone > temperature is equal to A, so thermal_zone_set_trips() sets the new > interval to {-INT_MAX, B} and f_A is turned on. > > Say the zone temperature is still rising, despite f_A being on, and B > is reached. __thermal_zone_device_update() runs and it sees that the > zone temperature is equal to B, so thermal_zone_set_trips() sets the > new interval to {A, C} and f_B is turned on. > > Say the temperature rises somewhat above B, but does not reach C and > starts to fall down. B is crossed on the way down, so f_B should be > turned off, but nothing happens, because an interrupt will only > trigger when A is reached. Worse yet, if the zone temperature starts to fall down between A and B, after setting the thermal_zone_set_trips() interval to {-INT_MAX, B},,nothing will happen when A is crossed on the way down, and since the low boundary is clearly of the "don't care" type, f_A will not be turned off until B is reached AFAICS, which may be never. > > I'm failing to understand the issue you describe > > I hope the above helps. > > > Were you able to reproduce the issue with emul_temp ? > > I haven't tried, but I'm sure I'd be able to reproduce it.
On 30/07/2024 18:46, Rafael J. Wysocki wrote: > On Tue, Jul 30, 2024 at 4:57 PM Daniel Lezcano > <daniel.lezcano@linaro.org> wrote: >> >> On 30/07/2024 16:41, Rafael J. Wysocki wrote: >>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >>> >>> Say there are 3 trip points A, B, C sorted in ascending temperature >>> order with no hysteresis. If the zone temerature is exactly equal to >>> B, thermal_zone_set_trips() will set the boundaries to A and C and the >>> hardware will not catch any crossing of B (either way) until either A >>> or C is crossed and the boundaries are changed. >> >> When the temperature is B, an interrupt fired which led to the >> thermal_zone_update() and in turn it calls thermal_zone_set_trips() >> >> As the current temperature is equal to trip B, it makes sense to set A >> and C, as B has been processes when handling the thermal trips right >> before calling thermal_zone_set_trips() > > So say that A, B and C are active trips and the thermal zone uses the > Bang-bang governor. Say that each trip point has a fan associated > with it, so that every time it is crossed on the way up, the fan > should be turned on, and every time it is crossed on the way down, the > fan should be turned off. Denote these fans as f_A, f_B, and f_C, > respectively. > > Say the initial thermal zone temperature is below A, so the initial > thermal_zone_set_trips() interval is {-INT_MAX, A}. The zone > temperature starts to rise and A is reached, so an interrupt triggers. > __thermal_zone_device_update() runs and it sees that the zone > temperature is equal to A, so thermal_zone_set_trips() sets the new > interval to {-INT_MAX, B} and f_A is turned on. > > Say the zone temperature is still rising, despite f_A being on, and B > is reached. __thermal_zone_device_update() runs and it sees that the > zone temperature is equal to B, so thermal_zone_set_trips() sets the > new interval to {A, C} and f_B is turned on. > > Say the temperature rises somewhat above B, but does not reach C and > starts to fall down. B is crossed on the way down, so f_B should be > turned off, but nothing happens, because an interrupt will only > trigger when A is reached. > >> I'm failing to understand the issue you describe > > I hope the above helps. Yes, I understand now. This is really specific when it is with active trip points, so passive delay is not involved. Thanks for the clarification >> Were you able to reproduce the issue with emul_temp ? > > I haven't tried, but I'm sure I'd be able to reproduce it.
Index: linux-pm/drivers/thermal/thermal_trip.c =================================================================== --- linux-pm.orig/drivers/thermal/thermal_trip.c +++ linux-pm/drivers/thermal/thermal_trip.c @@ -82,10 +82,10 @@ void thermal_zone_set_trips(struct therm return; for_each_trip_desc(tz, td) { - if (td->threshold < tz->temperature && td->threshold > low) + if (td->threshold <= tz->temperature && td->threshold > low) low = td->threshold; - if (td->threshold > tz->temperature && td->threshold < high) + if (td->threshold >= tz->temperature && td->threshold < high) high = td->threshold; }