Message ID | 20241022155147.463475-1-daniel.lezcano@linaro.org (mailing list archive) |
---|---|
Headers | show |
Series | Add thermal user thresholds support | expand |
Please note, I did not fixed the locking issue assuming it will be fixed when the series is applied as stated in v5. As soon as the series is merged, I will send a couple of changes, one to fix the memory leak and the other one to fix the warning reported by syszbot Thanks -- D. On 22/10/2024 17:51, Daniel Lezcano wrote: > The trip points are a firmware description of the temperature limits > of a specific thermal zone where we associate an action which is done > by the kernel. The time resolution is low. > > The userspace has to deal with a more complex thermal management based > on heuristics from different information coming from different > places. The logic is much more complex but based on a bigger time > resolution, usually one second based. > > The purpose of the userspace is to monitor the temperatures from > different places and take actions. However, it can not be constantly > reading the temperature to detect when a temperature threshold has > been reached. This is especially bad for mobile or embedded system as > that will lead to an unacceptable number of wakeup to check the > temperature with nothing to do. > > On the other side, the sensors are now most of the time interrupt > driven. That means the thermal framework will use the temperature trip > points to program the sensor to trigger an interrupt when a > temperature limit is crossed. > > Unfortunately, the userspace can not benefit this feature and current > solutions found here and there, iow out-of-tree, are to add fake trip > points in the firmware and enable the writable trip points. > > This is bad for different reasons, the trip points are for in-kernel > actions, the semantic of their types is used by the thermal framework > and by adding trip points in the device tree is a way to overcome the > current limitation but tampering with how the thermal framework is > supposed to work. The writable trip points is a way to adjust a > temperature limit given a specific platform if the firmware is not > accurate enough and TBH it is more a debug feature from my POV. > > The user thresholds mechanism is a way to have the userspace to tell > thermal framework to send a notification when a temperature limit is > crossed. There is no id, no hysteresis, just the temperature and the > direction of the limit crossing. That means we can be notified when a > temperature threshold is crossed the way up only, or the way down only > or both ways. That allows to create hysteresis values if it is needed. > > Those thresholds are refered as user thresholds in order to do the > difference with the trip points which are similar. > > An user threshold can be added, deleted or flushed. The latter means > all user thresholds belonging to a thermal zone will be deleted. > > When one or several user thresholds are crossed, an event is sent to > the userspace. > > All aforementioned actions and events lead to a notification to the > userspace. > > Along with the kernel changes, the thermal library has been extended > to provide the different API to deal with the new user threshold > netlink events and commands. > > In addition, the thermal-engine skeleton uses these new API by > flushing and adding user thresholds as well as getting the > notification about these actions. > > Overall the series has been tested with the thermal-engine skeleton > and some selftests which are not part of this series. > > Changelog: > V6: > > - Added a comment in the parse_threshold_get() function to clarify > how the size is computed (Lukasz) > > - Changed comparison to check against non zero value in the > __thermal_zone_discover() function (Lukasz) > > V5: > - Added CAP_SYS_ADMIN needed capability when adding, deleting and > flushing a threshold (Rafael) > > - Remove the pid information to prevent leaking pid inside > containers. Also the information is not really needed (Rafael) > > - Renamed "THERMAL_GENL_ATTR_THRESHOLD_WAY" to > "THERMAL_GENL_ATTR_THRESHOLD_DIRECTION". Did not used '*_DIR' as > suggested initially because it can be ambiguous with 'directory' > (Rafael) > > - Renamed 'last_temp' to 'prev_temp' (Rafael) > > - Used CLASS constructor/destructor to get / put the thermal > zone's device refcount (Rafael) > > - Moved locking inside thermal_thresholds_for_each() (Rafael) > > - Reflected the changes above in the thermal library and the > thermal engine skeleton > > > V4: > - Fix missing stubs when THERMAL_NETLINK=n (kernel test robot) > > V3: > - the first patch of the v2 series has been merged > > - Modified the description to split the information between the > cover letter and the patch 1 description (Rafael) > > - Made the thresholds code as part of the core (Rafael) > > - Converted the thresholds into a list and directly declared in > the thermal zone device structure (Rafael) > > - Changed the name of the field in the thermal zone device > structure to user_thresholds (Rafael) > > - Added #include "thermal_thresholds.h" (Rafael) > > - Combined the conditions in the function > __thermal_threshold_is_crossed (Rafael) > > - Moved the function thermal_thresholds_flush() before > thermal_thresholds_exit() (Rafael) > > - Change thermal_thresholds_handle() to return void (Rafael) > > - Move the list field on top the of the structure threshold and > renamed it list_node (Rafael) > > - Changed THERMAL_THRESHOLD_* notifications to > THERMAL_TZ_THRESHOLD_* (Rafael) > > V2: > - Compute min and max in thermal_zone_device_update() but keep > the loop as it is (Rafael) > > - Include slab.h to fix compilation warnings on some architectures > with kmalloc and kfree (kernel test robot) > > Daniel Lezcano (5): > thermal: core: Connect the threshold with the core > thermal/netlink: Add the commands and the events for the thresholds > tools/lib/thermal: Make more generic the command encoding function > tools/lib/thermal: Add the threshold netlink ABI > tools/thermal/thermal-engine: Take into account the thresholds API > > drivers/thermal/thermal_core.c | 9 + > drivers/thermal/thermal_netlink.c | 236 +++++++++++++++++- > drivers/thermal/thermal_netlink.h | 34 +++ > drivers/thermal/thermal_thresholds.c | 36 +-- > drivers/thermal/thermal_thresholds.h | 2 +- > include/uapi/linux/thermal.h | 27 +- > tools/lib/thermal/commands.c | 177 ++++++++++++- > tools/lib/thermal/events.c | 55 +++- > tools/lib/thermal/include/thermal.h | 40 +++ > tools/lib/thermal/libthermal.map | 5 + > tools/lib/thermal/thermal.c | 17 ++ > tools/thermal/lib/Makefile | 2 +- > tools/thermal/thermal-engine/thermal-engine.c | 105 +++++++- > 13 files changed, 681 insertions(+), 64 deletions(-) >
On Tue, Oct 22, 2024 at 5:54 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > > Please note, I did not fixed the locking issue assuming it will be fixed > when the series is applied as stated in v5. Yes. Applied now. Note that I have skipped patch [1/5] because it has been applied already before. It is on my bleeding-edge branch ATM since the commit IDs may still change. > As soon as the series is merged, I will send a couple of changes, one to > fix the memory leak and the other one to fix the warning reported by syszbot It should get to linux-next and thermal in a couple of days. Thanks!
Great thanks ! On 23/10/2024 12:31, Rafael J. Wysocki wrote: > On Tue, Oct 22, 2024 at 5:54 PM Daniel Lezcano > <daniel.lezcano@linaro.org> wrote: >> >> >> Please note, I did not fixed the locking issue assuming it will be fixed >> when the series is applied as stated in v5. > > Yes. > > Applied now. Note that I have skipped patch [1/5] because it has been > applied already before. > > It is on my bleeding-edge branch ATM since the commit IDs may still change. > >> As soon as the series is merged, I will send a couple of changes, one to >> fix the memory leak and the other one to fix the warning reported by syszbot > > It should get to linux-next and thermal in a couple of days. > > Thanks!