diff mbox series

thermal/core: Introduce user trip points

Message ID 20240627085451.3813989-1-daniel.lezcano@linaro.org (mailing list archive)
State Changes Requested, archived
Headers show
Series thermal/core: Introduce user trip points | expand

Commit Message

Daniel Lezcano June 27, 2024, 8:54 a.m. UTC
Currently the thermal framework has 4 trip point types:

- active : basically for fans (or anything requiring energy to cool
  down)

- passive : a performance limiter

- hot : for a last action before reaching critical

- critical : a without return threshold leading to a system shutdown

A thermal zone monitors the temperature regarding these trip
points. The old way to do that is actively polling the temperature
which is very bad for embedded systems, especially mobile and it is
even worse today as we can have more than fifty thermal zones. The
modern way is to rely on the driver to send an interrupt when the trip
points are crossed, so the system can sleep while the temperature
monitoring is offloaded to a dedicated hardware.

However, the thermal aspect is also managed from userspace to protect
the user, especially tracking down the skin temperature sensor. The
logic is more complex than what we found in the kernel because it
needs multiple sources indicating the thermal situation of the entire
system.

For this reason it needs to setup trip points at different levels in
order to get informed about what is going on with some thermal zones
when running some specific application.

For instance, the skin temperature must be limited to 43°C on a long
run but can go to 48°C for 10 minutes, or 60°C for 1 minute.

The thermal engine must then rely on trip points to monitor those
temperatures. Unfortunately, today there is only 'active' and
'passive' trip points which has a specific meaning for the kernel, not
the userspace. That leads to hacks in different platforms for mobile
and embedded systems where 'active' trip points are used to send
notification to the userspace. This is obviously not right because
these trip are handled by the kernel.

This patch introduces the 'user' trip point type where its semantic is
simple: do nothing at the kernel level, just send a notification to
the user space.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 .../devicetree/bindings/thermal/thermal-zones.yaml        | 1 +
 drivers/thermal/thermal_core.c                            | 8 ++++++++
 drivers/thermal/thermal_of.c                              | 1 +
 drivers/thermal/thermal_trace.h                           | 4 +++-
 drivers/thermal/thermal_trip.c                            | 1 +
 include/uapi/linux/thermal.h                              | 1 +
 6 files changed, 15 insertions(+), 1 deletion(-)

Comments

Rafael J. Wysocki June 28, 2024, 1:56 p.m. UTC | #1
On Thu, Jun 27, 2024 at 10:55 AM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
> Currently the thermal framework has 4 trip point types:
>
> - active : basically for fans (or anything requiring energy to cool
>   down)
>
> - passive : a performance limiter
>
> - hot : for a last action before reaching critical
>
> - critical : a without return threshold leading to a system shutdown
>
> A thermal zone monitors the temperature regarding these trip
> points. The old way to do that is actively polling the temperature
> which is very bad for embedded systems, especially mobile and it is
> even worse today as we can have more than fifty thermal zones. The
> modern way is to rely on the driver to send an interrupt when the trip
> points are crossed, so the system can sleep while the temperature
> monitoring is offloaded to a dedicated hardware.
>
> However, the thermal aspect is also managed from userspace to protect
> the user, especially tracking down the skin temperature sensor. The
> logic is more complex than what we found in the kernel because it
> needs multiple sources indicating the thermal situation of the entire
> system.
>
> For this reason it needs to setup trip points at different levels in
> order to get informed about what is going on with some thermal zones
> when running some specific application.
>
> For instance, the skin temperature must be limited to 43°C on a long
> run but can go to 48°C for 10 minutes, or 60°C for 1 minute.
>
> The thermal engine must then rely on trip points to monitor those
> temperatures. Unfortunately, today there is only 'active' and
> 'passive' trip points which has a specific meaning for the kernel, not
> the userspace. That leads to hacks in different platforms for mobile
> and embedded systems where 'active' trip points are used to send
> notification to the userspace. This is obviously not right because
> these trip are handled by the kernel.
>
> This patch introduces the 'user' trip point type where its semantic is
> simple: do nothing at the kernel level, just send a notification to
> the user space.
>
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
>  .../devicetree/bindings/thermal/thermal-zones.yaml        | 1 +
>  drivers/thermal/thermal_core.c                            | 8 ++++++++
>  drivers/thermal/thermal_of.c                              | 1 +
>  drivers/thermal/thermal_trace.h                           | 4 +++-
>  drivers/thermal/thermal_trip.c                            | 1 +
>  include/uapi/linux/thermal.h                              | 1 +
>  6 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> index 68398e7e8655..cb9ea54a192e 100644
> --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> @@ -153,6 +153,7 @@ patternProperties:
>                type:
>                  $ref: /schemas/types.yaml#/definitions/string
>                  enum:
> +                  - user     # enable user notification
>                    - active   # enable active cooling e.g. fans
>                    - passive  # enable passive cooling e.g. throttling cpu
>                    - hot      # send notification to driver
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 2aa04c46a425..506f880d9aa9 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -734,6 +734,14 @@ int thermal_bind_cdev_to_trip(struct thermal_zone_device *tz,
>         if (tz != pos1 || cdev != pos2)
>                 return -EINVAL;
>
> +       /*
> +        * It is not allowed to bind a cooling device with a trip
> +        * point user type because no mitigation should happen from
> +        * the kernel with these trip points
> +        */
> +       if (trip->type == THERMAL_TRIP_USER)
> +               return -EINVAL;

Maybe print a debug message when bailing out here?

A check for "user" trips would need to be added to
thermal_governor_trip_crossed() and to the .manage() callbacks in the
power allocator, step-wise and fair-share governors, if I'm not
mistaken.  Especially fair-share and power allocator should not take
them into account IMV.

> +
>         /* lower default 0, upper default max_state */
>         lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
>
> diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
> index aa34b6e82e26..f6daf921a136 100644
> --- a/drivers/thermal/thermal_of.c
> +++ b/drivers/thermal/thermal_of.c
> @@ -60,6 +60,7 @@ static const char * const trip_types[] = {
>         [THERMAL_TRIP_PASSIVE]  = "passive",
>         [THERMAL_TRIP_HOT]      = "hot",
>         [THERMAL_TRIP_CRITICAL] = "critical",
> +       [THERMAL_TRIP_USER]     = "user",
>  };
>
>  /**
> diff --git a/drivers/thermal/thermal_trace.h b/drivers/thermal/thermal_trace.h
> index df8f4edd6068..739228ecc2e2 100644
> --- a/drivers/thermal/thermal_trace.h
> +++ b/drivers/thermal/thermal_trace.h
> @@ -15,13 +15,15 @@ TRACE_DEFINE_ENUM(THERMAL_TRIP_CRITICAL);
>  TRACE_DEFINE_ENUM(THERMAL_TRIP_HOT);
>  TRACE_DEFINE_ENUM(THERMAL_TRIP_PASSIVE);
>  TRACE_DEFINE_ENUM(THERMAL_TRIP_ACTIVE);
> +TRACE_DEFINE_ENUM(THERMAL_TRIP_USER);
>
>  #define show_tzt_type(type)                                    \
>         __print_symbolic(type,                                  \
>                          { THERMAL_TRIP_CRITICAL, "CRITICAL"},  \
>                          { THERMAL_TRIP_HOT,      "HOT"},       \
>                          { THERMAL_TRIP_PASSIVE,  "PASSIVE"},   \
> -                        { THERMAL_TRIP_ACTIVE,   "ACTIVE"})
> +                        { THERMAL_TRIP_ACTIVE,   "ACTIVE"}),   \
> +                        { THERMAL_TRIP_USER,     "USER"})
>
>  TRACE_EVENT(thermal_temperature,
>
> diff --git a/drivers/thermal/thermal_trip.c b/drivers/thermal/thermal_trip.c
> index 2a876d3b93aa..a0780bb4ff0d 100644
> --- a/drivers/thermal/thermal_trip.c
> +++ b/drivers/thermal/thermal_trip.c
> @@ -10,6 +10,7 @@
>  #include "thermal_core.h"
>
>  static const char *trip_type_names[] = {
> +       [THERMAL_TRIP_USER] = "user",
>         [THERMAL_TRIP_ACTIVE] = "active",
>         [THERMAL_TRIP_PASSIVE] = "passive",
>         [THERMAL_TRIP_HOT] = "hot",
> diff --git a/include/uapi/linux/thermal.h b/include/uapi/linux/thermal.h
> index fc78bf3aead7..84e556ace5f5 100644
> --- a/include/uapi/linux/thermal.h
> +++ b/include/uapi/linux/thermal.h
> @@ -14,6 +14,7 @@ enum thermal_trip_type {
>         THERMAL_TRIP_PASSIVE,
>         THERMAL_TRIP_HOT,
>         THERMAL_TRIP_CRITICAL,
> +       THERMAL_TRIP_USER,
>  };
>
>  /* Adding event notification support elements */
> --
> 2.43.0
>
diff mbox series

Patch

diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
index 68398e7e8655..cb9ea54a192e 100644
--- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
+++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
@@ -153,6 +153,7 @@  patternProperties:
               type:
                 $ref: /schemas/types.yaml#/definitions/string
                 enum:
+                  - user     # enable user notification
                   - active   # enable active cooling e.g. fans
                   - passive  # enable passive cooling e.g. throttling cpu
                   - hot      # send notification to driver
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 2aa04c46a425..506f880d9aa9 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -734,6 +734,14 @@  int thermal_bind_cdev_to_trip(struct thermal_zone_device *tz,
 	if (tz != pos1 || cdev != pos2)
 		return -EINVAL;
 
+	/*
+	 * It is not allowed to bind a cooling device with a trip
+	 * point user type because no mitigation should happen from
+	 * the kernel with these trip points
+	 */
+	if (trip->type == THERMAL_TRIP_USER)
+		return -EINVAL;
+
 	/* lower default 0, upper default max_state */
 	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
 
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index aa34b6e82e26..f6daf921a136 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -60,6 +60,7 @@  static const char * const trip_types[] = {
 	[THERMAL_TRIP_PASSIVE]	= "passive",
 	[THERMAL_TRIP_HOT]	= "hot",
 	[THERMAL_TRIP_CRITICAL]	= "critical",
+	[THERMAL_TRIP_USER]	= "user",
 };
 
 /**
diff --git a/drivers/thermal/thermal_trace.h b/drivers/thermal/thermal_trace.h
index df8f4edd6068..739228ecc2e2 100644
--- a/drivers/thermal/thermal_trace.h
+++ b/drivers/thermal/thermal_trace.h
@@ -15,13 +15,15 @@  TRACE_DEFINE_ENUM(THERMAL_TRIP_CRITICAL);
 TRACE_DEFINE_ENUM(THERMAL_TRIP_HOT);
 TRACE_DEFINE_ENUM(THERMAL_TRIP_PASSIVE);
 TRACE_DEFINE_ENUM(THERMAL_TRIP_ACTIVE);
+TRACE_DEFINE_ENUM(THERMAL_TRIP_USER);
 
 #define show_tzt_type(type)					\
 	__print_symbolic(type,					\
 			 { THERMAL_TRIP_CRITICAL, "CRITICAL"},	\
 			 { THERMAL_TRIP_HOT,      "HOT"},	\
 			 { THERMAL_TRIP_PASSIVE,  "PASSIVE"},	\
-			 { THERMAL_TRIP_ACTIVE,   "ACTIVE"})
+			 { THERMAL_TRIP_ACTIVE,   "ACTIVE"}),	\
+			 { THERMAL_TRIP_USER,     "USER"})
 
 TRACE_EVENT(thermal_temperature,
 
diff --git a/drivers/thermal/thermal_trip.c b/drivers/thermal/thermal_trip.c
index 2a876d3b93aa..a0780bb4ff0d 100644
--- a/drivers/thermal/thermal_trip.c
+++ b/drivers/thermal/thermal_trip.c
@@ -10,6 +10,7 @@ 
 #include "thermal_core.h"
 
 static const char *trip_type_names[] = {
+	[THERMAL_TRIP_USER] = "user",
 	[THERMAL_TRIP_ACTIVE] = "active",
 	[THERMAL_TRIP_PASSIVE] = "passive",
 	[THERMAL_TRIP_HOT] = "hot",
diff --git a/include/uapi/linux/thermal.h b/include/uapi/linux/thermal.h
index fc78bf3aead7..84e556ace5f5 100644
--- a/include/uapi/linux/thermal.h
+++ b/include/uapi/linux/thermal.h
@@ -14,6 +14,7 @@  enum thermal_trip_type {
 	THERMAL_TRIP_PASSIVE,
 	THERMAL_TRIP_HOT,
 	THERMAL_TRIP_CRITICAL,
+	THERMAL_TRIP_USER,
 };
 
 /* Adding event notification support elements */