diff mbox series

[1/3] thermal: core: Add indication for userspace usage

Message ID 20201130053640.54608-1-kai.heng.feng@canonical.com (mailing list archive)
State New, archived
Delegated to: Zhang Rui
Headers show
Series [1/3] thermal: core: Add indication for userspace usage | expand

Commit Message

Kai-Heng Feng Nov. 30, 2020, 5:36 a.m. UTC
We are seeing thermal shutdown on Intel based mobile workstations, the
shutdown happens during the first trip handle in
thermal_zone_device_register():
kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down

However, we shouldn't do a thermal shutdown here, since
1) We may want to use a dedicated daemon, Intel's thermald in this case,
to handle thermal shutdown.

2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside
ThermalZone. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
"... If this object it present under a device, the device’s driver
evaluates this object to determine the device’s critical cooling
temperature trip point. This value may then be used by the device’s
driver to program an internal device temperature sensor trip point."

So a "critical trip" here merely means we should take a more aggressive
cooling method.

So add an indication to let thermal core know it should leave thermal
device to userspace to handle.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
 drivers/thermal/thermal_core.c | 3 +++
 include/linux/thermal.h        | 2 ++
 2 files changed, 5 insertions(+)

Comments

Matthew Garrett Dec. 14, 2020, 6:21 p.m. UTC | #1
On Sun, Nov 29, 2020 at 9:36 PM Kai-Heng Feng
<kai.heng.feng@canonical.com> wrote:
>
> We are seeing thermal shutdown on Intel based mobile workstations, the
> shutdown happens during the first trip handle in
> thermal_zone_device_register():
> kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down

Is the temperature reported by the thermal zone actually correct here?
101 C seems extremely excessive.
Kai-Heng Feng Dec. 15, 2020, 12:49 p.m. UTC | #2
On Tue, Dec 15, 2020 at 2:22 AM Matthew Garrett <mjg59@google.com> wrote:
>
> On Sun, Nov 29, 2020 at 9:36 PM Kai-Heng Feng
> <kai.heng.feng@canonical.com> wrote:
> >
> > We are seeing thermal shutdown on Intel based mobile workstations, the
> > shutdown happens during the first trip handle in
> > thermal_zone_device_register():
> > kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down
>
> Is the temperature reported by the thermal zone actually correct here?
> 101 C seems extremely excessive.

According to ODM/OEM, it's correct.
It's a short period when Intel Turbo Boost kicks in.

Kai-Heng
diff mbox series

Patch

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index c6d74bc1c90b..6561e3767529 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1477,6 +1477,9 @@  thermal_zone_device_register(const char *type, int trips, int mask,
 			goto unregister;
 	}
 
+	if (tz->tzp && tz->tzp->userspace)
+		thermal_zone_device_disable(tz);
+
 	mutex_lock(&thermal_list_lock);
 	list_add_tail(&tz->node, &thermal_tz_list);
 	mutex_unlock(&thermal_list_lock);
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index d07ea27e72a9..e8e8fac78fc8 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -247,6 +247,8 @@  struct thermal_zone_params {
 	 */
 	bool no_hwmon;
 
+	bool userspace;
+
 	int num_tbps;	/* Number of tbp entries */
 	struct thermal_bind_params *tbp;