diff mbox series

[v2] arm64: dts: allwinner: a64: Add GPU thermal trips to the SoC dtsi

Message ID 0a6110a7b27a050bd58ab3663087eecd8e873ac0.1724126053.git.dsimic@manjaro.org (mailing list archive)
State New
Headers show
Series [v2] arm64: dts: allwinner: a64: Add GPU thermal trips to the SoC dtsi | expand

Commit Message

Dragan Simic Aug. 20, 2024, 3:57 a.m. UTC
Add thermal trips for the two GPU thermal sensors found in the Allwinner A64.
There's only one GPU OPP defined since the commit 1428f0c19f9c ("arm64: dts:
allwinner: a64: Run GPU at 432 MHz"), so defining only the critical thermal
trips makes sense for the A64's two GPU thermal zones.

Having these critical thermal trips defined ensures that no hot spots develop
inside the SoC die that exceed the maximum junction temperature.  That might
have been possible before, although quite unlikely, because the CPU and GPU
portions of the SoC are packed closely inside the SoC, so the overheating GPU
would inevitably result in the heat soaking into the CPU portion of the SoC,
causing the CPU thermal sensor to return high readings and trigger the CPU
critical thermal trips.  However, it's better not to rely on the heat soak
and have the critical GPU thermal trips properly defined instead.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
---

Notes:
    Changes in v2:
      - Added "a64:" at the end of the patch subject prefix and adjusted the
        patch subject a bit, to match the usual prefix better
      - Dropped the removal of potentially redundant comments that describe
        the units, as suggested by Icenowy [1] and Chen-Yu [2]
    
    Link to v1: https://lore.kernel.org/linux-sunxi/a17e0df64c5b976b47f19c5a29c02759cd9e5b8c.1723427375.git.dsimic@manjaro.org/T/#u
    
    [1] https://lore.kernel.org/linux-sunxi/24406e36f6facd93e798113303e22925b0a2dcc1.camel@icenowy.me/
    [2] https://lore.kernel.org/linux-sunxi/662f2332efb1d6c21e722066562a72b9@manjaro.org/T/#mdd7b18962c1ae339141061af51b89cd68bc04d50

 arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

նորայր Sept. 3, 2024, 8:46 p.m. UTC | #1
Hello,

I have tested this patch to the device tree on my pinephone (1.2 edition) under Gentoo Linux (vanilla-sources-6.10.7)
before the test, my dmesg contained:

---
[    0.156933] thermal_sys: Registered thermal governor 'fair_share'
[    0.156942] thermal_sys: Registered thermal governor 'bang_bang'
[    0.156960] thermal_sys: Registered thermal governor 'step_wise'
[    0.156976] thermal_sys: Registered thermal governor 'user_space'
[    0.156992] thermal_sys: Registered thermal governor 'power_allocator'
[    1.409536] thermal_sys: Failed to find 'trips' node
[    1.409555] thermal_sys: Failed to find trip points for thermal-sensor id=1
[    1.409594] thermal_sys: Failed to find 'trips' node
[    1.409607] thermal_sys: Failed to find trip points for thermal-sensor id=2
----

after applying the patch and booting with newer dtb, when i grep for the string 'thermal' in dmesg i only have:

----
[    0.159456] thermal_sys: Registered thermal governor 'fair_share'
[    0.159465] thermal_sys: Registered thermal governor 'bang_bang'
[    0.159484] thermal_sys: Registered thermal governor 'step_wise'
[    0.159499] thermal_sys: Registered thermal governor 'user_space'
[    0.159515] thermal_sys: Registered thermal governor 'power_allocator'
----

Tested-by: Norayr Chilingarian <norayr@arnet.am>

On Tue, 20 Aug 2024 05:57:47 +0200
Dragan Simic <dsimic@manjaro.org> wrote:

> Add thermal trips for the two GPU thermal sensors found in the Allwinner A64.
> There's only one GPU OPP defined since the commit 1428f0c19f9c ("arm64: dts:
> allwinner: a64: Run GPU at 432 MHz"), so defining only the critical thermal
> trips makes sense for the A64's two GPU thermal zones.
> 
> Having these critical thermal trips defined ensures that no hot spots develop
> inside the SoC die that exceed the maximum junction temperature.  That might
> have been possible before, although quite unlikely, because the CPU and GPU
> portions of the SoC are packed closely inside the SoC, so the overheating GPU
> would inevitably result in the heat soaking into the CPU portion of the SoC,
> causing the CPU thermal sensor to return high readings and trigger the CPU
> critical thermal trips.  However, it's better not to rely on the heat soak
> and have the critical GPU thermal trips properly defined instead.
> 
> Signed-off-by: Dragan Simic <dsimic@manjaro.org>
> ---
> 
> Notes:
>     Changes in v2:
>       - Added "a64:" at the end of the patch subject prefix and adjusted the
>         patch subject a bit, to match the usual prefix better
>       - Dropped the removal of potentially redundant comments that describe
>         the units, as suggested by Icenowy [1] and Chen-Yu [2]
>     
>     Link to v1: https://lore.kernel.org/linux-sunxi/a17e0df64c5b976b47f19c5a29c02759cd9e5b8c.1723427375.git.dsimic@manjaro.org/T/#u
>     
>     [1] https://lore.kernel.org/linux-sunxi/24406e36f6facd93e798113303e22925b0a2dcc1.camel@icenowy.me/
>     [2] https://lore.kernel.org/linux-sunxi/662f2332efb1d6c21e722066562a72b9@manjaro.org/T/#mdd7b18962c1ae339141061af51b89cd68bc04d50
> 
>  arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
> index e868ca5ae753..a5c3920e0f04 100644
> --- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
> +++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
> @@ -263,13 +263,29 @@ gpu0_thermal: gpu0-thermal {
>  			polling-delay-passive = <0>;
>  			polling-delay = <0>;
>  			thermal-sensors = <&ths 1>;
> +
> +			trips {
> +				gpu0_crit: gpu0-crit {
> +					temperature = <110000>;
> +					hysteresis = <2000>;
> +					type = "critical";
> +				};
> +			};
>  		};
>  
>  		gpu1_thermal: gpu1-thermal {
>  			/* milliseconds */
>  			polling-delay-passive = <0>;
>  			polling-delay = <0>;
>  			thermal-sensors = <&ths 2>;
> +
> +			trips {
> +				gpu1_crit: gpu1-crit {
> +					temperature = <110000>;
> +					hysteresis = <2000>;
> +					type = "critical";
> +				};
> +			};
>  		};
>  	};
>
Chen-Yu Tsai Sept. 4, 2024, 2:56 p.m. UTC | #2
On Tue, 20 Aug 2024 05:57:47 +0200, Dragan Simic wrote:
> Add thermal trips for the two GPU thermal sensors found in the Allwinner A64.
> There's only one GPU OPP defined since the commit 1428f0c19f9c ("arm64: dts:
> allwinner: a64: Run GPU at 432 MHz"), so defining only the critical thermal
> trips makes sense for the A64's two GPU thermal zones.
> 
> Having these critical thermal trips defined ensures that no hot spots develop
> inside the SoC die that exceed the maximum junction temperature.  That might
> have been possible before, although quite unlikely, because the CPU and GPU
> portions of the SoC are packed closely inside the SoC, so the overheating GPU
> would inevitably result in the heat soaking into the CPU portion of the SoC,
> causing the CPU thermal sensor to return high readings and trigger the CPU
> critical thermal trips.  However, it's better not to rely on the heat soak
> and have the critical GPU thermal trips properly defined instead.
> 
> [...]

Applied to sunxi/for-next in sunxi/linux.git, thanks!

[1/1] arm64: dts: allwinner: a64: Add GPU thermal trips to the SoC dtsi
      https://git.kernel.org/sunxi/linux/c/89f1a037e97c

Best regards,
diff mbox series

Patch

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
index e868ca5ae753..a5c3920e0f04 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
@@ -263,13 +263,29 @@  gpu0_thermal: gpu0-thermal {
 			polling-delay-passive = <0>;
 			polling-delay = <0>;
 			thermal-sensors = <&ths 1>;
+
+			trips {
+				gpu0_crit: gpu0-crit {
+					temperature = <110000>;
+					hysteresis = <2000>;
+					type = "critical";
+				};
+			};
 		};
 
 		gpu1_thermal: gpu1-thermal {
 			/* milliseconds */
 			polling-delay-passive = <0>;
 			polling-delay = <0>;
 			thermal-sensors = <&ths 2>;
+
+			trips {
+				gpu1_crit: gpu1-crit {
+					temperature = <110000>;
+					hysteresis = <2000>;
+					type = "critical";
+				};
+			};
 		};
 	};