Message ID | 20250129-topic-sm8650-thermal-cpu-idle-v3-2-62ab1a64098d@linaro.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | arm64: dts: qcom: sm8650: rework CPU & GPU thermal zones | expand |
On 29/01/2025 10:43, Neil Armstrong wrote: > On the SM8650, the dynamic clock and voltage scaling (DCVS) for the GPU > is done from the HLOS, but the GPU can achieve a much higher temperature > before failing according the the reference downstream implementation. > > Set higher temperatures in the GPU trip points corresponding to > the temperatures provided by Qualcomm in the dowstream source, much > closer to the junction temperature and with a higher critical > temperature trip in the case the HLOS DCVS cannot handle the > temperature surge. Since the tsens MAX_THRESHOLD which leads to a system monitor thermal shutdown is set at 120C, I need to lower the critical and hot trip point, so please ignore this patchset. Thanks, Neil > > Fixes: 497624ed5506 ("arm64: dts: qcom: sm8650: Throttle the GPU when overheating") > Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> > --- > arch/arm64/boot/dts/qcom/sm8650.dtsi | 48 ++++++++++++++++++------------------ > 1 file changed, 24 insertions(+), 24 deletions(-) > > diff --git a/arch/arm64/boot/dts/qcom/sm8650.dtsi b/arch/arm64/boot/dts/qcom/sm8650.dtsi > index 95509ce2713d4fcc3dbe0c5cd5827312d5681af4..e9fcf05cb084b7979ecf0f4712fed332e9f4b07a 100644 > --- a/arch/arm64/boot/dts/qcom/sm8650.dtsi > +++ b/arch/arm64/boot/dts/qcom/sm8650.dtsi > @@ -6173,19 +6173,19 @@ map0 { > > trips { > gpu0_alert0: trip-point0 { > - temperature = <85000>; > + temperature = <95000>; > hysteresis = <1000>; > type = "passive"; > }; > > trip-point1 { > - temperature = <90000>; > + temperature = <115000>; > hysteresis = <1000>; > type = "hot"; > }; > > trip-point2 { > - temperature = <110000>; > + temperature = <125000>; > hysteresis = <1000>; > type = "critical"; > }; > @@ -6206,19 +6206,19 @@ map0 { > > trips { > gpu1_alert0: trip-point0 { > - temperature = <85000>; > + temperature = <95000>; > hysteresis = <1000>; > type = "passive"; > }; > > trip-point1 { > - temperature = <90000>; > + temperature = <115000>; > hysteresis = <1000>; > type = "hot"; > }; > > trip-point2 { > - temperature = <110000>; > + temperature = <125000>; > hysteresis = <1000>; > type = "critical"; > }; > @@ -6239,19 +6239,19 @@ map0 { > > trips { > gpu2_alert0: trip-point0 { > - temperature = <85000>; > + temperature = <95000>; > hysteresis = <1000>; > type = "passive"; > }; > > trip-point1 { > - temperature = <90000>; > + temperature = <115000>; > hysteresis = <1000>; > type = "hot"; > }; > > trip-point2 { > - temperature = <110000>; > + temperature = <125000>; > hysteresis = <1000>; > type = "critical"; > }; > @@ -6272,19 +6272,19 @@ map0 { > > trips { > gpu3_alert0: trip-point0 { > - temperature = <85000>; > + temperature = <95000>; > hysteresis = <1000>; > type = "passive"; > }; > > trip-point1 { > - temperature = <90000>; > + temperature = <115000>; > hysteresis = <1000>; > type = "hot"; > }; > > trip-point2 { > - temperature = <110000>; > + temperature = <125000>; > hysteresis = <1000>; > type = "critical"; > }; > @@ -6305,19 +6305,19 @@ map0 { > > trips { > gpu4_alert0: trip-point0 { > - temperature = <85000>; > + temperature = <95000>; > hysteresis = <1000>; > type = "passive"; > }; > > trip-point1 { > - temperature = <90000>; > + temperature = <115000>; > hysteresis = <1000>; > type = "hot"; > }; > > trip-point2 { > - temperature = <110000>; > + temperature = <125000>; > hysteresis = <1000>; > type = "critical"; > }; > @@ -6338,19 +6338,19 @@ map0 { > > trips { > gpu5_alert0: trip-point0 { > - temperature = <85000>; > + temperature = <95000>; > hysteresis = <1000>; > type = "passive"; > }; > > trip-point1 { > - temperature = <90000>; > + temperature = <115000>; > hysteresis = <1000>; > type = "hot"; > }; > > trip-point2 { > - temperature = <110000>; > + temperature = <125000>; > hysteresis = <1000>; > type = "critical"; > }; > @@ -6371,19 +6371,19 @@ map0 { > > trips { > gpu6_alert0: trip-point0 { > - temperature = <85000>; > + temperature = <95000>; > hysteresis = <1000>; > type = "passive"; > }; > > trip-point1 { > - temperature = <90000>; > + temperature = <115000>; > hysteresis = <1000>; > type = "hot"; > }; > > trip-point2 { > - temperature = <110000>; > + temperature = <125000>; > hysteresis = <1000>; > type = "critical"; > }; > @@ -6404,19 +6404,19 @@ map0 { > > trips { > gpu7_alert0: trip-point0 { > - temperature = <85000>; > + temperature = <95000>; > hysteresis = <1000>; > type = "passive"; > }; > > trip-point1 { > - temperature = <90000>; > + temperature = <115000>; > hysteresis = <1000>; > type = "hot"; > }; > > trip-point2 { > - temperature = <110000>; > + temperature = <125000>; > hysteresis = <1000>; > type = "critical"; > }; >
On 29.01.2025 3:41 PM, Neil Armstrong wrote: > On 29/01/2025 10:43, Neil Armstrong wrote: >> On the SM8650, the dynamic clock and voltage scaling (DCVS) for the GPU >> is done from the HLOS, but the GPU can achieve a much higher temperature >> before failing according the the reference downstream implementation. >> >> Set higher temperatures in the GPU trip points corresponding to >> the temperatures provided by Qualcomm in the dowstream source, much >> closer to the junction temperature and with a higher critical >> temperature trip in the case the HLOS DCVS cannot handle the >> temperature surge. > > Since the tsens MAX_THRESHOLD which leads to a system > monitor thermal shutdown is set at 120C, I need to lower > the critical and hot trip point, so please ignore this patchset. Should we make the "critical" trip point something like 110 or so? If LMH triggers a hard shutdown at 120, the OS will not have any time to take action. And 120 sounds like we're pushing it quite hard anyway. Konrad
On 01/02/2025 16:37, Konrad Dybcio wrote: > On 29.01.2025 3:41 PM, Neil Armstrong wrote: >> On 29/01/2025 10:43, Neil Armstrong wrote: >>> On the SM8650, the dynamic clock and voltage scaling (DCVS) for the GPU >>> is done from the HLOS, but the GPU can achieve a much higher temperature >>> before failing according the the reference downstream implementation. >>> >>> Set higher temperatures in the GPU trip points corresponding to >>> the temperatures provided by Qualcomm in the dowstream source, much >>> closer to the junction temperature and with a higher critical >>> temperature trip in the case the HLOS DCVS cannot handle the >>> temperature surge. >> >> Since the tsens MAX_THRESHOLD which leads to a system >> monitor thermal shutdown is set at 120C, I need to lower >> the critical and hot trip point, so please ignore this patchset. > > Should we make the "critical" trip point something like 110 or so? If > LMH triggers a hard shutdown at 120, the OS will not have any time to > take action. And 120 sounds like we're pushing it quite hard anyway. My plan is to harmonize and use 110 for hot and 115 for critical, and if available any passive cooling devices is available at 95C Neil > > Konrad
On 3.02.2025 9:23 AM, neil.armstrong@linaro.org wrote: > On 01/02/2025 16:37, Konrad Dybcio wrote: >> On 29.01.2025 3:41 PM, Neil Armstrong wrote: >>> On 29/01/2025 10:43, Neil Armstrong wrote: >>>> On the SM8650, the dynamic clock and voltage scaling (DCVS) for the GPU >>>> is done from the HLOS, but the GPU can achieve a much higher temperature >>>> before failing according the the reference downstream implementation. >>>> >>>> Set higher temperatures in the GPU trip points corresponding to >>>> the temperatures provided by Qualcomm in the dowstream source, much >>>> closer to the junction temperature and with a higher critical >>>> temperature trip in the case the HLOS DCVS cannot handle the >>>> temperature surge. >>> >>> Since the tsens MAX_THRESHOLD which leads to a system >>> monitor thermal shutdown is set at 120C, I need to lower >>> the critical and hot trip point, so please ignore this patchset. >> >> Should we make the "critical" trip point something like 110 or so? If >> LMH triggers a hard shutdown at 120, the OS will not have any time to >> take action. And 120 sounds like we're pushing it quite hard anyway. > > > My plan is to harmonize and use 110 for hot and 115 for critical, and > if available any passive cooling devices is available at 95C sounds good! Konrad
diff --git a/arch/arm64/boot/dts/qcom/sm8650.dtsi b/arch/arm64/boot/dts/qcom/sm8650.dtsi index 95509ce2713d4fcc3dbe0c5cd5827312d5681af4..e9fcf05cb084b7979ecf0f4712fed332e9f4b07a 100644 --- a/arch/arm64/boot/dts/qcom/sm8650.dtsi +++ b/arch/arm64/boot/dts/qcom/sm8650.dtsi @@ -6173,19 +6173,19 @@ map0 { trips { gpu0_alert0: trip-point0 { - temperature = <85000>; + temperature = <95000>; hysteresis = <1000>; type = "passive"; }; trip-point1 { - temperature = <90000>; + temperature = <115000>; hysteresis = <1000>; type = "hot"; }; trip-point2 { - temperature = <110000>; + temperature = <125000>; hysteresis = <1000>; type = "critical"; }; @@ -6206,19 +6206,19 @@ map0 { trips { gpu1_alert0: trip-point0 { - temperature = <85000>; + temperature = <95000>; hysteresis = <1000>; type = "passive"; }; trip-point1 { - temperature = <90000>; + temperature = <115000>; hysteresis = <1000>; type = "hot"; }; trip-point2 { - temperature = <110000>; + temperature = <125000>; hysteresis = <1000>; type = "critical"; }; @@ -6239,19 +6239,19 @@ map0 { trips { gpu2_alert0: trip-point0 { - temperature = <85000>; + temperature = <95000>; hysteresis = <1000>; type = "passive"; }; trip-point1 { - temperature = <90000>; + temperature = <115000>; hysteresis = <1000>; type = "hot"; }; trip-point2 { - temperature = <110000>; + temperature = <125000>; hysteresis = <1000>; type = "critical"; }; @@ -6272,19 +6272,19 @@ map0 { trips { gpu3_alert0: trip-point0 { - temperature = <85000>; + temperature = <95000>; hysteresis = <1000>; type = "passive"; }; trip-point1 { - temperature = <90000>; + temperature = <115000>; hysteresis = <1000>; type = "hot"; }; trip-point2 { - temperature = <110000>; + temperature = <125000>; hysteresis = <1000>; type = "critical"; }; @@ -6305,19 +6305,19 @@ map0 { trips { gpu4_alert0: trip-point0 { - temperature = <85000>; + temperature = <95000>; hysteresis = <1000>; type = "passive"; }; trip-point1 { - temperature = <90000>; + temperature = <115000>; hysteresis = <1000>; type = "hot"; }; trip-point2 { - temperature = <110000>; + temperature = <125000>; hysteresis = <1000>; type = "critical"; }; @@ -6338,19 +6338,19 @@ map0 { trips { gpu5_alert0: trip-point0 { - temperature = <85000>; + temperature = <95000>; hysteresis = <1000>; type = "passive"; }; trip-point1 { - temperature = <90000>; + temperature = <115000>; hysteresis = <1000>; type = "hot"; }; trip-point2 { - temperature = <110000>; + temperature = <125000>; hysteresis = <1000>; type = "critical"; }; @@ -6371,19 +6371,19 @@ map0 { trips { gpu6_alert0: trip-point0 { - temperature = <85000>; + temperature = <95000>; hysteresis = <1000>; type = "passive"; }; trip-point1 { - temperature = <90000>; + temperature = <115000>; hysteresis = <1000>; type = "hot"; }; trip-point2 { - temperature = <110000>; + temperature = <125000>; hysteresis = <1000>; type = "critical"; }; @@ -6404,19 +6404,19 @@ map0 { trips { gpu7_alert0: trip-point0 { - temperature = <85000>; + temperature = <95000>; hysteresis = <1000>; type = "passive"; }; trip-point1 { - temperature = <90000>; + temperature = <115000>; hysteresis = <1000>; type = "hot"; }; trip-point2 { - temperature = <110000>; + temperature = <125000>; hysteresis = <1000>; type = "critical"; };
On the SM8650, the dynamic clock and voltage scaling (DCVS) for the GPU is done from the HLOS, but the GPU can achieve a much higher temperature before failing according the the reference downstream implementation. Set higher temperatures in the GPU trip points corresponding to the temperatures provided by Qualcomm in the dowstream source, much closer to the junction temperature and with a higher critical temperature trip in the case the HLOS DCVS cannot handle the temperature surge. Fixes: 497624ed5506 ("arm64: dts: qcom: sm8650: Throttle the GPU when overheating") Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> --- arch/arm64/boot/dts/qcom/sm8650.dtsi | 48 ++++++++++++++++++------------------ 1 file changed, 24 insertions(+), 24 deletions(-)