[2/2] ARM: dts: rockchip: Configure the GPU thermal zone for mickey
diff mbox series

Message ID 20190520170132.91571-2-mka@chromium.org
State New
Headers show
Series
  • [1/2] ARM: dts: rockchip: Limit GPU frequency on veyron mickey to 300 MHz when the CPU gets very hot
Related show

Commit Message

Matthias Kaehlcke May 20, 2019, 5:01 p.m. UTC
mickey crams a lot of hardware into a tiny package, which requires
more aggressive thermal throttling than for devices with a larger
footprint. Configure the GPU thermal zone to throttle the GPU
progressively at temperatures >= 60°C. Heat dissipated by the
CPUs also affects the GPU temperature, hence we cap the CPU
frequency to 1.4 GHz for temperatures above 65°C. Further throttling
of the CPUs may be performed by the CPU thermal zone.

The configuration matches that of the downstram Chrome OS 3.14
kernel, the 'official' kernel for mickey.

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
---
Note: this patch depends on "ARM: dts: rockchip: Add #cooling-cells
entry for rk3288 GPU" (https://lore.kernel.org/patchwork/patch/1075005/)
---
 arch/arm/boot/dts/rk3288-veyron-mickey.dts | 64 ++++++++++++++++++++++
 1 file changed, 64 insertions(+)

Comments

Doug Anderson May 20, 2019, 8:21 p.m. UTC | #1
Hi,

On Mon, May 20, 2019 at 10:01 AM Matthias Kaehlcke <mka@chromium.org> wrote:
>
> mickey crams a lot of hardware into a tiny package, which requires
> more aggressive thermal throttling than for devices with a larger
> footprint. Configure the GPU thermal zone to throttle the GPU
> progressively at temperatures >= 60°C. Heat dissipated by the
> CPUs also affects the GPU temperature, hence we cap the CPU
> frequency to 1.4 GHz for temperatures above 65°C. Further throttling
> of the CPUs may be performed by the CPU thermal zone.
>
> The configuration matches that of the downstram Chrome OS 3.14

s/downstram/downstream


> +       cooling-maps {
> +               /* After 1st level throttle the GPU down to as low as 400 MHz */
> +               gpu_warmish_limit_gpu {
> +                       trip = <&gpu_alert_warmish>;
> +                       cooling-device = <&gpu THERMAL_NO_LIMIT 1>;

As per my comment in patch #1, you are probably ending up throttling
to 500 MHz, not 400 MHz.  Below will all have similar problems unless
we actually delete the 500 MHz operating point.


> +               };
> +
> +               /*
> +                * Slightly after we throttle the GPU, we'll also make sure that
> +                * the CPU can't go faster than 1.4 GHz.  Note that we won't
> +                * throttle the CPU lower than 1.4 GHz due to GPU heat--we'll
> +                * let the CPU do the rest itself.
> +                */
> +               gpu_warm_limit_cpu {
> +                       trip = <&gpu_alert_warm>;
> +                       cooling-device = <&cpu0 4 4>;

Shouldn't you list cpu1, cpu2, and cpu3 too?  That'd match what
upstream did elsewhere in this file?
Matthias Kaehlcke May 20, 2019, 9:21 p.m. UTC | #2
On Mon, May 20, 2019 at 01:21:33PM -0700, Doug Anderson wrote:
> Hi,
> 
> On Mon, May 20, 2019 at 10:01 AM Matthias Kaehlcke <mka@chromium.org> wrote:
> >
> > mickey crams a lot of hardware into a tiny package, which requires
> > more aggressive thermal throttling than for devices with a larger
> > footprint. Configure the GPU thermal zone to throttle the GPU
> > progressively at temperatures >= 60°C. Heat dissipated by the
> > CPUs also affects the GPU temperature, hence we cap the CPU
> > frequency to 1.4 GHz for temperatures above 65°C. Further throttling
> > of the CPUs may be performed by the CPU thermal zone.
> >
> > The configuration matches that of the downstram Chrome OS 3.14
> 
> s/downstram/downstream

ack

> 
> > +       cooling-maps {
> > +               /* After 1st level throttle the GPU down to as low as 400 MHz */
> > +               gpu_warmish_limit_gpu {
> > +                       trip = <&gpu_alert_warmish>;
> > +                       cooling-device = <&gpu THERMAL_NO_LIMIT 1>;
> 
> As per my comment in patch #1, you are probably ending up throttling
> to 500 MHz, not 400 MHz.  Below will all have similar problems unless
> we actually delete the 500 MHz operating point.

Thanks for pointing that out. As per disussion on patch #1 we'll
disable the 500 MHz OPP to stay in sync with downstream and avoid
problems in case someone decides to re-purpose NPLL.

> > +               };
> > +
> > +               /*
> > +                * Slightly after we throttle the GPU, we'll also make sure that
> > +                * the CPU can't go faster than 1.4 GHz.  Note that we won't
> > +                * throttle the CPU lower than 1.4 GHz due to GPU heat--we'll
> > +                * let the CPU do the rest itself.
> > +                */
> > +               gpu_warm_limit_cpu {
> > +                       trip = <&gpu_alert_warm>;
> > +                       cooling-device = <&cpu0 4 4>;
> 
> Shouldn't you list cpu1, cpu2, and cpu3 too?  That'd match what
> upstream did elsewhere in this file?

ack, should have noticed, I 'yelled' at others before for not doing this ...

Patch
diff mbox series

diff --git a/arch/arm/boot/dts/rk3288-veyron-mickey.dts b/arch/arm/boot/dts/rk3288-veyron-mickey.dts
index f118d92a49d0..f0b83afa2a60 100644
--- a/arch/arm/boot/dts/rk3288-veyron-mickey.dts
+++ b/arch/arm/boot/dts/rk3288-veyron-mickey.dts
@@ -138,6 +138,70 @@ 
 	/delete-property/mmc-hs200-1_8v;
 };
 
+&gpu_thermal {
+	/delete-node/ trips;
+	/delete-node/ cooling-maps;
+
+	trips {
+		gpu_alert_warmish: gpu_alert_warmish {
+			temperature = <60000>; /* millicelsius */
+			hysteresis = <2000>; /* millicelsius */
+			type = "passive";
+		};
+		gpu_alert_warm: gpu_alert_warm {
+			temperature = <65000>; /* millicelsius */
+			hysteresis = <2000>; /* millicelsius */
+			type = "passive";
+		};
+		gpu_alert_hotter: gpu_alert_hotter {
+			temperature = <84000>; /* millicelsius */
+			hysteresis = <2000>; /* millicelsius */
+			type = "passive";
+		};
+		gpu_alert_very_very_hot: gpu_alert_very_very_hot {
+			temperature = <86000>; /* millicelsius */
+			hysteresis = <2000>; /* millicelsius */
+			type = "passive";
+		};
+		gpu_crit: gpu_crit {
+			temperature = <90000>; /* millicelsius */
+			hysteresis = <2000>; /* millicelsius */
+			type = "critical";
+		};
+	};
+
+	cooling-maps {
+		/* After 1st level throttle the GPU down to as low as 400 MHz */
+		gpu_warmish_limit_gpu {
+			trip = <&gpu_alert_warmish>;
+			cooling-device = <&gpu THERMAL_NO_LIMIT 1>;
+		};
+
+		/*
+		 * Slightly after we throttle the GPU, we'll also make sure that
+		 * the CPU can't go faster than 1.4 GHz.  Note that we won't
+		 * throttle the CPU lower than 1.4 GHz due to GPU heat--we'll
+		 * let the CPU do the rest itself.
+		 */
+		gpu_warm_limit_cpu {
+			trip = <&gpu_alert_warm>;
+			cooling-device = <&cpu0 4 4>;
+		};
+
+		/* When hot, GPU goes down to 300 MHz */
+		gpu_hotter_limit_gpu {
+			trip = <&gpu_alert_hotter>;
+			cooling-device = <&gpu 2 2>;
+		};
+
+		/* When really hot, don't let GPU go _above_ 300 MHz */
+		gpu_very_very_hot_limit_gpu {
+			trip = <&gpu_alert_very_very_hot>;
+			cooling-device = <&gpu 2 THERMAL_NO_LIMIT>;
+		};
+	};
+};
+
 &i2c2 {
 	status = "disabled";
 };