diff mbox

ARM: dts: exynos: Exynos5422 Odroid-XU* incomplete thermal-zones definition

Message ID 20170623210913.ugutgop7ratwolfy@inc028000032.lancs.ac.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Willy WOLFF June 23, 2017, 9:09 p.m. UTC
Odroid XU*-familly boards has thermal sensors per A15 cores, but the actual
 thermal-zones define only cooling-maps action for cpu0.

If the application is running on all cores but core4 (first core of the A15
 cluster), the CPU can reach high temperature without any proper cooling
 action.

As already discus in prior mail, and on IRC, it's a quit big code
duplication, but I don't found the write way to express that in a better
 way.

The situation for this board is that we have multiple sensors, but
 matching cooling devices for these sensors act for the same physical
 device (FAN and A15 cluster, as each core of the cluster share the same
 frequency).
In fact, of-thermal.c:473:thermal_zone_of_sensor_register() can't use
multiple sensors for one single thermal zone.
This patch follow the path taken in arch/arm/boot/dts/qcom-apq8084.dtsi:97

I'm interested to extending the thermal driver, but it will takes time.
 So this is a workaround before refactoring the driver.
If somebody knows how to write it better, any advice and suggestions
are more than welcome.

Also, the comment for cpu_alert4 in cooling-maps definition is not
accurate, 11 steps for A15 correspond to 700MHz, not 600MHz.

Signed-off-by: Willy Wolff <willy.mh.wolff@gmail.com>
---
 arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi | 249 +++++++++++++++++++--
 1 file changed, 231 insertions(+), 18 deletions(-)

Comments

Krzysztof Kozlowski June 25, 2017, 7:57 a.m. UTC | #1
On Fri, Jun 23, 2017 at 10:09:14PM +0100, Willy Wolff wrote:
> Odroid XU*-familly boards has thermal sensors per A15 cores, but the actual
>  thermal-zones define only cooling-maps action for cpu0.
> 
> If the application is running on all cores but core4 (first core of the A15
>  cluster), the CPU can reach high temperature without any proper cooling
>  action.

Commit msg has weird paragraph indentation. Please write it as regular
commit, like others.

> 
> As already discus in prior mail, and on IRC, it's a quit big code
> duplication, but I don't found the write way to express that in a better
>  way.
> 
> The situation for this board is that we have multiple sensors, but
>  matching cooling devices for these sensors act for the same physical
>  device (FAN and A15 cluster, as each core of the cluster share the same
>  frequency).
> In fact, of-thermal.c:473:thermal_zone_of_sensor_register() can't use
> multiple sensors for one single thermal zone.
> This patch follow the path taken in arch/arm/boot/dts/qcom-apq8084.dtsi:97
> 
> I'm interested to extending the thermal driver, but it will takes time.
>  So this is a workaround before refactoring the driver.
> If somebody knows how to write it better, any advice and suggestions
> are more than welcome.

Skip all data irrelevant to the issue (like what is your interest,
discussions made on LKML or IRC). Just describe exactly what is the
problem (and maybe how it can be reproduced) and what is the solution.
You can mention what could be the optimal solution and why we cannot use
it now.

> 
> Also, the comment for cpu_alert4 in cooling-maps definition is not
> accurate, 11 steps for A15 correspond to 700MHz, not 600MHz.
> 
> Signed-off-by: Willy Wolff <willy.mh.wolff@gmail.com>
> ---
>  arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi | 249 +++++++++++++++++++--
>  1 file changed, 231 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
> index 05b9afdd6757..0759cc0812fb 100644
> --- a/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
> +++ b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
> @@ -64,22 +64,22 @@
>  			polling-delay-passive = <250>;
>  			polling-delay = <0>;
>  			trips {
> -				cpu_alert0: cpu-alert-0 {
> +				cpu0_alert0: cpu-alert-0 {
>  					temperature = <50000>; /* millicelsius */
>  					hysteresis = <5000>; /* millicelsius */
>  					type = "active";
>  				};
> -				cpu_alert1: cpu-alert-1 {
> +				cpu0_alert1: cpu-alert-1 {
>  					temperature = <60000>; /* millicelsius */
>  					hysteresis = <5000>; /* millicelsius */
>  					type = "active";
>  				};
> -				cpu_alert2: cpu-alert-2 {
> +				cpu0_alert2: cpu-alert-2 {
>  					temperature = <70000>; /* millicelsius */
>  					hysteresis = <5000>; /* millicelsius */
>  					type = "active";
>  				};
> -				cpu_crit0: cpu-crit-0 {
> +				cpu0_crit0: cpu-crit-0 {
>  					temperature = <120000>; /* millicelsius */
>  					hysteresis = <0>; /* millicelsius */
>  					type = "critical";
> @@ -88,14 +88,14 @@
>  				 * Exynos542x supports only 4 trip-points
>  				 * so for these polling mode is required.
>  				 * Start polling at temperature level of last
> -				 * interrupt-driven trip: cpu_alert2
> +				 * interrupt-driven trip: cpu0_alert2
>  				 */
> -				cpu_alert3: cpu-alert-3 {
> +				cpu0_alert3: cpu-alert-3 {
>  					temperature = <70000>; /* millicelsius */
>  					hysteresis = <10000>; /* millicelsius */
>  					type = "passive";
>  				};
> -				cpu_alert4: cpu-alert-4 {
> +				cpu0_alert4: cpu-alert-4 {
>  					temperature = <85000>; /* millicelsius */
>  					hysteresis = <10000>; /* millicelsius */
>  					type = "passive";
> @@ -104,43 +104,256 @@
>  			};
>  			cooling-maps {
>  				map0 {
> -					trip = <&cpu_alert0>;
> +					trip = <&cpu0_alert0>;
>  					cooling-device = <&fan0 0 1>;
>  				};
>  				map1 {
> -					trip = <&cpu_alert1>;
> +					trip = <&cpu0_alert1>;
>  					cooling-device = <&fan0 1 2>;
>  				};
>  				map2 {
> -					trip = <&cpu_alert2>;
> +					trip = <&cpu0_alert2>;
>  					cooling-device = <&fan0 2 3>;
>  				};
>  				/*
> -				 * When reaching cpu_alert3, reduce CPU
> +				 * When reaching cpu0_alert3, reduce CPU
>  				 * by 2 steps. On Exynos5422/5800 that would
>  				 * be: 1600 MHz and 1100 MHz.
>  				 */
>  				map3 {
> -					trip = <&cpu_alert3>;
> +					trip = <&cpu0_alert3>;
>  					cooling-device = <&cpu0 0 2>;
>  				};
>  				map4 {
> -					trip = <&cpu_alert3>;
> +					trip = <&cpu0_alert3>;
>  					cooling-device = <&cpu4 0 2>;
>  				};
>  
>  				/*
> -				 * When reaching cpu_alert4, reduce CPU
> -				 * further, down to 600 MHz (11 steps for big,
> +				 * When reaching cpu0_alert4, reduce CPU
> +				 * further, down to 600 MHz (12 steps for big,
>  				 * 7 steps for LITTLE).
>  				 */
>  				map5 {
> -					trip = <&cpu_alert4>;
> +					trip = <&cpu0_alert4>;
>  					cooling-device = <&cpu0 3 7>;
>  				};
>  				map6 {
> -					trip = <&cpu_alert4>;
> -					cooling-device = <&cpu4 3 11>;
> +					trip = <&cpu0_alert4>;
> +					cooling-device = <&cpu4 3 12>;
> +				};
> +			};
> +		};
> +		cpu1_thermal: cpu1-thermal {
> +			thermal-sensors = <&tmu_cpu1 0>;
> +			polling-delay-passive = <250>;
> +			polling-delay = <0>;
> +			trips {
> +				cpu1_alert0: cpu-alert-0 {
> +					temperature = <50000>;
> +					hysteresis = <5000>;
> +					type = "active";
> +				};
> +				cpu1_alert1: cpu-alert-1 {
> +					temperature = <60000>;
> +					hysteresis = <5000>;
> +					type = "active";
> +				};
> +				cpu1_alert2: cpu-alert-2 {
> +					temperature = <70000>;
> +					hysteresis = <5000>;
> +					type = "active";
> +				};
> +				cpu1_crit0: cpu-crit-0 {
> +					temperature = <120000>;
> +					hysteresis = <0>;
> +					type = "critical";
> +				};
> +
> +				cpu1_alert3: cpu-alert-3 {
> +					temperature = <70000>;
> +					hysteresis = <10000>; /* millicelsius */

You removed these comments in other cases so this should go away as
well.

Best regards,
Krzysztof
Anand Moon June 25, 2017, 1:23 p.m. UTC | #2
Hi Willy,

On 24 June 2017 at 02:39, Willy Wolff <willy.mh.wolff@gmail.com> wrote:
> Odroid XU*-familly boards has thermal sensors per A15 cores, but the actual
>  thermal-zones define only cooling-maps action for cpu0.
>
> If the application is running on all cores but core4 (first core of the A15
>  cluster), the CPU can reach high temperature without any proper cooling
>  action.
>
> As already discus in prior mail, and on IRC, it's a quit big code
> duplication, but I don't found the write way to express that in a better
>  way.
>
> The situation for this board is that we have multiple sensors, but
>  matching cooling devices for these sensors act for the same physical
>  device (FAN and A15 cluster, as each core of the cluster share the same
>  frequency).
> In fact, of-thermal.c:473:thermal_zone_of_sensor_register() can't use
> multiple sensors for one single thermal zone.
> This patch follow the path taken in arch/arm/boot/dts/qcom-apq8084.dtsi:97
>
> I'm interested to extending the thermal driver, but it will takes time.
>  So this is a workaround before refactoring the driver.
> If somebody knows how to write it better, any advice and suggestions
> are more than welcome.
>
> Also, the comment for cpu_alert4 in cooling-maps definition is not
> accurate, 11 steps for A15 correspond to 700MHz, not 600MHz.
>

[snip]

Few point to from my side.

1: We should also increase the trip points temperature so that it can
throttle at high temperature.
2: We should also increase the tips from 4 to 8 to support different
cluster of cpu's.
3: To avoid duplication of cooling-maps we can make tmu sensor work
differently for cluster of cpu's
      tmu_cpu0: handle pwm-fan control.
      tmu_cpu1: handle cpu[0-3] cpufreq mapping.
      tmu_cpu2: handle cpu[4-7] cpufreq mapping.
      tmu_gpu: handle gpu (mali) cpu throttle.

Please share your thought on this new approach.

I generally test using linaro's pm-qa and most of the time the test case fails
      # git clone git://git.linaro.org/tools/pm-qa.git
      # make -C thermal check

-Best Regards
-Anand Moon
Krzysztof Kozlowski June 25, 2017, 1:31 p.m. UTC | #3
On Sun, Jun 25, 2017 at 06:53:24PM +0530, Anand Moon wrote:
> Hi Willy,
> 
> On 24 June 2017 at 02:39, Willy Wolff <willy.mh.wolff@gmail.com> wrote:
> > Odroid XU*-familly boards has thermal sensors per A15 cores, but the actual
> >  thermal-zones define only cooling-maps action for cpu0.
> >
> > If the application is running on all cores but core4 (first core of the A15
> >  cluster), the CPU can reach high temperature without any proper cooling
> >  action.
> >
> > As already discus in prior mail, and on IRC, it's a quit big code
> > duplication, but I don't found the write way to express that in a better
> >  way.
> >
> > The situation for this board is that we have multiple sensors, but
> >  matching cooling devices for these sensors act for the same physical
> >  device (FAN and A15 cluster, as each core of the cluster share the same
> >  frequency).
> > In fact, of-thermal.c:473:thermal_zone_of_sensor_register() can't use
> > multiple sensors for one single thermal zone.
> > This patch follow the path taken in arch/arm/boot/dts/qcom-apq8084.dtsi:97
> >
> > I'm interested to extending the thermal driver, but it will takes time.
> >  So this is a workaround before refactoring the driver.
> > If somebody knows how to write it better, any advice and suggestions
> > are more than welcome.
> >
> > Also, the comment for cpu_alert4 in cooling-maps definition is not
> > accurate, 11 steps for A15 correspond to 700MHz, not 600MHz.
> >
> 
> [snip]
> 
> Few point to from my side.
> 
> 1: We should also increase the trip points temperature so that it can
> throttle at high temperature.

It is not related to this problem. If you wish to address different
problem related to non-optimal choice of temperature, then please send a
separate patch explaining chosen values.

> 2: We should also increase the tips from 4 to 8 to support different
> cluster of cpu's.

There are 4 CPU thermal zones on Exynos5422. What do you want to expand?

> 3: To avoid duplication of cooling-maps we can make tmu sensor work
> differently for cluster of cpu's
>       tmu_cpu0: handle pwm-fan control.
>       tmu_cpu1: handle cpu[0-3] cpufreq mapping.
>       tmu_cpu2: handle cpu[4-7] cpufreq mapping.

I miss the point behind this. Why fan should work only when CPU4
(tmu_cpu0) is heated and not CPU5-7 (rest of cpu tmu's)?

Best regards,
Krzysztof


>       tmu_gpu: handle gpu (mali) cpu throttle.
> 
> Please share your thought on this new approach.
> 
> I generally test using linaro's pm-qa and most of the time the test case fails
>       # git clone git://git.linaro.org/tools/pm-qa.git
>       # make -C thermal check
> 
> -Best Regards
> -Anand Moon
Anand Moon June 25, 2017, 3:18 p.m. UTC | #4
Hi Krzystof,

On 25 June 2017 at 19:01, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> On Sun, Jun 25, 2017 at 06:53:24PM +0530, Anand Moon wrote:
>> Hi Willy,
>>
>> On 24 June 2017 at 02:39, Willy Wolff <willy.mh.wolff@gmail.com> wrote:
>> > Odroid XU*-familly boards has thermal sensors per A15 cores, but the actual
>> >  thermal-zones define only cooling-maps action for cpu0.
>> >
>> > If the application is running on all cores but core4 (first core of the A15
>> >  cluster), the CPU can reach high temperature without any proper cooling
>> >  action.
>> >
>> > As already discus in prior mail, and on IRC, it's a quit big code
>> > duplication, but I don't found the write way to express that in a better
>> >  way.
>> >
>> > The situation for this board is that we have multiple sensors, but
>> >  matching cooling devices for these sensors act for the same physical
>> >  device (FAN and A15 cluster, as each core of the cluster share the same
>> >  frequency).
>> > In fact, of-thermal.c:473:thermal_zone_of_sensor_register() can't use
>> > multiple sensors for one single thermal zone.
>> > This patch follow the path taken in arch/arm/boot/dts/qcom-apq8084.dtsi:97
>> >
>> > I'm interested to extending the thermal driver, but it will takes time.
>> >  So this is a workaround before refactoring the driver.
>> > If somebody knows how to write it better, any advice and suggestions
>> > are more than welcome.
>> >
>> > Also, the comment for cpu_alert4 in cooling-maps definition is not
>> > accurate, 11 steps for A15 correspond to 700MHz, not 600MHz.
>> >
>>
>> [snip]
>>
>> Few point to from my side.
>>
>> 1: We should also increase the trip points temperature so that it can
>> throttle at high temperature.
>
> It is not related to this problem. If you wish to address different
> problem related to non-optimal choice of temperature, then please send a
> separate patch explaining chosen values.

Certain task  tends to run slow in current thermal zone setting.
If needed I will send a patch for this.

>> 2: We should also increase the tips from 4 to 8 to support different
>> cluster of cpu's.
>
> There are 4 CPU thermal zones on Exynos5422. What do you want to expand?

What I meant was to support more trip point to address below.
[    2.776320] exynos-tmu 100a0000.tmu: More trip points than
supported by this TMU.
[    2.782370] exynos-tmu 100a0000.tmu: 2 trip points should be
configured in polling mode.

>> 3: To avoid duplication of cooling-maps we can make tmu sensor work
>> differently for cluster of cpu's
>>       tmu_cpu0: handle pwm-fan control.
>>       tmu_cpu1: handle cpu[0-3] cpufreq mapping.
>>       tmu_cpu2: handle cpu[4-7] cpufreq mapping.
>
> I miss the point behind this. Why fan should work only when CPU4
> (tmu_cpu0) is heated and not CPU5-7 (rest of cpu tmu's)?

Ok fan should work on all the thermal zone. To avoid thermal shutdown.

But I want to avoid scaling down of all the cores of cpu to low freq
as cooling-maps cross the alert temperature. For below example.
----
 map3 {
              trip = <&cpu1_alert3>;
              cooling-device = <&cpu0 0 2>;
 };
 map4 {
              trip = <&cpu1_alert3>;
              cooling-device = <&cpu4 0 2>;
 };

 map5 {
              trip = <&cpu1_alert4>;
              cooling-device = <&cpu0 3 7>;
 };
 map6 {
             trip = <&cpu1_alert4>;
             cooling-device = <&cpu4 3 12>;
 };

What I want to configure thermal zone as.

cpu0_thermal: cpu0-thermal {
     configure cluster of cpu[0-3]
     {
        tips
     }
     cooling map
     {
         device handle cpu[0-3] with frequency scaling at particular
alert temperature.
     }
}

cpu1_thermal: cpu1-thermal {
     configure cluster of cpu[4-7]
     {
        tips
     }
     cooling map
     {
         device handle cpu[4-7] with frequency scaling at particular
alert temperature.
     }
}

We can chose to configure rest of the thermal-zone on this approach.
Please share your thoughts.

Best Regards
-Anand Moon
Krzysztof Kozlowski June 25, 2017, 3:29 p.m. UTC | #5
On Sun, Jun 25, 2017 at 08:48:13PM +0530, Anand Moon wrote:
> Hi Krzystof,
> >> 2: We should also increase the tips from 4 to 8 to support different
> >> cluster of cpu's.
> >
> > There are 4 CPU thermal zones on Exynos5422. What do you want to expand?
> 
> What I meant was to support more trip point to address below.
> [    2.776320] exynos-tmu 100a0000.tmu: More trip points than
> supported by this TMU.
> [    2.782370] exynos-tmu 100a0000.tmu: 2 trip points should be
> configured in polling mode.
> 

I do not understand what you want to achieve. I added 2 trip points in
polling mode for the CPU cooling mode. Just describe the problem and
send the patch - it is the best way to explain one's thought...

> >> 3: To avoid duplication of cooling-maps we can make tmu sensor work
> >> differently for cluster of cpu's
> >>       tmu_cpu0: handle pwm-fan control.
> >>       tmu_cpu1: handle cpu[0-3] cpufreq mapping.
> >>       tmu_cpu2: handle cpu[4-7] cpufreq mapping.
> >
> > I miss the point behind this. Why fan should work only when CPU4
> > (tmu_cpu0) is heated and not CPU5-7 (rest of cpu tmu's)?
> 
> Ok fan should work on all the thermal zone. To avoid thermal shutdown.
> 
> But I want to avoid scaling down of all the cores of cpu to low freq
> as cooling-maps cross the alert temperature. For below example.
> ----
>  map3 {
>               trip = <&cpu1_alert3>;
>               cooling-device = <&cpu0 0 2>;
>  };
>  map4 {
>               trip = <&cpu1_alert3>;
>               cooling-device = <&cpu4 0 2>;
>  };
> 
>  map5 {
>               trip = <&cpu1_alert4>;
>               cooling-device = <&cpu0 3 7>;
>  };
>  map6 {
>              trip = <&cpu1_alert4>;
>              cooling-device = <&cpu4 3 12>;
>  };
> 
> What I want to configure thermal zone as.
> 
> cpu0_thermal: cpu0-thermal {
>      configure cluster of cpu[0-3]
>      {
>         tips
>      }
>      cooling map
>      {
>          device handle cpu[0-3] with frequency scaling at particular
> alert temperature.
>      }
> }

cpu0_thermal is attached to tmu_cpu0 which is the temperature of CPU4
(first big core). I do not see reason behind connecting thermal zone
(thus temperature) of CPU4 with frequency of LITTLE cluster (CPU0-3). In
case of busy CPU4, you will scale down CPU0-3. Does not make sense.

> cpu1_thermal: cpu1-thermal {
>      configure cluster of cpu[4-7]
>      {
>         tips
>      }
>      cooling map
>      {
>          device handle cpu[4-7] with frequency scaling at particular
> alert temperature.
>      }
> }
> 
> We can chose to configure rest of the thermal-zone on this approach.
> Please share your thoughts.

I am sorry, I do not understand the idea, the problem nor the solution.

Best regards,
Krzysztof
Anand Moon June 25, 2017, 4:55 p.m. UTC | #6
Hi Krzysztof

On 25 June 2017 at 20:59, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> On Sun, Jun 25, 2017 at 08:48:13PM +0530, Anand Moon wrote:
>> Hi Krzystof,
>> >> 2: We should also increase the tips from 4 to 8 to support different
>> >> cluster of cpu's.
>> >
>> > There are 4 CPU thermal zones on Exynos5422. What do you want to expand?
>>
>> What I meant was to support more trip point to address below.
>> [    2.776320] exynos-tmu 100a0000.tmu: More trip points than
>> supported by this TMU.
>> [    2.782370] exynos-tmu 100a0000.tmu: 2 trip points should be
>> configured in polling mode.
>>
>
> I do not understand what you want to achieve. I added 2 trip points in
> polling mode for the CPU cooling mode. Just describe the problem and
> send the patch - it is the best way to explain one's thought...
>
>> >> 3: To avoid duplication of cooling-maps we can make tmu sensor work
>> >> differently for cluster of cpu's
>> >>       tmu_cpu0: handle pwm-fan control.
>> >>       tmu_cpu1: handle cpu[0-3] cpufreq mapping.
>> >>       tmu_cpu2: handle cpu[4-7] cpufreq mapping.
>> >
>> > I miss the point behind this. Why fan should work only when CPU4
>> > (tmu_cpu0) is heated and not CPU5-7 (rest of cpu tmu's)?
>>
>> Ok fan should work on all the thermal zone. To avoid thermal shutdown.
>>
>> But I want to avoid scaling down of all the cores of cpu to low freq
>> as cooling-maps cross the alert temperature. For below example.
>> ----
>>  map3 {
>>               trip = <&cpu1_alert3>;
>>               cooling-device = <&cpu0 0 2>;
>>  };
>>  map4 {
>>               trip = <&cpu1_alert3>;
>>               cooling-device = <&cpu4 0 2>;
>>  };
>>
>>  map5 {
>>               trip = <&cpu1_alert4>;
>>               cooling-device = <&cpu0 3 7>;
>>  };
>>  map6 {
>>              trip = <&cpu1_alert4>;
>>              cooling-device = <&cpu4 3 12>;
>>  };
>>
>> What I want to configure thermal zone as.
>>
>> cpu0_thermal: cpu0-thermal {
>>      configure cluster of cpu[0-3]
>>      {
>>         tips
>>      }
>>      cooling map
>>      {
>>          device handle cpu[0-3] with frequency scaling at particular
>> alert temperature.
>>      }
>> }
>
> cpu0_thermal is attached to tmu_cpu0 which is the temperature of CPU4
> (first big core). I do not see reason behind connecting thermal zone
> (thus temperature) of CPU4 with frequency of LITTLE cluster (CPU0-3). In
> case of busy CPU4, you will scale down CPU0-3. Does not make sense.
>

Both the cpu cluster are independent and have different cpu-freqency scaling.
that is the reason to for my changes.

>> cpu1_thermal: cpu1-thermal {
>>      configure cluster of cpu[4-7]
>>      {
>>         tips
>>      }
>>      cooling map
>>      {
>>          device handle cpu[4-7] with frequency scaling at particular
>> alert temperature.
>>      }
>> }
>>
>> We can chose to configure rest of the thermal-zone on this approach.
>> Please share your thoughts.
>
> I am sorry, I do not understand the idea, the problem nor the solution.
>
> Best regards,
> Krzysztof
>

Sorry for not able to express my self in technical terms.
my changes are based on exynos5433-tmu at this moment.
Just attaching small patch to share my point.

Best Regards
-Anand Moon
Krzysztof Kozlowski June 25, 2017, 5:19 p.m. UTC | #7
On Sun, Jun 25, 2017 at 10:25:32PM +0530, Anand Moon wrote:
> Hi Krzysztof
> 
> On 25 June 2017 at 20:59, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > On Sun, Jun 25, 2017 at 08:48:13PM +0530, Anand Moon wrote:
> >> Hi Krzystof,
> >> >> 2: We should also increase the tips from 4 to 8 to support different
> >> >> cluster of cpu's.
> >> >
> >> > There are 4 CPU thermal zones on Exynos5422. What do you want to expand?
> >>
> >> What I meant was to support more trip point to address below.
> >> [    2.776320] exynos-tmu 100a0000.tmu: More trip points than
> >> supported by this TMU.
> >> [    2.782370] exynos-tmu 100a0000.tmu: 2 trip points should be
> >> configured in polling mode.
> >>
> >
> > I do not understand what you want to achieve. I added 2 trip points in
> > polling mode for the CPU cooling mode. Just describe the problem and
> > send the patch - it is the best way to explain one's thought...
> >
> >> >> 3: To avoid duplication of cooling-maps we can make tmu sensor work
> >> >> differently for cluster of cpu's
> >> >>       tmu_cpu0: handle pwm-fan control.
> >> >>       tmu_cpu1: handle cpu[0-3] cpufreq mapping.
> >> >>       tmu_cpu2: handle cpu[4-7] cpufreq mapping.
> >> >
> >> > I miss the point behind this. Why fan should work only when CPU4
> >> > (tmu_cpu0) is heated and not CPU5-7 (rest of cpu tmu's)?
> >>
> >> Ok fan should work on all the thermal zone. To avoid thermal shutdown.
> >>
> >> But I want to avoid scaling down of all the cores of cpu to low freq
> >> as cooling-maps cross the alert temperature. For below example.
> >> ----
> >>  map3 {
> >>               trip = <&cpu1_alert3>;
> >>               cooling-device = <&cpu0 0 2>;
> >>  };
> >>  map4 {
> >>               trip = <&cpu1_alert3>;
> >>               cooling-device = <&cpu4 0 2>;
> >>  };
> >>
> >>  map5 {
> >>               trip = <&cpu1_alert4>;
> >>               cooling-device = <&cpu0 3 7>;
> >>  };
> >>  map6 {
> >>              trip = <&cpu1_alert4>;
> >>              cooling-device = <&cpu4 3 12>;
> >>  };
> >>
> >> What I want to configure thermal zone as.
> >>
> >> cpu0_thermal: cpu0-thermal {
> >>      configure cluster of cpu[0-3]
> >>      {
> >>         tips
> >>      }
> >>      cooling map
> >>      {
> >>          device handle cpu[0-3] with frequency scaling at particular
> >> alert temperature.
> >>      }
> >> }
> >
> > cpu0_thermal is attached to tmu_cpu0 which is the temperature of CPU4
> > (first big core). I do not see reason behind connecting thermal zone
> > (thus temperature) of CPU4 with frequency of LITTLE cluster (CPU0-3). In
> > case of busy CPU4, you will scale down CPU0-3. Does not make sense.
> >
> 
> Both the cpu cluster are independent and have different cpu-freqency scaling.
> that is the reason to for my changes.

Still does not make sense. Big is busy, LITTLE is doing nothing and you
want to scale down LITTLE. No reason.

> 
> >> cpu1_thermal: cpu1-thermal {
> >>      configure cluster of cpu[4-7]
> >>      {
> >>         tips
> >>      }
> >>      cooling map
> >>      {
> >>          device handle cpu[4-7] with frequency scaling at particular
> >> alert temperature.
> >>      }
> >> }
> >>
> >> We can chose to configure rest of the thermal-zone on this approach.
> >> Please share your thoughts.
> >
> > I am sorry, I do not understand the idea, the problem nor the solution.
> >
> > Best regards,
> > Krzysztof
> >
> 
> Sorry for not able to express my self in technical terms.
> my changes are based on exynos5433-tmu at this moment.
> Just attaching small patch to share my point.

On Odroid XU3-family, all TMU interrupt-driven configureable trip points
are configured. Why are you referring to Exynos5433?

Best regards,
Krzysztof
Anand Moon June 25, 2017, 5:59 p.m. UTC | #8
Hi Krzysztof,

On 25 June 2017 at 22:49, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> On Sun, Jun 25, 2017 at 10:25:32PM +0530, Anand Moon wrote:
>> Hi Krzysztof
>>
>> On 25 June 2017 at 20:59, Krzysztof Kozlowski <krzk@kernel.org> wrote:
>> > On Sun, Jun 25, 2017 at 08:48:13PM +0530, Anand Moon wrote:
>> >> Hi Krzystof,
>> >> >> 2: We should also increase the tips from 4 to 8 to support different
>> >> >> cluster of cpu's.
>> >> >
>> >> > There are 4 CPU thermal zones on Exynos5422. What do you want to expand?
>> >>
>> >> What I meant was to support more trip point to address below.
>> >> [    2.776320] exynos-tmu 100a0000.tmu: More trip points than
>> >> supported by this TMU.
>> >> [    2.782370] exynos-tmu 100a0000.tmu: 2 trip points should be
>> >> configured in polling mode.
>> >>
>> >
>> > I do not understand what you want to achieve. I added 2 trip points in
>> > polling mode for the CPU cooling mode. Just describe the problem and
>> > send the patch - it is the best way to explain one's thought...
>> >
>> >> >> 3: To avoid duplication of cooling-maps we can make tmu sensor work
>> >> >> differently for cluster of cpu's
>> >> >>       tmu_cpu0: handle pwm-fan control.
>> >> >>       tmu_cpu1: handle cpu[0-3] cpufreq mapping.
>> >> >>       tmu_cpu2: handle cpu[4-7] cpufreq mapping.
>> >> >
>> >> > I miss the point behind this. Why fan should work only when CPU4
>> >> > (tmu_cpu0) is heated and not CPU5-7 (rest of cpu tmu's)?
>> >>
>> >> Ok fan should work on all the thermal zone. To avoid thermal shutdown.
>> >>
>> >> But I want to avoid scaling down of all the cores of cpu to low freq
>> >> as cooling-maps cross the alert temperature. For below example.
>> >> ----
>> >>  map3 {
>> >>               trip = <&cpu1_alert3>;
>> >>               cooling-device = <&cpu0 0 2>;
>> >>  };
>> >>  map4 {
>> >>               trip = <&cpu1_alert3>;
>> >>               cooling-device = <&cpu4 0 2>;
>> >>  };
>> >>
>> >>  map5 {
>> >>               trip = <&cpu1_alert4>;
>> >>               cooling-device = <&cpu0 3 7>;
>> >>  };
>> >>  map6 {
>> >>              trip = <&cpu1_alert4>;
>> >>              cooling-device = <&cpu4 3 12>;
>> >>  };
>> >>
>> >> What I want to configure thermal zone as.
>> >>
>> >> cpu0_thermal: cpu0-thermal {
>> >>      configure cluster of cpu[0-3]
>> >>      {
>> >>         tips
>> >>      }
>> >>      cooling map
>> >>      {
>> >>          device handle cpu[0-3] with frequency scaling at particular
>> >> alert temperature.
>> >>      }
>> >> }
>> >
>> > cpu0_thermal is attached to tmu_cpu0 which is the temperature of CPU4
>> > (first big core). I do not see reason behind connecting thermal zone
>> > (thus temperature) of CPU4 with frequency of LITTLE cluster (CPU0-3). In
>> > case of busy CPU4, you will scale down CPU0-3. Does not make sense.
>> >
>>
>> Both the cpu cluster are independent and have different cpu-freqency scaling.
>> that is the reason to for my changes.
>
> Still does not make sense. Big is busy, LITTLE is doing nothing and you
> want to scale down LITTLE. No reason.
>
>>
>> >> cpu1_thermal: cpu1-thermal {
>> >>      configure cluster of cpu[4-7]
>> >>      {
>> >>         tips
>> >>      }
>> >>      cooling map
>> >>      {
>> >>          device handle cpu[4-7] with frequency scaling at particular
>> >> alert temperature.
>> >>      }
>> >> }
>> >>
>> >> We can chose to configure rest of the thermal-zone on this approach.
>> >> Please share your thoughts.
>> >
>> > I am sorry, I do not understand the idea, the problem nor the solution.
>> >
>> > Best regards,
>> > Krzysztof
>> >
>>
>> Sorry for not able to express my self in technical terms.
>> my changes are based on exynos5433-tmu at this moment.
>> Just attaching small patch to share my point.
>
> On Odroid XU3-family, all TMU interrupt-driven configureable trip points
> are configured. Why are you referring to Exynos5433?

Because both these ARCH are big.Little.

Best Regards
-Anand
diff mbox

Patch

diff --git a/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
index 05b9afdd6757..0759cc0812fb 100644
--- a/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
+++ b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
@@ -64,22 +64,22 @@ 
 			polling-delay-passive = <250>;
 			polling-delay = <0>;
 			trips {
-				cpu_alert0: cpu-alert-0 {
+				cpu0_alert0: cpu-alert-0 {
 					temperature = <50000>; /* millicelsius */
 					hysteresis = <5000>; /* millicelsius */
 					type = "active";
 				};
-				cpu_alert1: cpu-alert-1 {
+				cpu0_alert1: cpu-alert-1 {
 					temperature = <60000>; /* millicelsius */
 					hysteresis = <5000>; /* millicelsius */
 					type = "active";
 				};
-				cpu_alert2: cpu-alert-2 {
+				cpu0_alert2: cpu-alert-2 {
 					temperature = <70000>; /* millicelsius */
 					hysteresis = <5000>; /* millicelsius */
 					type = "active";
 				};
-				cpu_crit0: cpu-crit-0 {
+				cpu0_crit0: cpu-crit-0 {
 					temperature = <120000>; /* millicelsius */
 					hysteresis = <0>; /* millicelsius */
 					type = "critical";
@@ -88,14 +88,14 @@ 
 				 * Exynos542x supports only 4 trip-points
 				 * so for these polling mode is required.
 				 * Start polling at temperature level of last
-				 * interrupt-driven trip: cpu_alert2
+				 * interrupt-driven trip: cpu0_alert2
 				 */
-				cpu_alert3: cpu-alert-3 {
+				cpu0_alert3: cpu-alert-3 {
 					temperature = <70000>; /* millicelsius */
 					hysteresis = <10000>; /* millicelsius */
 					type = "passive";
 				};
-				cpu_alert4: cpu-alert-4 {
+				cpu0_alert4: cpu-alert-4 {
 					temperature = <85000>; /* millicelsius */
 					hysteresis = <10000>; /* millicelsius */
 					type = "passive";
@@ -104,43 +104,256 @@ 
 			};
 			cooling-maps {
 				map0 {
-					trip = <&cpu_alert0>;
+					trip = <&cpu0_alert0>;
 					cooling-device = <&fan0 0 1>;
 				};
 				map1 {
-					trip = <&cpu_alert1>;
+					trip = <&cpu0_alert1>;
 					cooling-device = <&fan0 1 2>;
 				};
 				map2 {
-					trip = <&cpu_alert2>;
+					trip = <&cpu0_alert2>;
 					cooling-device = <&fan0 2 3>;
 				};
 				/*
-				 * When reaching cpu_alert3, reduce CPU
+				 * When reaching cpu0_alert3, reduce CPU
 				 * by 2 steps. On Exynos5422/5800 that would
 				 * be: 1600 MHz and 1100 MHz.
 				 */
 				map3 {
-					trip = <&cpu_alert3>;
+					trip = <&cpu0_alert3>;
 					cooling-device = <&cpu0 0 2>;
 				};
 				map4 {
-					trip = <&cpu_alert3>;
+					trip = <&cpu0_alert3>;
 					cooling-device = <&cpu4 0 2>;
 				};
 
 				/*
-				 * When reaching cpu_alert4, reduce CPU
-				 * further, down to 600 MHz (11 steps for big,
+				 * When reaching cpu0_alert4, reduce CPU
+				 * further, down to 600 MHz (12 steps for big,
 				 * 7 steps for LITTLE).
 				 */
 				map5 {
-					trip = <&cpu_alert4>;
+					trip = <&cpu0_alert4>;
 					cooling-device = <&cpu0 3 7>;
 				};
 				map6 {
-					trip = <&cpu_alert4>;
-					cooling-device = <&cpu4 3 11>;
+					trip = <&cpu0_alert4>;
+					cooling-device = <&cpu4 3 12>;
+				};
+			};
+		};
+		cpu1_thermal: cpu1-thermal {
+			thermal-sensors = <&tmu_cpu1 0>;
+			polling-delay-passive = <250>;
+			polling-delay = <0>;
+			trips {
+				cpu1_alert0: cpu-alert-0 {
+					temperature = <50000>;
+					hysteresis = <5000>;
+					type = "active";
+				};
+				cpu1_alert1: cpu-alert-1 {
+					temperature = <60000>;
+					hysteresis = <5000>;
+					type = "active";
+				};
+				cpu1_alert2: cpu-alert-2 {
+					temperature = <70000>;
+					hysteresis = <5000>;
+					type = "active";
+				};
+				cpu1_crit0: cpu-crit-0 {
+					temperature = <120000>;
+					hysteresis = <0>;
+					type = "critical";
+				};
+
+				cpu1_alert3: cpu-alert-3 {
+					temperature = <70000>;
+					hysteresis = <10000>; /* millicelsius */
+					type = "passive";
+				};
+				cpu1_alert4: cpu-alert-4 {
+					temperature = <85000>;
+					hysteresis = <10000>;
+					type = "passive";
+				};
+
+			};
+			cooling-maps {
+				map0 {
+					trip = <&cpu1_alert0>;
+					cooling-device = <&fan0 0 1>;
+				};
+				map1 {
+					trip = <&cpu1_alert1>;
+					cooling-device = <&fan0 1 2>;
+				};
+				map2 {
+					trip = <&cpu1_alert2>;
+					cooling-device = <&fan0 2 3>;
+				};
+
+				map3 {
+					trip = <&cpu1_alert3>;
+					cooling-device = <&cpu0 0 2>;
+				};
+				map4 {
+					trip = <&cpu1_alert3>;
+					cooling-device = <&cpu4 0 2>;
+				};
+
+				map5 {
+					trip = <&cpu1_alert4>;
+					cooling-device = <&cpu0 3 7>;
+				};
+				map6 {
+					trip = <&cpu1_alert4>;
+					cooling-device = <&cpu4 3 12>;
+				};
+			};
+		};
+		cpu2_thermal: cpu2-thermal {
+			thermal-sensors = <&tmu_cpu2 0>;
+			polling-delay-passive = <250>;
+			polling-delay = <0>;
+			trips {
+				cpu2_alert0: cpu-alert-0 {
+					temperature = <50000>;
+					hysteresis = <5000>;
+					type = "active";
+				};
+				cpu2_alert1: cpu-alert-1 {
+					temperature = <60000>;
+					hysteresis = <5000>;
+					type = "active";
+				};
+				cpu2_alert2: cpu-alert-2 {
+					temperature = <70000>;
+					hysteresis = <5000>;
+					type = "active";
+				};
+				cpu2_crit0: cpu-crit-0 {
+					temperature = <120000>;
+					hysteresis = <0>;
+					type = "critical";
+				};
+
+				cpu2_alert3: cpu-alert-3 {
+					temperature = <70000>;
+					hysteresis = <10000>;
+					type = "passive";
+				};
+				cpu2_alert4: cpu-alert-4 {
+					temperature = <85000>;
+					hysteresis = <10000>;
+					type = "passive";
+				};
+
+			};
+			cooling-maps {
+				map0 {
+					trip = <&cpu2_alert0>;
+					cooling-device = <&fan0 0 1>;
+				};
+				map1 {
+					trip = <&cpu2_alert1>;
+					cooling-device = <&fan0 1 2>;
+				};
+				map2 {
+					trip = <&cpu2_alert2>;
+					cooling-device = <&fan0 2 3>;
+				};
+
+				map3 {
+					trip = <&cpu2_alert3>;
+					cooling-device = <&cpu0 0 2>;
+				};
+				map4 {
+					trip = <&cpu2_alert3>;
+					cooling-device = <&cpu4 0 2>;
+				};
+
+				map5 {
+					trip = <&cpu2_alert4>;
+					cooling-device = <&cpu0 3 7>;
+				};
+				map6 {
+					trip = <&cpu2_alert4>;
+					cooling-device = <&cpu4 3 12>;
+				};
+			};
+		};
+		cpu3_thermal: cpu3-thermal {
+			thermal-sensors = <&tmu_cpu3 0>;
+			polling-delay-passive = <250>;
+			polling-delay = <0>;
+			trips {
+				cpu3_alert0: cpu-alert-0 {
+					temperature = <50000>;
+					hysteresis = <5000>;
+					type = "active";
+				};
+				cpu3_alert1: cpu-alert-1 {
+					temperature = <60000>;
+					hysteresis = <5000>;
+					type = "active";
+				};
+				cpu3_alert2: cpu-alert-2 {
+					temperature = <70000>;
+					hysteresis = <5000>;
+					type = "active";
+				};
+				cpu3_crit0: cpu-crit-0 {
+					temperature = <120000>;
+					hysteresis = <0>;
+					type = "critical";
+				};
+
+				cpu3_alert3: cpu-alert-3 {
+					temperature = <70000>;
+					hysteresis = <10000>;
+					type = "passive";
+				};
+				cpu3_alert4: cpu-alert-4 {
+					temperature = <85000>;
+					hysteresis = <10000>;
+					type = "passive";
+				};
+
+			};
+			cooling-maps {
+				map0 {
+					trip = <&cpu3_alert0>;
+					cooling-device = <&fan0 0 1>;
+				};
+				map1 {
+					trip = <&cpu3_alert1>;
+					cooling-device = <&fan0 1 2>;
+				};
+				map2 {
+					trip = <&cpu3_alert2>;
+					cooling-device = <&fan0 2 3>;
+				};
+
+				map3 {
+					trip = <&cpu3_alert3>;
+					cooling-device = <&cpu0 0 2>;
+				};
+				map4 {
+					trip = <&cpu3_alert3>;
+					cooling-device = <&cpu4 0 2>;
+				};
+
+				map5 {
+					trip = <&cpu3_alert4>;
+					cooling-device = <&cpu0 3 7>;
+				};
+				map6 {
+					trip = <&cpu3_alert4>;
+					cooling-device = <&cpu4 3 12>;
 				};
 			};
 		};