Message ID | 20170623210913.ugutgop7ratwolfy@inc028000032.lancs.ac.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Jun 23, 2017 at 10:09:14PM +0100, Willy Wolff wrote: > Odroid XU*-familly boards has thermal sensors per A15 cores, but the actual > thermal-zones define only cooling-maps action for cpu0. > > If the application is running on all cores but core4 (first core of the A15 > cluster), the CPU can reach high temperature without any proper cooling > action. Commit msg has weird paragraph indentation. Please write it as regular commit, like others. > > As already discus in prior mail, and on IRC, it's a quit big code > duplication, but I don't found the write way to express that in a better > way. > > The situation for this board is that we have multiple sensors, but > matching cooling devices for these sensors act for the same physical > device (FAN and A15 cluster, as each core of the cluster share the same > frequency). > In fact, of-thermal.c:473:thermal_zone_of_sensor_register() can't use > multiple sensors for one single thermal zone. > This patch follow the path taken in arch/arm/boot/dts/qcom-apq8084.dtsi:97 > > I'm interested to extending the thermal driver, but it will takes time. > So this is a workaround before refactoring the driver. > If somebody knows how to write it better, any advice and suggestions > are more than welcome. Skip all data irrelevant to the issue (like what is your interest, discussions made on LKML or IRC). Just describe exactly what is the problem (and maybe how it can be reproduced) and what is the solution. You can mention what could be the optimal solution and why we cannot use it now. > > Also, the comment for cpu_alert4 in cooling-maps definition is not > accurate, 11 steps for A15 correspond to 700MHz, not 600MHz. > > Signed-off-by: Willy Wolff <willy.mh.wolff@gmail.com> > --- > arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi | 249 +++++++++++++++++++-- > 1 file changed, 231 insertions(+), 18 deletions(-) > > diff --git a/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi > index 05b9afdd6757..0759cc0812fb 100644 > --- a/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi > +++ b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi > @@ -64,22 +64,22 @@ > polling-delay-passive = <250>; > polling-delay = <0>; > trips { > - cpu_alert0: cpu-alert-0 { > + cpu0_alert0: cpu-alert-0 { > temperature = <50000>; /* millicelsius */ > hysteresis = <5000>; /* millicelsius */ > type = "active"; > }; > - cpu_alert1: cpu-alert-1 { > + cpu0_alert1: cpu-alert-1 { > temperature = <60000>; /* millicelsius */ > hysteresis = <5000>; /* millicelsius */ > type = "active"; > }; > - cpu_alert2: cpu-alert-2 { > + cpu0_alert2: cpu-alert-2 { > temperature = <70000>; /* millicelsius */ > hysteresis = <5000>; /* millicelsius */ > type = "active"; > }; > - cpu_crit0: cpu-crit-0 { > + cpu0_crit0: cpu-crit-0 { > temperature = <120000>; /* millicelsius */ > hysteresis = <0>; /* millicelsius */ > type = "critical"; > @@ -88,14 +88,14 @@ > * Exynos542x supports only 4 trip-points > * so for these polling mode is required. > * Start polling at temperature level of last > - * interrupt-driven trip: cpu_alert2 > + * interrupt-driven trip: cpu0_alert2 > */ > - cpu_alert3: cpu-alert-3 { > + cpu0_alert3: cpu-alert-3 { > temperature = <70000>; /* millicelsius */ > hysteresis = <10000>; /* millicelsius */ > type = "passive"; > }; > - cpu_alert4: cpu-alert-4 { > + cpu0_alert4: cpu-alert-4 { > temperature = <85000>; /* millicelsius */ > hysteresis = <10000>; /* millicelsius */ > type = "passive"; > @@ -104,43 +104,256 @@ > }; > cooling-maps { > map0 { > - trip = <&cpu_alert0>; > + trip = <&cpu0_alert0>; > cooling-device = <&fan0 0 1>; > }; > map1 { > - trip = <&cpu_alert1>; > + trip = <&cpu0_alert1>; > cooling-device = <&fan0 1 2>; > }; > map2 { > - trip = <&cpu_alert2>; > + trip = <&cpu0_alert2>; > cooling-device = <&fan0 2 3>; > }; > /* > - * When reaching cpu_alert3, reduce CPU > + * When reaching cpu0_alert3, reduce CPU > * by 2 steps. On Exynos5422/5800 that would > * be: 1600 MHz and 1100 MHz. > */ > map3 { > - trip = <&cpu_alert3>; > + trip = <&cpu0_alert3>; > cooling-device = <&cpu0 0 2>; > }; > map4 { > - trip = <&cpu_alert3>; > + trip = <&cpu0_alert3>; > cooling-device = <&cpu4 0 2>; > }; > > /* > - * When reaching cpu_alert4, reduce CPU > - * further, down to 600 MHz (11 steps for big, > + * When reaching cpu0_alert4, reduce CPU > + * further, down to 600 MHz (12 steps for big, > * 7 steps for LITTLE). > */ > map5 { > - trip = <&cpu_alert4>; > + trip = <&cpu0_alert4>; > cooling-device = <&cpu0 3 7>; > }; > map6 { > - trip = <&cpu_alert4>; > - cooling-device = <&cpu4 3 11>; > + trip = <&cpu0_alert4>; > + cooling-device = <&cpu4 3 12>; > + }; > + }; > + }; > + cpu1_thermal: cpu1-thermal { > + thermal-sensors = <&tmu_cpu1 0>; > + polling-delay-passive = <250>; > + polling-delay = <0>; > + trips { > + cpu1_alert0: cpu-alert-0 { > + temperature = <50000>; > + hysteresis = <5000>; > + type = "active"; > + }; > + cpu1_alert1: cpu-alert-1 { > + temperature = <60000>; > + hysteresis = <5000>; > + type = "active"; > + }; > + cpu1_alert2: cpu-alert-2 { > + temperature = <70000>; > + hysteresis = <5000>; > + type = "active"; > + }; > + cpu1_crit0: cpu-crit-0 { > + temperature = <120000>; > + hysteresis = <0>; > + type = "critical"; > + }; > + > + cpu1_alert3: cpu-alert-3 { > + temperature = <70000>; > + hysteresis = <10000>; /* millicelsius */ You removed these comments in other cases so this should go away as well. Best regards, Krzysztof
Hi Willy, On 24 June 2017 at 02:39, Willy Wolff <willy.mh.wolff@gmail.com> wrote: > Odroid XU*-familly boards has thermal sensors per A15 cores, but the actual > thermal-zones define only cooling-maps action for cpu0. > > If the application is running on all cores but core4 (first core of the A15 > cluster), the CPU can reach high temperature without any proper cooling > action. > > As already discus in prior mail, and on IRC, it's a quit big code > duplication, but I don't found the write way to express that in a better > way. > > The situation for this board is that we have multiple sensors, but > matching cooling devices for these sensors act for the same physical > device (FAN and A15 cluster, as each core of the cluster share the same > frequency). > In fact, of-thermal.c:473:thermal_zone_of_sensor_register() can't use > multiple sensors for one single thermal zone. > This patch follow the path taken in arch/arm/boot/dts/qcom-apq8084.dtsi:97 > > I'm interested to extending the thermal driver, but it will takes time. > So this is a workaround before refactoring the driver. > If somebody knows how to write it better, any advice and suggestions > are more than welcome. > > Also, the comment for cpu_alert4 in cooling-maps definition is not > accurate, 11 steps for A15 correspond to 700MHz, not 600MHz. > [snip] Few point to from my side. 1: We should also increase the trip points temperature so that it can throttle at high temperature. 2: We should also increase the tips from 4 to 8 to support different cluster of cpu's. 3: To avoid duplication of cooling-maps we can make tmu sensor work differently for cluster of cpu's tmu_cpu0: handle pwm-fan control. tmu_cpu1: handle cpu[0-3] cpufreq mapping. tmu_cpu2: handle cpu[4-7] cpufreq mapping. tmu_gpu: handle gpu (mali) cpu throttle. Please share your thought on this new approach. I generally test using linaro's pm-qa and most of the time the test case fails # git clone git://git.linaro.org/tools/pm-qa.git # make -C thermal check -Best Regards -Anand Moon
On Sun, Jun 25, 2017 at 06:53:24PM +0530, Anand Moon wrote: > Hi Willy, > > On 24 June 2017 at 02:39, Willy Wolff <willy.mh.wolff@gmail.com> wrote: > > Odroid XU*-familly boards has thermal sensors per A15 cores, but the actual > > thermal-zones define only cooling-maps action for cpu0. > > > > If the application is running on all cores but core4 (first core of the A15 > > cluster), the CPU can reach high temperature without any proper cooling > > action. > > > > As already discus in prior mail, and on IRC, it's a quit big code > > duplication, but I don't found the write way to express that in a better > > way. > > > > The situation for this board is that we have multiple sensors, but > > matching cooling devices for these sensors act for the same physical > > device (FAN and A15 cluster, as each core of the cluster share the same > > frequency). > > In fact, of-thermal.c:473:thermal_zone_of_sensor_register() can't use > > multiple sensors for one single thermal zone. > > This patch follow the path taken in arch/arm/boot/dts/qcom-apq8084.dtsi:97 > > > > I'm interested to extending the thermal driver, but it will takes time. > > So this is a workaround before refactoring the driver. > > If somebody knows how to write it better, any advice and suggestions > > are more than welcome. > > > > Also, the comment for cpu_alert4 in cooling-maps definition is not > > accurate, 11 steps for A15 correspond to 700MHz, not 600MHz. > > > > [snip] > > Few point to from my side. > > 1: We should also increase the trip points temperature so that it can > throttle at high temperature. It is not related to this problem. If you wish to address different problem related to non-optimal choice of temperature, then please send a separate patch explaining chosen values. > 2: We should also increase the tips from 4 to 8 to support different > cluster of cpu's. There are 4 CPU thermal zones on Exynos5422. What do you want to expand? > 3: To avoid duplication of cooling-maps we can make tmu sensor work > differently for cluster of cpu's > tmu_cpu0: handle pwm-fan control. > tmu_cpu1: handle cpu[0-3] cpufreq mapping. > tmu_cpu2: handle cpu[4-7] cpufreq mapping. I miss the point behind this. Why fan should work only when CPU4 (tmu_cpu0) is heated and not CPU5-7 (rest of cpu tmu's)? Best regards, Krzysztof > tmu_gpu: handle gpu (mali) cpu throttle. > > Please share your thought on this new approach. > > I generally test using linaro's pm-qa and most of the time the test case fails > # git clone git://git.linaro.org/tools/pm-qa.git > # make -C thermal check > > -Best Regards > -Anand Moon
Hi Krzystof, On 25 June 2017 at 19:01, Krzysztof Kozlowski <krzk@kernel.org> wrote: > On Sun, Jun 25, 2017 at 06:53:24PM +0530, Anand Moon wrote: >> Hi Willy, >> >> On 24 June 2017 at 02:39, Willy Wolff <willy.mh.wolff@gmail.com> wrote: >> > Odroid XU*-familly boards has thermal sensors per A15 cores, but the actual >> > thermal-zones define only cooling-maps action for cpu0. >> > >> > If the application is running on all cores but core4 (first core of the A15 >> > cluster), the CPU can reach high temperature without any proper cooling >> > action. >> > >> > As already discus in prior mail, and on IRC, it's a quit big code >> > duplication, but I don't found the write way to express that in a better >> > way. >> > >> > The situation for this board is that we have multiple sensors, but >> > matching cooling devices for these sensors act for the same physical >> > device (FAN and A15 cluster, as each core of the cluster share the same >> > frequency). >> > In fact, of-thermal.c:473:thermal_zone_of_sensor_register() can't use >> > multiple sensors for one single thermal zone. >> > This patch follow the path taken in arch/arm/boot/dts/qcom-apq8084.dtsi:97 >> > >> > I'm interested to extending the thermal driver, but it will takes time. >> > So this is a workaround before refactoring the driver. >> > If somebody knows how to write it better, any advice and suggestions >> > are more than welcome. >> > >> > Also, the comment for cpu_alert4 in cooling-maps definition is not >> > accurate, 11 steps for A15 correspond to 700MHz, not 600MHz. >> > >> >> [snip] >> >> Few point to from my side. >> >> 1: We should also increase the trip points temperature so that it can >> throttle at high temperature. > > It is not related to this problem. If you wish to address different > problem related to non-optimal choice of temperature, then please send a > separate patch explaining chosen values. Certain task tends to run slow in current thermal zone setting. If needed I will send a patch for this. >> 2: We should also increase the tips from 4 to 8 to support different >> cluster of cpu's. > > There are 4 CPU thermal zones on Exynos5422. What do you want to expand? What I meant was to support more trip point to address below. [ 2.776320] exynos-tmu 100a0000.tmu: More trip points than supported by this TMU. [ 2.782370] exynos-tmu 100a0000.tmu: 2 trip points should be configured in polling mode. >> 3: To avoid duplication of cooling-maps we can make tmu sensor work >> differently for cluster of cpu's >> tmu_cpu0: handle pwm-fan control. >> tmu_cpu1: handle cpu[0-3] cpufreq mapping. >> tmu_cpu2: handle cpu[4-7] cpufreq mapping. > > I miss the point behind this. Why fan should work only when CPU4 > (tmu_cpu0) is heated and not CPU5-7 (rest of cpu tmu's)? Ok fan should work on all the thermal zone. To avoid thermal shutdown. But I want to avoid scaling down of all the cores of cpu to low freq as cooling-maps cross the alert temperature. For below example. ---- map3 { trip = <&cpu1_alert3>; cooling-device = <&cpu0 0 2>; }; map4 { trip = <&cpu1_alert3>; cooling-device = <&cpu4 0 2>; }; map5 { trip = <&cpu1_alert4>; cooling-device = <&cpu0 3 7>; }; map6 { trip = <&cpu1_alert4>; cooling-device = <&cpu4 3 12>; }; What I want to configure thermal zone as. cpu0_thermal: cpu0-thermal { configure cluster of cpu[0-3] { tips } cooling map { device handle cpu[0-3] with frequency scaling at particular alert temperature. } } cpu1_thermal: cpu1-thermal { configure cluster of cpu[4-7] { tips } cooling map { device handle cpu[4-7] with frequency scaling at particular alert temperature. } } We can chose to configure rest of the thermal-zone on this approach. Please share your thoughts. Best Regards -Anand Moon
On Sun, Jun 25, 2017 at 08:48:13PM +0530, Anand Moon wrote: > Hi Krzystof, > >> 2: We should also increase the tips from 4 to 8 to support different > >> cluster of cpu's. > > > > There are 4 CPU thermal zones on Exynos5422. What do you want to expand? > > What I meant was to support more trip point to address below. > [ 2.776320] exynos-tmu 100a0000.tmu: More trip points than > supported by this TMU. > [ 2.782370] exynos-tmu 100a0000.tmu: 2 trip points should be > configured in polling mode. > I do not understand what you want to achieve. I added 2 trip points in polling mode for the CPU cooling mode. Just describe the problem and send the patch - it is the best way to explain one's thought... > >> 3: To avoid duplication of cooling-maps we can make tmu sensor work > >> differently for cluster of cpu's > >> tmu_cpu0: handle pwm-fan control. > >> tmu_cpu1: handle cpu[0-3] cpufreq mapping. > >> tmu_cpu2: handle cpu[4-7] cpufreq mapping. > > > > I miss the point behind this. Why fan should work only when CPU4 > > (tmu_cpu0) is heated and not CPU5-7 (rest of cpu tmu's)? > > Ok fan should work on all the thermal zone. To avoid thermal shutdown. > > But I want to avoid scaling down of all the cores of cpu to low freq > as cooling-maps cross the alert temperature. For below example. > ---- > map3 { > trip = <&cpu1_alert3>; > cooling-device = <&cpu0 0 2>; > }; > map4 { > trip = <&cpu1_alert3>; > cooling-device = <&cpu4 0 2>; > }; > > map5 { > trip = <&cpu1_alert4>; > cooling-device = <&cpu0 3 7>; > }; > map6 { > trip = <&cpu1_alert4>; > cooling-device = <&cpu4 3 12>; > }; > > What I want to configure thermal zone as. > > cpu0_thermal: cpu0-thermal { > configure cluster of cpu[0-3] > { > tips > } > cooling map > { > device handle cpu[0-3] with frequency scaling at particular > alert temperature. > } > } cpu0_thermal is attached to tmu_cpu0 which is the temperature of CPU4 (first big core). I do not see reason behind connecting thermal zone (thus temperature) of CPU4 with frequency of LITTLE cluster (CPU0-3). In case of busy CPU4, you will scale down CPU0-3. Does not make sense. > cpu1_thermal: cpu1-thermal { > configure cluster of cpu[4-7] > { > tips > } > cooling map > { > device handle cpu[4-7] with frequency scaling at particular > alert temperature. > } > } > > We can chose to configure rest of the thermal-zone on this approach. > Please share your thoughts. I am sorry, I do not understand the idea, the problem nor the solution. Best regards, Krzysztof
Hi Krzysztof On 25 June 2017 at 20:59, Krzysztof Kozlowski <krzk@kernel.org> wrote: > On Sun, Jun 25, 2017 at 08:48:13PM +0530, Anand Moon wrote: >> Hi Krzystof, >> >> 2: We should also increase the tips from 4 to 8 to support different >> >> cluster of cpu's. >> > >> > There are 4 CPU thermal zones on Exynos5422. What do you want to expand? >> >> What I meant was to support more trip point to address below. >> [ 2.776320] exynos-tmu 100a0000.tmu: More trip points than >> supported by this TMU. >> [ 2.782370] exynos-tmu 100a0000.tmu: 2 trip points should be >> configured in polling mode. >> > > I do not understand what you want to achieve. I added 2 trip points in > polling mode for the CPU cooling mode. Just describe the problem and > send the patch - it is the best way to explain one's thought... > >> >> 3: To avoid duplication of cooling-maps we can make tmu sensor work >> >> differently for cluster of cpu's >> >> tmu_cpu0: handle pwm-fan control. >> >> tmu_cpu1: handle cpu[0-3] cpufreq mapping. >> >> tmu_cpu2: handle cpu[4-7] cpufreq mapping. >> > >> > I miss the point behind this. Why fan should work only when CPU4 >> > (tmu_cpu0) is heated and not CPU5-7 (rest of cpu tmu's)? >> >> Ok fan should work on all the thermal zone. To avoid thermal shutdown. >> >> But I want to avoid scaling down of all the cores of cpu to low freq >> as cooling-maps cross the alert temperature. For below example. >> ---- >> map3 { >> trip = <&cpu1_alert3>; >> cooling-device = <&cpu0 0 2>; >> }; >> map4 { >> trip = <&cpu1_alert3>; >> cooling-device = <&cpu4 0 2>; >> }; >> >> map5 { >> trip = <&cpu1_alert4>; >> cooling-device = <&cpu0 3 7>; >> }; >> map6 { >> trip = <&cpu1_alert4>; >> cooling-device = <&cpu4 3 12>; >> }; >> >> What I want to configure thermal zone as. >> >> cpu0_thermal: cpu0-thermal { >> configure cluster of cpu[0-3] >> { >> tips >> } >> cooling map >> { >> device handle cpu[0-3] with frequency scaling at particular >> alert temperature. >> } >> } > > cpu0_thermal is attached to tmu_cpu0 which is the temperature of CPU4 > (first big core). I do not see reason behind connecting thermal zone > (thus temperature) of CPU4 with frequency of LITTLE cluster (CPU0-3). In > case of busy CPU4, you will scale down CPU0-3. Does not make sense. > Both the cpu cluster are independent and have different cpu-freqency scaling. that is the reason to for my changes. >> cpu1_thermal: cpu1-thermal { >> configure cluster of cpu[4-7] >> { >> tips >> } >> cooling map >> { >> device handle cpu[4-7] with frequency scaling at particular >> alert temperature. >> } >> } >> >> We can chose to configure rest of the thermal-zone on this approach. >> Please share your thoughts. > > I am sorry, I do not understand the idea, the problem nor the solution. > > Best regards, > Krzysztof > Sorry for not able to express my self in technical terms. my changes are based on exynos5433-tmu at this moment. Just attaching small patch to share my point. Best Regards -Anand Moon
On Sun, Jun 25, 2017 at 10:25:32PM +0530, Anand Moon wrote: > Hi Krzysztof > > On 25 June 2017 at 20:59, Krzysztof Kozlowski <krzk@kernel.org> wrote: > > On Sun, Jun 25, 2017 at 08:48:13PM +0530, Anand Moon wrote: > >> Hi Krzystof, > >> >> 2: We should also increase the tips from 4 to 8 to support different > >> >> cluster of cpu's. > >> > > >> > There are 4 CPU thermal zones on Exynos5422. What do you want to expand? > >> > >> What I meant was to support more trip point to address below. > >> [ 2.776320] exynos-tmu 100a0000.tmu: More trip points than > >> supported by this TMU. > >> [ 2.782370] exynos-tmu 100a0000.tmu: 2 trip points should be > >> configured in polling mode. > >> > > > > I do not understand what you want to achieve. I added 2 trip points in > > polling mode for the CPU cooling mode. Just describe the problem and > > send the patch - it is the best way to explain one's thought... > > > >> >> 3: To avoid duplication of cooling-maps we can make tmu sensor work > >> >> differently for cluster of cpu's > >> >> tmu_cpu0: handle pwm-fan control. > >> >> tmu_cpu1: handle cpu[0-3] cpufreq mapping. > >> >> tmu_cpu2: handle cpu[4-7] cpufreq mapping. > >> > > >> > I miss the point behind this. Why fan should work only when CPU4 > >> > (tmu_cpu0) is heated and not CPU5-7 (rest of cpu tmu's)? > >> > >> Ok fan should work on all the thermal zone. To avoid thermal shutdown. > >> > >> But I want to avoid scaling down of all the cores of cpu to low freq > >> as cooling-maps cross the alert temperature. For below example. > >> ---- > >> map3 { > >> trip = <&cpu1_alert3>; > >> cooling-device = <&cpu0 0 2>; > >> }; > >> map4 { > >> trip = <&cpu1_alert3>; > >> cooling-device = <&cpu4 0 2>; > >> }; > >> > >> map5 { > >> trip = <&cpu1_alert4>; > >> cooling-device = <&cpu0 3 7>; > >> }; > >> map6 { > >> trip = <&cpu1_alert4>; > >> cooling-device = <&cpu4 3 12>; > >> }; > >> > >> What I want to configure thermal zone as. > >> > >> cpu0_thermal: cpu0-thermal { > >> configure cluster of cpu[0-3] > >> { > >> tips > >> } > >> cooling map > >> { > >> device handle cpu[0-3] with frequency scaling at particular > >> alert temperature. > >> } > >> } > > > > cpu0_thermal is attached to tmu_cpu0 which is the temperature of CPU4 > > (first big core). I do not see reason behind connecting thermal zone > > (thus temperature) of CPU4 with frequency of LITTLE cluster (CPU0-3). In > > case of busy CPU4, you will scale down CPU0-3. Does not make sense. > > > > Both the cpu cluster are independent and have different cpu-freqency scaling. > that is the reason to for my changes. Still does not make sense. Big is busy, LITTLE is doing nothing and you want to scale down LITTLE. No reason. > > >> cpu1_thermal: cpu1-thermal { > >> configure cluster of cpu[4-7] > >> { > >> tips > >> } > >> cooling map > >> { > >> device handle cpu[4-7] with frequency scaling at particular > >> alert temperature. > >> } > >> } > >> > >> We can chose to configure rest of the thermal-zone on this approach. > >> Please share your thoughts. > > > > I am sorry, I do not understand the idea, the problem nor the solution. > > > > Best regards, > > Krzysztof > > > > Sorry for not able to express my self in technical terms. > my changes are based on exynos5433-tmu at this moment. > Just attaching small patch to share my point. On Odroid XU3-family, all TMU interrupt-driven configureable trip points are configured. Why are you referring to Exynos5433? Best regards, Krzysztof
Hi Krzysztof, On 25 June 2017 at 22:49, Krzysztof Kozlowski <krzk@kernel.org> wrote: > On Sun, Jun 25, 2017 at 10:25:32PM +0530, Anand Moon wrote: >> Hi Krzysztof >> >> On 25 June 2017 at 20:59, Krzysztof Kozlowski <krzk@kernel.org> wrote: >> > On Sun, Jun 25, 2017 at 08:48:13PM +0530, Anand Moon wrote: >> >> Hi Krzystof, >> >> >> 2: We should also increase the tips from 4 to 8 to support different >> >> >> cluster of cpu's. >> >> > >> >> > There are 4 CPU thermal zones on Exynos5422. What do you want to expand? >> >> >> >> What I meant was to support more trip point to address below. >> >> [ 2.776320] exynos-tmu 100a0000.tmu: More trip points than >> >> supported by this TMU. >> >> [ 2.782370] exynos-tmu 100a0000.tmu: 2 trip points should be >> >> configured in polling mode. >> >> >> > >> > I do not understand what you want to achieve. I added 2 trip points in >> > polling mode for the CPU cooling mode. Just describe the problem and >> > send the patch - it is the best way to explain one's thought... >> > >> >> >> 3: To avoid duplication of cooling-maps we can make tmu sensor work >> >> >> differently for cluster of cpu's >> >> >> tmu_cpu0: handle pwm-fan control. >> >> >> tmu_cpu1: handle cpu[0-3] cpufreq mapping. >> >> >> tmu_cpu2: handle cpu[4-7] cpufreq mapping. >> >> > >> >> > I miss the point behind this. Why fan should work only when CPU4 >> >> > (tmu_cpu0) is heated and not CPU5-7 (rest of cpu tmu's)? >> >> >> >> Ok fan should work on all the thermal zone. To avoid thermal shutdown. >> >> >> >> But I want to avoid scaling down of all the cores of cpu to low freq >> >> as cooling-maps cross the alert temperature. For below example. >> >> ---- >> >> map3 { >> >> trip = <&cpu1_alert3>; >> >> cooling-device = <&cpu0 0 2>; >> >> }; >> >> map4 { >> >> trip = <&cpu1_alert3>; >> >> cooling-device = <&cpu4 0 2>; >> >> }; >> >> >> >> map5 { >> >> trip = <&cpu1_alert4>; >> >> cooling-device = <&cpu0 3 7>; >> >> }; >> >> map6 { >> >> trip = <&cpu1_alert4>; >> >> cooling-device = <&cpu4 3 12>; >> >> }; >> >> >> >> What I want to configure thermal zone as. >> >> >> >> cpu0_thermal: cpu0-thermal { >> >> configure cluster of cpu[0-3] >> >> { >> >> tips >> >> } >> >> cooling map >> >> { >> >> device handle cpu[0-3] with frequency scaling at particular >> >> alert temperature. >> >> } >> >> } >> > >> > cpu0_thermal is attached to tmu_cpu0 which is the temperature of CPU4 >> > (first big core). I do not see reason behind connecting thermal zone >> > (thus temperature) of CPU4 with frequency of LITTLE cluster (CPU0-3). In >> > case of busy CPU4, you will scale down CPU0-3. Does not make sense. >> > >> >> Both the cpu cluster are independent and have different cpu-freqency scaling. >> that is the reason to for my changes. > > Still does not make sense. Big is busy, LITTLE is doing nothing and you > want to scale down LITTLE. No reason. > >> >> >> cpu1_thermal: cpu1-thermal { >> >> configure cluster of cpu[4-7] >> >> { >> >> tips >> >> } >> >> cooling map >> >> { >> >> device handle cpu[4-7] with frequency scaling at particular >> >> alert temperature. >> >> } >> >> } >> >> >> >> We can chose to configure rest of the thermal-zone on this approach. >> >> Please share your thoughts. >> > >> > I am sorry, I do not understand the idea, the problem nor the solution. >> > >> > Best regards, >> > Krzysztof >> > >> >> Sorry for not able to express my self in technical terms. >> my changes are based on exynos5433-tmu at this moment. >> Just attaching small patch to share my point. > > On Odroid XU3-family, all TMU interrupt-driven configureable trip points > are configured. Why are you referring to Exynos5433? Because both these ARCH are big.Little. Best Regards -Anand
diff --git a/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi index 05b9afdd6757..0759cc0812fb 100644 --- a/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi +++ b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi @@ -64,22 +64,22 @@ polling-delay-passive = <250>; polling-delay = <0>; trips { - cpu_alert0: cpu-alert-0 { + cpu0_alert0: cpu-alert-0 { temperature = <50000>; /* millicelsius */ hysteresis = <5000>; /* millicelsius */ type = "active"; }; - cpu_alert1: cpu-alert-1 { + cpu0_alert1: cpu-alert-1 { temperature = <60000>; /* millicelsius */ hysteresis = <5000>; /* millicelsius */ type = "active"; }; - cpu_alert2: cpu-alert-2 { + cpu0_alert2: cpu-alert-2 { temperature = <70000>; /* millicelsius */ hysteresis = <5000>; /* millicelsius */ type = "active"; }; - cpu_crit0: cpu-crit-0 { + cpu0_crit0: cpu-crit-0 { temperature = <120000>; /* millicelsius */ hysteresis = <0>; /* millicelsius */ type = "critical"; @@ -88,14 +88,14 @@ * Exynos542x supports only 4 trip-points * so for these polling mode is required. * Start polling at temperature level of last - * interrupt-driven trip: cpu_alert2 + * interrupt-driven trip: cpu0_alert2 */ - cpu_alert3: cpu-alert-3 { + cpu0_alert3: cpu-alert-3 { temperature = <70000>; /* millicelsius */ hysteresis = <10000>; /* millicelsius */ type = "passive"; }; - cpu_alert4: cpu-alert-4 { + cpu0_alert4: cpu-alert-4 { temperature = <85000>; /* millicelsius */ hysteresis = <10000>; /* millicelsius */ type = "passive"; @@ -104,43 +104,256 @@ }; cooling-maps { map0 { - trip = <&cpu_alert0>; + trip = <&cpu0_alert0>; cooling-device = <&fan0 0 1>; }; map1 { - trip = <&cpu_alert1>; + trip = <&cpu0_alert1>; cooling-device = <&fan0 1 2>; }; map2 { - trip = <&cpu_alert2>; + trip = <&cpu0_alert2>; cooling-device = <&fan0 2 3>; }; /* - * When reaching cpu_alert3, reduce CPU + * When reaching cpu0_alert3, reduce CPU * by 2 steps. On Exynos5422/5800 that would * be: 1600 MHz and 1100 MHz. */ map3 { - trip = <&cpu_alert3>; + trip = <&cpu0_alert3>; cooling-device = <&cpu0 0 2>; }; map4 { - trip = <&cpu_alert3>; + trip = <&cpu0_alert3>; cooling-device = <&cpu4 0 2>; }; /* - * When reaching cpu_alert4, reduce CPU - * further, down to 600 MHz (11 steps for big, + * When reaching cpu0_alert4, reduce CPU + * further, down to 600 MHz (12 steps for big, * 7 steps for LITTLE). */ map5 { - trip = <&cpu_alert4>; + trip = <&cpu0_alert4>; cooling-device = <&cpu0 3 7>; }; map6 { - trip = <&cpu_alert4>; - cooling-device = <&cpu4 3 11>; + trip = <&cpu0_alert4>; + cooling-device = <&cpu4 3 12>; + }; + }; + }; + cpu1_thermal: cpu1-thermal { + thermal-sensors = <&tmu_cpu1 0>; + polling-delay-passive = <250>; + polling-delay = <0>; + trips { + cpu1_alert0: cpu-alert-0 { + temperature = <50000>; + hysteresis = <5000>; + type = "active"; + }; + cpu1_alert1: cpu-alert-1 { + temperature = <60000>; + hysteresis = <5000>; + type = "active"; + }; + cpu1_alert2: cpu-alert-2 { + temperature = <70000>; + hysteresis = <5000>; + type = "active"; + }; + cpu1_crit0: cpu-crit-0 { + temperature = <120000>; + hysteresis = <0>; + type = "critical"; + }; + + cpu1_alert3: cpu-alert-3 { + temperature = <70000>; + hysteresis = <10000>; /* millicelsius */ + type = "passive"; + }; + cpu1_alert4: cpu-alert-4 { + temperature = <85000>; + hysteresis = <10000>; + type = "passive"; + }; + + }; + cooling-maps { + map0 { + trip = <&cpu1_alert0>; + cooling-device = <&fan0 0 1>; + }; + map1 { + trip = <&cpu1_alert1>; + cooling-device = <&fan0 1 2>; + }; + map2 { + trip = <&cpu1_alert2>; + cooling-device = <&fan0 2 3>; + }; + + map3 { + trip = <&cpu1_alert3>; + cooling-device = <&cpu0 0 2>; + }; + map4 { + trip = <&cpu1_alert3>; + cooling-device = <&cpu4 0 2>; + }; + + map5 { + trip = <&cpu1_alert4>; + cooling-device = <&cpu0 3 7>; + }; + map6 { + trip = <&cpu1_alert4>; + cooling-device = <&cpu4 3 12>; + }; + }; + }; + cpu2_thermal: cpu2-thermal { + thermal-sensors = <&tmu_cpu2 0>; + polling-delay-passive = <250>; + polling-delay = <0>; + trips { + cpu2_alert0: cpu-alert-0 { + temperature = <50000>; + hysteresis = <5000>; + type = "active"; + }; + cpu2_alert1: cpu-alert-1 { + temperature = <60000>; + hysteresis = <5000>; + type = "active"; + }; + cpu2_alert2: cpu-alert-2 { + temperature = <70000>; + hysteresis = <5000>; + type = "active"; + }; + cpu2_crit0: cpu-crit-0 { + temperature = <120000>; + hysteresis = <0>; + type = "critical"; + }; + + cpu2_alert3: cpu-alert-3 { + temperature = <70000>; + hysteresis = <10000>; + type = "passive"; + }; + cpu2_alert4: cpu-alert-4 { + temperature = <85000>; + hysteresis = <10000>; + type = "passive"; + }; + + }; + cooling-maps { + map0 { + trip = <&cpu2_alert0>; + cooling-device = <&fan0 0 1>; + }; + map1 { + trip = <&cpu2_alert1>; + cooling-device = <&fan0 1 2>; + }; + map2 { + trip = <&cpu2_alert2>; + cooling-device = <&fan0 2 3>; + }; + + map3 { + trip = <&cpu2_alert3>; + cooling-device = <&cpu0 0 2>; + }; + map4 { + trip = <&cpu2_alert3>; + cooling-device = <&cpu4 0 2>; + }; + + map5 { + trip = <&cpu2_alert4>; + cooling-device = <&cpu0 3 7>; + }; + map6 { + trip = <&cpu2_alert4>; + cooling-device = <&cpu4 3 12>; + }; + }; + }; + cpu3_thermal: cpu3-thermal { + thermal-sensors = <&tmu_cpu3 0>; + polling-delay-passive = <250>; + polling-delay = <0>; + trips { + cpu3_alert0: cpu-alert-0 { + temperature = <50000>; + hysteresis = <5000>; + type = "active"; + }; + cpu3_alert1: cpu-alert-1 { + temperature = <60000>; + hysteresis = <5000>; + type = "active"; + }; + cpu3_alert2: cpu-alert-2 { + temperature = <70000>; + hysteresis = <5000>; + type = "active"; + }; + cpu3_crit0: cpu-crit-0 { + temperature = <120000>; + hysteresis = <0>; + type = "critical"; + }; + + cpu3_alert3: cpu-alert-3 { + temperature = <70000>; + hysteresis = <10000>; + type = "passive"; + }; + cpu3_alert4: cpu-alert-4 { + temperature = <85000>; + hysteresis = <10000>; + type = "passive"; + }; + + }; + cooling-maps { + map0 { + trip = <&cpu3_alert0>; + cooling-device = <&fan0 0 1>; + }; + map1 { + trip = <&cpu3_alert1>; + cooling-device = <&fan0 1 2>; + }; + map2 { + trip = <&cpu3_alert2>; + cooling-device = <&fan0 2 3>; + }; + + map3 { + trip = <&cpu3_alert3>; + cooling-device = <&cpu0 0 2>; + }; + map4 { + trip = <&cpu3_alert3>; + cooling-device = <&cpu4 0 2>; + }; + + map5 { + trip = <&cpu3_alert4>; + cooling-device = <&cpu0 3 7>; + }; + map6 { + trip = <&cpu3_alert4>; + cooling-device = <&cpu4 3 12>; }; }; };
Odroid XU*-familly boards has thermal sensors per A15 cores, but the actual thermal-zones define only cooling-maps action for cpu0. If the application is running on all cores but core4 (first core of the A15 cluster), the CPU can reach high temperature without any proper cooling action. As already discus in prior mail, and on IRC, it's a quit big code duplication, but I don't found the write way to express that in a better way. The situation for this board is that we have multiple sensors, but matching cooling devices for these sensors act for the same physical device (FAN and A15 cluster, as each core of the cluster share the same frequency). In fact, of-thermal.c:473:thermal_zone_of_sensor_register() can't use multiple sensors for one single thermal zone. This patch follow the path taken in arch/arm/boot/dts/qcom-apq8084.dtsi:97 I'm interested to extending the thermal driver, but it will takes time. So this is a workaround before refactoring the driver. If somebody knows how to write it better, any advice and suggestions are more than welcome. Also, the comment for cpu_alert4 in cooling-maps definition is not accurate, 11 steps for A15 correspond to 700MHz, not 600MHz. Signed-off-by: Willy Wolff <willy.mh.wolff@gmail.com> --- arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi | 249 +++++++++++++++++++-- 1 file changed, 231 insertions(+), 18 deletions(-)