Message ID | 20230106164618.1845281-1-vincent.guittot@linaro.org (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | arm64: dts: qcom: sdm845: correct dynamic power coefficients | expand |
On Fri, Jan 06, 2023 at 05:46:18PM +0100, Vincent Guittot wrote: Seems like using get_maintainer.pl would have saved you some trouble ;) > While stressing EAS on my dragonboard RB3, I have noticed that LITTLE cores > where never selected as the most energy efficient CPU whatever the > utilization level of waking task. > > energy model framework uses its cost field to estimate the energy with > the formula: > > nrg = cost of the selected OPP * utilization / CPU's max capacity > > which ends up selecting the CPU with lowest cost / max capacity ration > as long as the utilization fits in the OPP's capacity. > > If we compare the cost of a little OPP with similar capacity of a big OPP > like : > OPP(kHz) OPP capacity cost max capacity cost/max capacity > LITTLE 1766400 407 351114 407 863 > big 1056000 408 520267 1024 508 > > This can be interpreted as the LITTLE core consumes 70% more than big core > for the same compute capacity. > > According to [1], LITTLE consumes 10% less than big core for Coremark > benchmark at those OPPs. If we consider that everything else stays > unchanged, the dynamic-power-coefficient of LITTLE core should be > only 53% of the current value: 290 * 53% = 154 > > Set the dynamic-power-coefficient of CPU0-3 to 154 to fix the energy model. > This is sounds reasonable. But if the math was wrong for SDM845, I would assume that sm8150 and sm8250 are wrong as well, as that's what 0e0a8e35d725 is based on. And should I assume that patches for other platforms are off by 53% as well? Can you help me understand how to arrive at this number? (Without considering everything else stays unchanged, if needed). Regards, Bjorn > [1] https://github.com/kdrag0n/freqbench/tree/master/results/sdm845/main > > Fixes: 0e0a8e35d725 ("arm64: dts: qcom: sdm845: correct dynamic power coefficients") > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > --- > arch/arm64/boot/dts/qcom/sdm845.dtsi | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > index 65032b94b46d..869bdb9bce6e 100644 > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > @@ -198,7 +198,7 @@ CPU0: cpu@0 { > reg = <0x0 0x0>; > enable-method = "psci"; > capacity-dmips-mhz = <611>; > - dynamic-power-coefficient = <290>; > + dynamic-power-coefficient = <154>; > qcom,freq-domain = <&cpufreq_hw 0>; > operating-points-v2 = <&cpu0_opp_table>; > interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, > @@ -222,7 +222,7 @@ CPU1: cpu@100 { > reg = <0x0 0x100>; > enable-method = "psci"; > capacity-dmips-mhz = <611>; > - dynamic-power-coefficient = <290>; > + dynamic-power-coefficient = <154>; > qcom,freq-domain = <&cpufreq_hw 0>; > operating-points-v2 = <&cpu0_opp_table>; > interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, > @@ -243,7 +243,7 @@ CPU2: cpu@200 { > reg = <0x0 0x200>; > enable-method = "psci"; > capacity-dmips-mhz = <611>; > - dynamic-power-coefficient = <290>; > + dynamic-power-coefficient = <154>; > qcom,freq-domain = <&cpufreq_hw 0>; > operating-points-v2 = <&cpu0_opp_table>; > interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, > @@ -264,7 +264,7 @@ CPU3: cpu@300 { > reg = <0x0 0x300>; > enable-method = "psci"; > capacity-dmips-mhz = <611>; > - dynamic-power-coefficient = <290>; > + dynamic-power-coefficient = <154>; > qcom,freq-domain = <&cpufreq_hw 0>; > operating-points-v2 = <&cpu0_opp_table>; > interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, > -- > 2.34.1 >
On Fri, 6 Jan 2023 at 19:28, Bjorn Andersson <andersson@kernel.org> wrote: > > On Fri, Jan 06, 2023 at 05:46:18PM +0100, Vincent Guittot wrote: > > Seems like using get_maintainer.pl would have saved you some trouble ;) The worst is that I used it but only checked names and not emails when I reused the list of the original patch :-( > > > While stressing EAS on my dragonboard RB3, I have noticed that LITTLE cores > > where never selected as the most energy efficient CPU whatever the > > utilization level of waking task. > > > > energy model framework uses its cost field to estimate the energy with > > the formula: > > > > nrg = cost of the selected OPP * utilization / CPU's max capacity > > > > which ends up selecting the CPU with lowest cost / max capacity ration > > as long as the utilization fits in the OPP's capacity. > > > > If we compare the cost of a little OPP with similar capacity of a big OPP > > like : > > OPP(kHz) OPP capacity cost max capacity cost/max capacity > > LITTLE 1766400 407 351114 407 863 > > big 1056000 408 520267 1024 508 > > > > This can be interpreted as the LITTLE core consumes 70% more than big core > > for the same compute capacity. > > > > According to [1], LITTLE consumes 10% less than big core for Coremark > > benchmark at those OPPs. If we consider that everything else stays > > unchanged, the dynamic-power-coefficient of LITTLE core should be > > only 53% of the current value: 290 * 53% = 154 > > > > Set the dynamic-power-coefficient of CPU0-3 to 154 to fix the energy model. > > > > This is sounds reasonable. > > But if the math was wrong for SDM845, I would assume that sm8150 and > sm8250 are wrong as well, as that's what 0e0a8e35d725 is based on. And > should I assume that patches for other platforms are off by 53% as well? I don't think that we can assume that there is an error and in particular the same 53% error for others. > > Can you help me understand how to arrive at this number? (Without > considering everything else stays unchanged, if needed). In order to do the full computation, we need the voltage of each OPP which I don't have as they are provided by the LUT at boot IIUC. Instead I have used the debugfs output of the energy model and compared the value of (perf_state->cost/cpu_max_capacity) with the energy and duration figures available in [1]. In the case of SDM845, it was pretty easy to compare the OPPs of big and LITTLE because the duration and the perf result were the same for 2 OPPS so we should have : (little OPP(1766400)->cost / little max capacity (407)) / (big OPP(1056000)->cost / big max capacity(1024)) = little OPP(1766400) energy / big OPP(1056000) energy (little OPP(1766400)->cost / little max capacity (407)) / (big OPP(1056000)->cost / big max capacity(1024)) = 0.90 but current output gives: (little OPP(1766400)->cost / little max capacity (407)) / (big OPP(1056000)->cost / big max capacity(1024)) = 1.70 As we consider everything else constant, it can be simplified by: correct_little_dynamic-power-coefficient * const_A = 0.90 Whereas we currently have current_little_dynamic-power-coefficient * const_A = 1.70 and we ends up with correct_little_dynamic-power-coefficient = 0.90 / 1.70 * current_little_dynamic-power-coefficient = 154 That being said, it can be simpler as the energy model provide the power figures little OPP(1766400)->power = 351114 uW big OPP(1056000)->power = 195991 uW ration = 1.79 [1] results gives little OPP(1766400)->power = 193.281 mW big OPP(1056000)->power = 216.405 mW ratio = 0.89 The ratios are a bit different and give a correct_little_dynamic-power-coefficient = 144 which is different than when using ->cost. This probably comes from rounding and which figures have been used to compute the model. If you have Voltage of OPP, the formula used in energy model is power (uW) = dynamic-power-coefficient * uV^2 * Freq (Mhz) so you can compute dynamic-power-coefficient for each OPPs. They should be close and then you will have to decide which one is the "best" I don't have access to sdm8150 or sdm8250 boards but you can use the same process to check that the energy model is aligned with the figures in [1] [1] https://github.com/kdrag0n/freqbench/tree/master/results Regards, Vincent > > Regards, > Bjorn > > > [1] https://github.com/kdrag0n/freqbench/tree/master/results/sdm845/main > > > > Fixes: 0e0a8e35d725 ("arm64: dts: qcom: sdm845: correct dynamic power coefficients") > > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > > --- > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 8 ++++---- > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > index 65032b94b46d..869bdb9bce6e 100644 > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > @@ -198,7 +198,7 @@ CPU0: cpu@0 { > > reg = <0x0 0x0>; > > enable-method = "psci"; > > capacity-dmips-mhz = <611>; > > - dynamic-power-coefficient = <290>; > > + dynamic-power-coefficient = <154>; > > qcom,freq-domain = <&cpufreq_hw 0>; > > operating-points-v2 = <&cpu0_opp_table>; > > interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, > > @@ -222,7 +222,7 @@ CPU1: cpu@100 { > > reg = <0x0 0x100>; > > enable-method = "psci"; > > capacity-dmips-mhz = <611>; > > - dynamic-power-coefficient = <290>; > > + dynamic-power-coefficient = <154>; > > qcom,freq-domain = <&cpufreq_hw 0>; > > operating-points-v2 = <&cpu0_opp_table>; > > interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, > > @@ -243,7 +243,7 @@ CPU2: cpu@200 { > > reg = <0x0 0x200>; > > enable-method = "psci"; > > capacity-dmips-mhz = <611>; > > - dynamic-power-coefficient = <290>; > > + dynamic-power-coefficient = <154>; > > qcom,freq-domain = <&cpufreq_hw 0>; > > operating-points-v2 = <&cpu0_opp_table>; > > interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, > > @@ -264,7 +264,7 @@ CPU3: cpu@300 { > > reg = <0x0 0x300>; > > enable-method = "psci"; > > capacity-dmips-mhz = <611>; > > - dynamic-power-coefficient = <290>; > > + dynamic-power-coefficient = <154>; > > qcom,freq-domain = <&cpufreq_hw 0>; > > operating-points-v2 = <&cpu0_opp_table>; > > interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, > > -- > > 2.34.1 > >
On Mon, Jan 09, 2023 at 06:02:29PM +0100, Vincent Guittot wrote: > On Fri, 6 Jan 2023 at 19:28, Bjorn Andersson <andersson@kernel.org> wrote: > > > > On Fri, Jan 06, 2023 at 05:46:18PM +0100, Vincent Guittot wrote: > > > > Seems like using get_maintainer.pl would have saved you some trouble ;) > > The worst is that I used it but only checked names and not emails > when I reused the list of the original patch :-( > :) > > > > > While stressing EAS on my dragonboard RB3, I have noticed that LITTLE cores > > > where never selected as the most energy efficient CPU whatever the > > > utilization level of waking task. > > > > > > energy model framework uses its cost field to estimate the energy with > > > the formula: > > > > > > nrg = cost of the selected OPP * utilization / CPU's max capacity > > > > > > which ends up selecting the CPU with lowest cost / max capacity ration > > > as long as the utilization fits in the OPP's capacity. > > > > > > If we compare the cost of a little OPP with similar capacity of a big OPP > > > like : > > > OPP(kHz) OPP capacity cost max capacity cost/max capacity > > > LITTLE 1766400 407 351114 407 863 > > > big 1056000 408 520267 1024 508 > > > > > > This can be interpreted as the LITTLE core consumes 70% more than big core > > > for the same compute capacity. > > > > > > According to [1], LITTLE consumes 10% less than big core for Coremark > > > benchmark at those OPPs. If we consider that everything else stays > > > unchanged, the dynamic-power-coefficient of LITTLE core should be > > > only 53% of the current value: 290 * 53% = 154 > > > > > > Set the dynamic-power-coefficient of CPU0-3 to 154 to fix the energy model. > > > > > > > This is sounds reasonable. > > Dmitry, what do you think about this? > > But if the math was wrong for SDM845, I would assume that sm8150 and > > sm8250 are wrong as well, as that's what 0e0a8e35d725 is based on. And > > should I assume that patches for other platforms are off by 53% as well? > > I don't think that we can assume that there is an error and in > particular the same 53% error for others. > > > > > Can you help me understand how to arrive at this number? (Without > > considering everything else stays unchanged, if needed). > > In order to do the full computation, we need the voltage of each OPP > which I don't have as they are provided by the LUT at boot IIUC. > Instead I have used the debugfs output of the energy model and > compared the value of (perf_state->cost/cpu_max_capacity) with the > energy and duration figures available in [1]. > > In the case of SDM845, it was pretty easy to compare the OPPs of big > and LITTLE because the duration and the perf result were the same for > 2 OPPS so we should have : > > (little OPP(1766400)->cost / little max capacity (407)) / (big > OPP(1056000)->cost / big max capacity(1024)) = little OPP(1766400) > energy / big OPP(1056000) energy > > (little OPP(1766400)->cost / little max capacity (407)) / (big > OPP(1056000)->cost / big max capacity(1024)) = 0.90 > > but current output gives: > > (little OPP(1766400)->cost / little max capacity (407)) / (big > OPP(1056000)->cost / big max capacity(1024)) = 1.70 > > As we consider everything else constant, it can be simplified by: > > correct_little_dynamic-power-coefficient * const_A = 0.90 > > Whereas we currently have > > current_little_dynamic-power-coefficient * const_A = 1.70 > > and we ends up with > > correct_little_dynamic-power-coefficient = 0.90 / 1.70 * > current_little_dynamic-power-coefficient = 154 > > That being said, it can be simpler as the energy model provide the power figures > > little OPP(1766400)->power = 351114 uW > big OPP(1056000)->power = 195991 uW > ration = 1.79 > > [1] results gives > little OPP(1766400)->power = 193.281 mW > big OPP(1056000)->power = 216.405 mW > ratio = 0.89 > > The ratios are a bit different and give a > correct_little_dynamic-power-coefficient = 144 which is different > than when using ->cost. This probably comes from rounding and which > figures have been used to compute the model. > > If you have Voltage of OPP, the formula used in energy model is power > (uW) = dynamic-power-coefficient * uV^2 * Freq (Mhz) so you can > compute dynamic-power-coefficient for each OPPs. They should be close > and then you will have to decide which one is the "best" > > I don't have access to sdm8150 or sdm8250 boards but you can use the > same process to check that the energy model is aligned with the > figures in [1] > > [1] https://github.com/kdrag0n/freqbench/tree/master/results > > Regards, > Vincent Thanks for the explanation Vincent! Regards, Bjorn > > > > Regards, > > Bjorn > > > > > [1] https://github.com/kdrag0n/freqbench/tree/master/results/sdm845/main > > > > > > Fixes: 0e0a8e35d725 ("arm64: dts: qcom: sdm845: correct dynamic power coefficients") > > > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > > > --- > > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 8 ++++---- > > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > index 65032b94b46d..869bdb9bce6e 100644 > > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > @@ -198,7 +198,7 @@ CPU0: cpu@0 { > > > reg = <0x0 0x0>; > > > enable-method = "psci"; > > > capacity-dmips-mhz = <611>; > > > - dynamic-power-coefficient = <290>; > > > + dynamic-power-coefficient = <154>; > > > qcom,freq-domain = <&cpufreq_hw 0>; > > > operating-points-v2 = <&cpu0_opp_table>; > > > interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, > > > @@ -222,7 +222,7 @@ CPU1: cpu@100 { > > > reg = <0x0 0x100>; > > > enable-method = "psci"; > > > capacity-dmips-mhz = <611>; > > > - dynamic-power-coefficient = <290>; > > > + dynamic-power-coefficient = <154>; > > > qcom,freq-domain = <&cpufreq_hw 0>; > > > operating-points-v2 = <&cpu0_opp_table>; > > > interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, > > > @@ -243,7 +243,7 @@ CPU2: cpu@200 { > > > reg = <0x0 0x200>; > > > enable-method = "psci"; > > > capacity-dmips-mhz = <611>; > > > - dynamic-power-coefficient = <290>; > > > + dynamic-power-coefficient = <154>; > > > qcom,freq-domain = <&cpufreq_hw 0>; > > > operating-points-v2 = <&cpu0_opp_table>; > > > interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, > > > @@ -264,7 +264,7 @@ CPU3: cpu@300 { > > > reg = <0x0 0x300>; > > > enable-method = "psci"; > > > capacity-dmips-mhz = <611>; > > > - dynamic-power-coefficient = <290>; > > > + dynamic-power-coefficient = <154>; > > > qcom,freq-domain = <&cpufreq_hw 0>; > > > operating-points-v2 = <&cpu0_opp_table>; > > > interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, > > > -- > > > 2.34.1 > > >
Hi Bjorn, On Thu, 19 Jan 2023 at 00:26, Bjorn Andersson <andersson@kernel.org> wrote: > [...] > > > > According to [1], LITTLE consumes 10% less than big core for Coremark > > > > benchmark at those OPPs. If we consider that everything else stays > > > > unchanged, the dynamic-power-coefficient of LITTLE core should be > > > > only 53% of the current value: 290 * 53% = 154 > > > > > > > > Set the dynamic-power-coefficient of CPU0-3 to 154 to fix the energy model. > > > > > > > > > > This is sounds reasonable. > > > > > Dmitry, what do you think about this? What is the status for this patch ? The problem is still present AFAICT Regards, Vincent > [...] > > > > > >
On Fri, 6 Jan 2023 17:46:18 +0100, Vincent Guittot wrote: > While stressing EAS on my dragonboard RB3, I have noticed that LITTLE cores > where never selected as the most energy efficient CPU whatever the > utilization level of waking task. > > energy model framework uses its cost field to estimate the energy with > the formula: > > [...] Applied, thanks! [1/1] arm64: dts: qcom: sdm845: correct dynamic power coefficients commit: 44750f153699b6e4f851a399287e5c8df208d696 Best regards,
diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi index 65032b94b46d..869bdb9bce6e 100644 --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi @@ -198,7 +198,7 @@ CPU0: cpu@0 { reg = <0x0 0x0>; enable-method = "psci"; capacity-dmips-mhz = <611>; - dynamic-power-coefficient = <290>; + dynamic-power-coefficient = <154>; qcom,freq-domain = <&cpufreq_hw 0>; operating-points-v2 = <&cpu0_opp_table>; interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, @@ -222,7 +222,7 @@ CPU1: cpu@100 { reg = <0x0 0x100>; enable-method = "psci"; capacity-dmips-mhz = <611>; - dynamic-power-coefficient = <290>; + dynamic-power-coefficient = <154>; qcom,freq-domain = <&cpufreq_hw 0>; operating-points-v2 = <&cpu0_opp_table>; interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, @@ -243,7 +243,7 @@ CPU2: cpu@200 { reg = <0x0 0x200>; enable-method = "psci"; capacity-dmips-mhz = <611>; - dynamic-power-coefficient = <290>; + dynamic-power-coefficient = <154>; qcom,freq-domain = <&cpufreq_hw 0>; operating-points-v2 = <&cpu0_opp_table>; interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, @@ -264,7 +264,7 @@ CPU3: cpu@300 { reg = <0x0 0x300>; enable-method = "psci"; capacity-dmips-mhz = <611>; - dynamic-power-coefficient = <290>; + dynamic-power-coefficient = <154>; qcom,freq-domain = <&cpufreq_hw 0>; operating-points-v2 = <&cpu0_opp_table>; interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>,
While stressing EAS on my dragonboard RB3, I have noticed that LITTLE cores where never selected as the most energy efficient CPU whatever the utilization level of waking task. energy model framework uses its cost field to estimate the energy with the formula: nrg = cost of the selected OPP * utilization / CPU's max capacity which ends up selecting the CPU with lowest cost / max capacity ration as long as the utilization fits in the OPP's capacity. If we compare the cost of a little OPP with similar capacity of a big OPP like : OPP(kHz) OPP capacity cost max capacity cost/max capacity LITTLE 1766400 407 351114 407 863 big 1056000 408 520267 1024 508 This can be interpreted as the LITTLE core consumes 70% more than big core for the same compute capacity. According to [1], LITTLE consumes 10% less than big core for Coremark benchmark at those OPPs. If we consider that everything else stays unchanged, the dynamic-power-coefficient of LITTLE core should be only 53% of the current value: 290 * 53% = 154 Set the dynamic-power-coefficient of CPU0-3 to 154 to fix the energy model. [1] https://github.com/kdrag0n/freqbench/tree/master/results/sdm845/main Fixes: 0e0a8e35d725 ("arm64: dts: qcom: sdm845: correct dynamic power coefficients") Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> --- arch/arm64/boot/dts/qcom/sdm845.dtsi | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)