diff mbox series

[4/4] energy_model: use a fixed reference frequency

Message ID 20230901130312.247719-5-vincent.guittot@linaro.org (mailing list archive)
State Superseded
Headers show
Series consolidate and cleanup CPU capacity | expand

Checks

Context Check Description
conchuod/cover_letter success Series has a cover letter
conchuod/tree_selection success Guessed tree name to be for-next at HEAD 9a1d204f5c57
conchuod/fixes_present success Fixes tag not required for -next series
conchuod/maintainers_pattern success MAINTAINERS pattern errors before the patch: 2 and now 2
conchuod/verify_signedoff success Signed-off-by tag matches author and committer
conchuod/kdoc success Errors and warnings before: 0 this patch: 0
conchuod/build_rv64_clang_allmodconfig success Errors and warnings before: 2344 this patch: 2344
conchuod/module_param success Was 0 now: 0
conchuod/build_rv64_gcc_allmodconfig success Errors and warnings before: 940 this patch: 940
conchuod/build_rv32_defconfig success Build OK
conchuod/dtb_warn_rv64 success Errors and warnings before: 39 this patch: 39
conchuod/header_inline success No static functions without inline keyword in header files
conchuod/checkpatch success total: 0 errors, 0 warnings, 0 checks, 41 lines checked
conchuod/build_rv64_nommu_k210_defconfig success Build OK
conchuod/verify_fixes success No Fixes tag
conchuod/build_rv64_nommu_virt_defconfig success Build OK

Commit Message

Vincent Guittot Sept. 1, 2023, 1:03 p.m. UTC
The last item of a performance domain is not always the performance point
that has been used to compute CPU's capacity. This can lead to different
target frequency compared with other part of the system like schedutil and
would result in wrong energy estimation.

a new arch_scale_freq_ref() is available to return a fixed and coherent
frequency reference that can be used when computing the CPU's frequency
for an level of utilization. Use this function when available or fallback
to the last performance domain item otherwise.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 include/linux/energy_model.h | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

Comments

Lukasz Luba Sept. 4, 2023, 12:40 p.m. UTC | #1
On 9/1/23 14:03, Vincent Guittot wrote:
> The last item of a performance domain is not always the performance point
> that has been used to compute CPU's capacity. This can lead to different
> target frequency compared with other part of the system like schedutil and
> would result in wrong energy estimation.
> 
> a new arch_scale_freq_ref() is available to return a fixed and coherent
> frequency reference that can be used when computing the CPU's frequency
> for an level of utilization. Use this function when available or fallback
> to the last performance domain item otherwise.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>   include/linux/energy_model.h | 20 +++++++++++++++++---
>   1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> index b9caa01dfac4..7ee07be6928e 100644
> --- a/include/linux/energy_model.h
> +++ b/include/linux/energy_model.h
> @@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
>   	return ps;
>   }
>   
> +#ifdef arch_scale_freq_ref
> +static __always_inline
> +unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> +{
> +	return arch_scale_freq_ref(cpu);
> +}
> +#else
> +static __always_inline
> +unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> +{
> +	return pd->table[pd->nr_perf_states - 1].frequency;
> +}
> +#endif
> +
>   /**
>    * em_cpu_energy() - Estimates the energy consumed by the CPUs of a
>    *		performance domain
> @@ -224,7 +238,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
>   				unsigned long max_util, unsigned long sum_util,
>   				unsigned long allowed_cpu_cap)
>   {
> -	unsigned long freq, scale_cpu;
> +	unsigned long freq, ref_freq, scale_cpu;
>   	struct em_perf_state *ps;
>   	int cpu;
>   
> @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
>   	 */
>   	cpu = cpumask_first(to_cpumask(pd->cpus));
>   	scale_cpu = arch_scale_cpu_capacity(cpu);
> -	ps = &pd->table[pd->nr_perf_states - 1];
> +	ref_freq = arch_scale_freq_ref_em(cpu, pd);
>   
>   	max_util = map_util_perf(max_util);
>   	max_util = min(max_util, allowed_cpu_cap);
> -	freq = map_util_freq(max_util, ps->frequency, scale_cpu);
> +	freq = map_util_freq(max_util, ref_freq, scale_cpu);
>   
>   	/*
>   	 * Find the lowest performance state of the Energy Model above the

LGTM,

Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>

FYI, I'm going to send my v4 for the EM hopefully in next days, so those
changes might collide. But we can sort this out later (when both would
be ready).

Regards,
Lukasz
Pierre Gondois Sept. 5, 2023, 10:05 a.m. UTC | #2
Hello Vincent,
I tried the patch-set on a platform using cppc_cpufreq and that has boosting
frequencies,

1-
On such platform, the CPU capacity comes from the CPPC highest_frequency
field. The CPU capacity is set to the capacity of the boosting frequency.
This behaviour is different from DT platforms where the CPU capacity is
updated whenever the boosting mode is enabled (it seems).

Wouldn't it be better to have CPU max capacities set to their boosting
capacity as for CPPC base platforms ? It seems the max frequency is always
available somehow for all the cpufreq drivers with boosting available, i.e.
acpi-cpufreq, amd-pstate, cppc_cpufreq.


2-
On the CPPC based platforms, the per_cpu freq_factor is not used/updated,
meaning that we have:
arch_scale_freq_ref_em()
\-arch_scale_freq_ref()
   \-topology_get_freq_ref()
     \-per_cpu(freq_factor, cpu) (set to the default value: 1)
and em_cpu_energy()'s ref_freq variable is then set to 1 instead of the max
frequency (leading to a 0 energy computation).

3-
Also just in case, arch_scale_freq_ref_policy() and cpufreq_get_hw_max_freq()
seem to have close (but not identical) purpose,

Regards,
Pierre

On 9/1/23 15:03, Vincent Guittot wrote:
> The last item of a performance domain is not always the performance point
> that has been used to compute CPU's capacity. This can lead to different
> target frequency compared with other part of the system like schedutil and
> would result in wrong energy estimation.
> 
> a new arch_scale_freq_ref() is available to return a fixed and coherent
> frequency reference that can be used when computing the CPU's frequency
> for an level of utilization. Use this function when available or fallback
> to the last performance domain item otherwise.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>   include/linux/energy_model.h | 20 +++++++++++++++++---
>   1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> index b9caa01dfac4..7ee07be6928e 100644
> --- a/include/linux/energy_model.h
> +++ b/include/linux/energy_model.h
> @@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
>   	return ps;
>   }
>   
> +#ifdef arch_scale_freq_ref
> +static __always_inline
> +unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> +{
> +	return arch_scale_freq_ref(cpu);
> +}
> +#else
> +static __always_inline
> +unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> +{
> +	return pd->table[pd->nr_perf_states - 1].frequency;
> +}
> +#endif
> +
>   /**
>    * em_cpu_energy() - Estimates the energy consumed by the CPUs of a
>    *		performance domain
> @@ -224,7 +238,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
>   				unsigned long max_util, unsigned long sum_util,
>   				unsigned long allowed_cpu_cap)
>   {
> -	unsigned long freq, scale_cpu;
> +	unsigned long freq, ref_freq, scale_cpu;
>   	struct em_perf_state *ps;
>   	int cpu;
>   
> @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
>   	 */
>   	cpu = cpumask_first(to_cpumask(pd->cpus));
>   	scale_cpu = arch_scale_cpu_capacity(cpu);
> -	ps = &pd->table[pd->nr_perf_states - 1];
> +	ref_freq = arch_scale_freq_ref_em(cpu, pd);
>   
>   	max_util = map_util_perf(max_util);
>   	max_util = min(max_util, allowed_cpu_cap);
> -	freq = map_util_freq(max_util, ps->frequency, scale_cpu);
> +	freq = map_util_freq(max_util, ref_freq, scale_cpu);
>   
>   	/*
>   	 * Find the lowest performance state of the Energy Model above the
Peter Zijlstra Sept. 5, 2023, 11:33 a.m. UTC | #3
On Tue, Sep 05, 2023 at 12:05:30PM +0200, Pierre Gondois wrote:
> Hello Vincent,
> I tried the patch-set on a platform using cppc_cpufreq and that has boosting
> frequencies,
> 
> 1-
> On such platform, the CPU capacity comes from the CPPC highest_frequency
> field. The CPU capacity is set to the capacity of the boosting frequency.
> This behaviour is different from DT platforms where the CPU capacity is
> updated whenever the boosting mode is enabled (it seems).
> 
> Wouldn't it be better to have CPU max capacities set to their boosting
> capacity as for CPPC base platforms ? It seems the max frequency is always
> available somehow for all the cpufreq drivers with boosting available, i.e.
> acpi-cpufreq, amd-pstate, cppc_cpufreq.

So on Intel we don't use the max (turbo) boost value, but typically end
up picking the 4-core turbo value or something. There's a comment in
arch/x86/kernel/cpu/aperfmperf.c.

Per that comment it probably makes sense to be able to differentiate
between a mobile device and a server, or perhaps we can (ab)use the EAS
enable knob for this distinction?

That is, I'm not sure it makes sense to always pick the highest boost
freqency for ARM64 servers, very much analogous to  how we don't do that
on Intel.
Vincent Guittot Sept. 5, 2023, 1:16 p.m. UTC | #4
On Tue, 5 Sept 2023 at 12:05, Pierre Gondois <pierre.gondois@arm.com> wrote:
>
> Hello Vincent,
> I tried the patch-set on a platform using cppc_cpufreq and that has boosting
> frequencies,
>
> 1-
> On such platform, the CPU capacity comes from the CPPC highest_frequency
> field. The CPU capacity is set to the capacity of the boosting frequency.
> This behaviour is different from DT platforms where the CPU capacity is
> updated whenever the boosting mode is enabled (it seems).

ok, I haven't noticed that cppc_cpufreq would be impacted by this
change in arch_topology. I'm going to check how to fix that

>
> Wouldn't it be better to have CPU max capacities set to their boosting
> capacity as for CPPC base platforms ? It seems the max frequency is always
> available somehow for all the cpufreq drivers with boosting available, i.e.
> acpi-cpufreq, amd-pstate, cppc_cpufreq.

Some platforms will never enable boost or  boost is only temporarily
available before being capped. As a result some prefer to use a more
sustainable freq for their max capacity. That's why we can't always
use the max/boost freq

>
>
> 2-
> On the CPPC based platforms, the per_cpu freq_factor is not used/updated,
> meaning that we have:
> arch_scale_freq_ref_em()
> \-arch_scale_freq_ref()
>    \-topology_get_freq_ref()
>      \-per_cpu(freq_factor, cpu) (set to the default value: 1)
> and em_cpu_energy()'s ref_freq variable is then set to 1 instead of the max
> frequency (leading to a 0 energy computation).

IIUC, cppc uses the default cpu capacity of arch_topology and then
never updates it  and it creates an EM for this SMP system.
ok, so you have an EM sets with ACPI and SMP.

I'm going to check where we could set this reference frequency for your case.

>
> 3-
> Also just in case, arch_scale_freq_ref_policy() and cpufreq_get_hw_max_freq()
> seem to have close (but not identical) purpose,
>
> Regards,
> Pierre
>
> On 9/1/23 15:03, Vincent Guittot wrote:
> > The last item of a performance domain is not always the performance point
> > that has been used to compute CPU's capacity. This can lead to different
> > target frequency compared with other part of the system like schedutil and
> > would result in wrong energy estimation.
> >
> > a new arch_scale_freq_ref() is available to return a fixed and coherent
> > frequency reference that can be used when computing the CPU's frequency
> > for an level of utilization. Use this function when available or fallback
> > to the last performance domain item otherwise.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> >   include/linux/energy_model.h | 20 +++++++++++++++++---
> >   1 file changed, 17 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> > index b9caa01dfac4..7ee07be6928e 100644
> > --- a/include/linux/energy_model.h
> > +++ b/include/linux/energy_model.h
> > @@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
> >       return ps;
> >   }
> >
> > +#ifdef arch_scale_freq_ref
> > +static __always_inline
> > +unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> > +{
> > +     return arch_scale_freq_ref(cpu);
> > +}
> > +#else
> > +static __always_inline
> > +unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> > +{
> > +     return pd->table[pd->nr_perf_states - 1].frequency;
> > +}
> > +#endif
> > +
> >   /**
> >    * em_cpu_energy() - Estimates the energy consumed by the CPUs of a
> >    *          performance domain
> > @@ -224,7 +238,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> >                               unsigned long max_util, unsigned long sum_util,
> >                               unsigned long allowed_cpu_cap)
> >   {
> > -     unsigned long freq, scale_cpu;
> > +     unsigned long freq, ref_freq, scale_cpu;
> >       struct em_perf_state *ps;
> >       int cpu;
> >
> > @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> >        */
> >       cpu = cpumask_first(to_cpumask(pd->cpus));
> >       scale_cpu = arch_scale_cpu_capacity(cpu);
> > -     ps = &pd->table[pd->nr_perf_states - 1];
> > +     ref_freq = arch_scale_freq_ref_em(cpu, pd);
> >
> >       max_util = map_util_perf(max_util);
> >       max_util = min(max_util, allowed_cpu_cap);
> > -     freq = map_util_freq(max_util, ps->frequency, scale_cpu);
> > +     freq = map_util_freq(max_util, ref_freq, scale_cpu);
> >
> >       /*
> >        * Find the lowest performance state of the Energy Model above the
Dietmar Eggemann Sept. 14, 2023, 9:07 p.m. UTC | #5
On 01/09/2023 15:03, Vincent Guittot wrote:

[...]

> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> index b9caa01dfac4..7ee07be6928e 100644
> --- a/include/linux/energy_model.h
> +++ b/include/linux/energy_model.h
> @@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
>  	return ps;
>  }
>  
> +#ifdef arch_scale_freq_ref
> +static __always_inline
> +unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)

Why is this function named with the arch prefix?

So far we have 5 arch functions (arch_scale_freq_tick() <->
arch_scale_freq_ref()) and e.g. Arm/Arm64 defines them with there
topology_foo implementations.

Isn't arch_scale_freq_ref_em() (as well as arch_scale_freq_ref_policy())
different in this sense and so a proper EM function which should
manifest in its name?

> +{
> +	return arch_scale_freq_ref(cpu);
> +}
> +#else
> +static __always_inline
> +unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> +{
> +	return pd->table[pd->nr_perf_states - 1].frequency;
> +}
> +#endif

[...]

> @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
>  	 */
>  	cpu = cpumask_first(to_cpumask(pd->cpus));
>  	scale_cpu = arch_scale_cpu_capacity(cpu);
> -	ps = &pd->table[pd->nr_perf_states - 1];
> +	ref_freq = arch_scale_freq_ref_em(cpu, pd);

Why not using existing `unsigned long freq` here like in schedutil's
get_next_freq()?

>  
>  	max_util = map_util_perf(max_util);

[...]
Vincent Guittot Sept. 15, 2023, 1:35 p.m. UTC | #6
On Thu, 14 Sept 2023 at 23:07, Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
> On 01/09/2023 15:03, Vincent Guittot wrote:
>
> [...]
>
> > diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> > index b9caa01dfac4..7ee07be6928e 100644
> > --- a/include/linux/energy_model.h
> > +++ b/include/linux/energy_model.h
> > @@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
> >       return ps;
> >  }
> >
> > +#ifdef arch_scale_freq_ref
> > +static __always_inline
> > +unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
>
> Why is this function named with the arch prefix?
>
> So far we have 5 arch functions (arch_scale_freq_tick() <->
> arch_scale_freq_ref()) and e.g. Arm/Arm64 defines them with there
> topology_foo implementations.
>
> Isn't arch_scale_freq_ref_em() (as well as arch_scale_freq_ref_policy())
> different in this sense and so a proper EM function which should
> manifest in its name?

arch_scale_freq_ref_em() is there to handle cases where
arch_scale_freq_ref() is not defined by arch. I keep arch_ prefix
because this should be provided by architecture which wants to use EM.

In the case of EM, it's only there for allyes|randconfig on arch that
doesn't use arch_topology.c like x86_64

>
> > +{
> > +     return arch_scale_freq_ref(cpu);
> > +}
> > +#else
> > +static __always_inline
> > +unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> > +{
> > +     return pd->table[pd->nr_perf_states - 1].frequency;
> > +}
> > +#endif
>
> [...]
>
> > @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> >        */
> >       cpu = cpumask_first(to_cpumask(pd->cpus));
> >       scale_cpu = arch_scale_cpu_capacity(cpu);
> > -     ps = &pd->table[pd->nr_perf_states - 1];
> > +     ref_freq = arch_scale_freq_ref_em(cpu, pd);
>
> Why not using existing `unsigned long freq` here like in schedutil's
> get_next_freq()?

Find it easier to read and understand and will not make any difference
in the compiled code

>
> >
> >       max_util = map_util_perf(max_util);
>
> [...]
>
Dietmar Eggemann Sept. 18, 2023, 8:46 p.m. UTC | #7
On 15/09/2023 15:35, Vincent Guittot wrote:
> On Thu, 14 Sept 2023 at 23:07, Dietmar Eggemann
> <dietmar.eggemann@arm.com> wrote:
>>
>> On 01/09/2023 15:03, Vincent Guittot wrote:

[...]

>>> +#ifdef arch_scale_freq_ref
>>> +static __always_inline
>>> +unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
>>
>> Why is this function named with the arch prefix?
>>
>> So far we have 5 arch functions (arch_scale_freq_tick() <->
>> arch_scale_freq_ref()) and e.g. Arm/Arm64 defines them with there
>> topology_foo implementations.
>>
>> Isn't arch_scale_freq_ref_em() (as well as arch_scale_freq_ref_policy())
>> different in this sense and so a proper EM function which should
>> manifest in its name?
> 
> arch_scale_freq_ref_em() is there to handle cases where
> arch_scale_freq_ref() is not defined by arch. I keep arch_ prefix
> because this should be provided by architecture which wants to use EM.

That's correct, x86_64 with CONFIG_ENERGY_MODEL=y needs
arch_scale_freq_ref_em() returning highest perf_state of the perf_domain.
But this function as opposed to arch_scale_freq_ref() does not have to
be provided by the arch itself. It's provided by the EM instead.
That's why my doubt whether it should be named arch_scale_freq_ref_em().

> In the case of EM, it's only there for allyes|randconfig on arch that
> doesn't use arch_topology.c like x86_64

[...]

>>> @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
>>>        */
>>>       cpu = cpumask_first(to_cpumask(pd->cpus));
>>>       scale_cpu = arch_scale_cpu_capacity(cpu);
>>> -     ps = &pd->table[pd->nr_perf_states - 1];
>>> +     ref_freq = arch_scale_freq_ref_em(cpu, pd);
>>
>> Why not using existing `unsigned long freq` here like in schedutil's
>> get_next_freq()?
> 
> Find it easier to read and understand and will not make any difference
> in the compiled code

True but I thought it's easier to be able to detect the functional
similarity between em_cpu_energy() (*) and get_next_freq().

freq = arch_scale_freq_ref_{policy,em}({policy,(cpu, pd)});
... (in case of *)
freq = map_util_freq(util, freq, max);

Just a nitpick ...

[...]
Ionela Voinescu Sept. 21, 2023, 10:12 a.m. UTC | #8
On Friday 01 Sep 2023 at 15:03:12 (+0200), Vincent Guittot wrote:
> The last item of a performance domain is not always the performance point
> that has been used to compute CPU's capacity. This can lead to different
> target frequency compared with other part of the system like schedutil and
> would result in wrong energy estimation.
> 
> a new arch_scale_freq_ref() is available to return a fixed and coherent
> frequency reference that can be used when computing the CPU's frequency
> for an level of utilization. Use this function when available or fallback
> to the last performance domain item otherwise.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  include/linux/energy_model.h | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> index b9caa01dfac4..7ee07be6928e 100644
> --- a/include/linux/energy_model.h
> +++ b/include/linux/energy_model.h
> @@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
>  	return ps;
>  }
>  
> +#ifdef arch_scale_freq_ref
> +static __always_inline
> +unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)

The comments in patch 3/4 should be considered for this function and its
use as well.

Thanks,
Ionela.

> +{
> +	return arch_scale_freq_ref(cpu);
> +}
> +#else
> +static __always_inline
> +unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> +{
> +	return pd->table[pd->nr_perf_states - 1].frequency;
> +}
> +#endif
> +
>  /**
>   * em_cpu_energy() - Estimates the energy consumed by the CPUs of a
>   *		performance domain
> @@ -224,7 +238,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
>  				unsigned long max_util, unsigned long sum_util,
>  				unsigned long allowed_cpu_cap)
>  {
> -	unsigned long freq, scale_cpu;
> +	unsigned long freq, ref_freq, scale_cpu;
>  	struct em_perf_state *ps;
>  	int cpu;
>  
> @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
>  	 */
>  	cpu = cpumask_first(to_cpumask(pd->cpus));
>  	scale_cpu = arch_scale_cpu_capacity(cpu);
> -	ps = &pd->table[pd->nr_perf_states - 1];
> +	ref_freq = arch_scale_freq_ref_em(cpu, pd);
>  
>  	max_util = map_util_perf(max_util);
>  	max_util = min(max_util, allowed_cpu_cap);
> -	freq = map_util_freq(max_util, ps->frequency, scale_cpu);
> +	freq = map_util_freq(max_util, ref_freq, scale_cpu);
>  
>  	/*
>  	 * Find the lowest performance state of the Energy Model above the
> -- 
> 2.34.1
> 
>
diff mbox series

Patch

diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
index b9caa01dfac4..7ee07be6928e 100644
--- a/include/linux/energy_model.h
+++ b/include/linux/energy_model.h
@@ -204,6 +204,20 @@  struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
 	return ps;
 }
 
+#ifdef arch_scale_freq_ref
+static __always_inline
+unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
+{
+	return arch_scale_freq_ref(cpu);
+}
+#else
+static __always_inline
+unsigned long  arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
+{
+	return pd->table[pd->nr_perf_states - 1].frequency;
+}
+#endif
+
 /**
  * em_cpu_energy() - Estimates the energy consumed by the CPUs of a
  *		performance domain
@@ -224,7 +238,7 @@  static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
 				unsigned long max_util, unsigned long sum_util,
 				unsigned long allowed_cpu_cap)
 {
-	unsigned long freq, scale_cpu;
+	unsigned long freq, ref_freq, scale_cpu;
 	struct em_perf_state *ps;
 	int cpu;
 
@@ -241,11 +255,11 @@  static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
 	 */
 	cpu = cpumask_first(to_cpumask(pd->cpus));
 	scale_cpu = arch_scale_cpu_capacity(cpu);
-	ps = &pd->table[pd->nr_perf_states - 1];
+	ref_freq = arch_scale_freq_ref_em(cpu, pd);
 
 	max_util = map_util_perf(max_util);
 	max_util = min(max_util, allowed_cpu_cap);
-	freq = map_util_freq(max_util, ps->frequency, scale_cpu);
+	freq = map_util_freq(max_util, ref_freq, scale_cpu);
 
 	/*
 	 * Find the lowest performance state of the Energy Model above the