Message ID | 20231129110853.94344-4-lukasz.luba@arm.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Series | Introduce runtime modifiable Energy Model | expand |
On 11/29/23 11:08, Lukasz Luba wrote: > The Energy Model might be updated at runtime and the energy efficiency > for each OPP may change. Thus, there is a need to update also the > cpufreq framework and make it aligned to the new values. In order to > do that, use a first active CPU from the Performance Domain. This is > needed since the first CPU in the cpumask might be offline when we > run this code path. I didn't understand the problem here. It seems you're fixing a race, but the description is not clear to me what the race is. > > Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> > --- > kernel/power/energy_model.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c > index 42486674b834..aa7c89f9e115 100644 > --- a/kernel/power/energy_model.c > +++ b/kernel/power/energy_model.c > @@ -243,12 +243,19 @@ em_cpufreq_update_efficiencies(struct device *dev, struct em_perf_state *table) > struct em_perf_domain *pd = dev->em_pd; > struct cpufreq_policy *policy; > int found = 0; > - int i; > + int i, cpu; > > if (!_is_cpu_device(dev) || !pd) > return; > > - policy = cpufreq_cpu_get(cpumask_first(em_span_cpus(pd))); > + /* Try to get a CPU which is active and in this PD */ > + cpu = cpumask_first_and(em_span_cpus(pd), cpu_active_mask); > + if (cpu >= nr_cpu_ids) { > + dev_warn(dev, "EM: No online CPU for CPUFreq policy\n"); > + return; > + } > + > + policy = cpufreq_cpu_get(cpu); Shouldn't policy be NULL here if all policy->realted_cpus were offlined? Cheers -- Qais Yousef > if (!policy) { > dev_warn(dev, "EM: Access to CPUFreq policy failed\n"); > return; > -- > 2.25.1 >
On 12/17/23 17:58, Qais Yousef wrote: > On 11/29/23 11:08, Lukasz Luba wrote: >> The Energy Model might be updated at runtime and the energy efficiency >> for each OPP may change. Thus, there is a need to update also the >> cpufreq framework and make it aligned to the new values. In order to >> do that, use a first active CPU from the Performance Domain. This is >> needed since the first CPU in the cpumask might be offline when we >> run this code path. > > I didn't understand the problem here. It seems you're fixing a race, but the > description is not clear to me what the race is. I have explained that in v1, v4 comments for this patch. When the EM is registered the fist CPU is always online. No problem for the old code, but for new code with runtime modification at later time, potentially from different subsystems - it it (e.g. thermal, drivers, etc). The fist CPU might be offline, but still such EM update for this domain shouldn'y fail. Although, when the CPU is offline we cannot get the valid policy... We can get it for next cpu in the cpumask, that's what the code is doing. > >> >> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> >> --- >> kernel/power/energy_model.c | 11 +++++++++-- >> 1 file changed, 9 insertions(+), 2 deletions(-) >> >> diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c >> index 42486674b834..aa7c89f9e115 100644 >> --- a/kernel/power/energy_model.c >> +++ b/kernel/power/energy_model.c >> @@ -243,12 +243,19 @@ em_cpufreq_update_efficiencies(struct device *dev, struct em_perf_state *table) >> struct em_perf_domain *pd = dev->em_pd; >> struct cpufreq_policy *policy; >> int found = 0; >> - int i; >> + int i, cpu; >> >> if (!_is_cpu_device(dev) || !pd) >> return; >> >> - policy = cpufreq_cpu_get(cpumask_first(em_span_cpus(pd))); >> + /* Try to get a CPU which is active and in this PD */ >> + cpu = cpumask_first_and(em_span_cpus(pd), cpu_active_mask); >> + if (cpu >= nr_cpu_ids) { >> + dev_warn(dev, "EM: No online CPU for CPUFreq policy\n"); >> + return; >> + } >> + >> + policy = cpufreq_cpu_get(cpu); > > Shouldn't policy be NULL here if all policy->realted_cpus were offlined? It will be NULL but we will capture that fact in other way in the 'if' above. We want something else. We want to get policy using 'some' online CPU's id from our known cpumask. Then we can continue with such policy in the code.
On 12/19/23 10:53, Lukasz Luba wrote: > > > On 12/17/23 17:58, Qais Yousef wrote: > > On 11/29/23 11:08, Lukasz Luba wrote: > > > The Energy Model might be updated at runtime and the energy efficiency > > > for each OPP may change. Thus, there is a need to update also the > > > cpufreq framework and make it aligned to the new values. In order to > > > do that, use a first active CPU from the Performance Domain. This is > > > needed since the first CPU in the cpumask might be offline when we > > > run this code path. > > > > I didn't understand the problem here. It seems you're fixing a race, but the > > description is not clear to me what the race is. > > I have explained that in v1, v4 comments for this patch. > When the EM is registered the fist CPU is always online. No problem > for the old code, but for new code with runtime modification at > later time, potentially from different subsystems - it it (e.g. thermal, > drivers, etc). The fist CPU might be offline, but still such EM > update for this domain shouldn'y fail. Although, when the CPU is offline > we cannot get the valid policy... > > We can get it for next cpu in the cpumask, that's what the code is > doing. Okay, I can see now that cpufreq_cpu_get_raw() ignores offline CPUs intentionally. A new variant seems better to me. But the experts know better. So LGTM.
On 12/28/23 17:13, Qais Yousef wrote: > On 12/19/23 10:53, Lukasz Luba wrote: >> >> >> On 12/17/23 17:58, Qais Yousef wrote: >>> On 11/29/23 11:08, Lukasz Luba wrote: >>>> The Energy Model might be updated at runtime and the energy efficiency >>>> for each OPP may change. Thus, there is a need to update also the >>>> cpufreq framework and make it aligned to the new values. In order to >>>> do that, use a first active CPU from the Performance Domain. This is >>>> needed since the first CPU in the cpumask might be offline when we >>>> run this code path. >>> >>> I didn't understand the problem here. It seems you're fixing a race, but the >>> description is not clear to me what the race is. >> >> I have explained that in v1, v4 comments for this patch. >> When the EM is registered the fist CPU is always online. No problem >> for the old code, but for new code with runtime modification at >> later time, potentially from different subsystems - it it (e.g. thermal, >> drivers, etc). The fist CPU might be offline, but still such EM >> update for this domain shouldn'y fail. Although, when the CPU is offline >> we cannot get the valid policy... >> >> We can get it for next cpu in the cpumask, that's what the code is >> doing. > > Okay, I can see now that cpufreq_cpu_get_raw() ignores offline CPUs > intentionally. > > A new variant seems better to me. But the experts know better. So LGTM. Thanks. Yes, I will gently ask Viresh to have a look at those places cpufreq related places.
diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c index 42486674b834..aa7c89f9e115 100644 --- a/kernel/power/energy_model.c +++ b/kernel/power/energy_model.c @@ -243,12 +243,19 @@ em_cpufreq_update_efficiencies(struct device *dev, struct em_perf_state *table) struct em_perf_domain *pd = dev->em_pd; struct cpufreq_policy *policy; int found = 0; - int i; + int i, cpu; if (!_is_cpu_device(dev) || !pd) return; - policy = cpufreq_cpu_get(cpumask_first(em_span_cpus(pd))); + /* Try to get a CPU which is active and in this PD */ + cpu = cpumask_first_and(em_span_cpus(pd), cpu_active_mask); + if (cpu >= nr_cpu_ids) { + dev_warn(dev, "EM: No online CPU for CPUFreq policy\n"); + return; + } + + policy = cpufreq_cpu_get(cpu); if (!policy) { dev_warn(dev, "EM: Access to CPUFreq policy failed\n"); return;
The Energy Model might be updated at runtime and the energy efficiency for each OPP may change. Thus, there is a need to update also the cpufreq framework and make it aligned to the new values. In order to do that, use a first active CPU from the Performance Domain. This is needed since the first CPU in the cpumask might be offline when we run this code path. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> --- kernel/power/energy_model.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)