mbox series

[v3,0/3] Add support for AArch64 AMUv1-based arch_freq_get_on_cpu

Message ID 20240312083431.3239989-1-beata.michalska@arm.com (mailing list archive)
Headers show
Series Add support for AArch64 AMUv1-based arch_freq_get_on_cpu | expand

Message

Beata Michalska March 12, 2024, 8:34 a.m. UTC
Introducing arm64 specific version of arch_freq_get_on_cpu, cashing on
existing implementation for FIE and AMUv1 support: the frequency scale
factor, updated on each sched tick, serves as a base for retrieving
the frequency for a given CPU, representing an average frequency
reported between the ticks - thus its accuracy is limited.

The changes have been rather lightly (due to some limitations) tested on
an FVP model.

Relevant discussions:
[1] https://lore.kernel.org/all/20240229162520.970986-1-vanshikonda@os.amperecomputing.com/
[2] https://lore.kernel.org/all/7eozim2xnepacnnkzxlbx34hib4otycnbn4dqymfziqou5lw5u@5xzpv3t7sxo3/
[3] https://lore.kernel.org/all/20231212072617.14756-1-lihuisong@huawei.com/
[4] https://lore.kernel.org/lkml/ZIHpd6unkOtYVEqP@e120325.cambridge.arm.com/T/#m4e74cb5a0aaa353c60fedc6cfb95ab7a6e381e3c

v3:
- dropping changes to cpufreq_verify_current_freq
- pulling in changes from Ionela initializing capacity_freq_ref to 0
  (thanks for that!)  and applying suggestions made by her during last review:
	- switching to arch_scale_freq_capacity and arch_scale_freq_ref when
	  reversing freq scale factor computation
	- swapping shift with multiplication
- adding time limit for considering last scale update as valid
- updating frequency scale factor upon entering idle

v2:
- Splitting the patches
- Adding comment for full dyntick mode
- Plugging arch_freq_get_on_cpu into cpufreq_verify_current_freq instead
  of in show_cpuinfo_cur_freq to allow the framework to stay more in sync
  with potential freq changes

Beata Michalska (2):
  arm64: Provide an AMU-based version of arch_freq_get_on_cpu
  arm64: Update AMU-based frequency scale factor on entering idle

Ionela Voinescu (1):
  arch_topology: init capacity_freq_ref to 0

 arch/arm64/kernel/topology.c | 116 +++++++++++++++++++++++++++++++----
 drivers/base/arch_topology.c |   8 ++-
 2 files changed, 110 insertions(+), 14 deletions(-)

Comments

Ionela Voinescu March 13, 2024, 12:27 p.m. UTC | #1
Hey,

On Tuesday 12 Mar 2024 at 08:34:28 (+0000), Beata Michalska wrote:
> Introducing arm64 specific version of arch_freq_get_on_cpu, cashing on
> existing implementation for FIE and AMUv1 support: the frequency scale
> factor, updated on each sched tick, serves as a base for retrieving
> the frequency for a given CPU, representing an average frequency
> reported between the ticks - thus its accuracy is limited.
> 
> The changes have been rather lightly (due to some limitations) tested on
> an FVP model.
> 
> Relevant discussions:
> [1] https://lore.kernel.org/all/20240229162520.970986-1-vanshikonda@os.amperecomputing.com/
> [2] https://lore.kernel.org/all/7eozim2xnepacnnkzxlbx34hib4otycnbn4dqymfziqou5lw5u@5xzpv3t7sxo3/
> [3] https://lore.kernel.org/all/20231212072617.14756-1-lihuisong@huawei.com/
> [4] https://lore.kernel.org/lkml/ZIHpd6unkOtYVEqP@e120325.cambridge.arm.com/T/#m4e74cb5a0aaa353c60fedc6cfb95ab7a6e381e3c
> 
> v3:
> - dropping changes to cpufreq_verify_current_freq
> - pulling in changes from Ionela initializing capacity_freq_ref to 0
>   (thanks for that!)  and applying suggestions made by her during last review:
> 	- switching to arch_scale_freq_capacity and arch_scale_freq_ref when
> 	  reversing freq scale factor computation
> 	- swapping shift with multiplication
> - adding time limit for considering last scale update as valid
> - updating frequency scale factor upon entering idle
> 
> v2:
> - Splitting the patches
> - Adding comment for full dyntick mode
> - Plugging arch_freq_get_on_cpu into cpufreq_verify_current_freq instead
>   of in show_cpuinfo_cur_freq to allow the framework to stay more in sync
>   with potential freq changes
> 
> Beata Michalska (2):
>   arm64: Provide an AMU-based version of arch_freq_get_on_cpu
>   arm64: Update AMU-based frequency scale factor on entering idle
> 
> Ionela Voinescu (1):
>   arch_topology: init capacity_freq_ref to 0
> 

Should there have been a patch that adds a call to
arch_freq_get_on_cpu() from show_cpuinfo_cur_freq() as well?

My understanding from this [1] thread and others referenced there is
that was something we wanted.

[1] https://lore.kernel.org/lkml/2cfbc633-1e94-d741-2337-e1b0cf48b81b@nvidia.com/

Thanks,
Ionela.


>  arch/arm64/kernel/topology.c | 116 +++++++++++++++++++++++++++++++----
>  drivers/base/arch_topology.c |   8 ++-
>  2 files changed, 110 insertions(+), 14 deletions(-)
> 
> -- 
> 2.25.1
>
Beata Michalska March 13, 2024, 11:49 p.m. UTC | #2
On Wed, Mar 13, 2024 at 12:27:53PM +0000, Ionela Voinescu wrote:
> Hey,
> 
> On Tuesday 12 Mar 2024 at 08:34:28 (+0000), Beata Michalska wrote:
> > Introducing arm64 specific version of arch_freq_get_on_cpu, cashing on
> > existing implementation for FIE and AMUv1 support: the frequency scale
> > factor, updated on each sched tick, serves as a base for retrieving
> > the frequency for a given CPU, representing an average frequency
> > reported between the ticks - thus its accuracy is limited.
> > 
> > The changes have been rather lightly (due to some limitations) tested on
> > an FVP model.
> > 
> > Relevant discussions:
> > [1] https://lore.kernel.org/all/20240229162520.970986-1-vanshikonda@os.amperecomputing.com/
> > [2] https://lore.kernel.org/all/7eozim2xnepacnnkzxlbx34hib4otycnbn4dqymfziqou5lw5u@5xzpv3t7sxo3/
> > [3] https://lore.kernel.org/all/20231212072617.14756-1-lihuisong@huawei.com/
> > [4] https://lore.kernel.org/lkml/ZIHpd6unkOtYVEqP@e120325.cambridge.arm.com/T/#m4e74cb5a0aaa353c60fedc6cfb95ab7a6e381e3c
> > 
> > v3:
> > - dropping changes to cpufreq_verify_current_freq
> > - pulling in changes from Ionela initializing capacity_freq_ref to 0
> >   (thanks for that!)  and applying suggestions made by her during last review:
> > 	- switching to arch_scale_freq_capacity and arch_scale_freq_ref when
> > 	  reversing freq scale factor computation
> > 	- swapping shift with multiplication
> > - adding time limit for considering last scale update as valid
> > - updating frequency scale factor upon entering idle
> > 
> > v2:
> > - Splitting the patches
> > - Adding comment for full dyntick mode
> > - Plugging arch_freq_get_on_cpu into cpufreq_verify_current_freq instead
> >   of in show_cpuinfo_cur_freq to allow the framework to stay more in sync
> >   with potential freq changes
> > 
> > Beata Michalska (2):
> >   arm64: Provide an AMU-based version of arch_freq_get_on_cpu
> >   arm64: Update AMU-based frequency scale factor on entering idle
> > 
> > Ionela Voinescu (1):
> >   arch_topology: init capacity_freq_ref to 0
> > 
> 
> Should there have been a patch that adds a call to
> arch_freq_get_on_cpu() from show_cpuinfo_cur_freq() as well?
> 
> My understanding from this [1] thread and others referenced there is
> that was something we wanted.
>
Right, so I must have missunderstood that, as the way I did read it was that
it is acceptable to keep things as they are wrt cpufreq sysfs entries.

---
BR
Beata
> [1] https://lore.kernel.org/lkml/2cfbc633-1e94-d741-2337-e1b0cf48b81b@nvidia.com/
> 
> Thanks,
> Ionela.
> 
> 
> >  arch/arm64/kernel/topology.c | 116 +++++++++++++++++++++++++++++++----
> >  drivers/base/arch_topology.c |   8 ++-
> >  2 files changed, 110 insertions(+), 14 deletions(-)
> > 
> > -- 
> > 2.25.1
> >
Sumit Gupta March 20, 2024, 4:52 p.m. UTC | #3
Hi Beata,

>> On Tuesday 12 Mar 2024 at 08:34:28 (+0000), Beata Michalska wrote:
>>> Introducing arm64 specific version of arch_freq_get_on_cpu, cashing on
>>> existing implementation for FIE and AMUv1 support: the frequency scale
>>> factor, updated on each sched tick, serves as a base for retrieving
>>> the frequency for a given CPU, representing an average frequency
>>> reported between the ticks - thus its accuracy is limited.
>>>
>>> The changes have been rather lightly (due to some limitations) tested on
>>> an FVP model.
>>>
>>> Relevant discussions:
>>> [1] https://lore.kernel.org/all/20240229162520.970986-1-vanshikonda@os.amperecomputing.com/
>>> [2] https://lore.kernel.org/all/7eozim2xnepacnnkzxlbx34hib4otycnbn4dqymfziqou5lw5u@5xzpv3t7sxo3/
>>> [3] https://lore.kernel.org/all/20231212072617.14756-1-lihuisong@huawei.com/
>>> [4] https://lore.kernel.org/lkml/ZIHpd6unkOtYVEqP@e120325.cambridge.arm.com/T/#m4e74cb5a0aaa353c60fedc6cfb95ab7a6e381e3c
>>>
>>> v3:
>>> - dropping changes to cpufreq_verify_current_freq
>>> - pulling in changes from Ionela initializing capacity_freq_ref to 0
>>>    (thanks for that!)  and applying suggestions made by her during last review:
>>>      - switching to arch_scale_freq_capacity and arch_scale_freq_ref when
>>>        reversing freq scale factor computation
>>>      - swapping shift with multiplication
>>> - adding time limit for considering last scale update as valid
>>> - updating frequency scale factor upon entering idle
>>>
>>> v2:
>>> - Splitting the patches
>>> - Adding comment for full dyntick mode
>>> - Plugging arch_freq_get_on_cpu into cpufreq_verify_current_freq instead
>>>    of in show_cpuinfo_cur_freq to allow the framework to stay more in sync
>>>    with potential freq changes
>>>
>>> Beata Michalska (2):
>>>    arm64: Provide an AMU-based version of arch_freq_get_on_cpu
>>>    arm64: Update AMU-based frequency scale factor on entering idle
>>>
>>> Ionela Voinescu (1):
>>>    arch_topology: init capacity_freq_ref to 0
>>>
>>
>> Should there have been a patch that adds a call to
>> arch_freq_get_on_cpu() from show_cpuinfo_cur_freq() as well?
>>
>> My understanding from this [1] thread and others referenced there is
>> that was something we wanted.
>>
> Right, so I must have missunderstood that, as the way I did read it was that
> it is acceptable to keep things as they are wrt cpufreq sysfs entries.
> 
> ---
> BR
> Beata
>> [1] https://lore.kernel.org/lkml/2cfbc633-1e94-d741-2337-e1b0cf48b81b@nvidia.com/
>>
>> Thanks,
>> Ionela.
>>

Yes, the change to show_cpuinfo_cur_freq from [1] is needed.

[1] 
https://lore.kernel.org/lkml/20230606155754.245998-1-beata.michalska@arm.com/

Thank you,
Sumit Gupta
Vanshidhar Konda March 25, 2024, 4:10 p.m. UTC | #4
On Tue, Mar 12, 2024 at 08:34:28AM +0000, Beata Michalska wrote:
>Introducing arm64 specific version of arch_freq_get_on_cpu, cashing on
>existing implementation for FIE and AMUv1 support: the frequency scale
>factor, updated on each sched tick, serves as a base for retrieving
>the frequency for a given CPU, representing an average frequency
>reported between the ticks - thus its accuracy is limited.
>
>The changes have been rather lightly (due to some limitations) tested on
>an FVP model.
>

I tested these changes on an Ampere system. The results from reading
scaling_cur_freq look reasonable in the majority of cases I tested. I
only saw some unexpected behavior with cores that were configured for
no_hz full.

I observed the unexplained behavior when I tested as follows:
1. Run stress on all cores
    stress-ng --cpu 186 --timeout 10m --metrics-brief
2. Observe scaling_cur_freq and cpuinfo_cur_freq for all cores
    scaling_cur_freq values were within a few % of cpuinfo_cur_freq
3. Kill stress test
4. Observe scaling_cur_freq and cpuinfo_cur_freq for all cores
    scaling_cur_freq values were within a few % of cpuinfo_cur_freq for
    most cores except the ones configured with no_hz full.

no_hz full = 122-127
core   scaling_cur_freq  cpuinfo_cur_freq
[122]: 2997070           1000000
[123]: 2997070           1000000
[124]: 3000038           1000000
[125]: 2997070           1000000
[126]: 2997070           1000000
[127]: 2997070           1000000

These values were reflected for multiple seconds. I suspect the cores
entered WFI and there was no update to the scale while those cores were
idle.

Thanks,
Vanshi
Beata Michalska April 3, 2024, 9:30 p.m. UTC | #5
On Wed, Mar 20, 2024 at 10:22:22PM +0530, Sumit Gupta wrote:
> Hi Beata,
> 
> > > On Tuesday 12 Mar 2024 at 08:34:28 (+0000), Beata Michalska wrote:
> > > > Introducing arm64 specific version of arch_freq_get_on_cpu, cashing on
> > > > existing implementation for FIE and AMUv1 support: the frequency scale
> > > > factor, updated on each sched tick, serves as a base for retrieving
> > > > the frequency for a given CPU, representing an average frequency
> > > > reported between the ticks - thus its accuracy is limited.
> > > > 
> > > > The changes have been rather lightly (due to some limitations) tested on
> > > > an FVP model.
> > > > 
> > > > Relevant discussions:
> > > > [1] https://lore.kernel.org/all/20240229162520.970986-1-vanshikonda@os.amperecomputing.com/
> > > > [2] https://lore.kernel.org/all/7eozim2xnepacnnkzxlbx34hib4otycnbn4dqymfziqou5lw5u@5xzpv3t7sxo3/
> > > > [3] https://lore.kernel.org/all/20231212072617.14756-1-lihuisong@huawei.com/
> > > > [4] https://lore.kernel.org/lkml/ZIHpd6unkOtYVEqP@e120325.cambridge.arm.com/T/#m4e74cb5a0aaa353c60fedc6cfb95ab7a6e381e3c
> > > > 
> > > > v3:
> > > > - dropping changes to cpufreq_verify_current_freq
> > > > - pulling in changes from Ionela initializing capacity_freq_ref to 0
> > > >    (thanks for that!)  and applying suggestions made by her during last review:
> > > >      - switching to arch_scale_freq_capacity and arch_scale_freq_ref when
> > > >        reversing freq scale factor computation
> > > >      - swapping shift with multiplication
> > > > - adding time limit for considering last scale update as valid
> > > > - updating frequency scale factor upon entering idle
> > > > 
> > > > v2:
> > > > - Splitting the patches
> > > > - Adding comment for full dyntick mode
> > > > - Plugging arch_freq_get_on_cpu into cpufreq_verify_current_freq instead
> > > >    of in show_cpuinfo_cur_freq to allow the framework to stay more in sync
> > > >    with potential freq changes
> > > > 
> > > > Beata Michalska (2):
> > > >    arm64: Provide an AMU-based version of arch_freq_get_on_cpu
> > > >    arm64: Update AMU-based frequency scale factor on entering idle
> > > > 
> > > > Ionela Voinescu (1):
> > > >    arch_topology: init capacity_freq_ref to 0
> > > > 
> > > 
> > > Should there have been a patch that adds a call to
> > > arch_freq_get_on_cpu() from show_cpuinfo_cur_freq() as well?
> > > 
> > > My understanding from this [1] thread and others referenced there is
> > > that was something we wanted.
> > > 
> > Right, so I must have missunderstood that, as the way I did read it was that
> > it is acceptable to keep things as they are wrt cpufreq sysfs entries.
> > 
> > ---
> > BR
> > Beata
> > > [1] https://lore.kernel.org/lkml/2cfbc633-1e94-d741-2337-e1b0cf48b81b@nvidia.com/
> > > 
> > > Thanks,
> > > Ionela.
> > > 
> 
> Yes, the change to show_cpuinfo_cur_freq from [1] is needed.
>
Noted. Will send an update including fixes and this requested change.

---
BR
Beata
> [1]
> https://lore.kernel.org/lkml/20230606155754.245998-1-beata.michalska@arm.com/
> 
> Thank you,
> Sumit Gupta
Beata Michalska April 3, 2024, 9:34 p.m. UTC | #6
On Mon, Mar 25, 2024 at 09:10:26AM -0700, Vanshidhar Konda wrote:
> On Tue, Mar 12, 2024 at 08:34:28AM +0000, Beata Michalska wrote:
> > Introducing arm64 specific version of arch_freq_get_on_cpu, cashing on
> > existing implementation for FIE and AMUv1 support: the frequency scale
> > factor, updated on each sched tick, serves as a base for retrieving
> > the frequency for a given CPU, representing an average frequency
> > reported between the ticks - thus its accuracy is limited.
> > 
> > The changes have been rather lightly (due to some limitations) tested on
> > an FVP model.
> > 
> 
> I tested these changes on an Ampere system. The results from reading
> scaling_cur_freq look reasonable in the majority of cases I tested. I
> only saw some unexpected behavior with cores that were configured for
> no_hz full.
> 
> I observed the unexplained behavior when I tested as follows:
> 1. Run stress on all cores
>    stress-ng --cpu 186 --timeout 10m --metrics-brief
> 2. Observe scaling_cur_freq and cpuinfo_cur_freq for all cores
>    scaling_cur_freq values were within a few % of cpuinfo_cur_freq
> 3. Kill stress test
> 4. Observe scaling_cur_freq and cpuinfo_cur_freq for all cores
>    scaling_cur_freq values were within a few % of cpuinfo_cur_freq for
>    most cores except the ones configured with no_hz full.
> 
> no_hz full = 122-127
> core   scaling_cur_freq  cpuinfo_cur_freq
> [122]: 2997070           1000000
> [123]: 2997070           1000000
> [124]: 3000038           1000000
> [125]: 2997070           1000000
> [126]: 2997070           1000000
> [127]: 2997070           1000000
> 
> These values were reflected for multiple seconds. I suspect the cores
> entered WFI and there was no update to the scale while those cores were
> idle.
>
Right, so the problem is with updating the counters upon entering idle, which at
this point is being done for all CPUs, and it should exclude the full dynticks
ones - otherwise it leads to such bad readings. So for nohz_full cores cpufreq
driver will have to take care of getting the current frequency.

Will be sending a fix for that.

Thank you very much for testing - appreciate that!

---
BR
Beata
> Thanks,
> Vanshi