mbox series

[v1,0/3] arm64: Use AMU counters for measuring CPU frequency

Message ID 20240229162520.970986-1-vanshikonda@os.amperecomputing.com (mailing list archive)
Headers show
Series arm64: Use AMU counters for measuring CPU frequency | expand

Message

Vanshidhar Konda Feb. 29, 2024, 4:25 p.m. UTC
AMU extension was added to Armv8.4 as an optional extension. The
extension provides architectural counters that can be used to measure
CPU frequency - CPU_CYCLES and CNT_CYCLES.

In the kernel FIE uses these counters to compute frequency scale on
every tick. The counters are also be used in the CPPC driver if the
firmware publishes support for registered & delivered registers using
ACPI FFH.

In the current implementation using these counters in the CPPC driver
results in inaccurate measurement in some cases. This has been discussed
in [1] and [2].

In the current implementation, CPPC delivered register and reference
register are read in two different cpc_read calls(). There could be
significant latency between the CPU reading these two registers due to
the core being interrupted - leading to an inaccurate result. Also, when
these registers are in FFH region, reading each register using cpc_read
will result in 2 IPI interrpts to the core whose registers are being read.
It will also wake up any core in idle to read the AMU counters.

In this patch series, there are two changes:
- Implement arch_freq_get_on_cpu() for arm64 to record AMU counters on
  every clock tick
- CPPC driver reads delivered and reference registers in a single IPI
  while avoiding a wake up on idle core to read AMU counters; also
    allows measuring CPU frequency of isolated CPUs

Results on an AmpereOne system with 128 cores after the patch:

When system is idle:
core   scaling_cur_freq  cpuinfo_cur_freq
[0]:   3068518           3000000
[1]:   1030869           1000000
[2]:   1031296           1000000
[3]:   1032224           1000000
[4]:   1032469           1000000
[5]:   1030987           1000000
...
...
isolcpus = 122-127
[122]: 1030724           1000000
[123]: 1030667           1000000
[124]: 1031888           1000000
[125]: 1031047           1000000
[126]: 1031683           1000000
[127]: 1030794           1000000

With stress applied to core 122-126:
core   scaling_cur_freq  cpuinfo_cur_freq
[0]:   3050000           3000000
[1]:   1031068           1000000
[2]:   1030699           1000000
[3]:   1031818           1000000
[4]:   1032251           1000000
[5]:   1031282           1000000
...
...
isolcpus = 122-127
[122]: 3000061           3012000
[123]: 3000041           3008000
[124]: 3000038           2998000
[125]: 3000062           2995000
[126]: 3000035           3004000
[127]: 1031440           1000000

[1]: https://lore.kernel.org/all/20230328193846.8757-1-yang@os.amperecomputing.com/
[2]: https://lore.kernel.org/linux-arm-kernel/20231212072617.14756-1-lihuisong@huawei.com/

Vanshidhar Konda (3):
  arm64: topology: Add arch_freq_get_on_cpu() support
  arm64: idle: Cache AMU counters before entering idle
  ACPI: CPPC: Read CPC FFH counters in a single IPI

 arch/arm64/kernel/idle.c     |  10 +++
 arch/arm64/kernel/topology.c | 153 ++++++++++++++++++++++++++++++-----
 drivers/acpi/cppc_acpi.c     |  32 +++++++-
 include/acpi/cppc_acpi.h     |  13 +++
 4 files changed, 186 insertions(+), 22 deletions(-)