mbox series

[00/18] KVM: arm64: Support FEAT_PMUv3 on Apple hardware

Message ID 20241217212048.3709204-1-oliver.upton@linux.dev (mailing list archive)
Headers show
Series KVM: arm64: Support FEAT_PMUv3 on Apple hardware | expand

Message

Oliver Upton Dec. 17, 2024, 9:20 p.m. UTC
One of the interesting features of some Apple M* parts is an IMPDEF trap
that routes EL1/EL0 accesses of the PMUv3 registers to EL2. This allows
a hypervisor to emulate an architectural PMUv3 on top of the IMPDEF PMU
hardware present in the CPU.

And if you squint, this _might_ look like a CPU erratum :-)

This series takes advantage of these IMPDEF traps to provide PMUv3 to
KVM guests. As a starting point, only expose the fixed CPU cycle counter
and no event counters. Conveniently, this is enough to get Windows
running as a KVM guest on Apple hardware.

I've tried to keep the deviation to a minimum by refactoring some of the
flows used for PMUv3, e.g. computing PMCEID from the arm_pmu bitmap
instead of reading hardware directly.

RFC -> v1:
 - Rebase to 6.13-rc3
 - Add support for 1 event counter in addition to CPU cycle counter
 - Don't sneak past the PMU event filter (Marc)
 - Have the PMU driver provide a PMUv3 -> HW event ID mapping (Marc)

Tested on my M2 with Linux and Windows guests. If possible, I'd
appreciate someone testing on an M1 as I haven't added those MIDRs to
the erratum yet.

Oliver Upton (18):
  drivers/perf: apple_m1: Refactor event select/filter configuration
  drivers/perf: apple_m1: Support host/guest event filtering
  drivers/perf: apple_m1: Map generic branch events
  drivers/perf: apple_m1: Provide helper for mapping PMUv3 events
  KVM: arm64: Compute PMCEID from arm_pmu's event bitmaps
  KVM: arm64: Always support SW_INCR PMU event
  KVM: arm64: Remap PMUv3 events onto hardware
  KVM: arm64: Use a cpucap to determine if system supports FEAT_PMUv3
  KVM: arm64: Drop kvm_arm_pmu_available static key
  KVM: arm64: Use guard() to cleanup usage of arm_pmus_lock
  KVM: arm64: Move PMUVer filtering into KVM code
  KVM: arm64: Compute synthetic sysreg ESR for Apple PMUv3 traps
  KVM: arm64: Advertise PMUv3 if IMPDEF traps are present
  KVM: arm64: Advertise 0 event counters for IMPDEF PMU
  arm64: Enable IMP DEF PMUv3 traps on Apple M2
  drivers/perf: apple_m1: Map a few more PMUv3 events
  KVM: arm64: Provide 1 event counter on IMPDEF hardware
  KVM: arm64: selftests: Add test for probing PMUv3 sysregs

 arch/arm64/include/asm/apple_m1_pmu.h         |   1 +
 arch/arm64/include/asm/cpufeature.h           |  28 +---
 arch/arm64/kernel/cpu_errata.c                |  38 +++++
 arch/arm64/kernel/cpufeature.c                |  19 +++
 arch/arm64/kernel/image-vars.h                |   5 -
 arch/arm64/kvm/arm.c                          |   4 +-
 arch/arm64/kvm/hyp/include/hyp/switch.h       |   4 +-
 arch/arm64/kvm/hyp/vhe/switch.c               |  22 +++
 arch/arm64/kvm/pmu-emul.c                     | 127 +++++++++++-----
 arch/arm64/kvm/pmu.c                          |  10 +-
 arch/arm64/tools/cpucaps                      |   2 +
 drivers/perf/apple_m1_cpu_pmu.c               | 103 ++++++++++---
 include/kvm/arm_pmu.h                         |  15 +-
 include/linux/perf/arm_pmu.h                  |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../kvm/aarch64/pmuv3_register_probe.c        | 135 ++++++++++++++++++
 16 files changed, 407 insertions(+), 108 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/aarch64/pmuv3_register_probe.c


base-commit: 78d4f34e2115b517bcbfe7ec0d018bbbb6f9b0b8

Comments

Janne Grunau Dec. 21, 2024, 1:45 p.m. UTC | #1
On Tue, Dec 17, 2024 at 01:20:30PM -0800, Oliver Upton wrote:
> One of the interesting features of some Apple M* parts is an IMPDEF trap
> that routes EL1/EL0 accesses of the PMUv3 registers to EL2. This allows
> a hypervisor to emulate an architectural PMUv3 on top of the IMPDEF PMU
> hardware present in the CPU.
> 
> And if you squint, this _might_ look like a CPU erratum :-)
> 
> This series takes advantage of these IMPDEF traps to provide PMUv3 to
> KVM guests. As a starting point, only expose the fixed CPU cycle counter
> and no event counters. Conveniently, this is enough to get Windows
> running as a KVM guest on Apple hardware.
> 
> I've tried to keep the deviation to a minimum by refactoring some of the
> flows used for PMUv3, e.g. computing PMCEID from the arm_pmu bitmap
> instead of reading hardware directly.
> 
> RFC -> v1:
>  - Rebase to 6.13-rc3
>  - Add support for 1 event counter in addition to CPU cycle counter
>  - Don't sneak past the PMU event filter (Marc)
>  - Have the PMU driver provide a PMUv3 -> HW event ID mapping (Marc)
> 
> Tested on my M2 with Linux and Windows guests. If possible, I'd
> appreciate someone testing on an M1 as I haven't added those MIDRs to
> the erratum yet.

Tested on M1 (t8103) with perf in a Linux guest and the patch below

Tested-by: Janne Grunau <j@jannau.net>

I'll import this into the downstream asahi kernel as there was a request
for performance counters to aid FEX-Emu development recently.

Janne

--- 

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 441ee4ffc7709..45ef67ec970f5 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -195,6 +195,12 @@ has_neoverse_n1_erratum_1542419(const struct arm64_cpu_capabilities *entry,
 }

 static const struct midr_range impdef_pmuv3_cpus[] = {
+	MIDR_ALL_VERSIONS(MIDR_APPLE_M1_ICESTORM),
+	MIDR_ALL_VERSIONS(MIDR_APPLE_M1_FIRESTORM),
+	MIDR_ALL_VERSIONS(MIDR_APPLE_M1_ICESTORM_PRO),
+	MIDR_ALL_VERSIONS(MIDR_APPLE_M1_FIRESTORM_PRO),
+	MIDR_ALL_VERSIONS(MIDR_APPLE_M1_ICESTORM_MAX),
+	MIDR_ALL_VERSIONS(MIDR_APPLE_M1_FIRESTORM_MAX),
 	MIDR_ALL_VERSIONS(MIDR_APPLE_M2_BLIZZARD),
 	MIDR_ALL_VERSIONS(MIDR_APPLE_M2_AVALANCHE),
 	MIDR_ALL_VERSIONS(MIDR_APPLE_M2_BLIZZARD_PRO),
Oliver Upton Dec. 21, 2024, 10 p.m. UTC | #2
On Sat, Dec 21, 2024 at 02:45:49PM +0100, Janne Grunau wrote:
> On Tue, Dec 17, 2024 at 01:20:30PM -0800, Oliver Upton wrote:
> > One of the interesting features of some Apple M* parts is an IMPDEF trap
> > that routes EL1/EL0 accesses of the PMUv3 registers to EL2. This allows
> > a hypervisor to emulate an architectural PMUv3 on top of the IMPDEF PMU
> > hardware present in the CPU.
> > 
> > And if you squint, this _might_ look like a CPU erratum :-)
> > 
> > This series takes advantage of these IMPDEF traps to provide PMUv3 to
> > KVM guests. As a starting point, only expose the fixed CPU cycle counter
> > and no event counters. Conveniently, this is enough to get Windows
> > running as a KVM guest on Apple hardware.
> > 
> > I've tried to keep the deviation to a minimum by refactoring some of the
> > flows used for PMUv3, e.g. computing PMCEID from the arm_pmu bitmap
> > instead of reading hardware directly.
> > 
> > RFC -> v1:
> >  - Rebase to 6.13-rc3
> >  - Add support for 1 event counter in addition to CPU cycle counter
> >  - Don't sneak past the PMU event filter (Marc)
> >  - Have the PMU driver provide a PMUv3 -> HW event ID mapping (Marc)
> > 
> > Tested on my M2 with Linux and Windows guests. If possible, I'd
> > appreciate someone testing on an M1 as I haven't added those MIDRs to
> > the erratum yet.
> 
> Tested on M1 (t8103) with perf in a Linux guest and the patch below
> 
> Tested-by: Janne Grunau <j@jannau.net>
> 
> I'll import this into the downstream asahi kernel as there was a request
> for performance counters to aid FEX-Emu development recently.
> 
> Janne

Awesome, greatly appreciate the testing Janne. Hopefully we can get this
worked out for upstream too :)
Will Deacon Jan. 8, 2025, 12:38 p.m. UTC | #3
Hi Oliver,

On Tue, Dec 17, 2024 at 01:20:30PM -0800, Oliver Upton wrote:
> One of the interesting features of some Apple M* parts is an IMPDEF trap
> that routes EL1/EL0 accesses of the PMUv3 registers to EL2. This allows
> a hypervisor to emulate an architectural PMUv3 on top of the IMPDEF PMU
> hardware present in the CPU.
> 
> And if you squint, this _might_ look like a CPU erratum :-)
> 
> This series takes advantage of these IMPDEF traps to provide PMUv3 to
> KVM guests. As a starting point, only expose the fixed CPU cycle counter
> and no event counters. Conveniently, this is enough to get Windows
> running as a KVM guest on Apple hardware.
> 
> I've tried to keep the deviation to a minimum by refactoring some of the
> flows used for PMUv3, e.g. computing PMCEID from the arm_pmu bitmap
> instead of reading hardware directly.

What's your plan for this series? I started looking at it and I can take
the first four apple_m1 patches if you like?

Will
Oliver Upton Jan. 8, 2025, 8:14 p.m. UTC | #4
Hey Will,

On Wed, Jan 08, 2025 at 12:38:41PM +0000, Will Deacon wrote:
> What's your plan for this series? I started looking at it and I can take
> the first four apple_m1 patches if you like?

I plan on posting a respin of it by next week, which should look pretty
much the same besides cleaning up the build error I introduced :)

Besides that, I think we need to decide on the KVM side of things
whether or not we want to support an event counter in addition to the
PMU cycle counter. Janne's FEX use case would certainly benefit from it.

Do you think you could grab patch #3? It is entirely unrelated to the
series at this point with the PMUv3 event remapping helper.
Marc Zyngier Jan. 8, 2025, 9:26 p.m. UTC | #5
On Wed, 08 Jan 2025 20:14:07 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> Hey Will,
> 
> On Wed, Jan 08, 2025 at 12:38:41PM +0000, Will Deacon wrote:
> > What's your plan for this series? I started looking at it and I can take
> > the first four apple_m1 patches if you like?
> 
> I plan on posting a respin of it by next week, which should look pretty
> much the same besides cleaning up the build error I introduced :)
> 
> Besides that, I think we need to decide on the KVM side of things
> whether or not we want to support an event counter in addition to the
> PMU cycle counter. Janne's FEX use case would certainly benefit from it.

I think we should always be able to support *one* counter on top of
the cycle counter. Doing more than that would result in inconsistent
behaviours (some events only count on a single counter).

Unless we restrict ourselves to a very small set of events that we can
always schedule on any counter, but this doesn't sound very promising.

Thoughts?

	M.
Oliver Upton Jan. 8, 2025, 11:06 p.m. UTC | #6
On Wed, Jan 08, 2025 at 09:26:54PM +0000, Marc Zyngier wrote:
> On Wed, 08 Jan 2025 20:14:07 +0000,
> Oliver Upton <oliver.upton@linux.dev> wrote:
> > 
> > Hey Will,
> > 
> > On Wed, Jan 08, 2025 at 12:38:41PM +0000, Will Deacon wrote:
> > > What's your plan for this series? I started looking at it and I can take
> > > the first four apple_m1 patches if you like?
> > 
> > I plan on posting a respin of it by next week, which should look pretty
> > much the same besides cleaning up the build error I introduced :)
> > 
> > Besides that, I think we need to decide on the KVM side of things
> > whether or not we want to support an event counter in addition to the
> > PMU cycle counter. Janne's FEX use case would certainly benefit from it.
> 
> I think we should always be able to support *one* counter on top of
> the cycle counter. Doing more than that would result in inconsistent
> behaviours (some events only count on a single counter).
> 
> Unless we restrict ourselves to a very small set of events that we can
> always schedule on any counter, but this doesn't sound very promising.

I definitely agree that a single event counter is the way to go. Dealing
with this IMPDEF crud is gross already, and coping with event affinities
would only make it worse.

I was more wanting to test the idea that we want programmable event
counters at all, although it isn't that much of a burden on top of the
cycle counter.

I'll un-RFC the tail of the series in v2 then.
Will Deacon Jan. 10, 2025, 4:22 p.m. UTC | #7
On Tue, 17 Dec 2024 13:20:30 -0800, Oliver Upton wrote:
> One of the interesting features of some Apple M* parts is an IMPDEF trap
> that routes EL1/EL0 accesses of the PMUv3 registers to EL2. This allows
> a hypervisor to emulate an architectural PMUv3 on top of the IMPDEF PMU
> hardware present in the CPU.
> 
> And if you squint, this _might_ look like a CPU erratum :-)
> 
> [...]

Applied the branch events patch to will (for-next/perf), thanks!

[03/18] drivers/perf: apple_m1: Map generic branch events
        https://git.kernel.org/will/c/4575353d82e2

Cheers,