diff mbox series

[RFC,05/14] KVM: arm64: Always allow fixed cycle counter

Message ID 20241203193220.1070811-6-oliver.upton@linux.dev (mailing list archive)
State New
Headers show
Series KVM: arm64: Support FEAT_PMUv3 on Apple hardware | expand

Commit Message

Oliver Upton Dec. 3, 2024, 7:32 p.m. UTC
The fixed CPU cycle counter is mandatory for PMUv3, so it doesn't make a
lot of sense allowing userspace to filter it. Only apply the PMU event
filter to *programmed* event counters.

While at it, use the generic CPU_CYCLES perf event to back the cycle
counter, potentially allowing non-PMUv3 drivers to map the event onto
the underlying implementation.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/kvm/pmu-emul.c | 35 +++++++++++++++++++----------------
 1 file changed, 19 insertions(+), 16 deletions(-)

Comments

Marc Zyngier Dec. 3, 2024, 9:32 p.m. UTC | #1
On Tue, 03 Dec 2024 19:32:11 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> The fixed CPU cycle counter is mandatory for PMUv3, so it doesn't make a
> lot of sense allowing userspace to filter it. Only apply the PMU event
> filter to *programmed* event counters.

But that's a change in ABI, isn't it? We explicitly say in the
documentation that the cycle counter can be filtered by specifying
event 0x11.

More importantly, the current filtering works in terms of events, and
not in terms of counters.

Instead of changing the ABI, how about simply not supporting filtering
on such non-compliant HW? Surely that would simplify a few things.

Thanks,

	M.
Oliver Upton Dec. 3, 2024, 10:32 p.m. UTC | #2
On Tue, Dec 03, 2024 at 09:32:10PM +0000, Marc Zyngier wrote:
> On Tue, 03 Dec 2024 19:32:11 +0000,
> Oliver Upton <oliver.upton@linux.dev> wrote:
> > 
> > The fixed CPU cycle counter is mandatory for PMUv3, so it doesn't make a
> > lot of sense allowing userspace to filter it. Only apply the PMU event
> > filter to *programmed* event counters.
> 
> But that's a change in ABI, isn't it? We explicitly say in the
> documentation that the cycle counter can be filtered by specifying
> event 0x11.

Yeah... A bit of a dirty shortcut I took because I don't like the ABI,
but distaste isn't enough to break it :)

> More importantly, the current filtering works in terms of events, and
> not in terms of counters.
> 
> Instead of changing the ABI, how about simply not supporting filtering
> on such non-compliant HW? Surely that would simplify a few things.

Yeah, that sounds reasonable. Especially if we allow programmable event
counters where the event ID space doesn't match the architecture.
Marc Zyngier Dec. 4, 2024, 9:04 a.m. UTC | #3
On Tue, 03 Dec 2024 22:32:38 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> On Tue, Dec 03, 2024 at 09:32:10PM +0000, Marc Zyngier wrote:
> > On Tue, 03 Dec 2024 19:32:11 +0000,
> > Oliver Upton <oliver.upton@linux.dev> wrote:
> > > 
> > > The fixed CPU cycle counter is mandatory for PMUv3, so it doesn't make a
> > > lot of sense allowing userspace to filter it. Only apply the PMU event
> > > filter to *programmed* event counters.
> > 
> > But that's a change in ABI, isn't it? We explicitly say in the
> > documentation that the cycle counter can be filtered by specifying
> > event 0x11.
> 
> Yeah... A bit of a dirty shortcut I took because I don't like the ABI,
> but distaste isn't enough to break it :)
> 
> > More importantly, the current filtering works in terms of events, and
> > not in terms of counters.
> > 
> > Instead of changing the ABI, how about simply not supporting filtering
> > on such non-compliant HW? Surely that would simplify a few things.
> 
> Yeah, that sounds reasonable. Especially if we allow programmable event
> counters where the event ID space doesn't match the architecture.

Another thing I have been wondering is if a slightly better approach
would be to move some of the handling to the PMU driver itself, and
let it emulate PMUv3 if it can. This would allow conversion of event
numbers in situ rather than polluting the PMUv3 code in KVM.

	M.
Oliver Upton Dec. 4, 2024, 9:56 p.m. UTC | #4
On Wed, Dec 04, 2024 at 09:04:26AM +0000, Marc Zyngier wrote:
> On Tue, 03 Dec 2024 22:32:38 +0000,
> Oliver Upton <oliver.upton@linux.dev> wrote:
> > > More importantly, the current filtering works in terms of events, and
> > > not in terms of counters.
> > > 
> > > Instead of changing the ABI, how about simply not supporting filtering
> > > on such non-compliant HW? Surely that would simplify a few things.
> > 
> > Yeah, that sounds reasonable. Especially if we allow programmable event
> > counters where the event ID space doesn't match the architecture.
> 
> Another thing I have been wondering is if a slightly better approach
> would be to move some of the handling to the PMU driver itself, and
> let it emulate PMUv3 if it can. This would allow conversion of event
> numbers in situ rather than polluting the PMUv3 code in KVM.

Sure, but I think the actual event fed into perf_event_create_kernel_counter()
should be the correct hardware event, not a PMUv3 event reinterpreted
behind the scenes. Otherwise, we'd need to devise an alternate config encoding
for PMUv3-like events since the event ID spaces overlap.

I'm thinking this could be a helper in the arm_pmu struct that takes a
PMUv3 event and spits out (in this case) an M1 event. The resulting KVM
code would be miniscule, like:

u64 kvm_map_pmu_event(struct kvm *kvm, u64 eventsel)
{
	struct arm_pmu *pmu = kvm->arch.arm_pmu;

	if (!pmu->map_pmuv3_event)
		return eventsel;

	return pmu->map_pmuv3_event(eventsel);
}

static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
{

	[...]

	attr.config = kvm_map_pmu_event(vcpu->kvm, eventsel);
	event = perf_event_create_kernel_counter(&attr, ...);
}

We could even have the M1 PMU driver populate arm_pmu->pmceid_bitmap
with the events it knows about and get PMCEID emulation for free.
Marc Zyngier Dec. 10, 2024, 9:49 a.m. UTC | #5
On Wed, 04 Dec 2024 21:56:58 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> On Wed, Dec 04, 2024 at 09:04:26AM +0000, Marc Zyngier wrote:
> > On Tue, 03 Dec 2024 22:32:38 +0000,
> > Oliver Upton <oliver.upton@linux.dev> wrote:
> > > > More importantly, the current filtering works in terms of events, and
> > > > not in terms of counters.
> > > > 
> > > > Instead of changing the ABI, how about simply not supporting filtering
> > > > on such non-compliant HW? Surely that would simplify a few things.
> > > 
> > > Yeah, that sounds reasonable. Especially if we allow programmable event
> > > counters where the event ID space doesn't match the architecture.
> > 
> > Another thing I have been wondering is if a slightly better approach
> > would be to move some of the handling to the PMU driver itself, and
> > let it emulate PMUv3 if it can. This would allow conversion of event
> > numbers in situ rather than polluting the PMUv3 code in KVM.
> 
> Sure, but I think the actual event fed into perf_event_create_kernel_counter()
> should be the correct hardware event, not a PMUv3 event reinterpreted
> behind the scenes. Otherwise, we'd need to devise an alternate config encoding
> for PMUv3-like events since the event ID spaces overlap.
> 
> I'm thinking this could be a helper in the arm_pmu struct that takes a
> PMUv3 event and spits out (in this case) an M1 event. The resulting KVM
> code would be miniscule, like:
> 
> u64 kvm_map_pmu_event(struct kvm *kvm, u64 eventsel)
> {
> 	struct arm_pmu *pmu = kvm->arch.arm_pmu;
> 
> 	if (!pmu->map_pmuv3_event)
> 		return eventsel;
> 
> 	return pmu->map_pmuv3_event(eventsel);
> }
> 
> static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
> {
> 
> 	[...]
> 
> 	attr.config = kvm_map_pmu_event(vcpu->kvm, eventsel);
> 	event = perf_event_create_kernel_counter(&attr, ...);
> }
> 
> We could even have the M1 PMU driver populate arm_pmu->pmceid_bitmap
> with the events it knows about and get PMCEID emulation for free.

CONFIG_PMUv3_COMPAT? ;-)

I think this is probably the less intrusive thing to do for KVM
(outside of the butt-ugly trap decoding thingy), and it squarely puts
the responsibility on the PMU driver to expose something that makes
sense in a given context.

Thanks,

	M.
diff mbox series

Patch

diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 809d65b912e8..3e7091e1a2e4 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -707,26 +707,27 @@  static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
 	evtreg = kvm_pmc_read_evtreg(pmc);
 
 	kvm_pmu_stop_counter(pmc);
-	if (pmc->idx == ARMV8_PMU_CYCLE_IDX)
+	if (pmc->idx == ARMV8_PMU_CYCLE_IDX) {
 		eventsel = ARMV8_PMUV3_PERFCTR_CPU_CYCLES;
-	else
+	} else {
 		eventsel = evtreg & kvm_pmu_event_mask(vcpu->kvm);
 
-	/*
-	 * Neither SW increment nor chained events need to be backed
-	 * by a perf event.
-	 */
-	if (eventsel == ARMV8_PMUV3_PERFCTR_SW_INCR ||
-	    eventsel == ARMV8_PMUV3_PERFCTR_CHAIN)
-		return;
+		/*
+		 * If we have a filter in place and that the event isn't
+		 * allowed, do not install a perf event either.
+		 */
+		if (vcpu->kvm->arch.pmu_filter &&
+		    !test_bit(eventsel, vcpu->kvm->arch.pmu_filter))
+			return;
 
-	/*
-	 * If we have a filter in place and that the event isn't allowed, do
-	 * not install a perf event either.
-	 */
-	if (vcpu->kvm->arch.pmu_filter &&
-	    !test_bit(eventsel, vcpu->kvm->arch.pmu_filter))
-		return;
+		/*
+		 * Neither SW increment nor chained events need to be backed
+		 * by a perf event.
+		 */
+		if (eventsel == ARMV8_PMUV3_PERFCTR_SW_INCR ||
+		    eventsel == ARMV8_PMUV3_PERFCTR_CHAIN)
+			return;
+	}
 
 	memset(&attr, 0, sizeof(struct perf_event_attr));
 	attr.type = arm_pmu->pmu.type;
@@ -877,6 +878,8 @@  static u64 compute_pmceid0(struct arm_pmu *pmu)
 
 	/* always support CHAIN */
 	val |= BIT(ARMV8_PMUV3_PERFCTR_CHAIN);
+	/* always support CPU_CYCLES */
+	val |= BIT(ARMV8_PMUV3_PERFCTR_CPU_CYCLES);
 	return val;
 }