Message ID | 20241203193220.1070811-6-oliver.upton@linux.dev (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | KVM: arm64: Support FEAT_PMUv3 on Apple hardware | expand |
On Tue, 03 Dec 2024 19:32:11 +0000, Oliver Upton <oliver.upton@linux.dev> wrote: > > The fixed CPU cycle counter is mandatory for PMUv3, so it doesn't make a > lot of sense allowing userspace to filter it. Only apply the PMU event > filter to *programmed* event counters. But that's a change in ABI, isn't it? We explicitly say in the documentation that the cycle counter can be filtered by specifying event 0x11. More importantly, the current filtering works in terms of events, and not in terms of counters. Instead of changing the ABI, how about simply not supporting filtering on such non-compliant HW? Surely that would simplify a few things. Thanks, M.
On Tue, Dec 03, 2024 at 09:32:10PM +0000, Marc Zyngier wrote: > On Tue, 03 Dec 2024 19:32:11 +0000, > Oliver Upton <oliver.upton@linux.dev> wrote: > > > > The fixed CPU cycle counter is mandatory for PMUv3, so it doesn't make a > > lot of sense allowing userspace to filter it. Only apply the PMU event > > filter to *programmed* event counters. > > But that's a change in ABI, isn't it? We explicitly say in the > documentation that the cycle counter can be filtered by specifying > event 0x11. Yeah... A bit of a dirty shortcut I took because I don't like the ABI, but distaste isn't enough to break it :) > More importantly, the current filtering works in terms of events, and > not in terms of counters. > > Instead of changing the ABI, how about simply not supporting filtering > on such non-compliant HW? Surely that would simplify a few things. Yeah, that sounds reasonable. Especially if we allow programmable event counters where the event ID space doesn't match the architecture.
On Tue, 03 Dec 2024 22:32:38 +0000, Oliver Upton <oliver.upton@linux.dev> wrote: > > On Tue, Dec 03, 2024 at 09:32:10PM +0000, Marc Zyngier wrote: > > On Tue, 03 Dec 2024 19:32:11 +0000, > > Oliver Upton <oliver.upton@linux.dev> wrote: > > > > > > The fixed CPU cycle counter is mandatory for PMUv3, so it doesn't make a > > > lot of sense allowing userspace to filter it. Only apply the PMU event > > > filter to *programmed* event counters. > > > > But that's a change in ABI, isn't it? We explicitly say in the > > documentation that the cycle counter can be filtered by specifying > > event 0x11. > > Yeah... A bit of a dirty shortcut I took because I don't like the ABI, > but distaste isn't enough to break it :) > > > More importantly, the current filtering works in terms of events, and > > not in terms of counters. > > > > Instead of changing the ABI, how about simply not supporting filtering > > on such non-compliant HW? Surely that would simplify a few things. > > Yeah, that sounds reasonable. Especially if we allow programmable event > counters where the event ID space doesn't match the architecture. Another thing I have been wondering is if a slightly better approach would be to move some of the handling to the PMU driver itself, and let it emulate PMUv3 if it can. This would allow conversion of event numbers in situ rather than polluting the PMUv3 code in KVM. M.
On Wed, Dec 04, 2024 at 09:04:26AM +0000, Marc Zyngier wrote: > On Tue, 03 Dec 2024 22:32:38 +0000, > Oliver Upton <oliver.upton@linux.dev> wrote: > > > More importantly, the current filtering works in terms of events, and > > > not in terms of counters. > > > > > > Instead of changing the ABI, how about simply not supporting filtering > > > on such non-compliant HW? Surely that would simplify a few things. > > > > Yeah, that sounds reasonable. Especially if we allow programmable event > > counters where the event ID space doesn't match the architecture. > > Another thing I have been wondering is if a slightly better approach > would be to move some of the handling to the PMU driver itself, and > let it emulate PMUv3 if it can. This would allow conversion of event > numbers in situ rather than polluting the PMUv3 code in KVM. Sure, but I think the actual event fed into perf_event_create_kernel_counter() should be the correct hardware event, not a PMUv3 event reinterpreted behind the scenes. Otherwise, we'd need to devise an alternate config encoding for PMUv3-like events since the event ID spaces overlap. I'm thinking this could be a helper in the arm_pmu struct that takes a PMUv3 event and spits out (in this case) an M1 event. The resulting KVM code would be miniscule, like: u64 kvm_map_pmu_event(struct kvm *kvm, u64 eventsel) { struct arm_pmu *pmu = kvm->arch.arm_pmu; if (!pmu->map_pmuv3_event) return eventsel; return pmu->map_pmuv3_event(eventsel); } static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc) { [...] attr.config = kvm_map_pmu_event(vcpu->kvm, eventsel); event = perf_event_create_kernel_counter(&attr, ...); } We could even have the M1 PMU driver populate arm_pmu->pmceid_bitmap with the events it knows about and get PMCEID emulation for free.
On Wed, 04 Dec 2024 21:56:58 +0000, Oliver Upton <oliver.upton@linux.dev> wrote: > > On Wed, Dec 04, 2024 at 09:04:26AM +0000, Marc Zyngier wrote: > > On Tue, 03 Dec 2024 22:32:38 +0000, > > Oliver Upton <oliver.upton@linux.dev> wrote: > > > > More importantly, the current filtering works in terms of events, and > > > > not in terms of counters. > > > > > > > > Instead of changing the ABI, how about simply not supporting filtering > > > > on such non-compliant HW? Surely that would simplify a few things. > > > > > > Yeah, that sounds reasonable. Especially if we allow programmable event > > > counters where the event ID space doesn't match the architecture. > > > > Another thing I have been wondering is if a slightly better approach > > would be to move some of the handling to the PMU driver itself, and > > let it emulate PMUv3 if it can. This would allow conversion of event > > numbers in situ rather than polluting the PMUv3 code in KVM. > > Sure, but I think the actual event fed into perf_event_create_kernel_counter() > should be the correct hardware event, not a PMUv3 event reinterpreted > behind the scenes. Otherwise, we'd need to devise an alternate config encoding > for PMUv3-like events since the event ID spaces overlap. > > I'm thinking this could be a helper in the arm_pmu struct that takes a > PMUv3 event and spits out (in this case) an M1 event. The resulting KVM > code would be miniscule, like: > > u64 kvm_map_pmu_event(struct kvm *kvm, u64 eventsel) > { > struct arm_pmu *pmu = kvm->arch.arm_pmu; > > if (!pmu->map_pmuv3_event) > return eventsel; > > return pmu->map_pmuv3_event(eventsel); > } > > static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc) > { > > [...] > > attr.config = kvm_map_pmu_event(vcpu->kvm, eventsel); > event = perf_event_create_kernel_counter(&attr, ...); > } > > We could even have the M1 PMU driver populate arm_pmu->pmceid_bitmap > with the events it knows about and get PMCEID emulation for free. CONFIG_PMUv3_COMPAT? ;-) I think this is probably the less intrusive thing to do for KVM (outside of the butt-ugly trap decoding thingy), and it squarely puts the responsibility on the PMU driver to expose something that makes sense in a given context. Thanks, M.
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c index 809d65b912e8..3e7091e1a2e4 100644 --- a/arch/arm64/kvm/pmu-emul.c +++ b/arch/arm64/kvm/pmu-emul.c @@ -707,26 +707,27 @@ static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc) evtreg = kvm_pmc_read_evtreg(pmc); kvm_pmu_stop_counter(pmc); - if (pmc->idx == ARMV8_PMU_CYCLE_IDX) + if (pmc->idx == ARMV8_PMU_CYCLE_IDX) { eventsel = ARMV8_PMUV3_PERFCTR_CPU_CYCLES; - else + } else { eventsel = evtreg & kvm_pmu_event_mask(vcpu->kvm); - /* - * Neither SW increment nor chained events need to be backed - * by a perf event. - */ - if (eventsel == ARMV8_PMUV3_PERFCTR_SW_INCR || - eventsel == ARMV8_PMUV3_PERFCTR_CHAIN) - return; + /* + * If we have a filter in place and that the event isn't + * allowed, do not install a perf event either. + */ + if (vcpu->kvm->arch.pmu_filter && + !test_bit(eventsel, vcpu->kvm->arch.pmu_filter)) + return; - /* - * If we have a filter in place and that the event isn't allowed, do - * not install a perf event either. - */ - if (vcpu->kvm->arch.pmu_filter && - !test_bit(eventsel, vcpu->kvm->arch.pmu_filter)) - return; + /* + * Neither SW increment nor chained events need to be backed + * by a perf event. + */ + if (eventsel == ARMV8_PMUV3_PERFCTR_SW_INCR || + eventsel == ARMV8_PMUV3_PERFCTR_CHAIN) + return; + } memset(&attr, 0, sizeof(struct perf_event_attr)); attr.type = arm_pmu->pmu.type; @@ -877,6 +878,8 @@ static u64 compute_pmceid0(struct arm_pmu *pmu) /* always support CHAIN */ val |= BIT(ARMV8_PMUV3_PERFCTR_CHAIN); + /* always support CPU_CYCLES */ + val |= BIT(ARMV8_PMUV3_PERFCTR_CPU_CYCLES); return val; }
The fixed CPU cycle counter is mandatory for PMUv3, so it doesn't make a lot of sense allowing userspace to filter it. Only apply the PMU event filter to *programmed* event counters. While at it, use the generic CPU_CYCLES perf event to back the cycle counter, potentially allowing non-PMUv3 drivers to map the event onto the underlying implementation. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> --- arch/arm64/kvm/pmu-emul.c | 35 +++++++++++++++++++---------------- 1 file changed, 19 insertions(+), 16 deletions(-)