Message ID | 20220823093221.38075-6-likexu@tencent.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/pmu: Corner cases fixes and optimization | expand |
On Tue, Aug 23, 2022, Like Xu wrote: > From: Like Xu <likexu@tencent.com> > > During a KVM-trap from vm-exit to vm-entry, requests from different > sources will try to create one or more perf_events via reprogram_counter(), > which will allow some predecessor actions to be undone posteriorly, > especially repeated calls to some perf subsystem interfaces. These > repetitive calls can be omitted because only the final state of the > perf_event and the hardware resources it occupies will take effect > for the guest right before the vm-entry. > > To realize this optimization, KVM marks the creation requirements via > an inline version of reprogram_counter(), and then defers the actual > execution with the help of vcpu KVM_REQ_PMU request. Use imperative mood and state what change is being made, not what KVM's behavior is as a result of the change. And this is way more complicated than it needs to be, and it also neglects to call out that the deferred logic is needed for a bug fix. IIUC: Batch reprogramming PMU counters by setting KVM_REQ_PMU and thus deferring reprogramming kvm_pmu_handle_event() to avoid reprogramming a counter multiple times during a single VM-Exit. Deferring programming will also allow KVM to fix a bug where immediately reprogramming a counter can result in sleeping (taking a mutex) while interrupts are disabled in the VM-Exit fastpath. > Opportunistically update related comments to avoid misunderstandings. > > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > index d9b9a0f0db17..6940cbeee54d 100644 > --- a/arch/x86/kvm/pmu.c > +++ b/arch/x86/kvm/pmu.c > @@ -101,7 +101,7 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi) > struct kvm_pmu *pmu = pmc_to_pmu(pmc); > bool skip_pmi = false; > > - /* Ignore counters that have been reprogrammed already. */ > + /* Ignore counters that have not been reprogrammed. */ Eh, just drop this comment, it's fairly obvious what the code is doing and your suggested comment is wrong in the sense that the counters haven't actually been reprogrammed, i.e. it should be: /* Ignore counters that don't need to be reprogrammed. */ but IMO that's pretty obvious. > if (test_and_set_bit(pmc->idx, pmu->reprogram_pmi)) > return; > > @@ -293,7 +293,7 @@ static bool check_pmu_event_filter(struct kvm_pmc *pmc) > return allow_event; > } > > -void reprogram_counter(struct kvm_pmc *pmc) > +static void __reprogram_counter(struct kvm_pmc *pmc) This is misleading. Double-underscore variants are usually inner helpers, whereas these have a different relationship. Instaed of renaming reprogram_counter(), how about introcuing kvm_pmu_request_counter_reprogam() to make it obvious that KVM is _requesting_ a reprogram and not actually doing the reprogram.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2c96c43c313a..4e568a7ef464 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -493,6 +493,7 @@ struct kvm_pmc { struct perf_event *perf_event; struct kvm_vcpu *vcpu; /* + * only for creating or reusing perf_event, * eventsel value for general purpose counters, * ctrl value for fixed counters. */ diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index d9b9a0f0db17..6940cbeee54d 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -101,7 +101,7 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi) struct kvm_pmu *pmu = pmc_to_pmu(pmc); bool skip_pmi = false; - /* Ignore counters that have been reprogrammed already. */ + /* Ignore counters that have not been reprogrammed. */ if (test_and_set_bit(pmc->idx, pmu->reprogram_pmi)) return; @@ -293,7 +293,7 @@ static bool check_pmu_event_filter(struct kvm_pmc *pmc) return allow_event; } -void reprogram_counter(struct kvm_pmc *pmc) +static void __reprogram_counter(struct kvm_pmc *pmc) { struct kvm_pmu *pmu = pmc_to_pmu(pmc); u64 eventsel = pmc->eventsel; @@ -335,7 +335,6 @@ void reprogram_counter(struct kvm_pmc *pmc) !(eventsel & ARCH_PERFMON_EVENTSEL_OS), eventsel & ARCH_PERFMON_EVENTSEL_INT); } -EXPORT_SYMBOL_GPL(reprogram_counter); void kvm_pmu_handle_event(struct kvm_vcpu *vcpu) { @@ -345,11 +344,12 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu) for_each_set_bit(bit, pmu->reprogram_pmi, X86_PMC_IDX_MAX) { struct kvm_pmc *pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, bit); - if (unlikely(!pmc || !pmc->perf_event)) { + if (unlikely(!pmc)) { clear_bit(bit, pmu->reprogram_pmi); continue; } - reprogram_counter(pmc); + + __reprogram_counter(pmc); } /* @@ -527,7 +527,7 @@ static void kvm_pmu_incr_counter(struct kvm_pmc *pmc) prev_count = pmc->counter; pmc->counter = (pmc->counter + 1) & pmc_bitmask(pmc); - reprogram_counter(pmc); + __reprogram_counter(pmc); if (pmc->counter < prev_count) __kvm_perf_overflow(pmc, false); } @@ -542,7 +542,9 @@ static inline bool eventsel_match_perf_hw_id(struct kvm_pmc *pmc, static inline bool cpl_is_matched(struct kvm_pmc *pmc) { bool select_os, select_user; - u64 config = pmc->current_config; + u64 config = pmc_is_gp(pmc) ? pmc->eventsel : + (u64)fixed_ctrl_field(pmc_to_pmu(pmc)->fixed_ctr_ctrl, + pmc->idx - INTEL_PMC_IDX_FIXED); if (pmc_is_gp(pmc)) { select_os = config & ARCH_PERFMON_EVENTSEL_OS; diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 5cc5721f260b..d193d1dc6de0 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -183,7 +183,11 @@ static inline void kvm_init_pmu_capability(void) KVM_PMC_MAX_FIXED); } -void reprogram_counter(struct kvm_pmc *pmc); +static inline void reprogram_counter(struct kvm_pmc *pmc) +{ + __set_bit(pmc->idx, pmc_to_pmu(pmc)->reprogram_pmi); + kvm_make_request(KVM_REQ_PMU, pmc->vcpu); +} void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu); void kvm_pmu_handle_event(struct kvm_vcpu *vcpu);