Message ID | 20220727233424.2968356-4-seanjc@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86: Intel PERF_CAPABILITIES fix and cleanups | expand |
On 28/7/2022 7:34 am, Sean Christopherson wrote: > guest_cpuid_has() is expensive due to the linear search of guest CPUID > entries, intel_pmu_lbr_is_enabled() is checked on every VM-Enter,_and_ > simply enumerating the same "Model" as the host causes KVM to set the > number of LBR records to a non-zero value. Before reconsidering vcpu->arch.perf_capabilities to reach a conclusion, how about this minor inline change help reduce my sins ? diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 0ecbbae42976..06a21d66be13 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7039,7 +7039,8 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) pt_guest_enter(vmx); atomic_switch_perf_msrs(vmx); - if (intel_pmu_lbr_is_enabled(vcpu)) + if (vmx->lbr_desc.records.nr && + (vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT)) vmx_passthrough_lbr_msrs(vcpu); if (enable_preemption_timer)
On Fri, Jul 29, 2022, Like Xu wrote: > On 28/7/2022 7:34 am, Sean Christopherson wrote: > > guest_cpuid_has() is expensive due to the linear search of guest CPUID > > entries, intel_pmu_lbr_is_enabled() is checked on every VM-Enter,_and_ > > simply enumerating the same "Model" as the host causes KVM to set the > > number of LBR records to a non-zero value. > > Before reconsidering vcpu->arch.perf_capabilities to reach a conclusion, > how about this minor inline change help reduce my sins ? > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 0ecbbae42976..06a21d66be13 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -7039,7 +7039,8 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) > pt_guest_enter(vmx); > > atomic_switch_perf_msrs(vmx); > - if (intel_pmu_lbr_is_enabled(vcpu)) > + if (vmx->lbr_desc.records.nr && > + (vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT)) That doesn't do the right thing if X86_FEATURE_PDCM is cleared in guest CPUID. It doesn't even require odd userspace behavior since intel_pmu_init() does: vcpu->arch.perf_capabilities = vmx_get_perf_capabilities(); E.g. older userspace that doesn't set MSR_IA32_PERF_CAPABILITIES will clear PDCM without touching the vCPU's MSR value. In the unlikely scenario we can't figure out a solution for PERF_CAPABILITIES, the alternative I tried first is to implement a generic CPUID feature "caching" scheme and use it to expedite the PDCM lookup. I scrapped that approach when I realized that KVM really should be able to consume PERF_CAPABILITIES during PMU refresh. I'm hesitant to even suggest a generic caching implementation because I suspect most performance critical uses of guest CPUID will be similar to PDMC, i.e. can be captured during KVM_SET_CPUID2 without requiring an explicit cache. And for PERF_CAPABILITIES, IMO a robust implementation is a must have, i.e. we've failed if we can't handle it during PMU refresh.
On 28/7/2022 7:34 am, Sean Christopherson wrote: > -bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu) > -{ > - struct x86_pmu_lbr *lbr = vcpu_to_lbr_records(vcpu); > - > - return lbr->nr && (vcpu_get_perf_capabilities(vcpu) & PMU_CAP_LBR_FMT); > -} > - > static bool intel_pmu_is_valid_lbr_msr(struct kvm_vcpu *vcpu, u32 index) > { > struct x86_pmu_lbr *records = vcpu_to_lbr_records(vcpu); > @@ -590,7 +583,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) > bitmap_set(pmu->all_valid_pmc_idx, > INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters); > > - if (cpuid_model_is_consistent(vcpu)) > + perf_capabilities = vcpu_get_perf_capabilities(vcpu); > + if (cpuid_model_is_consistent(vcpu) && > + (perf_capabilities & PMU_CAP_LBR_FMT)) > x86_perf_get_lbr(&lbr_desc->records); As one of evil source to add CPUID walk in the critical path: The x86_perf_get_lbr() is one of the perf interfaces, KVM cannot always trust that the number of returned lbr_desc->records.nr is always > 0, and if not, we have to tweak perf_capabilities inside KVM which violates user input again. Do you have more inputs to address this issue ? > else > lbr_desc->records.nr = 0;
On Tue, Aug 02, 2022, Like Xu wrote: > On 28/7/2022 7:34 am, Sean Christopherson wrote: > > -bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu) > > -{ > > - struct x86_pmu_lbr *lbr = vcpu_to_lbr_records(vcpu); > > - > > - return lbr->nr && (vcpu_get_perf_capabilities(vcpu) & PMU_CAP_LBR_FMT); > > -} > > - > > static bool intel_pmu_is_valid_lbr_msr(struct kvm_vcpu *vcpu, u32 index) > > { > > struct x86_pmu_lbr *records = vcpu_to_lbr_records(vcpu); > > @@ -590,7 +583,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) > > bitmap_set(pmu->all_valid_pmc_idx, > > INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters); > > - if (cpuid_model_is_consistent(vcpu)) > > + perf_capabilities = vcpu_get_perf_capabilities(vcpu); > > + if (cpuid_model_is_consistent(vcpu) && > > + (perf_capabilities & PMU_CAP_LBR_FMT)) > > x86_perf_get_lbr(&lbr_desc->records); > > As one of evil source to add CPUID walk in the critical path: > > The x86_perf_get_lbr() is one of the perf interfaces, KVM cannot always trust > that the number of returned lbr_desc->records.nr is always > 0, and if not, > we have to tweak perf_capabilities inside KVM which violates user input again. > > Do you have more inputs to address this issue ? First, drop the unnecessary stub and return value from x86_perf_get_lbr(). KVM selects PERF_EVENTS, so the stub and thus error path can't be hit. I'll add patches to the series to do this. Second, check the number of perf LBRs in vmx_get_perf_capabilities() and advertise PMU_CAP_LBR_FMT iff perf fully supports LBRs. --- From: Sean Christopherson <seanjc@google.com> Date: Tue, 2 Aug 2022 07:45:33 -0700 Subject: [PATCH] KVM: VMX: Advertise PMU LBRs if and only if perf supports LBRs Advertise LBR support to userspace via MSR_IA32_PERF_CAPABILITIES if and only if perf fully supports LBRs. Perf may disable LBRs (by zeroing the number of LBRs) even on platforms the allegedly support LBRs, e.g. if probing any LBR MSRs during setup fails. Fixes: be635e34c284 ("KVM: vmx/pmu: Expose LBR_FMT in the MSR_IA32_PERF_CAPABILITIES") Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/vmx/capabilities.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index c5e5dfef69c7..d2fdaf888d33 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -404,6 +404,7 @@ static inline bool vmx_pebs_supported(void) static inline u64 vmx_get_perf_capabilities(void) { u64 perf_cap = PMU_CAP_FW_WRITES; + struct x86_pmu_lbr lbr; u64 host_perf_cap = 0; if (!enable_pmu) @@ -412,7 +413,9 @@ static inline u64 vmx_get_perf_capabilities(void) if (boot_cpu_has(X86_FEATURE_PDCM)) rdmsrl(MSR_IA32_PERF_CAPABILITIES, host_perf_cap); - perf_cap |= host_perf_cap & PMU_CAP_LBR_FMT; + x86_perf_get_lbr(&lbr); + if (lbr.nr) + perf_cap |= host_perf_cap & PMU_CAP_LBR_FMT; if (vmx_pebs_supported()) { perf_cap |= host_perf_cap & PERF_CAP_PEBS_MASK; base-commit: 1f011a0755c2135b035cdee3b54e3adc426ec95c --
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index cfcb590afaa7..d111dc0d86df 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -171,13 +171,6 @@ static inline struct kvm_pmc *get_fw_gp_pmc(struct kvm_pmu *pmu, u32 msr) return get_gp_pmc(pmu, msr, MSR_IA32_PMC0); } -bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu) -{ - struct x86_pmu_lbr *lbr = vcpu_to_lbr_records(vcpu); - - return lbr->nr && (vcpu_get_perf_capabilities(vcpu) & PMU_CAP_LBR_FMT); -} - static bool intel_pmu_is_valid_lbr_msr(struct kvm_vcpu *vcpu, u32 index) { struct x86_pmu_lbr *records = vcpu_to_lbr_records(vcpu); @@ -590,7 +583,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) bitmap_set(pmu->all_valid_pmc_idx, INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters); - if (cpuid_model_is_consistent(vcpu)) + perf_capabilities = vcpu_get_perf_capabilities(vcpu); + if (cpuid_model_is_consistent(vcpu) && + (perf_capabilities & PMU_CAP_LBR_FMT)) x86_perf_get_lbr(&lbr_desc->records); else lbr_desc->records.nr = 0; @@ -598,7 +593,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) if (lbr_desc->records.nr) bitmap_set(pmu->all_valid_pmc_idx, INTEL_PMC_IDX_FIXED_VLBR, 1); - perf_capabilities = vcpu_get_perf_capabilities(vcpu); if (perf_capabilities & PERF_CAP_PEBS_FORMAT) { if (perf_capabilities & PERF_CAP_PEBS_BASELINE) { pmu->pebs_enable_mask = counter_mask; diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 690421b7d26c..c05e302fe2b1 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -526,9 +526,12 @@ static inline struct x86_pmu_lbr *vcpu_to_lbr_records(struct kvm_vcpu *vcpu) return &vcpu_to_lbr_desc(vcpu)->records; } +static inline bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu) +{ + return !!vcpu_to_lbr_records(vcpu)->nr; +} + void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu); -bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu); - int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu); void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu);
Now that the PMU is refreshed when MSR_IA32_PERF_CAPABILITIES is written by host userspace, zero out the number of LBR records for a vCPU during PMU refresh if PMU_CAP_LBR_FMT is not set in PERF_CAPABILITIES instead of handling the check at run-time. guest_cpuid_has() is expensive due to the linear search of guest CPUID entries, intel_pmu_lbr_is_enabled() is checked on every VM-Enter, _and_ simply enumerating the same "Model" as the host causes KVM to set the number of LBR records to a non-zero value. Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/vmx/pmu_intel.c | 12 +++--------- arch/x86/kvm/vmx/vmx.h | 7 +++++-- 2 files changed, 8 insertions(+), 11 deletions(-)