diff mbox series

[3/3] KVM: VMX: Adjust number of LBR records for PERF_CAPABILITIES at refresh

Message ID 20220727233424.2968356-4-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: Intel PERF_CAPABILITIES fix and cleanups | expand

Commit Message

Sean Christopherson July 27, 2022, 11:34 p.m. UTC
Now that the PMU is refreshed when MSR_IA32_PERF_CAPABILITIES is written
by host userspace, zero out the number of LBR records for a vCPU during
PMU refresh if PMU_CAP_LBR_FMT is not set in PERF_CAPABILITIES instead of
handling the check at run-time.

guest_cpuid_has() is expensive due to the linear search of guest CPUID
entries, intel_pmu_lbr_is_enabled() is checked on every VM-Enter, _and_
simply enumerating the same "Model" as the host causes KVM to set the
number of LBR records to a non-zero value.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/pmu_intel.c | 12 +++---------
 arch/x86/kvm/vmx/vmx.h       |  7 +++++--
 2 files changed, 8 insertions(+), 11 deletions(-)

Comments

Like Xu July 29, 2022, 10:10 a.m. UTC | #1
On 28/7/2022 7:34 am, Sean Christopherson wrote:
> guest_cpuid_has() is expensive due to the linear search of guest CPUID
> entries, intel_pmu_lbr_is_enabled() is checked on every VM-Enter,_and_
> simply enumerating the same "Model" as the host causes KVM to set the
> number of LBR records to a non-zero value.

Before reconsidering vcpu->arch.perf_capabilities to reach a conclusion,
how about this minor inline change help reduce my sins ?

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 0ecbbae42976..06a21d66be13 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7039,7 +7039,8 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
  	pt_guest_enter(vmx);

  	atomic_switch_perf_msrs(vmx);
-	if (intel_pmu_lbr_is_enabled(vcpu))
+	if (vmx->lbr_desc.records.nr &&
+	    (vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT))
  		vmx_passthrough_lbr_msrs(vcpu);

  	if (enable_preemption_timer)
Sean Christopherson July 29, 2022, 5:28 p.m. UTC | #2
On Fri, Jul 29, 2022, Like Xu wrote:
> On 28/7/2022 7:34 am, Sean Christopherson wrote:
> > guest_cpuid_has() is expensive due to the linear search of guest CPUID
> > entries, intel_pmu_lbr_is_enabled() is checked on every VM-Enter,_and_
> > simply enumerating the same "Model" as the host causes KVM to set the
> > number of LBR records to a non-zero value.
> 
> Before reconsidering vcpu->arch.perf_capabilities to reach a conclusion,
> how about this minor inline change help reduce my sins ?
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 0ecbbae42976..06a21d66be13 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7039,7 +7039,8 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  	pt_guest_enter(vmx);
> 
>  	atomic_switch_perf_msrs(vmx);
> -	if (intel_pmu_lbr_is_enabled(vcpu))
> +	if (vmx->lbr_desc.records.nr &&
> +	    (vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT))

That doesn't do the right thing if X86_FEATURE_PDCM is cleared in guest CPUID.
It doesn't even require odd userspace behavior since intel_pmu_init() does:

	vcpu->arch.perf_capabilities = vmx_get_perf_capabilities();

E.g. older userspace that doesn't set MSR_IA32_PERF_CAPABILITIES will clear PDCM
without touching the vCPU's MSR value.

In the unlikely scenario we can't figure out a solution for PERF_CAPABILITIES,
the alternative I tried first is to implement a generic CPUID feature "caching"
scheme and use it to expedite the PDCM lookup.  I scrapped that approach when I
realized that KVM really should be able to consume PERF_CAPABILITIES during PMU
refresh.

I'm hesitant to even suggest a generic caching implementation because I suspect 
most performance critical uses of guest CPUID will be similar to PDMC, i.e. can
be captured during KVM_SET_CPUID2 without requiring an explicit cache.  And for
PERF_CAPABILITIES, IMO a robust implementation is a must have, i.e. we've failed
if we can't handle it during PMU refresh.
Like Xu Aug. 2, 2022, 12:04 p.m. UTC | #3
On 28/7/2022 7:34 am, Sean Christopherson wrote:
> -bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu)
> -{
> -	struct x86_pmu_lbr *lbr = vcpu_to_lbr_records(vcpu);
> -
> -	return lbr->nr && (vcpu_get_perf_capabilities(vcpu) & PMU_CAP_LBR_FMT);
> -}
> -
>   static bool intel_pmu_is_valid_lbr_msr(struct kvm_vcpu *vcpu, u32 index)
>   {
>   	struct x86_pmu_lbr *records = vcpu_to_lbr_records(vcpu);
> @@ -590,7 +583,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
>   	bitmap_set(pmu->all_valid_pmc_idx,
>   		INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters);
>   
> -	if (cpuid_model_is_consistent(vcpu))
> +	perf_capabilities = vcpu_get_perf_capabilities(vcpu);
> +	if (cpuid_model_is_consistent(vcpu) &&
> +	    (perf_capabilities & PMU_CAP_LBR_FMT))
>   		x86_perf_get_lbr(&lbr_desc->records);

As one of evil source to add CPUID walk in the critical path:

The x86_perf_get_lbr() is one of the perf interfaces, KVM cannot always trust
that the number of returned lbr_desc->records.nr is always > 0,  and if not,
we have to tweak perf_capabilities inside KVM which violates user input again.

Do you have more inputs to address this issue ?

>   	else
>   		lbr_desc->records.nr = 0;
Sean Christopherson Aug. 2, 2022, 2:57 p.m. UTC | #4
On Tue, Aug 02, 2022, Like Xu wrote:
> On 28/7/2022 7:34 am, Sean Christopherson wrote:
> > -bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu)
> > -{
> > -	struct x86_pmu_lbr *lbr = vcpu_to_lbr_records(vcpu);
> > -
> > -	return lbr->nr && (vcpu_get_perf_capabilities(vcpu) & PMU_CAP_LBR_FMT);
> > -}
> > -
> >   static bool intel_pmu_is_valid_lbr_msr(struct kvm_vcpu *vcpu, u32 index)
> >   {
> >   	struct x86_pmu_lbr *records = vcpu_to_lbr_records(vcpu);
> > @@ -590,7 +583,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
> >   	bitmap_set(pmu->all_valid_pmc_idx,
> >   		INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters);
> > -	if (cpuid_model_is_consistent(vcpu))
> > +	perf_capabilities = vcpu_get_perf_capabilities(vcpu);
> > +	if (cpuid_model_is_consistent(vcpu) &&
> > +	    (perf_capabilities & PMU_CAP_LBR_FMT))
> >   		x86_perf_get_lbr(&lbr_desc->records);
> 
> As one of evil source to add CPUID walk in the critical path:
> 
> The x86_perf_get_lbr() is one of the perf interfaces, KVM cannot always trust
> that the number of returned lbr_desc->records.nr is always > 0,  and if not,
> we have to tweak perf_capabilities inside KVM which violates user input again.
> 
> Do you have more inputs to address this issue ?

First, drop the unnecessary stub and return value from x86_perf_get_lbr().  KVM
selects PERF_EVENTS, so the stub and thus error path can't be hit.  I'll add
patches to the series to do this.

Second, check the number of perf LBRs in vmx_get_perf_capabilities() and advertise
PMU_CAP_LBR_FMT iff perf fully supports LBRs.

---
From: Sean Christopherson <seanjc@google.com>
Date: Tue, 2 Aug 2022 07:45:33 -0700
Subject: [PATCH] KVM: VMX: Advertise PMU LBRs if and only if perf supports
 LBRs

Advertise LBR support to userspace via MSR_IA32_PERF_CAPABILITIES if and
only if perf fully supports LBRs.  Perf may disable LBRs (by zeroing the
number of LBRs) even on platforms the allegedly support LBRs, e.g. if
probing any LBR MSRs during setup fails.

Fixes: be635e34c284 ("KVM: vmx/pmu: Expose LBR_FMT in the MSR_IA32_PERF_CAPABILITIES")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/capabilities.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index c5e5dfef69c7..d2fdaf888d33 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -404,6 +404,7 @@ static inline bool vmx_pebs_supported(void)
 static inline u64 vmx_get_perf_capabilities(void)
 {
 	u64 perf_cap = PMU_CAP_FW_WRITES;
+	struct x86_pmu_lbr lbr;
 	u64 host_perf_cap = 0;

 	if (!enable_pmu)
@@ -412,7 +413,9 @@ static inline u64 vmx_get_perf_capabilities(void)
 	if (boot_cpu_has(X86_FEATURE_PDCM))
 		rdmsrl(MSR_IA32_PERF_CAPABILITIES, host_perf_cap);

-	perf_cap |= host_perf_cap & PMU_CAP_LBR_FMT;
+	x86_perf_get_lbr(&lbr);
+	if (lbr.nr)
+		perf_cap |= host_perf_cap & PMU_CAP_LBR_FMT;

 	if (vmx_pebs_supported()) {
 		perf_cap |= host_perf_cap & PERF_CAP_PEBS_MASK;

base-commit: 1f011a0755c2135b035cdee3b54e3adc426ec95c
--
diff mbox series

Patch

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index cfcb590afaa7..d111dc0d86df 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -171,13 +171,6 @@  static inline struct kvm_pmc *get_fw_gp_pmc(struct kvm_pmu *pmu, u32 msr)
 	return get_gp_pmc(pmu, msr, MSR_IA32_PMC0);
 }
 
-bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu)
-{
-	struct x86_pmu_lbr *lbr = vcpu_to_lbr_records(vcpu);
-
-	return lbr->nr && (vcpu_get_perf_capabilities(vcpu) & PMU_CAP_LBR_FMT);
-}
-
 static bool intel_pmu_is_valid_lbr_msr(struct kvm_vcpu *vcpu, u32 index)
 {
 	struct x86_pmu_lbr *records = vcpu_to_lbr_records(vcpu);
@@ -590,7 +583,9 @@  static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	bitmap_set(pmu->all_valid_pmc_idx,
 		INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters);
 
-	if (cpuid_model_is_consistent(vcpu))
+	perf_capabilities = vcpu_get_perf_capabilities(vcpu);
+	if (cpuid_model_is_consistent(vcpu) &&
+	    (perf_capabilities & PMU_CAP_LBR_FMT))
 		x86_perf_get_lbr(&lbr_desc->records);
 	else
 		lbr_desc->records.nr = 0;
@@ -598,7 +593,6 @@  static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	if (lbr_desc->records.nr)
 		bitmap_set(pmu->all_valid_pmc_idx, INTEL_PMC_IDX_FIXED_VLBR, 1);
 
-	perf_capabilities = vcpu_get_perf_capabilities(vcpu);
 	if (perf_capabilities & PERF_CAP_PEBS_FORMAT) {
 		if (perf_capabilities & PERF_CAP_PEBS_BASELINE) {
 			pmu->pebs_enable_mask = counter_mask;
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 690421b7d26c..c05e302fe2b1 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -526,9 +526,12 @@  static inline struct x86_pmu_lbr *vcpu_to_lbr_records(struct kvm_vcpu *vcpu)
 	return &vcpu_to_lbr_desc(vcpu)->records;
 }
 
+static inline bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu)
+{
+	return !!vcpu_to_lbr_records(vcpu)->nr;
+}
+
 void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu);
-bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu);
-
 int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu);
 void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu);