Message ID | 20240126085444.324918-12-xiong.y.zhang@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86/pmu: Introduce passthrough vPM | expand |
On Fri, Jan 26, 2024, Xiong Zhang wrote: > Finally, always propagate enable_passthrough_pmu and perf_capabilities into > kvm->arch for each KVM instance. Why? arch.enable_passthrough_pmu is simply "arch.enable_pmu && enable_passthrough_pmu", I don't see any reason to cache that information on a per-VM basis. Blech, it's also cached in vcpu->pmu.passthrough, which is even more compexity that doesn't add any value. E.g. code that is reachable iff the VM/vCPU has a PMU can simply check the module param. And if we commit to that model (all or nothing), then we can probably end up with cleaner code overall because we bifurcate everything at a module level, e.g. even use static_call() if we had reason to.
On Fri, Jan 26, 2024, Xiong Zhang wrote: > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 4432e736129f..074452aa700d 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -193,6 +193,11 @@ bool __read_mostly enable_pmu = true; > EXPORT_SYMBOL_GPL(enable_pmu); > module_param(enable_pmu, bool, 0444); > > +/* Enable/disable PMU virtualization */ Heh, copy+paste fail. Just omit a comment, it's pretty self-explanatory. > +bool __read_mostly enable_passthrough_pmu = true; > +EXPORT_SYMBOL_GPL(enable_passthrough_pmu); > +module_param(enable_passthrough_pmu, bool, 0444); Almost forgot. Two things: 1. KVM should not enable the passthrough/mediate PMU by default until it has reached feature parity with the existing PMU, because otherwise we are essentially breaking userspace. And if for some reason the passthrough PMU *can't* reach feature parity, then (a) that's super interesting, and (b) we need a more explicit/deliberate transition plan. 2. The module param absolutely must not be exposed to userspace until all patches are in place. The easiest way to do that without creating dependency hell is to simply not create the module param. I.e. this patch should do _only_ bool __read_mostly enable_passthrough_pmu; EXPORT_SYMBOL_GPL(enable_passthrough_pmu);
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index d7036982332e..f2e73e6830a3 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1371,6 +1371,7 @@ struct kvm_arch { bool bus_lock_detection_enabled; bool enable_pmu; + bool enable_passthrough_pmu; u32 notify_window; u32 notify_vmexit_flags; diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 1d64113de488..51011603c799 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -208,6 +208,20 @@ static inline void kvm_init_pmu_capability(const struct kvm_pmu_ops *pmu_ops) enable_pmu = false; } + /* Pass-through vPMU is only supported in Intel CPUs. */ + if (!is_intel) + enable_passthrough_pmu = false; + + /* + * Pass-through vPMU requires at least PerfMon version 4 because the + * implementation requires the usage of MSR_CORE_PERF_GLOBAL_STATUS_SET + * for counter emulation as well as PMU context switch. In addition, it + * requires host PMU support on passthrough mode. Disable pass-through + * vPMU if any condition fails. + */ + if (!enable_pmu || kvm_pmu_cap.version < 4 || !kvm_pmu_cap.passthrough) + enable_passthrough_pmu = false; + if (!enable_pmu) { memset(&kvm_pmu_cap, 0, sizeof(kvm_pmu_cap)); return; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index be20a60047b1..e4610b80e519 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7835,13 +7835,14 @@ static u64 vmx_get_perf_capabilities(void) if (boot_cpu_has(X86_FEATURE_PDCM)) rdmsrl(MSR_IA32_PERF_CAPABILITIES, host_perf_cap); - if (!cpu_feature_enabled(X86_FEATURE_ARCH_LBR)) { + if (!cpu_feature_enabled(X86_FEATURE_ARCH_LBR) && + !enable_passthrough_pmu) { x86_perf_get_lbr(&lbr); if (lbr.nr) perf_cap |= host_perf_cap & PMU_CAP_LBR_FMT; } - if (vmx_pebs_supported()) { + if (vmx_pebs_supported() && !enable_passthrough_pmu) { perf_cap |= host_perf_cap & PERF_CAP_PEBS_MASK; if ((perf_cap & PERF_CAP_PEBS_FORMAT) < 4) perf_cap &= ~PERF_CAP_PEBS_BASELINE; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 4432e736129f..074452aa700d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -193,6 +193,11 @@ bool __read_mostly enable_pmu = true; EXPORT_SYMBOL_GPL(enable_pmu); module_param(enable_pmu, bool, 0444); +/* Enable/disable PMU virtualization */ +bool __read_mostly enable_passthrough_pmu = true; +EXPORT_SYMBOL_GPL(enable_passthrough_pmu); +module_param(enable_passthrough_pmu, bool, 0444); + bool __read_mostly eager_page_split = true; module_param(eager_page_split, bool, 0644); @@ -6553,6 +6558,9 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, mutex_lock(&kvm->lock); if (!kvm->created_vcpus) { kvm->arch.enable_pmu = !(cap->args[0] & KVM_PMU_CAP_DISABLE); + /* Disable passthrough PMU if enable_pmu is false. */ + if (!kvm->arch.enable_pmu) + kvm->arch.enable_passthrough_pmu = false; r = 0; } mutex_unlock(&kvm->lock); @@ -12480,6 +12488,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) kvm->arch.default_tsc_khz = max_tsc_khz ? : tsc_khz; kvm->arch.guest_can_read_msr_platform_info = true; kvm->arch.enable_pmu = enable_pmu; + kvm->arch.enable_passthrough_pmu = enable_passthrough_pmu; #if IS_ENABLED(CONFIG_HYPERV) spin_lock_init(&kvm->arch.hv_root_tdp_lock); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 5184fde1dc54..38b73e98eae9 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -329,6 +329,7 @@ extern u64 host_arch_capabilities; extern struct kvm_caps kvm_caps; extern bool enable_pmu; +extern bool enable_passthrough_pmu; /* * Get a filtered version of KVM's supported XCR0 that strips out dynamic