Message ID | 20210823193709.55886-3-seanjc@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | perf/x86/intel: KVM: PT intr handler fix and cleanup | expand |
Sean Christopherson <seanjc@google.com> writes: > diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h > index 0e4f2b1fa9fb..b06dbbd7eeeb 100644 > --- a/arch/x86/kvm/pmu.h > +++ b/arch/x86/kvm/pmu.h > @@ -41,6 +41,7 @@ struct kvm_pmu_ops { > void (*reset)(struct kvm_vcpu *vcpu); > void (*deliver_pmi)(struct kvm_vcpu *vcpu); > void (*cleanup)(struct kvm_vcpu *vcpu); > + void (*handle_intel_pt_intr)(void); What's this one for? Regards, -- Alex
On Tue, Aug 24, 2021, Alexander Shishkin wrote: > Sean Christopherson <seanjc@google.com> writes: > > > diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h > > index 0e4f2b1fa9fb..b06dbbd7eeeb 100644 > > --- a/arch/x86/kvm/pmu.h > > +++ b/arch/x86/kvm/pmu.h > > @@ -41,6 +41,7 @@ struct kvm_pmu_ops { > > void (*reset)(struct kvm_vcpu *vcpu); > > void (*deliver_pmi)(struct kvm_vcpu *vcpu); > > void (*cleanup)(struct kvm_vcpu *vcpu); > > + void (*handle_intel_pt_intr)(void); > > What's this one for? Doh, the remnants of one of my explorations trying to figure out the least gross way to conditionally register the handling. I'll get it removed. Good eyeballs, thanks!
On 24/8/2021 3:37 am, Sean Christopherson wrote: > Override the Processor Trace (PT) interrupt handler for guest mode if and > only if PT is configured for host+guest mode, i.e. is being used > independently by both host and guest. If PT is configured for system > mode, the host fully controls PT and must handle all events. > > Fixes: 8479e04e7d6b ("KVM: x86: Inject PMI for KVM guest") > Cc: stable@vger.kernel.org > Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> > Reported-by: Artem Kashkanov <artem.kashkanov@intel.com> > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/include/asm/kvm_host.h | 1 + > arch/x86/kvm/pmu.h | 1 + > arch/x86/kvm/vmx/vmx.c | 1 + > arch/x86/kvm/x86.c | 4 +++- > 4 files changed, 6 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 09b256db394a..1ea4943a73d7 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1494,6 +1494,7 @@ struct kvm_x86_init_ops { > int (*disabled_by_bios)(void); > int (*check_processor_compatibility)(void); > int (*hardware_setup)(void); > + bool (*intel_pt_intr_in_guest)(void); > > struct kvm_x86_ops *runtime_ops; > }; > diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h > index 0e4f2b1fa9fb..b06dbbd7eeeb 100644 > --- a/arch/x86/kvm/pmu.h > +++ b/arch/x86/kvm/pmu.h > @@ -41,6 +41,7 @@ struct kvm_pmu_ops { > void (*reset)(struct kvm_vcpu *vcpu); > void (*deliver_pmi)(struct kvm_vcpu *vcpu); > void (*cleanup)(struct kvm_vcpu *vcpu); > + void (*handle_intel_pt_intr)(void); > }; > > static inline u64 pmc_bitmask(struct kvm_pmc *pmc) > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index fada1055f325..f19d72136f77 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -7896,6 +7896,7 @@ static struct kvm_x86_init_ops vmx_init_ops __initdata = { > .disabled_by_bios = vmx_disabled_by_bios, > .check_processor_compatibility = vmx_check_processor_compat, > .hardware_setup = hardware_setup, > + .intel_pt_intr_in_guest = vmx_pt_mode_is_host_guest, > > .runtime_ops = &vmx_x86_ops, > }; > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index fb6015f97f9e..b5ade47dad9c 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -8305,7 +8305,7 @@ static struct perf_guest_info_callbacks kvm_guest_cbs = { > .is_in_guest = kvm_is_in_guest, > .is_user_mode = kvm_is_user_mode, > .get_guest_ip = kvm_get_guest_ip, > - .handle_intel_pt_intr = kvm_handle_intel_pt_intr, > + .handle_intel_pt_intr = NULL, > }; > > #ifdef CONFIG_X86_64 > @@ -11061,6 +11061,8 @@ int kvm_arch_hardware_setup(void *opaque) > memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops)); > kvm_ops_static_call_update(); > > + if (ops->intel_pt_intr_in_guest && ops->intel_pt_intr_in_guest()) > + kvm_guest_cbs.handle_intel_pt_intr = kvm_handle_intel_pt_intr; Emm, it's still buggy. The guest "unknown NMI" from the host Intel PT can still be reproduced after the following operation: rmmod kvm_intel modprobe kvm-intel pt_mode=1 ept=1 rmmod kvm_intel modprobe kvm-intel pt_mode=1 ept=0 Since the handle_intel_pt_intr is not reset to NULL in kvm_arch_hardware_unsetup(), and the previous function pointer still exists in the generic KVM data structure. But I have to say that this fix is much better than the one I proposed [1]. [1] https://lore.kernel.org/lkml/20210514084436.848396-1-like.xu@linux.intel.com/ > perf_register_guest_info_callbacks(&kvm_guest_cbs); > > if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) >
On Wed, Aug 25, 2021, Like Xu wrote: > On 24/8/2021 3:37 am, Sean Christopherson wrote: > > @@ -11061,6 +11061,8 @@ int kvm_arch_hardware_setup(void *opaque) > > memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops)); > > kvm_ops_static_call_update(); > > + if (ops->intel_pt_intr_in_guest && ops->intel_pt_intr_in_guest()) > > + kvm_guest_cbs.handle_intel_pt_intr = kvm_handle_intel_pt_intr; > > Emm, it's still buggy. > > The guest "unknown NMI" from the host Intel PT can still be reproduced > after the following operation: > > rmmod kvm_intel > modprobe kvm-intel pt_mode=1 ept=1 > rmmod kvm_intel > modprobe kvm-intel pt_mode=1 ept=0 > > Since the handle_intel_pt_intr is not reset to NULL in kvm_arch_hardware_unsetup(), > and the previous function pointer still exists in the generic KVM data structure. Ooof, good catch. Any preference between nullifying handle_intel_pt_intr in setup() vs. unsetup()? I think I like the idea of "unwinding" during unsetup(), even though it splits the logic a bit. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b5ade47dad9c..ffc6c2d73508 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11093,6 +11093,7 @@ int kvm_arch_hardware_setup(void *opaque) void kvm_arch_hardware_unsetup(void) { perf_unregister_guest_info_callbacks(&kvm_guest_cbs); + kvm_guest_cbs.handle_intel_pt_intr = NULL; static_call(kvm_x86_hardware_unsetup)(); } vs. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b5ade47dad9c..462aa7a360e9 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11063,6 +11063,8 @@ int kvm_arch_hardware_setup(void *opaque) if (ops->intel_pt_intr_in_guest && ops->intel_pt_intr_in_guest()) kvm_guest_cbs.handle_intel_pt_intr = kvm_handle_intel_pt_intr; + else + kvm_guest_cbs.handle_intel_pt_intr = NULL; perf_register_guest_info_callbacks(&kvm_guest_cbs); if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) > But I have to say that this fix is much better than the one I proposed [1]. > > [1] https://lore.kernel.org/lkml/20210514084436.848396-1-like.xu@linux.intel.com/ > > > perf_register_guest_info_callbacks(&kvm_guest_cbs); > > if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) > >
On Wed, Aug 25, 2021, Sean Christopherson wrote: > On Wed, Aug 25, 2021, Like Xu wrote: > > On 24/8/2021 3:37 am, Sean Christopherson wrote: > > > @@ -11061,6 +11061,8 @@ int kvm_arch_hardware_setup(void *opaque) > > > memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops)); > > > kvm_ops_static_call_update(); > > > + if (ops->intel_pt_intr_in_guest && ops->intel_pt_intr_in_guest()) > > > + kvm_guest_cbs.handle_intel_pt_intr = kvm_handle_intel_pt_intr; > > > > Emm, it's still buggy. > > > > The guest "unknown NMI" from the host Intel PT can still be reproduced > > after the following operation: > > > > rmmod kvm_intel > > modprobe kvm-intel pt_mode=1 ept=1 > > rmmod kvm_intel > > modprobe kvm-intel pt_mode=1 ept=0 > > > > Since the handle_intel_pt_intr is not reset to NULL in kvm_arch_hardware_unsetup(), > > and the previous function pointer still exists in the generic KVM data structure. > > Ooof, good catch. Any preference between nullifying handle_intel_pt_intr in > setup() vs. unsetup()? I think I like the idea of "unwinding" during unsetup(), > even though it splits the logic a bit. Never mind, I figured out a way to clean all this up and land the PT interrupt handler in vmx.c where it belongs. Getting there is a bit of a journey, but it's very doable. That means unwinding in unsetup() is the preferred approach, otherwise there would be potential for leaving a dangling pointer if a different vendor module was succesfully loaded.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 09b256db394a..1ea4943a73d7 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1494,6 +1494,7 @@ struct kvm_x86_init_ops { int (*disabled_by_bios)(void); int (*check_processor_compatibility)(void); int (*hardware_setup)(void); + bool (*intel_pt_intr_in_guest)(void); struct kvm_x86_ops *runtime_ops; }; diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 0e4f2b1fa9fb..b06dbbd7eeeb 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -41,6 +41,7 @@ struct kvm_pmu_ops { void (*reset)(struct kvm_vcpu *vcpu); void (*deliver_pmi)(struct kvm_vcpu *vcpu); void (*cleanup)(struct kvm_vcpu *vcpu); + void (*handle_intel_pt_intr)(void); }; static inline u64 pmc_bitmask(struct kvm_pmc *pmc) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index fada1055f325..f19d72136f77 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7896,6 +7896,7 @@ static struct kvm_x86_init_ops vmx_init_ops __initdata = { .disabled_by_bios = vmx_disabled_by_bios, .check_processor_compatibility = vmx_check_processor_compat, .hardware_setup = hardware_setup, + .intel_pt_intr_in_guest = vmx_pt_mode_is_host_guest, .runtime_ops = &vmx_x86_ops, }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index fb6015f97f9e..b5ade47dad9c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8305,7 +8305,7 @@ static struct perf_guest_info_callbacks kvm_guest_cbs = { .is_in_guest = kvm_is_in_guest, .is_user_mode = kvm_is_user_mode, .get_guest_ip = kvm_get_guest_ip, - .handle_intel_pt_intr = kvm_handle_intel_pt_intr, + .handle_intel_pt_intr = NULL, }; #ifdef CONFIG_X86_64 @@ -11061,6 +11061,8 @@ int kvm_arch_hardware_setup(void *opaque) memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops)); kvm_ops_static_call_update(); + if (ops->intel_pt_intr_in_guest && ops->intel_pt_intr_in_guest()) + kvm_guest_cbs.handle_intel_pt_intr = kvm_handle_intel_pt_intr; perf_register_guest_info_callbacks(&kvm_guest_cbs); if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
Override the Processor Trace (PT) interrupt handler for guest mode if and only if PT is configured for host+guest mode, i.e. is being used independently by both host and guest. If PT is configured for system mode, the host fully controls PT and must handle all events. Fixes: 8479e04e7d6b ("KVM: x86: Inject PMI for KVM guest") Cc: stable@vger.kernel.org Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reported-by: Artem Kashkanov <artem.kashkanov@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/pmu.h | 1 + arch/x86/kvm/vmx/vmx.c | 1 + arch/x86/kvm/x86.c | 4 +++- 4 files changed, 6 insertions(+), 1 deletion(-)