Message ID | 20231204074535.9567-1-likexu@tencent.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86/intr: Explicitly check NMI from guest to eliminate false positives | expand |
On Mon, Dec 04, 2023, Like Xu wrote: > From: Like Xu <likexu@tencent.com> > > Explicitly checking the source of external interrupt is indeed NMI and not > other types in the kvm_arch_pmi_in_guest(), which prevents perf-kvm false > positive samples generated after vm-exit but before kvm_before_interrupt() > from being incorrectly labelled as guest samples: ... > Fixes: 73cd107b9685 ("KVM: x86: Drop current_vcpu for kvm_running_vcpu + kvm_arch_vcpu variable") The behavior is deliberate, and was added by commit dd60d217062f ("KVM: x86: Fix perf timer mode IP reporting"). *If* we want to undo that, then the best "fix" would be to effective reverting that commit by dropping the IRQ usage of kvm_before_interrupt() and renaming the helpers kvm_{before,after}_nmi(). But my understanding is that the behavior is necessary for select PMU usage.
On 4/12/2023 11:55 pm, Sean Christopherson wrote: > On Mon, Dec 04, 2023, Like Xu wrote: >> From: Like Xu <likexu@tencent.com> >> >> Explicitly checking the source of external interrupt is indeed NMI and not >> other types in the kvm_arch_pmi_in_guest(), which prevents perf-kvm false >> positive samples generated after vm-exit but before kvm_before_interrupt() >> from being incorrectly labelled as guest samples: > > ... > >> Fixes: 73cd107b9685 ("KVM: x86: Drop current_vcpu for kvm_running_vcpu + kvm_arch_vcpu variable") > > The behavior is deliberate, and was added by commit dd60d217062f ("KVM: x86: Fix > perf timer mode IP reporting"). *If* we want to undo that, then the best "fix" > would be to effective reverting that commit by dropping the IRQ usage of > kvm_before_interrupt() and renaming the helpers kvm_{before,after}_nmi(). But > my understanding is that the behavior is necessary for select PMU usage. Thanks for your comment. Yes, the commit dd60d217062f should be tracked. We don't have to undo the commit, and we also don't want to hurt either of the perf/core use cases (including perf NMI and timer modes). Thanks to the introduction of "enum kvm_intr_type", we can cover both cases instead of sacrificing one of two modes, how about: diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index c8c7e2475a18..5db607a564c6 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1868,8 +1868,16 @@ static inline int kvm_arch_flush_remote_tlbs_range(struct kvm *kvm, gfn_t gfn, } #endif /* CONFIG_HYPERV */ +enum kvm_intr_type { + /* Values are arbitrary, but must be non-zero. */ + KVM_HANDLING_IRQ = 1, + KVM_HANDLING_NMI, +}; + +/* Linux always use NMI for PMU. */ #define kvm_arch_pmi_in_guest(vcpu) \ - ((vcpu) && (vcpu)->arch.handling_intr_from_guest) + ((vcpu) && (vcpu)->arch.handling_intr_from_guest && \ + (in_nmi() == ((vcpu)->arch.handling_intr_from_guest == KVM_HANDLING_NMI))) void __init kvm_mmu_x86_module_init(void); int kvm_mmu_vendor_module_init(void); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 2f7e19166658..4dc38092d599 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -431,12 +431,6 @@ static inline bool kvm_notify_vmexit_enabled(struct kvm *kvm) return kvm->arch.notify_vmexit_flags & KVM_X86_NOTIFY_VMEXIT_ENABLED; } -enum kvm_intr_type { - /* Values are arbitrary, but must be non-zero. */ - KVM_HANDLING_IRQ = 1, - KVM_HANDLING_NMI, -}; - static __always_inline void kvm_before_interrupt(struct kvm_vcpu *vcpu, enum kvm_intr_type intr) {
On Tue, Dec 05, 2023, Like Xu wrote: > On 4/12/2023 11:55 pm, Sean Christopherson wrote: > > On Mon, Dec 04, 2023, Like Xu wrote: > > > From: Like Xu <likexu@tencent.com> > > > > > > Explicitly checking the source of external interrupt is indeed NMI and not > > > other types in the kvm_arch_pmi_in_guest(), which prevents perf-kvm false > > > positive samples generated after vm-exit but before kvm_before_interrupt() > > > from being incorrectly labelled as guest samples: > > > > ... > > > > > Fixes: 73cd107b9685 ("KVM: x86: Drop current_vcpu for kvm_running_vcpu + kvm_arch_vcpu variable") > > > > The behavior is deliberate, and was added by commit dd60d217062f ("KVM: x86: Fix > > perf timer mode IP reporting"). *If* we want to undo that, then the best "fix" > > would be to effective reverting that commit by dropping the IRQ usage of > > kvm_before_interrupt() and renaming the helpers kvm_{before,after}_nmi(). But > > my understanding is that the behavior is necessary for select PMU usage. > > Thanks for your comment. Yes, the commit dd60d217062f should be tracked. > > We don't have to undo the commit, and we also don't want to hurt > either of the perf/core use cases (including perf NMI and timer modes). > > Thanks to the introduction of "enum kvm_intr_type", we can cover both cases > instead of sacrificing one of two modes, how about: Hmm, yeah, that should work. It's not the prettiest thing, but I don't see an easy way to remedy that (I tried). False positives are still possible, but it's a clear improvement over what we have. > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index c8c7e2475a18..5db607a564c6 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1868,8 +1868,16 @@ static inline int > kvm_arch_flush_remote_tlbs_range(struct kvm *kvm, gfn_t gfn, > } > #endif /* CONFIG_HYPERV */ > > +enum kvm_intr_type { > + /* Values are arbitrary, but must be non-zero. */ > + KVM_HANDLING_IRQ = 1, > + KVM_HANDLING_NMI, > +}; > + > +/* Linux always use NMI for PMU. */ > #define kvm_arch_pmi_in_guest(vcpu) \ > - ((vcpu) && (vcpu)->arch.handling_intr_from_guest) > + ((vcpu) && (vcpu)->arch.handling_intr_from_guest && \ > + (in_nmi() == ((vcpu)->arch.handling_intr_from_guest == KVM_HANDLING_NMI))) > > void __init kvm_mmu_x86_module_init(void); > int kvm_mmu_vendor_module_init(void); > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index 2f7e19166658..4dc38092d599 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -431,12 +431,6 @@ static inline bool kvm_notify_vmexit_enabled(struct kvm *kvm) > return kvm->arch.notify_vmexit_flags & KVM_X86_NOTIFY_VMEXIT_ENABLED; > } > > -enum kvm_intr_type { > - /* Values are arbitrary, but must be non-zero. */ > - KVM_HANDLING_IRQ = 1, > - KVM_HANDLING_NMI, > -}; > - > static __always_inline void kvm_before_interrupt(struct kvm_vcpu *vcpu, > enum kvm_intr_type intr) > { > -- > 2.43.0 > > I noticed that timer mode is used when perf-record uses SW events like > "cpu-clock" event, > with or without hw-PMU support. My side of the tests no longer show false > positives and > the above diff does not affect the use of perf timer mode as well. Any better move ?
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index c8c7e2475a18..93e667f3099d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1868,8 +1868,15 @@ static inline int kvm_arch_flush_remote_tlbs_range(struct kvm *kvm, gfn_t gfn, } #endif /* CONFIG_HYPERV */ +enum kvm_intr_type { + /* Values are arbitrary, but must be non-zero. */ + KVM_HANDLING_IRQ = 1, + KVM_HANDLING_NMI, +}; + +/* Linux always use NMI for PMU. */ #define kvm_arch_pmi_in_guest(vcpu) \ - ((vcpu) && (vcpu)->arch.handling_intr_from_guest) + ((vcpu) && ((vcpu)->arch.handling_intr_from_guest == KVM_HANDLING_NMI)) void __init kvm_mmu_x86_module_init(void); int kvm_mmu_vendor_module_init(void); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 2f7e19166658..4dc38092d599 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -431,12 +431,6 @@ static inline bool kvm_notify_vmexit_enabled(struct kvm *kvm) return kvm->arch.notify_vmexit_flags & KVM_X86_NOTIFY_VMEXIT_ENABLED; } -enum kvm_intr_type { - /* Values are arbitrary, but must be non-zero. */ - KVM_HANDLING_IRQ = 1, - KVM_HANDLING_NMI, -}; - static __always_inline void kvm_before_interrupt(struct kvm_vcpu *vcpu, enum kvm_intr_type intr) {