diff mbox series

KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest

Message ID 20220515171633.902901-1-yanfei.xu@intel.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest | expand

Commit Message

Yanfei Xu May 15, 2022, 5:16 p.m. UTC
When kernel handles the vm-exit caused by external interrupts and PMI,
it always set a type of kvm_intr_type to handling_intr_from_guest to
tell if it's dealing an IRQ or NMI.
However, the further type judgment is missing in kvm_arch_pmi_in_guest().
It could make the PMI of intel_pt wrongly considered it comes from a
guest once the PMI breaks the handling of vm-exit of external interrupts.

Fixes: db215756ae59 ("KVM: x86: More precisely identify NMI from guest when handling PMI")
Signed-off-by: Yanfei Xu <yanfei.xu@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 8 +++++++-
 arch/x86/kvm/x86.h              | 6 ------
 2 files changed, 7 insertions(+), 7 deletions(-)

Comments

Sean Christopherson May 16, 2022, 1:58 p.m. UTC | #1
On Mon, May 16, 2022, Yanfei Xu wrote:
> When kernel handles the vm-exit caused by external interrupts and PMI,
> it always set a type of kvm_intr_type to handling_intr_from_guest to
> tell if it's dealing an IRQ or NMI.
> However, the further type judgment is missing in kvm_arch_pmi_in_guest().
> It could make the PMI of intel_pt wrongly considered it comes from a
> guest once the PMI breaks the handling of vm-exit of external interrupts.
> 
> Fixes: db215756ae59 ("KVM: x86: More precisely identify NMI from guest when handling PMI")
> Signed-off-by: Yanfei Xu <yanfei.xu@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 8 +++++++-
>  arch/x86/kvm/x86.h              | 6 ------
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 4ff36610af6a..308cf19f123d 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1582,8 +1582,14 @@ static inline int kvm_arch_flush_remote_tlb(struct kvm *kvm)
>  		return -ENOTSUPP;
>  }
>  
> +enum kvm_intr_type {
> +	/* Values are arbitrary, but must be non-zero. */
> +	KVM_HANDLING_IRQ = 1,
> +	KVM_HANDLING_NMI,
> +};
> +
>  #define kvm_arch_pmi_in_guest(vcpu) \
> -	((vcpu) && (vcpu)->arch.handling_intr_from_guest)
> +	((vcpu) && (vcpu)->arch.handling_intr_from_guest == KVM_HANDLING_NMI)

My understanding is that this isn't correct as a general change, as perf events
can use regular IRQs in some cases.  See commit dd60d217062f4 ("KVM: x86: Fix perf
timer mode IP reporting").

I assume there's got to be a way to know which mode perf is using, e.g. we should
be able to make this look something like:

	((vcpu) && (vcpu)->arch.handling_intr_from_guest == kvm_pmi_vector)


>  void kvm_mmu_x86_module_init(void);
>  int kvm_mmu_vendor_module_init(void);
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 588792f00334..3bdf1bc76863 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -344,12 +344,6 @@ static inline bool kvm_cstate_in_guest(struct kvm *kvm)
>  	return kvm->arch.cstate_in_guest;
>  }
>  
> -enum kvm_intr_type {
> -	/* Values are arbitrary, but must be non-zero. */
> -	KVM_HANDLING_IRQ = 1,
> -	KVM_HANDLING_NMI,
> -};
> -
>  static inline void kvm_before_interrupt(struct kvm_vcpu *vcpu,
>  					enum kvm_intr_type intr)
>  {
> -- 
> 2.32.0
>
Yanfei Xu May 18, 2022, 10:01 a.m. UTC | #2
On 2022/5/16 21:58, Sean Christopherson wrote:
> On Mon, May 16, 2022, Yanfei Xu wrote:
>> When kernel handles the vm-exit caused by external interrupts and PMI,
>> it always set a type of kvm_intr_type to handling_intr_from_guest to
>> tell if it's dealing an IRQ or NMI.
>> However, the further type judgment is missing in kvm_arch_pmi_in_guest().
>> It could make the PMI of intel_pt wrongly considered it comes from a
>> guest once the PMI breaks the handling of vm-exit of external interrupts.
>>
>> Fixes: db215756ae59 ("KVM: x86: More precisely identify NMI from guest when handling PMI")
>> Signed-off-by: Yanfei Xu <yanfei.xu@intel.com>
>> ---
>>   arch/x86/include/asm/kvm_host.h | 8 +++++++-
>>   arch/x86/kvm/x86.h              | 6 ------
>>   2 files changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 4ff36610af6a..308cf19f123d 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -1582,8 +1582,14 @@ static inline int kvm_arch_flush_remote_tlb(struct kvm *kvm)
>>   		return -ENOTSUPP;
>>   }
>>   
>> +enum kvm_intr_type {
>> +	/* Values are arbitrary, but must be non-zero. */
>> +	KVM_HANDLING_IRQ = 1,
>> +	KVM_HANDLING_NMI,
>> +};
>> +
>>   #define kvm_arch_pmi_in_guest(vcpu) \
>> -	((vcpu) && (vcpu)->arch.handling_intr_from_guest)
>> +	((vcpu) && (vcpu)->arch.handling_intr_from_guest == KVM_HANDLING_NMI)
> My understanding is that this isn't correct as a general change, as perf events
> can use regular IRQs in some cases.  See commit dd60d217062f4 ("KVM: x86: Fix perf
> timer mode IP reporting").
>
> I assume there's got to be a way to know which mode perf is using, e.g. we should
> be able to make this look something like:
>
> 	((vcpu) && (vcpu)->arch.handling_intr_from_guest == kvm_pmi_vector)

Thanks for your comments, I am going to understand these clearly and 
then reply or give a next version.

Regards,
Yanfei

>
>>   void kvm_mmu_x86_module_init(void);
>>   int kvm_mmu_vendor_module_init(void);
>> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
>> index 588792f00334..3bdf1bc76863 100644
>> --- a/arch/x86/kvm/x86.h
>> +++ b/arch/x86/kvm/x86.h
>> @@ -344,12 +344,6 @@ static inline bool kvm_cstate_in_guest(struct kvm *kvm)
>>   	return kvm->arch.cstate_in_guest;
>>   }
>>   
>> -enum kvm_intr_type {
>> -	/* Values are arbitrary, but must be non-zero. */
>> -	KVM_HANDLING_IRQ = 1,
>> -	KVM_HANDLING_NMI,
>> -};
>> -
>>   static inline void kvm_before_interrupt(struct kvm_vcpu *vcpu,
>>   					enum kvm_intr_type intr)
>>   {
>> -- 
>> 2.32.0
>>
Yanfei Xu May 20, 2022, 9:54 a.m. UTC | #3
Hi Sean,
You are right, the change of kvm_arch_pmi_in_guest is incorrect, because it should cover two cases of PMI. 
For the PMI of intel pt, it certainly is the NMI PMI. So how about fixing it like below?

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 610355b9ccce..378036c1cf94 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7856,7 +7856,7 @@ static unsigned int vmx_handle_intel_pt_intr(void)
        struct kvm_vcpu *vcpu = kvm_get_running_vcpu();

        /* '0' on failure so that the !PT case can use a RET0 static call. */
-       if (!kvm_arch_pmi_in_guest(vcpu))
+       if (!kvm_handling_nmi_from_guest(vcpu))
                return 0;

        kvm_make_request(KVM_REQ_PMI, vcpu);

Thanks,
Yanfei

-----Original Message-----
From: Sean Christopherson <seanjc@google.com> 
Sent: Monday, May 16, 2022 9:58 PM
To: Xu, Yanfei <yanfei.xu@intel.com>
Cc: pbonzini@redhat.com; vkuznets@redhat.com; wanpengli@tencent.com; jmattson@google.com; joro@8bytes.org; tglx@linutronix.de; mingo@redhat.com; bp@alien8.de; dave.hansen@linux.intel.com; x86@kernel.org; kvm@vger.kernel.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH] KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest

On Mon, May 16, 2022, Yanfei Xu wrote:
> When kernel handles the vm-exit caused by external interrupts and PMI, 
> it always set a type of kvm_intr_type to handling_intr_from_guest to 
> tell if it's dealing an IRQ or NMI.
> However, the further type judgment is missing in kvm_arch_pmi_in_guest().
> It could make the PMI of intel_pt wrongly considered it comes from a 
> guest once the PMI breaks the handling of vm-exit of external interrupts.
> 
> Fixes: db215756ae59 ("KVM: x86: More precisely identify NMI from guest 
> when handling PMI")
> Signed-off-by: Yanfei Xu <yanfei.xu@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 8 +++++++-
>  arch/x86/kvm/x86.h              | 6 ------
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h 
> b/arch/x86/include/asm/kvm_host.h index 4ff36610af6a..308cf19f123d 
> 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1582,8 +1582,14 @@ static inline int kvm_arch_flush_remote_tlb(struct kvm *kvm)
>  		return -ENOTSUPP;
>  }
>  
> +enum kvm_intr_type {
> +	/* Values are arbitrary, but must be non-zero. */
> +	KVM_HANDLING_IRQ = 1,
> +	KVM_HANDLING_NMI,
> +};
> +
>  #define kvm_arch_pmi_in_guest(vcpu) \
> -	((vcpu) && (vcpu)->arch.handling_intr_from_guest)
> +	((vcpu) && (vcpu)->arch.handling_intr_from_guest == 
> +KVM_HANDLING_NMI)

My understanding is that this isn't correct as a general change, as perf events can use regular IRQs in some cases.  See commit dd60d217062f4 ("KVM: x86: Fix perf timer mode IP reporting").

I assume there's got to be a way to know which mode perf is using, e.g. we should be able to make this look something like:

	((vcpu) && (vcpu)->arch.handling_intr_from_guest == kvm_pmi_vector)


>  void kvm_mmu_x86_module_init(void);
>  int kvm_mmu_vendor_module_init(void); diff --git a/arch/x86/kvm/x86.h 
> b/arch/x86/kvm/x86.h index 588792f00334..3bdf1bc76863 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -344,12 +344,6 @@ static inline bool kvm_cstate_in_guest(struct kvm *kvm)
>  	return kvm->arch.cstate_in_guest;
>  }
>  
> -enum kvm_intr_type {
> -	/* Values are arbitrary, but must be non-zero. */
> -	KVM_HANDLING_IRQ = 1,
> -	KVM_HANDLING_NMI,
> -};
> -
>  static inline void kvm_before_interrupt(struct kvm_vcpu *vcpu,
>  					enum kvm_intr_type intr)
>  {
> --
> 2.32.0
>
Sean Christopherson May 20, 2022, 4:31 p.m. UTC | #4
Please don't top-post.

On Fri, May 20, 2022, Xu, Yanfei wrote:
> From: Sean Christopherson <seanjc@google.com> 
> On Mon, May 16, 2022, Yanfei Xu wrote:
> > diff --git a/arch/x86/include/asm/kvm_host.h 
> > b/arch/x86/include/asm/kvm_host.h index 4ff36610af6a..308cf19f123d 
> > 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1582,8 +1582,14 @@ static inline int kvm_arch_flush_remote_tlb(struct kvm *kvm)
> >  		return -ENOTSUPP;
> >  }
> >  
> > +enum kvm_intr_type {
> > +	/* Values are arbitrary, but must be non-zero. */
> > +	KVM_HANDLING_IRQ = 1,
> > +	KVM_HANDLING_NMI,
> > +};
> > +
> >  #define kvm_arch_pmi_in_guest(vcpu) \
> > -	((vcpu) && (vcpu)->arch.handling_intr_from_guest)
> > +	((vcpu) && (vcpu)->arch.handling_intr_from_guest == 
> > +KVM_HANDLING_NMI)
> 
> My understanding is that this isn't correct as a general change, as perf
> events can use regular IRQs in some cases.  See commit dd60d217062f4 ("KVM:
> x86: Fix perf timer mode IP reporting").
> 
> I assume there's got to be a way to know which mode perf is using, e.g. we
> should be able to make this look something like:
> 
> 	((vcpu) && (vcpu)->arch.handling_intr_from_guest == kvm_pmi_vector)

> Hi Sean,
> You are right, the change of kvm_arch_pmi_in_guest is incorrect, because it should cover two cases of PMI. 
> For the PMI of intel pt, it certainly is the NMI PMI. So how about fixing it like below?

Yep, that works.  I did enough spelunking to figure out how we can fix the generic
issue, but it's per-event and requires a decent amount of plumbing in perf.

perf_guest_handle_intel_pt_intr() doesn't bother with perf_guest_state() since it's
such a specialized event, so fixing this in vmx_handle_intel_pt_intr() would likely
be the long-term solution even if/when the generic case is fixed.
Yanfei Xu May 21, 2022, 4:31 a.m. UTC | #5
> -----Original Message-----
> From: Sean Christopherson <seanjc@google.com>
> Sent: Saturday, May 21, 2022 12:32 AM
> To: Xu, Yanfei <yanfei.xu@intel.com>
> Cc: pbonzini@redhat.com; vkuznets@redhat.com; wanpengli@tencent.com;
> jmattson@google.com; joro@8bytes.org; tglx@linutronix.de;
> mingo@redhat.com; bp@alien8.de; dave.hansen@linux.intel.com;
> x86@kernel.org; kvm@vger.kernel.org; linux-kernel@vger.kernel.org; Wang,
> Wei W <wei.w.wang@intel.com>; Liang, Kan <kan.liang@intel.com>
> Subject: Re: [PATCH] KVM: x86: Fix the intel_pt PMI handling wrongly considered
> from guest
> 
> Please don't top-post.
Got it, just correctly configured my mailbox with prefix each line of original message.

> 
> On Fri, May 20, 2022, Xu, Yanfei wrote:
> > From: Sean Christopherson <seanjc@google.com> On Mon, May 16, 2022,
> > Yanfei Xu wrote:
> > > diff --git a/arch/x86/include/asm/kvm_host.h
> > > b/arch/x86/include/asm/kvm_host.h index 4ff36610af6a..308cf19f123d
> > > 100644
> > > --- a/arch/x86/include/asm/kvm_host.h
> > > +++ b/arch/x86/include/asm/kvm_host.h
> > > @@ -1582,8 +1582,14 @@ static inline int
> kvm_arch_flush_remote_tlb(struct kvm *kvm)
> > >  		return -ENOTSUPP;
> > >  }
> > >
> > > +enum kvm_intr_type {
> > > +	/* Values are arbitrary, but must be non-zero. */
> > > +	KVM_HANDLING_IRQ = 1,
> > > +	KVM_HANDLING_NMI,
> > > +};
> > > +
> > >  #define kvm_arch_pmi_in_guest(vcpu) \
> > > -	((vcpu) && (vcpu)->arch.handling_intr_from_guest)
> > > +	((vcpu) && (vcpu)->arch.handling_intr_from_guest ==
> > > +KVM_HANDLING_NMI)
> >
> > My understanding is that this isn't correct as a general change, as
> > perf events can use regular IRQs in some cases.  See commit dd60d217062f4
> ("KVM:
> > x86: Fix perf timer mode IP reporting").
> >
> > I assume there's got to be a way to know which mode perf is using,
> > e.g. we should be able to make this look something like:
> >
> > 	((vcpu) && (vcpu)->arch.handling_intr_from_guest == kvm_pmi_vector)
> 
> > Hi Sean,
> > You are right, the change of kvm_arch_pmi_in_guest is incorrect, because it
> should cover two cases of PMI.
> > For the PMI of intel pt, it certainly is the NMI PMI. So how about fixing it like
> below?
> 
> Yep, that works.  I did enough spelunking to figure out how we can fix the
> generic issue, but it's per-event and requires a decent amount of plumbing in
> perf.

Agree, it's per-event for the generic issue...
> 
> perf_guest_handle_intel_pt_intr() doesn't bother with perf_guest_state() since
> it's such a specialized event, so fixing this in vmx_handle_intel_pt_intr() would
> likely be the long-term solution even if/when the generic case is fixed.

Will send the v2.

Thanks,
Yanfei
diff mbox series

Patch

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4ff36610af6a..308cf19f123d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1582,8 +1582,14 @@  static inline int kvm_arch_flush_remote_tlb(struct kvm *kvm)
 		return -ENOTSUPP;
 }
 
+enum kvm_intr_type {
+	/* Values are arbitrary, but must be non-zero. */
+	KVM_HANDLING_IRQ = 1,
+	KVM_HANDLING_NMI,
+};
+
 #define kvm_arch_pmi_in_guest(vcpu) \
-	((vcpu) && (vcpu)->arch.handling_intr_from_guest)
+	((vcpu) && (vcpu)->arch.handling_intr_from_guest == KVM_HANDLING_NMI)
 
 void kvm_mmu_x86_module_init(void);
 int kvm_mmu_vendor_module_init(void);
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 588792f00334..3bdf1bc76863 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -344,12 +344,6 @@  static inline bool kvm_cstate_in_guest(struct kvm *kvm)
 	return kvm->arch.cstate_in_guest;
 }
 
-enum kvm_intr_type {
-	/* Values are arbitrary, but must be non-zero. */
-	KVM_HANDLING_IRQ = 1,
-	KVM_HANDLING_NMI,
-};
-
 static inline void kvm_before_interrupt(struct kvm_vcpu *vcpu,
 					enum kvm_intr_type intr)
 {