diff mbox series

[V3] KVM: x86: Sync the pending Posted-Interrupts

Message ID 1548924722-64060-1-git-send-email-luwei.kang@intel.com (mailing list archive)
State New, archived
Headers show
Series [V3] KVM: x86: Sync the pending Posted-Interrupts | expand

Commit Message

Luwei Kang Jan. 31, 2019, 8:52 a.m. UTC
Some Posted-Interrupts from passthrough devices may be lost or
overwritten when the vCPU is in runnable state.

The SN (Suppress Notification) of PID (Posted Interrupt Descriptor) will
be set when the vCPU is preempted (vCPU in KVM_MP_STATE_RUNNABLE state
but not running on physical CPU). If a posted interrupt coming at this
time, the irq remmaping facility will set the bit of PIR (Posted
Interrupt Requests) without ON (Outstanding Notification).
So this interrupt can't be sync to APIC virtualization register and
will not be handled by Guest because ON is zero.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 5 +++++
 arch/x86/kvm/x86.c     | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

Comments

Paolo Bonzini Jan. 31, 2019, 9:25 a.m. UTC | #1
On 31/01/19 09:52, Luwei Kang wrote:
> Some Posted-Interrupts from passthrough devices may be lost or
> overwritten when the vCPU is in runnable state.
> 
> The SN (Suppress Notification) of PID (Posted Interrupt Descriptor) will
> be set when the vCPU is preempted (vCPU in KVM_MP_STATE_RUNNABLE state
> but not running on physical CPU). If a posted interrupt coming at this
> time, the irq remmaping facility will set the bit of PIR (Posted
> Interrupt Requests) without ON (Outstanding Notification).
> So this interrupt can't be sync to APIC virtualization register and
> will not be handled by Guest because ON is zero.
> 
> Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 5 +++++
>  arch/x86/kvm/x86.c     | 2 +-
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 4341175..8ed9634 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1221,6 +1221,11 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
>  		new.sn = 0;
>  	} while (cmpxchg64(&pi_desc->control, old.control,
>  			   new.control) != old.control);
> +

	/*
	 * Clear SN before reading the bitmap.  The VT-d firmware
	 * writes the bitmap and reads SN atomically (5.2.3 in the
	 * spec), so it doesn't really have a memory barrier that
	 * pairs with this, but we cannot do that and we need one.
	 */

> +	smp_mb__after_atomic();
> +
> +	if (!bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS))
> +		pi_test_and_set_on(pi_desc);

You can add pi_set_on for use here.  The fast path with pi_clear_sn
should also be removed.

>  }

>  /*
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 3d27206..5bcf2c4 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7794,7 +7794,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  	 * 1) We should set ->mode before checking ->requests.  Please see
>  	 * the comment in kvm_vcpu_exiting_guest_mode().
>  	 *
> -	 * 2) For APICv, we should set ->mode before checking PIR.ON.  This
> +	 * 2) For APICv, we should set ->mode before checking PID.PIR. This

This should be PID.ON.

Paolo

>  	 * pairs with the memory barrier implicit in pi_test_and_set_on
>  	 * (see vmx_deliver_posted_interrupt).
>  	 *
>
Luwei Kang Feb. 1, 2019, 5:44 a.m. UTC | #2
> > Some Posted-Interrupts from passthrough devices may be lost or
> > overwritten when the vCPU is in runnable state.
> >
> > The SN (Suppress Notification) of PID (Posted Interrupt Descriptor)
> > will be set when the vCPU is preempted (vCPU in
> KVM_MP_STATE_RUNNABLE
> > state but not running on physical CPU). If a posted interrupt coming
> > at this time, the irq remmaping facility will set the bit of PIR
> > (Posted Interrupt Requests) without ON (Outstanding Notification).
> > So this interrupt can't be sync to APIC virtualization register and
> > will not be handled by Guest because ON is zero.
> >
> > Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 5 +++++
> >  arch/x86/kvm/x86.c     | 2 +-
> >  2 files changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index
> > 4341175..8ed9634 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -1221,6 +1221,11 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu
> *vcpu, int cpu)
> >  		new.sn = 0;
> >  	} while (cmpxchg64(&pi_desc->control, old.control,
> >  			   new.control) != old.control);
> > +
> 
> 	/*
> 	 * Clear SN before reading the bitmap.  The VT-d firmware
> 	 * writes the bitmap and reads SN atomically (5.2.3 in the
> 	 * spec), so it doesn't really have a memory barrier that
> 	 * pairs with this, but we cannot do that and we need one.
> 	 */
> 
> > +	smp_mb__after_atomic();
> > +
> > +	if (!bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS))
> > +		pi_test_and_set_on(pi_desc);
> 
> You can add pi_set_on for use here.  The fast path with pi_clear_sn should
> also be removed.


Do you mean remove the blow code in vmx_vcpu_pi_load() function to make the ON can be set if PIR is not zero?

--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1192,21 +1192,6 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
        if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu)
                return;

-       /*
-        * First handle the simple case where no cmpxchg is necessary; just
-        * allow posting non-urgent interrupts.
-        *
-        * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change
-        * PI.NDST: pi_post_block will do it for us and the wakeup_handler
-        * expects the VCPU to be on the blocked_vcpu_list that matches
-        * PI.NDST.
-        */
-       if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR ||
-           vcpu->cpu == cpu) {
-               pi_clear_sn(pi_desc);
-               return;
-       }

Thanks,
Luwei Kang
Paolo Bonzini Feb. 4, 2019, 10 a.m. UTC | #3
On 01/02/19 06:44, Kang, Luwei wrote:
> 
> Do you mean remove the blow code in vmx_vcpu_pi_load() function to make the ON can be set if PIR is not zero?
> 
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1192,21 +1192,6 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
>         if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu)
>                 return;
> 
> -       /*
> -        * First handle the simple case where no cmpxchg is necessary; just
> -        * allow posting non-urgent interrupts.
> -        *
> -        * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change
> -        * PI.NDST: pi_post_block will do it for us and the wakeup_handler
> -        * expects the VCPU to be on the blocked_vcpu_list that matches
> -        * PI.NDST.
> -        */
> -       if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR ||
> -           vcpu->cpu == cpu) {
> -               pi_clear_sn(pi_desc);
> -               return;
> -       }

Yes, exactly.

Paolo
diff mbox series

Patch

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4341175..8ed9634 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1221,6 +1221,11 @@  static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
 		new.sn = 0;
 	} while (cmpxchg64(&pi_desc->control, old.control,
 			   new.control) != old.control);
+
+	smp_mb__after_atomic();
+
+	if (!bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS))
+		pi_test_and_set_on(pi_desc);
 }
 
 /*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3d27206..5bcf2c4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7794,7 +7794,7 @@  static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	 * 1) We should set ->mode before checking ->requests.  Please see
 	 * the comment in kvm_vcpu_exiting_guest_mode().
 	 *
-	 * 2) For APICv, we should set ->mode before checking PIR.ON.  This
+	 * 2) For APICv, we should set ->mode before checking PID.PIR. This
 	 * pairs with the memory barrier implicit in pi_test_and_set_on
 	 * (see vmx_deliver_posted_interrupt).
 	 *