diff mbox series

[v3] KVM: x86/xen: Inject vCPU upcall vector when local APIC is enabled

Message ID 0717051e9379614721483aaef29572e0356cd347.camel@infradead.org (mailing list archive)
State New, archived
Headers show
Series [v3] KVM: x86/xen: Inject vCPU upcall vector when local APIC is enabled | expand

Commit Message

David Woodhouse Jan. 16, 2024, 11:16 a.m. UTC
From: David Woodhouse <dwmw@amazon.co.uk>

Linux guests since commit b1c3497e604d ("x86/xen: Add support for
HVMOP_set_evtchn_upcall_vector") in v6.0 onwards will use the per-vCPU
upcall vector when it's advertised in the Xen CPUID leaves.

This upcall is injected through the local APIC as an MSI, unlike the
older system vector which was merely injected by the hypervisor any time
the CPU was able to receive an interrupt and the upcall_pending flags is
set in its vcpu_info.

Effectively, that makes the per-CPU upcall edge triggered instead of
level triggered.

We lose edges.

Specifically, when the local APIC is *disabled*, delivering the MSI
will fail. Xen checks the vcpu_info->evtchn_upcall_pending flag when
enabling the local APIC for a vCPU and injects the vector immediately
if so.

Since userspace doesn't get to notice when the guest enables a local
APIC which is emulated in KVM, KVM needs to do the same.

Astute reviewers may note that kvm_xen_inject_vcpu_vector() function has
a WARN_ON_ONCE() in the case where kvm_irq_delivery_to_apic_fast() fails
and returns false. In the case where the MSI is not delivered due to the
local APIC being disabled, kvm_irq_delivery_to_apic_fast() still returns
true but the value in *r is zero. So the WARN_ON_ONCE() remains correct,
as that case should still never happen.

Fixes: fde0451be8fb3 ("KVM: x86/xen: Support per-vCPU event channel upcall via local APIC")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Cc: stable@vger.kernel.org
---
 v3: Repost, add Cc:stable
 v2: Add Fixes: tag.

 arch/x86/kvm/lapic.c |  5 ++++-
 arch/x86/kvm/xen.c   |  2 +-
 arch/x86/kvm/xen.h   | 18 ++++++++++++++++++
 3 files changed, 23 insertions(+), 2 deletions(-)

Comments

Sean Christopherson Jan. 16, 2024, 4:52 p.m. UTC | #1
On Tue, Jan 16, 2024, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> Linux guests since commit b1c3497e604d ("x86/xen: Add support for
> HVMOP_set_evtchn_upcall_vector") in v6.0 onwards will use the per-vCPU
> upcall vector when it's advertised in the Xen CPUID leaves.
> 
> This upcall is injected through the local APIC as an MSI, unlike the
> older system vector which was merely injected by the hypervisor any time
> the CPU was able to receive an interrupt and the upcall_pending flags is
> set in its vcpu_info.
> 
> Effectively, that makes the per-CPU upcall edge triggered instead of
> level triggered.
> 
> We lose edges.

Pronouns.  And losing edges isn't wrong in and of itself.  The only thing that
is "wrong" is that KVM doesn't exactly follow Xen's behavior.  How about smushing
the above sentence and the next two paragraphs into:

  Effectively, that makes the per-CPU upcall edge triggered instead of
  level triggered, which results in the upcall being lost if the MSI is
  delivered when the local APIC is *disabled*.

  Xen checks the vcpu_info->evtchn_upcall_pending flag when enabling the
  local APIC for a vCPU and injects the vector immediately if so.  Do the
  same in KVM since KVM doesn't provide a way for userspace to notice when
  the guest software enables a local APIC.
 
> Specifically, when the local APIC is *disabled*, delivering the MSI
> will fail. Xen checks the vcpu_info->evtchn_upcall_pending flag when
> enabling the local APIC for a vCPU and injects the vector immediately
> if so.
> 
> Since userspace doesn't get to notice when the guest enables a local
> APIC which is emulated in KVM, KVM needs to do the same.
> 
> Astute reviewers may note that kvm_xen_inject_vcpu_vector() function has
> a WARN_ON_ONCE() in the case where kvm_irq_delivery_to_apic_fast() fails
> and returns false. In the case where the MSI is not delivered due to the
> local APIC being disabled, kvm_irq_delivery_to_apic_fast() still returns
> true but the value in *r is zero. So the WARN_ON_ONCE() remains correct,
> as that case should still never happen.
> 
> Fixes: fde0451be8fb3 ("KVM: x86/xen: Support per-vCPU event channel upcall via local APIC")
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> Reviewed-by: Paul Durrant <paul@xen.org>
> Cc: stable@vger.kernel.org
> ---
>  v3: Repost, add Cc:stable
>  v2: Add Fixes: tag.
> 
>  arch/x86/kvm/lapic.c |  5 ++++-
>  arch/x86/kvm/xen.c   |  2 +-
>  arch/x86/kvm/xen.h   | 18 ++++++++++++++++++
>  3 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 3242f3da2457..1e715ca717bc 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -41,6 +41,7 @@
>  #include "ioapic.h"
>  #include "trace.h"
>  #include "x86.h"
> +#include "xen.h"
>  #include "cpuid.h"
>  #include "hyperv.h"
>  #include "smm.h"
> @@ -499,8 +500,10 @@ static inline void apic_set_spiv(struct kvm_lapic *apic, u32 val)
>         }
>  
>         /* Check if there are APF page ready requests pending */
> -       if (enabled)
> +       if (enabled) {

Argh, not your code, but there's really no reason this check needs to be outside
of the "if (enabled != apic->sw_enabled)" block.  I don't care about optimizing
anything, I just don't like that it implies that KVM always needs to take action
if SPIV is written.  Probably not worth trying to "fix" at this point though :-(

>                 kvm_make_request(KVM_REQ_APF_READY, apic->vcpu);
> +               kvm_xen_enable_lapic(apic->vcpu);

I think we should name the Xen helper kvm_xen_sw_enable_lapic(), to make it clear
that the behavior doesn't apply to the APIC being hardware enabled via MSR.
Unless it does apply to that path?
David Woodhouse Jan. 16, 2024, 6:39 p.m. UTC | #2
On Tue, 2024-01-16 at 08:52 -0800, Sean Christopherson wrote:
> On Tue, Jan 16, 2024, David Woodhouse wrote:
> > From: David Woodhouse <dwmw@amazon.co.uk>
> > 
> > Linux guests since commit b1c3497e604d ("x86/xen: Add support for
> > HVMOP_set_evtchn_upcall_vector") in v6.0 onwards will use the per-vCPU
> > upcall vector when it's advertised in the Xen CPUID leaves.
> > 
> > This upcall is injected through the local APIC as an MSI, unlike the
> > older system vector which was merely injected by the hypervisor any time
> > the CPU was able to receive an interrupt and the upcall_pending flags is
> > set in its vcpu_info.
> > 
> > Effectively, that makes the per-CPU upcall edge triggered instead of
> > level triggered.
> > 
> > We lose edges.
> 
> Pronouns.  And losing edges isn't wrong in and of itself.  The only thing that
> is "wrong" is that KVM doesn't exactly follow Xen's behavior.  How about smushing
> the above sentence and the next two paragraphs into:
> 
>   Effectively, that makes the per-CPU upcall edge triggered instead of
>   level triggered, which results in the upcall being lost if the MSI is
>   delivered when the local APIC is *disabled*.
> 
>   Xen checks the vcpu_info->evtchn_upcall_pending flag when enabling the
>   local APIC for a vCPU and injects the vector immediately if so.  Do the
>   same in KVM since KVM doesn't provide a way for userspace to notice when
>   the guest software enables a local APIC.

Yep, that works.

> > @@ -499,8 +500,10 @@ static inline void apic_set_spiv(struct kvm_lapic *apic, u32 val)
> >         }
> >  
> >         /* Check if there are APF page ready requests pending */
> > -       if (enabled)
> > +       if (enabled) {
> 
> Argh, not your code, but there's really no reason this check needs to be outside
> of the "if (enabled != apic->sw_enabled)" block.  I don't care about optimizing
> anything, I just don't like that it implies that KVM always needs to take action
> if SPIV is written.  Probably not worth trying to "fix" at this point though :-(
> 
> >                 kvm_make_request(KVM_REQ_APF_READY, apic->vcpu);
> > +               kvm_xen_enable_lapic(apic->vcpu);
> 
> I think we should name the Xen helper kvm_xen_sw_enable_lapic(), to make it clear
> that the behavior doesn't apply to the APIC being hardware enabled via MSR.
> Unless it does apply to that path?

I believe Xen only injects the interrupts when the SPIV register is
written with bit 8 set (i.e. not SW disabled). It doesn't look like a
HW disable/enable will reinject the vector. So yes, renaming it makes
some sense. 

It *shouldn't* be moved inside the 'if (enabled != apic->sw_enabled)'
block, if it's to emulate the Xen behaviour faithfully. With an upcall
pending, I believe that a HW enable followed by *any* write to the SPIV
register will inject the vector. 

(The APIC starts up with the software bit in SPIV *enabled*).

I'll send a new version with updated commit message and that name
change. Thanks.
diff mbox series

Patch

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3242f3da2457..1e715ca717bc 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -41,6 +41,7 @@ 
 #include "ioapic.h"
 #include "trace.h"
 #include "x86.h"
+#include "xen.h"
 #include "cpuid.h"
 #include "hyperv.h"
 #include "smm.h"
@@ -499,8 +500,10 @@  static inline void apic_set_spiv(struct kvm_lapic *apic, u32 val)
        }
 
        /* Check if there are APF page ready requests pending */
-       if (enabled)
+       if (enabled) {
                kvm_make_request(KVM_REQ_APF_READY, apic->vcpu);
+               kvm_xen_enable_lapic(apic->vcpu);
+       }
 }
 
 static inline void kvm_apic_set_xapic_id(struct kvm_lapic *apic, u8 id)
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 972098ec4f6a..7a0e3f7a9d20 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -483,7 +483,7 @@  void kvm_xen_update_runstate(struct kvm_vcpu *v, int state)
                kvm_xen_update_runstate_guest(v, state == RUNSTATE_runnable);
 }
 
-static void kvm_xen_inject_vcpu_vector(struct kvm_vcpu *v)
+void kvm_xen_inject_vcpu_vector(struct kvm_vcpu *v)
 {
        struct kvm_lapic_irq irq = { };
        int r;
diff --git a/arch/x86/kvm/xen.h b/arch/x86/kvm/xen.h
index f8f1fe22d090..8eba3943b246 100644
--- a/arch/x86/kvm/xen.h
+++ b/arch/x86/kvm/xen.h
@@ -18,6 +18,7 @@  extern struct static_key_false_deferred kvm_xen_enabled;
 
 int __kvm_xen_has_interrupt(struct kvm_vcpu *vcpu);
 void kvm_xen_inject_pending_events(struct kvm_vcpu *vcpu);
+void kvm_xen_inject_vcpu_vector(struct kvm_vcpu *vcpu);
 int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data);
 int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data);
 int kvm_xen_hvm_set_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data);
@@ -36,6 +37,19 @@  int kvm_xen_setup_evtchn(struct kvm *kvm,
                         const struct kvm_irq_routing_entry *ue);
 void kvm_xen_update_tsc_info(struct kvm_vcpu *vcpu);
 
+static inline void kvm_xen_enable_lapic(struct kvm_vcpu *vcpu)
+{
+       /*
+        * The local APIC is being enabled. If the per-vCPU upcall vector is
+        * set and the vCPU's evtchn_upcall_pending flag is set, inject the
+        * interrupt.
+        */
+       if (static_branch_unlikely(&kvm_xen_enabled.key) &&
+           vcpu->arch.xen.vcpu_info_cache.active &&
+           vcpu->arch.xen.upcall_vector && __kvm_xen_has_interrupt(vcpu))
+               kvm_xen_inject_vcpu_vector(vcpu);
+}
+
 static inline bool kvm_xen_msr_enabled(struct kvm *kvm)
 {
        return static_branch_unlikely(&kvm_xen_enabled.key) &&
@@ -101,6 +115,10 @@  static inline void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
 {
 }
 
+static inline void kvm_xen_enable_lapic(struct kvm_vcpu *vcpu)
+{
+}
+
 static inline bool kvm_xen_msr_enabled(struct kvm *kvm)
 {
        return false;