diff mbox series

[2/5] x86/vlapic: introduce an EOI callback mechanism

Message ID 20200812124709.4165-3-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86/vlapic: implement EOI callbacks | expand

Commit Message

Roger Pau Monne Aug. 12, 2020, 12:47 p.m. UTC
Add a new vlapic_set_irq_callback helper in order to inject a vector
and set a callback to be executed when the guest performs the end of
interrupt acknowledgment.

Such functionality will be used to migrate the current ad hoc handling
done in vlapic_handle_EOI for the vectors that require some logic to
be executed when the end of interrupt is performed.

No current users are migrated to use this new functionality yet, so
not functional change expected as a result.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/vlapic.c        | 54 ++++++++++++++++++++++++++++++--
 xen/include/asm-x86/hvm/vlapic.h | 10 ++++++
 2 files changed, 62 insertions(+), 2 deletions(-)

Comments

Andrew Cooper Aug. 13, 2020, 2:33 p.m. UTC | #1
On 12/08/2020 13:47, Roger Pau Monne wrote:
> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> index 7b5c633033..7369be468b 100644
> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -159,8 +160,26 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
>      else
>          vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]);
>  
> +    if ( callback )
> +    {
> +        unsigned long flags;
> +
> +        spin_lock_irqsave(&vlapic->callback_lock, flags);
> +        vlapic->callbacks[vec].callback = callback;
> +        vlapic->callbacks[vec].data = data;
> +        spin_unlock_irqrestore(&vlapic->callback_lock, flags);
> +    }
> +    else
> +        /*
> +         * Removing the callback can be done with a single atomic operation
> +         * without requiring the lock, as the callback data doesn't need to be
> +         * cleared.
> +         */
> +        write_atomic(&vlapic->callbacks[vec].callback, NULL);
> +
>      if ( hvm_funcs.update_eoi_exit_bitmap )
> -        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, trig);
> +        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec,
> +                          trig || callback);

I can't reason about this being correct.

When you say EOI, what property do you want, exactly, because there are
a couple of options.

All interrupts, edge or level, require acking at the local APIC, to mark
the interrupt as no longer in service.  For edge interrupts and hardware
APIC acceleration, this will be completed without a VMExit.

Level interrupts from the IO-APIC further require EOI'ing there. 
Whether this requires an explicit action or not depends on the LAPIC
"EOI Broadcast" setting.  I'm not sure if we virtualise and/or support
this setting.


What exactly are we going to want to do from these callbacks
(eventually, not just in this series alone)?

If it can't be made to work for level interrupts only, I'm afraid I
don't think this plan is viable.

(Some trivial comments to follow, but this is the meaty question.)

>  
>      if ( hvm_funcs.deliver_posted_intr )
>          alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec);
> @@ -168,6 +187,11 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
>          vcpu_kick(target);
>  }
>  
> +void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
> +{
> +    vlapic_set_irq_callback(vlapic, vec, trig, NULL, NULL);

Static inline in the header file?

> @@ -1636,9 +1671,23 @@ int vlapic_init(struct vcpu *v)
>      }
>      clear_page(vlapic->regs);
>  
> +    if ( !vlapic->callbacks )
> +    {
> +        vlapic->callbacks = xmalloc_array(typeof(*(vlapic->callbacks)),
> +                                          X86_NR_VECTORS);

This is a large data structure.  At a minimum, you can subtract 16 from
it, because vectors 0 thru 0xf can't legally be targetted by interrupts.

Sadly, I don't think it can be reduced beyond that, because a guest
could arrange for every other vector to become pending in level mode,
even if the overwhelming common option for VMs these days would be to
have no level interrupts at all.

~Andrew
Roger Pau Monne Aug. 13, 2020, 5:41 p.m. UTC | #2
On Thu, Aug 13, 2020 at 03:33:37PM +0100, Andrew Cooper wrote:
> On 12/08/2020 13:47, Roger Pau Monne wrote:
> > diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> > index 7b5c633033..7369be468b 100644
> > --- a/xen/arch/x86/hvm/vlapic.c
> > +++ b/xen/arch/x86/hvm/vlapic.c
> > @@ -159,8 +160,26 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
> >      else
> >          vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]);
> >  
> > +    if ( callback )
> > +    {
> > +        unsigned long flags;
> > +
> > +        spin_lock_irqsave(&vlapic->callback_lock, flags);
> > +        vlapic->callbacks[vec].callback = callback;
> > +        vlapic->callbacks[vec].data = data;
> > +        spin_unlock_irqrestore(&vlapic->callback_lock, flags);
> > +    }
> > +    else
> > +        /*
> > +         * Removing the callback can be done with a single atomic operation
> > +         * without requiring the lock, as the callback data doesn't need to be
> > +         * cleared.
> > +         */
> > +        write_atomic(&vlapic->callbacks[vec].callback, NULL);
> > +
> >      if ( hvm_funcs.update_eoi_exit_bitmap )
> > -        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, trig);
> > +        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec,
> > +                          trig || callback);
> 
> I can't reason about this being correct.

Note the 'trig' part should have been dropped in patch 5, as then
the vIO-APIC will properly register the EOI callback for level
triggered interrupts.

Note that if we ever implement EOI suppression the vIO-APIC callback
would then have to check whether the bit is currently set in order to
decide whether to perform the EOI on the vIO-APIC or not.

This just requests a vmexit whenever the caller of vlapic_set_irq
requires an EOI callback, so that it can be executed when using the
virtual interrupt delivery feature that would normally avoid such
vmexit.

> When you say EOI, what property do you want, exactly, because there are
> a couple of options.

Here (in the update_eoi_exit_bitmap context) I want a vmexit whenever
an EIO callback for the injected vector needs to be executed, that's
requested by the caller of vlapic_set_irq. We have to keep the 'trig'
part because vIO-APIC is not switched to use the new callback
infrastructure in this patch.

> All interrupts, edge or level, require acking at the local APIC, to mark
> the interrupt as no longer in service.  For edge interrupts and hardware
> APIC acceleration, this will be completed without a VMExit.

It would be completed without a vmexit as long as the corresponding
vector bit in the EOI exit bitmap is not set when using virtual
interrupt delivery.

I think this is currently wrong, as we require a vmexit to happen for
non-maskable edge MSI interrupts, and the corresponding EOI exit
bitmap bit doesn't seem to be set?

Maybe there's something I'm missing.

> Level interrupts from the IO-APIC further require EOI'ing there. 
> Whether this requires an explicit action or not depends on the LAPIC
> "EOI Broadcast" setting.  I'm not sure if we virtualise and/or support
> this setting.

No, our current vlapic implementation doesn't support EOI broadcast
suppression AFAICT, as bit 24 in VLAPIC_VERSION is 0. So the EOI of a
level interrupts is always broadcasted to the IO-APIC(s).

> 
> What exactly are we going to want to do from these callbacks
> (eventually, not just in this series alone)?

So this series just moves the current hooks in vlapic_handle_EOI to
become dynamically set by the users that inject the vectors.

We also need EOI callbacks for edge triggered interrupts, as
non-maskable edge MSIs from passed-through devices currently require
an EOI on the physical lapic when the guest also performs such EOI,
see hvm_dpci_msi_eoi.

> If it can't be made to work for level interrupts only, I'm afraid I
> don't think this plan is viable.

I think it can be made to work, the code here will keep the callback
for level triggered interrupts, so that the EOI is forwarded to the
vIO-APIC, but I don't see why this would be limited to level
interrupts only, we could need the same for edge interrupts, as it
seems like the SynIC viridian extensions could also make use of this
if we ever implement them fully.

> (Some trivial comments to follow, but this is the meaty question.)
> 
> >  
> >      if ( hvm_funcs.deliver_posted_intr )
> >          alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec);
> > @@ -168,6 +187,11 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
> >          vcpu_kick(target);
> >  }
> >  
> > +void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
> > +{
> > +    vlapic_set_irq_callback(vlapic, vec, trig, NULL, NULL);
> 
> Static inline in the header file?

Sure.

> > @@ -1636,9 +1671,23 @@ int vlapic_init(struct vcpu *v)
> >      }
> >      clear_page(vlapic->regs);
> >  
> > +    if ( !vlapic->callbacks )
> > +    {
> > +        vlapic->callbacks = xmalloc_array(typeof(*(vlapic->callbacks)),
> > +                                          X86_NR_VECTORS);
> 
> This is a large data structure.  At a minimum, you can subtract 16 from
> it, because vectors 0 thru 0xf can't legally be targetted by interrupts.
> 
> Sadly, I don't think it can be reduced beyond that, because a guest
> could arrange for every other vector to become pending in level mode,
> even if the overwhelming common option for VMs these days would be to
> have no level interrupts at all.

I'm still not sure I understand why you mention level triggered
interrupts specifically here, the lapic EOI is performed for both
level and edge trigger interrupts in order to clear the bit in ISR,
and hence we could have an EOI callback regardless of the triggering
mode?

It's just a matter of the caller requesting an EOI callback, and then
it will be executed when the guest performs the EOI, regardless of
whether the interrupt is level or edge triggered.

Thanks, Roger.
Jan Beulich Aug. 25, 2020, 5:56 a.m. UTC | #3
On 12.08.2020 14:47, Roger Pau Monne wrote:
> Add a new vlapic_set_irq_callback helper in order to inject a vector
> and set a callback to be executed when the guest performs the end of
> interrupt acknowledgment.

One thing I don't understand at all for now is how these
callbacks are going to be re-instated after migration for
not-yet-EOIed interrupts.

> @@ -1636,9 +1671,23 @@ int vlapic_init(struct vcpu *v)
>      }
>      clear_page(vlapic->regs);
>  
> +    if ( !vlapic->callbacks )
> +    {
> +        vlapic->callbacks = xmalloc_array(typeof(*(vlapic->callbacks)),

There's a pair of not really needed parentheses here (also
again a little further down).

> +                                          X86_NR_VECTORS);
> +        if ( !vlapic->callbacks )
> +        {
> +            dprintk(XENLOG_ERR, "alloc vlapic callbacks error: %d/%d\n",
> +                    v->domain->domain_id, v->vcpu_id);

Please use %pd.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 7b5c633033..7369be468b 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -144,7 +144,8 @@  bool vlapic_test_irq(const struct vlapic *vlapic, uint8_t vec)
     return vlapic_test_vector(vec, &vlapic->regs->data[APIC_IRR]);
 }
 
-void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
+void vlapic_set_irq_callback(struct vlapic *vlapic, uint8_t vec, uint8_t trig,
+                             vlapic_eoi_callback_t *callback, void *data)
 {
     struct vcpu *target = vlapic_vcpu(vlapic);
 
@@ -159,8 +160,26 @@  void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
     else
         vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]);
 
+    if ( callback )
+    {
+        unsigned long flags;
+
+        spin_lock_irqsave(&vlapic->callback_lock, flags);
+        vlapic->callbacks[vec].callback = callback;
+        vlapic->callbacks[vec].data = data;
+        spin_unlock_irqrestore(&vlapic->callback_lock, flags);
+    }
+    else
+        /*
+         * Removing the callback can be done with a single atomic operation
+         * without requiring the lock, as the callback data doesn't need to be
+         * cleared.
+         */
+        write_atomic(&vlapic->callbacks[vec].callback, NULL);
+
     if ( hvm_funcs.update_eoi_exit_bitmap )
-        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, trig);
+        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec,
+                          trig || callback);
 
     if ( hvm_funcs.deliver_posted_intr )
         alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec);
@@ -168,6 +187,11 @@  void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
         vcpu_kick(target);
 }
 
+void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
+{
+    vlapic_set_irq_callback(vlapic, vec, trig, NULL, NULL);
+}
+
 static int vlapic_find_highest_isr(const struct vlapic *vlapic)
 {
     return vlapic_find_highest_vector(&vlapic->regs->data[APIC_ISR]);
@@ -461,6 +485,9 @@  void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
 {
     struct vcpu *v = vlapic_vcpu(vlapic);
     struct domain *d = v->domain;
+    vlapic_eoi_callback_t *callback;
+    void *data;
+    unsigned long flags;
 
     /* All synic SINTx vectors are edge triggered */
 
@@ -470,6 +497,14 @@  void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
         viridian_synic_ack_sint(v, vector);
 
     hvm_dpci_msi_eoi(d, vector);
+
+    spin_lock_irqsave(&vlapic->callback_lock, flags);
+    callback = vlapic->callbacks[vector].callback;
+    data = vlapic->callbacks[vector].data;
+    spin_unlock_irqrestore(&vlapic->callback_lock, flags);
+
+    if ( callback )
+        callback(v, vector, data);
 }
 
 static bool_t is_multicast_dest(struct vlapic *vlapic, unsigned int short_hand,
@@ -1636,9 +1671,23 @@  int vlapic_init(struct vcpu *v)
     }
     clear_page(vlapic->regs);
 
+    if ( !vlapic->callbacks )
+    {
+        vlapic->callbacks = xmalloc_array(typeof(*(vlapic->callbacks)),
+                                          X86_NR_VECTORS);
+        if ( !vlapic->callbacks )
+        {
+            dprintk(XENLOG_ERR, "alloc vlapic callbacks error: %d/%d\n",
+                    v->domain->domain_id, v->vcpu_id);
+            return -ENOMEM;
+        }
+    }
+    memset(vlapic->callbacks, 0, sizeof(*(vlapic->callbacks)) * X86_NR_VECTORS);
+
     vlapic_reset(vlapic);
 
     spin_lock_init(&vlapic->esr_lock);
+    spin_lock_init(&vlapic->callback_lock);
 
     tasklet_init(&vlapic->init_sipi.tasklet, vlapic_init_sipi_action, v);
 
@@ -1660,6 +1709,7 @@  void vlapic_destroy(struct vcpu *v)
     destroy_periodic_time(&vlapic->pt);
     unmap_domain_page_global(vlapic->regs);
     free_domheap_page(vlapic->regs_page);
+    XFREE(vlapic->callbacks);
 }
 
 /*
diff --git a/xen/include/asm-x86/hvm/vlapic.h b/xen/include/asm-x86/hvm/vlapic.h
index 8f908928c3..6782508a68 100644
--- a/xen/include/asm-x86/hvm/vlapic.h
+++ b/xen/include/asm-x86/hvm/vlapic.h
@@ -73,6 +73,9 @@ 
 #define vlapic_clear_vector(vec, bitmap)                                \
     clear_bit(VEC_POS(vec), (uint32_t *)((bitmap) + REG_POS(vec)))
 
+typedef void vlapic_eoi_callback_t(struct vcpu *v, unsigned int vector,
+                                   void *data);
+
 struct vlapic {
     struct hvm_hw_lapic      hw;
     struct hvm_hw_lapic_regs *regs;
@@ -89,6 +92,11 @@  struct vlapic {
         uint32_t             icr, dest;
         struct tasklet       tasklet;
     } init_sipi;
+    struct {
+        vlapic_eoi_callback_t *callback;
+        void                 *data;
+    } *callbacks;
+    spinlock_t               callback_lock;
 };
 
 /* vlapic's frequence is 100 MHz */
@@ -111,6 +119,8 @@  void vlapic_reg_write(struct vcpu *v, unsigned int reg, uint32_t val);
 bool_t is_vlapic_lvtpc_enabled(struct vlapic *vlapic);
 
 bool vlapic_test_irq(const struct vlapic *vlapic, uint8_t vec);
+void vlapic_set_irq_callback(struct vlapic *vlapic, uint8_t vec, uint8_t trig,
+                             vlapic_eoi_callback_t *callback, void *data);
 void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig);
 
 int vlapic_has_pending_irq(struct vcpu *v);