Message ID | 20200812124709.4165-3-roger.pau@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/vlapic: implement EOI callbacks | expand |
On 12/08/2020 13:47, Roger Pau Monne wrote: > diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c > index 7b5c633033..7369be468b 100644 > --- a/xen/arch/x86/hvm/vlapic.c > +++ b/xen/arch/x86/hvm/vlapic.c > @@ -159,8 +160,26 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig) > else > vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]); > > + if ( callback ) > + { > + unsigned long flags; > + > + spin_lock_irqsave(&vlapic->callback_lock, flags); > + vlapic->callbacks[vec].callback = callback; > + vlapic->callbacks[vec].data = data; > + spin_unlock_irqrestore(&vlapic->callback_lock, flags); > + } > + else > + /* > + * Removing the callback can be done with a single atomic operation > + * without requiring the lock, as the callback data doesn't need to be > + * cleared. > + */ > + write_atomic(&vlapic->callbacks[vec].callback, NULL); > + > if ( hvm_funcs.update_eoi_exit_bitmap ) > - alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, trig); > + alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, > + trig || callback); I can't reason about this being correct. When you say EOI, what property do you want, exactly, because there are a couple of options. All interrupts, edge or level, require acking at the local APIC, to mark the interrupt as no longer in service. For edge interrupts and hardware APIC acceleration, this will be completed without a VMExit. Level interrupts from the IO-APIC further require EOI'ing there. Whether this requires an explicit action or not depends on the LAPIC "EOI Broadcast" setting. I'm not sure if we virtualise and/or support this setting. What exactly are we going to want to do from these callbacks (eventually, not just in this series alone)? If it can't be made to work for level interrupts only, I'm afraid I don't think this plan is viable. (Some trivial comments to follow, but this is the meaty question.) > > if ( hvm_funcs.deliver_posted_intr ) > alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec); > @@ -168,6 +187,11 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig) > vcpu_kick(target); > } > > +void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig) > +{ > + vlapic_set_irq_callback(vlapic, vec, trig, NULL, NULL); Static inline in the header file? > @@ -1636,9 +1671,23 @@ int vlapic_init(struct vcpu *v) > } > clear_page(vlapic->regs); > > + if ( !vlapic->callbacks ) > + { > + vlapic->callbacks = xmalloc_array(typeof(*(vlapic->callbacks)), > + X86_NR_VECTORS); This is a large data structure. At a minimum, you can subtract 16 from it, because vectors 0 thru 0xf can't legally be targetted by interrupts. Sadly, I don't think it can be reduced beyond that, because a guest could arrange for every other vector to become pending in level mode, even if the overwhelming common option for VMs these days would be to have no level interrupts at all. ~Andrew
On Thu, Aug 13, 2020 at 03:33:37PM +0100, Andrew Cooper wrote: > On 12/08/2020 13:47, Roger Pau Monne wrote: > > diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c > > index 7b5c633033..7369be468b 100644 > > --- a/xen/arch/x86/hvm/vlapic.c > > +++ b/xen/arch/x86/hvm/vlapic.c > > @@ -159,8 +160,26 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig) > > else > > vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]); > > > > + if ( callback ) > > + { > > + unsigned long flags; > > + > > + spin_lock_irqsave(&vlapic->callback_lock, flags); > > + vlapic->callbacks[vec].callback = callback; > > + vlapic->callbacks[vec].data = data; > > + spin_unlock_irqrestore(&vlapic->callback_lock, flags); > > + } > > + else > > + /* > > + * Removing the callback can be done with a single atomic operation > > + * without requiring the lock, as the callback data doesn't need to be > > + * cleared. > > + */ > > + write_atomic(&vlapic->callbacks[vec].callback, NULL); > > + > > if ( hvm_funcs.update_eoi_exit_bitmap ) > > - alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, trig); > > + alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, > > + trig || callback); > > I can't reason about this being correct. Note the 'trig' part should have been dropped in patch 5, as then the vIO-APIC will properly register the EOI callback for level triggered interrupts. Note that if we ever implement EOI suppression the vIO-APIC callback would then have to check whether the bit is currently set in order to decide whether to perform the EOI on the vIO-APIC or not. This just requests a vmexit whenever the caller of vlapic_set_irq requires an EOI callback, so that it can be executed when using the virtual interrupt delivery feature that would normally avoid such vmexit. > When you say EOI, what property do you want, exactly, because there are > a couple of options. Here (in the update_eoi_exit_bitmap context) I want a vmexit whenever an EIO callback for the injected vector needs to be executed, that's requested by the caller of vlapic_set_irq. We have to keep the 'trig' part because vIO-APIC is not switched to use the new callback infrastructure in this patch. > All interrupts, edge or level, require acking at the local APIC, to mark > the interrupt as no longer in service. For edge interrupts and hardware > APIC acceleration, this will be completed without a VMExit. It would be completed without a vmexit as long as the corresponding vector bit in the EOI exit bitmap is not set when using virtual interrupt delivery. I think this is currently wrong, as we require a vmexit to happen for non-maskable edge MSI interrupts, and the corresponding EOI exit bitmap bit doesn't seem to be set? Maybe there's something I'm missing. > Level interrupts from the IO-APIC further require EOI'ing there. > Whether this requires an explicit action or not depends on the LAPIC > "EOI Broadcast" setting. I'm not sure if we virtualise and/or support > this setting. No, our current vlapic implementation doesn't support EOI broadcast suppression AFAICT, as bit 24 in VLAPIC_VERSION is 0. So the EOI of a level interrupts is always broadcasted to the IO-APIC(s). > > What exactly are we going to want to do from these callbacks > (eventually, not just in this series alone)? So this series just moves the current hooks in vlapic_handle_EOI to become dynamically set by the users that inject the vectors. We also need EOI callbacks for edge triggered interrupts, as non-maskable edge MSIs from passed-through devices currently require an EOI on the physical lapic when the guest also performs such EOI, see hvm_dpci_msi_eoi. > If it can't be made to work for level interrupts only, I'm afraid I > don't think this plan is viable. I think it can be made to work, the code here will keep the callback for level triggered interrupts, so that the EOI is forwarded to the vIO-APIC, but I don't see why this would be limited to level interrupts only, we could need the same for edge interrupts, as it seems like the SynIC viridian extensions could also make use of this if we ever implement them fully. > (Some trivial comments to follow, but this is the meaty question.) > > > > > if ( hvm_funcs.deliver_posted_intr ) > > alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec); > > @@ -168,6 +187,11 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig) > > vcpu_kick(target); > > } > > > > +void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig) > > +{ > > + vlapic_set_irq_callback(vlapic, vec, trig, NULL, NULL); > > Static inline in the header file? Sure. > > @@ -1636,9 +1671,23 @@ int vlapic_init(struct vcpu *v) > > } > > clear_page(vlapic->regs); > > > > + if ( !vlapic->callbacks ) > > + { > > + vlapic->callbacks = xmalloc_array(typeof(*(vlapic->callbacks)), > > + X86_NR_VECTORS); > > This is a large data structure. At a minimum, you can subtract 16 from > it, because vectors 0 thru 0xf can't legally be targetted by interrupts. > > Sadly, I don't think it can be reduced beyond that, because a guest > could arrange for every other vector to become pending in level mode, > even if the overwhelming common option for VMs these days would be to > have no level interrupts at all. I'm still not sure I understand why you mention level triggered interrupts specifically here, the lapic EOI is performed for both level and edge trigger interrupts in order to clear the bit in ISR, and hence we could have an EOI callback regardless of the triggering mode? It's just a matter of the caller requesting an EOI callback, and then it will be executed when the guest performs the EOI, regardless of whether the interrupt is level or edge triggered. Thanks, Roger.
On 12.08.2020 14:47, Roger Pau Monne wrote: > Add a new vlapic_set_irq_callback helper in order to inject a vector > and set a callback to be executed when the guest performs the end of > interrupt acknowledgment. One thing I don't understand at all for now is how these callbacks are going to be re-instated after migration for not-yet-EOIed interrupts. > @@ -1636,9 +1671,23 @@ int vlapic_init(struct vcpu *v) > } > clear_page(vlapic->regs); > > + if ( !vlapic->callbacks ) > + { > + vlapic->callbacks = xmalloc_array(typeof(*(vlapic->callbacks)), There's a pair of not really needed parentheses here (also again a little further down). > + X86_NR_VECTORS); > + if ( !vlapic->callbacks ) > + { > + dprintk(XENLOG_ERR, "alloc vlapic callbacks error: %d/%d\n", > + v->domain->domain_id, v->vcpu_id); Please use %pd. Jan
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c index 7b5c633033..7369be468b 100644 --- a/xen/arch/x86/hvm/vlapic.c +++ b/xen/arch/x86/hvm/vlapic.c @@ -144,7 +144,8 @@ bool vlapic_test_irq(const struct vlapic *vlapic, uint8_t vec) return vlapic_test_vector(vec, &vlapic->regs->data[APIC_IRR]); } -void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig) +void vlapic_set_irq_callback(struct vlapic *vlapic, uint8_t vec, uint8_t trig, + vlapic_eoi_callback_t *callback, void *data) { struct vcpu *target = vlapic_vcpu(vlapic); @@ -159,8 +160,26 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig) else vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]); + if ( callback ) + { + unsigned long flags; + + spin_lock_irqsave(&vlapic->callback_lock, flags); + vlapic->callbacks[vec].callback = callback; + vlapic->callbacks[vec].data = data; + spin_unlock_irqrestore(&vlapic->callback_lock, flags); + } + else + /* + * Removing the callback can be done with a single atomic operation + * without requiring the lock, as the callback data doesn't need to be + * cleared. + */ + write_atomic(&vlapic->callbacks[vec].callback, NULL); + if ( hvm_funcs.update_eoi_exit_bitmap ) - alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, trig); + alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, + trig || callback); if ( hvm_funcs.deliver_posted_intr ) alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec); @@ -168,6 +187,11 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig) vcpu_kick(target); } +void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig) +{ + vlapic_set_irq_callback(vlapic, vec, trig, NULL, NULL); +} + static int vlapic_find_highest_isr(const struct vlapic *vlapic) { return vlapic_find_highest_vector(&vlapic->regs->data[APIC_ISR]); @@ -461,6 +485,9 @@ void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector) { struct vcpu *v = vlapic_vcpu(vlapic); struct domain *d = v->domain; + vlapic_eoi_callback_t *callback; + void *data; + unsigned long flags; /* All synic SINTx vectors are edge triggered */ @@ -470,6 +497,14 @@ void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector) viridian_synic_ack_sint(v, vector); hvm_dpci_msi_eoi(d, vector); + + spin_lock_irqsave(&vlapic->callback_lock, flags); + callback = vlapic->callbacks[vector].callback; + data = vlapic->callbacks[vector].data; + spin_unlock_irqrestore(&vlapic->callback_lock, flags); + + if ( callback ) + callback(v, vector, data); } static bool_t is_multicast_dest(struct vlapic *vlapic, unsigned int short_hand, @@ -1636,9 +1671,23 @@ int vlapic_init(struct vcpu *v) } clear_page(vlapic->regs); + if ( !vlapic->callbacks ) + { + vlapic->callbacks = xmalloc_array(typeof(*(vlapic->callbacks)), + X86_NR_VECTORS); + if ( !vlapic->callbacks ) + { + dprintk(XENLOG_ERR, "alloc vlapic callbacks error: %d/%d\n", + v->domain->domain_id, v->vcpu_id); + return -ENOMEM; + } + } + memset(vlapic->callbacks, 0, sizeof(*(vlapic->callbacks)) * X86_NR_VECTORS); + vlapic_reset(vlapic); spin_lock_init(&vlapic->esr_lock); + spin_lock_init(&vlapic->callback_lock); tasklet_init(&vlapic->init_sipi.tasklet, vlapic_init_sipi_action, v); @@ -1660,6 +1709,7 @@ void vlapic_destroy(struct vcpu *v) destroy_periodic_time(&vlapic->pt); unmap_domain_page_global(vlapic->regs); free_domheap_page(vlapic->regs_page); + XFREE(vlapic->callbacks); } /* diff --git a/xen/include/asm-x86/hvm/vlapic.h b/xen/include/asm-x86/hvm/vlapic.h index 8f908928c3..6782508a68 100644 --- a/xen/include/asm-x86/hvm/vlapic.h +++ b/xen/include/asm-x86/hvm/vlapic.h @@ -73,6 +73,9 @@ #define vlapic_clear_vector(vec, bitmap) \ clear_bit(VEC_POS(vec), (uint32_t *)((bitmap) + REG_POS(vec))) +typedef void vlapic_eoi_callback_t(struct vcpu *v, unsigned int vector, + void *data); + struct vlapic { struct hvm_hw_lapic hw; struct hvm_hw_lapic_regs *regs; @@ -89,6 +92,11 @@ struct vlapic { uint32_t icr, dest; struct tasklet tasklet; } init_sipi; + struct { + vlapic_eoi_callback_t *callback; + void *data; + } *callbacks; + spinlock_t callback_lock; }; /* vlapic's frequence is 100 MHz */ @@ -111,6 +119,8 @@ void vlapic_reg_write(struct vcpu *v, unsigned int reg, uint32_t val); bool_t is_vlapic_lvtpc_enabled(struct vlapic *vlapic); bool vlapic_test_irq(const struct vlapic *vlapic, uint8_t vec); +void vlapic_set_irq_callback(struct vlapic *vlapic, uint8_t vec, uint8_t trig, + vlapic_eoi_callback_t *callback, void *data); void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig); int vlapic_has_pending_irq(struct vcpu *v);
Add a new vlapic_set_irq_callback helper in order to inject a vector and set a callback to be executed when the guest performs the end of interrupt acknowledgment. Such functionality will be used to migrate the current ad hoc handling done in vlapic_handle_EOI for the vectors that require some logic to be executed when the end of interrupt is performed. No current users are migrated to use this new functionality yet, so not functional change expected as a result. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- xen/arch/x86/hvm/vlapic.c | 54 ++++++++++++++++++++++++++++++-- xen/include/asm-x86/hvm/vlapic.h | 10 ++++++ 2 files changed, 62 insertions(+), 2 deletions(-)