diff mbox

[question] Why newer QEMU may lose irq when doing migration?

Message ID CACzj_yV+DCt8d5KwH__-e=3KiuUTpnhmdh3sKQ9YrpGzEA5mEg@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wincy Van Dec. 22, 2014, 1:55 p.m. UTC
2014-12-19 17:50 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>
>
> On 19/12/2014 04:58, Wincy Van wrote:
>> 2014-12-17 18:46 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>>>
>>>
>>> On 17/12/2014 04:46, Wincy Van wrote:
>>>> Hi, all:
>>>>
>>>> The patchset (https://lkml.org/lkml/2014/3/18/309) fixed migration of
>>>> Windows guests, but commit 0bc830b05c667218d703f2026ec866c49df974fc
>>>> (KVM: ioapic: clear IRR for edge-triggered interrupts at delivery)
>>>> introduced a bug (see
>>>> https://www.mail-archive.com/kvm@vger.kernel.org/msg109813.html).
>>>>
>>>> From the description "Unlike the old qemu-kvm, which really never did
>>>> that, with new QEMU it is for some reason
>>>> somewhat likely to migrate a VM with a nonzero IRR in the ioapic."
>>>>
>>>> Why could new QEMU do that? I can not find any codes about the "some reason"..
>>>> As we know, once a irq is set in kvm's ioapic, the ioapic will send
>>>> that irq to lapic, this is an "atomic" operation.
>>>
>>> It can happen if the IRQ is masked in the IOAPIC, for example.  Until
>>> commit 0bc830b, KVM could not distinguish two cases:
>>>
>>> 1) an edge-triggered interrupt that was raised while the IOAPIC had it
>>> masked
>>>
>>> 2) an edge-triggered interrupt that was raised and delivered, but for
>>> which userspace left the level to 1.
>>
>> It seems that QEMU's rtc behavior is case 2. But before this patchset, a rtc
>> interrupt may be lost when doing migration, and guest will not acknowledge it,
>> then the newer rtc interrupts are ignored forever. I think this is
>> none of the cases above, because the interrupt was lost. It must be
>> something wrong here.
>
> There is a third case actually.
>
> If the source kernel is an old one before commit 2c2bf0113697 (KVM: Use
> eoi to track RTC interrupt delivery status, 2013-04-11), ioapic->irr can
> also be set if the RTC interrupt was coalesced (for example because the
> PPR was too high to deliver it).
>

Yeah, understood, thanks.

> Instead, commit 2c2bf0113697 will not set ioapic->irr in this case.
> Yang, was this intentional?
>
> The question, however, is then why my patch set worked (fixed migration)
> even without moving
>
>                 ioapic->irr |= mask;
>
> above this:
>
>                 if (irq == RTC_GSI && line_status &&
>                         rtc_irq_check_coalesced(ioapic)) {
>                         ret = 0; /* coalesced */
>                         goto out;
>                 }
>

Paolo, how about this patch?

It uses a new field named irr_delivered to record the status of
edge-triggered interrupt,
and clears the delivered irr in kvm_get_ioapic. So it have the same
effect of commit
0bc830b while avoid the bug of Windows guests.



Thanks,

Wincy
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index b1947e0..eda7905 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -206,6 +206,8 @@  static int ioapic_set_irq(struct kvm_ioapic
*ioapic, unsigned int irq,

        old_irr = ioapic->irr;
        ioapic->irr |= mask;
+      if (edge)
+              ioapic->irr_delivered &= ~mask;
        if ((edge && old_irr == ioapic->irr) ||
            (!edge && entry.fields.remote_irr)) {
                ret = 0;
@@ -349,7 +351,7 @@  static int ioapic_service(struct kvm_ioapic
*ioapic, int irq, bool line_status)
        irqe.shorthand = 0;

        if (irqe.trig_mode == IOAPIC_EDGE_TRIG)
-               ioapic->irr &= ~(1 << irq);
+              ioapic->irr_delivered |= 1 << irq;

        if (irq == RTC_GSI && line_status) {
                /*
@@ -597,6 +599,7 @@  static void kvm_ioapic_reset(struct kvm_ioapic *ioapic)
        ioapic->base_address = IOAPIC_DEFAULT_BASE_ADDRESS;
        ioapic->ioregsel = 0;
        ioapic->irr = 0;
+      ioapic->irr_delivered = 0;
        ioapic->id = 0;
        memset(ioapic->irq_eoi, 0x00, IOAPIC_NUM_PINS);
        rtc_irq_eoi_tracking_reset(ioapic);
@@ -654,6 +657,7 @@  int kvm_get_ioapic(struct kvm *kvm, struct
kvm_ioapic_state *state)

        spin_lock(&ioapic->lock);
        memcpy(state, ioapic, sizeof(struct kvm_ioapic_state));
+      state->irr &= ~ioapic->irr_delivered;
        spin_unlock(&ioapic->lock);
        return 0;
 }
diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
index 3c91955..a5cdfc0 100644
--- a/arch/x86/kvm/ioapic.h
+++ b/arch/x86/kvm/ioapic.h
@@ -77,6 +77,7 @@  struct kvm_ioapic {
        struct rtc_status rtc_status;
        struct delayed_work eoi_inject;
        u32 irq_eoi[IOAPIC_NUM_PINS];
+      u32 irr_delivered;
 };

 #ifdef DEBUG