diff mbox

[v2] pc: apic: fix touch LAPIC when irqchip is split

Message ID 1473832464-3478-1-git-send-email-wanpeng.li@hotmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wanpeng Li Sept. 14, 2016, 5:54 a.m. UTC
From: Wanpeng Li <wanpeng.li@hotmail.com>

Add -kernel_irqchip=split
./x86-run x86/eventinj.flat

qemu-system-x86_64 -enable-kvm -machine kernel_irqchip=split -cpu host
-device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc
none -serial stdio -device pci-testdev -kernel x86/eventinj.flat
enabling apic
paging enabled
cr0 = 80010011
cr3 = 7fff000
cr4 = 20
Sending vec 33 and 62 and mask one with TPR
irq1 running
irq1 running
After 33/62 TPR test
FAIL: TPR
irq0 running
irq0 running

Both irq1 and irq0 are executing twice.

kvm_entry: vcpu 0
kvm_exit: reason MSR_WRITE rip 0x401f33 info 0 0
kvm_apic: apic_write APIC_EOI = 0x0
kvm_eoi: apicid 0 vector 62
kvm_msr: msr_write 80b = 0x0
kvm_entry: vcpu 0
kvm_exit: reason PENDING_INTERRUPT rip 0x401f35 info 0 0
kvm_userspace_exit: reason KVM_EXIT_IRQ_WINDOW_OPEN (7)
kvm_inj_virq: irq 62
kvm_entry: vcpu 0
kvm_exit: reason IO_INSTRUCTION rip 0x4016ec info 3fd0008 0

From the trace we can see there is an interrupt window exit 
after the first interrupt EOI(irq 62), and the same irq(62) 
is injected duplicately after the interrupt window.

QEMU does KVM_INTERRUPT(62) ioctl after KVM exits with 
KVM_EXIT_IRQ_WINDOW_OPEN, which QEMU requested while the
guest was printing.  The printing calls

serial_update_irq() -> qemu_irq_lower() -> qemu_set_irq() ->
gsi_handler() -> qemu_set_irq() -> pic_irq_request() ->
apic_deliver_pic_intr() -> kvm_handle_interrupt()

kvm_handle_interrupt() does

	interrupt_request |= CPU_INTERRUPT_HARD

which later calls cpu_get_pic_interrupt() in kvm_arch_pre_run(), 
but that function uses stale information from APIC and injects 
62 again. If we synchronized the APIC, then the test would #GP, 
because there would be no injectable interrupt in LAPIC or PIC, 
so pic_read_irq() would return 15, thinking it was spurious.

This patch fix it by don't touch LAPIC if LAPIC is in kernel.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Radim Krčmář <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
v1 -> v2:
 * cleanup coding style issue

 hw/i386/pc.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

Comments

Radim Krčmář Sept. 14, 2016, 2:05 p.m. UTC | #1
2016-09-14 13:54+0800, Wanpeng Li:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> Add -kernel_irqchip=split
> ./x86-run x86/eventinj.flat
> 
> qemu-system-x86_64 -enable-kvm -machine kernel_irqchip=split -cpu host
> -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc
> none -serial stdio -device pci-testdev -kernel x86/eventinj.flat
> enabling apic
> paging enabled
> cr0 = 80010011
> cr3 = 7fff000
> cr4 = 20
> Sending vec 33 and 62 and mask one with TPR
> irq1 running
> irq1 running
> After 33/62 TPR test
> FAIL: TPR
> irq0 running
> irq0 running
> 
> Both irq1 and irq0 are executing twice.
> 
> kvm_entry: vcpu 0
> kvm_exit: reason MSR_WRITE rip 0x401f33 info 0 0
> kvm_apic: apic_write APIC_EOI = 0x0
> kvm_eoi: apicid 0 vector 62
> kvm_msr: msr_write 80b = 0x0
> kvm_entry: vcpu 0
> kvm_exit: reason PENDING_INTERRUPT rip 0x401f35 info 0 0
> kvm_userspace_exit: reason KVM_EXIT_IRQ_WINDOW_OPEN (7)
> kvm_inj_virq: irq 62
> kvm_entry: vcpu 0
> kvm_exit: reason IO_INSTRUCTION rip 0x4016ec info 3fd0008 0
> 
> From the trace we can see there is an interrupt window exit 
> after the first interrupt EOI(irq 62), and the same irq(62) 
> is injected duplicately after the interrupt window.
> 
> QEMU does KVM_INTERRUPT(62) ioctl after KVM exits with 
> KVM_EXIT_IRQ_WINDOW_OPEN, which QEMU requested while the
> guest was printing.  The printing calls
> 
> serial_update_irq() -> qemu_irq_lower() -> qemu_set_irq() ->
> gsi_handler() -> qemu_set_irq() -> pic_irq_request() ->
> apic_deliver_pic_intr() -> kvm_handle_interrupt()
> 
> kvm_handle_interrupt() does
> 
> 	interrupt_request |= CPU_INTERRUPT_HARD
> 
> which later calls cpu_get_pic_interrupt() in kvm_arch_pre_run(), 
> but that function uses stale information from APIC and injects 
> 62 again. If we synchronized the APIC, then the test would #GP, 
> because there would be no injectable interrupt in LAPIC or PIC, 
> so pic_read_irq() would return 15, thinking it was spurious.
> 
> This patch fix it by don't touch LAPIC if LAPIC is in kernel.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Suggested-by: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---

Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
diff mbox

Patch

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index e31f70f..4f3d508 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -161,14 +161,16 @@  int cpu_get_pic_interrupt(CPUX86State *env)
     X86CPU *cpu = x86_env_get_cpu(env);
     int intno;
 
-    intno = apic_get_interrupt(cpu->apic_state);
-    if (intno >= 0) {
-        return intno;
-    }
-    /* read the irq from the PIC */
-    if (!apic_accept_pic_intr(cpu->apic_state)) {
-        return -1;
-    }
+    if (!kvm_irqchip_in_kernel()) {
+        intno = apic_get_interrupt(cpu->apic_state);
+        if (intno >= 0) {
+            return intno;
+        }
+        /* read the irq from the PIC */
+        if (!apic_accept_pic_intr(cpu->apic_state)) {
+            return -1;
+        }
+    }
 
     intno = pic_read_irq(isa_pic);
     return intno;
@@ -180,7 +182,7 @@  static void pic_irq_request(void *opaque, int irq, int level)
     X86CPU *cpu = X86_CPU(cs);
 
     DPRINTF("pic_irqs: %s irq %d\n", level? "raise" : "lower", irq);
-    if (cpu->apic_state) {
+    if (cpu->apic_state && !kvm_irqchip_in_kernel()) {
         CPU_FOREACH(cs) {
             cpu = X86_CPU(cs);
             if (apic_accept_pic_intr(cpu->apic_state)) {