Message ID | 20250304013335.4155703-4-seanjc@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | KVM: x86: Optimize "stale" EOI bitmap exiting | expand |
On Mon, 2025-03-03 at 17:33 -0800, Sean Christopherson wrote: > From: weizijie <zijie.wei@linux.alibaba.com> > > Rescan I/O APIC routes for a vCPU after handling an intercepted I/O APIC > EOI for an IRQ that is not targeting said vCPU, i.e. after handling what's > effectively a stale EOI VM-Exit. If a level-triggered IRQ is in-flight > when IRQ routing changes, e.g. because the guest change routing from its ^ changes ? > IRQ handler, then KVM intercepts EOIs on both the new and old target vCPUs, > so that the in-flight IRQ can be de-asserted when it's EOI'd. > > However, only the EOI for the in-flight IRQ needs to intercepted, as IRQs ^ be intercepted > on the same vector with the new routing are coincidental, i.e. occur only > if the guest is reusing the vector for multiple interrupt sources. If the > I/O APIC routes aren't rescanned, KVM will unnecessarily intercept EOIs > for the vector and negative impact the vCPU's interrupt performance. > > Note, both commit db2bdcbbbd32 ("KVM: x86: fix edge EOI and IOAPIC reconfig > race") and commit 0fc5a36dd6b3 ("KVM: x86: ioapic: Fix level-triggered EOI > and IOAPIC reconfigure race") mentioned this issue, but it was considered > a "rare" occurrence thus was not addressed. However in real environments, > this issue can happen even in a well-behaved guest. > > Cc: Kai Huang <kai.huang@intel.com> > Co-developed-by: xuyun <xuyun_xy.xy@linux.alibaba.com> > Signed-off-by: xuyun <xuyun_xy.xy@linux.alibaba.com> > Signed-off-by: weizijie <zijie.wei@linux.alibaba.com> > [sean: massage changelog and comments, use int/-1, reset at scan] > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com>
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 44007a351e88..b378414c3104 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1025,6 +1025,7 @@ struct kvm_vcpu_arch { int pending_ioapic_eoi; int pending_external_vector; + int highest_stale_pending_ioapic_eoi; /* be preempted when it's in kernel-mode(cpl=0) */ bool preempted_in_kernel; diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c index 14590d9c4a37..d6d792b5d1bd 100644 --- a/arch/x86/kvm/irq_comm.c +++ b/arch/x86/kvm/irq_comm.c @@ -412,9 +412,21 @@ void kvm_scan_ioapic_irq(struct kvm_vcpu *vcpu, u32 dest_id, u16 dest_mode, * level-triggered IRQ. The EOI needs to be intercepted and forwarded * to I/O APIC emulation so that the IRQ can be de-asserted. */ - if (kvm_apic_match_dest(vcpu, NULL, APIC_DEST_NOSHORT, dest_id, dest_mode) || - kvm_apic_pending_eoi(vcpu, vector)) + if (kvm_apic_match_dest(vcpu, NULL, APIC_DEST_NOSHORT, dest_id, dest_mode)) { __set_bit(vector, ioapic_handled_vectors); + } else if (kvm_apic_pending_eoi(vcpu, vector)) { + __set_bit(vector, ioapic_handled_vectors); + + /* + * Track the highest pending EOI for which the vCPU is NOT the + * target in the new routing. Only the EOI for the IRQ that is + * in-flight (for the old routing) needs to be intercepted, any + * future IRQs that arrive on this vCPU will be coincidental to + * the level-triggered routing and don't need to be intercepted. + */ + if ((int)vector > vcpu->arch.highest_stale_pending_ioapic_eoi) + vcpu->arch.highest_stale_pending_ioapic_eoi = vector; + } } void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 9dbc0f5d9865..6af84a0f84f3 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1459,6 +1459,14 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, int vector) if (!kvm_ioapic_handles_vector(apic, vector)) return; + /* + * If the intercepted EOI is for an IRQ that was pending from previous + * routing, then re-scan the I/O APIC routes as EOIs for the IRQ likely + * no longer need to be intercepted. + */ + if (apic->vcpu->arch.highest_stale_pending_ioapic_eoi == vector) + kvm_make_request(KVM_REQ_SCAN_IOAPIC, apic->vcpu); + /* Request a KVM exit to inform the userspace IOAPIC. */ if (irqchip_split(apic->vcpu->kvm)) { apic->vcpu->arch.pending_ioapic_eoi = vector; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7d4b9e2f1a38..a40b09dfb36a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10650,6 +10650,7 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu) return; bitmap_zero(vcpu->arch.ioapic_handled_vectors, 256); + vcpu->arch.highest_stale_pending_ioapic_eoi = -1; kvm_x86_call(sync_pir_to_irr)(vcpu);