Message ID | 20250224235542.2562848-1-seanjc@google.com (mailing list archive) |
---|---|
Headers | show |
Series | KVM: x86: nVMX IRQ fix and VM teardown cleanups | expand |
On 2/25/25 00:55, Sean Christopherson wrote: > This was _supposed_ to be a tiny one-off patch to fix a nVMX bug where KVM > fails to detect that, after nested VM-Exit, L1 has a pending IRQ (or NMI). > But because x86's nested teardown flows are garbage (KVM simply forces a > nested VM-Exit to put the vCPU back into L1), that simple fix snowballed. > > The immediate issue is that checking for a pending interrupt accesses the > legacy PIC, and x86's kvm_arch_destroy_vm() currently frees the PIC before > destroying vCPUs, i.e. checking for IRQs during the forced nested VM-Exit > results in a NULL pointer deref (or use-after-free if KVM didn't nullify > the PIC pointer). That's patch 1. > > Patch 2 is the original nVMX fix. > > The remaining patches attempt to bring a bit of sanity to x86's VM > teardown code, which has accumulated a lot of cruft over the years. E.g. > KVM currently unloads each vCPU's MMUs in a separate operation from > destroying vCPUs, all because when guest SMP support was added, KVM had a > kludgy MMU teardown flow that broken when a VM had more than one 1 vCPU. > And that oddity lived on, for 18 years... Queued patches 1 and 2 to kvm/master, and everything to kvm/queue (pending a little more testing and the related TDX change). Paolo
Hello: This series was applied to riscv/linux.git (for-next) by Paolo Bonzini <pbonzini@redhat.com>: On Mon, 24 Feb 2025 15:55:35 -0800 you wrote: > This was _supposed_ to be a tiny one-off patch to fix a nVMX bug where KVM > fails to detect that, after nested VM-Exit, L1 has a pending IRQ (or NMI). > But because x86's nested teardown flows are garbage (KVM simply forces a > nested VM-Exit to put the vCPU back into L1), that simple fix snowballed. > > The immediate issue is that checking for a pending interrupt accesses the > legacy PIC, and x86's kvm_arch_destroy_vm() currently frees the PIC before > destroying vCPUs, i.e. checking for IRQs during the forced nested VM-Exit > results in a NULL pointer deref (or use-after-free if KVM didn't nullify > the PIC pointer). That's patch 1. > > [...] Here is the summary with links: - [1/7] KVM: x86: Free vCPUs before freeing VM state https://git.kernel.org/riscv/c/17bcd7144263 - [2/7] KVM: nVMX: Process events on nested VM-Exit if injectable IRQ or NMI is pending https://git.kernel.org/riscv/c/982caaa11504 - [3/7] KVM: Assert that a destroyed/freed vCPU is no longer visible (no matching commit) - [4/7] KVM: x86: Don't load/put vCPU when unloading its MMU during teardown (no matching commit) - [5/7] KVM: x86: Unload MMUs during vCPU destruction, not before (no matching commit) - [6/7] KVM: x86: Fold guts of kvm_arch_sync_events() into kvm_arch_pre_destroy_vm() (no matching commit) - [7/7] KVM: Drop kvm_arch_sync_events() now that all implementations are nops (no matching commit) You are awesome, thank you!