mbox series

[v4,0/4] KVM: x86: Fix and harden reg caching from !TASK context

Message ID 20241009175002.1118178-1-seanjc@google.com (mailing list archive)
Headers show
Series KVM: x86: Fix and harden reg caching from !TASK context | expand

Message

Sean Christopherson Oct. 9, 2024, 5:49 p.m. UTC
Fix a (VMX only) bug reported by Maxim where KVM caches a stale SS.AR_BYTES
when involuntary preemption schedules out a vCPU during vmx_vcpu_rest(), and
ultimately clobbers the VMCS's SS.AR_BYTES if userspace does KVM_GET_SREGS
=> KVM_SET_SREGS, i.e. if userspace writes the stale value back into KVM.

v4, as this is a spiritual successor to Maxim's earlier series.

Patch 1 fixes the underlying problem by avoiding the cache in kvm_sched_out().

Patch 2 fixes vmx_vcpu_reset() to invalidate the cache _after_ writing the
VMCS, which also fixes the VMCS clobbering bug, but isn't as robust of a fix
for KVM as a whole, e.g. any other flow that invalidates the cache too "early"
would be susceptible to the bug, and on its own doesn't allow for the
hardening in patch 3.

Patch 3 hardens KVM against using the register caches from !TASK context.
Except for PMI callbacks, which are tightly bounded, i.e. can't run while
KVM is modifying segment information, using the register caches from IRQ/NMI
is unsafe.

Patch 4 is a tangentially related cleanup.

v3: https://lore.kernel.org/all/20240725175232.337266-1-mlevitsk@redhat.com

Maxim Levitsky (1):
  KVM: VMX: reset the segment cache after segment init in
    vmx_vcpu_reset()

Sean Christopherson (3):
  KVM: x86: Bypass register cache when querying CPL from kvm_sched_out()
  KVM: x86: Add lockdep-guarded asserts on register cache usage
  KVM: x86: Use '0' for guest RIP if PMI encounters protected guest
    state

 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  1 +
 arch/x86/kvm/kvm_cache_regs.h      | 17 +++++++++++++++++
 arch/x86/kvm/svm/svm.c             |  1 +
 arch/x86/kvm/vmx/main.c            |  1 +
 arch/x86/kvm/vmx/vmx.c             | 29 +++++++++++++++++++++--------
 arch/x86/kvm/vmx/vmx.h             |  1 +
 arch/x86/kvm/x86.c                 | 15 ++++++++++++++-
 8 files changed, 57 insertions(+), 9 deletions(-)


base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b

Comments

Paolo Bonzini Oct. 10, 2024, 1:06 p.m. UTC | #1
On 10/9/24 19:49, Sean Christopherson wrote:
> Fix a (VMX only) bug reported by Maxim where KVM caches a stale SS.AR_BYTES
> when involuntary preemption schedules out a vCPU during vmx_vcpu_rest(), and
> ultimately clobbers the VMCS's SS.AR_BYTES if userspace does KVM_GET_SREGS
> => KVM_SET_SREGS, i.e. if userspace writes the stale value back into KVM.
> 
> v4, as this is a spiritual successor to Maxim's earlier series.
> 
> Patch 1 fixes the underlying problem by avoiding the cache in kvm_sched_out().

I think we want this one in stable?

Thanks,

Paolo

> Patch 2 fixes vmx_vcpu_reset() to invalidate the cache _after_ writing the
> VMCS, which also fixes the VMCS clobbering bug, but isn't as robust of a fix
> for KVM as a whole, e.g. any other flow that invalidates the cache too "early"
> would be susceptible to the bug, and on its own doesn't allow for the
> hardening in patch 3.
> 
> Patch 3 hardens KVM against using the register caches from !TASK context.
> Except for PMI callbacks, which are tightly bounded, i.e. can't run while
> KVM is modifying segment information, using the register caches from IRQ/NMI
> is unsafe.
> 
> Patch 4 is a tangentially related cleanup.
> 
> v3: https://lore.kernel.org/all/20240725175232.337266-1-mlevitsk@redhat.com
> 
> Maxim Levitsky (1):
>    KVM: VMX: reset the segment cache after segment init in
>      vmx_vcpu_reset()
> 
> Sean Christopherson (3):
>    KVM: x86: Bypass register cache when querying CPL from kvm_sched_out()
>    KVM: x86: Add lockdep-guarded asserts on register cache usage
>    KVM: x86: Use '0' for guest RIP if PMI encounters protected guest
>      state
> 
>   arch/x86/include/asm/kvm-x86-ops.h |  1 +
>   arch/x86/include/asm/kvm_host.h    |  1 +
>   arch/x86/kvm/kvm_cache_regs.h      | 17 +++++++++++++++++
>   arch/x86/kvm/svm/svm.c             |  1 +
>   arch/x86/kvm/vmx/main.c            |  1 +
>   arch/x86/kvm/vmx/vmx.c             | 29 +++++++++++++++++++++--------
>   arch/x86/kvm/vmx/vmx.h             |  1 +
>   arch/x86/kvm/x86.c                 | 15 ++++++++++++++-
>   8 files changed, 57 insertions(+), 9 deletions(-)
> 
> 
> base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b
Sean Christopherson Oct. 10, 2024, 4:17 p.m. UTC | #2
On Thu, Oct 10, 2024, Paolo Bonzini wrote:
> On 10/9/24 19:49, Sean Christopherson wrote:
> > Fix a (VMX only) bug reported by Maxim where KVM caches a stale SS.AR_BYTES
> > when involuntary preemption schedules out a vCPU during vmx_vcpu_rest(), and
> > ultimately clobbers the VMCS's SS.AR_BYTES if userspace does KVM_GET_SREGS
> > => KVM_SET_SREGS, i.e. if userspace writes the stale value back into KVM.
> > 
> > v4, as this is a spiritual successor to Maxim's earlier series.
> > 
> > Patch 1 fixes the underlying problem by avoiding the cache in kvm_sched_out().
> 
> I think we want this one in stable?

If anything, we should send Maxim's patch to stable trees.  While not a complete
fix, it resolves the only known scenario where caching SS.AR_BYTES is truly
problematic, it's as low risk as patches get, and it's much more likely to backport
cleanly to older kernels.

> > Patch 2 fixes vmx_vcpu_reset() to invalidate the cache _after_ writing the
> > VMCS, which also fixes the VMCS clobbering bug, but isn't as robust of a fix
> > for KVM as a whole, e.g. any other flow that invalidates the cache too "early"
> > would be susceptible to the bug, and on its own doesn't allow for the
> > hardening in patch 3.
Paolo Bonzini Oct. 10, 2024, 4:24 p.m. UTC | #3
On Thu, Oct 10, 2024 at 6:17 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Oct 10, 2024, Paolo Bonzini wrote:
> > On 10/9/24 19:49, Sean Christopherson wrote:
> > > Fix a (VMX only) bug reported by Maxim where KVM caches a stale SS.AR_BYTES
> > > when involuntary preemption schedules out a vCPU during vmx_vcpu_rest(), and
> > > ultimately clobbers the VMCS's SS.AR_BYTES if userspace does KVM_GET_SREGS
> > > => KVM_SET_SREGS, i.e. if userspace writes the stale value back into KVM.
> > >
> > > v4, as this is a spiritual successor to Maxim's earlier series.
> > >
> > > Patch 1 fixes the underlying problem by avoiding the cache in kvm_sched_out().
> >
> > I think we want this one in stable?
>
> If anything, we should send Maxim's patch to stable trees.  While not a complete
> fix, it resolves the only known scenario where caching SS.AR_BYTES is truly
> problematic, it's as low risk as patches get, and it's much more likely to backport
> cleanly to older kernels.

Ok, this works for me.

Paolo