mbox series

[v3,0/7] KVM: few more SMM fixes

Message ID 20210913140954.165665-1-mlevitsk@redhat.com (mailing list archive)
Headers show
Series KVM: few more SMM fixes | expand

Message

Maxim Levitsky Sept. 13, 2021, 2:09 p.m. UTC
These are few SMM fixes I was working on last week.

* Patch 1,2 fixes a minor issue that remained after
  commit 37be407b2ce8 ("KVM: nSVM: Fix L1 state corruption upon return from SMM")

  While now, returns to guest mode from SMM work due to restored state from HSAVE
  area, the guest entry still sees incorrect HSAVE state.

  This for example breaks return from SMM when the guest is 32 bit, due to PDPTRs
  loading which are done using incorrect MMU state which is incorrect,
  because it was setup with incorrect L1 HSAVE state.

  V3: updated with review feedback from Sean.

* Patch 3 fixes a theoretical issue that I introduced with my SREGS2 patchset,
  which Sean Christopherson pointed out.

  The issue is that KVM_REQ_GET_NESTED_STATE_PAGES request is not only used
  for completing the load of the nested state, but it is also used to complete
  exit from SMM to guest mode, and my compatibility hack of pdptrs_from_userspace
  was done assuming that this is not done.

  V3: I moved the reset of pdptrs_from_userspace to common x86 code.

* Patch 4 makes SVM SMM exit to be a bit more similar to how VMX does it
  by also raising KVM_REQ_GET_NESTED_STATE_PAGES requests.

  I do have doubts about why we need to do this on VMX though. The initial
  justification for this comes from

  7f7f1ba33cf2 ("KVM: x86: do not load vmcs12 pages while still in SMM")

  With all the MMU changes, I am not sure that we can still have a case
  of not up to date MMU when we enter the nested guest from SMM.
  On SVM it does seem to work anyway without this.

* Patch 5 fixes guest emulation failure when unrestricted_guest=0 and we reach
  handle_exception_nmi_irqoff.
  That function takes stale values from current vmcs and fails not taking into account
  the fact that we are emulating invalid guest state now, and thus no VM exit happened.

* Patch 6 fixed a corner case where return from SMM is slightly corrupting
  the L2 segment register state when unrestricted_guest=0 due to real mode segement
  caching register logic, but later it restores it correctly from SMMRAM.
  Fix this by not failing nested_vmx_enter_non_root_mode and delaying this
  failure to the next nested VM entry.

* Patch 7 fixes another corner case where emulation_required was not updated
  correctly on nested VMexit when restoring the L1 segement registers.

I still track 2 SMM issues:

1. When HyperV guest is running nested, and uses SMM enabled OVMF, it crashes and
   reboots during the boot process.

2. Nested migration on VMX is still broken when L1 floods itself with SMIs.

Best regards,
	Maxim Levitsky

Maxim Levitsky (7):
  KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smm
  KVM: x86: nSVM: restore the L1 host state prior to resuming nested
    guest on SMM exit
  KVM: x86: reset pdptrs_from_userspace when exiting smm
  KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM
    mode
  KVM: x86: VMX: synthesize invalid VM exit when emulating invalid guest
    state
  KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if
    !from_vmentry
  KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit

 arch/x86/kvm/svm/nested.c |   9 ++-
 arch/x86/kvm/svm/svm.c    | 131 ++++++++++++++++++++------------------
 arch/x86/kvm/svm/svm.h    |   3 +-
 arch/x86/kvm/vmx/nested.c |   9 ++-
 arch/x86/kvm/vmx/vmx.c    |  28 ++++++--
 arch/x86/kvm/vmx/vmx.h    |   1 +
 arch/x86/kvm/x86.c        |   7 ++
 7 files changed, 113 insertions(+), 75 deletions(-)

Comments

Paolo Bonzini Sept. 22, 2021, 2:35 p.m. UTC | #1
On 13/09/21 16:09, Maxim Levitsky wrote:
> These are few SMM fixes I was working on last week.
> 
> * Patch 1,2 fixes a minor issue that remained after
>    commit 37be407b2ce8 ("KVM: nSVM: Fix L1 state corruption upon return from SMM")
> 
>    While now, returns to guest mode from SMM work due to restored state from HSAVE
>    area, the guest entry still sees incorrect HSAVE state.
> 
>    This for example breaks return from SMM when the guest is 32 bit, due to PDPTRs
>    loading which are done using incorrect MMU state which is incorrect,
>    because it was setup with incorrect L1 HSAVE state.
> 
>    V3: updated with review feedback from Sean.
> 
> * Patch 3 fixes a theoretical issue that I introduced with my SREGS2 patchset,
>    which Sean Christopherson pointed out.
> 
>    The issue is that KVM_REQ_GET_NESTED_STATE_PAGES request is not only used
>    for completing the load of the nested state, but it is also used to complete
>    exit from SMM to guest mode, and my compatibility hack of pdptrs_from_userspace
>    was done assuming that this is not done.
> 
>    V3: I moved the reset of pdptrs_from_userspace to common x86 code.
> 
> * Patch 4 makes SVM SMM exit to be a bit more similar to how VMX does it
>    by also raising KVM_REQ_GET_NESTED_STATE_PAGES requests.
> 
>    I do have doubts about why we need to do this on VMX though. The initial
>    justification for this comes from
> 
>    7f7f1ba33cf2 ("KVM: x86: do not load vmcs12 pages while still in SMM")
> 
>    With all the MMU changes, I am not sure that we can still have a case
>    of not up to date MMU when we enter the nested guest from SMM.
>    On SVM it does seem to work anyway without this.
> 
> * Patch 5 fixes guest emulation failure when unrestricted_guest=0 and we reach
>    handle_exception_nmi_irqoff.
>    That function takes stale values from current vmcs and fails not taking into account
>    the fact that we are emulating invalid guest state now, and thus no VM exit happened.
> 
> * Patch 6 fixed a corner case where return from SMM is slightly corrupting
>    the L2 segment register state when unrestricted_guest=0 due to real mode segement
>    caching register logic, but later it restores it correctly from SMMRAM.
>    Fix this by not failing nested_vmx_enter_non_root_mode and delaying this
>    failure to the next nested VM entry.
> 
> * Patch 7 fixes another corner case where emulation_required was not updated
>    correctly on nested VMexit when restoring the L1 segement registers.
> 
> I still track 2 SMM issues:
> 
> 1. When HyperV guest is running nested, and uses SMM enabled OVMF, it crashes and
>     reboots during the boot process.
> 
> 2. Nested migration on VMX is still broken when L1 floods itself with SMIs.
> 
> Best regards,
> 	Maxim Levitsky
> 
> Maxim Levitsky (7):
>    KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smm
>    KVM: x86: nSVM: restore the L1 host state prior to resuming nested
>      guest on SMM exit
>    KVM: x86: reset pdptrs_from_userspace when exiting smm
>    KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM
>      mode
>    KVM: x86: VMX: synthesize invalid VM exit when emulating invalid guest
>      state
>    KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if
>      !from_vmentry
>    KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit
> 
>   arch/x86/kvm/svm/nested.c |   9 ++-
>   arch/x86/kvm/svm/svm.c    | 131 ++++++++++++++++++++------------------
>   arch/x86/kvm/svm/svm.h    |   3 +-
>   arch/x86/kvm/vmx/nested.c |   9 ++-
>   arch/x86/kvm/vmx/vmx.c    |  28 ++++++--
>   arch/x86/kvm/vmx/vmx.h    |   1 +
>   arch/x86/kvm/x86.c        |   7 ++
>   7 files changed, 113 insertions(+), 75 deletions(-)
> 

Queued, thanks.  However, I'm keeping patch 1 for 5.16 only.

Paolo
Sean Christopherson Sept. 22, 2021, 2:46 p.m. UTC | #2
On Wed, Sep 22, 2021, Paolo Bonzini wrote:
> On 13/09/21 16:09, Maxim Levitsky wrote:
> >    KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit

...
 
> Queued, thanks.  However, I'm keeping patch 1 for 5.16 only.

I'm pretty sure the above patch is wrong, emulation_required can simply be
cleared on emulated VM-Exit.
Paolo Bonzini Sept. 22, 2021, 3:45 p.m. UTC | #3
On 22/09/21 16:46, Sean Christopherson wrote:
> On Wed, Sep 22, 2021, Paolo Bonzini wrote:
>> On 13/09/21 16:09, Maxim Levitsky wrote:
>>>     KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit
> 
> ...
>   
>> Queued, thanks.  However, I'm keeping patch 1 for 5.16 only.
> 
> I'm pretty sure the above patch is wrong, emulation_required can simply be
> cleared on emulated VM-Exit.

Are you sure?  I think you can at least set the host segment fields to a 
data segment that requires emulation.  For example the DPL of the host 
DS is hardcoded to zero, but the RPL comes from the selector field and 
the DS selector is not validated.  Therefore a subsequent vmentry could 
fail the access rights tests of 26.3.1.2 Checks on Guest Segment Registers:

DS, ES, FS, GS. The DPL cannot be less than the RPL in the selector 
field if (1) the “unrestricted guest” VM-execution control is 0; (2) the 
register is usable; and (3) the Type in the access-rights field is in 
the range 0 – 11 (data segment or non-conforming code segment).

Paolo
Sean Christopherson Sept. 22, 2021, 3:52 p.m. UTC | #4
On Wed, Sep 22, 2021, Paolo Bonzini wrote:
> On 22/09/21 16:46, Sean Christopherson wrote:
> > On Wed, Sep 22, 2021, Paolo Bonzini wrote:
> > > On 13/09/21 16:09, Maxim Levitsky wrote:
> > > >     KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit
> > 
> > ...
> > > Queued, thanks.  However, I'm keeping patch 1 for 5.16 only.
> > 
> > I'm pretty sure the above patch is wrong, emulation_required can simply be
> > cleared on emulated VM-Exit.
> 
> Are you sure?

Pretty sure, but not 100% sure :-)

> I think you can at least set the host segment fields to a data segment that
> requires emulation.  For example the DPL of the host DS is hardcoded to zero,
> but the RPL comes from the selector field and the DS selector is not
> validated.

HOST_DS_SEL is validated:

  In the selector field for each of CS, SS, DS, ES, FS, GS and TR, the RPL
  (bits 1:0) and the TI flag (bit 2) must be 0.

> Therefore a subsequent vmentry could fail the access rights tests of 26.3.1.2
> Checks on Guest Segment Registers:

Yes, but this path is loading host state on VM-Exit.

> DS, ES, FS, GS. The DPL cannot be less than the RPL in the selector field if
> (1) the “unrestricted guest” VM-execution control is 0; (2) the register is
> usable; and (3) the Type in the access-rights field is in the range 0 – 11
> (data segment or non-conforming code segment).
> 
> Paolo
>
Paolo Bonzini Sept. 22, 2021, 6:17 p.m. UTC | #5
On 22/09/21 17:52, Sean Christopherson wrote:
> On Wed, Sep 22, 2021, Paolo Bonzini wrote:
>> On 22/09/21 16:46, Sean Christopherson wrote:
>>> On Wed, Sep 22, 2021, Paolo Bonzini wrote:
>>>> On 13/09/21 16:09, Maxim Levitsky wrote:
>>>>>      KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit
>>>
>>> ...
>>>> Queued, thanks.  However, I'm keeping patch 1 for 5.16 only.
>>>
>>> I'm pretty sure the above patch is wrong, emulation_required can simply be
>>> cleared on emulated VM-Exit.
>>
>> Are you sure?
> 
> Pretty sure, but not 100% sure :-)
> 
>> I think you can at least set the host segment fields to a data segment that
>> requires emulation.  For example the DPL of the host DS is hardcoded to zero,
>> but the RPL comes from the selector field and the DS selector is not
>> validated.
> 
> HOST_DS_SEL is validated:
> 
>    In the selector field for each of CS, SS, DS, ES, FS, GS and TR, the RPL
>    (bits 1:0) and the TI flag (bit 2) must be 0.

Ah, I think that's a bug in the manual.  In "27.5.2 Loading Host Segment 
and Descriptor-Table Registers" the reference to 26.3.1.2 should be 
26.2.3 ("Checks on Host Segment and Descriptor-Table Registers").  That 
one does cover all segment registers.  Hmm, who do we ask now about 
fixing Intel manuals?

So yeah, a WARN_ON_ONCE might be in order.  But I don't feel super safe 
making it false when it is possible to make KVM do something that is at 
least sensible.

Paolo

>> Therefore a subsequent vmentry could fail the access rights tests of 26.3.1.2
>> Checks on Guest Segment Registers:
> 
> Yes, but this path is loading host state on VM-Exit.
> 
>> DS, ES, FS, GS. The DPL cannot be less than the RPL in the selector field if
>> (1) the “unrestricted guest” VM-execution control is 0; (2) the register is
>> usable; and (3) the Type in the access-rights field is in the range 0 – 11
>> (data segment or non-conforming code segment).
>>
>> Paolo
>>
>