mbox series

[00/14] KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs

Message ID 20220627160440.31857-1-vkuznets@redhat.com (mailing list archive)
Headers show
Series KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs | expand

Message

Vitaly Kuznetsov June 27, 2022, 4:04 p.m. UTC
Changes since RFC:
- "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
  infrastructure is later used in other patches [Sean] PATCHes 1-3 added
  to support the change.
- "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
  added [Sean].
- Commit messages added.

vmcs_config is a sanitized version of host VMX MSRs where some controls are
filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are 
discovered, some inconsistencies in controls are detected,...) but
nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
in exposing undesired controls to L1. Switch to using vmcs_config instead.

Sean Christopherson (1):
  KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup

Vitaly Kuznetsov (13):
  KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
  KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
    setup_vmcs_config()
  KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
    in setup_vmcs_config()
  KVM: VMX: Extend VMX controls macro shenanigans
  KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
    setup_vmcs_config()
  KVM: VMX: Add missing VMEXIT controls to vmcs_config
  KVM: VMX: Add missing VMENTRY controls to vmcs_config
  KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
  KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
  KVM: VMX: Store required-1 VMX controls in vmcs_config
  KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
  KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
  KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
    nested MSR

 arch/x86/kvm/vmx/capabilities.h |  16 +--
 arch/x86/kvm/vmx/nested.c       |  37 +++---
 arch/x86/kvm/vmx/nested.h       |   2 +-
 arch/x86/kvm/vmx/vmx.c          | 198 ++++++++++++++------------------
 arch/x86/kvm/vmx/vmx.h          | 118 +++++++++++++++++++
 5 files changed, 229 insertions(+), 142 deletions(-)

Comments

Jim Mattson June 27, 2022, 5:50 p.m. UTC | #1
On Mon, Jun 27, 2022 at 9:04 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Changes since RFC:
> - "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
>   infrastructure is later used in other patches [Sean] PATCHes 1-3 added
>   to support the change.
> - "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
>   added [Sean].
> - Commit messages added.
>
> vmcs_config is a sanitized version of host VMX MSRs where some controls are
> filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are
> discovered, some inconsistencies in controls are detected,...) but
> nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
> in exposing undesired controls to L1. Switch to using vmcs_config instead.
>
> Sean Christopherson (1):
>   KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup
>
> Vitaly Kuznetsov (13):
>   KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
>   KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
>     setup_vmcs_config()
>   KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
>     in setup_vmcs_config()
>   KVM: VMX: Extend VMX controls macro shenanigans
>   KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
>     setup_vmcs_config()
>   KVM: VMX: Add missing VMEXIT controls to vmcs_config
>   KVM: VMX: Add missing VMENTRY controls to vmcs_config
>   KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
>   KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
>   KVM: VMX: Store required-1 VMX controls in vmcs_config
>   KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
>   KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
>   KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
>     nested MSR
>
>  arch/x86/kvm/vmx/capabilities.h |  16 +--
>  arch/x86/kvm/vmx/nested.c       |  37 +++---
>  arch/x86/kvm/vmx/nested.h       |   2 +-
>  arch/x86/kvm/vmx/vmx.c          | 198 ++++++++++++++------------------
>  arch/x86/kvm/vmx/vmx.h          | 118 +++++++++++++++++++
>  5 files changed, 229 insertions(+), 142 deletions(-)
>
> --
> 2.35.3
>

Just checking that this doesn't introduce any backwards-compatibility
issues. That is, all features that were reported as being available in
the past should still be available moving forward.

Thanks,

--jim
Maxim Levitsky June 28, 2022, 9:08 a.m. UTC | #2
On Mon, 2022-06-27 at 18:04 +0200, Vitaly Kuznetsov wrote:
> Changes since RFC:
> - "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
>   infrastructure is later used in other patches [Sean] PATCHes 1-3 added
>   to support the change.
> - "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
>   added [Sean].
> - Commit messages added.
> 
> vmcs_config is a sanitized version of host VMX MSRs where some controls are
> filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are 
> discovered, some inconsistencies in controls are detected,...) but
> nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
> in exposing undesired controls to L1. Switch to using vmcs_config instead.
> 
> Sean Christopherson (1):
>   KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup
> 
> Vitaly Kuznetsov (13):
>   KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
>   KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
>     setup_vmcs_config()
>   KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
>     in setup_vmcs_config()
>   KVM: VMX: Extend VMX controls macro shenanigans
>   KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
>     setup_vmcs_config()
>   KVM: VMX: Add missing VMEXIT controls to vmcs_config
>   KVM: VMX: Add missing VMENTRY controls to vmcs_config
>   KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
>   KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
>   KVM: VMX: Store required-1 VMX controls in vmcs_config
>   KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
>   KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
>   KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
>     nested MSR
> 
>  arch/x86/kvm/vmx/capabilities.h |  16 +--
>  arch/x86/kvm/vmx/nested.c       |  37 +++---
>  arch/x86/kvm/vmx/nested.h       |   2 +-
>  arch/x86/kvm/vmx/vmx.c          | 198 ++++++++++++++------------------
>  arch/x86/kvm/vmx/vmx.h          | 118 +++++++++++++++++++
>  5 files changed, 229 insertions(+), 142 deletions(-)
> 
Sorry that I was a bit out of loop on this, so before I review it,
does this patch series solve the eVMCS issue we had alone,
or we still need the eVMCS version patch series as well?

Best regards,
	Maxim Levitsky
Vitaly Kuznetsov June 28, 2022, 10:04 a.m. UTC | #3
Maxim Levitsky <mlevitsk@redhat.com> writes:

> On Mon, 2022-06-27 at 18:04 +0200, Vitaly Kuznetsov wrote:
>> Changes since RFC:
>> - "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
>>   infrastructure is later used in other patches [Sean] PATCHes 1-3 added
>>   to support the change.
>> - "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
>>   added [Sean].
>> - Commit messages added.
>> 
>> vmcs_config is a sanitized version of host VMX MSRs where some controls are
>> filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are 
>> discovered, some inconsistencies in controls are detected,...) but
>> nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
>> in exposing undesired controls to L1. Switch to using vmcs_config instead.
>> 
>> Sean Christopherson (1):
>>   KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup
>> 
>> Vitaly Kuznetsov (13):
>>   KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
>>   KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
>>     setup_vmcs_config()
>>   KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
>>     in setup_vmcs_config()
>>   KVM: VMX: Extend VMX controls macro shenanigans
>>   KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
>>     setup_vmcs_config()
>>   KVM: VMX: Add missing VMEXIT controls to vmcs_config
>>   KVM: VMX: Add missing VMENTRY controls to vmcs_config
>>   KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
>>   KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
>>   KVM: VMX: Store required-1 VMX controls in vmcs_config
>>   KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
>>   KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
>>   KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
>>     nested MSR
>> 
>>  arch/x86/kvm/vmx/capabilities.h |  16 +--
>>  arch/x86/kvm/vmx/nested.c       |  37 +++---
>>  arch/x86/kvm/vmx/nested.h       |   2 +-
>>  arch/x86/kvm/vmx/vmx.c          | 198 ++++++++++++++------------------
>>  arch/x86/kvm/vmx/vmx.h          | 118 +++++++++++++++++++
>>  5 files changed, 229 insertions(+), 142 deletions(-)
>> 
> Sorry that I was a bit out of loop on this, so before I review it,
> does this patch series solve the eVMCS issue we had alone,
> or we still need the eVMCS version patch series as well?

"[PATCH 00/11] KVM: VMX: Support TscScaling and EnclsExitingBitmap whith
eVMCS" adds new features, namely TSC scaling for both KVM-on-Hyper-V and
Hyper-V-on-KVM. This series doesn't add any features but avoids the
problem reported by Anirudh by properly filtering values in L1 VMX MSRs.

TL;DR: Both series are needed.
Maxim Levitsky June 28, 2022, 10:08 a.m. UTC | #4
On Tue, 2022-06-28 at 12:04 +0200, Vitaly Kuznetsov wrote:
> Maxim Levitsky <mlevitsk@redhat.com> writes:
> 
> > On Mon, 2022-06-27 at 18:04 +0200, Vitaly Kuznetsov wrote:
> > > Changes since RFC:
> > > - "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
> > >   infrastructure is later used in other patches [Sean] PATCHes 1-3 added
> > >   to support the change.
> > > - "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
> > >   added [Sean].
> > > - Commit messages added.
> > > 
> > > vmcs_config is a sanitized version of host VMX MSRs where some controls are
> > > filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are 
> > > discovered, some inconsistencies in controls are detected,...) but
> > > nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
> > > in exposing undesired controls to L1. Switch to using vmcs_config instead.
> > > 
> > > Sean Christopherson (1):
> > >   KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup
> > > 
> > > Vitaly Kuznetsov (13):
> > >   KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
> > >   KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
> > >     setup_vmcs_config()
> > >   KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
> > >     in setup_vmcs_config()
> > >   KVM: VMX: Extend VMX controls macro shenanigans
> > >   KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
> > >     setup_vmcs_config()
> > >   KVM: VMX: Add missing VMEXIT controls to vmcs_config
> > >   KVM: VMX: Add missing VMENTRY controls to vmcs_config
> > >   KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
> > >   KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
> > >   KVM: VMX: Store required-1 VMX controls in vmcs_config
> > >   KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
> > >   KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
> > >   KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
> > >     nested MSR
> > > 
> > >  arch/x86/kvm/vmx/capabilities.h |  16 +--
> > >  arch/x86/kvm/vmx/nested.c       |  37 +++---
> > >  arch/x86/kvm/vmx/nested.h       |   2 +-
> > >  arch/x86/kvm/vmx/vmx.c          | 198 ++++++++++++++------------------
> > >  arch/x86/kvm/vmx/vmx.h          | 118 +++++++++++++++++++
> > >  5 files changed, 229 insertions(+), 142 deletions(-)
> > > 
> > Sorry that I was a bit out of loop on this, so before I review it,
> > does this patch series solve the eVMCS issue we had alone,
> > or we still need the eVMCS version patch series as well?
> 
> "[PATCH 00/11] KVM: VMX: Support TscScaling and EnclsExitingBitmap whith
> eVMCS" adds new features, namely TSC scaling for both KVM-on-Hyper-V and
> Hyper-V-on-KVM. This series doesn't add any features but avoids the
> problem reported by Anirudh by properly filtering values in L1 VMX MSRs.
> 
> TL;DR: Both series are needed.
> 

Roger that!

Best regards,
	Maxim Levitsky
Vitaly Kuznetsov June 28, 2022, 2:04 p.m. UTC | #5
Jim Mattson <jmattson@google.com> writes:

> On Mon, Jun 27, 2022 at 9:04 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>
>> Changes since RFC:
>> - "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
>>   infrastructure is later used in other patches [Sean] PATCHes 1-3 added
>>   to support the change.
>> - "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
>>   added [Sean].
>> - Commit messages added.
>>
>> vmcs_config is a sanitized version of host VMX MSRs where some controls are
>> filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are
>> discovered, some inconsistencies in controls are detected,...) but
>> nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
>> in exposing undesired controls to L1. Switch to using vmcs_config instead.
>>
>> Sean Christopherson (1):
>>   KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup
>>
>> Vitaly Kuznetsov (13):
>>   KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
>>   KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
>>     setup_vmcs_config()
>>   KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
>>     in setup_vmcs_config()
>>   KVM: VMX: Extend VMX controls macro shenanigans
>>   KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
>>     setup_vmcs_config()
>>   KVM: VMX: Add missing VMEXIT controls to vmcs_config
>>   KVM: VMX: Add missing VMENTRY controls to vmcs_config
>>   KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
>>   KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
>>   KVM: VMX: Store required-1 VMX controls in vmcs_config
>>   KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
>>   KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
>>   KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
>>     nested MSR
>>
>>  arch/x86/kvm/vmx/capabilities.h |  16 +--
>>  arch/x86/kvm/vmx/nested.c       |  37 +++---
>>  arch/x86/kvm/vmx/nested.h       |   2 +-
>>  arch/x86/kvm/vmx/vmx.c          | 198 ++++++++++++++------------------
>>  arch/x86/kvm/vmx/vmx.h          | 118 +++++++++++++++++++
>>  5 files changed, 229 insertions(+), 142 deletions(-)
>>
>> --
>> 2.35.3
>>
>
> Just checking that this doesn't introduce any backwards-compatibility
> issues. That is, all features that were reported as being available in
> the past should still be available moving forward.
>

All the controls nested_vmx_setup_ctls_msrs() set are in the newly
introduced KVM_REQ_VMX_*/KVM_OPT_VMX_* sets so we should be good here
(unless I screwed up, of course).

There's going to be some changes though. E.g this series was started by
Anirudh's report when KVM was exposing SECONDARY_EXEC_TSC_SCALING while
running on KVM and using eVMCS which doesn't support the control. This
is a bug and I don't think we need and 'bug compatibility' here.

Another change is that VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL/
VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL will now be filtered out on the
"broken" CPUs (the list is in setup_vmcs_config()). I *think* this is
also OK but if not, we can move the filtering to vmx_vmentry_ctrl()/
vmx_vmexit_ctrl().
Jim Mattson June 28, 2022, 3:28 p.m. UTC | #6
On Tue, Jun 28, 2022 at 7:04 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Jim Mattson <jmattson@google.com> writes:
>
> > On Mon, Jun 27, 2022 at 9:04 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >>
> >> Changes since RFC:
> >> - "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
> >>   infrastructure is later used in other patches [Sean] PATCHes 1-3 added
> >>   to support the change.
> >> - "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
> >>   added [Sean].
> >> - Commit messages added.
> >>
> >> vmcs_config is a sanitized version of host VMX MSRs where some controls are
> >> filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are
> >> discovered, some inconsistencies in controls are detected,...) but
> >> nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
> >> in exposing undesired controls to L1. Switch to using vmcs_config instead.
> >>
> >> Sean Christopherson (1):
> >>   KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup
> >>
> >> Vitaly Kuznetsov (13):
> >>   KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
> >>   KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
> >>     setup_vmcs_config()
> >>   KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
> >>     in setup_vmcs_config()
> >>   KVM: VMX: Extend VMX controls macro shenanigans
> >>   KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
> >>     setup_vmcs_config()
> >>   KVM: VMX: Add missing VMEXIT controls to vmcs_config
> >>   KVM: VMX: Add missing VMENTRY controls to vmcs_config
> >>   KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
> >>   KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
> >>   KVM: VMX: Store required-1 VMX controls in vmcs_config
> >>   KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
> >>   KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
> >>   KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
> >>     nested MSR
> >>
> >>  arch/x86/kvm/vmx/capabilities.h |  16 +--
> >>  arch/x86/kvm/vmx/nested.c       |  37 +++---
> >>  arch/x86/kvm/vmx/nested.h       |   2 +-
> >>  arch/x86/kvm/vmx/vmx.c          | 198 ++++++++++++++------------------
> >>  arch/x86/kvm/vmx/vmx.h          | 118 +++++++++++++++++++
> >>  5 files changed, 229 insertions(+), 142 deletions(-)
> >>
> >> --
> >> 2.35.3
> >>
> >
> > Just checking that this doesn't introduce any backwards-compatibility
> > issues. That is, all features that were reported as being available in
> > the past should still be available moving forward.
> >
>
> All the controls nested_vmx_setup_ctls_msrs() set are in the newly
> introduced KVM_REQ_VMX_*/KVM_OPT_VMX_* sets so we should be good here
> (unless I screwed up, of course).
>
> There's going to be some changes though. E.g this series was started by
> Anirudh's report when KVM was exposing SECONDARY_EXEC_TSC_SCALING while
> running on KVM and using eVMCS which doesn't support the control. This
> is a bug and I don't think we need and 'bug compatibility' here.

You cannot force VM termination on a kernel upgrade. On live migration
from an older kernel, the new kernel must be willing to accept the
suspended state of a VM that was running under the older kernel. In
particular, the new KVM_SET_MSRS must accept the values of the VMX
capability MSRS that userspace obtains from the older KVM_GET_MSRS. I
don't know if this is what you are referring to as "bug
compatibility," but if it is, then we absolutely do need it.

> Another change is that VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL/
> VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL will now be filtered out on the
> "broken" CPUs (the list is in setup_vmcs_config()). I *think* this is
> also OK but if not, we can move the filtering to vmx_vmentry_ctrl()/
> vmx_vmexit_ctrl().
>
> --
> Vitaly
>
Vitaly Kuznetsov June 28, 2022, 4:01 p.m. UTC | #7
Jim Mattson <jmattson@google.com> writes:

> On Tue, Jun 28, 2022 at 7:04 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>

...

>> Jim Mattson <jmattson@google.com> writes:
>>
>> > Just checking that this doesn't introduce any backwards-compatibility
>> > issues. That is, all features that were reported as being available in
>> > the past should still be available moving forward.
>> >
>>
>> All the controls nested_vmx_setup_ctls_msrs() set are in the newly
>> introduced KVM_REQ_VMX_*/KVM_OPT_VMX_* sets so we should be good here
>> (unless I screwed up, of course).
>>
>> There's going to be some changes though. E.g this series was started by
>> Anirudh's report when KVM was exposing SECONDARY_EXEC_TSC_SCALING while
>> running on KVM and using eVMCS which doesn't support the control. This
>> is a bug and I don't think we need and 'bug compatibility' here.
>
> You cannot force VM termination on a kernel upgrade. On live migration
> from an older kernel, the new kernel must be willing to accept the
> suspended state of a VM that was running under the older kernel. In
> particular, the new KVM_SET_MSRS must accept the values of the VMX
> capability MSRS that userspace obtains from the older KVM_GET_MSRS. I
> don't know if this is what you are referring to as "bug
> compatibility," but if it is, then we absolutely do need it.
>

Oh, right you are, we do seem to have a problem. Even for eVMCS case,
the fact that we expose a feature which can't be used in VMX control
MSRs doesn't mean that the VM is broken. In particular, the VM may not
be using VMX features at all. Same goes to PERF_GLOBAL_CTRL errata.

vmx_restore_control_msr() currenly does strict checking of the supplied
data against what was initially set by nested_vmx_setup_ctls_msrs(),
this basically means we cannot drop feature bits, just add them. Out of
top of my head I don't see a solution other than relaxing the check by
introducing a "revoke list"... Another questions is whether we want
guest visible MSR value to remain like it was before migration or we can
be brave and clear 'broken' feature bits there (the features are
'broken' so they couldn't be in use, right?). I'm not sure.

Anirudh, the same concern applies to your 'intermediate' patch too.

Smart ideas on what can be done are more than welcome)
Jim Mattson June 28, 2022, 5:11 p.m. UTC | #8
On Tue, Jun 28, 2022 at 9:01 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Jim Mattson <jmattson@google.com> writes:
>
> > On Tue, Jun 28, 2022 at 7:04 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >>
>
> ...
>
> >> Jim Mattson <jmattson@google.com> writes:
> >>
> >> > Just checking that this doesn't introduce any backwards-compatibility
> >> > issues. That is, all features that were reported as being available in
> >> > the past should still be available moving forward.
> >> >
> >>
> >> All the controls nested_vmx_setup_ctls_msrs() set are in the newly
> >> introduced KVM_REQ_VMX_*/KVM_OPT_VMX_* sets so we should be good here
> >> (unless I screwed up, of course).
> >>
> >> There's going to be some changes though. E.g this series was started by
> >> Anirudh's report when KVM was exposing SECONDARY_EXEC_TSC_SCALING while
> >> running on KVM and using eVMCS which doesn't support the control. This
> >> is a bug and I don't think we need and 'bug compatibility' here.
> >
> > You cannot force VM termination on a kernel upgrade. On live migration
> > from an older kernel, the new kernel must be willing to accept the
> > suspended state of a VM that was running under the older kernel. In
> > particular, the new KVM_SET_MSRS must accept the values of the VMX
> > capability MSRS that userspace obtains from the older KVM_GET_MSRS. I
> > don't know if this is what you are referring to as "bug
> > compatibility," but if it is, then we absolutely do need it.
> >
>
> Oh, right you are, we do seem to have a problem. Even for eVMCS case,
> the fact that we expose a feature which can't be used in VMX control
> MSRs doesn't mean that the VM is broken. In particular, the VM may not
> be using VMX features at all. Same goes to PERF_GLOBAL_CTRL errata.
>
> vmx_restore_control_msr() currenly does strict checking of the supplied
> data against what was initially set by nested_vmx_setup_ctls_msrs(),
> this basically means we cannot drop feature bits, just add them. Out of
> top of my head I don't see a solution other than relaxing the check by
> introducing a "revoke list"... Another questions is whether we want
> guest visible MSR value to remain like it was before migration or we can
> be brave and clear 'broken' feature bits there (the features are
> 'broken' so they couldn't be in use, right?). I'm not sure.

Read-only MSRs cannot be changed after their values may have been
observed by the guest.

> Anirudh, the same concern applies to your 'intermediate' patch too.
>
> Smart ideas on what can be done are more than welcome)

You could define a bunch of "quirks," and userspace could use
KVM_CAP_DISABLE_QUIRKS2 to ask that the broken bits be cleared.
Vitaly Kuznetsov June 29, 2022, 9:06 a.m. UTC | #9
Jim Mattson <jmattson@google.com> writes:

> On Tue, Jun 28, 2022 at 9:01 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>

...

>
> Read-only MSRs cannot be changed after their values may have been
> observed by the guest.
>
>> Anirudh, the same concern applies to your 'intermediate' patch too.
>>
>> Smart ideas on what can be done are more than welcome)
>
> You could define a bunch of "quirks," and userspace could use
> KVM_CAP_DISABLE_QUIRKS2 to ask that the broken bits be cleared.

This sounds correct, but awful :-) I, however, think we can avoid this.

For the KVM-on-eVMCS case:
- When combined with "[PATCH 00/11] KVM: VMX: Support TscScaling and
EnclsExitingBitmap whith eVMCS" series
(https://lore.kernel.org/kvm/20220621155830.60115-1-vkuznets@redhat.com/),
the filtering we do in setup_vmcs_config() is no longer needed. I need
to check various available Hyper-V versions but my initial investigation
shows that we were only filtering out TSC Scaling and 'Load
IA32_PERF_GLOBAL_CTRL' vmexit/vmentry, the rest were never present in
VMX control MSRs (as presented by Hyper-V) in the first place.

For PERF_GLOBAL_CTRL errata:
- We can move the filtering to vmx_vmexit_ctrl()/vmx_vmentry_ctrl()
preserving the status quo: KVM doesn't use the feature but it is exposed
to L1 hypervisor (and L1 hypervisor presumably has the same check and
doesn't use the feature. FWIW, the workaround was added in 2011 and the
erratas it references appeared in 2010, this means that the affected
CPUs are quite old, modern proprietary hypervisors won't likely boot
there).

If we do the above, there's going to be no changes to VMX control MSRs
generated by nested_vmx_setup_ctls_msrs(). I, however, need to work on
a combined series.
Jim Mattson June 29, 2022, 10:27 p.m. UTC | #10
On Wed, Jun 29, 2022 at 2:06 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> For PERF_GLOBAL_CTRL errata:
> - We can move the filtering to vmx_vmexit_ctrl()/vmx_vmentry_ctrl()
> preserving the status quo: KVM doesn't use the feature but it is exposed
> to L1 hypervisor (and L1 hypervisor presumably has the same check and
> doesn't use the feature. FWIW, the workaround was added in 2011 and the
> erratas it references appeared in 2010, this means that the affected
> CPUs are quite old, modern proprietary hypervisors won't likely boot
> there).
Sadly, Nehalem and Westmere are well-supported by KVM today, and we
will probably still continue to support them for at least another
decade. They both have EPT, unrestricted guest, and other VT-x2
features that KVM still considers optional.
Sean Christopherson July 6, 2022, 10:30 p.m. UTC | #11
On Wed, Jun 29, 2022, Jim Mattson wrote:
> On Wed, Jun 29, 2022 at 2:06 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> 
> > For PERF_GLOBAL_CTRL errata:
> > - We can move the filtering to vmx_vmexit_ctrl()/vmx_vmentry_ctrl()
> > preserving the status quo: KVM doesn't use the feature but it is exposed
> > to L1 hypervisor (and L1 hypervisor presumably has the same check and
> > doesn't use the feature. FWIW, the workaround was added in 2011 and the
> > erratas it references appeared in 2010, this means that the affected
> > CPUs are quite old, modern proprietary hypervisors won't likely boot
> > there).
> Sadly, Nehalem and Westmere are well-supported by KVM today, and we
> will probably still continue to support them for at least another
> decade. They both have EPT, unrestricted guest, and other VT-x2
> features that KVM still considers optional.

Nehalem doesn't have unrestricted guest.  Nehalem is the only generation with EPT
but not unrestricted guest.