mbox series

[00/61] KVM: x86: Introduce KVM cpu caps

Message ID 20200201185218.24473-1-sean.j.christopherson@intel.com
Headers show
Series KVM: x86: Introduce KVM cpu caps | expand

Message

Sean Christopherson Feb. 1, 2020, 6:51 p.m. UTC
Introduce what is effectively a KVM-specific copy of the x86_capabilities
array in boot_cpu_data, kvm_cpu_caps.  kvm_cpu_caps is initialized by
copying boot_cpu_data.x86_capabilities before ->hardware_setup().  It is
then updated by KVM's CPUID logic (both common x86 and VMX/SVM specific)
to adjust the caps to reflect the CPU that KVM will expose to the guest.

Super cool things:
  - Kills off 8 kvm_x86_ops hooks.
  - Eliminates a retpoline from pretty much every page fault, and more
    retpolines throughout KVM.
  - Automagically handles selecting the appropriate eax/ebx/ecx/edx
    registers when updating CPUID feature bits.
  - Adds an auditing capability to double check that the function and
    index of a CPUID entry are correct during reverse CPUID lookup.

This is sort of a v2 of "KVM: x86: Purge kvm_x86_ops->*_supported()"[*],
but only a handful of the 26 patches from that series are carried forward
as is, and this series is obviously much more ambitiuous in scope.  And
unlike that series, there isn't a single patch in here that makes me go
"eww", and the end result is pretty awesome :-)

Quick synopsis:
  1. Refactor the KVM_GET_SUPPORTED_CPUID stack to consolidate code,
     remove crustiness, and set the stage for introducing kvm_cpu_caps.

  2. Introduce cpuid_entry_*() accessors/mutators to automatically
     handle retrieving the correct reg from a CPUID entry, and to audit
     that the entry matches the reserve CPUID lookup entry.  The
     cpuid_entry_*() helpers make moving the code from common x86 to
     vendor code much less risky.

  3. Move CPUID adjustments to vendor code in preparation for kvm_cpu_caps,
     which will be initialized at load time before the kvm_x86_ops hooks
     are ready to be used, i.e. before ->hardware_setup().

  4. Introduce kvm_cpu_caps and move all the CPUID code over to kvm_cpu_caps.

  5. Use kvm_cpu_cap_has() to kill off a bunch of ->*_supported() hooks.

  6. Additional cleanup in tangentially related areas to kill off even more
     ->*_supported() hooks.

  7. Profit!

Some of (6) could maybe be moved to a different series, but there would
likely be a number of minor conflicts.  I dropped as many arbitrary cleanup
patches as I could without letting any of the ->*_supported() hooks live,
and without losing confidence in the correctness of the refactoring.

Tested by verifying the output of KVM_GET_SUPPORTED_CPUID is identical
before and after on every patch on a Haswell and Coffee Lake.  Verified
correctness when hiding features via Qemu (running this version of KVM
in L1), e.g. that UMIP is correctly emulated for L2 when it's hidden from
L1, on relevant patches.

Boot tested and ran kvm-unit-tests at key points, e.g. large page handling.

The big untested pieces are PKU, XSAVES and PT on Intel, and everything AMD.

[*] https://lkml.kernel.org/r/20200129234640.8147-1-sean.j.christopherson@intel.com

Sean Christopherson (61):
  KVM: x86: Return -E2BIG when KVM_GET_SUPPORTED_CPUID hits max entries
  KVM: x86: Refactor loop around do_cpuid_func() to separate helper
  KVM: x86: Simplify handling of Centaur CPUID leafs
  KVM: x86: Clean up error handling in kvm_dev_ioctl_get_cpuid()
  KVM: x86: Check userapce CPUID array size after validating sub-leaf
  KVM: x86: Move CPUID 0xD.1 handling out of the index>0 loop
  KVM: x86: Check for CPUID 0xD.N support before validating array size
  KVM: x86: Warn on zero-size save state for valid CPUID 0xD.N sub-leaf
  KVM: x86: Refactor CPUID 0xD.N sub-leaf entry creation
  KVM: x86: Clean up CPUID 0x7 sub-leaf loop
  KVM: x86: Drop the explicit @index from do_cpuid_7_mask()
  KVM: x86: Drop redundant boot cpu checks on SSBD feature bits
  KVM: x86: Consolidate CPUID array max num entries checking
  KVM: x86: Hoist loop counter and terminator to top of
    __do_cpuid_func()
  KVM: x86: Refactor CPUID 0x4 and 0x8000001d handling
  KVM: x86: Encapsulate CPUID entries and metadata in struct
  KVM: x86: Drop redundant array size check
  KVM: x86: Use common loop iterator when handling CPUID 0xD.N
  KVM: VMX: Add helpers to query Intel PT mode
  KVM: x86: Calculate the supported xcr0 mask at load time
  KVM: x86: Use supported_xcr0 to detect MPX support
  KVM: x86: Make kvm_mpx_supported() an inline function
  KVM: x86: Clear output regs for CPUID 0x14 if PT isn't exposed to
    guest
  KVM: x86: Drop explicit @func param from ->set_supported_cpuid()
  KVM: x86: Use u32 for holding CPUID register value in helpers
  KVM: x86: Introduce cpuid_entry_{get,has}() accessors
  KVM: x86: Introduce cpuid_entry_{change,set,clear}() mutators
  KVM: x86: Refactor cpuid_mask() to auto-retrieve the register
  KVM: x86: Add Kconfig-controlled auditing of reverse CPUID lookups
  KVM: x86: Handle MPX CPUID adjustment in VMX code
  KVM: x86: Handle INVPCID CPUID adjustment in VMX code
  KVM: x86: Handle UMIP emulation CPUID adjustment in VMX code
  KVM: x86: Handle PKU CPUID adjustment in VMX code
  KVM: x86: Handle RDTSCP CPUID adjustment in VMX code
  KVM: x86: Handle Intel PT CPUID adjustment in VMX code
  KVM: x86: Handle GBPAGE CPUID adjustment for EPT in VMX code
  KVM: x86: Refactor handling of XSAVES CPUID adjustment
  KVM: x86: Introduce kvm_cpu_caps to replace runtime CPUID masking
  KVM: SVM: Convert feature updates from CPUID to KVM cpu caps
  KVM: VMX: Convert feature updates from CPUID to KVM cpu caps
  KVM: x86: Move XSAVES CPUID adjust to VMX's KVM cpu cap update
  KVM: x86: Add a helper to check kernel support when setting cpu cap
  KVM: x86: Use KVM cpu caps to mark CR4.LA57 as not-reserved
  KVM: x86: Use KVM cpu caps to track UMIP emulation
  KVM: x86: Fold CPUID 0x7 masking back into __do_cpuid_func()
  KVM: x86: Remove the unnecessary loop on CPUID 0x7 sub-leafs
  KVM: x86: Squash CPUID 0x2.0 insanity for modern CPUs
  KVM: x86: Do host CPUID at load time to mask KVM cpu caps
  KVM: x86: Override host CPUID results with kvm_cpu_caps
  KVM: x86: Set emulated/transmuted feature bits via kvm_cpu_caps
  KVM: x86: Use kvm_cpu_caps to detect Intel PT support
  KVM: x86: Use KVM cpu caps to detect MSR_TSC_AUX virt support
  KVM: VMX: Directly use VMX capabilities helper to detect RDTSCP
    support
  KVM: x86: Check for Intel PT MSR virtualization using KVM cpu caps
  KVM: VMX: Directly query Intel PT mode when refreshing PMUs
  KVM: SVM: Refactor logging of NPT enabled/disabled
  KVM: x86/mmu: Merge kvm_{enable,disable}_tdp() into a common function
  KVM: x86/mmu: Configure max page level during hardware setup
  KVM: x86: Don't propagate MMU lpage support to memslot.disallow_lpage
  KVM: Drop largepages_enabled and its accessor/mutator
  KVM: x86: Move VMX's host_efer to common x86 code

 arch/x86/include/asm/kvm_host.h |  15 +-
 arch/x86/kvm/Kconfig            |  10 +
 arch/x86/kvm/cpuid.c            | 771 +++++++++++++++-----------------
 arch/x86/kvm/cpuid.h            | 123 ++++-
 arch/x86/kvm/mmu/mmu.c          |  22 +-
 arch/x86/kvm/svm.c              | 117 ++---
 arch/x86/kvm/vmx/capabilities.h |  25 +-
 arch/x86/kvm/vmx/nested.c       |   2 +-
 arch/x86/kvm/vmx/pmu_intel.c    |   2 +-
 arch/x86/kvm/vmx/vmx.c          | 125 +++---
 arch/x86/kvm/vmx/vmx.h          |   5 +-
 arch/x86/kvm/x86.c              |  48 +-
 arch/x86/kvm/x86.h              |  10 +-
 include/linux/kvm_host.h        |   2 -
 virt/kvm/kvm_main.c             |  13 -
 15 files changed, 662 insertions(+), 628 deletions(-)

Comments

Paolo Bonzini Feb. 25, 2020, 3:25 p.m. UTC | #1
On 25/02/20 16:18, Vitaly Kuznetsov wrote:
> Would it be better or worse if we eliminate set_supported_cpuid() hook
> completely by doing an ugly hack like (completely untested):

Yes, it makes sense.

Paolo
Sean Christopherson Feb. 28, 2020, 1:37 a.m. UTC | #2
On Tue, Feb 25, 2020 at 04:25:34PM +0100, Paolo Bonzini wrote:
> On 25/02/20 16:18, Vitaly Kuznetsov wrote:
> > Would it be better or worse if we eliminate set_supported_cpuid() hook
> > completely by doing an ugly hack like (completely untested):
> 
> Yes, it makes sense.

Works for me, I'll tack it on.  I think my past self kept it because I was
planning on using vmx_set_supported_cpuid() for SGX, which adds multiple
sub-leafs, but I'm pretty sure I can squeeze them into kvm_cpu_caps with
a few extra shenanigans.
Paolo Bonzini Feb. 28, 2020, 7:04 a.m. UTC | #3
On 28/02/20 02:37, Sean Christopherson wrote:
>>> Would it be better or worse if we eliminate set_supported_cpuid() hook
>>> completely by doing an ugly hack like (completely untested):
>> Yes, it makes sense.
> Works for me, I'll tack it on.  I think my past self kept it because I was
> planning on using vmx_set_supported_cpuid() for SGX, which adds multiple
> sub-leafs, but I'm pretty sure I can squeeze them into kvm_cpu_caps with
> a few extra shenanigans.
> 

We can add it back for full CPUID leaves; it may even make sense to move
PT processing there (but not in this series).

Paolo
Sean Christopherson Feb. 29, 2020, 6:32 p.m. UTC | #4
On Tue, Feb 25, 2020 at 04:18:38PM +0100, Vitaly Kuznetsov wrote:
> Sean Christopherson <sean.j.christopherson@intel.com> writes:
> 
> >
> >   7. Profit!
> 
> Would it be better or worse if we eliminate set_supported_cpuid() hook
> completely by doing an ugly hack like (completely untested):
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index a2a091d328c6..5ad291d48e1b 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1145,8 +1145,6 @@ struct kvm_x86_ops {
>  
>         void (*set_tdp_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
>  
> -       void (*set_supported_cpuid)(struct kvm_cpuid_entry2 *entry);
> -
>         bool (*has_wbinvd_exit)(void);
>  
>         u64 (*read_l1_tsc_offset)(struct kvm_vcpu *vcpu);
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index e8beb1e542a8..88431fc02797 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -749,6 +749,16 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>  		cpuid_entry_override(entry, CPUID_8000_0008_EBX);
>  		break;
>  	}
> +	case 0x8000000A:
> +		if (boot_cpu_has(X86_FEATURE_SVM)) {
> +			entry->eax = 1; /* SVM revision 1 */
> +			entry->ebx = 8; /* Lets support 8 ASIDs in case we add proper
> +					   ASID emulation to nested SVM */
> +			entry->ecx = 0; /* Reserved */
> +			entry->edx = 0; /* Per default do not support any
> +					   additional features */

Lucky thing that you suggested this change, patch ("KVM: SVM: Convert
feature updates from CPUID to KVM cpu caps") was buggy in that clearing
entry->edx here would wipe out all X86_FEATURE_NRIPS and X86_FEATURE_NPT.
Only noticed it when moving this code.
Vitaly Kuznetsov March 2, 2020, 9:03 a.m. UTC | #5
Sean Christopherson <sean.j.christopherson@intel.com> writes:

> On Tue, Feb 25, 2020 at 04:18:38PM +0100, Vitaly Kuznetsov wrote:
>> Sean Christopherson <sean.j.christopherson@intel.com> writes:
>> 
>> >
>> >   7. Profit!
>> 
>> Would it be better or worse if we eliminate set_supported_cpuid() hook
>> completely by doing an ugly hack like (completely untested):
>> 
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index a2a091d328c6..5ad291d48e1b 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -1145,8 +1145,6 @@ struct kvm_x86_ops {
>>  
>>         void (*set_tdp_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
>>  
>> -       void (*set_supported_cpuid)(struct kvm_cpuid_entry2 *entry);
>> -
>>         bool (*has_wbinvd_exit)(void);
>>  
>>         u64 (*read_l1_tsc_offset)(struct kvm_vcpu *vcpu);
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index e8beb1e542a8..88431fc02797 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -749,6 +749,16 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>>  		cpuid_entry_override(entry, CPUID_8000_0008_EBX);
>>  		break;
>>  	}
>> +	case 0x8000000A:
>> +		if (boot_cpu_has(X86_FEATURE_SVM)) {
>> +			entry->eax = 1; /* SVM revision 1 */
>> +			entry->ebx = 8; /* Lets support 8 ASIDs in case we add proper
>> +					   ASID emulation to nested SVM */
>> +			entry->ecx = 0; /* Reserved */
>> +			entry->edx = 0; /* Per default do not support any
>> +					   additional features */
>
> Lucky thing that you suggested this change, patch ("KVM: SVM: Convert
> feature updates from CPUID to KVM cpu caps") was buggy in that clearing
> entry->edx here would wipe out all X86_FEATURE_NRIPS and X86_FEATURE_NPT.
> Only noticed it when moving this code. 
>

I plan to give your v2 a spin on AMD Epyc, just in case)