diff mbox series

[RFC,2/4] target/i386: define CPU models to model x86-64 ABI levels

Message ID 20210201153606.4158076-3-berrange@redhat.com (mailing list archive)
State New, archived
Headers show
Series target/i386/cpu: introduce new CPU models for x86-64 ABI levels | expand

Commit Message

Daniel P. Berrangé Feb. 1, 2021, 3:36 p.m. UTC
To paraphrase:

  https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level/

In 2020, AMD, Intel, Red Hat, and SUSE worked together to define
three microarchitecture levels on top of the historical x86-64
baseline:

  * x86-64:    original x86_64 baseline instruction set
  * x86-64-v2: vector instructions up to Streaming SIMD
               Extensions 4.2 (SSE4.2)  and Supplemental
	       Streaming SIMD Extensions 3 (SSSE3), the
	       POPCNT instruction, and CMPXCHG16B
  * x86-64-v3: vector instructions up to AVX2, MOVBE,
               and additional bit-manipulation instructions.
  * x86-64-v4: vector instructions from some of the
               AVX-512 variants.

This list of features is defined in the doc at:

  https://gitlab.com/x86-psABIs/x86-64-ABI/

QEMU has historically defaulted to the "qemu64" CPU model on
x86_64 targets, which is approximately the x86-64 baseline
ABI, with 'SVM' added.

It is thought it might be desirable if QEMU could provide CPU models
that closely correspond to the ABI levels, while offering portability
across the maximum number of physical CPUs.

Historically we've found that defining CPU models with an arbitrary
combination of CPU features can be problematic, as some guest OS
will not check all features they use, and instead assume that if
they see feature "XX", then "YY" will always exist. This is reasonable
in bare metal, but subject to breakage in virtualization.

Thus in defining the CPI models for the ABI levels, this patch attempted
to base them off an existing CPU model.

While each x86-64-abiNNN has a designated vendor, they are designed
to be vendor agnostic models. ie they are capable of running on any
AMD or Intel physical CPU model that satisfies the ABI level. eg
althgouh the x86-64-abi2 model is based on Nehalem, it should be
able to run guests on an Opteron_G4/G5/EPYC host, since those CPUs
support the features required by the x86-64 v2 ABI.

More precisely the models were defined as:

 * x86-64-abi1: close match for Opteron_G1, minus
                vme
 * x86-64-abi2: perfect match for Nehalem
 * x86-64-abi3: close match of Haswell-noTSX, minus
                aes pcid erms invpcid tsc-deadline
		x2apic pclmulqdq
 * x86-64-abi4: close match of Skylake-Server-noTSX-IBRS, minus
                spec-ctrl

None of the CPU models declare any VMX/SVM features. This would
make them unable to support nested virtualization with live
migration.

Given their vendor agnostic design, these CPU models are primarily
though to useful as the default CPU model for machine types. QEMU
upstream is quite conservative in mandating new hardware features,
but a downstream vendor may choose to mandate a newer x86-64 ABI
level for downstream only machine types.

Note that TCG is not capable of supporting the 2 newest ABI levels
currently:

* x86-64-abi3:

  CPUID.01H:ECX.fma [bit 12]
  CPUID.01H:ECX.avx [bit 28]
  CPUID.01H:ECX.f16c [bit 29]
  CPUID.07H:EBX.avx2 [bit 5]

* x86-64-abi4:

  CPUID.01H:ECX.pcid [bit 17]
  CPUID.01H:ECX.x2apic [bit 21]
  CPUID.01H:ECX.tsc-deadline [bit 24]
  CPUID.07H:EBX.invpcid [bit 10]
  CPUID.07H:EBX.avx512f [bit 16]
  CPUID.07H:EBX.avx512dq [bit 17]
  CPUID.07H:EBX.rdseed [bit 18]
  CPUID.07H:EBX.avx512cd [bit 28]
  CPUID.07H:EBX.avx512bw [bit 30]
  CPUID.07H:EBX.avx512vl [bit 31]
  CPUID.80000001H:ECX.3dnowprefetch [bit 8]
  CPUID.0DH:EAX.xsavec [bit 1]

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 docs/system/cpu-models-x86-abi.csv |   8 ++
 target/i386/cpu.c                  | 156 +++++++++++++++++++++++++++++
 2 files changed, 164 insertions(+)

Comments

David Edmondson Feb. 2, 2021, 9:46 a.m. UTC | #1
On Monday, 2021-02-01 at 15:36:04 GMT, Daniel P. Berrangé wrote:

> To paraphrase:
>
>   https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level/
>
> In 2020, AMD, Intel, Red Hat, and SUSE worked together to define
> three microarchitecture levels on top of the historical x86-64
> baseline:
>
>   * x86-64:    original x86_64 baseline instruction set
>   * x86-64-v2: vector instructions up to Streaming SIMD
>                Extensions 4.2 (SSE4.2)  and Supplemental
> 	       Streaming SIMD Extensions 3 (SSSE3), the
> 	       POPCNT instruction, and CMPXCHG16B
>   * x86-64-v3: vector instructions up to AVX2, MOVBE,
>                and additional bit-manipulation instructions.
>   * x86-64-v4: vector instructions from some of the
>                AVX-512 variants.
>
> This list of features is defined in the doc at:
>
>   https://gitlab.com/x86-psABIs/x86-64-ABI/
>
> QEMU has historically defaulted to the "qemu64" CPU model on
> x86_64 targets, which is approximately the x86-64 baseline
> ABI, with 'SVM' added.
>
> It is thought it might be desirable if QEMU could provide CPU models
> that closely correspond to the ABI levels, while offering portability
> across the maximum number of physical CPUs.
>
> Historically we've found that defining CPU models with an arbitrary
> combination of CPU features can be problematic, as some guest OS
> will not check all features they use, and instead assume that if
> they see feature "XX", then "YY" will always exist. This is reasonable
> in bare metal, but subject to breakage in virtualization.
>
> Thus in defining the CPI models for the ABI levels, this patch attempted

s/CPI/CPU/

> to base them off an existing CPU model.
>
> While each x86-64-abiNNN has a designated vendor, they are designed
> to be vendor agnostic models. ie they are capable of running on any
> AMD or Intel physical CPU model that satisfies the ABI level. eg

Only AMD or Intel? (You mention the Hugon chips elsewhere.)

> althgouh the x86-64-abi2 model is based on Nehalem, it should be
> able to run guests on an Opteron_G4/G5/EPYC host, since those CPUs
> support the features required by the x86-64 v2 ABI.
>
> More precisely the models were defined as:
>
>  * x86-64-abi1: close match for Opteron_G1, minus
>                 vme
>  * x86-64-abi2: perfect match for Nehalem
>  * x86-64-abi3: close match of Haswell-noTSX, minus
>                 aes pcid erms invpcid tsc-deadline
> 		x2apic pclmulqdq
>  * x86-64-abi4: close match of Skylake-Server-noTSX-IBRS, minus
>                 spec-ctrl
>
> None of the CPU models declare any VMX/SVM features. This would
> make them unable to support nested virtualization with live
> migration.

How about "Unable to support hardware accelerated nested
virtualization." ?

Is live migration relevant?

> Given their vendor agnostic design, these CPU models are primarily
> though to useful as the default CPU model for machine types. QEMU
> upstream is quite conservative in mandating new hardware features,
> but a downstream vendor may choose to mandate a newer x86-64 ABI
> level for downstream only machine types.
>
> Note that TCG is not capable of supporting the 2 newest ABI levels
> currently:
>
> * x86-64-abi3:
>
>   CPUID.01H:ECX.fma [bit 12]
>   CPUID.01H:ECX.avx [bit 28]
>   CPUID.01H:ECX.f16c [bit 29]
>   CPUID.07H:EBX.avx2 [bit 5]
>
> * x86-64-abi4:
>
>   CPUID.01H:ECX.pcid [bit 17]
>   CPUID.01H:ECX.x2apic [bit 21]
>   CPUID.01H:ECX.tsc-deadline [bit 24]
>   CPUID.07H:EBX.invpcid [bit 10]
>   CPUID.07H:EBX.avx512f [bit 16]
>   CPUID.07H:EBX.avx512dq [bit 17]
>   CPUID.07H:EBX.rdseed [bit 18]
>   CPUID.07H:EBX.avx512cd [bit 28]
>   CPUID.07H:EBX.avx512bw [bit 30]
>   CPUID.07H:EBX.avx512vl [bit 31]
>   CPUID.80000001H:ECX.3dnowprefetch [bit 8]
>   CPUID.0DH:EAX.xsavec [bit 1]
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  docs/system/cpu-models-x86-abi.csv |   8 ++
>  target/i386/cpu.c                  | 156 +++++++++++++++++++++++++++++
>  2 files changed, 164 insertions(+)
>
> diff --git a/docs/system/cpu-models-x86-abi.csv b/docs/system/cpu-models-x86-abi.csv
> index 4565e6a535..d34d95d485 100644
> --- a/docs/system/cpu-models-x86-abi.csv
> +++ b/docs/system/cpu-models-x86-abi.csv
> @@ -119,3 +119,11 @@ qemu32,,,,
>  qemu32-v1,,,,
>  qemu64,✅,,,
>  qemu64-v1,✅,,,
> +x86-64-abi1,✅,,,
> +x86-64-abi1-v1,✅,,,
> +x86-64-abi2,✅,✅,,
> +x86-64-abi2-v1,✅,✅,,
> +x86-64-abi3,✅,✅,✅,
> +x86-64-abi3-v1,✅,✅,✅,
> +x86-64-abi4,✅,✅,✅,✅
> +x86-64-abi4-v1,✅,✅,✅,✅
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index ae89024d36..87a775a5eb 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -1827,6 +1827,162 @@ static CPUCaches epyc_rome_cache_info = {
>   */
>  
>  static X86CPUDefinition builtin_x86_defs[] = {
> +    /*
> +     * These first few CPU models are designed to satisfy the
> +     * x86_64 ABI levels defined in:
> +     *
> +     *   https://gitlab.com/x86-psABIs/x86-64-ABI/
> +     *
> +     * They were constructed as follows:
> +     *
> +     *   - Find all the CPU models which can satisfy the ABI
> +     *   - Calculate the lowest common denominator (LCD) of these
> +     *     models' features
> +     *   - Find the named model most closely matching the LCD
> +     *   - Strip its features back to the LCD
> +     *
> +     * The above spec uses the "x86-64-vNN" naming convention.
> +     * This clashes with the "vNN" suffix QEMU uses for versioning.
> +     * Thus we use "abiNNN" as a suffix.
> +     */
> +    {
> +        /*
> +         * Derived from Opteron_G1, minus
> +         *   vme
> +         */
> +        .name = "x86-64-abi1",
> +        .level = 5,
> +        .vendor = CPUID_VENDOR_AMD,
> +        .family = 15,
> +        .model = 6,
> +        .stepping = 1,
> +        .features[FEAT_1_EDX] =
> +            CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
> +            CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
> +            CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
> +            CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE |
> +            CPUID_DE | CPUID_FP87,
> +        .features[FEAT_1_ECX] =
> +            CPUID_EXT_SSE3,
> +        .features[FEAT_8000_0001_EDX] =
> +            CPUID_EXT2_LM | CPUID_EXT2_NX | CPUID_EXT2_SYSCALL,
> +        .xlevel = 0x80000008,
> +        .model_id = "QEMU x86-64 baseline ABI",
> +    },
> +    {
> +        /* Derived from Nehalem */
> +        .name = "x86-64-abi2",
> +        .level = 11,
> +        .vendor = CPUID_VENDOR_INTEL,
> +        .family = 6,
> +        .model = 26,
> +        .stepping = 3,
> +        .features[FEAT_1_EDX] =
> +            CPUID_VME | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
> +            CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
> +            CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
> +            CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE |
> +            CPUID_DE | CPUID_FP87,
> +        .features[FEAT_1_ECX] =
> +            CPUID_EXT_POPCNT | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
> +            CPUID_EXT_CX16 | CPUID_EXT_SSSE3 | CPUID_EXT_SSE3,
> +        .features[FEAT_8000_0001_EDX] =
> +            CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX,
> +        .features[FEAT_8000_0001_ECX] =
> +            CPUID_EXT3_LAHF_LM,
> +        .xlevel = 0x80000008,
> +        .model_id = "QEMU x86-64-v2 ABI",
> +    },
> +    {
> +        /*
> +         * Derived from Haswell-noTSX, minus
> +         *   aes pcid erms invpcid tsc-deadline x2apic pclmulqdq
> +         */
> +        .name = "x86-64-abi3",
> +        .level = 0xd,
> +        .vendor = CPUID_VENDOR_INTEL,
> +        .family = 6,
> +        .model = 60,
> +        .stepping = 1,
> +        .features[FEAT_1_EDX] =
> +            CPUID_VME | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
> +            CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
> +            CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
> +            CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE |
> +            CPUID_DE | CPUID_FP87,
> +        .features[FEAT_1_ECX] =
> +            CPUID_EXT_AVX | CPUID_EXT_XSAVE |
> +            CPUID_EXT_POPCNT | CPUID_EXT_SSE42 |
> +            CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 |
> +            CPUID_EXT_SSE3 |
> +            CPUID_EXT_FMA | CPUID_EXT_MOVBE |
> +            CPUID_EXT_F16C | CPUID_EXT_RDRAND,
> +        .features[FEAT_8000_0001_EDX] =
> +            CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_NX |
> +            CPUID_EXT2_SYSCALL,
> +        .features[FEAT_8000_0001_ECX] =
> +            CPUID_EXT3_ABM | CPUID_EXT3_LAHF_LM,
> +        .features[FEAT_7_0_EBX] =
> +            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
> +            CPUID_7_0_EBX_AVX2 | CPUID_7_0_EBX_SMEP |
> +            CPUID_7_0_EBX_BMI2,
> +        .features[FEAT_XSAVE] =
> +            CPUID_XSAVE_XSAVEOPT,
> +        .features[FEAT_6_EAX] =
> +            CPUID_6_EAX_ARAT,
> +        .xlevel = 0x80000008,
> +        .model_id = "QEMU x86-64-v3 ABI",
> +    },
> +
> +    {
> +        /*
> +         * Derived from Skylake-Server-noTSX-IBRS, minus:
> +         *  spec-ctrl
> +         */
> +        .name = "x86-64-abi4",
> +        .level = 0xd,
> +        .vendor = CPUID_VENDOR_INTEL,
> +        .family = 6,
> +        .model = 85,
> +        .stepping = 4,
> +        .features[FEAT_1_EDX] =
> +            CPUID_VME | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
> +            CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
> +            CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
> +            CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE |
> +            CPUID_DE | CPUID_FP87,
> +        .features[FEAT_1_ECX] =
> +            CPUID_EXT_AVX | CPUID_EXT_XSAVE | CPUID_EXT_AES |
> +            CPUID_EXT_POPCNT | CPUID_EXT_X2APIC | CPUID_EXT_SSE42 |
> +            CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 |
> +            CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3 |
> +            CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_FMA | CPUID_EXT_MOVBE |
> +            CPUID_EXT_PCID | CPUID_EXT_F16C | CPUID_EXT_RDRAND,
> +        .features[FEAT_8000_0001_EDX] =
> +            CPUID_EXT2_LM | CPUID_EXT2_PDPE1GB | CPUID_EXT2_RDTSCP |
> +            CPUID_EXT2_NX | CPUID_EXT2_SYSCALL,
> +        .features[FEAT_8000_0001_ECX] =
> +            CPUID_EXT3_ABM | CPUID_EXT3_LAHF_LM | CPUID_EXT3_3DNOWPREFETCH,
> +        .features[FEAT_7_0_EBX] =
> +            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
> +            CPUID_7_0_EBX_AVX2 | CPUID_7_0_EBX_SMEP |
> +            CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_INVPCID |
> +            CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX |
> +            CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_CLWB |
> +            CPUID_7_0_EBX_AVX512F | CPUID_7_0_EBX_AVX512DQ |
> +            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512CD |
> +            CPUID_7_0_EBX_AVX512VL,
> +        .features[FEAT_7_0_ECX] =
> +            CPUID_7_0_ECX_PKU,
> +        .features[FEAT_XSAVE] =
> +            CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
> +            CPUID_XSAVE_XGETBV1,
> +        .features[FEAT_6_EAX] =
> +            CPUID_6_EAX_ARAT,
> +        .xlevel = 0x80000008,
> +        .model_id = "QEMU x86-64-v4 ABI",
> +    },
> +
>      {
>          .name = "qemu64",
>          .level = 0xd,
> -- 
> 2.29.2

dme.
Daniel P. Berrangé Feb. 2, 2021, 12:32 p.m. UTC | #2
On Tue, Feb 02, 2021 at 09:46:55AM +0000, David Edmondson wrote:
> On Monday, 2021-02-01 at 15:36:04 GMT, Daniel P. Berrangé wrote:
> 
> > To paraphrase:
> >
> >   https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level/
> >
> > In 2020, AMD, Intel, Red Hat, and SUSE worked together to define
> > three microarchitecture levels on top of the historical x86-64
> > baseline:
> >
> >   * x86-64:    original x86_64 baseline instruction set
> >   * x86-64-v2: vector instructions up to Streaming SIMD
> >                Extensions 4.2 (SSE4.2)  and Supplemental
> > 	       Streaming SIMD Extensions 3 (SSSE3), the
> > 	       POPCNT instruction, and CMPXCHG16B
> >   * x86-64-v3: vector instructions up to AVX2, MOVBE,
> >                and additional bit-manipulation instructions.
> >   * x86-64-v4: vector instructions from some of the
> >                AVX-512 variants.
> >
> > This list of features is defined in the doc at:
> >
> >   https://gitlab.com/x86-psABIs/x86-64-ABI/
> >
> > QEMU has historically defaulted to the "qemu64" CPU model on
> > x86_64 targets, which is approximately the x86-64 baseline
> > ABI, with 'SVM' added.
> >
> > It is thought it might be desirable if QEMU could provide CPU models
> > that closely correspond to the ABI levels, while offering portability
> > across the maximum number of physical CPUs.
> >
> > Historically we've found that defining CPU models with an arbitrary
> > combination of CPU features can be problematic, as some guest OS
> > will not check all features they use, and instead assume that if
> > they see feature "XX", then "YY" will always exist. This is reasonable
> > in bare metal, but subject to breakage in virtualization.
> >
> > Thus in defining the CPI models for the ABI levels, this patch attempted
> 
> s/CPI/CPU/
> 
> > to base them off an existing CPU model.
> >
> > While each x86-64-abiNNN has a designated vendor, they are designed
> > to be vendor agnostic models. ie they are capable of running on any
> > AMD or Intel physical CPU model that satisfies the ABI level. eg
> 
> Only AMD or Intel? (You mention the Hugon chips elsewhere.)

In theory any x86 CPU that meets the ABI level, regardless of vendor
but if any vendor's set of CPUID features diverges too far from other
CPUs of similar level we might have problems.

For example, I had to specially avoid including "aes" in the
x86-64-abi3 because of the Hugon chips lacking it. There might
be other cases like this, since I've only compared CPUID sets
for named CPUs that QEMU includes.

> > None of the CPU models declare any VMX/SVM features. This would
> > make them unable to support nested virtualization with live
> > migration.
> 
> How about "Unable to support hardware accelerated nested
> virtualization." ?
> 
> Is live migration relevant?

Choice of CPU models is a critical decision in your future ability
to live migrate, so I wanted to call that out specifically.

Regards,
Daniel
David Edmondson Feb. 2, 2021, 12:50 p.m. UTC | #3
On Tuesday, 2021-02-02 at 12:32:39 GMT, Daniel P. Berrangé wrote:

> On Tue, Feb 02, 2021 at 09:46:55AM +0000, David Edmondson wrote:
>> On Monday, 2021-02-01 at 15:36:04 GMT, Daniel P. Berrangé wrote:
>> 
>> > To paraphrase:
>> >
>> >   https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level/
>> >
>> > In 2020, AMD, Intel, Red Hat, and SUSE worked together to define
>> > three microarchitecture levels on top of the historical x86-64
>> > baseline:
>> >
>> >   * x86-64:    original x86_64 baseline instruction set
>> >   * x86-64-v2: vector instructions up to Streaming SIMD
>> >                Extensions 4.2 (SSE4.2)  and Supplemental
>> > 	       Streaming SIMD Extensions 3 (SSSE3), the
>> > 	       POPCNT instruction, and CMPXCHG16B
>> >   * x86-64-v3: vector instructions up to AVX2, MOVBE,
>> >                and additional bit-manipulation instructions.
>> >   * x86-64-v4: vector instructions from some of the
>> >                AVX-512 variants.
>> >
>> > This list of features is defined in the doc at:
>> >
>> >   https://gitlab.com/x86-psABIs/x86-64-ABI/
>> >
>> > QEMU has historically defaulted to the "qemu64" CPU model on
>> > x86_64 targets, which is approximately the x86-64 baseline
>> > ABI, with 'SVM' added.
>> >
>> > It is thought it might be desirable if QEMU could provide CPU models
>> > that closely correspond to the ABI levels, while offering portability
>> > across the maximum number of physical CPUs.
>> >
>> > Historically we've found that defining CPU models with an arbitrary
>> > combination of CPU features can be problematic, as some guest OS
>> > will not check all features they use, and instead assume that if
>> > they see feature "XX", then "YY" will always exist. This is reasonable
>> > in bare metal, but subject to breakage in virtualization.
>> >
>> > Thus in defining the CPI models for the ABI levels, this patch attempted
>> 
>> s/CPI/CPU/
>> 
>> > to base them off an existing CPU model.
>> >
>> > While each x86-64-abiNNN has a designated vendor, they are designed
>> > to be vendor agnostic models. ie they are capable of running on any
>> > AMD or Intel physical CPU model that satisfies the ABI level. eg
>> 
>> Only AMD or Intel? (You mention the Hugon chips elsewhere.)
>
> In theory any x86 CPU that meets the ABI level, regardless of vendor
> but if any vendor's set of CPUID features diverges too far from other
> CPUs of similar level we might have problems.

Understood - so why say "AMD or Intel"?

> For example, I had to specially avoid including "aes" in the
> x86-64-abi3 because of the Hugon chips lacking it. There might
> be other cases like this, since I've only compared CPUID sets
> for named CPUs that QEMU includes.
>
>> > None of the CPU models declare any VMX/SVM features. This would
>> > make them unable to support nested virtualization with live
>> > migration.
>> 
>> How about "Unable to support hardware accelerated nested
>> virtualization." ?
>> 
>> Is live migration relevant?
>
> Choice of CPU models is a critical decision in your future ability
> to live migrate, so I wanted to call that out specifically.

But the restriction, I believe, is not that you won't be able to live
migrate with nested VMs, it's that you don't get support for nested VMs
(well, hardware accelerated - you could still run a fully emulated
nested VM). Live migration with nested VMs is irrelevant if I don't
*get* nested VMs.

dme.
Daniel P. Berrangé Feb. 2, 2021, 12:54 p.m. UTC | #4
On Tue, Feb 02, 2021 at 12:50:53PM +0000, David Edmondson wrote:
> On Tuesday, 2021-02-02 at 12:32:39 GMT, Daniel P. Berrangé wrote:
> 
> > On Tue, Feb 02, 2021 at 09:46:55AM +0000, David Edmondson wrote:
> >> On Monday, 2021-02-01 at 15:36:04 GMT, Daniel P. Berrangé wrote:
> >> 
> >> > To paraphrase:
> >> >
> >> >   https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level/
> >> >
> >> > In 2020, AMD, Intel, Red Hat, and SUSE worked together to define
> >> > three microarchitecture levels on top of the historical x86-64
> >> > baseline:
> >> >
> >> >   * x86-64:    original x86_64 baseline instruction set
> >> >   * x86-64-v2: vector instructions up to Streaming SIMD
> >> >                Extensions 4.2 (SSE4.2)  and Supplemental
> >> > 	       Streaming SIMD Extensions 3 (SSSE3), the
> >> > 	       POPCNT instruction, and CMPXCHG16B
> >> >   * x86-64-v3: vector instructions up to AVX2, MOVBE,
> >> >                and additional bit-manipulation instructions.
> >> >   * x86-64-v4: vector instructions from some of the
> >> >                AVX-512 variants.
> >> >
> >> > This list of features is defined in the doc at:
> >> >
> >> >   https://gitlab.com/x86-psABIs/x86-64-ABI/
> >> >
> >> > QEMU has historically defaulted to the "qemu64" CPU model on
> >> > x86_64 targets, which is approximately the x86-64 baseline
> >> > ABI, with 'SVM' added.
> >> >
> >> > It is thought it might be desirable if QEMU could provide CPU models
> >> > that closely correspond to the ABI levels, while offering portability
> >> > across the maximum number of physical CPUs.
> >> >
> >> > Historically we've found that defining CPU models with an arbitrary
> >> > combination of CPU features can be problematic, as some guest OS
> >> > will not check all features they use, and instead assume that if
> >> > they see feature "XX", then "YY" will always exist. This is reasonable
> >> > in bare metal, but subject to breakage in virtualization.
> >> >
> >> > Thus in defining the CPI models for the ABI levels, this patch attempted
> >> 
> >> s/CPI/CPU/
> >> 
> >> > to base them off an existing CPU model.
> >> >
> >> > While each x86-64-abiNNN has a designated vendor, they are designed
> >> > to be vendor agnostic models. ie they are capable of running on any
> >> > AMD or Intel physical CPU model that satisfies the ABI level. eg
> >> 
> >> Only AMD or Intel? (You mention the Hugon chips elsewhere.)
> >
> > In theory any x86 CPU that meets the ABI level, regardless of vendor
> > but if any vendor's set of CPUID features diverges too far from other
> > CPUs of similar level we might have problems.
> 
> Understood - so why say "AMD or Intel"?

This text is just giving an example - it doesn't need to be an
exhaustive list of vendors.  Running AMD CPUs models on Intel
host and vica-verca are the two most common scenaroos.

> 
> > For example, I had to specially avoid including "aes" in the
> > x86-64-abi3 because of the Hugon chips lacking it. There might
> > be other cases like this, since I've only compared CPUID sets
> > for named CPUs that QEMU includes.
> >
> >> > None of the CPU models declare any VMX/SVM features. This would
> >> > make them unable to support nested virtualization with live
> >> > migration.
> >> 
> >> How about "Unable to support hardware accelerated nested
> >> virtualization." ?
> >> 
> >> Is live migration relevant?
> >
> > Choice of CPU models is a critical decision in your future ability
> > to live migrate, so I wanted to call that out specifically.
> 
> But the restriction, I believe, is not that you won't be able to live
> migrate with nested VMs, it's that you don't get support for nested VMs
> (well, hardware accelerated - you could still run a fully emulated
> nested VM). Live migration with nested VMs is irrelevant if I don't
> *get* nested VMs.

What I mean is that if you use  "-cpu x86-64-abi2,+vmx" and KVM will
enable nested virt, but I believe live migration will fail because
we've not declared any VMX capabilities in the CPU model. I'll have
to defer to Paolo on the actual failure scenario details.


Regards,
Daniel
diff mbox series

Patch

diff --git a/docs/system/cpu-models-x86-abi.csv b/docs/system/cpu-models-x86-abi.csv
index 4565e6a535..d34d95d485 100644
--- a/docs/system/cpu-models-x86-abi.csv
+++ b/docs/system/cpu-models-x86-abi.csv
@@ -119,3 +119,11 @@  qemu32,,,,
 qemu32-v1,,,,
 qemu64,✅,,,
 qemu64-v1,✅,,,
+x86-64-abi1,✅,,,
+x86-64-abi1-v1,✅,,,
+x86-64-abi2,✅,✅,,
+x86-64-abi2-v1,✅,✅,,
+x86-64-abi3,✅,✅,✅,
+x86-64-abi3-v1,✅,✅,✅,
+x86-64-abi4,✅,✅,✅,✅
+x86-64-abi4-v1,✅,✅,✅,✅
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ae89024d36..87a775a5eb 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1827,6 +1827,162 @@  static CPUCaches epyc_rome_cache_info = {
  */
 
 static X86CPUDefinition builtin_x86_defs[] = {
+    /*
+     * These first few CPU models are designed to satisfy the
+     * x86_64 ABI levels defined in:
+     *
+     *   https://gitlab.com/x86-psABIs/x86-64-ABI/
+     *
+     * They were constructed as follows:
+     *
+     *   - Find all the CPU models which can satisfy the ABI
+     *   - Calculate the lowest common denominator (LCD) of these
+     *     models' features
+     *   - Find the named model most closely matching the LCD
+     *   - Strip its features back to the LCD
+     *
+     * The above spec uses the "x86-64-vNN" naming convention.
+     * This clashes with the "vNN" suffix QEMU uses for versioning.
+     * Thus we use "abiNNN" as a suffix.
+     */
+    {
+        /*
+         * Derived from Opteron_G1, minus
+         *   vme
+         */
+        .name = "x86-64-abi1",
+        .level = 5,
+        .vendor = CPUID_VENDOR_AMD,
+        .family = 15,
+        .model = 6,
+        .stepping = 1,
+        .features[FEAT_1_EDX] =
+            CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
+            CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
+            CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
+            CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE |
+            CPUID_DE | CPUID_FP87,
+        .features[FEAT_1_ECX] =
+            CPUID_EXT_SSE3,
+        .features[FEAT_8000_0001_EDX] =
+            CPUID_EXT2_LM | CPUID_EXT2_NX | CPUID_EXT2_SYSCALL,
+        .xlevel = 0x80000008,
+        .model_id = "QEMU x86-64 baseline ABI",
+    },
+    {
+        /* Derived from Nehalem */
+        .name = "x86-64-abi2",
+        .level = 11,
+        .vendor = CPUID_VENDOR_INTEL,
+        .family = 6,
+        .model = 26,
+        .stepping = 3,
+        .features[FEAT_1_EDX] =
+            CPUID_VME | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
+            CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
+            CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
+            CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE |
+            CPUID_DE | CPUID_FP87,
+        .features[FEAT_1_ECX] =
+            CPUID_EXT_POPCNT | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
+            CPUID_EXT_CX16 | CPUID_EXT_SSSE3 | CPUID_EXT_SSE3,
+        .features[FEAT_8000_0001_EDX] =
+            CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX,
+        .features[FEAT_8000_0001_ECX] =
+            CPUID_EXT3_LAHF_LM,
+        .xlevel = 0x80000008,
+        .model_id = "QEMU x86-64-v2 ABI",
+    },
+    {
+        /*
+         * Derived from Haswell-noTSX, minus
+         *   aes pcid erms invpcid tsc-deadline x2apic pclmulqdq
+         */
+        .name = "x86-64-abi3",
+        .level = 0xd,
+        .vendor = CPUID_VENDOR_INTEL,
+        .family = 6,
+        .model = 60,
+        .stepping = 1,
+        .features[FEAT_1_EDX] =
+            CPUID_VME | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
+            CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
+            CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
+            CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE |
+            CPUID_DE | CPUID_FP87,
+        .features[FEAT_1_ECX] =
+            CPUID_EXT_AVX | CPUID_EXT_XSAVE |
+            CPUID_EXT_POPCNT | CPUID_EXT_SSE42 |
+            CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 |
+            CPUID_EXT_SSE3 |
+            CPUID_EXT_FMA | CPUID_EXT_MOVBE |
+            CPUID_EXT_F16C | CPUID_EXT_RDRAND,
+        .features[FEAT_8000_0001_EDX] =
+            CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_NX |
+            CPUID_EXT2_SYSCALL,
+        .features[FEAT_8000_0001_ECX] =
+            CPUID_EXT3_ABM | CPUID_EXT3_LAHF_LM,
+        .features[FEAT_7_0_EBX] =
+            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
+            CPUID_7_0_EBX_AVX2 | CPUID_7_0_EBX_SMEP |
+            CPUID_7_0_EBX_BMI2,
+        .features[FEAT_XSAVE] =
+            CPUID_XSAVE_XSAVEOPT,
+        .features[FEAT_6_EAX] =
+            CPUID_6_EAX_ARAT,
+        .xlevel = 0x80000008,
+        .model_id = "QEMU x86-64-v3 ABI",
+    },
+
+    {
+        /*
+         * Derived from Skylake-Server-noTSX-IBRS, minus:
+         *  spec-ctrl
+         */
+        .name = "x86-64-abi4",
+        .level = 0xd,
+        .vendor = CPUID_VENDOR_INTEL,
+        .family = 6,
+        .model = 85,
+        .stepping = 4,
+        .features[FEAT_1_EDX] =
+            CPUID_VME | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
+            CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
+            CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
+            CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE |
+            CPUID_DE | CPUID_FP87,
+        .features[FEAT_1_ECX] =
+            CPUID_EXT_AVX | CPUID_EXT_XSAVE | CPUID_EXT_AES |
+            CPUID_EXT_POPCNT | CPUID_EXT_X2APIC | CPUID_EXT_SSE42 |
+            CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 |
+            CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3 |
+            CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_FMA | CPUID_EXT_MOVBE |
+            CPUID_EXT_PCID | CPUID_EXT_F16C | CPUID_EXT_RDRAND,
+        .features[FEAT_8000_0001_EDX] =
+            CPUID_EXT2_LM | CPUID_EXT2_PDPE1GB | CPUID_EXT2_RDTSCP |
+            CPUID_EXT2_NX | CPUID_EXT2_SYSCALL,
+        .features[FEAT_8000_0001_ECX] =
+            CPUID_EXT3_ABM | CPUID_EXT3_LAHF_LM | CPUID_EXT3_3DNOWPREFETCH,
+        .features[FEAT_7_0_EBX] =
+            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
+            CPUID_7_0_EBX_AVX2 | CPUID_7_0_EBX_SMEP |
+            CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_INVPCID |
+            CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX |
+            CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_CLWB |
+            CPUID_7_0_EBX_AVX512F | CPUID_7_0_EBX_AVX512DQ |
+            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512CD |
+            CPUID_7_0_EBX_AVX512VL,
+        .features[FEAT_7_0_ECX] =
+            CPUID_7_0_ECX_PKU,
+        .features[FEAT_XSAVE] =
+            CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
+            CPUID_XSAVE_XGETBV1,
+        .features[FEAT_6_EAX] =
+            CPUID_6_EAX_ARAT,
+        .xlevel = 0x80000008,
+        .model_id = "QEMU x86-64-v4 ABI",
+    },
+
     {
         .name = "qemu64",
         .level = 0xd,