diff mbox series

[v8,04/12] x86emul: support SERIALIZE

Message ID 0bbbf95e-48ec-ee73-5234-52cf9c6c06d8@suse.com (mailing list archive)
State Superseded
Headers show
Series x86emul: further work | expand

Commit Message

Jan Beulich May 5, 2020, 8:14 a.m. UTC
... enabling its use by all guest kinds at the same time.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v7: Re-base.
v6: New.

Comments

Andrew Cooper May 7, 2020, 7:32 p.m. UTC | #1
On 05/05/2020 09:14, Jan Beulich wrote:
> ... enabling its use by all guest kinds at the same time.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

> @@ -5660,6 +5661,18 @@ x86_emulate(
>                  goto done;
>              break;
>  
> +        case 0xe8:
> +            switch ( vex.pfx )
> +            {
> +            case vex_none: /* serialize */
> +                host_and_vcpu_must_have(serialize);
> +                asm volatile ( ".byte 0x0f, 0x01, 0xe8" );

There is very little need for an actual implementation here.  The VMExit
to get here is good enough.

The only question is whether pre-unrestricted_guest Intel boxes are
liable to find this in real mode code.

~Andrew
Jan Beulich May 8, 2020, 7:34 a.m. UTC | #2
On 07.05.2020 21:32, Andrew Cooper wrote:
> On 05/05/2020 09:14, Jan Beulich wrote:
>> ... enabling its use by all guest kinds at the same time.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks.

>> @@ -5660,6 +5661,18 @@ x86_emulate(
>>                  goto done;
>>              break;
>>  
>> +        case 0xe8:
>> +            switch ( vex.pfx )
>> +            {
>> +            case vex_none: /* serialize */
>> +                host_and_vcpu_must_have(serialize);
>> +                asm volatile ( ".byte 0x0f, 0x01, 0xe8" );
> 
> There is very little need for an actual implementation here.  The VMExit
> to get here is good enough.
> 
> The only question is whether pre-unrestricted_guest Intel boxes are
> liable to find this in real mode code.

Apart from this we also shouldn't assume HVM in the core emulator,
I think.

Jan
Andrew Cooper May 8, 2020, 1 p.m. UTC | #3
On 08/05/2020 08:34, Jan Beulich wrote:
>>> @@ -5660,6 +5661,18 @@ x86_emulate(
>>>                  goto done;
>>>              break;
>>>  
>>> +        case 0xe8:
>>> +            switch ( vex.pfx )
>>> +            {
>>> +            case vex_none: /* serialize */
>>> +                host_and_vcpu_must_have(serialize);
>>> +                asm volatile ( ".byte 0x0f, 0x01, 0xe8" );
>> There is very little need for an actual implementation here.  The VMExit
>> to get here is good enough.
>>
>> The only question is whether pre-unrestricted_guest Intel boxes are
>> liable to find this in real mode code.
> Apart from this we also shouldn't assume HVM in the core emulator,
> I think.

It's not assuming HVM.  Its just that there is no way we can end up
emulating this instruction from PV context.

If we could end up here in PV context, the exception causing us to
emulate to begin with would be good enough as well.

~Andrew
Jan Beulich May 8, 2020, 1:59 p.m. UTC | #4
On 08.05.2020 15:00, Andrew Cooper wrote:
> On 08/05/2020 08:34, Jan Beulich wrote:
>>>> @@ -5660,6 +5661,18 @@ x86_emulate(
>>>>                  goto done;
>>>>              break;
>>>>  
>>>> +        case 0xe8:
>>>> +            switch ( vex.pfx )
>>>> +            {
>>>> +            case vex_none: /* serialize */
>>>> +                host_and_vcpu_must_have(serialize);
>>>> +                asm volatile ( ".byte 0x0f, 0x01, 0xe8" );
>>> There is very little need for an actual implementation here.  The VMExit
>>> to get here is good enough.
>>>
>>> The only question is whether pre-unrestricted_guest Intel boxes are
>>> liable to find this in real mode code.
>> Apart from this we also shouldn't assume HVM in the core emulator,
>> I think.
> 
> It's not assuming HVM.  Its just that there is no way we can end up
> emulating this instruction from PV context.
> 
> If we could end up here in PV context, the exception causing us to
> emulate to begin with would be good enough as well.

With the current way of where/how emulation gets involved -
yes. I'd like to remind you though of the 4-insn window
shadow code tries to emulate over for PAE guests. There
is no intervening #VMEXIT there.

So do you want me to drop the asm() then, and switch from
host_and_vcpu_must_have(serialize) to just
vcpu_must_have(serialize)?

Jan
Andrew Cooper May 8, 2020, 3:05 p.m. UTC | #5
On 08/05/2020 14:59, Jan Beulich wrote:
> On 08.05.2020 15:00, Andrew Cooper wrote:
>> On 08/05/2020 08:34, Jan Beulich wrote:
>>>>> @@ -5660,6 +5661,18 @@ x86_emulate(
>>>>>                  goto done;
>>>>>              break;
>>>>>  
>>>>> +        case 0xe8:
>>>>> +            switch ( vex.pfx )
>>>>> +            {
>>>>> +            case vex_none: /* serialize */
>>>>> +                host_and_vcpu_must_have(serialize);
>>>>> +                asm volatile ( ".byte 0x0f, 0x01, 0xe8" );
>>>> There is very little need for an actual implementation here.  The VMExit
>>>> to get here is good enough.
>>>>
>>>> The only question is whether pre-unrestricted_guest Intel boxes are
>>>> liable to find this in real mode code.
>>> Apart from this we also shouldn't assume HVM in the core emulator,
>>> I think.
>> It's not assuming HVM.  Its just that there is no way we can end up
>> emulating this instruction from PV context.
>>
>> If we could end up here in PV context, the exception causing us to
>> emulate to begin with would be good enough as well.
> With the current way of where/how emulation gets involved -
> yes. I'd like to remind you though of the 4-insn window
> shadow code tries to emulate over for PAE guests. There
> is no intervening #VMEXIT there.
>
> So do you want me to drop the asm() then, and switch from
> host_and_vcpu_must_have(serialize) to just
> vcpu_must_have(serialize)?

No - keep it as is.  This isn't a fastpath, and it is much safer to
assume there might be something we've forgotten.  (Or perhaps some new
future behaviour included in the definition of architecturally serialising).

~Andrew
diff mbox series

Patch

--- a/tools/libxl/libxl_cpuid.c
+++ b/tools/libxl/libxl_cpuid.c
@@ -214,6 +214,7 @@  int libxl_cpuid_parse_config(libxl_cpuid
         {"avx512-4vnniw",0x00000007,  0, CPUID_REG_EDX,  2,  1},
         {"avx512-4fmaps",0x00000007,  0, CPUID_REG_EDX,  3,  1},
         {"md-clear",     0x00000007,  0, CPUID_REG_EDX, 10,  1},
+        {"serialize",    0x00000007,  0, CPUID_REG_EDX, 14,  1},
         {"cet-ibt",      0x00000007,  0, CPUID_REG_EDX, 20,  1},
         {"ibrsb",        0x00000007,  0, CPUID_REG_EDX, 26,  1},
         {"stibp",        0x00000007,  0, CPUID_REG_EDX, 27,  1},
--- a/tools/misc/xen-cpuid.c
+++ b/tools/misc/xen-cpuid.c
@@ -161,6 +161,7 @@  static const char *const str_7d0[32] =
 
     [10] = "md-clear",
     /* 12 */                [13] = "tsx-force-abort",
+    [14] = "serialize",
 
     [18] = "pconfig",
     [20] = "cet-ibt",
--- a/tools/tests/x86_emulator/x86-emulate.h
+++ b/tools/tests/x86_emulator/x86-emulate.h
@@ -158,6 +158,7 @@  static inline bool xcr0_mask(uint64_t ma
 #define cpu_has_movdir64b  cp.feat.movdir64b
 #define cpu_has_avx512_4vnniw (cp.feat.avx512_4vnniw && xcr0_mask(0xe6))
 #define cpu_has_avx512_4fmaps (cp.feat.avx512_4fmaps && xcr0_mask(0xe6))
+#define cpu_has_serialize  cp.feat.serialize
 #define cpu_has_avx512_bf16 (cp.feat.avx512_bf16 && xcr0_mask(0xe6))
 
 #define cpu_has_xgetbv1   (cpu_has_xsave && cp.xstate.xgetbv1)
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1927,6 +1927,7 @@  amd_like(const struct x86_emulate_ctxt *
 #define vcpu_has_enqcmd()      (ctxt->cpuid->feat.enqcmd)
 #define vcpu_has_avx512_4vnniw() (ctxt->cpuid->feat.avx512_4vnniw)
 #define vcpu_has_avx512_4fmaps() (ctxt->cpuid->feat.avx512_4fmaps)
+#define vcpu_has_serialize()   (ctxt->cpuid->feat.serialize)
 #define vcpu_has_avx512_bf16() (ctxt->cpuid->feat.avx512_bf16)
 
 #define vcpu_must_have(feat) \
@@ -5660,6 +5661,18 @@  x86_emulate(
                 goto done;
             break;
 
+        case 0xe8:
+            switch ( vex.pfx )
+            {
+            case vex_none: /* serialize */
+                host_and_vcpu_must_have(serialize);
+                asm volatile ( ".byte 0x0f, 0x01, 0xe8" );
+                break;
+            default:
+                goto unimplemented_insn;
+            }
+            break;
+
         case 0xf8: /* swapgs */
             generate_exception_if(!mode_64bit(), EXC_UD);
             generate_exception_if(!mode_ring0(), EXC_GP, 0);
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -129,6 +129,7 @@ 
 #define cpu_has_avx512_4vnniw   boot_cpu_has(X86_FEATURE_AVX512_4VNNIW)
 #define cpu_has_avx512_4fmaps   boot_cpu_has(X86_FEATURE_AVX512_4FMAPS)
 #define cpu_has_tsx_force_abort boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)
+#define cpu_has_serialize       boot_cpu_has(X86_FEATURE_SERIALIZE)
 
 /* CPUID level 0x00000007:1.eax */
 #define cpu_has_avx512_bf16     boot_cpu_has(X86_FEATURE_AVX512_BF16)
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -258,6 +258,7 @@  XEN_CPUFEATURE(AVX512_4VNNIW, 9*32+ 2) /
 XEN_CPUFEATURE(AVX512_4FMAPS, 9*32+ 3) /*A  AVX512 Multiply Accumulation Single Precision */
 XEN_CPUFEATURE(MD_CLEAR,      9*32+10) /*A  VERW clears microarchitectural buffers */
 XEN_CPUFEATURE(TSX_FORCE_ABORT, 9*32+13) /* MSR_TSX_FORCE_ABORT.RTM_ABORT */
+XEN_CPUFEATURE(SERIALIZE,     9*32+14) /*A  SERIALIZE insn */
 XEN_CPUFEATURE(CET_IBT,       9*32+20) /*   CET - Indirect Branch Tracking */
 XEN_CPUFEATURE(IBRSB,         9*32+26) /*A  IBRS and IBPB support (used by Intel) */
 XEN_CPUFEATURE(STIBP,         9*32+27) /*A  STIBP */