diff mbox series

[v3,7/9] x86/svm: VMEntry/Exit logic for MSR_SPEC_CTRL

Message ID 20220131153621.8863-1-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Andrew Cooper Jan. 31, 2022, 3:36 p.m. UTC
Hardware maintains both host and guest versions of MSR_SPEC_CTRL, but guests
run with the logical OR of both values.  Therefore, in principle we want to
clear Xen's value before entering the guest.  However, for migration
compatibility, and for performance reasons with SEV-SNP guests, we want the
ability to use a nonzero value behind the guest's back.  Use vcpu_msrs to hold
this value, with the guest value in the VMCB.

On the VMEntry path, adjusting MSR_SPEC_CTRL must be done after CLGI so as to
be atomic with respect to NMIs/etc.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

v3:
 * Implement in asm
---
 xen/arch/x86/hvm/svm/entry.S             | 34 +++++++++++++++++++++++++++-----
 xen/arch/x86/include/asm/msr.h           |  9 +++++++++
 xen/arch/x86/include/asm/spec_ctrl_asm.h |  3 +++
 xen/arch/x86/x86_64/asm-offsets.c        |  1 +
 4 files changed, 42 insertions(+), 5 deletions(-)

Comments

Jan Beulich Feb. 1, 2022, 11:47 a.m. UTC | #1
On 31.01.2022 16:36, Andrew Cooper wrote:
> Hardware maintains both host and guest versions of MSR_SPEC_CTRL, but guests
> run with the logical OR of both values.  Therefore, in principle we want to
> clear Xen's value before entering the guest.  However, for migration
> compatibility,

I think you've explained this to me before, but I can't seem to put
all of it together already now. Could expand on how a non-zero value
behind a guest's back can help with migration compatibility? At the
first glance I would be inclined to say only what the guest actually
gets to see and use can affect its migration.

> and for performance reasons with SEV-SNP guests, we want the
> ability to use a nonzero value behind the guest's back.  Use vcpu_msrs to hold
> this value, with the guest value in the VMCB.
> 
> On the VMEntry path, adjusting MSR_SPEC_CTRL must be done after CLGI so as to
> be atomic with respect to NMIs/etc.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Preferably with the above expansion and with one further style
issue (see below) taken care of
Reviewed-by: Jan Beulich <jbeulich@suse.com>

> --- a/xen/arch/x86/hvm/svm/entry.S
> +++ b/xen/arch/x86/hvm/svm/entry.S
> @@ -55,11 +55,23 @@ __UNLIKELY_END(nsvm_hap)
>          mov  %rsp, %rdi
>          call svm_vmenter_helper
>  
> -        mov VCPU_arch_msrs(%rbx), %rax
> -        mov VCPUMSR_spec_ctrl_raw(%rax), %eax
> +        clgi
>  
>          /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */
> -        /* SPEC_CTRL_EXIT_TO_SVM   (nothing currently) */
> +        /* SPEC_CTRL_EXIT_TO_SVM       Req: b=curr %rsp=regs/cpuinfo, Clob: acd */
> +        .macro svm_vmentry_spec_ctrl
> +            mov    VCPU_arch_msrs(%rbx), %rax
> +            movzbl CPUINFO_last_spec_ctrl(%rsp), %edx
> +            mov    VCPUMSR_spec_ctrl_raw(%rax), %eax
> +            cmp    %edx, %eax
> +            je 1f  /* Skip write if value is correct. */

Wold you mind padding the insn operand properly, in line with all
others nearby?

Jan
Andrew Cooper Feb. 1, 2022, 12:28 p.m. UTC | #2
On 01/02/2022 11:47, Jan Beulich wrote:
> On 31.01.2022 16:36, Andrew Cooper wrote:
>> Hardware maintains both host and guest versions of MSR_SPEC_CTRL, but guests
>> run with the logical OR of both values.  Therefore, in principle we want to
>> clear Xen's value before entering the guest.  However, for migration
>> compatibility,
> I think you've explained this to me before, but I can't seem to put
> all of it together already now. Could expand on how a non-zero value
> behind a guest's back can help with migration compatibility? At the
> first glance I would be inclined to say only what the guest actually
> gets to see and use can affect its migration.

For VMs which see VIRT_SPEC_CTRL (compatibility with Fam15 thru Zen1),
writes of VIRT_SPEC_CTRL.SSBD (probably) need to use
SSBD-behind-the-guest's back.  I say probably, because I think this is
the least bad implementation option, but until we have working support,
it's still a guess.

For the ultra paranoid, a VM migrating in which can't see PSFD (e.g. for
compatibility with Zen2) should have PSFD set behind it's back.  Except
that SSBD also has an appropriate side effect so that existing "I'm a
piece of critical code" signals that have grown in various OSes continue
to do the safe thing on PSFD-capable hardware.  Given that we don't
activate SSBD by default, we shouldn't default disable PFSD behind an
unaware guest either.

That then leaves the meaning of spec-ctrl=ssbd,psfd because ssbd is
currently system wide (if enabled) on AMD.  This series changes that for
HVM guests, and it will change again shortly for PV guests, and this is
obviously the better default behaviour.  But we could have a system wide
option on top of guest support in most cases if someone sees a need.

>> and for performance reasons with SEV-SNP guests, we want the
>> ability to use a nonzero value behind the guest's back.

For completeness, for SEV-SNP, IBRS needs setting to avoid vmentry
issuing IBPB.  More specifically, the VMexit=>Entry path must not clear
IBRS, at which point hardware knows that nothing can have got into the
indirect predictor.


>>   Use vcpu_msrs to hold
>> this value, with the guest value in the VMCB.
>>
>> On the VMEntry path, adjusting MSR_SPEC_CTRL must be done after CLGI so as to
>> be atomic with respect to NMIs/etc.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Preferably with the above expansion and with one further style
> issue (see below) taken care of
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks

>
>> --- a/xen/arch/x86/hvm/svm/entry.S
>> +++ b/xen/arch/x86/hvm/svm/entry.S
>> @@ -55,11 +55,23 @@ __UNLIKELY_END(nsvm_hap)
>>          mov  %rsp, %rdi
>>          call svm_vmenter_helper
>>  
>> -        mov VCPU_arch_msrs(%rbx), %rax
>> -        mov VCPUMSR_spec_ctrl_raw(%rax), %eax
>> +        clgi
>>  
>>          /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */
>> -        /* SPEC_CTRL_EXIT_TO_SVM   (nothing currently) */
>> +        /* SPEC_CTRL_EXIT_TO_SVM       Req: b=curr %rsp=regs/cpuinfo, Clob: acd */
>> +        .macro svm_vmentry_spec_ctrl
>> +            mov    VCPU_arch_msrs(%rbx), %rax
>> +            movzbl CPUINFO_last_spec_ctrl(%rsp), %edx
>> +            mov    VCPUMSR_spec_ctrl_raw(%rax), %eax
>> +            cmp    %edx, %eax
>> +            je 1f  /* Skip write if value is correct. */
> Wold you mind padding the insn operand properly, in line with all
> others nearby?

Oops yes.  Fixed.

~Andrew
Jan Beulich Feb. 1, 2022, 12:40 p.m. UTC | #3
On 01.02.2022 13:28, Andrew Cooper wrote:
> On 01/02/2022 11:47, Jan Beulich wrote:
>> On 31.01.2022 16:36, Andrew Cooper wrote:
>>> Hardware maintains both host and guest versions of MSR_SPEC_CTRL, but guests
>>> run with the logical OR of both values.  Therefore, in principle we want to
>>> clear Xen's value before entering the guest.  However, for migration
>>> compatibility,
>> I think you've explained this to me before, but I can't seem to put
>> all of it together already now. Could expand on how a non-zero value
>> behind a guest's back can help with migration compatibility? At the
>> first glance I would be inclined to say only what the guest actually
>> gets to see and use can affect its migration.
> 
> For VMs which see VIRT_SPEC_CTRL (compatibility with Fam15 thru Zen1),
> writes of VIRT_SPEC_CTRL.SSBD (probably) need to use
> SSBD-behind-the-guest's back.  I say probably, because I think this is
> the least bad implementation option, but until we have working support,
> it's still a guess.

So this is future work (and mentioning just this in the description
would be enough to address my comment), but ...

> For the ultra paranoid, a VM migrating in which can't see PSFD (e.g. for
> compatibility with Zen2) should have PSFD set behind it's back.

... this is something we should be doing right away then?

Jan
Andrew Cooper Feb. 1, 2022, 12:46 p.m. UTC | #4
On 01/02/2022 12:40, Jan Beulich wrote:
> On 01.02.2022 13:28, Andrew Cooper wrote:
>> On 01/02/2022 11:47, Jan Beulich wrote:
>>> On 31.01.2022 16:36, Andrew Cooper wrote:
>>>> Hardware maintains both host and guest versions of MSR_SPEC_CTRL, but guests
>>>> run with the logical OR of both values.  Therefore, in principle we want to
>>>> clear Xen's value before entering the guest.  However, for migration
>>>> compatibility,
>>> I think you've explained this to me before, but I can't seem to put
>>> all of it together already now. Could expand on how a non-zero value
>>> behind a guest's back can help with migration compatibility? At the
>>> first glance I would be inclined to say only what the guest actually
>>> gets to see and use can affect its migration.
>> For VMs which see VIRT_SPEC_CTRL (compatibility with Fam15 thru Zen1),
>> writes of VIRT_SPEC_CTRL.SSBD (probably) need to use
>> SSBD-behind-the-guest's back.  I say probably, because I think this is
>> the least bad implementation option, but until we have working support,
>> it's still a guess.
> So this is future work (and mentioning just this in the description
> would be enough to address my comment)

Near future, but yes.

> , but ...
>
>> For the ultra paranoid, a VM migrating in which can't see PSFD (e.g. for
>> compatibility with Zen2) should have PSFD set behind it's back.
> ... this is something we should be doing right away then?

Except for the second half of this paragraph which was an argument as to
why not.

What OSes expose to userspace for "I need speculative safety" works
whether the kernel can see PSFD or not.

~Andrew
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/svm/entry.S b/xen/arch/x86/hvm/svm/entry.S
index 276215d36aff..16b642c9e2de 100644
--- a/xen/arch/x86/hvm/svm/entry.S
+++ b/xen/arch/x86/hvm/svm/entry.S
@@ -55,11 +55,23 @@  __UNLIKELY_END(nsvm_hap)
         mov  %rsp, %rdi
         call svm_vmenter_helper
 
-        mov VCPU_arch_msrs(%rbx), %rax
-        mov VCPUMSR_spec_ctrl_raw(%rax), %eax
+        clgi
 
         /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */
-        /* SPEC_CTRL_EXIT_TO_SVM   (nothing currently) */
+        /* SPEC_CTRL_EXIT_TO_SVM       Req: b=curr %rsp=regs/cpuinfo, Clob: acd */
+        .macro svm_vmentry_spec_ctrl
+            mov    VCPU_arch_msrs(%rbx), %rax
+            movzbl CPUINFO_last_spec_ctrl(%rsp), %edx
+            mov    VCPUMSR_spec_ctrl_raw(%rax), %eax
+            cmp    %edx, %eax
+            je 1f  /* Skip write if value is correct. */
+            mov    $MSR_SPEC_CTRL, %ecx
+            xor    %edx, %edx
+            wrmsr
+            mov    %al, CPUINFO_last_spec_ctrl(%rsp)
+1:          /* No Spectre v1 concerns.  Execution will hit VMRUN imminently. */
+        .endm
+        ALTERNATIVE "", svm_vmentry_spec_ctrl, X86_FEATURE_SC_MSR_HVM
 
         pop  %r15
         pop  %r14
@@ -78,7 +90,6 @@  __UNLIKELY_END(nsvm_hap)
         pop  %rsi
         pop  %rdi
 
-        clgi
         sti
         vmrun
 
@@ -86,8 +97,21 @@  __UNLIKELY_END(nsvm_hap)
 
         GET_CURRENT(bx)
 
-        /* SPEC_CTRL_ENTRY_FROM_SVM    Req: b=curr %rsp=regs/cpuinfo, Clob: ac  */
+        /* SPEC_CTRL_ENTRY_FROM_SVM    Req: %rsp=regs/cpuinfo         Clob: acd */
         ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_SC_RSB_HVM
+
+        .macro svm_vmexit_spec_ctrl
+            /*
+             * Write to MSR_SPEC_CTRL unconditionally, for the RAS[:32]
+             * flushing side effect.
+             */
+            mov    $MSR_SPEC_CTRL, %ecx
+            movzbl CPUINFO_xen_spec_ctrl(%rsp), %eax
+            xor    %edx, %edx
+            wrmsr
+            mov    %al, CPUINFO_last_spec_ctrl(%rsp)
+        .endm
+        ALTERNATIVE "", svm_vmexit_spec_ctrl, X86_FEATURE_SC_MSR_HVM
         /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
 
         stgi
diff --git a/xen/arch/x86/include/asm/msr.h b/xen/arch/x86/include/asm/msr.h
index 657a3295613d..ce4fe51afe54 100644
--- a/xen/arch/x86/include/asm/msr.h
+++ b/xen/arch/x86/include/asm/msr.h
@@ -297,6 +297,15 @@  struct vcpu_msrs
      *
      * For VT-x guests, the guest value is held in the MSR guest load/save
      * list.
+     *
+     * For SVM, the guest value lives in the VMCB, and hardware saves/restores
+     * the host value automatically.  However, guests run with the OR of the
+     * host and guest value, which allows Xen to set protections behind the
+     * guest's back.
+     *
+     * We must clear/restore Xen's value before/after VMRUN to avoid unduly
+     * influencing the guest.  In order to support "behind the guest's back"
+     * protections, we load this value (commonly 0) before VMRUN.
      */
     struct {
         uint32_t raw;
diff --git a/xen/arch/x86/include/asm/spec_ctrl_asm.h b/xen/arch/x86/include/asm/spec_ctrl_asm.h
index 9c0c7622c41f..02b3b18ce69f 100644
--- a/xen/arch/x86/include/asm/spec_ctrl_asm.h
+++ b/xen/arch/x86/include/asm/spec_ctrl_asm.h
@@ -46,6 +46,9 @@ 
  *   - On VMX by using MSR load/save lists to have vmentry/exit atomically
  *     load/save the guest value.  Xen's value is loaded in regular code, and
  *     there is no need to use the shadow logic (below).
+ *   - On SVM by altering MSR_SPEC_CTRL inside the CLGI/STGI region.  This
+ *     makes the changes atomic with respect to NMIs/etc, so no need for
+ *     shadowing logic.
  *
  * Factor 2 is harder.  We maintain a shadow_spec_ctrl value, and a use_shadow
  * boolean in the per cpu spec_ctrl_flags.  The synchronous use is:
diff --git a/xen/arch/x86/x86_64/asm-offsets.c b/xen/arch/x86/x86_64/asm-offsets.c
index 649892643fe9..287dac101ad4 100644
--- a/xen/arch/x86/x86_64/asm-offsets.c
+++ b/xen/arch/x86/x86_64/asm-offsets.c
@@ -126,6 +126,7 @@  void __dummy__(void)
     OFFSET(CPUINFO_pv_cr3, struct cpu_info, pv_cr3);
     OFFSET(CPUINFO_shadow_spec_ctrl, struct cpu_info, shadow_spec_ctrl);
     OFFSET(CPUINFO_xen_spec_ctrl, struct cpu_info, xen_spec_ctrl);
+    OFFSET(CPUINFO_last_spec_ctrl, struct cpu_info, last_spec_ctrl);
     OFFSET(CPUINFO_spec_ctrl_flags, struct cpu_info, spec_ctrl_flags);
     OFFSET(CPUINFO_root_pgt_changed, struct cpu_info, root_pgt_changed);
     OFFSET(CPUINFO_use_pv_cr3, struct cpu_info, use_pv_cr3);