diff mbox series

[11/16] x86/spec-ctrl: Adjust DO_OVERWRITE_RSB to be shadow stack compatible

Message ID 20200501225838.9866-12-andrew.cooper3@citrix.com (mailing list archive)
State Superseded
Headers show
Series x86: Support for CET Supervisor Shadow Stacks | expand

Commit Message

Andrew Cooper May 1, 2020, 10:58 p.m. UTC
The 32 calls need dropping from the shadow stack as well as the regular stack.
To shorten the code, we can use the 32bit forms of RDSSP/INCSSP, but need to
double up the input to INCSSP to counter the operand size based multiplier.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wl@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/include/asm-x86/spec_ctrl_asm.h | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

Comments

Jan Beulich May 7, 2020, 1:22 p.m. UTC | #1
On 02.05.2020 00:58, Andrew Cooper wrote:
> @@ -114,6 +114,16 @@
>      sub $1, %ecx
>      jnz .L\@_fill_rsb_loop
>      mov %\tmp, %rsp                 /* Restore old %rsp */
> +
> +#ifdef CONFIG_XEN_SHSTK
> +    mov $1, %ecx
> +    rdsspd %ecx
> +    cmp $1, %ecx
> +    je .L\@_shstk_done
> +    mov $64, %ecx                   /* 64 * 4 bytes, given incsspd */
> +    incsspd %ecx                    /* Restore old SSP */
> +.L\@_shstk_done:
> +#endif

The latest here I wonder why you don't use alternatives patching.
I thought that's what you've introduced the synthetic feature
flag for.

Jan
Andrew Cooper May 7, 2020, 1:25 p.m. UTC | #2
On 07/05/2020 14:22, Jan Beulich wrote:
> On 02.05.2020 00:58, Andrew Cooper wrote:
>> @@ -114,6 +114,16 @@
>>      sub $1, %ecx
>>      jnz .L\@_fill_rsb_loop
>>      mov %\tmp, %rsp                 /* Restore old %rsp */
>> +
>> +#ifdef CONFIG_XEN_SHSTK
>> +    mov $1, %ecx
>> +    rdsspd %ecx
>> +    cmp $1, %ecx
>> +    je .L\@_shstk_done
>> +    mov $64, %ecx                   /* 64 * 4 bytes, given incsspd */
>> +    incsspd %ecx                    /* Restore old SSP */
>> +.L\@_shstk_done:
>> +#endif
> The latest here I wonder why you don't use alternatives patching.
> I thought that's what you've introduced the synthetic feature
> flag for.

We're already in the middle of an alternative and they don't nest.  More
importantly, this path gets used on the BSP, after patching and before
CET gets enabled.

~Andrew
Jan Beulich May 7, 2020, 1:38 p.m. UTC | #3
On 07.05.2020 15:25, Andrew Cooper wrote:
> On 07/05/2020 14:22, Jan Beulich wrote:
>> On 02.05.2020 00:58, Andrew Cooper wrote:
>>> @@ -114,6 +114,16 @@
>>>      sub $1, %ecx
>>>      jnz .L\@_fill_rsb_loop
>>>      mov %\tmp, %rsp                 /* Restore old %rsp */
>>> +
>>> +#ifdef CONFIG_XEN_SHSTK
>>> +    mov $1, %ecx
>>> +    rdsspd %ecx
>>> +    cmp $1, %ecx
>>> +    je .L\@_shstk_done
>>> +    mov $64, %ecx                   /* 64 * 4 bytes, given incsspd */
>>> +    incsspd %ecx                    /* Restore old SSP */
>>> +.L\@_shstk_done:
>>> +#endif
>> The latest here I wonder why you don't use alternatives patching.
>> I thought that's what you've introduced the synthetic feature
>> flag for.
> 
> We're already in the middle of an alternative and they don't nest.  More
> importantly, this path gets used on the BSP, after patching and before
> CET gets enabled.

Oh, I should have noticed this. The first point could be dealt with,
but I agree the second pretty much rules out patching.

Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan
diff mbox series

Patch

diff --git a/xen/include/asm-x86/spec_ctrl_asm.h b/xen/include/asm-x86/spec_ctrl_asm.h
index c60093b090..cb34299a86 100644
--- a/xen/include/asm-x86/spec_ctrl_asm.h
+++ b/xen/include/asm-x86/spec_ctrl_asm.h
@@ -83,9 +83,9 @@ 
  * Requires nothing
  * Clobbers \tmp (%rax by default), %rcx
  *
- * Requires 256 bytes of stack space, but %rsp has no net change. Based on
- * Google's performance numbers, the loop is unrolled to 16 iterations and two
- * calls per iteration.
+ * Requires 256 bytes of {,shadow}stack space, but %rsp/SSP has no net
+ * change. Based on Google's performance numbers, the loop is unrolled to 16
+ * iterations and two calls per iteration.
  *
  * The call filling the RSB needs a nonzero displacement.  A nop would do, but
  * we use "1: pause; lfence; jmp 1b" to safely contains any ret-based
@@ -114,6 +114,16 @@ 
     sub $1, %ecx
     jnz .L\@_fill_rsb_loop
     mov %\tmp, %rsp                 /* Restore old %rsp */
+
+#ifdef CONFIG_XEN_SHSTK
+    mov $1, %ecx
+    rdsspd %ecx
+    cmp $1, %ecx
+    je .L\@_shstk_done
+    mov $64, %ecx                   /* 64 * 4 bytes, given incsspd */
+    incsspd %ecx                    /* Restore old SSP */
+.L\@_shstk_done:
+#endif
 .endm
 
 .macro DO_SPEC_CTRL_ENTRY_FROM_HVM