diff mbox

x86/EFI: meet further spec requirements for runtime calls

Message ID 5824A916020000780011DBB9@prv-mh.provo.novell.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Beulich Nov. 10, 2016, 4:06 p.m. UTC
So far we didn't guarantee 16-byte alignment of the stack: While (so
far) we don't tell the compiler to use smaller alignment, we also don't
guarantee 16-byte alignment when establishing stack pointers for new
vCPU-s. Runtime service functions using SSE instructions may end with
#GP(0) without that.

Note that -mpreferred-stack-boundary=3 is can be used only from gcc 4.8
onwards, and -mincoming-stack-boundary=3 only from 5.3 onwards. It is
for that reason that an alternative approach (using higher than
necessary alignment) is being used when building with such older
compilers.

Furthermore we should avoid #MF to be raised on the FLDCW we do.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
x86/EFI: meet further spec requirements for runtime calls

So far we didn't guarantee 16-byte alignment of the stack: While (so
far) we don't tell the compiler to use smaller alignment, we also don't
guarantee 16-byte alignment when establishing stack pointers for new
vCPU-s. Runtime service functions using SSE instructions may end with
#GP(0) without that.

Note that -mpreferred-stack-boundary=3 is can be used only from gcc 4.8
onwards, and -mincoming-stack-boundary=3 only from 5.3 onwards. It is
for that reason that an alternative approach (using higher than
necessary alignment) is being used when building with such older
compilers.

Furthermore we should avoid #MF to be raised on the FLDCW we do.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/efi/Makefile
+++ b/xen/arch/x86/efi/Makefile
@@ -14,5 +14,10 @@ extra-$(efi) += boot.init.o relocs-dummy
 %.o: %.ihex
 	$(OBJCOPY) -I ihex -O binary $< $@
 
+cc-runtime.o := $(CC) -mno-sse
+$(call cc-option-add,cflags-runtime.o,cc-runtime.o,-mpreferred-stack-boundary=3)
+$(call cc-option-add,cflags-runtime.o,cc-runtime.o,-mincoming-stack-boundary=3)
+runtime.o: CFLAGS += $(cflags-runtime.o)
+
 stub.o: $(extra-y)
 nogcov-$(efi) += stub.o
--- a/xen/common/efi/runtime.c
+++ b/xen/common/efi/runtime.c
@@ -59,12 +59,26 @@ unsigned long efi_rs_enter(void)
     static const u16 fcw = FCW_DEFAULT;
     static const u32 mxcsr = MXCSR_DEFAULT;
     unsigned long cr3 = read_cr3();
+#if __GNUC__ < 5 || (__GNUC__ == 5 && __GNUC_MINOR__ < 3)
+/*
+ * -mpreferred-stack-boundary=3 is can be used only from gcc 4.8 onwards,
+ * and -mincoming-stack-boundary=3 only from 5.3 onwards. Therefore higher
+ * than necessary alignment is being forced here in that case.
+ */
+# define FORCE_ALIGN 32
+#else
+# define FORCE_ALIGN 16
+#endif
+    unsigned long __aligned(FORCE_ALIGN) placeholder[0];
+#undef FORCE_ALIGN
+
+    asm volatile("" : "+m" (placeholder));
 
     if ( !efi_l4_pgtable )
         return 0;
 
     save_fpu_enable();
-    asm volatile ( "fldcw %0" :: "m" (fcw) );
+    asm volatile ( "fnclex; fldcw %0" :: "m" (fcw) );
     asm volatile ( "ldmxcsr %0" :: "m" (mxcsr) );
 
     spin_lock(&efi_rs_lock);

Comments

Andrew Cooper Nov. 11, 2016, 3:39 p.m. UTC | #1
On 10/11/16 16:06, Jan Beulich wrote:
> So far we didn't guarantee 16-byte alignment of the stack: While (so
> far) we don't tell the compiler to use smaller alignment, we also don't
> guarantee 16-byte alignment when establishing stack pointers for new
> vCPU-s. Runtime service functions using SSE instructions may end with
> #GP(0) without that.
>
> Note that -mpreferred-stack-boundary=3 is can be used only from gcc 4.8
> onwards, and -mincoming-stack-boundary=3 only from 5.3 onwards. It is
> for that reason that an alternative approach (using higher than
> necessary alignment) is being used when building with such older
> compilers.
>
> Furthermore we should avoid #MF to be raised on the FLDCW we do.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Wei Liu Nov. 12, 2016, 6:48 a.m. UTC | #2
On Fri, Nov 11, 2016 at 03:39:26PM +0000, Andrew Cooper wrote:
> On 10/11/16 16:06, Jan Beulich wrote:
> > So far we didn't guarantee 16-byte alignment of the stack: While (so
> > far) we don't tell the compiler to use smaller alignment, we also don't
> > guarantee 16-byte alignment when establishing stack pointers for new
> > vCPU-s. Runtime service functions using SSE instructions may end with
> > #GP(0) without that.
> >
> > Note that -mpreferred-stack-boundary=3 is can be used only from gcc 4.8
> > onwards, and -mincoming-stack-boundary=3 only from 5.3 onwards. It is
> > for that reason that an alternative approach (using higher than
> > necessary alignment) is being used when building with such older
> > compilers.
> >
> > Furthermore we should avoid #MF to be raised on the FLDCW we do.
> >
> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Applied.
Jan Beulich Nov. 14, 2016, 7:50 a.m. UTC | #3
>>> On 12.11.16 at 07:48, <wei.liu2@citrix.com> wrote:
> On Fri, Nov 11, 2016 at 03:39:26PM +0000, Andrew Cooper wrote:
>> On 10/11/16 16:06, Jan Beulich wrote:
>> > So far we didn't guarantee 16-byte alignment of the stack: While (so
>> > far) we don't tell the compiler to use smaller alignment, we also don't
>> > guarantee 16-byte alignment when establishing stack pointers for new
>> > vCPU-s. Runtime service functions using SSE instructions may end with
>> > #GP(0) without that.
>> >
>> > Note that -mpreferred-stack-boundary=3 is can be used only from gcc 4.8
>> > onwards, and -mincoming-stack-boundary=3 only from 5.3 onwards. It is
>> > for that reason that an alternative approach (using higher than
>> > necessary alignment) is being used when building with such older
>> > compilers.
>> >
>> > Furthermore we should avoid #MF to be raised on the FLDCW we do.
>> >
>> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> 
>> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> Applied.

I have to withdraw this patch (and hence revert it) - it has both an
active and a latent thinko/bug: The active one is that forcing stack
alignment in efi_rs_enter() is completely pointless. We want its
callers to have an aligned stack. The latent one is that with
-mpreferred-stack-boundary=3 the compiler is free to align the
calling function's stack, but allocate an odd number of longs on the
stack, so that the called function would still receive a misaligned
stack. The conclusion is that we shouldn't use
-mpreferred-stack-boundary=3, yet using
-mincoming-stack-boundary=3 alone would mean all functions in 
runtime.c would get their stack aligned. That might be acceptable,
but is wasteful. I think universally going the route of forcing larger
than necessary alignment (as done by the broken patch just for
older gcc) is the better route, albeit I think I should really check
that all gcc versions usable for building the EFI parts actually
honor the alignment (ISTR that very old gcc doesn't).

The alternative of always forcing an aligned stack would seem to
be quite a bit more intrusive a change, due to struct cpu_user_regs
(and the part of it actually covered by get_stack_bottom()) not
being a multiple of 16 in size. But I'll check more closely whether
this might also be a viable route.

Jan
diff mbox

Patch

--- a/xen/arch/x86/efi/Makefile
+++ b/xen/arch/x86/efi/Makefile
@@ -14,5 +14,10 @@  extra-$(efi) += boot.init.o relocs-dummy
 %.o: %.ihex
 	$(OBJCOPY) -I ihex -O binary $< $@
 
+cc-runtime.o := $(CC) -mno-sse
+$(call cc-option-add,cflags-runtime.o,cc-runtime.o,-mpreferred-stack-boundary=3)
+$(call cc-option-add,cflags-runtime.o,cc-runtime.o,-mincoming-stack-boundary=3)
+runtime.o: CFLAGS += $(cflags-runtime.o)
+
 stub.o: $(extra-y)
 nogcov-$(efi) += stub.o
--- a/xen/common/efi/runtime.c
+++ b/xen/common/efi/runtime.c
@@ -59,12 +59,26 @@  unsigned long efi_rs_enter(void)
     static const u16 fcw = FCW_DEFAULT;
     static const u32 mxcsr = MXCSR_DEFAULT;
     unsigned long cr3 = read_cr3();
+#if __GNUC__ < 5 || (__GNUC__ == 5 && __GNUC_MINOR__ < 3)
+/*
+ * -mpreferred-stack-boundary=3 is can be used only from gcc 4.8 onwards,
+ * and -mincoming-stack-boundary=3 only from 5.3 onwards. Therefore higher
+ * than necessary alignment is being forced here in that case.
+ */
+# define FORCE_ALIGN 32
+#else
+# define FORCE_ALIGN 16
+#endif
+    unsigned long __aligned(FORCE_ALIGN) placeholder[0];
+#undef FORCE_ALIGN
+
+    asm volatile("" : "+m" (placeholder));
 
     if ( !efi_l4_pgtable )
         return 0;
 
     save_fpu_enable();
-    asm volatile ( "fldcw %0" :: "m" (fcw) );
+    asm volatile ( "fnclex; fldcw %0" :: "m" (fcw) );
     asm volatile ( "ldmxcsr %0" :: "m" (mxcsr) );
 
     spin_lock(&efi_rs_lock);