Message ID | 5824A916020000780011DBB9@prv-mh.provo.novell.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 10/11/16 16:06, Jan Beulich wrote: > So far we didn't guarantee 16-byte alignment of the stack: While (so > far) we don't tell the compiler to use smaller alignment, we also don't > guarantee 16-byte alignment when establishing stack pointers for new > vCPU-s. Runtime service functions using SSE instructions may end with > #GP(0) without that. > > Note that -mpreferred-stack-boundary=3 is can be used only from gcc 4.8 > onwards, and -mincoming-stack-boundary=3 only from 5.3 onwards. It is > for that reason that an alternative approach (using higher than > necessary alignment) is being used when building with such older > compilers. > > Furthermore we should avoid #MF to be raised on the FLDCW we do. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
On Fri, Nov 11, 2016 at 03:39:26PM +0000, Andrew Cooper wrote: > On 10/11/16 16:06, Jan Beulich wrote: > > So far we didn't guarantee 16-byte alignment of the stack: While (so > > far) we don't tell the compiler to use smaller alignment, we also don't > > guarantee 16-byte alignment when establishing stack pointers for new > > vCPU-s. Runtime service functions using SSE instructions may end with > > #GP(0) without that. > > > > Note that -mpreferred-stack-boundary=3 is can be used only from gcc 4.8 > > onwards, and -mincoming-stack-boundary=3 only from 5.3 onwards. It is > > for that reason that an alternative approach (using higher than > > necessary alignment) is being used when building with such older > > compilers. > > > > Furthermore we should avoid #MF to be raised on the FLDCW we do. > > > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Applied.
>>> On 12.11.16 at 07:48, <wei.liu2@citrix.com> wrote: > On Fri, Nov 11, 2016 at 03:39:26PM +0000, Andrew Cooper wrote: >> On 10/11/16 16:06, Jan Beulich wrote: >> > So far we didn't guarantee 16-byte alignment of the stack: While (so >> > far) we don't tell the compiler to use smaller alignment, we also don't >> > guarantee 16-byte alignment when establishing stack pointers for new >> > vCPU-s. Runtime service functions using SSE instructions may end with >> > #GP(0) without that. >> > >> > Note that -mpreferred-stack-boundary=3 is can be used only from gcc 4.8 >> > onwards, and -mincoming-stack-boundary=3 only from 5.3 onwards. It is >> > for that reason that an alternative approach (using higher than >> > necessary alignment) is being used when building with such older >> > compilers. >> > >> > Furthermore we should avoid #MF to be raised on the FLDCW we do. >> > >> > Signed-off-by: Jan Beulich <jbeulich@suse.com> >> >> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> > > Applied. I have to withdraw this patch (and hence revert it) - it has both an active and a latent thinko/bug: The active one is that forcing stack alignment in efi_rs_enter() is completely pointless. We want its callers to have an aligned stack. The latent one is that with -mpreferred-stack-boundary=3 the compiler is free to align the calling function's stack, but allocate an odd number of longs on the stack, so that the called function would still receive a misaligned stack. The conclusion is that we shouldn't use -mpreferred-stack-boundary=3, yet using -mincoming-stack-boundary=3 alone would mean all functions in runtime.c would get their stack aligned. That might be acceptable, but is wasteful. I think universally going the route of forcing larger than necessary alignment (as done by the broken patch just for older gcc) is the better route, albeit I think I should really check that all gcc versions usable for building the EFI parts actually honor the alignment (ISTR that very old gcc doesn't). The alternative of always forcing an aligned stack would seem to be quite a bit more intrusive a change, due to struct cpu_user_regs (and the part of it actually covered by get_stack_bottom()) not being a multiple of 16 in size. But I'll check more closely whether this might also be a viable route. Jan
--- a/xen/arch/x86/efi/Makefile +++ b/xen/arch/x86/efi/Makefile @@ -14,5 +14,10 @@ extra-$(efi) += boot.init.o relocs-dummy %.o: %.ihex $(OBJCOPY) -I ihex -O binary $< $@ +cc-runtime.o := $(CC) -mno-sse +$(call cc-option-add,cflags-runtime.o,cc-runtime.o,-mpreferred-stack-boundary=3) +$(call cc-option-add,cflags-runtime.o,cc-runtime.o,-mincoming-stack-boundary=3) +runtime.o: CFLAGS += $(cflags-runtime.o) + stub.o: $(extra-y) nogcov-$(efi) += stub.o --- a/xen/common/efi/runtime.c +++ b/xen/common/efi/runtime.c @@ -59,12 +59,26 @@ unsigned long efi_rs_enter(void) static const u16 fcw = FCW_DEFAULT; static const u32 mxcsr = MXCSR_DEFAULT; unsigned long cr3 = read_cr3(); +#if __GNUC__ < 5 || (__GNUC__ == 5 && __GNUC_MINOR__ < 3) +/* + * -mpreferred-stack-boundary=3 is can be used only from gcc 4.8 onwards, + * and -mincoming-stack-boundary=3 only from 5.3 onwards. Therefore higher + * than necessary alignment is being forced here in that case. + */ +# define FORCE_ALIGN 32 +#else +# define FORCE_ALIGN 16 +#endif + unsigned long __aligned(FORCE_ALIGN) placeholder[0]; +#undef FORCE_ALIGN + + asm volatile("" : "+m" (placeholder)); if ( !efi_l4_pgtable ) return 0; save_fpu_enable(); - asm volatile ( "fldcw %0" :: "m" (fcw) ); + asm volatile ( "fnclex; fldcw %0" :: "m" (fcw) ); asm volatile ( "ldmxcsr %0" :: "m" (mxcsr) ); spin_lock(&efi_rs_lock);