Message ID | 20211026141420.17138-5-jiangshanlai@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | None | expand |
On Tue, Oct 26, 2021 at 10:13:34PM +0800, Lai Jiangshan wrote: > From: Lai Jiangshan <laijs@linux.alibaba.com> > > While in the native case, PER_CPU_VAR(cpu_tss_rw + TSS_sp0) is the > trampoline stack. But XEN pv doesn't use trampoline stack, so > PER_CPU_VAR(cpu_tss_rw + TSS_sp0) is also the kernel stack. Hence source > and destination stacks are identical in that case, which means reusing > swapgs_restore_regs_and_return_to_usermode() in XEN pv would cause %rsp > to move up to the top of the kernel stack and leave the IRET frame below > %rsp, which is dangerous to be corrupted if #NMI / #MC hit as either of > these events occurring in the middle of the stack pushing would clobber > data on the (original) stack. > > And swapgs_restore_regs_and_return_to_usermode() pushing the IRET frame > on to the original address is useless and error-prone when there is any > future attempt to modify the code. > > Fixes: 7f2590a110b8 ("x86/entry/64: Use a per-CPU trampoline stack for IDT entries") > Cc: Jan Beulich <jbeulich@suse.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Juergen Gross <jgross@suse.com> > Cc: Peter Anvin <hpa@zytor.com> > Cc: xen-devel@lists.xenproject.org> > Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> > --- > arch/x86/entry/entry_64.S | 9 ++++++--- > arch/x86/entry/entry_64_compat.S | 7 ++++--- > arch/x86/xen/xen-asm.S | 27 +++++++++++++++++++++++++++ > 3 files changed, 37 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S > index 9d468c8877e2..0dde5a253dda 100644 > --- a/arch/x86/entry/entry_64.S > +++ b/arch/x86/entry/entry_64.S > @@ -119,7 +119,7 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_GLOBAL) > * In the Xen PV case we must use iret anyway. > */ > > - ALTERNATIVE "", "jmp swapgs_restore_regs_and_return_to_usermode", \ > + ALTERNATIVE "", "jmp xenpv_restore_regs_and_return_to_usermode", \ Instead of sprinkling all those ALTERNATIVE calls everywhere, why don't you simply jump to the xenpv-one at the swapgs_restore_regs_and_return_to_usermode label itself and have a single ALTERNATIVE there?
On 2021/11/2 16:58, Borislav Petkov wrote: >> */ >> >> - ALTERNATIVE "", "jmp swapgs_restore_regs_and_return_to_usermode", \ >> + ALTERNATIVE "", "jmp xenpv_restore_regs_and_return_to_usermode", \ > > Instead of sprinkling all those ALTERNATIVE calls everywhere, > why don't you simply jump to the xenpv-one at the > swapgs_restore_regs_and_return_to_usermode label itself and have a > single ALTERNATIVE there? > It will add a 5-byte NOP at the beginning of the native swapgs_restore_regs_and_return_to_usermode. I avoided adding unneeded code in the native code even if it is NOPs and avoided melting xenpv-one into the native one which will reduce the code readability. I will follow your preference since a 5-byte NOP is so negligible in the slow path with an iret instruction. Or other option that adds macros to wrap the ALTERNATIVE. RESTORE_REGS_AND_RETURN_TO_USERMODE and COND_RESTORE_REGS_AND_RETURN_TO_USERMODE (test %eax before jmp in native case)
On Tue, Nov 02, 2021 at 05:19:46PM +0800, Lai Jiangshan wrote: > It will add a 5-byte NOP at the beginning of the native > swapgs_restore_regs_and_return_to_usermode. So? > I avoided adding unneeded code in the native code even if it is NOPs > and avoided melting xenpv-one into the native one which will reduce > the code readability. How does this reduce code readability?! diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index e38a4cf795d9..bf1de54a1fca 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -567,6 +567,10 @@ __irqentry_text_end: SYM_CODE_START_LOCAL(common_interrupt_return) SYM_INNER_LABEL(swapgs_restore_regs_and_return_to_usermode, SYM_L_GLOBAL) + + ALTERNATIVE "", "jmp xenpv_restore_regs_and_return_to_usermode", \ + X86_FEATURE_XENPV + #ifdef CONFIG_DEBUG_ENTRY /* Assert that pt_regs indicates user mode. */ testb $3, CS(%rsp) > I will follow your preference since a 5-byte NOP is so negligible in the slow > path with an iret instruction. Yes, we do already gazillion things on those entry and exit paths. > Or other option that adds macros to wrap the ALTERNATIVE. > RESTORE_REGS_AND_RETURN_TO_USERMODE and > COND_RESTORE_REGS_AND_RETURN_TO_USERMODE (test %eax before jmp in native case) No, the main goal is to keep the asm code as readable and as simple as possible. If macros or whatever need to be added, there better be a good reason for them. Saving a NOP is not one of them. Thx.
On November 2, 2021 10:49:44 AM GMT+01:00, Borislav Petkov <bp@alien8.de> wrote: >On Tue, Nov 02, 2021 at 05:19:46PM +0800, Lai Jiangshan wrote: >> It will add a 5-byte NOP at the beginning of the native >> swapgs_restore_regs_and_return_to_usermode. > >So? > It would be interesting to have an "override function with jmp" alternatives macro. It doesn't require any changes to the alternatives mechanism proper (but possibly to objtool): it would just insert an alternatives entry without adding any code including nops to the main path. It would of course only be applicable to a jmp, so a syntax like OVERRIDE_JMP feature, target rather than open-coding the instruction would probably be a good idea. That would reduce the trade-off to zero. >> I avoided adding unneeded code in the native code even if it is NOPs >> and avoided melting xenpv-one into the native one which will reduce >> the code readability. > >How does this reduce code readability?! > >diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S >index e38a4cf795d9..bf1de54a1fca 100644 >--- a/arch/x86/entry/entry_64.S >+++ b/arch/x86/entry/entry_64.S >@@ -567,6 +567,10 @@ __irqentry_text_end: > > SYM_CODE_START_LOCAL(common_interrupt_return) > SYM_INNER_LABEL(swapgs_restore_regs_and_return_to_usermode, SYM_L_GLOBAL) >+ >+ ALTERNATIVE "", "jmp xenpv_restore_regs_and_return_to_usermode", \ >+ X86_FEATURE_XENPV >+ > #ifdef CONFIG_DEBUG_ENTRY > /* Assert that pt_regs indicates user mode. */ > testb $3, CS(%rsp) > >> I will follow your preference since a 5-byte NOP is so negligible in the slow >> path with an iret instruction. > >Yes, we do already gazillion things on those entry and exit paths. > >> Or other option that adds macros to wrap the ALTERNATIVE. >> RESTORE_REGS_AND_RETURN_TO_USERMODE and >> COND_RESTORE_REGS_AND_RETURN_TO_USERMODE (test %eax before jmp in native case) > >No, the main goal is to keep the asm code as readable and as simple as >possible. > >If macros or whatever need to be added, there better be a good reason >for them. Saving a NOP is not one of them. > >Thx. >
On Tue, Nov 02, 2021 at 12:22:50PM +0100, H. Peter Anvin wrote: > It would be interesting to have an "override function with jmp" > alternatives macro. It doesn't require any changes to the alternatives > mechanism proper (but possibly to objtool): it would just insert an > alternatives entry without adding any code including nops to the main > path. It would of course only be applicable to a jmp, so a syntax like > OVERRIDE_JMP feature, target rather than open-coding the instruction > would probably be a good idea. I think you wanna say ALTERNATIVE_JMP here seeing how we have ALTERNATIVE_CALL already :) As to marking it properly, we can finally add that struct alt_instr.flags thing we have been trying to add for years now. /me adds it to his evergrowing todo. If anyone beats /me to it, /me will gladly have a look at it.
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 9d468c8877e2..0dde5a253dda 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -119,7 +119,7 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_GLOBAL) * In the Xen PV case we must use iret anyway. */ - ALTERNATIVE "", "jmp swapgs_restore_regs_and_return_to_usermode", \ + ALTERNATIVE "", "jmp xenpv_restore_regs_and_return_to_usermode", \ X86_FEATURE_XENPV movq RCX(%rsp), %rcx @@ -286,7 +286,8 @@ SYM_CODE_START(ret_from_fork) UNWIND_HINT_REGS movq %rsp, %rdi call syscall_exit_to_user_mode /* returns with IRQs disabled */ - jmp swapgs_restore_regs_and_return_to_usermode + ALTERNATIVE "jmp swapgs_restore_regs_and_return_to_usermode", \ + "jmp xenpv_restore_regs_and_return_to_usermode", X86_FEATURE_XENPV 1: /* kernel thread */ @@ -566,6 +567,7 @@ __irqentry_text_start: __irqentry_text_end: SYM_CODE_START_LOCAL(common_interrupt_return) +SYM_INNER_LABEL(xenpv_restore_regs_and_return_to_usermode, SYM_L_WEAK) /* placeholder */ SYM_INNER_LABEL(swapgs_restore_regs_and_return_to_usermode, SYM_L_GLOBAL) #ifdef CONFIG_DEBUG_ENTRY /* Assert that pt_regs indicates user mode. */ @@ -1055,7 +1057,8 @@ SYM_CODE_START_LOCAL(error_return) DEBUG_ENTRY_ASSERT_IRQS_OFF testb $3, CS(%rsp) jz restore_regs_and_return_to_kernel - jmp swapgs_restore_regs_and_return_to_usermode + ALTERNATIVE "jmp swapgs_restore_regs_and_return_to_usermode", \ + "jmp xenpv_restore_regs_and_return_to_usermode", X86_FEATURE_XENPV SYM_CODE_END(error_return) /* diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S index 0051cf5c792d..2a4d9532dfd5 100644 --- a/arch/x86/entry/entry_64_compat.S +++ b/arch/x86/entry/entry_64_compat.S @@ -139,7 +139,7 @@ SYM_INNER_LABEL(entry_SYSENTER_compat_after_hwframe, SYM_L_GLOBAL) call do_SYSENTER_32 /* XEN PV guests always use IRET path */ ALTERNATIVE "testl %eax, %eax; jz swapgs_restore_regs_and_return_to_usermode", \ - "jmp swapgs_restore_regs_and_return_to_usermode", X86_FEATURE_XENPV + "jmp xenpv_restore_regs_and_return_to_usermode", X86_FEATURE_XENPV jmp sysret32_from_system_call .Lsysenter_fix_flags: @@ -256,7 +256,7 @@ SYM_INNER_LABEL(entry_SYSCALL_compat_after_hwframe, SYM_L_GLOBAL) call do_fast_syscall_32 /* XEN PV guests always use IRET path */ ALTERNATIVE "testl %eax, %eax; jz swapgs_restore_regs_and_return_to_usermode", \ - "jmp swapgs_restore_regs_and_return_to_usermode", X86_FEATURE_XENPV + "jmp xenpv_restore_regs_and_return_to_usermode", X86_FEATURE_XENPV /* Opportunistic SYSRET */ sysret32_from_system_call: @@ -411,5 +411,6 @@ SYM_CODE_START(entry_INT80_compat) movq %rsp, %rdi call do_int80_syscall_32 - jmp swapgs_restore_regs_and_return_to_usermode + ALTERNATIVE "jmp swapgs_restore_regs_and_return_to_usermode", \ + "jmp xenpv_restore_regs_and_return_to_usermode", X86_FEATURE_XENPV SYM_CODE_END(entry_INT80_compat) diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S index 220dd9678494..032be1bab113 100644 --- a/arch/x86/xen/xen-asm.S +++ b/arch/x86/xen/xen-asm.S @@ -20,6 +20,7 @@ #include <linux/init.h> #include <linux/linkage.h> +#include <../entry/calling.h> .pushsection .noinstr.text, "ax" /* @@ -192,6 +193,32 @@ SYM_CODE_START(xen_iret) jmp hypercall_iret SYM_CODE_END(xen_iret) +/* + * XEN pv doesn't use trampoline stack, PER_CPU_VAR(cpu_tss_rw + TSS_sp0) is + * also the kernel stack. Reusing swapgs_restore_regs_and_return_to_usermode() + * in XEN pv would cause %rsp to move up to the top of the kernel stack and + * leave the IRET frame below %rsp, which is dangerous to be corrupted if #NMI + * interrupts. And swapgs_restore_regs_and_return_to_usermode() pushing the IRET + * frame at the same address is useless. + */ +SYM_CODE_START(xenpv_restore_regs_and_return_to_usermode) + UNWIND_HINT_REGS +#ifdef CONFIG_DEBUG_ENTRY + /* Assert that pt_regs indicates user mode. */ + testb $3, CS(%rsp) + jnz 1f + ud2 +1: +#endif + POP_REGS + + /* stackleak_erase() can work safely on the kernel stack. */ + STACKLEAK_ERASE_NOCLOBBER + + addq $8, %rsp /* skip regs->orig_ax */ + jmp xen_iret +SYM_CODE_END(xenpv_restore_regs_and_return_to_usermode) + /* * Xen handles syscall callbacks much like ordinary exceptions, which * means we have: