Message ID | 20200918192312.25978-9-yu-cheng.yu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Control-flow Enforcement: Indirect Branch Tracking | expand |
On 9/18/20 12:23 PM, Yu-cheng Yu wrote: > Emulation of the legacy vsyscall page is required by some programs > built before 2013. Newer programs after 2013 don't use it. > Disable vsyscall emulation when Control-flow Enforcement (CET) is > enabled to enhance security. How does this "enhance security"? What is the connection between vsyscall emulation and CET?
On Fri 2020-09-18 12:32:57, Dave Hansen wrote: > On 9/18/20 12:23 PM, Yu-cheng Yu wrote: > > Emulation of the legacy vsyscall page is required by some programs > > built before 2013. Newer programs after 2013 don't use it. > > Disable vsyscall emulation when Control-flow Enforcement (CET) is > > enabled to enhance security. > > How does this "enhance security"? > > What is the connection between vsyscall emulation and CET? Boom. We don't break compatibility by default, and you should not tell people to enable CET by default if you plan to do this. Best regards, Pavel
On Fri, Sep 18, 2020 at 2:00 PM Pavel Machek <pavel@ucw.cz> wrote: > > On Fri 2020-09-18 12:32:57, Dave Hansen wrote: > > On 9/18/20 12:23 PM, Yu-cheng Yu wrote: > > > Emulation of the legacy vsyscall page is required by some programs > > > built before 2013. Newer programs after 2013 don't use it. > > > Disable vsyscall emulation when Control-flow Enforcement (CET) is > > > enabled to enhance security. > > > > How does this "enhance security"? > > > > What is the connection between vsyscall emulation and CET? > > Boom. > > We don't break compatibility by default, and you should not tell > people to enable CET by default if you plan to do this. > Nothing will be broken. CET enabled applications don't use/need vsyscall emulation.
On 9/18/20 2:06 PM, H.J. Lu wrote: > On Fri, Sep 18, 2020 at 2:00 PM Pavel Machek <pavel@ucw.cz> wrote: >> On Fri 2020-09-18 12:32:57, Dave Hansen wrote: >>> On 9/18/20 12:23 PM, Yu-cheng Yu wrote: >>>> Emulation of the legacy vsyscall page is required by some programs >>>> built before 2013. Newer programs after 2013 don't use it. >>>> Disable vsyscall emulation when Control-flow Enforcement (CET) is >>>> enabled to enhance security. >>> How does this "enhance security"? >>> >>> What is the connection between vsyscall emulation and CET? >> Boom. >> >> We don't break compatibility by default, and you should not tell >> people to enable CET by default if you plan to do this. >> > Nothing will be broken. CET enabled applications don't use/need > vsyscall emulation. Hi H.J., Could you explain your logic a bit more thoroughly, please? I also suspect that Pavel was confused by your changelog where you said that you do this when "CET is enabled". Does enabled in this context mean: 1. Just CET support compiled in, or 2. Compiled in and on CET hardware, or 3. Compiled in to the kernel enabled in the app and running on CET hardware?
On 9/18/2020 2:00 PM, Pavel Machek wrote: > On Fri 2020-09-18 12:32:57, Dave Hansen wrote: >> On 9/18/20 12:23 PM, Yu-cheng Yu wrote: >>> Emulation of the legacy vsyscall page is required by some programs >>> built before 2013. Newer programs after 2013 don't use it. >>> Disable vsyscall emulation when Control-flow Enforcement (CET) is >>> enabled to enhance security. >> >> How does this "enhance security"? >> >> What is the connection between vsyscall emulation and CET? > > Boom. > > We don't break compatibility by default, and you should not tell > people to enable CET by default if you plan to do this. I would revise the wording if there is another version. What this patch does is: If an application is compiled for CET and the system supports it, then the application cannot do vsyscall emulation. Earlier we allow the emulation, and had a patch that fixes the shadow stack and endbr for the emulation code. Since newer programs mostly do no do the emulation, we changed the patch do block it when attempted. This patch would not block any legacy applications or any applications on older machines. Yu-cheng
On Fri, Sep 18, 2020 at 2:17 PM Dave Hansen <dave.hansen@intel.com> wrote: > > On 9/18/20 2:06 PM, H.J. Lu wrote: > > On Fri, Sep 18, 2020 at 2:00 PM Pavel Machek <pavel@ucw.cz> wrote: > >> On Fri 2020-09-18 12:32:57, Dave Hansen wrote: > >>> On 9/18/20 12:23 PM, Yu-cheng Yu wrote: > >>>> Emulation of the legacy vsyscall page is required by some programs > >>>> built before 2013. Newer programs after 2013 don't use it. > >>>> Disable vsyscall emulation when Control-flow Enforcement (CET) is > >>>> enabled to enhance security. > >>> How does this "enhance security"? > >>> > >>> What is the connection between vsyscall emulation and CET? > >> Boom. > >> > >> We don't break compatibility by default, and you should not tell > >> people to enable CET by default if you plan to do this. > >> > > Nothing will be broken. CET enabled applications don't use/need > > vsyscall emulation. > > Hi H.J., > > Could you explain your logic a bit more thoroughly, please? Here is my CET slides for LPC 2020: https://gitlab.com/cet-software/cet-smoke-test/-/wikis/uploads/09431a51248858e6f716a59065d732e2/CET-LPC-2020.pdf which may have answers for most questions. > I also suspect that Pavel was confused by your changelog where you said > that you do this when "CET is enabled". Does enabled in this context mean: > 1. Just CET support compiled in, or > 2. Compiled in and on CET hardware, or > 3. Compiled in to the kernel enabled in the app and running on CET > hardware? CET is enabled only in a process if 1. All components are CET enabled, and 2. CPU supports CET, and 3. Kernel supports CET.
On Fri 2020-09-18 14:21:10, Yu, Yu-cheng wrote: > On 9/18/2020 2:00 PM, Pavel Machek wrote: > > On Fri 2020-09-18 12:32:57, Dave Hansen wrote: > > > On 9/18/20 12:23 PM, Yu-cheng Yu wrote: > > > > Emulation of the legacy vsyscall page is required by some programs > > > > built before 2013. Newer programs after 2013 don't use it. > > > > Disable vsyscall emulation when Control-flow Enforcement (CET) is > > > > enabled to enhance security. > > > > > > How does this "enhance security"? > > > > > > What is the connection between vsyscall emulation and CET? > > > > Boom. > > > > We don't break compatibility by default, and you should not tell > > people to enable CET by default if you plan to do this. > > I would revise the wording if there is another version. What this patch > does is: > > If an application is compiled for CET and the system supports it, then the > application cannot do vsyscall emulation. Earlier we allow the emulation, > and had a patch that fixes the shadow stack and endbr for the emulation > code. Since newer programs mostly do no do the emulation, we changed the > patch do block it when attempted. > > This patch would not block any legacy applications or any applications on > older machines. Aha, makes sense, sorry for the noise. Best regards, Pavel
> Here is my CET slides for LPC 2020: > > https://gitlab.com/cet-software/cet-smoke-test/-/wikis/uploads/09431a51248858e6f716a59065d732e2/CET-LPC-2020.pdf > > which may have answers for most questions. Hi H.J., I know you're not super familiar with our kernel development process, which might be why you've been repeatedly referring folks to your presentations. Our efforts here on the kernel mailing list are to ensure that we have a nice standalone changelog in the kernel forever. Great as your presentation might be, it isn't a part of the changelog. I'm not asking these questions because I need them explained to me. I'm asking because I need Yu-cheng to ensure he has the answers in his changelog when the patches are merged.
On Fri, Sep 18, 2020 at 12:23 PM Yu-cheng Yu <yu-cheng.yu@intel.com> wrote: > > Emulation of the legacy vsyscall page is required by some programs > built before 2013. Newer programs after 2013 don't use it. > Disable vsyscall emulation when Control-flow Enforcement (CET) is > enabled to enhance security. > > Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> > --- > v12: > - Disable vsyscall emulation only when it is attempted (vs. at compile time). > > arch/x86/entry/vsyscall/vsyscall_64.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c > index 44c33103a955..3196e963e365 100644 > --- a/arch/x86/entry/vsyscall/vsyscall_64.c > +++ b/arch/x86/entry/vsyscall/vsyscall_64.c > @@ -150,6 +150,15 @@ bool emulate_vsyscall(unsigned long error_code, > > WARN_ON_ONCE(address != regs->ip); > > +#ifdef CONFIG_X86_INTEL_CET > + if (current->thread.cet.shstk_size || > + current->thread.cet.ibt_enabled) { > + warn_bad_vsyscall(KERN_INFO, regs, > + "vsyscall attempted with cet enabled"); > + return false; > + } Nope, try again. Having IBT on does *not* mean that every library in the process knows that we have indirect branch tracking. The legacy bitmap exists for a reason. Also, I want a way to flag programs as not using the vsyscall page, but that flag should not be called CET. And a process with vsyscalls off should not be able to read the vsyscall page, and /proc/self/maps should be correct. So you have some choices: 1. Drop this patch and make it work. 2. Add a real per-process vsyscall control. Either make it depend on vsyscall=xonly and wire it up correctly or actually make it work correctly with vsyscall=emulate. NAK to any hacks in this space. Do it right or don't do it at all.
On 9/18/2020 5:11 PM, Andy Lutomirski wrote: > On Fri, Sep 18, 2020 at 12:23 PM Yu-cheng Yu <yu-cheng.yu@intel.com> wrote: >> >> Emulation of the legacy vsyscall page is required by some programs >> built before 2013. Newer programs after 2013 don't use it. >> Disable vsyscall emulation when Control-flow Enforcement (CET) is >> enabled to enhance security. >> >> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> >> --- >> v12: >> - Disable vsyscall emulation only when it is attempted (vs. at compile time). >> >> arch/x86/entry/vsyscall/vsyscall_64.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c >> index 44c33103a955..3196e963e365 100644 >> --- a/arch/x86/entry/vsyscall/vsyscall_64.c >> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c >> @@ -150,6 +150,15 @@ bool emulate_vsyscall(unsigned long error_code, >> >> WARN_ON_ONCE(address != regs->ip); >> >> +#ifdef CONFIG_X86_INTEL_CET >> + if (current->thread.cet.shstk_size || >> + current->thread.cet.ibt_enabled) { >> + warn_bad_vsyscall(KERN_INFO, regs, >> + "vsyscall attempted with cet enabled"); >> + return false; >> + } > > Nope, try again. Having IBT on does *not* mean that every library in > the process knows that we have indirect branch tracking. The legacy > bitmap exists for a reason. Also, I want a way to flag programs as > not using the vsyscall page, but that flag should not be called CET. > And a process with vsyscalls off should not be able to read the > vsyscall page, and /proc/self/maps should be correct. > > So you have some choices: > > 1. Drop this patch and make it work. > > 2. Add a real per-process vsyscall control. Either make it depend on > vsyscall=xonly and wire it up correctly or actually make it work > correctly with vsyscall=emulate. > > NAK to any hacks in this space. Do it right or don't do it at all. > We can drop this patch, and bring back the previous patch that fixes up shadow stack and ibt. That makes vsyscall emulation work correctly, and does not force the application to do anything different from what is working now. I will post the previous patch as a reply to this thread so that people can make comments on it. Yu-cheng
On Mon, 2020-09-21 at 09:22 -0700, Yu, Yu-cheng wrote: > On 9/18/2020 5:11 PM, Andy Lutomirski wrote: > > On Fri, Sep 18, 2020 at 12:23 PM Yu-cheng Yu <yu-cheng.yu@intel.com> wrote: > > > Emulation of the legacy vsyscall page is required by some programs > > > built before 2013. Newer programs after 2013 don't use it. > > > Disable vsyscall emulation when Control-flow Enforcement (CET) is > > > enabled to enhance security. > > > > > > Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> [...] > > > > Nope, try again. Having IBT on does *not* mean that every library in > > the process knows that we have indirect branch tracking. The legacy > > bitmap exists for a reason. Also, I want a way to flag programs as > > not using the vsyscall page, but that flag should not be called CET. > > And a process with vsyscalls off should not be able to read the > > vsyscall page, and /proc/self/maps should be correct. > > > > So you have some choices: > > > > 1. Drop this patch and make it work. > > > > 2. Add a real per-process vsyscall control. Either make it depend on > > vsyscall=xonly and wire it up correctly or actually make it work > > correctly with vsyscall=emulate. > > > > NAK to any hacks in this space. Do it right or don't do it at all. > > > > We can drop this patch, and bring back the previous patch that fixes up > shadow stack and ibt. That makes vsyscall emulation work correctly, and > does not force the application to do anything different from what is > working now. I will post the previous patch as a reply to this thread > so that people can make comments on it. > > Yu-cheng Here is the patch: ------ From dfdee39c795ee5dcee2c77f6ba344a61f4d8124b Mon Sep 17 00:00:00 2001 From: Yu-cheng Yu <yu-cheng.yu@intel.com> Date: Thu, 29 Nov 2018 14:15:38 -0800 Subject: [PATCH 34/43] x86/vsyscall/64: Fixup Shadow Stack and Indirect Branch Tracking for vsyscall emulation Vsyscall entry points are effectively branch targets. Mark them with ENDBR64 opcodes. When emulating the RET instruction, unwind the shadow stack and reset IBT state machine. Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> --- arch/x86/entry/vsyscall/vsyscall_64.c | 29 +++++++++++++++++++++++ arch/x86/entry/vsyscall/vsyscall_emu_64.S | 9 +++++++ arch/x86/entry/vsyscall/vsyscall_trace.h | 1 + 3 files changed, 39 insertions(+) diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c index 44c33103a955..0131c9f7f9c5 100644 --- a/arch/x86/entry/vsyscall/vsyscall_64.c +++ b/arch/x86/entry/vsyscall/vsyscall_64.c @@ -38,6 +38,9 @@ #include <asm/fixmap.h> #include <asm/traps.h> #include <asm/paravirt.h> +#include <asm/fpu/xstate.h> +#include <asm/fpu/types.h> +#include <asm/fpu/internal.h> #define CREATE_TRACE_POINTS #include "vsyscall_trace.h" @@ -286,6 +289,32 @@ bool emulate_vsyscall(unsigned long error_code, /* Emulate a ret instruction. */ regs->ip = caller; regs->sp += 8; + + if (current->thread.cet.shstk_size || + current->thread.cet.ibt_enabled) { + u64 r; + + fpregs_lock(); + if (test_thread_flag(TIF_NEED_FPU_LOAD)) + __fpregs_load_activate(); + +#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER + /* Fixup branch tracking */ + if (current->thread.cet.ibt_enabled) { + rdmsrl(MSR_IA32_U_CET, r); + wrmsrl(MSR_IA32_U_CET, r & ~CET_WAIT_ENDBR); + } +#endif + +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER + /* Unwind shadow stack. */ + if (current->thread.cet.shstk_size) { + rdmsrl(MSR_IA32_PL3_SSP, r); + wrmsrl(MSR_IA32_PL3_SSP, r + 8); + } +#endif + fpregs_unlock(); + } return true; sigsegv: diff --git a/arch/x86/entry/vsyscall/vsyscall_emu_64.S b/arch/x86/entry/vsyscall/vsyscall_emu_64.S index 2e203f3a25a7..040696333457 100644 --- a/arch/x86/entry/vsyscall/vsyscall_emu_64.S +++ b/arch/x86/entry/vsyscall/vsyscall_emu_64.S @@ -17,16 +17,25 @@ __PAGE_ALIGNED_DATA .type __vsyscall_page, @object __vsyscall_page: +#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER + endbr64 +#endif mov $__NR_gettimeofday, %rax syscall ret .balign 1024, 0xcc +#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER + endbr64 +#endif mov $__NR_time, %rax syscall ret .balign 1024, 0xcc +#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER + endbr64 +#endif mov $__NR_getcpu, %rax syscall ret diff --git a/arch/x86/entry/vsyscall/vsyscall_trace.h b/arch/x86/entry/vsyscall/vsyscall_trace.h index 3c3f9765a85c..7aa2101ada44 100644 --- a/arch/x86/entry/vsyscall/vsyscall_trace.h +++ b/arch/x86/entry/vsyscall/vsyscall_trace.h @@ -25,6 +25,7 @@ TRACE_EVENT(emulate_vsyscall, #endif #undef TRACE_INCLUDE_PATH +#undef TRACE_INCLUDE_FILE #define TRACE_INCLUDE_PATH ../../arch/x86/entry/vsyscall/ #define TRACE_INCLUDE_FILE vsyscall_trace #include <trace/define_trace.h>
On Mon, Sep 21, 2020 at 3:37 PM Yu-cheng Yu <yu-cheng.yu@intel.com> wrote: > > On Mon, 2020-09-21 at 09:22 -0700, Yu, Yu-cheng wrote: > > On 9/18/2020 5:11 PM, Andy Lutomirski wrote: > > > On Fri, Sep 18, 2020 at 12:23 PM Yu-cheng Yu <yu-cheng.yu@intel.com> wrote: > > > > Emulation of the legacy vsyscall page is required by some programs > > > > built before 2013. Newer programs after 2013 don't use it. > > > > Disable vsyscall emulation when Control-flow Enforcement (CET) is > > > > enabled to enhance security. > > > > > > > > Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> > [...] > > > > > > Nope, try again. Having IBT on does *not* mean that every library in > > > the process knows that we have indirect branch tracking. The legacy > > > bitmap exists for a reason. Also, I want a way to flag programs as > > > not using the vsyscall page, but that flag should not be called CET. > > > And a process with vsyscalls off should not be able to read the > > > vsyscall page, and /proc/self/maps should be correct. > > > > > > So you have some choices: > > > > > > 1. Drop this patch and make it work. > > > > > > 2. Add a real per-process vsyscall control. Either make it depend on > > > vsyscall=xonly and wire it up correctly or actually make it work > > > correctly with vsyscall=emulate. > > > > > > NAK to any hacks in this space. Do it right or don't do it at all. > > > > > > > We can drop this patch, and bring back the previous patch that fixes up > > shadow stack and ibt. That makes vsyscall emulation work correctly, and > > does not force the application to do anything different from what is > > working now. I will post the previous patch as a reply to this thread > > so that people can make comments on it. > > > > Yu-cheng > > Here is the patch: > > ------ > > From dfdee39c795ee5dcee2c77f6ba344a61f4d8124b Mon Sep 17 00:00:00 2001 > From: Yu-cheng Yu <yu-cheng.yu@intel.com> > Date: Thu, 29 Nov 2018 14:15:38 -0800 > Subject: [PATCH 34/43] x86/vsyscall/64: Fixup Shadow Stack and Indirect Branch > Tracking for vsyscall emulation > > Vsyscall entry points are effectively branch targets. Mark them with > ENDBR64 opcodes. When emulating the RET instruction, unwind the shadow > stack and reset IBT state machine. > > Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> > --- > arch/x86/entry/vsyscall/vsyscall_64.c | 29 +++++++++++++++++++++++ > arch/x86/entry/vsyscall/vsyscall_emu_64.S | 9 +++++++ > arch/x86/entry/vsyscall/vsyscall_trace.h | 1 + > 3 files changed, 39 insertions(+) > > diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c > b/arch/x86/entry/vsyscall/vsyscall_64.c > index 44c33103a955..0131c9f7f9c5 100644 > --- a/arch/x86/entry/vsyscall/vsyscall_64.c > +++ b/arch/x86/entry/vsyscall/vsyscall_64.c > @@ -38,6 +38,9 @@ > #include <asm/fixmap.h> > #include <asm/traps.h> > #include <asm/paravirt.h> > +#include <asm/fpu/xstate.h> > +#include <asm/fpu/types.h> > +#include <asm/fpu/internal.h> > > #define CREATE_TRACE_POINTS > #include "vsyscall_trace.h" > @@ -286,6 +289,32 @@ bool emulate_vsyscall(unsigned long error_code, > /* Emulate a ret instruction. */ > regs->ip = caller; > regs->sp += 8; > + > + if (current->thread.cet.shstk_size || > + current->thread.cet.ibt_enabled) { > + u64 r; > + > + fpregs_lock(); > + if (test_thread_flag(TIF_NEED_FPU_LOAD)) > + __fpregs_load_activate(); Wouldn't this be nicer if you operated on the memory image, not the registers? > + > +#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER > + /* Fixup branch tracking */ > + if (current->thread.cet.ibt_enabled) { > + rdmsrl(MSR_IA32_U_CET, r); > + wrmsrl(MSR_IA32_U_CET, r & ~CET_WAIT_ENDBR); > + } > +#endif Seems reasonable on first glance. > + > +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER > + /* Unwind shadow stack. */ > + if (current->thread.cet.shstk_size) { > + rdmsrl(MSR_IA32_PL3_SSP, r); > + wrmsrl(MSR_IA32_PL3_SSP, r + 8); > + } > +#endif What happens if the result is noncanonical? A quick skim of the SDM didn't find anything. This latter issue goes away if you operate on the memory image, though -- writing a bogus value is just fine, since the FP restore will handle it.
On 9/21/2020 4:48 PM, Andy Lutomirski wrote: > On Mon, Sep 21, 2020 at 3:37 PM Yu-cheng Yu <yu-cheng.yu@intel.com> wrote: >> >> On Mon, 2020-09-21 at 09:22 -0700, Yu, Yu-cheng wrote: [...] >> >> Here is the patch: >> >> ------ >> >> From dfdee39c795ee5dcee2c77f6ba344a61f4d8124b Mon Sep 17 00:00:00 2001 >> From: Yu-cheng Yu <yu-cheng.yu@intel.com> >> Date: Thu, 29 Nov 2018 14:15:38 -0800 >> Subject: [PATCH 34/43] x86/vsyscall/64: Fixup Shadow Stack and Indirect Branch >> Tracking for vsyscall emulation >> >> Vsyscall entry points are effectively branch targets. Mark them with >> ENDBR64 opcodes. When emulating the RET instruction, unwind the shadow >> stack and reset IBT state machine. >> >> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> >> --- >> arch/x86/entry/vsyscall/vsyscall_64.c | 29 +++++++++++++++++++++++ >> arch/x86/entry/vsyscall/vsyscall_emu_64.S | 9 +++++++ >> arch/x86/entry/vsyscall/vsyscall_trace.h | 1 + >> 3 files changed, 39 insertions(+) >> >> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c >> b/arch/x86/entry/vsyscall/vsyscall_64.c >> index 44c33103a955..0131c9f7f9c5 100644 >> --- a/arch/x86/entry/vsyscall/vsyscall_64.c >> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c >> @@ -38,6 +38,9 @@ >> #include <asm/fixmap.h> >> #include <asm/traps.h> >> #include <asm/paravirt.h> >> +#include <asm/fpu/xstate.h> >> +#include <asm/fpu/types.h> >> +#include <asm/fpu/internal.h> >> >> #define CREATE_TRACE_POINTS >> #include "vsyscall_trace.h" >> @@ -286,6 +289,32 @@ bool emulate_vsyscall(unsigned long error_code, >> /* Emulate a ret instruction. */ >> regs->ip = caller; >> regs->sp += 8; >> + >> + if (current->thread.cet.shstk_size || >> + current->thread.cet.ibt_enabled) { >> + u64 r; >> + >> + fpregs_lock(); >> + if (test_thread_flag(TIF_NEED_FPU_LOAD)) >> + __fpregs_load_activate(); > > Wouldn't this be nicer if you operated on the memory image, not the registers? Do you mean writing to the XSAVES area? > >> + >> +#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER >> + /* Fixup branch tracking */ >> + if (current->thread.cet.ibt_enabled) { >> + rdmsrl(MSR_IA32_U_CET, r); >> + wrmsrl(MSR_IA32_U_CET, r & ~CET_WAIT_ENDBR); >> + } >> +#endif > > Seems reasonable on first glance. > >> + >> +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER >> + /* Unwind shadow stack. */ >> + if (current->thread.cet.shstk_size) { >> + rdmsrl(MSR_IA32_PL3_SSP, r); >> + wrmsrl(MSR_IA32_PL3_SSP, r + 8); >> + } >> +#endif > > What happens if the result is noncanonical? A quick skim of the SDM > didn't find anything. This latter issue goes away if you operate on > the memory image, though -- writing a bogus value is just fine, since > the FP restore will handle it. > At this point, the MSR's value can still be valid or is already saved to memory. If we are going to write to memory, then the MSR must be saved first. So I chose to do __fpregs_load_activate() and write the MSR. Maybe we can check the address before writing it to the MSR? Yu-cheng
On Mon, Sep 21, 2020 at 04:48:25PM -0700, Andy Lutomirski wrote: > On Mon, Sep 21, 2020 at 3:37 PM Yu-cheng Yu <yu-cheng.yu@intel.com> wrote: > > diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c > > b/arch/x86/entry/vsyscall/vsyscall_64.c > > index 44c33103a955..0131c9f7f9c5 100644 > > --- a/arch/x86/entry/vsyscall/vsyscall_64.c > > +++ b/arch/x86/entry/vsyscall/vsyscall_64.c > > @@ -38,6 +38,9 @@ > > #include <asm/fixmap.h> > > #include <asm/traps.h> > > #include <asm/paravirt.h> > > +#include <asm/fpu/xstate.h> > > +#include <asm/fpu/types.h> > > +#include <asm/fpu/internal.h> > > > > #define CREATE_TRACE_POINTS > > #include "vsyscall_trace.h" > > @@ -286,6 +289,32 @@ bool emulate_vsyscall(unsigned long error_code, > > /* Emulate a ret instruction. */ > > regs->ip = caller; > > regs->sp += 8; > > + > > + if (current->thread.cet.shstk_size || > > + current->thread.cet.ibt_enabled) { > > + u64 r; > > + > > + fpregs_lock(); > > + if (test_thread_flag(TIF_NEED_FPU_LOAD)) > > + __fpregs_load_activate(); > > Wouldn't this be nicer if you operated on the memory image, not the registers? > > > + > > +#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER > > + /* Fixup branch tracking */ > > + if (current->thread.cet.ibt_enabled) { > > + rdmsrl(MSR_IA32_U_CET, r); > > + wrmsrl(MSR_IA32_U_CET, r & ~CET_WAIT_ENDBR); > > + } > > +#endif > > Seems reasonable on first glance. > > > + > > +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER > > + /* Unwind shadow stack. */ > > + if (current->thread.cet.shstk_size) { > > + rdmsrl(MSR_IA32_PL3_SSP, r); > > + wrmsrl(MSR_IA32_PL3_SSP, r + 8); > > + } > > +#endif > > What happens if the result is noncanonical? A quick skim of the SDM > didn't find anything. This latter issue goes away if you operate on > the memory image, though -- writing a bogus value is just fine, since > the FP restore will handle it. #GP, the SSP MSRs do canonical checks.
On Tue, Sep 22, 2020 at 10:46 AM Yu, Yu-cheng <yu-cheng.yu@intel.com> wrote: > > On 9/21/2020 4:48 PM, Andy Lutomirski wrote: > > On Mon, Sep 21, 2020 at 3:37 PM Yu-cheng Yu <yu-cheng.yu@intel.com> wrote: > >> > >> On Mon, 2020-09-21 at 09:22 -0700, Yu, Yu-cheng wrote: > > [...] > > >> > >> Here is the patch: > >> > >> ------ > >> > >> From dfdee39c795ee5dcee2c77f6ba344a61f4d8124b Mon Sep 17 00:00:00 2001 > >> From: Yu-cheng Yu <yu-cheng.yu@intel.com> > >> Date: Thu, 29 Nov 2018 14:15:38 -0800 > >> Subject: [PATCH 34/43] x86/vsyscall/64: Fixup Shadow Stack and Indirect Branch > >> Tracking for vsyscall emulation > >> > >> Vsyscall entry points are effectively branch targets. Mark them with > >> ENDBR64 opcodes. When emulating the RET instruction, unwind the shadow > >> stack and reset IBT state machine. > >> > >> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> > >> --- > >> arch/x86/entry/vsyscall/vsyscall_64.c | 29 +++++++++++++++++++++++ > >> arch/x86/entry/vsyscall/vsyscall_emu_64.S | 9 +++++++ > >> arch/x86/entry/vsyscall/vsyscall_trace.h | 1 + > >> 3 files changed, 39 insertions(+) > >> > >> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c > >> b/arch/x86/entry/vsyscall/vsyscall_64.c > >> index 44c33103a955..0131c9f7f9c5 100644 > >> --- a/arch/x86/entry/vsyscall/vsyscall_64.c > >> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c > >> @@ -38,6 +38,9 @@ > >> #include <asm/fixmap.h> > >> #include <asm/traps.h> > >> #include <asm/paravirt.h> > >> +#include <asm/fpu/xstate.h> > >> +#include <asm/fpu/types.h> > >> +#include <asm/fpu/internal.h> > >> > >> #define CREATE_TRACE_POINTS > >> #include "vsyscall_trace.h" > >> @@ -286,6 +289,32 @@ bool emulate_vsyscall(unsigned long error_code, > >> /* Emulate a ret instruction. */ > >> regs->ip = caller; > >> regs->sp += 8; > >> + > >> + if (current->thread.cet.shstk_size || > >> + current->thread.cet.ibt_enabled) { > >> + u64 r; > >> + > >> + fpregs_lock(); > >> + if (test_thread_flag(TIF_NEED_FPU_LOAD)) > >> + __fpregs_load_activate(); > > > > Wouldn't this be nicer if you operated on the memory image, not the registers? > > Do you mean writing to the XSAVES area? Yes. > > > > >> + > >> +#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER > >> + /* Fixup branch tracking */ > >> + if (current->thread.cet.ibt_enabled) { > >> + rdmsrl(MSR_IA32_U_CET, r); > >> + wrmsrl(MSR_IA32_U_CET, r & ~CET_WAIT_ENDBR); > >> + } > >> +#endif > > > > Seems reasonable on first glance. > > > >> + > >> +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER > >> + /* Unwind shadow stack. */ > >> + if (current->thread.cet.shstk_size) { > >> + rdmsrl(MSR_IA32_PL3_SSP, r); > >> + wrmsrl(MSR_IA32_PL3_SSP, r + 8); > >> + } > >> +#endif > > > > What happens if the result is noncanonical? A quick skim of the SDM > > didn't find anything. This latter issue goes away if you operate on > > the memory image, though -- writing a bogus value is just fine, since > > the FP restore will handle it. > > > > At this point, the MSR's value can still be valid or is already saved to > memory. If we are going to write to memory, then the MSR must be saved > first. So I chose to do __fpregs_load_activate() and write the MSR. > > Maybe we can check the address before writing it to the MSR? Performance is almost irrelevant here, and the writing-to-XSAVES-area approach should have the benefit that the exception handling and signaling happens for free.
On 9/23/2020 2:29 PM, Sean Christopherson wrote: > On Mon, Sep 21, 2020 at 04:48:25PM -0700, Andy Lutomirski wrote: >> On Mon, Sep 21, 2020 at 3:37 PM Yu-cheng Yu <yu-cheng.yu@intel.com> wrote: >>> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c >>> b/arch/x86/entry/vsyscall/vsyscall_64.c >>> index 44c33103a955..0131c9f7f9c5 100644 >>> --- a/arch/x86/entry/vsyscall/vsyscall_64.c >>> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c >>> @@ -38,6 +38,9 @@ >>> #include <asm/fixmap.h> >>> #include <asm/traps.h> >>> #include <asm/paravirt.h> >>> +#include <asm/fpu/xstate.h> >>> +#include <asm/fpu/types.h> >>> +#include <asm/fpu/internal.h> >>> >>> #define CREATE_TRACE_POINTS >>> #include "vsyscall_trace.h" >>> @@ -286,6 +289,32 @@ bool emulate_vsyscall(unsigned long error_code, >>> /* Emulate a ret instruction. */ >>> regs->ip = caller; >>> regs->sp += 8; >>> + >>> + if (current->thread.cet.shstk_size || >>> + current->thread.cet.ibt_enabled) { >>> + u64 r; >>> + >>> + fpregs_lock(); >>> + if (test_thread_flag(TIF_NEED_FPU_LOAD)) >>> + __fpregs_load_activate(); >> >> Wouldn't this be nicer if you operated on the memory image, not the registers? >> >>> + >>> +#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER >>> + /* Fixup branch tracking */ >>> + if (current->thread.cet.ibt_enabled) { >>> + rdmsrl(MSR_IA32_U_CET, r); >>> + wrmsrl(MSR_IA32_U_CET, r & ~CET_WAIT_ENDBR); >>> + } >>> +#endif >> >> Seems reasonable on first glance. >> >>> + >>> +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER >>> + /* Unwind shadow stack. */ >>> + if (current->thread.cet.shstk_size) { >>> + rdmsrl(MSR_IA32_PL3_SSP, r); >>> + wrmsrl(MSR_IA32_PL3_SSP, r + 8); >>> + } >>> +#endif >> >> What happens if the result is noncanonical? A quick skim of the SDM >> didn't find anything. This latter issue goes away if you operate on >> the memory image, though -- writing a bogus value is just fine, since >> the FP restore will handle it. > > #GP, the SSP MSRs do canonical checks. I think I'll add a check here for (r + 8) >= TASK_SIZE_MAX. It is better than getting a fault. Thanks, Yu-cheng
On 9/23/2020 2:34 PM, Andy Lutomirski wrote: > On Tue, Sep 22, 2020 at 10:46 AM Yu, Yu-cheng <yu-cheng.yu@intel.com> wrote: >> >> On 9/21/2020 4:48 PM, Andy Lutomirski wrote: >>> On Mon, Sep 21, 2020 at 3:37 PM Yu-cheng Yu <yu-cheng.yu@intel.com> wrote: >>>> >>>> On Mon, 2020-09-21 at 09:22 -0700, Yu, Yu-cheng wrote: >> >> [...] >> >>>> >>>> Here is the patch: >>>> >>>> ------ >>>> >>>> From dfdee39c795ee5dcee2c77f6ba344a61f4d8124b Mon Sep 17 00:00:00 2001 >>>> From: Yu-cheng Yu <yu-cheng.yu@intel.com> >>>> Date: Thu, 29 Nov 2018 14:15:38 -0800 >>>> Subject: [PATCH 34/43] x86/vsyscall/64: Fixup Shadow Stack and Indirect Branch >>>> Tracking for vsyscall emulation >>>> >>>> Vsyscall entry points are effectively branch targets. Mark them with >>>> ENDBR64 opcodes. When emulating the RET instruction, unwind the shadow >>>> stack and reset IBT state machine. >>>> >>>> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> >>>> --- >>>> arch/x86/entry/vsyscall/vsyscall_64.c | 29 +++++++++++++++++++++++ >>>> arch/x86/entry/vsyscall/vsyscall_emu_64.S | 9 +++++++ >>>> arch/x86/entry/vsyscall/vsyscall_trace.h | 1 + >>>> 3 files changed, 39 insertions(+) >>>> >>>> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c >>>> b/arch/x86/entry/vsyscall/vsyscall_64.c >>>> index 44c33103a955..0131c9f7f9c5 100644 >>>> --- a/arch/x86/entry/vsyscall/vsyscall_64.c >>>> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c >>>> @@ -38,6 +38,9 @@ >>>> #include <asm/fixmap.h> >>>> #include <asm/traps.h> >>>> #include <asm/paravirt.h> >>>> +#include <asm/fpu/xstate.h> >>>> +#include <asm/fpu/types.h> >>>> +#include <asm/fpu/internal.h> >>>> >>>> #define CREATE_TRACE_POINTS >>>> #include "vsyscall_trace.h" >>>> @@ -286,6 +289,32 @@ bool emulate_vsyscall(unsigned long error_code, >>>> /* Emulate a ret instruction. */ >>>> regs->ip = caller; >>>> regs->sp += 8; >>>> + >>>> + if (current->thread.cet.shstk_size || >>>> + current->thread.cet.ibt_enabled) { >>>> + u64 r; >>>> + >>>> + fpregs_lock(); >>>> + if (test_thread_flag(TIF_NEED_FPU_LOAD)) >>>> + __fpregs_load_activate(); >>> >>> Wouldn't this be nicer if you operated on the memory image, not the registers? >> >> Do you mean writing to the XSAVES area? > > Yes. > >> >>> >>>> + >>>> +#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER >>>> + /* Fixup branch tracking */ >>>> + if (current->thread.cet.ibt_enabled) { >>>> + rdmsrl(MSR_IA32_U_CET, r); >>>> + wrmsrl(MSR_IA32_U_CET, r & ~CET_WAIT_ENDBR); >>>> + } >>>> +#endif >>> >>> Seems reasonable on first glance. >>> >>>> + >>>> +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER >>>> + /* Unwind shadow stack. */ >>>> + if (current->thread.cet.shstk_size) { >>>> + rdmsrl(MSR_IA32_PL3_SSP, r); >>>> + wrmsrl(MSR_IA32_PL3_SSP, r + 8); >>>> + } >>>> +#endif >>> >>> What happens if the result is noncanonical? A quick skim of the SDM >>> didn't find anything. This latter issue goes away if you operate on >>> the memory image, though -- writing a bogus value is just fine, since >>> the FP restore will handle it. >>> >> >> At this point, the MSR's value can still be valid or is already saved to >> memory. If we are going to write to memory, then the MSR must be saved >> first. So I chose to do __fpregs_load_activate() and write the MSR. >> >> Maybe we can check the address before writing it to the MSR? > > Performance is almost irrelevant here, and the writing-to-XSAVES-area > approach should have the benefit that the exception handling and > signaling happens for free. > Ok, I will change it. Thanks, Yu-cheng
On 9/23/20 3:06 PM, Yu, Yu-cheng wrote: > I think I'll add a check here for (r + 8) >= TASK_SIZE_MAX. It is > better than getting a fault. There's also wrmsr_safe().
On 9/23/2020 3:08 PM, Dave Hansen wrote: > On 9/23/20 3:06 PM, Yu, Yu-cheng wrote: >> I think I'll add a check here for (r + 8) >= TASK_SIZE_MAX. It is >> better than getting a fault. > > There's also wrmsr_safe(). > Yes, thanks. Since I am going to change this to: fpu__prepare_write(), then write to the XSAVES area. The kernel does not expect XRSTORS to fail ("Bad FPU state detected..." message). So maybe still check the address first.
On Wed, Sep 23, 2020 at 3:20 PM Yu, Yu-cheng <yu-cheng.yu@intel.com> wrote: > > On 9/23/2020 3:08 PM, Dave Hansen wrote: > > On 9/23/20 3:06 PM, Yu, Yu-cheng wrote: > >> I think I'll add a check here for (r + 8) >= TASK_SIZE_MAX. It is > >> better than getting a fault. > > > > There's also wrmsr_safe(). > > > Yes, thanks. > > Since I am going to change this to: > > fpu__prepare_write(), then write to the XSAVES area. > > The kernel does not expect XRSTORS to fail ("Bad FPU state detected..." > message). So maybe still check the address first. Surely there are plenty of ways to use ptrace() to poke garbage into the FPU state. We should be able to handle this type of failure somewhat gracefully.
On 9/23/20 3:47 PM, Andy Lutomirski wrote: > On Wed, Sep 23, 2020 at 3:20 PM Yu, Yu-cheng <yu-cheng.yu@intel.com> wrote: >> On 9/23/2020 3:08 PM, Dave Hansen wrote: >>> On 9/23/20 3:06 PM, Yu, Yu-cheng wrote: >>>> I think I'll add a check here for (r + 8) >= TASK_SIZE_MAX. It is >>>> better than getting a fault. >>> There's also wrmsr_safe(). >>> >> Yes, thanks. >> >> Since I am going to change this to: >> >> fpu__prepare_write(), then write to the XSAVES area. >> >> The kernel does not expect XRSTORS to fail ("Bad FPU state detected..." >> message). So maybe still check the address first. > Surely there are plenty of ways to use ptrace() to poke garbage into > the FPU state. We should be able to handle this type of failure > somewhat gracefully. Yeah, agreed. I'd much rather make XRSTORS able to #GP gracefully than teach the kernel exhaustively about every possible error condition it can encounter. We *might* want to do something like to preserve the warning if the task hasn't been ptrace'd, or had the memory buffer written to directly or tainted in another way.
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c index 44c33103a955..3196e963e365 100644 --- a/arch/x86/entry/vsyscall/vsyscall_64.c +++ b/arch/x86/entry/vsyscall/vsyscall_64.c @@ -150,6 +150,15 @@ bool emulate_vsyscall(unsigned long error_code, WARN_ON_ONCE(address != regs->ip); +#ifdef CONFIG_X86_INTEL_CET + if (current->thread.cet.shstk_size || + current->thread.cet.ibt_enabled) { + warn_bad_vsyscall(KERN_INFO, regs, + "vsyscall attempted with cet enabled"); + return false; + } +#endif + if (vsyscall_mode == NONE) { warn_bad_vsyscall(KERN_INFO, regs, "vsyscall attempted with vsyscall=none");
Emulation of the legacy vsyscall page is required by some programs built before 2013. Newer programs after 2013 don't use it. Disable vsyscall emulation when Control-flow Enforcement (CET) is enabled to enhance security. Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> --- v12: - Disable vsyscall emulation only when it is attempted (vs. at compile time). arch/x86/entry/vsyscall/vsyscall_64.c | 9 +++++++++ 1 file changed, 9 insertions(+)