[v21,06/26] x86/cet: Add control-protection fault handler

Message ID	20210217222730.15819-7-yu-cheng.yu@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=QuKr=HT=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 214C064E33 IronPort-SDR: mb0Aqu+zxTw8a0f4QnL+aFyT6/KsXTl71OmPuqBddj+W9CmjryVzSEABcyZFE3RbC+dCl1+jhe Psdt2d3KWG0Q== IronPort-SDR: c2+mSaWb0KR0wfAwIEm20Wc+TTy9xS3ET+XWHcJvirCRUiuv0FDn+U9Iqygov+shBSBwrQssaY DtZZIJdCt9FA== From: Yu-cheng Yu <yu-cheng.yu@intel.com> To: x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann <arnd@arndb.de>, Andy Lutomirski <luto@kernel.org>, Balbir Singh <bsingharora@gmail.com>, Borislav Petkov <bp@alien8.de>, Cyrill Gorcunov <gorcunov@gmail.com>, Dave Hansen <dave.hansen@linux.intel.com>, Eugene Syromiatnikov <esyr@redhat.com>, Florian Weimer <fweimer@redhat.com>, "H.J. Lu" <hjl.tools@gmail.com>, Jann Horn <jannh@google.com>, Jonathan Corbet <corbet@lwn.net>, Kees Cook <keescook@chromium.org>, Mike Kravetz <mike.kravetz@oracle.com>, Nadav Amit <nadav.amit@gmail.com>, Oleg Nesterov <oleg@redhat.com>, Pavel Machek <pavel@ucw.cz>, Peter Zijlstra <peterz@infradead.org>, Randy Dunlap <rdunlap@infradead.org>, "Ravi V. Shankar" <ravi.v.shankar@intel.com>, Vedvyas Shanbhogue <vedvyas.shanbhogue@intel.com>, Dave Martin <Dave.Martin@arm.com>, Weijiang Yang <weijiang.yang@intel.com>, Pengfei Xu <pengfei.xu@intel.com>, Haitao Huang <haitao.huang@intel.com> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>, Michael Kerrisk <mtk.manpages@gmail.com> Subject: [PATCH v21 06/26] x86/cet: Add control-protection fault handler Date: Wed, 17 Feb 2021 14:27:10 -0800 Message-Id: <20210217222730.15819-7-yu-cheng.yu@intel.com> In-Reply-To: <20210217222730.15819-1-yu-cheng.yu@intel.com> References: <20210217222730.15819-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	Control-flow Enforcement: Shadow Stack \| expand [v21,00/26] Control-flow Enforcement: Shadow Stack [v21,01/26] Documentation/x86: Add CET description [v21,02/26] x86/cet/shstk: Add Kconfig option for user-mode control-flow protection [v21,03/26] x86/cpufeatures: Add CET CPU feature flags for Control-flow Enforcement Technology (CET) [v21,04/26] x86/cpufeatures: Introduce X86_FEATURE_CET and setup functions [v21,05/26] x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states [v21,06/26] x86/cet: Add control-protection fault handler [v21,07/26] x86/mm: Remove _PAGE_DIRTY from kernel RO pages [v21,08/26] x86/mm: Introduce _PAGE_COW [v21,09/26] drm/i915/gvt: Change _PAGE_DIRTY to _PAGE_DIRTY_BITS [v21,10/26] x86/mm: Update pte_modify for _PAGE_COW [v21,11/26] x86/mm: Update ptep_set_wrprotect() and pmdp_set_wrprotect() for transition from _PAGE_… [v21,12/26] mm: Introduce VM_SHSTK for shadow stack memory [v21,13/26] x86/mm: Shadow Stack page fault error checking [v21,14/26] x86/mm: Update maybe_mkwrite() for shadow stack [v21,15/26] mm: Fixup places that call pte_mkwrite() directly [v21,16/26] mm: Add guard pages around a shadow stack. [v21,17/26] mm/mmap: Add shadow stack pages to memory accounting [v21,18/26] mm: Update can_follow_write_pte() for shadow stack [v21,19/26] mm: Re-introduce vm_flags to do_mmap() [v21,20/26] x86/cet/shstk: User-mode shadow stack support [v21,21/26] x86/cet/shstk: Handle signals for shadow stack [v21,22/26] ELF: Introduce arch_setup_elf_property() [v21,23/26] x86/cet/shstk: Handle thread shadow stack [v21,24/26] x86/cet/shstk: Add arch_prctl functions for shadow stack [v21,25/26] mm: Move arch_calc_vm_prot_bits() to arch/x86/include/asm/mman.h [v21,26/26] mm: Introduce PROT_SHSTK for shadow stack

Yu-cheng Yu Feb. 17, 2021, 10:27 p.m. UTC

A control-protection fault is triggered when a control-flow transfer
attempt violates Shadow Stack or Indirect Branch Tracking constraints.
For example, the return address for a RET instruction differs from the copy
on the shadow stack; or an indirect JMP instruction, without the NOTRACK
prefix, arrives at a non-ENDBR opcode.

The control-protection fault handler works in a similar way as the general
protection fault handler.  It provides the si_code SEGV_CPERR to the signal
handler.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
---
 arch/x86/include/asm/idtentry.h    |  4 ++
 arch/x86/kernel/idt.c              |  4 ++
 arch/x86/kernel/signal_compat.c    |  2 +-
 arch/x86/kernel/traps.c            | 63 ++++++++++++++++++++++++++++++
 include/uapi/asm-generic/siginfo.h |  3 +-
 5 files changed, 74 insertions(+), 2 deletions(-)

Borislav Petkov Feb. 24, 2021, 4:13 p.m. UTC | #1

On Wed, Feb 17, 2021 at 02:27:10PM -0800, Yu-cheng Yu wrote:
> +/*
> + * When a control protection exception occurs, send a signal to the responsible
> + * application.  Currently, control protection is only enabled for user mode.
> + * This exception should not come from kernel mode.
> + */
> +DEFINE_IDTENTRY_ERRORCODE(exc_control_protection)
> +{
> +	static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
> +				      DEFAULT_RATELIMIT_BURST);

Pls move that out of the function - those "static" qualifiers get missed
easily when inside a function.

> +	struct task_struct *tsk;
> +
> +	if (!user_mode(regs)) {
> +		pr_emerg("PANIC: unexpected kernel control protection fault\n");
> +		die("kernel control protection fault", regs, error_code);
> +		panic("Machine halted.");
> +	}
> +
> +	cond_local_irq_enable(regs);
> +
> +	if (!boot_cpu_has(X86_FEATURE_CET))
> +		WARN_ONCE(1, "Control protection fault with CET support disabled\n");
> +
> +	tsk = current;
> +	tsk->thread.error_code = error_code;
> +	tsk->thread.trap_nr = X86_TRAP_CP;
> +
> +	/*
> +	 * Ratelimit to prevent log spamming.
> +	 */
> +	if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) &&
> +	    __ratelimit(&rs)) {
> +		unsigned long ssp;
> +		int err;
> +
> +		err = array_index_nospec(error_code, ARRAY_SIZE(control_protection_err));

"err" as an automatic variable is confusing - we use those to denote
whether the function returned an error or not. Call yours "cpf_type" or
so.

> +
> +		rdmsrl(MSR_IA32_PL3_SSP, ssp);
> +		pr_emerg("%s[%d] control protection ip:%lx sp:%lx ssp:%lx error:%lx(%s)",
> +			 tsk->comm, task_pid_nr(tsk),
> +			 regs->ip, regs->sp, ssp, error_code,
> +			 control_protection_err[err]);
> +		print_vma_addr(KERN_CONT " in ", regs->ip);
> +		pr_cont("\n");
> +	}
> +
> +	force_sig_fault(SIGSEGV, SEGV_CPERR,
> +			(void __user *)uprobe_get_trap_addr(regs));

Why is this calling an uprobes function?

Also, do not break that line even if it is longer than 80.

> +	cond_local_irq_disable(regs);
> +}
> +#endif
> +
>  static bool do_int3(struct pt_regs *regs)
>  {
>  	int res;
> diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
> index d2597000407a..1c2ea91284a0 100644
> --- a/include/uapi/asm-generic/siginfo.h
> +++ b/include/uapi/asm-generic/siginfo.h
> @@ -231,7 +231,8 @@ typedef struct siginfo {
>  #define SEGV_ADIPERR	7	/* Precise MCD exception */
>  #define SEGV_MTEAERR	8	/* Asynchronous ARM MTE error */
>  #define SEGV_MTESERR	9	/* Synchronous ARM MTE exception */
> -#define NSIGSEGV	9
> +#define SEGV_CPERR	10	/* Control protection fault */
> +#define NSIGSEGV	10

I still don't see the patch adding this to the manpage of sigaction(2).

There's a git repo there: https://www.kernel.org/doc/man-pages/

and I'm pretty sure Michael takes patches.

Yu-cheng Yu Feb. 24, 2021, 4:44 p.m. UTC | #2

On 2/24/2021 8:13 AM, Borislav Petkov wrote:
> On Wed, Feb 17, 2021 at 02:27:10PM -0800, Yu-cheng Yu wrote:
>> +/*
>> + * When a control protection exception occurs, send a signal to the responsible
>> + * application.  Currently, control protection is only enabled for user mode.
>> + * This exception should not come from kernel mode.
>> + */
>> +DEFINE_IDTENTRY_ERRORCODE(exc_control_protection)
>> +{
>> +	static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
>> +				      DEFAULT_RATELIMIT_BURST);

[...]

>> +
>> +		rdmsrl(MSR_IA32_PL3_SSP, ssp);
>> +		pr_emerg("%s[%d] control protection ip:%lx sp:%lx ssp:%lx error:%lx(%s)",
>> +			 tsk->comm, task_pid_nr(tsk),
>> +			 regs->ip, regs->sp, ssp, error_code,
>> +			 control_protection_err[err]);
>> +		print_vma_addr(KERN_CONT " in ", regs->ip);
>> +		pr_cont("\n");
>> +	}
>> +
>> +	force_sig_fault(SIGSEGV, SEGV_CPERR,
>> +			(void __user *)uprobe_get_trap_addr(regs));
> 
> Why is this calling an uprobes function?
> 

I will change it to error_get_trap_addr().

> Also, do not break that line even if it is longer than 80.
> 
>> +	cond_local_irq_disable(regs);
>> +}
>> +#endif
>> +
>>   static bool do_int3(struct pt_regs *regs)
>>   {
>>   	int res;
>> diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
>> index d2597000407a..1c2ea91284a0 100644
>> --- a/include/uapi/asm-generic/siginfo.h
>> +++ b/include/uapi/asm-generic/siginfo.h
>> @@ -231,7 +231,8 @@ typedef struct siginfo {
>>   #define SEGV_ADIPERR	7	/* Precise MCD exception */
>>   #define SEGV_MTEAERR	8	/* Asynchronous ARM MTE error */
>>   #define SEGV_MTESERR	9	/* Synchronous ARM MTE exception */
>> -#define NSIGSEGV	9
>> +#define SEGV_CPERR	10	/* Control protection fault */
>> +#define NSIGSEGV	10
> 
> I still don't see the patch adding this to the manpage of sigaction(2).
> 
> There's a git repo there: https://www.kernel.org/doc/man-pages/
> 
> and I'm pretty sure Michael takes patches.
> 

I will send a patch.

--
Yu-cheng

Borislav Petkov Feb. 24, 2021, 4:53 p.m. UTC | #3

On Wed, Feb 24, 2021 at 08:44:45AM -0800, Yu, Yu-cheng wrote:
> > > +	force_sig_fault(SIGSEGV, SEGV_CPERR,
> > > +			(void __user *)uprobe_get_trap_addr(regs));
> > 
> > Why is this calling an uprobes function?
> > 
> 
> I will change it to error_get_trap_addr().

"/*
  * Posix requires to provide the address of the faulting instruction for
  * SIGILL (#UD) and SIGFPE (#DE) in the si_addr member of siginfo_t.
  ..."

Is yours SIGILL or SIGFPE?

Yu-cheng Yu Feb. 24, 2021, 5:56 p.m. UTC | #4

On 2/24/2021 8:53 AM, Borislav Petkov wrote:
> On Wed, Feb 24, 2021 at 08:44:45AM -0800, Yu, Yu-cheng wrote:
>>>> +	force_sig_fault(SIGSEGV, SEGV_CPERR,
>>>> +			(void __user *)uprobe_get_trap_addr(regs));
>>>
>>> Why is this calling an uprobes function?
>>>
>>
>> I will change it to error_get_trap_addr().
> 
> "/*
>    * Posix requires to provide the address of the faulting instruction for
>    * SIGILL (#UD) and SIGFPE (#DE) in the si_addr member of siginfo_t.
>    ..."
> 
> Is yours SIGILL or SIGFPE?
> 

No.  Maybe I am doing too much.  The GP fault sets si_addr to zero, for 
example.  So maybe do the same here?

Borislav Petkov Feb. 24, 2021, 7:20 p.m. UTC | #5

On Wed, Feb 24, 2021 at 09:56:13AM -0800, Yu, Yu-cheng wrote:
> No.  Maybe I am doing too much.  The GP fault sets si_addr to zero, for
> example.  So maybe do the same here?

No, you're looking at this from the wrong angle. This is going to be
user-visible and the moment it gets upstream, it is cast in stone.

So the whole use case of what luserspace needs to do or is going to do
or wants to do on a SEGV_CPERR, needs to be described, agreed upon by
people etc before it goes out. And thus clarified whether the address
gets copied out or not.

Thx.

Andy Lutomirski Feb. 24, 2021, 7:30 p.m. UTC | #6

On Wed, Feb 24, 2021 at 11:20 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Feb 24, 2021 at 09:56:13AM -0800, Yu, Yu-cheng wrote:
> > No.  Maybe I am doing too much.  The GP fault sets si_addr to zero, for
> > example.  So maybe do the same here?
>
> No, you're looking at this from the wrong angle. This is going to be
> user-visible and the moment it gets upstream, it is cast in stone.
>
> So the whole use case of what luserspace needs to do or is going to do
> or wants to do on a SEGV_CPERR, needs to be described, agreed upon by
> people etc before it goes out. And thus clarified whether the address
> gets copied out or not.

I vote 0.  The address is in ucontext->gregs[REG_RIP] [0] regardless.
Why do we need to stick a copy somewhere else?

[0] or however it's spelled.  i can never remember.

>
> Thx.
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

Borislav Petkov Feb. 24, 2021, 7:42 p.m. UTC | #7

On Wed, Feb 24, 2021 at 11:30:34AM -0800, Andy Lutomirski wrote:
> On Wed, Feb 24, 2021 at 11:20 AM Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Wed, Feb 24, 2021 at 09:56:13AM -0800, Yu, Yu-cheng wrote:
> > > No.  Maybe I am doing too much.  The GP fault sets si_addr to zero, for
> > > example.  So maybe do the same here?
> >
> > No, you're looking at this from the wrong angle. This is going to be
> > user-visible and the moment it gets upstream, it is cast in stone.
> >
> > So the whole use case of what luserspace needs to do or is going to do
> > or wants to do on a SEGV_CPERR, needs to be described, agreed upon by
> > people etc before it goes out. And thus clarified whether the address
> > gets copied out or not.
> 
> I vote 0.  The address is in ucontext->gregs[REG_RIP] [0] regardless.
> Why do we need to stick a copy somewhere else?
> 
> [0] or however it's spelled.  i can never remember.

Fine with me. Let's have this documented in the manpage and then we can
move forward with this.

Thx.

Yu-cheng Yu Feb. 24, 2021, 7:52 p.m. UTC | #8

On 2/24/2021 11:42 AM, Borislav Petkov wrote:
> On Wed, Feb 24, 2021 at 11:30:34AM -0800, Andy Lutomirski wrote:
>> On Wed, Feb 24, 2021 at 11:20 AM Borislav Petkov <bp@alien8.de> wrote:
>>>
>>> On Wed, Feb 24, 2021 at 09:56:13AM -0800, Yu, Yu-cheng wrote:
>>>> No.  Maybe I am doing too much.  The GP fault sets si_addr to zero, for
>>>> example.  So maybe do the same here?
>>>
>>> No, you're looking at this from the wrong angle. This is going to be
>>> user-visible and the moment it gets upstream, it is cast in stone.
>>>
>>> So the whole use case of what luserspace needs to do or is going to do
>>> or wants to do on a SEGV_CPERR, needs to be described, agreed upon by
>>> people etc before it goes out. And thus clarified whether the address
>>> gets copied out or not.
>>
>> I vote 0.  The address is in ucontext->gregs[REG_RIP] [0] regardless.
>> Why do we need to stick a copy somewhere else?
>>
>> [0] or however it's spelled.  i can never remember.
> 
> Fine with me. Let's have this documented in the manpage and then we can
> move forward with this.
> 
> Thx.
> 

The man page at https://man7.org/linux/man-pages/man2/sigaction.2.html says,

SIGILL, SIGFPE, SIGSEGV, SIGBUS, and SIGTRAP fill in si_addr with the 
address of the fault.

But it is not entirely true.

I will send a patch to update it, and another patch for the si_code.

--
Yu-cheng

[v21,06/26] x86/cet: Add control-protection fault handler

Commit Message

Comments

Patch