[RFC,v5,2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
diff mbox

Message ID 1508631773-2502-3-git-send-email-alex.popov@linux.com
State New
Headers show

Commit Message

Alexander Popov Oct. 22, 2017, 12:22 a.m. UTC
The STACKLEAK feature erases the kernel stack before returning from
syscalls. That reduces the information which kernel stack leak bugs can
reveal and blocks some uninitialized stack variable attacks. Moreover,
STACKLEAK provides runtime checks for kernel stack overflow detection.

This commit introduces the STACKLEAK gcc plugin. It is needed for:
 - tracking the lowest border of the kernel stack, which is important
    for the code erasing the used part of the kernel stack at the end
    of syscalls (comes in a separate commit);
 - checking that alloca calls don't cause stack overflow.

So this plugin instruments the kernel code inserting:
 - the check_alloca() call before alloca and the track_stack() call
    after it;
 - the track_stack() call for the functions with a stack frame size
    greater than or equal to CONFIG_STACKLEAK_TRACK_MIN_SIZE.

The STACKLEAK feature is ported from grsecurity/PaX. More information at:
  https://grsecurity.net/
  https://pax.grsecurity.net/

This code is modified from Brad Spengler/PaX Team's code in the last
public patch of grsecurity/PaX based on our understanding of the code.
Changes or omissions from the original code are ours and don't reflect
the original grsecurity/PaX code.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 arch/Kconfig                           |  12 +
 arch/x86/kernel/dumpstack.c            |  15 ++
 fs/exec.c                              |  30 +++
 scripts/Makefile.gcc-plugins           |   3 +
 scripts/gcc-plugins/stackleak_plugin.c | 470 +++++++++++++++++++++++++++++++++
 5 files changed, 530 insertions(+)
 create mode 100644 scripts/gcc-plugins/stackleak_plugin.c

Comments

Alexander Popov Oct. 30, 2017, 4:51 p.m. UTC | #1
On 22.10.2017 03:22, Alexander Popov wrote:
> The STACKLEAK feature erases the kernel stack before returning from
> syscalls. That reduces the information which kernel stack leak bugs can
> reveal and blocks some uninitialized stack variable attacks. Moreover,
> STACKLEAK provides runtime checks for kernel stack overflow detection.
> 
> This commit introduces the STACKLEAK gcc plugin. It is needed for:
>  - tracking the lowest border of the kernel stack, which is important
>     for the code erasing the used part of the kernel stack at the end
>     of syscalls (comes in a separate commit);
>  - checking that alloca calls don't cause stack overflow.
> 
> So this plugin instruments the kernel code inserting:
>  - the check_alloca() call before alloca and the track_stack() call
>     after it;
>  - the track_stack() call for the functions with a stack frame size
>     greater than or equal to CONFIG_STACKLEAK_TRACK_MIN_SIZE.
> 
> The STACKLEAK feature is ported from grsecurity/PaX. More information at:
>   https://grsecurity.net/
>   https://pax.grsecurity.net/
> 
> This code is modified from Brad Spengler/PaX Team's code in the last
> public patch of grsecurity/PaX based on our understanding of the code.
> Changes or omissions from the original code are ours and don't reflect
> the original grsecurity/PaX code.
> 
> Signed-off-by: Alexander Popov <alex.popov@linux.com>
> ---
>  arch/Kconfig                           |  12 +
>  arch/x86/kernel/dumpstack.c            |  15 ++
>  fs/exec.c                              |  30 +++
>  scripts/Makefile.gcc-plugins           |   3 +
>  scripts/gcc-plugins/stackleak_plugin.c | 470 +++++++++++++++++++++++++++++++++
>  5 files changed, 530 insertions(+)
>  create mode 100644 scripts/gcc-plugins/stackleak_plugin.c
> 

[...]

> diff --git a/fs/exec.c b/fs/exec.c
> index 3e14ba2..481ef4b 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1958,3 +1958,33 @@ COMPAT_SYSCALL_DEFINE5(execveat, int, fd,
>  				  argv, envp, flags);
>  }
>  #endif
> +
> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> +void __used track_stack(void)
> +{
> +	/*
> +	 * N.B. The arch-specific part of the STACKLEAK feature fills the
> +	 * kernel stack with the poison value, which has the register width.
> +	 * That code assumes that the value of thread.lowest_stack is aligned
> +	 * on the register width boundary.
> +	 *
> +	 * That is true for x86 and x86_64 because of the kernel stack
> +	 * alignment on these platforms (for details, see cc_stack_align in
> +	 * arch/x86/Makefile). Take care of that when you port STACKLEAK to
> +	 * new platforms.
> +	 */
> +	unsigned long sp = (unsigned long)&sp;
> +
> +	if (sp < current->thread.lowest_stack &&
> +	    sp >= (unsigned long)task_stack_page(current) +
> +					2 * sizeof(unsigned long)) {
> +		current->thread.lowest_stack = sp;
> +	}
> +
> +#ifndef CONFIG_VMAP_STACK
> +	if (unlikely((sp & (THREAD_SIZE - 1)) < (THREAD_SIZE / 16)))
> +		BUG();

Hello, I need your help! I'm trying to solve the problem with the recursive
BUG() here (currently on x86_64).

When the thread stack is exhausted, this BUG() is hit. But do_error_trap(),
which handles the exception, calls track_stack() itself again (since it is
instrumented by the gcc plugin). So this recursion proceeds with exhausting the
thread stack.

Finally the stack pointer oversteps the stack bottom and the bug is not hit
anymore, because (sp & (THREAD_SIZE - 1)) is big again. And we have such an oops
message printed:

[   17.529075] ------------[ cut here ]------------
[   17.529075] kernel BUG at fs/exec.c:1986!
[   17.529075] invalid opcode: 0000 [#1] SMP
[   17.529075] Dumping ftrace buffer:
[   17.529075]    (ftrace buffer empty)
[   17.529075] Modules linked in: lkdtm
[   17.529075] CPU: 0 PID: 2651 Comm: sh Not tainted 4.14.0-rc5+ #10
[   17.529075] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   17.529075] task: ffff880079950040 task.stack: ffff88007a904000
[   17.529075] RIP: 0010:track_stack+0x52/0x60
[   17.529075] RSP: 0018:ffff88007a904078 EFLAGS: 00010193
[   17.529075] RAX: 0000000000000078 RBX: 0000000000000006 RCX: ffff88007a904010
[   17.529075] RDX: ffff880079950040 RSI: ffff88007a904000 RDI: ffff88007a904168
[   17.529075] RBP: ffff88007a904080 R08: 0000000000000004 R09: 000000000000018c
[   17.529075] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007a904168
[   17.529075] R13: 0000000000000004 R14: ffffffff81bd33c9 R15: 0000000000000000
[   17.529075] FS:  00007f469977e700(0000) GS:ffff88007fc00000(0000)
knlGS:0000000000000000
[   17.529075] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   17.529075] CR2: 00000000021ab618 CR3: 000000007a404000 CR4: 00000000000006f0
[   17.529075] Call Trace:
[   17.529075]  do_error_trap+0x25/0xe0
[   17.529075]  do_invalid_op+0x2a/0x30
[   17.529075]  invalid_op+0x18/0x20
[   17.529075] RIP: 0010:track_stack+0x52/0x60
[   17.529075] RSP: 0018:ffff88007a904218 EFLAGS: 00010193
[   17.529075] RAX: 0000000000000218 RBX: 0000000000000006 RCX: ffff88007a904010
[   17.529075] RDX: ffff880079950040 RSI: ffff88007a904000 RDI: ffff88007a904308
[   17.529075] RBP: ffff88007a904220 R08: 0000000000000004 R09: 000000000000018c
[   17.529075] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007a904308
[   17.529075] R13: 0000000000000004 R14: ffffffff81bd33c9 R15: 0000000000000000
[   17.529075]  do_error_trap+0x25/0xe0
[   17.529075]  do_invalid_op+0x2a/0x30
[   17.529075]  invalid_op+0x18/0x20
[   17.529075] RIP: 0010:track_stack+0x52/0x60
[   17.529075] RSP: 0018:ffff88007a9043b8 EFLAGS: 00010393
[   17.529075] RAX: 00000000000003b8 RBX: ffff88007a907d90 RCX: ffff88007a904010
[   17.529075] RDX: ffff880079950040 RSI: ffff88007a904000 RDI: ffff88007a907d90
[   17.529075] RBP: ffff88007a9043c0 R08: 0000000000000000 R09: 000000000000018c
[   17.529075] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000003d0
[   17.529075] R13: ffffffffc0007020 R14: 0000000000000016 R15: 000000000000003d
[   17.529075]  recursion+0x14/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  ? put_dec+0x24/0xb0
[   17.529075]  ? put_dec+0x24/0xb0
[   17.529075]  ? number+0x2c5/0x2e0
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  ? vsnprintf+0xd1/0x4b0
[   17.529075]  ? wait_for_xmitr+0x32/0x90
[   17.529075]  ? serial8250_console_putchar+0x25/0x30
[   17.529075]  ? wait_for_xmitr+0x90/0x90
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  ? up+0x30/0x50
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  ? vprintk_emit+0x24a/0x2d0
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  ? vprintk_func+0x31/0x80
[   17.529075]  ? printk+0x4a/0x55
[   17.529075]  lkdtm_STACKLEAK_TRACK_STACK+0x39/0x4d [lkdtm]
[   17.529075]  lkdtm_do_action+0x18/0x20 [lkdtm]
[   17.529075]  direct_entry+0xcc/0x130 [lkdtm]
[   17.529075]  full_proxy_write+0x4f/0x90
[   17.529075]  __vfs_write+0x36/0x140
[   17.529075]  ? security_file_permission+0x36/0xb0
[   17.529075]  vfs_write+0xb1/0x1a0
[   17.529075]  SyS_write+0x46/0xa0
[   17.529075]  entry_SYSCALL_64_fastpath+0x1a/0xaa
[   17.529075] RIP: 0033:0x7f46992a1600
[   17.529075] RSP: 002b:00007fffe4466838 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[   17.529075] RAX: ffffffffffffffda RBX: 000000000000006b RCX: 00007f46992a1600
[   17.529075] RDX: 0000000000000016 RSI: 00000000021a9610 RDI: 0000000000000001
[   17.529075] RBP: 0000000000002710 R08: 0000000000000003 R09: 0000000000002010
[   17.529075] R10: 0000000000000871 R11: 0000000000000246 R12: 00007f469955eb58
[   17.529075] R13: 0000000000002010 R14: 00000000021a9600 R15: 00007f469955eb00
[   17.529075] Code: 8d 4e 10 48 39 c8 73 0f 25 ff 3f 00 00 48 3d ff 03 00 00 76
16 c9 c3 48 89 82 30 0a 00 00 25 ff 3f 00 00 48 3d ff 03 00 00 77 ea <0f> 0b 66
90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 54 53
[   17.529075] RIP: track_stack+0x52/0x60 RSP: ffff88007a904078
[   17.529075] ---[ end trace 3f0d8f585f4dcc08 ]---


But printing this message spoils the memory below the exhausted stack, right?

Can we somehow handle this BUG() in another stack to avoid the recursion?

Thanks!

> +#endif /* !CONFIG_VMAP_STACK */
> +}
> +EXPORT_SYMBOL(track_stack);
> +#endif /* CONFIG_GCC_PLUGIN_STACKLEAK */

[...]

Best regards,
Alexander
Peter Zijlstra Oct. 30, 2017, 5:32 p.m. UTC | #2
On Mon, Oct 30, 2017 at 07:51:33PM +0300, Alexander Popov wrote:
> When the thread stack is exhausted, this BUG() is hit. But do_error_trap(),
> which handles the exception, calls track_stack() itself again (since it is
> instrumented by the gcc plugin). So this recursion proceeds with exhausting the
> thread stack.

Add a __attribute__((nostacktrack)) on it?
Alexander Popov Oct. 30, 2017, 6:06 p.m. UTC | #3
Hello Peter,

Thanks for your reply.

On 30.10.2017 20:32, Peter Zijlstra wrote:
> On Mon, Oct 30, 2017 at 07:51:33PM +0300, Alexander Popov wrote:
>> When the thread stack is exhausted, this BUG() is hit. But do_error_trap(),
>> which handles the exception, calls track_stack() itself again (since it is
>> instrumented by the gcc plugin). So this recursion proceeds with exhausting the
>> thread stack.
> 
> Add a __attribute__((nostacktrack)) on it?

Yes, I already tried some blacklisting in the plugin, but it didn't really help,
because:

1. there are other (more than 5) instrumented functions, that are called during
BUG() handling too;

2. decreasing CONFIG_STACKLEAK_TRACK_MIN_SIZE would add more instrumented
functions, which should be manually blacklisted (not good).

I guess handling BUG() in another stack would be a solution. For example, Andy
Lutomirski calls handle_stack_overflow in the DOUBLEFAULT_STACK
(arch/x86/mm/fault.c). Should I do something similar?

Thanks!

Best regards,
Alexander
Alexander Popov Nov. 14, 2017, 3:36 p.m. UTC | #4
On 30.10.2017 21:06, Alexander Popov wrote:
> On 30.10.2017 20:32, Peter Zijlstra wrote:
>> On Mon, Oct 30, 2017 at 07:51:33PM +0300, Alexander Popov wrote:
>>> When the thread stack is exhausted, this BUG() is hit. But do_error_trap(),
>>> which handles the exception, calls track_stack() itself again (since it is
>>> instrumented by the gcc plugin). So this recursion proceeds with exhausting the
>>> thread stack.
>>
>> Add a __attribute__((nostacktrack)) on it?
> 
> Yes, I already tried some blacklisting in the plugin, but it didn't really help,
> because:
> 
> 1. there are other (more than 5) instrumented functions, that are called during
> BUG() handling too;
> 
> 2. decreasing CONFIG_STACKLEAK_TRACK_MIN_SIZE would add more instrumented
> functions, which should be manually blacklisted (not good).
> 
> I guess handling BUG() in another stack would be a solution. For example, Andy
> Lutomirski calls handle_stack_overflow in the DOUBLEFAULT_STACK
> (arch/x86/mm/fault.c). Should I do something similar?

Hello Andy! May I ask your advice?

When CONFIG_VMAP_STACK is disabled and STACKLEAK is enabled (for example, on
x86_32), we need another way to detect stack depth overflow. That is the reason
of having this BUG() in track_stack(). But it turns out to be recursive since
track_stack() will be called again during BUG() handling.

We can avoid that recursion by handling oops in another stack. It looks similar
to the way you call handle_stack_overflow() in arch/x86/mm/fault.c. But it seems
that I can't reuse that code, am I right?

How should I do it properly?

By the way, you wrote that you have some entry code changes which conflict with
STACKLEAK. May I ask for more details?

Best regards,
Alexander
Andy Lutomirski Nov. 14, 2017, 4:13 p.m. UTC | #5
On Tue, Nov 14, 2017 at 7:36 AM, Alexander Popov <alex.popov@linux.com> wrote:
> On 30.10.2017 21:06, Alexander Popov wrote:
>> On 30.10.2017 20:32, Peter Zijlstra wrote:
>>> On Mon, Oct 30, 2017 at 07:51:33PM +0300, Alexander Popov wrote:
>>>> When the thread stack is exhausted, this BUG() is hit. But do_error_trap(),
>>>> which handles the exception, calls track_stack() itself again (since it is
>>>> instrumented by the gcc plugin). So this recursion proceeds with exhausting the
>>>> thread stack.
>>>
>>> Add a __attribute__((nostacktrack)) on it?
>>
>> Yes, I already tried some blacklisting in the plugin, but it didn't really help,
>> because:
>>
>> 1. there are other (more than 5) instrumented functions, that are called during
>> BUG() handling too;
>>
>> 2. decreasing CONFIG_STACKLEAK_TRACK_MIN_SIZE would add more instrumented
>> functions, which should be manually blacklisted (not good).
>>
>> I guess handling BUG() in another stack would be a solution. For example, Andy
>> Lutomirski calls handle_stack_overflow in the DOUBLEFAULT_STACK
>> (arch/x86/mm/fault.c). Should I do something similar?
>
> Hello Andy! May I ask your advice?
>
> When CONFIG_VMAP_STACK is disabled and STACKLEAK is enabled (for example, on
> x86_32), we need another way to detect stack depth overflow. That is the reason
> of having this BUG() in track_stack(). But it turns out to be recursive since
> track_stack() will be called again during BUG() handling.

What does the STEAKLACK plugin actually do?  I haven't followed this enough.

>
> We can avoid that recursion by handling oops in another stack. It looks similar
> to the way you call handle_stack_overflow() in arch/x86/mm/fault.c. But it seems
> that I can't reuse that code, am I right?

You'd probably have to make 32-bit compatible, which means making a
32-bit variant of this thingy:

                asm volatile ("movq %[stack], %%rsp\n\t"
                              "call handle_stack_overflow\n\t"
                              "1: jmp 1b"
                              : ASM_CALL_CONSTRAINT
                              : "D" ("kernel stack overflow (page fault)"),
                                "S" (regs), "d" (address),
                                [stack] "rm" (stack));

Or you could force a double-fault.

>
> How should I do it properly?
>
> By the way, you wrote that you have some entry code changes which conflict with
> STACKLEAK. May I ask for more details?

It's this thing:

https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/log/?h=x86/entry_stack.wip

and I'll probably drop the ".wip" from the name shortly.

>
> Best regards,
> Alexander
Mark Rutland Nov. 14, 2017, 4:33 p.m. UTC | #6
On Tue, Nov 14, 2017 at 08:13:43AM -0800, Andy Lutomirski wrote:
> What does the STEAKLACK plugin actually do?  I haven't followed this enough.

The plugin adds instrumentation to track the maximum stack depth, though only
functions with a sufficiently large stackframe are instrumented.

The basic idea is that this can then be used to lazily zero stack after use
(just before return to userspace), to minimize the risk of problems resulting
from uninitialized variables (either copied to userspace and leaking secrets,
or controlled by userspace and influencing the kernel).

That, and catching some stack overflows in the absence of VMAP'd stacks.

Thanks,
Mark.
Alexander Popov Nov. 14, 2017, 9:09 p.m. UTC | #7
Thanks, Mark!

Please see my comments below.

On 14.11.2017 19:33, Mark Rutland wrote:
> On Tue, Nov 14, 2017 at 08:13:43AM -0800, Andy Lutomirski wrote:
>> What does the STEAKLACK plugin actually do?  I haven't followed this enough.
> 
> The plugin adds instrumentation to track the maximum stack depth, though only
> functions with a sufficiently large stackframe are instrumented.

Yes. Functions with a big stack frame call track_stack() to update the
lowest_stack value. If CONFIG_VMAP_STACK is disabled, track_stack() is compiled
with a check for detecting stack depth overflow. This check is what I'm asking
about.

> The basic idea is that this can then be used to lazily zero stack after use
> (just before return to userspace), to minimize the risk of problems resulting
> from uninitialized variables (either copied to userspace and leaking secrets,
> or controlled by userspace and influencing the kernel).

Yes.

Actually the used part of the kernel stack is not zeroed. It is filled by
STACKLEAK_POISON (-0xBEEF) which points to the unused hole in x86_64 virtual
memory map.

> That, and catching some stack overflows in the absence of VMAP'd stacks.

The STACKLEAK plugin also adds check_alloca() call before each alloca to block
"Stack Clash" attack against kernel stack.

More details and statistics are available in the cover letter and commit
messages in this patch series.

Best regards,
Alexander
Andy Lutomirski Nov. 14, 2017, 9:17 p.m. UTC | #8
On Tue, Nov 14, 2017 at 1:09 PM, Alexander Popov <alex.popov@linux.com> wrote:
> Thanks, Mark!
>
> Please see my comments below.
>
> On 14.11.2017 19:33, Mark Rutland wrote:
>> On Tue, Nov 14, 2017 at 08:13:43AM -0800, Andy Lutomirski wrote:
>>> What does the STEAKLACK plugin actually do?  I haven't followed this enough.
>>
>> The plugin adds instrumentation to track the maximum stack depth, though only
>> functions with a sufficiently large stackframe are instrumented.
>
> Yes. Functions with a big stack frame call track_stack() to update the
> lowest_stack value. If CONFIG_VMAP_STACK is disabled, track_stack() is compiled
> with a check for detecting stack depth overflow. This check is what I'm asking
> about.

Then you'll probably have to do something like what I did in the
VMAP_STACK code.

That being said, I don't entirely see the point.  If you want a
hardened kernel, you're going to enable VMAP_STACK.  Are there really
users of hardened 32-bit kernels?
Alexander Popov Nov. 14, 2017, 9:50 p.m. UTC | #9
Hello Andy,

Thanks for your prompt reply!

On 14.11.2017 19:13, Andy Lutomirski wrote:
> On Tue, Nov 14, 2017 at 7:36 AM, Alexander Popov <alex.popov@linux.com> wrote:
>> On 30.10.2017 21:06, Alexander Popov wrote:
>>> On 30.10.2017 20:32, Peter Zijlstra wrote:
>>>> On Mon, Oct 30, 2017 at 07:51:33PM +0300, Alexander Popov wrote:
>>>>> When the thread stack is exhausted, this BUG() is hit. But do_error_trap(),
>>>>> which handles the exception, calls track_stack() itself again (since it is
>>>>> instrumented by the gcc plugin). So this recursion proceeds with exhausting the
>>>>> thread stack.
>>>>
>>>> Add a __attribute__((nostacktrack)) on it?
>>>
>>> Yes, I already tried some blacklisting in the plugin, but it didn't really help,
>>> because:
>>>
>>> 1. there are other (more than 5) instrumented functions, that are called during
>>> BUG() handling too;
>>>
>>> 2. decreasing CONFIG_STACKLEAK_TRACK_MIN_SIZE would add more instrumented
>>> functions, which should be manually blacklisted (not good).
>>>
>>> I guess handling BUG() in another stack would be a solution. For example, Andy
>>> Lutomirski calls handle_stack_overflow in the DOUBLEFAULT_STACK
>>> (arch/x86/mm/fault.c). Should I do something similar?
>>
>> Hello Andy! May I ask your advice?
>>
>> When CONFIG_VMAP_STACK is disabled and STACKLEAK is enabled (for example, on
>> x86_32), we need another way to detect stack depth overflow. That is the reason
>> of having this BUG() in track_stack(). But it turns out to be recursive since
>> track_stack() will be called again during BUG() handling.
> 
> What does the STEAKLACK plugin actually do?  I haven't followed this enough.

I've just replied to Mark's explanation.

>> We can avoid that recursion by handling oops in another stack. It looks similar
>> to the way you call handle_stack_overflow() in arch/x86/mm/fault.c. But it seems
>> that I can't reuse that code, am I right?
> 
> You'd probably have to make 32-bit compatible, which means making a
> 32-bit variant of this thingy:
> 
>                 asm volatile ("movq %[stack], %%rsp\n\t"
>                               "call handle_stack_overflow\n\t"
>                               "1: jmp 1b"
>                               : ASM_CALL_CONSTRAINT
>                               : "D" ("kernel stack overflow (page fault)"),
>                                 "S" (regs), "d" (address),
>                                 [stack] "rm" (stack));

Hm, I don't have these pt_regs in track_stack(). That is why I think I can't
reuse your handle_stack_overflow(). I guess manually crafting pt_regs will not
be good-looking.

> Or you could force a double-fault.

Could you elaborate on that?

The goal is to have a verbose oops message and kill the offending process (if we
work on behalf of a process). Can I do that?

>> How should I do it properly?
>>
>> By the way, you wrote that you have some entry code changes which conflict with
>> STACKLEAK. May I ask for more details?
> 
> It's this thing:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/log/?h=x86/entry_stack.wip
> 
> and I'll probably drop the ".wip" from the name shortly.

Wow, it's big. I'll look into it and maybe return with questions.

Best regards,
Alexander
Alexander Popov Nov. 14, 2017, 10:03 p.m. UTC | #10
On 15.11.2017 00:17, Andy Lutomirski wrote:
> On Tue, Nov 14, 2017 at 1:09 PM, Alexander Popov <alex.popov@linux.com> wrote:
>> Thanks, Mark!
>>
>> Please see my comments below.
>>
>> On 14.11.2017 19:33, Mark Rutland wrote:
>>> On Tue, Nov 14, 2017 at 08:13:43AM -0800, Andy Lutomirski wrote:
>>>> What does the STEAKLACK plugin actually do?  I haven't followed this enough.
>>>
>>> The plugin adds instrumentation to track the maximum stack depth, though only
>>> functions with a sufficiently large stackframe are instrumented.
>>
>> Yes. Functions with a big stack frame call track_stack() to update the
>> lowest_stack value. If CONFIG_VMAP_STACK is disabled, track_stack() is compiled
>> with a check for detecting stack depth overflow. This check is what I'm asking
>> about.
> 
> Then you'll probably have to do something like what I did in the
> VMAP_STACK code.

Yes!

> That being said, I don't entirely see the point.  If you want a
> hardened kernel, you're going to enable VMAP_STACK.  Are there really
> users of hardened 32-bit kernels?

You know, STACKLEAK already supports x86_32. It's a pity for me to make
STACKLEAK dependent on VMAP_STACK and hence to drop STACKLEAK support for this
platform.

I hope there is a way to add a good-looking check to track_stack() and have at
least some profit (although it will not catch all overflow cases).

Best regards,
Alexander

Patch
diff mbox

diff --git a/arch/Kconfig b/arch/Kconfig
index e9ec94c..f2de598 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -543,6 +543,18 @@  config GCC_PLUGIN_STACKLEAK
 	   * https://grsecurity.net/
 	   * https://pax.grsecurity.net/
 
+config STACKLEAK_TRACK_MIN_SIZE
+	int "Minimum stack frame size of functions tracked by STACKLEAK"
+	default 100
+	range 0 4096
+	depends on GCC_PLUGIN_STACKLEAK
+	help
+	  The STACKLEAK gcc plugin instruments the kernel code for tracking
+	  the lowest border of the kernel stack (and for some other purposes).
+	  It inserts the track_stack() call for the functions with a stack
+	  frame size greater than or equal to this parameter. If unsure,
+	  leave the default value 100.
+
 config HAVE_CC_STACKPROTECTOR
 	bool
 	help
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index f13b4c0..5a9b6cc 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -315,3 +315,18 @@  static int __init code_bytes_setup(char *s)
 	return 1;
 }
 __setup("code_bytes=", code_bytes_setup);
+
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+void __used check_alloca(unsigned long size)
+{
+	unsigned long sp = (unsigned long)&sp;
+	struct stack_info stack_info = {0};
+	unsigned long visit_mask = 0;
+	unsigned long stack_left;
+
+	BUG_ON(get_stack_info(&sp, current, &stack_info, &visit_mask));
+	stack_left = sp - (unsigned long)stack_info.begin;
+	BUG_ON(stack_left < 256 || size >= stack_left - 256);
+}
+EXPORT_SYMBOL(check_alloca);
+#endif
diff --git a/fs/exec.c b/fs/exec.c
index 3e14ba2..481ef4b 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1958,3 +1958,33 @@  COMPAT_SYSCALL_DEFINE5(execveat, int, fd,
 				  argv, envp, flags);
 }
 #endif
+
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+void __used track_stack(void)
+{
+	/*
+	 * N.B. The arch-specific part of the STACKLEAK feature fills the
+	 * kernel stack with the poison value, which has the register width.
+	 * That code assumes that the value of thread.lowest_stack is aligned
+	 * on the register width boundary.
+	 *
+	 * That is true for x86 and x86_64 because of the kernel stack
+	 * alignment on these platforms (for details, see cc_stack_align in
+	 * arch/x86/Makefile). Take care of that when you port STACKLEAK to
+	 * new platforms.
+	 */
+	unsigned long sp = (unsigned long)&sp;
+
+	if (sp < current->thread.lowest_stack &&
+	    sp >= (unsigned long)task_stack_page(current) +
+					2 * sizeof(unsigned long)) {
+		current->thread.lowest_stack = sp;
+	}
+
+#ifndef CONFIG_VMAP_STACK
+	if (unlikely((sp & (THREAD_SIZE - 1)) < (THREAD_SIZE / 16)))
+		BUG();
+#endif /* !CONFIG_VMAP_STACK */
+}
+EXPORT_SYMBOL(track_stack);
+#endif /* CONFIG_GCC_PLUGIN_STACKLEAK */
diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins
index d1f7b0d..3793c41 100644
--- a/scripts/Makefile.gcc-plugins
+++ b/scripts/Makefile.gcc-plugins
@@ -34,6 +34,9 @@  ifdef CONFIG_GCC_PLUGINS
   gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_RANDSTRUCT)	+= -DRANDSTRUCT_PLUGIN
   gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_RANDSTRUCT_PERFORMANCE)	+= -fplugin-arg-randomize_layout_plugin-performance-mode
 
+  gcc-plugin-$(CONFIG_GCC_PLUGIN_STACKLEAK)	+= stackleak_plugin.so
+  gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STACKLEAK)	+= -DSTACKLEAK_PLUGIN -fplugin-arg-stackleak_plugin-track-min-size=$(CONFIG_STACKLEAK_TRACK_MIN_SIZE)
+
   GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y))
 
   export PLUGINCC GCC_PLUGINS_CFLAGS GCC_PLUGIN GCC_PLUGIN_SUBDIR
diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c
new file mode 100644
index 0000000..1461d88
--- /dev/null
+++ b/scripts/gcc-plugins/stackleak_plugin.c
@@ -0,0 +1,470 @@ 
+/*
+ * Copyright 2011-2017 by the PaX Team <pageexec@freemail.hu>
+ * Modified by Alexander Popov <alex.popov@linux.com>
+ * Licensed under the GPL v2
+ *
+ * Note: the choice of the license means that the compilation process is
+ * NOT 'eligible' as defined by gcc's library exception to the GPL v3,
+ * but for the kernel it doesn't matter since it doesn't link against
+ * any of the gcc libraries
+ *
+ * This gcc plugin is needed for tracking the lowest border of the kernel stack
+ * and checking that alloca calls don't cause stack overflow. It instruments
+ * the kernel code inserting:
+ *  - the check_alloca() call before alloca and the track_stack() call after it;
+ *  - the track_stack() call for the functions with a stack frame size greater
+ *     than or equal to the "track-min-size" plugin parameter.
+ *
+ * This plugin is ported from grsecurity/PaX. For more information see:
+ *   https://grsecurity.net/
+ *   https://pax.grsecurity.net/
+ *
+ * Debugging:
+ *  - use fprintf() to stderr, debug_generic_expr(), debug_gimple_stmt()
+ *     and print_rtl();
+ *  - add "-fdump-tree-all -fdump-rtl-all" to the plugin CFLAGS in
+ *     Makefile.gcc-plugins to see the verbose dumps of the gcc passes;
+ *  - use gcc -E to understand the preprocessing shenanigans;
+ *  - use gcc with enabled CFG/GIMPLE/SSA verification (--enable-checking).
+ */
+
+#include "gcc-common.h"
+
+__visible int plugin_is_GPL_compatible;
+
+static int track_frame_size = -1;
+static const char track_function[] = "track_stack";
+static const char check_function[] = "check_alloca";
+
+/*
+ * Mark these global variables (roots) for gcc garbage collector since
+ * they point to the garbage-collected memory.
+ */
+static GTY(()) tree track_function_decl;
+static GTY(()) tree check_function_decl;
+
+static struct plugin_info stackleak_plugin_info = {
+	.version = "201707101337",
+	.help = "track-min-size=nn\ttrack stack for functions with a stack frame size >= nn bytes\n"
+};
+
+static void stackleak_check_alloca(gimple_stmt_iterator *gsi)
+{
+	gimple stmt;
+	gcall *check_alloca;
+	tree alloca_size;
+	cgraph_node_ptr node;
+	int frequency;
+	basic_block bb;
+
+	/* Insert call to void check_alloca(unsigned long size) */
+	alloca_size = gimple_call_arg(gsi_stmt(*gsi), 0);
+	stmt = gimple_build_call(check_function_decl, 1, alloca_size);
+	check_alloca = as_a_gcall(stmt);
+	gsi_insert_before(gsi, check_alloca, GSI_SAME_STMT);
+
+	/* Update the cgraph */
+	bb = gimple_bb(check_alloca);
+	node = cgraph_get_create_node(check_function_decl);
+	gcc_assert(node);
+	frequency = compute_call_stmt_bb_frequency(current_function_decl, bb);
+	cgraph_create_edge(cgraph_get_node(current_function_decl), node,
+			check_alloca, bb->count, frequency, bb->loop_depth);
+}
+
+static void stackleak_add_instrumentation(gimple_stmt_iterator *gsi, bool after)
+{
+	gimple stmt;
+	gcall *track_stack;
+	cgraph_node_ptr node;
+	int frequency;
+	basic_block bb;
+
+	/* Insert call to void track_stack(void) */
+	stmt = gimple_build_call(track_function_decl, 0);
+	track_stack = as_a_gcall(stmt);
+	if (after)
+		gsi_insert_after(gsi, track_stack, GSI_CONTINUE_LINKING);
+	else
+		gsi_insert_before(gsi, track_stack, GSI_SAME_STMT);
+
+	/* Update the cgraph */
+	bb = gimple_bb(track_stack);
+	node = cgraph_get_create_node(track_function_decl);
+	gcc_assert(node);
+	frequency = compute_call_stmt_bb_frequency(current_function_decl, bb);
+	cgraph_create_edge(cgraph_get_node(current_function_decl), node,
+			track_stack, bb->count, frequency, bb->loop_depth);
+}
+
+static bool is_alloca(gimple stmt)
+{
+	if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA))
+		return true;
+
+#if BUILDING_GCC_VERSION >= 4007
+	if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA_WITH_ALIGN))
+		return true;
+#endif
+
+	return false;
+}
+
+/*
+ * Work with the GIMPLE representation of the code.
+ * Insert the check_alloca() call before alloca and track_stack() call after
+ * it. Also insert track_stack() call into the beginning of the function
+ * if it is not instrumented.
+ */
+static unsigned int stackleak_tree_instrument_execute(void)
+{
+	basic_block bb, entry_bb;
+	bool prologue_instrumented = false, is_leaf = true;
+	gimple_stmt_iterator gsi;
+
+	/*
+	 * ENTRY_BLOCK_PTR is a basic block which represents possible entry
+	 * point of a function. This block does not contain any code and
+	 * has a CFG edge to its successor.
+	 */
+	gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+	entry_bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+
+	/*
+	 * 1. Loop through the GIMPLE statements in each of cfun basic blocks.
+	 * cfun is a global variable which represents the function that is
+	 * currently processed.
+	 */
+	FOR_EACH_BB_FN(bb, cfun) {
+		for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
+			gimple stmt;
+
+			stmt = gsi_stmt(gsi);
+
+			/* Leaf function is a function which makes no calls */
+			if (is_gimple_call(stmt))
+				is_leaf = false;
+
+			if (!is_alloca(stmt))
+				continue;
+
+			/* 2. Insert stack overflow check before alloca call */
+			stackleak_check_alloca(&gsi);
+
+			/* 3. Insert track_stack() call after alloca call */
+			stackleak_add_instrumentation(&gsi, true);
+			if (bb == entry_bb)
+				prologue_instrumented = true;
+		}
+	}
+
+	if (prologue_instrumented)
+		return 0;
+
+	/*
+	 * Special cases to skip the instrumentation.
+	 *
+	 * Taking the address of static inline functions materializes them,
+	 * but we mustn't instrument some of them as the resulting stack
+	 * alignment required by the function call ABI will break other
+	 * assumptions regarding the expected (but not otherwise enforced)
+	 * register clobbering ABI.
+	 *
+	 * Case in point: native_save_fl on amd64 when optimized for size
+	 * clobbers rdx if it were instrumented here.
+	 *
+	 * TODO: any more special cases?
+	 */
+	if (is_leaf &&
+	    !TREE_PUBLIC(current_function_decl) &&
+	    DECL_DECLARED_INLINE_P(current_function_decl)) {
+		return 0;
+	}
+
+	if (is_leaf &&
+	    !strncmp(IDENTIFIER_POINTER(DECL_NAME(current_function_decl)),
+		     "_paravirt_", 10)) {
+		return 0;
+	}
+
+	/* 4. Insert track_stack() call at the function beginning */
+	bb = entry_bb;
+	if (!single_pred_p(bb)) {
+		/* gcc_assert(bb_loop_depth(bb) ||
+				(bb->flags & BB_IRREDUCIBLE_LOOP)); */
+		split_edge(single_succ_edge(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+		gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+		bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+	}
+	gsi = gsi_after_labels(bb);
+	stackleak_add_instrumentation(&gsi, false);
+
+	return 0;
+}
+
+/*
+ * Work with the RTL representation of the code.
+ * Remove the unneeded track_stack() calls from the functions which don't
+ * call alloca and have the stack frame size less than track_frame_size.
+ */
+static unsigned int stackleak_final_execute(void)
+{
+	rtx_insn *insn, *next;
+
+	if (cfun->calls_alloca)
+		return 0;
+
+	if (get_frame_size() >= track_frame_size)
+		return 0;
+
+	/*
+	 * 1. Find track_stack() calls. Loop through the chain of insns,
+	 * which is an RTL representation of the code for a function.
+	 *
+	 * The example of a matching insn:
+	 *    (call_insn 8 4 10 2 (call (mem (symbol_ref ("track_stack")
+	 *    [flags 0x41] <function_decl 0x7f7cd3302a80 track_stack>)
+	 *    [0 track_stack S1 A8]) (0)) 675 {*call} (expr_list
+	 *    (symbol_ref ("track_stack") [flags 0x41] <function_decl
+	 *    0x7f7cd3302a80 track_stack>) (expr_list (0) (nil))) (nil))
+	 */
+	for (insn = get_insns(); insn; insn = next) {
+		rtx body;
+
+		next = NEXT_INSN(insn);
+
+		/* Check the expression code of the insn */
+		if (!CALL_P(insn))
+			continue;
+
+		/*
+		 * Check the expression code of the insn body, which is an RTL
+		 * Expression (RTX) describing the side effect performed by
+		 * that insn.
+		 */
+		body = PATTERN(insn);
+		if (GET_CODE(body) != CALL)
+			continue;
+
+		/*
+		 * Check the first operand of the call expression. It should
+		 * be a mem RTX describing the needed subroutine with a
+		 * symbol_ref RTX.
+		 */
+		body = XEXP(body, 0);
+		if (GET_CODE(body) != MEM)
+			continue;
+
+		body = XEXP(body, 0);
+		if (GET_CODE(body) != SYMBOL_REF)
+			continue;
+
+		if (SYMBOL_REF_DECL(body) != track_function_decl)
+			continue;
+
+		/* 2. Delete the track_stack() call */
+		delete_insn_and_edges(insn);
+#if BUILDING_GCC_VERSION >= 4007
+		if (GET_CODE(next) == NOTE &&
+		    NOTE_KIND(next) == NOTE_INSN_CALL_ARG_LOCATION) {
+			insn = next;
+			next = NEXT_INSN(insn);
+			delete_insn_and_edges(insn);
+		}
+#endif
+	}
+
+	/*
+	 * Uncomment the following to see the code which was cleaned at this
+	 * pass. It should not contain check_alloca() and track_stack() calls.
+	 * The stack frame size should be less than track_frame_size.
+	 *
+	 * warning(0, "Instrumentation is removed, stack frame size: %ld",
+	 * 						get_frame_size());
+	 * print_simple_rtl(stderr, get_insns());
+	 */
+
+	return 0;
+}
+
+static bool stackleak_track_stack_gate(void)
+{
+	tree section;
+
+	section = lookup_attribute("section",
+				   DECL_ATTRIBUTES(current_function_decl));
+	if (section && TREE_VALUE(section)) {
+		section = TREE_VALUE(TREE_VALUE(section));
+
+		if (!strncmp(TREE_STRING_POINTER(section), ".init.text", 10))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".devinit.text", 13))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".cpuinit.text", 13))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".meminit.text", 13))
+			return false;
+	}
+
+	return track_frame_size >= 0;
+}
+
+/* Build function declarations for track_stack() and check_alloca() */
+static void stackleak_start_unit(void *gcc_data __unused,
+				 void *user_data __unused)
+{
+	tree fntype;
+
+	/* void track_stack(void) */
+	fntype = build_function_type_list(void_type_node, NULL_TREE);
+	track_function_decl = build_fn_decl(track_function, fntype);
+	DECL_ASSEMBLER_NAME(track_function_decl); /* for LTO */
+	TREE_PUBLIC(track_function_decl) = 1;
+	TREE_USED(track_function_decl) = 1;
+	DECL_EXTERNAL(track_function_decl) = 1;
+	DECL_ARTIFICIAL(track_function_decl) = 1;
+	DECL_PRESERVE_P(track_function_decl) = 1;
+
+	/* void check_alloca(unsigned long) */
+	fntype = build_function_type_list(void_type_node,
+				long_unsigned_type_node, NULL_TREE);
+	check_function_decl = build_fn_decl(check_function, fntype);
+	DECL_ASSEMBLER_NAME(check_function_decl); /* for LTO */
+	TREE_PUBLIC(check_function_decl) = 1;
+	TREE_USED(check_function_decl) = 1;
+	DECL_EXTERNAL(check_function_decl) = 1;
+	DECL_ARTIFICIAL(check_function_decl) = 1;
+	DECL_PRESERVE_P(check_function_decl) = 1;
+}
+
+/*
+ * Pass gate function is a predicate function that gets executed before the
+ * corresponding pass. If the return value is 'true' the pass gets executed,
+ * otherwise, it is skipped.
+ */
+static bool stackleak_tree_instrument_gate(void)
+{
+	return stackleak_track_stack_gate();
+}
+
+#define PASS_NAME stackleak_tree_instrument
+#define PROPERTIES_REQUIRED PROP_gimple_leh | PROP_cfg
+#define TODO_FLAGS_START TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts
+#define TODO_FLAGS_FINISH TODO_verify_ssa | TODO_verify_stmts | TODO_dump_func \
+			| TODO_update_ssa | TODO_rebuild_cgraph_edges
+#include "gcc-generate-gimple-pass.h"
+
+static bool stackleak_final_gate(void)
+{
+	return stackleak_track_stack_gate();
+}
+
+#define PASS_NAME stackleak_final
+#define TODO_FLAGS_FINISH TODO_dump_func
+#include "gcc-generate-rtl-pass.h"
+
+/*
+ * Every gcc plugin exports a plugin_init() function that is called right
+ * after the plugin is loaded. This function is responsible for registering
+ * the plugin callbacks and doing other required initialization.
+ */
+__visible int plugin_init(struct plugin_name_args *plugin_info,
+			  struct plugin_gcc_version *version)
+{
+	const char * const plugin_name = plugin_info->base_name;
+	const int argc = plugin_info->argc;
+	const struct plugin_argument * const argv = plugin_info->argv;
+	int i;
+
+	/* Extra GGC root tables describing our GTY-ed data */
+	static const struct ggc_root_tab gt_ggc_r_gt_stackleak[] = {
+		{
+			.base = &track_function_decl,
+			.nelt = 1,
+			.stride = sizeof(track_function_decl),
+			.cb = &gt_ggc_mx_tree_node,
+			.pchw = &gt_pch_nx_tree_node
+		},
+		{
+			.base = &check_function_decl,
+			.nelt = 1,
+			.stride = sizeof(check_function_decl),
+			.cb = &gt_ggc_mx_tree_node,
+			.pchw = &gt_pch_nx_tree_node
+		},
+		LAST_GGC_ROOT_TAB
+	};
+
+	/*
+	 * The stackleak_tree_instrument pass should be executed before the
+	 * "optimized" pass, which is the control flow graph cleanup that is
+	 * performed just before expanding gcc trees to the RTL. In former
+	 * versions of the plugin this new pass was inserted before the
+	 * "tree_profile" pass, which is currently called "profile".
+	 */
+	PASS_INFO(stackleak_tree_instrument, "optimized", 1,
+						PASS_POS_INSERT_BEFORE);
+
+	/*
+	 * The stackleak_final pass should be executed before the "final" pass,
+	 * which turns the RTL (Register Transfer Language) into assembly.
+	 */
+	PASS_INFO(stackleak_final, "final", 1, PASS_POS_INSERT_BEFORE);
+
+	if (!plugin_default_version_check(version, &gcc_version)) {
+		error(G_("incompatible gcc/plugin versions"));
+		return 1;
+	}
+
+	/* Parse the plugin arguments */
+	if (argc != 1) {
+		error(G_("bad number of the plugin arguments: %d"), argc);
+		return 1;
+	}
+
+	if (strcmp(argv[i].key, "track-min-size")) {
+		error(G_("unknown option '-fplugin-arg-%s-%s'"),
+				plugin_name, argv[i].key);
+		return 1;
+	}
+
+	if (!argv[i].value) {
+		error(G_("no value supplied for option '-fplugin-arg-%s-%s'"),
+				plugin_name, argv[i].key);
+		return 1;
+	}
+
+	track_frame_size = atoi(argv[i].value);
+	if (track_frame_size < 0) {
+		error(G_("invalid option argument '-fplugin-arg-%s-%s=%s'"),
+				plugin_name, argv[i].key, argv[i].value);
+		return 1;
+	}
+
+	/* Give the information about the plugin */
+	register_callback(plugin_name, PLUGIN_INFO, NULL,
+						&stackleak_plugin_info);
+
+	/* Register to be called before processing a translation unit */
+	register_callback(plugin_name, PLUGIN_START_UNIT,
+					&stackleak_start_unit, NULL);
+
+	/* Register an extra GCC garbage collector (GGC) root table */
+	register_callback(plugin_name, PLUGIN_REGISTER_GGC_ROOTS, NULL,
+					(void *)&gt_ggc_r_gt_stackleak);
+
+	/*
+	 * Hook into the Pass Manager to register new gcc passes.
+	 *
+	 * The stack frame size info is available only at the last RTL pass,
+	 * when it's too late to insert complex code like a function call.
+	 * So we register two gcc passes to instrument every function at first
+	 * and remove the unneeded instrumentation later.
+	 */
+	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+					&stackleak_tree_instrument_pass_info);
+	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+					&stackleak_final_pass_info);
+
+	return 0;
+}