diff mbox

[RFC,v2,1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls

Message ID 1497018608-22168-1-git-send-email-alex.popov@linux.com (mailing list archive)
State New, archived
Headers show

Commit Message

Alexander Popov June 9, 2017, 2:30 p.m. UTC
Hello,

My employer Positive Technologies kindly allowed me to spend a part of my
working time on helping KSPP. So I join the initiative of porting STACKLEAK
to the mainline, which was started by Tycho Andersen.

STACKLEAK is a security feature developed by Grsecurity/PaX (kudos to them!),
which can mitigate the damage from kernel stack leak bugs (see CVE-2010-2963,
CVE-2016-4569 and CVE-2016-4578).

I carefully extracted STACKLEAK from the last public patch of Grsecurity/PaX
and do my best to understand it. So I added some comments describing that
understanding. You are welcome to discuss it.

Here are the results of a brief performance test on x86_64:

Hardware: Intel Core i7-4770, 16 GB RAM

Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s
 --metrics-brief
Result on 4.11-rc8:
  cpu bogo ops 269955
  iosync bogo ops 9809985
  vm bogo ops 17093608
Result on 4.11-rc8+stackleak:
  cpu bogo ops 270106 (+0.6%)
  iosync bogo ops 9474535 (-3.42%)
  vm bogo ops 17093608 (the same)

Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
Average result on 4.11-rc8: 8.71s
Average result on 4.11-rc8+stackleak: 9.08s (+4.29%)

Test #3: building the Linux kernel with Ubuntu config (time make -j9)
Result on 4.11-rc8:
  real	32m14.893s
  user	237m30.886s
  sys	11m12.273s
Result on 4.11-rc8+stackleak:
  real	32m26.881s (+0.62%)
  user	238m38.926s (+0.48%)
  sys	11m36.426s (+3.59%)

There is an important aspect of STACKLEAK on i386. PaX Team might know
about it, I can't say for sure. Dear Grsecurity, is it fine to talk with
you about such things in that mailing list? Should I do it differently?

The STACKLEAK gcc plugin does not instrument the kernel code for i386. I've
checked that for the last public patch of Grsecurity/PaX and see the same
behaviour on my machines. The reason: the ix86_cmodel for the Linux kernel
on i386 is not CM_KERNEL. So the STACKLEAK plugin seems to skip the
instrumentation for that platform. As a result, on i386 erase_kstack()
always starts to search for the bottom of the stack from the top minus 128.

See the results of the same performance tests on i386:

Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s
 --metrics-brief
Result on 4.11-rc8:
  cpu bogo ops 207754
  iosync bogo ops 9442815
  vm bogo ops 8546804
Result on 4.11-rc8+stackleak:
  cpu bogo ops 206061 (-0.81%)
  iosync bogo ops 9435139 (-0.08%)
  vm bogo ops 8546804 (the same)

Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
Average result on 4.11-rc8: 8.197s
Average result on 4.11-rc8+stackleak: 9.134s (+11.43%)

Test #3: building the Linux kernel with Ubuntu config (time make -j9)
Result on 4.11-rc8:
  real	18m15.372s
  user	129m58.169s
  sys	8m27.884s
Result on 4.11-rc8+stackleak:
  real	18m34.244s (+1.72%)
  user	132m33.843s (+2.00%)
  sys	8m37.658s (+1.92%)

More things to be done:
 - understand how the STACKLEAK gcc plugin works;
 - develop tests for STACKLEAK.

Best regards,
Alexander

-- >8 --

From d22af45233b2f6d657a29dcb1815b35a5a45c539 Mon Sep 17 00:00:00 2001
From: Alexander Popov <alex.popov@linux.com>
Date: Fri, 9 Jun 2017 15:21:16 +0300
Subject: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the
 kernel stack at the end of syscalls

The stackleak feature erases the kernel stack before returning from syscalls.
That reduces the information which a kernel stack leak bug can reveal.

This feature consists of:
  - the architecture-specific code filling the used part of the kernel stack
  with a poison value before returning to the userspace (currently only
  for i386 and x86_64);
  - the gcc plugin for tracking the lowest border of the kernel stack. It
  instruments the kernel code inserting the track_stack() function call if
  a stack frame size is over a specified limit.

The stackleak feature is ported from grsecurity/PaX. For more information see:
  https://grsecurity.net/
  https://pax.grsecurity.net/

This code is verbatim from Brad Spengler/PaX Team's code in the last public
patch of grsecurity/PaX based on our understanding of the code. Changes or
omissions from the original code are ours and don't reflect the original
grsecurity/PaX code.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Tycho Andersen <tycho@docker.com>
---
 arch/Kconfig                           |  20 ++
 arch/x86/Kconfig                       |   1 +
 arch/x86/entry/common.c                |  17 +-
 arch/x86/entry/entry_32.S              |  71 +++++++
 arch/x86/entry/entry_64.S              |  99 ++++++++++
 arch/x86/entry/entry_64_compat.S       |   8 +
 arch/x86/include/asm/processor.h       |   4 +
 arch/x86/kernel/asm-offsets.c          |   9 +
 arch/x86/kernel/dumpstack_32.c         |  12 ++
 arch/x86/kernel/dumpstack_64.c         |  33 ++++
 arch/x86/kernel/process_32.c           |   5 +
 arch/x86/kernel/process_64.c           |   5 +
 fs/exec.c                              |  17 ++
 scripts/Makefile.gcc-plugins           |   3 +
 scripts/gcc-plugins/stackleak_plugin.c | 342 +++++++++++++++++++++++++++++++++
 15 files changed, 643 insertions(+), 3 deletions(-)
 create mode 100644 scripts/gcc-plugins/stackleak_plugin.c

Comments

Kees Cook June 9, 2017, 5:28 p.m. UTC | #1
On Fri, Jun 9, 2017 at 7:30 AM, Alexander Popov <alex.popov@linux.com> wrote:
> Hello,
>
> My employer Positive Technologies kindly allowed me to spend a part of my
> working time on helping KSPP. So I join the initiative of porting STACKLEAK
> to the mainline, which was started by Tycho Andersen.

Great, thanks for working on this!

> STACKLEAK is a security feature developed by Grsecurity/PaX (kudos to them!),
> which can mitigate the damage from kernel stack leak bugs (see CVE-2010-2963,
> CVE-2016-4569 and CVE-2016-4578).

It might help to distinguish that this covers at least two classes of
issues: stack content exposure (the latter two CVEs above), and it can
block some uninitialized variable attacks (like the mentioned
CVE-2010-2963). In other words, this can be more than just an info
exposure defense.

> I carefully extracted STACKLEAK from the last public patch of Grsecurity/PaX
> and do my best to understand it. So I added some comments describing that
> understanding. You are welcome to discuss it.

Great, adding more comments to the gcc plugin will help others who are
trying to get up to speed on it (like me). :)

> Here are the results of a brief performance test on x86_64:
>
> Hardware: Intel Core i7-4770, 16 GB RAM
>
> Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s
>  --metrics-brief
> Result on 4.11-rc8:
>   cpu bogo ops 269955
>   iosync bogo ops 9809985
>   vm bogo ops 17093608
> Result on 4.11-rc8+stackleak:
>   cpu bogo ops 270106 (+0.6%)
>   iosync bogo ops 9474535 (-3.42%)
>   vm bogo ops 17093608 (the same)
>
> Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
> Average result on 4.11-rc8: 8.71s
> Average result on 4.11-rc8+stackleak: 9.08s (+4.29%)
>
> Test #3: building the Linux kernel with Ubuntu config (time make -j9)
> Result on 4.11-rc8:
>   real  32m14.893s
>   user  237m30.886s
>   sys   11m12.273s
> Result on 4.11-rc8+stackleak:
>   real  32m26.881s (+0.62%)
>   user  238m38.926s (+0.48%)
>   sys   11m36.426s (+3.59%)

Awesome, and thanks for the benchmarks! That should really help people
understand the trade-offs for using this feature (and is likely worth
mentioning in the Kconfig). Seems like less than 4% overhead, maybe
much less? Real time on build times seems like a tiny difference, but
hackbench shows 4%.

> There is an important aspect of STACKLEAK on i386. PaX Team might know
> about it, I can't say for sure. Dear Grsecurity, is it fine to talk with
> you about such things in that mailing list? Should I do it differently?
>
> The STACKLEAK gcc plugin does not instrument the kernel code for i386. I've
> checked that for the last public patch of Grsecurity/PaX and see the same
> behaviour on my machines. The reason: the ix86_cmodel for the Linux kernel
> on i386 is not CM_KERNEL. So the STACKLEAK plugin seems to skip the
> instrumentation for that platform. As a result, on i386 erase_kstack()
> always starts to search for the bottom of the stack from the top minus 128.

What version of gcc did you use for these builds, btw? Perhaps
something changed there?

> See the results of the same performance tests on i386:
>
> Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s
>  --metrics-brief
> Result on 4.11-rc8:
>   cpu bogo ops 207754
>   iosync bogo ops 9442815
>   vm bogo ops 8546804
> Result on 4.11-rc8+stackleak:
>   cpu bogo ops 206061 (-0.81%)
>   iosync bogo ops 9435139 (-0.08%)
>   vm bogo ops 8546804 (the same)
>
> Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
> Average result on 4.11-rc8: 8.197s
> Average result on 4.11-rc8+stackleak: 9.134s (+11.43%)
>
> Test #3: building the Linux kernel with Ubuntu config (time make -j9)
> Result on 4.11-rc8:
>   real  18m15.372s
>   user  129m58.169s
>   sys   8m27.884s
> Result on 4.11-rc8+stackleak:
>   real  18m34.244s (+1.72%)
>   user  132m33.843s (+2.00%)
>   sys   8m37.658s (+1.92%)
>
> More things to be done:
>  - understand how the STACKLEAK gcc plugin works;
>  - develop tests for STACKLEAK.

Since this is mostly an anti-exposure defense, LKDTM is probably not a
good match (i.e. organizing a test for the uninit variable case can be
very fragile). I think something similar to test_user_copy.c would be
better.

> From d22af45233b2f6d657a29dcb1815b35a5a45c539 Mon Sep 17 00:00:00 2001
> From: Alexander Popov <alex.popov@linux.com>
> Date: Fri, 9 Jun 2017 15:21:16 +0300
> Subject: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the
>  kernel stack at the end of syscalls
>
> The stackleak feature erases the kernel stack before returning from syscalls.
> That reduces the information which a kernel stack leak bug can reveal.
>
> This feature consists of:
>   - the architecture-specific code filling the used part of the kernel stack
>   with a poison value before returning to the userspace (currently only
>   for i386 and x86_64);
>   - the gcc plugin for tracking the lowest border of the kernel stack. It
>   instruments the kernel code inserting the track_stack() function call if
>   a stack frame size is over a specified limit.
>
> The stackleak feature is ported from grsecurity/PaX. For more information see:
>   https://grsecurity.net/
>   https://pax.grsecurity.net/
>
> This code is verbatim from Brad Spengler/PaX Team's code in the last public

Maybe "nearly verbatim" since Kconfigs and such were (correctly)
renamed/expanded, and comments were added.

> patch of grsecurity/PaX based on our understanding of the code. Changes or
> omissions from the original code are ours and don't reflect the original
> grsecurity/PaX code.
>
> Signed-off-by: Alexander Popov <alex.popov@linux.com>
> Signed-off-by: Tycho Andersen <tycho@docker.com>

This S-o-b is not right, I think it should just be yours. Perhaps say
something like "Continued upstreaming work started by Tycho Anderson."
in the commit log.

> ---
>  arch/Kconfig                           |  20 ++
>  arch/x86/Kconfig                       |   1 +
>  arch/x86/entry/common.c                |  17 +-
>  arch/x86/entry/entry_32.S              |  71 +++++++
>  arch/x86/entry/entry_64.S              |  99 ++++++++++
>  arch/x86/entry/entry_64_compat.S       |   8 +
>  arch/x86/include/asm/processor.h       |   4 +
>  arch/x86/kernel/asm-offsets.c          |   9 +
>  arch/x86/kernel/dumpstack_32.c         |  12 ++
>  arch/x86/kernel/dumpstack_64.c         |  33 ++++
>  arch/x86/kernel/process_32.c           |   5 +
>  arch/x86/kernel/process_64.c           |   5 +
>  fs/exec.c                              |  17 ++
>  scripts/Makefile.gcc-plugins           |   3 +
>  scripts/gcc-plugins/stackleak_plugin.c | 342 +++++++++++++++++++++++++++++++++
>  15 files changed, 643 insertions(+), 3 deletions(-)
>  create mode 100644 scripts/gcc-plugins/stackleak_plugin.c
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index cd211a1..a209bd5 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -436,6 +436,26 @@ config GCC_PLUGIN_STRUCTLEAK_VERBOSE
>           initialized. Since not all existing initializers are detected
>           by the plugin, this can produce false positive warnings.
>
> +config ARCH_HAS_STACKLEAK
> +       def_bool n

This can just be "bool" (it will default off).

> +       help
> +         An architecture should select this if it has the code which
> +         fills the used part of the kernel stack with a poison value
> +         before returning from system calls.

Maybe specifically mention the -0xBEEF value?

> +
> +config STACKLEAK

I would follow the naming of the others, and call this GCC_PLUGIN_STACKLEAK

> +       bool "Erase the kernel stack before returning from syscalls"
> +       depends on GCC_PLUGINS
> +       depends on ARCH_HAS_STACKLEAK
> +       help
> +         This option makes the kernel erase the kernel stack before it
> +         returns from a system call. That reduces the information which
> +         a kernel stack leak bug can reveal.
> +
> +         This plugin was ported from grsecurity/PaX. More information at:
> +          * https://grsecurity.net/
> +          * https://pax.grsecurity.net/

Is Kconfig smart enough to include this in the gcc plugins sub-page
when the ARCH_HAS_STACKLEAK config is between this and the others?

> +
>  config HAVE_CC_STACKPROTECTOR
>         bool
>         help
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index cc98d5a..5988a5f 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -56,6 +56,7 @@ config X86
>         select ARCH_HAS_PMEM_API                if X86_64
>         select ARCH_HAS_SET_MEMORY
>         select ARCH_HAS_SG_CHAIN
> +       select ARCH_HAS_STACKLEAK
>         select ARCH_HAS_STRICT_KERNEL_RWX
>         select ARCH_HAS_STRICT_MODULE_RWX
>         select ARCH_HAS_UBSAN_SANITIZE_ALL
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index 370c42c..8ff0a84 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -43,6 +43,12 @@ __visible inline void enter_from_user_mode(void)
>  static inline void enter_from_user_mode(void) {}
>  #endif
>
> +#ifdef CONFIG_STACKLEAK
> +asmlinkage void erase_kstack(void);
> +#else
> +static void erase_kstack(void) {}
> +#endif
> +
>  static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
>  {
>  #ifdef CONFIG_X86_64
> @@ -79,8 +85,10 @@ static long syscall_trace_enter(struct pt_regs *regs)
>                 emulated = true;
>
>         if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
> -           tracehook_report_syscall_entry(regs))
> +           tracehook_report_syscall_entry(regs)) {
> +               erase_kstack();
>                 return -1L;
> +       }
>
>         if (emulated)
>                 return -1L;
> @@ -114,9 +122,11 @@ static long syscall_trace_enter(struct pt_regs *regs)
>                         sd.args[5] = regs->bp;
>                 }
>
> -               ret = __secure_computing(&sd);
> -               if (ret == -1)
> +               ret = secure_computing(&sd);
> +               if (ret == -1) {
> +                       erase_kstack();
>                         return ret;
> +               }
>         }
>  #endif

Since this is x86-specific, maybe Cc x86@kernel.org and/or Andy
Lutomirski on follow-up versions. I'm sure they'll have opinions about
changes to the entry code. :)

It seems like it shouldn't be too hard to add on-user-return erasure
code to other architectures too.

-Kees
Alexander Popov June 9, 2017, 11 p.m. UTC | #2
On 09.06.2017 20:28, Kees Cook wrote:
> On Fri, Jun 9, 2017 at 7:30 AM, Alexander Popov <alex.popov@linux.com> wrote:
>> Hello,
>>
>> My employer Positive Technologies kindly allowed me to spend a part of my
>> working time on helping KSPP. So I join the initiative of porting STACKLEAK
>> to the mainline, which was started by Tycho Andersen.
> 
> Great, thanks for working on this!

Hello Kees, thanks for your prompt reply.

>> STACKLEAK is a security feature developed by Grsecurity/PaX (kudos to them!),
>> which can mitigate the damage from kernel stack leak bugs (see CVE-2010-2963,
>> CVE-2016-4569 and CVE-2016-4578).
> 
> It might help to distinguish that this covers at least two classes of
> issues: stack content exposure (the latter two CVEs above), and it can
> block some uninitialized variable attacks (like the mentioned
> CVE-2010-2963). In other words, this can be more than just an info
> exposure defense.

Thanks, I'll improve the description in the next version.

>> I carefully extracted STACKLEAK from the last public patch of Grsecurity/PaX
>> and do my best to understand it. So I added some comments describing that
>> understanding. You are welcome to discuss it.
> 
> Great, adding more comments to the gcc plugin will help others who are
> trying to get up to speed on it (like me). :)

Currently I annotated the arch-specific asm part of the STACKLEAK feature.
However, there are a few questions in my comments marked with TODO.
Playing with the gcc plugin is the next step.

>> Here are the results of a brief performance test on x86_64:
>>
>> Hardware: Intel Core i7-4770, 16 GB RAM
>>
>> Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s
>>  --metrics-brief
>> Result on 4.11-rc8:
>>   cpu bogo ops 269955
>>   iosync bogo ops 9809985
>>   vm bogo ops 17093608
>> Result on 4.11-rc8+stackleak:
>>   cpu bogo ops 270106 (+0.6%)
>>   iosync bogo ops 9474535 (-3.42%)
>>   vm bogo ops 17093608 (the same)
>>
>> Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
>> Average result on 4.11-rc8: 8.71s
>> Average result on 4.11-rc8+stackleak: 9.08s (+4.29%)
>>
>> Test #3: building the Linux kernel with Ubuntu config (time make -j9)
>> Result on 4.11-rc8:
>>   real  32m14.893s
>>   user  237m30.886s
>>   sys   11m12.273s
>> Result on 4.11-rc8+stackleak:
>>   real  32m26.881s (+0.62%)
>>   user  238m38.926s (+0.48%)
>>   sys   11m36.426s (+3.59%)
> 
> Awesome, and thanks for the benchmarks! That should really help people
> understand the trade-offs for using this feature (and is likely worth
> mentioning in the Kconfig). Seems like less than 4% overhead, maybe
> much less? Real time on build times seems like a tiny difference, but
> hackbench shows 4%.

Yes, the performance penalty of STACKLEAK differs a lot depending on the
kind of load. Do you have any idea which test can give a bigger slowdown?
It should be some rapid syscall exhausting the kernel stack hard.

>> There is an important aspect of STACKLEAK on i386. PaX Team might know
>> about it, I can't say for sure. Dear Grsecurity, is it fine to talk with
>> you about such things in that mailing list? Should I do it differently?
>>
>> The STACKLEAK gcc plugin does not instrument the kernel code for i386. I've
>> checked that for the last public patch of Grsecurity/PaX and see the same
>> behaviour on my machines. The reason: the ix86_cmodel for the Linux kernel
>> on i386 is not CM_KERNEL. So the STACKLEAK plugin seems to skip the
>> instrumentation for that platform. As a result, on i386 erase_kstack()
>> always starts to search for the bottom of the stack from the top minus 128.
> 
> What version of gcc did you use for these builds, btw? Perhaps
> something changed there?

I tried to compile the Linux kernel for i386 with two different versions of gcc:

#gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro
6.3.0-18ubuntu2~16.04' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared
--enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/
--enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify
--enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin
--enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib
--enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 6.3.0 20170519 (Ubuntu/Linaro 6.3.0-18ubuntu2~16.04)

gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-linux-gnu/5/lto-wrapper
Target: i686-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.4'
--with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls
--with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify
--enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin
--enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-i386/jre --enable-java-home
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-i386
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-i386 --with-arch-directory=i386
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-targets=all
--enable-multiarch --disable-werror --with-arch-32=i686 --with-multilib-list=m32,m64,mx32
--enable-multilib --with-tune=generic --enable-checking=release --build=i686-linux-gnu
--host=i686-linux-gnu --target=i686-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)

Both of them compiled it with ix86_cmodel=0, so the STACKLEAK instrumentation
was skipped.

>> See the results of the same performance tests on i386:
>>
>> Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s
>>  --metrics-brief
>> Result on 4.11-rc8:
>>   cpu bogo ops 207754
>>   iosync bogo ops 9442815
>>   vm bogo ops 8546804
>> Result on 4.11-rc8+stackleak:
>>   cpu bogo ops 206061 (-0.81%)
>>   iosync bogo ops 9435139 (-0.08%)
>>   vm bogo ops 8546804 (the same)
>>
>> Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
>> Average result on 4.11-rc8: 8.197s
>> Average result on 4.11-rc8+stackleak: 9.134s (+11.43%)
>>
>> Test #3: building the Linux kernel with Ubuntu config (time make -j9)
>> Result on 4.11-rc8:
>>   real  18m15.372s
>>   user  129m58.169s
>>   sys   8m27.884s
>> Result on 4.11-rc8+stackleak:
>>   real  18m34.244s (+1.72%)
>>   user  132m33.843s (+2.00%)
>>   sys   8m37.658s (+1.92%)
>>
>> More things to be done:
>>  - understand how the STACKLEAK gcc plugin works;
>>  - develop tests for STACKLEAK.
> 
> Since this is mostly an anti-exposure defense, LKDTM is probably not a
> good match (i.e. organizing a test for the uninit variable case can be
> very fragile). I think something similar to test_user_copy.c would be
> better.

Ok, thanks for the clue.

>> From d22af45233b2f6d657a29dcb1815b35a5a45c539 Mon Sep 17 00:00:00 2001
>> From: Alexander Popov <alex.popov@linux.com>
>> Date: Fri, 9 Jun 2017 15:21:16 +0300
>> Subject: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the
>>  kernel stack at the end of syscalls
>>
>> The stackleak feature erases the kernel stack before returning from syscalls.
>> That reduces the information which a kernel stack leak bug can reveal.
>>
>> This feature consists of:
>>   - the architecture-specific code filling the used part of the kernel stack
>>   with a poison value before returning to the userspace (currently only
>>   for i386 and x86_64);
>>   - the gcc plugin for tracking the lowest border of the kernel stack. It
>>   instruments the kernel code inserting the track_stack() function call if
>>   a stack frame size is over a specified limit.
>>
>> The stackleak feature is ported from grsecurity/PaX. For more information see:
>>   https://grsecurity.net/
>>   https://pax.grsecurity.net/
>>
>> This code is verbatim from Brad Spengler/PaX Team's code in the last public
> 
> Maybe "nearly verbatim" since Kconfigs and such were (correctly)
> renamed/expanded, and comments were added.

Agree, I'll fix it.

>> patch of grsecurity/PaX based on our understanding of the code. Changes or
>> omissions from the original code are ours and don't reflect the original
>> grsecurity/PaX code.
>>
>> Signed-off-by: Alexander Popov <alex.popov@linux.com>
>> Signed-off-by: Tycho Andersen <tycho@docker.com>
> 
> This S-o-b is not right, I think it should just be yours. Perhaps say
> something like "Continued upstreaming work started by Tycho Anderson."
> in the commit log.

Ok, excuse me.
Tycho looked at this patch and tested it. I had to use "Tested-by".

>> ---
>>  arch/Kconfig                           |  20 ++
>>  arch/x86/Kconfig                       |   1 +
>>  arch/x86/entry/common.c                |  17 +-
>>  arch/x86/entry/entry_32.S              |  71 +++++++
>>  arch/x86/entry/entry_64.S              |  99 ++++++++++
>>  arch/x86/entry/entry_64_compat.S       |   8 +
>>  arch/x86/include/asm/processor.h       |   4 +
>>  arch/x86/kernel/asm-offsets.c          |   9 +
>>  arch/x86/kernel/dumpstack_32.c         |  12 ++
>>  arch/x86/kernel/dumpstack_64.c         |  33 ++++
>>  arch/x86/kernel/process_32.c           |   5 +
>>  arch/x86/kernel/process_64.c           |   5 +
>>  fs/exec.c                              |  17 ++
>>  scripts/Makefile.gcc-plugins           |   3 +
>>  scripts/gcc-plugins/stackleak_plugin.c | 342 +++++++++++++++++++++++++++++++++
>>  15 files changed, 643 insertions(+), 3 deletions(-)
>>  create mode 100644 scripts/gcc-plugins/stackleak_plugin.c
>>
>> diff --git a/arch/Kconfig b/arch/Kconfig
>> index cd211a1..a209bd5 100644
>> --- a/arch/Kconfig
>> +++ b/arch/Kconfig
>> @@ -436,6 +436,26 @@ config GCC_PLUGIN_STRUCTLEAK_VERBOSE
>>           initialized. Since not all existing initializers are detected
>>           by the plugin, this can produce false positive warnings.
>>
>> +config ARCH_HAS_STACKLEAK
>> +       def_bool n
> 
> This can just be "bool" (it will default off).

Ok.

>> +       help
>> +         An architecture should select this if it has the code which
>> +         fills the used part of the kernel stack with a poison value
>> +         before returning from system calls.
> 
> Maybe specifically mention the -0xBEEF value?

Ok. Should I create some macro for it?

>> +
>> +config STACKLEAK
> 
> I would follow the naming of the others, and call this GCC_PLUGIN_STACKLEAK

It seems to me that GCC_PLUGIN_STACKLEAK is not a right name since the whole
feature consists of two parts: the arch-specific asm code actually cleaning
the kernel stack and the gcc plugin which helps to do it faster and more
reliable. What do you think?

>> +       bool "Erase the kernel stack before returning from syscalls"
>> +       depends on GCC_PLUGINS
>> +       depends on ARCH_HAS_STACKLEAK
>> +       help
>> +         This option makes the kernel erase the kernel stack before it
>> +         returns from a system call. That reduces the information which
>> +         a kernel stack leak bug can reveal.
>> +
>> +         This plugin was ported from grsecurity/PaX. More information at:
>> +          * https://grsecurity.net/
>> +          * https://pax.grsecurity.net/
> 
> Is Kconfig smart enough to include this in the gcc plugins sub-page
> when the ARCH_HAS_STACKLEAK config is between this and the others?

You are right, it's outside gcc plugins subpage. If we finally choose the
name GCC_PLUGIN_STACKLEAK, I'll put it inside.

>> +
>>  config HAVE_CC_STACKPROTECTOR
>>         bool
>>         help
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index cc98d5a..5988a5f 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -56,6 +56,7 @@ config X86
>>         select ARCH_HAS_PMEM_API                if X86_64
>>         select ARCH_HAS_SET_MEMORY
>>         select ARCH_HAS_SG_CHAIN
>> +       select ARCH_HAS_STACKLEAK
>>         select ARCH_HAS_STRICT_KERNEL_RWX
>>         select ARCH_HAS_STRICT_MODULE_RWX
>>         select ARCH_HAS_UBSAN_SANITIZE_ALL
>> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
>> index 370c42c..8ff0a84 100644
>> --- a/arch/x86/entry/common.c
>> +++ b/arch/x86/entry/common.c
>> @@ -43,6 +43,12 @@ __visible inline void enter_from_user_mode(void)
>>  static inline void enter_from_user_mode(void) {}
>>  #endif
>>
>> +#ifdef CONFIG_STACKLEAK
>> +asmlinkage void erase_kstack(void);
>> +#else
>> +static void erase_kstack(void) {}
>> +#endif
>> +
>>  static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
>>  {
>>  #ifdef CONFIG_X86_64
>> @@ -79,8 +85,10 @@ static long syscall_trace_enter(struct pt_regs *regs)
>>                 emulated = true;
>>
>>         if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
>> -           tracehook_report_syscall_entry(regs))
>> +           tracehook_report_syscall_entry(regs)) {
>> +               erase_kstack();
>>                 return -1L;
>> +       }
>>
>>         if (emulated)
>>                 return -1L;
>> @@ -114,9 +122,11 @@ static long syscall_trace_enter(struct pt_regs *regs)
>>                         sd.args[5] = regs->bp;
>>                 }
>>
>> -               ret = __secure_computing(&sd);
>> -               if (ret == -1)
>> +               ret = secure_computing(&sd);
>> +               if (ret == -1) {
>> +                       erase_kstack();
>>                         return ret;
>> +               }
>>         }
>>  #endif
> 
> Since this is x86-specific, maybe Cc x86@kernel.org and/or Andy
> Lutomirski on follow-up versions. I'm sure they'll have opinions about
> changes to the entry code. :)

Thanks! Of course, I'll do it for the next version.

> It seems like it shouldn't be too hard to add on-user-return erasure
> code to other architectures too.

Yes, maybe. At the same time the gcc plugin might somehow depend on x86.
I'll investigate that.

Best regards,
Alexander
Laura Abbott June 13, 2017, 9:51 p.m. UTC | #3
On 06/09/2017 10:28 AM, Kees Cook wrote:
> It seems like it shouldn't be too hard to add on-user-return erasure
> code to other architectures too.

I played around getting this to compile for arm64 with a dummy
stack clearing function. arm64 is doing something special with the
efistub so it fails to link with

drivers/firmware/efi/libstub/arm-stub.c:45:(.init.text+0x54): relocation
	truncated to fit: R_AARCH64_CALL26 against undefined symbol `__efistub_track_stack'

The relocation to the .init.text section and appending __efistub happens after
compilation so the checks in the plugin itself don't work. I haven't come up
with a solution to not have the plugin run on the stub yet.

Thanks,
Laura
Hector Martin June 20, 2017, 9:06 a.m. UTC | #4
On 09/06/17 23:30, Alexander Popov wrote:
> +/*
> + * Copyright 2011-2017 by the PaX Team <pageexec@freemail.hu>
> + * Licensed under the GPL v2
> + *
> + * Note: the choice of the license means that the compilation process is
> + *       NOT 'eligible' as defined by gcc's library exception to the GPL v3,
> + *       but for the kernel it doesn't matter since it doesn't link against
> + *       any of the gcc libraries

I know this isn't the first time I've mentioned this, but I think it 
bears repeating that, as far as I understand the licensing (IANAL), GCC 
plug-ins licensed as GPLv2 are in violation of the GCC license.

See [1], which says that this plug-in and GCC form a "single combined 
program", and [2], which says that in this case the plug-in must be 
GPL-compatible. Though not specified in this informal FAQ entry, for 
GPLv3-licensed GCC, this obviously means GPLv3-compatible, and the GPLv2 
is *not* GPLv3-compatible ([3], especially the second paragraph).

In addition, orthogonally, the motivation behind this licensing 
(preventing userspace code from using these plug-ins through a gross 
abuse of the anti-proprietary-plugin licensing that libgcc uses) also 
prevents them from being used with some kernel architectures that *do* 
link against libgcc.

This can all be fixed by requesting that the plug-ins be relicensed as 
"GPLv2 or later", although evidently this goes against the interests of 
the original author (PaX team) of restricting usage of these plug-ins, 
so I doubt they will do that.

Alternatively, the plug-in can be rewritten to operate on an *extracted* 
intermediate representation of the GCC internal state, with a 
GPLv3-compatible shim used as a GCC plug-in to extract said state, and 
the actual core of the transformation done in a separate binary that 
does not link with GCC and may use any license. This approach is what 
PaX team *should* have done from the beginning in order to use their 
anti-userspace licensing hack without violating any licenses, IMO. This 
would still disallow the plug-in from being used on kernel architectures 
that do link libgcc, and on userspace code that does the same.

Failing those, the only remaining option I see is a clean rewrite of the 
plug-ins without using any of the original code, and licensing that as 
GPLv2+.

[1] https://www.gnu.org/licenses/gpl-faq.en.html#GPLPlugins
[2] https://www.gnu.org/licenses/gpl-faq.en.html#GPLAndPlugins
[3] https://www.gnu.org/licenses/gpl-faq.en.html#v2v3Compatibility
Mark Rutland June 20, 2017, 11:20 a.m. UTC | #5
On Tue, Jun 13, 2017 at 02:51:59PM -0700, Laura Abbott wrote:
> On 06/09/2017 10:28 AM, Kees Cook wrote:
> > It seems like it shouldn't be too hard to add on-user-return erasure
> > code to other architectures too.
> 
> I played around getting this to compile for arm64 with a dummy
> stack clearing function. arm64 is doing something special with the
> efistub so it fails to link with
> 
> drivers/firmware/efi/libstub/arm-stub.c:45:(.init.text+0x54): relocation
> 	truncated to fit: R_AARCH64_CALL26 against undefined symbol `__efistub_track_stack'
> 
> The relocation to the .init.text section and appending __efistub happens after
> compilation so the checks in the plugin itself don't work. I haven't come up
> with a solution to not have the plugin run on the stub yet.

Can we do something like what we do for KCOV, and (only) place the
plugin-invoking flags to into CFLAGS_STACKLEAK, which we can filter out
in scripts/Makefile.lib?

Thanks,
Mark.
Ard Biesheuvel June 20, 2017, 2:13 p.m. UTC | #6
On 20 June 2017 at 13:20, Mark Rutland <mark.rutland@arm.com> wrote:
> On Tue, Jun 13, 2017 at 02:51:59PM -0700, Laura Abbott wrote:
>> On 06/09/2017 10:28 AM, Kees Cook wrote:
>> > It seems like it shouldn't be too hard to add on-user-return erasure
>> > code to other architectures too.
>>
>> I played around getting this to compile for arm64 with a dummy
>> stack clearing function. arm64 is doing something special with the
>> efistub so it fails to link with
>>
>> drivers/firmware/efi/libstub/arm-stub.c:45:(.init.text+0x54): relocation
>>       truncated to fit: R_AARCH64_CALL26 against undefined symbol `__efistub_track_stack'
>>
>> The relocation to the .init.text section and appending __efistub happens after
>> compilation so the checks in the plugin itself don't work. I haven't come up
>> with a solution to not have the plugin run on the stub yet.
>
> Can we do something like what we do for KCOV, and (only) place the
> plugin-invoking flags to into CFLAGS_STACKLEAK, which we can filter out
> in scripts/Makefile.lib?
>

I agree. This instrumentation serves no purpose in the EFI stub, given
that there are no syscalls to return from, and so it would be better
if we could prevent the plugin from seeing these source files to begin
with. In fact, I could imagine that there are certain hot paths in
kthreads that would benefit from having this disabled as well.
Kees Cook June 20, 2017, 7:07 p.m. UTC | #7
On Tue, Jun 20, 2017 at 2:06 AM, Hector Martin "marcan"
<marcan@marcan.st> wrote:
> On 09/06/17 23:30, Alexander Popov wrote:
>>
>> +/*
>> + * Copyright 2011-2017 by the PaX Team <pageexec@freemail.hu>
>> + * Licensed under the GPL v2
>> + *
>> + * Note: the choice of the license means that the compilation process is
>> + *       NOT 'eligible' as defined by gcc's library exception to the GPL
>> v3,
>> + *       but for the kernel it doesn't matter since it doesn't link
>> against
>> + *       any of the gcc libraries
>
>
> I know this isn't the first time I've mentioned this, but I think it bears
> repeating that, as far as I understand the licensing (IANAL), GCC plug-ins
> licensed as GPLv2 are in violation of the GCC license.

I continue to think that the wording is unclear and that rational
people can come to different conclusions. To guide me I try to look at
the intent of the license ("stay Free Software") and the decisions of
the author ("this should be GPLv2"). Since the authors are quite clear
about their rationale, and I don't see anything that appears to be
trying to dodge the plugin being Free Software, I stand by my earlier
conclusion[1].

-Kees

[1] http://www.openwall.com/lists/kernel-hardening/2016/05/31/1
Kees Cook June 20, 2017, 7:11 p.m. UTC | #8
On Tue, Jun 20, 2017 at 4:20 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Tue, Jun 13, 2017 at 02:51:59PM -0700, Laura Abbott wrote:
>> On 06/09/2017 10:28 AM, Kees Cook wrote:
>> > It seems like it shouldn't be too hard to add on-user-return erasure
>> > code to other architectures too.
>>
>> I played around getting this to compile for arm64 with a dummy
>> stack clearing function. arm64 is doing something special with the
>> efistub so it fails to link with
>>
>> drivers/firmware/efi/libstub/arm-stub.c:45:(.init.text+0x54): relocation
>>       truncated to fit: R_AARCH64_CALL26 against undefined symbol `__efistub_track_stack'
>>
>> The relocation to the .init.text section and appending __efistub happens after
>> compilation so the checks in the plugin itself don't work. I haven't come up
>> with a solution to not have the plugin run on the stub yet.
>
> Can we do something like what we do for KCOV, and (only) place the
> plugin-invoking flags to into CFLAGS_STACKLEAK, which we can filter out
> in scripts/Makefile.lib?

That seems fine to me, though plugins have tended to provide a
DISABLE_foo_bar_PLUGIN export (like used in ppc for turning off the
latent entropy plugin):

scripts/Makefile.gcc-plugins:    DISABLE_LATENT_ENTROPY_PLUGIN +=
-fplugin-arg-latent_entropy_plugin-disable

arch/powerpc/kernel/Makefile:CFLAGS_cputable.o +=
$(DISABLE_LATENT_ENTROPY_PLUGIN)
arch/powerpc/kernel/Makefile:CFLAGS_prom_init.o +=
$(DISABLE_LATENT_ENTROPY_PLUGIN)
...


-Kees
Kees Cook June 20, 2017, 7:14 p.m. UTC | #9
On Fri, Jun 9, 2017 at 7:30 AM, Alexander Popov <alex.popov@linux.com> wrote:
> diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c
> new file mode 100644
> index 0000000..2ee49c4
> --- /dev/null
> +++ b/scripts/gcc-plugins/stackleak_plugin.c
> [...]
> + * TODO:
> + * - initialize all local variables
> [...]
> +static bool init_locals;
> +
> +static struct plugin_info stackleak_plugin_info = {
> +       .version        = "201602181345",
> +       .help           = "track-lowest-sp=nn\ttrack sp in functions whose frame size is at least nn bytes\n"
> +//                       "initialize-locals\t\tforcibly initialize all stack frames\n"
> +};
> [...]
> +       for (i = 0; i < argc; ++i) {
> +               if (!strcmp(argv[i].key, "track-lowest-sp")) {
> +                       if (!argv[i].value) {
> +                               error(G_("no value supplied for option '-fplugin-arg-%s-%s'"), plugin_name, argv[i].key);
> +                               continue;
> +                       }
> +                       track_frame_size = atoi(argv[i].value);
> +                       if (argv[i].value[0] < '0' || argv[i].value[0] > '9' || track_frame_size < 0)
> +                               error(G_("invalid option argument '-fplugin-arg-%s-%s=%s'"), plugin_name, argv[i].key, argv[i].value);
> +                       continue;
> +               }
> +               if (!strcmp(argv[i].key, "initialize-locals")) {
> +                       if (argv[i].value) {
> +                               error(G_("invalid option argument '-fplugin-arg-%s-%s=%s'"), plugin_name, argv[i].key, argv[i].value);
> +                               continue;
> +                       }
> +                       init_locals = true;
> +                       continue;
> +               }
> +               error(G_("unknown option '-fplugin-arg-%s-%s'"), plugin_name, argv[i].key);
> +       }

Just noticed this: there is an undocumented "initialize-locals" plugin
argument that set an unused init_locals variables. (And it's
unused-ness matches the TODO at the top.)

Probably the unused arg/var should be dropped (or better yet:
implemented to do the initialization).

-Kees
Kees Cook June 20, 2017, 7:20 p.m. UTC | #10
On Fri, Jun 9, 2017 at 4:00 PM, Alexander Popov <alex.popov@linux.com> wrote:
> On 09.06.2017 20:28, Kees Cook wrote:
>> Awesome, and thanks for the benchmarks! That should really help people
>> understand the trade-offs for using this feature (and is likely worth
>> mentioning in the Kconfig). Seems like less than 4% overhead, maybe
>> much less? Real time on build times seems like a tiny difference, but
>> hackbench shows 4%.
>
> Yes, the performance penalty of STACKLEAK differs a lot depending on the
> kind of load. Do you have any idea which test can give a bigger slowdown?
> It should be some rapid syscall exhausting the kernel stack hard.

I can't think of anything off the top of my head. You could play with
CONFIG_FRAME_WARN[1] and related tools to find a deep call path and
try that?

[1] http://elinux.org/Kernel_Small_Stacks

>> Maybe specifically mention the -0xBEEF value?
>
> Ok. Should I create some macro for it?

Maybe? It's not really clear how useful that might be. If it's easy,
then yeah, use a common macro for the value, if it creates header
soup, leave it open-coded.

>> I would follow the naming of the others, and call this GCC_PLUGIN_STACKLEAK
>
> It seems to me that GCC_PLUGIN_STACKLEAK is not a right name since the whole
> feature consists of two parts: the arch-specific asm code actually cleaning
> the kernel stack and the gcc plugin which helps to do it faster and more
> reliable. What do you think?

It looks like the feature requires the plugin, so I think the common
naming (GCC_PLUGIN_STACKLEAK) would be preferred. But perhaps I'm
overlooking something where the plugin is not used?

Thanks!

-Kees
Hector Martin June 20, 2017, 8:22 p.m. UTC | #11
On 2017-06-21 04:07, Kees Cook wrote:
> On Tue, Jun 20, 2017 at 2:06 AM, Hector Martin "marcan"
> <marcan@marcan.st> wrote:
>> I know this isn't the first time I've mentioned this, but I think it bears
>> repeating that, as far as I understand the licensing (IANAL), GCC plug-ins
>> licensed as GPLv2 are in violation of the GCC license.
> 
> I continue to think that the wording is unclear and that rational
> people can come to different conclusions. To guide me I try to look at
> the intent of the license ("stay Free Software") and the decisions of
> the author ("this should be GPLv2"). Since the authors are quite clear
> about their rationale, and I don't see anything that appears to be
> trying to dodge the plugin being Free Software, I stand by my earlier
> conclusion[1].

I think this is a somewhat naive view. Even if we ignore the letter of
the license (which I do believe we shouldn't, to keep us out of legal
hot water) and focus on the spirit/intent, what is going on here has a
seriously strange smell.

For one, the GPLv2 license has been chosen *explicitly* to re-target a
licensing scheme intended to prevent usage of GCC with non-free plugins
to, instead, limit usage of the plug-in to only certain kinds of
software (namely the kernel). This was done in order to effectively use
the free plug-in as an advertisement for a (AIUI) GPLv3-licensed "full"
version that is (AIUI) distributed under a wrapper contract that
attempts to discourage redistribution (i.e. exercising the user's rights
under the license) under threat of contract discontinuation. Whether
this is legal or not (and I believe it is, at least this part of the
story), it's certainly not what the GCC authors had in mind when they
developed the plug-in interface and very much violates the intent of the
libgcc exception mechanism (which is just to discourage proprietary
plug-ins).

Additionally, the simple fact that the plug-in is not GPLv3-compatible
means it cannot be upstreamed into GCC itself. This prevents GCC itself
from benefiting from this usage of its plug-in API, and is also clearly
not what the GCC authors intend.

Ultimately, the GCC authors laid out a simple rule with their license
choice: "keep plug-ins GPLv3-compatible", and this plug-in violates that
rule - worse, does so in order to explicitly limit users' freedom. I
think it's easy to argue that this violates the spirit of free software
and the intent of the FSF.

For what it's worth, I did ask a GCC developer and he agrees with my
analysis (though obviously this is a single data point and does not
represent the view of the FSF's lawyers).
Mark Rutland June 21, 2017, 9:24 a.m. UTC | #12
On Tue, Jun 20, 2017 at 12:11:56PM -0700, Kees Cook wrote:
> On Tue, Jun 20, 2017 at 4:20 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Tue, Jun 13, 2017 at 02:51:59PM -0700, Laura Abbott wrote:
> >> On 06/09/2017 10:28 AM, Kees Cook wrote:
> >> > It seems like it shouldn't be too hard to add on-user-return erasure
> >> > code to other architectures too.
> >>
> >> I played around getting this to compile for arm64 with a dummy
> >> stack clearing function. arm64 is doing something special with the
> >> efistub so it fails to link with
> >>
> >> drivers/firmware/efi/libstub/arm-stub.c:45:(.init.text+0x54): relocation
> >>       truncated to fit: R_AARCH64_CALL26 against undefined symbol `__efistub_track_stack'
> >>
> >> The relocation to the .init.text section and appending __efistub happens after
> >> compilation so the checks in the plugin itself don't work. I haven't come up
> >> with a solution to not have the plugin run on the stub yet.
> >
> > Can we do something like what we do for KCOV, and (only) place the
> > plugin-invoking flags to into CFLAGS_STACKLEAK, which we can filter out
> > in scripts/Makefile.lib?
> 
> That seems fine to me, though plugins have tended to provide a
> DISABLE_foo_bar_PLUGIN export (like used in ppc for turning off the
> latent entropy plugin):
> 
> scripts/Makefile.gcc-plugins:    DISABLE_LATENT_ENTROPY_PLUGIN +=
> -fplugin-arg-latent_entropy_plugin-disable
> 
> arch/powerpc/kernel/Makefile:CFLAGS_cputable.o +=
> $(DISABLE_LATENT_ENTROPY_PLUGIN)
> arch/powerpc/kernel/Makefile:CFLAGS_prom_init.o +=
> $(DISABLE_LATENT_ENTROPY_PLUGIN)
> ...

Interesting. I hadn't encountered that style before.

I'd seen the per-makefile config disables:

drivers/firmware/efi/libstub/Makefile:GCOV_PROFILE                      := n
drivers/firmware/efi/libstub/Makefile:KASAN_SANITIZE                    := n
drivers/firmware/efi/libstub/Makefile:UBSAN_SANITIZE                    := n
drivers/firmware/efi/libstub/Makefile:KCOV_INSTRUMENT                   := n

... and per-object config disables:

arch/arm64/mm/Makefile:KASAN_SANITIZE_physaddr.o        += n
arch/arm64/mm/Makefile:KASAN_SANITIZE_kasan_init.o      := n

... which AFAICT are both handled by the standard scripts/Makefile.lib idiom:

ifeq ($(CONFIG_KASAN),y)
_c_flags += $(if $(patsubst n%,, \
                $(KASAN_SANITIZE_$(basetarget).o)$(KASAN_SANITIZE)y), \
                $(CFLAGS_KASAN))
endif

... it seems a shame for plugins to follow a different pattern.

Thanks,
Mark.
Laura Abbott June 21, 2017, 3:54 p.m. UTC | #13
On 06/21/2017 02:24 AM, Mark Rutland wrote:
> On Tue, Jun 20, 2017 at 12:11:56PM -0700, Kees Cook wrote:
>> On Tue, Jun 20, 2017 at 4:20 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>>> On Tue, Jun 13, 2017 at 02:51:59PM -0700, Laura Abbott wrote:
>>>> On 06/09/2017 10:28 AM, Kees Cook wrote:
>>>>> It seems like it shouldn't be too hard to add on-user-return erasure
>>>>> code to other architectures too.
>>>>
>>>> I played around getting this to compile for arm64 with a dummy
>>>> stack clearing function. arm64 is doing something special with the
>>>> efistub so it fails to link with
>>>>
>>>> drivers/firmware/efi/libstub/arm-stub.c:45:(.init.text+0x54): relocation
>>>>       truncated to fit: R_AARCH64_CALL26 against undefined symbol `__efistub_track_stack'
>>>>
>>>> The relocation to the .init.text section and appending __efistub happens after
>>>> compilation so the checks in the plugin itself don't work. I haven't come up
>>>> with a solution to not have the plugin run on the stub yet.
>>>
>>> Can we do something like what we do for KCOV, and (only) place the
>>> plugin-invoking flags to into CFLAGS_STACKLEAK, which we can filter out
>>> in scripts/Makefile.lib?
>>
>> That seems fine to me, though plugins have tended to provide a
>> DISABLE_foo_bar_PLUGIN export (like used in ppc for turning off the
>> latent entropy plugin):
>>
>> scripts/Makefile.gcc-plugins:    DISABLE_LATENT_ENTROPY_PLUGIN +=
>> -fplugin-arg-latent_entropy_plugin-disable
>>
>> arch/powerpc/kernel/Makefile:CFLAGS_cputable.o +=
>> $(DISABLE_LATENT_ENTROPY_PLUGIN)
>> arch/powerpc/kernel/Makefile:CFLAGS_prom_init.o +=
>> $(DISABLE_LATENT_ENTROPY_PLUGIN)
>> ...
> 
> Interesting. I hadn't encountered that style before.
> 
> I'd seen the per-makefile config disables:
> 
> drivers/firmware/efi/libstub/Makefile:GCOV_PROFILE                      := n
> drivers/firmware/efi/libstub/Makefile:KASAN_SANITIZE                    := n
> drivers/firmware/efi/libstub/Makefile:UBSAN_SANITIZE                    := n
> drivers/firmware/efi/libstub/Makefile:KCOV_INSTRUMENT                   := n
> 
> ... and per-object config disables:
> 
> arch/arm64/mm/Makefile:KASAN_SANITIZE_physaddr.o        += n
> arch/arm64/mm/Makefile:KASAN_SANITIZE_kasan_init.o      := n
> 
> ... which AFAICT are both handled by the standard scripts/Makefile.lib idiom:
> 
> ifeq ($(CONFIG_KASAN),y)
> _c_flags += $(if $(patsubst n%,, \
>                 $(KASAN_SANITIZE_$(basetarget).o)$(KASAN_SANITIZE)y), \
>                 $(CFLAGS_KASAN))
> endif
> 
> ... it seems a shame for plugins to follow a different pattern.
> 
> Thanks,
> Mark.
> 

That's what I tried to do as a first attempt but I got bogged down
in Makefiles and figuring out how to not pass the plugin flags to
specific files. Agreed it would be good to have the plugins follow
a similar pattern.

Thanks,
Laura
Laura Abbott July 10, 2017, 10:04 p.m. UTC | #14
I made an attempt at implementing stack clearing for arm64 using roughly
the same algorithm as x86. It passes some level of basic tests but it definitely
needs more careful review and thought ("submit early and often").

As an added follow up, self-protection.rst should also be updated with some
details about how stackleak actually works for people who want to follow on
for other arches.

Laura Abbott (2):
  stackleak: Update for arm64
  arm64: Clear the stack

 arch/arm64/Kconfig                     |  1 +
 arch/arm64/include/asm/processor.h     |  3 ++
 arch/arm64/kernel/asm-offsets.c        |  3 ++
 arch/arm64/kernel/entry.S              | 92 ++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/process.c            | 18 +++++++
 drivers/firmware/efi/libstub/Makefile  |  3 +-
 scripts/Makefile.gcc-plugins           |  5 +-
 scripts/gcc-plugins/stackleak_plugin.c | 25 +++++++--
 8 files changed, 143 insertions(+), 7 deletions(-)
Alexander Popov July 11, 2017, 10:56 p.m. UTC | #15
Hello Laura,

On 11.07.2017 01:04, Laura Abbott wrote:
> I made an attempt at implementing stack clearing for arm64 using roughly
> the same algorithm as x86. It passes some level of basic tests but it definitely
> needs more careful review and thought ("submit early and often").

Thank you for joining that fun!

I will send the third version of the STACKLEAK patch shortly, please see the
changes. I'm going to CC x86@kernel.org. Hope they will have a look at the stack
erasing, and we will gain better understanding.

> As an added follow up, self-protection.rst should also be updated with some
> details about how stackleak actually works for people who want to follow on
> for other arches.

Thanks, I've added this to the TODO.

Best regards,
Alexander
diff mbox

Patch

diff --git a/arch/Kconfig b/arch/Kconfig
index cd211a1..a209bd5 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -436,6 +436,26 @@  config GCC_PLUGIN_STRUCTLEAK_VERBOSE
 	  initialized. Since not all existing initializers are detected
 	  by the plugin, this can produce false positive warnings.
 
+config ARCH_HAS_STACKLEAK
+	def_bool n
+	help
+	  An architecture should select this if it has the code which
+	  fills the used part of the kernel stack with a poison value
+	  before returning from system calls.
+
+config STACKLEAK
+	bool "Erase the kernel stack before returning from syscalls"
+	depends on GCC_PLUGINS
+	depends on ARCH_HAS_STACKLEAK
+	help
+	  This option makes the kernel erase the kernel stack before it
+	  returns from a system call. That reduces the information which
+	  a kernel stack leak bug can reveal.
+
+	  This plugin was ported from grsecurity/PaX. More information at:
+	   * https://grsecurity.net/
+	   * https://pax.grsecurity.net/
+
 config HAVE_CC_STACKPROTECTOR
 	bool
 	help
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cc98d5a..5988a5f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -56,6 +56,7 @@  config X86
 	select ARCH_HAS_PMEM_API		if X86_64
 	select ARCH_HAS_SET_MEMORY
 	select ARCH_HAS_SG_CHAIN
+	select ARCH_HAS_STACKLEAK
 	select ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_HAS_STRICT_MODULE_RWX
 	select ARCH_HAS_UBSAN_SANITIZE_ALL
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 370c42c..8ff0a84 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -43,6 +43,12 @@  __visible inline void enter_from_user_mode(void)
 static inline void enter_from_user_mode(void) {}
 #endif
 
+#ifdef CONFIG_STACKLEAK
+asmlinkage void erase_kstack(void);
+#else
+static void erase_kstack(void) {}
+#endif
+
 static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
 {
 #ifdef CONFIG_X86_64
@@ -79,8 +85,10 @@  static long syscall_trace_enter(struct pt_regs *regs)
 		emulated = true;
 
 	if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
-	    tracehook_report_syscall_entry(regs))
+	    tracehook_report_syscall_entry(regs)) {
+		erase_kstack();
 		return -1L;
+	}
 
 	if (emulated)
 		return -1L;
@@ -114,9 +122,11 @@  static long syscall_trace_enter(struct pt_regs *regs)
 			sd.args[5] = regs->bp;
 		}
 
-		ret = __secure_computing(&sd);
-		if (ret == -1)
+		ret = secure_computing(&sd);
+		if (ret == -1) {
+			erase_kstack();
 			return ret;
+		}
 	}
 #endif
 
@@ -125,6 +135,7 @@  static long syscall_trace_enter(struct pt_regs *regs)
 
 	do_audit_syscall_entry(regs, arch);
 
+	erase_kstack();
 	return ret ?: regs->orig_ax;
 }
 
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 57f7ec3..d8610e9 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -78,6 +78,73 @@ 
 #endif
 .endm
 
+.macro erase_kstack
+#ifdef CONFIG_STACKLEAK
+	call erase_kstack
+#endif
+.endm
+
+#ifdef CONFIG_STACKLEAK
+/*
+ * ebp: thread_info
+ */
+ENTRY(erase_kstack)
+	pushl	%edi
+	pushl	%ecx
+	pushl	%eax
+	pushl	%ebp
+
+	movl	PER_CPU_VAR(current_task), %ebp
+	mov	TASK_lowest_stack(%ebp), %edi
+	mov	$-0xBEEF, %eax
+	std
+
+1:
+	mov	%edi, %ecx
+	and	$THREAD_SIZE_asm - 1, %ecx
+	shr	$2, %ecx
+	repne	scasl
+	jecxz	2f
+
+	cmp	$2*16, %ecx
+	jc	2f
+
+	mov	$2*16, %ecx
+	repe	scasl
+	jecxz	2f
+	jne	1b
+
+2:
+	cld
+	or	$2*4, %edi
+	mov	%esp, %ecx
+	sub	%edi, %ecx
+
+	cmp	$THREAD_SIZE_asm, %ecx
+	jb	3f
+	ud2
+
+3:
+	shr	$2, %ecx
+	rep	stosl
+
+	/*
+	 * TODO: sp0 on x86_32 is not reliable, right?
+	 * Doubt because of the definition of cpu_current_top_of_stack
+	 * in arch/x86/kernel/cpu/common.c.
+	 */
+	mov	TASK_thread_sp0(%ebp), %edi
+	sub	$128, %edi
+	mov	%edi, TASK_lowest_stack(%ebp)
+
+	popl	%ebp
+	popl	%eax
+	popl	%ecx
+	popl	%edi
+	ret
+ENDPROC(erase_kstack)
+#endif
+
 /*
  * User gs save/restore
  *
@@ -440,6 +507,8 @@  ENTRY(entry_SYSENTER_32)
 	ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \
 		    "jmp .Lsyscall_32_done", X86_FEATURE_XENPV
 
+	erase_kstack
+
 /* Opportunistic SYSEXIT */
 	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
 	movl	PT_EIP(%esp), %edx	/* pt_regs->ip */
@@ -526,6 +595,8 @@  ENTRY(entry_INT80_32)
 	call	do_int80_syscall_32
 .Lsyscall_32_done:
 
+	erase_kstack
+
 restore_all:
 	TRACE_IRQS_IRET
 .Lrestore_all_notrace:
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 044d18e..84829d4 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -58,6 +58,94 @@  ENDPROC(native_usergs_sysret64)
 #endif
 .endm
 
+.macro erase_kstack
+#ifdef CONFIG_STACKLEAK
+	call erase_kstack
+#endif
+.endm
+
+#ifdef CONFIG_STACKLEAK
+ENTRY(erase_kstack)
+	pushq	%rdi
+	pushq	%rcx
+	pushq	%rax
+	pushq	%r11
+
+	movq	PER_CPU_VAR(current_task), %r11
+	mov	TASK_lowest_stack(%r11), %rdi
+	mov	$-0xBEEF, %rax		/* -0xBEEF is a stack poison. */
+	std
+
+1:
+	/*
+	 * Let's search for the poison value in the stack.
+	 * Start from the lowest_stack and go to the bottom (see std above).
+	 */
+	mov	%edi, %ecx
+	and	$THREAD_SIZE_asm - 1, %ecx
+	shr	$3, %ecx
+	repne	scasq
+	jecxz	2f	/* Didn't find it. Go to poisoning. */
+
+	/*
+	 * Found the poison value in the stack. Go to poisoning if there are
+	 * less than 16 qwords left.
+	 */
+	cmp	$2*8, %ecx
+	jc	2f
+
+	/*
+	 * Check that 16 further qwords contain poison (avoid false positives).
+	 * If so, the part of the stack below the address in %rdi is likely
+	 * to be poisoned. Otherwise we need to search deeper.
+	 */
+	mov	$2*8, %ecx
+	repe	scasq
+	jecxz	2f	/* Poison the upper part of the stack. */
+	jne	1b	/* Search deeper. */
+
+2:
+	/*
+	 * Prepare the counter for poisoning the kernel stack between
+	 * %rdi and %rsp.
+	 *
+	 * TODO: Sorry, don't understand why the following OR instruction is
+	 * needed. That may be connected to the thread.lowest_stack
+	 * initialization in arch/x86/kernel/process_64.c, where it is set
+	 * to the task_stack_page address + 2 * sizeof(unsigned long).
+	 */
+	cld
+	or	$2*8, %rdi
+	mov	%esp, %ecx
+	sub	%edi, %ecx
+
+	/* Check that the counter value is sane. */
+	cmp	$THREAD_SIZE_asm, %rcx
+	jb	3f
+	ud2
+
+3:
+	/*
+	 * So let's write the poison value to the kernel stack. Start from the
+	 * address in %rdi and move up (see cld above) to the address in %rsp
+	 * (not included, used memory).
+	 */
+	shr	$3, %ecx
+	rep	stosq
+
+	/* Set the lowest_stack value to the top_of_stack - 256. */
+	mov	TASK_thread_sp0(%r11), %rdi
+	sub	$256, %rdi
+	mov	%rdi, TASK_lowest_stack(%r11)
+
+	popq	%r11
+	popq	%rax
+	popq	%rcx
+	popq	%rdi
+	ret
+ENDPROC(erase_kstack)
+#endif
+
 /*
  * When dynamic function tracer is enabled it will add a breakpoint
  * to all locations that it is about to modify, sync CPUs, update
@@ -218,6 +306,8 @@  entry_SYSCALL_64_fastpath:
 	testl	$_TIF_ALLWORK_MASK, TASK_TI_flags(%r11)
 	jnz	1f
 
+	erase_kstack
+
 	LOCKDEP_SYS_EXIT
 	TRACE_IRQS_ON		/* user mode is traced as IRQs on */
 	movq	RIP(%rsp), %rcx
@@ -246,6 +336,8 @@  entry_SYSCALL64_slow_path:
 	call	do_syscall_64		/* returns with IRQs disabled */
 
 return_from_SYSCALL_64:
+	erase_kstack
+
 	RESTORE_EXTRA_REGS
 	TRACE_IRQS_IRETQ		/* we're about to change IF */
 
@@ -419,6 +511,7 @@  ENTRY(ret_from_fork)
 2:
 	leaq	FRAME_OFFSET(%rsp),%rdi	/* pt_regs pointer */
 	call	syscall_return_slowpath	/* returns with IRQs disabled */
+	erase_kstack
 	TRACE_IRQS_ON			/* user mode is traced as IRQS on */
 	SWAPGS
 	FRAME_END
@@ -532,6 +625,12 @@  ret_from_intr:
 GLOBAL(retint_user)
 	mov	%rsp,%rdi
 	call	prepare_exit_to_usermode
+
+	/*
+	 * TODO: Do we need to call erase_kstack here?
+	 * The PaX patch has it here commented out.
+	 */
+
 	TRACE_IRQS_IRETQ
 	SWAPGS
 	jmp	restore_regs_and_iret
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index e1721da..330516a 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -18,6 +18,12 @@ 
 
 	.section .entry.text, "ax"
 
+	.macro erase_kstack
+#ifdef CONFIG_STACKLEAK
+	call erase_kstack
+#endif
+	.endm
+
 /*
  * 32-bit SYSENTER entry.
  *
@@ -229,6 +235,7 @@  ENTRY(entry_SYSCALL_compat)
 
 	/* Opportunistic SYSRET */
 sysret32_from_system_call:
+	erase_kstack
 	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
 	movq	RBX(%rsp), %rbx		/* pt_regs->rbx */
 	movq	RBP(%rsp), %rbp		/* pt_regs->rbp */
@@ -337,6 +344,7 @@  ENTRY(entry_INT80_compat)
 .Lsyscall_32_done:
 
 	/* Go back to user mode. */
+	erase_kstack
 	TRACE_IRQS_ON
 	SWAPGS
 	jmp	restore_regs_and_iret
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f385eca..049abff 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -469,6 +469,10 @@  struct thread_struct {
 
 	mm_segment_t		addr_limit;
 
+#ifdef CONFIG_STACKLEAK
+	unsigned long		lowest_stack;
+#endif
+
 	unsigned int		sig_on_uaccess_err:1;
 	unsigned int		uaccess_err:1;	/* uaccess failed */
 
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index de827d6..31ba8e9 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -37,6 +37,10 @@  void common(void) {
 	BLANK();
 	OFFSET(TASK_TI_flags, task_struct, thread_info.flags);
 	OFFSET(TASK_addr_limit, task_struct, thread.addr_limit);
+#ifdef CONFIG_STACKLEAK
+	OFFSET(TASK_lowest_stack, task_struct, thread.lowest_stack);
+	OFFSET(TASK_thread_sp0, task_struct, thread.sp0);
+#endif
 
 	BLANK();
 	OFFSET(crypto_tfm_ctx_offset, crypto_tfm, __crt_ctx);
@@ -73,6 +77,11 @@  void common(void) {
 	OFFSET(PV_MMU_read_cr2, pv_mmu_ops, read_cr2);
 #endif
 
+#ifdef CONFIG_STACKLEAK
+	BLANK();
+	DEFINE(THREAD_SIZE_asm, THREAD_SIZE);
+#endif
+
 #ifdef CONFIG_XEN
 	BLANK();
 	OFFSET(XEN_vcpu_info_mask, vcpu_info, evtchn_upcall_mask);
diff --git a/arch/x86/kernel/dumpstack_32.c b/arch/x86/kernel/dumpstack_32.c
index b0b3a3d..a223b4e 100644
--- a/arch/x86/kernel/dumpstack_32.c
+++ b/arch/x86/kernel/dumpstack_32.c
@@ -174,3 +174,15 @@  int is_valid_bugaddr(unsigned long ip)
 
 	return ud2 == 0x0b0f;
 }
+
+#ifdef CONFIG_STACKLEAK
+void __used check_alloca(unsigned long size)
+{
+	unsigned long sp = (unsigned long)&sp, stack_left;
+
+	/* all kernel stacks are of the same size */
+	stack_left = sp & (THREAD_SIZE - 1);
+	BUG_ON(stack_left < 256 || size >= stack_left - 256);
+}
+EXPORT_SYMBOL(check_alloca);
+#endif
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index a8b117e..c2ffb9f 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -188,3 +188,36 @@  int is_valid_bugaddr(unsigned long ip)
 
 	return ud2 == 0x0b0f;
 }
+
+#ifdef CONFIG_STACKLEAK
+void __used check_alloca(unsigned long size)
+{
+	struct stack_info stack_info = {0};
+	unsigned long visit_mask = 0;
+	unsigned long sp = (unsigned long)&sp;
+	unsigned long stack_left;
+
+	BUG_ON(get_stack_info(&sp, current, &stack_info, &visit_mask));
+
+	switch (stack_info.type) {
+	case STACK_TYPE_TASK:
+		stack_left = sp & (THREAD_SIZE - 1);
+		break;
+
+	case STACK_TYPE_IRQ:
+		stack_left = sp & (IRQ_STACK_SIZE - 1);
+		break;
+
+	case STACK_TYPE_EXCEPTION ... STACK_TYPE_EXCEPTION_LAST:
+		stack_left = sp & (EXCEPTION_STKSZ - 1);
+		break;
+
+	case STACK_TYPE_SOFTIRQ:
+	default:
+		BUG();
+	}
+
+	BUG_ON(stack_left < 256 || size >= stack_left - 256);
+}
+EXPORT_SYMBOL(check_alloca);
+#endif
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 4c818f8..5355b12 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -134,6 +134,11 @@  int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 	p->thread.sp0 = (unsigned long) (childregs+1);
 	memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps));
 
+#ifdef CONFIG_STACKLEAK
+	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
+						2 * sizeof(unsigned long);
+#endif
+
 	if (unlikely(p->flags & PF_KTHREAD)) {
 		/* kernel thread */
 		memset(childregs, 0, sizeof(struct pt_regs));
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index d6b784a..f943558 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -161,6 +161,11 @@  int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 	p->thread.sp = (unsigned long) fork_frame;
 	p->thread.io_bitmap_ptr = NULL;
 
+#ifdef CONFIG_STACKLEAK
+	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
+						2 * sizeof(unsigned long);
+#endif
+
 	savesegment(gs, p->thread.gsindex);
 	p->thread.gsbase = p->thread.gsindex ? 0 : me->thread.gsbase;
 	savesegment(fs, p->thread.fsindex);
diff --git a/fs/exec.c b/fs/exec.c
index 65145a3..c041611 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1927,3 +1927,20 @@  COMPAT_SYSCALL_DEFINE5(execveat, int, fd,
 				  argv, envp, flags);
 }
 #endif
+
+#ifdef CONFIG_STACKLEAK
+void __used track_stack(void)
+{
+	unsigned long sp = (unsigned long)&sp;
+
+	if (sp < current->thread.lowest_stack &&
+	    sp >= (unsigned long)task_stack_page(current) +
+					2 * sizeof(unsigned long)) {
+		current->thread.lowest_stack = sp;
+	}
+
+	if (unlikely((sp & ~(THREAD_SIZE - 1)) < (THREAD_SIZE / 16)))
+		BUG();
+}
+EXPORT_SYMBOL(track_stack);
+#endif
diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins
index 8233553..6cf9487 100644
--- a/scripts/Makefile.gcc-plugins
+++ b/scripts/Makefile.gcc-plugins
@@ -29,6 +29,9 @@  ifdef CONFIG_GCC_PLUGINS
   gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE)	+= -fplugin-arg-structleak_plugin-verbose
   gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STRUCTLEAK)	+= -DSTRUCTLEAK_PLUGIN
 
+  gcc-plugin-$(CONFIG_STACKLEAK)	+= stackleak_plugin.so
+  gcc-plugin-cflags-$(CONFIG_STACKLEAK)	+= -DSTACKLEAK_PLUGIN -fplugin-arg-stackleak_plugin-track-lowest-sp=100
+
   GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y))
 
   export PLUGINCC GCC_PLUGINS_CFLAGS GCC_PLUGIN GCC_PLUGIN_SUBDIR
diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c
new file mode 100644
index 0000000..2ee49c4
--- /dev/null
+++ b/scripts/gcc-plugins/stackleak_plugin.c
@@ -0,0 +1,342 @@ 
+/*
+ * Copyright 2011-2017 by the PaX Team <pageexec@freemail.hu>
+ * Licensed under the GPL v2
+ *
+ * Note: the choice of the license means that the compilation process is
+ *       NOT 'eligible' as defined by gcc's library exception to the GPL v3,
+ *       but for the kernel it doesn't matter since it doesn't link against
+ *       any of the gcc libraries
+ *
+ * gcc plugin to help implement various PaX features
+ *
+ * - track lowest stack pointer
+ *
+ * TODO:
+ * - initialize all local variables
+ *
+ * BUGS:
+ * - none known
+ */
+
+#include "gcc-common.h"
+
+__visible int plugin_is_GPL_compatible;
+
+static int track_frame_size = -1;
+static const char track_function[] = "track_stack";
+static const char check_function[] = "check_alloca";
+static GTY(()) tree track_function_decl;
+static GTY(()) tree check_function_decl;
+static bool init_locals;
+
+static struct plugin_info stackleak_plugin_info = {
+	.version	= "201602181345",
+	.help		= "track-lowest-sp=nn\ttrack sp in functions whose frame size is at least nn bytes\n"
+//			  "initialize-locals\t\tforcibly initialize all stack frames\n"
+};
+
+static void stackleak_check_alloca(gimple_stmt_iterator *gsi)
+{
+	gimple stmt;
+	gcall *check_alloca;
+	tree alloca_size;
+	cgraph_node_ptr node;
+	int frequency;
+	basic_block bb;
+
+	// insert call to void check_alloca(unsigned long size)
+	alloca_size = gimple_call_arg(gsi_stmt(*gsi), 0);
+	stmt = gimple_build_call(check_function_decl, 1, alloca_size);
+	check_alloca = as_a_gcall(stmt);
+	gsi_insert_before(gsi, check_alloca, GSI_SAME_STMT);
+
+	// update the cgraph
+	bb = gimple_bb(check_alloca);
+	node = cgraph_get_create_node(check_function_decl);
+	gcc_assert(node);
+	frequency = compute_call_stmt_bb_frequency(current_function_decl, bb);
+	cgraph_create_edge(cgraph_get_node(current_function_decl), node, check_alloca, bb->count, frequency, bb->loop_depth);
+}
+
+static void stackleak_add_instrumentation(gimple_stmt_iterator *gsi, bool after)
+{
+	gimple stmt;
+	gcall *track_stack;
+	cgraph_node_ptr node;
+	int frequency;
+	basic_block bb;
+
+	// insert call to void track_stack(void)
+	stmt = gimple_build_call(track_function_decl, 0);
+	track_stack = as_a_gcall(stmt);
+	if (after)
+		gsi_insert_after(gsi, track_stack, GSI_CONTINUE_LINKING);
+	else
+		gsi_insert_before(gsi, track_stack, GSI_SAME_STMT);
+
+	// update the cgraph
+	bb = gimple_bb(track_stack);
+	node = cgraph_get_create_node(track_function_decl);
+	gcc_assert(node);
+	frequency = compute_call_stmt_bb_frequency(current_function_decl, bb);
+	cgraph_create_edge(cgraph_get_node(current_function_decl), node, track_stack, bb->count, frequency, bb->loop_depth);
+}
+
+static bool is_alloca(gimple stmt)
+{
+	if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA))
+		return true;
+
+#if BUILDING_GCC_VERSION >= 4007
+	if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA_WITH_ALIGN))
+		return true;
+#endif
+
+	return false;
+}
+
+static unsigned int stackleak_tree_instrument_execute(void)
+{
+	basic_block bb, entry_bb;
+	bool prologue_instrumented = false, is_leaf = true;
+
+	entry_bb = ENTRY_BLOCK_PTR_FOR_FN(cfun)->next_bb;
+
+	// 1. loop through BBs and GIMPLE statements
+	FOR_EACH_BB_FN(bb, cfun) {
+		gimple_stmt_iterator gsi;
+
+		for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
+			gimple stmt;
+
+			stmt = gsi_stmt(gsi);
+
+			if (is_gimple_call(stmt))
+				is_leaf = false;
+
+			// gimple match: align 8 built-in BUILT_IN_NORMAL:BUILT_IN_ALLOCA attributes <tree_list 0xb7576450>
+			if (!is_alloca(stmt))
+				continue;
+
+			// 2. insert stack overflow check before each __builtin_alloca call
+			stackleak_check_alloca(&gsi);
+
+			// 3. insert track call after each __builtin_alloca call
+			stackleak_add_instrumentation(&gsi, true);
+			if (bb == entry_bb)
+				prologue_instrumented = true;
+		}
+	}
+
+	// special cases for some bad linux code: taking the address of static inline functions will materialize them
+	// but we mustn't instrument some of them as the resulting stack alignment required by the function call ABI
+	// will break other assumptions regarding the expected (but not otherwise enforced) register clobbering  ABI.
+	// case in point: native_save_fl on amd64 when optimized for size clobbers rdx if it were instrumented here.
+	if (is_leaf && !TREE_PUBLIC(current_function_decl) && DECL_DECLARED_INLINE_P(current_function_decl))
+		return 0;
+	if (is_leaf && !strncmp(IDENTIFIER_POINTER(DECL_NAME(current_function_decl)), "_paravirt_", 10))
+		return 0;
+
+	// 4. insert track call at the beginning
+	if (!prologue_instrumented) {
+		gimple_stmt_iterator gsi;
+
+		gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+		bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+		if (!single_pred_p(bb)) {
+//			gcc_assert(bb_loop_depth(bb) || (bb->flags & BB_IRREDUCIBLE_LOOP));
+			split_edge(single_succ_edge(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+			gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+			bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+		}
+		gsi = gsi_after_labels(bb);
+		stackleak_add_instrumentation(&gsi, false);
+	}
+
+	return 0;
+}
+
+static unsigned int stackleak_final_execute(void)
+{
+	rtx_insn *insn, *next;
+
+	if (cfun->calls_alloca)
+		return 0;
+
+	// keep calls only if function frame is big enough
+	if (get_frame_size() >= track_frame_size)
+		return 0;
+
+	// 1. find track_stack calls
+	for (insn = get_insns(); insn; insn = next) {
+		// rtl match: (call_insn 8 7 9 3 (call (mem (symbol_ref ("track_stack") [flags 0x41] <function_decl 0xb7470e80 track_stack>) [0 S1 A8]) (4)) -1 (nil) (nil))
+		rtx body;
+
+		next = NEXT_INSN(insn);
+		if (!CALL_P(insn))
+			continue;
+		body = PATTERN(insn);
+		if (GET_CODE(body) != CALL)
+			continue;
+		body = XEXP(body, 0);
+		if (GET_CODE(body) != MEM)
+			continue;
+		body = XEXP(body, 0);
+		if (GET_CODE(body) != SYMBOL_REF)
+			continue;
+//		if (strcmp(XSTR(body, 0), track_function))
+		if (SYMBOL_REF_DECL(body) != track_function_decl)
+			continue;
+//		warning(0, "track_frame_size: %d %ld %d", cfun->calls_alloca, get_frame_size(), track_frame_size);
+		// 2. delete call
+		delete_insn_and_edges(insn);
+#if BUILDING_GCC_VERSION >= 4007
+		if (GET_CODE(next) == NOTE && NOTE_KIND(next) == NOTE_INSN_CALL_ARG_LOCATION) {
+			insn = next;
+			next = NEXT_INSN(insn);
+			delete_insn_and_edges(insn);
+		}
+#endif
+	}
+
+//	print_simple_rtl(stderr, get_insns());
+//	print_rtl(stderr, get_insns());
+//	warning(0, "track_frame_size: %d %ld %d", cfun->calls_alloca, get_frame_size(), track_frame_size);
+
+	return 0;
+}
+
+static bool stackleak_track_stack_gate(void)
+{
+	tree section;
+
+	if (ix86_cmodel != CM_KERNEL)
+		return false;
+
+	section = lookup_attribute("section", DECL_ATTRIBUTES(current_function_decl));
+	if (section && TREE_VALUE(section)) {
+		section = TREE_VALUE(TREE_VALUE(section));
+
+		if (!strncmp(TREE_STRING_POINTER(section), ".init.text", 10))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".devinit.text", 13))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".cpuinit.text", 13))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".meminit.text", 13))
+			return false;
+	}
+
+	return track_frame_size >= 0;
+}
+
+static void stackleak_start_unit(void *gcc_data __unused, void *user_data __unused)
+{
+	tree fntype;
+
+	// void track_stack(void)
+	fntype = build_function_type_list(void_type_node, NULL_TREE);
+	track_function_decl = build_fn_decl(track_function, fntype);
+	DECL_ASSEMBLER_NAME(track_function_decl); // for LTO
+	TREE_PUBLIC(track_function_decl) = 1;
+	TREE_USED(track_function_decl) = 1;
+	DECL_EXTERNAL(track_function_decl) = 1;
+	DECL_ARTIFICIAL(track_function_decl) = 1;
+	DECL_PRESERVE_P(track_function_decl) = 1;
+
+	// void check_alloca(unsigned long)
+	fntype = build_function_type_list(void_type_node, long_unsigned_type_node, NULL_TREE);
+	check_function_decl = build_fn_decl(check_function, fntype);
+	DECL_ASSEMBLER_NAME(check_function_decl); // for LTO
+	TREE_PUBLIC(check_function_decl) = 1;
+	TREE_USED(check_function_decl) = 1;
+	DECL_EXTERNAL(check_function_decl) = 1;
+	DECL_ARTIFICIAL(check_function_decl) = 1;
+	DECL_PRESERVE_P(check_function_decl) = 1;
+}
+
+static bool stackleak_tree_instrument_gate(void)
+{
+	return stackleak_track_stack_gate();
+}
+
+#define PASS_NAME stackleak_tree_instrument
+#define PROPERTIES_REQUIRED PROP_gimple_leh | PROP_cfg
+#define TODO_FLAGS_START TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts
+#define TODO_FLAGS_FINISH TODO_verify_ssa | TODO_verify_stmts | TODO_dump_func | TODO_update_ssa | TODO_rebuild_cgraph_edges
+#include "gcc-generate-gimple-pass.h"
+
+static bool stackleak_final_gate(void)
+{
+	return stackleak_track_stack_gate();
+}
+
+#define PASS_NAME stackleak_final
+#define TODO_FLAGS_FINISH TODO_dump_func
+#include "gcc-generate-rtl-pass.h"
+
+__visible int plugin_init(struct plugin_name_args *plugin_info, struct plugin_gcc_version *version)
+{
+	const char * const plugin_name = plugin_info->base_name;
+	const int argc = plugin_info->argc;
+	const struct plugin_argument * const argv = plugin_info->argv;
+	int i;
+
+	static const struct ggc_root_tab gt_ggc_r_gt_stackleak[] = {
+		{
+			.base = &track_function_decl,
+			.nelt = 1,
+			.stride = sizeof(track_function_decl),
+			.cb = &gt_ggc_mx_tree_node,
+			.pchw = &gt_pch_nx_tree_node
+		},
+		{
+			.base = &check_function_decl,
+			.nelt = 1,
+			.stride = sizeof(check_function_decl),
+			.cb = &gt_ggc_mx_tree_node,
+			.pchw = &gt_pch_nx_tree_node
+		},
+		LAST_GGC_ROOT_TAB
+	};
+
+//	PASS_INFO(stackleak_tree_instrument, "tree_profile", 1, PASS_POS_INSERT_BEFORE);
+	PASS_INFO(stackleak_tree_instrument, "optimized", 1, PASS_POS_INSERT_BEFORE);
+	PASS_INFO(stackleak_final, "final", 1, PASS_POS_INSERT_BEFORE);
+
+	if (!plugin_default_version_check(version, &gcc_version)) {
+		error(G_("incompatible gcc/plugin versions"));
+		return 1;
+	}
+
+	register_callback(plugin_name, PLUGIN_INFO, NULL, &stackleak_plugin_info);
+
+	for (i = 0; i < argc; ++i) {
+		if (!strcmp(argv[i].key, "track-lowest-sp")) {
+			if (!argv[i].value) {
+				error(G_("no value supplied for option '-fplugin-arg-%s-%s'"), plugin_name, argv[i].key);
+				continue;
+			}
+			track_frame_size = atoi(argv[i].value);
+			if (argv[i].value[0] < '0' || argv[i].value[0] > '9' || track_frame_size < 0)
+				error(G_("invalid option argument '-fplugin-arg-%s-%s=%s'"), plugin_name, argv[i].key, argv[i].value);
+			continue;
+		}
+		if (!strcmp(argv[i].key, "initialize-locals")) {
+			if (argv[i].value) {
+				error(G_("invalid option argument '-fplugin-arg-%s-%s=%s'"), plugin_name, argv[i].key, argv[i].value);
+				continue;
+			}
+			init_locals = true;
+			continue;
+		}
+		error(G_("unknown option '-fplugin-arg-%s-%s'"), plugin_name, argv[i].key);
+	}
+
+	register_callback(plugin_name, PLUGIN_START_UNIT, &stackleak_start_unit, NULL);
+	register_callback(plugin_name, PLUGIN_REGISTER_GGC_ROOTS, NULL, (void *)&gt_ggc_r_gt_stackleak);
+	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL, &stackleak_tree_instrument_pass_info);
+	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL, &stackleak_final_pass_info);
+
+	return 0;
+}