diff mbox

[v2] use -fstack-protector-strong

Message ID 20131126203727.GA352@www.outflux.net (mailing list archive)
State New, archived
Headers show

Commit Message

Kees Cook Nov. 26, 2013, 8:37 p.m. UTC
Build the kernel with -fstack-protector-strong when it is available
(gcc 4.9 and later). This increases the coverage of the stack protector
without the heavy performance hit of -fstack-protector-all.

The stack protector options available in gcc are:

-fstack-protector-all:
Adds the stack-canary saving prefix and stack-canary checking suffix to
_all_ function entry and exit. Results in substantial use of stack space
for saving the canary for deep stack users (e.g. historically xfs), and
measurable (though shockingly still low) performance hit due to all the
saving/checking. Really not suitable for sane systems, and was entirely
removed as an option from the kernel many years ago.

-fstack-protector:
Adds the canary save/check to functions that define an 8
(--param=ssp-buffer-size=N, N=8 by default) or more byte local char
array. Traditionally, stack overflows happened with string-based
manipulations, so this was a way to find those functions. Very few
total functions actually get the canary; no measurable performance or
size overhead.

-fstack-protector-strong
Adds the canary for a wider set of functions, since history has shown that
it's not just those with strings that have ultimately been vulnerable to
stack-busting attacks. With this superset, more functions end up with
a canary, but it still remains small compared to all functions with no
measurable change in performance. Based on the original design document,
a function gets the canary when it contains any of:
- local variable's address used as part of the RHS of an assignment or
  function argument
- local variable is an array (or union containing an array), regardless
  of array type or length
- uses register local variables
https://docs.google.com/a/google.com/document/d/1xXBH6rRZue4f296vGt9YQcuLVQHeE516stHwt8M9xyU

Chrome OS x86_64 build is less than 0.16% larger:

-rwxr-xr-x 1 kees kees 118219343 Apr 17 12:26 vmlinux.orig
-rwxr-xr-x 1 kees kees 118407919 Apr 19 15:00 vmlinux

Ubuntu x86_64 build, using 14.04's config is less than 0.14% larger:

-rwxrwxr-x 1 kees kees 174384144 Nov 26 11:00 vmlinux.ubuntu-gcc-4.9
-rwxrwxr-x 1 kees kees 174627120 Nov 26 11:09 vmlinux.ubuntu-gcc-4.9+strong

On a defconfig x86_64 build (with CONFIG_CC_STACKPROTECTOR enabled), the
delta in size is just under 9% larger:

-rwxrwxr-x 1 kees kees  22134340 Nov 26 10:28 vmlinux.gcc-4.8
-rwxrwxr-x 1 kees kees  22123870 Nov 26 10:40 vmlinux.gcc-4.9
-rwxrwxr-x 1 kees kees  24225118 Nov 26 10:42 vmlinux.gcc-4.9+strong

ARM's compressed boot code now triggers stack protection, so a static
guard was added. Since this is only used during decompression and was
never protected before, the exposure here is very small. Once it switches
to the full kernel, the stack guard is back to normal.

Chrome OS has been using -fstack-protector-strong for its kernel builds
for the last 8 months with no problems.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
v2:
 - added description of all stack protector options
 - added size comparisons for Ubuntu and defconfig
---
 arch/arm/Makefile               |    3 ++-
 arch/arm/boot/compressed/misc.c |   14 ++++++++++++++
 arch/x86/Makefile               |    2 +-
 3 files changed, 17 insertions(+), 2 deletions(-)

Comments

Ingo Molnar Nov. 27, 2013, 11:27 a.m. UTC | #1
* Kees Cook <keescook@chromium.org> wrote:

> On a defconfig x86_64 build (with CONFIG_CC_STACKPROTECTOR enabled), the
> delta in size is just under 9% larger:
> 
>  -rwxrwxr-x 1 kees kees  22134340 Nov 26 10:28 vmlinux.gcc-4.8
>  -rwxrwxr-x 1 kees kees  22123870 Nov 26 10:40 vmlinux.gcc-4.9
>  -rwxrwxr-x 1 kees kees  24225118 Nov 26 10:42 vmlinux.gcc-4.9+strong

Please run it through 'size' so that we know the real text size 
increases.

If the cost of -fstack-protector-strong is really +9% in kernel text 
size then that's rather significant!

If this option blows up our performance critical codepaths as well 
then this will likely cause a runtime slowdown as well, in addition to 
the increase in I$ footprint. That needs to be measured.

CONFIG_CC_STACKPROTECTOR=y is relatively cheap today. For example on 
x86-64 defconfig:

      text    data    bss     dec       filename
  11378972    1455056 1191936 14025964  vmlinux  # CONFIG_CC_STACKPROTECTOR is not set
  11420243    1455056 1191936 14067235  vmlinux  CONFIG_CC_STACKPROTECTOR=y

that's a +0.3% cost currently.

Thanks,

	Ingo
Kees Cook Nov. 27, 2013, 5:21 p.m. UTC | #2
On Wed, Nov 27, 2013 at 3:27 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Kees Cook <keescook@chromium.org> wrote:
>
>> On a defconfig x86_64 build (with CONFIG_CC_STACKPROTECTOR enabled), the
>> delta in size is just under 9% larger:
>>
>>  -rwxrwxr-x 1 kees kees  22134340 Nov 26 10:28 vmlinux.gcc-4.8
>>  -rwxrwxr-x 1 kees kees  22123870 Nov 26 10:40 vmlinux.gcc-4.9
>>  -rwxrwxr-x 1 kees kees  24225118 Nov 26 10:42 vmlinux.gcc-4.9+strong
>
> Please run it through 'size' so that we know the real text size
> increases.

   text    data     bss     dec     hex filename
11407474        1453792 1191936 14053202         d66f52 vmlinux.gcc-4.8
11458837        1457504 1191936 14108277         d74675 vmlinux.gcc-4.9
11682929        1457504 1191936 14332369         dab1d1 vmlinux.gcc-4.9+strong

Looks to be 2% for defconfig. That's way better. Shall I send a v3?

> If the cost of -fstack-protector-strong is really +9% in kernel text
> size then that's rather significant!
>
> If this option blows up our performance critical codepaths as well
> then this will likely cause a runtime slowdown as well, in addition to
> the increase in I$ footprint. That needs to be measured.
>
> CONFIG_CC_STACKPROTECTOR=y is relatively cheap today. For example on
> x86-64 defconfig:
>
>       text    data    bss     dec       filename
>   11378972    1455056 1191936 14025964  vmlinux  # CONFIG_CC_STACKPROTECTOR is not set
>   11420243    1455056 1191936 14067235  vmlinux  CONFIG_CC_STACKPROTECTOR=y
>
> that's a +0.3% cost currently.

Yeah -- not a lot of functions have char arrays. :)

>
> Thanks,
>
>         Ingo

-Kees
Ingo Molnar Nov. 27, 2013, 5:54 p.m. UTC | #3
* Kees Cook <keescook@chromium.org> wrote:

> On Wed, Nov 27, 2013 at 3:27 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > * Kees Cook <keescook@chromium.org> wrote:
> >
> >> On a defconfig x86_64 build (with CONFIG_CC_STACKPROTECTOR enabled), the
> >> delta in size is just under 9% larger:
> >>
> >>  -rwxrwxr-x 1 kees kees  22134340 Nov 26 10:28 vmlinux.gcc-4.8
> >>  -rwxrwxr-x 1 kees kees  22123870 Nov 26 10:40 vmlinux.gcc-4.9
> >>  -rwxrwxr-x 1 kees kees  24225118 Nov 26 10:42 vmlinux.gcc-4.9+strong
> >
> > Please run it through 'size' so that we know the real text size
> > increases.
> 
>    text    data     bss     dec     hex filename
> 11407474        1453792 1191936 14053202         d66f52 vmlinux.gcc-4.8
> 11458837        1457504 1191936 14108277         d74675 vmlinux.gcc-4.9
> 11682929        1457504 1191936 14332369         dab1d1 vmlinux.gcc-4.9+strong
> 
> Looks to be 2% for defconfig. That's way better. Shall I send a v3?

Well, it's better than 9%, but still almost an order of magnitude 
higher than the cost is today, and a lot of distros have 
CONFIG_CC_STACKPROTECTOR=y.

So it would be nice to measure how much the instruction count goes up 
in some realistic system-bound test. How much does something like 
kernel/built-in.o increase, as per 'size' output?

Thanks,

	Ingo
H. Peter Anvin Nov. 27, 2013, 5:55 p.m. UTC | #4
On 11/27/2013 09:54 AM, Ingo Molnar wrote:
>>
>> Looks to be 2% for defconfig. That's way better. Shall I send a v3?
> 
> Well, it's better than 9%, but still almost an order of magnitude 
> higher than the cost is today, and a lot of distros have 
> CONFIG_CC_STACKPROTECTOR=y.
> 
> So it would be nice to measure how much the instruction count goes up 
> in some realistic system-bound test. How much does something like 
> kernel/built-in.o increase, as per 'size' output?
> 

Do we need CONFIG_CC_STACKPROTECTOR_STRONG?

	-hpa
Kees Cook Nov. 27, 2013, 6:11 p.m. UTC | #5
On Wed, Nov 27, 2013 at 9:55 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/27/2013 09:54 AM, Ingo Molnar wrote:
>>>
>>> Looks to be 2% for defconfig. That's way better. Shall I send a v3?
>>
>> Well, it's better than 9%, but still almost an order of magnitude
>> higher than the cost is today, and a lot of distros have
>> CONFIG_CC_STACKPROTECTOR=y.
>>
>> So it would be nice to measure how much the instruction count goes up
>> in some realistic system-bound test. How much does something like
>> kernel/built-in.o increase, as per 'size' output?

   text    data     bss     dec     hex filename
 929611   90851  594496 1614958  18a46e built-in.o-gcc-4.9
 954648   90851  594496 1639995  19063b built-in.o-gcc-4.9+strong

Looks like 3% for defconfg + CONFIG_CC_STACKPROTECTOR

>
> Do we need CONFIG_CC_STACKPROTECTOR_STRONG?

I'm hoping to avoid this since nearly anyone using CC_STACKPROTECTOR
would want strong added, but as a fallback, I'm happy to implement it
as a separate config item.

-Kees
Kees Cook Dec. 17, 2013, 12:57 a.m. UTC | #6
On Wed, Nov 27, 2013 at 10:11 AM, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Nov 27, 2013 at 9:55 AM, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 11/27/2013 09:54 AM, Ingo Molnar wrote:
>>>>
>>>> Looks to be 2% for defconfig. That's way better. Shall I send a v3?
>>>
>>> Well, it's better than 9%, but still almost an order of magnitude
>>> higher than the cost is today, and a lot of distros have
>>> CONFIG_CC_STACKPROTECTOR=y.
>>>
>>> So it would be nice to measure how much the instruction count goes up
>>> in some realistic system-bound test. How much does something like
>>> kernel/built-in.o increase, as per 'size' output?
>
>    text    data     bss     dec     hex filename
>  929611   90851  594496 1614958  18a46e built-in.o-gcc-4.9
>  954648   90851  594496 1639995  19063b built-in.o-gcc-4.9+strong
>
> Looks like 3% for defconfg + CONFIG_CC_STACKPROTECTOR
>
>>
>> Do we need CONFIG_CC_STACKPROTECTOR_STRONG?
>
> I'm hoping to avoid this since nearly anyone using CC_STACKPROTECTOR
> would want strong added, but as a fallback, I'm happy to implement it
> as a separate config item.

Any verdict on this? Should I go with adding ..._STRONG like we used
to have for ..._ALL, or is defaulting to -strong best?

-Kees
Ingo Molnar Dec. 17, 2013, 11:29 a.m. UTC | #7
* Kees Cook <keescook@chromium.org> wrote:

> On Wed, Nov 27, 2013 at 10:11 AM, Kees Cook <keescook@chromium.org> wrote:
> > On Wed, Nov 27, 2013 at 9:55 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> >> On 11/27/2013 09:54 AM, Ingo Molnar wrote:
> >>>>
> >>>> Looks to be 2% for defconfig. That's way better. Shall I send a v3?
> >>>
> >>> Well, it's better than 9%, but still almost an order of magnitude
> >>> higher than the cost is today, and a lot of distros have
> >>> CONFIG_CC_STACKPROTECTOR=y.
> >>>
> >>> So it would be nice to measure how much the instruction count goes up
> >>> in some realistic system-bound test. How much does something like
> >>> kernel/built-in.o increase, as per 'size' output?
> >
> >    text    data     bss     dec     hex filename
> >  929611   90851  594496 1614958  18a46e built-in.o-gcc-4.9
> >  954648   90851  594496 1639995  19063b built-in.o-gcc-4.9+strong
> >
> > Looks like 3% for defconfg + CONFIG_CC_STACKPROTECTOR
> >
> >>
> >> Do we need CONFIG_CC_STACKPROTECTOR_STRONG?
> >
> > I'm hoping to avoid this since nearly anyone using 
> > CC_STACKPROTECTOR would want strong added, but as a fallback, I'm 
> > happy to implement it as a separate config item.
> 
> Any verdict on this? Should I go with adding ..._STRONG like we used 
> to have for ..._ALL, or is defaulting to -strong best?

I'm not opposed to the feature itself, just to the specific structure 
you presented - as outlined in my review feedback.

The cost of the feature itself appears to be significant (this cost 
should be outlined in the help text btw), while I think the cost of 
adding this as a new _STRONG option is minimal.

So I'd go forward with addressing two issues:

1)

I'd add the new STACKPROTECTOR_STRONG option and maybe rename the old 
one to STACKPROTECTOR_WEAK.

If in a year or two most distros have switched over to the _STRONG 
variant, despite its costs, then we can drop the weak variant.

2)

It would also be nice to see a head to head comparison of the 3 
variants:

	!STACKPROTECTOR
	STACKPROTECTOR_LIGHT
	STACKPROTECTOR_STRONG

of defconfig vmlinux size and estimated number of checks inserted in 
each case - so people/distros can make an informed decision about the 
relative quality differences between these variants and whether they 
want to carry the costs of that.

Thanks,

	Ingo
diff mbox

Patch

diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index c99b1086d83d..c6d3ea1c063e 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -41,7 +41,8 @@  KBUILD_CFLAGS	+=-fno-omit-frame-pointer -mapcs -mno-sched-prolog
 endif
 
 ifeq ($(CONFIG_CC_STACKPROTECTOR),y)
-KBUILD_CFLAGS	+=-fstack-protector
+KBUILD_CFLAGS	+= $(call cc-option,-fstack-protector-strong,-fstack-protector)
+
 endif
 
 ifeq ($(CONFIG_CPU_BIG_ENDIAN),y)
diff --git a/arch/arm/boot/compressed/misc.c b/arch/arm/boot/compressed/misc.c
index 31bd43b82095..d4f891f56996 100644
--- a/arch/arm/boot/compressed/misc.c
+++ b/arch/arm/boot/compressed/misc.c
@@ -127,6 +127,18 @@  asmlinkage void __div0(void)
 	error("Attempting division by 0!");
 }
 
+unsigned long __stack_chk_guard;
+
+void __stack_chk_guard_setup(void)
+{
+	__stack_chk_guard = 0x000a0dff;
+}
+
+void __stack_chk_fail(void)
+{
+	error("stack-protector: Kernel stack is corrupted\n");
+}
+
 extern int do_decompress(u8 *input, int len, u8 *output, void (*error)(char *x));
 
 
@@ -137,6 +149,8 @@  decompress_kernel(unsigned long output_start, unsigned long free_mem_ptr_p,
 {
 	int ret;
 
+	__stack_chk_guard_setup();
+
 	output_data		= (unsigned char *)output_start;
 	free_mem_ptr		= free_mem_ptr_p;
 	free_mem_end_ptr	= free_mem_ptr_end_p;
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 41250fb33985..4ebb054cc323 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -86,7 +86,7 @@  endif
 ifdef CONFIG_CC_STACKPROTECTOR
 	cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh
         ifeq ($(shell $(CONFIG_SHELL) $(cc_has_sp) $(CC) $(KBUILD_CPPFLAGS) $(biarch)),y)
-                stackp-y := -fstack-protector
+                stackp-y := $(call cc-option,-fstack-protector-strong,-fstack-protector)
                 KBUILD_CFLAGS += $(stackp-y)
         else
                 $(warning stack protector enabled but no compiler support)