[v2] mm: Add SLUB free list pointer obfuscation

Message ID	20170623015010.GA137429@beast (mailing list archive)
State	New, archived
Headers	show Return-Path: <kernel-hardening-return-8752-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com> Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk Date: Thu, 22 Jun 2017 18:50:10 -0700 From: Kees Cook <keescook@chromium.org> To: Christoph Lameter <cl@linux.com>, Andrew Morton <akpm@linux-foundation.org> Cc: Laura Abbott <labbott@redhat.com>, Daniel Micay <danielmicay@gmail.com>, Pekka Enberg <penberg@kernel.org>, David Rientjes <rientjes@google.com>, Joonsoo Kim <iamjoonsoo.kim@lge.com>, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>, Ingo Molnar <mingo@kernel.org>, Josh Triplett <josh@joshtriplett.org>, Andy Lutomirski <luto@kernel.org>, Nicolas Pitre <nicolas.pitre@linaro.org>, Tejun Heo <tj@kernel.org>, Daniel Mack <daniel@zonque.org>, Sebastian Andrzej Siewior <bigeasy@linutronix.de>, Sergey Senozhatsky <sergey.senozhatsky@gmail.com>, Helge Deller <deller@gmx.de>, Rik van Riel <riel@redhat.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-hardening@lists.openwall.com Message-ID: <20170623015010.GA137429@beast> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Subject: [kernel-hardening] [PATCH v2] mm: Add SLUB free list pointer obfuscation

Kees Cook June 23, 2017, 1:50 a.m. UTC

This SLUB free list pointer obfuscation code is modified from Brad
Spengler/PaX Team's code in the last public patch of grsecurity/PaX based
on my understanding of the code. Changes or omissions from the original
code are mine and don't reflect the original grsecurity/PaX code.

This adds a per-cache random value to SLUB caches that is XORed with
their freelist pointers. This adds nearly zero overhead and frustrates the
very common heap overflow exploitation method of overwriting freelist
pointers. A recent example of the attack is written up here:
http://cyseclabs.com/blog/cve-2016-6187-heap-off-by-one-exploit

This is based on patches by Daniel Micay, and refactored to avoid lots
of #ifdef code.

Suggested-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
v2:
- renamed Kconfig to SLAB_FREELIST_HARDENED; labbott.
---
 include/linux/slub_def.h |  4 ++++
 init/Kconfig             |  9 +++++++++
 mm/slub.c                | 32 +++++++++++++++++++++++++++-----
 3 files changed, 40 insertions(+), 5 deletions(-)

Kees Cook June 25, 2017, 7:56 p.m. UTC | #1

On Thu, Jun 22, 2017 at 6:50 PM, Kees Cook <keescook@chromium.org> wrote:
> This SLUB free list pointer obfuscation code is modified from Brad
> Spengler/PaX Team's code in the last public patch of grsecurity/PaX based
> on my understanding of the code. Changes or omissions from the original
> code are mine and don't reflect the original grsecurity/PaX code.
>
> This adds a per-cache random value to SLUB caches that is XORed with
> their freelist pointers. This adds nearly zero overhead and frustrates the
> very common heap overflow exploitation method of overwriting freelist
> pointers. A recent example of the attack is written up here:
> http://cyseclabs.com/blog/cve-2016-6187-heap-off-by-one-exploit

BTW, to quantify "nearly zero overhead", I ran multiple 200-run cycles
of "hackbench -g 20 -l 1000", and saw:

before:
mean 10.11882499999999999995
variance .03320378329145728642
stdev .18221905304181911048

after:
mean 10.12654000000000000014
variance .04700556623115577889
stdev .21680767106160192064

The difference gets lost in the noise, but if the above is sensible,
it's 0.07% slower. ;)

-Kees

Christoph Lameter (Ampere) June 29, 2017, 5:05 p.m. UTC | #2

On Sun, 25 Jun 2017, Kees Cook wrote:

> The difference gets lost in the noise, but if the above is sensible,
> it's 0.07% slower. ;)

Hmmm... These differences add up. Also in a repetative benchmark like that
you do not see the impact that the additional cacheline use in the cpu
cache has on larger workloads. Those may be pushed over the edge of l1 or
l2 capacity at some point which then causes drastic regressions.

Kees Cook June 29, 2017, 5:47 p.m. UTC | #3

On Thu, Jun 29, 2017 at 10:05 AM, Christoph Lameter <cl@linux.com> wrote:
> On Sun, 25 Jun 2017, Kees Cook wrote:
>
>> The difference gets lost in the noise, but if the above is sensible,
>> it's 0.07% slower. ;)
>
> Hmmm... These differences add up. Also in a repetative benchmark like that
> you do not see the impact that the additional cacheline use in the cpu
> cache has on larger workloads. Those may be pushed over the edge of l1 or
> l2 capacity at some point which then causes drastic regressions.

Even if that is true, it may be worth it to some people to have the
protection. Given that is significantly hampers a large class of heap
overflow attacks[1], I think it's an important change to have. I'm not
suggesting this be on by default, it's cleanly behind
CONFIG-controlled macros, and is very limited in scope. If you can Ack
it we can let system builders decide if they want to risk a possible
performance hit. I'm pretty sure most distros would like to have this
protection.

Thanks for looking it over!

-Kees

[1] http://resources.infosecinstitute.com/exploiting-linux-kernel-heap-corruptions-slub-allocator/

Rik van Riel June 29, 2017, 5:54 p.m. UTC | #4

On Thu, 2017-06-29 at 10:47 -0700, Kees Cook wrote:
> On Thu, Jun 29, 2017 at 10:05 AM, Christoph Lameter <cl@linux.com>
> wrote:
> > On Sun, 25 Jun 2017, Kees Cook wrote:
> > 
> > > The difference gets lost in the noise, but if the above is
> > > sensible,
> > > it's 0.07% slower. ;)
> > 
> > Hmmm... These differences add up. Also in a repetative benchmark
> > like that
> > you do not see the impact that the additional cacheline use in the
> > cpu
> > cache has on larger workloads. Those may be pushed over the edge of
> > l1 or
> > l2 capacity at some point which then causes drastic regressions.
> 
> Even if that is true, it may be worth it to some people to have the
> protection. Given that is significantly hampers a large class of heap
> overflow attacks[1], I think it's an important change to have. I'm
> not
> suggesting this be on by default, it's cleanly behind
> CONFIG-controlled macros, and is very limited in scope. If you can
> Ack
> it we can let system builders decide if they want to risk a possible
> performance hit. I'm pretty sure most distros would like to have this
> protection.

I could certainly see it being useful for all kinds of portable
and network-connected systems where security is simply much
more important than performance.

Tycho Andersen June 29, 2017, 5:56 p.m. UTC | #5

On Thu, Jun 29, 2017 at 01:54:13PM -0400, Rik van Riel wrote:
> On Thu, 2017-06-29 at 10:47 -0700, Kees Cook wrote:
> > On Thu, Jun 29, 2017 at 10:05 AM, Christoph Lameter <cl@linux.com>
> > wrote:
> > > On Sun, 25 Jun 2017, Kees Cook wrote:
> > > 
> > > > The difference gets lost in the noise, but if the above is
> > > > sensible,
> > > > it's 0.07% slower. ;)
> > > 
> > > Hmmm... These differences add up. Also in a repetative benchmark
> > > like that
> > > you do not see the impact that the additional cacheline use in the
> > > cpu
> > > cache has on larger workloads. Those may be pushed over the edge of
> > > l1 or
> > > l2 capacity at some point which then causes drastic regressions.
> > 
> > Even if that is true, it may be worth it to some people to have the
> > protection. Given that is significantly hampers a large class of heap
> > overflow attacks[1], I think it's an important change to have. I'm
> > not
> > suggesting this be on by default, it's cleanly behind
> > CONFIG-controlled macros, and is very limited in scope. If you can
> > Ack
> > it we can let system builders decide if they want to risk a possible
> > performance hit. I'm pretty sure most distros would like to have this
> > protection.
> 
> I could certainly see it being useful for all kinds of portable
> and network-connected systems where security is simply much
> more important than performance.

Indeed, I believe we would enable this in our kernels.

Cheers,

Tycho

Kees Cook July 5, 2017, 11:30 p.m. UTC | #6

On Thu, Jun 29, 2017 at 10:56 AM, Tycho Andersen <tycho@docker.com> wrote:
> On Thu, Jun 29, 2017 at 01:54:13PM -0400, Rik van Riel wrote:
>> On Thu, 2017-06-29 at 10:47 -0700, Kees Cook wrote:
>> > On Thu, Jun 29, 2017 at 10:05 AM, Christoph Lameter <cl@linux.com>
>> > wrote:
>> > > On Sun, 25 Jun 2017, Kees Cook wrote:
>> > >
>> > > > The difference gets lost in the noise, but if the above is
>> > > > sensible,
>> > > > it's 0.07% slower. ;)
>> > >
>> > > Hmmm... These differences add up. Also in a repetative benchmark
>> > > like that
>> > > you do not see the impact that the additional cacheline use in the
>> > > cpu
>> > > cache has on larger workloads. Those may be pushed over the edge of
>> > > l1 or
>> > > l2 capacity at some point which then causes drastic regressions.
>> >
>> > Even if that is true, it may be worth it to some people to have the
>> > protection. Given that is significantly hampers a large class of heap
>> > overflow attacks[1], I think it's an important change to have. I'm
>> > not
>> > suggesting this be on by default, it's cleanly behind
>> > CONFIG-controlled macros, and is very limited in scope. If you can
>> > Ack
>> > it we can let system builders decide if they want to risk a possible
>> > performance hit. I'm pretty sure most distros would like to have this
>> > protection.
>>
>> I could certainly see it being useful for all kinds of portable
>> and network-connected systems where security is simply much
>> more important than performance.
>
> Indeed, I believe we would enable this in our kernels.

Andrew and Christoph,

What do you think about carrying this for -mm, since people are
interested in it and it's a very narrow change behind a config (with a
large impact on reducing the expoitability of freelist pointer
overwrites)?

-Kees

Andrew Morton July 5, 2017, 11:39 p.m. UTC | #7

On Thu, 22 Jun 2017 18:50:10 -0700 Kees Cook <keescook@chromium.org> wrote:

> This SLUB free list pointer obfuscation code is modified from Brad
> Spengler/PaX Team's code in the last public patch of grsecurity/PaX based
> on my understanding of the code. Changes or omissions from the original
> code are mine and don't reflect the original grsecurity/PaX code.
> 
> This adds a per-cache random value to SLUB caches that is XORed with
> their freelist pointers. This adds nearly zero overhead and frustrates the
> very common heap overflow exploitation method of overwriting freelist
> pointers. A recent example of the attack is written up here:
> http://cyseclabs.com/blog/cve-2016-6187-heap-off-by-one-exploit
> 
> This is based on patches by Daniel Micay, and refactored to avoid lots
> of #ifdef code.
> 
> ...
>
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1900,6 +1900,15 @@ config SLAB_FREELIST_RANDOM
>  	  security feature reduces the predictability of the kernel slab
>  	  allocator against heap overflows.
>  
> +config SLAB_FREELIST_HARDENED
> +	bool "Harden slab freelist metadata"
> +	depends on SLUB
> +	help
> +	  Many kernel heap attacks try to target slab cache metadata and
> +	  other infrastructure. This options makes minor performance
> +	  sacrifies to harden the kernel slab allocator against common
> +	  freelist exploit methods.
> +

Well, it is optable-outable.

>  config SLUB_CPU_PARTIAL
>  	default y
>  	depends on SLUB && SMP
> diff --git a/mm/slub.c b/mm/slub.c
> index 57e5156f02be..590e7830aaed 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -34,6 +34,7 @@
>  #include <linux/stacktrace.h>
>  #include <linux/prefetch.h>
>  #include <linux/memcontrol.h>
> +#include <linux/random.h>
>  
>  #include <trace/events/kmem.h>
>  
> @@ -238,30 +239,50 @@ static inline void stat(const struct kmem_cache *s, enum stat_item si)
>   * 			Core slab cache functions
>   *******************************************************************/
>  
> +#ifdef CONFIG_SLAB_FREELIST_HARDENED
> +# define initialize_random(s)					\
> +		do {						\
> +			s->random = get_random_long();		\
> +		} while (0)
> +# define FREEPTR_VAL(ptr, ptr_addr, s)	\
> +		(void *)((unsigned long)(ptr) ^ s->random ^ (ptr_addr))
> +#else
> +# define initialize_random(s)		do { } while (0)
> +# define FREEPTR_VAL(ptr, addr, s)	((void *)(ptr))
> +#endif
> +#define FREELIST_ENTRY(ptr_addr, s)				\
> +		FREEPTR_VAL(*(unsigned long *)(ptr_addr),	\
> +			    (unsigned long)ptr_addr, s)
> +

That's a bit of an eyesore.  Is there any reason why we cannot
implement all of the above in nice, conventional C functions?

>
> ...
>
> @@ -3536,6 +3557,7 @@ static int kmem_cache_open(struct kmem_cache *s, unsigned long flags)
>  {
>  	s->flags = kmem_cache_flags(s->size, flags, s->name, s->ctor);
>  	s->reserved = 0;
> +	initialize_random(s);
>  
>  	if (need_reserve_slab_rcu && (s->flags & SLAB_TYPESAFE_BY_RCU))
>  		s->reserved = sizeof(struct rcu_head);

We regularly have issues where the random system just isn't ready
(enough) for clients to use it.  Are you sure the above is actually
useful for the boot-time caches?

Kees Cook July 5, 2017, 11:56 p.m. UTC | #8

On Wed, Jul 5, 2017 at 4:39 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Thu, 22 Jun 2017 18:50:10 -0700 Kees Cook <keescook@chromium.org> wrote:
>
>> This SLUB free list pointer obfuscation code is modified from Brad
>> Spengler/PaX Team's code in the last public patch of grsecurity/PaX based
>> on my understanding of the code. Changes or omissions from the original
>> code are mine and don't reflect the original grsecurity/PaX code.
>>
>> This adds a per-cache random value to SLUB caches that is XORed with
>> their freelist pointers. This adds nearly zero overhead and frustrates the
>> very common heap overflow exploitation method of overwriting freelist
>> pointers. A recent example of the attack is written up here:
>> http://cyseclabs.com/blog/cve-2016-6187-heap-off-by-one-exploit
>>
>> This is based on patches by Daniel Micay, and refactored to avoid lots
>> of #ifdef code.
>>
>> ...
>>
>> --- a/init/Kconfig
>> +++ b/init/Kconfig
>> @@ -1900,6 +1900,15 @@ config SLAB_FREELIST_RANDOM
>>         security feature reduces the predictability of the kernel slab
>>         allocator against heap overflows.
>>
>> +config SLAB_FREELIST_HARDENED
>> +     bool "Harden slab freelist metadata"
>> +     depends on SLUB
>> +     help
>> +       Many kernel heap attacks try to target slab cache metadata and
>> +       other infrastructure. This options makes minor performance
>> +       sacrifies to harden the kernel slab allocator against common
>> +       freelist exploit methods.
>> +
>
> Well, it is optable-outable.
>
>>  config SLUB_CPU_PARTIAL
>>       default y
>>       depends on SLUB && SMP
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 57e5156f02be..590e7830aaed 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -34,6 +34,7 @@
>>  #include <linux/stacktrace.h>
>>  #include <linux/prefetch.h>
>>  #include <linux/memcontrol.h>
>> +#include <linux/random.h>
>>
>>  #include <trace/events/kmem.h>
>>
>> @@ -238,30 +239,50 @@ static inline void stat(const struct kmem_cache *s, enum stat_item si)
>>   *                   Core slab cache functions
>>   *******************************************************************/
>>
>> +#ifdef CONFIG_SLAB_FREELIST_HARDENED
>> +# define initialize_random(s)                                        \
>> +             do {                                            \
>> +                     s->random = get_random_long();          \
>> +             } while (0)
>> +# define FREEPTR_VAL(ptr, ptr_addr, s)       \
>> +             (void *)((unsigned long)(ptr) ^ s->random ^ (ptr_addr))
>> +#else
>> +# define initialize_random(s)                do { } while (0)
>> +# define FREEPTR_VAL(ptr, addr, s)   ((void *)(ptr))
>> +#endif
>> +#define FREELIST_ENTRY(ptr_addr, s)                          \
>> +             FREEPTR_VAL(*(unsigned long *)(ptr_addr),       \
>> +                         (unsigned long)ptr_addr, s)
>> +
>
> That's a bit of an eyesore.  Is there any reason why we cannot
> implement all of the above in nice, conventional C functions?

I could rework it using static inlines. I was mainly avoiding #ifdef
blocks in the freelist manipulation functions, but I could push them
up to these functions instead. I'll send a v2.

>> ...
>>
>> @@ -3536,6 +3557,7 @@ static int kmem_cache_open(struct kmem_cache *s, unsigned long flags)
>>  {
>>       s->flags = kmem_cache_flags(s->size, flags, s->name, s->ctor);
>>       s->reserved = 0;
>> +     initialize_random(s);
>>
>>       if (need_reserve_slab_rcu && (s->flags & SLAB_TYPESAFE_BY_RCU))
>>               s->reserved = sizeof(struct rcu_head);
>
> We regularly have issues where the random system just isn't ready
> (enough) for clients to use it.  Are you sure the above is actually
> useful for the boot-time caches?

IMO, this isn't reason enough to block this since we have similar
problems in other places (e.g. the stack canary itself). The random
infrastructure is aware of these problems and is continually improving
(e.g. on x86 without enough pool entropy, this will fall back to
RDRAND, IIUC). Additionally for systems that are chronically short on
entropy they can build with the latent_entropy GCC plugin to at least
bump the seeding around a bit. So, it _can_ be low entropy, but adding
this still improves the situation since the address of the freelist
pointer itself is part of the XORing, so even if the random number was
static, it would still require info exposures to work around the
obfuscation.

Shorter version: yeah, it'll still be useful. :)

-Kees

[v2] mm: Add SLUB free list pointer obfuscation

Commit Message

Comments

Patch