diff mbox

[v3] mm: Add SLUB free list pointer obfuscation

Message ID 20170706002718.GA102852@beast (mailing list archive)
State New, archived
Headers show

Commit Message

Kees Cook July 6, 2017, 12:27 a.m. UTC
This SLUB free list pointer obfuscation code is modified from Brad
Spengler/PaX Team's code in the last public patch of grsecurity/PaX based
on my understanding of the code. Changes or omissions from the original
code are mine and don't reflect the original grsecurity/PaX code.

This adds a per-cache random value to SLUB caches that is XORed with
their freelist pointer address and value. This adds nearly zero overhead
and frustrates the very common heap overflow exploitation method of
overwriting freelist pointers. A recent example of the attack is written
up here: http://cyseclabs.com/blog/cve-2016-6187-heap-off-by-one-exploit

This is based on patches by Daniel Micay, and refactored to minimize the
use of #ifdef.

Under 200-count cycles of "hackbench -g 20 -l 1000" I saw the following
run times:

before:
	mean 10.11882499999999999995
	variance .03320378329145728642
	stdev .18221905304181911048

after:
	mean 10.12654000000000000014
	variance .04700556623115577889
	stdev .21680767106160192064

The difference gets lost in the noise, but if the above is to be taken
literally, using CONFIG_FREELIST_HARDENED is 0.07% slower.

Suggested-by: Daniel Micay <danielmicay@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tycho Andersen <tycho@docker.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
v3:
- use static inlines instead of macros (akpm).
v2:
- rename CONFIG_SLAB_HARDENED to CONFIG_FREELIST_HARDENED (labbott).
---
 include/linux/slub_def.h |  4 ++++
 init/Kconfig             |  9 +++++++++
 mm/slub.c                | 42 +++++++++++++++++++++++++++++++++++++-----
 3 files changed, 50 insertions(+), 5 deletions(-)

Comments

Christoph Lameter (Ampere) July 6, 2017, 1:43 p.m. UTC | #1
On Wed, 5 Jul 2017, Kees Cook wrote:

> @@ -3536,6 +3565,9 @@ static int kmem_cache_open(struct kmem_cache *s, unsigned long flags)
>  {
>  	s->flags = kmem_cache_flags(s->size, flags, s->name, s->ctor);
>  	s->reserved = 0;
> +#ifdef CONFIG_SLAB_FREELIST_HARDENED
> +	s->random = get_random_long();
> +#endif
>
>  	if (need_reserve_slab_rcu && (s->flags & SLAB_TYPESAFE_BY_RCU))
>  		s->reserved = sizeof(struct rcu_head);
>

So if an attacker knows the internal structure of data then he can simply
dereference page->kmem_cache->random to decode the freepointer.

Assuming someone is already targeting a freelist pointer (which indicates
detailed knowledge of the internal structure) then I would think that
someone like that will also figure out how to follow the pointer links to
get to the random value.

Not seeing the point of all of this.
Kees Cook July 6, 2017, 3:48 p.m. UTC | #2
On Thu, Jul 6, 2017 at 6:43 AM, Christoph Lameter <cl@linux.com> wrote:
> On Wed, 5 Jul 2017, Kees Cook wrote:
>
>> @@ -3536,6 +3565,9 @@ static int kmem_cache_open(struct kmem_cache *s, unsigned long flags)
>>  {
>>       s->flags = kmem_cache_flags(s->size, flags, s->name, s->ctor);
>>       s->reserved = 0;
>> +#ifdef CONFIG_SLAB_FREELIST_HARDENED
>> +     s->random = get_random_long();
>> +#endif
>>
>>       if (need_reserve_slab_rcu && (s->flags & SLAB_TYPESAFE_BY_RCU))
>>               s->reserved = sizeof(struct rcu_head);
>>
>
> So if an attacker knows the internal structure of data then he can simply
> dereference page->kmem_cache->random to decode the freepointer.

That requires a series of arbitrary reads. This is protecting against
attacks that use an adjacent slab object write overflow to write the
freelist pointer. This internal structure is very reliable, and has
been the basis of freelist attacks against the kernel for a decade.

> Assuming someone is already targeting a freelist pointer (which indicates
> detailed knowledge of the internal structure) then I would think that
> someone like that will also figure out how to follow the pointer links to
> get to the random value.

The kind of hardening this provides is to frustrate the expansion of
an attacker's capabilities. Most attacks are a chain of exploits that
slowly builds up the ability to perform arbitrary writes. For example,
a slab object overflow isn't an arbitrary write on its own, but when
combined with heap allocation layout control and an adjacent free
object, this can be upgraded to an arbitrary write.

> Not seeing the point of all of this.

It is a probabilistic defense, but then so is the stack protector.
This is a similar defense; while not perfect it makes the class of
attack much more difficult to mount.

-Kees
Christoph Lameter (Ampere) July 6, 2017, 3:55 p.m. UTC | #3
On Thu, 6 Jul 2017, Kees Cook wrote:

> On Thu, Jul 6, 2017 at 6:43 AM, Christoph Lameter <cl@linux.com> wrote:
> > On Wed, 5 Jul 2017, Kees Cook wrote:
> >
> >> @@ -3536,6 +3565,9 @@ static int kmem_cache_open(struct kmem_cache *s, unsigned long flags)
> >>  {
> >>       s->flags = kmem_cache_flags(s->size, flags, s->name, s->ctor);
> >>       s->reserved = 0;
> >> +#ifdef CONFIG_SLAB_FREELIST_HARDENED
> >> +     s->random = get_random_long();
> >> +#endif
> >>
> >>       if (need_reserve_slab_rcu && (s->flags & SLAB_TYPESAFE_BY_RCU))
> >>               s->reserved = sizeof(struct rcu_head);
> >>
> >
> > So if an attacker knows the internal structure of data then he can simply
> > dereference page->kmem_cache->random to decode the freepointer.
>
> That requires a series of arbitrary reads. This is protecting against
> attacks that use an adjacent slab object write overflow to write the
> freelist pointer. This internal structure is very reliable, and has
> been the basis of freelist attacks against the kernel for a decade.

These reads are not arbitrary. You can usually calculate the page struct
address easily from the address and then do a couple of loads to get
there.

Ok so you get rid of the old attacks because we did not have that
hardening in effect when they designed their approaches?

> It is a probabilistic defense, but then so is the stack protector.
> This is a similar defense; while not perfect it makes the class of
> attack much more difficult to mount.

Na I am not convinced of the "much more difficult". Maybe they will just
have to upgrade their approaches to fetch the proper values to decode.
Daniel Micay July 6, 2017, 4:16 p.m. UTC | #4
On Thu, 2017-07-06 at 10:55 -0500, Christoph Lameter wrote:
> On Thu, 6 Jul 2017, Kees Cook wrote:
> 
> > On Thu, Jul 6, 2017 at 6:43 AM, Christoph Lameter <cl@linux.com>
> > wrote:
> > > On Wed, 5 Jul 2017, Kees Cook wrote:
> > > 
> > > > @@ -3536,6 +3565,9 @@ static int kmem_cache_open(struct
> > > > kmem_cache *s, unsigned long flags)
> > > >  {
> > > >       s->flags = kmem_cache_flags(s->size, flags, s->name, s-
> > > > >ctor);
> > > >       s->reserved = 0;
> > > > +#ifdef CONFIG_SLAB_FREELIST_HARDENED
> > > > +     s->random = get_random_long();
> > > > +#endif
> > > > 
> > > >       if (need_reserve_slab_rcu && (s->flags &
> > > > SLAB_TYPESAFE_BY_RCU))
> > > >               s->reserved = sizeof(struct rcu_head);
> > > > 
> > > 
> > > So if an attacker knows the internal structure of data then he can
> > > simply
> > > dereference page->kmem_cache->random to decode the freepointer.
> > 
> > That requires a series of arbitrary reads. This is protecting
> > against
> > attacks that use an adjacent slab object write overflow to write the
> > freelist pointer. This internal structure is very reliable, and has
> > been the basis of freelist attacks against the kernel for a decade.
> 
> These reads are not arbitrary. You can usually calculate the page
> struct
> address easily from the address and then do a couple of loads to get
> there.

You're describing an arbitrary read vulnerability: an attacker able to
read the value of an address of their choosing. Requiring a powerful
additional primitive rather than only a small fixed size overflow or a
weak use-after-free vulnerability to use a common exploit vector is
useful.

A deterministic mitigation would be better, but I don't think an extra
slab allocator for hardened kernels would be welcomed. Since there isn't
a separate allocator for that niche, SLAB or SLUB are used. The ideal
would be bitmaps in `struct page` but that implies another allocator,
using single pages for the smallest size classes and potentially needing
to bloat `struct page` even with that.

There's definitely a limit to the hardening that can be done for SLUB,
but unless forking it into a different allocator is welcome that's what
will be suggested. Similarly, the slab freelist randomization feature is
a much weaker mitigation than it could be without these constraints
placed on it. This is much lower complexity than that and higher value
though...

> Ok so you get rid of the old attacks because we did not have that
> hardening in effect when they designed their approaches?
> 
> > It is a probabilistic defense, but then so is the stack protector.
> > This is a similar defense; while not perfect it makes the class of
> > attack much more difficult to mount.
> 
> Na I am not convinced of the "much more difficult". Maybe they will
> just
> have to upgrade their approaches to fetch the proper values to decode.

To fetch the values they would need an arbitrary read vulnerability or
the ability to dump them via uninitialized slab allocations as an extra
requirement.

An attacker can similarly bypass the stack canary by reading them from
stack frames via a stack buffer read overflow or uninitialized variable
usage leaking stack data. On non-x86, at least with SMP, the stack
canary is just a global variable that remains the same after
initialization too. That doesn't make it useless, although the kernel
doesn't have many linear overflows on the stack which is the real issue
with it as a mitigation. Despite that, most people are using kernels
with stack canaries, and that has a significant performance cost unlike
these kinds of changes.
Rik van Riel July 6, 2017, 5:53 p.m. UTC | #5
On Thu, 2017-07-06 at 10:55 -0500, Christoph Lameter wrote:
> On Thu, 6 Jul 2017, Kees Cook wrote:
> 
> > On Thu, Jul 6, 2017 at 6:43 AM, Christoph Lameter <cl@linux.com>
> > wrote:
> > > On Wed, 5 Jul 2017, Kees Cook wrote:
> > > 
> > > > @@ -3536,6 +3565,9 @@ static int kmem_cache_open(struct
> > > > kmem_cache *s, unsigned long flags)
> > > >  {
> > > >       s->flags = kmem_cache_flags(s->size, flags, s->name, s-
> > > > >ctor);
> > > >       s->reserved = 0;
> > > > +#ifdef CONFIG_SLAB_FREELIST_HARDENED
> > > > +     s->random = get_random_long();
> > > > +#endif
> > > > 
> > > >       if (need_reserve_slab_rcu && (s->flags &
> > > > SLAB_TYPESAFE_BY_RCU))
> > > >               s->reserved = sizeof(struct rcu_head);
> > > > 
> > > 
> > > So if an attacker knows the internal structure of data then he
> > > can simply
> > > dereference page->kmem_cache->random to decode the freepointer.
> > 
> > That requires a series of arbitrary reads. This is protecting
> > against
> > attacks that use an adjacent slab object write overflow to write
> > the
> > freelist pointer. This internal structure is very reliable, and has
> > been the basis of freelist attacks against the kernel for a decade.
> 
> These reads are not arbitrary. You can usually calculate the page
> struct
> address easily from the address and then do a couple of loads to get
> there.
> 
> Ok so you get rid of the old attacks because we did not have that
> hardening in effect when they designed their approaches?

The hardening protects against situations where
people do not have arbitrary code execution and
memory read access in the kernel, with the goal
of protecting people from acquiring those abilities.

> > It is a probabilistic defense, but then so is the stack protector.
> > This is a similar defense; while not perfect it makes the class of
> > attack much more difficult to mount.
> 
> Na I am not convinced of the "much more difficult". Maybe they will
> just
> have to upgrade their approaches to fetch the proper values to
> decode.

Easier said than done. Most of the time there is an
unpatched vulnerability outstanding, there is only
one known issue, before the kernel is updated by the
user, to a version that does not have that issue.

Bypassing kernel hardening typically requires the
use of multiple vulnerabilities, and the absence of
roadblocks (like hardening) that make a type of
vulnerability exploitable.

Between usercopy hardening, and these slub freelist
canaries (which is what they effectively are), several
classes of exploits are no longer usable.
Kees Cook July 6, 2017, 6:50 p.m. UTC | #6
On Thu, Jul 6, 2017 at 10:53 AM, Rik van Riel <riel@redhat.com> wrote:
> On Thu, 2017-07-06 at 10:55 -0500, Christoph Lameter wrote:
>> On Thu, 6 Jul 2017, Kees Cook wrote:
>> > That requires a series of arbitrary reads. This is protecting
>> > against
>> > attacks that use an adjacent slab object write overflow to write
>> > the
>> > freelist pointer. This internal structure is very reliable, and has
>> > been the basis of freelist attacks against the kernel for a decade.
>>
>> These reads are not arbitrary. You can usually calculate the page struct
>> address easily from the address and then do a couple of loads to get
>> there.
>>
>> Ok so you get rid of the old attacks because we did not have that
>> hardening in effect when they designed their approaches?
>
> The hardening protects against situations where
> people do not have arbitrary code execution and
> memory read access in the kernel, with the goal
> of protecting people from acquiring those abilities.

Right. This is about blocking the escalation of attack capability. For
slab object overflow flaws, there are mainly two exploitation methods:
adjacent allocated object overwrite and adjacent freed object
overwrite (i.e. a freelist pointer overwrite). The first attack
depends heavily on which slab cache (and therefore which structures)
has been exposed by the bug. It's a very narrow and specific attack
method. The freelist attack is entirely general purpose since it
provides a reliable way to gain arbitrary write capabilities.
Protecting against that attack greatly narrows the options for an
attacker which makes attacks more expensive to create and possibly
less reliable (and reliability is crucial to successful attacks).

>> > It is a probabilistic defense, but then so is the stack protector.
>> > This is a similar defense; while not perfect it makes the class of
>> > attack much more difficult to mount.
>>
>> Na I am not convinced of the "much more difficult". Maybe they will just
>> have to upgrade their approaches to fetch the proper values to
>> decode.
>
> Easier said than done. Most of the time there is an
> unpatched vulnerability outstanding, there is only
> one known issue, before the kernel is updated by the
> user, to a version that does not have that issue.
>
> Bypassing kernel hardening typically requires the
> use of multiple vulnerabilities, and the absence of
> roadblocks (like hardening) that make a type of
> vulnerability exploitable.
>
> Between usercopy hardening, and these slub freelist
> canaries (which is what they effectively are), several
> classes of exploits are no longer usable.

Yup. I've been thinking of this patch kind of like glibc's PTR_MANGLE feature.

-Kees
diff mbox

Patch

diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 07ef550c6627..d7990a83b416 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -93,6 +93,10 @@  struct kmem_cache {
 #endif
 #endif
 
+#ifdef CONFIG_SLAB_FREELIST_HARDENED
+	unsigned long random;
+#endif
+
 #ifdef CONFIG_NUMA
 	/*
 	 * Defragmentation by allocating from a remote node.
diff --git a/init/Kconfig b/init/Kconfig
index 1d3475fc9496..04ee3e507b9e 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1900,6 +1900,15 @@  config SLAB_FREELIST_RANDOM
 	  security feature reduces the predictability of the kernel slab
 	  allocator against heap overflows.
 
+config SLAB_FREELIST_HARDENED
+	bool "Harden slab freelist metadata"
+	depends on SLUB
+	help
+	  Many kernel heap attacks try to target slab cache metadata and
+	  other infrastructure. This options makes minor performance
+	  sacrifies to harden the kernel slab allocator against common
+	  freelist exploit methods.
+
 config SLUB_CPU_PARTIAL
 	default y
 	depends on SLUB && SMP
diff --git a/mm/slub.c b/mm/slub.c
index 57e5156f02be..eae0628d3346 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -34,6 +34,7 @@ 
 #include <linux/stacktrace.h>
 #include <linux/prefetch.h>
 #include <linux/memcontrol.h>
+#include <linux/random.h>
 
 #include <trace/events/kmem.h>
 
@@ -238,30 +239,58 @@  static inline void stat(const struct kmem_cache *s, enum stat_item si)
  * 			Core slab cache functions
  *******************************************************************/
 
+/*
+ * Returns freelist pointer (ptr). With hardening, this is obfuscated
+ * with an XOR of the address where the pointer is held and a per-cache
+ * random number.
+ */
+static inline void *freelist_ptr(const struct kmem_cache *s, void *ptr,
+				 unsigned long ptr_addr)
+{
+#ifdef CONFIG_SLAB_FREELIST_HARDENED
+	return (void *)((unsigned long)ptr ^ s->random ^ ptr_addr);
+#else
+	return ptr;
+#endif
+}
+
+/* Returns the freelist pointer recorded at location ptr_addr. */
+static inline void *freelist_dereference(const struct kmem_cache *s,
+					 void *ptr_addr)
+{
+	return freelist_ptr(s, (void *)*(unsigned long *)(ptr_addr),
+			    (unsigned long)ptr_addr);
+}
+
 static inline void *get_freepointer(struct kmem_cache *s, void *object)
 {
-	return *(void **)(object + s->offset);
+	return freelist_dereference(s, object + s->offset);
 }
 
 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
 {
-	prefetch(object + s->offset);
+	if (object)
+		prefetch(freelist_dereference(s, object + s->offset));
 }
 
 static inline void *get_freepointer_safe(struct kmem_cache *s, void *object)
 {
+	unsigned long freepointer_addr;
 	void *p;
 
 	if (!debug_pagealloc_enabled())
 		return get_freepointer(s, object);
 
-	probe_kernel_read(&p, (void **)(object + s->offset), sizeof(p));
-	return p;
+	freepointer_addr = (unsigned long)object + s->offset;
+	probe_kernel_read(&p, (void **)freepointer_addr, sizeof(p));
+	return freelist_ptr(s, p, freepointer_addr);
 }
 
 static inline void set_freepointer(struct kmem_cache *s, void *object, void *fp)
 {
-	*(void **)(object + s->offset) = fp;
+	unsigned long freeptr_addr = (unsigned long)object + s->offset;
+
+	*(void **)freeptr_addr = freelist_ptr(s, fp, freeptr_addr);
 }
 
 /* Loop over all objects in a slab */
@@ -3536,6 +3565,9 @@  static int kmem_cache_open(struct kmem_cache *s, unsigned long flags)
 {
 	s->flags = kmem_cache_flags(s->size, flags, s->name, s->ctor);
 	s->reserved = 0;
+#ifdef CONFIG_SLAB_FREELIST_HARDENED
+	s->random = get_random_long();
+#endif
 
 	if (need_reserve_slab_rcu && (s->flags & SLAB_TYPESAFE_BY_RCU))
 		s->reserved = sizeof(struct rcu_head);