Message ID | 20170706002718.GA102852@beast (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, 5 Jul 2017, Kees Cook wrote: > @@ -3536,6 +3565,9 @@ static int kmem_cache_open(struct kmem_cache *s, unsigned long flags) > { > s->flags = kmem_cache_flags(s->size, flags, s->name, s->ctor); > s->reserved = 0; > +#ifdef CONFIG_SLAB_FREELIST_HARDENED > + s->random = get_random_long(); > +#endif > > if (need_reserve_slab_rcu && (s->flags & SLAB_TYPESAFE_BY_RCU)) > s->reserved = sizeof(struct rcu_head); > So if an attacker knows the internal structure of data then he can simply dereference page->kmem_cache->random to decode the freepointer. Assuming someone is already targeting a freelist pointer (which indicates detailed knowledge of the internal structure) then I would think that someone like that will also figure out how to follow the pointer links to get to the random value. Not seeing the point of all of this.
On Thu, Jul 6, 2017 at 6:43 AM, Christoph Lameter <cl@linux.com> wrote: > On Wed, 5 Jul 2017, Kees Cook wrote: > >> @@ -3536,6 +3565,9 @@ static int kmem_cache_open(struct kmem_cache *s, unsigned long flags) >> { >> s->flags = kmem_cache_flags(s->size, flags, s->name, s->ctor); >> s->reserved = 0; >> +#ifdef CONFIG_SLAB_FREELIST_HARDENED >> + s->random = get_random_long(); >> +#endif >> >> if (need_reserve_slab_rcu && (s->flags & SLAB_TYPESAFE_BY_RCU)) >> s->reserved = sizeof(struct rcu_head); >> > > So if an attacker knows the internal structure of data then he can simply > dereference page->kmem_cache->random to decode the freepointer. That requires a series of arbitrary reads. This is protecting against attacks that use an adjacent slab object write overflow to write the freelist pointer. This internal structure is very reliable, and has been the basis of freelist attacks against the kernel for a decade. > Assuming someone is already targeting a freelist pointer (which indicates > detailed knowledge of the internal structure) then I would think that > someone like that will also figure out how to follow the pointer links to > get to the random value. The kind of hardening this provides is to frustrate the expansion of an attacker's capabilities. Most attacks are a chain of exploits that slowly builds up the ability to perform arbitrary writes. For example, a slab object overflow isn't an arbitrary write on its own, but when combined with heap allocation layout control and an adjacent free object, this can be upgraded to an arbitrary write. > Not seeing the point of all of this. It is a probabilistic defense, but then so is the stack protector. This is a similar defense; while not perfect it makes the class of attack much more difficult to mount. -Kees
On Thu, 6 Jul 2017, Kees Cook wrote: > On Thu, Jul 6, 2017 at 6:43 AM, Christoph Lameter <cl@linux.com> wrote: > > On Wed, 5 Jul 2017, Kees Cook wrote: > > > >> @@ -3536,6 +3565,9 @@ static int kmem_cache_open(struct kmem_cache *s, unsigned long flags) > >> { > >> s->flags = kmem_cache_flags(s->size, flags, s->name, s->ctor); > >> s->reserved = 0; > >> +#ifdef CONFIG_SLAB_FREELIST_HARDENED > >> + s->random = get_random_long(); > >> +#endif > >> > >> if (need_reserve_slab_rcu && (s->flags & SLAB_TYPESAFE_BY_RCU)) > >> s->reserved = sizeof(struct rcu_head); > >> > > > > So if an attacker knows the internal structure of data then he can simply > > dereference page->kmem_cache->random to decode the freepointer. > > That requires a series of arbitrary reads. This is protecting against > attacks that use an adjacent slab object write overflow to write the > freelist pointer. This internal structure is very reliable, and has > been the basis of freelist attacks against the kernel for a decade. These reads are not arbitrary. You can usually calculate the page struct address easily from the address and then do a couple of loads to get there. Ok so you get rid of the old attacks because we did not have that hardening in effect when they designed their approaches? > It is a probabilistic defense, but then so is the stack protector. > This is a similar defense; while not perfect it makes the class of > attack much more difficult to mount. Na I am not convinced of the "much more difficult". Maybe they will just have to upgrade their approaches to fetch the proper values to decode.
On Thu, 2017-07-06 at 10:55 -0500, Christoph Lameter wrote: > On Thu, 6 Jul 2017, Kees Cook wrote: > > > On Thu, Jul 6, 2017 at 6:43 AM, Christoph Lameter <cl@linux.com> > > wrote: > > > On Wed, 5 Jul 2017, Kees Cook wrote: > > > > > > > @@ -3536,6 +3565,9 @@ static int kmem_cache_open(struct > > > > kmem_cache *s, unsigned long flags) > > > > { > > > > s->flags = kmem_cache_flags(s->size, flags, s->name, s- > > > > >ctor); > > > > s->reserved = 0; > > > > +#ifdef CONFIG_SLAB_FREELIST_HARDENED > > > > + s->random = get_random_long(); > > > > +#endif > > > > > > > > if (need_reserve_slab_rcu && (s->flags & > > > > SLAB_TYPESAFE_BY_RCU)) > > > > s->reserved = sizeof(struct rcu_head); > > > > > > > > > > So if an attacker knows the internal structure of data then he can > > > simply > > > dereference page->kmem_cache->random to decode the freepointer. > > > > That requires a series of arbitrary reads. This is protecting > > against > > attacks that use an adjacent slab object write overflow to write the > > freelist pointer. This internal structure is very reliable, and has > > been the basis of freelist attacks against the kernel for a decade. > > These reads are not arbitrary. You can usually calculate the page > struct > address easily from the address and then do a couple of loads to get > there. You're describing an arbitrary read vulnerability: an attacker able to read the value of an address of their choosing. Requiring a powerful additional primitive rather than only a small fixed size overflow or a weak use-after-free vulnerability to use a common exploit vector is useful. A deterministic mitigation would be better, but I don't think an extra slab allocator for hardened kernels would be welcomed. Since there isn't a separate allocator for that niche, SLAB or SLUB are used. The ideal would be bitmaps in `struct page` but that implies another allocator, using single pages for the smallest size classes and potentially needing to bloat `struct page` even with that. There's definitely a limit to the hardening that can be done for SLUB, but unless forking it into a different allocator is welcome that's what will be suggested. Similarly, the slab freelist randomization feature is a much weaker mitigation than it could be without these constraints placed on it. This is much lower complexity than that and higher value though... > Ok so you get rid of the old attacks because we did not have that > hardening in effect when they designed their approaches? > > > It is a probabilistic defense, but then so is the stack protector. > > This is a similar defense; while not perfect it makes the class of > > attack much more difficult to mount. > > Na I am not convinced of the "much more difficult". Maybe they will > just > have to upgrade their approaches to fetch the proper values to decode. To fetch the values they would need an arbitrary read vulnerability or the ability to dump them via uninitialized slab allocations as an extra requirement. An attacker can similarly bypass the stack canary by reading them from stack frames via a stack buffer read overflow or uninitialized variable usage leaking stack data. On non-x86, at least with SMP, the stack canary is just a global variable that remains the same after initialization too. That doesn't make it useless, although the kernel doesn't have many linear overflows on the stack which is the real issue with it as a mitigation. Despite that, most people are using kernels with stack canaries, and that has a significant performance cost unlike these kinds of changes.
On Thu, 2017-07-06 at 10:55 -0500, Christoph Lameter wrote: > On Thu, 6 Jul 2017, Kees Cook wrote: > > > On Thu, Jul 6, 2017 at 6:43 AM, Christoph Lameter <cl@linux.com> > > wrote: > > > On Wed, 5 Jul 2017, Kees Cook wrote: > > > > > > > @@ -3536,6 +3565,9 @@ static int kmem_cache_open(struct > > > > kmem_cache *s, unsigned long flags) > > > > { > > > > s->flags = kmem_cache_flags(s->size, flags, s->name, s- > > > > >ctor); > > > > s->reserved = 0; > > > > +#ifdef CONFIG_SLAB_FREELIST_HARDENED > > > > + s->random = get_random_long(); > > > > +#endif > > > > > > > > if (need_reserve_slab_rcu && (s->flags & > > > > SLAB_TYPESAFE_BY_RCU)) > > > > s->reserved = sizeof(struct rcu_head); > > > > > > > > > > So if an attacker knows the internal structure of data then he > > > can simply > > > dereference page->kmem_cache->random to decode the freepointer. > > > > That requires a series of arbitrary reads. This is protecting > > against > > attacks that use an adjacent slab object write overflow to write > > the > > freelist pointer. This internal structure is very reliable, and has > > been the basis of freelist attacks against the kernel for a decade. > > These reads are not arbitrary. You can usually calculate the page > struct > address easily from the address and then do a couple of loads to get > there. > > Ok so you get rid of the old attacks because we did not have that > hardening in effect when they designed their approaches? The hardening protects against situations where people do not have arbitrary code execution and memory read access in the kernel, with the goal of protecting people from acquiring those abilities. > > It is a probabilistic defense, but then so is the stack protector. > > This is a similar defense; while not perfect it makes the class of > > attack much more difficult to mount. > > Na I am not convinced of the "much more difficult". Maybe they will > just > have to upgrade their approaches to fetch the proper values to > decode. Easier said than done. Most of the time there is an unpatched vulnerability outstanding, there is only one known issue, before the kernel is updated by the user, to a version that does not have that issue. Bypassing kernel hardening typically requires the use of multiple vulnerabilities, and the absence of roadblocks (like hardening) that make a type of vulnerability exploitable. Between usercopy hardening, and these slub freelist canaries (which is what they effectively are), several classes of exploits are no longer usable.
On Thu, Jul 6, 2017 at 10:53 AM, Rik van Riel <riel@redhat.com> wrote: > On Thu, 2017-07-06 at 10:55 -0500, Christoph Lameter wrote: >> On Thu, 6 Jul 2017, Kees Cook wrote: >> > That requires a series of arbitrary reads. This is protecting >> > against >> > attacks that use an adjacent slab object write overflow to write >> > the >> > freelist pointer. This internal structure is very reliable, and has >> > been the basis of freelist attacks against the kernel for a decade. >> >> These reads are not arbitrary. You can usually calculate the page struct >> address easily from the address and then do a couple of loads to get >> there. >> >> Ok so you get rid of the old attacks because we did not have that >> hardening in effect when they designed their approaches? > > The hardening protects against situations where > people do not have arbitrary code execution and > memory read access in the kernel, with the goal > of protecting people from acquiring those abilities. Right. This is about blocking the escalation of attack capability. For slab object overflow flaws, there are mainly two exploitation methods: adjacent allocated object overwrite and adjacent freed object overwrite (i.e. a freelist pointer overwrite). The first attack depends heavily on which slab cache (and therefore which structures) has been exposed by the bug. It's a very narrow and specific attack method. The freelist attack is entirely general purpose since it provides a reliable way to gain arbitrary write capabilities. Protecting against that attack greatly narrows the options for an attacker which makes attacks more expensive to create and possibly less reliable (and reliability is crucial to successful attacks). >> > It is a probabilistic defense, but then so is the stack protector. >> > This is a similar defense; while not perfect it makes the class of >> > attack much more difficult to mount. >> >> Na I am not convinced of the "much more difficult". Maybe they will just >> have to upgrade their approaches to fetch the proper values to >> decode. > > Easier said than done. Most of the time there is an > unpatched vulnerability outstanding, there is only > one known issue, before the kernel is updated by the > user, to a version that does not have that issue. > > Bypassing kernel hardening typically requires the > use of multiple vulnerabilities, and the absence of > roadblocks (like hardening) that make a type of > vulnerability exploitable. > > Between usercopy hardening, and these slub freelist > canaries (which is what they effectively are), several > classes of exploits are no longer usable. Yup. I've been thinking of this patch kind of like glibc's PTR_MANGLE feature. -Kees
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index 07ef550c6627..d7990a83b416 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -93,6 +93,10 @@ struct kmem_cache { #endif #endif +#ifdef CONFIG_SLAB_FREELIST_HARDENED + unsigned long random; +#endif + #ifdef CONFIG_NUMA /* * Defragmentation by allocating from a remote node. diff --git a/init/Kconfig b/init/Kconfig index 1d3475fc9496..04ee3e507b9e 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1900,6 +1900,15 @@ config SLAB_FREELIST_RANDOM security feature reduces the predictability of the kernel slab allocator against heap overflows. +config SLAB_FREELIST_HARDENED + bool "Harden slab freelist metadata" + depends on SLUB + help + Many kernel heap attacks try to target slab cache metadata and + other infrastructure. This options makes minor performance + sacrifies to harden the kernel slab allocator against common + freelist exploit methods. + config SLUB_CPU_PARTIAL default y depends on SLUB && SMP diff --git a/mm/slub.c b/mm/slub.c index 57e5156f02be..eae0628d3346 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -34,6 +34,7 @@ #include <linux/stacktrace.h> #include <linux/prefetch.h> #include <linux/memcontrol.h> +#include <linux/random.h> #include <trace/events/kmem.h> @@ -238,30 +239,58 @@ static inline void stat(const struct kmem_cache *s, enum stat_item si) * Core slab cache functions *******************************************************************/ +/* + * Returns freelist pointer (ptr). With hardening, this is obfuscated + * with an XOR of the address where the pointer is held and a per-cache + * random number. + */ +static inline void *freelist_ptr(const struct kmem_cache *s, void *ptr, + unsigned long ptr_addr) +{ +#ifdef CONFIG_SLAB_FREELIST_HARDENED + return (void *)((unsigned long)ptr ^ s->random ^ ptr_addr); +#else + return ptr; +#endif +} + +/* Returns the freelist pointer recorded at location ptr_addr. */ +static inline void *freelist_dereference(const struct kmem_cache *s, + void *ptr_addr) +{ + return freelist_ptr(s, (void *)*(unsigned long *)(ptr_addr), + (unsigned long)ptr_addr); +} + static inline void *get_freepointer(struct kmem_cache *s, void *object) { - return *(void **)(object + s->offset); + return freelist_dereference(s, object + s->offset); } static void prefetch_freepointer(const struct kmem_cache *s, void *object) { - prefetch(object + s->offset); + if (object) + prefetch(freelist_dereference(s, object + s->offset)); } static inline void *get_freepointer_safe(struct kmem_cache *s, void *object) { + unsigned long freepointer_addr; void *p; if (!debug_pagealloc_enabled()) return get_freepointer(s, object); - probe_kernel_read(&p, (void **)(object + s->offset), sizeof(p)); - return p; + freepointer_addr = (unsigned long)object + s->offset; + probe_kernel_read(&p, (void **)freepointer_addr, sizeof(p)); + return freelist_ptr(s, p, freepointer_addr); } static inline void set_freepointer(struct kmem_cache *s, void *object, void *fp) { - *(void **)(object + s->offset) = fp; + unsigned long freeptr_addr = (unsigned long)object + s->offset; + + *(void **)freeptr_addr = freelist_ptr(s, fp, freeptr_addr); } /* Loop over all objects in a slab */ @@ -3536,6 +3565,9 @@ static int kmem_cache_open(struct kmem_cache *s, unsigned long flags) { s->flags = kmem_cache_flags(s->size, flags, s->name, s->ctor); s->reserved = 0; +#ifdef CONFIG_SLAB_FREELIST_HARDENED + s->random = get_random_long(); +#endif if (need_reserve_slab_rcu && (s->flags & SLAB_TYPESAFE_BY_RCU)) s->reserved = sizeof(struct rcu_head);
This SLUB free list pointer obfuscation code is modified from Brad Spengler/PaX Team's code in the last public patch of grsecurity/PaX based on my understanding of the code. Changes or omissions from the original code are mine and don't reflect the original grsecurity/PaX code. This adds a per-cache random value to SLUB caches that is XORed with their freelist pointer address and value. This adds nearly zero overhead and frustrates the very common heap overflow exploitation method of overwriting freelist pointers. A recent example of the attack is written up here: http://cyseclabs.com/blog/cve-2016-6187-heap-off-by-one-exploit This is based on patches by Daniel Micay, and refactored to minimize the use of #ifdef. Under 200-count cycles of "hackbench -g 20 -l 1000" I saw the following run times: before: mean 10.11882499999999999995 variance .03320378329145728642 stdev .18221905304181911048 after: mean 10.12654000000000000014 variance .04700556623115577889 stdev .21680767106160192064 The difference gets lost in the noise, but if the above is to be taken literally, using CONFIG_FREELIST_HARDENED is 0.07% slower. Suggested-by: Daniel Micay <danielmicay@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Rik van Riel <riel@redhat.com> Cc: Tycho Andersen <tycho@docker.com> Signed-off-by: Kees Cook <keescook@chromium.org> --- v3: - use static inlines instead of macros (akpm). v2: - rename CONFIG_SLAB_HARDENED to CONFIG_FREELIST_HARDENED (labbott). --- include/linux/slub_def.h | 4 ++++ init/Kconfig | 9 +++++++++ mm/slub.c | 42 +++++++++++++++++++++++++++++++++++++----- 3 files changed, 50 insertions(+), 5 deletions(-)