From patchwork Thu May 26 20:37:11 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Garnier X-Patchwork-Id: 9137401 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A110E6075A for ; Thu, 26 May 2016 20:38:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8E12D27BFE for ; Thu, 26 May 2016 20:38:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8138928145; Thu, 26 May 2016 20:38:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, RCVD_IN_DNSWL_MED, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id D050027BFE for ; Thu, 26 May 2016 20:37:58 +0000 (UTC) Received: (qmail 11735 invoked by uid 550); 26 May 2016 20:37:49 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: kernel-hardening@lists.openwall.com Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 11543 invoked from network); 26 May 2016 20:37:48 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ZQLoJnScQYHm2RPs+Fk63Ge6ub4w9i5ISOAbS6EJr54=; b=QaAkjLMksgUL7nx+OGMsFoMks8bkzxFXQvTIYJNZbyjO5fLgxUOwjcrSbAw4hfbL3O hRGMBpegIl0Zs9kGAWBdhq+ZDNZm1BTJ31nrObLKRXEzn5XDGmqY9XPJXE3Kg2fOzZSa PjXxhrRnBH7pbwsn6r3KfgCLTs8EFSPJrkjP8y8EO3MAN8Zr+MNEj7eklUVJvsGoHlBR 2TZzICCFlz38CAYh8uK60DPorUkqrx7zqdrFzIQFOubeFLnMFFOCeTnlBGH8zJqu8COx ZWXan9y7KR29WRSRwQMfZzEPODun4Y0iRCCldw78NphxXyrD+7qcQ8j2G2SIy3heAq67 9zaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ZQLoJnScQYHm2RPs+Fk63Ge6ub4w9i5ISOAbS6EJr54=; b=fjwRPrKvV2MbIytH42lN7YKIQaFiUTWz3zSkyFYBct9fRzlNllC93KXP4Ykql+6NWw YvvMG/xpTOiaD8PJgz4qtMmMr4GRuXNNc/1PU3rCXoOlIKPdPgOU98RCEVoBHoBn3c7j pFIIyvh1FlHhmOKNDxqiLxLvqWbrJMfQ6QLd5ptC19TBGnuZUcYuEl5LsZW2qDxUWofg mW+nl1wa/YTlDtSqVwZroz1g+4ra/FhBPiQF2ftTHGfnLMaiUnXdxbEiyo7usR7Z8WSo rdKQ+6CBFgg04kWQ85RQWaGRMT/TH9RRERqrZkqv5RRs+HF27+PYuUK9zP6AH4h4905R 8eMw== X-Gm-Message-State: ALyK8tLcPCNivm+A6sj62/RxeJ3OhQOXzxP7Uj/nUwey1pgAoYGgAF4bC1PyiX09weF6HIG8 X-Received: by 10.66.122.175 with SMTP id lt15mr16816393pab.51.1464295056317; Thu, 26 May 2016 13:37:36 -0700 (PDT) From: Thomas Garnier To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , "Paul E . McKenney" , Pranith Kumar , David Howells , Tejun Heo , Johannes Weiner , David Woodhouse , Thomas Garnier , Petr Mladek , Kees Cook Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, gthelen@google.com, kernel-hardening@lists.openwall.com Date: Thu, 26 May 2016 13:37:11 -0700 Message-Id: <1464295031-26375-3-git-send-email-thgarnie@google.com> X-Mailer: git-send-email 2.8.0.rc3.226.g39d4020 In-Reply-To: <1464295031-26375-1-git-send-email-thgarnie@google.com> References: <1464295031-26375-1-git-send-email-thgarnie@google.com> Subject: [kernel-hardening] [PATCH v1 2/2] mm: SLUB Freelist randomization X-Virus-Scanned: ClamAV using ClamSMTP Implements Freelist randomization for the SLUB allocator. It was previous implemented for the SLAB allocator. Both use the same configuration option (CONFIG_SLAB_FREELIST_RANDOM). The list is randomized during initialization of a new set of pages. The order on different freelist sizes is pre-computed at boot for performance. Each kmem_cache has its own randomized freelist. This security feature reduces the predictability of the kernel SLUB allocator against heap overflows rendering attacks much less stable. For example these attacks exploit the predictability of the heap: - Linux Kernel CAN SLUB overflow (https://goo.gl/oMNWkU) - Exploiting Linux Kernel Heap corruptions (http://goo.gl/EXLn95) Performance results: slab_test impact is between 3% to 4% on average for 100000 attemps wihout smp. It is a very focused testing, kernbench show the overall impact on the system is way lower. Before: Single thread testing ===================== 1. Kmalloc: Repeatedly allocate then free test 100000 times kmalloc(8) -> 49 cycles kfree -> 77 cycles 100000 times kmalloc(16) -> 51 cycles kfree -> 79 cycles 100000 times kmalloc(32) -> 53 cycles kfree -> 83 cycles 100000 times kmalloc(64) -> 62 cycles kfree -> 90 cycles 100000 times kmalloc(128) -> 81 cycles kfree -> 97 cycles 100000 times kmalloc(256) -> 98 cycles kfree -> 121 cycles 100000 times kmalloc(512) -> 95 cycles kfree -> 122 cycles 100000 times kmalloc(1024) -> 96 cycles kfree -> 126 cycles 100000 times kmalloc(2048) -> 115 cycles kfree -> 140 cycles 100000 times kmalloc(4096) -> 149 cycles kfree -> 171 cycles 2. Kmalloc: alloc/free test 100000 times kmalloc(8)/kfree -> 70 cycles 100000 times kmalloc(16)/kfree -> 70 cycles 100000 times kmalloc(32)/kfree -> 70 cycles 100000 times kmalloc(64)/kfree -> 70 cycles 100000 times kmalloc(128)/kfree -> 70 cycles 100000 times kmalloc(256)/kfree -> 69 cycles 100000 times kmalloc(512)/kfree -> 70 cycles 100000 times kmalloc(1024)/kfree -> 73 cycles 100000 times kmalloc(2048)/kfree -> 72 cycles 100000 times kmalloc(4096)/kfree -> 71 cycles After: Single thread testing ===================== 1. Kmalloc: Repeatedly allocate then free test 100000 times kmalloc(8) -> 57 cycles kfree -> 78 cycles 100000 times kmalloc(16) -> 61 cycles kfree -> 81 cycles 100000 times kmalloc(32) -> 76 cycles kfree -> 93 cycles 100000 times kmalloc(64) -> 83 cycles kfree -> 94 cycles 100000 times kmalloc(128) -> 106 cycles kfree -> 107 cycles 100000 times kmalloc(256) -> 118 cycles kfree -> 117 cycles 100000 times kmalloc(512) -> 114 cycles kfree -> 116 cycles 100000 times kmalloc(1024) -> 115 cycles kfree -> 118 cycles 100000 times kmalloc(2048) -> 147 cycles kfree -> 131 cycles 100000 times kmalloc(4096) -> 214 cycles kfree -> 161 cycles 2. Kmalloc: alloc/free test 100000 times kmalloc(8)/kfree -> 66 cycles 100000 times kmalloc(16)/kfree -> 66 cycles 100000 times kmalloc(32)/kfree -> 66 cycles 100000 times kmalloc(64)/kfree -> 66 cycles 100000 times kmalloc(128)/kfree -> 65 cycles 100000 times kmalloc(256)/kfree -> 67 cycles 100000 times kmalloc(512)/kfree -> 67 cycles 100000 times kmalloc(1024)/kfree -> 64 cycles 100000 times kmalloc(2048)/kfree -> 67 cycles 100000 times kmalloc(4096)/kfree -> 67 cycles Kernbench, before: Average Optimal load -j 12 Run (std deviation): Elapsed Time 101.873 (1.16069) User Time 1045.22 (1.60447) System Time 88.969 (0.559195) Percent CPU 1112.9 (13.8279) Context Switches 189140 (2282.15) Sleeps 99008.6 (768.091) After: Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.47 (0.562732) User Time 1045.3 (1.34263) System Time 88.311 (0.342554) Percent CPU 1105.8 (6.49444) Context Switches 189081 (2355.78) Sleeps 99231.5 (800.358) Signed-off-by: Thomas Garnier Reviewed-by: Kees Cook --- Based on next-20160526 --- include/linux/slub_def.h | 5 ++ init/Kconfig | 4 +- mm/slub.c | 133 ++++++++++++++++++++++++++++++++++++++++++++--- 3 files changed, 133 insertions(+), 9 deletions(-) diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index d1faa01..5624c1f 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -99,6 +99,11 @@ struct kmem_cache { */ int remote_node_defrag_ratio; #endif + +#ifdef CONFIG_SLAB_FREELIST_RANDOM + unsigned int *random_seq; +#endif + struct kmem_cache_node *node[MAX_NUMNODES]; }; diff --git a/init/Kconfig b/init/Kconfig index 7b82f3f..4590629 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1784,10 +1784,10 @@ endchoice config SLAB_FREELIST_RANDOM default n - depends on SLAB + depends on SLAB || SLUB bool "SLAB freelist randomization" help - Randomizes the freelist order used on creating new SLABs. This + Randomizes the freelist order used on creating new pages. This security feature reduces the predictability of the kernel slab allocator against heap overflows. diff --git a/mm/slub.c b/mm/slub.c index 825ff45..f5b3114 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1405,6 +1405,109 @@ static inline struct page *alloc_slab_page(struct kmem_cache *s, return page; } +#ifdef CONFIG_SLAB_FREELIST_RANDOM +/* Pre-initialize the random sequence cache */ +static int init_cache_random_seq(struct kmem_cache *s) +{ + int err; + unsigned long i, count = oo_objects(s->oo); + + err = cache_random_seq_create(s, count, GFP_KERNEL); + if (err) { + pr_err("SLUB: Unable to initialize free list for %s\n", + s->name); + return err; + } + + /* Transform to an offset on the set of pages */ + if (s->random_seq) { + for (i = 0; i < count; i++) + s->random_seq[i] *= s->size; + } + return 0; +} + +/* Initialize each random sequence freelist per cache */ +static void __init init_freelist_randomization(void) +{ + struct kmem_cache *s; + + mutex_lock(&slab_mutex); + + list_for_each_entry(s, &slab_caches, list) + init_cache_random_seq(s); + + mutex_unlock(&slab_mutex); +} + +/* Get the next entry on the pre-computed freelist randomized */ +static void *next_freelist_entry(struct kmem_cache *s, struct page *page, + unsigned long *pos, void *start, + unsigned long page_limit, + unsigned long freelist_count) +{ + unsigned int idx; + + /* + * If the target page allocation failed, the number of objects on the + * page might be smaller than the usual size defined by the cache. + */ + do { + idx = s->random_seq[*pos]; + *pos += 1; + if (*pos >= freelist_count) + *pos = 0; + } while (unlikely(idx >= page_limit)); + + return (char *)start + idx; +} + +/* Shuffle the single linked freelist based on a random pre-computed sequence */ +static bool shuffle_freelist(struct kmem_cache *s, struct page *page) +{ + void *start; + void *cur; + void *next; + unsigned long idx, pos, page_limit, freelist_count; + + if (page->objects < 2 || !s->random_seq) + return false; + + freelist_count = oo_objects(s->oo); + pos = get_random_int() % freelist_count; + + page_limit = page->objects * s->size; + start = fixup_red_left(s, page_address(page)); + + /* First entry is used as the base of the freelist */ + cur = next_freelist_entry(s, page, &pos, start, page_limit, + freelist_count); + page->freelist = cur; + + for (idx = 1; idx < page->objects; idx++) { + setup_object(s, page, cur); + next = next_freelist_entry(s, page, &pos, start, page_limit, + freelist_count); + set_freepointer(s, cur, next); + cur = next; + } + setup_object(s, page, cur); + set_freepointer(s, cur, NULL); + + return true; +} +#else +static inline int init_cache_random_seq(struct kmem_cache *s) +{ + return 0; +} +static inline void init_freelist_randomization(void) { } +static inline bool shuffle_freelist(struct kmem_cache *s, struct page *page) +{ + return false; +} +#endif /* CONFIG_SLAB_FREELIST_RANDOM */ + static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) { struct page *page; @@ -1412,6 +1515,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) gfp_t alloc_gfp; void *start, *p; int idx, order; + bool shuffle; flags &= gfp_allowed_mask; @@ -1473,15 +1577,19 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) kasan_poison_slab(page); - for_each_object_idx(p, idx, s, start, page->objects) { - setup_object(s, page, p); - if (likely(idx < page->objects)) - set_freepointer(s, p, p + s->size); - else - set_freepointer(s, p, NULL); + shuffle = shuffle_freelist(s, page); + + if (!shuffle) { + for_each_object_idx(p, idx, s, start, page->objects) { + setup_object(s, page, p); + if (likely(idx < page->objects)) + set_freepointer(s, p, p + s->size); + else + set_freepointer(s, p, NULL); + } + page->freelist = fixup_red_left(s, start); } - page->freelist = fixup_red_left(s, start); page->inuse = page->objects; page->frozen = 1; @@ -3207,6 +3315,7 @@ static void free_kmem_cache_nodes(struct kmem_cache *s) void __kmem_cache_release(struct kmem_cache *s) { + cache_random_seq_destroy(s); free_percpu(s->cpu_slab); free_kmem_cache_nodes(s); } @@ -3431,6 +3540,13 @@ static int kmem_cache_open(struct kmem_cache *s, unsigned long flags) #ifdef CONFIG_NUMA s->remote_node_defrag_ratio = 1000; #endif + + /* Initialize the pre-computed randomized freelist if slab is up */ + if (slab_state >= UP) { + if (init_cache_random_seq(s)) + goto error; + } + if (!init_kmem_cache_nodes(s)) goto error; @@ -3947,6 +4063,9 @@ void __init kmem_cache_init(void) setup_kmalloc_cache_index_table(); create_kmalloc_caches(0); + /* Setup random freelists for each cache */ + init_freelist_randomization(); + #ifdef CONFIG_SMP register_cpu_notifier(&slab_notifier); #endif