From patchwork Tue Mar 13 20:59:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Garnier X-Patchwork-Id: 10280825 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4C2616038F for ; Tue, 13 Mar 2018 21:07:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3B48E284D1 for ; Tue, 13 Mar 2018 21:07:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2F098284E4; Tue, 13 Mar 2018 21:07:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id AA841284D2 for ; Tue, 13 Mar 2018 21:07:16 +0000 (UTC) Received: (qmail 22432 invoked by uid 550); 13 Mar 2018 21:01:43 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 22191 invoked from network); 13 Mar 2018 21:01:29 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=BcWZe0Rr5RwfvH1hVbfKDU5JvGsd0U9Mi/STRcOvWlY=; b=s6dvV/O6UeFJ50Wn1vjoveJxDLEBbkcOw8ZTnmncQm16Gy0sjnGGAGWmAA+Krsc2Ut 6UWD/B/EVbLx5RzODq1ESM9MWTFQHzKdhG+3LzYr4LyqarRAmo1qV8z1D0PAuEfxcuuc XHlL1aoAeekXadYqrUu9yC1lNuHVkL9J42VtYbGlJCRzOawY5ZJ1CrcASSJKjsIW96Sq GBsiTiGBjLOIioQBlJaU5dvFNUbZEnmOeY0OgIpTZJJxEk79ehINHtgFcwCoXxWBZ3Lo iBpU5UVmWCrgwUH6fRI+x9PK/TNKCCAAGwrHju57C+/4o8HphiwkMyEmb21NurSUQfyS Q6xQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=BcWZe0Rr5RwfvH1hVbfKDU5JvGsd0U9Mi/STRcOvWlY=; b=gBimzc/m6iMsWW73zR+2GBoZmZG/8wqcN6QSaEeoYiJhQRxZf5UJixTwof+CeXk0Xp eieOPilFsaaIrTxp2myZjozB9HaZ36Bujf4WC/bOHShvrV1mO9qByP55wbMoJqneKBvO aOmcMMNSQm5loW0FhsQGY30/s1Fz/wnks2qs41nelJYQcTEweYv5rVRl0bJeV5LKIApZ uRvy8I8UiboZlMV03umuO6vi6C3D3852UPmxatV6k3/QZon2F934bHmUiM6gfe9sQWXy 6HUKE5qWnYTYEIIQg/ZkW/u2dLAyZxX+nNMeJtID/WxqbY6USUwddasO9OaiFRdZO2dG q96Q== X-Gm-Message-State: AElRT7HvxXCH4J18Qz7uPO2UEbYaes1ytE5mraCjfRWKOvQ9xolVukEq IDp2U4X+k88ghdU62RgnO+k1ng== X-Google-Smtp-Source: AG47ELvd1M3rWm2xbTC+LIjuegixv182PSVn85xHMg/FZ1wYEm7uQ4rItY7n6TSSz+9mJ6RZ7dAj8Q== X-Received: by 10.101.69.205 with SMTP id m13mr1602699pgr.323.1520974876629; Tue, 13 Mar 2018 14:01:16 -0700 (PDT) From: Thomas Garnier To: Herbert Xu , "David S . Miller" , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , Peter Zijlstra , Josh Poimboeuf , Greg Kroah-Hartman , Kate Stewart , Thomas Garnier , Arnd Bergmann , Philippe Ombredanne , Arnaldo Carvalho de Melo , Andrey Ryabinin , Matthias Kaehlcke , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Andy Lutomirski , Dominik Brodowski , Borislav Petkov , Borislav Petkov , "Rafael J . Wysocki" , Len Brown , Pavel Machek , Juergen Gross , Alok Kataria , Steven Rostedt , Tejun Heo , Christoph Lameter , Dennis Zhou , Boris Ostrovsky , David Woodhouse , Alexey Dobriyan , "Paul E . McKenney" , Andrew Morton , Nicolas Pitre , Randy Dunlap , "Luis R . Rodriguez" , Christopher Li , Jason Baron , Ashish Kalra , Kyle McMartin , Dou Liyang , Lukas Wunner , Petr Mladek , Sergey Senozhatsky , Masahiro Yamada , Ingo Molnar , Nicholas Piggin , Cao jin , "H . J . Lu" , Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Joerg Roedel , Dave Hansen , Rik van Riel , Jia Zhang , Jiri Slaby , Kyle Huey , Jonathan Corbet , Matthew Wilcox , Michal Hocko , Rob Landley , Baoquan He , Daniel Micay , =?UTF-8?q?Jan=20H=20=2E=20Sch=C3=B6nherr?= Cc: x86@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, virtualization@lists.linux-foundation.org, xen-devel@lists.xenproject.org, linux-arch@vger.kernel.org, linux-sparse@vger.kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com Subject: [PATCH v2 27/27] x86/kaslr: Add option to extend KASLR range from 1GB to 3GB Date: Tue, 13 Mar 2018 13:59:45 -0700 Message-Id: <20180313205945.245105-28-thgarnie@google.com> X-Mailer: git-send-email 2.16.2.660.g709887971b-goog In-Reply-To: <20180313205945.245105-1-thgarnie@google.com> References: <20180313205945.245105-1-thgarnie@google.com> X-Virus-Scanned: ClamAV using ClamSMTP Add a new CONFIG_RANDOMIZE_BASE_LARGE option to benefit from PIE support. It increases the KASLR range from 1GB to 3GB. The new range stars at 0xffffffff00000000 just above the EFI memory region. This option is off by default. The boot code is adapted to create the appropriate page table spanning three PUD pages. The relocation table uses 64-bit integers generated with the updated relocation tool with the large-reloc option. Signed-off-by: Thomas Garnier --- arch/x86/Kconfig | 21 +++++++++++++++++++++ arch/x86/boot/compressed/Makefile | 5 +++++ arch/x86/boot/compressed/misc.c | 10 +++++++++- arch/x86/include/asm/page_64_types.h | 9 +++++++++ arch/x86/kernel/head64.c | 15 ++++++++++++--- arch/x86/kernel/head_64.S | 11 ++++++++++- 6 files changed, 66 insertions(+), 5 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 4b1615e661d6..7ea69cb0153f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2260,6 +2260,27 @@ config X86_PIE select DYNAMIC_MODULE_BASE select MODULE_REL_CRCS if MODVERSIONS +config RANDOMIZE_BASE_LARGE + bool "Increase the randomization range of the kernel image" + depends on X86_64 && RANDOMIZE_BASE + select X86_PIE + select X86_MODULE_PLTS if MODULES + default n + ---help--- + Build the kernel as a Position Independent Executable (PIE) and + increase the available randomization range from 1GB to 3GB. + + This option impacts performance on kernel CPU intensive workloads up + to 10% due to PIE generated code. Impact on user-mode processes and + typical usage would be significantly less (0.50% when you build the + kernel). + + The kernel and modules will generate slightly more assembly (1 to 2% + increase on the .text sections). The vmlinux binary will be + significantly smaller due to less relocations. + + If unsure say N + config HOTPLUG_CPU bool "Support for hot-pluggable CPUs" depends on SMP diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 1f734cd98fd3..fb72f53defd0 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -116,7 +116,12 @@ $(obj)/vmlinux.bin: vmlinux FORCE targets += $(patsubst $(obj)/%,%,$(vmlinux-objs-y)) vmlinux.bin.all vmlinux.relocs +# Large randomization require bigger relocation table +ifeq ($(CONFIG_RANDOMIZE_BASE_LARGE),y) +CMD_RELOCS = arch/x86/tools/relocs --large-reloc +else CMD_RELOCS = arch/x86/tools/relocs +endif quiet_cmd_relocs = RELOCS $@ cmd_relocs = $(CMD_RELOCS) $< > $@;$(CMD_RELOCS) --abs-relocs $< $(obj)/vmlinux.relocs: vmlinux FORCE diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index b50c42455e25..746a968690d5 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -170,10 +170,18 @@ void __puthex(unsigned long value) } #if CONFIG_X86_NEED_RELOCS + +/* Large randomization go lower than -2G and use large relocation table */ +#ifdef CONFIG_RANDOMIZE_BASE_LARGE +typedef long rel_t; +#else +typedef int rel_t; +#endif + static void handle_relocations(void *output, unsigned long output_len, unsigned long virt_addr) { - int *reloc; + rel_t *reloc; unsigned long delta, map, ptr; unsigned long min_addr = (unsigned long)output; unsigned long max_addr = min_addr + (VO___bss_start - VO__text); diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h index 2c5a966dc222..85ea681421d2 100644 --- a/arch/x86/include/asm/page_64_types.h +++ b/arch/x86/include/asm/page_64_types.h @@ -46,7 +46,11 @@ #define __PAGE_OFFSET __PAGE_OFFSET_BASE_L4 #endif /* CONFIG_DYNAMIC_MEMORY_LAYOUT */ +#ifdef CONFIG_RANDOMIZE_BASE_LARGE +#define __START_KERNEL_map _AC(0xffffffff00000000, UL) +#else #define __START_KERNEL_map _AC(0xffffffff80000000, UL) +#endif /* CONFIG_RANDOMIZE_BASE_LARGE */ /* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */ @@ -64,9 +68,14 @@ * 512MiB by default, leaving 1.5GiB for modules once the page tables * are fully set up. If kernel ASLR is configured, it can extend the * kernel page table mapping, reducing the size of the modules area. + * On PIE, we relocate the binary 2G lower so add this extra space. */ #if defined(CONFIG_RANDOMIZE_BASE) +#ifdef CONFIG_RANDOMIZE_BASE_LARGE +#define KERNEL_IMAGE_SIZE (_AC(3, UL) * 1024 * 1024 * 1024) +#else #define KERNEL_IMAGE_SIZE (1024 * 1024 * 1024) +#endif #else #define KERNEL_IMAGE_SIZE (512 * 1024 * 1024) #endif diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index ea4c498369d8..577b47381ba2 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -63,6 +63,7 @@ EXPORT_SYMBOL(vmemmap_base); #endif #define __head __section(.head.text) +#define pud_count(x) (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT) /* Required for read_cr3 when building as PIE */ unsigned long __force_order; @@ -112,6 +113,8 @@ unsigned long __head __startup_64(unsigned long physaddr, { unsigned long load_delta, *p; unsigned long pgtable_flags; + unsigned long level3_kernel_start, level3_kernel_count; + unsigned long level3_fixmap_start; pgdval_t *pgd; p4dval_t *p4d; pudval_t *pud; @@ -142,6 +145,11 @@ unsigned long __head __startup_64(unsigned long physaddr, /* Include the SME encryption mask in the fixup value */ load_delta += sme_get_me_mask(); + /* Look at the randomization spread to adapt page table used */ + level3_kernel_start = pud_index(__START_KERNEL_map); + level3_kernel_count = pud_count(KERNEL_IMAGE_SIZE); + level3_fixmap_start = level3_kernel_start + level3_kernel_count; + /* Fixup the physical addresses in the page table */ pgd = fixup_pointer(&early_top_pgt, physaddr); @@ -158,8 +166,9 @@ unsigned long __head __startup_64(unsigned long physaddr, } pud = fixup_pointer(&level3_kernel_pgt, physaddr); - pud[510] += load_delta; - pud[511] += load_delta; + for (i = 0; i < level3_kernel_count; i++) + pud[level3_kernel_start + i] += load_delta; + pud[level3_fixmap_start] += load_delta; pmd = fixup_pointer(level2_fixmap_pgt, physaddr); pmd[506] += load_delta; @@ -214,7 +223,7 @@ unsigned long __head __startup_64(unsigned long physaddr, */ pmd = fixup_pointer(level2_kernel_pgt, physaddr); - for (i = 0; i < PTRS_PER_PMD; i++) { + for (i = 0; i < PTRS_PER_PMD * level3_kernel_count; i++) { if (pmd[i] & _PAGE_PRESENT) pmd[i] += load_delta; } diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 87624e1fe22f..c86d9262015f 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -41,12 +41,16 @@ #define l4_index(x) (((x) >> 39) & 511) #define pud_index(x) (((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1)) +#define pud_count(x) (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT) L4_PAGE_OFFSET = l4_index(__PAGE_OFFSET_BASE_L4) L4_START_KERNEL = l4_index(__START_KERNEL_map) L3_START_KERNEL = pud_index(__START_KERNEL_map) +/* Adapt page table L3 space based on range of randomization */ +L3_KERNEL_ENTRY_COUNT = pud_count(KERNEL_IMAGE_SIZE) + .text __HEAD .code64 @@ -436,7 +440,12 @@ NEXT_PAGE(level4_kernel_pgt) NEXT_PAGE(level3_kernel_pgt) .fill L3_START_KERNEL,8,0 /* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */ - .quad level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC + i = 0 + .rept L3_KERNEL_ENTRY_COUNT + .quad level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC \ + + PAGE_SIZE*i + i = i + 1 + .endr .quad level2_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC NEXT_PAGE(level2_kernel_pgt)