From patchwork Thu Aug 10 17:26:14 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Garnier X-Patchwork-Id: 9894385 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id ED3C760236 for ; Thu, 10 Aug 2017 17:32:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D516528ADF for ; Thu, 10 Aug 2017 17:32:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C948E28AE9; Thu, 10 Aug 2017 17:32:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, RCVD_IN_DNSWL_MED, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 2261628B0B for ; Thu, 10 Aug 2017 17:32:07 +0000 (UTC) Received: (qmail 3145 invoked by uid 550); 10 Aug 2017 17:27:36 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 1881 invoked from network); 10 Aug 2017 17:27:27 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Xxa1MumDdZbzLT4wtRxdOztdZ+w0OnKBUUJ4UJRudYw=; b=gCtgqyrhp7cpDHY8qW3NYTKj9e015W+1YkBijlU4ZGq81R3otOTsrmNVVdYINWr1Rn KNAVua96sRO+9Qsc9f/Is/Ya5WRwwVsEKrrc4ewWn16bjyydVZyK6kl/cbs4SOBEpZxW W7OEjxDBBmN1Qmrnfmtsxj2qnW/nZdhVLQhTiHTZXGZLU/szRES784ZPIagIPsTpGlNo rbfEPVh1fzHKUSgL9GL4i6io1/dx9BM0SjcnpQguZxX4dM+JVvIiHKbdnG6Jk2WLSMk1 BeEe5zjXYGoB4g7pnpCpP0PdSHg8y+0RDeTKXungOPHupgIUDDo7sVIb3UF+t5GfDPRV WgJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Xxa1MumDdZbzLT4wtRxdOztdZ+w0OnKBUUJ4UJRudYw=; b=e0d7g484Ql/VHYbzgiFKBMXGMcwC97iDVnT7IvUq++7soGZIbWcUm/ziZ7d35DBF49 UTBFo8FvQ7B6bnoE8vJloNGFIxN+Qb/e0vb0KHrztgKzmGujBHZ8kSryJJVEE9u7Fdcy e3FUcByLgLPd9VWZZSXyC8Ei/H/PnM8tdhuysLxOkln38OpPaWmyQIkj6al/rXKA8yxE XvT+I4f9Yfk/vtwG0QKIRuP7LsWePj5VfOuZStswf8kwLU8Saks5EpqMDJlpQfs+dP1i k4vxhUuI9fCTyZqbemoXRqlHVnRIHAjzSyg7johDjYeCUiOqIFMaXabkRVT09DfABANm U15w== X-Gm-Message-State: AHYfb5hPv7Z8SCaRn0aF9W0hYo5YK4iBFCrZV3LPyLRIPsSFIbpx0WgI lbUiPJv1QP0Gg8q+ X-Received: by 10.84.139.129 with SMTP id 1mr14156213plr.122.1502386035618; Thu, 10 Aug 2017 10:27:15 -0700 (PDT) From: Thomas Garnier To: Herbert Xu , "David S . Miller" , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , Peter Zijlstra , Josh Poimboeuf , Arnd Bergmann , Thomas Garnier , Matthias Kaehlcke , Boris Ostrovsky , Juergen Gross , Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Joerg Roedel , Tom Lendacky , Andy Lutomirski , Borislav Petkov , Brian Gerst , "Kirill A . Shutemov" , "Rafael J . Wysocki" , Len Brown , Pavel Machek , Tejun Heo , Christoph Lameter , Paul Gortmaker , Chris Metcalf , Andrew Morton , "Paul E . McKenney" , Nicolas Pitre , Christopher Li , "Rafael J . Wysocki" , Lukas Wunner , Mika Westerberg , Dou Liyang , Daniel Borkmann , Alexei Starovoitov , Masahiro Yamada , Markus Trippelsdorf , Steven Rostedt , Kees Cook , Rik van Riel , David Howells , Waiman Long , Kyle Huey , Peter Foley , Tim Chen , Catalin Marinas , Ard Biesheuvel , Michal Hocko , Matthew Wilcox , "H . J . Lu" , Paul Bolle , Rob Landley , Baoquan He , Daniel Micay Cc: x86@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org, xen-devel@lists.xenproject.org, kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-arch@vger.kernel.org, linux-sparse@vger.kernel.org, kernel-hardening@lists.openwall.com Date: Thu, 10 Aug 2017 10:26:14 -0700 Message-Id: <20170810172615.51965-23-thgarnie@google.com> X-Mailer: git-send-email 2.14.0.434.g98096fd7a8-goog In-Reply-To: <20170810172615.51965-1-thgarnie@google.com> References: <20170810172615.51965-1-thgarnie@google.com> Subject: [kernel-hardening] [RFC v2 22/23] x86/module: Add support for mcmodel large and PLTs X-Virus-Scanned: ClamAV using ClamSMTP With PIE support and KASLR extended range, the modules may be further away from the kernel than before breaking mcmodel=kernel expectations. Add an option to build modules with mcmodel=large. The modules generated code will make no assumptions on placement in memory. Despite this option, modules still expect kernel functions to be within 2G and generate relative calls. To solve this issue, the PLT arm64 code was adapted for x86_64. When a relative relocation go outside its range, a dynamic PLT entry is used to correctly jump to the destination. Signed-off-by: Thomas Garnier --- arch/x86/Kconfig | 10 +++ arch/x86/Makefile | 10 ++- arch/x86/include/asm/module.h | 17 ++++ arch/x86/kernel/Makefile | 2 + arch/x86/kernel/module-plts.c | 198 ++++++++++++++++++++++++++++++++++++++++++ arch/x86/kernel/module.c | 18 ++-- arch/x86/kernel/module.lds | 4 + 7 files changed, 252 insertions(+), 7 deletions(-) create mode 100644 arch/x86/kernel/module-plts.c create mode 100644 arch/x86/kernel/module.lds diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index a419f4110872..2b69be667543 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2139,6 +2139,16 @@ config X86_PIE select MODULE_REL_CRCS if MODVERSIONS select X86_GLOBAL_STACKPROTECTOR if CC_STACKPROTECTOR +config X86_MODULE_MODEL_LARGE + bool + depends on X86_64 && X86_PIE + +config X86_MODULE_PLTS + bool + depends on X86_64 + select X86_MODULE_MODEL_LARGE + select HAVE_MOD_ARCH_SPECIFIC + config HOTPLUG_CPU bool "Support for hot-pluggable CPUs" depends on SMP diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 05e01588b5af..f980991804f7 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -147,10 +147,18 @@ else KBUILD_CFLAGS += -mno-red-zone ifdef CONFIG_X86_PIE KBUILD_CFLAGS += -fPIC - KBUILD_CFLAGS_MODULE += -fno-PIC -mcmodel=kernel + KBUILD_CFLAGS_MODULE += -fno-PIC else KBUILD_CFLAGS += -mcmodel=kernel endif +ifdef CONFIG_X86_MODULE_MODEL_LARGE + KBUILD_CFLAGS_MODULE += -mcmodel=large +else + KBUILD_CFLAGS_MODULE += -mcmodel=kernel +endif +ifdef CONFIG_X86_MODULE_PLTS + KBUILD_LDFLAGS_MODULE += -T $(srctree)/arch/x86/kernel/module.lds +endif # -funit-at-a-time shrinks the kernel .text considerably # unfortunately it makes reading oopses harder. diff --git a/arch/x86/include/asm/module.h b/arch/x86/include/asm/module.h index 9eb7c718aaf8..58d079fb2dc9 100644 --- a/arch/x86/include/asm/module.h +++ b/arch/x86/include/asm/module.h @@ -4,12 +4,26 @@ #include #include +#ifdef CONFIG_X86_MODULE_PLTS +struct mod_plt_sec { + struct elf64_shdr *plt; + int plt_num_entries; + int plt_max_entries; +}; +#endif + + + struct mod_arch_specific { #ifdef CONFIG_ORC_UNWINDER unsigned int num_orcs; int *orc_unwind_ip; struct orc_entry *orc_unwind; #endif +#ifdef CONFIG_X86_MODULE_PLTS + struct mod_plt_sec core; + struct mod_plt_sec init; +#endif }; #ifdef CONFIG_X86_64 @@ -70,4 +84,7 @@ struct mod_arch_specific { # define MODULE_ARCH_VERMAGIC MODULE_PROC_FAMILY #endif +u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela, + Elf64_Sym *sym); + #endif /* _ASM_X86_MODULE_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 287eac7d207f..df32768cc576 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -140,4 +140,6 @@ ifeq ($(CONFIG_X86_64),y) obj-$(CONFIG_PCI_MMCONFIG) += mmconf-fam10h_64.o obj-y += vsmp_64.o + + obj-$(CONFIG_X86_MODULE_PLTS) += module-plts.o endif diff --git a/arch/x86/kernel/module-plts.c b/arch/x86/kernel/module-plts.c new file mode 100644 index 000000000000..bbf11771f424 --- /dev/null +++ b/arch/x86/kernel/module-plts.c @@ -0,0 +1,198 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * Generate PLT entries for out-of-bound PC-relative relocations. It is required + * when a module can be mapped more than 2G away from the kernel. + * + * Based on arm64 module-plts implementation. + */ + +#include +#include +#include +#include + +/* jmp QWORD PTR [rip+0xfffffffffffffff2] */ +const u8 jmp_target[] = { 0xFF, 0x25, 0xF2, 0xFF, 0xFF, 0xFF }; + +struct plt_entry { + u64 target; /* Hold the target address */ + u8 jmp[sizeof(jmp_target)]; /* jmp opcode to target */ +}; + +static bool in_init(const struct module *mod, void *loc) +{ + return (u64)loc - (u64)mod->init_layout.base < mod->init_layout.size; +} + +u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela, + Elf64_Sym *sym) +{ + struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core : + &mod->arch.init; + struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr; + int i = pltsec->plt_num_entries; + u64 ret; + + /* + * + * jmp QWORD PTR [rip+0xfffffffffffffff2] # Target address + */ + plt[i].target = sym->st_value; + memcpy(plt[i].jmp, jmp_target, sizeof(jmp_target)); + + /* + * Check if the entry we just created is a duplicate. Given that the + * relocations are sorted, this will be the last entry we allocated. + * (if one exists). + */ + if (i > 0 && plt[i].target == plt[i - 2].target) { + ret = (u64)&plt[i - 1].jmp; + } else { + pltsec->plt_num_entries++; + BUG_ON(pltsec->plt_num_entries > pltsec->plt_max_entries); + ret = (u64)&plt[i].jmp; + } + + return ret + rela->r_addend; +} + +#define cmp_3way(a,b) ((a) < (b) ? -1 : (a) > (b)) + +static int cmp_rela(const void *a, const void *b) +{ + const Elf64_Rela *x = a, *y = b; + int i; + + /* sort by type, symbol index and addend */ + i = cmp_3way(ELF64_R_TYPE(x->r_info), ELF64_R_TYPE(y->r_info)); + if (i == 0) + i = cmp_3way(ELF64_R_SYM(x->r_info), ELF64_R_SYM(y->r_info)); + if (i == 0) + i = cmp_3way(x->r_addend, y->r_addend); + return i; +} + +static bool duplicate_rel(const Elf64_Rela *rela, int num) +{ + /* + * Entries are sorted by type, symbol index and addend. That means + * that, if a duplicate entry exists, it must be in the preceding + * slot. + */ + return num > 0 && cmp_rela(rela + num, rela + num - 1) == 0; +} + +static unsigned int count_plts(Elf64_Sym *syms, Elf64_Rela *rela, int num, + Elf64_Word dstidx) +{ + unsigned int ret = 0; + Elf64_Sym *s; + int i; + + for (i = 0; i < num; i++) { + switch (ELF64_R_TYPE(rela[i].r_info)) { + case R_X86_64_PC32: + /* + * We only have to consider branch targets that resolve + * to symbols that are defined in a different section. + * This is not simply a heuristic, it is a fundamental + * limitation, since there is no guaranteed way to emit + * PLT entries sufficiently close to the branch if the + * section size exceeds the range of a branch + * instruction. So ignore relocations against defined + * symbols if they live in the same section as the + * relocation target. + */ + s = syms + ELF64_R_SYM(rela[i].r_info); + if (s->st_shndx == dstidx) + break; + + /* + * Jump relocations with non-zero addends against + * undefined symbols are supported by the ELF spec, but + * do not occur in practice (e.g., 'jump n bytes past + * the entry point of undefined function symbol f'). + * So we need to support them, but there is no need to + * take them into consideration when trying to optimize + * this code. So let's only check for duplicates when + * the addend is zero: this allows us to record the PLT + * entry address in the symbol table itself, rather than + * having to search the list for duplicates each time we + * emit one. + */ + if (rela[i].r_addend != 0 || !duplicate_rel(rela, i)) + ret++; + break; + } + } + return ret; +} + +int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, + char *secstrings, struct module *mod) +{ + unsigned long core_plts = 0; + unsigned long init_plts = 0; + Elf64_Sym *syms = NULL; + int i; + + /* + * Find the empty .plt section so we can expand it to store the PLT + * entries. Record the symtab address as well. + */ + for (i = 0; i < ehdr->e_shnum; i++) { + if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt")) + mod->arch.core.plt = sechdrs + i; + else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt")) + mod->arch.init.plt = sechdrs + i; + else if (sechdrs[i].sh_type == SHT_SYMTAB) + syms = (Elf64_Sym *)sechdrs[i].sh_addr; + } + + if (!mod->arch.core.plt || !mod->arch.init.plt) { + pr_err("%s: module PLT section(s) missing\n", mod->name); + return -ENOEXEC; + } + if (!syms) { + pr_err("%s: module symtab section missing\n", mod->name); + return -ENOEXEC; + } + + for (i = 0; i < ehdr->e_shnum; i++) { + Elf64_Rela *rels = (void *)ehdr + sechdrs[i].sh_offset; + int numrels = sechdrs[i].sh_size / sizeof(Elf64_Rela); + Elf64_Shdr *dstsec = sechdrs + sechdrs[i].sh_info; + + if (sechdrs[i].sh_type != SHT_RELA) + continue; + + /* sort by type, symbol index and addend */ + sort(rels, numrels, sizeof(Elf64_Rela), cmp_rela, NULL); + + if (strncmp(secstrings + dstsec->sh_name, ".init", 5) != 0) + core_plts += count_plts(syms, rels, numrels, + sechdrs[i].sh_info); + else + init_plts += count_plts(syms, rels, numrels, + sechdrs[i].sh_info); + } + + mod->arch.core.plt->sh_type = SHT_NOBITS; + mod->arch.core.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC; + mod->arch.core.plt->sh_addralign = L1_CACHE_BYTES; + mod->arch.core.plt->sh_size = (core_plts + 1) * sizeof(struct plt_entry); + mod->arch.core.plt_num_entries = 0; + mod->arch.core.plt_max_entries = core_plts; + + mod->arch.init.plt->sh_type = SHT_NOBITS; + mod->arch.init.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC; + mod->arch.init.plt->sh_addralign = L1_CACHE_BYTES; + mod->arch.init.plt->sh_size = (init_plts + 1) * sizeof(struct plt_entry); + mod->arch.init.plt_num_entries = 0; + mod->arch.init.plt_max_entries = init_plts; + + return 0; +} diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index 62e7d70aadd5..061270a972a5 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -187,10 +187,15 @@ int apply_relocate_add(Elf64_Shdr *sechdrs, case R_X86_64_PC32: val -= (u64)loc; *(u32 *)loc = val; -#if 0 - if ((s64)val != *(s32 *)loc) - goto overflow; -#endif + if (IS_ENABLED(CONFIG_X86_MODULE_MODEL_LARGE) && + (s64)val != *(s32 *)loc) { + val = module_emit_plt_entry(me, loc, &rel[i], + sym); + val -= (u64)loc; + *(u32 *)loc = val; + if ((s64)val != *(s32 *)loc) + goto overflow; + } break; default: pr_err("%s: Unknown rela relocation: %llu\n", @@ -203,8 +208,9 @@ int apply_relocate_add(Elf64_Shdr *sechdrs, overflow: pr_err("overflow in relocation type %d val %Lx\n", (int)ELF64_R_TYPE(rel[i].r_info), val); - pr_err("`%s' likely not compiled with -mcmodel=kernel\n", - me->name); + pr_err("`%s' likely not compiled with -mcmodel=%s\n", + me->name, + IS_ENABLED(CONFIG_X86_MODULE_MODEL_LARGE) ? "large" : "kernel"); return -ENOEXEC; } #endif diff --git a/arch/x86/kernel/module.lds b/arch/x86/kernel/module.lds new file mode 100644 index 000000000000..f7c9781a9d48 --- /dev/null +++ b/arch/x86/kernel/module.lds @@ -0,0 +1,4 @@ +SECTIONS { + .plt (NOLOAD) : { BYTE(0) } + .init.plt (NOLOAD) : { BYTE(0) } +}