From patchwork Tue Mar 13 20:59:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Garnier X-Patchwork-Id: 10280787 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 231CA6038F for ; Tue, 13 Mar 2018 21:04:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1267321327 for ; Tue, 13 Mar 2018 21:04:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 051C6284CE; Tue, 13 Mar 2018 21:04:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 5E8DC21327 for ; Tue, 13 Mar 2018 21:04:53 +0000 (UTC) Received: (qmail 21662 invoked by uid 550); 13 Mar 2018 21:01:04 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 20477 invoked from network); 13 Mar 2018 21:00:56 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=2R3+b5kQv464Cvv6Ut0q6JxMSxeora4f0vAEmykSms0=; b=v/I8n5DNFzwnFA0cu1vbbUIkd5RHAXsfBqUXijSiKv/pGKZdsbcUj4chKvc4BnNinT JTu3lAuRwTakWrBIycbwNFcjcFGtJnjjSJa8d3OHMLz6hDzaY4w7itxFHd+ER0YZ/dcK 1WvLv3/JmKkV0pBGxrMFw5kOQcfZc22V9VBAtOXyVKAcaj8JJ5KWIs5CdVxGJqxxJOcQ y8LdqfEnrB0kLcXzPGy0vEyd1PjFa+zAEL5baQE7z41TcT71xRhXhfZemIKWVFUZ4PBZ ZNksZX09yFJeiDgHkPdzEgwxQPGF3afv6vg2kImVlt6ALxhbqG3frFb5PFKGoDoYzUNN StVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=2R3+b5kQv464Cvv6Ut0q6JxMSxeora4f0vAEmykSms0=; b=J4CrtBJBODfC1OBnr/PYdLP2zYg2lhvlMwCag4F7OMG+o6dlscxjcqg22MZ0fQU8NG YCbr0w+SxRqn/036ivyREdgwmEiogOpaXbScJ0lwD2W4IkTXE0aDBPmd6vvCwAN9z2Ku zLGl4+ncWzpAGYH5XAp5/rigKe3mx8wptvX6II7chs4fyXRY3WORPoFfKtHSI4imKUgB Rp/IBAqQVSVNUe8wmxuVb7MEUJKVlKQbsM0sE7iq+rJmvHUAY54bFGPH2emEgSVHl4zz PaNpoJbyQ6nP2s9wndBPnlNUFvj4zp65jvJFHrWB94tifhL50Qf6U/ZcyqwBPgiSRSiu f1gQ== X-Gm-Message-State: AElRT7EpPVg636oUAaQAB/nnjWQGm7oNjORCj+4kuvuYNt18pzHDL6oL Q6XpuZAa2uE1G9CwHRFvYDLWZg== X-Google-Smtp-Source: AG47ELvURsd1I4ov3EEA2/3HCS2l3aUZt8T1NKHcr3zqBgZsO6WbsluZG338hVXgoenaJ16v4Yetpg== X-Received: by 2002:a17:902:158b:: with SMTP id m11-v6mr1230126pla.300.1520974843557; Tue, 13 Mar 2018 14:00:43 -0700 (PDT) From: Thomas Garnier To: Herbert Xu , "David S . Miller" , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , Peter Zijlstra , Josh Poimboeuf , Greg Kroah-Hartman , Kate Stewart , Thomas Garnier , Arnd Bergmann , Philippe Ombredanne , Arnaldo Carvalho de Melo , Andrey Ryabinin , Matthias Kaehlcke , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Andy Lutomirski , Dominik Brodowski , Borislav Petkov , Borislav Petkov , "Rafael J . Wysocki" , Len Brown , Pavel Machek , Juergen Gross , Alok Kataria , Steven Rostedt , Tejun Heo , Christoph Lameter , Dennis Zhou , Boris Ostrovsky , David Woodhouse , Alexey Dobriyan , "Paul E . McKenney" , Andrew Morton , Nicolas Pitre , Randy Dunlap , "Luis R . Rodriguez" , Christopher Li , Jason Baron , Ashish Kalra , Kyle McMartin , Dou Liyang , Lukas Wunner , Petr Mladek , Sergey Senozhatsky , Masahiro Yamada , Ingo Molnar , Nicholas Piggin , Cao jin , "H . J . Lu" , Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Joerg Roedel , Dave Hansen , Rik van Riel , Jia Zhang , Jiri Slaby , Kyle Huey , Jonathan Corbet , Matthew Wilcox , Michal Hocko , Rob Landley , Baoquan He , Daniel Micay , =?UTF-8?q?Jan=20H=20=2E=20Sch=C3=B6nherr?= Cc: x86@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, virtualization@lists.linux-foundation.org, xen-devel@lists.xenproject.org, linux-arch@vger.kernel.org, linux-sparse@vger.kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com Subject: [PATCH v2 14/27] x86/percpu: Adapt percpu for PIE support Date: Tue, 13 Mar 2018 13:59:32 -0700 Message-Id: <20180313205945.245105-15-thgarnie@google.com> X-Mailer: git-send-email 2.16.2.660.g709887971b-goog In-Reply-To: <20180313205945.245105-1-thgarnie@google.com> References: <20180313205945.245105-1-thgarnie@google.com> X-Virus-Scanned: ClamAV using ClamSMTP Perpcu uses a clever design where the .percu ELF section has a virtual address of zero and the relocation code avoid relocating specific symbols. It makes the code simple and easily adaptable with or without SMP support. This design is incompatible with PIE because generated code always try to access the zero virtual address relative to the default mapping address. It becomes impossible when KASLR is configured to go below -2G. This patch solves this problem by removing the zero mapping and adapting the GS base to be relative to the expected address. These changes are done only when PIE is enabled. The original implementation is kept as-is by default. The assembly and PER_CPU macros are changed to use relative references when PIE is enabled. The KALLSYMS_ABSOLUTE_PERCPU configuration is disabled with PIE given percpu symbols are not absolute in this case. Position Independent Executable (PIE) support will allow to extended the KASLR randomization range below the -2G memory limit. Signed-off-by: Thomas Garnier --- arch/x86/entry/calling.h | 2 +- arch/x86/entry/entry_64.S | 4 ++-- arch/x86/include/asm/percpu.h | 25 +++++++++++++++++++------ arch/x86/kernel/cpu/common.c | 4 +++- arch/x86/kernel/head_64.S | 4 ++++ arch/x86/kernel/setup_percpu.c | 2 +- arch/x86/kernel/vmlinux.lds.S | 13 +++++++++++-- arch/x86/lib/cmpxchg16b_emu.S | 8 ++++---- arch/x86/xen/xen-asm.S | 12 ++++++------ init/Kconfig | 2 +- 10 files changed, 52 insertions(+), 24 deletions(-) diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h index be63330c5511..7b48d97111b5 100644 --- a/arch/x86/entry/calling.h +++ b/arch/x86/entry/calling.h @@ -216,7 +216,7 @@ For 32-bit we have the following conventions - kernel is built with .endm #define THIS_CPU_user_pcid_flush_mask \ - PER_CPU_VAR(cpu_tlbstate) + TLB_STATE_user_pcid_flush_mask + PER_CPU_VAR(cpu_tlbstate + TLB_STATE_user_pcid_flush_mask) .macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index c53123468364..d34794368e20 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -358,7 +358,7 @@ ENTRY(__switch_to_asm) #ifdef CONFIG_CC_STACKPROTECTOR movq TASK_stack_canary(%rsi), %rbx - movq %rbx, PER_CPU_VAR(irq_stack_union)+stack_canary_offset + movq %rbx, PER_CPU_VAR(irq_stack_union + stack_canary_offset) #endif #ifdef CONFIG_RETPOLINE @@ -896,7 +896,7 @@ apicinterrupt IRQ_WORK_VECTOR irq_work_interrupt smp_irq_work_interrupt /* * Exception entry points. */ -#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss_rw) + (TSS_ist + ((x) - 1) * 8) +#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss_rw + (TSS_ist + ((x) - 1) * 8)) .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1 ENTRY(\sym) diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index a06b07399d17..7d1271b536ea 100644 --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -5,9 +5,11 @@ #ifdef CONFIG_X86_64 #define __percpu_seg gs #define __percpu_mov_op movq +#define __percpu_rel (%rip) #else #define __percpu_seg fs #define __percpu_mov_op movl +#define __percpu_rel #endif #ifdef __ASSEMBLY__ @@ -28,10 +30,14 @@ #define PER_CPU(var, reg) \ __percpu_mov_op %__percpu_seg:this_cpu_off, reg; \ lea var(reg), reg -#define PER_CPU_VAR(var) %__percpu_seg:var +/* Compatible with Position Independent Code */ +#define PER_CPU_VAR(var) %__percpu_seg:(var)##__percpu_rel +/* Rare absolute reference */ +#define PER_CPU_VAR_ABS(var) %__percpu_seg:var #else /* ! SMP */ #define PER_CPU(var, reg) __percpu_mov_op $var, reg -#define PER_CPU_VAR(var) var +#define PER_CPU_VAR(var) (var)##__percpu_rel +#define PER_CPU_VAR_ABS(var) var #endif /* SMP */ #ifdef CONFIG_X86_64_SMP @@ -209,27 +215,34 @@ do { \ pfo_ret__; \ }) +/* Position Independent code uses relative addresses only */ +#ifdef CONFIG_X86_PIE +#define __percpu_stable_arg __percpu_arg(a1) +#else +#define __percpu_stable_arg __percpu_arg(P1) +#endif + #define percpu_stable_op(op, var) \ ({ \ typeof(var) pfo_ret__; \ switch (sizeof(var)) { \ case 1: \ - asm(op "b "__percpu_arg(P1)",%0" \ + asm(op "b "__percpu_stable_arg ",%0" \ : "=q" (pfo_ret__) \ : "p" (&(var))); \ break; \ case 2: \ - asm(op "w "__percpu_arg(P1)",%0" \ + asm(op "w "__percpu_stable_arg ",%0" \ : "=r" (pfo_ret__) \ : "p" (&(var))); \ break; \ case 4: \ - asm(op "l "__percpu_arg(P1)",%0" \ + asm(op "l "__percpu_stable_arg ",%0" \ : "=r" (pfo_ret__) \ : "p" (&(var))); \ break; \ case 8: \ - asm(op "q "__percpu_arg(P1)",%0" \ + asm(op "q "__percpu_stable_arg ",%0" \ : "=r" (pfo_ret__) \ : "p" (&(var))); \ break; \ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 348cf4821240..0e9a87a34f76 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -487,7 +487,9 @@ void load_percpu_segment(int cpu) loadsegment(fs, __KERNEL_PERCPU); #else __loadsegment_simple(gs, 0); - wrmsrl(MSR_GS_BASE, (unsigned long)per_cpu(irq_stack_union.gs_base, cpu)); + wrmsrl(MSR_GS_BASE, + (unsigned long)per_cpu(irq_stack_union.gs_base, cpu) - + (unsigned long)__per_cpu_start); #endif load_stack_canary_segment(); } diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 48652f3ec46a..87624e1fe22f 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -266,7 +266,11 @@ ENDPROC(start_cpu0) GLOBAL(initial_code) .quad x86_64_start_kernel GLOBAL(initial_gs) +#ifdef CONFIG_X86_PIE + .quad 0 +#else .quad INIT_PER_CPU_VAR(irq_stack_union) +#endif GLOBAL(initial_stack) /* * The SIZEOF_PTREGS gap is a convention which helps the in-kernel diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c index ea554f812ee1..f7655979f19a 100644 --- a/arch/x86/kernel/setup_percpu.c +++ b/arch/x86/kernel/setup_percpu.c @@ -26,7 +26,7 @@ DEFINE_PER_CPU_READ_MOSTLY(int, cpu_number); EXPORT_PER_CPU_SYMBOL(cpu_number); -#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_X86_PIE) #define BOOT_PERCPU_OFFSET ((unsigned long)__per_cpu_load) #else #define BOOT_PERCPU_OFFSET 0 diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index 1c43a2e839fa..4a5957900dcc 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -209,9 +209,14 @@ SECTIONS /* * percpu offsets are zero-based on SMP. PERCPU_VADDR() changes the * output PHDR, so the next output section - .init.text - should - * start another segment - init. + * start another segment - init. For Position Independent Code, the + * per-cpu section cannot be zero-based because everything is relative. */ +#ifdef CONFIG_X86_PIE + PERCPU_SECTION(INTERNODE_CACHE_BYTES) +#else PERCPU_VADDR(INTERNODE_CACHE_BYTES, 0, :percpu) +#endif ASSERT(SIZEOF(.data..percpu) < CONFIG_PHYSICAL_START, "per-CPU data too large - increase CONFIG_PHYSICAL_START") #endif @@ -387,7 +392,11 @@ SECTIONS * Per-cpu symbols which need to be offset from __per_cpu_load * for the boot processor. */ +#ifdef CONFIG_X86_PIE +#define INIT_PER_CPU(x) init_per_cpu__##x = x +#else #define INIT_PER_CPU(x) init_per_cpu__##x = x + __per_cpu_load +#endif INIT_PER_CPU(gdt_page); INIT_PER_CPU(irq_stack_union); @@ -397,7 +406,7 @@ INIT_PER_CPU(irq_stack_union); . = ASSERT((_end - _text <= KERNEL_IMAGE_SIZE), "kernel image bigger than KERNEL_IMAGE_SIZE"); -#ifdef CONFIG_SMP +#if defined(CONFIG_SMP) && !defined(CONFIG_X86_PIE) . = ASSERT((irq_stack_union == 0), "irq_stack_union is not at start of per-cpu area"); #endif diff --git a/arch/x86/lib/cmpxchg16b_emu.S b/arch/x86/lib/cmpxchg16b_emu.S index 9b330242e740..254950604ae4 100644 --- a/arch/x86/lib/cmpxchg16b_emu.S +++ b/arch/x86/lib/cmpxchg16b_emu.S @@ -33,13 +33,13 @@ ENTRY(this_cpu_cmpxchg16b_emu) pushfq cli - cmpq PER_CPU_VAR((%rsi)), %rax + cmpq PER_CPU_VAR_ABS((%rsi)), %rax jne .Lnot_same - cmpq PER_CPU_VAR(8(%rsi)), %rdx + cmpq PER_CPU_VAR_ABS(8(%rsi)), %rdx jne .Lnot_same - movq %rbx, PER_CPU_VAR((%rsi)) - movq %rcx, PER_CPU_VAR(8(%rsi)) + movq %rbx, PER_CPU_VAR_ABS((%rsi)) + movq %rcx, PER_CPU_VAR_ABS(8(%rsi)) popfq mov $1, %al diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S index 8019edd0125c..a5d73d3218be 100644 --- a/arch/x86/xen/xen-asm.S +++ b/arch/x86/xen/xen-asm.S @@ -21,7 +21,7 @@ ENTRY(xen_irq_enable_direct) FRAME_BEGIN /* Unmask events */ - movb $0, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask + movb $0, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask) /* * Preempt here doesn't matter because that will deal with any @@ -30,7 +30,7 @@ ENTRY(xen_irq_enable_direct) */ /* Test for pending */ - testb $0xff, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_pending + testb $0xff, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_pending) jz 1f call check_events @@ -45,7 +45,7 @@ ENTRY(xen_irq_enable_direct) * non-zero. */ ENTRY(xen_irq_disable_direct) - movb $1, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask + movb $1, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask) ret ENDPROC(xen_irq_disable_direct) @@ -59,7 +59,7 @@ ENDPROC(xen_irq_disable_direct) * x86 use opposite senses (mask vs enable). */ ENTRY(xen_save_fl_direct) - testb $0xff, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask + testb $0xff, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask) setz %ah addb %ah, %ah ret @@ -80,7 +80,7 @@ ENTRY(xen_restore_fl_direct) #else testb $X86_EFLAGS_IF>>8, %ah #endif - setz PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask + setz PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask) /* * Preempt here doesn't matter because that will deal with any * pending interrupts. The pending check may end up being run @@ -88,7 +88,7 @@ ENTRY(xen_restore_fl_direct) */ /* check for unmasked and pending */ - cmpw $0x0001, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_pending + cmpw $0x0001, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_pending) jnz 1f call check_events 1: diff --git a/init/Kconfig b/init/Kconfig index 221cac95044f..acc9087546ac 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1365,7 +1365,7 @@ config KALLSYMS_ALL config KALLSYMS_ABSOLUTE_PERCPU bool depends on KALLSYMS - default X86_64 && SMP + default X86_64 && SMP && !X86_PIE config KALLSYMS_BASE_RELATIVE bool