From patchwork Fri Sep 22 17:41:56 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 9966737 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 9C8B16035E for ; Fri, 22 Sep 2017 17:46:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8BED5299A3 for ; Fri, 22 Sep 2017 17:46:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 80976299A8; Fri, 22 Sep 2017 17:46:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_MED, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 40FC9299A6 for ; Fri, 22 Sep 2017 17:46:06 +0000 (UTC) Received: (qmail 25606 invoked by uid 550); 22 Sep 2017 17:45:06 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 24471 invoked from network); 22 Sep 2017 17:45:05 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=+y01bk3+eO9zYZZZ7arFvmDIhq28OoMQ/MI8jffZycU=; b=AP92AsXEfT4BJMk+Pk0H9Lvtjk7yfUTnNS95a0Zw2H1thpHsR4ZmUZjr3PnAhVcf98 ZZy7qULgrdg2Vbqgjk+AEwdxkAagfTKrBgXkycCJFQK1yyhmtTrQDAV+4NB48Hn7PJY8 gt32weLh+rKLaiVM3GJJymByxAL1ZttNssrkw5D5EoNTApfC9H/96VSs4AW15CWLmgGz uauTECi29O0wKukeOWiXBcZ+JBg1U28CbcAr+fLYBvfa6ysMpZzMzDSPz4pisCjR6MxL rx2HerCZ+4ndpJdlc4GjlYvfyVQNqUKg8QiF5h89rrz0cxnCboTWMFPSo3iXAnrDk9AT nLKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=+y01bk3+eO9zYZZZ7arFvmDIhq28OoMQ/MI8jffZycU=; b=MZMezsgkUNlQuucoGoi4Fj7nrKUzw0EBrSXEbE+cky6O+eM78BWeGaOejfDpXqCHjO YUt/FCcwi5Zc1dZVUN8B1pZ7sq8C1XwKkaulkOEyeneuU477vv0hwl0blhLFAlXv4bMW JLS2EQvnTVFVTTJ0NV2BJI0T3JAiK4ZH360zY8UG/bxOy7pwHk/zkVz+f0GdqUAUQDBe z4YyHgqwaDJhLvewSHfiAGojG8X6e5HkcFhnYIpHyd3X+/zJSOgHbkd23a2ysfGf+Gqg CYoBwjeEu+njrY7odLYJaZ/CzeQrLB9uo7VJKXJEbS9pJ2Cr1GvoxtfwnIV+1wfh1Q0u JgTw== X-Gm-Message-State: AHPjjUgYEGkLGUsqpi+KN48foQFj6+9WXGLzI2UCnvGzH86tfhRwoe2f YKFeVn+9rjKhfFS34oU38nk= X-Google-Smtp-Source: AOwi7QBofesSW2vffUInqIChGKKAfY08r72n1lcOlPQMMXLEsE2GcoFRFSCa+FAYXAnMSpyFw6GFrQ== X-Received: by 10.159.216.139 with SMTP id s11mr9707115plp.120.1506102293381; Fri, 22 Sep 2017 10:44:53 -0700 (PDT) From: Eric Biggers To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, Andy Lutomirski , Dave Hansen , Dmitry Vyukov , Fenghua Yu , Ingo Molnar , Kevin Hao , Oleg Nesterov , Wanpeng Li , Yu-cheng Yu , Michael Halcrow , Eric Biggers Date: Fri, 22 Sep 2017 10:41:56 -0700 Message-Id: <20170922174156.16780-4-ebiggers3@gmail.com> X-Mailer: git-send-email 2.14.1.821.g8fa685d3b7-goog In-Reply-To: <20170922174156.16780-1-ebiggers3@gmail.com> References: <20170922174156.16780-1-ebiggers3@gmail.com> Subject: [kernel-hardening] [PATCH v4 3/3] x86/fpu: reinitialize FPU registers if restoring FPU state fails X-Virus-Scanned: ClamAV using ClamSMTP From: Eric Biggers Userspace can change the FPU state of a task using the ptrace() or rt_sigreturn() system calls. Because reserved bits in the FPU state can cause the XRSTOR instruction to fail, the kernel has to carefully validate that no reserved bits or other invalid values are being set. Unfortunately, there have been bugs in this validation code. For example, we were not checking that the 'xcomp_bv' field in the xstate_header was 0. As-is, such bugs are exploitable to read the FPU registers of other processes on the system. To do so, an attacker can create a task, assign to it an invalid FPU state, then spin in a loop and monitor the values of the FPU registers. Because the task's FPU registers are not being restored, sometimes the FPU registers will have the values from another process. This is likely to continue to be a problem in the future because the validation done by the CPU instructions like XRSTOR is not immediately visible to kernel developers. Nor will invalid FPU states ever be encountered during ordinary use --- they will only be seen during fuzzing or exploits. There can even be reserved bits outside the xstate_header which are easy to forget about. For example, the MXCSR register contains reserved bits, which were not validated by the KVM_SET_XSAVE ioctl until commit a575813bfe4b ("KVM: x86: Fix load damaged SSEx MXCSR register"). Therefore, mitigate this class of vulnerability by restoring the FPU registers from init_fpstate if restoring from the task's state fails. We actually used to do this, but it was (perhaps unwisely) removed by commit 9ccc27a5d297 ("x86/fpu: Remove error return values from copy_kernel_to_*regs() functions"). This new patch is also a bit different. First, it only clears the registers, not also the bad in-memory state; this is simpler and makes it easier to make the mitigation cover all callers of __copy_kernel_to_fpregs(). Second, it does the register clearing in an exception handler so that no extra instructions are added to context switches. In fact, we *remove* instructions, since previously we were always zeroing the register containing 'err' even if CONFIG_X86_DEBUG_FPU was disabled. Reviewed-by: Rik van Riel Cc: Andy Lutomirski Cc: Dave Hansen Cc: Dmitry Vyukov Cc: Fenghua Yu Cc: Ingo Molnar Cc: Kevin Hao Cc: Oleg Nesterov Cc: Wanpeng Li Cc: Yu-cheng Yu Signed-off-by: Eric Biggers --- arch/x86/include/asm/fpu/internal.h | 51 +++++++++++-------------------------- arch/x86/mm/extable.c | 24 +++++++++++++++++ 2 files changed, 39 insertions(+), 36 deletions(-) diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 53461be20767..8bee5f1a42e7 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -120,20 +120,11 @@ extern void fpstate_sanitize_xstate(struct fpu *fpu); err; \ }) -#define check_insn(insn, output, input...) \ -({ \ - int err; \ +#define kernel_insn(insn, output, input...) \ asm volatile("1:" #insn "\n\t" \ "2:\n" \ - ".section .fixup,\"ax\"\n" \ - "3: movl $-1,%[err]\n" \ - " jmp 2b\n" \ - ".previous\n" \ - _ASM_EXTABLE(1b, 3b) \ - : [err] "=r" (err), output \ - : "0"(0), input); \ - err; \ -}) + _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_fprestore) \ + : output : input) static inline int copy_fregs_to_user(struct fregs_state __user *fx) { @@ -153,20 +144,16 @@ static inline int copy_fxregs_to_user(struct fxregs_state __user *fx) static inline void copy_kernel_to_fxregs(struct fxregs_state *fx) { - int err; - if (IS_ENABLED(CONFIG_X86_32)) { - err = check_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx)); + kernel_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx)); } else { if (IS_ENABLED(CONFIG_AS_FXSAVEQ)) { - err = check_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx)); + kernel_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx)); } else { /* See comment in copy_fxregs_to_kernel() below. */ - err = check_insn(rex64/fxrstor (%[fx]), "=m" (*fx), [fx] "R" (fx), "m" (*fx)); + kernel_insn(rex64/fxrstor (%[fx]), "=m" (*fx), [fx] "R" (fx), "m" (*fx)); } } - /* Copying from a kernel buffer to FPU registers should never fail: */ - WARN_ON_FPU(err); } static inline int copy_user_to_fxregs(struct fxregs_state __user *fx) @@ -183,9 +170,7 @@ static inline int copy_user_to_fxregs(struct fxregs_state __user *fx) static inline void copy_kernel_to_fregs(struct fregs_state *fx) { - int err = check_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx)); - - WARN_ON_FPU(err); + kernel_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx)); } static inline int copy_user_to_fregs(struct fregs_state __user *fx) @@ -281,18 +266,13 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu) * Use XRSTORS to restore context if it is enabled. XRSTORS supports compact * XSAVE area format. */ -#define XSTATE_XRESTORE(st, lmask, hmask, err) \ +#define XSTATE_XRESTORE(st, lmask, hmask) \ asm volatile(ALTERNATIVE(XRSTOR, \ XRSTORS, X86_FEATURE_XSAVES) \ "\n" \ - "xor %[err], %[err]\n" \ "3:\n" \ - ".pushsection .fixup,\"ax\"\n" \ - "4: movl $-2, %[err]\n" \ - "jmp 3b\n" \ - ".popsection\n" \ - _ASM_EXTABLE(661b, 4b) \ - : [err] "=r" (err) \ + _ASM_EXTABLE_HANDLE(661b, 3b, ex_handler_fprestore)\ + : \ : "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \ : "memory") @@ -336,7 +316,10 @@ static inline void copy_kernel_to_xregs_booting(struct xregs_state *xstate) else XSTATE_OP(XRSTOR, xstate, lmask, hmask, err); - /* We should never fault when copying from a kernel buffer: */ + /* + * We should never fault when copying from a kernel buffer, and the FPU + * state we set at boot time should be valid. + */ WARN_ON_FPU(err); } @@ -365,12 +348,8 @@ static inline void copy_kernel_to_xregs(struct xregs_state *xstate, u64 mask) { u32 lmask = mask; u32 hmask = mask >> 32; - int err; - - XSTATE_XRESTORE(xstate, lmask, hmask, err); - /* We should never fault when copying from a kernel buffer: */ - WARN_ON_FPU(err); + XSTATE_XRESTORE(xstate, lmask, hmask); } /* diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index c076f710de4c..c3521e2be396 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -2,6 +2,7 @@ #include #include +#include #include #include @@ -78,6 +79,29 @@ bool ex_handler_refcount(const struct exception_table_entry *fixup, } EXPORT_SYMBOL_GPL(ex_handler_refcount); +/* + * Handler for when we fail to restore a task's FPU state. We should never get + * here because the FPU state of a task using the FPU (task->thread.fpu.state) + * should always be valid. However, past bugs have allowed userspace to set + * reserved bits in the XSAVE area using PTRACE_SETREGSET or sys_rt_sigreturn(). + * These caused XRSTOR to fail when switching to the task, leaking the FPU + * registers of the task previously executing on the CPU. Mitigate this class + * of vulnerability by restoring from the initial state (essentially, zeroing + * out all the FPU registers) if we can't restore from the task's FPU state. + */ +bool ex_handler_fprestore(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr) +{ + regs->ip = ex_fixup_addr(fixup); + + WARN_ONCE(1, "Bad FPU state detected at %pB, reinitializing FPU registers.", + (void *)instruction_pointer(regs)); + + __copy_kernel_to_fpregs(&init_fpstate, -1); + return true; +} +EXPORT_SYMBOL_GPL(ex_handler_fprestore); + bool ex_handler_ext(const struct exception_table_entry *fixup, struct pt_regs *regs, int trapnr) {