From patchwork Sat Feb 18 21:14:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 13145668 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2927CC6379F for ; Sat, 18 Feb 2023 21:16:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 45D1F28001B; Sat, 18 Feb 2023 16:16:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 436DD280019; Sat, 18 Feb 2023 16:16:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D9B428001C; Sat, 18 Feb 2023 16:16:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EED2428001B for ; Sat, 18 Feb 2023 16:16:25 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id C48FAAAD71 for ; Sat, 18 Feb 2023 21:16:25 +0000 (UTC) X-FDA: 80481671130.20.6C2DE42 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by imf25.hostedemail.com (Postfix) with ESMTP id DE2B5A0006 for ; Sat, 18 Feb 2023 21:16:23 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Av3BINPN; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf25.hostedemail.com: domain of rick.p.edgecombe@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676754984; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=iNUnkPacdMKPP0JY+JfK1r5b2aug/1vNrSgdx7qF/YQ=; b=sO4fKytjPMoW8QGPaVO1l9646k7NZo7Q1Cx1s+cy6X7xN2LMutE9WTc1xazxky+85DSWUh w3x4f3Qqw1vh3aer2eDcxwG0zAsJTO1mYdvUU25ULy/dMnydnqyJueyWWzImj39RYe+ega pyrWC+MFqkEtHBYhEPFoYxtkynFDte0= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Av3BINPN; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf25.hostedemail.com: domain of rick.p.edgecombe@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676754984; a=rsa-sha256; cv=none; b=JmODL7WV4m1+h8e2g0e+zhlUMZCjmfLFVnu7/eLvSyvpacfpqPX39mMgtr/Wq8TGAAGJ5t ZXPJudEtsrnrizvyqztV/Ybu8dtsWT+k+zw5SMJfJaZKvllyqaejBSzpJE7XSYz3zJcslP BTzu+KaXUkEIDxLNho762WKL3NnWKoI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1676754984; x=1708290984; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=Fx30aEuXtnObd4pLstduxowlzqHzXgwWfGXPPH6LT+k=; b=Av3BINPNbCvB9yBGUThKyb3e/QHy8n+jraITNweDGauRVLrz9gxJv9qU w5IQZJap+nd4jknpIhaiUIqWh0kr3lOM8izqdNkX8OwZK2Bg5UrCr4JpY j4xFZDjsqHvB3p6IWcjB+gabwYQ71P8xCNHD16zxRmKw6InJ20fNIHbA1 GYpyYgNpO7GzssOGEFg5WGcWsIm97chTHmwBlMfLq4QjEEagg5t4ptMLD nGEch7xgbjU6Vko+OW80Undu+kA0UgPcqkuBB7k2cEVS1RxeqtwawbB6B AEwSl24qBjQdUzt6sbkfDLXlc+oigGvuhFB5euWN8d/Hm0/wCaSb0NY2i g==; X-IronPort-AV: E=McAfee;i="6500,9779,10625"; a="418427719" X-IronPort-AV: E=Sophos;i="5.97,309,1669104000"; d="scan'208";a="418427719" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2023 13:16:20 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10625"; a="664241712" X-IronPort-AV: E=Sophos;i="5.97,309,1669104000"; d="scan'208";a="664241712" Received: from adityava-mobl1.amr.corp.intel.com (HELO rpedgeco-desk.amr.corp.intel.com) ([10.209.80.223]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2023 13:16:19 -0800 From: Rick Edgecombe To: x86@kernel.org, "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H . J . Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , Weijiang Yang , "Kirill A . Shutemov" , John Allen , kcc@google.com, eranian@google.com, rppt@kernel.org, jamorris@linux.microsoft.com, dethoma@microsoft.com, akpm@linux-foundation.org, Andrew.Cooper3@citrix.com, christina.schimpe@intel.com, david@redhat.com, debug@rivosinc.com Cc: rick.p.edgecombe@intel.com, Yu-cheng Yu Subject: [PATCH v6 29/41] x86/shstk: Add user-mode shadow stack support Date: Sat, 18 Feb 2023 13:14:21 -0800 Message-Id: <20230218211433.26859-30-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230218211433.26859-1-rick.p.edgecombe@intel.com> References: <20230218211433.26859-1-rick.p.edgecombe@intel.com> X-Rspamd-Queue-Id: DE2B5A0006 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: 1kesn378p7bkxpyrb9zw7mugisfnjyko X-HE-Tag: 1676754983-258399 X-HE-Meta: U2FsdGVkX1+5NPvZ5+IGbp4K/LPl9uGDI3Ahbn47lbJYlJxT+YpIUlfD6aCKMnjj2roLn0zrCspevwFf6bESHpzdNWo9svFTioEokrRr1cIJ49Mwm0dSJhAhTp/QLhZDW5RHPWS1mf3a7+lVOjAd3J13a/eSASju1FqipJGs4IWuEJdKwAqnVXy+P+sT/Sfr0xEhjuUzCTucjg2AwfKr3cV/OriW4mSvE00TJhgFtGmIGFDqZy3S6MrNhAyQYabCDA/5QYfCUew/oCb4dt2IX8PqndNl+3J/7n32P72uqIvQSl05AEpbmKjPA2S/4SrtaAxIeey9Ja1mCup62Bkbxukbu0iL5/m0ka1tvWMvzg26WLD/q6TTDPr82eij/KR1lm+vAZR0tyoGEfXwDskFR7EMLgqtiVfXvRmzDkC0t/9HF8jZdkadtuAft0lbqnzW9bFx92XddW0i+DXBhdIDFLO0R3rxFaUXgcWZBCyz97sasr93/osdC+5gwG+kKBI/fp8nmi+VKTd3G+aei53znDVv9f5qR4ynXRe9xRHjaPOjBy6yAFf1eDjvJImB4bcdeGosu1cXFD2DbZCHzxUxTmd5ge/N89mLgrrmnJfyg3PFsa+LEbmps37PkiBne6LscnHZ3/LL+3+JGeWTqzpyB4DUbwccGIcSzUT4qvs8GGdHNt1mhqYB1tTRctYhqFHyWzVLqY02iy1B7lFspycKY0v89/VwgDfdcuBcyLYWhxKzwdbZSfkSurQNoTANfB/+G9AGh8vl2atme8AdvWPYlg66xfXo6FulrOU4e2JfCII+BKzHlZ/iSMLWaoDei3SwKw4qvCNfq9vBKPKz/O6iO4Ra1/N/8Bn5ZeGomW9TB48hYKuMfMjqoYg+ibtF60nHkrKo5gMxzwjG8j03NnqPFegio5uftRczKrm5owpXfi3Im1mLWxJ2haH+j3kBJlAHHccPMabkFjPgz61jGTT qEjI8FQO ns5gb7gqfIRxBvjzTDVu71Z/qkFIqutjeZZw7m9kDGUtST9klFPqdQrPbHz2T9khxoo7yBfR4iJHHUNVkCnxPcXM3rdrGKpvuD8v1KJy0OccFAW5j3bkZ56X29v1XoYONC7JyR32YnOU5zuZLSETsFj50+y6z0uzCw1j8Xdb5sZwUEYsGxvI80IbEzWJNib9ceKIQld0CHSVdPH+uQEG+gMOSlTYFqpsf00ziAh5ECnYLVFI/cevsLA2Pt1qVrKMCXnrDQf6BDk2LXqAXx35r7BjWFFSOYt7Iu+kk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yu-cheng Yu Introduce basic shadow stack enabling/disabling/allocation routines. A task's shadow stack is allocated from memory with VM_SHADOW_STACK flag and has a fixed size of min(RLIMIT_STACK, 4GB). Keep the task's shadow stack address and size in thread_struct. This will be copied when cloning new threads, but needs to be cleared during exec, so add a function to do this. Do not support IA32 emulation or x32. Tested-by: Pengfei Xu Tested-by: John Allen Reviewed-by: Kees Cook Signed-off-by: Yu-cheng Yu Co-developed-by: Rick Edgecombe Signed-off-by: Rick Edgecombe Cc: Kees Cook --- v5: - Switch to EOPNOTSUPP - Use MAP_ABOVE4G - Move set_clr_bits_msrl() to patch where it is first used v4: - Just set MSR_IA32_U_CET when disabling shadow stack, since we don't have IBT yet. (Peterz) v3: - Use define for set_clr_bits_msrl() (Kees) - Make some functions static (Kees) - Change feature_foo() to features_foo() (Kees) - Centralize shadow stack size rlimit checks (Kees) - Disable x32 support v2: - Get rid of unnecessary shstk->base checks - Don't support IA32 emulation --- arch/x86/include/asm/processor.h | 2 + arch/x86/include/asm/shstk.h | 7 ++ arch/x86/include/uapi/asm/prctl.h | 3 + arch/x86/kernel/shstk.c | 145 ++++++++++++++++++++++++++++++ 4 files changed, 157 insertions(+) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index bd16e012b3e9..ff98cd6d5af2 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -479,6 +479,8 @@ struct thread_struct { #ifdef CONFIG_X86_USER_SHADOW_STACK unsigned long features; unsigned long features_locked; + + struct thread_shstk shstk; #endif /* Floating point and extended processor state */ diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h index ec753809f074..2b1f7c9b9995 100644 --- a/arch/x86/include/asm/shstk.h +++ b/arch/x86/include/asm/shstk.h @@ -8,12 +8,19 @@ struct task_struct; #ifdef CONFIG_X86_USER_SHADOW_STACK +struct thread_shstk { + u64 base; + u64 size; +}; + long shstk_prctl(struct task_struct *task, int option, unsigned long features); void reset_thread_features(void); +void shstk_free(struct task_struct *p); #else static inline long shstk_prctl(struct task_struct *task, int option, unsigned long arg2) { return -EINVAL; } static inline void reset_thread_features(void) {} +static inline void shstk_free(struct task_struct *p) {} #endif /* CONFIG_X86_USER_SHADOW_STACK */ #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h index b2b3b7200b2d..7dfd9dc00509 100644 --- a/arch/x86/include/uapi/asm/prctl.h +++ b/arch/x86/include/uapi/asm/prctl.h @@ -26,4 +26,7 @@ #define ARCH_SHSTK_DISABLE 0x5002 #define ARCH_SHSTK_LOCK 0x5003 +/* ARCH_SHSTK_ features bits */ +#define ARCH_SHSTK_SHSTK (1ULL << 0) + #endif /* _ASM_X86_PRCTL_H */ diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c index 41ed6552e0a5..3cb85224d856 100644 --- a/arch/x86/kernel/shstk.c +++ b/arch/x86/kernel/shstk.c @@ -8,14 +8,159 @@ #include #include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include #include +static bool features_enabled(unsigned long features) +{ + return current->thread.features & features; +} + +static void features_set(unsigned long features) +{ + current->thread.features |= features; +} + +static void features_clr(unsigned long features) +{ + current->thread.features &= ~features; +} + +static unsigned long alloc_shstk(unsigned long size) +{ + int flags = MAP_ANONYMOUS | MAP_PRIVATE | MAP_ABOVE4G; + struct mm_struct *mm = current->mm; + unsigned long addr, unused; + + mmap_write_lock(mm); + addr = do_mmap(NULL, addr, size, PROT_READ, flags, + VM_SHADOW_STACK | VM_WRITE, 0, &unused, NULL); + + mmap_write_unlock(mm); + + return addr; +} + +static unsigned long adjust_shstk_size(unsigned long size) +{ + if (size) + return PAGE_ALIGN(size); + + return PAGE_ALIGN(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G)); +} + +static void unmap_shadow_stack(u64 base, u64 size) +{ + while (1) { + int r; + + r = vm_munmap(base, size); + + /* + * vm_munmap() returns -EINTR when mmap_lock is held by + * something else, and that lock should not be held for a + * long time. Retry it for the case. + */ + if (r == -EINTR) { + cond_resched(); + continue; + } + + /* + * For all other types of vm_munmap() failure, either the + * system is out of memory or there is bug. + */ + WARN_ON_ONCE(r); + break; + } +} + +static int shstk_setup(void) +{ + struct thread_shstk *shstk = ¤t->thread.shstk; + unsigned long addr, size; + + /* Already enabled */ + if (features_enabled(ARCH_SHSTK_SHSTK)) + return 0; + + /* Also not supported for 32 bit and x32 */ + if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK) || in_32bit_syscall()) + return -EOPNOTSUPP; + + size = adjust_shstk_size(0); + addr = alloc_shstk(size); + if (IS_ERR_VALUE(addr)) + return PTR_ERR((void *)addr); + + fpregs_lock_and_load(); + wrmsrl(MSR_IA32_PL3_SSP, addr + size); + wrmsrl(MSR_IA32_U_CET, CET_SHSTK_EN); + fpregs_unlock(); + + shstk->base = addr; + shstk->size = size; + features_set(ARCH_SHSTK_SHSTK); + + return 0; +} + void reset_thread_features(void) { + memset(¤t->thread.shstk, 0, sizeof(struct thread_shstk)); current->thread.features = 0; current->thread.features_locked = 0; } +void shstk_free(struct task_struct *tsk) +{ + struct thread_shstk *shstk = &tsk->thread.shstk; + + if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK) || + !features_enabled(ARCH_SHSTK_SHSTK)) + return; + + if (!tsk->mm) + return; + + unmap_shadow_stack(shstk->base, shstk->size); +} + +static int shstk_disable(void) +{ + if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK)) + return -EOPNOTSUPP; + + /* Already disabled? */ + if (!features_enabled(ARCH_SHSTK_SHSTK)) + return 0; + + fpregs_lock_and_load(); + /* Disable WRSS too when disabling shadow stack */ + wrmsrl(MSR_IA32_U_CET, 0); + wrmsrl(MSR_IA32_PL3_SSP, 0); + fpregs_unlock(); + + shstk_free(current); + features_clr(ARCH_SHSTK_SHSTK); + + return 0; +} + long shstk_prctl(struct task_struct *task, int option, unsigned long features) { if (option == ARCH_SHSTK_LOCK) {