From patchwork Wed Feb 5 18:19:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366907 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0EC2D17E0 for ; Wed, 5 Feb 2020 18:20:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BDBB122314 for ; Wed, 5 Feb 2020 18:20:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BDBB122314 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5C7B46B000A; Wed, 5 Feb 2020 13:20:30 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 54BD86B000C; Wed, 5 Feb 2020 13:20:30 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3DF306B000C; Wed, 5 Feb 2020 13:20:30 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0216.hostedemail.com [216.40.44.216]) by kanga.kvack.org (Postfix) with ESMTP id 1463F6B0008 for ; Wed, 5 Feb 2020 13:20:30 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id A742740FB for ; Wed, 5 Feb 2020 18:20:29 +0000 (UTC) X-FDA: 76456888578.21.flesh40_78855d5269f24 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:2106:30003:30012:30046:30051:30054:30056:30062:30064:30069:30070:30075:30079:30089:30091,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules: 0:0:0,LF X-HE-Tag: flesh40_78855d5269f24 X-Filterd-Recvd-Size: 15274 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:27 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:24 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447737" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:24 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 01/27] Documentation/x86: Add CET description Date: Wed, 5 Feb 2020 10:19:09 -0800 Message-Id: <20200205181935.3712-2-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Explain no_cet_shstk/no_cet_ibt kernel parameters, and introduce a new document on Control-flow Enforcement Technology (CET). Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook --- .../admin-guide/kernel-parameters.txt | 6 + Documentation/x86/index.rst | 1 + Documentation/x86/intel_cet.rst | 294 ++++++++++++++++++ 3 files changed, 301 insertions(+) create mode 100644 Documentation/x86/intel_cet.rst diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index ade4e6ec23e0..8b69ebf0baed 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3001,6 +3001,12 @@ noexec=on: enable non-executable mappings (default) noexec=off: disable non-executable mappings + no_cet_shstk [X86-64] Disable Shadow Stack for user-mode + applications + + no_cet_ibt [X86-64] Disable Indirect Branch Tracking for user-mode + applications + nosmap [X86,PPC] Disable SMAP (Supervisor Mode Access Prevention) even if it is supported by processor. diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst index a8de2fbc1caa..81f919801765 100644 --- a/Documentation/x86/index.rst +++ b/Documentation/x86/index.rst @@ -19,6 +19,7 @@ x86-specific Documentation tlb mtrr pat + intel_cet intel_mpx intel-iommu intel_txt diff --git a/Documentation/x86/intel_cet.rst b/Documentation/x86/intel_cet.rst new file mode 100644 index 000000000000..71e2462fea5c --- /dev/null +++ b/Documentation/x86/intel_cet.rst @@ -0,0 +1,294 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================================= +Control-flow Enforcement Technology (CET) +========================================= + +[1] Overview +============ + +Control-flow Enforcement Technology (CET) provides protection against +return/jump-oriented programming (ROP) attacks. It can be setup to +protect both applications and the kernel. In the first phase, only +user-mode protection is implemented in the 64-bit kernel; 32-bit +applications are supported in compatibility mode. + +CET introduces Shadow Stack (SHSTK) and Indirect Branch Tracking +(IBT). SHSTK is a secondary stack allocated from memory and cannot +be directly modified by applications. When executing a CALL, the +processor pushes a copy of the return address to SHSTK. Upon +function return, the processor pops the SHSTK copy and compares it +to the one from the program stack. If the two copies differ, the +processor raises a control-protection fault. IBT verifies indirect +CALL/JMP targets are intended as marked by the compiler with 'ENDBR' +opcodes (see CET instructions below). + +There are two kernel configuration options: + + X86_INTEL_SHADOW_STACK_USER, and + X86_INTEL_BRANCH_TRACKING_USER. + +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or later +are required. To build a CET-enabled application, GLIBC v2.28 or +later is also required. + +There are two command-line options for disabling CET features:: + + no_cet_shstk - disables SHSTK, and + no_cet_ibt - disables IBT. + +At run time, /proc/cpuinfo shows the availability of SHSTK and IBT. + +[2] CET assembly instructions +============================= + +RDSSP %r + Read the SHSTK pointer into %r. + +INCSSP %r + Unwind (increment) the SHSTK pointer (0 ~ 255) steps as indicated + in the operand register. The GLIBC longjmp uses INCSSP to unwind + the SHSTK until that matches the program stack. When it is + necessary to unwind beyond 255 steps, longjmp divides and repeats + the process. + +RSTORSSP (%r) + Switch to the SHSTK indicated in the 'restore token' pointed by + the operand register and replace the 'restore token' with a new + token to be saved (with SAVEPREVSSP) for the outgoing SHSTK. + +:: + + Before RSTORSSP + + Incoming SHSTK Current/Outgoing SHSTK + + |----------------------| |----------------------| + addr=x | | ssp-> | | + |----------------------| |----------------------| + (%r)-> | rstor_token=(x|Lg) | addr=y-8 | | + |----------------------| |----------------------| + + After RSTORSSP + + |----------------------| |----------------------| + addr=x | | | | + |----------------------| |----------------------| + ssp-> | rstor_token=(y|Pv|Lg)| addr=y-8 | | + |----------------------| |----------------------| + + note: + 1. Only valid addresses and restore tokens can be on the + user-mode SHSTK. + 2. A token is always of type u64 and must align to u64. + 3. The incoming SHSTK pointer in a rstor_token must point to + immediately above the token. + 4. 'Lg' is bit[0] of a rstor_token indicating a 64-bit SHSTK. + 5. 'Pv' is bit[1] of a rstor_token indicating the token is to + be used only for the next SAVEPREVSSP and invalid for + RSTORSSP. + +SAVEPREVSSP + Pop the SHSTK 'restore token' pointed by current SHSTK pointer + and store it at (previous SHSTK pointer - 8). + +:: + + After SAVEPREVSSP + + |----------------------| |----------------------| + ssp-> | | | | + |----------------------| |----------------------| + addr=x-8 | rstor_token=(y|Pv|Lg)| addr=y-8 | rstor_token(y|Lg) | + |----------------------| |----------------------| + +WRUSS %r0, (%r1) + Write the value in %r0 to the SHSTK address pointed by (%r1). + This is a kernel-mode only instruction. + +ENDBR and NOTRACK prefix + When IBT is enabled, an indirect CALL/JMP must either:: + + have a NOTRACK prefix, + reach an ENDBR, or + reach an address within a legacy code page; + + or it results in a control-protection fault. + + When the target address is derived from information that cannot + be modified, the compiler uses the NOTRACK prefix. In other + cases, the compiler inserts an ENDBR at the target address. + + A legacy code page is designated in the legacy code bitmap, which + is explained below in section [8]. + +[3] Application Enabling +======================== + +An application's CET capability is marked in its ELF header and can +be verified from the following command output, in the +NT_GNU_PROPERTY_TYPE_0 field: + + readelf -n + +If an application supports CET and is statically linked, it will run +with CET protection. If the application needs any shared libraries, +the loader checks all dependencies and enables CET only when all +requirements are met. + +[4] Legacy Libraries +==================== + +GLIBC provides a few tunables for backward compatibility. + +GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-IBT + Turn off SHSTK/IBT for the current shell. + +GLIBC_TUNABLES=glibc.tune.x86_shstk= + This controls how dlopen() handles SHSTK legacy libraries:: + + on - continue with SHSTK enabled; + permissive - continue with SHSTK off. + +[5] CET system calls +==================== + +The following arch_prctl() system calls are added for CET: + +arch_prctl(ARCH_X86_CET_STATUS, unsigned long *addr) + Return CET feature status. + + The parameter 'addr' is a pointer to a user buffer. + On returning to the caller, the kernel fills the following + information:: + + *addr = SHSTK/IBT status + *(addr + 1) = SHSTK base address + *(addr + 2) = SHSTK size + +arch_prctl(ARCH_X86_CET_DISABLE, unsigned long features) + Disable SHSTK and/or IBT specified in 'features'. Return -EPERM + if CET is locked. + +arch_prctl(ARCH_X86_CET_LOCK) + Lock in CET feature. + +arch_prctl(ARCH_X86_CET_ALLOC_SHSTK, unsigned long *addr) + Allocate a new SHSTK and put a restore token at top. + + The parameter 'addr' is a pointer to a user buffer and indicates + the desired SHSTK size to allocate. On returning to the caller, + the kernel fills '*addr' with the base address of the new SHSTK. + +arch_prctl(ARCH_X86_CET_MARK_LEGACY_CODE, unsigned long *addr) + Mark an address range as IBT legacy code. + + The parameter 'addr' is a pointer to a user buffer that has the + following information:: + + *addr = starting linear address of the legacy code + *(addr + 1) = size of the legacy code + *(addr + 2) = set (1); clear (0) + +Note: + There is no CET-enabling arch_prctl function. By design, CET is + enabled automatically if the binary and the system can support it. + + The parameters passed are always unsigned 64-bit. When an IA32 + application passing pointers, it should only use the lower 32 bits. + +[6] The implementation of the SHSTK +=================================== + +SHSTK size +---------- + +A task's SHSTK is allocated from memory to a fixed size of +RLIMIT_STACK. A compat-mode thread's SHSTK size is 1/4 of +RLIMIT_STACK. The smaller 32-bit thread SHSTK allows more threads to +share a 32-bit address space. + +Signal +------ + +The main program and its signal handlers use the same SHSTK. Because +the SHSTK stores only return addresses, a large SHSTK will cover the +condition that both the program stack and the sigaltstack run out. + +The kernel creates a restore token at the SHSTK restoring address and +verifies that token when restoring from the signal handler. + +IBT for signal delivering and sigreturn is the same as the main +program's setup; except for WAIT_ENDBR status, which can be read from +MSR_IA32_U_CET. In general, a task is in WAIT_ENDBR after an +indirect CALL/JMP and before the next instruction starts. + +A task's WAIT_ENDBR is reset for its signal handler, but preserved on +the task's stack; and then restored from sigreturn. + +Fork +---- + +The SHSTK's vma has VM_SHSTK flag set; its PTEs are required to be +read-only and dirty. When a SHSTK PTE is not present, RO, and dirty, +a SHSTK access triggers a page fault with an additional SHSTK bit set +in the page fault error code. + +When a task forks a child, its SHSTK PTEs are copied and both the +parent's and the child's SHSTK PTEs are cleared of the dirty bit. +Upon the next SHSTK access, the resulting SHSTK page fault is handled +by page copy/re-use. + +When a pthread child is created, the kernel allocates a new SHSTK for +the new thread. + +Setjmp/Longjmp +-------------- + +Longjmp unwinds SHSTK until it matches the program stack. + +Ucontext +-------- + +In GLIBC, getcontext/setcontext is implemented in similar way as +setjmp/longjmp. + +When makecontext creates a new ucontext, a new SHSTK is allocated for +that context with ARCH_X86_CET_ALLOC_SHSTK syscall. The kernel +creates a restore token at the top of the new SHSTK and the user-mode +code switches to the new SHSTK with the RSTORSSP instruction. + +[7] The management of read-only & dirty PTEs for SHSTK +====================================================== + +A RO and dirty PTE exists in the following cases: + +(a) A page is modified and then shared with a fork()'ed child; +(b) A R/O page that has been COW'ed; +(c) A SHSTK page. + +The processor only checks the dirty bit for (c). To prevent the use +of non-SHSTK memory as SHSTK, we use a spare bit of the 64-bit PTE as +DIRTY_SW for (a) and (b) above. This results to the following PTE +settings:: + + Modified PTE: (R/W + DIRTY_HW) + Modified and shared PTE: (R/O + DIRTY_SW) + R/O PTE, COW'ed: (R/O + DIRTY_SW) + SHSTK PTE: (R/O + DIRTY_HW) + SHSTK PTE, COW'ed: (R/O + DIRTY_HW) + SHSTK PTE, shared: (R/O + DIRTY_SW) + +Note that DIRTY_SW is only used in R/O PTEs but not R/W PTEs. + +[8] The implementation of IBT legacy bitmap +=========================================== + +When IBT is active, a non-IBT-capable legacy library can be executed +if its address ranges are specified in the legacy code bitmap. The +bitmap covers the whole user-space address, which is TASK_SIZE_MAX +for 64-bit and TASK_SIZE for IA32, and its each bit indicates a 4-KB +legacy code page. It is read-only from an application, and setup by +the kernel as a special mapping when the first time the application +calls arch_prctl(ARCH_X86_CET_MARK_LEGACY_CODE). The application +manages the bitmap through the arch_prctl. From patchwork Wed Feb 5 18:19:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366905 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4A35F921 for ; Wed, 5 Feb 2020 18:20:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 20E5B20674 for ; Wed, 5 Feb 2020 18:20:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 20E5B20674 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B359C6B0006; Wed, 5 Feb 2020 13:20:29 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A71976B0008; Wed, 5 Feb 2020 13:20:29 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8EB146B000A; Wed, 5 Feb 2020 13:20:29 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0064.hostedemail.com [216.40.44.64]) by kanga.kvack.org (Postfix) with ESMTP id 718186B0006 for ; Wed, 5 Feb 2020 13:20:29 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 009878248D51 for ; Wed, 5 Feb 2020 18:20:28 +0000 (UTC) X-FDA: 76456888536.08.card52_788be02a0b749 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com:bp@suse.de,RULES_HIT:30054:30055:30056:30064,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:27,LUA_SUMMARY:none X-HE-Tag: card52_788be02a0b749 X-Filterd-Recvd-Size: 4592 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:28 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:25 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447740" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:24 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu , Borislav Petkov Subject: [RFC PATCH v9 02/27] x86/cpufeatures: Add CET CPU feature flags for Control-flow Enforcement Technology (CET) Date: Wed, 5 Feb 2020 10:19:10 -0800 Message-Id: <20200205181935.3712-3-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add CPU feature flags for Control-flow Enforcement Technology (CET). CPUID.(EAX=7,ECX=0):ECX[bit 7] Shadow stack CPUID.(EAX=7,ECX=0):EDX[bit 20] Indirect Branch Tracking Signed-off-by: Yu-cheng Yu Reviewed-by: Borislav Petkov Reviewed-by: Kees Cook --- arch/x86/include/asm/cpufeatures.h | 2 ++ arch/x86/kernel/cpu/cpuid-deps.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index e9b62498fe75..a2c6b1b5c026 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -336,6 +336,7 @@ #define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ #define X86_FEATURE_WAITPKG (16*32+ 5) /* UMONITOR/UMWAIT/TPAUSE Instructions */ #define X86_FEATURE_AVX512_VBMI2 (16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */ +#define X86_FEATURE_SHSTK (16*32+ 7) /* Shadow Stack */ #define X86_FEATURE_GFNI (16*32+ 8) /* Galois Field New Instructions */ #define X86_FEATURE_VAES (16*32+ 9) /* Vector AES */ #define X86_FEATURE_VPCLMULQDQ (16*32+10) /* Carry-Less Multiplication Double Quadword */ @@ -361,6 +362,7 @@ #define X86_FEATURE_MD_CLEAR (18*32+10) /* VERW clears CPU buffers */ #define X86_FEATURE_TSX_FORCE_ABORT (18*32+13) /* "" TSX_FORCE_ABORT */ #define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */ +#define X86_FEATURE_IBT (18*32+20) /* Indirect Branch Tracking */ #define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */ #define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */ #define X86_FEATURE_FLUSH_L1D (18*32+28) /* Flush L1D cache */ diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c index 3cbe24ca80ab..fec83cc74b9e 100644 --- a/arch/x86/kernel/cpu/cpuid-deps.c +++ b/arch/x86/kernel/cpu/cpuid-deps.c @@ -69,6 +69,8 @@ static const struct cpuid_dep cpuid_deps[] = { { X86_FEATURE_CQM_MBM_TOTAL, X86_FEATURE_CQM_LLC }, { X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC }, { X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL }, + { X86_FEATURE_SHSTK, X86_FEATURE_XSAVES }, + { X86_FEATURE_IBT, X86_FEATURE_XSAVES }, {} }; From patchwork Wed Feb 5 18:19:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366909 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9AFE017E0 for ; Wed, 5 Feb 2020 18:20:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6422322314 for ; Wed, 5 Feb 2020 18:20:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6422322314 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A4C516B0010; Wed, 5 Feb 2020 13:20:30 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9DA016B000C; Wed, 5 Feb 2020 13:20:30 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7B17A6B0010; Wed, 5 Feb 2020 13:20:30 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0059.hostedemail.com [216.40.44.59]) by kanga.kvack.org (Postfix) with ESMTP id 4793D6B0008 for ; Wed, 5 Feb 2020 13:20:30 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CF44A180AD802 for ; Wed, 5 Feb 2020 18:20:29 +0000 (UTC) X-FDA: 76456888578.24.thing06_789a6a4ec460d X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30034:30045:30051:30054:30056:30064,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: thing06_789a6a4ec460d X-Filterd-Recvd-Size: 9873 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:28 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:25 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447746" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:24 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 03/27] x86/fpu/xstate: Introduce CET MSR XSAVES supervisor states Date: Wed, 5 Feb 2020 10:19:11 -0800 Message-Id: <20200205181935.3712-4-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Control-flow Enforcement Technology (CET) adds five MSRs. Introduce them and their XSAVES supervisor states: MSR_IA32_U_CET (user-mode CET settings), MSR_IA32_PL3_SSP (user-mode Shadow Stack pointer), MSR_IA32_PL0_SSP (kernel-mode Shadow Stack pointer), MSR_IA32_PL1_SSP (Privilege Level 1 Shadow Stack pointer), MSR_IA32_PL2_SSP (Privilege Level 2 Shadow Stack pointer). v6: - Remove __packed from struct cet_user_state, struct cet_kernel_state. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook --- arch/x86/include/asm/fpu/types.h | 22 ++++++++++++++++++ arch/x86/include/asm/fpu/xstate.h | 5 +++-- arch/x86/include/asm/msr-index.h | 18 +++++++++++++++ arch/x86/include/uapi/asm/processor-flags.h | 2 ++ arch/x86/kernel/fpu/xstate.c | 25 +++++++++++++++++++-- 5 files changed, 68 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index f098f6cab94b..d7ef4d9c7ad5 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -114,6 +114,9 @@ enum xfeature { XFEATURE_Hi16_ZMM, XFEATURE_PT_UNIMPLEMENTED_SO_FAR, XFEATURE_PKRU, + XFEATURE_RESERVED, + XFEATURE_CET_USER, + XFEATURE_CET_KERNEL, XFEATURE_MAX, }; @@ -128,6 +131,8 @@ enum xfeature { #define XFEATURE_MASK_Hi16_ZMM (1 << XFEATURE_Hi16_ZMM) #define XFEATURE_MASK_PT (1 << XFEATURE_PT_UNIMPLEMENTED_SO_FAR) #define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU) +#define XFEATURE_MASK_CET_USER (1 << XFEATURE_CET_USER) +#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL) #define XFEATURE_MASK_FPSSE (XFEATURE_MASK_FP | XFEATURE_MASK_SSE) #define XFEATURE_MASK_AVX512 (XFEATURE_MASK_OPMASK \ @@ -229,6 +234,23 @@ struct pkru_state { u32 pad; } __packed; +/* + * State component 11 is Control-flow Enforcement user states + */ +struct cet_user_state { + u64 user_cet; /* user control-flow settings */ + u64 user_ssp; /* user shadow stack pointer */ +}; + +/* + * State component 12 is Control-flow Enforcement kernel states + */ +struct cet_kernel_state { + u64 kernel_ssp; /* kernel shadow stack */ + u64 pl1_ssp; /* privilege level 1 shadow stack */ + u64 pl2_ssp; /* privilege level 2 shadow stack */ +}; + struct xstate_header { u64 xfeatures; u64 xcomp_bv; diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h index 9ebfdd543576..952d2515dae4 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -33,13 +33,14 @@ XFEATURE_MASK_BNDCSR) /* All currently supported supervisor features */ -#define SUPPORTED_XFEATURES_MASK_SUPERVISOR (0) +#define SUPPORTED_XFEATURES_MASK_SUPERVISOR (XFEATURE_MASK_CET_USER) /* * Unsupported supervisor features. When a supervisor feature in this mask is * supported in the future, move it to the supported supervisor feature mask. */ -#define UNSUPPORTED_XFEATURES_MASK_SUPERVISOR (XFEATURE_MASK_PT) +#define UNSUPPORTED_XFEATURES_MASK_SUPERVISOR (XFEATURE_MASK_PT | \ + XFEATURE_MASK_CET_KERNEL) /* All supervisor states including supported and unsupported states. */ #define ALL_XFEATURES_MASK_SUPERVISOR (SUPPORTED_XFEATURES_MASK_SUPERVISOR | \ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 084e98da04a7..114e77f5bb6b 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -872,4 +872,22 @@ #define MSR_VM_IGNNE 0xc0010115 #define MSR_VM_HSAVE_PA 0xc0010117 +/* Control-flow Enforcement Technology MSRs */ +#define MSR_IA32_U_CET 0x6a0 /* user mode cet setting */ +#define MSR_IA32_S_CET 0x6a2 /* kernel mode cet setting */ +#define MSR_IA32_PL0_SSP 0x6a4 /* kernel shstk pointer */ +#define MSR_IA32_PL1_SSP 0x6a5 /* ring-1 shstk pointer */ +#define MSR_IA32_PL2_SSP 0x6a6 /* ring-2 shstk pointer */ +#define MSR_IA32_PL3_SSP 0x6a7 /* user shstk pointer */ +#define MSR_IA32_INT_SSP_TAB 0x6a8 /* exception shstk table */ + +/* MSR_IA32_U_CET and MSR_IA32_S_CET bits */ +#define MSR_IA32_CET_SHSTK_EN 0x0000000000000001ULL +#define MSR_IA32_CET_WRSS_EN 0x0000000000000002ULL +#define MSR_IA32_CET_ENDBR_EN 0x0000000000000004ULL +#define MSR_IA32_CET_LEG_IW_EN 0x0000000000000008ULL +#define MSR_IA32_CET_NO_TRACK_EN 0x0000000000000010ULL +#define MSR_IA32_CET_WAIT_ENDBR 0x00000000000000800UL +#define MSR_IA32_CET_BITMAP_MASK 0xfffffffffffff000ULL + #endif /* _ASM_X86_MSR_INDEX_H */ diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h index bcba3c643e63..a8df907e8017 100644 --- a/arch/x86/include/uapi/asm/processor-flags.h +++ b/arch/x86/include/uapi/asm/processor-flags.h @@ -130,6 +130,8 @@ #define X86_CR4_SMAP _BITUL(X86_CR4_SMAP_BIT) #define X86_CR4_PKE_BIT 22 /* enable Protection Keys support */ #define X86_CR4_PKE _BITUL(X86_CR4_PKE_BIT) +#define X86_CR4_CET_BIT 23 /* enable Control-flow Enforcement */ +#define X86_CR4_CET _BITUL(X86_CR4_CET_BIT) /* * x86-64 Task Priority Register, CR8 diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 04f7c6b8dbbc..ec08a2b6feca 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -38,6 +38,9 @@ static const char *xfeature_names[] = "Processor Trace (unused)" , "Protection Keys User registers", "unknown xstate feature" , + "Control-flow User registers" , + "Control-flow Kernel registers" , + "unknown xstate feature" , }; static short xsave_cpuid_features[] __initdata = { @@ -51,6 +54,9 @@ static short xsave_cpuid_features[] __initdata = { X86_FEATURE_AVX512F, X86_FEATURE_INTEL_PT, X86_FEATURE_PKU, + -1, /* Unused */ + X86_FEATURE_SHSTK, /* XFEATURE_CET_USER */ + X86_FEATURE_SHSTK, /* XFEATURE_CET_KERNEL */ }; /* @@ -316,6 +322,8 @@ static void __init print_xstate_features(void) print_xstate_feature(XFEATURE_MASK_ZMM_Hi256); print_xstate_feature(XFEATURE_MASK_Hi16_ZMM); print_xstate_feature(XFEATURE_MASK_PKRU); + print_xstate_feature(XFEATURE_MASK_CET_USER); + print_xstate_feature(XFEATURE_MASK_CET_KERNEL); } /* @@ -563,6 +571,8 @@ static void check_xstate_against_struct(int nr) XCHECK_SZ(sz, nr, XFEATURE_ZMM_Hi256, struct avx_512_zmm_uppers_state); XCHECK_SZ(sz, nr, XFEATURE_Hi16_ZMM, struct avx_512_hi16_state); XCHECK_SZ(sz, nr, XFEATURE_PKRU, struct pkru_state); + XCHECK_SZ(sz, nr, XFEATURE_CET_USER, struct cet_user_state); + XCHECK_SZ(sz, nr, XFEATURE_CET_KERNEL, struct cet_kernel_state); /* * Make *SURE* to add any feature numbers in below if @@ -770,8 +780,19 @@ void __init fpu__init_system_xstate(void) * Clear XSAVE features that are disabled in the normal CPUID. */ for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) { - if (!boot_cpu_has(xsave_cpuid_features[i])) - xfeatures_mask_all &= ~BIT_ULL(i); + if (xsave_cpuid_features[i] == X86_FEATURE_SHSTK) { + /* + * X86_FEATURE_SHSTK and X86_FEATURE_IBT share + * same states, but can be enabled separately. + */ + if (!boot_cpu_has(X86_FEATURE_SHSTK) && + !boot_cpu_has(X86_FEATURE_IBT)) + xfeatures_mask_all &= ~BIT_ULL(i); + } else { + if ((xsave_cpuid_features[i] == -1) || + !boot_cpu_has(xsave_cpuid_features[i])) + xfeatures_mask_all &= ~BIT_ULL(i); + } } xfeatures_mask_all &= fpu__get_supported_xfeatures_mask(); From patchwork Wed Feb 5 18:19:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366911 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 60C5B921 for ; Wed, 5 Feb 2020 18:20:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 287A222314 for ; Wed, 5 Feb 2020 18:20:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 287A222314 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D0BB36B0008; Wed, 5 Feb 2020 13:20:30 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C6E2B6B000E; Wed, 5 Feb 2020 13:20:30 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A471D6B0008; Wed, 5 Feb 2020 13:20:30 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0092.hostedemail.com [216.40.44.92]) by kanga.kvack.org (Postfix) with ESMTP id 66E6F6B000D for ; Wed, 5 Feb 2020 13:20:30 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E0578180AD804 for ; Wed, 5 Feb 2020 18:20:29 +0000 (UTC) X-FDA: 76456888578.11.event62_78aff8cf5e54d X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30003:30012:30051:30054:30056:30064:30070:30079:30083,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:17,LUA_SUMMARY:none X-HE-Tag: event62_78aff8cf5e54d X-Filterd-Recvd-Size: 8877 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:29 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:25 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447751" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:25 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 04/27] x86/cet: Add control-protection fault handler Date: Wed, 5 Feb 2020 10:19:12 -0800 Message-Id: <20200205181935.3712-5-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A control-protection fault is triggered when a control-flow transfer attempt violates Shadow Stack or Indirect Branch Tracking constraints. For example, the return address for a RET instruction differs from the copy on the Shadow Stack; or an indirect JMP instruction, without the NOTRACK prefix, arrives at a non-ENDBR opcode. The control-protection fault handler works in a similar way as the general protection fault handler. It provides the si_code SEGV_CPERR to the signal handler. v9: - Add Shadow Stack pointer to the fault printout. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook --- arch/x86/entry/entry_64.S | 2 +- arch/x86/include/asm/traps.h | 3 ++ arch/x86/kernel/idt.c | 4 ++ arch/x86/kernel/signal_compat.c | 2 +- arch/x86/kernel/traps.c | 59 ++++++++++++++++++++++++++++++ include/uapi/asm-generic/siginfo.h | 3 +- 6 files changed, 70 insertions(+), 3 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 76942cbd95a1..6ca77312d008 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1034,7 +1034,7 @@ idtentry spurious_interrupt_bug do_spurious_interrupt_bug has_error_code=0 idtentry coprocessor_error do_coprocessor_error has_error_code=0 idtentry alignment_check do_alignment_check has_error_code=1 idtentry simd_coprocessor_error do_simd_coprocessor_error has_error_code=0 - +idtentry control_protection do_control_protection has_error_code=1 /* * Reload gs selector with exception handling diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index ffa0dc8a535e..7ac26bbd0bef 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -26,6 +26,7 @@ asmlinkage void invalid_TSS(void); asmlinkage void segment_not_present(void); asmlinkage void stack_segment(void); asmlinkage void general_protection(void); +asmlinkage void control_protection(void); asmlinkage void page_fault(void); asmlinkage void async_page_fault(void); asmlinkage void spurious_interrupt_bug(void); @@ -84,6 +85,7 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s); void __init trap_init(void); #endif dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code); +dotraplinkage void do_control_protection(struct pt_regs *regs, long error_code); dotraplinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address); dotraplinkage void do_spurious_interrupt_bug(struct pt_regs *regs, long error_code); dotraplinkage void do_coprocessor_error(struct pt_regs *regs, long error_code); @@ -154,6 +156,7 @@ enum { X86_TRAP_AC, /* 17, Alignment Check */ X86_TRAP_MC, /* 18, Machine Check */ X86_TRAP_XF, /* 19, SIMD Floating-Point Exception */ + X86_TRAP_CP = 21, /* 21 Control Protection Fault */ X86_TRAP_IRET = 32, /* 32, IRET Exception */ }; diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c index 87ef69a72c52..8ed406f469e7 100644 --- a/arch/x86/kernel/idt.c +++ b/arch/x86/kernel/idt.c @@ -102,6 +102,10 @@ static const __initconst struct idt_data def_idts[] = { #elif defined(CONFIG_X86_32) SYSG(IA32_SYSCALL_VECTOR, entry_INT80_32), #endif + +#ifdef CONFIG_X86_64 + INTG(X86_TRAP_CP, control_protection), +#endif }; /* diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c index 9ccbf0576cd0..c572a3de1037 100644 --- a/arch/x86/kernel/signal_compat.c +++ b/arch/x86/kernel/signal_compat.c @@ -27,7 +27,7 @@ static inline void signal_compat_build_tests(void) */ BUILD_BUG_ON(NSIGILL != 11); BUILD_BUG_ON(NSIGFPE != 15); - BUILD_BUG_ON(NSIGSEGV != 7); + BUILD_BUG_ON(NSIGSEGV != 8); BUILD_BUG_ON(NSIGBUS != 5); BUILD_BUG_ON(NSIGTRAP != 5); BUILD_BUG_ON(NSIGCHLD != 6); diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 05da6b5b167b..99c83ee522ed 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -570,6 +570,65 @@ do_general_protection(struct pt_regs *regs, long error_code) } NOKPROBE_SYMBOL(do_general_protection); +static const char * const control_protection_err[] = { + "unknown", + "near-ret", + "far-ret/iret", + "endbranch", + "rstorssp", + "setssbsy", +}; + +/* + * When a control protection exception occurs, send a signal + * to the responsible application. Currently, control + * protection is only enabled for the user mode. This + * exception should not come from the kernel mode. + */ +dotraplinkage void +do_control_protection(struct pt_regs *regs, long error_code) +{ + struct task_struct *tsk; + + RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU"); + if (notify_die(DIE_TRAP, "control protection fault", regs, + error_code, X86_TRAP_CP, SIGSEGV) == NOTIFY_STOP) + return; + cond_local_irq_enable(regs); + + if (!user_mode(regs)) + die("kernel control protection fault", regs, error_code); + + if (!static_cpu_has(X86_FEATURE_SHSTK) && + !static_cpu_has(X86_FEATURE_IBT)) + WARN_ONCE(1, "CET is disabled but got control protection fault\n"); + + tsk = current; + tsk->thread.error_code = error_code; + tsk->thread.trap_nr = X86_TRAP_CP; + + if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) && + printk_ratelimit()) { + unsigned int max_err; + unsigned long ssp; + + max_err = ARRAY_SIZE(control_protection_err) - 1; + if ((error_code < 0) || (error_code > max_err)) + error_code = 0; + rdmsrl(MSR_IA32_PL3_SSP, ssp); + pr_info("%s[%d] control protection ip:%lx sp:%lx ssp:%lx error:%lx(%s)", + tsk->comm, task_pid_nr(tsk), + regs->ip, regs->sp, ssp, error_code, + control_protection_err[error_code]); + print_vma_addr(KERN_CONT " in ", regs->ip); + pr_cont("\n"); + } + + force_sig_fault(SIGSEGV, SEGV_CPERR, + (void __user *)uprobe_get_trap_addr(regs)); +} +NOKPROBE_SYMBOL(do_control_protection); + dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code) { #ifdef CONFIG_DYNAMIC_FTRACE diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h index cb3d6c267181..693071dbe641 100644 --- a/include/uapi/asm-generic/siginfo.h +++ b/include/uapi/asm-generic/siginfo.h @@ -229,7 +229,8 @@ typedef struct siginfo { #define SEGV_ACCADI 5 /* ADI not enabled for mapped object */ #define SEGV_ADIDERR 6 /* Disrupting MCD error */ #define SEGV_ADIPERR 7 /* Precise MCD exception */ -#define NSIGSEGV 7 +#define SEGV_CPERR 8 +#define NSIGSEGV 8 /* * SIGBUS si_codes From patchwork Wed Feb 5 18:19:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366913 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CA880921 for ; Wed, 5 Feb 2020 18:20:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A064B24658 for ; Wed, 5 Feb 2020 18:20:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A064B24658 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 51A0F6B000C; Wed, 5 Feb 2020 13:20:31 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4A4D26B000D; Wed, 5 Feb 2020 13:20:31 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 391986B000E; Wed, 5 Feb 2020 13:20:31 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0186.hostedemail.com [216.40.44.186]) by kanga.kvack.org (Postfix) with ESMTP id 1D4036B000C for ; Wed, 5 Feb 2020 13:20:31 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B51458248D51 for ; Wed, 5 Feb 2020 18:20:30 +0000 (UTC) X-FDA: 76456888620.24.tray03_78c7c37d3852a X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30003:30054:30055:30056:30064:30070,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: tray03_78c7c37d3852a X-Filterd-Recvd-Size: 4549 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:29 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:25 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447755" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:25 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 05/27] x86/cet/shstk: Add Kconfig option for user-mode Shadow Stack protection Date: Wed, 5 Feb 2020 10:19:13 -0800 Message-Id: <20200205181935.3712-6-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Introduce Kconfig option: X86_INTEL_SHADOW_STACK_USER. Shadow Stack (SHSTK) provides protection against function return address corruption. It is active when the kernel has this feature enabled, and both the processor and the application support it. When this feature is enabled, legacy non-SHSTK applications continue to work, but without SHSTK protection. The user-mode SHSTK protection is only implemented for the 64-bit kernel. IA32 applications are supported under the compatibility mode. Signed-off-by: Yu-cheng Yu --- arch/x86/Kconfig | 22 ++++++++++++++++++++++ arch/x86/Makefile | 7 +++++++ 2 files changed, 29 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5e8949953660..6c34b701c588 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1974,6 +1974,28 @@ config X86_INTEL_TSX_MODE_AUTO side channel attacks- equals the tsx=auto command line parameter. endchoice +config X86_INTEL_CET + def_bool n + +config ARCH_HAS_SHSTK + def_bool n + +config X86_INTEL_SHADOW_STACK_USER + prompt "Intel Shadow Stack for user-mode" + def_bool n + depends on CPU_SUP_INTEL && X86_64 + select ARCH_USES_HIGH_VMA_FLAGS + select X86_INTEL_CET + select ARCH_HAS_SHSTK + ---help--- + Shadow Stack (SHSTK) provides protection against program + stack corruption. It is active when the kernel has this + feature enabled, and the processor and the application + support it. When this feature is enabled, legacy non-SHSTK + applications continue to work, but without SHSTK protection. + + If unsure, say y. + config EFI bool "EFI runtime service support" depends on ACPI diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 94df0868804b..c34f5befa4c8 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -149,6 +149,13 @@ ifdef CONFIG_X86_X32 endif export CONFIG_X86_X32_ABI +# Check assembler Shadow Stack suppot +ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER + ifeq ($(call as-instr, saveprevssp, y),) + $(error CONFIG_X86_INTEL_SHADOW_STACK_USER not supported by the assembler) + endif +endif + # # If the function graph tracer is used with mcount instead of fentry, # '-maccumulate-outgoing-args' is needed to prevent a GCC bug From patchwork Wed Feb 5 18:19:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366917 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C3B86921 for ; Wed, 5 Feb 2020 18:20:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 99B2321927 for ; Wed, 5 Feb 2020 18:20:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 99B2321927 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B16206B000D; Wed, 5 Feb 2020 13:20:31 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id ABE396B0037; Wed, 5 Feb 2020 13:20:31 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7FD206B000E; Wed, 5 Feb 2020 13:20:31 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0069.hostedemail.com [216.40.44.69]) by kanga.kvack.org (Postfix) with ESMTP id 607466B0032 for ; Wed, 5 Feb 2020 13:20:31 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id DD8E54404 for ; Wed, 5 Feb 2020 18:20:30 +0000 (UTC) X-FDA: 76456888620.05.tree38_78d0e2f65f810 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30054:30056:30064,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:35,LUA_SUMMARY:none X-HE-Tag: tree38_78d0e2f65f810 X-Filterd-Recvd-Size: 4797 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:30 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:25 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447762" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:25 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 06/27] mm: Introduce VM_SHSTK for Shadow Stack memory Date: Wed, 5 Feb 2020 10:19:14 -0800 Message-Id: <20200205181935.3712-7-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A Shadow Stack (SHSTK) PTE must be read-only and have _PAGE_DIRTY set. However, read-only and Dirty PTEs also exist for copy-on-write (COW) pages. These two cases are handled differently for page faults and a new VM flag is necessary for tracking SHSTK VMAs. v9: - Add VM_SHSTK case to arch_vma_name(). - Revise the commit log to explain why a new VM flag is needed. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook --- arch/x86/mm/mmap.c | 2 ++ fs/proc/task_mmu.c | 3 +++ include/linux/mm.h | 8 ++++++++ 3 files changed, 13 insertions(+) diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c index aae9a933dfd4..482813b4c659 100644 --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -165,6 +165,8 @@ const char *arch_vma_name(struct vm_area_struct *vma) { if (vma->vm_flags & VM_MPX) return "[mpx]"; + else if (vma->vm_flags & VM_SHSTK) + return "[shadow stack]"; return NULL; } diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 9442631fd4af..590b58ee008a 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -687,6 +687,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) [ilog2(VM_PKEY_BIT4)] = "", #endif #endif /* CONFIG_ARCH_HAS_PKEYS */ +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER + [ilog2(VM_SHSTK)] = "ss", +#endif }; size_t i; diff --git a/include/linux/mm.h b/include/linux/mm.h index cfaa8feecfe8..b5145fbe102e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -298,11 +298,13 @@ extern unsigned int kobjsize(const void *objp); #define VM_HIGH_ARCH_BIT_2 34 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_BIT_3 35 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_BIT_4 36 /* bit only usable on 64-bit architectures */ +#define VM_HIGH_ARCH_BIT_5 37 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_0 BIT(VM_HIGH_ARCH_BIT_0) #define VM_HIGH_ARCH_1 BIT(VM_HIGH_ARCH_BIT_1) #define VM_HIGH_ARCH_2 BIT(VM_HIGH_ARCH_BIT_2) #define VM_HIGH_ARCH_3 BIT(VM_HIGH_ARCH_BIT_3) #define VM_HIGH_ARCH_4 BIT(VM_HIGH_ARCH_BIT_4) +#define VM_HIGH_ARCH_5 BIT(VM_HIGH_ARCH_BIT_5) #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */ #ifdef CONFIG_ARCH_HAS_PKEYS @@ -340,6 +342,12 @@ extern unsigned int kobjsize(const void *objp); # define VM_MPX VM_NONE #endif +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER +# define VM_SHSTK VM_HIGH_ARCH_5 +#else +# define VM_SHSTK VM_NONE +#endif + #ifndef VM_GROWSUP # define VM_GROWSUP VM_NONE #endif From patchwork Wed Feb 5 18:19:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366915 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 62CFA1820 for ; Wed, 5 Feb 2020 18:20:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2EFF921927 for ; Wed, 5 Feb 2020 18:20:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2EFF921927 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8EA1B6B0032; Wed, 5 Feb 2020 13:20:31 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 89BF46B000D; Wed, 5 Feb 2020 13:20:31 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7767B6B0036; Wed, 5 Feb 2020 13:20:31 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0023.hostedemail.com [216.40.44.23]) by kanga.kvack.org (Postfix) with ESMTP id 51DA96B000E for ; Wed, 5 Feb 2020 13:20:31 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E87F6181AC9CC for ; Wed, 5 Feb 2020 18:20:30 +0000 (UTC) X-FDA: 76456888620.14.jeans93_78d46d72edf51 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30054:30056:30064,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:26,LUA_SUMMARY:none X-HE-Tag: jeans93_78d46d72edf51 X-Filterd-Recvd-Size: 4116 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:30 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:26 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447765" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:25 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 07/27] Add guard pages around a Shadow Stack. Date: Wed, 5 Feb 2020 10:19:15 -0800 Message-Id: <20200205181935.3712-8-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: INCSSPD/INCSSPQ instruction is used to unwind a Shadow Stack (SHSTK). It performs 'pop and discard' of the first and last element from SHSTK in the range specified in the operand. The maximum value of the operand is 255, and the maximum moving distance of the SHSTK pointer is 255 * 4 for INCSSPD, 255 * 8 for INCSSPQ. Since SHSTK has a fixed size, creating a guard page above prevents INCSSP/RET from moving beyond. Likewise, creating a guard page below prevents CALL from underflowing the SHSTK. Signed-off-by: Yu-cheng Yu --- include/linux/mm.h | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index b5145fbe102e..75de07674649 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2464,9 +2464,15 @@ static inline struct vm_area_struct * find_vma_intersection(struct mm_struct * m static inline unsigned long vm_start_gap(struct vm_area_struct *vma) { unsigned long vm_start = vma->vm_start; + unsigned long gap = 0; - if (vma->vm_flags & VM_GROWSDOWN) { - vm_start -= stack_guard_gap; + if (vma->vm_flags & VM_GROWSDOWN) + gap = stack_guard_gap; + else if (vma->vm_flags & VM_SHSTK) + gap = PAGE_SIZE; + + if (gap != 0) { + vm_start -= gap; if (vm_start > vma->vm_start) vm_start = 0; } @@ -2476,9 +2482,15 @@ static inline unsigned long vm_start_gap(struct vm_area_struct *vma) static inline unsigned long vm_end_gap(struct vm_area_struct *vma) { unsigned long vm_end = vma->vm_end; + unsigned long gap = 0; + + if (vma->vm_flags & VM_GROWSUP) + gap = stack_guard_gap; + else if (vma->vm_flags & VM_SHSTK) + gap = PAGE_SIZE; - if (vma->vm_flags & VM_GROWSUP) { - vm_end += stack_guard_gap; + if (gap != 0) { + vm_end += gap; if (vm_end < vma->vm_end) vm_end = -PAGE_SIZE; } From patchwork Wed Feb 5 18:19:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366919 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 852C017E0 for ; Wed, 5 Feb 2020 18:20:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4E558218AC for ; Wed, 5 Feb 2020 18:20:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4E558218AC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 73D716B0036; Wed, 5 Feb 2020 13:20:32 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6EFD76B0037; Wed, 5 Feb 2020 13:20:32 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5B6DE6B006C; Wed, 5 Feb 2020 13:20:32 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0107.hostedemail.com [216.40.44.107]) by kanga.kvack.org (Postfix) with ESMTP id 351206B0036 for ; Wed, 5 Feb 2020 13:20:32 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id BD3A18248D51 for ; Wed, 5 Feb 2020 18:20:31 +0000 (UTC) X-FDA: 76456888662.14.cats28_78ebd88c9bd2e X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30054:30056:30064:30070,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: cats28_78ebd88c9bd2e X-Filterd-Recvd-Size: 9778 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:30 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:26 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447771" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:26 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 08/27] x86/mm: Change _PAGE_DIRTY to _PAGE_DIRTY_HW Date: Wed, 5 Feb 2020 10:19:16 -0800 Message-Id: <20200205181935.3712-9-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Before introducing _PAGE_DIRTY_SW for non-hardware memory management purposes in the next patch, rename _PAGE_DIRTY to _PAGE_DIRTY_HW and _PAGE_BIT_DIRTY to _PAGE_BIT_DIRTY_HW to make these PTE dirty bits more clear. There are no functional changes from this patch. v9: - At some places _PAGE_DIRTY were not changed to _PAGE_DIRTY_HW, because they will be changed again in the next patch to _PAGE_DIRTY_BITS. However, this causes compile issues if the next patch is not yet applied. Fix it by changing all _PAGE_DIRTY to _PAGE_DRITY_HW. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook Reviewed-by: Dave Hansen --- arch/x86/include/asm/pgtable.h | 18 +++++++++--------- arch/x86/include/asm/pgtable_types.h | 17 +++++++++-------- arch/x86/kernel/relocate_kernel_64.S | 2 +- arch/x86/kvm/vmx/vmx.c | 2 +- 4 files changed, 20 insertions(+), 19 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index ad97dc155195..ab50d25f9afc 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -122,7 +122,7 @@ extern pmdval_t early_pmd_flags; */ static inline int pte_dirty(pte_t pte) { - return pte_flags(pte) & _PAGE_DIRTY; + return pte_flags(pte) & _PAGE_DIRTY_HW; } @@ -161,7 +161,7 @@ static inline int pte_young(pte_t pte) static inline int pmd_dirty(pmd_t pmd) { - return pmd_flags(pmd) & _PAGE_DIRTY; + return pmd_flags(pmd) & _PAGE_DIRTY_HW; } static inline int pmd_young(pmd_t pmd) @@ -171,7 +171,7 @@ static inline int pmd_young(pmd_t pmd) static inline int pud_dirty(pud_t pud) { - return pud_flags(pud) & _PAGE_DIRTY; + return pud_flags(pud) & _PAGE_DIRTY_HW; } static inline int pud_young(pud_t pud) @@ -312,7 +312,7 @@ static inline pte_t pte_clear_flags(pte_t pte, pteval_t clear) static inline pte_t pte_mkclean(pte_t pte) { - return pte_clear_flags(pte, _PAGE_DIRTY); + return pte_clear_flags(pte, _PAGE_DIRTY_HW); } static inline pte_t pte_mkold(pte_t pte) @@ -332,7 +332,7 @@ static inline pte_t pte_mkexec(pte_t pte) static inline pte_t pte_mkdirty(pte_t pte) { - return pte_set_flags(pte, _PAGE_DIRTY | _PAGE_SOFT_DIRTY); + return pte_set_flags(pte, _PAGE_DIRTY_HW | _PAGE_SOFT_DIRTY); } static inline pte_t pte_mkyoung(pte_t pte) @@ -396,7 +396,7 @@ static inline pmd_t pmd_mkold(pmd_t pmd) static inline pmd_t pmd_mkclean(pmd_t pmd) { - return pmd_clear_flags(pmd, _PAGE_DIRTY); + return pmd_clear_flags(pmd, _PAGE_DIRTY_HW); } static inline pmd_t pmd_wrprotect(pmd_t pmd) @@ -406,7 +406,7 @@ static inline pmd_t pmd_wrprotect(pmd_t pmd) static inline pmd_t pmd_mkdirty(pmd_t pmd) { - return pmd_set_flags(pmd, _PAGE_DIRTY | _PAGE_SOFT_DIRTY); + return pmd_set_flags(pmd, _PAGE_DIRTY_HW | _PAGE_SOFT_DIRTY); } static inline pmd_t pmd_mkdevmap(pmd_t pmd) @@ -450,7 +450,7 @@ static inline pud_t pud_mkold(pud_t pud) static inline pud_t pud_mkclean(pud_t pud) { - return pud_clear_flags(pud, _PAGE_DIRTY); + return pud_clear_flags(pud, _PAGE_DIRTY_HW); } static inline pud_t pud_wrprotect(pud_t pud) @@ -460,7 +460,7 @@ static inline pud_t pud_wrprotect(pud_t pud) static inline pud_t pud_mkdirty(pud_t pud) { - return pud_set_flags(pud, _PAGE_DIRTY | _PAGE_SOFT_DIRTY); + return pud_set_flags(pud, _PAGE_DIRTY_HW | _PAGE_SOFT_DIRTY); } static inline pud_t pud_mkdevmap(pud_t pud) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index b5e49e6bac63..e647e3c75578 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -15,7 +15,7 @@ #define _PAGE_BIT_PWT 3 /* page write through */ #define _PAGE_BIT_PCD 4 /* page cache disabled */ #define _PAGE_BIT_ACCESSED 5 /* was accessed (raised by CPU) */ -#define _PAGE_BIT_DIRTY 6 /* was written to (raised by CPU) */ +#define _PAGE_BIT_DIRTY_HW 6 /* was written to (raised by CPU) */ #define _PAGE_BIT_PSE 7 /* 4 MB (or 2MB) page */ #define _PAGE_BIT_PAT 7 /* on 4KB pages */ #define _PAGE_BIT_GLOBAL 8 /* Global TLB entry PPro+ */ @@ -45,7 +45,7 @@ #define _PAGE_PWT (_AT(pteval_t, 1) << _PAGE_BIT_PWT) #define _PAGE_PCD (_AT(pteval_t, 1) << _PAGE_BIT_PCD) #define _PAGE_ACCESSED (_AT(pteval_t, 1) << _PAGE_BIT_ACCESSED) -#define _PAGE_DIRTY (_AT(pteval_t, 1) << _PAGE_BIT_DIRTY) +#define _PAGE_DIRTY_HW (_AT(pteval_t, 1) << _PAGE_BIT_DIRTY_HW) #define _PAGE_PSE (_AT(pteval_t, 1) << _PAGE_BIT_PSE) #define _PAGE_GLOBAL (_AT(pteval_t, 1) << _PAGE_BIT_GLOBAL) #define _PAGE_SOFTW1 (_AT(pteval_t, 1) << _PAGE_BIT_SOFTW1) @@ -73,7 +73,7 @@ _PAGE_PKEY_BIT3) #if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE) -#define _PAGE_KNL_ERRATUM_MASK (_PAGE_DIRTY | _PAGE_ACCESSED) +#define _PAGE_KNL_ERRATUM_MASK (_PAGE_DIRTY_HW | _PAGE_ACCESSED) #else #define _PAGE_KNL_ERRATUM_MASK 0 #endif @@ -111,9 +111,9 @@ #define _PAGE_PROTNONE (_AT(pteval_t, 1) << _PAGE_BIT_PROTNONE) #define _PAGE_TABLE_NOENC (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |\ - _PAGE_ACCESSED | _PAGE_DIRTY) + _PAGE_ACCESSED | _PAGE_DIRTY_HW) #define _KERNPG_TABLE_NOENC (_PAGE_PRESENT | _PAGE_RW | \ - _PAGE_ACCESSED | _PAGE_DIRTY) + _PAGE_ACCESSED | _PAGE_DIRTY_HW) /* * Set of bits not changed in pte_modify. The pte's @@ -122,7 +122,7 @@ * pte_modify() does modify it. */ #define _PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \ - _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY | \ + _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY_HW | \ _PAGE_SOFT_DIRTY | _PAGE_DEVMAP) #define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE) @@ -167,7 +167,8 @@ enum page_cache_mode { _PAGE_ACCESSED) #define __PAGE_KERNEL_EXEC \ - (_PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_GLOBAL) + (_PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY_HW | _PAGE_ACCESSED | \ + _PAGE_GLOBAL) #define __PAGE_KERNEL (__PAGE_KERNEL_EXEC | _PAGE_NX) #define __PAGE_KERNEL_RO (__PAGE_KERNEL & ~_PAGE_RW) @@ -186,7 +187,7 @@ enum page_cache_mode { #define _PAGE_ENC (_AT(pteval_t, sme_me_mask)) #define _KERNPG_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | \ - _PAGE_DIRTY | _PAGE_ENC) + _PAGE_DIRTY_HW | _PAGE_ENC) #define _PAGE_TABLE (_KERNPG_TABLE | _PAGE_USER) #define __PAGE_KERNEL_ENC (__PAGE_KERNEL | _PAGE_ENC) diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S index ef3ba99068d3..3acd75f97b61 100644 --- a/arch/x86/kernel/relocate_kernel_64.S +++ b/arch/x86/kernel/relocate_kernel_64.S @@ -15,7 +15,7 @@ */ #define PTR(x) (x << 3) -#define PAGE_ATTR (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY) +#define PAGE_ATTR (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY_HW) /* * control_page + KEXEC_CONTROL_CODE_MAX_SIZE diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index e3394c839dea..fbbbf621b0d9 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3503,7 +3503,7 @@ static int init_rmode_identity_map(struct kvm *kvm) /* Set up identity-mapping pagetable for EPT in real mode */ for (i = 0; i < PT32_ENT_PER_PAGE; i++) { tmp = (i << 22) + (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | - _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE); + _PAGE_ACCESSED | _PAGE_DIRTY_HW | _PAGE_PSE); r = kvm_write_guest_page(kvm, identity_map_pfn, &tmp, i * sizeof(tmp), sizeof(tmp)); if (r < 0) From patchwork Wed Feb 5 18:19:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366923 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1EF0F921 for ; Wed, 5 Feb 2020 18:20:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CE787217F4 for ; Wed, 5 Feb 2020 18:20:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CE787217F4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2881A6B0072; Wed, 5 Feb 2020 13:20:33 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1E9826B0070; Wed, 5 Feb 2020 13:20:33 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E1D446B0071; Wed, 5 Feb 2020 13:20:32 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0077.hostedemail.com [216.40.44.77]) by kanga.kvack.org (Postfix) with ESMTP id BF68F6B006E for ; Wed, 5 Feb 2020 13:20:32 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 58EE62C2E for ; Wed, 5 Feb 2020 18:20:32 +0000 (UTC) X-FDA: 76456888704.14.heat93_78f7e3a535215 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:4423:30051:30054:30056:30064:30070,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: heat93_78f7e3a535215 X-Filterd-Recvd-Size: 12225 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:31 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:26 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447778" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:26 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 09/27] x86/mm: Introduce _PAGE_DIRTY_SW Date: Wed, 5 Feb 2020 10:19:17 -0800 Message-Id: <20200205181935.3712-10-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When Shadow Stack (SHSTK) is introduced, a R/O and Dirty PTE exists in the following cases: (a) A modified, copy-on-write (COW) page; (b) A R/O page that has been COW'ed; (c) A SHSTK page. To separate non-SHSTK memory from SHSTK, introduce a spare bit of the 64-bit PTE as _PAGE_BIT_DIRTY_SW and use that for case (a) and (b). This results in the following possible settings: Modified PTE: (R/W + DIRTY_HW) Modified and COW PTE: (R/O + DIRTY_SW) R/O PTE COW'ed: (R/O + DIRTY_SW) SHSTK PTE: (R/O + DIRTY_HW) SHSTK shared PTE[1]: (R/O + DIRTY_SW) SHSTK PTE COW'ed: (R/O + DIRTY_HW) [1] When a SHSTK page is being shared among threads, its PTE is cleared of _PAGE_DIRTY_HW, so the next SHSTK access causes a fault, and the page is duplicated and _PAGE_DIRTY_HW is set again. With this, in pte_wrprotect(), if SHSTK is active, use _PAGE_DIRTY_SW for the Dirty bit, and in pte_mkwrite() use _PAGE_DIRTY_HW. The same changes apply to pmd and pud. When this patch is applied, there are six free bits left in the 64-bit PTE. There are no more free bits in the 32-bit PTE (except for PAE) and SHSTK is not implemented for the 32-bit kernel. v9: - Remove pte_move_flags() etc. and put the logic directly in pte_wrprotect()/pte_mkwrite() etc. - Change compile-time conditionals to run-time checks. - Split out pte_modify()/pmd_modify() to a new patch. - Update comments. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook --- arch/x86/include/asm/pgtable.h | 111 ++++++++++++++++++++++++--- arch/x86/include/asm/pgtable_types.h | 31 +++++++- 2 files changed, 131 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index ab50d25f9afc..62aeb118bc36 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -120,9 +120,9 @@ extern pmdval_t early_pmd_flags; * The following only work if pte_present() is true. * Undefined behaviour if not.. */ -static inline int pte_dirty(pte_t pte) +static inline bool pte_dirty(pte_t pte) { - return pte_flags(pte) & _PAGE_DIRTY_HW; + return pte_flags(pte) & _PAGE_DIRTY_BITS; } @@ -159,9 +159,9 @@ static inline int pte_young(pte_t pte) return pte_flags(pte) & _PAGE_ACCESSED; } -static inline int pmd_dirty(pmd_t pmd) +static inline bool pmd_dirty(pmd_t pmd) { - return pmd_flags(pmd) & _PAGE_DIRTY_HW; + return pmd_flags(pmd) & _PAGE_DIRTY_BITS; } static inline int pmd_young(pmd_t pmd) @@ -169,9 +169,9 @@ static inline int pmd_young(pmd_t pmd) return pmd_flags(pmd) & _PAGE_ACCESSED; } -static inline int pud_dirty(pud_t pud) +static inline bool pud_dirty(pud_t pud) { - return pud_flags(pud) & _PAGE_DIRTY_HW; + return pud_flags(pud) & _PAGE_DIRTY_BITS; } static inline int pud_young(pud_t pud) @@ -312,7 +312,7 @@ static inline pte_t pte_clear_flags(pte_t pte, pteval_t clear) static inline pte_t pte_mkclean(pte_t pte) { - return pte_clear_flags(pte, _PAGE_DIRTY_HW); + return pte_clear_flags(pte, _PAGE_DIRTY_BITS); } static inline pte_t pte_mkold(pte_t pte) @@ -322,6 +322,17 @@ static inline pte_t pte_mkold(pte_t pte) static inline pte_t pte_wrprotect(pte_t pte) { + /* + * Use _PAGE_DIRTY_SW on a R/O PTE to set it apart from + * a Shadow Stack PTE, which is R/O + _PAGE_DIRTY_HW. + */ + if (static_cpu_has(X86_FEATURE_SHSTK)) { + if (pte_flags(pte) & _PAGE_DIRTY_HW) { + pte = pte_clear_flags(pte, _PAGE_DIRTY_HW); + pte = pte_set_flags(pte, _PAGE_DIRTY_SW); + } + } + return pte_clear_flags(pte, _PAGE_RW); } @@ -332,9 +343,25 @@ static inline pte_t pte_mkexec(pte_t pte) static inline pte_t pte_mkdirty(pte_t pte) { + pteval_t dirty = _PAGE_DIRTY_HW; + + if (static_cpu_has(X86_FEATURE_SHSTK) && !pte_write(pte)) + dirty = _PAGE_DIRTY_SW; + + return pte_set_flags(pte, dirty | _PAGE_SOFT_DIRTY); +} + +static inline pte_t pte_mkdirty_shstk(pte_t pte) +{ + pte = pte_clear_flags(pte, _PAGE_DIRTY_SW); return pte_set_flags(pte, _PAGE_DIRTY_HW | _PAGE_SOFT_DIRTY); } +static inline bool pte_dirty_hw(pte_t pte) +{ + return pte_flags(pte) & _PAGE_DIRTY_HW; +} + static inline pte_t pte_mkyoung(pte_t pte) { return pte_set_flags(pte, _PAGE_ACCESSED); @@ -342,6 +369,13 @@ static inline pte_t pte_mkyoung(pte_t pte) static inline pte_t pte_mkwrite(pte_t pte) { + if (static_cpu_has(X86_FEATURE_SHSTK)) { + if (pte_flags(pte) & _PAGE_DIRTY_SW) { + pte = pte_clear_flags(pte, _PAGE_DIRTY_SW); + pte = pte_set_flags(pte, _PAGE_DIRTY_HW); + } + } + return pte_set_flags(pte, _PAGE_RW); } @@ -396,19 +430,46 @@ static inline pmd_t pmd_mkold(pmd_t pmd) static inline pmd_t pmd_mkclean(pmd_t pmd) { - return pmd_clear_flags(pmd, _PAGE_DIRTY_HW); + return pmd_clear_flags(pmd, _PAGE_DIRTY_BITS); } static inline pmd_t pmd_wrprotect(pmd_t pmd) { + /* + * Use _PAGE_DIRTY_SW on a R/O PMD to set it apart from + * a Shadow Stack PTE, which is R/O + _PAGE_DIRTY_HW. + */ + if (static_cpu_has(X86_FEATURE_SHSTK)) { + if (pmd_flags(pmd) & _PAGE_DIRTY_HW) { + pmd = pmd_clear_flags(pmd, _PAGE_DIRTY_HW); + pmd = pmd_set_flags(pmd, _PAGE_DIRTY_SW); + } + } + return pmd_clear_flags(pmd, _PAGE_RW); } static inline pmd_t pmd_mkdirty(pmd_t pmd) { + pmdval_t dirty = _PAGE_DIRTY_HW; + + if (static_cpu_has(X86_FEATURE_SHSTK) && !(pmd_flags(pmd) & _PAGE_RW)) + dirty = _PAGE_DIRTY_SW; + + return pmd_set_flags(pmd, dirty | _PAGE_SOFT_DIRTY); +} + +static inline pmd_t pmd_mkdirty_shstk(pmd_t pmd) +{ + pmd = pmd_clear_flags(pmd, _PAGE_DIRTY_SW); return pmd_set_flags(pmd, _PAGE_DIRTY_HW | _PAGE_SOFT_DIRTY); } +static inline bool pmd_dirty_hw(pmd_t pmd) +{ + return pmd_flags(pmd) & _PAGE_DIRTY_HW; +} + static inline pmd_t pmd_mkdevmap(pmd_t pmd) { return pmd_set_flags(pmd, _PAGE_DEVMAP); @@ -426,6 +487,13 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd) static inline pmd_t pmd_mkwrite(pmd_t pmd) { + if (static_cpu_has(X86_FEATURE_SHSTK)) { + if (pmd_flags(pmd) & _PAGE_DIRTY_SW) { + pmd = pmd_clear_flags(pmd, _PAGE_DIRTY_SW); + pmd = pmd_set_flags(pmd, _PAGE_DIRTY_HW); + } + } + return pmd_set_flags(pmd, _PAGE_RW); } @@ -450,17 +518,33 @@ static inline pud_t pud_mkold(pud_t pud) static inline pud_t pud_mkclean(pud_t pud) { - return pud_clear_flags(pud, _PAGE_DIRTY_HW); + return pud_clear_flags(pud, _PAGE_DIRTY_BITS); } static inline pud_t pud_wrprotect(pud_t pud) { + /* + * Use _PAGE_DIRTY_SW on a R/O PUD to set it apart from + * a Shadow Stack PTE, which is R/O + _PAGE_DIRTY_HW. + */ + if (static_cpu_has(X86_FEATURE_SHSTK)) { + if (pud_flags(pud) & _PAGE_DIRTY_HW) { + pud = pud_clear_flags(pud, _PAGE_DIRTY_HW); + pud = pud_set_flags(pud, _PAGE_DIRTY_SW); + } + } + return pud_clear_flags(pud, _PAGE_RW); } static inline pud_t pud_mkdirty(pud_t pud) { - return pud_set_flags(pud, _PAGE_DIRTY_HW | _PAGE_SOFT_DIRTY); + pudval_t dirty = _PAGE_DIRTY_HW; + + if (static_cpu_has(X86_FEATURE_SHSTK) && !(pud_flags(pud) & _PAGE_RW)) + dirty = _PAGE_DIRTY_SW; + + return pud_set_flags(pud, dirty | _PAGE_SOFT_DIRTY); } static inline pud_t pud_mkdevmap(pud_t pud) @@ -480,6 +564,13 @@ static inline pud_t pud_mkyoung(pud_t pud) static inline pud_t pud_mkwrite(pud_t pud) { + if (static_cpu_has(X86_FEATURE_SHSTK)) { + if (pud_flags(pud) & _PAGE_DIRTY_SW) { + pud = pud_clear_flags(pud, _PAGE_DIRTY_SW); + pud = pud_set_flags(pud, _PAGE_DIRTY_HW); + } + } + return pud_set_flags(pud, _PAGE_RW); } diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index e647e3c75578..826823df917f 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -23,7 +23,8 @@ #define _PAGE_BIT_SOFTW2 10 /* " */ #define _PAGE_BIT_SOFTW3 11 /* " */ #define _PAGE_BIT_PAT_LARGE 12 /* On 2MB or 1GB pages */ -#define _PAGE_BIT_SOFTW4 58 /* available for programmer */ +#define _PAGE_BIT_SOFTW4 57 /* available for programmer */ +#define _PAGE_BIT_SOFTW5 58 /* available for programmer */ #define _PAGE_BIT_PKEY_BIT0 59 /* Protection Keys, bit 1/4 */ #define _PAGE_BIT_PKEY_BIT1 60 /* Protection Keys, bit 2/4 */ #define _PAGE_BIT_PKEY_BIT2 61 /* Protection Keys, bit 3/4 */ @@ -35,6 +36,12 @@ #define _PAGE_BIT_SOFT_DIRTY _PAGE_BIT_SOFTW3 /* software dirty tracking */ #define _PAGE_BIT_DEVMAP _PAGE_BIT_SOFTW4 +/* + * This bit indicates a copy-on-write page, and is different from + * _PAGE_BIT_SOFT_DIRTY, which tracks which pages a task writes to. + */ +#define _PAGE_BIT_DIRTY_SW _PAGE_BIT_SOFTW5 /* was written to */ + /* If _PAGE_BIT_PRESENT is clear, we use these: */ /* - if the user mapped it with PROT_NONE; pte_present gives true */ #define _PAGE_BIT_PROTNONE _PAGE_BIT_GLOBAL @@ -108,6 +115,28 @@ #define _PAGE_DEVMAP (_AT(pteval_t, 0)) #endif +/* A R/O and dirty PTE exists in the following cases: + * (a) A modified, copy-on-write (COW) page; + * (b) A R/O page that has been COW'ed; + * (c) A SHSTK page. + * _PAGE_DIRTY_SW is used to separate case (c) from others. + * This results in the following settings: + * + * Modified PTE: (R/W + DIRTY_HW) + * Modified and COW PTE: (R/O + DIRTY_SW) + * R/O PTE COW'ed: (R/O + DIRTY_SW) + * SHSTK PTE: (R/O + DIRTY_HW) + * SHSTK PTE COW'ed: (R/O + DIRTY_HW) + * SHSTK PTE being shared among threads: (R/O + DIRTY_SW) + */ +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER +#define _PAGE_DIRTY_SW (_AT(pteval_t, 1) << _PAGE_BIT_DIRTY_SW) +#else +#define _PAGE_DIRTY_SW (_AT(pteval_t, 0)) +#endif + +#define _PAGE_DIRTY_BITS (_PAGE_DIRTY_HW | _PAGE_DIRTY_SW) + #define _PAGE_PROTNONE (_AT(pteval_t, 1) << _PAGE_BIT_PROTNONE) #define _PAGE_TABLE_NOENC (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |\ From patchwork Wed Feb 5 18:19:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366921 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 223DA17E0 for ; Wed, 5 Feb 2020 18:20:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ECA13217F4 for ; Wed, 5 Feb 2020 18:20:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ECA13217F4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BAA336B0037; Wed, 5 Feb 2020 13:20:32 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B58DF6B006C; Wed, 5 Feb 2020 13:20:32 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A70226B006E; Wed, 5 Feb 2020 13:20:32 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8A9916B006C for ; Wed, 5 Feb 2020 13:20:32 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2CA09180AD804 for ; Wed, 5 Feb 2020 18:20:32 +0000 (UTC) X-FDA: 76456888704.16.alarm73_78f927ec64454 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30054:30056:30064:30070,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:7,LUA_SUMMARY:none X-HE-Tag: alarm73_78f927ec64454 X-Filterd-Recvd-Size: 4483 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:31 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:26 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447781" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:26 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 10/27] x86/mm: Update pte_modify, pmd_modify, and _PAGE_CHG_MASK for _PAGE_DIRTY_SW Date: Wed, 5 Feb 2020 10:19:18 -0800 Message-Id: <20200205181935.3712-11-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: After the introduction of _PAGE_DIRTY_SW, pte_modify and pmd_modify need to set the Dirty bit accordingly: if Shadow Stack is enabled and _PAGE_RW is cleared, use _PAGE_DIRTY_SW; otherwise _PAGE_DIRTY_HW. Since the Dirty bit is modify by pte_modify(), remove _PAGE_DIRTY_HW from PAGE_CHG_MASK. Signed-off-by: Yu-cheng Yu --- arch/x86/include/asm/pgtable.h | 16 ++++++++++++++++ arch/x86/include/asm/pgtable_types.h | 4 ++-- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 62aeb118bc36..2733e7ec16b3 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -702,6 +702,14 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) val &= _PAGE_CHG_MASK; val |= check_pgprot(newprot) & ~_PAGE_CHG_MASK; val = flip_protnone_guard(oldval, val, PTE_PFN_MASK); + + if (pte_dirty(pte)) { + if (static_cpu_has(X86_FEATURE_SHSTK) && !(val & _PAGE_RW)) + val |= _PAGE_DIRTY_SW; + else + val |= _PAGE_DIRTY_HW; + } + return __pte(val); } @@ -712,6 +720,14 @@ static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) val &= _HPAGE_CHG_MASK; val |= check_pgprot(newprot) & ~_HPAGE_CHG_MASK; val = flip_protnone_guard(oldval, val, PHYSICAL_PMD_PAGE_MASK); + + if (pmd_dirty(pmd)) { + if (static_cpu_has(X86_FEATURE_SHSTK) && !(val & _PAGE_RW)) + val |= _PAGE_DIRTY_SW; + else + val |= _PAGE_DIRTY_HW; + } + return __pmd(val); } diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 826823df917f..e7e28bf7e919 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -150,8 +150,8 @@ * instance, and is *not* included in this mask since * pte_modify() does modify it. */ -#define _PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \ - _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY_HW | \ +#define _PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \ + _PAGE_SPECIAL | _PAGE_ACCESSED | \ _PAGE_SOFT_DIRTY | _PAGE_DEVMAP) #define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE) From patchwork Wed Feb 5 18:19:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366925 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 237A4921 for ; Wed, 5 Feb 2020 18:21:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E38AC217F4 for ; Wed, 5 Feb 2020 18:21:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E38AC217F4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 62D236B006C; Wed, 5 Feb 2020 13:20:33 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5DC2B6B0070; Wed, 5 Feb 2020 13:20:33 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 283B26B006C; Wed, 5 Feb 2020 13:20:33 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0019.hostedemail.com [216.40.44.19]) by kanga.kvack.org (Postfix) with ESMTP id 0B7F16B006E for ; Wed, 5 Feb 2020 13:20:33 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 995F78248D51 for ; Wed, 5 Feb 2020 18:20:32 +0000 (UTC) X-FDA: 76456888704.14.swim49_791362249aa32 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30054:30055:30056:30064:30070,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:17,LUA_SUMMARY:none X-HE-Tag: swim49_791362249aa32 X-Filterd-Recvd-Size: 2961 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:31 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:27 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447786" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:26 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 11/27] drm/i915/gvt: Change _PAGE_DIRTY to _PAGE_DIRTY_BITS Date: Wed, 5 Feb 2020 10:19:19 -0800 Message-Id: <20200205181935.3712-12-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: After the introduction of _PAGE_DIRTY_SW, a dirty PTE can have either _PAGE_DIRTY_HW or _PAGE_DIRTY_SW. Change _PAGE_DIRTY to _PAGE_DIRTY_BITS. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook --- drivers/gpu/drm/i915/gvt/gtt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c index 4b04af569c05..e467ca182633 100644 --- a/drivers/gpu/drm/i915/gvt/gtt.c +++ b/drivers/gpu/drm/i915/gvt/gtt.c @@ -1201,7 +1201,7 @@ static int split_2MB_gtt_entry(struct intel_vgpu *vgpu, } /* Clear dirty field. */ - se->val64 &= ~_PAGE_DIRTY; + se->val64 &= ~_PAGE_DIRTY_BITS; ops->clear_pse(se); ops->clear_ips(se); From patchwork Wed Feb 5 18:19:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366927 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C797B1820 for ; Wed, 5 Feb 2020 18:21:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 90C36217F4 for ; Wed, 5 Feb 2020 18:21:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 90C36217F4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DACC06B0070; Wed, 5 Feb 2020 13:20:33 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C6D436B0073; Wed, 5 Feb 2020 13:20:33 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0BBF6B0071; Wed, 5 Feb 2020 13:20:33 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 99AFA6B006E for ; Wed, 5 Feb 2020 13:20:33 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 3B6AA49960C for ; Wed, 5 Feb 2020 18:20:33 +0000 (UTC) X-FDA: 76456888746.16.bike81_792d663455759 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30054:30056:30064:30070,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:2,LUA_SUMMARY:none X-HE-Tag: bike81_792d663455759 X-Filterd-Recvd-Size: 6512 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:32 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:27 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447789" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:27 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 12/27] x86/mm: Modify ptep_set_wrprotect and pmdp_set_wrprotect for _PAGE_DIRTY_SW Date: Wed, 5 Feb 2020 10:19:20 -0800 Message-Id: <20200205181935.3712-13-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When Shadow Stack (SHSTK) is enabled, the [R/O + PAGE_DIRTY_HW] setting is reserved only for SHSTK. Non-Shadow Stack R/O PTEs are [R/O + PAGE_DIRTY_SW]. When a PTE goes from [R/W + PAGE_DIRTY_HW] to [R/O + PAGE_DIRTY_SW], it could become a transient SHSTK PTE in two cases. The first case is that some processors can start a write but end up seeing a read-only PTE by the time they get to the Dirty bit, creating a transient SHSTK PTE. However, this will not occur on processors supporting SHSTK therefore we don't need a TLB flush here. The second case is that when the software, without atomic, tests & replaces PAGE_DIRTY_HW with PAGE_DIRTY_SW, a transient SHSTK PTE can exist. This is prevented with cmpxchg. Dave Hansen, Jann Horn, Andy Lutomirski, and Peter Zijlstra provided many insights to the issue. Jann Horn provided the cmpxchg solution. v9: - Change compile-time conditionals to runtime checks. - Fix parameters of try_cmpxchg(): change pte_t/pmd_t to pte_t.pte/pmd_t.pmd. v4: - Implement try_cmpxchg(). Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook --- arch/x86/include/asm/pgtable.h | 66 ++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 2733e7ec16b3..43cb27379208 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1253,6 +1253,39 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { + /* + * Some processors can start a write, but end up seeing a read-only + * PTE by the time they get to the Dirty bit. In this case, they + * will set the Dirty bit, leaving a read-only, Dirty PTE which + * looks like a Shadow Stack PTE. + * + * However, this behavior has been improved and will not occur on + * processors supporting Shadow Stack. Without this guarantee, a + * transition to a non-present PTE and flush the TLB would be + * needed. + * + * When changing a writable PTE to read-only and if the PTE has + * _PAGE_DIRTY_HW set, we move that bit to _PAGE_DIRTY_SW so that + * the PTE is not a valid Shadow Stack PTE. + */ +#ifdef CONFIG_X86_64 + if (static_cpu_has(X86_FEATURE_SHSTK)) { + pte_t new_pte, pte = READ_ONCE(*ptep); + + do { + /* + * This is the same as moving _PAGE_DIRTY_HW + * to _PAGE_DIRTY_SW. + */ + new_pte = pte_wrprotect(pte); + new_pte.pte |= (new_pte.pte & _PAGE_DIRTY_HW) >> + _PAGE_BIT_DIRTY_HW << _PAGE_BIT_DIRTY_SW; + new_pte.pte &= ~_PAGE_DIRTY_HW; + } while (!try_cmpxchg(&ptep->pte, &pte.pte, new_pte.pte)); + + return; + } +#endif clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte); } @@ -1303,6 +1336,39 @@ static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { + /* + * Some processors can start a write, but end up seeing a read-only + * PMD by the time they get to the Dirty bit. In this case, they + * will set the Dirty bit, leaving a read-only, Dirty PMD which + * looks like a Shadow Stack PMD. + * + * However, this behavior has been improved and will not occur on + * processors supporting Shadow Stack. Without this guarantee, a + * transition to a non-present PMD and flush the TLB would be + * needed. + * + * When changing a writable PMD to read-only and if the PMD has + * _PAGE_DIRTY_HW set, we move that bit to _PAGE_DIRTY_SW so that + * the PMD is not a valid Shadow Stack PMD. + */ +#ifdef CONFIG_X86_64 + if (static_cpu_has(X86_FEATURE_SHSTK)) { + pmd_t new_pmd, pmd = READ_ONCE(*pmdp); + + do { + /* + * This is the same as moving _PAGE_DIRTY_HW + * to _PAGE_DIRTY_SW. + */ + new_pmd = pmd_wrprotect(pmd); + new_pmd.pmd |= (new_pmd.pmd & _PAGE_DIRTY_HW) >> + _PAGE_BIT_DIRTY_HW << _PAGE_BIT_DIRTY_SW; + new_pmd.pmd &= ~_PAGE_DIRTY_HW; + } while (!try_cmpxchg(&pmdp->pmd, &pmd.pmd, new_pmd.pmd)); + + return; + } +#endif clear_bit(_PAGE_BIT_RW, (unsigned long *)pmdp); } From patchwork Wed Feb 5 18:19:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366929 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5211117E0 for ; Wed, 5 Feb 2020 18:21:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 27616217F4 for ; Wed, 5 Feb 2020 18:21:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 27616217F4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4AA906B006E; Wed, 5 Feb 2020 13:20:34 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 42D0F6B0071; Wed, 5 Feb 2020 13:20:34 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1244F6B0075; Wed, 5 Feb 2020 13:20:33 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0020.hostedemail.com [216.40.44.20]) by kanga.kvack.org (Postfix) with ESMTP id B9B7A6B006E for ; Wed, 5 Feb 2020 13:20:33 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 4E82D40F4 for ; Wed, 5 Feb 2020 18:20:33 +0000 (UTC) X-FDA: 76456888746.02.toy24_792ae5514f71c X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30003:30012:30054:30056:30064:30069:30070,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: toy24_792ae5514f71c X-Filterd-Recvd-Size: 4652 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:32 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:27 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447794" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:27 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 13/27] x86/mm: Shadow Stack page fault error checking Date: Wed, 5 Feb 2020 10:19:21 -0800 Message-Id: <20200205181935.3712-14-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If a page fault is triggered by a Shadow Stack (SHSTK) access (e.g. CALL/RET) or SHSTK management instructions (e.g. WRUSSQ), then bit[6] of the page fault error code is set. In access_error(), verify a SHSTK page fault is within a SHSTK memory area. It is always an error otherwise. For a valid SHSTK access, set FAULT_FLAG_WRITE to effect copy-on-write. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook --- arch/x86/include/asm/traps.h | 2 ++ arch/x86/mm/fault.c | 18 ++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 7ac26bbd0bef..8023d177fcd8 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -169,6 +169,7 @@ enum { * bit 3 == 1: use of reserved bit detected * bit 4 == 1: fault was an instruction fetch * bit 5 == 1: protection keys block access + * bit 6 == 1: shadow stack access fault */ enum x86_pf_error_code { X86_PF_PROT = 1 << 0, @@ -177,5 +178,6 @@ enum x86_pf_error_code { X86_PF_RSVD = 1 << 3, X86_PF_INSTR = 1 << 4, X86_PF_PK = 1 << 5, + X86_PF_SHSTK = 1 << 6, }; #endif /* _ASM_X86_TRAPS_H */ diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 304d31d8cbbc..9c1243302663 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1187,6 +1187,17 @@ access_error(unsigned long error_code, struct vm_area_struct *vma) (error_code & X86_PF_INSTR), foreign)) return 1; + /* + * Verify X86_PF_SHSTK is within a Shadow Stack VMA. + * It is always an error if there is a Shadow Stack + * fault outside a Shadow Stack VMA. + */ + if (error_code & X86_PF_SHSTK) { + if (!(vma->vm_flags & VM_SHSTK)) + return 1; + return 0; + } + if (error_code & X86_PF_WRITE) { /* write, present and write, not present: */ if (unlikely(!(vma->vm_flags & VM_WRITE))) @@ -1344,6 +1355,13 @@ void do_user_addr_fault(struct pt_regs *regs, perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); + /* + * If the fault is caused by a Shadow Stack access, + * i.e. CALL/RET/SAVEPREVSSP/RSTORSSP, then set + * FAULT_FLAG_WRITE to effect copy-on-write. + */ + if (hw_error_code & X86_PF_SHSTK) + flags |= FAULT_FLAG_WRITE; if (hw_error_code & X86_PF_WRITE) flags |= FAULT_FLAG_WRITE; if (hw_error_code & X86_PF_INSTR) From patchwork Wed Feb 5 18:19:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366931 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1E278921 for ; Wed, 5 Feb 2020 18:21:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DF0B221927 for ; Wed, 5 Feb 2020 18:21:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DF0B221927 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7BAB36B0071; Wed, 5 Feb 2020 13:20:34 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6F8466B0078; Wed, 5 Feb 2020 13:20:34 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 396496B0074; Wed, 5 Feb 2020 13:20:34 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0192.hostedemail.com [216.40.44.192]) by kanga.kvack.org (Postfix) with ESMTP id DD6716B0071 for ; Wed, 5 Feb 2020 13:20:33 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 7628F181AEF09 for ; Wed, 5 Feb 2020 18:20:33 +0000 (UTC) X-FDA: 76456888746.22.thumb90_7932f647a5334 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30003:30054:30056:30064:30079,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:26,LUA_SUMMARY:none X-HE-Tag: thumb90_7932f647a5334 X-Filterd-Recvd-Size: 6495 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:32 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:27 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447801" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:27 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 14/27] mm: Handle Shadow Stack page fault Date: Wed, 5 Feb 2020 10:19:22 -0800 Message-Id: <20200205181935.3712-15-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When a task does fork(), its Shadow Stack (SHSTK) must be duplicated for the child. This patch implements a flow similar to copy-on-write of an anonymous page, but for SHSTK. A SHSTK PTE must be RO and Dirty. This Dirty bit requirement is used to effect the copying. In copy_one_pte(), clear the Dirty bit from a SHSTK PTE to cause a page fault upon the next SHSTK access. At that time, fix the PTE and copy/re-use the page. Signed-off-by: Yu-cheng Yu --- arch/x86/mm/pgtable.c | 15 +++++++++++++++ include/asm-generic/pgtable.h | 17 +++++++++++++++++ mm/memory.c | 7 ++++++- 3 files changed, 38 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 7bd2c3a52297..2eb33794c08d 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -872,3 +872,18 @@ int pmd_free_pte_page(pmd_t *pmd, unsigned long addr) #endif /* CONFIG_X86_64 */ #endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */ + +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER +inline bool arch_copy_pte_mapping(vm_flags_t vm_flags) +{ + return (vm_flags & VM_SHSTK); +} + +inline pte_t pte_set_vma_features(pte_t pte, struct vm_area_struct *vma) +{ + if (vma->vm_flags & VM_SHSTK) + return pte_mkdirty_shstk(pte); + else + return pte; +} +#endif /* CONFIG_X86_INTEL_SHADOW_STACK_USER */ diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 798ea36a0549..9cb2f9ba5895 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -1190,6 +1190,23 @@ static inline bool arch_has_pfn_modify_check(void) } #endif /* !_HAVE_ARCH_PFN_MODIFY_ALLOWED */ +#ifdef CONFIG_MMU +#ifndef CONFIG_ARCH_HAS_SHSTK +static inline bool arch_copy_pte_mapping(vm_flags_t vm_flags) +{ + return false; +} + +static inline pte_t pte_set_vma_features(pte_t pte, struct vm_area_struct *vma) +{ + return pte; +} +#else +bool arch_copy_pte_mapping(vm_flags_t vm_flags); +pte_t pte_set_vma_features(pte_t pte, struct vm_area_struct *vma); +#endif +#endif /* CONFIG_MMU */ + /* * Architecture PAGE_KERNEL_* fallbacks * diff --git a/mm/memory.c b/mm/memory.c index 45442d9a4f52..6daa28614327 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -772,7 +772,8 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, * If it's a COW mapping, write protect it both * in the parent and the child */ - if (is_cow_mapping(vm_flags) && pte_write(pte)) { + if ((is_cow_mapping(vm_flags) && pte_write(pte)) || + arch_copy_pte_mapping(vm_flags)) { ptep_set_wrprotect(src_mm, addr, src_pte); pte = pte_wrprotect(pte); } @@ -2417,6 +2418,7 @@ static inline void wp_page_reuse(struct vm_fault *vmf) flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte)); entry = pte_mkyoung(vmf->orig_pte); entry = maybe_mkwrite(pte_mkdirty(entry), vma); + entry = pte_set_vma_features(entry, vma); if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1)) update_mmu_cache(vma, vmf->address, vmf->pte); pte_unmap_unlock(vmf->pte, vmf->ptl); @@ -2504,6 +2506,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte)); entry = mk_pte(new_page, vma->vm_page_prot); entry = maybe_mkwrite(pte_mkdirty(entry), vma); + entry = pte_set_vma_features(entry, vma); /* * Clear the pte entry and flush it first, before updating the * pte with the new entry. This will avoid a race condition @@ -3023,6 +3026,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) pte = mk_pte(page, vma->vm_page_prot); if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) { pte = maybe_mkwrite(pte_mkdirty(pte), vma); + pte = pte_set_vma_features(pte, vma); vmf->flags &= ~FAULT_FLAG_WRITE; ret |= VM_FAULT_WRITE; exclusive = RMAP_EXCLUSIVE; @@ -3165,6 +3169,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) entry = mk_pte(page, vma->vm_page_prot); if (vma->vm_flags & VM_WRITE) entry = pte_mkwrite(pte_mkdirty(entry)); + entry = pte_set_vma_features(entry, vma); vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); From patchwork Wed Feb 5 18:19:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366933 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0C29F17E0 for ; Wed, 5 Feb 2020 18:21:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D4E9E218AC for ; Wed, 5 Feb 2020 18:21:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D4E9E218AC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B83E86B0073; Wed, 5 Feb 2020 13:20:34 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id ABFFB6B0075; Wed, 5 Feb 2020 13:20:34 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 914A96B0074; Wed, 5 Feb 2020 13:20:34 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0219.hostedemail.com [216.40.44.219]) by kanga.kvack.org (Postfix) with ESMTP id 66C866B0075 for ; Wed, 5 Feb 2020 13:20:34 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 00116181AEF09 for ; Wed, 5 Feb 2020 18:20:33 +0000 (UTC) X-FDA: 76456888746.21.wave85_794a90c99f35b X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30003:30054:30056:30064:30079,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:2,LUA_SUMMARY:none X-HE-Tag: wave85_794a90c99f35b X-Filterd-Recvd-Size: 5793 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:33 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447806" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:27 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 15/27] mm: Handle THP/HugeTLB Shadow Stack page fault Date: Wed, 5 Feb 2020 10:19:23 -0800 Message-Id: <20200205181935.3712-16-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch implements THP Shadow Stack (SHSTK) copying in the same way as in the previous patch for regular PTE. In copy_huge_pmd(), clear the dirty bit from the PMD to cause a page fault upon the next SHSTK access to the PMD. At that time, fix the PMD and copy/re-use the page. Signed-off-by: Yu-cheng Yu --- arch/x86/mm/pgtable.c | 8 ++++++++ include/asm-generic/pgtable.h | 11 +++++++++++ mm/huge_memory.c | 4 ++++ 3 files changed, 23 insertions(+) diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 2eb33794c08d..3340b1d4e9da 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -886,4 +886,12 @@ inline pte_t pte_set_vma_features(pte_t pte, struct vm_area_struct *vma) else return pte; } + +inline pmd_t pmd_set_vma_features(pmd_t pmd, struct vm_area_struct *vma) +{ + if (vma->vm_flags & VM_SHSTK) + return pmd_mkdirty_shstk(pmd); + else + return pmd; +} #endif /* CONFIG_X86_INTEL_SHADOW_STACK_USER */ diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 9cb2f9ba5895..a9df093fdf45 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -1201,9 +1201,20 @@ static inline pte_t pte_set_vma_features(pte_t pte, struct vm_area_struct *vma) { return pte; } + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +static inline pmd_t pmd_set_vma_features(pmd_t pmd, struct vm_area_struct *vma) +{ + return pmd; +} +#endif #else bool arch_copy_pte_mapping(vm_flags_t vm_flags); pte_t pte_set_vma_features(pte_t pte, struct vm_area_struct *vma); + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +pmd_t pmd_set_vma_features(pmd_t pmd, struct vm_area_struct *vma); +#endif #endif #endif /* CONFIG_MMU */ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a88093213674..93ef368df2dd 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -636,6 +636,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, entry = mk_huge_pmd(page, vma->vm_page_prot); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); + entry = pmd_set_vma_features(entry, vma); page_add_new_anon_rmap(page, vma, haddr, true); mem_cgroup_commit_charge(page, memcg, false, true); lru_cache_add_active_or_unevictable(page, vma); @@ -1278,6 +1279,7 @@ static vm_fault_t do_huge_pmd_wp_page_fallback(struct vm_fault *vmf, pte_t entry; entry = mk_pte(pages[i], vma->vm_page_prot); entry = maybe_mkwrite(pte_mkdirty(entry), vma); + entry = pte_set_vma_features(entry, vma); memcg = (void *)page_private(pages[i]); set_page_private(pages[i], 0); page_add_new_anon_rmap(pages[i], vmf->vma, haddr, false); @@ -1360,6 +1362,7 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) pmd_t entry; entry = pmd_mkyoung(orig_pmd); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); + entry = pmd_set_vma_features(entry, vma); if (pmdp_set_access_flags(vma, haddr, vmf->pmd, entry, 1)) update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); ret |= VM_FAULT_WRITE; @@ -1432,6 +1435,7 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) pmd_t entry; entry = mk_huge_pmd(new_page, vma->vm_page_prot); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); + entry = pmd_set_vma_features(entry, vma); pmdp_huge_clear_flush_notify(vma, haddr, vmf->pmd); page_add_new_anon_rmap(new_page, vma, haddr, true); mem_cgroup_commit_charge(new_page, memcg, false, true); From patchwork Wed Feb 5 18:19:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366935 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E17F117E0 for ; Wed, 5 Feb 2020 18:21:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AA238218AC for ; Wed, 5 Feb 2020 18:21:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AA238218AC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 14CB46B0074; Wed, 5 Feb 2020 13:20:35 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0AC2C6B0078; Wed, 5 Feb 2020 13:20:35 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB42D6B007B; Wed, 5 Feb 2020 13:20:34 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0171.hostedemail.com [216.40.44.171]) by kanga.kvack.org (Postfix) with ESMTP id BA1276B0074 for ; Wed, 5 Feb 2020 13:20:34 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 570A240F4 for ; Wed, 5 Feb 2020 18:20:34 +0000 (UTC) X-FDA: 76456888788.02.grade11_794faf0331820 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30003:30054:30056:30064:30070:30079,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: grade11_794faf0331820 X-Filterd-Recvd-Size: 7688 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:33 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447815" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:28 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 16/27] mm: Update can_follow_write_pte() for Shadow Stack Date: Wed, 5 Feb 2020 10:19:24 -0800 Message-Id: <20200205181935.3712-17-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Can_follow_write_pte() verifies that a read-only page is the task's own copy by ensuring the page has gone through faultin_page() and the PTE is Dirty. A Shadow Stack (SHSTK) PTE must be (read-only + _PAGE_DIRTY_HW). When a task does fork(), its SHSTK PTEs become (read-only + _PAGE_DIRTY_SW). This causes the next SHSTK access (i.e. CALL, RET, INCSSP) to trigger a fault; the page is then copied, and (read-only + _PAGE_DIRTY_HW) is restored. To update can_follow_write_pte() for SHSTK, introduce pte_exclusive(). It verifies a data PTE is Dirty and a SHSTK PTE has _PAGE_DIRTY_HW. Also rename can_follow_write_pte() to can_follow_write() to make its meaning clear; i.e. "Can we write to the page?", not "Is the PTE writable?" Also apply same changes to the huge memory case. Signed-off-by: Yu-cheng Yu --- arch/x86/mm/pgtable.c | 18 ++++++++++++++++++ include/asm-generic/pgtable.h | 12 ++++++++++++ mm/gup.c | 8 +++++--- mm/huge_memory.c | 8 +++++--- 4 files changed, 40 insertions(+), 6 deletions(-) diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 3340b1d4e9da..fa8133f37918 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -887,6 +887,15 @@ inline pte_t pte_set_vma_features(pte_t pte, struct vm_area_struct *vma) return pte; } +inline bool pte_exclusive(pte_t pte, struct vm_area_struct *vma) +{ + if (vma->vm_flags & VM_SHSTK) + return pte_dirty_hw(pte); + else + return pte_dirty(pte); +} + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE inline pmd_t pmd_set_vma_features(pmd_t pmd, struct vm_area_struct *vma) { if (vma->vm_flags & VM_SHSTK) @@ -894,4 +903,13 @@ inline pmd_t pmd_set_vma_features(pmd_t pmd, struct vm_area_struct *vma) else return pmd; } + +inline bool pmd_exclusive(pmd_t pmd, struct vm_area_struct *vma) +{ + if (vma->vm_flags & VM_SHSTK) + return pmd_dirty_hw(pmd); + else + return pmd_dirty(pmd); +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif /* CONFIG_X86_INTEL_SHADOW_STACK_USER */ diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index a9df093fdf45..ae9a84fffc25 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -1202,18 +1202,30 @@ static inline pte_t pte_set_vma_features(pte_t pte, struct vm_area_struct *vma) return pte; } +static inline bool pte_exclusive(pte_t pte, struct vm_area_struct *vma) +{ + return pte_dirty(pte); +} + #ifdef CONFIG_TRANSPARENT_HUGEPAGE static inline pmd_t pmd_set_vma_features(pmd_t pmd, struct vm_area_struct *vma) { return pmd; } + +static inline bool pmd_exclusive(pmd_t pmd, struct vm_area_struct *vma) +{ + return pmd_dirty(pmd); +} #endif #else bool arch_copy_pte_mapping(vm_flags_t vm_flags); pte_t pte_set_vma_features(pte_t pte, struct vm_area_struct *vma); +bool pte_exclusive(pte_t pte, struct vm_area_struct *vma); #ifdef CONFIG_TRANSPARENT_HUGEPAGE pmd_t pmd_set_vma_features(pmd_t pmd, struct vm_area_struct *vma); +bool pmd_exclusive(pmd_t pmd, struct vm_area_struct *vma); #endif #endif #endif /* CONFIG_MMU */ diff --git a/mm/gup.c b/mm/gup.c index 7646bf993b25..d1dbfbde8443 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -164,10 +164,12 @@ static int follow_pfn_pte(struct vm_area_struct *vma, unsigned long address, * FOLL_FORCE can write to even unwritable pte's, but only * after we've gone through a COW cycle and they are dirty. */ -static inline bool can_follow_write_pte(pte_t pte, unsigned int flags) +static inline bool can_follow_write(pte_t pte, unsigned int flags, + struct vm_area_struct *vma) { return pte_write(pte) || - ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte)); + ((flags & FOLL_FORCE) && (flags & FOLL_COW) && + pte_exclusive(pte, vma)); } static struct page *follow_page_pte(struct vm_area_struct *vma, @@ -205,7 +207,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, } if ((flags & FOLL_NUMA) && pte_protnone(pte)) goto no_page; - if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, flags)) { + if ((flags & FOLL_WRITE) && !can_follow_write(pte, flags, vma)) { pte_unmap_unlock(ptep, ptl); return NULL; } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 93ef368df2dd..baad346e9f4a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1469,10 +1469,12 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) * FOLL_FORCE can write to even unwritable pmd's, but only * after we've gone through a COW cycle and they are dirty. */ -static inline bool can_follow_write_pmd(pmd_t pmd, unsigned int flags) +static inline bool can_follow_write(pmd_t pmd, unsigned int flags, + struct vm_area_struct *vma) { return pmd_write(pmd) || - ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pmd_dirty(pmd)); + ((flags & FOLL_FORCE) && (flags & FOLL_COW) && + pmd_exclusive(pmd, vma)); } struct page *follow_trans_huge_pmd(struct vm_area_struct *vma, @@ -1485,7 +1487,7 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct *vma, assert_spin_locked(pmd_lockptr(mm, pmd)); - if (flags & FOLL_WRITE && !can_follow_write_pmd(*pmd, flags)) + if (flags & FOLL_WRITE && !can_follow_write(*pmd, flags, vma)) goto out; /* Avoid dumping huge zero page */ From patchwork Wed Feb 5 18:19:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366937 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D9E42921 for ; Wed, 5 Feb 2020 18:21:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 95BC7218AC for ; Wed, 5 Feb 2020 18:21:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 95BC7218AC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 40BBF6B0078; Wed, 5 Feb 2020 13:20:35 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 31D466B007E; Wed, 5 Feb 2020 13:20:35 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 00E4F6B0075; Wed, 5 Feb 2020 13:20:34 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0166.hostedemail.com [216.40.44.166]) by kanga.kvack.org (Postfix) with ESMTP id D01826B0078 for ; Wed, 5 Feb 2020 13:20:34 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 78D634405 for ; Wed, 5 Feb 2020 18:20:34 +0000 (UTC) X-FDA: 76456888788.10.sky76_79524b345ff36 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30045:30046:30054:30055:30056:30062:30064:30067:30070:30090,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:6,LUA_SUMMARY:none X-HE-Tag: sky76_79524b345ff36 X-Filterd-Recvd-Size: 12103 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:33 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447820" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:28 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 17/27] x86/cet/shstk: User-mode Shadow Stack support Date: Wed, 5 Feb 2020 10:19:25 -0800 Message-Id: <20200205181935.3712-18-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch adds basic Shadow Stack (SHSTK) enabling/disabling routines. A task's SHSTK is allocated from memory with VM_SHSTK flag and read-only protection. It has a fixed size of RLIMIT_STACK. v9: - Change cpu_feature_enabled() to static_cpu_has(). - Merge cet_disable_shstk to cet_disable_free_shstk. - Remove the empty slot at the top of the SHSTK, as it is not needed. - Move do_mmap_locked() to alloc_shstk(), which is a static function. v6: - Create a function do_mmap_locked() for SHSTK allocation. v2: - Change noshstk to no_cet_shstk. Signed-off-by: Yu-cheng Yu --- arch/x86/include/asm/cet.h | 31 +++++ arch/x86/include/asm/disabled-features.h | 8 +- arch/x86/include/asm/processor.h | 5 + arch/x86/kernel/Makefile | 2 + arch/x86/kernel/cet.c | 121 ++++++++++++++++++ arch/x86/kernel/cpu/common.c | 25 ++++ arch/x86/kernel/process.c | 1 + .../arch/x86/include/asm/disabled-features.h | 8 +- 8 files changed, 199 insertions(+), 2 deletions(-) create mode 100644 arch/x86/include/asm/cet.h create mode 100644 arch/x86/kernel/cet.c diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h new file mode 100644 index 000000000000..c44c991ca91f --- /dev/null +++ b/arch/x86/include/asm/cet.h @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_CET_H +#define _ASM_X86_CET_H + +#ifndef __ASSEMBLY__ +#include + +struct task_struct; +/* + * Per-thread CET status + */ +struct cet_status { + unsigned long shstk_base; + unsigned long shstk_size; + unsigned int shstk_enabled:1; +}; + +#ifdef CONFIG_X86_INTEL_CET +int cet_setup_shstk(void); +void cet_disable_free_shstk(struct task_struct *p); +#else +static inline void cet_disable_free_shstk(struct task_struct *p) {} +#endif + +#define cpu_x86_cet_enabled() \ + (static_cpu_has(X86_FEATURE_SHSTK) || \ + static_cpu_has(X86_FEATURE_IBT)) + +#endif /* __ASSEMBLY__ */ + +#endif /* _ASM_X86_CET_H */ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index 8e1d0bb46361..e1454509ad83 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -62,6 +62,12 @@ # define DISABLE_PTI (1 << (X86_FEATURE_PTI & 31)) #endif +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER +#define DISABLE_SHSTK 0 +#else +#define DISABLE_SHSTK (1<<(X86_FEATURE_SHSTK & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -81,7 +87,7 @@ #define DISABLED_MASK13 0 #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 -#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP) +#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP|DISABLE_SHSTK) #define DISABLED_MASK17 0 #define DISABLED_MASK18 0 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 0340aad3f2fc..793d210e64da 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -25,6 +25,7 @@ struct vm86; #include #include #include +#include #include #include @@ -539,6 +540,10 @@ struct thread_struct { unsigned int sig_on_uaccess_err:1; unsigned int uaccess_err:1; /* uaccess failed */ +#ifdef CONFIG_X86_INTEL_CET + struct cet_status cet; +#endif + /* Floating point and extended processor state */ struct fpu fpu; /* diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 6175e370ee4a..b8c1ea4ab7eb 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -142,6 +142,8 @@ obj-$(CONFIG_UNWINDER_ORC) += unwind_orc.o obj-$(CONFIG_UNWINDER_FRAME_POINTER) += unwind_frame.o obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o +obj-$(CONFIG_X86_INTEL_CET) += cet.o + ### # 64 bit specific files ifeq ($(CONFIG_X86_64),y) diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c new file mode 100644 index 000000000000..b4c7d88e9a8f --- /dev/null +++ b/arch/x86/kernel/cet.c @@ -0,0 +1,121 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * cet.c - Control-flow Enforcement (CET) + * + * Copyright (c) 2019, Intel Corporation. + * Yu-cheng Yu + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static void start_update_msrs(void) +{ + fpregs_lock(); + if (test_thread_flag(TIF_NEED_FPU_LOAD)) + __fpregs_load_activate(); +} + +static void end_update_msrs(void) +{ + fpregs_unlock(); +} + +static unsigned long cet_get_shstk_addr(void) +{ + struct fpu *fpu = ¤t->thread.fpu; + unsigned long ssp = 0; + + fpregs_lock(); + + if (fpregs_state_valid(fpu, smp_processor_id())) { + rdmsrl(MSR_IA32_PL3_SSP, ssp); + } else { + struct cet_user_state *p; + + p = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER); + if (p) + ssp = p->user_ssp; + } + + fpregs_unlock(); + return ssp; +} + +static unsigned long alloc_shstk(unsigned long size) +{ + struct mm_struct *mm = current->mm; + unsigned long addr, populate; + + down_write(&mm->mmap_sem); + addr = do_mmap(NULL, 0, size, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, + VM_SHSTK, 0, &populate, NULL); + up_write(&mm->mmap_sem); + + if (populate) + mm_populate(addr, populate); + + return addr; +} + +int cet_setup_shstk(void) +{ + unsigned long addr, size; + struct cet_status *cet = ¤t->thread.cet; + + if (!static_cpu_has(X86_FEATURE_SHSTK)) + return -EOPNOTSUPP; + + size = rlimit(RLIMIT_STACK); + addr = alloc_shstk(size); + + if (IS_ERR((void *)addr)) + return PTR_ERR((void *)addr); + + cet->shstk_base = addr; + cet->shstk_size = size; + cet->shstk_enabled = 1; + + start_update_msrs(); + wrmsrl(MSR_IA32_PL3_SSP, addr + size); + wrmsrl(MSR_IA32_U_CET, MSR_IA32_CET_SHSTK_EN); + end_update_msrs(); + return 0; +} + +void cet_disable_free_shstk(struct task_struct *tsk) +{ + struct cet_status *cet = &tsk->thread.cet; + + if (!static_cpu_has(X86_FEATURE_SHSTK) || + !cet->shstk_enabled || !cet->shstk_base) + return; + + if (!tsk->mm || (tsk->mm != current->mm)) + return; + + if (tsk == current) { + u64 msr_val; + + start_update_msrs(); + rdmsrl(MSR_IA32_U_CET, msr_val); + wrmsrl(MSR_IA32_U_CET, msr_val & ~MSR_IA32_CET_SHSTK_EN); + end_update_msrs(); + } + + vm_munmap(cet->shstk_base, cet->shstk_size); + cet->shstk_base = 0; + cet->shstk_size = 0; + cet->shstk_enabled = 0; +} diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 2e4d90294fe6..40498ec72fda 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -54,6 +54,7 @@ #include #include #include +#include #include #include "cpu.h" @@ -486,6 +487,29 @@ static __init int setup_disable_pku(char *arg) __setup("nopku", setup_disable_pku); #endif /* CONFIG_X86_64 */ +static __always_inline void setup_cet(struct cpuinfo_x86 *c) +{ + if (cpu_x86_cet_enabled()) + cr4_set_bits(X86_CR4_CET); +} + +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER +static __init int setup_disable_shstk(char *s) +{ + /* require an exact match without trailing characters */ + if (s[0] != '\0') + return 0; + + if (!boot_cpu_has(X86_FEATURE_SHSTK)) + return 1; + + setup_clear_cpu_cap(X86_FEATURE_SHSTK); + pr_info("x86: 'no_cet_shstk' specified, disabling Shadow Stack\n"); + return 1; +} +__setup("no_cet_shstk", setup_disable_shstk); +#endif + /* * Some CPU features depend on higher CPUID levels, which may not always * be available due to CPUID level capping or broken virtualization @@ -1510,6 +1534,7 @@ static void identify_cpu(struct cpuinfo_x86 *c) x86_init_rdrand(c); x86_init_cache_qos(c); setup_pku(c); + setup_cet(c); /* * Clear/Set all flags overridden by options, need do it diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 8d0b9442202e..e102e63de641 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -43,6 +43,7 @@ #include #include #include +#include #include "process.h" diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h index 8e1d0bb46361..e1454509ad83 100644 --- a/tools/arch/x86/include/asm/disabled-features.h +++ b/tools/arch/x86/include/asm/disabled-features.h @@ -62,6 +62,12 @@ # define DISABLE_PTI (1 << (X86_FEATURE_PTI & 31)) #endif +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER +#define DISABLE_SHSTK 0 +#else +#define DISABLE_SHSTK (1<<(X86_FEATURE_SHSTK & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -81,7 +87,7 @@ #define DISABLED_MASK13 0 #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 -#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP) +#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP|DISABLE_SHSTK) #define DISABLED_MASK17 0 #define DISABLED_MASK18 0 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19) From patchwork Wed Feb 5 18:19:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366939 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 86C0717E0 for ; Wed, 5 Feb 2020 18:21:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5C1BF218AC for ; Wed, 5 Feb 2020 18:21:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5C1BF218AC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 828186B0075; Wed, 5 Feb 2020 13:20:35 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7E57D6B007B; Wed, 5 Feb 2020 13:20:35 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E1366B0080; Wed, 5 Feb 2020 13:20:35 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0250.hostedemail.com [216.40.44.250]) by kanga.kvack.org (Postfix) with ESMTP id 38E326B0075 for ; Wed, 5 Feb 2020 13:20:35 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CBEEF180AD802 for ; Wed, 5 Feb 2020 18:20:34 +0000 (UTC) X-FDA: 76456888788.13.pipe30_796a792e1815e X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30003:30051:30054:30056:30064:30070,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:2,LUA_SUMMARY:none X-HE-Tag: pipe30_796a792e1815e X-Filterd-Recvd-Size: 3900 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:34 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447824" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:28 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 18/27] x86/cet/shstk: Introduce WRUSS instruction Date: Wed, 5 Feb 2020 10:19:26 -0800 Message-Id: <20200205181935.3712-19-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: WRUSS is a new kernel-mode instruction but writes directly to user Shadow Stack (SHSTK) memory. This is used to construct a return address on SHSTK for the signal handler. This instruction can fault if the user SHSTK is not valid SHSTK memory. In that case, the kernel does a fixup. v4: - Change to asm goto. Signed-off-by: Yu-cheng Yu --- arch/x86/include/asm/special_insns.h | 32 ++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 6d37b8fcfc77..1b9b2e79c353 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -222,6 +222,38 @@ static inline void clwb(volatile void *__p) : [pax] "a" (p)); } +#ifdef CONFIG_X86_INTEL_CET +#if defined(CONFIG_IA32_EMULATION) || defined(CONFIG_X86_X32) +static inline int write_user_shstk_32(unsigned long addr, unsigned int val) +{ + asm_volatile_goto("1: wrussd %1, (%0)\n" + _ASM_EXTABLE(1b, %l[fail]) + :: "r" (addr), "r" (val) + :: fail); + return 0; +fail: + return -EPERM; +} +#else +static inline int write_user_shstk_32(unsigned long addr, unsigned int val) +{ + WARN_ONCE(1, "%s used but not supported.\n", __func__); + return -EFAULT; +} +#endif + +static inline int write_user_shstk_64(unsigned long addr, unsigned long val) +{ + asm_volatile_goto("1: wrussq %1, (%0)\n" + _ASM_EXTABLE(1b, %l[fail]) + :: "r" (addr), "r" (val) + :: fail); + return 0; +fail: + return -EPERM; +} +#endif /* CONFIG_X86_INTEL_CET */ + #define nop() asm volatile ("nop") From patchwork Wed Feb 5 18:19:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366943 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A7A9A921 for ; Wed, 5 Feb 2020 18:21:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 63D4721D7D for ; Wed, 5 Feb 2020 18:21:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 63D4721D7D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 329386B007E; Wed, 5 Feb 2020 13:20:36 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1BDC56B0083; Wed, 5 Feb 2020 13:20:36 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D7DB66B0080; Wed, 5 Feb 2020 13:20:35 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0155.hostedemail.com [216.40.44.155]) by kanga.kvack.org (Postfix) with ESMTP id B7D4D6B007E for ; Wed, 5 Feb 2020 13:20:35 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4F4CF180AD802 for ; Wed, 5 Feb 2020 18:20:35 +0000 (UTC) X-FDA: 76456888830.21.horn61_7975124020a26 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30003:30045:30046:30051:30054:30056:30064:30069:30070:30079,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:2,LUA_SUMMARY:no ne X-HE-Tag: horn61_7975124020a26 X-Filterd-Recvd-Size: 15703 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:34 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:29 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447828" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:28 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 19/27] x86/cet/shstk: Handle signals for Shadow Stack Date: Wed, 5 Feb 2020 10:19:27 -0800 Message-Id: <20200205181935.3712-20-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: To deliver a signal, create a Shadow Stack (SHSTK) restore token and put the token and the signal restorer address on the SHSTK. For sigreturn, verify the token and restore the SHSTK pointer. Introduce a signal context extension struct 'sc_ext', which is used to save SHSTK restore token address and WAIT_ENDBR status. WAIT_ENDBR will be introduced later in the Indirect Branch Tracking (IBT) series, but add that into sc_ext now to keep the struct stable in case the IBT series is applied later. v9: - Update CET MSR access according to XSAVES supervisor state changes. - Add 'wait_endbr' to struct 'sc_ext'. - Update and simplify signal frame allocation, setup, and restoration. - Update commit log text. v2: - Move CET status from sigcontext to a separate struct sc_ext, which is located above the fpstate on the signal frame. - Add a restore token for sigreturn address. Signed-off-by: Yu-cheng Yu --- arch/x86/ia32/ia32_signal.c | 17 +++ arch/x86/include/asm/cet.h | 7 ++ arch/x86/include/asm/fpu/internal.h | 2 + arch/x86/include/uapi/asm/sigcontext.h | 9 ++ arch/x86/kernel/cet.c | 153 +++++++++++++++++++++++++ arch/x86/kernel/fpu/signal.c | 89 ++++++++++++++ arch/x86/kernel/signal.c | 10 ++ 7 files changed, 287 insertions(+) diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c index 30416d7f19d4..c0bb350a3d2d 100644 --- a/arch/x86/ia32/ia32_signal.c +++ b/arch/x86/ia32/ia32_signal.c @@ -35,6 +35,7 @@ #include #include #include +#include /* * Do a signal return; undo the signal stack. @@ -223,6 +224,7 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs, void __user **fpstate) { unsigned long sp, fx_aligned, math_size; + void __user *restorer = NULL; /* Default to using normal stack */ sp = regs->sp; @@ -236,8 +238,23 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs, ksig->ka.sa.sa_restorer) sp = (unsigned long) ksig->ka.sa.sa_restorer; + if (ksig->ka.sa.sa_flags & SA_RESTORER) { + restorer = ksig->ka.sa.sa_restorer; + } else if (current->mm->context.vdso) { + if (ksig->ka.sa.sa_flags & SA_SIGINFO) + restorer = current->mm->context.vdso + + vdso_image_32.sym___kernel_rt_sigreturn; + else + restorer = current->mm->context.vdso + + vdso_image_32.sym___kernel_sigreturn; + } + sp = fpu__alloc_mathframe(sp, 1, &fx_aligned, &math_size); *fpstate = (struct _fpstate_32 __user *) sp; + + if (save_cet_to_sigframe(*fpstate, (unsigned long)restorer, 1)) + return (void __user *) -1L; + if (copy_fpstate_to_sigframe(*fpstate, (void __user *)fx_aligned, math_size) < 0) return (void __user *) -1L; diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h index c44c991ca91f..409d4f91a0dc 100644 --- a/arch/x86/include/asm/cet.h +++ b/arch/x86/include/asm/cet.h @@ -6,6 +6,8 @@ #include struct task_struct; +struct sc_ext; + /* * Per-thread CET status */ @@ -18,8 +20,13 @@ struct cet_status { #ifdef CONFIG_X86_INTEL_CET int cet_setup_shstk(void); void cet_disable_free_shstk(struct task_struct *p); +int cet_restore_signal(bool ia32, struct sc_ext *sc); +int cet_setup_signal(bool ia32, unsigned long rstor, struct sc_ext *sc); #else static inline void cet_disable_free_shstk(struct task_struct *p) {} +static inline int cet_restore_signal(bool ia32, struct sc_ext *sc) { return -EINVAL; } +static inline int cet_setup_signal(bool ia32, unsigned long rstor, + struct sc_ext *sc) { return -EINVAL; } #endif #define cpu_x86_cet_enabled() \ diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 42159f45bf9c..241521c0ed02 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -476,6 +476,8 @@ static inline void copy_kernel_to_fpregs(union fpregs_state *fpstate) __copy_kernel_to_fpregs(fpstate, -1); } +extern int save_cet_to_sigframe(void __user *fp, unsigned long restorer, + int is_ia32); extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size); /* diff --git a/arch/x86/include/uapi/asm/sigcontext.h b/arch/x86/include/uapi/asm/sigcontext.h index 844d60eb1882..cf2d55db3be4 100644 --- a/arch/x86/include/uapi/asm/sigcontext.h +++ b/arch/x86/include/uapi/asm/sigcontext.h @@ -196,6 +196,15 @@ struct _xstate { /* New processor state extensions go here: */ }; +/* + * Located at the end of sigcontext->fpstate, aligned to 8. + */ +struct sc_ext { + unsigned long total_size; + unsigned long ssp; + unsigned long wait_endbr; +}; + /* * The 32-bit signal frame: */ diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c index b4c7d88e9a8f..cba5c7656aab 100644 --- a/arch/x86/kernel/cet.c +++ b/arch/x86/kernel/cet.c @@ -19,6 +19,8 @@ #include #include #include +#include +#include static void start_update_msrs(void) { @@ -69,6 +71,80 @@ static unsigned long alloc_shstk(unsigned long size) return addr; } +#define TOKEN_MODE_MASK 3UL +#define TOKEN_MODE_64 1UL +#define IS_TOKEN_64(token) ((token & TOKEN_MODE_MASK) == TOKEN_MODE_64) +#define IS_TOKEN_32(token) ((token & TOKEN_MODE_MASK) == 0) + +/* + * Verify the restore token at the address of 'ssp' is + * valid and then set shadow stack pointer according to the + * token. + */ +static int verify_rstor_token(bool ia32, unsigned long ssp, + unsigned long *new_ssp) +{ + unsigned long token; + + *new_ssp = 0; + + if (!IS_ALIGNED(ssp, 8)) + return -EINVAL; + + if (get_user(token, (unsigned long __user *)ssp)) + return -EFAULT; + + /* Is 64-bit mode flag correct? */ + if (!ia32 && !IS_TOKEN_64(token)) + return -EINVAL; + else if (ia32 && !IS_TOKEN_32(token)) + return -EINVAL; + + token &= ~TOKEN_MODE_MASK; + + /* + * Restore address properly aligned? + */ + if ((!ia32 && !IS_ALIGNED(token, 8)) || !IS_ALIGNED(token, 4)) + return -EINVAL; + + /* + * Token was placed properly? + */ + if ((ALIGN_DOWN(token, 8) - 8) != ssp) + return -EINVAL; + + *new_ssp = token; + return 0; +} + +/* + * Create a restore token on the shadow stack. + * A token is always 8-byte and aligned to 8. + */ +static int create_rstor_token(bool ia32, unsigned long ssp, + unsigned long *new_ssp) +{ + unsigned long addr; + + *new_ssp = 0; + + if ((!ia32 && !IS_ALIGNED(ssp, 8)) || !IS_ALIGNED(ssp, 4)) + return -EINVAL; + + addr = ALIGN_DOWN(ssp, 8) - 8; + + /* Is the token for 64-bit? */ + if (!ia32) + ssp |= TOKEN_MODE_64; + + if (write_user_shstk_64(addr, ssp)) + return -EFAULT; + + *new_ssp = addr; + return 0; +} + int cet_setup_shstk(void) { unsigned long addr, size; @@ -119,3 +195,80 @@ void cet_disable_free_shstk(struct task_struct *tsk) cet->shstk_size = 0; cet->shstk_enabled = 0; } + +/* + * Called from __fpu__restore_sig() and XSAVES buffer is protected by + * set_thread_flag(TIF_NEED_FPU_LOAD). + */ +int cet_restore_signal(bool ia32, struct sc_ext *sc_ext) +{ + struct cet_user_state *cet_user_state; + struct cet_status *cet = ¤t->thread.cet; + unsigned long new_ssp = 0; + u64 msr_val = 0; + int err; + + if (!cet->shstk_enabled) + return 0; + + cet_user_state = get_xsave_addr(¤t->thread.fpu.state.xsave, + XFEATURE_CET_USER); + if (!cet_user_state) + return -1; + + if (cet->shstk_enabled) { + err = verify_rstor_token(ia32, sc_ext->ssp, &new_ssp); + if (err) + return err; + + cet_user_state->user_ssp = new_ssp; + msr_val |= MSR_IA32_CET_SHSTK_EN; + } + + cet_user_state->user_cet = msr_val; + return 0; +} + +/* + * Setup the shadow stack for the signal handler: first, + * create a restore token to keep track of the current ssp, + * and then the return address of the signal handler. + */ +int cet_setup_signal(bool ia32, unsigned long rstor_addr, struct sc_ext *sc_ext) +{ + struct cet_status *cet = ¤t->thread.cet; + unsigned long ssp = 0, new_ssp = 0; + int err; + + if (!cet->shstk_enabled) + return 0; + + if (cet->shstk_enabled) { + if (!rstor_addr) + return -EINVAL; + + ssp = cet_get_shstk_addr(); + err = create_rstor_token(ia32, ssp, &new_ssp); + if (err) + return err; + + if (ia32) { + ssp = new_ssp - sizeof(u32); + err = write_user_shstk_32(ssp, (unsigned int)rstor_addr); + } else { + ssp = new_ssp - sizeof(u64); + err = write_user_shstk_64(ssp, rstor_addr); + } + + if (err) + return err; + + sc_ext->ssp = new_ssp; + } + + start_update_msrs(); + if (cet->shstk_enabled) + wrmsrl(MSR_IA32_PL3_SSP, ssp); + end_update_msrs(); + + return 0; diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c index 0d3e06a772b0..875cc0fadce3 100644 --- a/arch/x86/kernel/fpu/signal.c +++ b/arch/x86/kernel/fpu/signal.c @@ -52,6 +52,69 @@ static inline int check_for_xstate(struct fxregs_state __user *buf, return 0; } +int save_cet_to_sigframe(void __user *fp, unsigned long restorer, int is_ia32) +{ + int err = 0; + +#ifdef CONFIG_X86_INTEL_CET + if (!current->thread.cet.shstk_enabled) + return 0; + + if (fp) { + struct sc_ext ext = {0, 0, 0}; + + err = cet_setup_signal(is_ia32, restorer, &ext); + if (!err) { + void __user *p = fp; + + ext.total_size = sizeof(ext); + + if (is_ia32) + p += sizeof(struct fregs_state); + + p += fpu_user_xstate_size + FP_XSTATE_MAGIC2_SIZE; + p = (void __user *)ALIGN((unsigned long)p, 8); + + if (copy_to_user(p, &ext, sizeof(ext))) + return -EFAULT; + } + } +#endif + + return err; +} + +static int restore_cet_from_sigframe(int is_ia32, void __user *fp) +{ + int err = 0; + +#ifdef CONFIG_X86_INTEL_CET + if (!current->thread.cet.shstk_enabled) + return 0; + + if (fp) { + struct sc_ext ext = {0, 0, 0}; + void __user *p = fp; + + if (is_ia32) + p += sizeof(struct fregs_state); + + p += fpu_user_xstate_size + FP_XSTATE_MAGIC2_SIZE; + p = (void __user *)ALIGN((unsigned long)p, 8); + + if (copy_from_user(&ext, p, sizeof(ext))) + return -EFAULT; + + if (ext.total_size != sizeof(ext)) + return -EFAULT; + + err = cet_restore_signal(is_ia32, &ext); + } +#endif + + return err; +} + /* * Signal frame handlers. */ @@ -367,6 +430,10 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size) pagefault_disable(); ret = copy_user_to_fpregs_zeroing(buf_fx, xfeatures_user, fx_only); pagefault_enable(); + + if (!ret) + ret = restore_cet_from_sigframe(0, buf); + if (!ret) { if (xfeatures_mask_supervisor()) copy_kernel_to_xregs(&fpu->state.xsave, @@ -397,6 +464,10 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size) sanitize_restored_user_xstate(&fpu->state, envp, xfeatures_user, fx_only); + ret = restore_cet_from_sigframe((int)ia32_fxstate, buf); + if (ret) + goto err_out; + fpregs_lock(); if (unlikely(init_bv)) copy_kernel_to_xregs(&init_fpstate.xsave, init_bv); @@ -468,12 +539,30 @@ int fpu__restore_sig(void __user *buf, int ia32_frame) return __fpu__restore_sig(buf, buf_fx, size); } +static unsigned long fpu__alloc_sigcontext_ext(unsigned long sp) +{ + /* + * sigcontext_ext is at: fpu + fpu_user_xstate_size + + * FP_XSTATE_MAGIC2_SIZE, then aligned to 8. + */ + if (cpu_x86_cet_enabled()) { + struct cet_status *cet = ¤t->thread.cet; + + if (cet->shstk_enabled) + sp -= (sizeof(struct sc_ext) + 8); + } + + return sp; +} + unsigned long fpu__alloc_mathframe(unsigned long sp, int ia32_frame, unsigned long *buf_fx, unsigned long *size) { unsigned long frame_size = xstate_sigframe_size(); + sp = fpu__alloc_sigcontext_ext(sp); + *buf_fx = sp = round_down(sp - frame_size, 64); if (ia32_frame && use_fxsr()) { frame_size += sizeof(struct fregs_state); diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c index ce9421ec285f..b26f5084a8a1 100644 --- a/arch/x86/kernel/signal.c +++ b/arch/x86/kernel/signal.c @@ -46,6 +46,7 @@ #include #include +#include #define COPY(x) do { \ get_user_ex(regs->x, &sc->x); \ @@ -246,6 +247,9 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size, unsigned long buf_fx = 0; int onsigstack = on_sig_stack(sp); int ret; +#ifdef CONFIG_X86_64 + void __user *restorer = NULL; +#endif /* redzone */ if (IS_ENABLED(CONFIG_X86_64)) @@ -277,6 +281,12 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size, if (onsigstack && !likely(on_sig_stack(sp))) return (void __user *)-1L; +#ifdef CONFIG_X86_64 + if (ka->sa.sa_flags & SA_RESTORER) + restorer = ka->sa.sa_restorer; + ret = save_cet_to_sigframe(*fpstate, (unsigned long)restorer, 0); +#endif + /* save i387 and extended state */ ret = copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size); if (ret < 0) From patchwork Wed Feb 5 18:19:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366941 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F3FE517E0 for ; Wed, 5 Feb 2020 18:21:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CAAA321D7D for ; Wed, 5 Feb 2020 18:21:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CAAA321D7D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F06426B007B; Wed, 5 Feb 2020 13:20:35 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id EB7D56B007E; Wed, 5 Feb 2020 13:20:35 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB7B86B0081; Wed, 5 Feb 2020 13:20:35 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0080.hostedemail.com [216.40.44.80]) by kanga.kvack.org (Postfix) with ESMTP id AF5D46B007B for ; Wed, 5 Feb 2020 13:20:35 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 58FC040D9 for ; Wed, 5 Feb 2020 18:20:35 +0000 (UTC) X-FDA: 76456888830.24.lead82_7977c358ebe39 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30054:30056:30064,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: lead82_7977c358ebe39 X-Filterd-Recvd-Size: 4031 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:34 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:29 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447831" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:29 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 20/27] ELF: UAPI and Kconfig additions for ELF program properties Date: Wed, 5 Feb 2020 10:19:28 -0800 Message-Id: <20200205181935.3712-21-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Introduce basic ELF definitions relating to the NT_GNU_PROPERTY_TYPE_0 note. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook --- fs/Kconfig.binfmt | 3 +++ include/linux/elf.h | 8 ++++++++ include/uapi/linux/elf.h | 1 + 3 files changed, 12 insertions(+) diff --git a/fs/Kconfig.binfmt b/fs/Kconfig.binfmt index 62dc4f577ba1..d2cfe0729a73 100644 --- a/fs/Kconfig.binfmt +++ b/fs/Kconfig.binfmt @@ -36,6 +36,9 @@ config COMPAT_BINFMT_ELF config ARCH_BINFMT_ELF_STATE bool +config ARCH_USE_GNU_PROPERTY + bool + config BINFMT_ELF_FDPIC bool "Kernel support for FDPIC ELF binaries" default y if !BINFMT_ELF diff --git a/include/linux/elf.h b/include/linux/elf.h index e3649b3e970e..459cddcceaac 100644 --- a/include/linux/elf.h +++ b/include/linux/elf.h @@ -2,6 +2,7 @@ #ifndef _LINUX_ELF_H #define _LINUX_ELF_H +#include #include #include @@ -56,4 +57,11 @@ static inline int elf_coredump_extra_notes_write(struct coredump_params *cprm) { extern int elf_coredump_extra_notes_size(void); extern int elf_coredump_extra_notes_write(struct coredump_params *cprm); #endif + +/* NT_GNU_PROPERTY_TYPE_0 header */ +struct gnu_property { + u32 pr_type; + u32 pr_datasz; +}; + #endif /* _LINUX_ELF_H */ diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h index 34c02e4290fe..c37731407074 100644 --- a/include/uapi/linux/elf.h +++ b/include/uapi/linux/elf.h @@ -36,6 +36,7 @@ typedef __s64 Elf64_Sxword; #define PT_LOPROC 0x70000000 #define PT_HIPROC 0x7fffffff #define PT_GNU_EH_FRAME 0x6474e550 +#define PT_GNU_PROPERTY 0x6474e553 #define PT_GNU_STACK (PT_LOOS + 0x474e551) From patchwork Wed Feb 5 18:19:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366945 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 11E4317E0 for ; Wed, 5 Feb 2020 18:21:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D306921927 for ; Wed, 5 Feb 2020 18:21:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D306921927 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 78D686B0080; Wed, 5 Feb 2020 13:20:36 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 715516B0082; Wed, 5 Feb 2020 13:20:36 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42ED06B0081; Wed, 5 Feb 2020 13:20:36 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0216.hostedemail.com [216.40.44.216]) by kanga.kvack.org (Postfix) with ESMTP id 157DA6B0082 for ; Wed, 5 Feb 2020 13:20:36 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id AA8F9180AD804 for ; Wed, 5 Feb 2020 18:20:35 +0000 (UTC) X-FDA: 76456888830.26.bag40_798784434ae63 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30054:30056:30064,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:39,LUA_SUMMARY:none X-HE-Tag: bag40_798784434ae63 X-Filterd-Recvd-Size: 3135 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:34 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:29 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447836" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:29 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 21/27] binfmt_elf: Define GNU_PROPERTY_X86_FEATURE_1_AND Date: Wed, 5 Feb 2020 10:19:29 -0800 Message-Id: <20200205181935.3712-22-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: An ELF file's .note.gnu.property indicates architecture features of the file. Introduce feature definitions for Control-flow Enforcement Technology (CET): Shadow Stack and Indirect Branch Tracking. Signed-off-by: Yu-cheng Yu --- include/uapi/linux/elf.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h index c37731407074..61251ecabdd7 100644 --- a/include/uapi/linux/elf.h +++ b/include/uapi/linux/elf.h @@ -444,4 +444,11 @@ typedef struct elf64_note { Elf64_Word n_type; /* Content type */ } Elf64_Nhdr; +/* .note.gnu.property types */ +#define GNU_PROPERTY_X86_FEATURE_1_AND 0xc0000002 + +/* Bits of GNU_PROPERTY_X86_FEATURE_1_AND */ +#define GNU_PROPERTY_X86_FEATURE_1_IBT 0x00000001 +#define GNU_PROPERTY_X86_FEATURE_1_SHSTK 0x00000002 + #endif /* _UAPI_LINUX_ELF_H */ From patchwork Wed Feb 5 18:19:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366949 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8AB0C921 for ; Wed, 5 Feb 2020 18:21:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 45E7E21927 for ; Wed, 5 Feb 2020 18:21:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 45E7E21927 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 009166B0081; Wed, 5 Feb 2020 13:20:37 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E236B6B0087; Wed, 5 Feb 2020 13:20:36 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B7FD16B0082; Wed, 5 Feb 2020 13:20:36 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0029.hostedemail.com [216.40.44.29]) by kanga.kvack.org (Postfix) with ESMTP id 9AC006B0083 for ; Wed, 5 Feb 2020 13:20:36 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 3558D8248D52 for ; Wed, 5 Feb 2020 18:20:36 +0000 (UTC) X-FDA: 76456888872.15.river89_7996de403853d X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30012:30029:30036:30054:30056:30064:30070,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: river89_7996de403853d X-Filterd-Recvd-Size: 11033 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:35 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:29 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447839" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:29 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 22/27] ELF: Add ELF program property parsing support Date: Wed, 5 Feb 2020 10:19:30 -0800 Message-Id: <20200205181935.3712-23-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Dave Martin ELF program properties will be needed for detecting whether to enable optional architecture or ABI features for a new ELF process. For now, there are no generic properties that we care about, so do nothing unless CONFIG_ARCH_USE_GNU_PROPERTY=y. Otherwise, the presence of properties using the PT_PROGRAM_PROPERTY phdrs entry (if any), and notify each property to the arch code. For now, the added code is not used. Signed-off-by: Dave Martin Signed-off-by: Yu-cheng Yu --- fs/binfmt_elf.c | 127 +++++++++++++++++++++++++++++++++++++++ fs/compat_binfmt_elf.c | 4 ++ include/linux/elf.h | 19 ++++++ include/uapi/linux/elf.h | 4 ++ 4 files changed, 154 insertions(+) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index ecd8d2698515..054446f93442 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -39,12 +39,18 @@ #include #include #include +#include +#include #include #include #include #include #include +#ifndef ELF_COMPAT +#define ELF_COMPAT 0 +#endif + #ifndef user_long_t #define user_long_t long #endif @@ -678,6 +684,111 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex, * libraries. There is no binary dependent code anywhere else. */ +static int parse_elf_property(const char *data, size_t *off, size_t datasz, + struct arch_elf_state *arch, + bool have_prev_type, u32 *prev_type) +{ + size_t o, step; + const struct gnu_property *pr; + int ret; + + if (*off == datasz) + return -ENOENT; + + if (WARN_ON(*off > datasz || *off % ELF_GNU_PROPERTY_ALIGN)) + return -EIO; + o = *off; + datasz -= *off; + + if (datasz < sizeof(*pr)) + return -EIO; + pr = (const struct gnu_property *)(data + o); + o += sizeof(*pr); + datasz -= sizeof(*pr); + + if (pr->pr_datasz > datasz) + return -EIO; + + WARN_ON(o % ELF_GNU_PROPERTY_ALIGN); + step = round_up(pr->pr_datasz, ELF_GNU_PROPERTY_ALIGN); + if (step > datasz) + return -EIO; + + /* Properties are supposed to be unique and sorted on pr_type: */ + if (have_prev_type && pr->pr_type <= *prev_type) + return -EIO; + *prev_type = pr->pr_type; + + ret = arch_parse_elf_property(pr->pr_type, data + o, + pr->pr_datasz, ELF_COMPAT, arch); + if (ret) + return ret; + + *off = o + step; + return 0; +} + +#define NOTE_DATA_SZ SZ_1K +#define GNU_PROPERTY_TYPE_0_NAME "GNU" +#define NOTE_NAME_SZ (sizeof(GNU_PROPERTY_TYPE_0_NAME)) + +static int parse_elf_properties(struct file *f, const struct elf_phdr *phdr, + struct arch_elf_state *arch) +{ + union { + struct elf_note nhdr; + char data[NOTE_DATA_SZ]; + } note; + loff_t pos; + ssize_t n; + size_t off, datasz; + int ret; + bool have_prev_type; + u32 prev_type; + + if (!IS_ENABLED(CONFIG_ARCH_USE_GNU_PROPERTY) || !phdr) + return 0; + + /* load_elf_binary() shouldn't call us unless this is true... */ + if (WARN_ON(phdr->p_type != PT_GNU_PROPERTY)) + return -EIO; + + /* If the properties are crazy large, that's too bad (for now): */ + if (phdr->p_filesz > sizeof(note)) + return -ENOEXEC; + + pos = phdr->p_offset; + n = kernel_read(f, ¬e, phdr->p_filesz, &pos); + + BUILD_BUG_ON(sizeof(note) < sizeof(note.nhdr) + NOTE_NAME_SZ); + if (n < 0 || n < sizeof(note.nhdr) + NOTE_NAME_SZ) + return -EIO; + + if (note.nhdr.n_type != NT_GNU_PROPERTY_TYPE_0 || + note.nhdr.n_namesz != NOTE_NAME_SZ || + strncmp(note.data + sizeof(note.nhdr), + GNU_PROPERTY_TYPE_0_NAME, n - sizeof(note.nhdr))) + return -EIO; + + off = round_up(sizeof(note.nhdr) + NOTE_NAME_SZ, + ELF_GNU_PROPERTY_ALIGN); + if (off > n) + return -EIO; + + if (note.nhdr.n_descsz > n - off) + return -EIO; + datasz = off + note.nhdr.n_descsz; + + have_prev_type = false; + do { + ret = parse_elf_property(note.data, &off, datasz, arch, + have_prev_type, &prev_type); + have_prev_type = true; + } while (!ret); + + return ret == -ENOENT ? 0 : ret; +} + static int load_elf_binary(struct linux_binprm *bprm) { struct file *interpreter = NULL; /* to shut gcc up */ @@ -685,6 +796,7 @@ static int load_elf_binary(struct linux_binprm *bprm) int load_addr_set = 0; unsigned long error; struct elf_phdr *elf_ppnt, *elf_phdata, *interp_elf_phdata = NULL; + struct elf_phdr *elf_property_phdata = NULL; unsigned long elf_bss, elf_brk; int bss_prot = 0; int retval, i; @@ -731,6 +843,11 @@ static int load_elf_binary(struct linux_binprm *bprm) for (i = 0; i < loc->elf_ex.e_phnum; i++, elf_ppnt++) { char *elf_interpreter; + if (elf_ppnt->p_type == PT_GNU_PROPERTY) { + elf_property_phdata = elf_ppnt; + continue; + } + if (elf_ppnt->p_type != PT_INTERP) continue; @@ -818,9 +935,14 @@ static int load_elf_binary(struct linux_binprm *bprm) goto out_free_dentry; /* Pass PT_LOPROC..PT_HIPROC headers to arch code */ + elf_property_phdata = NULL; elf_ppnt = interp_elf_phdata; for (i = 0; i < loc->interp_elf_ex.e_phnum; i++, elf_ppnt++) switch (elf_ppnt->p_type) { + case PT_GNU_PROPERTY: + elf_property_phdata = elf_ppnt; + break; + case PT_LOPROC ... PT_HIPROC: retval = arch_elf_pt_proc(&loc->interp_elf_ex, elf_ppnt, interpreter, @@ -831,6 +953,11 @@ static int load_elf_binary(struct linux_binprm *bprm) } } + retval = parse_elf_properties(interpreter ?: bprm->file, + elf_property_phdata, &arch_state); + if (retval) + goto out_free_dentry; + /* * Allow arch code to reject the ELF at this point, whilst it's * still possible to return an error to the code that invoked diff --git a/fs/compat_binfmt_elf.c b/fs/compat_binfmt_elf.c index aaad4ca1217e..13a087bc816b 100644 --- a/fs/compat_binfmt_elf.c +++ b/fs/compat_binfmt_elf.c @@ -17,6 +17,8 @@ #include #include +#define ELF_COMPAT 1 + /* * Rename the basic ELF layout types to refer to the 32-bit class of files. */ @@ -28,11 +30,13 @@ #undef elf_shdr #undef elf_note #undef elf_addr_t +#undef ELF_GNU_PROPERTY_ALIGN #define elfhdr elf32_hdr #define elf_phdr elf32_phdr #define elf_shdr elf32_shdr #define elf_note elf32_note #define elf_addr_t Elf32_Addr +#define ELF_GNU_PROPERTY_ALIGN ELF32_GNU_PROPERTY_ALIGN /* * Some data types as stored in coredump. diff --git a/include/linux/elf.h b/include/linux/elf.h index 459cddcceaac..7bdc6da160c7 100644 --- a/include/linux/elf.h +++ b/include/linux/elf.h @@ -22,6 +22,9 @@ SET_PERSONALITY(ex) #endif +#define ELF32_GNU_PROPERTY_ALIGN 4 +#define ELF64_GNU_PROPERTY_ALIGN 8 + #if ELF_CLASS == ELFCLASS32 extern Elf32_Dyn _DYNAMIC []; @@ -32,6 +35,7 @@ extern Elf32_Dyn _DYNAMIC []; #define elf_addr_t Elf32_Off #define Elf_Half Elf32_Half #define Elf_Word Elf32_Word +#define ELF_GNU_PROPERTY_ALIGN ELF32_GNU_PROPERTY_ALIGN #else @@ -43,6 +47,7 @@ extern Elf64_Dyn _DYNAMIC []; #define elf_addr_t Elf64_Off #define Elf_Half Elf64_Half #define Elf_Word Elf64_Word +#define ELF_GNU_PROPERTY_ALIGN ELF64_GNU_PROPERTY_ALIGN #endif @@ -64,4 +69,18 @@ struct gnu_property { u32 pr_datasz; }; +struct arch_elf_state; + +#ifndef CONFIG_ARCH_USE_GNU_PROPERTY +static inline int arch_parse_elf_property(u32 type, const void *data, + size_t datasz, bool compat, + struct arch_elf_state *arch) +{ + return 0; +} +#else +extern int arch_parse_elf_property(u32 type, const void *data, size_t datasz, + bool compat, struct arch_elf_state *arch); +#endif + #endif /* _LINUX_ELF_H */ diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h index 61251ecabdd7..518651708d8f 100644 --- a/include/uapi/linux/elf.h +++ b/include/uapi/linux/elf.h @@ -368,6 +368,7 @@ typedef struct elf64_shdr { * Notes used in ET_CORE. Architectures export some of the arch register sets * using the corresponding note types via the PTRACE_GETREGSET and * PTRACE_SETREGSET requests. + * The note name for all these is "LINUX". */ #define NT_PRSTATUS 1 #define NT_PRFPREG 2 @@ -430,6 +431,9 @@ typedef struct elf64_shdr { #define NT_MIPS_FP_MODE 0x801 /* MIPS floating-point mode */ #define NT_MIPS_MSA 0x802 /* MIPS SIMD registers */ +/* Note types with note name "GNU" */ +#define NT_GNU_PROPERTY_TYPE_0 5 + /* Note header in a PT_NOTE section */ typedef struct elf32_note { Elf32_Word n_namesz; /* Name size */ From patchwork Wed Feb 5 18:19:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366947 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3405217E0 for ; Wed, 5 Feb 2020 18:21:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0112721927 for ; Wed, 5 Feb 2020 18:21:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0112721927 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C6CB16B0083; Wed, 5 Feb 2020 13:20:36 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C1D7A6B0081; Wed, 5 Feb 2020 13:20:36 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ABDC06B0085; Wed, 5 Feb 2020 13:20:36 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0235.hostedemail.com [216.40.44.235]) by kanga.kvack.org (Postfix) with ESMTP id 83F466B0081 for ; Wed, 5 Feb 2020 13:20:36 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 267C58248D51 for ; Wed, 5 Feb 2020 18:20:36 +0000 (UTC) X-FDA: 76456888872.03.walk98_7998bfeed242a X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30054:30056:30064,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: walk98_7998bfeed242a X-Filterd-Recvd-Size: 3799 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:35 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:30 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447843" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:29 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 23/27] ELF: Introduce arch_setup_elf_property() Date: Wed, 5 Feb 2020 10:19:31 -0800 Message-Id: <20200205181935.3712-24-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: An ELF file's .note.gnu.property indicates architecture features of the file. These features are extracted earlier and stored in the struct 'arch_elf_state'. Introduce arch_setup_elf_property() to setup and enable these features. The first use-case of this function is Shadow Stack and Indirect Branch Tracking, which are introduced later. Signed-off-by: Yu-cheng Yu --- fs/binfmt_elf.c | 4 ++++ include/linux/elf.h | 6 ++++++ 2 files changed, 10 insertions(+) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 054446f93442..56fe6cd437fe 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1213,6 +1213,10 @@ static int load_elf_binary(struct linux_binprm *bprm) set_binfmt(&elf_format); + retval = arch_setup_elf_property(&arch_state); + if (retval < 0) + goto out; + #ifdef ARCH_HAS_SETUP_ADDITIONAL_PAGES retval = arch_setup_additional_pages(bprm, !!interpreter); if (retval < 0) diff --git a/include/linux/elf.h b/include/linux/elf.h index 7bdc6da160c7..81f2161fa4a8 100644 --- a/include/linux/elf.h +++ b/include/linux/elf.h @@ -78,9 +78,15 @@ static inline int arch_parse_elf_property(u32 type, const void *data, { return 0; } + +static inline int arch_setup_elf_property(struct arch_elf_state *arch) +{ + return 0; +} #else extern int arch_parse_elf_property(u32 type, const void *data, size_t datasz, bool compat, struct arch_elf_state *arch); +extern int arch_setup_elf_property(struct arch_elf_state *arch); #endif #endif /* _LINUX_ELF_H */ From patchwork Wed Feb 5 18:19:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366951 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C702417E0 for ; Wed, 5 Feb 2020 18:21:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9DE6921927 for ; Wed, 5 Feb 2020 18:21:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9DE6921927 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 875046B0085; Wed, 5 Feb 2020 13:20:37 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 825266B0088; Wed, 5 Feb 2020 13:20:37 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C6A76B0087; Wed, 5 Feb 2020 13:20:37 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0091.hostedemail.com [216.40.44.91]) by kanga.kvack.org (Postfix) with ESMTP id 4A39C6B0082 for ; Wed, 5 Feb 2020 13:20:37 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id DF4D3180AD802 for ; Wed, 5 Feb 2020 18:20:36 +0000 (UTC) X-FDA: 76456888872.11.offer18_79b1e35837c02 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30045:30054:30056:30064:30070,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:18,LUA_SUMMARY:none X-HE-Tag: offer18_79b1e35837c02 X-Filterd-Recvd-Size: 4940 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:36 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:30 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447848" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:30 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 24/27] x86/cet/shstk: ELF header parsing for Shadow Stack Date: Wed, 5 Feb 2020 10:19:32 -0800 Message-Id: <20200205181935.3712-25-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Check an ELF file's .note.gnu.property, and setup Shadow Stack if the application supports it. v9: - Change cpu_feature_enabled() to static_cpu_has(). Signed-off-by: Yu-cheng Yu --- arch/x86/Kconfig | 2 ++ arch/x86/include/asm/elf.h | 13 +++++++++++++ arch/x86/kernel/process_64.c | 31 +++++++++++++++++++++++++++++++ 3 files changed, 46 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 6c34b701c588..d1447380e02e 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1987,6 +1987,8 @@ config X86_INTEL_SHADOW_STACK_USER select ARCH_USES_HIGH_VMA_FLAGS select X86_INTEL_CET select ARCH_HAS_SHSTK + select ARCH_USE_GNU_PROPERTY + select ARCH_BINFMT_ELF_STATE ---help--- Shadow Stack (SHSTK) provides protection against program stack corruption. It is active when the kernel has this diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h index 69c0f892e310..fac79b621e0a 100644 --- a/arch/x86/include/asm/elf.h +++ b/arch/x86/include/asm/elf.h @@ -367,6 +367,19 @@ extern int compat_arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp); #define compat_arch_setup_additional_pages compat_arch_setup_additional_pages +#ifdef CONFIG_ARCH_BINFMT_ELF_STATE +struct arch_elf_state { + unsigned int gnu_property; +}; + +#define INIT_ARCH_ELF_STATE { \ + .gnu_property = 0, \ +} + +#define arch_elf_pt_proc(ehdr, phdr, elf, interp, state) (0) +#define arch_check_elf(ehdr, interp, interp_ehdr, state) (0) +#endif + /* Do not change the values. See get_align_mask() */ enum align_flags { ALIGN_VA_32 = BIT(0), diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 506d66830d4d..99548cde0cc6 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -732,3 +732,34 @@ unsigned long KSTK_ESP(struct task_struct *task) { return task_pt_regs(task)->sp; } + +#ifdef CONFIG_ARCH_USE_GNU_PROPERTY +int arch_parse_elf_property(u32 type, const void *data, size_t datasz, + bool compat, struct arch_elf_state *state) +{ + if (type != GNU_PROPERTY_X86_FEATURE_1_AND) + return 0; + + if (datasz != sizeof(unsigned int)) + return -ENOEXEC; + + state->gnu_property = *(unsigned int *)data; + return 0; +} + +int arch_setup_elf_property(struct arch_elf_state *state) +{ + int r = 0; + + memset(¤t->thread.cet, 0, sizeof(struct cet_status)); + + if (static_cpu_has(X86_FEATURE_SHSTK)) { + if (state->gnu_property & GNU_PROPERTY_X86_FEATURE_1_SHSTK) + r = cet_setup_shstk(); + if (r < 0) + return r; + } + + return r; +} +#endif From patchwork Wed Feb 5 18:19:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366953 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2A2D3921 for ; Wed, 5 Feb 2020 18:21:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DDE1821741 for ; Wed, 5 Feb 2020 18:21:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DDE1821741 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CE9086B008A; Wed, 5 Feb 2020 13:20:37 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BF92D6B0087; Wed, 5 Feb 2020 13:20:37 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A4C9D6B008A; Wed, 5 Feb 2020 13:20:37 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0128.hostedemail.com [216.40.44.128]) by kanga.kvack.org (Postfix) with ESMTP id 887E86B0087 for ; Wed, 5 Feb 2020 13:20:37 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 244DB181AEF09 for ; Wed, 5 Feb 2020 18:20:37 +0000 (UTC) X-FDA: 76456888914.26.mom66_79b87746bc92d X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30003:30012:30045:30054:30056:30064,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: mom66_79b87746bc92d X-Filterd-Recvd-Size: 7094 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:36 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:30 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447851" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:30 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 25/27] x86/cet/shstk: Handle thread Shadow Stack Date: Wed, 5 Feb 2020 10:19:33 -0800 Message-Id: <20200205181935.3712-26-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The Shadow Stack (SHSTK) for clone/fork is handled as the following: (1) If ((clone_flags & (CLONE_VFORK | CLONE_VM)) == CLONE_VM), the kernel allocates (and frees on thread exit) a new SHSTK for the child. It is possible for the kernel to complete the clone syscall and set the child's SHSTK pointer to NULL and let the child thread allocate a SHSTK for itself. There are two issues in this approach: It is not compatible with existing code that does inline syscall and it cannot handle signals before the child can successfully allocate a SHSTK. (2) For (clone_flags & CLONE_VFORK), the child uses the existing SHSTK. (3) For all other cases, the SHSTK is copied/reused whenever the parent or the child does a call/ret. This patch handles cases (1) & (2). Case (3) is handled in the SHSTK page fault patches. A 64-bit SHSTK has a fixed size of RLIMIT_STACK. A compat-mode thread SHSTK has a fixed size of 1/4 RLIMIT_STACK. This allows more threads to share a 32-bit address space. Signed-off-by: Yu-cheng Yu --- arch/x86/include/asm/cet.h | 2 ++ arch/x86/include/asm/mmu_context.h | 3 +++ arch/x86/kernel/cet.c | 41 ++++++++++++++++++++++++++++++ arch/x86/kernel/process.c | 7 +++++ 4 files changed, 53 insertions(+) diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h index 409d4f91a0dc..9a3e2da9c1c4 100644 --- a/arch/x86/include/asm/cet.h +++ b/arch/x86/include/asm/cet.h @@ -19,10 +19,12 @@ struct cet_status { #ifdef CONFIG_X86_INTEL_CET int cet_setup_shstk(void); +int cet_setup_thread_shstk(struct task_struct *p); void cet_disable_free_shstk(struct task_struct *p); int cet_restore_signal(bool ia32, struct sc_ext *sc); int cet_setup_signal(bool ia32, unsigned long rstor, struct sc_ext *sc); #else +static inline int cet_setup_thread_shstk(struct task_struct *p) { return 0; } static inline void cet_disable_free_shstk(struct task_struct *p) {} static inline int cet_restore_signal(bool ia32, struct sc_ext *sc) { return -EINVAL; } static inline int cet_setup_signal(bool ia32, unsigned long rstor, diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 5f33924e200f..6a8189308823 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -13,6 +13,7 @@ #include #include #include +#include #include extern atomic64_t last_mm_ctx_id; @@ -230,6 +231,8 @@ do { \ #else #define deactivate_mm(tsk, mm) \ do { \ + if (!tsk->vfork_done) \ + cet_disable_free_shstk(tsk); \ load_gs_index(0); \ loadsegment(fs, 0); \ } while (0) diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c index cba5c7656aab..5b45abda80a1 100644 --- a/arch/x86/kernel/cet.c +++ b/arch/x86/kernel/cet.c @@ -170,6 +170,47 @@ int cet_setup_shstk(void) return 0; } +int cet_setup_thread_shstk(struct task_struct *tsk) +{ + unsigned long addr, size; + struct cet_user_state *state; + struct cet_status *cet = &tsk->thread.cet; + + if (!cet->shstk_enabled) + return 0; + + state = get_xsave_addr(&tsk->thread.fpu.state.xsave, + XFEATURE_CET_USER); + + if (!state) + return -EINVAL; + + size = rlimit(RLIMIT_STACK); + + /* + * Compat-mode pthreads share a limited address space. + * If each function call takes an average of four slots + * stack space, we need 1/4 of stack size for shadow stack. + */ + if (in_compat_syscall()) + size /= 4; + + addr = alloc_shstk(size); + + if (IS_ERR((void *)addr)) { + cet->shstk_base = 0; + cet->shstk_size = 0; + cet->shstk_enabled = 0; + return PTR_ERR((void *)addr); + } + + fpu__prepare_write(&tsk->thread.fpu); + state->user_ssp = (u64)(addr + size); + cet->shstk_base = addr; + cet->shstk_size = size; + return 0; +} + void cet_disable_free_shstk(struct task_struct *tsk) { struct cet_status *cet = &tsk->thread.cet; diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index e102e63de641..7098618142f2 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -110,6 +110,7 @@ void exit_thread(struct task_struct *tsk) free_vm86(t); + cet_disable_free_shstk(tsk); fpu__drop(fpu); } @@ -180,6 +181,12 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp, if (clone_flags & CLONE_SETTLS) ret = set_new_tls(p, tls); +#ifdef CONFIG_X86_64 + /* Allocate a new shadow stack for pthread */ + if (!ret && (clone_flags & (CLONE_VFORK | CLONE_VM)) == CLONE_VM) + ret = cet_setup_thread_shstk(p); +#endif + if (!ret && unlikely(test_tsk_thread_flag(current, TIF_IO_BITMAP))) io_bitmap_share(p); From patchwork Wed Feb 5 18:19:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366955 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 74F2D921 for ; Wed, 5 Feb 2020 18:21:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4B94B22464 for ; Wed, 5 Feb 2020 18:21:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4B94B22464 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 08A9B6B0087; Wed, 5 Feb 2020 13:20:38 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CE6986B0082; Wed, 5 Feb 2020 13:20:37 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B326C6B0088; Wed, 5 Feb 2020 13:20:37 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id 79F846B0082 for ; Wed, 5 Feb 2020 13:20:37 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 1DB981EFF for ; Wed, 5 Feb 2020 18:20:37 +0000 (UTC) X-FDA: 76456888914.21.bikes52_79bab6e2ec43f X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30054:30056:30064:30070,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: bikes52_79bab6e2ec43f X-Filterd-Recvd-Size: 3203 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:36 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:30 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447855" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:30 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 26/27] mm/mmap: Add Shadow Stack pages to memory accounting Date: Wed, 5 Feb 2020 10:19:34 -0800 Message-Id: <20200205181935.3712-27-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add Shadow Stack pages to memory accounting. v8: - Change Shadow Stake pages from data_vm to stack_vm. Signed-off-by: Yu-cheng Yu --- mm/mmap.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index 71e4ffc83bcd..acfa04e2a5dd 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1687,6 +1687,9 @@ static inline int accountable_mapping(struct file *file, vm_flags_t vm_flags) if (file && is_file_hugepages(file)) return 0; + if (arch_copy_pte_mapping(vm_flags)) + return 1; + return (vm_flags & (VM_NORESERVE | VM_SHARED | VM_WRITE)) == VM_WRITE; } @@ -3302,6 +3305,8 @@ void vm_stat_account(struct mm_struct *mm, vm_flags_t flags, long npages) mm->stack_vm += npages; else if (is_data_mapping(flags)) mm->data_vm += npages; + else if (arch_copy_pte_mapping(flags)) + mm->stack_vm += npages; } static vm_fault_t special_mapping_fault(struct vm_fault *vmf); From patchwork Wed Feb 5 18:19:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11366957 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0EC5317E0 for ; Wed, 5 Feb 2020 18:21:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CC1742192A for ; Wed, 5 Feb 2020 18:21:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CC1742192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8EB836B0082; Wed, 5 Feb 2020 13:20:38 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 89A306B0088; Wed, 5 Feb 2020 13:20:38 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7624A6B0089; Wed, 5 Feb 2020 13:20:38 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0147.hostedemail.com [216.40.44.147]) by kanga.kvack.org (Postfix) with ESMTP id 599066B0082 for ; Wed, 5 Feb 2020 13:20:38 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id EC4B08248D51 for ; Wed, 5 Feb 2020 18:20:37 +0000 (UTC) X-FDA: 76456888914.02.money06_79d5fdcb20d07 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,:x86@kernel.org:hpa@zytor.com:tglx@linutronix.de:mingo@redhat.com:linux-kernel@vger.kernel.org:linux-doc@vger.kernel.org::linux-arch@vger.kernel.org:linux-api@vger.kernel.org:arnd@arndb.de:luto@kernel.org:bsingharora@gmail.com:bp@alien8.de:gorcunov@gmail.com:dave.hansen@linux.intel.com:esyr@redhat.com:fweimer@redhat.com:hjl.tools@gmail.com:jannh@google.com:corbet@lwn.net:keescook@chromium.org:mike.kravetz@oracle.com:nadav.amit@gmail.com:oleg@redhat.com:pavel@ucw.cz:peterz@infradead.org:rdunlap@infradead.org:ravi.v.shankar@intel.com:vedvyas.shanbhogue@intel.com:dave.martin@arm.com:x86-patch-review@intel.com:yu-cheng.yu@intel.com,RULES_HIT:30003:30046:30051:30054:30056:30064:30069:30075,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: money06_79d5fdcb20d07 X-Filterd-Recvd-Size: 9167 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 18:20:37 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Feb 2020 10:20:31 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,406,1574150400"; d="scan'208";a="279447862" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by FMSMGA003.fm.intel.com with ESMTP; 05 Feb 2020 10:20:30 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Cc: Yu-cheng Yu Subject: [RFC PATCH v9 27/27] x86/cet/shstk: Add arch_prctl functions for Shadow Stack Date: Wed, 5 Feb 2020 10:19:35 -0800 Message-Id: <20200205181935.3712-28-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200205181935.3712-1-yu-cheng.yu@intel.com> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: arch_prctl(ARCH_X86_CET_STATUS, unsigned long *addr) Return CET feature status. The parameter 'addr' is a pointer to a user buffer. On returning to the caller, the kernel fills the following information: *addr = SHSTK/IBT status *(addr + 1) = SHSTK base address *(addr + 2) = SHSTK size arch_prctl(ARCH_X86_CET_DISABLE, unsigned long features) Disable CET features specified in 'features'. Return -EPERM if CET is locked. arch_prctl(ARCH_X86_CET_LOCK) Lock in CET feature. arch_prctl(ARCH_X86_CET_ALLOC_SHSTK, unsigned long *addr) Allocate a new SHSTK. The parameter 'addr' is a pointer to a user buffer and indicates the desired SHSTK size to allocate. On returning to the caller the buffer contains the address of the new SHSTK. There is no CET enabling arch_prctl function. By design, CET is enabled automatically if the binary and the system can support it. The parameters passed are always unsigned 64-bit. When an IA32 application passing pointers, it should only use the lower 32 bits. Signed-off-by: Yu-cheng Yu --- arch/x86/include/asm/cet.h | 4 ++ arch/x86/include/uapi/asm/prctl.h | 5 ++ arch/x86/kernel/Makefile | 2 +- arch/x86/kernel/cet.c | 29 +++++++++++ arch/x86/kernel/cet_prctl.c | 84 +++++++++++++++++++++++++++++++ arch/x86/kernel/process.c | 4 +- 6 files changed, 125 insertions(+), 3 deletions(-) create mode 100644 arch/x86/kernel/cet_prctl.c diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h index 9a3e2da9c1c4..b64f6d810ae0 100644 --- a/arch/x86/include/asm/cet.h +++ b/arch/x86/include/asm/cet.h @@ -14,16 +14,20 @@ struct sc_ext; struct cet_status { unsigned long shstk_base; unsigned long shstk_size; + unsigned int locked:1; unsigned int shstk_enabled:1; }; #ifdef CONFIG_X86_INTEL_CET +int prctl_cet(int option, unsigned long arg2); int cet_setup_shstk(void); int cet_setup_thread_shstk(struct task_struct *p); +int cet_alloc_shstk(unsigned long *arg); void cet_disable_free_shstk(struct task_struct *p); int cet_restore_signal(bool ia32, struct sc_ext *sc); int cet_setup_signal(bool ia32, unsigned long rstor, struct sc_ext *sc); #else +static inline int prctl_cet(int option, unsigned long arg2) { return -EINVAL; } static inline int cet_setup_thread_shstk(struct task_struct *p) { return 0; } static inline void cet_disable_free_shstk(struct task_struct *p) {} static inline int cet_restore_signal(bool ia32, struct sc_ext *sc) { return -EINVAL; } diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h index 5a6aac9fa41f..d962f0ec9ccf 100644 --- a/arch/x86/include/uapi/asm/prctl.h +++ b/arch/x86/include/uapi/asm/prctl.h @@ -14,4 +14,9 @@ #define ARCH_MAP_VDSO_32 0x2002 #define ARCH_MAP_VDSO_64 0x2003 +#define ARCH_X86_CET_STATUS 0x3001 +#define ARCH_X86_CET_DISABLE 0x3002 +#define ARCH_X86_CET_LOCK 0x3003 +#define ARCH_X86_CET_ALLOC_SHSTK 0x3004 + #endif /* _ASM_X86_PRCTL_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index b8c1ea4ab7eb..69a19957e200 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -142,7 +142,7 @@ obj-$(CONFIG_UNWINDER_ORC) += unwind_orc.o obj-$(CONFIG_UNWINDER_FRAME_POINTER) += unwind_frame.o obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o -obj-$(CONFIG_X86_INTEL_CET) += cet.o +obj-$(CONFIG_X86_INTEL_CET) += cet.o cet_prctl.o ### # 64 bit specific files diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c index 5b45abda80a1..01aa24c40a5d 100644 --- a/arch/x86/kernel/cet.c +++ b/arch/x86/kernel/cet.c @@ -145,6 +145,35 @@ static int create_rstor_token(bool ia32, unsigned long ssp, return 0; } +int cet_alloc_shstk(unsigned long *arg) +{ + unsigned long len = *arg; + unsigned long addr; + unsigned long token; + unsigned long ssp; + + addr = alloc_shstk(len); + + if (IS_ERR((void *)addr)) + return PTR_ERR((void *)addr); + + /* Restore token is 8 bytes and aligned to 8 bytes */ + ssp = addr + len; + token = ssp; + + if (!in_ia32_syscall()) + token |= TOKEN_MODE_64; + ssp -= 8; + + if (write_user_shstk_64(ssp, token)) { + vm_munmap(addr, len); + return -EINVAL; + } + + *arg = addr; + return 0; +} + int cet_setup_shstk(void) { unsigned long addr, size; diff --git a/arch/x86/kernel/cet_prctl.c b/arch/x86/kernel/cet_prctl.c new file mode 100644 index 000000000000..6cf8f87e3d98 --- /dev/null +++ b/arch/x86/kernel/cet_prctl.c @@ -0,0 +1,84 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* See Documentation/x86/intel_cet.rst. */ + +static int handle_get_status(unsigned long arg2) +{ + struct cet_status *cet = ¤t->thread.cet; + unsigned int features = 0; + unsigned long buf[3]; + + if (cet->shstk_enabled) + features |= GNU_PROPERTY_X86_FEATURE_1_SHSTK; + + buf[0] = (unsigned long)features; + buf[1] = cet->shstk_base; + buf[2] = cet->shstk_size; + return copy_to_user((unsigned long __user *)arg2, buf, + sizeof(buf)); +} + +static int handle_alloc_shstk(unsigned long arg2) +{ + int err = 0; + unsigned long arg; + unsigned long addr = 0; + unsigned long size = 0; + + if (get_user(arg, (unsigned long __user *)arg2)) + return -EFAULT; + + size = arg; + err = cet_alloc_shstk(&arg); + if (err) + return err; + + addr = arg; + if (put_user(addr, (unsigned long __user *)arg2)) { + vm_munmap(addr, size); + return -EFAULT; + } + + return 0; +} + +int prctl_cet(int option, unsigned long arg2) +{ + struct cet_status *cet = ¤t->thread.cet; + + if (!cpu_x86_cet_enabled()) + return -EINVAL; + + switch (option) { + case ARCH_X86_CET_STATUS: + return handle_get_status(arg2); + + case ARCH_X86_CET_DISABLE: + if (cet->locked) + return -EPERM; + if (arg2 & GNU_PROPERTY_X86_FEATURE_1_SHSTK) + cet_disable_free_shstk(current); + + return 0; + + case ARCH_X86_CET_LOCK: + cet->locked = 1; + return 0; + + case ARCH_X86_CET_ALLOC_SHSTK: + return handle_alloc_shstk(arg2); + + default: + return -EINVAL; + } +} diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 7098618142f2..63dc88070923 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -998,7 +998,7 @@ long do_arch_prctl_common(struct task_struct *task, int option, return get_cpuid_mode(); case ARCH_SET_CPUID: return set_cpuid_mode(task, cpuid_enabled); + default: + return prctl_cet(option, cpuid_enabled); } - - return -EINVAL; }