From patchwork Fri Oct 9 18:32:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827037 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5FA67139F for ; Fri, 9 Oct 2020 18:33:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 185CD222B9 for ; Fri, 9 Oct 2020 18:33:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 185CD222B9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6FAC46B0062; Fri, 9 Oct 2020 14:33:14 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6AC406B0068; Fri, 9 Oct 2020 14:33:14 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4FD1D6B006C; Fri, 9 Oct 2020 14:33:14 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 24A466B0062 for ; Fri, 9 Oct 2020 14:33:14 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id AC5738249980 for ; Fri, 9 Oct 2020 18:33:13 +0000 (UTC) X-FDA: 77353234266.23.paper20_5c16faa271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id 891FB37606 for ; Fri, 9 Oct 2020 18:33:13 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30003:30012:30046:30051:30054:30055:30056:30062:30064:30069:30070:30075:30079:30089,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04ygsr1jby94egztiogjkpgbeyw59yc548gqoywf6ooi5rj41zufsijxrimetf7.591k1trobmhbni14qe7sam6kbnd9h8k1ey5ghdyem9piqgrxxenx88sjd1axbkr.s-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: paper20_5c16faa271e2 X-Filterd-Recvd-Size: 10173 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:12 +0000 (UTC) IronPort-SDR: K6/rgQL7t8yxxIEoQXNZaHpobfqMPE38X7pZ4rMqOZdCNMMakpRKOX/6Fc7YElr8LEFoqUokOd G2MyhzuY+IAw== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390191" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390191" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:10 -0700 IronPort-SDR: SVy66sUDtVoA5a2G2YD/OTN6KwqOHE/c9fxQskputhCBOrlutlHdnwTGRBobLOz+c2VYWFEjiA D4syxx5bJNIg== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031104" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:09 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 01/26] Documentation/x86: Add CET description Date: Fri, 9 Oct 2020 11:32:05 -0700 Message-Id: <20201009183230.26717-2-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Explain no_user_shstk/no_user_ibt kernel parameters, and introduce a new document on Control-flow Enforcement Technology (CET). Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook v13: - Change X86_INTEL_* to X86_*. v12: - Remove ARCH_X86_CET_MMAP_SHSTK information. v11: - Add back GLIBC tunables information. - Add ARCH_X86_CET_MMAP_SHSTK information. v10: - Change no_cet_shstk and no_cet_ibt to no_user_shstk and no_user_ibt. - Remove the opcode section, as it is already in the Intel SDM. - Remove sections related to GLIBC implementation. - Remove shadow stack memory management section, as it is already in the code comments. - Remove legacy bitmap related information, as it is not supported now. - Fix arch_ioctl() related text. - Change SHSTK, IBT to plain English. --- .../admin-guide/kernel-parameters.txt | 6 + Documentation/x86/index.rst | 1 + Documentation/x86/intel_cet.rst | 133 ++++++++++++++++++ 3 files changed, 140 insertions(+) create mode 100644 Documentation/x86/intel_cet.rst diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index a1068742a6df..7c7124a6a7ac 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3164,6 +3164,12 @@ noexec=on: enable non-executable mappings (default) noexec=off: disable non-executable mappings + no_user_shstk [X86-64] Disable Shadow Stack for user-mode + applications + + no_user_ibt [X86-64] Disable Indirect Branch Tracking for user-mode + applications + nosmap [X86,PPC] Disable SMAP (Supervisor Mode Access Prevention) even if it is supported by processor. diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst index 265d9e9a093b..2aef972a868d 100644 --- a/Documentation/x86/index.rst +++ b/Documentation/x86/index.rst @@ -19,6 +19,7 @@ x86-specific Documentation tlb mtrr pat + intel_cet intel-iommu intel_txt amd-memory-encryption diff --git a/Documentation/x86/intel_cet.rst b/Documentation/x86/intel_cet.rst new file mode 100644 index 000000000000..c9ebb3d9dd00 --- /dev/null +++ b/Documentation/x86/intel_cet.rst @@ -0,0 +1,133 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================================= +Control-flow Enforcement Technology (CET) +========================================= + +[1] Overview +============ + +Control-flow Enforcement Technology (CET) is an Intel processor feature +that provides protection against return/jump-oriented programming (ROP) +attacks. It can be set up to protect both applications and the kernel. +Only user-mode protection is implemented in the 64-bit kernel, including +support for running legacy 32-bit applications. + +CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is +a secondary stack allocated from memory and cannot be directly modified by +applications. When executing a CALL, the processor pushes the return +address to both the normal stack and the shadow stack. Upon function +return, the processor pops the shadow stack copy and compares it to the +normal stack copy. If the two differ, the processor raises a control- +protection fault. Indirect branch tracking verifies indirect CALL/JMP +targets are intended as marked by the compiler with 'ENDBR' opcodes. + +There are two kernel configuration options: + + X86_SHADOW_STACK_USER, and + X86_BRANCH_TRACKING_USER. + +These need to be enabled to build a CET-enabled kernel, and Binutils v2.31 +and GCC v8.1 or later are required to build a CET kernel. To build a CET- +enabled application, GLIBC v2.28 or later is also required. + +There are two command-line options for disabling CET features:: + + no_user_shstk - disables user shadow stack, and + no_user_ibt - disables user indirect branch tracking. + +At run time, /proc/cpuinfo shows CET features if the processor supports +CET. + +[2] Application Enabling +======================== + +An application's CET capability is marked in its ELF header and can be +verified from the following command output, in the NT_GNU_PROPERTY_TYPE_0 +field: + + readelf -n + +If an application supports CET and is statically linked, it will run with +CET protection. If the application needs any shared libraries, the loader +checks all dependencies and enables CET when all requirements are met. + +[3] Backward Compatibility +========================== + +GLIBC provides a few tunables for backward compatibility. + +GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-IBT + Turn off SHSTK/IBT for the current shell. + +GLIBC_TUNABLES=glibc.tune.x86_shstk= + This controls how dlopen() handles SHSTK legacy libraries:: + + on - continue with SHSTK enabled; + permissive - continue with SHSTK off. + +[4] CET arch_prctl()'s +====================== + +Several arch_prctl()'s have been added for CET: + +arch_prctl(ARCH_X86_CET_STATUS, u64 *addr) + Return CET feature status. + + The parameter 'addr' is a pointer to a user buffer. + On returning to the caller, the kernel fills the following + information:: + + *addr = shadow stack/indirect branch tracking status + *(addr + 1) = shadow stack base address + *(addr + 2) = shadow stack size + +arch_prctl(ARCH_X86_CET_DISABLE, unsigned int features) + Disable shadow stack and/or indirect branch tracking as specified in + 'features'. Return -EPERM if CET is locked. + +arch_prctl(ARCH_X86_CET_LOCK) + Lock in all CET features. They cannot be turned off afterwards. + +Note: + There is no CET-enabling arch_prctl function. By design, CET is enabled + automatically if the binary and the system can support it. + +[5] The implementation of the Shadow Stack +========================================== + +Shadow Stack size +----------------- + +A task's shadow stack is allocated from memory to a fixed size of +MIN(RLIMIT_STACK, 4 GB). In other words, the shadow stack is allocated to +the maximum size of the normal stack, but capped to 4 GB. However, +a compat-mode application's address space is smaller, each of its thread's +shadow stack size is MIN(1/4 RLIMIT_STACK, 4 GB). + +Signal +------ + +The main program and its signal handlers use the same shadow stack. +Because the shadow stack stores only return addresses, a large shadow +stack covers the condition that both the program stack and the signal +alternate stack run out. + +The kernel creates a restore token for the shadow stack restoring address +and verifies that token when restoring from the signal handler. + +Fork +---- + +The shadow stack's vma has VM_SHSTK flag set; its PTEs are required to be +read-only and dirty. When a shadow stack PTE is not RO and dirty, a +shadow access triggers a page fault with the shadow stack access bit set +in the page fault error code. + +When a task forks a child, its shadow stack PTEs are copied and both the +parent's and the child's shadow stack PTEs are cleared of the dirty bit. +Upon the next shadow stack access, the resulting shadow stack page fault +is handled by page copy/re-use. + +When a pthread child is created, the kernel allocates a new shadow stack +for the new thread. From patchwork Fri Oct 9 18:32:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827039 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6075B139F for ; Fri, 9 Oct 2020 18:33:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 285AE22284 for ; Fri, 9 Oct 2020 18:33:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 285AE22284 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E4E126B0068; Fri, 9 Oct 2020 14:33:15 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DDB9B6B006C; Fri, 9 Oct 2020 14:33:15 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B65556B0070; Fri, 9 Oct 2020 14:33:15 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0159.hostedemail.com [216.40.44.159]) by kanga.kvack.org (Postfix) with ESMTP id 88E266B0068 for ; Fri, 9 Oct 2020 14:33:15 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 21869180AD807 for ; Fri, 9 Oct 2020 18:33:15 +0000 (UTC) X-FDA: 77353234350.13.limit89_600de67271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id 03C9C18140B67 for ; Fri, 9 Oct 2020 18:33:14 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30054:30055:30056:30064,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04y8kwpe991z8b58wucxubqr618sxyp6kcwpah6sgdti4kgsan3ce67oam3sxss.yfraxaqdyt9djbh6kyxfqb4hn4e9qtxx9qxi6wcux5fprcoeznpfte7qdqfh1ax.q-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:26,LUA_SUMMARY:none X-HE-Tag: limit89_600de67271e2 X-Filterd-Recvd-Size: 6458 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:13 +0000 (UTC) IronPort-SDR: bgcdvRaFYhyKPdz8VMd0jOlqsvMWyVLxOGn3SWqHkL75bbN19HtYNOfyjmgrctPgvD7yUJ9nd2 RyX0ugQJe0FQ== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390196" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390196" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:10 -0700 IronPort-SDR: nlWBVAbcM9MZMAEV6dAYjnLG/+ceiVx1lhrPatb7R9HAns+Y6izzzqFbYIwEizMOYKJhiWHHxY KrxPRgmH9wmA== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031108" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:09 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu , Borislav Petkov Subject: [PATCH v14 02/26] x86/cpufeatures: Add CET CPU feature flags for Control-flow Enforcement Technology (CET) Date: Fri, 9 Oct 2020 11:32:06 -0700 Message-Id: <20201009183230.26717-3-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add CPU feature flags for Control-flow Enforcement Technology (CET). CPUID.(EAX=7,ECX=0):ECX[bit 7] Shadow stack CPUID.(EAX=7,ECX=0):EDX[bit 20] Indirect Branch Tracking Signed-off-by: Yu-cheng Yu Reviewed-by: Borislav Petkov Reviewed-by: Kees Cook --- arch/x86/include/asm/cpufeatures.h | 2 ++ arch/x86/kernel/cpu/cpuid-deps.c | 2 ++ tools/arch/x86/include/asm/cpufeatures.h | 2 ++ 3 files changed, 6 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 2901d5df4366..c794e18e8a14 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -341,6 +341,7 @@ #define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ #define X86_FEATURE_WAITPKG (16*32+ 5) /* UMONITOR/UMWAIT/TPAUSE Instructions */ #define X86_FEATURE_AVX512_VBMI2 (16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */ +#define X86_FEATURE_SHSTK (16*32+ 7) /* Shadow Stack */ #define X86_FEATURE_GFNI (16*32+ 8) /* Galois Field New Instructions */ #define X86_FEATURE_VAES (16*32+ 9) /* Vector AES */ #define X86_FEATURE_VPCLMULQDQ (16*32+10) /* Carry-Less Multiplication Double Quadword */ @@ -370,6 +371,7 @@ #define X86_FEATURE_SERIALIZE (18*32+14) /* SERIALIZE instruction */ #define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */ #define X86_FEATURE_ARCH_LBR (18*32+19) /* Intel ARCH LBR */ +#define X86_FEATURE_IBT (18*32+20) /* Indirect Branch Tracking */ #define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */ #define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */ #define X86_FEATURE_FLUSH_L1D (18*32+28) /* Flush L1D cache */ diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c index 3cbe24ca80ab..fec83cc74b9e 100644 --- a/arch/x86/kernel/cpu/cpuid-deps.c +++ b/arch/x86/kernel/cpu/cpuid-deps.c @@ -69,6 +69,8 @@ static const struct cpuid_dep cpuid_deps[] = { { X86_FEATURE_CQM_MBM_TOTAL, X86_FEATURE_CQM_LLC }, { X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC }, { X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL }, + { X86_FEATURE_SHSTK, X86_FEATURE_XSAVES }, + { X86_FEATURE_IBT, X86_FEATURE_XSAVES }, {} }; diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h index 2901d5df4366..c794e18e8a14 100644 --- a/tools/arch/x86/include/asm/cpufeatures.h +++ b/tools/arch/x86/include/asm/cpufeatures.h @@ -341,6 +341,7 @@ #define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ #define X86_FEATURE_WAITPKG (16*32+ 5) /* UMONITOR/UMWAIT/TPAUSE Instructions */ #define X86_FEATURE_AVX512_VBMI2 (16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */ +#define X86_FEATURE_SHSTK (16*32+ 7) /* Shadow Stack */ #define X86_FEATURE_GFNI (16*32+ 8) /* Galois Field New Instructions */ #define X86_FEATURE_VAES (16*32+ 9) /* Vector AES */ #define X86_FEATURE_VPCLMULQDQ (16*32+10) /* Carry-Less Multiplication Double Quadword */ @@ -370,6 +371,7 @@ #define X86_FEATURE_SERIALIZE (18*32+14) /* SERIALIZE instruction */ #define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */ #define X86_FEATURE_ARCH_LBR (18*32+19) /* Intel ARCH LBR */ +#define X86_FEATURE_IBT (18*32+20) /* Indirect Branch Tracking */ #define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */ #define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */ #define X86_FEATURE_FLUSH_L1D (18*32+28) /* Flush L1D cache */ From patchwork Fri Oct 9 18:32:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827043 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4214814D5 for ; Fri, 9 Oct 2020 18:33:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 142DD22284 for ; Fri, 9 Oct 2020 18:33:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 142DD22284 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CEF606B006E; Fri, 9 Oct 2020 14:33:16 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C506C6B0071; Fri, 9 Oct 2020 14:33:16 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC6A26B0072; Fri, 9 Oct 2020 14:33:16 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0210.hostedemail.com [216.40.44.210]) by kanga.kvack.org (Postfix) with ESMTP id 7C7D06B006E for ; Fri, 9 Oct 2020 14:33:16 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 25CB81EF1 for ; Fri, 9 Oct 2020 18:33:16 +0000 (UTC) X-FDA: 77353234392.19.tax32_2b09c21271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id DFE111AD1B2 for ; Fri, 9 Oct 2020 18:33:15 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30034:30045:30051:30054:30056:30064:30075:30090,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04yfc5ukqd5nwag6smufka9wei8i6ocj9xm6ypeqe7u78qchcpxsso9ty9fho46.ity8bkqfznxtt9makkfoyg5ot1nbc5jq1us63xrhpxnt6suc3t1e7w4kztpk9i1.w-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: tax32_2b09c21271e2 X-Filterd-Recvd-Size: 12566 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:13 +0000 (UTC) IronPort-SDR: GW6piLbSF366I81wNZoZT0Uqy/iAVexjsrVxaOqcWTJ7D8kKyt6zAX2oy3SlSXVGJVWVTRiPFr eDs8PHiNjCxw== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390199" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390199" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:11 -0700 IronPort-SDR: e4W868LYHWRCdb2Q2qLhQWgadVFBercUan7I59aKmXM/FRXzdKvl6IVcA4k6elqSEbVjc/Fege L2j1KYPAp0MA== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031112" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:10 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 03/26] x86/fpu/xstate: Introduce CET MSR XSAVES supervisor states Date: Fri, 9 Oct 2020 11:32:07 -0700 Message-Id: <20201009183230.26717-4-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Control-flow Enforcement Technology (CET) adds five MSRs. Introduce them and their XSAVES supervisor states: MSR_IA32_U_CET (user-mode CET settings), MSR_IA32_PL3_SSP (user-mode Shadow Stack pointer), MSR_IA32_PL0_SSP (kernel-mode Shadow Stack pointer), MSR_IA32_PL1_SSP (Privilege Level 1 Shadow Stack pointer), MSR_IA32_PL2_SSP (Privilege Level 2 Shadow Stack pointer). Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook v12: - Add definition of reserved/suppress* bits for MSR_IA32_U_CET and MSR_IA32_S_CET. v11: - Drop MSR_IA32 prefix for individual bits, and use BIT_ULL(). - Drop MSR_IA32_CET_BITMAP_MASK. v6: - Remove __packed from struct cet_user_state, struct cet_kernel_state. --- arch/x86/include/asm/fpu/types.h | 23 +++++++++++++++-- arch/x86/include/asm/fpu/xstate.h | 5 ++-- arch/x86/include/asm/msr-index.h | 20 +++++++++++++++ arch/x86/include/uapi/asm/processor-flags.h | 2 ++ arch/x86/kernel/fpu/xstate.c | 28 ++++++++++++++++++--- tools/arch/x86/include/asm/msr-index.h | 20 +++++++++++++++ 6 files changed, 91 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index c87364ea6446..2a7037a6f960 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -115,8 +115,8 @@ enum xfeature { XFEATURE_PT_UNIMPLEMENTED_SO_FAR, XFEATURE_PKRU, XFEATURE_RSRVD_COMP_10, - XFEATURE_RSRVD_COMP_11, - XFEATURE_RSRVD_COMP_12, + XFEATURE_CET_USER, + XFEATURE_CET_KERNEL, XFEATURE_RSRVD_COMP_13, XFEATURE_RSRVD_COMP_14, XFEATURE_LBR, @@ -134,6 +134,8 @@ enum xfeature { #define XFEATURE_MASK_Hi16_ZMM (1 << XFEATURE_Hi16_ZMM) #define XFEATURE_MASK_PT (1 << XFEATURE_PT_UNIMPLEMENTED_SO_FAR) #define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU) +#define XFEATURE_MASK_CET_USER (1 << XFEATURE_CET_USER) +#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL) #define XFEATURE_MASK_LBR (1 << XFEATURE_LBR) #define XFEATURE_MASK_FPSSE (XFEATURE_MASK_FP | XFEATURE_MASK_SSE) @@ -236,6 +238,23 @@ struct pkru_state { u32 pad; } __packed; +/* + * State component 11 is Control-flow Enforcement user states + */ +struct cet_user_state { + u64 user_cet; /* user control-flow settings */ + u64 user_ssp; /* user shadow stack pointer */ +}; + +/* + * State component 12 is Control-flow Enforcement kernel states + */ +struct cet_kernel_state { + u64 kernel_ssp; /* kernel shadow stack */ + u64 pl1_ssp; /* privilege level 1 shadow stack */ + u64 pl2_ssp; /* privilege level 2 shadow stack */ +}; + /* * State component 15: Architectural LBR configuration state. * The size of Arch LBR state depends on the number of LBRs (lbr_depth). diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h index 14ab815132d4..e4408db88bca 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -35,7 +35,7 @@ XFEATURE_MASK_BNDCSR) /* All currently supported supervisor features */ -#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (0) +#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_CET_USER) /* * A supervisor state component may not always contain valuable information, @@ -62,7 +62,8 @@ * Unsupported supervisor features. When a supervisor feature in this mask is * supported in the future, move it to the supported supervisor feature mask. */ -#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT) +#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT | \ + XFEATURE_MASK_CET_KERNEL) /* All supervisor states including supported and unsupported states. */ #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 2859ee4f39a8..f6f3a0e6c664 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -912,4 +912,24 @@ #define MSR_VM_IGNNE 0xc0010115 #define MSR_VM_HSAVE_PA 0xc0010117 +/* Control-flow Enforcement Technology MSRs */ +#define MSR_IA32_U_CET 0x6a0 /* user mode cet setting */ +#define MSR_IA32_S_CET 0x6a2 /* kernel mode cet setting */ +#define MSR_IA32_PL0_SSP 0x6a4 /* kernel shstk pointer */ +#define MSR_IA32_PL1_SSP 0x6a5 /* ring-1 shstk pointer */ +#define MSR_IA32_PL2_SSP 0x6a6 /* ring-2 shstk pointer */ +#define MSR_IA32_PL3_SSP 0x6a7 /* user shstk pointer */ +#define MSR_IA32_INT_SSP_TAB 0x6a8 /* exception shstk table */ + +/* MSR_IA32_U_CET and MSR_IA32_S_CET bits */ +#define CET_SHSTK_EN BIT_ULL(0) +#define CET_WRSS_EN BIT_ULL(1) +#define CET_ENDBR_EN BIT_ULL(2) +#define CET_LEG_IW_EN BIT_ULL(3) +#define CET_NO_TRACK_EN BIT_ULL(4) +#define CET_SUPPRESS_DISABLE BIT_ULL(5) +#define CET_RESERVED (BIT_ULL(6) | BIT_ULL(7) | BIT_ULL(8) | BIT_ULL(9)) +#define CET_SUPPRESS BIT_ULL(10) +#define CET_WAIT_ENDBR BIT_ULL(11) + #endif /* _ASM_X86_MSR_INDEX_H */ diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h index bcba3c643e63..a8df907e8017 100644 --- a/arch/x86/include/uapi/asm/processor-flags.h +++ b/arch/x86/include/uapi/asm/processor-flags.h @@ -130,6 +130,8 @@ #define X86_CR4_SMAP _BITUL(X86_CR4_SMAP_BIT) #define X86_CR4_PKE_BIT 22 /* enable Protection Keys support */ #define X86_CR4_PKE _BITUL(X86_CR4_PKE_BIT) +#define X86_CR4_CET_BIT 23 /* enable Control-flow Enforcement */ +#define X86_CR4_CET _BITUL(X86_CR4_CET_BIT) /* * x86-64 Task Priority Register, CR8 diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 038e19c0019e..705fd9b94e31 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -38,6 +38,9 @@ static const char *xfeature_names[] = "Processor Trace (unused)" , "Protection Keys User registers", "unknown xstate feature" , + "Control-flow User registers" , + "Control-flow Kernel registers" , + "unknown xstate feature" , }; static short xsave_cpuid_features[] __initdata = { @@ -51,6 +54,9 @@ static short xsave_cpuid_features[] __initdata = { X86_FEATURE_AVX512F, X86_FEATURE_INTEL_PT, X86_FEATURE_PKU, + -1, /* Unused */ + X86_FEATURE_SHSTK, /* XFEATURE_CET_USER */ + X86_FEATURE_SHSTK, /* XFEATURE_CET_KERNEL */ }; /* @@ -318,6 +324,8 @@ static void __init print_xstate_features(void) print_xstate_feature(XFEATURE_MASK_ZMM_Hi256); print_xstate_feature(XFEATURE_MASK_Hi16_ZMM); print_xstate_feature(XFEATURE_MASK_PKRU); + print_xstate_feature(XFEATURE_MASK_CET_USER); + print_xstate_feature(XFEATURE_MASK_CET_KERNEL); } /* @@ -592,6 +600,8 @@ static void check_xstate_against_struct(int nr) XCHECK_SZ(sz, nr, XFEATURE_ZMM_Hi256, struct avx_512_zmm_uppers_state); XCHECK_SZ(sz, nr, XFEATURE_Hi16_ZMM, struct avx_512_hi16_state); XCHECK_SZ(sz, nr, XFEATURE_PKRU, struct pkru_state); + XCHECK_SZ(sz, nr, XFEATURE_CET_USER, struct cet_user_state); + XCHECK_SZ(sz, nr, XFEATURE_CET_KERNEL, struct cet_kernel_state); /* * Make *SURE* to add any feature numbers in below if @@ -601,7 +611,8 @@ static void check_xstate_against_struct(int nr) if ((nr < XFEATURE_YMM) || (nr >= XFEATURE_MAX) || (nr == XFEATURE_PT_UNIMPLEMENTED_SO_FAR) || - ((nr >= XFEATURE_RSRVD_COMP_10) && (nr <= XFEATURE_LBR))) { + (nr == XFEATURE_RSRVD_COMP_10) || + ((nr >= XFEATURE_RSRVD_COMP_13) && (nr <= XFEATURE_LBR))) { WARN_ONCE(1, "no structure for xstate: %d\n", nr); XSTATE_WARN_ON(1); } @@ -831,8 +842,19 @@ void __init fpu__init_system_xstate(void) * Clear XSAVE features that are disabled in the normal CPUID. */ for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) { - if (!boot_cpu_has(xsave_cpuid_features[i])) - xfeatures_mask_all &= ~BIT_ULL(i); + if (xsave_cpuid_features[i] == X86_FEATURE_SHSTK) { + /* + * X86_FEATURE_SHSTK and X86_FEATURE_IBT share + * same states, but can be enabled separately. + */ + if (!boot_cpu_has(X86_FEATURE_SHSTK) && + !boot_cpu_has(X86_FEATURE_IBT)) + xfeatures_mask_all &= ~BIT_ULL(i); + } else { + if ((xsave_cpuid_features[i] == -1) || + !boot_cpu_has(xsave_cpuid_features[i])) + xfeatures_mask_all &= ~BIT_ULL(i); + } } xfeatures_mask_all &= fpu__get_supported_xfeatures_mask(); diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index 2859ee4f39a8..f6f3a0e6c664 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -912,4 +912,24 @@ #define MSR_VM_IGNNE 0xc0010115 #define MSR_VM_HSAVE_PA 0xc0010117 +/* Control-flow Enforcement Technology MSRs */ +#define MSR_IA32_U_CET 0x6a0 /* user mode cet setting */ +#define MSR_IA32_S_CET 0x6a2 /* kernel mode cet setting */ +#define MSR_IA32_PL0_SSP 0x6a4 /* kernel shstk pointer */ +#define MSR_IA32_PL1_SSP 0x6a5 /* ring-1 shstk pointer */ +#define MSR_IA32_PL2_SSP 0x6a6 /* ring-2 shstk pointer */ +#define MSR_IA32_PL3_SSP 0x6a7 /* user shstk pointer */ +#define MSR_IA32_INT_SSP_TAB 0x6a8 /* exception shstk table */ + +/* MSR_IA32_U_CET and MSR_IA32_S_CET bits */ +#define CET_SHSTK_EN BIT_ULL(0) +#define CET_WRSS_EN BIT_ULL(1) +#define CET_ENDBR_EN BIT_ULL(2) +#define CET_LEG_IW_EN BIT_ULL(3) +#define CET_NO_TRACK_EN BIT_ULL(4) +#define CET_SUPPRESS_DISABLE BIT_ULL(5) +#define CET_RESERVED (BIT_ULL(6) | BIT_ULL(7) | BIT_ULL(8) | BIT_ULL(9)) +#define CET_SUPPRESS BIT_ULL(10) +#define CET_WAIT_ENDBR BIT_ULL(11) + #endif /* _ASM_X86_MSR_INDEX_H */ From patchwork Fri Oct 9 18:32:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827041 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7DF9E139F for ; Fri, 9 Oct 2020 18:33:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3DF86222E8 for ; Fri, 9 Oct 2020 18:33:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3DF86222E8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 227D16B006C; Fri, 9 Oct 2020 14:33:16 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 022016B0070; Fri, 9 Oct 2020 14:33:15 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D122B8E0001; Fri, 9 Oct 2020 14:33:15 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0048.hostedemail.com [216.40.44.48]) by kanga.kvack.org (Postfix) with ESMTP id 9F9126B006C for ; Fri, 9 Oct 2020 14:33:15 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 3F0C5180AD811 for ; Fri, 9 Oct 2020 18:33:15 +0000 (UTC) X-FDA: 77353234350.28.ring30_1c0f96f271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id 1E1BD6C05 for ; Fri, 9 Oct 2020 18:33:15 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30003:30051:30054:30056:30064:30070:30079:30083,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-62.18.0.100 64.95.201.95;04yga69nf961ms9rku8aoud9dsq18oczp38dtjschrno16w4dcuotb4oerj8mf5.qap8bc1o54hn8e5pp73xj9dexs5967dntgsth3hpzu9eykink5a5acqzq7kwwtz.1-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: ring30_1c0f96f271e2 X-Filterd-Recvd-Size: 7963 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:14 +0000 (UTC) IronPort-SDR: N5W/U8ee1qvhy87VrrIaa2VgDJp9mf10uCJ5IJlMuXy8sQVknYdSG+k+A3SRxeK8NVg2qKQSfM gzH4PIRnIxzA== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390203" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390203" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:12 -0700 IronPort-SDR: vkpIhFwYaXZu40VveZiiSORQdIKVqGnxs1BvvEwkplNHapf9knDSM56bcasiVEBhrXz3csfcVU 8wv6+ZtUJyHQ== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031116" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:11 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 04/26] x86/cet: Add control-protection fault handler Date: Fri, 9 Oct 2020 11:32:08 -0700 Message-Id: <20201009183230.26717-5-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A control-protection fault is triggered when a control-flow transfer attempt violates Shadow Stack or Indirect Branch Tracking constraints. For example, the return address for a RET instruction differs from the copy on the Shadow Stack; or an indirect JMP instruction, without the NOTRACK prefix, arrives at a non-ENDBR opcode. The control-protection fault handler works in a similar way as the general protection fault handler. It provides the si_code SEGV_CPERR to the signal handler. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook v13: - Change X86_INTEL_* to X86_*. v10: - Change CONFIG_X86_64 to CONFIG_X86_INTEL_CET. v9: - Add Shadow Stack pointer to the fault printout. --- arch/x86/include/asm/idtentry.h | 4 ++ arch/x86/kernel/idt.c | 4 ++ arch/x86/kernel/signal_compat.c | 2 +- arch/x86/kernel/traps.c | 59 ++++++++++++++++++++++++++++++ include/uapi/asm-generic/siginfo.h | 3 +- 5 files changed, 70 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h index a0638640f1ed..3130a6ec0a2a 100644 --- a/arch/x86/include/asm/idtentry.h +++ b/arch/x86/include/asm/idtentry.h @@ -532,6 +532,10 @@ DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_SS, exc_stack_segment); DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_GP, exc_general_protection); DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_AC, exc_alignment_check); +#ifdef CONFIG_X86_CET +DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_CP, exc_control_protection); +#endif + /* Raw exception entries which need extra work */ DECLARE_IDTENTRY_RAW(X86_TRAP_UD, exc_invalid_op); DECLARE_IDTENTRY_RAW(X86_TRAP_BP, exc_int3); diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c index 7ecf9babf0cb..34f2a7383d5d 100644 --- a/arch/x86/kernel/idt.c +++ b/arch/x86/kernel/idt.c @@ -112,6 +112,10 @@ static const __initconst struct idt_data def_idts[] = { #elif defined(CONFIG_X86_32) SYSG(IA32_SYSCALL_VECTOR, entry_INT80_32), #endif + +#ifdef CONFIG_X86_CET + INTG(X86_TRAP_CP, asm_exc_control_protection), +#endif }; /* diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c index 9ccbf0576cd0..c572a3de1037 100644 --- a/arch/x86/kernel/signal_compat.c +++ b/arch/x86/kernel/signal_compat.c @@ -27,7 +27,7 @@ static inline void signal_compat_build_tests(void) */ BUILD_BUG_ON(NSIGILL != 11); BUILD_BUG_ON(NSIGFPE != 15); - BUILD_BUG_ON(NSIGSEGV != 7); + BUILD_BUG_ON(NSIGSEGV != 8); BUILD_BUG_ON(NSIGBUS != 5); BUILD_BUG_ON(NSIGTRAP != 5); BUILD_BUG_ON(NSIGCHLD != 6); diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 81a2fb711091..97a049258e33 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -597,6 +597,65 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection) cond_local_irq_disable(regs); } +#ifdef CONFIG_X86_CET +static const char * const control_protection_err[] = { + "unknown", + "near-ret", + "far-ret/iret", + "endbranch", + "rstorssp", + "setssbsy", +}; + +/* + * When a control protection exception occurs, send a signal + * to the responsible application. Currently, control + * protection is only enabled for the user mode. This + * exception should not come from the kernel mode. + */ +DEFINE_IDTENTRY_ERRORCODE(exc_control_protection) +{ + struct task_struct *tsk; + + if (notify_die(DIE_TRAP, "control protection fault", regs, + error_code, X86_TRAP_CP, SIGSEGV) == NOTIFY_STOP) + return; + cond_local_irq_enable(regs); + + if (!user_mode(regs)) + die("kernel control protection fault", regs, error_code); + + if (!static_cpu_has(X86_FEATURE_SHSTK) && + !static_cpu_has(X86_FEATURE_IBT)) + WARN_ONCE(1, "CET is disabled but got control protection fault\n"); + + tsk = current; + tsk->thread.error_code = error_code; + tsk->thread.trap_nr = X86_TRAP_CP; + + if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) && + printk_ratelimit()) { + unsigned int max_err; + unsigned long ssp; + + max_err = ARRAY_SIZE(control_protection_err) - 1; + if ((error_code < 0) || (error_code > max_err)) + error_code = 0; + rdmsrl(MSR_IA32_PL3_SSP, ssp); + pr_info("%s[%d] control protection ip:%lx sp:%lx ssp:%lx error:%lx(%s)", + tsk->comm, task_pid_nr(tsk), + regs->ip, regs->sp, ssp, error_code, + control_protection_err[error_code]); + print_vma_addr(KERN_CONT " in ", regs->ip); + pr_cont("\n"); + } + + force_sig_fault(SIGSEGV, SEGV_CPERR, + (void __user *)uprobe_get_trap_addr(regs)); + cond_local_irq_disable(regs); +} +#endif + static bool do_int3(struct pt_regs *regs) { int res; diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h index cb3d6c267181..91e10cbe3bb0 100644 --- a/include/uapi/asm-generic/siginfo.h +++ b/include/uapi/asm-generic/siginfo.h @@ -229,7 +229,8 @@ typedef struct siginfo { #define SEGV_ACCADI 5 /* ADI not enabled for mapped object */ #define SEGV_ADIDERR 6 /* Disrupting MCD error */ #define SEGV_ADIPERR 7 /* Precise MCD exception */ -#define NSIGSEGV 7 +#define SEGV_CPERR 8 /* Control protection fault */ +#define NSIGSEGV 8 /* * SIGBUS si_codes From patchwork Fri Oct 9 18:32:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827047 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD18F14D5 for ; Fri, 9 Oct 2020 18:33:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7233F22314 for ; Fri, 9 Oct 2020 18:33:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7233F22314 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6360D6B0071; Fri, 9 Oct 2020 14:33:17 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6115A6B0072; Fri, 9 Oct 2020 14:33:17 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F0DF6B0074; Fri, 9 Oct 2020 14:33:17 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0040.hostedemail.com [216.40.44.40]) by kanga.kvack.org (Postfix) with ESMTP id EC8FA6B0073 for ; Fri, 9 Oct 2020 14:33:16 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 83CA51EF2 for ; Fri, 9 Oct 2020 18:33:16 +0000 (UTC) X-FDA: 77353234392.03.seat41_020d8bb271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin03.hostedemail.com (Postfix) with ESMTP id 6145128A4E8 for ; Fri, 9 Oct 2020 18:33:16 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30003:30051:30054:30055:30056:30064:30070,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04y89t4a88dnf4d5zwwjzwp85jm51opgd1x5kwf4hu6diuraww6t7orgn3oep93.5fdicxezkatfqq5btdh5q8hqprbehxjy7hdetn5sxw6cnct1benkdzugcpgt4uc.a-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: seat41_020d8bb271e2 X-Filterd-Recvd-Size: 5348 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:15 +0000 (UTC) IronPort-SDR: Osa9kPJct/4HdT5H8c9yE9RDskGe0VhMXsSvlR7GiGOL+jdySWJlORNJOen7yOTq1RwiDD7jXB uvVksf3DhvQw== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390205" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390205" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:13 -0700 IronPort-SDR: AnsvOg3ftbUY5GVbbBfdSqHuV+7OId3FEb3JM2h+sl8ejNnxytJ7yCESZofgNo3agZ9uZrFaCG rXUU1fYiQsBw== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031121" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:12 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 05/26] x86/cet/shstk: Add Kconfig option for user-mode Shadow Stack Date: Fri, 9 Oct 2020 11:32:09 -0700 Message-Id: <20201009183230.26717-6-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Shadow Stack provides protection against function return address corruption. It is active when the processor supports it, the kernel has CONFIG_X86_SHADOW_STACK_USER, and the application is built for the feature. This is only implemented for the 64-bit kernel. When it is enabled, legacy non-shadow stack applications continue to work, but without protection. Signed-off-by: Yu-cheng Yu v13: - Update help text, and change default to N. - Change X86_INTEL_* to X86_*. v10: - Change SHSTK to shadow stack in the help text. - Change build-time check to config-time check. - Change ARCH_HAS_SHSTK to ARCH_HAS_SHADOW_STACK. --- arch/x86/Kconfig | 33 +++++++++++++++++++++++++++ scripts/as-x86_64-has-shadow-stack.sh | 4 ++++ 2 files changed, 37 insertions(+) create mode 100755 scripts/as-x86_64-has-shadow-stack.sh diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 7101ac64bb20..415fcc869afc 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1927,6 +1927,39 @@ config X86_INTEL_TSX_MODE_AUTO side channel attacks- equals the tsx=auto command line parameter. endchoice +config AS_HAS_SHADOW_STACK + def_bool $(success,$(srctree)/scripts/as-x86_64-has-shadow-stack.sh $(CC)) + help + Test the assembler for shadow stack instructions. + +config X86_CET + def_bool n + +config ARCH_HAS_SHADOW_STACK + def_bool n + +config X86_SHADOW_STACK_USER + prompt "Intel Shadow Stacks for user-mode" + def_bool n + depends on CPU_SUP_INTEL && X86_64 + depends on AS_HAS_SHADOW_STACK + select ARCH_USES_HIGH_VMA_FLAGS + select X86_CET + select ARCH_HAS_SHADOW_STACK + help + Shadow Stacks provides protection against program stack + corruption. It's a hardware feature. This only matters + if you have the right hardware. It's a security hardening + feature and apps must be enabled to use it. You get no + protection "for free" on old userspace. The hardware can + support user and kernel, but this option is for user space + only. + Support for this feature is only known to be present on + processors released in 2020 or later. CET features are also + known to increase kernel text size by 3.7 KB. + + If unsure, say N. + config EFI bool "EFI runtime service support" depends on ACPI diff --git a/scripts/as-x86_64-has-shadow-stack.sh b/scripts/as-x86_64-has-shadow-stack.sh new file mode 100755 index 000000000000..fac1d363a1b8 --- /dev/null +++ b/scripts/as-x86_64-has-shadow-stack.sh @@ -0,0 +1,4 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 + +echo "wrussq %rax, (%rbx)" | $* -x assembler -c - From patchwork Fri Oct 9 18:32:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827045 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AB8F5139F for ; Fri, 9 Oct 2020 18:33:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 704B822284 for ; Fri, 9 Oct 2020 18:33:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 704B822284 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1F7386B0070; Fri, 9 Oct 2020 14:33:17 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 165C16B0071; Fri, 9 Oct 2020 14:33:17 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F1AD86B0074; Fri, 9 Oct 2020 14:33:16 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0002.hostedemail.com [216.40.44.2]) by kanga.kvack.org (Postfix) with ESMTP id A386B6B0070 for ; Fri, 9 Oct 2020 14:33:16 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 483CF180AD815 for ; Fri, 9 Oct 2020 18:33:16 +0000 (UTC) X-FDA: 77353234392.30.cart37_210c804271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id 20A46180B3C8B for ; Fri, 9 Oct 2020 18:33:16 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30054:30056:30064:30070,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-62.18.0.100 64.95.201.95;04yrf5e5yk4hwy4zhpcc9nirpxjcayp8okb56465rgu9hjo4urzzg4384a7rkrf.nnkki9sohky1cxmwagxxk6zkk5te39mu6c4tt3k4bnabkaddjiz87o15ej7kc94.c-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: cart37_210c804271e2 X-Filterd-Recvd-Size: 9640 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:15 +0000 (UTC) IronPort-SDR: ODVKPQJsuQL06yJHX5s1G+ACorRofGaRlwQCaDiz5Z1DFJzV9/JMGgNq5ePAc1042WuNSkst95 x0+GMIckMW9Q== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390209" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390209" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:13 -0700 IronPort-SDR: xRi/Ix4/li5FeqCIhxbfZVk9oqsLgG8ak+XgiiuRR//88BtnPwC1MeqjLh2e2g048gOyET70Mc MwfO1+IWcmmA== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031129" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:13 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu , Dave Hansen Subject: [PATCH v14 06/26] x86/mm: Change _PAGE_DIRTY to _PAGE_DIRTY_HW Date: Fri, 9 Oct 2020 11:32:10 -0700 Message-Id: <20201009183230.26717-7-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Before introducing _PAGE_COW for non-hardware memory management purposes in the next patch, rename _PAGE_DIRTY to _PAGE_DIRTY_HW and _PAGE_BIT_DIRTY to _PAGE_BIT_DIRTY_HW to make meanings more clear. There are no functional changes from this patch. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook Reviewed-by: Dave Hansen v9: - At some places _PAGE_DIRTY were not changed to _PAGE_DIRTY_HW, because they will be changed again in the next patch to _PAGE_DIRTY_BITS. However, this causes compile issues if the next patch is not yet applied. Fix it by changing all _PAGE_DIRTY to _PAGE_DRITY_HW. --- arch/x86/include/asm/pgtable.h | 18 +++++++++--------- arch/x86/include/asm/pgtable_types.h | 11 +++++------ arch/x86/kernel/relocate_kernel_64.S | 2 +- arch/x86/kvm/vmx/vmx.c | 2 +- 4 files changed, 16 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index b836138ce852..86b7acd221c1 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -124,7 +124,7 @@ extern pmdval_t early_pmd_flags; */ static inline int pte_dirty(pte_t pte) { - return pte_flags(pte) & _PAGE_DIRTY; + return pte_flags(pte) & _PAGE_DIRTY_HW; } @@ -163,7 +163,7 @@ static inline int pte_young(pte_t pte) static inline int pmd_dirty(pmd_t pmd) { - return pmd_flags(pmd) & _PAGE_DIRTY; + return pmd_flags(pmd) & _PAGE_DIRTY_HW; } static inline int pmd_young(pmd_t pmd) @@ -173,7 +173,7 @@ static inline int pmd_young(pmd_t pmd) static inline int pud_dirty(pud_t pud) { - return pud_flags(pud) & _PAGE_DIRTY; + return pud_flags(pud) & _PAGE_DIRTY_HW; } static inline int pud_young(pud_t pud) @@ -334,7 +334,7 @@ static inline pte_t pte_clear_uffd_wp(pte_t pte) static inline pte_t pte_mkclean(pte_t pte) { - return pte_clear_flags(pte, _PAGE_DIRTY); + return pte_clear_flags(pte, _PAGE_DIRTY_HW); } static inline pte_t pte_mkold(pte_t pte) @@ -354,7 +354,7 @@ static inline pte_t pte_mkexec(pte_t pte) static inline pte_t pte_mkdirty(pte_t pte) { - return pte_set_flags(pte, _PAGE_DIRTY | _PAGE_SOFT_DIRTY); + return pte_set_flags(pte, _PAGE_DIRTY_HW | _PAGE_SOFT_DIRTY); } static inline pte_t pte_mkyoung(pte_t pte) @@ -435,7 +435,7 @@ static inline pmd_t pmd_mkold(pmd_t pmd) static inline pmd_t pmd_mkclean(pmd_t pmd) { - return pmd_clear_flags(pmd, _PAGE_DIRTY); + return pmd_clear_flags(pmd, _PAGE_DIRTY_HW); } static inline pmd_t pmd_wrprotect(pmd_t pmd) @@ -445,7 +445,7 @@ static inline pmd_t pmd_wrprotect(pmd_t pmd) static inline pmd_t pmd_mkdirty(pmd_t pmd) { - return pmd_set_flags(pmd, _PAGE_DIRTY | _PAGE_SOFT_DIRTY); + return pmd_set_flags(pmd, _PAGE_DIRTY_HW | _PAGE_SOFT_DIRTY); } static inline pmd_t pmd_mkdevmap(pmd_t pmd) @@ -489,7 +489,7 @@ static inline pud_t pud_mkold(pud_t pud) static inline pud_t pud_mkclean(pud_t pud) { - return pud_clear_flags(pud, _PAGE_DIRTY); + return pud_clear_flags(pud, _PAGE_DIRTY_HW); } static inline pud_t pud_wrprotect(pud_t pud) @@ -499,7 +499,7 @@ static inline pud_t pud_wrprotect(pud_t pud) static inline pud_t pud_mkdirty(pud_t pud) { - return pud_set_flags(pud, _PAGE_DIRTY | _PAGE_SOFT_DIRTY); + return pud_set_flags(pud, _PAGE_DIRTY_HW | _PAGE_SOFT_DIRTY); } static inline pud_t pud_mkdevmap(pud_t pud) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 816b31c68550..192e1326b3db 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -15,7 +15,7 @@ #define _PAGE_BIT_PWT 3 /* page write through */ #define _PAGE_BIT_PCD 4 /* page cache disabled */ #define _PAGE_BIT_ACCESSED 5 /* was accessed (raised by CPU) */ -#define _PAGE_BIT_DIRTY 6 /* was written to (raised by CPU) */ +#define _PAGE_BIT_DIRTY_HW 6 /* was written to (raised by CPU) */ #define _PAGE_BIT_PSE 7 /* 4 MB (or 2MB) page */ #define _PAGE_BIT_PAT 7 /* on 4KB pages */ #define _PAGE_BIT_GLOBAL 8 /* Global TLB entry PPro+ */ @@ -46,7 +46,7 @@ #define _PAGE_PWT (_AT(pteval_t, 1) << _PAGE_BIT_PWT) #define _PAGE_PCD (_AT(pteval_t, 1) << _PAGE_BIT_PCD) #define _PAGE_ACCESSED (_AT(pteval_t, 1) << _PAGE_BIT_ACCESSED) -#define _PAGE_DIRTY (_AT(pteval_t, 1) << _PAGE_BIT_DIRTY) +#define _PAGE_DIRTY_HW (_AT(pteval_t, 1) << _PAGE_BIT_DIRTY_HW) #define _PAGE_PSE (_AT(pteval_t, 1) << _PAGE_BIT_PSE) #define _PAGE_GLOBAL (_AT(pteval_t, 1) << _PAGE_BIT_GLOBAL) #define _PAGE_SOFTW1 (_AT(pteval_t, 1) << _PAGE_BIT_SOFTW1) @@ -74,7 +74,7 @@ _PAGE_PKEY_BIT3) #if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE) -#define _PAGE_KNL_ERRATUM_MASK (_PAGE_DIRTY | _PAGE_ACCESSED) +#define _PAGE_KNL_ERRATUM_MASK (_PAGE_DIRTY_HW | _PAGE_ACCESSED) #else #define _PAGE_KNL_ERRATUM_MASK 0 #endif @@ -126,7 +126,7 @@ * pte_modify() does modify it. */ #define _PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \ - _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY | \ + _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY_HW | \ _PAGE_SOFT_DIRTY | _PAGE_DEVMAP | _PAGE_ENC | \ _PAGE_UFFD_WP) #define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE) @@ -163,7 +163,7 @@ enum page_cache_mode { #define __RW _PAGE_RW #define _USR _PAGE_USER #define ___A _PAGE_ACCESSED -#define ___D _PAGE_DIRTY +#define ___D _PAGE_DIRTY_HW #define ___G _PAGE_GLOBAL #define __NX _PAGE_NX @@ -205,7 +205,6 @@ enum page_cache_mode { #define __PAGE_KERNEL_IO __PAGE_KERNEL #define __PAGE_KERNEL_IO_NOCACHE __PAGE_KERNEL_NOCACHE - #ifndef __ASSEMBLY__ #define __PAGE_KERNEL_ENC (__PAGE_KERNEL | _ENC) diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S index a4d9a261425b..e3bb4ff95523 100644 --- a/arch/x86/kernel/relocate_kernel_64.S +++ b/arch/x86/kernel/relocate_kernel_64.S @@ -17,7 +17,7 @@ */ #define PTR(x) (x << 3) -#define PAGE_ATTR (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY) +#define PAGE_ATTR (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY_HW) /* * control_page + KEXEC_CONTROL_CODE_MAX_SIZE diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 96979c09ebd1..57c5ad890597 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3620,7 +3620,7 @@ static int init_rmode_identity_map(struct kvm *kvm) /* Set up identity-mapping pagetable for EPT in real mode */ for (i = 0; i < PT32_ENT_PER_PAGE; i++) { tmp = (i << 22) + (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | - _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE); + _PAGE_ACCESSED | _PAGE_DIRTY_HW | _PAGE_PSE); r = kvm_write_guest_page(kvm, identity_map_pfn, &tmp, i * sizeof(tmp), sizeof(tmp)); if (r < 0) From patchwork Fri Oct 9 18:32:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827049 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B946A139F for ; Fri, 9 Oct 2020 18:33:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7E5D922284 for ; Fri, 9 Oct 2020 18:33:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7E5D922284 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 462AB6B0072; Fri, 9 Oct 2020 14:33:18 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3C8516B0073; Fri, 9 Oct 2020 14:33:18 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 158FF6B0075; Fri, 9 Oct 2020 14:33:18 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0135.hostedemail.com [216.40.44.135]) by kanga.kvack.org (Postfix) with ESMTP id BD7C86B0072 for ; Fri, 9 Oct 2020 14:33:17 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 53FC5181AE863 for ; Fri, 9 Oct 2020 18:33:17 +0000 (UTC) X-FDA: 77353234434.29.curve93_3210add271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin29.hostedemail.com (Postfix) with ESMTP id 0EE4618086CC2 for ; Fri, 9 Oct 2020 18:33:17 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30054:30056:30064,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-62.18.0.100 64.95.201.95;04yfk41489jnx98jq361ub686u3riocptryet97u7gcji7w3irp1jbxgib8hgoe.acws414h345ky48qq43nbmdi38ukyh99ippsneerakguxadwj7x8a4g45tt6p7b.o-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: curve93_3210add271e2 X-Filterd-Recvd-Size: 5141 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:16 +0000 (UTC) IronPort-SDR: +ibk6qB4Gr5IapbopeE32En25ICRT00zFqAk2isA3zfqrqvy6Q+WYIyWPeiWy+h0g2y7HPYBmW 7qfcCuVgkRjA== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390213" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390213" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:14 -0700 IronPort-SDR: p/c/19d7BMMhCeposyKVFjSxIPV01lzH35q4x4UDb+xmmhFzOXxKKFTRIn3aOviPM0KZQ8dqEZ /XUyinUESh2Q== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031134" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:13 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu , Christoph Hellwig Subject: [PATCH v14 07/26] x86/mm: Remove _PAGE_DIRTY_HW from kernel RO pages Date: Fri, 9 Oct 2020 11:32:11 -0700 Message-Id: <20201009183230.26717-8-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Kernel read-only PTEs are setup as _PAGE_DIRTY_HW. Since these become shadow stack PTEs, remove the dirty bit. Signed-off-by: Yu-cheng Yu Cc: "H. Peter Anvin" Cc: Kees Cook Cc: Thomas Gleixner Cc: Dave Hansen Cc: Christoph Hellwig Cc: Andy Lutomirski Cc: Ingo Molnar Cc: Borislav Petkov Cc: Peter Zijlstra --- arch/x86/include/asm/pgtable_types.h | 6 +++--- arch/x86/mm/pat/set_memory.c | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 192e1326b3db..5f31f1c407b9 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -193,10 +193,10 @@ enum page_cache_mode { #define _KERNPG_TABLE (__PP|__RW| 0|___A| 0|___D| 0| 0| _ENC) #define _PAGE_TABLE_NOENC (__PP|__RW|_USR|___A| 0|___D| 0| 0) #define _PAGE_TABLE (__PP|__RW|_USR|___A| 0|___D| 0| 0| _ENC) -#define __PAGE_KERNEL_RO (__PP| 0| 0|___A|__NX|___D| 0|___G) -#define __PAGE_KERNEL_ROX (__PP| 0| 0|___A| 0|___D| 0|___G) +#define __PAGE_KERNEL_RO (__PP| 0| 0|___A|__NX| 0| 0|___G) +#define __PAGE_KERNEL_ROX (__PP| 0| 0|___A| 0| 0| 0|___G) #define __PAGE_KERNEL_NOCACHE (__PP|__RW| 0|___A|__NX|___D| 0|___G| __NC) -#define __PAGE_KERNEL_VVAR (__PP| 0|_USR|___A|__NX|___D| 0|___G) +#define __PAGE_KERNEL_VVAR (__PP| 0|_USR|___A|__NX| 0| 0|___G) #define __PAGE_KERNEL_LARGE (__PP|__RW| 0|___A|__NX|___D|_PSE|___G) #define __PAGE_KERNEL_LARGE_EXEC (__PP|__RW| 0|___A| 0|___D|_PSE|___G) #define __PAGE_KERNEL_WP (__PP|__RW| 0|___A|__NX|___D| 0|___G| __WP) diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index d1b2a889f035..962434fdf0d9 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -1932,7 +1932,7 @@ int set_memory_nx(unsigned long addr, int numpages) int set_memory_ro(unsigned long addr, int numpages) { - return change_page_attr_clear(&addr, numpages, __pgprot(_PAGE_RW), 0); + return change_page_attr_clear(&addr, numpages, __pgprot(_PAGE_RW | _PAGE_DIRTY_HW), 0); } int set_memory_rw(unsigned long addr, int numpages) From patchwork Fri Oct 9 18:32:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827051 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2D7F0139F for ; Fri, 9 Oct 2020 18:33:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D452A22284 for ; Fri, 9 Oct 2020 18:33:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D452A22284 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7DE696B0073; Fri, 9 Oct 2020 14:33:18 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 707F56B0075; Fri, 9 Oct 2020 14:33:18 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 266CD8E0001; Fri, 9 Oct 2020 14:33:18 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0218.hostedemail.com [216.40.44.218]) by kanga.kvack.org (Postfix) with ESMTP id CBD046B0073 for ; Fri, 9 Oct 2020 14:33:17 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 585108249980 for ; Fri, 9 Oct 2020 18:33:17 +0000 (UTC) X-FDA: 77353234434.01.fact13_260211a271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id 1E82310046495 for ; Fri, 9 Oct 2020 18:33:17 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:4423:30045:30051:30054:30056:30064:30070:30079:30091,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04yrm3shaaxm9gpdf76aegz6rqg1gocgprfso47on515nc9hi5qpejgzj7xgjg8.huqyd474d4wb6qiosjixutd1quru58yrpwimmj6bnysecdo1hi1rqa8rjbzygfc.o-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: fact13_260211a271e2 X-Filterd-Recvd-Size: 15401 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:16 +0000 (UTC) IronPort-SDR: LTxkbisyIUt2MhuyaCmzjH8mCqQ5rGvD//bwzYGfPKahtaqNkUpjM1+sKO0CR0Kj3TY6faJSZs BQwMN8VvRjTA== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390215" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390215" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:15 -0700 IronPort-SDR: WVoRiaZRSHeFbzkI9KihV+1PEhdnCkuHMPz3VYBAUN4m94fPVupXVF0Wv8QfcEUAz377iwAjja jEkvaTw2AdnQ== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031143" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:14 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 08/26] x86/mm: Introduce _PAGE_COW Date: Fri, 9 Oct 2020 11:32:12 -0700 Message-Id: <20201009183230.26717-9-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There is essentially no room left in the x86 hardware PTEs on some OSes (not Linux). That left the hardware architects looking for a way to represent a new memory type (shadow stack) within the existing bits. They chose to repurpose a lightly-used state: Write=0,Dirty=1. The reason it's lightly used is that Dirty=1 is normally set by hardware and cannot normally be set by hardware on a Write=0 PTE. Software must normally be involved to create one of these PTEs, so software can simply opt to not create them. But that leaves us with a Linux problem: we need to ensure we never create Write=0,Dirty=1 PTEs. In places where we do create them, we need to find an alternative way to represent them _without_ using the same hardware bit combination. Thus, enter _PAGE_COW. This results in the following: (a) A modified, copy-on-write (COW) page: (R/O + _PAGE_COW) (b) A R/O page that has been COW'ed: (R/O + _PAGE_COW) The user page is in a R/O VMA, and get_user_pages() needs a writable copy. The page fault handler creates a copy of the page and sets the new copy's PTE as R/O and _PAGE_COW. (c) A shadow stack PTE: (R/O + _PAGE_DIRTY_HW) (d) A shared shadow stack PTE: (R/O + _PAGE_COW) When a shadow stack page is being shared among processes (this happens at fork()), its PTE is cleared of _PAGE_DIRTY_HW, so the next shadow stack access causes a fault, and the page is duplicated and _PAGE_DIRTY_HW is set again. This is the COW equivalent for shadow stack pages, even though it's copy-on-access rather than copy-on-write. (e) A page where the processor observed a Write=1 PTE, started a write, set Dirty=1, but then observed a Write=0 PTE. That's possible today, but will not happen on processors that support shadow stack. Use _PAGE_COW in pte_wrprotect() and _PAGE_DIRTY_HW in pte_mkwrite(). Apply the same changes to pmd and pud. When this patch is applied, there are six free bits left in the 64-bit PTE. There are no more free bits in the 32-bit PTE (except for PAE) and shadow stack is not implemented for the 32-bit kernel. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook v10: - Change _PAGE_BIT_DIRTY_SW to _PAGE_BIT_COW, as it is used for copy-on- write PTEs. - Update pte_write() and treat shadow stack as writable. - Change *_mkdirty_shstk() to *_mkwrite_shstk() as these make shadow stack pages writable. - Use bit test & shift to move _PAGE_BIT_DIRTY_HW to _PAGE_BIT_COW. - Change static_cpu_has() to cpu_feature_enabled(). - Revise commit log. v9: - Remove pte_move_flags() etc. and put the logic directly in pte_wrprotect()/pte_mkwrite() etc. - Change compile-time conditionals to run-time checks. - Split out pte_modify()/pmd_modify() to a new patch. - Update comments. --- arch/x86/include/asm/pgtable.h | 120 ++++++++++++++++++++++++--- arch/x86/include/asm/pgtable_types.h | 41 ++++++++- 2 files changed, 150 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 86b7acd221c1..ac4ed814be96 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -122,9 +122,9 @@ extern pmdval_t early_pmd_flags; * The following only work if pte_present() is true. * Undefined behaviour if not.. */ -static inline int pte_dirty(pte_t pte) +static inline bool pte_dirty(pte_t pte) { - return pte_flags(pte) & _PAGE_DIRTY_HW; + return pte_flags(pte) & _PAGE_DIRTY_BITS; } @@ -161,9 +161,9 @@ static inline int pte_young(pte_t pte) return pte_flags(pte) & _PAGE_ACCESSED; } -static inline int pmd_dirty(pmd_t pmd) +static inline bool pmd_dirty(pmd_t pmd) { - return pmd_flags(pmd) & _PAGE_DIRTY_HW; + return pmd_flags(pmd) & _PAGE_DIRTY_BITS; } static inline int pmd_young(pmd_t pmd) @@ -171,9 +171,9 @@ static inline int pmd_young(pmd_t pmd) return pmd_flags(pmd) & _PAGE_ACCESSED; } -static inline int pud_dirty(pud_t pud) +static inline bool pud_dirty(pud_t pud) { - return pud_flags(pud) & _PAGE_DIRTY_HW; + return pud_flags(pud) & _PAGE_DIRTY_BITS; } static inline int pud_young(pud_t pud) @@ -183,6 +183,12 @@ static inline int pud_young(pud_t pud) static inline int pte_write(pte_t pte) { + /* + * If _PAGE_DIRTY_HW is set, the PTE must either have + * _PAGE_RW or be a shadow stack PTE, which is logically writable. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) + return pte_flags(pte) & (_PAGE_RW | _PAGE_DIRTY_HW); return pte_flags(pte) & _PAGE_RW; } @@ -334,7 +340,7 @@ static inline pte_t pte_clear_uffd_wp(pte_t pte) static inline pte_t pte_mkclean(pte_t pte) { - return pte_clear_flags(pte, _PAGE_DIRTY_HW); + return pte_clear_flags(pte, _PAGE_DIRTY_BITS); } static inline pte_t pte_mkold(pte_t pte) @@ -344,6 +350,17 @@ static inline pte_t pte_mkold(pte_t pte) static inline pte_t pte_wrprotect(pte_t pte) { + /* + * Blindly clearing _PAGE_RW might accidentally create + * a shadow stack PTE (RW=0,Dirty=1). Move the hardware + * dirty value to the software bit. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + pte.pte |= (pte.pte & _PAGE_DIRTY_HW) >> + _PAGE_BIT_DIRTY_HW << _PAGE_BIT_COW; + pte = pte_clear_flags(pte, _PAGE_DIRTY_HW); + } + return pte_clear_flags(pte, _PAGE_RW); } @@ -354,6 +371,18 @@ static inline pte_t pte_mkexec(pte_t pte) static inline pte_t pte_mkdirty(pte_t pte) { + pteval_t dirty = _PAGE_DIRTY_HW; + + /* Avoid creating (HW)Dirty=1,Write=0 PTEs */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK) && !pte_write(pte)) + dirty = _PAGE_COW; + + return pte_set_flags(pte, dirty | _PAGE_SOFT_DIRTY); +} + +static inline pte_t pte_mkwrite_shstk(pte_t pte) +{ + pte = pte_clear_flags(pte, _PAGE_COW); return pte_set_flags(pte, _PAGE_DIRTY_HW | _PAGE_SOFT_DIRTY); } @@ -364,6 +393,13 @@ static inline pte_t pte_mkyoung(pte_t pte) static inline pte_t pte_mkwrite(pte_t pte) { + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + if (pte_flags(pte) & _PAGE_COW) { + pte = pte_clear_flags(pte, _PAGE_COW); + pte = pte_set_flags(pte, _PAGE_DIRTY_HW); + } + } + return pte_set_flags(pte, _PAGE_RW); } @@ -435,16 +471,41 @@ static inline pmd_t pmd_mkold(pmd_t pmd) static inline pmd_t pmd_mkclean(pmd_t pmd) { - return pmd_clear_flags(pmd, _PAGE_DIRTY_HW); + return pmd_clear_flags(pmd, _PAGE_DIRTY_BITS); } static inline pmd_t pmd_wrprotect(pmd_t pmd) { + /* + * Blindly clearing _PAGE_RW might accidentally create + * a shadow stack PMD (RW=0,Dirty=1). Move the hardware + * dirty value to the software bit. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + pmdval_t v = native_pmd_val(pmd); + + v |= (v & _PAGE_DIRTY_HW) >> _PAGE_BIT_DIRTY_HW << + _PAGE_BIT_COW; + pmd = pmd_clear_flags(__pmd(v), _PAGE_DIRTY_HW); + } + return pmd_clear_flags(pmd, _PAGE_RW); } static inline pmd_t pmd_mkdirty(pmd_t pmd) { + pmdval_t dirty = _PAGE_DIRTY_HW; + + /* Avoid creating (HW)Dirty=1,Write=0 PMDs */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK) && !(pmd_flags(pmd) & _PAGE_RW)) + dirty = _PAGE_COW; + + return pmd_set_flags(pmd, dirty | _PAGE_SOFT_DIRTY); +} + +static inline pmd_t pmd_mkwrite_shstk(pmd_t pmd) +{ + pmd = pmd_clear_flags(pmd, _PAGE_COW); return pmd_set_flags(pmd, _PAGE_DIRTY_HW | _PAGE_SOFT_DIRTY); } @@ -465,6 +526,13 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd) static inline pmd_t pmd_mkwrite(pmd_t pmd) { + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + if (pmd_flags(pmd) & _PAGE_COW) { + pmd = pmd_clear_flags(pmd, _PAGE_COW); + pmd = pmd_set_flags(pmd, _PAGE_DIRTY_HW); + } + } + return pmd_set_flags(pmd, _PAGE_RW); } @@ -489,17 +557,36 @@ static inline pud_t pud_mkold(pud_t pud) static inline pud_t pud_mkclean(pud_t pud) { - return pud_clear_flags(pud, _PAGE_DIRTY_HW); + return pud_clear_flags(pud, _PAGE_DIRTY_BITS); } static inline pud_t pud_wrprotect(pud_t pud) { + /* + * Blindly clearing _PAGE_RW might accidentally create + * a shadow stack PUD (RW=0,Dirty=1). Move the hardware + * dirty value to the software bit. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + pudval_t v = native_pud_val(pud); + + v |= (v & _PAGE_DIRTY_HW) >> _PAGE_BIT_DIRTY_HW << + _PAGE_BIT_COW; + pud = pud_clear_flags(__pud(v), _PAGE_DIRTY_HW); + } + return pud_clear_flags(pud, _PAGE_RW); } static inline pud_t pud_mkdirty(pud_t pud) { - return pud_set_flags(pud, _PAGE_DIRTY_HW | _PAGE_SOFT_DIRTY); + pudval_t dirty = _PAGE_DIRTY_HW; + + /* Avoid creating (HW)Dirty=1,Write=0 PUDs */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK) && !(pud_flags(pud) & _PAGE_RW)) + dirty = _PAGE_COW; + + return pud_set_flags(pud, dirty | _PAGE_SOFT_DIRTY); } static inline pud_t pud_mkdevmap(pud_t pud) @@ -519,6 +606,13 @@ static inline pud_t pud_mkyoung(pud_t pud) static inline pud_t pud_mkwrite(pud_t pud) { + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + if (pud_flags(pud) & _PAGE_COW) { + pud = pud_clear_flags(pud, _PAGE_COW); + pud = pud_set_flags(pud, _PAGE_DIRTY_HW); + } + } + return pud_set_flags(pud, _PAGE_RW); } @@ -1132,6 +1226,12 @@ extern int pmdp_clear_flush_young(struct vm_area_struct *vma, #define pmd_write pmd_write static inline int pmd_write(pmd_t pmd) { + /* + * If _PAGE_DIRTY_HW is set, then the PMD must either have + * _PAGE_RW or be a shadow stack PMD, which is logically writable. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) + return pmd_flags(pmd) & (_PAGE_RW | _PAGE_DIRTY_HW); return pmd_flags(pmd) & _PAGE_RW; } diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 5f31f1c407b9..75362df8b226 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -23,7 +23,8 @@ #define _PAGE_BIT_SOFTW2 10 /* " */ #define _PAGE_BIT_SOFTW3 11 /* " */ #define _PAGE_BIT_PAT_LARGE 12 /* On 2MB or 1GB pages */ -#define _PAGE_BIT_SOFTW4 58 /* available for programmer */ +#define _PAGE_BIT_SOFTW4 57 /* available for programmer */ +#define _PAGE_BIT_SOFTW5 58 /* available for programmer */ #define _PAGE_BIT_PKEY_BIT0 59 /* Protection Keys, bit 1/4 */ #define _PAGE_BIT_PKEY_BIT1 60 /* Protection Keys, bit 2/4 */ #define _PAGE_BIT_PKEY_BIT2 61 /* Protection Keys, bit 3/4 */ @@ -36,6 +37,16 @@ #define _PAGE_BIT_SOFT_DIRTY _PAGE_BIT_SOFTW3 /* software dirty tracking */ #define _PAGE_BIT_DEVMAP _PAGE_BIT_SOFTW4 +/* + * This bit indicates a copy-on-write page, and is different from + * _PAGE_BIT_SOFT_DIRTY, which tracks which pages a task writes to. + */ +#ifdef CONFIG_X86_64 +#define _PAGE_BIT_COW _PAGE_BIT_SOFTW5 /* copy-on-write */ +#else +#define _PAGE_BIT_COW 0 +#endif + /* If _PAGE_BIT_PRESENT is clear, we use these: */ /* - if the user mapped it with PROT_NONE; pte_present gives true */ #define _PAGE_BIT_PROTNONE _PAGE_BIT_GLOBAL @@ -117,6 +128,34 @@ #define _PAGE_DEVMAP (_AT(pteval_t, 0)) #endif +/* + * _PAGE_COW is used to separate R/O and copy-on-write PTEs created by + * software from the shadow stack PTE setting required by the hardware: + * (a) A modified, copy-on-write (COW) page: (R/O + _PAGE_COW) + * (b) A R/O page that has been COW'ed: (R/O +_PAGE_COW) + * The user page is in a R/O VMA, and get_user_pages() needs a + * writable copy. The page fault handler creates a copy of the page + * and sets the new copy's PTE as R/O and _PAGE_COW. + * (c) A shadow stack PTE: (R/O + _PAGE_DIRTY_HW) + * (d) A shared (copy-on-access) shadow stack PTE: (R/O + _PAGE_COW) + * When a shadow stack page is being shared among processes (this + * happens at fork()), its PTE is cleared of _PAGE_DIRTY_HW, so the + * next shadow stack access causes a fault, and the page is duplicated + * and _PAGE_DIRTY_HW is set again. This is the COW equivalent for + * shadow stack pages, even though it's copy-on-access rather than + * copy-on-write. + * (e) A page where the processor observed a Write=1 PTE, started a write, + * set Dirty=1, but then observed a Write=0 PTE. That's possible + * today, but will not happen on processors that support shadow stack. + */ +#ifdef CONFIG_X86_SHADOW_STACK_USER +#define _PAGE_COW (_AT(pteval_t, 1) << _PAGE_BIT_COW) +#else +#define _PAGE_COW (_AT(pteval_t, 0)) +#endif + +#define _PAGE_DIRTY_BITS (_PAGE_DIRTY_HW | _PAGE_COW) + #define _PAGE_PROTNONE (_AT(pteval_t, 1) << _PAGE_BIT_PROTNONE) /* From patchwork Fri Oct 9 18:32:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827053 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7442F139F for ; Fri, 9 Oct 2020 18:33:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3E7A722284 for ; Fri, 9 Oct 2020 18:33:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3E7A722284 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E25DD6B0074; Fri, 9 Oct 2020 14:33:18 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DB0186B0075; Fri, 9 Oct 2020 14:33:18 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C28E06B0078; Fri, 9 Oct 2020 14:33:18 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0050.hostedemail.com [216.40.44.50]) by kanga.kvack.org (Postfix) with ESMTP id 8E36D6B0074 for ; Fri, 9 Oct 2020 14:33:18 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 1CF181EE6 for ; Fri, 9 Oct 2020 18:33:18 +0000 (UTC) X-FDA: 77353234476.08.pull11_0711abc271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin08.hostedemail.com (Postfix) with ESMTP id F20831819E769 for ; Fri, 9 Oct 2020 18:33:17 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30054:30055:30056:30064:30070,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-62.18.0.100 64.95.201.95;04ygdeo16ojtqtcnepi85cmn7cjg7ycuwrcxztqjdyqqkpf5ffwshn75n6ixj18.7c987apmf94enkamuo3q98h53bz6xjoyx4cb1bzumdjc95mnnqpfth16eeyiqhf.r-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: pull11_0711abc271e2 X-Filterd-Recvd-Size: 4020 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:17 +0000 (UTC) IronPort-SDR: 5q/fsMrp3/NhWyYzynYLvR9+WBPsaqJD6DKYlMMNLXNfZcLk90CmKbchIgalliXDzpJop2zXVr VjVQ4/EcKjVg== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390220" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390220" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:16 -0700 IronPort-SDR: IARO35AcYk38TmDCxmOKPwD2ixcVHHNOUoGtIntfM4Mm/tL2anWftfI5OwbNplfJWs/3G7DZlk +IAIuQ5KOAWw== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031150" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:15 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu , David Airlie , Joonas Lahtinen , Jani Nikula , Daniel Vetter , Rodrigo Vivi , Zhenyu Wang , Zhi Wang Subject: [PATCH v14 09/26] drm/i915/gvt: Change _PAGE_DIRTY to _PAGE_DIRTY_BITS Date: Fri, 9 Oct 2020 11:32:13 -0700 Message-Id: <20201009183230.26717-10-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: After the introduction of _PAGE_COW, a modified page's PTE can have either _PAGE_DIRTY_HW or _PAGE_COW. Change _PAGE_DIRTY to _PAGE_DIRTY_BITS. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook Cc: David Airlie Cc: Joonas Lahtinen Cc: Jani Nikula Cc: Daniel Vetter Cc: Rodrigo Vivi Cc: Zhenyu Wang Cc: Zhi Wang --- drivers/gpu/drm/i915/gvt/gtt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c index a3a4305eda01..dd0ab28cfe7d 100644 --- a/drivers/gpu/drm/i915/gvt/gtt.c +++ b/drivers/gpu/drm/i915/gvt/gtt.c @@ -1207,7 +1207,7 @@ static int split_2MB_gtt_entry(struct intel_vgpu *vgpu, } /* Clear dirty field. */ - se->val64 &= ~_PAGE_DIRTY; + se->val64 &= ~_PAGE_DIRTY_BITS; ops->clear_pse(se); ops->clear_ips(se); From patchwork Fri Oct 9 18:32:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827055 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8A210139F for ; Fri, 9 Oct 2020 18:33:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5D37E22314 for ; Fri, 9 Oct 2020 18:33:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5D37E22314 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CF4646B0075; Fri, 9 Oct 2020 14:33:19 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C2B7D6B0078; Fri, 9 Oct 2020 14:33:19 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AEFE76B007B; Fri, 9 Oct 2020 14:33:19 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0006.hostedemail.com [216.40.44.6]) by kanga.kvack.org (Postfix) with ESMTP id 7F9456B0075 for ; Fri, 9 Oct 2020 14:33:19 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 16A73181AE863 for ; Fri, 9 Oct 2020 18:33:19 +0000 (UTC) X-FDA: 77353234518.16.drug48_0a0b7b9271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin16.hostedemail.com (Postfix) with ESMTP id E8F4410141366 for ; Fri, 9 Oct 2020 18:33:18 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30054:30056:30064:30070,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04yf3uzkfdceqiewqzbjohi3715tmypwi5jfhgo7shmoyqz8u944yod5w3tn1qj.nacr54w98zj18cmwzycn75pgigt3sq5ujce4sk61hc411xxapbftiknxnzyp7sp.c-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:26,LUA_SUMMARY:none X-HE-Tag: drug48_0a0b7b9271e2 X-Filterd-Recvd-Size: 5065 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:18 +0000 (UTC) IronPort-SDR: 7UJ/B8uWPDJknF8TbUu87XMgAqIVO3vEwqpMR2PA2gJEzoGlQiax+Y64L9oYx2aVjF42+Do+bB ePbFEP8U+WmQ== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390223" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390223" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:17 -0700 IronPort-SDR: tRMScDw6017/sGeIsvjQRZxKrFLwxT2cE6sr+LpNl5j3pVHHVDKr9M7regoka0veSrVGLaTwF5 5+s0caKylyYQ== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031153" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:16 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 10/26] x86/mm: Update pte_modify for _PAGE_COW Date: Fri, 9 Oct 2020 11:32:14 -0700 Message-Id: <20201009183230.26717-11-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Pte_modify() changes a PTE to 'newprot'. It doesn't use the pte_*() helpers that a previous patch fixed up, so we need a new site. Introduce fixup_dirty_pte() to set the dirty bits based on _PAGE_RW, and apply the same changes to pmd_modify(). Signed-off-by: Yu-cheng Yu v14: - Update fixup_dirty_pte() and fixup_dirty_pmd() with cpu_feature_enabled(X86_FEATURE_SHSTK). v10: - Replace _PAGE_CHG_MASK approach with fixup functions. --- arch/x86/include/asm/pgtable.h | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index ac4ed814be96..8d4c09831e67 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -727,6 +727,21 @@ static inline pmd_t pmd_mkinvalid(pmd_t pmd) static inline u64 flip_protnone_guard(u64 oldval, u64 val, u64 mask); +static inline pteval_t fixup_dirty_pte(pteval_t pteval) +{ + pte_t pte = __pte(pteval); + + if (cpu_feature_enabled(X86_FEATURE_SHSTK) && pte_dirty(pte)) { + pte = pte_mkclean(pte); + + if (pte_flags(pte) & _PAGE_RW) + pte = pte_set_flags(pte, _PAGE_DIRTY_HW); + else + pte = pte_set_flags(pte, _PAGE_COW); + } + return pte_val(pte); +} + static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) { pteval_t val = pte_val(pte), oldval = val; @@ -737,16 +752,34 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) */ val &= _PAGE_CHG_MASK; val |= check_pgprot(newprot) & ~_PAGE_CHG_MASK; + val = fixup_dirty_pte(val); val = flip_protnone_guard(oldval, val, PTE_PFN_MASK); return __pte(val); } +static inline int pmd_write(pmd_t pmd); +static inline pmdval_t fixup_dirty_pmd(pmdval_t pmdval) +{ + pmd_t pmd = __pmd(pmdval); + + if (cpu_feature_enabled(X86_FEATURE_SHSTK) && pmd_dirty(pmd)) { + pmd = pmd_mkclean(pmd); + + if (pmd_flags(pmd) & _PAGE_RW) + pmd = pmd_set_flags(pmd, _PAGE_DIRTY_HW); + else + pmd = pmd_set_flags(pmd, _PAGE_COW); + } + return pmd_val(pmd); +} + static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) { pmdval_t val = pmd_val(pmd), oldval = val; val &= _HPAGE_CHG_MASK; val |= check_pgprot(newprot) & ~_HPAGE_CHG_MASK; + val = fixup_dirty_pmd(val); val = flip_protnone_guard(oldval, val, PHYSICAL_PMD_PAGE_MASK); return __pmd(val); } From patchwork Fri Oct 9 18:32:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827057 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D2BFF14D5 for ; Fri, 9 Oct 2020 18:33:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A450D22284 for ; Fri, 9 Oct 2020 18:33:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A450D22284 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B17E96B0078; Fri, 9 Oct 2020 14:33:20 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id ACAF86B007B; Fri, 9 Oct 2020 14:33:20 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F4358E0001; Fri, 9 Oct 2020 14:33:20 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0118.hostedemail.com [216.40.44.118]) by kanga.kvack.org (Postfix) with ESMTP id 5ED936B0078 for ; Fri, 9 Oct 2020 14:33:20 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E9CA1180AD811 for ; Fri, 9 Oct 2020 18:33:19 +0000 (UTC) X-FDA: 77353234518.02.stop77_4505913271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id CD9E8101A24BA for ; Fri, 9 Oct 2020 18:33:19 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30054:30056:30064:30070,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-62.18.0.100 64.95.201.95;04yrejh3fi94mqte1nnrwdw8wwmytoc6opfa9a9jyab9nauded5a75x84r5xh7m.g7yupnka59613qzmc66fjqw6y335kb9ajt6q6quxz1hz555tckbpig5s87w7hb7.4-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: stop77_4505913271e2 X-Filterd-Recvd-Size: 6756 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:18 +0000 (UTC) IronPort-SDR: lKrgsL9udhEroNWJDECMgiex8T2LYSIMY1EqRUatmvUL9D6CIGdDGRFRIb4KdewEhubcN5b4Im 6GThr8fmWlXg== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390225" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390225" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:18 -0700 IronPort-SDR: ZezoQJ05eDgw6I0oJZ6LzLHRpzb3pJ0AakUzzEnLudImgnnOD7Fp76Q6AMoR8O/qLcDqD2M5bY pD0HC0v/c10A== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031158" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:17 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 11/26] x86/mm: Update ptep_set_wrprotect() and pmdp_set_wrprotect() for transition from _PAGE_DIRTY_HW to _PAGE_COW Date: Fri, 9 Oct 2020 11:32:15 -0700 Message-Id: <20201009183230.26717-12-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When shadow stack is introduced, [R/O + _PAGE_DIRTY_HW] PTE is reserved for shadow stack. Copy-on-write PTEs have [R/O + _PAGE_COW]. When a PTE goes from [R/W + _PAGE_DIRTY_HW] to [R/O + _PAGE_COW], it could become a transient shadow stack PTE in two cases: The first case is that some processors can start a write but end up seeing a read-only PTE by the time they get to the Dirty bit, creating a transient shadow stack PTE. However, this will not occur on processors supporting shadow stack, therefore we don't need a TLB flush here. The second case is that when the software, without atomic, tests & replaces _PAGE_DIRTY_HW with _PAGE_COW, a transient shadow stack PTE can exist. This is prevented with cmpxchg. Dave Hansen, Jann Horn, Andy Lutomirski, and Peter Zijlstra provided many insights to the issue. Jann Horn provided the cmpxchg solution. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook v10: - Replace bit shift with pte_wrprotect()/pmd_wrprotect(), which use bit test & shift. - Move READ_ONCE of old_pte into try_cmpxchg() loop. - Change static_cpu_has() to cpu_feature_enabled(). v9: - Change compile-time conditionals to runtime checks. - Fix parameters of try_cmpxchg(): change pte_t/pmd_t to pte_t.pte/pmd_t.pmd. v4: - Implement try_cmpxchg(). --- arch/x86/include/asm/pgtable.h | 52 ++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 8d4c09831e67..8e637a5ed9e4 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1230,6 +1230,32 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { + /* + * Some processors can start a write, but end up seeing a read-only + * PTE by the time they get to the Dirty bit. In this case, they + * will set the Dirty bit, leaving a read-only, Dirty PTE which + * looks like a shadow stack PTE. + * + * However, this behavior has been improved and will not occur on + * processors supporting shadow stack. Without this guarantee, a + * transition to a non-present PTE and flush the TLB would be + * needed. + * + * When changing a writable PTE to read-only and if the PTE has + * _PAGE_DIRTY_HW set, move that bit to _PAGE_COW so that the + * PTE is not a shadow stack PTE. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + pte_t old_pte, new_pte; + + do { + old_pte = READ_ONCE(*ptep); + new_pte = pte_wrprotect(old_pte); + + } while (!try_cmpxchg(&ptep->pte, &old_pte.pte, new_pte.pte)); + + return; + } clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte); } @@ -1286,6 +1312,32 @@ static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { + /* + * Some processors can start a write, but end up seeing a read-only + * PMD by the time they get to the Dirty bit. In this case, they + * will set the Dirty bit, leaving a read-only, Dirty PMD which + * looks like a Shadow Stack PMD. + * + * However, this behavior has been improved and will not occur on + * processors supporting Shadow Stack. Without this guarantee, a + * transition to a non-present PMD and flush the TLB would be + * needed. + * + * When changing a writable PMD to read-only and if the PMD has + * _PAGE_DIRTY_HW set, we move that bit to _PAGE_COW so that the + * PMD is not a shadow stack PMD. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + pmd_t old_pmd, new_pmd; + + do { + old_pmd = READ_ONCE(*pmdp); + new_pmd = pmd_wrprotect(old_pmd); + + } while (!try_cmpxchg((pmdval_t *)pmdp, (pmdval_t *)&old_pmd, pmd_val(new_pmd))); + + return; + } clear_bit(_PAGE_BIT_RW, (unsigned long *)pmdp); } From patchwork Fri Oct 9 18:32:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827059 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 99BBA14D5 for ; Fri, 9 Oct 2020 18:33:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6437222314 for ; Fri, 9 Oct 2020 18:33:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6437222314 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8CEB26B007B; Fri, 9 Oct 2020 14:33:21 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 880508E0001; Fri, 9 Oct 2020 14:33:21 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6AACC6B007E; Fri, 9 Oct 2020 14:33:21 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0029.hostedemail.com [216.40.44.29]) by kanga.kvack.org (Postfix) with ESMTP id 399366B007B for ; Fri, 9 Oct 2020 14:33:21 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id BB620362A for ; Fri, 9 Oct 2020 18:33:20 +0000 (UTC) X-FDA: 77353234560.06.river41_1017a17271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 942A6101CD639 for ; Fri, 9 Oct 2020 18:33:20 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30054:30056:30064:30070,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04yrzjeooabe1acxsc1nqug8o8x6xyp5437yt7a5kri351gfay3jprppzb95szb.acp8tiadft5f6mgd1ftfwuotwgjsryyekkaiessra1i4jb96uhm1nmrwy9aneto.1-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: river41_1017a17271e2 X-Filterd-Recvd-Size: 5313 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:19 +0000 (UTC) IronPort-SDR: NbC18YiQzRfu3Sa6ENqvMXF0cIwM0Un8ZVcENtXHtC1JxktIrG0ClRKOoWqKiBvDiqIwimtEfC GgVxgdUcNQHg== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390228" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390228" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:19 -0700 IronPort-SDR: f9rNyMfPG6xNQOquVmZxqZvQXRiQ8mAicfXLnuim7kGtIJ6/NbcUIkMQ9rhMgzriLkN1oHIIR2 H6E/a5Qvo3yA== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031165" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:18 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 12/26] mm: Introduce VM_SHSTK for shadow stack memory Date: Fri, 9 Oct 2020 11:32:16 -0700 Message-Id: <20201009183230.26717-13-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A Shadow Stack PTE must be read-only and have _PAGE_DIRTY set. However, read-only and Dirty PTEs also exist for copy-on-write (COW) pages. These two cases are handled differently for page faults. Introduce VM_SHSTK to track shadow stack VMAs. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook v9: - Add VM_SHSTK case to arch_vma_name(). - Revise the commit log to explain why adding a new VM flag. --- arch/x86/mm/mmap.c | 2 ++ fs/proc/task_mmu.c | 3 +++ include/linux/mm.h | 8 ++++++++ 3 files changed, 13 insertions(+) diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c index c90c20904a60..a22c6b6fc607 100644 --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -165,6 +165,8 @@ unsigned long get_mmap_base(int is_legacy) const char *arch_vma_name(struct vm_area_struct *vma) { + if (vma->vm_flags & VM_SHSTK) + return "[shadow stack]"; return NULL; } diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 5066b0251ed8..436dd37f2d4c 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -663,6 +663,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) [ilog2(VM_PKEY_BIT4)] = "", #endif #endif /* CONFIG_ARCH_HAS_PKEYS */ +#ifdef CONFIG_X86_SHADOW_STACK_USER + [ilog2(VM_SHSTK)] = "ss", +#endif }; size_t i; diff --git a/include/linux/mm.h b/include/linux/mm.h index 16b799a0522c..12be96b061c7 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -304,11 +304,13 @@ extern unsigned int kobjsize(const void *objp); #define VM_HIGH_ARCH_BIT_2 34 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_BIT_3 35 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_BIT_4 36 /* bit only usable on 64-bit architectures */ +#define VM_HIGH_ARCH_BIT_5 37 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_0 BIT(VM_HIGH_ARCH_BIT_0) #define VM_HIGH_ARCH_1 BIT(VM_HIGH_ARCH_BIT_1) #define VM_HIGH_ARCH_2 BIT(VM_HIGH_ARCH_BIT_2) #define VM_HIGH_ARCH_3 BIT(VM_HIGH_ARCH_BIT_3) #define VM_HIGH_ARCH_4 BIT(VM_HIGH_ARCH_BIT_4) +#define VM_HIGH_ARCH_5 BIT(VM_HIGH_ARCH_BIT_5) #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */ #ifdef CONFIG_ARCH_HAS_PKEYS @@ -324,6 +326,12 @@ extern unsigned int kobjsize(const void *objp); #endif #endif /* CONFIG_ARCH_HAS_PKEYS */ +#ifdef CONFIG_X86_SHADOW_STACK_USER +# define VM_SHSTK VM_HIGH_ARCH_5 +#else +# define VM_SHSTK VM_NONE +#endif + #if defined(CONFIG_X86) # define VM_PAT VM_ARCH_1 /* PAT reserves whole VMA at once (x86) */ #elif defined(CONFIG_PPC) From patchwork Fri Oct 9 18:32:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827061 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1D46614D5 for ; Fri, 9 Oct 2020 18:33:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D57DF222C3 for ; Fri, 9 Oct 2020 18:33:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D57DF222C3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F38746B007D; Fri, 9 Oct 2020 14:33:21 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id EEC516B007E; Fri, 9 Oct 2020 14:33:21 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D62AC6B0080; Fri, 9 Oct 2020 14:33:21 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0213.hostedemail.com [216.40.44.213]) by kanga.kvack.org (Postfix) with ESMTP id A332B6B007D for ; Fri, 9 Oct 2020 14:33:21 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 385791EE6 for ; Fri, 9 Oct 2020 18:33:21 +0000 (UTC) X-FDA: 77353234602.23.tooth95_4b15e6b271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id 1D80637606 for ; Fri, 9 Oct 2020 18:33:21 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30003:30054:30056:30064:30069:30070,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04yrghfr3h978dtxg1kxsaczc8784opd9bf64hsw76mmgnwjyiwo8y8j7cbu4sd.gxpm91jod81ppowoyqkbuyt3puxfeda4qny6xryb8tr91p1y5t5gwzr7p6idfzn.q-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: tooth95_4b15e6b271e2 X-Filterd-Recvd-Size: 6050 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:20 +0000 (UTC) IronPort-SDR: RMRMYGa2CkmVkMXxnEWuOfYg2gK9nSmocae2SbpBu1Nqfft6BPfiPfvHhMwLuvqM3lkvg0PZ1w SmWz50N8Izuw== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390230" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390230" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:19 -0700 IronPort-SDR: P+MUraUybckbWiHQRGxlDqH898uEvbgyWOD5TyH5Q7Cl0/6rBGPRcW9SRcHlo+z/IJdRli2HvL vfKy7HsenfcA== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031170" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:19 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 13/26] x86/mm: Shadow Stack page fault error checking Date: Fri, 9 Oct 2020 11:32:17 -0700 Message-Id: <20201009183230.26717-14-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Shadow stack accesses are those that are performed by the CPU where it expects to encounter a shadow stack mapping. These accesses are performed implicitly by CALL/RET at the site of the shadow stack pointer. These accesses are made explicitly by shadow stack management instructions like WRUSSQ. Shadow stacks accesses to shadow-stack mapping can see faults in normal, valid operation just like regular accesses to regular mappings. Shadow stacks need some of the same features like delayed allocation, swap and copy-on-write. Shadow stack accesses can also result in errors, such as when a shadow stack overflows, or if a shadow stack access occurs to a non-shadow-stack mapping. In handling a shadow stack page fault, verify it occurs within a shadow stack mapping. It is always an error otherwise. For valid shadow stack accesses, set FAULT_FLAG_WRITE to effect copy-on-write. Because clearing _PAGE_DIRTY_HW (vs. _PAGE_RW) is used to trigger the fault, shadow stack read fault and shadow stack write fault are not differentiated and both are handled as a write access. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook v10: -Revise commit log. --- arch/x86/include/asm/traps.h | 2 ++ arch/x86/mm/fault.c | 19 +++++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 714b1a30e7b0..28b493c53d70 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -50,6 +50,7 @@ void __noreturn handle_stack_overflow(const char *message, * bit 3 == 1: use of reserved bit detected * bit 4 == 1: fault was an instruction fetch * bit 5 == 1: protection keys block access + * bit 6 == 1: shadow stack access fault */ enum x86_pf_error_code { X86_PF_PROT = 1 << 0, @@ -58,5 +59,6 @@ enum x86_pf_error_code { X86_PF_RSVD = 1 << 3, X86_PF_INSTR = 1 << 4, X86_PF_PK = 1 << 5, + X86_PF_SHSTK = 1 << 6, }; #endif /* _ASM_X86_TRAPS_H */ diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 6e3e8a124903..2390399c157f 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1110,6 +1110,17 @@ access_error(unsigned long error_code, struct vm_area_struct *vma) (error_code & X86_PF_INSTR), foreign)) return 1; + /* + * Verify a shadow stack access is within a shadow stack VMA. + * It is always an error otherwise. Normal data access to a + * shadow stack area is checked in the case followed. + */ + if (error_code & X86_PF_SHSTK) { + if (!(vma->vm_flags & VM_SHSTK)) + return 1; + return 0; + } + if (error_code & X86_PF_WRITE) { /* write, present and write, not present: */ if (unlikely(!(vma->vm_flags & VM_WRITE))) @@ -1275,6 +1286,14 @@ void do_user_addr_fault(struct pt_regs *regs, perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); + /* + * Clearing _PAGE_DIRTY_HW is used to detect shadow stack access. + * This method cannot distinguish shadow stack read vs. write. + * For valid shadow stack accesses, set FAULT_FLAG_WRITE to effect + * copy-on-write. + */ + if (hw_error_code & X86_PF_SHSTK) + flags |= FAULT_FLAG_WRITE; if (hw_error_code & X86_PF_WRITE) flags |= FAULT_FLAG_WRITE; if (hw_error_code & X86_PF_INSTR) From patchwork Fri Oct 9 18:32:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827063 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E274914D5 for ; Fri, 9 Oct 2020 18:33:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B46E0222BA for ; Fri, 9 Oct 2020 18:33:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B46E0222BA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BDDEA6B007E; Fri, 9 Oct 2020 14:33:22 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B69C36B0080; Fri, 9 Oct 2020 14:33:22 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 995B76B0081; Fri, 9 Oct 2020 14:33:22 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0156.hostedemail.com [216.40.44.156]) by kanga.kvack.org (Postfix) with ESMTP id 6D8E06B007E for ; Fri, 9 Oct 2020 14:33:22 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 0B9E3180AD811 for ; Fri, 9 Oct 2020 18:33:22 +0000 (UTC) X-FDA: 77353234644.21.alarm26_1306d5f271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id E2D64180442C0 for ; Fri, 9 Oct 2020 18:33:21 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30003:30054:30056:30064:30070,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-62.18.0.100 64.95.201.95;04ygd7ymfnq1mrce4swccukcfrwa7ypp3hwu8jkm9fyy1jdnffm8x4hs1mt3hd7.wfke86crtg1fjikd9jihc4rgotgbu51wa1dqthxiyrspibjfiqzhx4h77zcan4u.c-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:22,LUA_SUMMARY:none X-HE-Tag: alarm26_1306d5f271e2 X-Filterd-Recvd-Size: 6346 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:20 +0000 (UTC) IronPort-SDR: nFjAZS5Gpw/LxBqPNkWgz1agrDTR7+dBpg7dfi7vTTu93WjYjDqprVY2jrZRiWPTo/CexJDKwR nxye3YbBxZ4g== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390233" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390233" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:20 -0700 IronPort-SDR: YfeV+iuQu9OQjgYh91aEixZ7zdJDHp7e3rXv/cVaWMFM3rVaPr7XOEbVjgJ2OXVZBxbmRq2AHb /dXjTYWTS/Jg== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031174" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:19 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 14/26] x86/mm: Update maybe_mkwrite() for shadow stack Date: Fri, 9 Oct 2020 11:32:18 -0700 Message-Id: <20201009183230.26717-15-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Shadow stack memory is writable, but its VMA has VM_SHSTK instead of VM_WRITE. Update maybe_mkwrite() to include the shadow stack. Signed-off-by: Yu-cheng Yu --- arch/x86/Kconfig | 4 ++++ arch/x86/mm/pgtable.c | 18 ++++++++++++++++++ include/linux/mm.h | 2 ++ include/linux/pgtable.h | 24 ++++++++++++++++++++++++ mm/huge_memory.c | 2 ++ 5 files changed, 50 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 415fcc869afc..7578327226e3 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1935,6 +1935,9 @@ config AS_HAS_SHADOW_STACK config X86_CET def_bool n +config ARCH_MAYBE_MKWRITE + def_bool n + config ARCH_HAS_SHADOW_STACK def_bool n @@ -1945,6 +1948,7 @@ config X86_SHADOW_STACK_USER depends on AS_HAS_SHADOW_STACK select ARCH_USES_HIGH_VMA_FLAGS select X86_CET + select ARCH_MAYBE_MKWRITE select ARCH_HAS_SHADOW_STACK help Shadow Stacks provides protection against program stack diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index dfd82f51ba66..a9666b64bc05 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -610,6 +610,24 @@ int pmdp_clear_flush_young(struct vm_area_struct *vma, } #endif +#ifdef CONFIG_ARCH_MAYBE_MKWRITE +pte_t arch_maybe_mkwrite(pte_t pte, struct vm_area_struct *vma) +{ + if (likely(vma->vm_flags & VM_SHSTK)) + pte = pte_mkwrite_shstk(pte); + return pte; +} + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +pmd_t arch_maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) +{ + if (likely(vma->vm_flags & VM_SHSTK)) + pmd = pmd_mkwrite_shstk(pmd); + return pmd; +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ +#endif /* CONFIG_ARCH_MAYBE_MKWRITE */ + /** * reserve_top_address - reserves a hole in the top of kernel address space * @reserve - size of hole to reserve diff --git a/include/linux/mm.h b/include/linux/mm.h index 12be96b061c7..4f6305106feb 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -969,6 +969,8 @@ static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma) { if (likely(vma->vm_flags & VM_WRITE)) pte = pte_mkwrite(pte); + else + pte = arch_maybe_mkwrite(pte, vma); return pte; } diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 90654cb63e9e..157f5e726896 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1356,6 +1356,30 @@ static inline bool arch_has_pfn_modify_check(void) } #endif /* !_HAVE_ARCH_PFN_MODIFY_ALLOWED */ +#ifdef CONFIG_MMU +#ifdef CONFIG_ARCH_MAYBE_MKWRITE +pte_t arch_maybe_mkwrite(pte_t pte, struct vm_area_struct *vma); + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +pmd_t arch_maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma); +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + +#else /* !CONFIG_ARCH_MAYBE_MKWRITE */ +static inline pte_t arch_maybe_mkwrite(pte_t pte, struct vm_area_struct *vma) +{ + return pte; +} + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +static inline pmd_t arch_maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) +{ + return pmd; +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + +#endif /* CONFIG_ARCH_MAYBE_MKWRITE */ +#endif /* CONFIG_MMU */ + /* * Architecture PAGE_KERNEL_* fallbacks * diff --git a/mm/huge_memory.c b/mm/huge_memory.c index da397779a6d4..01252b00cd06 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -464,6 +464,8 @@ pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) { if (likely(vma->vm_flags & VM_WRITE)) pmd = pmd_mkwrite(pmd); + else + pmd = arch_maybe_pmd_mkwrite(pmd, vma); return pmd; } From patchwork Fri Oct 9 18:32:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827065 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 28B0C139F for ; Fri, 9 Oct 2020 18:33:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F00A722284 for ; Fri, 9 Oct 2020 18:33:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F00A722284 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9E0786B0080; Fri, 9 Oct 2020 14:33:23 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 997846B0081; Fri, 9 Oct 2020 14:33:23 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 80F676B0083; Fri, 9 Oct 2020 14:33:23 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0082.hostedemail.com [216.40.44.82]) by kanga.kvack.org (Postfix) with ESMTP id 49A636B0080 for ; Fri, 9 Oct 2020 14:33:23 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E4C70181AE863 for ; Fri, 9 Oct 2020 18:33:22 +0000 (UTC) X-FDA: 77353234644.04.boats98_4f0e06d271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin04.hostedemail.com (Postfix) with ESMTP id C316F800C882 for ; Fri, 9 Oct 2020 18:33:22 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30003:30054:30056:30064:30070,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-62.18.0.100 64.95.201.95;04y8fb316pr3b6pzmos3k66grz1s3ycf7ug9yofx4wnokgxmhcptck6q8s3dhng.szwrd4n5iui6aqzcsw8qi5c4mu7yqbnaoowj43cxwiaehtqdxk79hiuc48gzjmn.g-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: boats98_4f0e06d271e2 X-Filterd-Recvd-Size: 5339 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:21 +0000 (UTC) IronPort-SDR: NqT7QDbfyhJsaWvJGFVwq19ShPXoTSk2KEaQQCocXI7QunaWKDUL4+nV0w+pDxpBPIWeKjLL05 ApswkJKW/CAQ== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390236" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390236" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:21 -0700 IronPort-SDR: YY+fCizLYtbw2KqiHuPIr6IS5sfs3YdjU6d6tLEIV5+4mjpR5qElCkKI1+zQbP+JTfQSVqDzTs x+LsNLy5LJOQ== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031178" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:20 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 15/26] mm: Fixup places that call pte_mkwrite() directly Date: Fri, 9 Oct 2020 11:32:19 -0700 Message-Id: <20201009183230.26717-16-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A shadow stack page is made writable by pte_mkwrite_shstk(), which sets _PAGE_DIRTY_HW. There are a few places that call pte_mkwrite() directly and miss the maybe_mkwrite() fixup in the previous patch. Fix them with maybe_mkwrite(): - do_anonymous_page() and migrate_vma_insert_page() check VM_WRITE directly and call pte_mkwrite(), which is the same as maybe_mkwrite(). Change them to maybe_mkwrite(). - In do_numa_page(), if the numa entry 'was-writable', then pte_mkwrite() is called directly. Fix it by doing maybe_mkwrite(). - In change_pte_range(), pte_mkwrite() is called directly. Replace it with maybe_mkwrite(). Signed-off-by: Yu-cheng Yu --- mm/memory.c | 5 ++--- mm/migrate.c | 3 +-- mm/mprotect.c | 2 +- 3 files changed, 4 insertions(+), 6 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index fcfc4ca36eba..5fde329791b8 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3563,8 +3563,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) entry = mk_pte(page, vma->vm_page_prot); entry = pte_sw_mkyoung(entry); - if (vma->vm_flags & VM_WRITE) - entry = pte_mkwrite(pte_mkdirty(entry)); + entry = maybe_mkwrite(pte_mkdirty(entry), vma); vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); @@ -4218,7 +4217,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) pte = pte_modify(old_pte, vma->vm_page_prot); pte = pte_mkyoung(pte); if (was_writable) - pte = pte_mkwrite(pte); + pte = maybe_mkwrite(pte, vma); ptep_modify_prot_commit(vma, vmf->address, vmf->pte, old_pte, pte); update_mmu_cache(vma, vmf->address, vmf->pte); diff --git a/mm/migrate.c b/mm/migrate.c index 04a98bb2f568..bba81bbcee80 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2904,8 +2904,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, } } else { entry = mk_pte(page, vma->vm_page_prot); - if (vma->vm_flags & VM_WRITE) - entry = pte_mkwrite(pte_mkdirty(entry)); + entry = maybe_mkwrite(pte_mkdirty(entry), vma); } ptep = pte_offset_map_lock(mm, pmdp, addr, &ptl); diff --git a/mm/mprotect.c b/mm/mprotect.c index ce8b8a5eacbb..a8edbcb3af99 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -135,7 +135,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, if (dirty_accountable && pte_dirty(ptent) && (pte_soft_dirty(ptent) || !(vma->vm_flags & VM_SOFTDIRTY))) { - ptent = pte_mkwrite(ptent); + ptent = maybe_mkwrite(ptent, vma); } ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); pages++; From patchwork Fri Oct 9 18:32:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827067 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B121514D5 for ; Fri, 9 Oct 2020 18:33:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7788D222BA for ; Fri, 9 Oct 2020 18:33:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7788D222BA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EED836B0081; Fri, 9 Oct 2020 14:33:23 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E753D6B0082; Fri, 9 Oct 2020 14:33:23 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D18528E0001; Fri, 9 Oct 2020 14:33:23 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id 7BE7A6B0082 for ; Fri, 9 Oct 2020 14:33:23 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 290948249980 for ; Fri, 9 Oct 2020 18:33:23 +0000 (UTC) X-FDA: 77353234686.28.wash86_17180b2271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id 07CE36C05 for ; Fri, 9 Oct 2020 18:33:23 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30003:30054:30056:30064,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04ygzecmoz6g31inmxi73epsth3kcoch91ne5baonnfy7rpum4gm9jhhxgua4h3.iqdfeufm19tme9rddinuf5cnn484sowk1pixz4hifo4a4hbow6shxakzbbn4etd.q-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: wash86_17180b2271e2 X-Filterd-Recvd-Size: 5965 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:22 +0000 (UTC) IronPort-SDR: xBfgIkbG+6A9aqS4WdOSxZv2VJUVSZAdX0lozJl8++X3nCxzvW7ceYiyCMdxQAEcGxP8+mJ+Gv cHmzAs2tjqrg== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390237" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390237" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:22 -0700 IronPort-SDR: kwqnonbsJTBKvJ1u/MPsZ7qKVw2rupj2KDzxzltIKru8fra2WhgDj8Wmu0wolWM/MgNw5jfCIp 119yCIcITC1w== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031183" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:21 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 16/26] mm: Add guard pages around a shadow stack. Date: Fri, 9 Oct 2020 11:32:20 -0700 Message-Id: <20201009183230.26717-17-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: INCSSP(Q/D) increments shadow stack pointer and 'pops and discards' the first and the last elements in the range, effectively touches those memory areas. The maximum moving distance by INCSSPQ is 255 * 8 = 2040 bytes and 255 * 4 = 1020 bytes by INCSSPD. Both ranges are far from PAGE_SIZE. Thus, putting a gap page on both ends of a shadow stack prevents INCSSP, CALL, and RET from going beyond. Signed-off-by: Yu-cheng Yu v10: - Define ARCH_SHADOW_STACK_GUARD_GAP. --- arch/x86/include/asm/processor.h | 10 ++++++++++ include/linux/mm.h | 24 ++++++++++++++++++++---- 2 files changed, 30 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 97143d87994c..01acbd63cad8 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -840,6 +840,16 @@ static inline void spin_lock_prefetch(const void *x) #define STACK_TOP TASK_SIZE_LOW #define STACK_TOP_MAX TASK_SIZE_MAX +/* + * Shadow stack pointer is moved by CALL, JMP, and INCSSP(Q/D). INCSSPQ + * moves shadow stack pointer up to 255 * 8 = ~2 KB (~1KB for INCSSPD) and + * touches the first and the last element in the range, which triggers a + * page fault if the range is not in a shadow stack. Because of this, + * creating 4-KB guard pages around a shadow stack prevents these + * instructions from going beyond. + */ +#define ARCH_SHADOW_STACK_GUARD_GAP PAGE_SIZE + #define INIT_THREAD { \ .addr_limit = KERNEL_DS, \ } diff --git a/include/linux/mm.h b/include/linux/mm.h index 4f6305106feb..ce461795fd8b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2615,6 +2615,10 @@ extern vm_fault_t filemap_page_mkwrite(struct vm_fault *vmf); int __must_check write_one_page(struct page *page); void task_dirty_inc(struct task_struct *tsk); +#ifndef ARCH_SHADOW_STACK_GUARD_GAP +#define ARCH_SHADOW_STACK_GUARD_GAP 0 +#endif + extern unsigned long stack_guard_gap; /* Generic expand stack which grows the stack according to GROWS{UP,DOWN} */ extern int expand_stack(struct vm_area_struct *vma, unsigned long address); @@ -2647,9 +2651,15 @@ static inline struct vm_area_struct * find_vma_intersection(struct mm_struct * m static inline unsigned long vm_start_gap(struct vm_area_struct *vma) { unsigned long vm_start = vma->vm_start; + unsigned long gap = 0; - if (vma->vm_flags & VM_GROWSDOWN) { - vm_start -= stack_guard_gap; + if (vma->vm_flags & VM_GROWSDOWN) + gap = stack_guard_gap; + else if (vma->vm_flags & VM_SHSTK) + gap = ARCH_SHADOW_STACK_GUARD_GAP; + + if (gap != 0) { + vm_start -= gap; if (vm_start > vma->vm_start) vm_start = 0; } @@ -2659,9 +2669,15 @@ static inline unsigned long vm_start_gap(struct vm_area_struct *vma) static inline unsigned long vm_end_gap(struct vm_area_struct *vma) { unsigned long vm_end = vma->vm_end; + unsigned long gap = 0; + + if (vma->vm_flags & VM_GROWSUP) + gap = stack_guard_gap; + else if (vma->vm_flags & VM_SHSTK) + gap = ARCH_SHADOW_STACK_GUARD_GAP; - if (vma->vm_flags & VM_GROWSUP) { - vm_end += stack_guard_gap; + if (gap != 0) { + vm_end += gap; if (vm_end < vma->vm_end) vm_end = -PAGE_SIZE; } From patchwork Fri Oct 9 18:32:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827069 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D1E97139F for ; Fri, 9 Oct 2020 18:33:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9336D222B9 for ; Fri, 9 Oct 2020 18:33:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9336D222B9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D90E96B0082; Fri, 9 Oct 2020 14:33:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D3BB58E0001; Fri, 9 Oct 2020 14:33:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BDEE36B0085; Fri, 9 Oct 2020 14:33:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0150.hostedemail.com [216.40.44.150]) by kanga.kvack.org (Postfix) with ESMTP id 8D6746B0082 for ; Fri, 9 Oct 2020 14:33:25 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 2603E1EE6 for ; Fri, 9 Oct 2020 18:33:25 +0000 (UTC) X-FDA: 77353234770.27.run53_270d947271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin27.hostedemail.com (Postfix) with ESMTP id 06E1A3D663 for ; Fri, 9 Oct 2020 18:33:24 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30001:30054:30056:30064:30070,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04yr3fp9jjodnggd8gz6irjqy1daxyp8uigy6srgcr9eyneuu41jibnyjta4g9m.3ibmyy6951pr3upb1ywi1kohwyy9j8x3cxodx5qb1p5katchqy16p5fr6e8cimr.1-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: run53_270d947271e2 X-Filterd-Recvd-Size: 4937 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:24 +0000 (UTC) IronPort-SDR: v3hW4M2VCfKDNlrX+QC6Zx00S4/DM8HPYCo+lIAIqNEFhnyGjWre511j08omhNHeqvTzpjVDfC 2CNEfVx3spOA== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390241" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390241" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:22 -0700 IronPort-SDR: IEc5Yt270Rmj0Oi9i8rRmRUKNEuBztenkEftbr997kju6CQAkepjSwECp5P+SOQjI3q5i9nx5b Czf+uPq50ZsA== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031186" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:22 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 17/26] mm/mmap: Add shadow stack pages to memory accounting Date: Fri, 9 Oct 2020 11:32:21 -0700 Message-Id: <20201009183230.26717-18-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Account shadow stack pages to stack memory. Signed-off-by: Yu-cheng Yu v10: - Use arch_shadow_stack_mapping() to make meaning clear. v8: - Change shadow stake pages from data_vm to stack_vm. --- arch/x86/mm/pgtable.c | 7 +++++++ include/linux/pgtable.h | 11 +++++++++++ mm/mmap.c | 5 +++++ 3 files changed, 23 insertions(+) diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index a9666b64bc05..68e98f70298b 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -893,3 +893,10 @@ int pmd_free_pte_page(pmd_t *pmd, unsigned long addr) #endif /* CONFIG_X86_64 */ #endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */ + +#ifdef CONFIG_ARCH_HAS_SHADOW_STACK +bool arch_shadow_stack_mapping(vm_flags_t vm_flags) +{ + return (vm_flags & VM_SHSTK); +} +#endif diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 157f5e726896..6f2ca5fffd44 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1380,6 +1380,17 @@ static inline pmd_t arch_maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma #endif /* CONFIG_ARCH_MAYBE_MKWRITE */ #endif /* CONFIG_MMU */ +#ifdef CONFIG_MMU +#ifdef CONFIG_ARCH_HAS_SHADOW_STACK +bool arch_shadow_stack_mapping(vm_flags_t vm_flags); +#else +static inline bool arch_shadow_stack_mapping(vm_flags_t vm_flags) +{ + return false; +} +#endif /* CONFIG_ARCH_HAS_SHADOW_STACK */ +#endif /* CONFIG_MMU */ + /* * Architecture PAGE_KERNEL_* fallbacks * diff --git a/mm/mmap.c b/mm/mmap.c index 40248d84ad5f..574b3f273462 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1682,6 +1682,9 @@ static inline int accountable_mapping(struct file *file, vm_flags_t vm_flags) if (file && is_file_hugepages(file)) return 0; + if (arch_shadow_stack_mapping(vm_flags)) + return 1; + return (vm_flags & (VM_NORESERVE | VM_SHARED | VM_WRITE)) == VM_WRITE; } @@ -3352,6 +3355,8 @@ void vm_stat_account(struct mm_struct *mm, vm_flags_t flags, long npages) mm->stack_vm += npages; else if (is_data_mapping(flags)) mm->data_vm += npages; + else if (arch_shadow_stack_mapping(flags)) + mm->stack_vm += npages; } static vm_fault_t special_mapping_fault(struct vm_fault *vmf); From patchwork Fri Oct 9 18:32:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827071 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1B887139F for ; Fri, 9 Oct 2020 18:33:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D9BC222284 for ; Fri, 9 Oct 2020 18:33:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D9BC222284 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9BBB86B0083; Fri, 9 Oct 2020 14:33:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9904D6B0085; Fri, 9 Oct 2020 14:33:26 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 857FC6B0087; Fri, 9 Oct 2020 14:33:26 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0058.hostedemail.com [216.40.44.58]) by kanga.kvack.org (Postfix) with ESMTP id 573426B0083 for ; Fri, 9 Oct 2020 14:33:26 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E856A8249980 for ; Fri, 9 Oct 2020 18:33:25 +0000 (UTC) X-FDA: 77353234770.16.dock91_4211aeb271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin16.hostedemail.com (Postfix) with ESMTP id C5EAB100BDB11 for ; Fri, 9 Oct 2020 18:33:25 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:30003:30029:30054:30056:30064:30069:30070,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04ygrnyhxbjszzxpkdjekwhpy4aenopb3usdy4nenrewcwy3j45hc53kux175m5.oozzi19aniwh4n7qway5dirwpzz8t4dgm83q9fqz8kghk44k1bewjwbqgp7rpf9.c-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: dock91_4211aeb271e2 X-Filterd-Recvd-Size: 5720 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:24 +0000 (UTC) IronPort-SDR: WH3WB5u8WgvYSUxT5Iq5uadXxvjqIOqO0I/IRBs/RB/lO81LNfgiBfrvEvK9kTFtX+I1nGt+9N RIqbqOZp1Mzg== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390244" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390244" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:24 -0700 IronPort-SDR: ft9ey/X2dPezFXtXG1CROWRok1s/kjizjQejy/YOe0ly7RCr7xt+vdwUUXgYL2fn+v5HzxrbxZ y+A5wYR5JsWA== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031190" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:22 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v14 18/26] mm: Update can_follow_write_pte() for shadow stack Date: Fri, 9 Oct 2020 11:32:22 -0700 Message-Id: <20201009183230.26717-19-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Can_follow_write_pte() ensures a read-only page is COWed by checking the FOLL_COW flag, and uses pte_dirty() to validate the flag is still valid. Like a writable data page, a shadow stack page is writable, and becomes read-only during copy-on-write, but it is always dirty. Thus, in the can_follow_write_pte() check, it belongs to the writable page case and should be excluded from the read-only page pte_dirty() check. Apply the same changes to can_follow_write_pmd(). Signed-off-by: Yu-cheng Yu v10: - Reverse name changes to can_follow_write_*(). --- mm/gup.c | 8 +++++--- mm/huge_memory.c | 8 +++++--- 2 files changed, 10 insertions(+), 6 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index e869c634cc9a..10e32f574822 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -384,10 +384,12 @@ static int follow_pfn_pte(struct vm_area_struct *vma, unsigned long address, * FOLL_FORCE can write to even unwritable pte's, but only * after we've gone through a COW cycle and they are dirty. */ -static inline bool can_follow_write_pte(pte_t pte, unsigned int flags) +static inline bool can_follow_write_pte(pte_t pte, unsigned int flags, + struct vm_area_struct *vma) { return pte_write(pte) || - ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte)); + ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte) && + !arch_shadow_stack_mapping(vma->vm_flags)); } static struct page *follow_page_pte(struct vm_area_struct *vma, @@ -430,7 +432,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, } if ((flags & FOLL_NUMA) && pte_protnone(pte)) goto no_page; - if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, flags)) { + if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, flags, vma)) { pte_unmap_unlock(ptep, ptl); return NULL; } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 01252b00cd06..fd22ceaba945 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1324,10 +1324,12 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) * FOLL_FORCE can write to even unwritable pmd's, but only * after we've gone through a COW cycle and they are dirty. */ -static inline bool can_follow_write_pmd(pmd_t pmd, unsigned int flags) +static inline bool can_follow_write_pmd(pmd_t pmd, unsigned int flags, + struct vm_area_struct *vma) { return pmd_write(pmd) || - ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pmd_dirty(pmd)); + ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pmd_dirty(pmd) && + !arch_shadow_stack_mapping(vma->vm_flags)); } struct page *follow_trans_huge_pmd(struct vm_area_struct *vma, @@ -1340,7 +1342,7 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct *vma, assert_spin_locked(pmd_lockptr(mm, pmd)); - if (flags & FOLL_WRITE && !can_follow_write_pmd(*pmd, flags)) + if (flags & FOLL_WRITE && !can_follow_write_pmd(*pmd, flags, vma)) goto out; /* Avoid dumping huge zero page */ From patchwork Fri Oct 9 18:32:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 11827073 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 712CE139F for ; Fri, 9 Oct 2020 18:33:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3534822284 for ; Fri, 9 Oct 2020 18:33:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3534822284 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 465406B0085; Fri, 9 Oct 2020 14:33:28 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 43C426B0087; Fri, 9 Oct 2020 14:33:28 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2DABA6B0088; Fri, 9 Oct 2020 14:33:28 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0128.hostedemail.com [216.40.44.128]) by kanga.kvack.org (Postfix) with ESMTP id D9B626B0085 for ; Fri, 9 Oct 2020 14:33:27 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 785BE180AD807 for ; Fri, 9 Oct 2020 18:33:27 +0000 (UTC) X-FDA: 77353234854.12.water51_2c10e91271e2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin12.hostedemail.com (Postfix) with ESMTP id 4246E1800CFFD for ; Fri, 9 Oct 2020 18:33:27 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yu-cheng.yu@intel.com,,RULES_HIT:4423:30036:30051:30054:30056:30064,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04yfj37nybtqdkpyg8ru6garhutmhocxedzf6b3bhahgsfgphgmkjwgarf71fq8.s9hg71aceys7xzdn6en99nzptcpdpjmkr4o79gxtqdjnyrdn4ugqecdn7adb9ha.w-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: water51_2c10e91271e2 X-Filterd-Recvd-Size: 8286 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 18:33:25 +0000 (UTC) IronPort-SDR: zVx55yuTIoSg8qvju1+Hksb2pmw16XSC6oqtJ1vBbrwcC2t+rYm5qmGDqYUnauIX5Xdj3Yhyrm uX3AGPtFRkHg== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="145390248" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145390248" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:25 -0700 IronPort-SDR: rp6jXdeLjtAnqOqNkHb0JfEYbED0R0ksVdFKAUx62iV7U6+2ULBZABrvNC3qIaselFLUlew7iV HcgqBIo2atGA== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="529031199" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 11:33:24 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu , Peter Collingbourne , Andrew Morton Subject: [PATCH v14 19/26] mm: Re-introduce vm_flags to do_mmap() Date: Fri, 9 Oct 2020 11:32:23 -0700 Message-Id: <20201009183230.26717-20-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201009183230.26717-1-yu-cheng.yu@intel.com> References: <20201009183230.26717-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There was no more caller passing vm_flags to do_mmap(), and vm_flags was removed from the function's input by: commit 45e55300f114 ("mm: remove unnecessary wrapper function do_mmap_pgoff()"). There is a new user now. Shadow stack allocation passes VM_SHSTK to do_mmap(). Re-introduce vm_flags to do_mmap(), but without the old wrapper do_mmap_pgoff(). Instead, make all callers of the wrapper pass a zero vm_flags to do_mmap(). Signed-off-by: Yu-cheng Yu Reviewed-by: Peter Collingbourne Cc: Andrew Morton Cc: Oleg Nesterov Cc: linux-mm@kvack.org v14: - Remove do_mmap_pgoff() and make all callers of do_mmap() pass vm_flags. --- fs/aio.c | 2 +- include/linux/mm.h | 3 ++- ipc/shm.c | 2 +- mm/mmap.c | 10 +++++----- mm/nommu.c | 4 ++-- mm/util.c | 2 +- 6 files changed, 12 insertions(+), 11 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index d5ec30385566..ca8c11665eea 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -527,7 +527,7 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events) ctx->mmap_base = do_mmap(ctx->aio_ring_file, 0, ctx->mmap_size, PROT_READ | PROT_WRITE, - MAP_SHARED, 0, &unused, NULL); + MAP_SHARED, 0, 0, &unused, NULL); mmap_write_unlock(mm); if (IS_ERR((void *)ctx->mmap_base)) { ctx->mmap_size = 0; diff --git a/include/linux/mm.h b/include/linux/mm.h index ce461795fd8b..71677d498300 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2560,7 +2560,8 @@ extern unsigned long mmap_region(struct file *file, unsigned long addr, struct list_head *uf); extern unsigned long do_mmap(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flags, - unsigned long pgoff, unsigned long *populate, struct list_head *uf); + vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate, + struct list_head *uf); extern int __do_munmap(struct mm_struct *, unsigned long, size_t, struct list_head *uf, bool downgrade); extern int do_munmap(struct mm_struct *, unsigned long, size_t, diff --git a/ipc/shm.c b/ipc/shm.c index e25c7c6106bc..91474258933d 100644 --- a/ipc/shm.c +++ b/ipc/shm.c @@ -1556,7 +1556,7 @@ long do_shmat(int shmid, char __user *shmaddr, int shmflg, goto invalid; } - addr = do_mmap(file, addr, size, prot, flags, 0, &populate, NULL); + addr = do_mmap(file, addr, size, prot, flags, 0, 0, &populate, NULL); *raddr = addr; err = 0; if (IS_ERR_VALUE(addr)) diff --git a/mm/mmap.c b/mm/mmap.c index 574b3f273462..fc04184d2eae 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1365,11 +1365,11 @@ static inline bool file_mmap_ok(struct file *file, struct inode *inode, */ unsigned long do_mmap(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, - unsigned long flags, unsigned long pgoff, - unsigned long *populate, struct list_head *uf) + unsigned long flags, vm_flags_t vm_flags, + unsigned long pgoff, unsigned long *populate, + struct list_head *uf) { struct mm_struct *mm = current->mm; - vm_flags_t vm_flags; int pkey = 0; *populate = 0; @@ -1431,7 +1431,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, * to. we assume access permissions have been handled by the open * of the memory object, so we don't do any here. */ - vm_flags = calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(flags) | + vm_flags |= calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(flags) | mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; if (flags & MAP_LOCKED) @@ -3007,7 +3007,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, file = get_file(vma->vm_file); ret = do_mmap(vma->vm_file, start, size, - prot, flags, pgoff, &populate, NULL); + prot, flags, 0, pgoff, &populate, NULL); fput(file); out: mmap_write_unlock(mm); diff --git a/mm/nommu.c b/mm/nommu.c index 75a327149af1..f67d6bcdfc9f 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -1078,6 +1078,7 @@ unsigned long do_mmap(struct file *file, unsigned long len, unsigned long prot, unsigned long flags, + vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate, struct list_head *uf) @@ -1085,7 +1086,6 @@ unsigned long do_mmap(struct file *file, struct vm_area_struct *vma; struct vm_region *region; struct rb_node *rb; - vm_flags_t vm_flags; unsigned long capabilities, result; int ret; @@ -1104,7 +1104,7 @@ unsigned long do_mmap(struct file *file, /* we've determined that we can make the mapping, now translate what we * now know into VMA flags */ - vm_flags = determine_vm_flags(file, prot, flags, capabilities); + vm_flags |= determine_vm_flags(file, prot, flags, capabilities); /* we're going to need to record the mapping */ region = kmem_cache_zalloc(vm_region_jar, GFP_KERNEL); diff --git a/mm/util.c b/mm/util.c index 5ef378a2a038..beb8b881c080 100644 --- a/mm/util.c +++ b/mm/util.c @@ -503,7 +503,7 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr, if (!ret) { if (mmap_write_lock_killable(mm)) return -EINTR; - ret = do_mmap(file, addr, len, prot, flag, pgoff, &populate, + ret = do_mmap(file, addr, len, prot, flag, 0, pgoff, &populate, &uf); mmap_write_unlock(mm); userfaultfd_unmap_complete(mm, &uf);