From patchwork Thu May 27 23:51:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 12285773 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA23FC4708A for ; Thu, 27 May 2021 23:57:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9227261006 for ; Thu, 27 May 2021 23:57:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9227261006 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 348956B006E; Thu, 27 May 2021 19:57:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 31F956B0070; Thu, 27 May 2021 19:57:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 197106B0071; Thu, 27 May 2021 19:57:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0073.hostedemail.com [216.40.44.73]) by kanga.kvack.org (Postfix) with ESMTP id D6B306B006E for ; Thu, 27 May 2021 19:57:36 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 81D8E1812143D for ; Thu, 27 May 2021 23:57:36 +0000 (UTC) X-FDA: 78188675712.30.6DA86FD Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf01.hostedemail.com (Postfix) with ESMTP id 700F8503BD13 for ; Thu, 27 May 2021 23:57:27 +0000 (UTC) IronPort-SDR: 1wVit2CjKwqFkSe/pPnNqzhYhpfFmjcHR4NnQEqedoUQ5Khy6ZS5+bC3uwgwjYwbaRjHgvEegi jkkfW9GPYLSg== X-IronPort-AV: E=McAfee;i="6200,9189,9997"; a="190229886" X-IronPort-AV: E=Sophos;i="5.83,228,1616482800"; d="scan'208";a="190229886" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 May 2021 16:57:34 -0700 IronPort-SDR: yD/NbNm0YpzjCdn1XHFOqeXCAlkAigTM8S/w8sGOQLElDaLB+al68vr387z6prMfvaj6YU57xc eytDWg4WXIVA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,228,1616482800"; d="scan'208";a="397944347" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga003.jf.intel.com with ESMTP; 27 May 2021 16:57:33 -0700 Subject: [PATCH 1/5] x86/pkeys: move read_pkru() and write_pkru() To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org,Dave Hansen ,tglx@linutronix.de,mingo@redhat.com,bp@alien8.de,x86@kernel.org,luto@kernel.org,shuah@kernel.org,babu.moger@amd.com,dave.kleikamp@oracle.com,linuxram@us.ibm.com,bauerman@linux.ibm.com,bigeasy@linutronix.de From: Dave Hansen Date: Thu, 27 May 2021 16:51:11 -0700 References: <20210527235109.B2A9F45F@viggo.jf.intel.com> In-Reply-To: <20210527235109.B2A9F45F@viggo.jf.intel.com> Message-Id: <20210527235111.11857E30@viggo.jf.intel.com> Authentication-Results: imf01.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none); spf=none (imf01.hostedemail.com: domain of dave.hansen@linux.intel.com has no SPF policy when checking 134.134.136.126) smtp.mailfrom=dave.hansen@linux.intel.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 700F8503BD13 X-Stat-Signature: 97r96js933w3zyeteqb6m3f3sgrjxzd7 X-HE-Tag: 1622159847-635959 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Dave Hansen write_pkru() was originally used just to write to the PKRU register. It was mercifully short and sweet and was not out of place in pgtable.h with some other pkey-related code. But, later work included a requirement to also modify the task XSAVE buffer when updating the register. This really is more related to the XSAVE architecture than to paging. Move the read/write_pkru() to asm/fpu/xstate.h. pgtable.h won't miss them. Signed-off-by: Dave Hansen Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: x86@kernel.org Cc: Andy Lutomirski Cc: Shuah Khan Cc: Babu Moger Cc: Dave Kleikamp Cc: Ram Pai Cc: Thiago Jung Bauermann Cc: Sebastian Andrzej Siewior --- b/arch/x86/include/asm/fpu/xstate.h | 29 +++++++++++++++++++++++++++++ b/arch/x86/include/asm/pgtable.h | 29 ----------------------------- b/arch/x86/kernel/process_64.c | 1 + b/arch/x86/kvm/svm/sev.c | 1 + b/arch/x86/kvm/x86.c | 1 + b/arch/x86/mm/pkeys.c | 1 + 6 files changed, 33 insertions(+), 29 deletions(-) diff -puN arch/x86/include/asm/fpu/xstate.h~move-write_pkru arch/x86/include/asm/fpu/xstate.h --- a/arch/x86/include/asm/fpu/xstate.h~move-write_pkru 2021-05-27 16:40:23.110705472 -0700 +++ b/arch/x86/include/asm/fpu/xstate.h 2021-05-27 16:40:23.132705472 -0700 @@ -6,6 +6,7 @@ #include #include +#include #include /* Bit 63 of XCR0 is reserved for future expansion */ @@ -116,4 +117,32 @@ void copy_kernel_to_dynamic_supervisor(s /* Validate an xstate header supplied by userspace (ptrace or sigreturn) */ int validate_user_xstate_header(const struct xstate_header *hdr); +static inline u32 read_pkru(void) +{ + if (boot_cpu_has(X86_FEATURE_OSPKE)) + return rdpkru(); + return 0; +} + +static inline void write_pkru(u32 pkru) +{ + struct pkru_state *pk; + + if (!boot_cpu_has(X86_FEATURE_OSPKE)) + return; + + pk = get_xsave_addr(¤t->thread.fpu.state.xsave, XFEATURE_PKRU); + + /* + * The PKRU value in xstate needs to be in sync with the value that is + * written to the CPU. The FPU restore on return to userland would + * otherwise load the previous value again. + */ + fpregs_lock(); + if (pk) + pk->pkru = pkru; + __write_pkru(pkru); + fpregs_unlock(); +} + #endif diff -puN arch/x86/include/asm/pgtable.h~move-write_pkru arch/x86/include/asm/pgtable.h --- a/arch/x86/include/asm/pgtable.h~move-write_pkru 2021-05-27 16:40:23.114705472 -0700 +++ b/arch/x86/include/asm/pgtable.h 2021-05-27 16:40:23.135705472 -0700 @@ -126,35 +126,6 @@ static inline int pte_dirty(pte_t pte) return pte_flags(pte) & _PAGE_DIRTY; } - -static inline u32 read_pkru(void) -{ - if (boot_cpu_has(X86_FEATURE_OSPKE)) - return rdpkru(); - return 0; -} - -static inline void write_pkru(u32 pkru) -{ - struct pkru_state *pk; - - if (!boot_cpu_has(X86_FEATURE_OSPKE)) - return; - - pk = get_xsave_addr(¤t->thread.fpu.state.xsave, XFEATURE_PKRU); - - /* - * The PKRU value in xstate needs to be in sync with the value that is - * written to the CPU. The FPU restore on return to userland would - * otherwise load the previous value again. - */ - fpregs_lock(); - if (pk) - pk->pkru = pkru; - __write_pkru(pkru); - fpregs_unlock(); -} - static inline int pte_young(pte_t pte) { return pte_flags(pte) & _PAGE_ACCESSED; diff -puN arch/x86/kernel/process_64.c~move-write_pkru arch/x86/kernel/process_64.c --- a/arch/x86/kernel/process_64.c~move-write_pkru 2021-05-27 16:40:23.117705472 -0700 +++ b/arch/x86/kernel/process_64.c 2021-05-27 16:40:23.138705472 -0700 @@ -41,6 +41,7 @@ #include #include +#include #include #include #include diff -puN arch/x86/kvm/svm/sev.c~move-write_pkru arch/x86/kvm/svm/sev.c --- a/arch/x86/kvm/svm/sev.c~move-write_pkru 2021-05-27 16:40:23.121705472 -0700 +++ b/arch/x86/kvm/svm/sev.c 2021-05-27 16:40:23.142705472 -0700 @@ -16,6 +16,7 @@ #include #include #include +#include #include #include diff -puN arch/x86/kvm/x86.c~move-write_pkru arch/x86/kvm/x86.c --- a/arch/x86/kvm/x86.c~move-write_pkru 2021-05-27 16:40:23.125705472 -0700 +++ b/arch/x86/kvm/x86.c 2021-05-27 16:40:23.150705472 -0700 @@ -66,6 +66,7 @@ #include #include #include +#include #include /* Ugh! */ #include #include diff -puN arch/x86/mm/pkeys.c~move-write_pkru arch/x86/mm/pkeys.c --- a/arch/x86/mm/pkeys.c~move-write_pkru 2021-05-27 16:40:23.128705472 -0700 +++ b/arch/x86/mm/pkeys.c 2021-05-27 16:40:23.151705472 -0700 @@ -10,6 +10,7 @@ #include /* boot_cpu_has, ... */ #include /* vma_pkey() */ +#include /* read/write_pkru() */ #include /* init_fpstate */ int __execute_only_pkey(struct mm_struct *mm) From patchwork Thu May 27 23:51:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 12285775 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5E6CC47089 for ; Thu, 27 May 2021 23:57:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 72D6361006 for ; Thu, 27 May 2021 23:57:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 72D6361006 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D0A9C6B0070; Thu, 27 May 2021 19:57:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CCF776B0071; Thu, 27 May 2021 19:57:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D6676B0072; Thu, 27 May 2021 19:57:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0194.hostedemail.com [216.40.44.194]) by kanga.kvack.org (Postfix) with ESMTP id 3955C6B0070 for ; Thu, 27 May 2021 19:57:38 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id D0CD2ABF4 for ; Thu, 27 May 2021 23:57:37 +0000 (UTC) X-FDA: 78188675754.22.2C9F118 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by imf22.hostedemail.com (Postfix) with ESMTP id 12E15C007753 for ; Thu, 27 May 2021 23:57:28 +0000 (UTC) IronPort-SDR: v4qcDwbOhvlpv0xLGB74MZOGunAjgsGcnZt0iheTupwxqa2YI05ukXltggCu9khjT6jwsuCo2b ZnSdd9AJ8kEQ== X-IronPort-AV: E=McAfee;i="6200,9189,9997"; a="199820404" X-IronPort-AV: E=Sophos;i="5.83,228,1616482800"; d="scan'208";a="199820404" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 May 2021 16:57:36 -0700 IronPort-SDR: ZeIFmsdsFsOlue/8YEFXyR6NydiZCn/WmQ1NjqfrLjrWgFb8OJC/EeEGhPimHBbp8ROR/GbO2m 9PPKTfr6G0RQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,228,1616482800"; d="scan'208";a="472807339" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by FMSMGA003.fm.intel.com with ESMTP; 27 May 2021 16:57:35 -0700 Subject: [PATCH 2/5] x86/pkeys: rename write_pkru() To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org,Dave Hansen ,tglx@linutronix.de,mingo@redhat.com,bp@alien8.de,x86@kernel.org,luto@kernel.org,shuah@kernel.org,babu.moger@amd.com,dave.kleikamp@oracle.com,linuxram@us.ibm.com,bauerman@linux.ibm.com,bigeasy@linutronix.de From: Dave Hansen Date: Thu, 27 May 2021 16:51:13 -0700 References: <20210527235109.B2A9F45F@viggo.jf.intel.com> In-Reply-To: <20210527235109.B2A9F45F@viggo.jf.intel.com> Message-Id: <20210527235113.C5DAFE12@viggo.jf.intel.com> X-Rspamd-Queue-Id: 12E15C007753 Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none); spf=none (imf22.hostedemail.com: domain of dave.hansen@linux.intel.com has no SPF policy when checking 192.55.52.93) smtp.mailfrom=dave.hansen@linux.intel.com X-Rspamd-Server: rspam04 X-Stat-Signature: yr1n7n1aqhyxnmm5qxa1hdk8fdoryda3 X-HE-Tag: 1622159848-795055 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Dave Hansen write_pkru() was once concerned purely with writing to the PKRU register. However, the current task XSAVE buffer must also be updated in a coordinated way. Change the naming to reflect that this is an operation which applies to a task (current) and its state in addition to the register itself. Signed-off-by: Dave Hansen Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: x86@kernel.org Cc: Andy Lutomirski Cc: Shuah Khan Cc: Babu Moger Cc: Dave Kleikamp Cc: Ram Pai Cc: Thiago Jung Bauermann Cc: Sebastian Andrzej Siewior --- b/arch/x86/include/asm/fpu/xstate.h | 6 +++++- b/arch/x86/kernel/fpu/xstate.c | 2 +- b/arch/x86/mm/pkeys.c | 2 +- 3 files changed, 7 insertions(+), 3 deletions(-) diff -puN arch/x86/include/asm/fpu/xstate.h~rename-write_pkru arch/x86/include/asm/fpu/xstate.h --- a/arch/x86/include/asm/fpu/xstate.h~rename-write_pkru 2021-05-27 16:40:24.618705468 -0700 +++ b/arch/x86/include/asm/fpu/xstate.h 2021-05-27 16:40:24.631705468 -0700 @@ -124,7 +124,11 @@ static inline u32 read_pkru(void) return 0; } -static inline void write_pkru(u32 pkru) +/* + * Update all of the PKRU state for the current task: + * PKRU register and PKRU xstate. + */ +static inline void current_write_pkru(u32 pkru) { struct pkru_state *pk; diff -puN arch/x86/kernel/fpu/xstate.c~rename-write_pkru arch/x86/kernel/fpu/xstate.c --- a/arch/x86/kernel/fpu/xstate.c~rename-write_pkru 2021-05-27 16:40:24.620705468 -0700 +++ b/arch/x86/kernel/fpu/xstate.c 2021-05-27 16:40:24.633705468 -0700 @@ -1026,7 +1026,7 @@ int arch_set_user_pkey_access(struct tas old_pkru &= ~((PKRU_AD_BIT|PKRU_WD_BIT) << pkey_shift); /* Write old part along with new part: */ - write_pkru(old_pkru | new_pkru_bits); + current_write_pkru(old_pkru | new_pkru_bits); return 0; } diff -puN arch/x86/mm/pkeys.c~rename-write_pkru arch/x86/mm/pkeys.c --- a/arch/x86/mm/pkeys.c~rename-write_pkru 2021-05-27 16:40:24.626705468 -0700 +++ b/arch/x86/mm/pkeys.c 2021-05-27 16:40:24.634705468 -0700 @@ -139,7 +139,7 @@ void copy_init_pkru_to_fpregs(void) * Override the PKRU state that came from 'init_fpstate' * with the baseline from the process. */ - write_pkru(init_pkru_value_snapshot); + current_write_pkru(init_pkru_value_snapshot); } static ssize_t init_pkru_read_file(struct file *file, char __user *user_buf, From patchwork Thu May 27 23:51:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 12285781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AFFEFC4708B for ; Thu, 27 May 2021 23:57:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3C33F60233 for ; Thu, 27 May 2021 23:57:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3C33F60233 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9B3416B0071; Thu, 27 May 2021 19:57:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9395A6B0072; Thu, 27 May 2021 19:57:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7B2D06B0073; Thu, 27 May 2021 19:57:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0223.hostedemail.com [216.40.44.223]) by kanga.kvack.org (Postfix) with ESMTP id 4A4D76B0071 for ; Thu, 27 May 2021 19:57:39 -0400 (EDT) Received: from smtpin37.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id CE692181AF5CA for ; Thu, 27 May 2021 23:57:38 +0000 (UTC) X-FDA: 78188675796.37.2004B72 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf01.hostedemail.com (Postfix) with ESMTP id DC9E2503BD15 for ; Thu, 27 May 2021 23:57:29 +0000 (UTC) IronPort-SDR: vd/6+zIEI+Sy09gfIzJ3Sz/gwWPsK8Z+Ix+foc6/sIBWyRlqJWE7EwonFn9Ip8kBVt/M6CJfrT 5mMKPqXoGGOw== X-IronPort-AV: E=McAfee;i="6200,9189,9997"; a="190229896" X-IronPort-AV: E=Sophos;i="5.83,228,1616482800"; d="scan'208";a="190229896" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 May 2021 16:57:37 -0700 IronPort-SDR: 9ZFlg3sGbjSQxnOyaaTDmNRe3esrK6QKVLZjgaZeDojwcFOC1gLIryltHoXaSBDLcmDkBnsP99 M1Wcq2ajCfsA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,228,1616482800"; d="scan'208";a="615568004" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga005.jf.intel.com with ESMTP; 27 May 2021 16:57:37 -0700 Subject: [PATCH 3/5] x86/pkeys: skip 'init_pkru' debugfs file creation when pkeys not supported To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org,Dave Hansen ,tglx@linutronix.de,mingo@redhat.com,bp@alien8.de,x86@kernel.org,luto@kernel.org,shuah@kernel.org,babu.moger@amd.com,dave.kleikamp@oracle.com,linuxram@us.ibm.com,bauerman@linux.ibm.com,bigeasy@linutronix.de From: Dave Hansen Date: Thu, 27 May 2021 16:51:16 -0700 References: <20210527235109.B2A9F45F@viggo.jf.intel.com> In-Reply-To: <20210527235109.B2A9F45F@viggo.jf.intel.com> Message-Id: <20210527235116.E5A35872@viggo.jf.intel.com> Authentication-Results: imf01.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none); spf=none (imf01.hostedemail.com: domain of dave.hansen@linux.intel.com has no SPF policy when checking 134.134.136.126) smtp.mailfrom=dave.hansen@linux.intel.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: DC9E2503BD15 X-Stat-Signature: 7assdg4numutudir9ee69g5ky8e9wzkt X-HE-Tag: 1622159849-802249 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Dave Hansen The PKRU hardware is permissive by default: all reads and writes are allowed. The in-kernel policy is restrictive by default: deny all unnecessary access until explicitly requested. That policy can be modified with a debugfs file: "x86/init_pkru". This file is created unconditionally, regardless of PKRU support in the hardware, which is a little silly. Avoid creating the file when pkeys are not available. This also removes the need to check for pkey support at runtime, which would be required once the new pkey modification infrastructure is put in place later in this series. Signed-off-by: Dave Hansen Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: x86@kernel.org Cc: Andy Lutomirski Cc: Shuah Khan Cc: Babu Moger Cc: Dave Kleikamp Cc: Ram Pai Cc: Thiago Jung Bauermann Cc: Sebastian Andrzej Siewior --- b/arch/x86/mm/pkeys.c | 4 ++++ 1 file changed, 4 insertions(+) diff -puN arch/x86/mm/pkeys.c~x86-pkeys-skip-debugfs-file arch/x86/mm/pkeys.c --- a/arch/x86/mm/pkeys.c~x86-pkeys-skip-debugfs-file 2021-05-27 16:40:25.847705465 -0700 +++ b/arch/x86/mm/pkeys.c 2021-05-27 16:40:25.852705465 -0700 @@ -193,6 +193,10 @@ static const struct file_operations fops static int __init create_init_pkru_value(void) { + /* Do not expose the file if pkeys are not supported. */ + if (!cpu_feature_enabled(X86_FEATURE_OSPKE)) + return 0; + debugfs_create_file("init_pkru", S_IRUSR | S_IWUSR, arch_debugfs_dir, NULL, &fops_init_pkru); return 0; From patchwork Thu May 27 23:51:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 12285777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F378C4707F for ; Thu, 27 May 2021 23:57:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 22A1760233 for ; Thu, 27 May 2021 23:57:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 22A1760233 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AC2816B0072; Thu, 27 May 2021 19:57:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A9BCF6B0073; Thu, 27 May 2021 19:57:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8764F8D0001; Thu, 27 May 2021 19:57:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0038.hostedemail.com [216.40.44.38]) by kanga.kvack.org (Postfix) with ESMTP id 43F606B0072 for ; Thu, 27 May 2021 19:57:42 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id D650CAF9D for ; Thu, 27 May 2021 23:57:41 +0000 (UTC) X-FDA: 78188675922.22.1539114 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by imf22.hostedemail.com (Postfix) with ESMTP id EBD7CC007753 for ; Thu, 27 May 2021 23:57:32 +0000 (UTC) IronPort-SDR: HoiEyQ6IYxgpK6yRMXn0+naDVcCmUv68oLGFUUBr3wNeCIl3HHNGCltg8MYj8zMCqAxZEv6xBt rpYwAGu7Ecpw== X-IronPort-AV: E=McAfee;i="6200,9189,9997"; a="199820413" X-IronPort-AV: E=Sophos;i="5.83,228,1616482800"; d="scan'208";a="199820413" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 May 2021 16:57:39 -0700 IronPort-SDR: u94T1XDD7Dmcee1joPlPmeDtJMVGzdmoKzdBUrssqKMN9paNoZiQy0TBxCBQ52Mfx8PnK8C2Eh 8vMcaG1ZArXA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,228,1616482800"; d="scan'208";a="398394082" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga006.jf.intel.com with ESMTP; 27 May 2021 16:57:39 -0700 Subject: [PATCH 4/5] x86/pkeys: replace PKRU modification infrastructure To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org,Dave Hansen ,tglx@linutronix.de,mingo@redhat.com,bp@alien8.de,x86@kernel.org,luto@kernel.org,shuah@kernel.org,babu.moger@amd.com,dave.kleikamp@oracle.com,linuxram@us.ibm.com,bauerman@linux.ibm.com,bigeasy@linutronix.de From: Dave Hansen Date: Thu, 27 May 2021 16:51:18 -0700 References: <20210527235109.B2A9F45F@viggo.jf.intel.com> In-Reply-To: <20210527235109.B2A9F45F@viggo.jf.intel.com> Message-Id: <20210527235118.88C9831B@viggo.jf.intel.com> X-Rspamd-Queue-Id: EBD7CC007753 Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none); spf=none (imf22.hostedemail.com: domain of dave.hansen@linux.intel.com has no SPF policy when checking 192.55.52.93) smtp.mailfrom=dave.hansen@linux.intel.com X-Rspamd-Server: rspam04 X-Stat-Signature: nay7eujrttfhwuggm58wxnxds4zi175o X-HE-Tag: 1622159852-723128 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Dave Hansen There are two points in the kernel which write to PKRU in a buggy way: * In switch_fpu_finish(), where having xfeatures[PKRU]=0 will result in PKRU being assigned 'init_pkru_value' instead of 0x0. * In write_pkru(), xfeatures[PKRU]=0 will result in PKRU having the correct value, but the XSAVE buffer will remain stale because xfeatures is not updated. Both of them screw up the fact that get_xsave_addr() will return NULL for PKRU when it is in the XSAVE "init state". This went unnoticed until now because on Intel hardware XINUSE[PKRU] is never 0 because of the kernel policy around 'init_pkru_value'. AMD hardware, on the other hand, can set XINUSE[PKRU]=0 via a normal WRPKRU(0). The handy selftests somewhat accidentally produced a case[2] where WRPKRU(0) occurs. get_xsave_addr() is a horrible interface[1], especially when used for writing state. It is too easy for callers to be tricked into thinking: 1. On a NULL return that they have no work to do 2. On a valid pointer return that they *can* safely write state without doing more work like setting an xfeatures bit. Wrap get_xsave_addr() with some additional infrastructure. Ensure that callers must declare their intent to read or write to the state. Inject the init state into both reads *and* writes. This ensures that writers never have to deal with detritus from previous state. The new common xstate infrastructure: xstatebuf_get_write_ptr() and xfeature_init_space() should be quite usable for other xfeatures with trivial updates to xfeature_init_space(). My hope is that we can move away from all use of get_xsave_addr(), replacing it with things like xstate_read_pkru(). The new BUG_ON()s are not great. But, they do represent a severe violation of expectations and XSAVE state can be security-sensitive and these represent a truly dazed-and-confused situation. 1. I know, I wrote it. I'm really sorry. 2. https://lore.kernel.org/linux-kselftest/b2e0324a-9125-bb34-9e76-81817df27c48@amd.com/ Signed-off-by: Dave Hansen Fixes: 0d714dba1626 ("x86/fpu: Update xstate's PKRU value on write_pkru()") Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: x86@kernel.org Cc: Andy Lutomirski Cc: Shuah Khan Cc: Babu Moger Cc: Dave Kleikamp Cc: Ram Pai Cc: Thiago Jung Bauermann Cc: Sebastian Andrzej Siewior --- b/arch/x86/include/asm/fpu/internal.h | 8 -- b/arch/x86/include/asm/fpu/xstate.h | 111 +++++++++++++++++++++++++++++++--- b/arch/x86/include/asm/processor.h | 7 ++ b/arch/x86/kernel/cpu/common.c | 6 - b/arch/x86/mm/pkeys.c | 6 - 5 files changed, 115 insertions(+), 23 deletions(-) diff -puN arch/x86/include/asm/fpu/internal.h~write_pkru arch/x86/include/asm/fpu/internal.h --- a/arch/x86/include/asm/fpu/internal.h~write_pkru 2021-05-27 16:40:26.903705463 -0700 +++ b/arch/x86/include/asm/fpu/internal.h 2021-05-27 16:40:26.919705463 -0700 @@ -564,7 +564,6 @@ static inline void switch_fpu_prepare(st static inline void switch_fpu_finish(struct fpu *new_fpu) { u32 pkru_val = init_pkru_value; - struct pkru_state *pk; if (!static_cpu_has(X86_FEATURE_FPU)) return; @@ -578,11 +577,8 @@ static inline void switch_fpu_finish(str * PKRU state is switched eagerly because it needs to be valid before we * return to userland e.g. for a copy_to_user() operation. */ - if (current->mm) { - pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU); - if (pk) - pkru_val = pk->pkru; - } + if (current->mm) + pkru_val = xstate_read_pkru(&new_fpu->state.xsave); __write_pkru(pkru_val); /* diff -puN arch/x86/include/asm/fpu/xstate.h~write_pkru arch/x86/include/asm/fpu/xstate.h --- a/arch/x86/include/asm/fpu/xstate.h~write_pkru 2021-05-27 16:40:26.906705463 -0700 +++ b/arch/x86/include/asm/fpu/xstate.h 2021-05-27 16:40:26.919705463 -0700 @@ -124,27 +124,124 @@ static inline u32 read_pkru(void) return 0; } +static inline void xfeature_mark_non_init(struct xregs_state *xstate, + int xfeature_nr) +{ + /* + * Caller will place data in the @xstate buffer. + * Mark the xfeature as non-init: + */ + xstate->header.xfeatures |= BIT_ULL(xfeature_nr); +} + + +/* Set the contents of @xfeature_nr to the hardware init state */ +static inline void xfeature_init_space(struct xregs_state *xstate, + int xfeature_nr) +{ + void *state = get_xsave_addr(xstate, xfeature_nr); + + switch (xfeature_nr) { + case XFEATURE_PKRU: + /* zero the whole state, including reserved bits */ + memset(state, 0, sizeof(struct pkru_state)); + break; + default: + BUG(); + } +} + +/* + * Called when it is necessary to write to an XSAVE + * component feature. Guarantees that a future + * XRSTOR of the 'xstate' buffer will not consider + * @xfeature_nr to be in its init state. + * + * The returned buffer may contain old state. The + * caller must be prepared to fill the entire buffer. + * + * Caller must first ensure that @xfeature_nr is + * enabled and present in the @xstate buffer. + */ +static inline void *xstatebuf_get_write_ptr(struct xregs_state *xstate, + int xfeature_nr) +{ + bool feature_was_init = xstate->header.xfeatures & BIT_ULL(xfeature_nr); + + /* + * xcomp_bv represents whether 'xstate' has space for + * features. If not, something is horribly wrong and + * a write would corrupt memory. Perhaps xfeature_nr + * was not enabled. + */ + BUG_ON(!(xstate->header.xcomp_bv & BIT_ULL(xfeature_nr))); + + /* + * Ensure a sane xfeature_nr, including avoiding + * confusion with XCOMP_BV_COMPACTED_FORMAT. + */ + BUG_ON(xfeature_nr >= XFEATURE_MAX); + + /* Prepare xstate for a write to the xfeature: */ + xfeature_mark_non_init(xstate, xfeature_nr); + + /* + * If xfeature_nr was in the init state, update the buffer + * to match the state. Ensures that callers can safely + * write only a part of the state, they are not forced to + * write it in its entirety. + */ + if (feature_was_init) + xfeature_init_space(xstate, xfeature_nr); + + return get_xsave_addr(xstate, xfeature_nr); +} + +/* Caller must ensure X86_FEATURE_OSPKE is enabled. */ +static inline void xstate_write_pkru(struct xregs_state *xstate, u32 pkru) +{ + struct pkru_state *pk; + + pk = xstatebuf_get_write_ptr(xstate, XFEATURE_PKRU); + pk->pkru = pkru; +} + +/* + * What PKRU value is represented in the 'xstate'? Note, + * this returns the *architecturally* represented value, + * not the literal in-memory value. They may be different. + */ +static inline u32 xstate_read_pkru(struct xregs_state *xstate) +{ + struct pkru_state *pk; + + pk = get_xsave_addr(xstate, XFEATURE_PKRU); + /* + * If present, pull PKRU out of the XSAVE buffer. + * Otherwise, use the hardware init value. + */ + if (pk) + return pk->pkru; + else + return PKRU_HW_INIT_VALUE; +} + /* * Update all of the PKRU state for the current task: * PKRU register and PKRU xstate. */ static inline void current_write_pkru(u32 pkru) { - struct pkru_state *pk; - if (!boot_cpu_has(X86_FEATURE_OSPKE)) return; - pk = get_xsave_addr(¤t->thread.fpu.state.xsave, XFEATURE_PKRU); - + fpregs_lock(); /* * The PKRU value in xstate needs to be in sync with the value that is * written to the CPU. The FPU restore on return to userland would * otherwise load the previous value again. */ - fpregs_lock(); - if (pk) - pk->pkru = pkru; + xstate_write_pkru(¤t->thread.fpu.state.xsave, pkru); __write_pkru(pkru); fpregs_unlock(); } diff -puN arch/x86/include/asm/processor.h~write_pkru arch/x86/include/asm/processor.h --- a/arch/x86/include/asm/processor.h~write_pkru 2021-05-27 16:40:26.908705463 -0700 +++ b/arch/x86/include/asm/processor.h 2021-05-27 16:40:26.921705463 -0700 @@ -854,4 +854,11 @@ enum mds_mitigations { MDS_MITIGATION_VMWERV, }; +/* + * The XSAVE architecture defines an "init state" for + * PKRU. PKRU is set to this value by XRSTOR when it + * tries to restore PKRU but has on value in the buffer. + */ +#define PKRU_HW_INIT_VALUE 0x0 + #endif /* _ASM_X86_PROCESSOR_H */ diff -puN arch/x86/kernel/cpu/common.c~write_pkru arch/x86/kernel/cpu/common.c --- a/arch/x86/kernel/cpu/common.c~write_pkru 2021-05-27 16:40:26.912705463 -0700 +++ b/arch/x86/kernel/cpu/common.c 2021-05-27 16:40:26.924705463 -0700 @@ -466,8 +466,6 @@ static bool pku_disabled; static __always_inline void setup_pku(struct cpuinfo_x86 *c) { - struct pkru_state *pk; - /* check the boot processor, plus compile options for PKU: */ if (!cpu_feature_enabled(X86_FEATURE_PKU)) return; @@ -478,9 +476,7 @@ static __always_inline void setup_pku(st return; cr4_set_bits(X86_CR4_PKE); - pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU); - if (pk) - pk->pkru = init_pkru_value; + xstate_write_pkru(¤t->thread.fpu.state.xsave, init_pkru_value); /* * Seting X86_CR4_PKE will cause the X86_FEATURE_OSPKE * cpuid bit to be set. We need to ensure that we diff -puN arch/x86/mm/pkeys.c~write_pkru arch/x86/mm/pkeys.c --- a/arch/x86/mm/pkeys.c~write_pkru 2021-05-27 16:40:26.914705463 -0700 +++ b/arch/x86/mm/pkeys.c 2021-05-27 16:40:26.926705463 -0700 @@ -155,7 +155,6 @@ static ssize_t init_pkru_read_file(struc static ssize_t init_pkru_write_file(struct file *file, const char __user *user_buf, size_t count, loff_t *ppos) { - struct pkru_state *pk; char buf[32]; ssize_t len; u32 new_init_pkru; @@ -178,10 +177,7 @@ static ssize_t init_pkru_write_file(stru return -EINVAL; WRITE_ONCE(init_pkru_value, new_init_pkru); - pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU); - if (!pk) - return -EINVAL; - pk->pkru = new_init_pkru; + xstate_write_pkru(&init_fpstate.xsave, new_init_pkru); return count; } From patchwork Thu May 27 23:51:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 12285779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 953FDC4708C for ; Thu, 27 May 2021 23:57:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 43B5960233 for ; Thu, 27 May 2021 23:57:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 43B5960233 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6F1876B0073; Thu, 27 May 2021 19:57:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C7588D0001; Thu, 27 May 2021 19:57:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 590C86B0075; Thu, 27 May 2021 19:57:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0234.hostedemail.com [216.40.44.234]) by kanga.kvack.org (Postfix) with ESMTP id 2277C6B0073 for ; Thu, 27 May 2021 19:57:44 -0400 (EDT) Received: from smtpin33.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id B7A7F1813E4E0 for ; Thu, 27 May 2021 23:57:43 +0000 (UTC) X-FDA: 78188676006.33.DEDD2DC Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by imf19.hostedemail.com (Postfix) with ESMTP id 1810790012EB for ; Thu, 27 May 2021 23:57:32 +0000 (UTC) IronPort-SDR: xw1sT/LLC9LJCCDGv6E3kQTciytAH02pXJUBNiImC630UBY0x4PhhBkJozXuj7f2a4IN9YgnJf IropaVlMz8Ag== X-IronPort-AV: E=McAfee;i="6200,9189,9997"; a="266746138" X-IronPort-AV: E=Sophos;i="5.83,228,1616482800"; d="scan'208";a="266746138" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 May 2021 16:57:41 -0700 IronPort-SDR: WhFUaJMdRoZHktc0K9pH1n1J+PqBSQSOClzMtQtUf/siDr6PqPItak+rH85FtQokhyAvUSdrjv h0lCEipzg22w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,228,1616482800"; d="scan'208";a="477705335" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga001.jf.intel.com with ESMTP; 27 May 2021 16:57:41 -0700 Subject: [PATCH 5/5] selftests/vm/pkeys: exercise x86 XSAVE init state To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org,Dave Hansen ,tglx@linutronix.de,mingo@redhat.com,bp@alien8.de,x86@kernel.org,luto@kernel.org,shuah@kernel.org,babu.moger@amd.com,dave.kleikamp@oracle.com,linuxram@us.ibm.com,bauerman@linux.ibm.com,bigeasy@linutronix.de From: Dave Hansen Date: Thu, 27 May 2021 16:51:19 -0700 References: <20210527235109.B2A9F45F@viggo.jf.intel.com> In-Reply-To: <20210527235109.B2A9F45F@viggo.jf.intel.com> Message-Id: <20210527235119.9D443084@viggo.jf.intel.com> X-Rspamd-Queue-Id: 1810790012EB Authentication-Results: imf19.hostedemail.com; dkim=none; spf=none (imf19.hostedemail.com: domain of dave.hansen@linux.intel.com has no SPF policy when checking 134.134.136.100) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none) X-Rspamd-Server: rspam03 X-Stat-Signature: rhzgjyty5krkt7imeypo3ytkcmnh4t5u X-HE-Tag: 1622159852-391916 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Dave Hansen On x86, there is a set of instructions used to save and restore register state collectively known as the XSAVE architecture. There are about a dozen different features managed with XSAVE. The protection keys register, PKRU, is one of those features. The hardware optimizes XSAVE by tracking when the state has not changed from its initial (init) state. In this case, it can avoid the cost of writing state to memory (it would usually just be a bunch of 0's). When the pkey register is 0x0 the hardware optionally choose to track the register as being in the init state (optimize away the writes). AMD CPUs do this more aggressively compared to Intel. On x86, PKRU is rarely in its (very permissive) init state. Instead, the value defaults to something very restrictive. It is not surprising that bugs have popped up in the rare cases when PKRU reaches its init state. Add a protection key selftest which gets the protection keys register into its init state in a way that should work on Intel and AMD. Then, do a bunch of pkey register reads to watch for inadvertent changes. This adds "-mxsave" to CFLAGS for all the x86 vm selftests in order to allow use of the XSAVE instruction __builtin functions. This will make the builtins available on all of the vm selftests, but is expected to be harmless. Signed-off-by: Dave Hansen Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: x86@kernel.org Cc: Andy Lutomirski Cc: Shuah Khan Cc: Babu Moger Cc: Dave Kleikamp Cc: Ram Pai Cc: Thiago Jung Bauermann Cc: Sebastian Andrzej Siewior --- b/tools/testing/selftests/vm/Makefile | 4 - b/tools/testing/selftests/vm/pkey-x86.h | 1 b/tools/testing/selftests/vm/protection_keys.c | 71 +++++++++++++++++++++++++ 3 files changed, 74 insertions(+), 2 deletions(-) diff -puN tools/testing/selftests/vm/Makefile~init-pkru-selftest tools/testing/selftests/vm/Makefile --- a/tools/testing/selftests/vm/Makefile~init-pkru-selftest 2021-05-27 16:40:28.299705459 -0700 +++ b/tools/testing/selftests/vm/Makefile 2021-05-27 16:40:28.315705459 -0700 @@ -99,7 +99,7 @@ $(1) $(1)_64: $(OUTPUT)/$(1)_64 endef ifeq ($(CAN_BUILD_I386),1) -$(BINARIES_32): CFLAGS += -m32 +$(BINARIES_32): CFLAGS += -m32 -mxsave $(BINARIES_32): LDLIBS += -lrt -ldl -lm $(BINARIES_32): $(OUTPUT)/%_32: %.c $(CC) $(CFLAGS) $(EXTRA_CFLAGS) $(notdir $^) $(LDLIBS) -o $@ @@ -107,7 +107,7 @@ $(foreach t,$(TARGETS),$(eval $(call gen endif ifeq ($(CAN_BUILD_X86_64),1) -$(BINARIES_64): CFLAGS += -m64 +$(BINARIES_64): CFLAGS += -m64 -mxsave $(BINARIES_64): LDLIBS += -lrt -ldl $(BINARIES_64): $(OUTPUT)/%_64: %.c $(CC) $(CFLAGS) $(EXTRA_CFLAGS) $(notdir $^) $(LDLIBS) -o $@ diff -puN tools/testing/selftests/vm/pkey-x86.h~init-pkru-selftest tools/testing/selftests/vm/pkey-x86.h --- a/tools/testing/selftests/vm/pkey-x86.h~init-pkru-selftest 2021-05-27 16:40:28.301705459 -0700 +++ b/tools/testing/selftests/vm/pkey-x86.h 2021-05-27 16:40:28.315705459 -0700 @@ -126,6 +126,7 @@ static inline u32 pkey_bit_position(int #define XSTATE_PKEY_BIT (9) #define XSTATE_PKEY 0x200 +#define XSTATE_BV_OFFSET 512 int pkey_reg_xstate_offset(void) { diff -puN tools/testing/selftests/vm/protection_keys.c~init-pkru-selftest tools/testing/selftests/vm/protection_keys.c --- a/tools/testing/selftests/vm/protection_keys.c~init-pkru-selftest 2021-05-27 16:40:28.303705459 -0700 +++ b/tools/testing/selftests/vm/protection_keys.c 2021-05-27 16:40:28.314705459 -0700 @@ -1278,6 +1278,76 @@ void test_pkey_alloc_exhaust(int *ptr, u } } +void arch_force_pkey_reg_init(void) +{ +#if defined(__i386__) || defined(__x86_64__) /* arch */ + u64 *buf; + + /* + * All keys should be allocated and set to allow reads and + * writes, so the register should be all 0. If not, just + * skip the test. + */ + if (read_pkey_reg()) + return; + + /* + * Just allocate an absurd about of memory rather than + * doing the XSAVE size enumeration dance. + */ + buf = mmap(NULL, 1*MB, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); + + /* These __builtins require compiling with -mxsave */ + + /* XSAVE to build a valid buffer: */ + __builtin_ia32_xsave(buf, XSTATE_PKEY); + /* Clear XSTATE_BV[PKRU]: */ + buf[XSTATE_BV_OFFSET/sizeof(u64)] &= ~XSTATE_PKEY; + /* XRSTOR will likely get PKRU back to the init state: */ + __builtin_ia32_xrstor(buf, XSTATE_PKEY); + + munmap(buf, 1*MB); +#endif +} + + +/* + * This is mostly useless on ppc for now. But it will not + * hurt anything and should give some better coverage as + * a long-running test that continually checks the pkey + * register. + */ +void test_pkey_init_state(int *ptr, u16 pkey) +{ + int err; + int allocated_pkeys[NR_PKEYS] = {0}; + int nr_allocated_pkeys = 0; + int i; + + for (i = 0; i < NR_PKEYS*3; i++) { + int new_pkey = alloc_pkey(); + + allocated_pkeys[nr_allocated_pkeys++] = new_pkey; + } + + dprintf3("%s()::%d\n", __func__, __LINE__); + + arch_force_pkey_reg_init(); + + /* + * Loop for a bit, hoping to get exercise the kernel + * context switch code. + */ + for (i = 0; i < 1000000; i++) + read_pkey_reg(); + + for (i = 0; i < nr_allocated_pkeys; i++) { + err = sys_pkey_free(allocated_pkeys[i]); + pkey_assert(!err); + read_pkey_reg(); /* for shadow checking */ + } +} + /* * pkey 0 is special. It is allocated by default, so you do not * have to call pkey_alloc() to use it first. Make sure that it @@ -1502,6 +1572,7 @@ void (*pkey_tests[])(int *ptr, u16 pkey) test_implicit_mprotect_exec_only_memory, test_mprotect_with_pkey_0, test_ptrace_of_child, + test_pkey_init_state, test_pkey_syscalls_on_non_allocated_pkey, test_pkey_syscalls_bad_args, test_pkey_alloc_exhaust,