From patchwork Mon Nov 19 21:47:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 10689539 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5AF1C13BB for ; Mon, 19 Nov 2018 21:54:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4D2372A6EE for ; Mon, 19 Nov 2018 21:54:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3F06A2A6F7; Mon, 19 Nov 2018 21:54:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A17202A589 for ; Mon, 19 Nov 2018 21:54:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 260746B1C98; Mon, 19 Nov 2018 16:54:26 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1EA8A6B1C99; Mon, 19 Nov 2018 16:54:26 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E69986B1C9A; Mon, 19 Nov 2018 16:54:25 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f197.google.com (mail-pf1-f197.google.com [209.85.210.197]) by kanga.kvack.org (Postfix) with ESMTP id 9E4F86B1C98 for ; Mon, 19 Nov 2018 16:54:25 -0500 (EST) Received: by mail-pf1-f197.google.com with SMTP id 68so21654855pfr.6 for ; Mon, 19 Nov 2018 13:54:25 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=4lNjvzsO3J8a39KYcX74z0FBTxk0UzpiYaAAYtfEg/c=; b=AgaWm7fWvUhKkKxseycwFW3mnZ4X4Z2qr0oW5tBD5vltwQEq6wt9D9Ec4cwcWLagBy /AKLc+4CzTfkQtluJ5XdSMadFYyaEFB4hd4EqZ+eyYVy1SF1/RTYJjZd9/T2RsFL5M2G DMIP5tiu+15as4aVb/XsW3O4Z9YpO3g1jmca6+DDFeedmyk3DPmCXVkUpAJmb0/JZAlF m+b9AsTv5zncElgFuxNo9XevldCZu+RjPLxl1e8zBzoapt6bxHhR0OktIrkWEyHdINcF Ra8nWmOyt+WgZxj0cSqmFHezlVMNtF+Qc29QthBohDjchXz/VuIw5HAzocafsIKrm+Re k4uQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yu-cheng.yu@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=yu-cheng.yu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AGRZ1gJ202EM7LAvHXyCKLiiORqqyJ8JkHVdHz1mQf96N6LVUe41vEYc V5HfiOutRveEqWsuT8lQekB1h7E7vUm6PZJruR5jlfXD8PFmO3Kv767ScBq2t163eDwW+r0Z1Zn yAagW4ft2WsFyyvl6+P+qtC01MiZPC1VgGYU97ytvlSI9XQBH6xEWUlB09ORJBfPVLA== X-Received: by 2002:a63:a51b:: with SMTP id n27mr22090114pgf.17.1542664465309; Mon, 19 Nov 2018 13:54:25 -0800 (PST) X-Google-Smtp-Source: AJdET5fwNeYgbcZgnuxcS5IRHljXF7rgpy8GHhDFv6lspszt21MMy8VDXGxvY1H0cLMZ6iOqUfj9 X-Received: by 2002:a63:a51b:: with SMTP id n27mr22090081pgf.17.1542664464449; Mon, 19 Nov 2018 13:54:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542664464; cv=none; d=google.com; s=arc-20160816; b=dSYi70WVmuVzutuOFAvx4MohFToJbEGhmBe+N/JAfSKYXRMdZGOQUHBlVRmyKHx2zE zzeOJLQ9no4k4nNkMltwwcMBsyfKUXgd5d4QWomFX2w8qtIz67fmLW0OgkHh6dozFoUh rA3fr6xzMuwDDB9c+xVSbViyY477UGLCad2iS4F5SC+TjjKvVWTJIe4f9wThlaFqQjm0 UJ+OOVcMadobm2j6kLmOWYEiQ48ZQKnQjRtVI0VzruDLdQagw1QUObolCPGZ/5z+XZCO XIpjo9IfZh7Z9oRjG/iquzmn2MwL4Lw293XoLuAepKeqpm2gw48BlzHJsF1anmDLlZva Flaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=4lNjvzsO3J8a39KYcX74z0FBTxk0UzpiYaAAYtfEg/c=; b=KMN7E1Q6Y6qFwoGHeMXNjuW1CDZ8ZXmfftwTGzstGxjZAgJbyxVrjteBMJd70h0RIg /dt7p0m9K5Kdqyo7TKuiURdNfVKi3VQhurJbOdOgg3129PqjyhjlLQVIu/DBVG7YWI0G p2VD1Q5Q826x5yplgWHuezE7Ni8yIs0lmBSt6t/RqQSclwwZJymjEll6z8wR/QyiB3MB smVJteklG8pgzu52QBxCxhiYLCQuFFDMAzGfDl+fyeckFX3Z/G30nijU6yQoFZUt9k9H qt1Af5SceZeBTWNLcb1avlRTRrSteLArrImRJIHk3K1ZC9xP4BZO6yFFLOilIjsUes6g V2Dg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yu-cheng.yu@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=yu-cheng.yu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga11.intel.com (mga11.intel.com. [192.55.52.93]) by mx.google.com with ESMTPS id o12-v6si32492543plg.114.2018.11.19.13.54.24 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 19 Nov 2018 13:54:24 -0800 (PST) Received-SPF: pass (google.com: domain of yu-cheng.yu@intel.com designates 192.55.52.93 as permitted sender) client-ip=192.55.52.93; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yu-cheng.yu@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=yu-cheng.yu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 19 Nov 2018 13:54:24 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,254,1539673200"; d="scan'208";a="93319875" Received: from yyu32-desk1.sc.intel.com ([143.183.136.147]) by orsmga008.jf.intel.com with ESMTP; 19 Nov 2018 13:54:23 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue Cc: Yu-cheng Yu Subject: [RFC PATCH v6 13/26] x86/mm: Modify ptep_set_wrprotect and pmdp_set_wrprotect for _PAGE_DIRTY_SW Date: Mon, 19 Nov 2018 13:47:56 -0800 Message-Id: <20181119214809.6086-14-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181119214809.6086-1-yu-cheng.yu@intel.com> References: <20181119214809.6086-1-yu-cheng.yu@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP When Shadow Stack is enabled, the [R/O + PAGE_DIRTY_HW] setting is reserved only for the Shadow Stack. Non-Shadow Stack R/O PTEs use [R/O + PAGE_DIRTY_SW]. When a PTE goes from [R/W + PAGE_DIRTY_HW] to [R/O + PAGE_DIRTY_SW], it could become a transient Shadow Stack PTE in two cases. The first case is that some processors can start a write but end up seeing a read-only PTE by the time they get to the Dirty bit, creating a transient Shadow Stack PTE. However, this will not occur on processors supporting Shadow Stack therefore we don't need a TLB flush here. The second case is that when the software, without atomic, tests & replaces PAGE_DIRTY_HW with PAGE_DIRTY_SW, a transient Shadow Stack PTE can exist. This is prevented with cmpxchg. Dave Hansen, Jann Horn, Andy Lutomirski, and Peter Zijlstra provided many insights to the issue. Jann Horn provided the cmpxchg solution. Signed-off-by: Yu-cheng Yu --- arch/x86/include/asm/pgtable.h | 58 ++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index db4b9d22d2f7..cf0c50ef53d8 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1202,7 +1202,36 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER + pte_t new_pte, pte = READ_ONCE(*ptep); + + /* + * Some processors can start a write, but end up + * seeing a read-only PTE by the time they get + * to the Dirty bit. In this case, they will + * set the Dirty bit, leaving a read-only, Dirty + * PTE which looks like a Shadow Stack PTE. + * + * However, this behavior has been improved and + * will not occur on processors supporting + * Shadow Stacks. Without this guarantee, a + * transition to a non-present PTE and flush the + * TLB would be needed. + * + * When changing a writable PTE to read-only and + * if the PTE has _PAGE_DIRTY_HW set, we move + * that bit to _PAGE_DIRTY_SW so that the PTE is + * not a valid Shadow Stack PTE. + */ + do { + new_pte = pte_wrprotect(pte); + new_pte.pte |= (new_pte.pte & _PAGE_DIRTY_HW) >> + _PAGE_BIT_DIRTY_HW << _PAGE_BIT_DIRTY_SW; + new_pte.pte &= ~_PAGE_DIRTY_HW; + } while (!try_cmpxchg(ptep, &pte, new_pte)); +#else clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte); +#endif } #define flush_tlb_fix_spurious_fault(vma, address) do { } while (0) @@ -1265,7 +1294,36 @@ static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER + pmd_t new_pmd, pmd = READ_ONCE(*pmdp); + + /* + * Some processors can start a write, but end up + * seeing a read-only PMD by the time they get + * to the Dirty bit. In this case, they will + * set the Dirty bit, leaving a read-only, Dirty + * PMD which looks like a Shadow Stack PMD. + * + * However, this behavior has been improved and + * will not occur on processors supporting + * Shadow Stacks. Without this guarantee, a + * transition to a non-present PMD and flush the + * TLB would be needed. + * + * When changing a writable PMD to read-only and + * if the PMD has _PAGE_DIRTY_HW set, we move + * that bit to _PAGE_DIRTY_SW so that the PMD is + * not a valid Shadow Stack PMD. + */ + do { + new_pmd = pmd_wrprotect(pmd); + new_pmd.pmd |= (new_pmd.pmd & _PAGE_DIRTY_HW) >> + _PAGE_BIT_DIRTY_HW << _PAGE_BIT_DIRTY_SW; + new_pmd.pmd &= ~_PAGE_DIRTY_HW; + } while (!try_cmpxchg(pmdp, &pmd, new_pmd)); +#else clear_bit(_PAGE_BIT_RW, (unsigned long *)pmdp); +#endif } #define pud_write pud_write