From patchwork Fri Sep 21 15:03:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 10610241 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 67D7915A6 for ; Fri, 21 Sep 2018 15:09:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 582672E429 for ; Fri, 21 Sep 2018 15:09:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4C7122E42E; Fri, 21 Sep 2018 15:09:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 303912E429 for ; Fri, 21 Sep 2018 15:09:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F346A8E0014; Fri, 21 Sep 2018 11:08:52 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DF8218E000F; Fri, 21 Sep 2018 11:08:52 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 89C0B8E000F; Fri, 21 Sep 2018 11:08:52 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) by kanga.kvack.org (Postfix) with ESMTP id EE83C8E0014 for ; Fri, 21 Sep 2018 11:08:51 -0400 (EDT) Received: by mail-pl1-f200.google.com with SMTP id 43-v6so1507621ple.19 for ; Fri, 21 Sep 2018 08:08:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=fbIU10g+3FIH2F9V39N52dFDZKOYUCa0+VKpaAcOC2w=; b=SLUpiY7OiO1nMvyva9W7NmwEEMOV6jz9fB5TI1XicrzqNLNGUwXJ0XrHCyQy+Aiyqu Ke+jfRRKrN/JTZoSCILJVdcqnIFOwSlWTsvZx0QaduhYbfO0cuBcRUK5KVxveA49XWTP W8sypZmqimDJExANn71vsphziLL0oLmRcRsEIy/8cw/3SwxyHCTDJ+FX/8z+kWff4HZc Fqdt09Q70YdDv30ZWLhASezwRfCNyooVp8ep3RMFcrdN9gJ0G+DgwvL6iQ4zEaju9512 0eC/MQASaRvQN56MV2vpgS/6FWccYXiShDHc0y+FUshjjA2Xx6yvd63qzlpuA8jMbRqw 5oFA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yu-cheng.yu@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=yu-cheng.yu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51BCeOZZfcw29qvsvQI9ABkZE2VeHOOwGffFRh9L+4AT9T6RuqGj ndArLTS3fgzB8b2Nr2hcEYYvWTtg5CubI2hGZJlLaVm8Swtxu3ZokAzfcH8Jl0kqJZgfLQZCKhT ZmLjV4xkExwLok1buj7U06mJy2wXIm9wP1EWQjXiVEdUYDikY7iTrTjb4Mw5pfEkwwg== X-Received: by 2002:a17:902:b08f:: with SMTP id p15-v6mr17691336plr.296.1537542531645; Fri, 21 Sep 2018 08:08:51 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdbaxhidhub7zAlvQK0QCQsVwSaS+IvurdewiT6ZDxidy30pIfN96oxDIMdqWE4tIdbRcfoU X-Received: by 2002:a17:902:b08f:: with SMTP id p15-v6mr17691265plr.296.1537542530409; Fri, 21 Sep 2018 08:08:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537542530; cv=none; d=google.com; s=arc-20160816; b=Jc+/XjLGJaMoI+WTKjPtEhvKo5UwlSh3wIKj45nw5KYGWIU1jHz8XTj2m4Rbtj5nhX LDwAHajCnPE8L5kWvCLxbH9vTqL7ahqaUAbW8wMC/0JIPbt+7h4GM0EqXqGEAtnW9wnC bOZkfFldq/u1YPFYHPHd6VMWLkxpLnvVaKZ5sIsXPvET9YNlWJSIPCId73dfCwzAnVDZ MNb4CwvTxvwaCDQd4FecqUzEJKyUwJLR/fdvzzV1Ca1yeaS4VwNnlFr6cSCCzxh1amHo pQt5TmoAplQDP7W+Wwwss8R99ry8h74a8PQvgjl04kL0zvv3onmJ8a+rt7qxygfp0Ml4 QAkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=fbIU10g+3FIH2F9V39N52dFDZKOYUCa0+VKpaAcOC2w=; b=X7Lx1V4xk27dsY1oyYcbA+6dVg7MKzgq9L0/GxqVYLLhpYGKpQVnxMToXAsXa4XvfM 3wQw4SgdWYrFvvkT/DoQDls54ETtcq6aU5PJr3Pq9ior/5KwyU2krSRBO1UrZ3t7p1RQ 0bliJLMNljdKefo4shosSr2UJGPckPviVfk51wI68b1sQuxJyXr0vctvXVzO+oPlbtHB rRb6ACCVFv6XdE87XrFEhaO9XWdZAbOKYoD+i11S3REGjeRfso+NSWGLHNnz1/LCYVeB lrPmQSJUkXdSSnBbAKphvVRTFQW0zQ5TM6dUJpgEsN1c4seJofmq1K8kLsqOExEUUs3y nMcg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yu-cheng.yu@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=yu-cheng.yu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga12.intel.com (mga12.intel.com. [192.55.52.136]) by mx.google.com with ESMTPS id d11-v6si26378966pgh.564.2018.09.21.08.08.50 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Sep 2018 08:08:50 -0700 (PDT) Received-SPF: pass (google.com: domain of yu-cheng.yu@intel.com designates 192.55.52.136 as permitted sender) client-ip=192.55.52.136; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yu-cheng.yu@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=yu-cheng.yu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Sep 2018 08:08:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,285,1534834800"; d="scan'208";a="71856565" Received: from 2b52.sc.intel.com ([143.183.136.51]) by fmsmga007.fm.intel.com with ESMTP; 21 Sep 2018 08:08:49 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Cyrill Gorcunov , Dave Hansen , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue Cc: Yu-cheng Yu Subject: [RFC PATCH v4 12/27] x86/mm: Modify ptep_set_wrprotect and pmdp_set_wrprotect for _PAGE_DIRTY_SW Date: Fri, 21 Sep 2018 08:03:36 -0700 Message-Id: <20180921150351.20898-13-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180921150351.20898-1-yu-cheng.yu@intel.com> References: <20180921150351.20898-1-yu-cheng.yu@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP When Shadow Stack is enabled, the [R/O + PAGE_DIRTY_HW] setting is reserved only for the Shadow Stack. For non-Shadow Stack R/O PTEs, we use [R/O + PAGE_DIRTY_SW]. When a PTE goes from [R/W + PAGE_DIRTY_HW] to [R/O + PAGE_DIRTY_SW], it could become a transient Shadow Stack PTE in two cases. The first case is that some processors can start a write but end up seeing a read-only PTE by the time they get to the Dirty bit, creating a transient Shadow Stack PTE. However, this will not occur on processors supporting Shadow Stack therefore we don't need a TLB flush here. The second case is that when the software, without atomic, tests & replaces PAGE_DIRTY_HW with PAGE_DIRTY_SW, a transient Shadow Stack PTE can exist. This is prevented with cmpxchg. Dave Hansen, Jann Horn, Andy Lutomirski, and Peter Zijlstra provided many insights to the issue. Jann Horn provided the cmpxchg solution. Signed-off-by: Yu-cheng Yu --- arch/x86/include/asm/pgtable.h | 58 ++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 3ee554d81480..b6e0ee5c5503 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1203,7 +1203,36 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER + pte_t new_pte, pte = READ_ONCE(*ptep); + + /* + * Some processors can start a write, but end up + * seeing a read-only PTE by the time they get + * to the Dirty bit. In this case, they will + * set the Dirty bit, leaving a read-only, Dirty + * PTE which looks like a Shadow Stack PTE. + * + * However, this behavior has been improved and + * will not occur on processors supporting + * Shadow Stacks. Without this guarantee, a + * transition to a non-present PTE and flush the + * TLB would be needed. + * + * When changing a writable PTE to read-only and + * if the PTE has _PAGE_DIRTY_HW set, we move + * that bit to _PAGE_DIRTY_SW so that the PTE is + * not a valid Shadow Stack PTE. + */ + do { + new_pte = pte_wrprotect(pte); + new_pte.pte |= (new_pte.pte & _PAGE_DIRTY_HW) >> + _PAGE_BIT_DIRTY_HW << _PAGE_BIT_DIRTY_SW; + new_pte.pte &= ~_PAGE_DIRTY_HW; + } while (!try_cmpxchg(ptep, &pte, new_pte)); +#else clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte); +#endif } #define flush_tlb_fix_spurious_fault(vma, address) do { } while (0) @@ -1266,7 +1295,36 @@ static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER + pmd_t new_pmd, pmd = READ_ONCE(*pmdp); + + /* + * Some processors can start a write, but end up + * seeing a read-only PMD by the time they get + * to the Dirty bit. In this case, they will + * set the Dirty bit, leaving a read-only, Dirty + * PMD which looks like a Shadow Stack PMD. + * + * However, this behavior has been improved and + * will not occur on processors supporting + * Shadow Stacks. Without this guarantee, a + * transition to a non-present PMD and flush the + * TLB would be needed. + * + * When changing a writable PMD to read-only and + * if the PMD has _PAGE_DIRTY_HW set, we move + * that bit to _PAGE_DIRTY_SW so that the PMD is + * not a valid Shadow Stack PMD. + */ + do { + new_pmd = pmd_wrprotect(pmd); + new_pmd.pmd |= (new_pmd.pmd & _PAGE_DIRTY_HW) >> + _PAGE_BIT_DIRTY_HW << _PAGE_BIT_DIRTY_SW; + new_pmd.pmd &= ~_PAGE_DIRTY_HW; + } while (!try_cmpxchg(pmdp, &pmd, new_pmd)); +#else clear_bit(_PAGE_BIT_RW, (unsigned long *)pmdp); +#endif } #define pud_write pud_write