From patchwork Wed Jul 18 20:14:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 10533223 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 498CC600F4 for ; Wed, 18 Jul 2018 20:18:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 352F929D42 for ; Wed, 18 Jul 2018 20:18:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 26F0429D44; Wed, 18 Jul 2018 20:18:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8A4D729D42 for ; Wed, 18 Jul 2018 20:18:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A07356B0006; Wed, 18 Jul 2018 16:18:35 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9B5856B0007; Wed, 18 Jul 2018 16:18:35 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A4336B0008; Wed, 18 Jul 2018 16:18:35 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl0-f72.google.com (mail-pl0-f72.google.com [209.85.160.72]) by kanga.kvack.org (Postfix) with ESMTP id 493B96B0006 for ; Wed, 18 Jul 2018 16:18:35 -0400 (EDT) Received: by mail-pl0-f72.google.com with SMTP id q18-v6so3153748pll.3 for ; Wed, 18 Jul 2018 13:18:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:message-id :subject:from:to:date:in-reply-to:references:mime-version :content-transfer-encoding; bh=9CPLhTxicpWgPF5tmwEEoIKB7d7KGfwLo21jmQRAHHU=; b=CiVRjwr61YR+FfQ1V9MNax8dIgI2NPGVe6JNbCbb0+N0Sl8CioB6175ddb+IOJJjP1 bThi2GxgkGMN+8k6HcVAq0A5HejR8qfYXYGH5qiqn8ay1nZBHvyrs4ducyENFxmUq7LV qghUL9Ie6k9fZIEeuQvN5pOK1pGp586v0WGafTjLwT3AwKO3+a8kRCw7pK82zUA49ntL GRVOOjICuX2UP9ufX3ttpGzxbXzYdYV4dbN62epicrqSarcN9+sR5anebK7CJ01xxgnb nc9t6GhfBKGN4RJjjLH4fGGkdSihtISvGCcGUq0gOmEtO/2j/xdVfaoNV3bKJcK1ojZ6 clsQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yu-cheng.yu@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=yu-cheng.yu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AOUpUlEfxLS3/PPiLTJ753nTdXGbIqeN7uBNLJmwO76EpYiGyNuFyDzv PdxfuktU80nyRj9YfQlMjFYqQKhcxaelWgEI29iR5zSUC5TQ9YXwL85iomXZYK4WLLPP+PXv3AK O1mMvUWR1/cA1W7sXD6la3hmnhChXTPUMpQV+RwgEpneK88kfdQgQ9nX5KGRLJfrMRQ== X-Received: by 2002:a17:902:8a95:: with SMTP id p21-v6mr7294796plo.91.1531945114948; Wed, 18 Jul 2018 13:18:34 -0700 (PDT) X-Google-Smtp-Source: AAOMgpeo3hrwYcFHaF9YaVAQidC3id0SBgUvAkkXYSDRi1NtO7Qg8z10L77ghoLZZdRYUcNkUUCc X-Received: by 2002:a17:902:8a95:: with SMTP id p21-v6mr7294744plo.91.1531945113852; Wed, 18 Jul 2018 13:18:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531945113; cv=none; d=google.com; s=arc-20160816; b=G37VRG3Y3UZFBALHJPskXj/iLMPcKRzca32RsnWTOVN7FeDu/ZSLYiHLzj0HdKyHAj JagwlJqQwCOtbfSSc7P7Ib0PFbZSbpQAvkAgaK6neoeXBkdvuMhywTkeah7XQ8EkgJWO O4DI5/7nRf1NGBtrfEoYQOyuuQ0bfO24podYMQ2Mol69uqqVBnd4ZzD2Ciu4WkODHNfm tgmMIjs0hU7ccAk+9UwVzhMxtc2ub4ePyJufvTnecoeZVIYYkVhmRnnGoDpiD51OFuuC XQZWRqbeGFF82mA05pITuztBX93/irHlfGDo7FVH3jYk9W+/OdQtXiuBPj087E2gLgKf RKYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to:date :to:from:subject:message-id:arc-authentication-results; bh=9CPLhTxicpWgPF5tmwEEoIKB7d7KGfwLo21jmQRAHHU=; b=sFSvVf3lrvyJtpJOJlztisoouFKGqL5kE5w0JhOKvrj6vI0gk38elPdBXF4UupaxSF wMrD9A5Xh4tP4MVB+YHLn1RJOQnmL0FlyzgG5fYsqNTyfGek4cIKRwQTITnyb9gQJcso Zk7S65eRrLqvK0fGx0S0KwFQ4jxu8OIpgz1W4OU/M0vNOzJVNHKegKSjZejLQck5LhX5 Pk/ikUXr8LqfiPnigJ96MKuRNd9VBcdxxxzW4YL3ZilDmGdO1PzshwKJOTAjTKUCDkgI JIfZXJUyPx2An7hDnjjgAWFCb+51ahKZBY9TLhqplkUG42AnX/AQspo0BPlm4RszKp/k 66Fw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yu-cheng.yu@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=yu-cheng.yu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga12.intel.com (mga12.intel.com. [192.55.52.136]) by mx.google.com with ESMTPS id t67-v6si3759662pfd.364.2018.07.18.13.18.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 18 Jul 2018 13:18:33 -0700 (PDT) Received-SPF: pass (google.com: domain of yu-cheng.yu@intel.com designates 192.55.52.136 as permitted sender) client-ip=192.55.52.136; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yu-cheng.yu@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=yu-cheng.yu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jul 2018 13:18:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,371,1526367600"; d="scan'208";a="217100830" Received: from 2b52.sc.intel.com ([143.183.136.146]) by orsmga004.jf.intel.com with ESMTP; 18 Jul 2018 13:18:32 -0700 Message-ID: <1531944882.10738.1.camel@intel.com> Subject: Re: [RFC PATCH v2 16/27] mm: Modify can_follow_write_pte/pmd for shadow stack From: Yu-cheng Yu To: Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Cyrill Gorcunov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , "Ravi V. Shankar" , Vedvyas Shanbhogue Date: Wed, 18 Jul 2018 13:14:42 -0700 In-Reply-To: References: <20180710222639.8241-1-yu-cheng.yu@intel.com> <20180710222639.8241-17-yu-cheng.yu@intel.com> <1531328731.15351.3.camel@intel.com> <45a85b01-e005-8cb6-af96-b23ce9b5fca7@linux.intel.com> <1531868610.3541.21.camel@intel.com> X-Mailer: Evolution 3.18.5.2-0ubuntu3.2 Mime-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP On Tue, 2018-07-17 at 16:15 -0700, Dave Hansen wrote: > On 07/17/2018 04:03 PM, Yu-cheng Yu wrote: > > > > We need to find a way to differentiate "someone can write to this PTE" > > from "the write bit is set in this PTE". > Please think about this: > > Should pte_write() tell us whether PTE.W=1, or should it tell us > that *something* can write to the PTE, which would include > PTE.W=0/D=1? Is it better now? Subject: [PATCH] mm: Modify can_follow_write_pte/pmd for shadow stack can_follow_write_pte/pmd look for the (RO & DIRTY) PTE/PMD to verify a non-sharing RO page still exists after a broken COW. However, a shadow stack PTE is always RO & DIRTY; it can be:   RO & DIRTY_HW - is_shstk_pte(pte) is true; or   RO & DIRTY_SW - the page is being shared. Update these functions to check a non-sharing shadow stack page still exists after the COW. Also rename can_follow_write_pte/pmd() to can_follow_write() to make their meaning clear; i.e. "Can we write to the page?", not "Is the PTE writable?" Signed-off-by: Yu-cheng Yu ---  mm/gup.c         | 38 ++++++++++++++++++++++++++++++++++----  mm/huge_memory.c | 19 ++++++++++++++-----  2 files changed, 48 insertions(+), 9 deletions(-) -- diff --git a/mm/gup.c b/mm/gup.c index fc5f98069f4e..316967996232 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -63,11 +63,41 @@ static int follow_pfn_pte(struct vm_area_struct *vma, unsigned long address,  /*   * FOLL_FORCE can write to even unwritable pte's, but only   * after we've gone through a COW cycle and they are dirty. + * + * Background: + * + * When we force-write to a read-only page, the page fault + * handler copies the page and sets the new page's PTE to + * RO & DIRTY.  This routine tells + * + *     "Can we write to the page?" + * + * by checking: + * + *     (1) The page has been copied, i.e. FOLL_COW is set; + *     (2) The copy still exists and its PTE is RO & DIRTY. + * + * However, a shadow stack PTE is always RO & DIRTY; it can + * be: + * + *     RO & DIRTY_HW: when is_shstk_pte(pte) is true; or + *     RO & DIRTY_SW: when the page is being shared. + * + * To test a shadow stack's non-sharing page still exists, + * we verify that the new page's PTE is_shstk_pte(pte).   */ -static inline bool can_follow_write_pte(pte_t pte, unsigned int flags) +static inline bool can_follow_write(pte_t pte, unsigned int flags, +     struct vm_area_struct *vma)  { - return pte_write(pte) || - ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte)); + if (!is_shstk_mapping(vma->vm_flags)) { + if (pte_write(pte)) + return true; + return ((flags & FOLL_FORCE) && (flags & FOLL_COW) && + pte_dirty(pte)); + } else { + return ((flags & FOLL_FORCE) && (flags & FOLL_COW) && + is_shstk_pte(pte)); + }  }    static struct page *follow_page_pte(struct vm_area_struct *vma, @@ -105,7 +135,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,   }   if ((flags & FOLL_NUMA) && pte_protnone(pte))   goto no_page; - if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, flags)) { + if ((flags & FOLL_WRITE) && !can_follow_write(pte, flags, vma)) {   pte_unmap_unlock(ptep, ptl);   return NULL;   } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 7f3e11d3b64a..822a563678b5 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1388,11 +1388,20 @@ int do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd)  /*   * FOLL_FORCE can write to even unwritable pmd's, but only   * after we've gone through a COW cycle and they are dirty. + * See comments in mm/gup.c, can_follow_write().   */ -static inline bool can_follow_write_pmd(pmd_t pmd, unsigned int flags) -{ - return pmd_write(pmd) || -        ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pmd_dirty(pmd)); +static inline bool can_follow_write(pmd_t pmd, unsigned int flags, +     struct vm_area_struct *vma) +{ + if (!is_shstk_mapping(vma->vm_flags)) { + if (pmd_write(pmd)) + return true; + return ((flags & FOLL_FORCE) && (flags & FOLL_COW) && + pmd_dirty(pmd)); + } else { + return ((flags & FOLL_FORCE) && (flags & FOLL_COW) && + is_shstk_pmd(pmd)); + }  }    struct page *follow_trans_huge_pmd(struct vm_area_struct *vma, @@ -1405,7 +1414,7 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct *vma,     assert_spin_locked(pmd_lockptr(mm, pmd));   - if (flags & FOLL_WRITE && !can_follow_write_pmd(*pmd, flags)) + if (flags & FOLL_WRITE && !can_follow_write(*pmd, flags, vma))   goto out;     /* Avoid dumping huge zero page */