From patchwork Fri Apr 26 04:51:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918035 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 651081575 for ; Fri, 26 Apr 2019 04:52:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 528E9289B1 for ; Fri, 26 Apr 2019 04:52:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 465DF28CDD; Fri, 26 Apr 2019 04:52:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7D2D4289B1 for ; Fri, 26 Apr 2019 04:52:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA0996B0008; Fri, 26 Apr 2019 00:52:16 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A51F26B000A; Fri, 26 Apr 2019 00:52:16 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 91A106B000C; Fri, 26 Apr 2019 00:52:16 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 6F3BF6B0008 for ; Fri, 26 Apr 2019 00:52:16 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id o34so1898016qte.5 for ; Thu, 25 Apr 2019 21:52:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=nYsVv5CHuVcs/SDWoc2yWG1yFmoVkxyWiX72w6VRjZk=; b=WA8lAEyrX+Nn0oVeg3tgd+qnkidyBzWX2g69OSVFHfLzTKEwlDtnnKCWs8JPo5k3U0 ZVWJaTgmUt+DVj0+1JLiEX9DxcR9ChEIFC2/FZ1kMroQ1VZ2Bqek/0nDb0asXf7FagbM Qu9D/xuLf1/+S06Fl59kgrIfeCmuuvqpVmV6jdmlw/+WK6Z0whvfJoNIvTn8bJQTIANy BtEVz/NJSyiMeJIMJ1TZhrV4SgGRJ9Ix5ioDhDpWhd2+8Df9y3oSu+3GOMJ5Adbh77RK pFykYDdntaHEA7kc9kNG2Eq1SE+QCMQ84iNK+7DppNVSifkOU9Nv+nk1L8DEbe5q7P6V 2IEQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAW0qTV7whtnJCCw8AJQYYoRTS4949IXyhVdGAm6zk/R8ioZrXTW qR7CdtseE+lxhGTpTd/7QDSUnIC8sjRggiGdIVtULo9OBl9LxTeeaV1OupSW9cUVcGNWhNIs/Vv /gC190ua0gXFKpsRP+FHK0FdkvbMCLlLBYwC2vrdcb8S5S8Fb33LYz+yiWjfFsSCXWw== X-Received: by 2002:ac8:2dae:: with SMTP id p43mr25942936qta.14.1556254336217; Thu, 25 Apr 2019 21:52:16 -0700 (PDT) X-Google-Smtp-Source: APXvYqyGXphqPDK0UJ4Nw6hJiSI40FdwAmIsCisZrmTMBTNPae5kVeIBuv/Zxq/cHSQpjCG1Q9jO X-Received: by 2002:ac8:2dae:: with SMTP id p43mr25942887qta.14.1556254335255; Thu, 25 Apr 2019 21:52:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254335; cv=none; d=google.com; s=arc-20160816; b=x+8od/XHtoLhB1vZkuJ7J1MtcjICn2VpXJbzi3TKjPntJEQ81WDGrSMlMRffbnUKqU F9Nkxck6P2BV87Q9UG4hUxqAB6VhyvIZ9IMNeiPVyKekKVNfaTpobgLUTNS+bJiCKk1R t1uf/y/zPIpRL0KbyyK+S1H1VKlmDuyzj075NvB4cChsC3QswwQ0OyZ9RmvevNN2mQMk /nnf63LlowE6UfFM9laaH95tqkzOtZJnd5bnIibNJJMsa9hUc3vSRszu0j2wSZIYfwKb nASSWckEs4lySo7DM2eQ46u1vlAawwrlAESAD5PPZeaozdSUY1wkfspJAExBfueJhTfu pJMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=nYsVv5CHuVcs/SDWoc2yWG1yFmoVkxyWiX72w6VRjZk=; b=wuf7tdkaPM/pgHdnlSACI8tzT2/M/2UIC4gv1wVpvB4CbNRSN9GJycLPzy8V7tWVux 6bqxRZaVW/fLTz8f3mCvyj/OgfHScnVwZW1HMYlrEyMYK2ohuPi1KVq8CR00OpEIewsb /N601cqAT4/RnSnq+j4MzoX3sSEoMdU+m2NlPCObTgWD9eEb25gjmpa4YbEzmKJDNUJd ex3tN/Qw8S2aaFJnwbf6qkbrBEEXwoWAe1Zf1fKssvXkEDu77uNrVjoGqBFiGLDgeJD9 pt0PxyCnfdUH3h/lNwrCwu8ojqqb40GzANuL/k8XEVedJLRpMoij2zKrBcnHCPYcjWwp b+lQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id u26si4076692qkk.265.2019.04.25.21.52.15 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:52:15 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 55B3330ADBD9; Fri, 26 Apr 2019 04:52:14 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 15FDF194A5; Fri, 26 Apr 2019 04:52:06 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 01/27] mm: gup: rename "nonblocking" to "locked" where proper Date: Fri, 26 Apr 2019 12:51:25 +0800 Message-Id: <20190426045151.19556-2-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Fri, 26 Apr 2019 04:52:14 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP There's plenty of places around __get_user_pages() that has a parameter "nonblocking" which does not really mean that "it won't block" (because it can really block) but instead it shows whether the mmap_sem is released by up_read() during the page fault handling mostly when VM_FAULT_RETRY is returned. We have the correct naming in e.g. get_user_pages_locked() or get_user_pages_remote() as "locked", however there're still many places that are using the "nonblocking" as name. Renaming the places to "locked" where proper to better suite the functionality of the variable. While at it, fixing up some of the comments accordingly. Reviewed-by: Mike Rapoport Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- mm/gup.c | 44 +++++++++++++++++++++----------------------- mm/hugetlb.c | 8 ++++---- 2 files changed, 25 insertions(+), 27 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index f84e22685aaa..a78d252d6358 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -509,12 +509,12 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address, } /* - * mmap_sem must be held on entry. If @nonblocking != NULL and - * *@flags does not include FOLL_NOWAIT, the mmap_sem may be released. - * If it is, *@nonblocking will be set to 0 and -EBUSY returned. + * mmap_sem must be held on entry. If @locked != NULL and *@flags + * does not include FOLL_NOWAIT, the mmap_sem may be released. If it + * is, *@locked will be set to 0 and -EBUSY returned. */ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, - unsigned long address, unsigned int *flags, int *nonblocking) + unsigned long address, unsigned int *flags, int *locked) { unsigned int fault_flags = 0; vm_fault_t ret; @@ -526,7 +526,7 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, fault_flags |= FAULT_FLAG_WRITE; if (*flags & FOLL_REMOTE) fault_flags |= FAULT_FLAG_REMOTE; - if (nonblocking) + if (locked) fault_flags |= FAULT_FLAG_ALLOW_RETRY; if (*flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; @@ -552,8 +552,8 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, } if (ret & VM_FAULT_RETRY) { - if (nonblocking && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) - *nonblocking = 0; + if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) + *locked = 0; return -EBUSY; } @@ -630,7 +630,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) * only intends to ensure the pages are faulted in. * @vmas: array of pointers to vmas corresponding to each page. * Or NULL if the caller does not require them. - * @nonblocking: whether waiting for disk IO or mmap_sem contention + * @locked: whether we're still with the mmap_sem held * * Returns number of pages pinned. This may be fewer than the number * requested. If nr_pages is 0 or negative, returns 0. If no pages @@ -659,13 +659,11 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) * appropriate) must be called after the page is finished with, and * before put_page is called. * - * If @nonblocking != NULL, __get_user_pages will not wait for disk IO - * or mmap_sem contention, and if waiting is needed to pin all pages, - * *@nonblocking will be set to 0. Further, if @gup_flags does not - * include FOLL_NOWAIT, the mmap_sem will be released via up_read() in - * this case. + * If @locked != NULL, *@locked will be set to 0 when mmap_sem is + * released by an up_read(). That can happen if @gup_flags does not + * have FOLL_NOWAIT. * - * A caller using such a combination of @nonblocking and @gup_flags + * A caller using such a combination of @locked and @gup_flags * must therefore hold the mmap_sem for reading only, and recognize * when it's been released. Otherwise, it must be held for either * reading or writing and will not be released. @@ -677,7 +675,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, unsigned long start, unsigned long nr_pages, unsigned int gup_flags, struct page **pages, - struct vm_area_struct **vmas, int *nonblocking) + struct vm_area_struct **vmas, int *locked) { long ret = 0, i = 0; struct vm_area_struct *vma = NULL; @@ -721,7 +719,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, if (is_vm_hugetlb_page(vma)) { i = follow_hugetlb_page(mm, vma, pages, vmas, &start, &nr_pages, i, - gup_flags, nonblocking); + gup_flags, locked); continue; } } @@ -739,7 +737,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, page = follow_page_mask(vma, start, foll_flags, &ctx); if (!page) { ret = faultin_page(tsk, vma, start, &foll_flags, - nonblocking); + locked); switch (ret) { case 0: goto retry; @@ -1347,7 +1345,7 @@ EXPORT_SYMBOL(get_user_pages_longterm); * @vma: target vma * @start: start address * @end: end address - * @nonblocking: + * @locked: whether the mmap_sem is still held * * This takes care of mlocking the pages too if VM_LOCKED is set. * @@ -1355,14 +1353,14 @@ EXPORT_SYMBOL(get_user_pages_longterm); * * vma->vm_mm->mmap_sem must be held. * - * If @nonblocking is NULL, it may be held for read or write and will + * If @locked is NULL, it may be held for read or write and will * be unperturbed. * - * If @nonblocking is non-NULL, it must held for read only and may be - * released. If it's released, *@nonblocking will be set to 0. + * If @locked is non-NULL, it must held for read only and may be + * released. If it's released, *@locked will be set to 0. */ long populate_vma_page_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end, int *nonblocking) + unsigned long start, unsigned long end, int *locked) { struct mm_struct *mm = vma->vm_mm; unsigned long nr_pages = (end - start) / PAGE_SIZE; @@ -1397,7 +1395,7 @@ long populate_vma_page_range(struct vm_area_struct *vma, * not result in a stack expansion that recurses back here. */ return __get_user_pages(current, mm, start, nr_pages, gup_flags, - NULL, NULL, nonblocking); + NULL, NULL, locked); } /* diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 97b1e0290c66..e77b56141f0c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4191,7 +4191,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, struct page **pages, struct vm_area_struct **vmas, unsigned long *position, unsigned long *nr_pages, - long i, unsigned int flags, int *nonblocking) + long i, unsigned int flags, int *locked) { unsigned long pfn_offset; unsigned long vaddr = *position; @@ -4262,7 +4262,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, spin_unlock(ptl); if (flags & FOLL_WRITE) fault_flags |= FAULT_FLAG_WRITE; - if (nonblocking) + if (locked) fault_flags |= FAULT_FLAG_ALLOW_RETRY; if (flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | @@ -4279,9 +4279,9 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, break; } if (ret & VM_FAULT_RETRY) { - if (nonblocking && + if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) - *nonblocking = 0; + *locked = 0; *nr_pages = 0; /* * VM_FAULT_RETRY must not return an From patchwork Fri Apr 26 04:51:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918039 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 05BD11390 for ; Fri, 26 Apr 2019 04:52:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E6102289B1 for ; Fri, 26 Apr 2019 04:52:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D8EBC28CDD; Fri, 26 Apr 2019 04:52:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A2A82289B1 for ; Fri, 26 Apr 2019 04:52:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B8B596B000A; Fri, 26 Apr 2019 00:52:28 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B3AE96B000C; Fri, 26 Apr 2019 00:52:28 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A03EE6B000D; Fri, 26 Apr 2019 00:52:28 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 79FA76B000A for ; Fri, 26 Apr 2019 00:52:28 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id u65so1818397qkd.17 for ; Thu, 25 Apr 2019 21:52:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=VXK/45moApwmzESY0POuJskfgajf6d2BIg/Gp9tBhhs=; b=WtjhRrd27XvoctPfXydJ7HKMbzpSfmlKs9EKfF9oysF+7J/1aOX9wO0eZftPyGWtri rw8YMFICcaq7GK0jhMzWiT6Ff+Zkyz6rdCPo7hqY6zc8kPSMQyl1DI8rfikmZIm9k7nC Le7cmeY3KEuoJOYyjdqCtfvUx1oX9HVG2tM2ST4MOUMKrCKBL7KPDwFWc/Fk1JJmU+Gy J2C8eApI/T1k6SXzRrnhtsg/jnYVl8kfxSbovU9xGEiF2UBSKG0uMewDhXbMDrbZ36+B Y17vo08iGMQeNIsfKlMTneauILvU0HxuAtV2fG/UIrbNWFMGu4+x6h1gNaGhXipxzHbd YA1Q== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXgsKSC+e38L20bOxUGnW+Qs9s+YO0fGoFQzSYloJXHvhoBVv0q u86IElQ866P82kxL1gVi9FbONbVTOcDxLamMoGg+vNQ6iSeHW+j8KM2qXKGteopmtZOL2LNl1Tw 6kbrHTfzw17Sd5BX7odVdD+mmkQBvZqq8YUprd+B4PgzXKwNSbFuMcVaRKa8HoGy+Bg== X-Received: by 2002:ae9:f117:: with SMTP id k23mr31186213qkg.344.1556254348205; Thu, 25 Apr 2019 21:52:28 -0700 (PDT) X-Google-Smtp-Source: APXvYqzMF5RFIIH3udkPW7CzCNxuaRU9t6BKRuA/+Nw3QMDzotYMZSdw52i2o9lWYxekguaD+dbL X-Received: by 2002:ae9:f117:: with SMTP id k23mr31186165qkg.344.1556254346732; Thu, 25 Apr 2019 21:52:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254346; cv=none; d=google.com; s=arc-20160816; b=A72gR9uT8481GY3R5rnz9rFfrH8SZdHKDEGCrvcQyvOvAWKYbXoI8H4J02JS7RW3/l IA4PKU6ff4Zdw4/CvpTXf9begUqawkFVCGfIw0FmakcInr6Q3FAjIIK1A9U58eddXGct 11QOzsrpraDk4AdRXJ3cw7MsM92mawtLaEJ1RTuOD01Ln+7Osmu9yM0njVcuaZE/+lGj jeurVRllUbUkFlcjSCaztI+KAu+qc1gZl8H3m1yQgKE1pTs/snMMmTwm2gbP1YkEtRGQ Eo2DIj/hbmKeBFk6WqCW1C+oh5TaP4sLqdmnmOCjRPK/rh+54VS4AChWY15erM1r599E kkXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=VXK/45moApwmzESY0POuJskfgajf6d2BIg/Gp9tBhhs=; b=EggqkUXgomFcWpiJyGuuEF9QdMT6VGxp2z6coJLJb9yQ1ubgYzt+wiedcigKL+OuIg hQJYEjCPjLuLi8kv1fsfy/MdlorN+9XIsaGPH8MZE5dlOe6/nIMiQJydc7oMe3IX6G/A X6Zr/fHchFpBwyWerfIrk9efMRBOFx6pD2YLY6TwM5rL+Ie65DutMXTHhVV9hC6uoQFE ojPzkNAvyQ87+DfaZ/VRpKLuWyjN+C9Y5b7eomT6Yx2eovPv8PZWjmUOLYluhTqB8VnD Djerwyclr70IcMei8lL/nnsA+M7rr4LQ8zBkfLGaDYF7MABmsiMbL672tZrkgT0Yrqn5 lfXw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id 8si4848208qtt.203.2019.04.25.21.52.26 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:52:26 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CD2F4C0AC92A; Fri, 26 Apr 2019 04:52:25 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id CEE6D18500; Fri, 26 Apr 2019 04:52:14 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 02/27] mm: userfault: return VM_FAULT_RETRY on signals Date: Fri, 26 Apr 2019 12:51:26 +0800 Message-Id: <20190426045151.19556-3-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 26 Apr 2019 04:52:25 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from the upstream discussion between Linus and Andrea: https://lkml.org/lkml/2017/10/30/560 A summary to the issue: there was a special path in handle_userfault() in the past that we'll return a VM_FAULT_NOPAGE when we detected non-fatal signals when waiting for userfault handling. We did that by reacquiring the mmap_sem before returning. However that brings a risk in that the vmas might have changed when we retake the mmap_sem and even we could be holding an invalid vma structure. This patch removes the special path and we'll return a VM_FAULT_RETRY with the common path even if we have got such signals. Then for all the architectures that is passing in VM_FAULT_ALLOW_RETRY into handle_mm_fault(), we check not only for SIGKILL but for all the rest of userspace pending signals right after we returned from handle_mm_fault(). This can allow the userspace to handle nonfatal signals faster than before. This patch is a preparation work for the next patch to finally remove the special code path mentioned above in handle_userfault(). Suggested-by: Linus Torvalds Suggested-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- arch/alpha/mm/fault.c | 2 +- arch/arc/mm/fault.c | 11 ++++------- arch/arm/mm/fault.c | 6 +++--- arch/arm64/mm/fault.c | 6 +++--- arch/hexagon/mm/vm_fault.c | 2 +- arch/ia64/mm/fault.c | 2 +- arch/m68k/mm/fault.c | 2 +- arch/microblaze/mm/fault.c | 2 +- arch/mips/mm/fault.c | 2 +- arch/nds32/mm/fault.c | 6 +++--- arch/nios2/mm/fault.c | 2 +- arch/openrisc/mm/fault.c | 2 +- arch/parisc/mm/fault.c | 2 +- arch/powerpc/mm/fault.c | 2 ++ arch/riscv/mm/fault.c | 4 ++-- arch/s390/mm/fault.c | 9 ++++++--- arch/sh/mm/fault.c | 4 ++++ arch/sparc/mm/fault_32.c | 3 +++ arch/sparc/mm/fault_64.c | 3 +++ arch/um/kernel/trap.c | 5 ++++- arch/unicore32/mm/fault.c | 4 ++-- arch/x86/mm/fault.c | 6 +++++- arch/xtensa/mm/fault.c | 3 +++ 23 files changed, 56 insertions(+), 34 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index 188fc9256baf..8a2ef90b4bfc 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -150,7 +150,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, the fault. */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index 8df1638259f3..9e9e6eb1f7d0 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -141,17 +141,14 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) */ fault = handle_mm_fault(vma, address, flags); - if (fatal_signal_pending(current)) { - + if (unlikely((fault & VM_FAULT_RETRY) && signal_pending(current))) { + if (fatal_signal_pending(current) && !user_mode(regs)) + goto no_context; /* * if fault retry, mmap_sem already relinquished by core mm * so OK to return to user mode (with signal handled first) */ - if (fault & VM_FAULT_RETRY) { - if (!user_mode(regs)) - goto no_context; - return; - } + return; } perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 58f69fa07df9..c41c021bbe40 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -314,12 +314,12 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_page_fault(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (unlikely(fault & VM_FAULT_RETRY && signal_pending(current))) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return 0; } diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 1a7e92ab69eb..46c32d639fbf 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -512,13 +512,13 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, if (fault & VM_FAULT_RETRY) { /* - * If we need to retry but a fatal signal is pending, + * If we need to retry but a signal is pending, * handle the signal first. We do not need to release * the mmap_sem because it would already be released * in __lock_page_or_retry in mm/filemap.c. */ - if (fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return 0; } diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index eb263e61daf4..be10b441d9cc 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -104,7 +104,7 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; /* The most common case -- we are done. */ diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 5baeb022f474..62c2d39d2bed 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -163,7 +163,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index 9b6163c05a75..d9808a807ab8 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -138,7 +138,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); pr_debug("handle_mm_fault returns %x\n", fault); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return 0; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 202ad6a494f5..4fd2dbd0c5ca 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -217,7 +217,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 73d8a0f0b810..92374fd091d2 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -154,7 +154,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index 68d5f2a27f38..da777de8a62e 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -206,12 +206,12 @@ void do_page_fault(unsigned long entry, unsigned long addr, fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return; } diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index 6a2e716b959f..bdb1f9db75ba 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -133,7 +133,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index dc4dbafc1d83..873ecb5d82d7 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -165,7 +165,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index c8e8b7c05558..29422eec329d 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -303,7 +303,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 887f11bcf330..aaa853e6592f 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -591,6 +591,8 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, */ flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; + if (is_user && signal_pending(current)) + return 0; if (!fatal_signal_pending(current)) goto retry; } diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 88401d5125bc..4fc8d746bec3 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -123,11 +123,11 @@ asmlinkage void do_page_fault(struct pt_regs *regs) fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(tsk)) + if ((fault & VM_FAULT_RETRY) && signal_pending(tsk)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index 11613362c4e7..aba1dad1efcd 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -476,9 +476,12 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) * the fault. */ fault = handle_mm_fault(vma, address, flags); - /* No reason to continue if interrupted by SIGKILL. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - fault = VM_FAULT_SIGNAL; + /* Do not continue if interrupted by signals. */ + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { + if (fatal_signal_pending(current)) + fault = VM_FAULT_SIGNAL; + else + fault = 0; if (flags & FAULT_FLAG_RETRY_NOWAIT) goto out_up; goto out; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index 6defd2c6d9b1..baf5d73df40c 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -506,6 +506,10 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, * have already released it in __lock_page_or_retry * in mm/filemap.c. */ + + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index b0440b0edd97..a2c83104fe35 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -269,6 +269,9 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index 8f8a604c1300..cad71ec5c7b3 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -467,6 +467,9 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(current)) + return; + goto retry; } } diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index 0e8b6158f224..05dcd4c5f0d5 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -76,8 +76,11 @@ int handle_page_fault(unsigned long address, unsigned long ip, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { + if (is_user && !fatal_signal_pending(current)) + err = 0; goto out_nosemaphore; + } if (unlikely(fault & VM_FAULT_ERROR)) { if (fault & VM_FAULT_OOM) { diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index b9a3a50644c1..3611f19234a1 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -248,11 +248,11 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_pf(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return 0; if (!(fault & VM_FAULT_ERROR) && (flags & FAULT_FLAG_ALLOW_RETRY)) { diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 667f1da36208..d9ca1ec26d40 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1481,16 +1481,20 @@ void do_user_addr_fault(struct pt_regs *regs, * that we made any progress. Handle this case first. */ if (unlikely(fault & VM_FAULT_RETRY)) { + bool is_user = flags & FAULT_FLAG_USER; + /* Retry at most once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; + if (is_user && signal_pending(tsk)) + return; if (!fatal_signal_pending(tsk)) goto retry; } /* User mode? Just return to handle the fatal exception */ - if (flags & FAULT_FLAG_USER) + if (is_user) return; /* Not returning to user mode? Handle exceptions or die: */ diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index 2ab0e0dcd166..792dad5e2f12 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -136,6 +136,9 @@ void do_page_fault(struct pt_regs *regs) * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(current)) + return; + goto retry; } } From patchwork Fri Apr 26 04:51:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918043 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6234E933 for ; Fri, 26 Apr 2019 04:52:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 50DE1289B1 for ; Fri, 26 Apr 2019 04:52:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 446AF28CDD; Fri, 26 Apr 2019 04:52:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C155F289B1 for ; Fri, 26 Apr 2019 04:52:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D873E6B000C; Fri, 26 Apr 2019 00:52:33 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D37506B000D; Fri, 26 Apr 2019 00:52:33 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C26B66B000E; Fri, 26 Apr 2019 00:52:33 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id A3EE26B000C for ; Fri, 26 Apr 2019 00:52:33 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id u66so1847799qkh.9 for ; Thu, 25 Apr 2019 21:52:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=vKNT6GfdSA2RaFRR5Rx6tafrEcvehh1DQki77hDcefA=; b=K7qrCrNOFAAlg5J8Z8/SgEww+z3jp2+ybQFL4gr/PgkL2ak4QY0lMpODUNllxKGGxu oQ8Ry6rvjyxVdxpt9Oy3XdPKnTBU5dzrTTALD2DBG0ICQoTvXiPPaMnAQJKZfG/HaKWl 7SEXgSEukUJnXgZUrG/8cdUmQABiqSRL419QWBMBCN0XGAfn+WWWpXB3VWpaIdmA9IQ2 E+GGt9tlNkUBDjN1aazRhzZcuJtnrsRKHzUrTVDPS988C7+vl4349p05DoVOWnaHCxyf p5nnTxFtbmBeZNSE+A6ATn4xcd2d5KHz9zJImrkLrAjFJfN5S7qQhUcbnqZkMXNr23cE 2uLg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWC/AJv/5vARCfii4taqUvasqWQhnKL+C78bMqBYRqtugN39KX9 JXXlFRr6CZidcWSQI/om8tFZyOGkONVaGrwSiCYPQPBur7FRcAkNMTaq9ov1zg/08qa6ZXtkvb4 k/l6tVqmS1FZJaPh5LSSeFESh3BdDuvp++yX615vpl1OrptEgmzP47up5iAKH3FlPMg== X-Received: by 2002:a37:e119:: with SMTP id c25mr10607017qkm.75.1556254353440; Thu, 25 Apr 2019 21:52:33 -0700 (PDT) X-Google-Smtp-Source: APXvYqzldQgP8IQzyw1tWgp4He77GLfMOjGVlMA5aNOQnRT9md+DDPb+aCsWeEPLErOpUaoOyqL/ X-Received: by 2002:a37:e119:: with SMTP id c25mr10606982qkm.75.1556254352737; Thu, 25 Apr 2019 21:52:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254352; cv=none; d=google.com; s=arc-20160816; b=FUvsmF4F2T4XPSmZeweGg0xbGxJZXTxQF1h3LXtQgoHD/RgWj/4QLPs+D+FKox1bF1 HzkiqWN0IQK+t46ko7S0T3/BcJVVee2Qb7AIm4guGCUm9cOqkHOU1J/UkZKOKSp14ocv 3KzMpe1StXlWNlvayWkzJx+mzgaP8UdZeR526YnVdF5C6WPFRGW945R41FcjIG5uJHAU /Zyp80u0p/meU5kJFAPHYlfOze6PWwyA+HpThpmeml1DA6syUU8iRyGotHpd02lSlhtF aEWBMAPLESUVNFGWWjb/y32ElKhJTaN3oHtihClUqGocuY1D7qScSb6vlXtZOXxIwTBv og7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=vKNT6GfdSA2RaFRR5Rx6tafrEcvehh1DQki77hDcefA=; b=Ze9XzA7UbXEP07DSSDekhtwiEWlGp0cf3W5suBMQWh/lyViBGOSTDKQmzztrvitlmB +/h3mJAnCJTJa0EhHzWMfhZdZ/KDcv5Ak77Rjbw//Qk59bCdcxzSIyOp5Y5k1+NmcdiP iEYlUuVCqox4vMFgHRHiYo0mrPze9Tpu43ctBknyKkkQwo5wAffHNC9Xv6uKJ6/kJ9g2 RVbM1xfuMgVDcBFScMecOASNePUudHEj2hP0R+J8TcM5hhppGeJZjGNZ/VCNaWF24E9Q IH1brXTnH2Qsumz/RAmoEasfaZCQIFaXGTNoOT0bWqzQtc+vsnxFux4Vpxg5Xrx2vG1a fAsQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id o45si5188qta.395.2019.04.25.21.52.32 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:52:32 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E1F71308A9E1; Fri, 26 Apr 2019 04:52:31 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5407A18500; Fri, 26 Apr 2019 04:52:26 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 03/27] userfaultfd: don't retake mmap_sem to emulate NOPAGE Date: Fri, 26 Apr 2019 12:51:27 +0800 Message-Id: <20190426045151.19556-4-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Fri, 26 Apr 2019 04:52:32 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from the upstream discussion between Linus and Andrea: https://lkml.org/lkml/2017/10/30/560 A summary to the issue: there was a special path in handle_userfault() in the past that we'll return a VM_FAULT_NOPAGE when we detected non-fatal signals when waiting for userfault handling. We did that by reacquiring the mmap_sem before returning. However that brings a risk in that the vmas might have changed when we retake the mmap_sem and even we could be holding an invalid vma structure. This patch removes the risk path in handle_userfault() then we will be sure that the callers of handle_mm_fault() will know that the VMAs might have changed. Meanwhile with previous patch we don't lose responsiveness as well since the core mm code now can handle the nonfatal userspace signals quickly even if we return VM_FAULT_RETRY. Suggested-by: Andrea Arcangeli Suggested-by: Linus Torvalds Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- fs/userfaultfd.c | 24 ------------------------ 1 file changed, 24 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 89800fc7dc9d..b397bc3b954d 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -514,30 +514,6 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) __set_current_state(TASK_RUNNING); - if (return_to_userland) { - if (signal_pending(current) && - !fatal_signal_pending(current)) { - /* - * If we got a SIGSTOP or SIGCONT and this is - * a normal userland page fault, just let - * userland return so the signal will be - * handled and gdb debugging works. The page - * fault code immediately after we return from - * this function is going to release the - * mmap_sem and it's not depending on it - * (unlike gup would if we were not to return - * VM_FAULT_RETRY). - * - * If a fatal signal is pending we still take - * the streamlined VM_FAULT_RETRY failure path - * and there's no need to retake the mmap_sem - * in such case. - */ - down_read(&mm->mmap_sem); - ret = VM_FAULT_NOPAGE; - } - } - /* * Here we race with the list_del; list_add in * userfaultfd_ctx_read(), however because we don't ever run From patchwork Fri Apr 26 04:51:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918045 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BD2B3933 for ; Fri, 26 Apr 2019 04:52:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A938A289B1 for ; Fri, 26 Apr 2019 04:52:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9CA4E28CDD; Fri, 26 Apr 2019 04:52:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3A4A6289B1 for ; Fri, 26 Apr 2019 04:52:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 375B16B000D; Fri, 26 Apr 2019 00:52:41 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3246C6B000E; Fri, 26 Apr 2019 00:52:41 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1ED656B0010; Fri, 26 Apr 2019 00:52:41 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id E8F726B000D for ; Fri, 26 Apr 2019 00:52:40 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id j49so1859682qtk.19 for ; Thu, 25 Apr 2019 21:52:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=2wZEWtmgzgH9tFFaLg96arh0UGtq8Y2NUzBdZE8N3DU=; b=iAyZ4m2EnEpO7zbljZsytiABStlVgFoZv3X0lMzxnMCUmxmlbm0y1qTorIa06rGYgf yGpTW/DF+GN6i1xoX4aI0KsJ7VPMKMm9VoGdgx7yZl6+vMEotsU+q0pfJ+uWsw15D8Yl Z6wQXBKLdBv33GOHjWktPkxlBeJQFzoyYhmfAOqpy/tM1IibXlem9ugy4Ih4eGXK8NdZ 2CjHYXFUXYrf/bkBOwNwvzjFluhZEPs2hk0WF9Trxtqtlqmiw5wuFJXZpfNx779jCof8 nfc2ozqrT4s7ZzVEHHlLcois7Vu9qfCTHOz7NwWm/fsqDWDg/alofdxafb5xYG1b82+k E8DQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAW1vNw8u6cVQDbEZERC8DcBrO8cdl6g9giwp9EW+UX36iFiWCHf CqOTt/mm2ZkI3m+h9uY6w6rUhI23rIvmYqrYHt2g+kMuBOGneYJ6WabowlIhLSkk8d04AyQHwbg SUYZMPK8kul865l3K1BegWXjtYHgmiy4YWATOBpnPluj2GaSYpE/r/hYtikR9lDU+hA== X-Received: by 2002:aed:24a3:: with SMTP id t32mr19592133qtc.206.1556254360690; Thu, 25 Apr 2019 21:52:40 -0700 (PDT) X-Google-Smtp-Source: APXvYqz1IKaarvT5VnxjBMCORzyZftbMQOHplbbgVN8v7ze/mJXI5T8sFUwkE+UrjFzV28T8qpn4 X-Received: by 2002:aed:24a3:: with SMTP id t32mr19592057qtc.206.1556254359093; Thu, 25 Apr 2019 21:52:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254359; cv=none; d=google.com; s=arc-20160816; b=b68UYYzCKxLXdQ1qdtOJy692fpTidb5R6Tz2V43QFIj40yHbugUaWFXzxOPJJQa4/s BALjFpeIFLj87zEBahYy2WFYtQg9mNPp18jr/OK54kn9MX0RMDraODSMr0x8/LCTKfix I/ZK8sP5kzAqsGQtRTl+rpN39PE+AuIZltRjQRK3y+KHgoY087bqf+GA95NKDfANzEdQ +SHQccpUieThYxwuWuth7vZjSAmjgdluyyJu5sFbXlIheWCWXmvSUGUUSk25MXu2vIay MT693EkEhchbheBtc7R20nXgFOPVjhlP+V2/H+pGJ9YsvXcoYKfiXthv51a1oH9D/aKc 64mA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=2wZEWtmgzgH9tFFaLg96arh0UGtq8Y2NUzBdZE8N3DU=; b=H4Gft1zWVLyQqwF9VHfxxg8Z4Wu2dOSwrM3IEcY3MJbuQot0ciiM3xMgRPpll+HxKI 3NlvzeCVt4+DfQOmaam4BIICntv5EfFwXk3ilXP1Oe9YHEWwpo3yL056we2tAVd62wqD XcvAQIGxgfjpekDrR1KNwuwASv6p3tenxyG/8+fijz9/uMvR3X5Ba2f+kf3NikvY867z FOesVsXEZVlXmxQMXvXNaGbJuPE3ylBl0UjVv5yDCCFtzMrdbomk8T6i/Ff5fZK5FY9W tOc+xqmxGmqzXCfKulaFw2rrdfbmuI4e8DctOZQYVsDfKIO/Bi0kq7v/whbFrvoVeSw7 bzMQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id a14si1312916qtj.232.2019.04.25.21.52.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:52:39 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3D89AC0AC92A; Fri, 26 Apr 2019 04:52:38 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 66AC6194A5; Fri, 26 Apr 2019 04:52:32 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 04/27] mm: allow VM_FAULT_RETRY for multiple times Date: Fri, 26 Apr 2019 12:51:28 +0800 Message-Id: <20190426045151.19556-5-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 26 Apr 2019 04:52:38 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from a discussion between Linus and Andrea [1]. Before this patch we only allow a page fault to retry once. We achieved this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing handle_mm_fault() the second time. This was majorly used to avoid unexpected starvation of the system by looping over forever to handle the page fault on a single page. However that should hardly happen, and after all for each code path to return a VM_FAULT_RETRY we'll first wait for a condition (during which time we should possibly yield the cpu) to happen before VM_FAULT_RETRY is really returned. This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY flag when we receive VM_FAULT_RETRY. It means that the page fault handler now can retry the page fault for multiple times if necessary without the need to generate another page fault event. Meanwhile we still keep the FAULT_FLAG_TRIED flag so page fault handler can still identify whether a page fault is the first attempt or not. Then we'll have these combinations of fault flags (only considering ALLOW_RETRY flag and TRIED flag): - ALLOW_RETRY and !TRIED: this means the page fault allows to retry, and this is the first try - ALLOW_RETRY and TRIED: this means the page fault allows to retry, and this is not the first try - !ALLOW_RETRY and !TRIED: this means the page fault does not allow to retry at all - !ALLOW_RETRY and TRIED: this is forbidden and should never be used In existing code we have multiple places that has taken special care of the first condition above by checking against (fault_flags & FAULT_FLAG_ALLOW_RETRY). This patch introduces a simple helper to detect the first retry of a page fault by checking against both (fault_flags & FAULT_FLAG_ALLOW_RETRY) and !(fault_flag & FAULT_FLAG_TRIED) because now even the 2nd try will have the ALLOW_RETRY set, then use that helper in all existing special paths. One example is in __lock_page_or_retry(), now we'll drop the mmap_sem only in the first attempt of page fault and we'll keep it in follow up retries, so old locking behavior will be retained. This will be a nice enhancement for current code [2] at the same time a supporting material for the future userfaultfd-writeprotect work, since in that work there will always be an explicit userfault writeprotect retry for protected pages, and if that cannot resolve the page fault (e.g., when userfaultfd-writeprotect is used in conjunction with swapped pages) then we'll possibly need a 3rd retry of the page fault. It might also benefit other potential users who will have similar requirement like userfault write-protection. GUP code is not touched yet and will be covered in follow up patch. Please read the thread below for more information. [1] https://lkml.org/lkml/2017/11/2/833 [2] https://lkml.org/lkml/2018/12/30/64 Suggested-by: Linus Torvalds Suggested-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- arch/alpha/mm/fault.c | 2 +- arch/arc/mm/fault.c | 1 - arch/arm/mm/fault.c | 3 --- arch/arm64/mm/fault.c | 5 ---- arch/hexagon/mm/vm_fault.c | 1 - arch/ia64/mm/fault.c | 1 - arch/m68k/mm/fault.c | 3 --- arch/microblaze/mm/fault.c | 1 - arch/mips/mm/fault.c | 1 - arch/nds32/mm/fault.c | 1 - arch/nios2/mm/fault.c | 3 --- arch/openrisc/mm/fault.c | 1 - arch/parisc/mm/fault.c | 4 +--- arch/powerpc/mm/fault.c | 6 ----- arch/riscv/mm/fault.c | 5 ---- arch/s390/mm/fault.c | 5 +--- arch/sh/mm/fault.c | 1 - arch/sparc/mm/fault_32.c | 1 - arch/sparc/mm/fault_64.c | 1 - arch/um/kernel/trap.c | 1 - arch/unicore32/mm/fault.c | 4 +--- arch/x86/mm/fault.c | 2 -- arch/xtensa/mm/fault.c | 1 - drivers/gpu/drm/ttm/ttm_bo_vm.c | 12 +++++++--- include/linux/mm.h | 41 ++++++++++++++++++++++++++++++++- mm/filemap.c | 2 +- mm/shmem.c | 2 +- 27 files changed, 55 insertions(+), 56 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index 8a2ef90b4bfc..6a02c0fb36b9 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -169,7 +169,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; + flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would * have already released it in __lock_page_or_retry diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index 9e9e6eb1f7d0..e7d2947ba72c 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -167,7 +167,6 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index c41c021bbe40..7910b4b5205d 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -342,9 +342,6 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) regs, addr); } if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 46c32d639fbf..bf8608805df9 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -523,12 +523,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, return 0; } - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk of - * starvation. - */ if (mm_flags & FAULT_FLAG_ALLOW_RETRY) { - mm_flags &= ~FAULT_FLAG_ALLOW_RETRY; mm_flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index be10b441d9cc..576751597e77 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -115,7 +115,6 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 62c2d39d2bed..9de95d39935e 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -189,7 +189,6 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index d9808a807ab8..b1b2109e4ab4 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -162,9 +162,6 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 4fd2dbd0c5ca..05a4847ac0bf 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -236,7 +236,6 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 92374fd091d2..9953b5b571df 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -178,7 +178,6 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, tsk->min_flt++; } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index da777de8a62e..3642bdd7909d 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -242,7 +242,6 @@ void do_page_fault(unsigned long entry, unsigned long addr, 1, regs, addr); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index bdb1f9db75ba..9d4961d51db4 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -157,9 +157,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index 873ecb5d82d7..ff92c5674781 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -185,7 +185,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, else tsk->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index 29422eec329d..675b221af198 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -327,14 +327,12 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; - /* * No need to up_read(&mm->mmap_sem) as we would * have already released it in __lock_page_or_retry * in mm/filemap.c. */ - + flags |= FAULT_FLAG_TRIED; goto retry; } } diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index aaa853e6592f..c831cb3ce03f 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -583,13 +583,7 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, * case. */ if (unlikely(fault & VM_FAULT_RETRY)) { - /* We retry only once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. - */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; if (is_user && signal_pending(current)) return 0; diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 4fc8d746bec3..aad2c0557d2f 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -154,11 +154,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs) 1, regs, addr); } if (fault & VM_FAULT_RETRY) { - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. - */ - flags &= ~(FAULT_FLAG_ALLOW_RETRY); flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index aba1dad1efcd..4e8c066964a9 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -513,10 +513,7 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) fault = VM_FAULT_PFAULT; goto out_up; } - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~(FAULT_FLAG_ALLOW_RETRY | - FAULT_FLAG_RETRY_NOWAIT); + flags &= ~FAULT_FLAG_RETRY_NOWAIT; flags |= FAULT_FLAG_TRIED; down_read(&mm->mmap_sem); goto retry; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index baf5d73df40c..cd710e2d7c57 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -498,7 +498,6 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index a2c83104fe35..6735cd1c09b9 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -261,7 +261,6 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, 1, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index cad71ec5c7b3..28d5b4d012c6 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -459,7 +459,6 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) 1, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index 05dcd4c5f0d5..e7723c133c7f 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -99,7 +99,6 @@ int handle_page_fault(unsigned long address, unsigned long ip, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index 3611f19234a1..efca122b5ef7 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -261,9 +261,7 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) else tsk->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; + flags |= FAULT_FLAG_TRIED; goto retry; } } diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index d9ca1ec26d40..462d26e5fa4d 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1483,9 +1483,7 @@ void do_user_addr_fault(struct pt_regs *regs, if (unlikely(fault & VM_FAULT_RETRY)) { bool is_user = flags & FAULT_FLAG_USER; - /* Retry at most once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; if (is_user && signal_pending(tsk)) return; diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index 792dad5e2f12..7cd55f2d66c9 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -128,7 +128,6 @@ void do_page_fault(struct pt_regs *regs) else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index e86a29a1e51f..801d109f98ad 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -61,9 +61,10 @@ static vm_fault_t ttm_bo_vm_fault_idle(struct ttm_buffer_object *bo, /* * If possible, avoid waiting for GPU with mmap_sem - * held. + * held. We only do this if the fault allows retry and this + * is the first attempt. */ - if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) { + if (fault_flag_allow_retry_first(vmf->flags)) { ret = VM_FAULT_RETRY; if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT) goto out_unlock; @@ -132,7 +133,12 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf) * for the buffer to become unreserved. */ if (unlikely(!reservation_object_trylock(bo->resv))) { - if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) { + /* + * If the fault allows retry and this is the first + * fault attempt, we try to release the mmap_sem + * before waiting + */ + if (fault_flag_allow_retry_first(vmf->flags)) { if (!(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) { ttm_bo_get(bo); up_read(&vmf->vma->vm_mm->mmap_sem); diff --git a/include/linux/mm.h b/include/linux/mm.h index 76769749b5a5..bad93704abc8 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -337,16 +337,55 @@ extern unsigned int kobjsize(const void *objp); */ extern pgprot_t protection_map[16]; +/* + * About FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_TRIED: we can specify whether we + * would allow page faults to retry by specifying these two fault flags + * correctly. Currently there can be three legal combinations: + * + * (a) ALLOW_RETRY and !TRIED: this means the page fault allows retry, and + * this is the first try + * + * (b) ALLOW_RETRY and TRIED: this means the page fault allows retry, and + * we've already tried at least once + * + * (c) !ALLOW_RETRY and !TRIED: this means the page fault does not allow retry + * + * The unlisted combination (!ALLOW_RETRY && TRIED) is illegal and should never + * be used. Note that page faults can be allowed to retry for multiple times, + * in which case we'll have an initial fault with flags (a) then later on + * continuous faults with flags (b). We should always try to detect pending + * signals before a retry to make sure the continuous page faults can still be + * interrupted if necessary. + */ + #define FAULT_FLAG_WRITE 0x01 /* Fault was a write access */ #define FAULT_FLAG_MKWRITE 0x02 /* Fault was mkwrite of existing pte */ #define FAULT_FLAG_ALLOW_RETRY 0x04 /* Retry fault if blocking */ #define FAULT_FLAG_RETRY_NOWAIT 0x08 /* Don't drop mmap_sem and wait when retrying */ #define FAULT_FLAG_KILLABLE 0x10 /* The fault task is in SIGKILL killable region */ -#define FAULT_FLAG_TRIED 0x20 /* Second try */ +#define FAULT_FLAG_TRIED 0x20 /* We've tried once */ #define FAULT_FLAG_USER 0x40 /* The fault originated in userspace */ #define FAULT_FLAG_REMOTE 0x80 /* faulting for non current tsk/mm */ #define FAULT_FLAG_INSTRUCTION 0x100 /* The fault was during an instruction fetch */ +/** + * fault_flag_allow_retry_first - check ALLOW_RETRY the first time + * + * This is mostly used for places where we want to try to avoid taking + * the mmap_sem for too long a time when waiting for another condition + * to change, in which case we can try to be polite to release the + * mmap_sem in the first round to avoid potential starvation of other + * processes that would also want the mmap_sem. + * + * Return: true if the page fault allows retry and this is the first + * attempt of the fault handling; false otherwise. + */ +static inline bool fault_flag_allow_retry_first(unsigned int flags) +{ + return (flags & FAULT_FLAG_ALLOW_RETRY) && + (!(flags & FAULT_FLAG_TRIED)); +} + #define FAULT_FLAG_TRACE \ { FAULT_FLAG_WRITE, "WRITE" }, \ { FAULT_FLAG_MKWRITE, "MKWRITE" }, \ diff --git a/mm/filemap.c b/mm/filemap.c index d78f577baef2..6871b7bc16c3 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1374,7 +1374,7 @@ EXPORT_SYMBOL_GPL(__lock_page_killable); int __lock_page_or_retry(struct page *page, struct mm_struct *mm, unsigned int flags) { - if (flags & FAULT_FLAG_ALLOW_RETRY) { + if (fault_flag_allow_retry_first(flags)) { /* * CAUTION! In this case, mmap_sem is not released * even though return 0. diff --git a/mm/shmem.c b/mm/shmem.c index b3db3779a30a..a2d83ba745f8 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2014,7 +2014,7 @@ static vm_fault_t shmem_fault(struct vm_fault *vmf) DEFINE_WAIT_FUNC(shmem_fault_wait, synchronous_wake_function); ret = VM_FAULT_NOPAGE; - if ((vmf->flags & FAULT_FLAG_ALLOW_RETRY) && + if (fault_flag_allow_retry_first(vmf->flags) && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) { /* It's polite to up mmap_sem if we can */ up_read(&vma->vm_mm->mmap_sem); From patchwork Fri Apr 26 04:51:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918047 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 39B841390 for ; Fri, 26 Apr 2019 04:52:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 24754289B1 for ; Fri, 26 Apr 2019 04:52:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 11B2428CDD; Fri, 26 Apr 2019 04:52:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8EF5B289B1 for ; Fri, 26 Apr 2019 04:52:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A04476B0010; Fri, 26 Apr 2019 00:52:51 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9B4BA6B0266; Fri, 26 Apr 2019 00:52:51 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A4DF6B0269; Fri, 26 Apr 2019 00:52:51 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 63C0A6B0010 for ; Fri, 26 Apr 2019 00:52:51 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id t23so1879503qtj.13 for ; Thu, 25 Apr 2019 21:52:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=6n1/wIReuyJP1CMWC1mqNA//1zjNIv45DTVZimaytg8=; b=X4enOqnUNbOcOsJuIwy21ixWUBdH5DlVz0tGsEzKA/gktTM2zD0NPcujs4sMGqn7GB iP9gxTObvCGOO/KwI26Z+Wq4t5b/sEDfl89xd/BgMFfMK44YETt9svxzb4H1r/xl5tE9 2YxoGZ+RfPCoM/OPTHGvu84RF+sw0slX7+vckMLTRcRcPBGxzODOQ214pJksVY5KcQ76 TTUzICvQYOoeC95k/jTsS9dhRf8FKcNWdYSzkjq2Kp6Ciu5kLe547leUDjLw/7MH+uKI p4QJDOx8pSOu46PWG9ScTNaWuBAGMmWhNxwGU811uVtpQ5NcPl6WcKqu7yxLMpREdVki SXBQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVJcGtdYzgkdyEb6YKerF3iARvnolygm5JtwWY2wzIDvDY9PxJG z4hD6DIojKQ7qWJWiYpPWgocwajEXvtxLfqNamLqPPNtWdKN8nipzOhliBmQNYkt/0aOMN7kU/K cYDlnbX1s6OJ5vJ3P0ynPLKjtcZCpXxDXcR47O0Qh4PbTyYcbo7iLqFisK9m7nBEXjA== X-Received: by 2002:ae9:f218:: with SMTP id m24mr32696863qkg.261.1556254371196; Thu, 25 Apr 2019 21:52:51 -0700 (PDT) X-Google-Smtp-Source: APXvYqysPDLtwK2tKu9bGqwsL+TFmkuqpgx4CBYK9SyJBlqgh6wXT0/lWzAuCOSxwGgGjQLTThWa X-Received: by 2002:ae9:f218:: with SMTP id m24mr32696844qkg.261.1556254370634; Thu, 25 Apr 2019 21:52:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254370; cv=none; d=google.com; s=arc-20160816; b=Hljfac/fMklfj3b4iPouCT3WTfjLCaZgCRRHSuB+xhNa981wcUs+AkEQ6wJpeRGTKW g5epwvZtRIQ39aa5ADgArSR70c/r1yHDJy5PtgXT/Cl3auRU6YaZOelp3Yu1XmqE7idh GUZ3tmCNawiio/J0ZMp8tL9WXdAm2Q4ufXXCBkF8mvNDgUKDTTmzAc/tIQR7moe0Pb6l XzuQoeDXlq5xEkW1ipSR6DbeW/Rgylw/EwiNTE3YZd/h+/+wwvSZNNf09y2SMFQAoQ/e Beo+Yzz1YvAfTWCEpGhnCo4FawQexLw2EkmBpSTHpXn/1HTi/FWeEQ2xdwYz3nbNUn9/ 9riQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=6n1/wIReuyJP1CMWC1mqNA//1zjNIv45DTVZimaytg8=; b=c6NKGSU3jaJg2bmvZ4WHTDAHlXT0Ztrt8MvYRdJYJWlNu8sKSroiGsNe5vC5XdKtai +Xe+nyejWQ4QIMtbgUlvvCZnYD0f/h9rc4uxMiqgU2TWBaz8jWqfQqTuxqMuzwctuQbb ygC8HX3dxWtN2T0ZK7kqd97tnjQUY/Er2KGbNu3ivrBjiXQr9QorXykphvBeElU9Oh1h HkTvc0yHlDPAYgljtPo+8DXwbvS1kACp9fegxzkfOrt1pGNv89785SD4Glur87uzfY8N +2ynBUmYSvz3wz9R0x3HF4lyqxZFmPbAz4YGMfpjaLHttSOzyuAi5V90P4m0d+9Eqq5L DyDg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id l35si4420734qte.230.2019.04.25.21.52.50 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:52:50 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C6AB33082B41; Fri, 26 Apr 2019 04:52:49 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id B7AAD17B21; Fri, 26 Apr 2019 04:52:38 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 05/27] mm: gup: allow VM_FAULT_RETRY for multiple times Date: Fri, 26 Apr 2019 12:51:29 +0800 Message-Id: <20190426045151.19556-6-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Fri, 26 Apr 2019 04:52:50 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This is the gup counterpart of the change that allows the VM_FAULT_RETRY to happen for more than once. Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- mm/gup.c | 17 +++++++++++++---- mm/hugetlb.c | 6 ++++-- 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index a78d252d6358..46b1d1412364 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -531,7 +531,10 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, if (*flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; if (*flags & FOLL_TRIED) { - VM_WARN_ON_ONCE(fault_flags & FAULT_FLAG_ALLOW_RETRY); + /* + * Note: FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_TRIED + * can co-exist + */ fault_flags |= FAULT_FLAG_TRIED; } @@ -946,17 +949,23 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, /* VM_FAULT_RETRY triggered, so seek to the faulting offset */ pages += ret; start += ret << PAGE_SHIFT; + lock_dropped = true; +retry: /* * Repeat on the address that fired VM_FAULT_RETRY - * without FAULT_FLAG_ALLOW_RETRY but with + * with both FAULT_FLAG_ALLOW_RETRY and * FAULT_FLAG_TRIED. */ *locked = 1; - lock_dropped = true; down_read(&mm->mmap_sem); ret = __get_user_pages(tsk, mm, start, 1, flags | FOLL_TRIED, - pages, NULL, NULL); + pages, NULL, locked); + if (!*locked) { + /* Continue to retry until we succeeded */ + BUG_ON(ret != 0); + goto retry; + } if (ret != 1) { BUG_ON(ret > 1); if (!pages_done) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e77b56141f0c..d14e2cc6f7c1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4268,8 +4268,10 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; if (flags & FOLL_TRIED) { - VM_WARN_ON_ONCE(fault_flags & - FAULT_FLAG_ALLOW_RETRY); + /* + * Note: FAULT_FLAG_ALLOW_RETRY and + * FAULT_FLAG_TRIED can co-exist + */ fault_flags |= FAULT_FLAG_TRIED; } ret = hugetlb_fault(mm, vma, vaddr, fault_flags); From patchwork Fri Apr 26 04:51:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918049 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 80FFD1390 for ; Fri, 26 Apr 2019 04:53:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6F923289B1 for ; Fri, 26 Apr 2019 04:53:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 62BB128CDD; Fri, 26 Apr 2019 04:53:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A361C289B1 for ; Fri, 26 Apr 2019 04:53:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C75826B0269; Fri, 26 Apr 2019 00:53:00 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C256C6B026A; Fri, 26 Apr 2019 00:53:00 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AEEB26B026B; Fri, 26 Apr 2019 00:53:00 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 8B6446B0269 for ; Fri, 26 Apr 2019 00:53:00 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id j49so1860171qtk.19 for ; Thu, 25 Apr 2019 21:53:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=e0fxVMddmA9GXiTtSfkbRCEUFj7blRCWENdC7N/YeF0=; b=koH2Q6a0OfCSJcRf+oxL0C0pQh46GAmffjzrpZixmsMd0AN/CeJXtUxKw2puVOCnSv 8bn8ketf2S8U14LwLmItDiwrPxUgQAee6aYb5ia8nEWetV4a5UFmbx2AotHD9ChGtMhF DNsMQQib/wQRABDdb11/7DyKDR71vdjL1sdHRuj6QVhI0rcQfDhfniYDkcM/Ad9g84Vv AAeVWUTyoyXKyDtn419gdJbJW4UWlY9g2dG7ZDcs/iBhH6SyC81Tq/iuh5dC9R7j+98h aVvVxZqf3mCOvy3cIuyZl3SrlCvSbyTt7+9FPxSRmr1Rnuac29fBG8qWnd3lYhwmGdNN ERpQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWFSalXKwY92b9B5Ke7aOM/N+Bn/IKJMyfTFpcFzV87OYduJ/rA J8XLJhoGK/ivn5ffyRxkbzDUS08ZRweHEDHPUgXDx9FIw4khH8BI4tfYjMqyMf34n5rlgq+8qI2 LGmHHFP0JtKpGiYpFJtfOQEcr91OkD3O20IjsXHYOg/tXjVRSBuhDByi5PWB1OdS+4Q== X-Received: by 2002:a05:620a:129a:: with SMTP id w26mr14676885qki.297.1556254380357; Thu, 25 Apr 2019 21:53:00 -0700 (PDT) X-Google-Smtp-Source: APXvYqyX6L5fpCvCo72vbKc8wBrlscdfmVFm4J14JPEvxBQcWVSeECZoshU3PwzOgIH4XMb2i06b X-Received: by 2002:a05:620a:129a:: with SMTP id w26mr14676868qki.297.1556254379851; Thu, 25 Apr 2019 21:52:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254379; cv=none; d=google.com; s=arc-20160816; b=EMz1z3Xpbsece9LKsdMf6mx3MFBoQ7I1aPx8xQovOB3s6GncxNRaEQ1YSgACN9Oy/I OBNA25YzCd4AFuIS8C1hp+OFjFTRh3kF4heEeuOf6OSKr1Do3DWPal7Y1NBfKbSf9oyn /RA5q0ITfr7AOS8hBG8e3JM5EUR4CloyFi+eDToTYDsf88VtOJLoIrFO2PQZZwgIo4dB qhypfA9u7mriNtF+bVObrLAnR1oqQskDVZ24FzJNuefP/ecWcuP5R7PU0df1Q1rKRaPn cooab7pRf/P65E6UU+oXIX/UHGFjjWwCBzv1Oe0kVjT6mfzUdX6KhMuWiSMq1a2Irap7 vZbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=e0fxVMddmA9GXiTtSfkbRCEUFj7blRCWENdC7N/YeF0=; b=E1muo9S3rCySkE+VKY9V7N40+JF0he5VzocNj64zWpdc5jQYDdOoWjM6qAt3rZxix7 DzNwMSI+OsPyPPoUTDZdpeO3RstJLCLufnhXnNCl0lx8LzMMea5iu90+Bfo74zXZb0m4 bbO5hz3wb6asH24npeMfCIa+MiyN+QJ2PsuHCr7ZetqbMv9GLYkfhoqAbYsthlBb0yU9 M6YNxhFBcOnd9tDLCo1a8MXy3ySyCDS/y7Lg+CqnynFZ248SoYYpALP/c4XhlJ4z1mMK W12mOnOjNqcvkxESpM1ubDUFW9BOpFa1j7o8tUvBiWemHAzNSdKAEA9fvfX1WfkPFAEw Lx1A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id g24si418882qtb.300.2019.04.25.21.52.59 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:52:59 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0C8822DA988; Fri, 26 Apr 2019 04:52:59 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4AE4617B2E; Fri, 26 Apr 2019 04:52:50 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Pavel Emelyanov , Rik van Riel Subject: [PATCH v4 06/27] userfaultfd: wp: add helper for writeprotect check Date: Fri, 26 Apr 2019 12:51:30 +0800 Message-Id: <20190426045151.19556-7-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Fri, 26 Apr 2019 04:52:59 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li add helper for writeprotect check. Will use it later. Cc: Andrea Arcangeli Cc: Pavel Emelyanov Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 37c9eba75c98..38f748e7186e 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -50,6 +50,11 @@ static inline bool userfaultfd_missing(struct vm_area_struct *vma) return vma->vm_flags & VM_UFFD_MISSING; } +static inline bool userfaultfd_wp(struct vm_area_struct *vma) +{ + return vma->vm_flags & VM_UFFD_WP; +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP); @@ -94,6 +99,11 @@ static inline bool userfaultfd_missing(struct vm_area_struct *vma) return false; } +static inline bool userfaultfd_wp(struct vm_area_struct *vma) +{ + return false; +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return false; From patchwork Fri Apr 26 04:51:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918051 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DFD66933 for ; Fri, 26 Apr 2019 04:53:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CBE20289B1 for ; Fri, 26 Apr 2019 04:53:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BECA128D7F; Fri, 26 Apr 2019 04:53:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2D80B289B1 for ; Fri, 26 Apr 2019 04:53:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 54A776B026B; Fri, 26 Apr 2019 00:53:12 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4F96C6B026C; Fri, 26 Apr 2019 00:53:12 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E92E6B026D; Fri, 26 Apr 2019 00:53:12 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 1BA7F6B026B for ; Fri, 26 Apr 2019 00:53:12 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id r13so1798596qke.22 for ; Thu, 25 Apr 2019 21:53:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=DP8rIuiZbwGD1DEvgaImLarPGpPXGWXJpd5zHekgvPw=; b=NK8aySJrwbOsoMOABSZxvBSSxYB+4w4CvapGbzRIaMNlOMn3UPxFEoJ8eLLdgSiaZa HlZ2iUYYr2oZ+2HPDVTeRTKT4sYLugyTO/FOoVmvX/usc4cLeRgEt3+/IixMwvDQNS8c hDpBrl4jV/WQc5Kz56ueGr0lMThvvxhNB6nh1+q3vQDTBqPpUYraRKAaJrKx4YgGAZDl 7kdAHJWTXWe8pQYwEJeu35fJJGCqdUZl5yhe9Wo05rACcjL9HwSpWFxN+zRv5S0i++2V NGOPKl750TTtFFO7iihghSNM6mZUCnar3Csf/ETGV3NAaryGRcYgW+ZJ7mNnS3pAHwe2 DjGQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXeJwrKPIkETVAPgjnGWm+mxI4B525CfXvkQf/tMjYJs3zanqkX j0TRR5NCYCgQCg9h/mNlRm49Gq1TRa3mwyNdBanLf0TO8tbXhf035fiihmedxXHc1CdkXxyDnu8 vSVuMjLBHoS/40A9gMgOHMKE9UsKWwi8phLlL4s/roRZYeCQLfX+qrzXOdNq+T+IBIA== X-Received: by 2002:a0c:c110:: with SMTP id f16mr34137436qvh.190.1556254391885; Thu, 25 Apr 2019 21:53:11 -0700 (PDT) X-Google-Smtp-Source: APXvYqynKwsxCtJ1Dz8L+pgjbFC+MKlkP6tvMHEqPqsFCDspkvqqsfHrHBDtiDlPj/i0Z1eiNn1b X-Received: by 2002:a0c:c110:: with SMTP id f16mr34137408qvh.190.1556254391266; Thu, 25 Apr 2019 21:53:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254391; cv=none; d=google.com; s=arc-20160816; b=rlTOOr0Ww0ZtduA5K2Qjq7qF2jMvQJXXZikXTFJRh+l9Tvx2E91LXFK7Bj7I/WyiRl IFYveg+ObMkFtP/8+KB/dfJDEs5bMYPPoW7VXt2GeYlAHdSNbiQQHNdIkjWFWTA9fwCq KccjUVql431ZEt3J4rZnWgxzUg9Bk3OaBHVXw70vdPy8Fb9NWVFIYDp27XCbkAo+RH4+ HLnpyO0G2NUlEGoZ4piwVmzDCd4hpQrTEPogoMR9tXp60+DM2dOH4KQnlAePu9gmjkfP DebFWSViHOlGoceQbHNvKB5+savX+A6G+7UzOSve/MEosQWbDFOvC5vmAQS9AVeM+g73 oWGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=DP8rIuiZbwGD1DEvgaImLarPGpPXGWXJpd5zHekgvPw=; b=dADS72p8sF0c8ujYJjiKXGKVhPnwfrIbPDuw4xb1lb0D6zsU8C/07IWR+RMyvQ+FQF uV/JiEnMvYjh5/Qp/RZl2WXZfgBmhgA9z3RGARWuvTgwdBlJkaTPmnkoUuvr0YR0lF1l PCBjG1l/Ox2CqkvjiylTeL+d+Yg3rfjWQIi4cvGvYkkxD8mr1n0V/i5oGr3btiv0NLAH iAysnNt3BZGOGMSq63ZVxmPHSW0UNUyTbU7FKJUCHmPv3toRNQ5bTe5QA6OEnGo6mvm0 6LmaJXo4OGO5n/1GQqt0G+djJcUkBjaguGq/cCUWOPXBiwFkJkgP77dbqxGzOeFxZUaA hK1g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id e1si5695661qvv.205.2019.04.25.21.53.11 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:53:11 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5ECBD307D847; Fri, 26 Apr 2019 04:53:10 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 871F018500; Fri, 26 Apr 2019 04:52:59 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 07/27] userfaultfd: wp: hook userfault handler to write protection fault Date: Fri, 26 Apr 2019 12:51:31 +0800 Message-Id: <20190426045151.19556-8-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Fri, 26 Apr 2019 04:53:10 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli There are several cases write protection fault happens. It could be a write to zero page, swaped page or userfault write protected page. When the fault happens, there is no way to know if userfault write protect the page before. Here we just blindly issue a userfault notification for vma with VM_UFFD_WP regardless if app write protects it yet. Application should be ready to handle such wp fault. v1: From: Shaohua Li v2: Handle the userfault in the common do_wp_page. If we get there a pagetable is present and readonly so no need to do further processing until we solve the userfault. In the swapin case, always swapin as readonly. This will cause false positive userfaults. We need to decide later if to eliminate them with a flag like soft-dirty in the swap entry (see _PAGE_SWP_SOFT_DIRTY). hugetlbfs wouldn't need to worry about swapouts but and tmpfs would be handled by a swap entry bit like anonymous memory. The main problem with no easy solution to eliminate the false positives, will be if/when userfaultfd is extended to real filesystem pagecache. When the pagecache is freed by reclaim we can't leave the radix tree pinned if the inode and in turn the radix tree is reclaimed as well. The estimation is that full accuracy and lack of false positives could be easily provided only to anonymous memory (as long as there's no fork or as long as MADV_DONTFORK is used on the userfaultfd anonymous range) tmpfs and hugetlbfs, it's most certainly worth to achieve it but in a later incremental patch. v3: Add hooking point for THP wrprotect faults. CC: Shaohua Li Signed-off-by: Andrea Arcangeli [peterx: don't conditionally drop FAULT_FLAG_WRITE in do_swap_page] Reviewed-by: Mike Rapoport Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- mm/memory.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index ab650c21bccd..8ccd4927b58d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2492,6 +2492,11 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; + if (userfaultfd_wp(vma)) { + pte_unmap_unlock(vmf->pte, vmf->ptl); + return handle_userfault(vmf, VM_UFFD_WP); + } + vmf->page = vm_normal_page(vma, vmf->address, vmf->orig_pte); if (!vmf->page) { /* @@ -3707,8 +3712,11 @@ static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf) /* `inline' is required to avoid gcc 4.1.2 build error */ static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd) { - if (vma_is_anonymous(vmf->vma)) + if (vma_is_anonymous(vmf->vma)) { + if (userfaultfd_wp(vmf->vma)) + return handle_userfault(vmf, VM_UFFD_WP); return do_huge_pmd_wp_page(vmf, orig_pmd); + } if (vmf->vma->vm_ops->huge_fault) return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD); From patchwork Fri Apr 26 04:51:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918053 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A46221390 for ; Fri, 26 Apr 2019 04:53:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8E9C8289B1 for ; Fri, 26 Apr 2019 04:53:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 82AB928D7F; Fri, 26 Apr 2019 04:53:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ADF7F289B1 for ; Fri, 26 Apr 2019 04:53:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B3CE06B0007; Fri, 26 Apr 2019 00:53:18 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AECFF6B026E; Fri, 26 Apr 2019 00:53:18 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9BEB86B0007; Fri, 26 Apr 2019 00:53:18 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 797D86B0007 for ; Fri, 26 Apr 2019 00:53:18 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id 207so1857227qkn.5 for ; Thu, 25 Apr 2019 21:53:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=6LEoQf2iQS+qIvCfeSdroHAv9oHURBm5B7WiNRfkEUw=; b=rbkxvkFR0HTB1agpNxqzlwd4ZQabi5MbGP5bt813nAGyAa/suy55sgf1YHlF+dCqPw 6FG7hCHIWFlIOV6mGre8YBwSZllL9jrrfYXw1IWkpHhXb+45NqQA4oyST1xLJhYonSYi Zoe7Ov3MUpqXPhngDLKd0RGq8IKCurRPylbfXR00nEt6Z1z187I2MJ3NVlzHsvwsjLsm r/IAvgVu4JaYGelENP6JapXgj9hZWdhwneabgX2xl01BouYIF6IAtIEa1DofuXaNdTF3 OQHRz00fo5zDo7Hbw6wFcOphzAF3dV5abmca8d+2LowP6RfnC9v7BHZ356txkCxNVxa4 8CWw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVw0RwdpynzsmiJ7OrH656GbgmP1UQX+RTy4ard2+SZ2An1CG0T 8qIZqAs7bum9i+Y4ibu2P4kuGYaPEJWSvLmIqrw4OlRJgq6N6qrvHtt6/3YgirQlTOAoehAqHk+ ltCXl7oq2fGzmcC1Ap6WcQVcLLAGKRvR5cwYHuhmSil+B7JElHC3TULVauzNlcThlhA== X-Received: by 2002:a37:4896:: with SMTP id v144mr32076057qka.194.1556254398248; Thu, 25 Apr 2019 21:53:18 -0700 (PDT) X-Google-Smtp-Source: APXvYqzetQOrbCxH8kfUNJutePulymHVcJtITmcngu9//GpFbMurOsfWz7fo2O9Qb31fpbDUhxvL X-Received: by 2002:a37:4896:: with SMTP id v144mr32076029qka.194.1556254397360; Thu, 25 Apr 2019 21:53:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254397; cv=none; d=google.com; s=arc-20160816; b=zGTab0rQWA/Bek9/Ph9Bymh2XlWLN5phz6/Hn4OM5WIGMtiAKBNBBid3fRmdfeQclD GAtESsv7l1OQ4/Gbon7EZEX4VOuVMbpj70LvHK+Z1RBgFlXvtCN5oBbuLf3cudxDaGIO x/gbdd958lmFhVp+5TbtcUz2Sq+DsEXWLOZMNo3YwwXAOShh11aFoKyKeDFbglzhZKeQ O49MozgWvldbJg9VQhLI0p5XCeIY47YGDWMmKzesYAQYDWzdXXbK83EHPhYSZtenUJCr /T7Q/hcVB/h9uE2tVHRKU95ygkdOu5wh19/r5Sj+PAB2bEPFSPHuO+whs5CWVqWaXe2w q88w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=6LEoQf2iQS+qIvCfeSdroHAv9oHURBm5B7WiNRfkEUw=; b=a/MlwTbWLbDgbw4GmHm+FpdXuue5AcqUAaNcsajsUwh9QTEh8wksk4h+gvly0JYhfF eXHTPRml8VQCKMxIPikWpKWIRIHcQsJQj+BmQmrjIi6VKRgRW1zJ3aaxQ/1YbeLQNooR AfQowtY6so+EHbo3VwOeVMF+YIk+e5iUC+zMrsW2p2DgUftQmghrR+M94uPydP3lB1RQ X/tXvnPTJ+E1YCwLtbdKm63OVYNmvHXRs0mx+10Ra/rdh5wLzudHEbPGP8ntduOr1eNi tQK4Q1z8N+vKXRBkFes1AwHYl7m8p67yecgTi47vY8BNiAz6NZ/u72MIN4twsG6yVd4X mjwA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id t46si455467qta.282.2019.04.25.21.53.17 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:53:17 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 719ED2DA988; Fri, 26 Apr 2019 04:53:16 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id D38D717B21; Fri, 26 Apr 2019 04:53:10 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 08/27] userfaultfd: wp: add WP pagetable tracking to x86 Date: Fri, 26 Apr 2019 12:51:32 +0800 Message-Id: <20190426045151.19556-9-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Fri, 26 Apr 2019 04:53:16 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli Accurate userfaultfd WP tracking is possible by tracking exactly which virtual memory ranges were writeprotected by userland. We can't relay only on the RW bit of the mapped pagetable because that information is destroyed by fork() or KSM or swap. If we were to relay on that, we'd need to stay on the safe side and generate false positive wp faults for every swapped out page. Signed-off-by: Andrea Arcangeli [peterx: append _PAGE_UFD_WP to _PAGE_CHG_MASK] Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 52 ++++++++++++++++++++++++++++ arch/x86/include/asm/pgtable_64.h | 8 ++++- arch/x86/include/asm/pgtable_types.h | 11 +++++- include/asm-generic/pgtable.h | 1 + include/asm-generic/pgtable_uffd.h | 51 +++++++++++++++++++++++++++ init/Kconfig | 5 +++ 7 files changed, 127 insertions(+), 2 deletions(-) create mode 100644 include/asm-generic/pgtable_uffd.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5ad92419be19..70d369fe08d7 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -208,6 +208,7 @@ config X86 select USER_STACKTRACE_SUPPORT select VIRT_TO_BUS select X86_FEATURE_NAMES if PROC_FS + select HAVE_ARCH_USERFAULTFD_WP if USERFAULTFD config INSTRUCTION_DECODER def_bool y diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 2779ace16d23..6863236e8484 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -23,6 +23,7 @@ #ifndef __ASSEMBLY__ #include +#include extern pgd_t early_top_pgt[PTRS_PER_PGD]; int __init __early_make_pgtable(unsigned long address, pmdval_t pmd); @@ -293,6 +294,23 @@ static inline pte_t pte_clear_flags(pte_t pte, pteval_t clear) return native_make_pte(v & ~clear); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline int pte_uffd_wp(pte_t pte) +{ + return pte_flags(pte) & _PAGE_UFFD_WP; +} + +static inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte_set_flags(pte, _PAGE_UFFD_WP); +} + +static inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return pte_clear_flags(pte, _PAGE_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + static inline pte_t pte_mkclean(pte_t pte) { return pte_clear_flags(pte, _PAGE_DIRTY); @@ -372,6 +390,23 @@ static inline pmd_t pmd_clear_flags(pmd_t pmd, pmdval_t clear) return native_make_pmd(v & ~clear); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline int pmd_uffd_wp(pmd_t pmd) +{ + return pmd_flags(pmd) & _PAGE_UFFD_WP; +} + +static inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pmd_set_flags(pmd, _PAGE_UFFD_WP); +} + +static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pmd_clear_flags(pmd, _PAGE_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + static inline pmd_t pmd_mkold(pmd_t pmd) { return pmd_clear_flags(pmd, _PAGE_ACCESSED); @@ -1351,6 +1386,23 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) #endif #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return pte_set_flags(pte, _PAGE_SWP_UFFD_WP); +} + +static inline int pte_swp_uffd_wp(pte_t pte) +{ + return pte_flags(pte) & _PAGE_SWP_UFFD_WP; +} + +static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return pte_clear_flags(pte, _PAGE_SWP_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + #define PKRU_AD_BIT 0x1 #define PKRU_WD_BIT 0x2 #define PKRU_BITS_PER_PKEY 2 diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index 0bb566315621..627666b1c3c0 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -189,7 +189,7 @@ extern void sync_global_pgds(unsigned long start, unsigned long end); * * | ... | 11| 10| 9|8|7|6|5| 4| 3|2| 1|0| <- bit number * | ... |SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names - * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| X|X|SD|0| <- swp entry + * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| X|F|SD|0| <- swp entry * * G (8) is aliased and used as a PROT_NONE indicator for * !present ptes. We need to start storing swap entries above @@ -197,9 +197,15 @@ extern void sync_global_pgds(unsigned long start, unsigned long end); * erratum where they can be incorrectly set by hardware on * non-present PTEs. * + * SD Bits 1-4 are not used in non-present format and available for + * special use described below: + * * SD (1) in swp entry is used to store soft dirty bit, which helps us * remember soft dirty over page migration * + * F (2) in swp entry is used to record when a pagetable is + * writeprotected by userfaultfd WP support. + * * Bit 7 in swp entry should be 0 because pmd_present checks not only P, * but also L and G. * diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index d6ff0bbdb394..dd9c6295d610 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -32,6 +32,7 @@ #define _PAGE_BIT_SPECIAL _PAGE_BIT_SOFTW1 #define _PAGE_BIT_CPA_TEST _PAGE_BIT_SOFTW1 +#define _PAGE_BIT_UFFD_WP _PAGE_BIT_SOFTW2 /* userfaultfd wrprotected */ #define _PAGE_BIT_SOFT_DIRTY _PAGE_BIT_SOFTW3 /* software dirty tracking */ #define _PAGE_BIT_DEVMAP _PAGE_BIT_SOFTW4 @@ -100,6 +101,14 @@ #define _PAGE_SWP_SOFT_DIRTY (_AT(pteval_t, 0)) #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +#define _PAGE_UFFD_WP (_AT(pteval_t, 1) << _PAGE_BIT_UFFD_WP) +#define _PAGE_SWP_UFFD_WP _PAGE_USER +#else +#define _PAGE_UFFD_WP (_AT(pteval_t, 0)) +#define _PAGE_SWP_UFFD_WP (_AT(pteval_t, 0)) +#endif + #if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE) #define _PAGE_NX (_AT(pteval_t, 1) << _PAGE_BIT_NX) #define _PAGE_DEVMAP (_AT(u64, 1) << _PAGE_BIT_DEVMAP) @@ -124,7 +133,7 @@ */ #define _PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \ _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY | \ - _PAGE_SOFT_DIRTY | _PAGE_DEVMAP) + _PAGE_SOFT_DIRTY | _PAGE_DEVMAP | _PAGE_UFFD_WP) #define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE) /* diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index fa782fba51ee..39e4122b667b 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -10,6 +10,7 @@ #include #include #include +#include #if 5 - defined(__PAGETABLE_P4D_FOLDED) - defined(__PAGETABLE_PUD_FOLDED) - \ defined(__PAGETABLE_PMD_FOLDED) != CONFIG_PGTABLE_LEVELS diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h new file mode 100644 index 000000000000..643d1bf559c2 --- /dev/null +++ b/include/asm-generic/pgtable_uffd.h @@ -0,0 +1,51 @@ +#ifndef _ASM_GENERIC_PGTABLE_UFFD_H +#define _ASM_GENERIC_PGTABLE_UFFD_H + +#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static __always_inline int pte_uffd_wp(pte_t pte) +{ + return 0; +} + +static __always_inline int pmd_uffd_wp(pmd_t pmd) +{ + return 0; +} + +static __always_inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pmd; +} + +static __always_inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pmd; +} + +static __always_inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline int pte_swp_uffd_wp(pte_t pte) +{ + return 0; +} + +static __always_inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return pte; +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + +#endif /* _ASM_GENERIC_PGTABLE_UFFD_H */ diff --git a/init/Kconfig b/init/Kconfig index 4592bf7997c0..76550307948a 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1451,6 +1451,11 @@ config ADVISE_SYSCALLS applications use these syscalls, you can disable this option to save space. +config HAVE_ARCH_USERFAULTFD_WP + bool + help + Arch has userfaultfd write protection support + config MEMBARRIER bool "Enable membarrier() system call" if EXPERT default y From patchwork Fri Apr 26 04:51:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918057 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7B935933 for ; Fri, 26 Apr 2019 04:53:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6A79F289B1 for ; Fri, 26 Apr 2019 04:53:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5E07528D7F; Fri, 26 Apr 2019 04:53:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E268D289B1 for ; Fri, 26 Apr 2019 04:53:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E733D6B0008; Fri, 26 Apr 2019 00:53:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E20916B026E; Fri, 26 Apr 2019 00:53:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE9B26B026F; Fri, 26 Apr 2019 00:53:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id AD6F86B0008 for ; Fri, 26 Apr 2019 00:53:25 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id e31so1906551qtb.0 for ; Thu, 25 Apr 2019 21:53:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=dMjQIHNpFGSe7O5uqRGBwJx8Nml7amhMVaY9sZK8RHg=; b=poauBJeYmMthD+iW+vgbLReTFE0/u9TBgZK+fCtvSCMy/zyr/PSTHAi40NLoeKVVQ8 u8h/5Nc/nGFstsfoQiETgCMaYSwE0S316ywwKLs5B1Sz8UYwsfKc7okuyr0wtU4hzpmW +6Eg2VuU4AbOMiyiI39cRRgkPL8crwvSM+eZwxyw0F4M5syszzg3711V/Nbu2T6EC+rM yQ+5EV2jQuUMgC/0GlTp2qS0V/+UOZUL3mnz29XIiA7gLItD1dE9JCexXhoMqw/DnBhN eJFnTbIVErQyQBW+jsYSCQNQ9qi7LuxoUBRjiPHebwE94L0cMyItVfASSSS4/zpAlG2x geHg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXxRXl/JyiLEVSHE3bORgDBasvRRvXGJKWuvW1Ni3apf0YmbXNC UDyYKJoyJ9u6FCpWZMxLrINuoKjQfR/3lnU6czmW2Z3dInYUnQAx7GgYytMwD5A9Rlpcv5iph2O ZjMlFxxDnfODxt+TqZiMb9J1PXtoKhiGCqhAgxP/pe+9Dtg1ogMtFT0d0X6m9tcqcNA== X-Received: by 2002:a0c:b6c8:: with SMTP id h8mr33500077qve.67.1556254405512; Thu, 25 Apr 2019 21:53:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqyYIWtJLRKGJUHSQfq8js4DWFFJJVNPebielrCgX2+Ye/D1ft4Ks48p0XcDAAeVfWI2J98x X-Received: by 2002:a0c:b6c8:: with SMTP id h8mr33500052qve.67.1556254404986; Thu, 25 Apr 2019 21:53:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254404; cv=none; d=google.com; s=arc-20160816; b=Kn3Cwxws/r9Z4ZWwWtRgl55Uj0pSY9/UevE55EgZ4NCP6oKtoOuNVdJ3Gzu/qBhWq+ 3tgcAGFhwJeZIifRkCwqeNpQzojfaXEMb9Af2M1osoxS9wGWXcR1NIcErTVDBjv3GRJC 8avuKVXmY7kZYrvbcwH1nWGENyGqAEwUP86t88uTNww+ufXtwiuJ/C7j7swrKT7ixgGS PE/4YLVEd3U9rc8mLuaqG+Eu5wE01Tb49xXIskP0EOxSlHI7ivnfrIv/yqME8/6dFpZU hA698OyFv/gzqR1bUwrYUlaanY543L9mQ+xoCqfJ95rTZt4q4UOPzdpoNxH5/P4c4IkO mgAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=dMjQIHNpFGSe7O5uqRGBwJx8Nml7amhMVaY9sZK8RHg=; b=kZ/Z0W6q6fOV3gYfK3aE7jXjPjL+xDWXM3leeqIFZz4WEcRLmRCqtucZk9Z8iRYf8h IaCKnGpodTz9x+61x0pUEODncilMym+pTLAvoqVAnac6BWe9drdrN/pR8x8RBxDXcJwh mw5jSVhWS/gbpBJqLzGVZuI5ybADvKNjL2sDpAqhQEIMhr4jk/7kR/pDikjWsSy8IA1X 9x9EvQT3iphNXZfsifOedDnsH0RpagCR4KtXo7lcfmGoBYoqLW7MdEjONYNfqvXMg15E MC1UlMwiKMTcirsXobLurn0Cqjvt3QuAJYxkS//1W3/O1Uzs3nUhasxz7b7myT4CE5db XjKA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id m30si4101973qtg.171.2019.04.25.21.53.24 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:53:24 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3247C59458; Fri, 26 Apr 2019 04:53:24 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id E91DD17B21; Fri, 26 Apr 2019 04:53:16 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 09/27] userfaultfd: wp: userfaultfd_pte/huge_pmd_wp() helpers Date: Fri, 26 Apr 2019 12:51:33 +0800 Message-Id: <20190426045151.19556-10-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 26 Apr 2019 04:53:24 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli Implement helpers methods to invoke userfaultfd wp faults more selectively: not only when a wp fault triggers on a vma with vma->vm_flags VM_UFFD_WP set, but only if the _PAGE_UFFD_WP bit is set in the pagetable too. Signed-off-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 38f748e7186e..c6590c58ce28 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -14,6 +14,8 @@ #include /* linux/include/uapi/linux/userfaultfd.h */ #include +#include +#include /* * CAREFUL: Check include/uapi/asm-generic/fcntl.h when defining @@ -55,6 +57,18 @@ static inline bool userfaultfd_wp(struct vm_area_struct *vma) return vma->vm_flags & VM_UFFD_WP; } +static inline bool userfaultfd_pte_wp(struct vm_area_struct *vma, + pte_t pte) +{ + return userfaultfd_wp(vma) && pte_uffd_wp(pte); +} + +static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma, + pmd_t pmd) +{ + return userfaultfd_wp(vma) && pmd_uffd_wp(pmd); +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP); @@ -104,6 +118,19 @@ static inline bool userfaultfd_wp(struct vm_area_struct *vma) return false; } +static inline bool userfaultfd_pte_wp(struct vm_area_struct *vma, + pte_t pte) +{ + return false; +} + +static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma, + pmd_t pmd) +{ + return false; +} + + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return false; From patchwork Fri Apr 26 04:51:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918059 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5BACA933 for ; Fri, 26 Apr 2019 04:53:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4946028CDB for ; Fri, 26 Apr 2019 04:53:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3D16528D88; Fri, 26 Apr 2019 04:53:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7EA2B28D95 for ; Fri, 26 Apr 2019 04:53:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 792506B000A; Fri, 26 Apr 2019 00:53:37 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 743446B026F; Fri, 26 Apr 2019 00:53:37 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 60D6A6B0270; Fri, 26 Apr 2019 00:53:37 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 3BD4D6B000A for ; Fri, 26 Apr 2019 00:53:37 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id q57so1874488qtf.11 for ; Thu, 25 Apr 2019 21:53:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=guN7gYgnmDjc0dZDM0nP8l0/tmhZ7IhmP3yHcVRz/U4=; b=F4H0lyIVErzJFoDJfwaPNuIUknJ9kbKmjjcKuy++jbnc6Qgy/1w+uypCVwB/AIocM7 A7SH98sSfogWHbPUTwgU6/+nCzPIG/mwFm2x/jiQ9a0+jSwoaTa6/yQ7hsjYROlhSvaF y+F91J/d+RJRvRrA93BW7enqfUbvin2vYXzHjBmiPyL3VACsD29t5Yl5C6Bet6vLQIw+ jBGuzIBR7pXldDKJ1RL4ySrGqBjqQI/Y3NeHlFuZ0F4cUmwmtBZrsWHLJRzpJejpUEiP hHvmfltnnjzkO3RMTKUbNxAxxKarNaEGzLgCYRVVsO9Qkp+hzRn1QTm+HH+L4JGUbziB wEZA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAV77chEhrIlOwyBYUuPLRTeJESa9nNOxLWjgkokJnB7vPcPrefU qIVy4MrmKvW2JsGf7Q1gn2qzWWwjXiGd1y8GMhCpGHekaVM3HXLxmYRHqDDCPMOUxqqIK/DfkwH 7xEuK1CfdFK5apZ5VDILZIvM3PcsB0Rp9aJxJzgFbpkCRXRCztS5zCP+X580sx0WSOg== X-Received: by 2002:a0c:9810:: with SMTP id c16mr34655772qvd.192.1556254417006; Thu, 25 Apr 2019 21:53:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqwqG9Kesz0beVChGRRxHLMu/2N27pkouw0o4JGSzvwHmgS7nFLI/wSaFIhdxWlnyVyrayga X-Received: by 2002:a0c:9810:: with SMTP id c16mr34655739qvd.192.1556254416282; Thu, 25 Apr 2019 21:53:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254416; cv=none; d=google.com; s=arc-20160816; b=gDGtnlYHpDKpD5bXwNoExRMIeAFriYYeRGQ+Dir8o4ZoIosqmsc+KaQdn2aYHOmsiq pmpXSirI3p+eo2RA2CXayVxWOWUlA8XyCUpyPan8uyUE39uYBB1KoBfsE5SX/twxe8U2 1+fQ18djBeUfdO4LrSLzZnztd6Nfb1bHYoyATJKGEwbrOt4MzhcGQ+Pw+pn6h13C1Lnq e6tCbNk03gEE5yZMQkc5S3efmaq3ZqHhnOcbsokmNWcw7mR4FfkXrsBpZ++WMNZ+GAgc Lxeih8iR5oaM6J+nlS4b/82VQTwRInhLuo3F1RQl0PexiwXt1B95NINt2/0VF+Qly/hg 9j3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=guN7gYgnmDjc0dZDM0nP8l0/tmhZ7IhmP3yHcVRz/U4=; b=guWQ//3T7U6QR6Sgv7FnCROiARPJNo2eq2Qc1QUzp+hHfTFicJYjcX8IXCVdHU3Zkl R2sHOe9cUdgXz3FzSiOWEY5i+uhSH50mkEADmDr84cEbOz6yDK3S0exENBybtIEPCXNN SnvDvr1LIF3+0357qXcSGJ2oEq1lXhDkTABlN9fvyVkbd7/ut6C25nuSdk+Z4il+2AB4 M4pxbDFS23gZiHMW+0F5iQedXYdwjk10p7u4JFsJMMXE6IDYsujRqjNLhuhMJpR8/Jg3 ZNZ0CKWKMpBshE+sk0wdIufzb3rm0zGxYrj8JlWRQLz3HMt7KYklADnGpYWZ43Wz74yv 6GdQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id o10si4291973qkj.36.2019.04.25.21.53.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:53:36 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6156512F8FE; Fri, 26 Apr 2019 04:53:35 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id ACD4918500; Fri, 26 Apr 2019 04:53:24 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 10/27] userfaultfd: wp: add UFFDIO_COPY_MODE_WP Date: Fri, 26 Apr 2019 12:51:34 +0800 Message-Id: <20190426045151.19556-11-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 26 Apr 2019 04:53:35 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli This allows UFFDIO_COPY to map pages write-protected. Signed-off-by: Andrea Arcangeli [peterx: switch to VM_WARN_ON_ONCE in mfill_atomic_pte; add brackets around "dst_vma->vm_flags & VM_WRITE"; fix wordings in comments and commit messages] Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- fs/userfaultfd.c | 5 +++-- include/linux/userfaultfd_k.h | 2 +- include/uapi/linux/userfaultfd.h | 11 +++++----- mm/userfaultfd.c | 36 ++++++++++++++++++++++---------- 4 files changed, 35 insertions(+), 19 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index b397bc3b954d..3092885c9d2c 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1683,11 +1683,12 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, ret = -EINVAL; if (uffdio_copy.src + uffdio_copy.len <= uffdio_copy.src) goto out; - if (uffdio_copy.mode & ~UFFDIO_COPY_MODE_DONTWAKE) + if (uffdio_copy.mode & ~(UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP)) goto out; if (mmget_not_zero(ctx->mm)) { ret = mcopy_atomic(ctx->mm, uffdio_copy.dst, uffdio_copy.src, - uffdio_copy.len, &ctx->mmap_changing); + uffdio_copy.len, &ctx->mmap_changing, + uffdio_copy.mode); mmput(ctx->mm); } else { return -ESRCH; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index c6590c58ce28..765ce884cec0 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -34,7 +34,7 @@ extern vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason); extern ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - bool *mmap_changing); + bool *mmap_changing, __u64 mode); extern ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long len, diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 48f1a7c2f1f0..340f23bc251d 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -203,13 +203,14 @@ struct uffdio_copy { __u64 dst; __u64 src; __u64 len; +#define UFFDIO_COPY_MODE_DONTWAKE ((__u64)1<<0) /* - * There will be a wrprotection flag later that allows to map - * pages wrprotected on the fly. And such a flag will be - * available if the wrprotection ioctl are implemented for the - * range according to the uffdio_register.ioctls. + * UFFDIO_COPY_MODE_WP will map the page write protected on + * the fly. UFFDIO_COPY_MODE_WP is available only if the + * write protected ioctl is implemented for the range + * according to the uffdio_register.ioctls. */ -#define UFFDIO_COPY_MODE_DONTWAKE ((__u64)1<<0) +#define UFFDIO_COPY_MODE_WP ((__u64)1<<1) __u64 mode; /* diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index d59b5a73dfb3..eaecc21806da 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -25,7 +25,8 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - struct page **pagep) + struct page **pagep, + bool wp_copy) { struct mem_cgroup *memcg; pte_t _dst_pte, *dst_pte; @@ -71,9 +72,9 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, if (mem_cgroup_try_charge(page, dst_mm, GFP_KERNEL, &memcg, false)) goto out_release; - _dst_pte = mk_pte(page, dst_vma->vm_page_prot); - if (dst_vma->vm_flags & VM_WRITE) - _dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte)); + _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); + if ((dst_vma->vm_flags & VM_WRITE) && !wp_copy) + _dst_pte = pte_mkwrite(_dst_pte); dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); if (dst_vma->vm_file) { @@ -399,7 +400,8 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, struct page **page, - bool zeropage) + bool zeropage, + bool wp_copy) { ssize_t err; @@ -416,11 +418,13 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, if (!(dst_vma->vm_flags & VM_SHARED)) { if (!zeropage) err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, - dst_addr, src_addr, page); + dst_addr, src_addr, page, + wp_copy); else err = mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, dst_addr); } else { + VM_WARN_ON_ONCE(wp_copy); if (!zeropage) err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, @@ -438,7 +442,8 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, unsigned long src_start, unsigned long len, bool zeropage, - bool *mmap_changing) + bool *mmap_changing, + __u64 mode) { struct vm_area_struct *dst_vma; ssize_t err; @@ -446,6 +451,7 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, unsigned long src_addr, dst_addr; long copied; struct page *page; + bool wp_copy; /* * Sanitize the command parameters: @@ -502,6 +508,14 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, dst_vma->vm_flags & VM_SHARED)) goto out_unlock; + /* + * validate 'mode' now that we know the dst_vma: don't allow + * a wrprotect copy if the userfaultfd didn't register as WP. + */ + wp_copy = mode & UFFDIO_COPY_MODE_WP; + if (wp_copy && !(dst_vma->vm_flags & VM_UFFD_WP)) + goto out_unlock; + /* * If this is a HUGETLB vma, pass off to appropriate routine */ @@ -557,7 +571,7 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, BUG_ON(pmd_trans_huge(*dst_pmd)); err = mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, - src_addr, &page, zeropage); + src_addr, &page, zeropage, wp_copy); cond_resched(); if (unlikely(err == -ENOENT)) { @@ -604,14 +618,14 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - bool *mmap_changing) + bool *mmap_changing, __u64 mode) { return __mcopy_atomic(dst_mm, dst_start, src_start, len, false, - mmap_changing); + mmap_changing, mode); } ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long start, unsigned long len, bool *mmap_changing) { - return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing); + return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing, 0); } From patchwork Fri Apr 26 04:51:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918061 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A7FC21390 for ; Fri, 26 Apr 2019 04:53:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 962D5289B1 for ; Fri, 26 Apr 2019 04:53:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8A37828D7F; Fri, 26 Apr 2019 04:53:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B859D289B1 for ; Fri, 26 Apr 2019 04:53:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A2CF36B000C; Fri, 26 Apr 2019 00:53:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9DB316B0270; Fri, 26 Apr 2019 00:53:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 87D946B0271; Fri, 26 Apr 2019 00:53:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 61AD86B000C for ; Fri, 26 Apr 2019 00:53:43 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id k7so1433212qtg.8 for ; Thu, 25 Apr 2019 21:53:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=tdgmvYv2ahMfXs/2KvDiokgAJnqmJRii5iG0BDeDRwU=; b=MEmLzKMETc/IBDJLFOoC9b3hMr8cjMRtvGRi09lYNCA0KAVmTd1f34gMzTQ6I+rDzC IkKCptJK7+pPZ/Kj+02ZzvfbJmOH1s24EzeMqrATXEQ7tqpQZh9oUUO85dssrXNtV/ic Srfk+1HX1DJynpyRLAJvpG2cdKOqdIl4ZyJBt7rcy0KqYTcjG73aJUiJCCUKVXd9dRgB MgG1LHe4f0arfY4O01BaiUBpvWrXafM9z/a3JH9lBSwSXvbhum73gdBlvn+5RBPr8Oln t1jjJylMPH9tg5NXsunoXMIAxSbXZVlOzHU1NWY5ksy4u95+UEYlYYVMGJ5Jh2mBKoUA l75g== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWgVzYTrPu66qR6P8J9whLgDrnJSKJjDlWKAcude0fCri7VCBqK SrbWXJyI6ZblYkJDrR0BTn8yVL8Rt8gX5EEHkLksmcn9Aq/KIeWwYCC3LrOPT1G1AhYPLNF8MG0 vNPqhNiUmIBF96o39BZKhjTpDbf/RSBjnJPH9Z5jlpTUbiB/JqZ/RGzwEQkDUTJ5FSg== X-Received: by 2002:a37:6897:: with SMTP id d145mr23103246qkc.185.1556254423162; Thu, 25 Apr 2019 21:53:43 -0700 (PDT) X-Google-Smtp-Source: APXvYqzmeesOWdR8UAm3rLdqGA9buIpFGFnKaetxFGfXNoEcCTFalbuUSrejeGylWxxFx+926lyw X-Received: by 2002:a37:6897:: with SMTP id d145mr23103213qkc.185.1556254422312; Thu, 25 Apr 2019 21:53:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254422; cv=none; d=google.com; s=arc-20160816; b=y9ZH8TrrOTLOec0mpaGeZRZqX+y7Zv1htYSIE55bPMbPOLeopsyD+pfmPqVVw4rzW3 QhCJUMzr8sJLJ0hPCtjFrErdkPtUXHHHPccoNjhMUK97cACudmbBWGaAM/rUGm/zRGqw qC0OprZzW716LVOoEzbKP6nAJ+HV9Otd5sV2RAuqibrB1+IJg4ZS+tj8WSzZDObQLIno njPeorN+7yT1OXnU/7WdLiDfJLQCeS7/NWYHnK/pI36Nw3b0/9loBZZadi85bUX+XQp0 HIZAVndX2GjgCo6FJby9WTzi1ZIkQ8vAVgD4nxXofUhqbI5J2vGv2opGkKJksyJwcHDi SX7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=tdgmvYv2ahMfXs/2KvDiokgAJnqmJRii5iG0BDeDRwU=; b=DPrpmAo4yJEgYGvrZOeafgKuxHtiwiHrvi75v31Y6LLWovOuO8CAqpgv26FVssaVLl nrk3u74wv6qXx8vJUUn9rBrQxNydNZPAaWHfXvAxIDrKwjS41m/vol97/KQTSffjNq7n ztf1aZKdDFXo8NgWzl85voqvqw9cJQrg6zhyN6ExyIdYx7sGP94YhzV2xwnOM9gvREyK qrJPQaUIkIpLIwLEgVytTf9nuKDUJRKMzGHfiFhWaurZdBenyEbM8aYNJgybr+3Jryxd 2QU7nXBh/P/4u/rMklgh5+7LXqQ8krse+eDVXF242QRIJklSY/wFINPMVSpauB6SMTQq mLag== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id g9si209388qkb.8.2019.04.25.21.53.42 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:53:42 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 73420C07DEA3; Fri, 26 Apr 2019 04:53:41 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id DBFA5194A0; Fri, 26 Apr 2019 04:53:35 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 11/27] mm: merge parameters for change_protection() Date: Fri, 26 Apr 2019 12:51:35 +0800 Message-Id: <20190426045151.19556-12-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 26 Apr 2019 04:53:41 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP change_protection() was used by either the NUMA or mprotect() code, there's one parameter for each of the callers (dirty_accountable and prot_numa). Further, these parameters are passed along the calls: - change_protection_range() - change_p4d_range() - change_pud_range() - change_pmd_range() - ... Now we introduce a flag for change_protect() and all these helpers to replace these parameters. Then we can avoid passing multiple parameters multiple times along the way. More importantly, it'll greatly simplify the work if we want to introduce any new parameters to change_protection(). In the follow up patches, a new parameter for userfaultfd write protection will be introduced. No functional change at all. Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- include/linux/huge_mm.h | 2 +- include/linux/mm.h | 14 +++++++++++++- mm/huge_memory.c | 3 ++- mm/mempolicy.c | 2 +- mm/mprotect.c | 29 ++++++++++++++++------------- 5 files changed, 33 insertions(+), 17 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 381e872bfde0..1550fb12dbd4 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -46,7 +46,7 @@ extern bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, pmd_t *old_pmd, pmd_t *new_pmd); extern int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, pgprot_t newprot, - int prot_numa); + unsigned long cp_flags); vm_fault_t vmf_insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, pmd_t *pmd, pfn_t pfn, bool write); vm_fault_t vmf_insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, diff --git a/include/linux/mm.h b/include/linux/mm.h index bad93704abc8..086e69d4439d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1641,9 +1641,21 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma, unsigned long old_addr, struct vm_area_struct *new_vma, unsigned long new_addr, unsigned long len, bool need_rmap_locks); + +/* + * Flags used by change_protection(). For now we make it a bitmap so + * that we can pass in multiple flags just like parameters. However + * for now all the callers are only use one of the flags at the same + * time. + */ +/* Whether we should allow dirty bit accounting */ +#define MM_CP_DIRTY_ACCT (1UL << 0) +/* Whether this protection change is for NUMA hints */ +#define MM_CP_PROT_NUMA (1UL << 1) + extern unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa); + unsigned long cp_flags); extern int mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev, unsigned long start, unsigned long end, unsigned long newflags); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 165ea46bf149..64d26b1989d2 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1899,13 +1899,14 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, * - HPAGE_PMD_NR is protections changed and TLB flush necessary */ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, - unsigned long addr, pgprot_t newprot, int prot_numa) + unsigned long addr, pgprot_t newprot, unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; spinlock_t *ptl; pmd_t entry; bool preserve_write; int ret; + bool prot_numa = cp_flags & MM_CP_PROT_NUMA; ptl = __pmd_trans_huge_lock(pmd, vma); if (!ptl) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 2219e747df49..825053818bcb 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -575,7 +575,7 @@ unsigned long change_prot_numa(struct vm_area_struct *vma, { int nr_updated; - nr_updated = change_protection(vma, addr, end, PAGE_NONE, 0, 1); + nr_updated = change_protection(vma, addr, end, PAGE_NONE, MM_CP_PROT_NUMA); if (nr_updated) count_vm_numa_events(NUMA_PTE_UPDATES, nr_updated); diff --git a/mm/mprotect.c b/mm/mprotect.c index 028c724dcb1a..98091408bd11 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -37,13 +37,15 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; pte_t *pte, oldpte; spinlock_t *ptl; unsigned long pages = 0; int target_node = NUMA_NO_NODE; + bool dirty_accountable = cp_flags & MM_CP_DIRTY_ACCT; + bool prot_numa = cp_flags & MM_CP_PROT_NUMA; /* * Can be called with only the mmap_sem for reading by @@ -164,7 +166,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, static inline unsigned long change_pmd_range(struct vm_area_struct *vma, pud_t *pud, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { pmd_t *pmd; unsigned long next; @@ -194,7 +196,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, __split_huge_pmd(vma, pmd, addr, false, NULL); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, - newprot, prot_numa); + newprot, cp_flags); if (nr_ptes) { if (nr_ptes == HPAGE_PMD_NR) { @@ -209,7 +211,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, /* fall through, the trans huge pmd just split */ } this_pages = change_pte_range(vma, pmd, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); pages += this_pages; next: cond_resched(); @@ -225,7 +227,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, static inline unsigned long change_pud_range(struct vm_area_struct *vma, p4d_t *p4d, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { pud_t *pud; unsigned long next; @@ -237,7 +239,7 @@ static inline unsigned long change_pud_range(struct vm_area_struct *vma, if (pud_none_or_clear_bad(pud)) continue; pages += change_pmd_range(vma, pud, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (pud++, addr = next, addr != end); return pages; @@ -245,7 +247,7 @@ static inline unsigned long change_pud_range(struct vm_area_struct *vma, static inline unsigned long change_p4d_range(struct vm_area_struct *vma, pgd_t *pgd, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { p4d_t *p4d; unsigned long next; @@ -257,7 +259,7 @@ static inline unsigned long change_p4d_range(struct vm_area_struct *vma, if (p4d_none_or_clear_bad(p4d)) continue; pages += change_pud_range(vma, p4d, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (p4d++, addr = next, addr != end); return pages; @@ -265,7 +267,7 @@ static inline unsigned long change_p4d_range(struct vm_area_struct *vma, static unsigned long change_protection_range(struct vm_area_struct *vma, unsigned long addr, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; pgd_t *pgd; @@ -282,7 +284,7 @@ static unsigned long change_protection_range(struct vm_area_struct *vma, if (pgd_none_or_clear_bad(pgd)) continue; pages += change_p4d_range(vma, pgd, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (pgd++, addr = next, addr != end); /* Only flush the TLB if we actually modified any entries: */ @@ -295,14 +297,15 @@ static unsigned long change_protection_range(struct vm_area_struct *vma, unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { unsigned long pages; if (is_vm_hugetlb_page(vma)) pages = hugetlb_change_protection(vma, start, end, newprot); else - pages = change_protection_range(vma, start, end, newprot, dirty_accountable, prot_numa); + pages = change_protection_range(vma, start, end, newprot, + cp_flags); return pages; } @@ -430,7 +433,7 @@ mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev, vma_set_page_prot(vma); change_protection(vma, start, end, vma->vm_page_prot, - dirty_accountable, 0); + dirty_accountable ? MM_CP_DIRTY_ACCT : 0); /* * Private VM_LOCKED VMA becoming writable: trigger COW to avoid major From patchwork Fri Apr 26 04:51:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918063 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C4865933 for ; Fri, 26 Apr 2019 04:53:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B2FAA289B1 for ; Fri, 26 Apr 2019 04:53:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A6CC528D7F; Fri, 26 Apr 2019 04:53:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E9952289B1 for ; Fri, 26 Apr 2019 04:53:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B3BF46B0271; Fri, 26 Apr 2019 00:53:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AEC436B0272; Fri, 26 Apr 2019 00:53:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9DC8F6B0273; Fri, 26 Apr 2019 00:53:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 7CDBF6B0271 for ; Fri, 26 Apr 2019 00:53:49 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id f20so1908627qtf.3 for ; Thu, 25 Apr 2019 21:53:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=VeXRgvUJi1y1SW+1toogAxaUlYVOD2umH9qWZARvA0w=; b=L2SdKQJF+tWF47JfdxDKUaHR9T5ZQN0xOvEoYH2OHgT4NjsEX9xvayaaCBIDTsTZye xzjxRSbHfeoBYgSl7xojQejcxcbUMS2HExRh5BTZwO0/btHq/qjlIcBfhJ8pTEItpa0U NydN2rxmSLy9VomKwvFdKA1qlvLyIFTb8MX5185Wq7ESIffM/W90+4Y7dZ0PtsZORKRx OCYs/sKGzve4UV77ceKx/eRR+93toEkcu4TfL+PDJoY13F5e5k+O8YDu2irqSzIhg6V7 F7kj4ws3OJvJqBKOSSsxWufLUymEUiplTPkRbP9Nq8ai4VYJtDMVdc1rhPDgMJMrljjP qwOA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVOg44lNWvRzKfUTCFoXVp2kJaqiK60+7n7Alcq0UH6x3PP6fxt P5yL0gDiC7It9nX/oUQ8Wbq+7wJEaT3l+/v0XeCviHIiLcDLkqzChVJdGn7Wrtz3/T4YpNuML8d N8ujbbYohl6q379iGF1uCy4AdgJyJgTmNfBVrO+sKnvKxJdRLnk/7alO4Qwljx32CCA== X-Received: by 2002:a05:620a:1244:: with SMTP id a4mr242177qkl.282.1556254429281; Thu, 25 Apr 2019 21:53:49 -0700 (PDT) X-Google-Smtp-Source: APXvYqzPD8TgLhkkjPAlXU68f2v1MoHsvCvcGey+upfkABlZkcSHTYvBJQcHso0K2AiIQmh6VBtq X-Received: by 2002:a05:620a:1244:: with SMTP id a4mr242143qkl.282.1556254428382; Thu, 25 Apr 2019 21:53:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254428; cv=none; d=google.com; s=arc-20160816; b=lIuoVGvNsIe1q7I7CKEaaVpICvpoYI4Rrpmjz0VYeGSJ9x+FcYol40z2zM5ReYYblj fk1k0G5FsrDsRlVkEeKXmCJlss8qkBYEiM0YdFboY/r3/HYCO8TcBac9AV7YBIY2qPFj qoJ6DOXfiI1D23Kr/vuLUxCunJYAa1+hU7jeQcST/EZqCLFXjp9zP6ozlRLoMHVacPyd CNsQr0zp4wkKzaDA7UsSsy1uDs/uQ71N77vSJ+wWcXodISvBCx+4/MsldbxtlZk66/Uu XkeHYTiYCoZD7gnw2iPw5bzqwV8XxwgsqN/j77HvdGNlWnsIQnx6URuzrmiGbUxoX4+n k7Zg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=VeXRgvUJi1y1SW+1toogAxaUlYVOD2umH9qWZARvA0w=; b=TY42V5YvQVX17D8TNjHwXLqhFt8oVktLzBLQ7DM/n/AG13RoIjegac1F8AjvLVHy8j 1jNBqOU8zlBWEK4uXdealNlmGnuS1VYZYNeKItRbdGA3PA5J+Zu+gAv+XiFQz8jF0cZp vxZpgtFyGPFmbB+HZyRBxZmJLJdIDgqj3jCJdO1RRgqjl9yjEVFZq2ab9a1mAu77tlbj eeoT7hsAV4Fr3eJ8WopNNc7u377XospxlfKxgQxrrI7KKypSvkDnQ9SbV7Y4qZsUtgmO eP5mjX+f3wpyKW1d0OXAnyWWTl71ODFzMMgNYPDjyKHPewSEY28se4A6YAWjq8zteQzt aCEA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id u25si5035342qkj.20.2019.04.25.21.53.48 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:53:48 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 861213078ADD; Fri, 26 Apr 2019 04:53:47 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id ED6AB17CC5; Fri, 26 Apr 2019 04:53:41 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 12/27] userfaultfd: wp: apply _PAGE_UFFD_WP bit Date: Fri, 26 Apr 2019 12:51:36 +0800 Message-Id: <20190426045151.19556-13-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Fri, 26 Apr 2019 04:53:47 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Firstly, introduce two new flags MM_CP_UFFD_WP[_RESOLVE] for change_protection() when used with uffd-wp and make sure the two new flags are exclusively used. Then, - For MM_CP_UFFD_WP: apply the _PAGE_UFFD_WP bit and remove _PAGE_RW when a range of memory is write protected by uffd - For MM_CP_UFFD_WP_RESOLVE: remove the _PAGE_UFFD_WP bit and recover _PAGE_RW when write protection is resolved from userspace And use this new interface in mwriteprotect_range() to replace the old MM_CP_DIRTY_ACCT. Do this change for both PTEs and huge PMDs. Then we can start to identify which PTE/PMD is write protected by general (e.g., COW or soft dirty tracking), and which is for userfaultfd-wp. Since we should keep the _PAGE_UFFD_WP when doing pte_modify(), add it into _PAGE_CHG_MASK as well. Meanwhile, since we have this new bit, we can be even more strict when detecting uffd-wp page faults in either do_wp_page() or wp_huge_pmd(). Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/linux/mm.h | 5 +++++ mm/huge_memory.c | 14 +++++++++++++- mm/memory.c | 4 ++-- mm/mprotect.c | 12 ++++++++++++ mm/userfaultfd.c | 8 ++++++-- 5 files changed, 38 insertions(+), 5 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 086e69d4439d..a5ac81188523 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1652,6 +1652,11 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma, #define MM_CP_DIRTY_ACCT (1UL << 0) /* Whether this protection change is for NUMA hints */ #define MM_CP_PROT_NUMA (1UL << 1) +/* Whether this change is for write protecting */ +#define MM_CP_UFFD_WP (1UL << 2) /* do wp */ +#define MM_CP_UFFD_WP_RESOLVE (1UL << 3) /* Resolve wp */ +#define MM_CP_UFFD_WP_ALL (MM_CP_UFFD_WP | \ + MM_CP_UFFD_WP_RESOLVE) extern unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 64d26b1989d2..3885747d4901 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1907,6 +1907,8 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, bool preserve_write; int ret; bool prot_numa = cp_flags & MM_CP_PROT_NUMA; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; ptl = __pmd_trans_huge_lock(pmd, vma); if (!ptl) @@ -1973,6 +1975,13 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, entry = pmd_modify(entry, newprot); if (preserve_write) entry = pmd_mk_savedwrite(entry); + if (uffd_wp) { + entry = pmd_wrprotect(entry); + entry = pmd_mkuffd_wp(entry); + } else if (uffd_wp_resolve) { + entry = pmd_mkwrite(entry); + entry = pmd_clear_uffd_wp(entry); + } ret = HPAGE_PMD_NR; set_pmd_at(mm, addr, pmd, entry); BUG_ON(vma_is_anonymous(vma) && !preserve_write && pmd_write(entry)); @@ -2120,7 +2129,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, struct page *page; pgtable_t pgtable; pmd_t old_pmd, _pmd; - bool young, write, soft_dirty, pmd_migration = false; + bool young, write, soft_dirty, pmd_migration = false, uffd_wp = false; unsigned long addr; int i; @@ -2202,6 +2211,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, write = pmd_write(old_pmd); young = pmd_young(old_pmd); soft_dirty = pmd_soft_dirty(old_pmd); + uffd_wp = pmd_uffd_wp(old_pmd); } VM_BUG_ON_PAGE(!page_count(page), page); page_ref_add(page, HPAGE_PMD_NR - 1); @@ -2235,6 +2245,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = pte_mkold(entry); if (soft_dirty) entry = pte_mksoft_dirty(entry); + if (uffd_wp) + entry = pte_mkuffd_wp(entry); } pte = pte_offset_map(&_pmd, addr); BUG_ON(!pte_none(*pte)); diff --git a/mm/memory.c b/mm/memory.c index 8ccd4927b58d..64bd8075f054 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2492,7 +2492,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; - if (userfaultfd_wp(vma)) { + if (userfaultfd_pte_wp(vma, *vmf->pte)) { pte_unmap_unlock(vmf->pte, vmf->ptl); return handle_userfault(vmf, VM_UFFD_WP); } @@ -3713,7 +3713,7 @@ static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf) static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd) { if (vma_is_anonymous(vmf->vma)) { - if (userfaultfd_wp(vmf->vma)) + if (userfaultfd_huge_pmd_wp(vmf->vma, orig_pmd)) return handle_userfault(vmf, VM_UFFD_WP); return do_huge_pmd_wp_page(vmf, orig_pmd); } diff --git a/mm/mprotect.c b/mm/mprotect.c index 98091408bd11..732d9b6d1d21 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -46,6 +46,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, int target_node = NUMA_NO_NODE; bool dirty_accountable = cp_flags & MM_CP_DIRTY_ACCT; bool prot_numa = cp_flags & MM_CP_PROT_NUMA; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; /* * Can be called with only the mmap_sem for reading by @@ -117,6 +119,14 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, if (preserve_write) ptent = pte_mk_savedwrite(ptent); + if (uffd_wp) { + ptent = pte_wrprotect(ptent); + ptent = pte_mkuffd_wp(ptent); + } else if (uffd_wp_resolve) { + ptent = pte_mkwrite(ptent); + ptent = pte_clear_uffd_wp(ptent); + } + /* Avoid taking write faults for known dirty pages */ if (dirty_accountable && pte_dirty(ptent) && (pte_soft_dirty(ptent) || @@ -301,6 +311,8 @@ unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, { unsigned long pages; + BUG_ON((cp_flags & MM_CP_UFFD_WP_ALL) == MM_CP_UFFD_WP_ALL); + if (is_vm_hugetlb_page(vma)) pages = hugetlb_change_protection(vma, start, end, newprot); else diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index eaecc21806da..240de2a8492d 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -73,8 +73,12 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, goto out_release; _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); - if ((dst_vma->vm_flags & VM_WRITE) && !wp_copy) - _dst_pte = pte_mkwrite(_dst_pte); + if (dst_vma->vm_flags & VM_WRITE) { + if (wp_copy) + _dst_pte = pte_mkuffd_wp(_dst_pte); + else + _dst_pte = pte_mkwrite(_dst_pte); + } dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); if (dst_vma->vm_file) { From patchwork Fri Apr 26 04:51:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918065 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 37473933 for ; Fri, 26 Apr 2019 04:54:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 270A328CDB for ; Fri, 26 Apr 2019 04:54:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1430E289B1; Fri, 26 Apr 2019 04:54:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A6484289B1 for ; Fri, 26 Apr 2019 04:54:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AEE946B0273; Fri, 26 Apr 2019 00:54:00 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A9D3F6B0274; Fri, 26 Apr 2019 00:54:00 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B5206B0275; Fri, 26 Apr 2019 00:54:00 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 7B1146B0273 for ; Fri, 26 Apr 2019 00:54:00 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id e31so1907356qtb.0 for ; Thu, 25 Apr 2019 21:54:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=HQXeCDuaeEzp5u9iQHCKOJfnvV1Ev3VBPvZheWkVmck=; b=W8C1byNmqEHqkiyaEvnxQ2DgroUBIH2psOIlS/WxzREvTtW+aeyQzujwPfkS4/ptZd oY4iGWDGcgVwuwMPQwAQpB5rvzrpp9DEqQwlMPSntRK6+5v62KDFCWULPXx+tettrw2t GTYdPK8T3kJRHxGB3CVKK2I5gzmkWf+JPwIAsV+t62yghUHyPD88ca9xlCPQhBU6h4ZZ b3b29hjgHIffUXeA8eECR7SJAqrpQABuoJowyt5v2BsPFOiSuDI0gP0y98ZRNSKBpNDW 56rZviURBtUlaVKT76XS5npuDTHNJOtG7R02zhtigUGPoI5ONJ4wolKwHXOjXSI81M69 3lTg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAU3B9xvpMEuBboXqDmLr44xKIOHg5dU8RKEnfJUrhLISIWx2vpH XwthjsRi7Z+Fp/L0ON1b17rjudXNSlGscQp6ReQbfMDbfU7DQELOqBwZoBdOnOz/7F1HDaQRIlT pDxxZ6NpHu+3il4gzy8cU8a+xjM9TL7DrlzZQgx+zi1uAFFMdsDgwMQcI53rfaQBe4A== X-Received: by 2002:a37:6c81:: with SMTP id h123mr30706337qkc.201.1556254440302; Thu, 25 Apr 2019 21:54:00 -0700 (PDT) X-Google-Smtp-Source: APXvYqwPHq+TczNZJ2n/k9L16tlHidboIo2vXq62Z2bxKNcSJDi9xUXF0D35SRDtl7QIyC40pnPs X-Received: by 2002:a37:6c81:: with SMTP id h123mr30706312qkc.201.1556254439744; Thu, 25 Apr 2019 21:53:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254439; cv=none; d=google.com; s=arc-20160816; b=ATaFUszANtEJ9Hx4SdWU4ohCUCLm2QkEpRhtWnAVpfJyY2fFUb26yTvshiuhYC2Awt o24O38yoayjmgxcWVnJWVEgYzy7vR2gJP2ow3Y+Ko6024R3xeIlLN445h3ud1XVteoXS aNMHKWKk0nHdm5qozBtVVhQCXw7BiGFBERtzJIrf/sCukCHoYQ9W0Bt+JbqSKJ7qjYHu mafxzCY7csFlUM+SwfN9xwOVvoj4jpozCv6bk+n/IoktsejQlbxD+JwxsTplxaxILNKT +11B0m3Tx9IaiEXqvucKG2Myk4N613kw2Ur5ORpLI/GmGjuypf7ZY4CnpdcHiSZzT/50 OAjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=HQXeCDuaeEzp5u9iQHCKOJfnvV1Ev3VBPvZheWkVmck=; b=okj3LjO1/UpQb1VOFa79fg39/Dp0NKbrlghlCrWo5PZQwB94KCvDM5+Bqdyr5xIpLz w0mu7eIhduvm6Uv2tNzsCKbqVSBvASq8GCWwk5lsiXIeKjDe7jbWYLfNjjdH9zVXPOer LDjX9q87MUM/jUffplS2L9AlQHSHLgcMIUDuopC2vdn3sPALGhMqRVaZp3fAxodmPg+k DZsZna9ZYt0fSHjcKpL7l+A4Km0gytDi+R3uEF+5vtHzemWOAUezUW5wQZTyEu4Xm6jT 4rjluKpRlJdLI27PcGR0VGgkPOtVxWsw69YDrNk6e0Si8XTUcf9QeOOyLCf+qcWooMMj 8uGA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id k5si2422391qte.277.2019.04.25.21.53.59 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:53:59 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E5C3781E16; Fri, 26 Apr 2019 04:53:58 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0B738194A0; Fri, 26 Apr 2019 04:53:47 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 13/27] mm: introduce do_wp_page_cont() Date: Fri, 26 Apr 2019 12:51:37 +0800 Message-Id: <20190426045151.19556-14-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 26 Apr 2019 04:53:59 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The userfaultfd handling in do_wp_page() is very special comparing to the rest of the function because it only postpones the real handling of the page fault to the userspace program. Isolate the handling part of do_wp_page() into a new function called do_wp_page_cont() so that we can use it somewhere else when resolving the userfault page fault. Signed-off-by: Peter Xu --- include/linux/mm.h | 2 ++ mm/memory.c | 8 ++++++++ 2 files changed, 10 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index a5ac81188523..a2911de04cdd 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -445,6 +445,8 @@ struct vm_fault { */ }; +vm_fault_t do_wp_page_cont(struct vm_fault *vmf); + /* page entry size for vm->huge_fault() */ enum page_entry_size { PE_SIZE_PTE = 0, diff --git a/mm/memory.c b/mm/memory.c index 64bd8075f054..ab98a1eb4702 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2497,6 +2497,14 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) return handle_userfault(vmf, VM_UFFD_WP); } + return do_wp_page_cont(vmf); +} + +vm_fault_t do_wp_page_cont(struct vm_fault *vmf) + __releases(vmf->ptl) +{ + struct vm_area_struct *vma = vmf->vma; + vmf->page = vm_normal_page(vma, vmf->address, vmf->orig_pte); if (!vmf->page) { /* From patchwork Fri Apr 26 04:51:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918067 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D5EE6933 for ; Fri, 26 Apr 2019 04:54:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C413428CDB for ; Fri, 26 Apr 2019 04:54:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B85E828D88; Fri, 26 Apr 2019 04:54:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2C13128CDB for ; Fri, 26 Apr 2019 04:54:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 213666B0274; Fri, 26 Apr 2019 00:54:07 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1C3D86B0275; Fri, 26 Apr 2019 00:54:07 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 08C0A6B0276; Fri, 26 Apr 2019 00:54:07 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id DCE826B0274 for ; Fri, 26 Apr 2019 00:54:06 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id 18so1847244qtw.20 for ; Thu, 25 Apr 2019 21:54:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=yQDxWhs+lCUchDUREZ6Ld09bitOzgmvjoggeWW25uUE=; b=aBT18Ra/1X62gr9MCEQfpAkxRxS4IzbN8Yop82txlwmuc07SZZ/Qg3LnX5W7zZcz8w yT4JyIvQXW0CUESC13UBbto8e6ICPpobfDxawBSELjptcnX11wJH+8+Bnap6N+SbwN7R UaQvsG1OS2t5VsKpozZG9SuTjEMPrU9YZS2A6IclBE2ltcgud0rle9vu4ENthHneAL8j CMK+58ftCaOGXtzo2JYOJWJxLm62gP8ACQxus9GLQ0Xal+krzmJVBBULhsksW9zBJI68 EG8VB/qUEG6Y2rSDAB6GCNNAms75+k9qJ0L+YDmwGpYYylAwNECdR1uwGdmN4Zc/d6tl Jtmg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAUYNSyDigAzpd0h8Ymi1broaYXHGusqkndpOOKYD3ROiVxfUryZ 2gWFRMcWAFHwvDPzrlL5DORIT9IGFVgDtzygQQ72NGat8PNzCIqBAVdyjY0wx4tI8y/gCeJV6ND 0VRDiZnhiThf2uZDMGnHCCzL0RHiN+uyiHIODAcDW59Ol8vVXkXFpySe0Yp4DKIq5Zw== X-Received: by 2002:a37:a7c4:: with SMTP id q187mr27830430qke.242.1556254446679; Thu, 25 Apr 2019 21:54:06 -0700 (PDT) X-Google-Smtp-Source: APXvYqy7UbMgIcPFHP0iOGdRlKdnxP2LYDhz0UP7GhyK9nS3ONGElwdltEPPCvkhwILFcGk/Ekra X-Received: by 2002:a37:a7c4:: with SMTP id q187mr27830402qke.242.1556254445905; Thu, 25 Apr 2019 21:54:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254445; cv=none; d=google.com; s=arc-20160816; b=C4j+2+fw1RjdVouE5ZAkD5503CwjNmO77ILD88oBB1nNeItiL75HX8sHKig5F7erro +I202tmCsBWmK7sY9elBk43dyyXlryzJBjbnMHIrbo10H8yda7daTxOfg7dUzImbmhOy YQb/zgmjIDtRFrhRtSQu8rWrb/8PFl96NmdSKPDaM1q09Zr9G6sKMhmJF6gAS1xeVhsd oCf15OzOtA1CiSZyZC1DaY53zI5AI7A5ndo/srpwiPnOziuNvfK8OqaZJsyOHRR/KUWA C2Tty891BiOmJQqgRk51jB/eTRLE5DcrGFnRio/8H100RgMlQEFEeuBTwJDbOd99tUAj u8Rw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=yQDxWhs+lCUchDUREZ6Ld09bitOzgmvjoggeWW25uUE=; b=lPmyE9afq/TCDeyeLzQ2d7wf80vqHiSCE7YP3JtH9vsTZhP4F4ForRAcVrcfttbR9s DW1Dyd1z71KcuYd8XktqJs7gQeRTCVcEprrXLAFC1apBeEXBhV4ICO701jg/vA4qg5MV R/ZJ9srb8+4UP1mlLZgJheIdieRQ0pCWlWWQGVk8o3yczfLiNdWJbLmMhxoVmD5Qf9AE VIzpyiYkHnu0CWlG/ICwRK+uhwDoOTve0mVTGA4ySMqDG8llUDty6sr4RItRZBZWdDVM dLbuOwekd08xRn5PCHdzZyZQM46sdUSu3s+T9RHTFE8qSp0WrUD5nkJ7h0r0jwkiCkMb GE/A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id d31si2873585qvh.37.2019.04.25.21.54.05 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:54:05 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0755A81DFE; Fri, 26 Apr 2019 04:54:05 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6B896194A0; Fri, 26 Apr 2019 04:53:59 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 14/27] userfaultfd: wp: handle COW properly for uffd-wp Date: Fri, 26 Apr 2019 12:51:38 +0800 Message-Id: <20190426045151.19556-15-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 26 Apr 2019 04:54:05 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This allows uffd-wp to support write-protected pages for COW. For example, the uffd write-protected PTE could also be write-protected by other usages like COW or zero pages. When that happens, we can't simply set the write bit in the PTE since otherwise it'll change the content of every single reference to the page. Instead, we should do the COW first if necessary, then handle the uffd-wp fault. To correctly copy the page, we'll also need to carry over the _PAGE_UFFD_WP bit if it was set in the original PTE. For huge PMDs, we just simply split the huge PMDs where we want to resolve an uffd-wp page fault always. That matches what we do with general huge PMD write protections. In that way, we resolved the huge PMD copy-on-write issue into PTE copy-on-write. Signed-off-by: Peter Xu --- mm/memory.c | 5 ++++- mm/mprotect.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 56 insertions(+), 4 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index ab98a1eb4702..965d974bb9bd 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2299,7 +2299,10 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) } flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte)); entry = mk_pte(new_page, vma->vm_page_prot); - entry = maybe_mkwrite(pte_mkdirty(entry), vma); + if (pte_uffd_wp(vmf->orig_pte)) + entry = pte_mkuffd_wp(entry); + else + entry = maybe_mkwrite(pte_mkdirty(entry), vma); /* * Clear the pte entry and flush it first, before updating the * pte with the new entry. This will avoid a race condition diff --git a/mm/mprotect.c b/mm/mprotect.c index 732d9b6d1d21..1f40662182f8 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -73,18 +73,18 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, flush_tlb_batched_pending(vma->vm_mm); arch_enter_lazy_mmu_mode(); do { +retry_pte: oldpte = *pte; if (pte_present(oldpte)) { pte_t ptent; bool preserve_write = prot_numa && pte_write(oldpte); + struct page *page; /* * Avoid trapping faults against the zero or KSM * pages. See similar comment in change_huge_pmd. */ if (prot_numa) { - struct page *page; - page = vm_normal_page(vma, addr, oldpte); if (!page || PageKsm(page)) continue; @@ -114,6 +114,45 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, continue; } + /* + * Detect whether we'll need to COW before + * resolving an uffd-wp fault. Note that this + * includes detection of the zero page (where + * page==NULL) + */ + if (uffd_wp_resolve) { + struct vm_fault vmf = { + .vma = vma, + .address = addr & PAGE_MASK, + .orig_pte = oldpte, + .pmd = pmd, + .pte = pte, + .ptl = ptl, + }; + vm_fault_t ret; + + /* If the fault is resolved already, skip */ + if (!pte_uffd_wp(*pte)) + continue; + + arch_leave_lazy_mmu_mode(); + /* With PTE lock held */ + ret = do_wp_page_cont(&vmf); + if (ret != VM_FAULT_WRITE && ret != 0) + /* Probably OOM */ + return pages; + pte = pte_offset_map_lock(vma->vm_mm, pmd, + addr, &ptl); + arch_enter_lazy_mmu_mode(); + if (ret == 0 || !pte_present(*pte)) + /* + * This PTE could have been modified + * during or after COW before taking + * the lock; retry. + */ + goto retry_pte; + } + oldpte = ptep_modify_prot_start(vma, addr, pte); ptent = pte_modify(oldpte, newprot); if (preserve_write) @@ -183,6 +222,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, unsigned long pages = 0; unsigned long nr_huge_updates = 0; struct mmu_notifier_range range; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; range.start = 0; @@ -202,7 +242,16 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, } if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { - if (next - addr != HPAGE_PMD_SIZE) { + /* + * When resolving an userfaultfd write + * protection fault, it's not easy to identify + * whether a THP is shared with others and + * whether we'll need to do copy-on-write, so + * just split it always for now to simply the + * procedure. And that's the policy too for + * general THP write-protect in af9e4d5f2de2. + */ + if (next - addr != HPAGE_PMD_SIZE || uffd_wp_resolve) { __split_huge_pmd(vma, pmd, addr, false, NULL); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, From patchwork Fri Apr 26 04:51:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918069 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0D3CC1390 for ; Fri, 26 Apr 2019 04:54:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F07C3289B1 for ; Fri, 26 Apr 2019 04:54:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E450F28D7F; Fri, 26 Apr 2019 04:54:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 79FD8289B1 for ; Fri, 26 Apr 2019 04:54:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D3716B0275; Fri, 26 Apr 2019 00:54:13 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 482886B0276; Fri, 26 Apr 2019 00:54:13 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3720D6B0277; Fri, 26 Apr 2019 00:54:13 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 1293E6B0275 for ; Fri, 26 Apr 2019 00:54:13 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id k68so1812421qkd.21 for ; Thu, 25 Apr 2019 21:54:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=+2zXM28mkRTFL/nZ1ldtHaKFT1vWRt0OYY5NDzGcXks=; b=YwiqyxWGYyWki91j4vHIUq6l1MB6nePvMml7BMo3vlibO3FxKlM/ShbNRbhcG1RSIu CnZB1/re8GnAAmo8CPSL9EjeEw3nZOYy250BIP+WZEJ8/+vCn3AJcSCvyDnizUKoCOJz s6Xgvvn1t5TqPNzEs/zdedWgn+n81V0PfglTqK7QHwDai2NyQCe5RutJ56dqncm7a/1R zbBTpBh60muiy6GQI9O9qB1Pen5l4ZwnkWwt+HAf0HSmqE6L5GsmqQrHjxmWCJ9WmwjF So78+tmD5LzUCQfiNaVk1H+5+Kke2zet+RWwfq9qgdXvafA3dvrn7v9jKiF9CGHrnToX Fxhw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVdleTVWWUXSqvJ0wOobyMJcJkC9Nd3cBtSOv2IErrmSOA4/Mmc tgCA2KjyZlwhI59L0fHF+kddSPJOoBc9ZeDgfZjElGbRI+JMwnbsNNWbvs+mTSPy++K21Wm3rUk CAdqFsA+hqX5aLKdHZ45AUTM2a6518udwXr5Kc1Wn5eyIKMAAN/jMsDEmwMAPQ5f/Rg== X-Received: by 2002:a0c:b8aa:: with SMTP id y42mr32827397qvf.66.1556254452852; Thu, 25 Apr 2019 21:54:12 -0700 (PDT) X-Google-Smtp-Source: APXvYqwsF67o2wqm3TBLBXO7AkRP8s86B+PB1urRc1oPEDseT8yZtzUQX+Cuvs2/hngOKldxOFIE X-Received: by 2002:a0c:b8aa:: with SMTP id y42mr32827364qvf.66.1556254451938; Thu, 25 Apr 2019 21:54:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254451; cv=none; d=google.com; s=arc-20160816; b=bdNK1gtI2l7ucQ4NNdjShvE8KAdqovvlD3lg4fj9leNJ54Lts9pCrCMyW4RYFw1rPz XcdMz1tr3Nji7/yzdmVQCVE3ab7ch3A73d3H8fXQydFsHjKBmqETUqRRSP/84/luVdjm /wQufFV1lo8bR1VdblNH5BHALY9kLTZHyLjG92lnwT7O8+mC744nZwaSIuOJnpjwkxpc 0K3BbS80v/yrIu9bSOvcivPR3mKDYft6R+fXWJ2ogIdWvULcj8+s/Yaln7NY5HP+Ds0k 4xBYVwtsTy56TCk8g1rqMPZFvyTuoPpcQxE8JeWZlcuXc/wzV7pqtVPkWnWAI5odRH6H BFNQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=+2zXM28mkRTFL/nZ1ldtHaKFT1vWRt0OYY5NDzGcXks=; b=fGOpXXvc4dy4TDazCqmqHzSHDHu9nz1165Kf4o5hgvu+jUFb3nJMzuu2nVbBASxIeD /oxN3S6kwzI4XNur+fygnSnOmq1h/YoXvXCJO5TelzzgKAAzIzdTjz/2FKtYiN51JBK9 lFkvT7VDU6tPfc8D8QUx9IhrD4UvCCNKSvKc+Jh0qH9dXVhDhpF1PjKPJbBeLRjQkzjG 6zQgUDZnYxIcQSiUGouwqNUWSHfzTPQrnR6sRkIFLc4Zf08htw+RLKXWFcX/e0Ygs9vj +R8qEz3GuiYXmiN1Jm4fz4smQIZqWilnKu2oWrVXEKsSxhmig7uLomXlMxso4JkZwQ48 DvOg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id l190si884793qkf.38.2019.04.25.21.54.11 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:54:11 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1EBBFC0842AB; Fri, 26 Apr 2019 04:54:11 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 80CB5194A0; Fri, 26 Apr 2019 04:54:05 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 15/27] userfaultfd: wp: drop _PAGE_UFFD_WP properly when fork Date: Fri, 26 Apr 2019 12:51:39 +0800 Message-Id: <20190426045151.19556-16-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Fri, 26 Apr 2019 04:54:11 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP UFFD_EVENT_FORK support for uffd-wp should be already there, except that we should clean the uffd-wp bit if uffd fork event is not enabled. Detect that to avoid _PAGE_UFFD_WP being set even if the VMA is not being tracked by VM_UFFD_WP. Do this for both small PTEs and huge PMDs. Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- mm/huge_memory.c | 8 ++++++++ mm/memory.c | 8 ++++++++ 2 files changed, 16 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 3885747d4901..cf8f11d6e6cd 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -976,6 +976,14 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, ret = -EAGAIN; pmd = *src_pmd; + /* + * Make sure the _PAGE_UFFD_WP bit is cleared if the new VMA + * does not have the VM_UFFD_WP, which means that the uffd + * fork event is not enabled. + */ + if (!(vma->vm_flags & VM_UFFD_WP)) + pmd = pmd_clear_uffd_wp(pmd); + #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION if (unlikely(is_swap_pmd(pmd))) { swp_entry_t entry = pmd_to_swp_entry(pmd); diff --git a/mm/memory.c b/mm/memory.c index 965d974bb9bd..2abf0934ad7f 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -789,6 +789,14 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte = pte_mkclean(pte); pte = pte_mkold(pte); + /* + * Make sure the _PAGE_UFFD_WP bit is cleared if the new VMA + * does not have the VM_UFFD_WP, which means that the uffd + * fork event is not enabled. + */ + if (!(vm_flags & VM_UFFD_WP)) + pte = pte_clear_uffd_wp(pte); + page = vm_normal_page(vma, addr, pte); if (page) { get_page(page); From patchwork Fri Apr 26 04:51:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918071 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DA9CF1390 for ; Fri, 26 Apr 2019 04:54:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CA2EF289B1 for ; Fri, 26 Apr 2019 04:54:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BDF3D28D7F; Fri, 26 Apr 2019 04:54:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5FEB428CDB for ; Fri, 26 Apr 2019 04:54:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 71D346B0277; Fri, 26 Apr 2019 00:54:24 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6CCD16B0278; Fri, 26 Apr 2019 00:54:24 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E3A96B0279; Fri, 26 Apr 2019 00:54:24 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 3EF566B0277 for ; Fri, 26 Apr 2019 00:54:24 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id z34so1869863qtz.14 for ; Thu, 25 Apr 2019 21:54:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=JMh9KCVPuDaGt/2VnHtIaAX3wvvxE5xtc+RuS52ZdHc=; b=CQWkmsvZNEOtxAG8zaH7ATtEkfOtmknO3qNXKLq2HRYhH0FOeQVFaASHRNwxaaDRyt UxE5JZ7zNm7zhYJg/MuuXkHIDrZWY7jJfUSApD1QLGTbuNQ3W+5a05OhWSUvPat0FL0E wP84iViJo4tiMVNj1B8TVTzVP3sMpIeqbMDcZBNPdoi0vonffy3VnTlrGRA8ZrOuQaP9 VgbUdOLh8pL2AZwYvK3ZsXzLkP0IH2hwIeLT0On6OOptkaoC+AEsu3wVIswdCWG3+84A HcxbFkDTQnkq+V5Vz7i3xzfUPbUGGxjHaCXf+34uNWc7W2PPdApyg1LJKkdxkPsdOLPu wNmg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXrMWK7Ut1Jr192BBFamexfaA/2DndpCjrdZrp93Rs5MiX1KLzB QUy8qWO/dWD32whQjMlUDTYguQNP9M5j6K7FuwJ5dNKSAPpFpoGDzCjXJaYgR+zn5QZqBYgEAEH 2yRaLA588bNgGYUwJJx97ouUHsEL+ByKk8j712a0iGwuvTXF2MPbsDe+eyRui2zN78w== X-Received: by 2002:ac8:1491:: with SMTP id l17mr5785731qtj.143.1556254464060; Thu, 25 Apr 2019 21:54:24 -0700 (PDT) X-Google-Smtp-Source: APXvYqzlqFfsc0RMiyTn7yRxDDTfhcBuAm1sH1oEoVu9VvykV2j3nrTjjOqzK1BDhLPUZahl2CbK X-Received: by 2002:ac8:1491:: with SMTP id l17mr5785705qtj.143.1556254463236; Thu, 25 Apr 2019 21:54:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254463; cv=none; d=google.com; s=arc-20160816; b=s30RyO2vmrAcPDZDszMpZQK9JBErpCroMzdwUtmFADvYytNyYwLM8AhMAQIrYQDgFB oa3qRSUpnPWQlslxiouGCam0iSnPLIPnbhXFlImd3QAABrzTo44zCTWx5+Kblvdx/epM W4no2Ai/cyvBSwiMOMsIZEhjDcheAEdsQLikopngyx/My/gIgSQcOgwKIHRiDWFNkbY4 PGU5o2+JnSky3aAOmMnIcUKD9sb0UFLDxhCl6aImANjwYaO4HYi0Hn1dCRONm8Rt6Ykx x40hMSmGQJPkQAv3pBYvpg7GzvWiSllTtBg5YqsfvEBvjey3aNsApUP1+XwTSZj960uP brpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=JMh9KCVPuDaGt/2VnHtIaAX3wvvxE5xtc+RuS52ZdHc=; b=QbAzB6V6bTRhLNM2NF+YmX7NG10WpQoSxhy2YS1TLzDmlDSOGuyBGLKnB649jcBOsB VWdx2QMltpnE+c6ImZyUAz5Tdo+Cc2nibRlAoZWAj3nIh1hP8QVPLmQkAX3m7Oo970ef 4kLxmrKt8QO9fVCRWQWW12gEv5+yAegO5Pa1Zw8a7R1JbXNkOE9cPgRQ6Czc/TcazODC U+ONiJ8kqtPUnWNX191AMDGjKjAAqiD+CcCG27o6bG7Ok8s+aSHQ5p5LmyQvadNbJFJf BKhMEgxT14ZRQjtZize7o4lSbrbhMSx7fkJqadADlyz70ZsTE38LI6Lq0J/ww1+Om+/M IIkw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id j4si4361657qkd.234.2019.04.25.21.54.23 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:54:23 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 62CD33082B4D; Fri, 26 Apr 2019 04:54:22 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 96F8D17B21; Fri, 26 Apr 2019 04:54:11 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 16/27] userfaultfd: wp: add pmd_swp_*uffd_wp() helpers Date: Fri, 26 Apr 2019 12:51:40 +0800 Message-Id: <20190426045151.19556-17-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Fri, 26 Apr 2019 04:54:22 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Adding these missing helpers for uffd-wp operations with pmd swap/migration entries. Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- arch/x86/include/asm/pgtable.h | 15 +++++++++++++++ include/asm-generic/pgtable_uffd.h | 15 +++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 6863236e8484..18a815d6f4ea 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1401,6 +1401,21 @@ static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) { return pte_clear_flags(pte, _PAGE_SWP_UFFD_WP); } + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pmd_set_flags(pmd, _PAGE_SWP_UFFD_WP); +} + +static inline int pmd_swp_uffd_wp(pmd_t pmd) +{ + return pmd_flags(pmd) & _PAGE_SWP_UFFD_WP; +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pmd_clear_flags(pmd, _PAGE_SWP_UFFD_WP); +} #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ #define PKRU_AD_BIT 0x1 diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h index 643d1bf559c2..828966d4c281 100644 --- a/include/asm-generic/pgtable_uffd.h +++ b/include/asm-generic/pgtable_uffd.h @@ -46,6 +46,21 @@ static __always_inline pte_t pte_swp_clear_uffd_wp(pte_t pte) { return pte; } + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pmd; +} + +static inline int pmd_swp_uffd_wp(pmd_t pmd) +{ + return 0; +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pmd; +} #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ #endif /* _ASM_GENERIC_PGTABLE_UFFD_H */ From patchwork Fri Apr 26 04:51:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918073 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A9A70933 for ; Fri, 26 Apr 2019 04:54:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 98D3C289B1 for ; Fri, 26 Apr 2019 04:54:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8C38628D7F; Fri, 26 Apr 2019 04:54:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C2D7A289B1 for ; Fri, 26 Apr 2019 04:54:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A2C306B0279; Fri, 26 Apr 2019 00:54:30 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9DA6B6B027A; Fri, 26 Apr 2019 00:54:30 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F0636B027B; Fri, 26 Apr 2019 00:54:30 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id 6FC306B0279 for ; Fri, 26 Apr 2019 00:54:30 -0400 (EDT) Received: by mail-qk1-f200.google.com with SMTP id k6so1839801qkf.13 for ; Thu, 25 Apr 2019 21:54:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=XOSlci1dq3CxVK30ZuotdjWSgcs2dSvqE9xrxT37Oio=; b=bP83kyCtzn9c+ZKZdS8isoMaCBx7I0MBTGRWl3zeN+85sLsNNgQo5GEmC8I41Nkz5C jfVWq3tuyILVa/CJGvrENV6rRRD1U0LYm05gmGNeBkiEuVn+r07S52BnsZi/TgzKe7ap VChKB8hG3G6hxhkceNKzo8b5fBx3oTqNATwqCPZqTy5gpjCz8t3zvJFi4OTxyfCRs/cd xMgH8Bgzn998reaHl+q6r676Kg8+N+Z5OCClLy7pQjJxsVDg8cd+ZlBwtdjhaChbNCmu YY1oEfhirlQ75+8EuxtQpT7sfcMCS9SgL2k1aseRX1ABwOZVe06WMLiLoya/QsQLwvBX 6i0w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVHltf1wsu6ptSMON2OQxxTDMsbDDPvZH24x1awjn3St1YWif2T d0Y11UEVk3+qqDzq6sE2rbb1Zon52zIPLMO92EPntzNHg3Q6/bQ/RGpdPRoT+Vk3htA+rgHbE4l PzB7fYnFUjTRiTbv6DrNWO8eOAZ8JOpSmmmhRTw3HOsqx0SdvnQ6fSAxtV6W57DCAbg== X-Received: by 2002:ac8:304f:: with SMTP id g15mr15204226qte.306.1556254470223; Thu, 25 Apr 2019 21:54:30 -0700 (PDT) X-Google-Smtp-Source: APXvYqyqVXq7PawpgyLeUD6xDouKSIxipG71+8WoqpxnletcF5os8W2bZNpiVpE4NabMEkI2u7qo X-Received: by 2002:ac8:304f:: with SMTP id g15mr15204186qte.306.1556254469340; Thu, 25 Apr 2019 21:54:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254469; cv=none; d=google.com; s=arc-20160816; b=z/ogjw692jP6QDwMMP0QGgl2EqTpZVpTXV0StMUr8mVSOMFOH1LSNc0xTky80ZWhda T55BB6LqZ1Ma98Uy6n1Svau46C/auErWd+K+ikPHe7ips3hZHDefH/NfXS1yRtn3nQxM 8P79XgIEbLA0OJ006T3OL8GlQgeNHA0gvijL5wO9D26aAS3rJlBp6jS5XUhfcW5LmN7h pGrZ2YwCPOUUb/tIjqdwpeMcMFrHGCZ1822K6mtEu8+Hh6GXPWbtitmWpGIA/m2n0iG0 zzkRQbh1ayc2p1s6/eli9WzRGvcN2Zbii566Ox4YHL3SQgDBTRv2lVVv934ONd26Bq2l JfpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=XOSlci1dq3CxVK30ZuotdjWSgcs2dSvqE9xrxT37Oio=; b=yiL/4Nt9LiQdHq/CyOGcfcSpnsbEatmZXKE6x0higWbcBU64K3d3yVK3KyoA44zB2w 47HV5rvyfKHjc289FMWciDrossXvVIIvxaQL0uRHNyCTa9781pdTZ08kytpRUfx0HCOT T8bmvloyUAmOgYRXAd6YpJiKHVBxYM5QUeWcjJ1QocnSPRedbLyEzWiXRUFGYDjDsgK3 R8zhIv7RPaYCKYhwcljDVk+smrGnt+WU6fLzzbsuPn4eG9J/UxV3YVI8JcG0B39dqQye rNPGpJR1apSyJWSmOk2Gu35mYzZAJnk1CU383bJbuCx3ceqwByTg4OWGSYpL58r1hOny s9Mw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id c20si4186205qkb.236.2019.04.25.21.54.29 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:54:29 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 710FF308425B; Fri, 26 Apr 2019 04:54:28 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id D580D17B21; Fri, 26 Apr 2019 04:54:22 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 17/27] userfaultfd: wp: support swap and page migration Date: Fri, 26 Apr 2019 12:51:41 +0800 Message-Id: <20190426045151.19556-18-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Fri, 26 Apr 2019 04:54:28 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP For either swap and page migration, we all use the bit 2 of the entry to identify whether this entry is uffd write-protected. It plays a similar role as the existing soft dirty bit in swap entries but only for keeping the uffd-wp tracking for a specific PTE/PMD. Something special here is that when we want to recover the uffd-wp bit from a swap/migration entry to the PTE bit we'll also need to take care of the _PAGE_RW bit and make sure it's cleared, otherwise even with the _PAGE_UFFD_WP bit we can't trap it at all. In change_pte_range() we do nothing for uffd if the PTE is a swap entry. That can lead to data mismatch if the page that we are going to write protect is swapped out when sending the UFFDIO_WRITEPROTECT. This patch also applies/removes the uffd-wp bit even for the swap entries. Signed-off-by: Peter Xu --- include/linux/swapops.h | 2 ++ mm/huge_memory.c | 3 +++ mm/memory.c | 8 ++++++++ mm/migrate.c | 6 ++++++ mm/mprotect.c | 28 +++++++++++++++++----------- mm/rmap.c | 6 ++++++ 6 files changed, 42 insertions(+), 11 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 4d961668e5fc..0c2923b1cdb7 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -68,6 +68,8 @@ static inline swp_entry_t pte_to_swp_entry(pte_t pte) if (pte_swp_soft_dirty(pte)) pte = pte_swp_clear_soft_dirty(pte); + if (pte_swp_uffd_wp(pte)) + pte = pte_swp_clear_uffd_wp(pte); arch_entry = __pte_to_swp_entry(pte); return swp_entry(__swp_type(arch_entry), __swp_offset(arch_entry)); } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index cf8f11d6e6cd..998a7e5d625e 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2212,6 +2212,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, write = is_write_migration_entry(entry); young = false; soft_dirty = pmd_swp_soft_dirty(old_pmd); + uffd_wp = pmd_swp_uffd_wp(old_pmd); } else { page = pmd_page(old_pmd); if (pmd_dirty(old_pmd)) @@ -2244,6 +2245,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = swp_entry_to_pte(swp_entry); if (soft_dirty) entry = pte_swp_mksoft_dirty(entry); + if (uffd_wp) + entry = pte_swp_mkuffd_wp(entry); } else { entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); entry = maybe_mkwrite(entry, vma); diff --git a/mm/memory.c b/mm/memory.c index 2abf0934ad7f..f53f54592ddc 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -737,6 +737,8 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte = swp_entry_to_pte(entry); if (pte_swp_soft_dirty(*src_pte)) pte = pte_swp_mksoft_dirty(pte); + if (pte_swp_uffd_wp(*src_pte)) + pte = pte_swp_mkuffd_wp(pte); set_pte_at(src_mm, addr, src_pte, pte); } } else if (is_device_private_entry(entry)) { @@ -766,6 +768,8 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, is_cow_mapping(vm_flags)) { make_device_private_entry_read(&entry); pte = swp_entry_to_pte(entry); + if (pte_swp_uffd_wp(*src_pte)) + pte = pte_swp_mkuffd_wp(pte); set_pte_at(src_mm, addr, src_pte, pte); } } @@ -2854,6 +2858,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) flush_icache_page(vma, page); if (pte_swp_soft_dirty(vmf->orig_pte)) pte = pte_mksoft_dirty(pte); + if (pte_swp_uffd_wp(vmf->orig_pte)) { + pte = pte_mkuffd_wp(pte); + pte = pte_wrprotect(pte); + } set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); vmf->orig_pte = pte; diff --git a/mm/migrate.c b/mm/migrate.c index 663a5449367a..deff1f8c20af 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -241,11 +241,15 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, entry = pte_to_swp_entry(*pvmw.pte); if (is_write_migration_entry(entry)) pte = maybe_mkwrite(pte, vma); + else if (pte_swp_uffd_wp(*pvmw.pte)) + pte = pte_mkuffd_wp(pte); if (unlikely(is_zone_device_page(new))) { if (is_device_private_page(new)) { entry = make_device_private_entry(new, pte_write(pte)); pte = swp_entry_to_pte(entry); + if (pte_swp_uffd_wp(*pvmw.pte)) + pte = pte_mkuffd_wp(pte); } else if (is_device_public_page(new)) { pte = pte_mkdevmap(pte); } @@ -2306,6 +2310,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pte)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pte)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, addr, ptep, swp_pte); /* diff --git a/mm/mprotect.c b/mm/mprotect.c index 1f40662182f8..adc054d38f89 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -174,11 +174,11 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, } ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); pages++; - } else if (IS_ENABLED(CONFIG_MIGRATION)) { + } else if (is_swap_pte(oldpte)) { swp_entry_t entry = pte_to_swp_entry(oldpte); + pte_t newpte; if (is_write_migration_entry(entry)) { - pte_t newpte; /* * A protection check is difficult so * just be safe and disable write @@ -187,22 +187,28 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, newpte = swp_entry_to_pte(entry); if (pte_swp_soft_dirty(oldpte)) newpte = pte_swp_mksoft_dirty(newpte); - set_pte_at(mm, addr, pte, newpte); - - pages++; - } - - if (is_write_device_private_entry(entry)) { - pte_t newpte; - + if (pte_swp_uffd_wp(oldpte)) + newpte = pte_swp_mkuffd_wp(newpte); + } else if (is_write_device_private_entry(entry)) { /* * We do not preserve soft-dirtiness. See * copy_one_pte() for explanation. */ make_device_private_entry_read(&entry); newpte = swp_entry_to_pte(entry); - set_pte_at(mm, addr, pte, newpte); + if (pte_swp_uffd_wp(oldpte)) + newpte = pte_swp_mkuffd_wp(newpte); + } else { + newpte = oldpte; + } + if (uffd_wp) + newpte = pte_swp_mkuffd_wp(newpte); + else if (uffd_wp_resolve) + newpte = pte_swp_clear_uffd_wp(newpte); + + if (!pte_same(oldpte, newpte)) { + set_pte_at(mm, addr, pte, newpte); pages++; } } diff --git a/mm/rmap.c b/mm/rmap.c index b30c7c71d1d9..0b2e2f74b477 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1469,6 +1469,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, pvmw.address, pvmw.pte, swp_pte); /* * No need to invalidate here it will synchronize on @@ -1561,6 +1563,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* * No need to invalidate here it will synchronize on @@ -1627,6 +1631,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* Invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, From patchwork Fri Apr 26 04:51:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918075 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3BEE41390 for ; Fri, 26 Apr 2019 04:54:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 29C68289B1 for ; Fri, 26 Apr 2019 04:54:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1BC2F28D7F; Fri, 26 Apr 2019 04:54:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 99314289B1 for ; Fri, 26 Apr 2019 04:54:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A50D26B027D; Fri, 26 Apr 2019 00:54:38 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A00BB6B027E; Fri, 26 Apr 2019 00:54:38 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F0586B027F; Fri, 26 Apr 2019 00:54:38 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 6B7156B027D for ; Fri, 26 Apr 2019 00:54:38 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id j20so1831480qta.23 for ; Thu, 25 Apr 2019 21:54:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=jp0YVUZ3P3SiDKVWqlmFTg+hIs4LHhTDtGXxqzlDiEY=; b=jUNHKowXwzzeOMcWXN47x4Ax+PTffzOD+I4tYOe9gj9S0cRRWT3VyxY7HWm1+b5RNm dN43YfHTU6GX5oBO/NQ4YvScpRP9XneJK0m4nPjZsOQXF1OKyNhr5EqxJ3Vr0pVB4cSC enf1kOcIwBb2yUtWK46vc02h68y8mtHN3wMkilTcN/GYIJRF8PmIzOk7TAuVRXV2Y2Ev d4/7e3L6a2sRWxxK4B76Nbf9CLfxT2FuS1xcP8z0jqQ5nRcgN//aP1I9zFKYDl277Hpu T5sNszOPEau8NGS4SiHu1XFDp1EGRoY6taOgiSWFYZwT93qOnjwx2dKRYV4BlcYsWYff XcqA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWLjxX9etZ5P8gqJDdhKe65SHcSm8raVsNW0+Ec/qAZFRb2+Itv +t0IhC+rD2UjA1Ir5rJoaNDlEZV1yPBh91qydkrjMpJrrMMjehwQMc+RVN2Ld1x7WBaXFyDQ9Kw m1NDrU/HLsNaCBQE2GuoM38VqW4fm9ZhsTxvP2psl6wWtTcbDL27OMjhlcIMIPqE3GQ== X-Received: by 2002:ae9:e406:: with SMTP id q6mr9242914qkc.227.1556254478235; Thu, 25 Apr 2019 21:54:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqxf9NlglyMXlfQdZeAeQmUTW7HqcVjW2NnSjPh82vOnwRyFduz0T9ggX/DRdfCzHbX+dI/M X-Received: by 2002:ae9:e406:: with SMTP id q6mr9242872qkc.227.1556254477178; Thu, 25 Apr 2019 21:54:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254477; cv=none; d=google.com; s=arc-20160816; b=lv4+qoRCNeiQiaoHGorJXYvZ1Z/cCIstQJTrwToXW8XhqkhTFGaSchS7N7/q9rNgTJ XdFpOwIo4L+gautQnYVlH+LLzMZ7veQvy4nA+lgN3YOVvFIs76Xgv52Pu6+SGbVswKtS 3JXax987wdxAQCYlYSCFFELL9ZTNO3KT5CWTUapvX9OUZfzguiX9A2ITzctSxANj7PWv NT/3Jrt5OaV+F73ap37Gw9h4xhhBydo7I+ZYFqJ5uaht1YKJ7Vt+81dpFmYI5fvKm3jK PGN1poh7cLeUd1hWYSpz7yliq6OX1/kJKY1m0HeY8a83dEgLXbgXe7mZV65/zK9rFiX6 XNVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=jp0YVUZ3P3SiDKVWqlmFTg+hIs4LHhTDtGXxqzlDiEY=; b=I3+a0W7nw26GJ8yedxVwKtrxTbV3YSNX5PIG6boDCf62GBrLvD8rtfJuJqT6D/JnDd TiBBt0N8LJ1A5Q+MGYSOeiwqojdPAZHKkQ4jE1o9ayHZ7M6GaZmu4asp/oADbaIcvVd6 6xrP93iX2ZjXADgtlJQ/S2/s0I7HxeepLk7bXX80DsmG5imTstAuyMJhWxoLJ04m7BtV /mPAHebt/6zRa4ZPf2kxJP9wZI/z5mkJYm+Wn7nwZiWTwRoLRihuDDxQ6u45b2GPpKDi qVAhfny1NVn9NdwWnh7cp8F9eGfUiW2cGgsQ6QM3aoT4ftWXIoU65IoEhTsfOO3naZY/ qCnQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id d54si9333817qta.89.2019.04.25.21.54.37 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:54:37 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 53EEC85543; Fri, 26 Apr 2019 04:54:36 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id EB8F918500; Fri, 26 Apr 2019 04:54:28 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 18/27] khugepaged: skip collapse if uffd-wp detected Date: Fri, 26 Apr 2019 12:51:42 +0800 Message-Id: <20190426045151.19556-19-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Fri, 26 Apr 2019 04:54:36 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Don't collapse the huge PMD if there is any userfault write protected small PTEs. The problem is that the write protection is in small page granularity and there's no way to keep all these write protection information if the small pages are going to be merged into a huge PMD. The same thing needs to be considered for swap entries and migration entries. So do the check as well disregarding khugepaged_max_ptes_swap. Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/trace/events/huge_memory.h | 1 + mm/khugepaged.c | 23 +++++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index dd4db334bd63..2d7bad9cb976 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -13,6 +13,7 @@ EM( SCAN_PMD_NULL, "pmd_null") \ EM( SCAN_EXCEED_NONE_PTE, "exceed_none_pte") \ EM( SCAN_PTE_NON_PRESENT, "pte_non_present") \ + EM( SCAN_PTE_UFFD_WP, "pte_uffd_wp") \ EM( SCAN_PAGE_RO, "no_writable_page") \ EM( SCAN_LACK_REFERENCED_PAGE, "lack_referenced_page") \ EM( SCAN_PAGE_NULL, "page_null") \ diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 449044378782..6aa9935317d4 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -29,6 +29,7 @@ enum scan_result { SCAN_PMD_NULL, SCAN_EXCEED_NONE_PTE, SCAN_PTE_NON_PRESENT, + SCAN_PTE_UFFD_WP, SCAN_PAGE_RO, SCAN_LACK_REFERENCED_PAGE, SCAN_PAGE_NULL, @@ -1124,6 +1125,15 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, pte_t pteval = *_pte; if (is_swap_pte(pteval)) { if (++unmapped <= khugepaged_max_ptes_swap) { + /* + * Always be strict with uffd-wp + * enabled swap entries. Please see + * comment below for pte_uffd_wp(). + */ + if (pte_swp_uffd_wp(pteval)) { + result = SCAN_PTE_UFFD_WP; + goto out_unmap; + } continue; } else { result = SCAN_EXCEED_SWAP_PTE; @@ -1143,6 +1153,19 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, result = SCAN_PTE_NON_PRESENT; goto out_unmap; } + if (pte_uffd_wp(pteval)) { + /* + * Don't collapse the page if any of the small + * PTEs are armed with uffd write protection. + * Here we can also mark the new huge pmd as + * write protected if any of the small ones is + * marked but that could bring uknown + * userfault messages that falls outside of + * the registered range. So, just be simple. + */ + result = SCAN_PTE_UFFD_WP; + goto out_unmap; + } if (pte_write(pteval)) writable = true; From patchwork Fri Apr 26 04:51:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918079 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5032A933 for ; Fri, 26 Apr 2019 04:54:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3E93028925 for ; Fri, 26 Apr 2019 04:54:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3258828A30; Fri, 26 Apr 2019 04:54:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A7BC028925 for ; Fri, 26 Apr 2019 04:54:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B3B866B027F; Fri, 26 Apr 2019 00:54:52 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AEB0A6B0280; Fri, 26 Apr 2019 00:54:52 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A02DD6B0281; Fri, 26 Apr 2019 00:54:52 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 814CD6B027F for ; Fri, 26 Apr 2019 00:54:52 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id q57so1876140qtf.11 for ; Thu, 25 Apr 2019 21:54:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=g4CUT7wn2ZfM204z66Iujtc9aw73Tdzc6nKReP/7dxM=; b=NfL7l0hqPiXAGsLupLpDaNBQeRqdW+A9qYHP/n2S+ufL8Otek60B1olqqsqLX00f4U 34J1IoN0Fr2y+C85UvMSgqARM6uJ2ZijpatWNR+XE4i8XAUxLjNUYC6h6kEzgp3DbvgZ BdUf/fY6b3L0nTehDdx+knX/is9y8uESJzdDhxZ06bWigJFixrVi+sJPUV8E0lLYNhel FyORaqfSwO/ysNgqBarzRIAtzFJBHwAJmZ8mCD/htn3ln8X6sEw1S82M4wc4PWXNAt9K DBqLVPL/y5cGECFDkY1daBbDWLDxg9Fo8vCsX8jJBOnSK8pkVAqoOXAZ9dIyFtDmCFyT U94A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWR63dKBEr8f1tvQT/r5e1jPDFAhwg+N9vqZHqY7EvRpVCvHfis Gw+jDqeIEPwbJfai9Br+6SbkdpIZ3AscjqqDcT7IOMz8Fk15zILf0KcHaYUUj1XoUGSfHnTz5BJ j7bUDwxb8j8gNj6rVYIuMgM1AfSeikztrU6IfG7trOchu3OTdzN6LC11N8riN3J6k9A== X-Received: by 2002:aed:3aa1:: with SMTP id o30mr19616698qte.218.1556254492326; Thu, 25 Apr 2019 21:54:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqyeK1UqlMa+fBrPhWwETer1R0fv9M0ayjQ3DuZ6NAkpWEsCx9boWpWqEgqeGW9El7wuS9L5 X-Received: by 2002:aed:3aa1:: with SMTP id o30mr19616678qte.218.1556254491710; Thu, 25 Apr 2019 21:54:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254491; cv=none; d=google.com; s=arc-20160816; b=IMdNQ0LW6CqRaykK3XvFekKaU6zFKamFmrqb8WSsVv76aL6Y+UX6Cd0d9FzNI9HDg+ bA7Mw140olQ+arGAmYrDdmhuJAPBe9ea/jw7Ugzs6H0KXvnzMnqLB0oJqIDPAvlhbm56 wUFAvCMyj3g68mTaqXkCwU+/IAJRYl9PwPNQlE1YhW9U//SPic3TMn0aVYFup7Pco1YT 2aBTc7l6XpYe5OLyrxm7P48tsA2CEVJt0UlY0rqHg7f2gfik6AbXGorBs9MZq1km7l3j Ajnp7ZwXBIRpKiQledPP4NdXGrxKISKfmJG2eUBA1WOInUbU/tZ15vO3MGeMhx9zCmPb 8aIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=g4CUT7wn2ZfM204z66Iujtc9aw73Tdzc6nKReP/7dxM=; b=zeM2by4x9ZwGPAh0ZmPxWgR+kTg9feM16Br36ruc2a2Ievf5gWgkwwOwjif/bLoXL7 7sZRpTbv5caL9hz2qhea/B0p0kUidkN4MPvUKyRM3YQp4KK5yHRn8sVe0uC84dO0xFUx hsMWtn0Zi3bHeg556reZg3rVHe3ttTiFKhwM8zLLU6eJJlwiY2mA8MaZ160B1sjq9kEG fBEt7fOnx/hVaUe4Szo3uDKl9LXP7Rtike5vhoWdKBCzEc7n6XyZFhZ1RLGXskFTYgtY FMH66B/5v49dBPZWA/sx2cNaJekfLfxDDODGDgybgdlO6apqT5fU7ZeKTqI7VVjyMGcQ vc1w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id l28si1596307qve.85.2019.04.25.21.54.51 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:54:51 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D1935859FF; Fri, 26 Apr 2019 04:54:50 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id CC7F218500; Fri, 26 Apr 2019 04:54:36 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 19/27] userfaultfd: introduce helper vma_find_uffd Date: Fri, 26 Apr 2019 12:51:43 +0800 Message-Id: <20190426045151.19556-20-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 26 Apr 2019 04:54:51 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP We've have multiple (and more coming) places that would like to find a userfault enabled VMA from a mm struct that covers a specific memory range. This patch introduce the helper for it, meanwhile apply it to the code. Suggested-by: Mike Rapoport Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- mm/userfaultfd.c | 54 +++++++++++++++++++++++++++--------------------- 1 file changed, 30 insertions(+), 24 deletions(-) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 240de2a8492d..2606409572b2 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -20,6 +20,34 @@ #include #include "internal.h" +/* + * Find a valid userfault enabled VMA region that covers the whole + * address range, or NULL on failure. Must be called with mmap_sem + * held. + */ +static struct vm_area_struct *vma_find_uffd(struct mm_struct *mm, + unsigned long start, + unsigned long len) +{ + struct vm_area_struct *vma = find_vma(mm, start); + + if (!vma) + return NULL; + + /* + * Check the vma is registered in uffd, this is required to + * enforce the VM_MAYWRITE check done at uffd registration + * time. + */ + if (!vma->vm_userfaultfd_ctx.ctx) + return NULL; + + if (start < vma->vm_start || start + len > vma->vm_end) + return NULL; + + return vma; +} + static int mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, struct vm_area_struct *dst_vma, @@ -228,20 +256,9 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, */ if (!dst_vma) { err = -ENOENT; - dst_vma = find_vma(dst_mm, dst_start); + dst_vma = vma_find_uffd(dst_mm, dst_start, len); if (!dst_vma || !is_vm_hugetlb_page(dst_vma)) goto out_unlock; - /* - * Check the vma is registered in uffd, this is - * required to enforce the VM_MAYWRITE check done at - * uffd registration time. - */ - if (!dst_vma->vm_userfaultfd_ctx.ctx) - goto out_unlock; - - if (dst_start < dst_vma->vm_start || - dst_start + len > dst_vma->vm_end) - goto out_unlock; err = -EINVAL; if (vma_hpagesize != vma_kernel_pagesize(dst_vma)) @@ -488,20 +505,9 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, * both valid and fully within a single existing vma. */ err = -ENOENT; - dst_vma = find_vma(dst_mm, dst_start); + dst_vma = vma_find_uffd(dst_mm, dst_start, len); if (!dst_vma) goto out_unlock; - /* - * Check the vma is registered in uffd, this is required to - * enforce the VM_MAYWRITE check done at uffd registration - * time. - */ - if (!dst_vma->vm_userfaultfd_ctx.ctx) - goto out_unlock; - - if (dst_start < dst_vma->vm_start || - dst_start + len > dst_vma->vm_end) - goto out_unlock; err = -EINVAL; /* From patchwork Fri Apr 26 04:51:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918083 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9E15E1390 for ; Fri, 26 Apr 2019 04:55:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8BE3E28925 for ; Fri, 26 Apr 2019 04:55:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7E52C28A30; Fri, 26 Apr 2019 04:55:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BBC1A28925 for ; Fri, 26 Apr 2019 04:55:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C72C36B0281; Fri, 26 Apr 2019 00:55:01 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C26D56B0282; Fri, 26 Apr 2019 00:55:01 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC68E6B0283; Fri, 26 Apr 2019 00:55:01 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 8CEE86B0281 for ; Fri, 26 Apr 2019 00:55:01 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id k8so1815662qkj.20 for ; Thu, 25 Apr 2019 21:55:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=JJc0h+L8EKEiFRWPZySSJxSRlflCCV4CfqhuFZTjUzs=; b=sJeYl6AyCyyjoGtQuhDpqXa8qveNGES7EHt8a41T1r4usm5LwintbAvTPEFty4QvE6 fzDYZuaJrI35T/KkLvlBvN6jU2BBgmI7pg/bPl1UnJuqQjn96KFy5IPnfq/AVDArf+t+ YL2jRd2ho88efJQEaThg75AN7IVki1tq949cruxR1I9Bbd4XNw2QhqAdCv2bf01SoBRT m0EqAI7Vl74Npydf6G0apxOKDKijp0QQObevTShGB2jP3gPjRUWKHwb+yzLV2LV6cJXl 80hChLh+WOHiMmglB596QMNr7LnTdSRWWnnUd4EWcw+yH5ar41leesvnQz+LibC/cWhO os/g== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAV8ANVvzDunT+GzYMHOYU7DV8EjencEfViHPH3rOXd11i501pNQ kJrcAtKHmZTMToLpXq/qa5pbG7z8s3UfF5PRLDdGsTqUShux9Kycs4AQCcEUEqp935J9FYIkgzC j/D0RCcN8NouDIgE9XaRXr4ButZmMMGO+kM5k/bqKokKm8pKXBs1ESz8jifyiNsJVVw== X-Received: by 2002:a37:58c4:: with SMTP id m187mr11496033qkb.138.1556254501311; Thu, 25 Apr 2019 21:55:01 -0700 (PDT) X-Google-Smtp-Source: APXvYqwtjPgr3iYx++IWGgge83BqyKj+Hv6yAZTKjIKuJN1iUAi+VOzPAfdlplL5M2Eqm+ByMOCu X-Received: by 2002:a37:58c4:: with SMTP id m187mr11496009qkb.138.1556254500642; Thu, 25 Apr 2019 21:55:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254500; cv=none; d=google.com; s=arc-20160816; b=IF7ee8+e0AvlsUpAjIpqY2sOGIKkUcI0LDSXpXpRAjGaNWLQgPbufnMW1hMElporwc pG8AzmcMMplM625l5DgywpCg+blHcY1XrTEuoFYW/lKdN/B/I1taAZ2QRrPC57aa+rv+ BDpKlf98bKxOdvblySArbBQ+2LDAqL+3u+SDBUToDgjLtKO5u4MxqoMQXydJza1c/YMq 3tjP1EKY7YwomavDfssL1rehD6jltCGQ1P5GN8a4j9AJgWcYa1LsLCVUNF6vX0ztiQz+ MCKWKOJuzHwfjCC3IgSpDTZgvRoVB+LR6teRbCkW2qjFI2q530rgXTdBHoST039z1GLZ A0/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=JJc0h+L8EKEiFRWPZySSJxSRlflCCV4CfqhuFZTjUzs=; b=vdxTH17MbqlX4eTft2LwtVJjSr8aJ9uI4kfPhbN9XKKmkUyay91kG3oXwXmLQZOu2w 1aFkEj0uxGDR8OPf33UBq/4gpByhmFeFrVvXjmYApt7xKJyHfcA3drWWh3OKiF+7jxWB bylpCgBSaOQAAbeFyJ7lk8hFbCcmJr+9TCgi0W64t4MXY9BaomochbnZC/e9tf3ZhCxe sKJfcKaefn/wmQdUGRY77FYh5ePvFpP+XXCzEK+gtYrBxQvvtEy44VP94ycBta9381aU YrDxoRsB/bB6ByjRZ9DdlbrCewmeAWpq0umhXjrd5Iltqza0zS/qLK1IpPjfP6Crk6Kv tC5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id v195si857480qka.194.2019.04.25.21.55.00 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:55:00 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BCB8E59455; Fri, 26 Apr 2019 04:54:59 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 54DA718504; Fri, 26 Apr 2019 04:54:51 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Rik van Riel Subject: [PATCH v4 20/27] userfaultfd: wp: support write protection for userfault vma range Date: Fri, 26 Apr 2019 12:51:44 +0800 Message-Id: <20190426045151.19556-21-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 26 Apr 2019 04:54:59 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li Add API to enable/disable writeprotect a vma range. Unlike mprotect, this doesn't split/merge vmas. Cc: Andrea Arcangeli Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli [peterx: - use the helper to find VMA; - return -ENOENT if not found to match mcopy case; - use the new MM_CP_UFFD_WP* flags for change_protection - check against mmap_changing for failures] Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 3 ++ mm/userfaultfd.c | 54 +++++++++++++++++++++++++++++++++++ 2 files changed, 57 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 765ce884cec0..8f6e6ed544fb 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -39,6 +39,9 @@ extern ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long len, bool *mmap_changing); +extern int mwriteprotect_range(struct mm_struct *dst_mm, + unsigned long start, unsigned long len, + bool enable_wp, bool *mmap_changing); /* mm helpers */ static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma, diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 2606409572b2..70cea2ff3960 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -639,3 +639,57 @@ ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long start, { return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing, 0); } + +int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, + unsigned long len, bool enable_wp, bool *mmap_changing) +{ + struct vm_area_struct *dst_vma; + pgprot_t newprot; + int err; + + /* + * Sanitize the command parameters: + */ + BUG_ON(start & ~PAGE_MASK); + BUG_ON(len & ~PAGE_MASK); + + /* Does the address range wrap, or is the span zero-sized? */ + BUG_ON(start + len <= start); + + down_read(&dst_mm->mmap_sem); + + /* + * If memory mappings are changing because of non-cooperative + * operation (e.g. mremap) running in parallel, bail out and + * request the user to retry later + */ + err = -EAGAIN; + if (mmap_changing && READ_ONCE(*mmap_changing)) + goto out_unlock; + + err = -ENOENT; + dst_vma = vma_find_uffd(dst_mm, start, len); + /* + * Make sure the vma is not shared, that the dst range is + * both valid and fully within a single existing vma. + */ + if (!dst_vma || (dst_vma->vm_flags & VM_SHARED)) + goto out_unlock; + if (!userfaultfd_wp(dst_vma)) + goto out_unlock; + if (!vma_is_anonymous(dst_vma)) + goto out_unlock; + + if (enable_wp) + newprot = vm_get_page_prot(dst_vma->vm_flags & ~(VM_WRITE)); + else + newprot = vm_get_page_prot(dst_vma->vm_flags); + + change_protection(dst_vma, start, start + len, newprot, + enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE); + + err = 0; +out_unlock: + up_read(&dst_mm->mmap_sem); + return err; +} From patchwork Fri Apr 26 04:51:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918089 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8AD351390 for ; Fri, 26 Apr 2019 04:55:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 77A1228925 for ; Fri, 26 Apr 2019 04:55:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6ACE328A30; Fri, 26 Apr 2019 04:55:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9CD3728925 for ; Fri, 26 Apr 2019 04:55:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A789A6B0283; Fri, 26 Apr 2019 00:55:17 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A27E86B0284; Fri, 26 Apr 2019 00:55:17 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 93EB26B0285; Fri, 26 Apr 2019 00:55:17 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 727FD6B0283 for ; Fri, 26 Apr 2019 00:55:17 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id z34so1871166qtz.14 for ; Thu, 25 Apr 2019 21:55:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=DauPO+I/hJ8SBuajGGHsJsj3lcEKHinvfF4ZjWhK/Gw=; b=Xk0hoQKVrpyMUGGAk9hSx5iyTFmVWIyv9j956UpcMrkUF0ACtNX3sV79HVKv//M7Lg ceBytUTDsGqrVZg84OkQcNwadW+uCnsrHQ7Vx8RAxBCJFFKAeIdeRFT15uCw3dhb8zD2 QzSJLoVWMy/xGN2OJaNbyoah7tPSwMES+4L9ELCggzIb++Z4y+tCmmSIjn7tGZGQewpk wTKtMUmSjJgCDJxVpj97qjOlFz/ThFXL2Uh9Jrc3wGRwFrBRC6v9NSrZYyfoDv9861Gw 95y6DMkq6RdyloYxTqWoMT9hGfHkV0C1fdNJzIYs1qx9qHFf6Bs83IEeGAUiH2+m9th2 fuYQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVmVUKEPSR0FDsZqQL6AM+06ErzTy9Ii1IJ8SZgov/yZt+MFmyI 4wjNGJt5b528FlcxIMIdlCDX3JCxP6AOsCbtZcIu35X2HDWkQbYccpM3oRMAPVjOBuVUTPKUSBJ TVBKMs3ii4wXNiPcNtW3NlwdmdZemBENc3sqAw3Q5NLgAjDee74zJRNxUNPqrluaIXQ== X-Received: by 2002:ae9:c313:: with SMTP id n19mr15847270qkg.87.1556254517207; Thu, 25 Apr 2019 21:55:17 -0700 (PDT) X-Google-Smtp-Source: APXvYqxWBXfuWrjfrsPPmnf9YbmPQNMIWgdZXMbMXMz2ZRe0va3SD6sqQaLbL1j5+iLPvvFVmfUT X-Received: by 2002:ae9:c313:: with SMTP id n19mr15847218qkg.87.1556254515736; Thu, 25 Apr 2019 21:55:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254515; cv=none; d=google.com; s=arc-20160816; b=VYemPpI56DM87lpIfxZR1xoLFELMXfKqjLPWktOrIaaopWoIJL3ciZKBfb0mgVKcNG e5v1s1+TiA4mFcTHzw2R9vGwG84+4khMX4bo3uRWDzAJeLrFrR+aVx3HEAtZBOTrFbLI eRO37zrJzo2dDU7k7diMXP6ZA0s1sHb64TD2a2zr8eoDYibV1hPJB1JT1qIrQyemjR3I QKBCaWTGdlzf7ih7Y5PGho4+lu9QjciWJfPBzc2fe1VNZLG19jVKcZ+UAg44PDXgAaol g593yAh3UTrmOBspVTeJtctVYvWL/wQGlwGEGwuGqACuNzwBdOvKzwXcGEvwIJzDamSO MyNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=DauPO+I/hJ8SBuajGGHsJsj3lcEKHinvfF4ZjWhK/Gw=; b=Qp2U/Hle7veEPD0QQXz/zwW+z6KkNc0DCjIkG1CcKesQDHLjO29SxQpOUDT+bOaSC4 81yXzIGWTwJCNdpT5nJfhTfq3j0adqqs9K1CUYfFarKovQmZIXNSGgkQtz8oaJgsluUP IIz/1pbvuIxUGHBPzdliYTD98LNDtdDlFFU/QIZ3wFQRksBsf0z2tGZOHImHLPF6BYsQ Yu3S6vwFYh7GkmsYYNDr45SPObkX5OkcTADVymAI8uTb6LX9Jna9M13o3CP22/9BLN5i StUrvU7vrC92NxCj00+6Qf/0M7x9Ez3XSLj3id40rqfHpQdcb/czsrGW+rsMhel0ma2F SKSw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id t58si4474344qtc.36.2019.04.25.21.55.15 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:55:15 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CD2F1285BE; Fri, 26 Apr 2019 04:55:14 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3FDB917B21; Fri, 26 Apr 2019 04:54:59 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 21/27] userfaultfd: wp: add the writeprotect API to userfaultfd ioctl Date: Fri, 26 Apr 2019 12:51:45 +0800 Message-Id: <20190426045151.19556-22-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Fri, 26 Apr 2019 04:55:15 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli v1: From: Shaohua Li v2: cleanups, remove a branch. [peterx writes up the commit message, as below...] This patch introduces the new uffd-wp APIs for userspace. Firstly, we'll allow to do UFFDIO_REGISTER with write protection tracking using the new UFFDIO_REGISTER_MODE_WP flag. Note that this flag can co-exist with the existing UFFDIO_REGISTER_MODE_MISSING, in which case the userspace program can not only resolve missing page faults, and at the same time tracking page data changes along the way. Secondly, we introduced the new UFFDIO_WRITEPROTECT API to do page level write protection tracking. Note that we will need to register the memory region with UFFDIO_REGISTER_MODE_WP before that. Signed-off-by: Andrea Arcangeli [peterx: remove useless block, write commit message, check against VM_MAYWRITE rather than VM_WRITE when register] Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- fs/userfaultfd.c | 82 +++++++++++++++++++++++++------- include/uapi/linux/userfaultfd.h | 23 +++++++++ 2 files changed, 89 insertions(+), 16 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 3092885c9d2c..81962d62520c 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -304,8 +304,11 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, if (!pmd_present(_pmd)) goto out; - if (pmd_trans_huge(_pmd)) + if (pmd_trans_huge(_pmd)) { + if (!pmd_write(_pmd) && (reason & VM_UFFD_WP)) + ret = true; goto out; + } /* * the pmd is stable (as in !pmd_trans_unstable) so we can re-read it @@ -318,6 +321,8 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, */ if (pte_none(*pte)) ret = true; + if (!pte_write(*pte) && (reason & VM_UFFD_WP)) + ret = true; pte_unmap(pte); out: @@ -1251,10 +1256,13 @@ static __always_inline int validate_range(struct mm_struct *mm, return 0; } -static inline bool vma_can_userfault(struct vm_area_struct *vma) +static inline bool vma_can_userfault(struct vm_area_struct *vma, + unsigned long vm_flags) { - return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || - vma_is_shmem(vma); + /* FIXME: add WP support to hugetlbfs and shmem */ + return vma_is_anonymous(vma) || + ((is_vm_hugetlb_page(vma) || vma_is_shmem(vma)) && + !(vm_flags & VM_UFFD_WP)); } static int userfaultfd_register(struct userfaultfd_ctx *ctx, @@ -1286,15 +1294,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, vm_flags = 0; if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MISSING) vm_flags |= VM_UFFD_MISSING; - if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) { + if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) vm_flags |= VM_UFFD_WP; - /* - * FIXME: remove the below error constraint by - * implementing the wprotect tracking mode. - */ - ret = -EINVAL; - goto out; - } ret = validate_range(mm, uffdio_register.range.start, uffdio_register.range.len); @@ -1342,7 +1343,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, /* check not compatible vmas */ ret = -EINVAL; - if (!vma_can_userfault(cur)) + if (!vma_can_userfault(cur, vm_flags)) goto out_unlock; /* @@ -1370,6 +1371,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, if (end & (vma_hpagesize - 1)) goto out_unlock; } + if ((vm_flags & VM_UFFD_WP) && !(cur->vm_flags & VM_MAYWRITE)) + goto out_unlock; /* * Check that this vma isn't already owned by a @@ -1399,7 +1402,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!vma_can_userfault(vma)); + BUG_ON(!vma_can_userfault(vma, vm_flags)); BUG_ON(vma->vm_userfaultfd_ctx.ctx && vma->vm_userfaultfd_ctx.ctx != ctx); WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); @@ -1534,7 +1537,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, * provides for more strict behavior to notice * unregistration errors. */ - if (!vma_can_userfault(cur)) + if (!vma_can_userfault(cur, cur->vm_flags)) goto out_unlock; found = true; @@ -1548,7 +1551,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!vma_can_userfault(vma)); + BUG_ON(!vma_can_userfault(vma, vma->vm_flags)); /* * Nothing to do: this vma is already registered into this @@ -1761,6 +1764,50 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx, return ret; } +static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, + unsigned long arg) +{ + int ret; + struct uffdio_writeprotect uffdio_wp; + struct uffdio_writeprotect __user *user_uffdio_wp; + struct userfaultfd_wake_range range; + + if (READ_ONCE(ctx->mmap_changing)) + return -EAGAIN; + + user_uffdio_wp = (struct uffdio_writeprotect __user *) arg; + + if (copy_from_user(&uffdio_wp, user_uffdio_wp, + sizeof(struct uffdio_writeprotect))) + return -EFAULT; + + ret = validate_range(ctx->mm, uffdio_wp.range.start, + uffdio_wp.range.len); + if (ret) + return ret; + + if (uffdio_wp.mode & ~(UFFDIO_WRITEPROTECT_MODE_DONTWAKE | + UFFDIO_WRITEPROTECT_MODE_WP)) + return -EINVAL; + if ((uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP) && + (uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) + return -EINVAL; + + ret = mwriteprotect_range(ctx->mm, uffdio_wp.range.start, + uffdio_wp.range.len, uffdio_wp.mode & + UFFDIO_WRITEPROTECT_MODE_WP, + &ctx->mmap_changing); + if (ret) + return ret; + + if (!(uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) { + range.start = uffdio_wp.range.start; + range.len = uffdio_wp.range.len; + wake_userfault(ctx, &range); + } + return ret; +} + static inline unsigned int uffd_ctx_features(__u64 user_features) { /* @@ -1838,6 +1885,9 @@ static long userfaultfd_ioctl(struct file *file, unsigned cmd, case UFFDIO_ZEROPAGE: ret = userfaultfd_zeropage(ctx, arg); break; + case UFFDIO_WRITEPROTECT: + ret = userfaultfd_writeprotect(ctx, arg); + break; } return ret; } diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 340f23bc251d..95c4a160e5f8 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -52,6 +52,7 @@ #define _UFFDIO_WAKE (0x02) #define _UFFDIO_COPY (0x03) #define _UFFDIO_ZEROPAGE (0x04) +#define _UFFDIO_WRITEPROTECT (0x06) #define _UFFDIO_API (0x3F) /* userfaultfd ioctl ids */ @@ -68,6 +69,8 @@ struct uffdio_copy) #define UFFDIO_ZEROPAGE _IOWR(UFFDIO, _UFFDIO_ZEROPAGE, \ struct uffdio_zeropage) +#define UFFDIO_WRITEPROTECT _IOWR(UFFDIO, _UFFDIO_WRITEPROTECT, \ + struct uffdio_writeprotect) /* read() structure */ struct uffd_msg { @@ -232,4 +235,24 @@ struct uffdio_zeropage { __s64 zeropage; }; +struct uffdio_writeprotect { + struct uffdio_range range; +/* + * UFFDIO_WRITEPROTECT_MODE_WP: set the flag to write protect a range, + * unset the flag to undo protection of a range which was previously + * write protected. + * + * UFFDIO_WRITEPROTECT_MODE_DONTWAKE: set the flag to avoid waking up + * any wait thread after the operation succeeds. + * + * NOTE: Write protecting a region (WP=1) is unrelated to page faults, + * therefore DONTWAKE flag is meaningless with WP=1. Removing write + * protection (WP=0) in response to a page fault wakes the faulting + * task unless DONTWAKE is set. + */ +#define UFFDIO_WRITEPROTECT_MODE_WP ((__u64)1<<0) +#define UFFDIO_WRITEPROTECT_MODE_DONTWAKE ((__u64)1<<1) + __u64 mode; +}; + #endif /* _LINUX_USERFAULTFD_H */ From patchwork Fri Apr 26 04:51:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918091 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B846E933 for ; Fri, 26 Apr 2019 04:55:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A4D2328B26 for ; Fri, 26 Apr 2019 04:55:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 973EC28D7F; Fri, 26 Apr 2019 04:55:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2EAD228B26 for ; Fri, 26 Apr 2019 04:55:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D3066B0284; Fri, 26 Apr 2019 00:55:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4820B6B0286; Fri, 26 Apr 2019 00:55:26 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 399346B0287; Fri, 26 Apr 2019 00:55:26 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id 1BAA26B0284 for ; Fri, 26 Apr 2019 00:55:26 -0400 (EDT) Received: by mail-qk1-f200.google.com with SMTP id m8so1844038qka.10 for ; Thu, 25 Apr 2019 21:55:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=sOi2/XhOKc0PT+DtA8+paHYFsJMLE1++TUoCDDvkFl0=; b=EmueyDIAhzBIYb3vUKWTO3+eXZc93L8KjOoYQhu6CGE+8zXiWth1vDDa61Ty+EFp7Y ZETQa2sAN6lW9Kz4k7Qx56HDUmBuytzWPmADQYu1RP/7DuWNIW7GKmvWJ/IADZ1x1MPN YDEpaG8vozT60AHwqq80Kj0NwE1htkNeZF/XE5O2yZdtaF+TxOPQY8HGT7+dDCHBFzew kRTJpvVUKqS8ozHmqEnqYJTOyBosykdAMv8VqLzxAGeGX+SOg7l/GMZHEoUsid5iPurw Xxc5ySese61drzhi4QkCaiXRz8v3Xoy6IOUy5jMGrvdKqb/YM3h7pKGvcsuMBheA1vk8 /9UA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVtIC/TdQo5S/td56qVULjI30jl6I90tLUFnLAvHySFhHk3BPHn ZfxwygjS4yZ4rQcNYNqgxTCdXnArk71WpNFrX21d2SX/xH9kNZVQnUced2tIsgbvv4mQyvbi8h2 wB1O6fnpNni18RphNd53x+KllXN05QunsXLtVUvoU6lsHuv+zKF1yIBR0rWT920Q3Jg== X-Received: by 2002:a37:434d:: with SMTP id q74mr31100084qka.177.1556254525914; Thu, 25 Apr 2019 21:55:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqwBsjV5IzVKFHN+VcLjHACVneb8xzqM1zvALEilKTglcc+8KGgBslqZacyVuj3mOrYVjj3t X-Received: by 2002:a37:434d:: with SMTP id q74mr31100060qka.177.1556254525301; Thu, 25 Apr 2019 21:55:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254525; cv=none; d=google.com; s=arc-20160816; b=wf7j4wnEbPkQdIqz7yr4LGzNVuzM1PaxFSZAiiuGPxIfuaDmMlRQdk9Bm1PnbAfm3O DTgKs/BHymc6HnVTAsAMmoyLSjfKbY8Zub4iFA+14IMY/CnDZGzicVbRyJQZ74e/CORn wX8ExyYGbxCTP28NaXEx/NeLunYcOQi2ctT6fnTP9poTEPua46MlFr/0xqzWPyK1x0gg jKlAivF9CUsqAtnBlhFfFDuco3I7DFWX8SDUqGAillMbbSNXAjJm+mzJNQLvcORkIqvE A5GSxuudENLtOxgXWVsfFxtnXX6fbScBE1AN7ajU0k51myPe8ovYHC7e/0kM66YIj+aw c2mQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=sOi2/XhOKc0PT+DtA8+paHYFsJMLE1++TUoCDDvkFl0=; b=YyGVfUTG5hzYyrofIYv2xEomUIIyCW+PjW6C447vXmIG5kydnPk0NsKb5EzY8GmukR ARmNE8enBl9ifq5DCF26obFWkou2R4+PadLFFDgJEcmr8WOFAPXGlFu6SJoUoPN1ImHp qWjsBQ009rjO2Xi0ub4wQNLZ6n3mIRb3Jl8Bm+lj3swzBovz6s6aTCqOS1OAdAekhDR2 xrW1FoP5+Ru4R4bPPg8rfePxdwQcs4laUZ9hp5KmJ7ek4m24M+9SWlgzzAqkcL8vU6F3 VeQCXfPXmLyNcSH5+i3ZdT63rL4A0a3Iaixp2hP0FkpdaEgKMqr2yzD0ZnJbaJBtNBQI aEDg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id k58si10155391qvc.193.2019.04.25.21.55.25 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:55:25 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7C5DB3082B4F; Fri, 26 Apr 2019 04:55:24 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 525EC17B21; Fri, 26 Apr 2019 04:55:15 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Pavel Emelyanov , Rik van Riel Subject: [PATCH v4 22/27] userfaultfd: wp: enabled write protection in userfaultfd API Date: Fri, 26 Apr 2019 12:51:46 +0800 Message-Id: <20190426045151.19556-23-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Fri, 26 Apr 2019 04:55:24 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li Now it's safe to enable write protection in userfaultfd API Cc: Andrea Arcangeli Cc: Pavel Emelyanov Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/uapi/linux/userfaultfd.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 95c4a160e5f8..e7e98bde221f 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -19,7 +19,8 @@ * means the userland is reading). */ #define UFFD_API ((__u64)0xAA) -#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK | \ +#define UFFD_API_FEATURES (UFFD_FEATURE_PAGEFAULT_FLAG_WP | \ + UFFD_FEATURE_EVENT_FORK | \ UFFD_FEATURE_EVENT_REMAP | \ UFFD_FEATURE_EVENT_REMOVE | \ UFFD_FEATURE_EVENT_UNMAP | \ @@ -34,7 +35,8 @@ #define UFFD_API_RANGE_IOCTLS \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY | \ - (__u64)1 << _UFFDIO_ZEROPAGE) + (__u64)1 << _UFFDIO_ZEROPAGE | \ + (__u64)1 << _UFFDIO_WRITEPROTECT) #define UFFD_API_RANGE_IOCTLS_BASIC \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY) From patchwork Fri Apr 26 04:51:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918093 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1FB50933 for ; Fri, 26 Apr 2019 04:55:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0E28328B26 for ; Fri, 26 Apr 2019 04:55:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0276628D90; Fri, 26 Apr 2019 04:55:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8707128CDB for ; Fri, 26 Apr 2019 04:55:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A2A9F6B0286; Fri, 26 Apr 2019 00:55:37 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9D8E36B0288; Fri, 26 Apr 2019 00:55:37 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8EFA36B0289; Fri, 26 Apr 2019 00:55:37 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 6F80C6B0286 for ; Fri, 26 Apr 2019 00:55:37 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id 54so1869806qtn.15 for ; Thu, 25 Apr 2019 21:55:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=Br5Y+z06NFL6Gv3oN9scI1im/YhSUPS/BrUXGI+L5ts=; b=EcGkk9/P2UkfmHUbpuFvm14Ig0XjmXTclRetz8FCe/ypAf45zbwpVxBuTiMeTCRz6G ZCGaBkJu+7JIJJiTJWMc07CSklr6VzUUpmyRegXNGSbF78cWWONR4m7bqhTLoheoPNYf KImfDsL4nLfsa/rQdwzfpX3Hb4Yn94vG93CRQAZrzuy0iZw95Vrb1D1iIurB8JR4xr65 e6WzanrCgr3QM5qO4w/Py/Ww8um/MtabYMVaAW5nmJGLMIRGw+GvVtgyxjb/eYuB3SPd Feat4YAKmd58uXPWi8Wliy98vDRXpgztIdaoVUEo70TIR4EbbzPCQuMYdBkdHkUkwFbz +qRw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXBp3jEZCVxu5mkelRtocAElA3X2KUco9OZT8kMQC3LZtAZ541y Ex5cJCYRRT760wnFyDuUC+M61Y8JshJz50O2/xrX4v8ZMBpWc7xtP4E5i2G4JertX8osKkOpWJC 4e1k984hUUdaeqsyfGVQqB6r2sose97LhtOTUaPIE7QUQsJKZ4onu7gxr4W4cOQem4A== X-Received: by 2002:a05:620a:3d0:: with SMTP id r16mr32725589qkm.210.1556254537242; Thu, 25 Apr 2019 21:55:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqyYn69FquMln6zjTSiH8BJX1JTHSar2l6IAg3I/J0XGWPexo4fJI1NHfesTzp0HbS9rXtYI X-Received: by 2002:a05:620a:3d0:: with SMTP id r16mr32725558qkm.210.1556254536413; Thu, 25 Apr 2019 21:55:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254536; cv=none; d=google.com; s=arc-20160816; b=iLSjoyuQFJ80KW9nWZQSOTYIMsW+GRI80/RMJrRtHDlYPx1TsaCPP3gOPlpGtomlHl i6bxkRGZoH7/KBkaE6SY1AX3k+RJuTMYa6SenqZm2p4f6k+AZ8o8NQ6DM9tZMC9btSVL FvJ7I0rPfdi/3/Qnuu2gaGjyIRnvwtvdGDA/PGyjDzkovQMsl5/2hf3w4bpH4YJN7Zh0 JYauLNr+SmutqvkCcH1GslGF12b+DFKHWHSGT1jPHv+wsfI3E7EocTaPf730vYUCn7DK nqjVCr+GeiG5z6/av+MgWfQjDavdluxZTkfIYJk4gQWia53OORaeUDexojhQHM20rS0D V+yg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=Br5Y+z06NFL6Gv3oN9scI1im/YhSUPS/BrUXGI+L5ts=; b=wvYHFFhvP7KmWcKx0cd6dDYMN+enpNMC2F6lsJaHSG10cS5KgZw+/pnGTpw3g/Cskd 0PswnxW9Is6u8XTlIEO/hsG1UmTpHAlW7p+GDEZmwQcbcQ4pErDO3suZrUqmSreUDSBn P1/GwycjNtQxeF8t2Kh96/ZZJp96rg+0XQe/AfJvVRAVMLUNsgF/YVtF2ffySfrv/bJh KhvkA9TtsyvYg2ryoyAo4uBz+O1LVy/HPWPMaJ0n6Cp79+v6qiqNZLPPamxL4GRvjFTN +R4eQEptJVGSTpoh4eX8K1iLs4xZIoR0WwIqyB4NGDLUgN+4ZJUd69WgQrKlrCSrQsVD uCpA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id k5si2432889qtc.85.2019.04.25.21.55.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:55:36 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 95E9A3082132; Fri, 26 Apr 2019 04:55:35 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id F356918500; Fri, 26 Apr 2019 04:55:24 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 23/27] userfaultfd: wp: don't wake up when doing write protect Date: Fri, 26 Apr 2019 12:51:47 +0800 Message-Id: <20190426045151.19556-24-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Fri, 26 Apr 2019 04:55:35 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP It does not make sense to try to wake up any waiting thread when we're write-protecting a memory region. Only wake up when resolving a write protected page fault. Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- fs/userfaultfd.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 81962d62520c..f1f61a0278c2 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1771,6 +1771,7 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, struct uffdio_writeprotect uffdio_wp; struct uffdio_writeprotect __user *user_uffdio_wp; struct userfaultfd_wake_range range; + bool mode_wp, mode_dontwake; if (READ_ONCE(ctx->mmap_changing)) return -EAGAIN; @@ -1789,18 +1790,20 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, if (uffdio_wp.mode & ~(UFFDIO_WRITEPROTECT_MODE_DONTWAKE | UFFDIO_WRITEPROTECT_MODE_WP)) return -EINVAL; - if ((uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP) && - (uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) + + mode_wp = uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP; + mode_dontwake = uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE; + + if (mode_wp && mode_dontwake) return -EINVAL; ret = mwriteprotect_range(ctx->mm, uffdio_wp.range.start, - uffdio_wp.range.len, uffdio_wp.mode & - UFFDIO_WRITEPROTECT_MODE_WP, + uffdio_wp.range.len, mode_wp, &ctx->mmap_changing); if (ret) return ret; - if (!(uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) { + if (!mode_wp && !mode_dontwake) { range.start = uffdio_wp.range.start; range.len = uffdio_wp.range.len; wake_userfault(ctx, &range); From patchwork Fri Apr 26 04:51:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918095 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 51F441390 for ; Fri, 26 Apr 2019 04:55:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3BD6828CDB for ; Fri, 26 Apr 2019 04:55:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2F76328D88; Fri, 26 Apr 2019 04:55:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 870F528CDB for ; Fri, 26 Apr 2019 04:55:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 940506B0288; Fri, 26 Apr 2019 00:55:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8EEAC6B028A; Fri, 26 Apr 2019 00:55:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7DDBF6B028B; Fri, 26 Apr 2019 00:55:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 5F06D6B0288 for ; Fri, 26 Apr 2019 00:55:43 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id q28so1897085qtj.6 for ; Thu, 25 Apr 2019 21:55:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=VaFnBBsHk80eRAT5JESA2qIabMvWOGHlSe7euPruE0Y=; b=tU0wVOicu9IwiOHOmeIYhMo3dF8BWfeHJ9Fq8PkWfI7Aa3KfGSXMXCTQgA04Ehfh+e jmwr2Pe+O4GwqetplLpE531Mnl003vH7Ake/JKNGHXxGxmHl6MUYsNwBZ4j2eYoiTWMo CBE26FuviUmn+hzrYx3nZ/7m6/OK+soij8IfyK01pZpDyAuKZFj2/N5KOhTJcmwP69Ja rYgOlvTPPvh5L8s8f6CpHeDePFfrEhODxGjI6yshkCS5XMi5ID71xwwiilvRHEuP8c7+ 0frgg2sAB0dFNJqIUVV2aGaF2YfBySf+UswzOIf/uGIpfPfBZDt4h+tJZhDMaXRfgJz1 b4nQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWjkbTOrun9l5Rsu/dbW1/vVDky+Tfwrsn7F7nH3+0Cfi49t0XC +0lAGfuCoIVZKMKZYXpHmQ50fOlKJQpbTdUid5dj1op3p3uZVf66k9v/4/knPux8vA9IrHVPXxy envorgAkyKHJHiQoi6/+ynO8aZHO3JzOXTqjXXHEx+/rd/QVc8tRO6f68uGhAXtw48Q== X-Received: by 2002:a0c:d1b4:: with SMTP id e49mr29183114qvh.87.1556254543177; Thu, 25 Apr 2019 21:55:43 -0700 (PDT) X-Google-Smtp-Source: APXvYqyh4foJy3nhOuTXpBxmxu9Dt20oLfWYuPH5AEEiUSz25ld0fUbWJEG3S4Mxz1YjgcRKXQYt X-Received: by 2002:a0c:d1b4:: with SMTP id e49mr29183083qvh.87.1556254542525; Thu, 25 Apr 2019 21:55:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254542; cv=none; d=google.com; s=arc-20160816; b=Oi9STVYXPPrRK2sNU3q3zCh5FSeyYJnDEh9jjkINrIAjQBI60yOUpklR6KTVsg6MfJ hVgqTQjd91IfhPwkP339X7f4pvr+hyXVmJPd36koJE8TeBLgibZaVY3KV5cm0i9BKk6B TQZqICYX+Y+VnzzoIahrb0LPUjqTRAzem/5Xcfn2KzLP56Akz2YDavHuBuqN5Fd5kgsP dY6/4j7RBt07NH2yUINcBIRyY2TvhFCOTGRrnhY75OQrQFy7zx5gXSi2nYIYgij7T358 N8pqnDpirlylDPbUWn9smF1en73tG81136TX0WZIzFm2/KCbkZSSyu1QgkNHMZc5K8XU NppA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=VaFnBBsHk80eRAT5JESA2qIabMvWOGHlSe7euPruE0Y=; b=AiJFjZHO+4QI+6Ou6f5mgYoswN1Vh+nxe+iawBEf6xJbu9J4ErtlnHOcDgmYFj1hDi JqgahtSSvds5TzQ+HH/VVRctNi6welDD9PMnOkL9ywVKGmcbR2z06u/rtVHDZfc2UVSq DHRa/cg+sr4yqaxXWC8vcXscZWuVgKQ05rtsA1se7DOSmi1MuGpq6JXiddEx1/kFiZ2x lYn115+4V5tf7BkImh24w5o16um9hFQ4cM4k2iZy9Z7s4dzReDxyAZo8Ci/uMtg4zZEs lojMMUg1LMkvgx/waaEsX4FwnWK6+gmQ5PeGBbi4q4/R/rvy0DUe4IwlEP1GjoYBJNhD ioMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id 30si10756440qte.385.2019.04.25.21.55.42 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:55:42 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A9A953082B4D; Fri, 26 Apr 2019 04:55:41 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1B28F45AE; Fri, 26 Apr 2019 04:55:35 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 24/27] userfaultfd: wp: UFFDIO_REGISTER_MODE_WP documentation update Date: Fri, 26 Apr 2019 12:51:48 +0800 Message-Id: <20190426045151.19556-25-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Fri, 26 Apr 2019 04:55:41 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Martin Cracauer Adds documentation about the write protection support. Signed-off-by: Martin Cracauer Signed-off-by: Andrea Arcangeli [peterx: rewrite in rst format; fixups here and there] Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- Documentation/admin-guide/mm/userfaultfd.rst | 51 ++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst index 5048cf661a8a..c30176e67900 100644 --- a/Documentation/admin-guide/mm/userfaultfd.rst +++ b/Documentation/admin-guide/mm/userfaultfd.rst @@ -108,6 +108,57 @@ UFFDIO_COPY. They're atomic as in guaranteeing that nothing can see an half copied page since it'll keep userfaulting until the copy has finished. +Notes: + +- If you requested UFFDIO_REGISTER_MODE_MISSING when registering then + you must provide some kind of page in your thread after reading from + the uffd. You must provide either UFFDIO_COPY or UFFDIO_ZEROPAGE. + The normal behavior of the OS automatically providing a zero page on + an annonymous mmaping is not in place. + +- None of the page-delivering ioctls default to the range that you + registered with. You must fill in all fields for the appropriate + ioctl struct including the range. + +- You get the address of the access that triggered the missing page + event out of a struct uffd_msg that you read in the thread from the + uffd. You can supply as many pages as you want with UFFDIO_COPY or + UFFDIO_ZEROPAGE. Keep in mind that unless you used DONTWAKE then + the first of any of those IOCTLs wakes up the faulting thread. + +- Be sure to test for all errors including (pollfd[0].revents & + POLLERR). This can happen, e.g. when ranges supplied were + incorrect. + +Write Protect Notifications +--------------------------- + +This is equivalent to (but faster than) using mprotect and a SIGSEGV +signal handler. + +Firstly you need to register a range with UFFDIO_REGISTER_MODE_WP. +Instead of using mprotect(2) you use ioctl(uffd, UFFDIO_WRITEPROTECT, +struct *uffdio_writeprotect) while mode = UFFDIO_WRITEPROTECT_MODE_WP +in the struct passed in. The range does not default to and does not +have to be identical to the range you registered with. You can write +protect as many ranges as you like (inside the registered range). +Then, in the thread reading from uffd the struct will have +msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP set. Now you send +ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect) again +while pagefault.mode does not have UFFDIO_WRITEPROTECT_MODE_WP set. +This wakes up the thread which will continue to run with writes. This +allows you to do the bookkeeping about the write in the uffd reading +thread before the ioctl. + +If you registered with both UFFDIO_REGISTER_MODE_MISSING and +UFFDIO_REGISTER_MODE_WP then you need to think about the sequence in +which you supply a page and undo write protect. Note that there is a +difference between writes into a WP area and into a !WP area. The +former will have UFFD_PAGEFAULT_FLAG_WP set, the latter +UFFD_PAGEFAULT_FLAG_WRITE. The latter did not fail on protection but +you still need to supply a page when UFFDIO_REGISTER_MODE_MISSING was +used. + QEMU/KVM ======== From patchwork Fri Apr 26 04:51:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918097 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3979A1390 for ; Fri, 26 Apr 2019 04:55:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2910428CDB for ; Fri, 26 Apr 2019 04:55:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1B17628D88; Fri, 26 Apr 2019 04:55:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B1E8C28CDB for ; Fri, 26 Apr 2019 04:55:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CC0726B028A; Fri, 26 Apr 2019 00:55:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C70D46B028C; Fri, 26 Apr 2019 00:55:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B87406B028D; Fri, 26 Apr 2019 00:55:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 965316B028A for ; Fri, 26 Apr 2019 00:55:49 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id g28so1569502qtk.7 for ; Thu, 25 Apr 2019 21:55:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=q9EArJHlwfMAuKhsKXQeuL6ijYZfZ9vQpvzHYfctpTk=; b=jwzCw290gwN1pcHAGG7z2IcSkhFUpgD6mKE1HUbwIwijERtBgy/ES30imtmidEDHmE dqKRljM1ugGkerGcBn/UmxsK+RkQxMpNMNtnwgyGOJfmW+/SLRaM3xrMgoPLPwTExSx8 3FGLvUghBvaik0aThqse9lb31UGa/bNP8fJgC6M12RXUheRf81s+jTeqGNGUvaqDkki1 CEqcOxUL565OKqNGaO1pbt6O7eEWJASlyHTx3F1oXS8B5gdY2aYdjZyCUEXwPst70tSe nkrP7rl063cHc0FTzyCzSStINtz9aaiFF7nyShJ9MfdJINon63Bdy/R+ZIXegNnCuLUZ ZVSg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAX5cFjozfCiU/397FDODzI3PLCgrmvyFvJ8EpMQTzs9ez0n63Tq FmaT/1gW+3+CKQ6vPciNIPCw5DSkbVbELQKtuQ1HLGWsGAjdIPI3Z1sfg0sPwK/+/JH/zvIM1Mh M+MVDnsElCXtiL+v/QTCPl3vPLroJ9+r7FdoPS65hFJCvOAIjbdcqUX2uq3Wz6qYR8w== X-Received: by 2002:ac8:2649:: with SMTP id v9mr34918097qtv.275.1556254549407; Thu, 25 Apr 2019 21:55:49 -0700 (PDT) X-Google-Smtp-Source: APXvYqxuxMweeYAtd2b/wojLKZB0tsm2eTUZeZQw1EQoRkgjbl57ygBvh5MgqpsSPqOc1g4zkx/k X-Received: by 2002:ac8:2649:: with SMTP id v9mr34918068qtv.275.1556254548625; Thu, 25 Apr 2019 21:55:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254548; cv=none; d=google.com; s=arc-20160816; b=iz48OPV6Hu7cGUs1tl+AtxF+v96j7rkPgrVUMHyH4MuR5wgowSq+KhxXmpTz3nTRO6 RmM1EUInsSxWUYxlafISgoWCzdDT331NmAU+AKGZlmqPiHc9i0z+L5K4FgDI3vMv5xkZ skNanK09ufgixKxBSQZ6sHVPlkxTT4XilqbNM87CM0m6O+kynuF9mKCT5gA6tkaFfJWf T592H68+BXhOZlYOu1qyMFT1Sz7/aLcMpcCJFvVlb1k6YlfLdoR3slqnsgfXLeJZRvqR BuHB4Jyis87dvBGCvZGmR1OCCHsrBAXOj86Ev2E8xOdIbvNIOLlvDvFLSaep3uKjbBvd y7cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=q9EArJHlwfMAuKhsKXQeuL6ijYZfZ9vQpvzHYfctpTk=; b=CDWmb6wzXCU0JfbsQKOuZjOk0fiYBzkk1N+w4Rc5CLU4FVK5onaJPDbndKuz+qCI5u e617C7a+OgNTwr149pZ2K0WjkfUgIbBtAPhuQ1Nn0msgCQUWdByn2s6WpJjO7qkYB0Bg Yyvfz7l4cGRE5MBaxBywWtcTWOZYTZ2UjkCrP6f2HUvEBP5n8GjFlezCcFmblLH9GEWh SibaDaXJJD/HL5KAsimxmm2rYXxNcZSeZUfRsHTV0/AKbV9BQ8rusEgMzUV9niC23AFR qP5bM9vLRWVItyFEatWyHphGwfEpxpR6Idv+gZJHu9s2T1/StmJjIQh+bwpW6T96irR1 dtFA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id v33si3681742qvf.120.2019.04.25.21.55.48 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:55:48 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BF56A11DBA7; Fri, 26 Apr 2019 04:55:47 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3088818500; Fri, 26 Apr 2019 04:55:41 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 25/27] userfaultfd: wp: declare _UFFDIO_WRITEPROTECT conditionally Date: Fri, 26 Apr 2019 12:51:49 +0800 Message-Id: <20190426045151.19556-26-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 26 Apr 2019 04:55:47 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Only declare _UFFDIO_WRITEPROTECT if the user specified UFFDIO_REGISTER_MODE_WP and if all the checks passed. Then when the user registers regions with shmem/hugetlbfs we won't expose the new ioctl to them. Even with complete anonymous memory range, we'll only expose the new WP ioctl bit if the register mode has MODE_WP. Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- fs/userfaultfd.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index f1f61a0278c2..7f87e9e4fb9b 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1456,14 +1456,24 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, up_write(&mm->mmap_sem); mmput(mm); if (!ret) { + __u64 ioctls_out; + + ioctls_out = basic_ioctls ? UFFD_API_RANGE_IOCTLS_BASIC : + UFFD_API_RANGE_IOCTLS; + + /* + * Declare the WP ioctl only if the WP mode is + * specified and all checks passed with the range + */ + if (!(uffdio_register.mode & UFFDIO_REGISTER_MODE_WP)) + ioctls_out &= ~((__u64)1 << _UFFDIO_WRITEPROTECT); + /* * Now that we scanned all vmas we can already tell * userland which ioctls methods are guaranteed to * succeed on this range. */ - if (put_user(basic_ioctls ? UFFD_API_RANGE_IOCTLS_BASIC : - UFFD_API_RANGE_IOCTLS, - &user_uffdio_register->ioctls)) + if (put_user(ioctls_out, &user_uffdio_register->ioctls)) ret = -EFAULT; } out: From patchwork Fri Apr 26 04:51:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918099 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C2819933 for ; Fri, 26 Apr 2019 04:56:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B08BA28B26 for ; Fri, 26 Apr 2019 04:56:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A4DE128CDD; Fri, 26 Apr 2019 04:56:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DEFBA28B26 for ; Fri, 26 Apr 2019 04:56:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EE3156B028C; Fri, 26 Apr 2019 00:56:00 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E95B06B028E; Fri, 26 Apr 2019 00:56:00 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D832B6B028F; Fri, 26 Apr 2019 00:56:00 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id B91596B028C for ; Fri, 26 Apr 2019 00:56:00 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id t67so1831821qkd.15 for ; Thu, 25 Apr 2019 21:56:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=YoZxjejxwXU5vKpbAmaSLSE1AnavbqmoKueYd5KBa1k=; b=Q1bTMCmFJ4dg9s2nTQijDurx9XU+6FwxAPgoN3CzYK8n4B5pF7vSclvW/vVmPLGBXX SSB7Cuw+TQfT+9uz76a4nN/s9VtnNpFE6kCHceEq3qBgT2wYks1LEyVlxB6PKYglDJTR Fh6kT8l3wFP+itmF1LC8zFDP56xrAZO37BQGLH0Zxxljf3vX+StvQ+Roo77Y926R/maC pe9MYOfUSsL8szsm7WGjhbp2C3labfIRxu28+4NhlQrtVQqgEZ3VZeABCYWqPGnxTMdK a0cofe+RFMvaq6ksr1LQCdDrgtBTPYr7WT2rU1UIA/gyWFDahxyMo84hM8a7aS2ueNHV DCbg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVWUucHEjvQwulML2VG8DljVl0UmGjLkHvxKBFezcs+p/4BtW7C KVuU9JOtjbzcvrRj/2mhTcgRF38ej2gbvgPyHkGn71IiPq+tt7nVbUzNT429zovqIDgC07ce/1Z BR3TseubG90iaJCVDhVRSOvLPqSmGgcxPXkVr/PkEOcEU7buF0Qe5tEEECaYUAINQwQ== X-Received: by 2002:ae9:df41:: with SMTP id t62mr14065765qkf.150.1556254560523; Thu, 25 Apr 2019 21:56:00 -0700 (PDT) X-Google-Smtp-Source: APXvYqxL+Jtu9HzwxJjIzj5kmraM0GOomCw97DMJDcm3DKPqmTt+e3r1s66EDGaQa0neED4sOWE4 X-Received: by 2002:ae9:df41:: with SMTP id t62mr14065741qkf.150.1556254559749; Thu, 25 Apr 2019 21:55:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254559; cv=none; d=google.com; s=arc-20160816; b=qaPKti3DSDAqEWgio0cIbGR+ZoHkf2NdEKr6FbDtuf8ieBRby8MMOOV9Lcdju8llUt HkD3qr7CFefHx4QOxlnGYwu4JHUEz3K/zke8AGmZNs/L5ZXmP2t0yffdQvkTbM70k7uv z9Vja8AXqRhhHzV2TEGUg+k4d8LzFAPRllvbSwXSEP4T27vS/GXVM0NtNB3GkyVQxDQV 7O/9kYSSQCAGJGrw1Rq3N3Np1mWDRrHTfKHrgkG36QkYckTUle7qu57ZdhsH0K6OV9WO sO6xHmS2hgXU2nlG+PFC4ng/kobEQkYYRirEKYpNIofqx+joBP6LIUsCDM7MnaAE8g3l KfTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=YoZxjejxwXU5vKpbAmaSLSE1AnavbqmoKueYd5KBa1k=; b=0FylnBpVPdKaGUBr0xsQbOAIbQPImF7nBwEpFEn5tj+WHSCahOxqEIagnvTuHKGyhO W/X5WOEnFNInt1EIq8pEsM/ap617iibOXQ7713P3M9L1l5yKXSYmDAy7nn0JujNYSmaq 920j6AjX8eyVgNmczlMPuNm61N4ih9HIj38Y6a2Dskukv0ZvEXWL8TsQv9YuqH5u1JW0 aNkYkpzk4rp3hcOW6rozw/imbHM86DC6fQZ/FTmElw2Out/o5LDzznyfNssmLbDAxD5v uEToBk8floHa+vfdFnBgFPJMRZab3q7MJHGF9MGPyoEGfd7nYvaCbgK1jDceJi1wY2s+ Ypmw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id k11si6970316qtk.81.2019.04.25.21.55.59 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:55:59 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D1E2681227; Fri, 26 Apr 2019 04:55:58 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4501518500; Fri, 26 Apr 2019 04:55:48 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 26/27] userfaultfd: selftests: refactor statistics Date: Fri, 26 Apr 2019 12:51:50 +0800 Message-Id: <20190426045151.19556-27-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 26 Apr 2019 04:55:59 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Introduce uffd_stats structure for statistics of the self test, at the same time refactor the code to always pass in the uffd_stats for either read() or poll() typed fault handling threads instead of using two different ways to return the statistic results. No functional change. With the new structure, it's very easy to introduce new statistics. Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 76 +++++++++++++++--------- 1 file changed, 49 insertions(+), 27 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index 5d1db824f73a..e5d12c209e09 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -88,6 +88,12 @@ static char *area_src, *area_src_alias, *area_dst, *area_dst_alias; static char *zeropage; pthread_attr_t attr; +/* Userfaultfd test statistics */ +struct uffd_stats { + int cpu; + unsigned long missing_faults; +}; + /* pthread_mutex_t starts at page offset 0 */ #define area_mutex(___area, ___nr) \ ((pthread_mutex_t *) ((___area) + (___nr)*page_size)) @@ -127,6 +133,17 @@ static void usage(void) exit(1); } +static void uffd_stats_reset(struct uffd_stats *uffd_stats, + unsigned long n_cpus) +{ + int i; + + for (i = 0; i < n_cpus; i++) { + uffd_stats[i].cpu = i; + uffd_stats[i].missing_faults = 0; + } +} + static int anon_release_pages(char *rel_area) { int ret = 0; @@ -469,8 +486,8 @@ static int uffd_read_msg(int ufd, struct uffd_msg *msg) return 0; } -/* Return 1 if page fault handled by us; otherwise 0 */ -static int uffd_handle_page_fault(struct uffd_msg *msg) +static void uffd_handle_page_fault(struct uffd_msg *msg, + struct uffd_stats *stats) { unsigned long offset; @@ -485,18 +502,19 @@ static int uffd_handle_page_fault(struct uffd_msg *msg) offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; offset &= ~(page_size-1); - return copy_page(uffd, offset); + if (copy_page(uffd, offset)) + stats->missing_faults++; } static void *uffd_poll_thread(void *arg) { - unsigned long cpu = (unsigned long) arg; + struct uffd_stats *stats = (struct uffd_stats *)arg; + unsigned long cpu = stats->cpu; struct pollfd pollfd[2]; struct uffd_msg msg; struct uffdio_register uffd_reg; int ret; char tmp_chr; - unsigned long userfaults = 0; pollfd[0].fd = uffd; pollfd[0].events = POLLIN; @@ -526,7 +544,7 @@ static void *uffd_poll_thread(void *arg) msg.event), exit(1); break; case UFFD_EVENT_PAGEFAULT: - userfaults += uffd_handle_page_fault(&msg); + uffd_handle_page_fault(&msg, stats); break; case UFFD_EVENT_FORK: close(uffd); @@ -545,28 +563,27 @@ static void *uffd_poll_thread(void *arg) break; } } - return (void *)userfaults; + + return NULL; } pthread_mutex_t uffd_read_mutex = PTHREAD_MUTEX_INITIALIZER; static void *uffd_read_thread(void *arg) { - unsigned long *this_cpu_userfaults; + struct uffd_stats *stats = (struct uffd_stats *)arg; struct uffd_msg msg; - this_cpu_userfaults = (unsigned long *) arg; - *this_cpu_userfaults = 0; - pthread_mutex_unlock(&uffd_read_mutex); /* from here cancellation is ok */ for (;;) { if (uffd_read_msg(uffd, &msg)) continue; - (*this_cpu_userfaults) += uffd_handle_page_fault(&msg); + uffd_handle_page_fault(&msg, stats); } - return (void *)NULL; + + return NULL; } static void *background_thread(void *arg) @@ -582,13 +599,12 @@ static void *background_thread(void *arg) return NULL; } -static int stress(unsigned long *userfaults) +static int stress(struct uffd_stats *uffd_stats) { unsigned long cpu; pthread_t locking_threads[nr_cpus]; pthread_t uffd_threads[nr_cpus]; pthread_t background_threads[nr_cpus]; - void **_userfaults = (void **) userfaults; finished = 0; for (cpu = 0; cpu < nr_cpus; cpu++) { @@ -597,12 +613,13 @@ static int stress(unsigned long *userfaults) return 1; if (bounces & BOUNCE_POLL) { if (pthread_create(&uffd_threads[cpu], &attr, - uffd_poll_thread, (void *)cpu)) + uffd_poll_thread, + (void *)&uffd_stats[cpu])) return 1; } else { if (pthread_create(&uffd_threads[cpu], &attr, uffd_read_thread, - &_userfaults[cpu])) + (void *)&uffd_stats[cpu])) return 1; pthread_mutex_lock(&uffd_read_mutex); } @@ -639,7 +656,8 @@ static int stress(unsigned long *userfaults) fprintf(stderr, "pipefd write error\n"); return 1; } - if (pthread_join(uffd_threads[cpu], &_userfaults[cpu])) + if (pthread_join(uffd_threads[cpu], + (void *)&uffd_stats[cpu])) return 1; } else { if (pthread_cancel(uffd_threads[cpu])) @@ -910,11 +928,11 @@ static int userfaultfd_events_test(void) { struct uffdio_register uffdio_register; unsigned long expected_ioctls; - unsigned long userfaults; pthread_t uffd_mon; int err, features; pid_t pid; char c; + struct uffd_stats stats = { 0 }; printf("testing events (fork, remap, remove): "); fflush(stdout); @@ -941,7 +959,7 @@ static int userfaultfd_events_test(void) "unexpected missing ioctl for anon memory\n"), exit(1); - if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL)) + if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, &stats)) perror("uffd_poll_thread create"), exit(1); pid = fork(); @@ -957,13 +975,13 @@ static int userfaultfd_events_test(void) if (write(pipefd[1], &c, sizeof(c)) != sizeof(c)) perror("pipe write"), exit(1); - if (pthread_join(uffd_mon, (void **)&userfaults)) + if (pthread_join(uffd_mon, NULL)) return 1; close(uffd); - printf("userfaults: %ld\n", userfaults); + printf("userfaults: %ld\n", stats.missing_faults); - return userfaults != nr_pages; + return stats.missing_faults != nr_pages; } static int userfaultfd_sig_test(void) @@ -975,6 +993,7 @@ static int userfaultfd_sig_test(void) int err, features; pid_t pid; char c; + struct uffd_stats stats = { 0 }; printf("testing signal delivery: "); fflush(stdout); @@ -1006,7 +1025,7 @@ static int userfaultfd_sig_test(void) if (uffd_test_ops->release_pages(area_dst)) return 1; - if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL)) + if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, &stats)) perror("uffd_poll_thread create"), exit(1); pid = fork(); @@ -1032,6 +1051,7 @@ static int userfaultfd_sig_test(void) close(uffd); return userfaults != 0; } + static int userfaultfd_stress(void) { void *area; @@ -1040,7 +1060,7 @@ static int userfaultfd_stress(void) struct uffdio_register uffdio_register; unsigned long cpu; int err; - unsigned long userfaults[nr_cpus]; + struct uffd_stats uffd_stats[nr_cpus]; uffd_test_ops->allocate_area((void **)&area_src); if (!area_src) @@ -1169,8 +1189,10 @@ static int userfaultfd_stress(void) if (uffd_test_ops->release_pages(area_dst)) return 1; + uffd_stats_reset(uffd_stats, nr_cpus); + /* bounce pass */ - if (stress(userfaults)) + if (stress(uffd_stats)) return 1; /* unregister */ @@ -1213,7 +1235,7 @@ static int userfaultfd_stress(void) printf("userfaults:"); for (cpu = 0; cpu < nr_cpus; cpu++) - printf(" %lu", userfaults[cpu]); + printf(" %lu", uffd_stats[cpu].missing_faults); printf("\n"); } From patchwork Fri Apr 26 04:51:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10918101 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7F13A933 for ; Fri, 26 Apr 2019 04:56:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6CA5F28B26 for ; Fri, 26 Apr 2019 04:56:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6103128CDD; Fri, 26 Apr 2019 04:56:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 69EC928B26 for ; Fri, 26 Apr 2019 04:56:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5BB516B028E; Fri, 26 Apr 2019 00:56:07 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 56B9C6B0290; Fri, 26 Apr 2019 00:56:07 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 45B216B0291; Fri, 26 Apr 2019 00:56:07 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 24A276B028E for ; Fri, 26 Apr 2019 00:56:07 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id c44so1900034qtb.9 for ; Thu, 25 Apr 2019 21:56:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=MIozerEVlbsC7Eks2DH/IrUUfmnGE+cAmIg6z0gEZVg=; b=spgJz8hk+vH9eRx4RVav0S9HZ2EmCoa+VYXHQXwGQKAXpiHEMxd6at3UA8boDeRlI7 Lf0ELzCdfk81yKkqbuTcsQqleI0BbiUpGTcZfZFXFxEezKkW/Ix+CzJo5ZlDshiWtdjt DlX60i+IMqM8LHcTTFUOJhInD/NeR2WUaV58loe11Ebw3qqDUEYsmC+lo6QPPDtxokwh vkCQsw3qE7928SeyjTo9UBKHfQwaA+YeusygoH0Q2Ab+ZUCHaW/nT819yTazf4cELasv NM4mH73HoJCYa5K/Zy+2QBizaUdu9QUcgnm1Xgk6uWrKjpTbhS52+yvCzGDK8DqErZVO VXrw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAUkDZspZkzwIOeaVyE9QzWeAj6o3AyMwCdvmUvjR5h9q/eiLpn1 B7INOpEwyL8C3RCzg3+ufCli8QAcce7/KZdzNEHNmsyz3D8yQCS68gmFsPfKKTKwTdZKbTvpyom Dwo/vnNnwVl49bdEQUmxJGOknLKX7iwv0DUReneUJORD13GfcF1QD14qmeMB0gOblXg== X-Received: by 2002:a37:ef13:: with SMTP id j19mr13614794qkk.264.1556254566883; Thu, 25 Apr 2019 21:56:06 -0700 (PDT) X-Google-Smtp-Source: APXvYqzyNaQt3gzc0sAr47/s8YsYpBus0xPkMqCac1wEQf8G5cWntTygNZbAimUZQbZQjgy1Djom X-Received: by 2002:a37:ef13:: with SMTP id j19mr13614755qkk.264.1556254565835; Thu, 25 Apr 2019 21:56:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556254565; cv=none; d=google.com; s=arc-20160816; b=I5V1VLkNapF7WWqOVvmvRLuOvCk5JuYKJa5ZJbznNfIlITSbKxCtX6pnmvXzQe5leT Ec1nUWdtXV3s7CY+nNCOAPRtFXyOXFa5ydlHhcS1W0VkN7QZ9K5q2NcKQ5prchGnqfJD paEAdXJ3bn/AzklwZ1ZiIDYTE0DbaUJmqwfKszCxV5EvVx4KiHkCmqebLTzWcL7DhIHe EJYZ3PmfAcDsX9jbuUdAMH7jW2PB2FVtp0FNN7VE9sRG/+/au4Xp9m0fzYqSdSeShFJp 5FPvwPrM0cW6iveK4OAYTiSFRDyoA2RZZ7W27Om4kC3OYv8nw0Fj3M/ZsIm69jJKBdwo f05Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=MIozerEVlbsC7Eks2DH/IrUUfmnGE+cAmIg6z0gEZVg=; b=tWgx/1rXwS8KkQzKSryXb3hosYw0Kw4XLunXucwtlWnFU9Pw17vVgekv5wKsOZoJmS 65l6kC0umf2MyY34+GJAdonOItmZWhcHhILW4CTmL5fbA359OtxyG4EXk0GfAM6t1xtE FKIYjRw8bY3mFOh6BSEP6tnmdqkTkwlg+gvRxJjKXrtj/xVZL6b/lgqhvp398Oi85bUK p39JOK4c8I2zlyr6v3cvdbIAEDGmpVkjcTT4/Y8cpxX9HdpElMbkwnmbQ28obrML+xNX PjGCZNmTpvLUhWDv3QSKRwumiFUndfFKcaDqdWlovmbaJtgSE473o9DTk+YWLC/06m9v K0oQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id y18si1858338qtj.397.2019.04.25.21.56.05 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 21:56:05 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id ED3EE3082B4D; Fri, 26 Apr 2019 04:56:04 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-15-205.nay.redhat.com [10.66.15.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5784C17B21; Fri, 26 Apr 2019 04:55:59 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Denis Plotnikov , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v4 27/27] userfaultfd: selftests: add write-protect test Date: Fri, 26 Apr 2019 12:51:51 +0800 Message-Id: <20190426045151.19556-28-peterx@redhat.com> In-Reply-To: <20190426045151.19556-1-peterx@redhat.com> References: <20190426045151.19556-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Fri, 26 Apr 2019 04:56:05 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This patch adds uffd tests for write protection. Instead of introducing new tests for it, let's simply squashing uffd-wp tests into existing uffd-missing test cases. Changes are: (1) Bouncing tests We do the write-protection in two ways during the bouncing test: - By using UFFDIO_COPY_MODE_WP when resolving MISSING pages: then we'll make sure for each bounce process every single page will be at least fault twice: once for MISSING, once for WP. - By direct call UFFDIO_WRITEPROTECT on existing faulted memories: To further torture the explicit page protection procedures of uffd-wp, we split each bounce procedure into two halves (in the background thread): the first half will be MISSING+WP for each page as explained above. After the first half, we write protect the faulted region in the background thread to make sure at least half of the pages will be write protected again which is the first half to test the new UFFDIO_WRITEPROTECT call. Then we continue with the 2nd half, which will contain both MISSING and WP faulting tests for the 2nd half and WP-only faults from the 1st half. (2) Event/Signal test Mostly previous tests but will do MISSING+WP for each page. For sigbus-mode test we'll need to provide standalone path to handle the write protection faults. For all tests, do statistics as well for uffd-wp pages. Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 157 +++++++++++++++++++---- 1 file changed, 133 insertions(+), 24 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index e5d12c209e09..bf1e10db72f5 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -56,6 +56,7 @@ #include #include #include +#include #include "../kselftest.h" @@ -78,6 +79,8 @@ static int test_type; #define ALARM_INTERVAL_SECS 10 static volatile bool test_uffdio_copy_eexist = true; static volatile bool test_uffdio_zeropage_eexist = true; +/* Whether to test uffd write-protection */ +static bool test_uffdio_wp = false; static bool map_shared; static int huge_fd; @@ -92,6 +95,7 @@ pthread_attr_t attr; struct uffd_stats { int cpu; unsigned long missing_faults; + unsigned long wp_faults; }; /* pthread_mutex_t starts at page offset 0 */ @@ -141,9 +145,29 @@ static void uffd_stats_reset(struct uffd_stats *uffd_stats, for (i = 0; i < n_cpus; i++) { uffd_stats[i].cpu = i; uffd_stats[i].missing_faults = 0; + uffd_stats[i].wp_faults = 0; } } +static void uffd_stats_report(struct uffd_stats *stats, int n_cpus) +{ + int i; + unsigned long long miss_total = 0, wp_total = 0; + + for (i = 0; i < n_cpus; i++) { + miss_total += stats[i].missing_faults; + wp_total += stats[i].wp_faults; + } + + printf("userfaults: %llu missing (", miss_total); + for (i = 0; i < n_cpus; i++) + printf("%lu+", stats[i].missing_faults); + printf("\b), %llu wp (", wp_total); + for (i = 0; i < n_cpus; i++) + printf("%lu+", stats[i].wp_faults); + printf("\b)\n"); +} + static int anon_release_pages(char *rel_area) { int ret = 0; @@ -264,10 +288,15 @@ struct uffd_test_ops { void (*alias_mapping)(__u64 *start, size_t len, unsigned long offset); }; -#define ANON_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ +#define SHMEM_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ (1 << _UFFDIO_COPY) | \ (1 << _UFFDIO_ZEROPAGE)) +#define ANON_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ + (1 << _UFFDIO_COPY) | \ + (1 << _UFFDIO_ZEROPAGE) | \ + (1 << _UFFDIO_WRITEPROTECT)) + static struct uffd_test_ops anon_uffd_test_ops = { .expected_ioctls = ANON_EXPECTED_IOCTLS, .allocate_area = anon_allocate_area, @@ -276,7 +305,7 @@ static struct uffd_test_ops anon_uffd_test_ops = { }; static struct uffd_test_ops shmem_uffd_test_ops = { - .expected_ioctls = ANON_EXPECTED_IOCTLS, + .expected_ioctls = SHMEM_EXPECTED_IOCTLS, .allocate_area = shmem_allocate_area, .release_pages = shmem_release_pages, .alias_mapping = noop_alias_mapping, @@ -300,6 +329,21 @@ static int my_bcmp(char *str1, char *str2, size_t n) return 0; } +static void wp_range(int ufd, __u64 start, __u64 len, bool wp) +{ + struct uffdio_writeprotect prms = { 0 }; + + /* Write protection page faults */ + prms.range.start = start; + prms.range.len = len; + /* Undo write-protect, do wakeup after that */ + prms.mode = wp ? UFFDIO_WRITEPROTECT_MODE_WP : 0; + + if (ioctl(ufd, UFFDIO_WRITEPROTECT, &prms)) + fprintf(stderr, "clear WP failed for address 0x%Lx\n", + start), exit(1); +} + static void *locking_thread(void *arg) { unsigned long cpu = (unsigned long) arg; @@ -438,7 +482,10 @@ static int __copy_page(int ufd, unsigned long offset, bool retry) uffdio_copy.dst = (unsigned long) area_dst + offset; uffdio_copy.src = (unsigned long) area_src + offset; uffdio_copy.len = page_size; - uffdio_copy.mode = 0; + if (test_uffdio_wp) + uffdio_copy.mode = UFFDIO_COPY_MODE_WP; + else + uffdio_copy.mode = 0; uffdio_copy.copy = 0; if (ioctl(ufd, UFFDIO_COPY, &uffdio_copy)) { /* real retval in ufdio_copy.copy */ @@ -495,15 +542,21 @@ static void uffd_handle_page_fault(struct uffd_msg *msg, fprintf(stderr, "unexpected msg event %u\n", msg->event), exit(1); - if (bounces & BOUNCE_VERIFY && - msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE) - fprintf(stderr, "unexpected write fault\n"), exit(1); + if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP) { + wp_range(uffd, msg->arg.pagefault.address, page_size, false); + stats->wp_faults++; + } else { + /* Missing page faults */ + if (bounces & BOUNCE_VERIFY && + msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE) + fprintf(stderr, "unexpected write fault\n"), exit(1); - offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; - offset &= ~(page_size-1); + offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; + offset &= ~(page_size-1); - if (copy_page(uffd, offset)) - stats->missing_faults++; + if (copy_page(uffd, offset)) + stats->missing_faults++; + } } static void *uffd_poll_thread(void *arg) @@ -589,11 +642,30 @@ static void *uffd_read_thread(void *arg) static void *background_thread(void *arg) { unsigned long cpu = (unsigned long) arg; - unsigned long page_nr; + unsigned long page_nr, start_nr, mid_nr, end_nr; + + start_nr = cpu * nr_pages_per_cpu; + end_nr = (cpu+1) * nr_pages_per_cpu; + mid_nr = (start_nr + end_nr) / 2; + + /* Copy the first half of the pages */ + for (page_nr = start_nr; page_nr < mid_nr; page_nr++) + copy_page_retry(uffd, page_nr * page_size); - for (page_nr = cpu * nr_pages_per_cpu; - page_nr < (cpu+1) * nr_pages_per_cpu; - page_nr++) + /* + * If we need to test uffd-wp, set it up now. Then we'll have + * at least the first half of the pages mapped already which + * can be write-protected for testing + */ + if (test_uffdio_wp) + wp_range(uffd, (unsigned long)area_dst + start_nr * page_size, + nr_pages_per_cpu * page_size, true); + + /* + * Continue the 2nd half of the page copying, handling write + * protection faults if any + */ + for (page_nr = mid_nr; page_nr < end_nr; page_nr++) copy_page_retry(uffd, page_nr * page_size); return NULL; @@ -755,17 +827,31 @@ static int faulting_process(int signal_test) } for (nr = 0; nr < split_nr_pages; nr++) { + int steps = 1; + unsigned long offset = nr * page_size; + if (signal_test) { if (sigsetjmp(*sigbuf, 1) != 0) { - if (nr == lastnr) { + if (steps == 1 && nr == lastnr) { fprintf(stderr, "Signal repeated\n"); return 1; } lastnr = nr; if (signal_test == 1) { - if (copy_page(uffd, nr * page_size)) - signalled++; + if (steps == 1) { + /* This is a MISSING request */ + steps++; + if (copy_page(uffd, offset)) + signalled++; + } else { + /* This is a WP request */ + assert(steps == 2); + wp_range(uffd, + (__u64)area_dst + + offset, + page_size, false); + } } else { signalled++; continue; @@ -778,8 +864,13 @@ static int faulting_process(int signal_test) fprintf(stderr, "nr %lu memory corruption %Lu %Lu\n", nr, count, - count_verify[nr]), exit(1); - } + count_verify[nr]); + } + /* + * Trigger write protection if there is by writting + * the same value back. + */ + *area_count(area_dst, nr) = count; } if (signal_test) @@ -801,6 +892,11 @@ static int faulting_process(int signal_test) nr, count, count_verify[nr]), exit(1); } + /* + * Trigger write protection if there is by writting + * the same value back. + */ + *area_count(area_dst, nr) = count; } if (uffd_test_ops->release_pages(area_dst)) @@ -904,6 +1000,8 @@ static int userfaultfd_zeropage_test(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) fprintf(stderr, "register failure\n"), exit(1); @@ -949,6 +1047,8 @@ static int userfaultfd_events_test(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) fprintf(stderr, "register failure\n"), exit(1); @@ -979,7 +1079,8 @@ static int userfaultfd_events_test(void) return 1; close(uffd); - printf("userfaults: %ld\n", stats.missing_faults); + + uffd_stats_report(&stats, 1); return stats.missing_faults != nr_pages; } @@ -1009,6 +1110,8 @@ static int userfaultfd_sig_test(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) fprintf(stderr, "register failure\n"), exit(1); @@ -1141,6 +1244,8 @@ static int userfaultfd_stress(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) { fprintf(stderr, "register failure\n"); return 1; @@ -1195,6 +1300,11 @@ static int userfaultfd_stress(void) if (stress(uffd_stats)) return 1; + /* Clear all the write protections if there is any */ + if (test_uffdio_wp) + wp_range(uffd, (unsigned long)area_dst, + nr_pages * page_size, false); + /* unregister */ if (ioctl(uffd, UFFDIO_UNREGISTER, &uffdio_register.range)) { fprintf(stderr, "unregister failure\n"); @@ -1233,10 +1343,7 @@ static int userfaultfd_stress(void) area_src_alias = area_dst_alias; area_dst_alias = tmp_area; - printf("userfaults:"); - for (cpu = 0; cpu < nr_cpus; cpu++) - printf(" %lu", uffd_stats[cpu].missing_faults); - printf("\n"); + uffd_stats_report(uffd_stats, nr_cpus); } if (err) @@ -1276,6 +1383,8 @@ static void set_test_type(const char *type) if (!strcmp(type, "anon")) { test_type = TEST_ANON; uffd_test_ops = &anon_uffd_test_ops; + /* Only enable write-protect test for anonymous test */ + test_uffdio_wp = true; } else if (!strcmp(type, "hugetlb")) { test_type = TEST_HUGETLB; uffd_test_ops = &hugetlb_uffd_test_ops;