From patchwork Wed Mar 20 02:06:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860669 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0BE6815AC for ; Wed, 20 Mar 2019 02:07:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E145D287D4 for ; Wed, 20 Mar 2019 02:07:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D543A297E3; Wed, 20 Mar 2019 02:07:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1CA1D287D4 for ; Wed, 20 Mar 2019 02:07:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0E3816B0006; Tue, 19 Mar 2019 22:07:04 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 094866B0007; Tue, 19 Mar 2019 22:07:04 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EC52F6B0008; Tue, 19 Mar 2019 22:07:03 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id C9A2A6B0006 for ; Tue, 19 Mar 2019 22:07:03 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id v2so811905qkf.21 for ; Tue, 19 Mar 2019 19:07:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=vEMcSd27exybzmip3JgYjLqtTvZHhvrIgjQE9HQqX1c=; b=SuofEe93sQAEKlVyb0jfb309KY2VzGmoHpBKBS/RwwnF0U5jclp4kz0rUgb6LnuiSE mKB/q4HFZBv8S5UE8Uv6jVtGageGsM0NbRX5vhsc+FY2p0dOrii2jMNS3iPzBse+7Zil zDk7t5ZrjcMC94dRcSvO4/aiGK5FNV35BPRDgqsiHvXbkzBzf7GC0IG5tkYIh+ZXXpF4 NfEb08i1/9La2u+YyoGd0TxItqPOkNI7i9p28HWG+ird2SgVGaCv0GO5AKOWYtS12SQ7 wyUIZtOyJzTLC0WYvRhTqU4lvm0eYI65mFd40S8R7a1qMowOGSOyB7lmIr9Paz+/Y3MQ qkLg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVQbOMQeNtxbiJLEVFs5yc1OxbRkS+qMR6/dofkjkPW+af1X+JA IlzD/0L+CYuA9fmwBm8C44QG+B+pTGk6hbSKYmqfVgnybV58dpMCRIQ2LSCM0/Wu18uSZ4h/v9U hykvKksDVDyKtc35n7wpKl9fcvbIPcXjeKLwMkWXA45AJMIhrytRU9x9l3FxQV2V91w== X-Received: by 2002:a05:620a:14b0:: with SMTP id x16mr4317855qkj.187.1553047623576; Tue, 19 Mar 2019 19:07:03 -0700 (PDT) X-Google-Smtp-Source: APXvYqzmT9wLYfaVyXZgYDi0qSm2xrx1ZQzCiLh6uloM7BfQD5TTxDP6ucUU5g5hI1UW3ex/pAAm X-Received: by 2002:a05:620a:14b0:: with SMTP id x16mr4317784qkj.187.1553047621986; Tue, 19 Mar 2019 19:07:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047621; cv=none; d=google.com; s=arc-20160816; b=moL76mwoj5MTR1H3UWtejw0k7beHRyUzymcsDLAIS6nX68H590H4LPw+p/MH0iHKs3 wyKAs0dmHVHcJiedCGQRPemLuL521l57s0p8kS46BPbx88MJKGa+QL/w24KDPHhQjJG9 NQGuRsAYjqq5UZDWCQ3I4Hn9/M7Dsn6Bei1NHyYj1ArSSX/AXirI8V8PYJgTYLK7Rb9i o65Mt2H2wdSRZtmKmM362QQKcjFDI83LNfH72URqOV35mVQiWpug89DzEzH7DlvhBaIO aPOEa3y7ijBFrVFx2SnkTLNYTXw6xvyVFeOEBu0SozsmbtM59Zsi0GmcspLy4R0GGUaQ ApEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=vEMcSd27exybzmip3JgYjLqtTvZHhvrIgjQE9HQqX1c=; b=jZOPordU1xCiJ9mlvJu36LEP9rdEK/SCPiE6MlBJFL2xdByMe2tag5T9KAr1qIPb3G HVEeDK8ZrWGIro86TSpw3Yn4zWT7YDCUSZVZ6pFantpFBUIq09z6JcdRzubWhwnmWQRk baVofRpZcnARt1Gr+sr3n3Sf4F0GV/WBb4Ok1oznXaC3rh+O2yEMhef3zaktUYTRloOt c4Jdp5HY7LE21rf8Tm9AwnaL9uq6FUAyvZbep6zhFB4mJXDmdXk9AZi8CZMWCd0SX0HV cI5VNcYZglP7Y2T8Mna0j/4YSTqOIuhxwxd7uFCNJdlnlhnMK8/aMUckv76dufgmA1zl BzcQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id d58si487016qtk.97.2019.03.19.19.07.01 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:07:01 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 19A32308792D; Wed, 20 Mar 2019 02:07:01 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 38015605CA; Wed, 20 Mar 2019 02:06:54 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 01/28] mm: gup: rename "nonblocking" to "locked" where proper Date: Wed, 20 Mar 2019 10:06:15 +0800 Message-Id: <20190320020642.4000-2-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Wed, 20 Mar 2019 02:07:01 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP There's plenty of places around __get_user_pages() that has a parameter "nonblocking" which does not really mean that "it won't block" (because it can really block) but instead it shows whether the mmap_sem is released by up_read() during the page fault handling mostly when VM_FAULT_RETRY is returned. We have the correct naming in e.g. get_user_pages_locked() or get_user_pages_remote() as "locked", however there're still many places that are using the "nonblocking" as name. Renaming the places to "locked" where proper to better suite the functionality of the variable. While at it, fixing up some of the comments accordingly. Reviewed-by: Mike Rapoport Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- mm/gup.c | 44 +++++++++++++++++++++----------------------- mm/hugetlb.c | 8 ++++---- 2 files changed, 25 insertions(+), 27 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 75029649baca..9bb3bed68ee3 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -506,12 +506,12 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address, } /* - * mmap_sem must be held on entry. If @nonblocking != NULL and - * *@flags does not include FOLL_NOWAIT, the mmap_sem may be released. - * If it is, *@nonblocking will be set to 0 and -EBUSY returned. + * mmap_sem must be held on entry. If @locked != NULL and *@flags + * does not include FOLL_NOWAIT, the mmap_sem may be released. If it + * is, *@locked will be set to 0 and -EBUSY returned. */ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, - unsigned long address, unsigned int *flags, int *nonblocking) + unsigned long address, unsigned int *flags, int *locked) { unsigned int fault_flags = 0; vm_fault_t ret; @@ -523,7 +523,7 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, fault_flags |= FAULT_FLAG_WRITE; if (*flags & FOLL_REMOTE) fault_flags |= FAULT_FLAG_REMOTE; - if (nonblocking) + if (locked) fault_flags |= FAULT_FLAG_ALLOW_RETRY; if (*flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; @@ -549,8 +549,8 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, } if (ret & VM_FAULT_RETRY) { - if (nonblocking && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) - *nonblocking = 0; + if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) + *locked = 0; return -EBUSY; } @@ -627,7 +627,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) * only intends to ensure the pages are faulted in. * @vmas: array of pointers to vmas corresponding to each page. * Or NULL if the caller does not require them. - * @nonblocking: whether waiting for disk IO or mmap_sem contention + * @locked: whether we're still with the mmap_sem held * * Returns number of pages pinned. This may be fewer than the number * requested. If nr_pages is 0 or negative, returns 0. If no pages @@ -656,13 +656,11 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) * appropriate) must be called after the page is finished with, and * before put_page is called. * - * If @nonblocking != NULL, __get_user_pages will not wait for disk IO - * or mmap_sem contention, and if waiting is needed to pin all pages, - * *@nonblocking will be set to 0. Further, if @gup_flags does not - * include FOLL_NOWAIT, the mmap_sem will be released via up_read() in - * this case. + * If @locked != NULL, *@locked will be set to 0 when mmap_sem is + * released by an up_read(). That can happen if @gup_flags does not + * have FOLL_NOWAIT. * - * A caller using such a combination of @nonblocking and @gup_flags + * A caller using such a combination of @locked and @gup_flags * must therefore hold the mmap_sem for reading only, and recognize * when it's been released. Otherwise, it must be held for either * reading or writing and will not be released. @@ -674,7 +672,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, unsigned long start, unsigned long nr_pages, unsigned int gup_flags, struct page **pages, - struct vm_area_struct **vmas, int *nonblocking) + struct vm_area_struct **vmas, int *locked) { long ret = 0, i = 0; struct vm_area_struct *vma = NULL; @@ -718,7 +716,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, if (is_vm_hugetlb_page(vma)) { i = follow_hugetlb_page(mm, vma, pages, vmas, &start, &nr_pages, i, - gup_flags, nonblocking); + gup_flags, locked); continue; } } @@ -736,7 +734,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, page = follow_page_mask(vma, start, foll_flags, &ctx); if (!page) { ret = faultin_page(tsk, vma, start, &foll_flags, - nonblocking); + locked); switch (ret) { case 0: goto retry; @@ -1195,7 +1193,7 @@ EXPORT_SYMBOL(get_user_pages_longterm); * @vma: target vma * @start: start address * @end: end address - * @nonblocking: + * @locked: whether the mmap_sem is still held * * This takes care of mlocking the pages too if VM_LOCKED is set. * @@ -1203,14 +1201,14 @@ EXPORT_SYMBOL(get_user_pages_longterm); * * vma->vm_mm->mmap_sem must be held. * - * If @nonblocking is NULL, it may be held for read or write and will + * If @locked is NULL, it may be held for read or write and will * be unperturbed. * - * If @nonblocking is non-NULL, it must held for read only and may be - * released. If it's released, *@nonblocking will be set to 0. + * If @locked is non-NULL, it must held for read only and may be + * released. If it's released, *@locked will be set to 0. */ long populate_vma_page_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end, int *nonblocking) + unsigned long start, unsigned long end, int *locked) { struct mm_struct *mm = vma->vm_mm; unsigned long nr_pages = (end - start) / PAGE_SIZE; @@ -1245,7 +1243,7 @@ long populate_vma_page_range(struct vm_area_struct *vma, * not result in a stack expansion that recurses back here. */ return __get_user_pages(current, mm, start, nr_pages, gup_flags, - NULL, NULL, nonblocking); + NULL, NULL, locked); } /* diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8dfdffc34a99..52296ce4025a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4190,7 +4190,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, struct page **pages, struct vm_area_struct **vmas, unsigned long *position, unsigned long *nr_pages, - long i, unsigned int flags, int *nonblocking) + long i, unsigned int flags, int *locked) { unsigned long pfn_offset; unsigned long vaddr = *position; @@ -4261,7 +4261,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, spin_unlock(ptl); if (flags & FOLL_WRITE) fault_flags |= FAULT_FLAG_WRITE; - if (nonblocking) + if (locked) fault_flags |= FAULT_FLAG_ALLOW_RETRY; if (flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | @@ -4278,9 +4278,9 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, break; } if (ret & VM_FAULT_RETRY) { - if (nonblocking && + if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) - *nonblocking = 0; + *locked = 0; *nr_pages = 0; /* * VM_FAULT_RETRY must not return an From patchwork Wed Mar 20 02:06:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860671 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7A28515AC for ; Wed, 20 Mar 2019 02:07:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5D423287D4 for ; Wed, 20 Mar 2019 02:07:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 51085299B9; Wed, 20 Mar 2019 02:07:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 02EF1287D4 for ; Wed, 20 Mar 2019 02:07:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 17EFD6B0007; Tue, 19 Mar 2019 22:07:12 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 12F1F6B0008; Tue, 19 Mar 2019 22:07:12 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F12976B000A; Tue, 19 Mar 2019 22:07:11 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id C39B66B0007 for ; Tue, 19 Mar 2019 22:07:11 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id i3so912932qtc.7 for ; Tue, 19 Mar 2019 19:07:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=BU0FSMjUhzRwc9BHCX7EGJ7teQkDzu6N7AKfPnOzHPU=; b=fABx5Lr5E8lm8mVqaYgJJWU+XqaD5ukn39dK12w6dVh2L5M6xsvXkuk4yFF32LF2pT qoTi2UK2322Kd8dINOB7YwSP7+KyMmg2Jx/2i5NT3dMarjkG07W2lCnsN6Q4tOMOfaoG krqIdhgcN6Dhx+yLw+7vVF45ySAZRplleCjdktMwJY4ajH+JFfvNvsPJNi87TEV+PvUR l9/pmuKgEdni0fOoAsuG10UatYdqYxwG2nNz3WmMlNwVJt28RtdFulUCQ97vhYVZcGEI ABUOi0x2XYXZ3rieBLB0pVu43Uo7rAkH+xw2eLBETbqGufHLSY9K2u3RcPaxL6RzomMc 3G7w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXgEsjTiMGbrlXKOCl8Iwv/JiE1WCsXAVVQj/RHveCycmoCVNZH 4v9aVzM14iGzsv6dICpZWw8DDxWhUU0CpJRvdyI3/ooIe0Pf6gZjbdCn78t2gISW2Ov4R7gH2UL HB1//gQN9jGPCfC/IJrQnG9kEcWGT2MKshTfd3pWNBDsXBN1vVdyrnLWTVoKLxwWPeQ== X-Received: by 2002:aed:3ac9:: with SMTP id o67mr4778793qte.64.1553047631551; Tue, 19 Mar 2019 19:07:11 -0700 (PDT) X-Google-Smtp-Source: APXvYqyha0yHSgwHScbFnlX5lUTqqrIuEpEDg5DkpdAHjQmyi3DJe92l+QLzgtRBEs6/W6tWxcoB X-Received: by 2002:aed:3ac9:: with SMTP id o67mr4778739qte.64.1553047630180; Tue, 19 Mar 2019 19:07:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047630; cv=none; d=google.com; s=arc-20160816; b=QxIuo5BXnN4RBAJcThermfUjXbjD+hhce08+wse0G6j6g+NzUobE8cP575EMeAVljV WJkzwxZOWmpl7T3fyCwSDBqcZBlYAl0FXzsVBSSRoQ1IvxpRbJB6Wn95R4J+4vOvodHS inXeDd5Hqkort8iL/u6UQzX4SJeb8xgRhpSm89NqP5g0DBGWgH7KhuFW7oSWvQtYzISD QgnrzEznrOlfOsn7/ERIG1CfXEGlnrpTCL8RS6ASi8JAHw3yzHfKwuxW6VemehuhGSQK qQiRMqH+Dh82jvRYUdP1wCWWsoQu9BM8gN6uthFRQLa+vMAkn8HWZRzte/BngLsntrNK IBXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=BU0FSMjUhzRwc9BHCX7EGJ7teQkDzu6N7AKfPnOzHPU=; b=EhKQEh1m9dn8u0GpIdNkfYeFKiziRxBOdhejjK1IjDeYDr/yu6Y4WmItXU++VVKUTr JJcrAowzWNWK8w/+a5dBn6DhEyj5bRn+Nt9A5IvHCknduBXE347T0C/Xf+FFUtvzmf2p 5vuO3CAfNf6G6FYgj7eRS6ialnYoq//04YwJEP/j5Izy5jLpYdyuyBlP/KXiHXKmNRW+ zANqDxKf4Oww64BlkCocYAu4FmIvhGhOb2KlrFf9ycXUgjd9Ju3ukocaXxxzS+BNcItA Rhdcxr97DWd9RhJmZllJyS1KppVdBIhaSQLV6Zw8VkHUK0RwKVRsNMPzqcjdWfUUbG7u YR+Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id k31si267380qvh.70.2019.03.19.19.07.10 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:07:10 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3D0CDC0587F5; Wed, 20 Mar 2019 02:07:09 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 952CC6014C; Wed, 20 Mar 2019 02:07:01 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 02/28] mm: userfault: return VM_FAULT_RETRY on signals Date: Wed, 20 Mar 2019 10:06:16 +0800 Message-Id: <20190320020642.4000-3-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Wed, 20 Mar 2019 02:07:09 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from the upstream discussion between Linus and Andrea: https://lkml.org/lkml/2017/10/30/560 A summary to the issue: there was a special path in handle_userfault() in the past that we'll return a VM_FAULT_NOPAGE when we detected non-fatal signals when waiting for userfault handling. We did that by reacquiring the mmap_sem before returning. However that brings a risk in that the vmas might have changed when we retake the mmap_sem and even we could be holding an invalid vma structure. This patch removes the special path and we'll return a VM_FAULT_RETRY with the common path even if we have got such signals. Then for all the architectures that is passing in VM_FAULT_ALLOW_RETRY into handle_mm_fault(), we check not only for SIGKILL but for all the rest of userspace pending signals right after we returned from handle_mm_fault(). This can allow the userspace to handle nonfatal signals faster than before. This patch is a preparation work for the next patch to finally remove the special code path mentioned above in handle_userfault(). Suggested-by: Linus Torvalds Suggested-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- arch/alpha/mm/fault.c | 2 +- arch/arc/mm/fault.c | 11 ++++------- arch/arm/mm/fault.c | 6 +++--- arch/arm64/mm/fault.c | 6 +++--- arch/hexagon/mm/vm_fault.c | 2 +- arch/ia64/mm/fault.c | 2 +- arch/m68k/mm/fault.c | 2 +- arch/microblaze/mm/fault.c | 2 +- arch/mips/mm/fault.c | 2 +- arch/nds32/mm/fault.c | 6 +++--- arch/nios2/mm/fault.c | 2 +- arch/openrisc/mm/fault.c | 2 +- arch/parisc/mm/fault.c | 2 +- arch/powerpc/mm/fault.c | 2 ++ arch/riscv/mm/fault.c | 4 ++-- arch/s390/mm/fault.c | 9 ++++++--- arch/sh/mm/fault.c | 4 ++++ arch/sparc/mm/fault_32.c | 3 +++ arch/sparc/mm/fault_64.c | 3 +++ arch/um/kernel/trap.c | 5 ++++- arch/unicore32/mm/fault.c | 4 ++-- arch/x86/mm/fault.c | 6 +++++- arch/xtensa/mm/fault.c | 3 +++ 23 files changed, 56 insertions(+), 34 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index 188fc9256baf..8a2ef90b4bfc 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -150,7 +150,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, the fault. */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index 8df1638259f3..9e9e6eb1f7d0 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -141,17 +141,14 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) */ fault = handle_mm_fault(vma, address, flags); - if (fatal_signal_pending(current)) { - + if (unlikely((fault & VM_FAULT_RETRY) && signal_pending(current))) { + if (fatal_signal_pending(current) && !user_mode(regs)) + goto no_context; /* * if fault retry, mmap_sem already relinquished by core mm * so OK to return to user mode (with signal handled first) */ - if (fault & VM_FAULT_RETRY) { - if (!user_mode(regs)) - goto no_context; - return; - } + return; } perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 58f69fa07df9..c41c021bbe40 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -314,12 +314,12 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_page_fault(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (unlikely(fault & VM_FAULT_RETRY && signal_pending(current))) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return 0; } diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index efb7b2cbead5..a38ff8c49a66 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -512,13 +512,13 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, if (fault & VM_FAULT_RETRY) { /* - * If we need to retry but a fatal signal is pending, + * If we need to retry but a signal is pending, * handle the signal first. We do not need to release * the mmap_sem because it would already be released * in __lock_page_or_retry in mm/filemap.c. */ - if (fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return 0; } diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index eb263e61daf4..be10b441d9cc 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -104,7 +104,7 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; /* The most common case -- we are done. */ diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 5baeb022f474..62c2d39d2bed 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -163,7 +163,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index 9b6163c05a75..d9808a807ab8 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -138,7 +138,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); pr_debug("handle_mm_fault returns %x\n", fault); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return 0; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 202ad6a494f5..4fd2dbd0c5ca 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -217,7 +217,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 73d8a0f0b810..92374fd091d2 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -154,7 +154,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index 68d5f2a27f38..da777de8a62e 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -206,12 +206,12 @@ void do_page_fault(unsigned long entry, unsigned long addr, fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return; } diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index 24fd84cf6006..5939434a31ae 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -134,7 +134,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index dc4dbafc1d83..873ecb5d82d7 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -165,7 +165,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index c8e8b7c05558..29422eec329d 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -303,7 +303,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 887f11bcf330..aaa853e6592f 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -591,6 +591,8 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, */ flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; + if (is_user && signal_pending(current)) + return 0; if (!fatal_signal_pending(current)) goto retry; } diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 88401d5125bc..4fc8d746bec3 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -123,11 +123,11 @@ asmlinkage void do_page_fault(struct pt_regs *regs) fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(tsk)) + if ((fault & VM_FAULT_RETRY) && signal_pending(tsk)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index 11613362c4e7..aba1dad1efcd 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -476,9 +476,12 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) * the fault. */ fault = handle_mm_fault(vma, address, flags); - /* No reason to continue if interrupted by SIGKILL. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - fault = VM_FAULT_SIGNAL; + /* Do not continue if interrupted by signals. */ + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { + if (fatal_signal_pending(current)) + fault = VM_FAULT_SIGNAL; + else + fault = 0; if (flags & FAULT_FLAG_RETRY_NOWAIT) goto out_up; goto out; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index 6defd2c6d9b1..baf5d73df40c 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -506,6 +506,10 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, * have already released it in __lock_page_or_retry * in mm/filemap.c. */ + + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index b0440b0edd97..a2c83104fe35 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -269,6 +269,9 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index 8f8a604c1300..cad71ec5c7b3 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -467,6 +467,9 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(current)) + return; + goto retry; } } diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index 0e8b6158f224..05dcd4c5f0d5 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -76,8 +76,11 @@ int handle_page_fault(unsigned long address, unsigned long ip, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { + if (is_user && !fatal_signal_pending(current)) + err = 0; goto out_nosemaphore; + } if (unlikely(fault & VM_FAULT_ERROR)) { if (fault & VM_FAULT_OOM) { diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index b9a3a50644c1..3611f19234a1 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -248,11 +248,11 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_pf(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return 0; if (!(fault & VM_FAULT_ERROR) && (flags & FAULT_FLAG_ALLOW_RETRY)) { diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 9d5c75f02295..248ff0a28ecd 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1481,16 +1481,20 @@ void do_user_addr_fault(struct pt_regs *regs, * that we made any progress. Handle this case first. */ if (unlikely(fault & VM_FAULT_RETRY)) { + bool is_user = flags & FAULT_FLAG_USER; + /* Retry at most once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; + if (is_user && signal_pending(tsk)) + return; if (!fatal_signal_pending(tsk)) goto retry; } /* User mode? Just return to handle the fatal exception */ - if (flags & FAULT_FLAG_USER) + if (is_user) return; /* Not returning to user mode? Handle exceptions or die: */ diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index 2ab0e0dcd166..792dad5e2f12 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -136,6 +136,9 @@ void do_page_fault(struct pt_regs *regs) * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(current)) + return; + goto retry; } } From patchwork Wed Mar 20 02:06:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860673 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 334581390 for ; Wed, 20 Mar 2019 02:07:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1762C289B0 for ; Wed, 20 Mar 2019 02:07:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0B5BB299AC; Wed, 20 Mar 2019 02:07:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C53F8299AB for ; Wed, 20 Mar 2019 02:07:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D671B6B0008; Tue, 19 Mar 2019 22:07:19 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D167D6B000A; Tue, 19 Mar 2019 22:07:19 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C06B86B000C; Tue, 19 Mar 2019 22:07:19 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id A341A6B0008 for ; Tue, 19 Mar 2019 22:07:19 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id 23so19463475qkl.16 for ; Tue, 19 Mar 2019 19:07:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=vKNT6GfdSA2RaFRR5Rx6tafrEcvehh1DQki77hDcefA=; b=AZQ9tLWSwKkQwwfF9nEQvGRqmkwK/uhtCTYznYNsAX07Iu/J1OhTiCJOzCB3kb9kVV SUVPXVtPKqhvlyfiMtSPft1cLV/S8q+JNoNhyz/DSQ+rXy7ib94qvz9v0AWteL0v59gL Dqpu1ccd3K+tPfuuuBKp60hV0EB0KDvAGVKfx5bqLenKEhWmY1i4lGz2dSlpb5RW5eIw qG5gzy8Exh4/ubzEErf28Rl4T933ou8dz4N0P3jk4m47V8ULOqqEGSrx1daqTC5PU/tS IPulAQ5qzhSrBjlNS597bVtoAzOvTn9qHVYCHqD5t117JfB33zIai1no+LV+EUNWIfvV zrgA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAUgMmQxZG/YBNj8YkNvPog3yKjXbrAxRKiOgpGv766CE/sDzmdh tmrVSvBNZq0d+nu9XL8bZFei/WRm1JxjjIhIBd/NncgogSaIRD48rLrFkIpibIDMIHkWu6nu1fh rf0oLssSn8711a/Rn6o6OLHnB4xWMFIhLn00tHAIv1XnTAr41ObOqwlbb4FKZxf1ezQ== X-Received: by 2002:ae9:c314:: with SMTP id n20mr4589240qkg.191.1553047639392; Tue, 19 Mar 2019 19:07:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqwd5GPx3FBznlbZ6k8DP080MQr8Y90gAdRvn3aqZhhEzH1EJSm+mqnnaUgHFz9/NbCxL7Ir X-Received: by 2002:ae9:c314:: with SMTP id n20mr4589198qkg.191.1553047638649; Tue, 19 Mar 2019 19:07:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047638; cv=none; d=google.com; s=arc-20160816; b=QiVtEfIhBEh7OkMMlwTtTe9qXIkQSTpm7mh9Vxd6zE713z8XIl4cdz0ZTFOQr0VUwA TdluMV8U+owd1dR+ph7sxYM7wSNttbFiC4uh9XYUoq9+UsLJ3kCtKzCFacWFlCyRpEcY /EJTA3nmI5uNMW6JODbplzDPz8G1d9hh4j2JvP74DGeByBbaZ7blHJCtH/oNJShyrFEm l7oGQ/4oPeUVdmdCrsQ1snrMog1NYwLa9CA8TG9JazgrdX3x/CJIzDhScnfR0dPlVj/U 1ZYmSFeooAto5r0whbTxOv5kCP8FwrYl/7YYVCnzT3AZ3A5s2wmgTsN3pwb4SVXwY/RF 8kMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=vKNT6GfdSA2RaFRR5Rx6tafrEcvehh1DQki77hDcefA=; b=QwecyyGnPjmno1RDPXrWBfSXoUC2oVqVwjSFG66CwwG/GwLEurjdAeBujjnJW7TZ4j V4kyILIxG337SkVq1L+U1TPVtyC/3CU4vbImZOe0Ddo7Gv4lYCSAEQquhNa7aHrEmHwx Sjrc804ZZHs6gINCXulx8EfY8zf1Im9HqItdcfobJ1JwRHfNcfgwpLXZp+G/4vFs4492 Uly1dfeCTMGbZ9SM7BmV01tooo/kmJSwIOZ/NdLyVdgXXdScCBB0bLALTbqCb/XWtl0O tDZk1J+d3Mg/M5QId4IjO4ReLr/fKgnGGdET0+S3up36WLxgbnt8VKHZBXGMX6QSt3hd 9HMw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id h39si414549qth.201.2019.03.19.19.07.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:07:18 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 80AD83087934; Wed, 20 Mar 2019 02:07:17 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id B9EB16014C; Wed, 20 Mar 2019 02:07:09 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 03/28] userfaultfd: don't retake mmap_sem to emulate NOPAGE Date: Wed, 20 Mar 2019 10:06:17 +0800 Message-Id: <20190320020642.4000-4-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Wed, 20 Mar 2019 02:07:17 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from the upstream discussion between Linus and Andrea: https://lkml.org/lkml/2017/10/30/560 A summary to the issue: there was a special path in handle_userfault() in the past that we'll return a VM_FAULT_NOPAGE when we detected non-fatal signals when waiting for userfault handling. We did that by reacquiring the mmap_sem before returning. However that brings a risk in that the vmas might have changed when we retake the mmap_sem and even we could be holding an invalid vma structure. This patch removes the risk path in handle_userfault() then we will be sure that the callers of handle_mm_fault() will know that the VMAs might have changed. Meanwhile with previous patch we don't lose responsiveness as well since the core mm code now can handle the nonfatal userspace signals quickly even if we return VM_FAULT_RETRY. Suggested-by: Andrea Arcangeli Suggested-by: Linus Torvalds Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- fs/userfaultfd.c | 24 ------------------------ 1 file changed, 24 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 89800fc7dc9d..b397bc3b954d 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -514,30 +514,6 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) __set_current_state(TASK_RUNNING); - if (return_to_userland) { - if (signal_pending(current) && - !fatal_signal_pending(current)) { - /* - * If we got a SIGSTOP or SIGCONT and this is - * a normal userland page fault, just let - * userland return so the signal will be - * handled and gdb debugging works. The page - * fault code immediately after we return from - * this function is going to release the - * mmap_sem and it's not depending on it - * (unlike gup would if we were not to return - * VM_FAULT_RETRY). - * - * If a fatal signal is pending we still take - * the streamlined VM_FAULT_RETRY failure path - * and there's no need to retake the mmap_sem - * in such case. - */ - down_read(&mm->mmap_sem); - ret = VM_FAULT_NOPAGE; - } - } - /* * Here we race with the list_del; list_add in * userfaultfd_ctx_read(), however because we don't ever run From patchwork Wed Mar 20 02:06:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860675 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E729115AC for ; Wed, 20 Mar 2019 02:07:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C6F78299B7 for ; Wed, 20 Mar 2019 02:07:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BAD2229800; Wed, 20 Mar 2019 02:07:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4DAF3299AB for ; Wed, 20 Mar 2019 02:07:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5024C6B000A; Tue, 19 Mar 2019 22:07:27 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4B2BB6B000C; Tue, 19 Mar 2019 22:07:27 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 379E26B000D; Tue, 19 Mar 2019 22:07:27 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 106BF6B000A for ; Tue, 19 Mar 2019 22:07:27 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id n10so910824qtk.9 for ; Tue, 19 Mar 2019 19:07:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=pXV/owCkhk/Ku3ko/t4dekZX/ZdfSDa6VsorYWBWRPI=; b=o1jtjlQatEmR7ua77/cMr82GP0ZwCs0GNeXm1S2d6U0zlCB9yMEgFBuR3byyOffZ+p XDyxp4nCfLJDISsppgRvQvITJ+Wt8zbBbfF1ocv85xTn6ZstNE33v9OQvsLp+q+Ja2gg zYEVlW+m8AA0GUwbg35AvilB5G0zOgi1bHIIFVX4rbNy+xlHme9uGaqqnxw5OtR4fQsD Cb+QmzPst2gJsnSiy/rC9264xphA+7TdCINRiu8+2H0Ry201UoNUhrrFDclSg8UIFT3c JzhcG9hQbjlAgVArYpE/rf3ICFdjvHG+bdJsB4q4PF9ZhezoxwPJFryyp84Y1hLqxLXp yhug== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAUMV9IRi9aiEhpBMOZsFsYj0di5eBDrKKqQefThcxIhGVJ0am2J T9h+vLCxL3SAKHHXftbT9SV4sAw0bzOLkT/zPR4OokcsC3NqXFJSmQzRu5KAyX8mH6zfptCUIFJ Mm9XkHmAugnfuOrNMwYrQGTtmaprbLxF4ZelK78UaBsWIsd5K/bMavaaQLCeD9CPqsQ== X-Received: by 2002:a37:c9d9:: with SMTP id m86mr4325305qkl.174.1553047646796; Tue, 19 Mar 2019 19:07:26 -0700 (PDT) X-Google-Smtp-Source: APXvYqy0aFIWKXvH+kVeFWx76Nr+k/jQcMBFyBozCphYK6Ne5BsN00tnjqRwkRudZpcy/BK0KS+1 X-Received: by 2002:a37:c9d9:: with SMTP id m86mr4325233qkl.174.1553047644934; Tue, 19 Mar 2019 19:07:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047644; cv=none; d=google.com; s=arc-20160816; b=RkRi2bxF19HQlZ0+2PyaeQ3V+mkDNtfsLeGhw6fGi6bo5k5cNaLcY7C8K8KjIWXO30 kRkFx59U5gl+xv/O3yKhT7U0vK2umcnZZLcTlIlpDJlEiC+ioiEuy2rzwsUBaJVwqsR/ OCy2YVpUCyPT/NES/BfZug/5Zag41jqdPRO8LApAL9vjFOgoUiZ8jMaHV4bUvvULqmuM nsKx9qjfCwyXCldsCbfzr13s0RkEBdF8XWr0w8tQBTpreRHVXorH0MDG6zonBFZjwkyF XmojNNTUlLRXGI9tFUMG5z6GYxM7iatTB6CiAat9caySsreo6qEc8V6uaHAKztcg1jQf M+eg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=pXV/owCkhk/Ku3ko/t4dekZX/ZdfSDa6VsorYWBWRPI=; b=GW1z71fJvNbZIknGgEjquVGrC+mtP02c1vqVYKKRXSkNjeK5qzqzzTT6VN9fp31Ykr biWRBxGOl7GfgA4b2F6FgFYaJ2d5GFXS2hnWOxdHdtuxHo7d++MwRasoAsZpTet53/Vv 02mNgM4qMW6PgQBy68h5Utc39CWdqmh+ym1hn8vh1oCwPZjM2DQVg24qQnZ1NkcoL+LO F7uXaQIizVjnx5BNghgUgXoTWEMTHa3AKc9WE7OOGzXCq1IQQF/ivfdNDhFndCEOpaef a3c+4fBAacUHHH9VwsZ3/eRfIcsWXyWUv1aDCi04je0XaPu5kaaAbePjtBYCW+E4CReH vWow== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id m26si392106qtn.187.2019.03.19.19.07.24 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:07:24 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0D5E483F40; Wed, 20 Mar 2019 02:07:24 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 09E746014C; Wed, 20 Mar 2019 02:07:17 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 04/28] mm: allow VM_FAULT_RETRY for multiple times Date: Wed, 20 Mar 2019 10:06:18 +0800 Message-Id: <20190320020642.4000-5-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Wed, 20 Mar 2019 02:07:24 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from a discussion between Linus and Andrea [1]. Before this patch we only allow a page fault to retry once. We achieved this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing handle_mm_fault() the second time. This was majorly used to avoid unexpected starvation of the system by looping over forever to handle the page fault on a single page. However that should hardly happen, and after all for each code path to return a VM_FAULT_RETRY we'll first wait for a condition (during which time we should possibly yield the cpu) to happen before VM_FAULT_RETRY is really returned. This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY flag when we receive VM_FAULT_RETRY. It means that the page fault handler now can retry the page fault for multiple times if necessary without the need to generate another page fault event. Meanwhile we still keep the FAULT_FLAG_TRIED flag so page fault handler can still identify whether a page fault is the first attempt or not. Then we'll have these combinations of fault flags (only considering ALLOW_RETRY flag and TRIED flag): - ALLOW_RETRY and !TRIED: this means the page fault allows to retry, and this is the first try - ALLOW_RETRY and TRIED: this means the page fault allows to retry, and this is not the first try - !ALLOW_RETRY and !TRIED: this means the page fault does not allow to retry at all - !ALLOW_RETRY and TRIED: this is forbidden and should never be used In existing code we have multiple places that has taken special care of the first condition above by checking against (fault_flags & FAULT_FLAG_ALLOW_RETRY). This patch introduces a simple helper to detect the first retry of a page fault by checking against both (fault_flags & FAULT_FLAG_ALLOW_RETRY) and !(fault_flag & FAULT_FLAG_TRIED) because now even the 2nd try will have the ALLOW_RETRY set, then use that helper in all existing special paths. One example is in __lock_page_or_retry(), now we'll drop the mmap_sem only in the first attempt of page fault and we'll keep it in follow up retries, so old locking behavior will be retained. This will be a nice enhancement for current code [2] at the same time a supporting material for the future userfaultfd-writeprotect work, since in that work there will always be an explicit userfault writeprotect retry for protected pages, and if that cannot resolve the page fault (e.g., when userfaultfd-writeprotect is used in conjunction with swapped pages) then we'll possibly need a 3rd retry of the page fault. It might also benefit other potential users who will have similar requirement like userfault write-protection. GUP code is not touched yet and will be covered in follow up patch. Please read the thread below for more information. [1] https://lkml.org/lkml/2017/11/2/833 [2] https://lkml.org/lkml/2018/12/30/64 Suggested-by: Linus Torvalds Suggested-by: Andrea Arcangeli Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse --- arch/alpha/mm/fault.c | 2 +- arch/arc/mm/fault.c | 1 - arch/arm/mm/fault.c | 3 --- arch/arm64/mm/fault.c | 5 ----- arch/hexagon/mm/vm_fault.c | 1 - arch/ia64/mm/fault.c | 1 - arch/m68k/mm/fault.c | 3 --- arch/microblaze/mm/fault.c | 1 - arch/mips/mm/fault.c | 1 - arch/nds32/mm/fault.c | 1 - arch/nios2/mm/fault.c | 3 --- arch/openrisc/mm/fault.c | 1 - arch/parisc/mm/fault.c | 4 +--- arch/powerpc/mm/fault.c | 6 ------ arch/riscv/mm/fault.c | 5 ----- arch/s390/mm/fault.c | 5 +---- arch/sh/mm/fault.c | 1 - arch/sparc/mm/fault_32.c | 1 - arch/sparc/mm/fault_64.c | 1 - arch/um/kernel/trap.c | 1 - arch/unicore32/mm/fault.c | 4 +--- arch/x86/mm/fault.c | 2 -- arch/xtensa/mm/fault.c | 1 - drivers/gpu/drm/ttm/ttm_bo_vm.c | 12 ++++++++--- include/linux/mm.h | 38 ++++++++++++++++++++++++++++++++- mm/filemap.c | 2 +- mm/shmem.c | 2 +- 27 files changed, 52 insertions(+), 56 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index 8a2ef90b4bfc..6a02c0fb36b9 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -169,7 +169,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; + flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would * have already released it in __lock_page_or_retry diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index 9e9e6eb1f7d0..e7d2947ba72c 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -167,7 +167,6 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index c41c021bbe40..7910b4b5205d 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -342,9 +342,6 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) regs, addr); } if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index a38ff8c49a66..d1d3c98f9ffb 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -523,12 +523,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, return 0; } - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk of - * starvation. - */ if (mm_flags & FAULT_FLAG_ALLOW_RETRY) { - mm_flags &= ~FAULT_FLAG_ALLOW_RETRY; mm_flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index be10b441d9cc..576751597e77 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -115,7 +115,6 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 62c2d39d2bed..9de95d39935e 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -189,7 +189,6 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index d9808a807ab8..b1b2109e4ab4 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -162,9 +162,6 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 4fd2dbd0c5ca..05a4847ac0bf 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -236,7 +236,6 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 92374fd091d2..9953b5b571df 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -178,7 +178,6 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, tsk->min_flt++; } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index da777de8a62e..3642bdd7909d 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -242,7 +242,6 @@ void do_page_fault(unsigned long entry, unsigned long addr, 1, regs, addr); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index 5939434a31ae..9dd1c51acc22 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -158,9 +158,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index 873ecb5d82d7..ff92c5674781 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -185,7 +185,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, else tsk->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index 29422eec329d..675b221af198 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -327,14 +327,12 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; - /* * No need to up_read(&mm->mmap_sem) as we would * have already released it in __lock_page_or_retry * in mm/filemap.c. */ - + flags |= FAULT_FLAG_TRIED; goto retry; } } diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index aaa853e6592f..c831cb3ce03f 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -583,13 +583,7 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, * case. */ if (unlikely(fault & VM_FAULT_RETRY)) { - /* We retry only once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. - */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; if (is_user && signal_pending(current)) return 0; diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 4fc8d746bec3..aad2c0557d2f 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -154,11 +154,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs) 1, regs, addr); } if (fault & VM_FAULT_RETRY) { - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. - */ - flags &= ~(FAULT_FLAG_ALLOW_RETRY); flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index aba1dad1efcd..4e8c066964a9 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -513,10 +513,7 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) fault = VM_FAULT_PFAULT; goto out_up; } - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~(FAULT_FLAG_ALLOW_RETRY | - FAULT_FLAG_RETRY_NOWAIT); + flags &= ~FAULT_FLAG_RETRY_NOWAIT; flags |= FAULT_FLAG_TRIED; down_read(&mm->mmap_sem); goto retry; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index baf5d73df40c..cd710e2d7c57 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -498,7 +498,6 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index a2c83104fe35..6735cd1c09b9 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -261,7 +261,6 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, 1, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index cad71ec5c7b3..28d5b4d012c6 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -459,7 +459,6 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) 1, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index 05dcd4c5f0d5..e7723c133c7f 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -99,7 +99,6 @@ int handle_page_fault(unsigned long address, unsigned long ip, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index 3611f19234a1..efca122b5ef7 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -261,9 +261,7 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) else tsk->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; + flags |= FAULT_FLAG_TRIED; goto retry; } } diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 248ff0a28ecd..d842c3e02a50 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1483,9 +1483,7 @@ void do_user_addr_fault(struct pt_regs *regs, if (unlikely(fault & VM_FAULT_RETRY)) { bool is_user = flags & FAULT_FLAG_USER; - /* Retry at most once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; if (is_user && signal_pending(tsk)) return; diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index 792dad5e2f12..7cd55f2d66c9 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -128,7 +128,6 @@ void do_page_fault(struct pt_regs *regs) else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index a1d977fbade5..5fac635f72a5 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -61,9 +61,10 @@ static vm_fault_t ttm_bo_vm_fault_idle(struct ttm_buffer_object *bo, /* * If possible, avoid waiting for GPU with mmap_sem - * held. + * held. We only do this if the fault allows retry and this + * is the first attempt. */ - if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) { + if (fault_flag_allow_retry_first(vmf->flags)) { ret = VM_FAULT_RETRY; if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT) goto out_unlock; @@ -136,7 +137,12 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf) if (err != -EBUSY) return VM_FAULT_NOPAGE; - if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) { + /* + * If the fault allows retry and this is the first + * fault attempt, we try to release the mmap_sem + * before waiting + */ + if (fault_flag_allow_retry_first(vmf->flags)) { if (!(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) { ttm_bo_get(bo); up_read(&vmf->vma->vm_mm->mmap_sem); diff --git a/include/linux/mm.h b/include/linux/mm.h index 80bb6408fe73..f73dbc4a1957 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -336,16 +336,52 @@ extern unsigned int kobjsize(const void *objp); */ extern pgprot_t protection_map[16]; +/* + * About FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_TRIED: we can specify whether we + * would allow page faults to retry by specifying these two fault flags + * correctly. Currently there can be three legal combinations: + * + * (a) ALLOW_RETRY and !TRIED: this means the page fault allows retry, and + * this is the first try + * + * (b) ALLOW_RETRY and TRIED: this means the page fault allows retry, and + * we've already tried at least once + * + * (c) !ALLOW_RETRY and !TRIED: this means the page fault does not allow retry + * + * The unlisted combination (!ALLOW_RETRY && TRIED) is illegal and should never + * be used. Note that page faults can be allowed to retry for multiple times, + * in which case we'll have an initial fault with flags (a) then later on + * continuous faults with flags (b). We should always try to detect pending + * signals before a retry to make sure the continuous page faults can still be + * interrupted if necessary. + */ + #define FAULT_FLAG_WRITE 0x01 /* Fault was a write access */ #define FAULT_FLAG_MKWRITE 0x02 /* Fault was mkwrite of existing pte */ #define FAULT_FLAG_ALLOW_RETRY 0x04 /* Retry fault if blocking */ #define FAULT_FLAG_RETRY_NOWAIT 0x08 /* Don't drop mmap_sem and wait when retrying */ #define FAULT_FLAG_KILLABLE 0x10 /* The fault task is in SIGKILL killable region */ -#define FAULT_FLAG_TRIED 0x20 /* Second try */ +#define FAULT_FLAG_TRIED 0x20 /* We've tried once */ #define FAULT_FLAG_USER 0x40 /* The fault originated in userspace */ #define FAULT_FLAG_REMOTE 0x80 /* faulting for non current tsk/mm */ #define FAULT_FLAG_INSTRUCTION 0x100 /* The fault was during an instruction fetch */ +/* + * Returns true if the page fault allows retry and this is the first + * attempt of the fault handling; false otherwise. This is mostly + * used for places where we want to try to avoid taking the mmap_sem + * for too long a time when waiting for another condition to change, + * in which case we can try to be polite to release the mmap_sem in + * the first round to avoid potential starvation of other processes + * that would also want the mmap_sem. + */ +static inline bool fault_flag_allow_retry_first(unsigned int flags) +{ + return (flags & FAULT_FLAG_ALLOW_RETRY) && + (!(flags & FAULT_FLAG_TRIED)); +} + #define FAULT_FLAG_TRACE \ { FAULT_FLAG_WRITE, "WRITE" }, \ { FAULT_FLAG_MKWRITE, "MKWRITE" }, \ diff --git a/mm/filemap.c b/mm/filemap.c index 9f5e323e883e..a2b5c53166de 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1351,7 +1351,7 @@ EXPORT_SYMBOL_GPL(__lock_page_killable); int __lock_page_or_retry(struct page *page, struct mm_struct *mm, unsigned int flags) { - if (flags & FAULT_FLAG_ALLOW_RETRY) { + if (fault_flag_allow_retry_first(flags)) { /* * CAUTION! In this case, mmap_sem is not released * even though return 0. diff --git a/mm/shmem.c b/mm/shmem.c index 2c012eee133d..ac875b79281c 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1949,7 +1949,7 @@ static vm_fault_t shmem_fault(struct vm_fault *vmf) DEFINE_WAIT_FUNC(shmem_fault_wait, synchronous_wake_function); ret = VM_FAULT_NOPAGE; - if ((vmf->flags & FAULT_FLAG_ALLOW_RETRY) && + if (fault_flag_allow_retry_first(vmf->flags) && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) { /* It's polite to up mmap_sem if we can */ up_read(&vma->vm_mm->mmap_sem); From patchwork Wed Mar 20 02:06:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860677 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 263E115AC for ; Wed, 20 Mar 2019 02:07:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0BB4629800 for ; Wed, 20 Mar 2019 02:07:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F3E13299AD; Wed, 20 Mar 2019 02:07:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 799E829800 for ; Wed, 20 Mar 2019 02:07:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 85E1E6B000C; Tue, 19 Mar 2019 22:07:34 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 80DB16B000D; Tue, 19 Mar 2019 22:07:34 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6FD806B000E; Tue, 19 Mar 2019 22:07:34 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id 516A56B000C for ; Tue, 19 Mar 2019 22:07:34 -0400 (EDT) Received: by mail-qk1-f200.google.com with SMTP id 75so14388899qki.13 for ; Tue, 19 Mar 2019 19:07:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=EMuqhh7UDP9mueKADKHGwGeKC3lB3y4BVPe1BKF6cgY=; b=r0yowNc1r/6lk4R9pCGRUXEimtlLXGdlVgJ+PB+FHLtvmjfzFD2YlNDQ68FXvauRUo rUTESVvyCYy9eQb1PMbcURPDXCzIF4umAJOuQCuGA1TLsF/kcReRAnsj12rXfk1DFeus Os0pfO84VQcQAqi9G8kpimDE6V/w6JVxgt7nNP9SHnyBBhqpq+l49lCuKO3w2xNSiwoB GFALukBRJVNswIzLJg2zwq88gR/KD1D+LM3g5j4GlmhfAR2X3ZfYVr98BUz8ykIKdAsq obiHMkBVYBMsKCspSOIbweH5SQrbU3jrRUAakmxtEenfU2R1/by9lQBVQ3PrBbCK2Cw+ PR6Q== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWzIFIzVQctXK8IKcNOIkd5ZBt5Ue/Co4n2oxQ1J8VCa2o6h89C XUN3pToHu9bT13amRNhfqIp6E/s1NnFBjdxH3J13dIolbHzSDhq/Gs0vouyWEo83mlfoO/d1KpA lhZeIl1GHbt59F0wyqpJLh60e/PkBfJf0GRGL96wpTBHJsUJrogQ4IP6vxNLiY8+ksw== X-Received: by 2002:aed:3e8e:: with SMTP id n14mr4966042qtf.390.1553047654126; Tue, 19 Mar 2019 19:07:34 -0700 (PDT) X-Google-Smtp-Source: APXvYqxV01AhwzNypmXTPZvJl7OrwDZF6m9i8KxNkBrw+87HZHnaUb8PehbToxGF03JC4mX05q7V X-Received: by 2002:aed:3e8e:: with SMTP id n14mr4965995qtf.390.1553047653107; Tue, 19 Mar 2019 19:07:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047653; cv=none; d=google.com; s=arc-20160816; b=bXKUKrFW4EUZUSb/Xk/lghTk4b5zPRoSvAkjbwXfakNQX9IA8TT6CM1pLFlQrquok7 e3KVd2GQpHNiZ93Gp1TSePTTUtN3Njm0Z50ANKBs3jDiMxKWG+LXmUP+ojFYfKsMi/Ob /UGdsASFIAeAgDyQXMR/INQdKEdL8FLWtNOzXwOR/447H4yYCxhHz9xw9Co+foJE0YOm FWqIR+O+WG+yICBPOodS69qHq4HWHfbQzrqfeK4TCIzIy6/Kgfj5VM+wFIvlNcTyrKfh c+LUoJBsgHoTaAaihEgxq+r9uqEoTS5YE3Oz/9aXcG6AS5MbUjOw00ppWbBSy11YAPGM 59VQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=EMuqhh7UDP9mueKADKHGwGeKC3lB3y4BVPe1BKF6cgY=; b=ru2sZXWbcYmF4hqA4ueiiTpgUrXLC/VFjIsRmJT5JYq9XU2TPAOgefgWf5of6MoANq b9SR5yj5MLQpSVBBW82KlLp2xDHHif2D3k1X/ifWyk4LkISjdgCPngkl7ffeAea0jA3J Iw9frWe4wnOAoge01LVTrtLKbH9q7A6T9aHjYIaGgUQx+ykKylDV4blRIjaABkdA6Skx JRfQQ/pn8RbUJMOHwBQPibGSqXK/avnxdG3CsmPS+Mf9iFddSn8eEq0Dn/y7vdAGo4RS rDCbDD5agFZzc1LSYI4jIGSfpCgVI33pvGTAQfJwGHgPPvHXdesfCvEGNMGSTrO+C1TX L7DA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id g22si411607qtb.326.2019.03.19.19.07.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:07:33 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 445743087938; Wed, 20 Mar 2019 02:07:32 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 88B24601A4; Wed, 20 Mar 2019 02:07:24 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 05/28] mm: gup: allow VM_FAULT_RETRY for multiple times Date: Wed, 20 Mar 2019 10:06:19 +0800 Message-Id: <20190320020642.4000-6-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Wed, 20 Mar 2019 02:07:32 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This is the gup counterpart of the change that allows the VM_FAULT_RETRY to happen for more than once. Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- mm/gup.c | 17 +++++++++++++---- mm/hugetlb.c | 6 ++++-- 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 9bb3bed68ee3..f56dee055f26 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -528,7 +528,10 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, if (*flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; if (*flags & FOLL_TRIED) { - VM_WARN_ON_ONCE(fault_flags & FAULT_FLAG_ALLOW_RETRY); + /* + * Note: FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_TRIED + * can co-exist + */ fault_flags |= FAULT_FLAG_TRIED; } @@ -943,17 +946,23 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, /* VM_FAULT_RETRY triggered, so seek to the faulting offset */ pages += ret; start += ret << PAGE_SHIFT; + lock_dropped = true; +retry: /* * Repeat on the address that fired VM_FAULT_RETRY - * without FAULT_FLAG_ALLOW_RETRY but with + * with both FAULT_FLAG_ALLOW_RETRY and * FAULT_FLAG_TRIED. */ *locked = 1; - lock_dropped = true; down_read(&mm->mmap_sem); ret = __get_user_pages(tsk, mm, start, 1, flags | FOLL_TRIED, - pages, NULL, NULL); + pages, NULL, locked); + if (!*locked) { + /* Continue to retry until we succeeded */ + BUG_ON(ret != 0); + goto retry; + } if (ret != 1) { BUG_ON(ret > 1); if (!pages_done) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 52296ce4025a..040779a7b906 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4267,8 +4267,10 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; if (flags & FOLL_TRIED) { - VM_WARN_ON_ONCE(fault_flags & - FAULT_FLAG_ALLOW_RETRY); + /* + * Note: FAULT_FLAG_ALLOW_RETRY and + * FAULT_FLAG_TRIED can co-exist + */ fault_flags |= FAULT_FLAG_TRIED; } ret = hugetlb_fault(mm, vma, vaddr, fault_flags); From patchwork Wed Mar 20 02:06:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860679 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 954AF1390 for ; Wed, 20 Mar 2019 02:07:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7881E287D4 for ; Wed, 20 Mar 2019 02:07:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6CE59289B0; Wed, 20 Mar 2019 02:07:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0C09F299AC for ; Wed, 20 Mar 2019 02:07:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3A2BB6B000D; Tue, 19 Mar 2019 22:07:45 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 351AE6B000E; Tue, 19 Mar 2019 22:07:45 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2411C6B0010; Tue, 19 Mar 2019 22:07:45 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 059616B000D for ; Tue, 19 Mar 2019 22:07:45 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id f89so925458qtb.4 for ; Tue, 19 Mar 2019 19:07:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=e0fxVMddmA9GXiTtSfkbRCEUFj7blRCWENdC7N/YeF0=; b=VRPrboQ5g1/Bg8DD/GTUy8MgkI4N8Z90W1gF2gLT6VruO6JBl6mW7HJDpqlE2cgJNZ fE3XpV62EzvZy/0q2mlrm3/HBY8zrrn8oTs7IPCc44xC8twnuIDFYc0h3V7a+rYIL2Pt ySEgzZRLWi4DXmMBGePQ5JUVE1R96AlHArZZ5ngImq1qDOa5H8S4CDdGRBsf/gG0HQei 2VADCmyKGptIboSPhPxxCXlsdbr6PZYEEAwyX+WiKzFoUONBwnnp56UTi4XwIBPCUQe0 /0qXukq08gJgxZgvetZja8iJNOINMtviJEe0zbILMzvb7Xb1dsdNFtZkIC/gsEVjJ4se aj3A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVEm7JaNpqBcxuznI9HUcjXuvRC6us2tBX+Tk8z23MUWAIa1G3V IWwSapj8ZgJTaBSfSj8LOpIlXiQJNyAYU3MW9YaydWBWAnRAiyAL7uFSGS3lEfDi+wJW9G4mM8s YPbO5U08iP06QrasXI55rnmIHpiihjm4ordwCZrqcd5HNs+3XIgOfdcNuke7jcmawsA== X-Received: by 2002:a37:a390:: with SMTP id m138mr4392184qke.72.1553047664812; Tue, 19 Mar 2019 19:07:44 -0700 (PDT) X-Google-Smtp-Source: APXvYqwbIYfh+P8hCcfjYcY3pSWrvMsTj7e0WzG1k27zz4YXZQB/UHYFu3SLugfYYv4wpRTYqB+0 X-Received: by 2002:a37:a390:: with SMTP id m138mr4392153qke.72.1553047664182; Tue, 19 Mar 2019 19:07:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047664; cv=none; d=google.com; s=arc-20160816; b=HesJaUVrsek7MNAmadahD36AxH79meFBk3NGL51VXdg6yu3lZxDHIjvsCXc2liAPTs C5BeNJ/eEEMBVJPd5zFWTc5kk/y0BxtWGnBKdgEIC+yj8tWH0JNuR66mXurN2/wwb9Rv FdFFYGbCIgZoQGPYPbPJSD8xoGk6QBskaAZFeVH2xyCd+hOXx1CQZ8kCnYwpMZzN4WCe 7NhynWITNB1xH1vGOqoTbW66A50auxa+IdQ1yPcfRuWNaePXivr+HlYWhV3+Ok3QQzrv nL+2AjEC/cK03q8WnQ+vxSiyyerbjrbF4jDqkrO4dCx8F8Br9zwpgpdKDwu7WrrC7Wse XQeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=e0fxVMddmA9GXiTtSfkbRCEUFj7blRCWENdC7N/YeF0=; b=Qs8YQMsqx6iGivonCvfjNoQRmE+CAB0xTeA9hl3Gf8H22FCjTRg6IUQm10sy2wQYH6 H+GedH5zgZAgHJ7bJS8awDf5e0zRDfeyD2B50QoFvXBph4lJh6A0srpii8OOLSKm+vQE ol7lpc15md8bF0+XmoQJHrEfvsBkQ13LmjE+ZUaRcw+2H0GON37Kg2awGdH/yBIlV9Fl rnyaFXeDQLlUMu4ij8/1/BkRO070bXazKsyUip+aZdKnycuzc9ZfcDXMMBNnsOj3S64e F67UCjt7+v+UQAOjLCIqcTVkB+NIorZChWkh1FXfoCdWeGM5DTnVTH/mRTBHbQPgbdLv dp9w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id f50si448830qte.34.2019.03.19.19.07.44 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:07:44 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 43696859FE; Wed, 20 Mar 2019 02:07:43 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id BD7476014E; Wed, 20 Mar 2019 02:07:32 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Pavel Emelyanov , Rik van Riel Subject: [PATCH v3 06/28] userfaultfd: wp: add helper for writeprotect check Date: Wed, 20 Mar 2019 10:06:20 +0800 Message-Id: <20190320020642.4000-7-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Wed, 20 Mar 2019 02:07:43 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li add helper for writeprotect check. Will use it later. Cc: Andrea Arcangeli Cc: Pavel Emelyanov Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 37c9eba75c98..38f748e7186e 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -50,6 +50,11 @@ static inline bool userfaultfd_missing(struct vm_area_struct *vma) return vma->vm_flags & VM_UFFD_MISSING; } +static inline bool userfaultfd_wp(struct vm_area_struct *vma) +{ + return vma->vm_flags & VM_UFFD_WP; +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP); @@ -94,6 +99,11 @@ static inline bool userfaultfd_missing(struct vm_area_struct *vma) return false; } +static inline bool userfaultfd_wp(struct vm_area_struct *vma) +{ + return false; +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return false; From patchwork Wed Mar 20 02:06:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860681 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 208081390 for ; Wed, 20 Mar 2019 02:07:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0485E299AB for ; Wed, 20 Mar 2019 02:07:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ECC48299B3; Wed, 20 Mar 2019 02:07:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6CAC1299AB for ; Wed, 20 Mar 2019 02:07:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 884F26B000E; Tue, 19 Mar 2019 22:07:53 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 834476B0010; Tue, 19 Mar 2019 22:07:53 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 722676B0266; Tue, 19 Mar 2019 22:07:53 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 49B916B000E for ; Tue, 19 Mar 2019 22:07:53 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id c25so902901qtj.13 for ; Tue, 19 Mar 2019 19:07:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=hNINBJ5UPUtfO+zxQq1V9aretMuFnZBdzgd4L900j30=; b=T1o/EOoDZUVdRW7wYZ97AxItXAmf9jiLtabzF76WzCDV8rzCurMwDmuMXyEUVM/FzX mOj5KGjpmwTDMVv75rk3+VZDLD7EL3j/MPdZhQhTp3iIpFC7iF4W5u0ID0u5dD/sHrD7 uICjRFC+8/YzA1xy2nMir9xhMFMtdbYGLcNNyTX2e2E7ixN0FV9QmjfIAOxXvHSEqyKy NwUCCaVtzfDMX7+2kpjGbqDePZSebntfFtq14iF487S/1P/6UCbprlDdh09sTNkf8YHu TnzQYNWqo6iLLFxRfF243RzQYklNFF/orbh0RwrQD5yUZWhOLxH5gQdnSVp6SZT1Bh0c mtPg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWi4cDXFdXmxJBZZWUINrhXiXx+k+M3zqJ5dEumK/BiePqmxpyx RwXS83Dvrz25Xi0GB/AcXtRJ+xI6p55LHVKN7OQkODIyCSRWJnt4Pc5HgpJ3SglJvyQi9Rt1M/4 ODhug6hEjkJ0IaxCuINJnSGJagnB+UIFlXgIZMYq3s40zqPiiMbaiFT3t+55RJHdCtA== X-Received: by 2002:a37:5c05:: with SMTP id q5mr18757050qkb.20.1553047673051; Tue, 19 Mar 2019 19:07:53 -0700 (PDT) X-Google-Smtp-Source: APXvYqzlG1GeLwrKc0bWwItgWq0oXxqBk9rCGaT17qkbbvmEOB+2NpyUf8r6OJ+9SCiJABz1XKGE X-Received: by 2002:a37:5c05:: with SMTP id q5mr18757026qkb.20.1553047672472; Tue, 19 Mar 2019 19:07:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047672; cv=none; d=google.com; s=arc-20160816; b=BOiZmM8hh7Mq1aEV/w1Dg1x5cSPNPbz3Dve47Vh2TFvbpkbr9gFQZxpLE69MO0sfJU xBT5wP1HrZNsPg5FjykzJiD49aBJGzzyQqX7/Gp5df27CNMdiq+kmXYS/j8zR3KbPS7Y n9+wGFhcf0mOdWfYyXgUaVsQvqMY/9eqjndbhUk8eyfYQUIHh+5OogEyUsb8dBvDxDZh TzV9is4r1OzNn94P9bpQJTMcaS+RB+OVFSYxzkTePj33St6bIbxt4gYCvcq6O5j/tSrh RVC6ttj5z4kyhP9t/ZV7Z+Yx8tyvlqNdf9HeytJjLwEGCYKXnim09/N3lRZm4qgbEESt wFYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=hNINBJ5UPUtfO+zxQq1V9aretMuFnZBdzgd4L900j30=; b=VH/awJCgNnEMvYLg+Ore56dczv0jNmuhpqLOshn6z78f65jaJIhFGQ5GRiBMFcY+Sp xFftcmpRUKzvj2etOojd8awENJTfaxl/IJ8OfklDxuXmlKwqFNOyDadIgrDKp0jFMNbJ LOYkFy+f+iHWaCZ8Id4NDhnrYFy0tL/iYe1BKiygK9laRGAj8RmCxc13tbWkxGSLLmnG Sy/Y7eGA4qCjHTe4YaiNscrlyN0EA95K6sJhFUa8lYuzTJ5ay1PL3Tg06dmMnDpA4JzG NuJ2Q5uMRp1t+ve1xreV5dWNi4p5U2H2bb9bAPVA02uDBzWTPpHp099qyGeqBTjwSp8E +Sog== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id v44si262379qvf.137.2019.03.19.19.07.52 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:07:52 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6E445308FBAE; Wed, 20 Mar 2019 02:07:51 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id BDF9760634; Wed, 20 Mar 2019 02:07:43 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 07/28] userfaultfd: wp: hook userfault handler to write protection fault Date: Wed, 20 Mar 2019 10:06:21 +0800 Message-Id: <20190320020642.4000-8-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Wed, 20 Mar 2019 02:07:51 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli There are several cases write protection fault happens. It could be a write to zero page, swaped page or userfault write protected page. When the fault happens, there is no way to know if userfault write protect the page before. Here we just blindly issue a userfault notification for vma with VM_UFFD_WP regardless if app write protects it yet. Application should be ready to handle such wp fault. v1: From: Shaohua Li v2: Handle the userfault in the common do_wp_page. If we get there a pagetable is present and readonly so no need to do further processing until we solve the userfault. In the swapin case, always swapin as readonly. This will cause false positive userfaults. We need to decide later if to eliminate them with a flag like soft-dirty in the swap entry (see _PAGE_SWP_SOFT_DIRTY). hugetlbfs wouldn't need to worry about swapouts but and tmpfs would be handled by a swap entry bit like anonymous memory. The main problem with no easy solution to eliminate the false positives, will be if/when userfaultfd is extended to real filesystem pagecache. When the pagecache is freed by reclaim we can't leave the radix tree pinned if the inode and in turn the radix tree is reclaimed as well. The estimation is that full accuracy and lack of false positives could be easily provided only to anonymous memory (as long as there's no fork or as long as MADV_DONTFORK is used on the userfaultfd anonymous range) tmpfs and hugetlbfs, it's most certainly worth to achieve it but in a later incremental patch. v3: Add hooking point for THP wrprotect faults. CC: Shaohua Li Signed-off-by: Andrea Arcangeli [peterx: don't conditionally drop FAULT_FLAG_WRITE in do_swap_page] Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse --- mm/memory.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index e11ca9dd823f..567686ec086d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2483,6 +2483,11 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; + if (userfaultfd_wp(vma)) { + pte_unmap_unlock(vmf->pte, vmf->ptl); + return handle_userfault(vmf, VM_UFFD_WP); + } + vmf->page = vm_normal_page(vma, vmf->address, vmf->orig_pte); if (!vmf->page) { /* @@ -3684,8 +3689,11 @@ static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf) /* `inline' is required to avoid gcc 4.1.2 build error */ static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd) { - if (vma_is_anonymous(vmf->vma)) + if (vma_is_anonymous(vmf->vma)) { + if (userfaultfd_wp(vmf->vma)) + return handle_userfault(vmf, VM_UFFD_WP); return do_huge_pmd_wp_page(vmf, orig_pmd); + } if (vmf->vma->vm_ops->huge_fault) return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD); From patchwork Wed Mar 20 02:06:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860683 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5454015AC for ; Wed, 20 Mar 2019 02:08:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 34165287D4 for ; Wed, 20 Mar 2019 02:08:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 27B3A299B0; Wed, 20 Mar 2019 02:08:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A9BC299B9 for ; Wed, 20 Mar 2019 02:08:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4B4276B0003; Tue, 19 Mar 2019 22:08:02 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 463196B0010; Tue, 19 Mar 2019 22:08:02 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 32B5C6B0266; Tue, 19 Mar 2019 22:08:02 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 04AB26B0003 for ; Tue, 19 Mar 2019 22:08:02 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id l187so7088245qkd.7 for ; Tue, 19 Mar 2019 19:08:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=m+XjRxjDS73O4zAA/HG5rtRYNcWXsl8r1Sf9nvtsBIE=; b=LQyiTlOO8HFiOI8zvFS+5xLTRv/FI7anCzMJSNMMhFe0seRMfnHXtNDxziLw/1slga Y/xpBrGkfNNExoKZBBDuL9F1FjZQ9mo9pcs1DuYIM09WgGdQWdY8x94f64FMQDtL0Kyg JXNjJZIdqBQFPNJFdvoicwYpBBWd1XWk7YLA1gjX1K985Ow7ARJ7orkbKJQEHEayQD0B lHIPZyx2znQmSbrtPBAx1glVhgZBrXex+CLBNdLc2yojMKKMs6kKh8HftEbQ59UjyQ/P I4Xu5iY817f9nGqcp4lUPV40U3hx0vAn9UXYKSBhpCWsCYuk/BmI596ffKJ6epJjb7Ij KhLA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVP5gL5Nf54VhRa+ZkeO/cdV9m6FlZGzsXmg+fQ0/KFc4R58K3w 9RcO8kUfiAfTF+pKWS7oBybW83tnZ5BpEe5otrVHyPGl5aOWptTlIC88J2fzL6OxaAt9eDvcu/6 8nazyHJdvgZ7sWmybbHTCnO81YCBz7yaYqhzCRIKJZBMV8YYtNaigwMm97/Iw66G5ww== X-Received: by 2002:a0c:c68f:: with SMTP id d15mr4486882qvj.72.1553047681771; Tue, 19 Mar 2019 19:08:01 -0700 (PDT) X-Google-Smtp-Source: APXvYqxSxnV2J+wW7T21LlKhv9e2VrPChQfel3mqMJsm2p5nNtP8tMqMQf7K9oqzgxbUBDc6NcqS X-Received: by 2002:a0c:c68f:: with SMTP id d15mr4486841qvj.72.1553047680697; Tue, 19 Mar 2019 19:08:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047680; cv=none; d=google.com; s=arc-20160816; b=lOZvWfZPiZt41OPirkvt6qmlcjkt7Pl4dygjfN6O/fLsHpKS+KNRdnwbOIeI6yd2AW 9yHGT8xl0e8fYpfllWq4jKMaN4a5mmpWxKQOqoAY6uGl9s1TgBMvUb31sddYlKVpTLuh VsS6hgn9R8nHHjr07On6BICHW+iEcru4PLGMjjPQwth1VTDHCCzEHC0RgPhnb7XJSFCt KPXU9//G3OJCmcfUmbZoZrXodVmaFECuKtuldnXchTq1z5KQ08OuGKRfNoX42FQXwl/1 CIN0v6e08YE0tdPp1vDOK72EuXq0CeZSowXtPZ2isdCIcxDMu0cLCZ8LwxzcTbZd/R49 lneA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=m+XjRxjDS73O4zAA/HG5rtRYNcWXsl8r1Sf9nvtsBIE=; b=zy6/EKSmjU9CJbaMUZcFHVLYvs7UyHoCmtr4b/CiVs2WH1vwirsrlbRjs/xog4aGKx knOigbHe7XtI/MrSlnphYUZxlciE+qhSmyb2Vy0SeYzLSsfiyhAlrPouwy2PqZT3oXAH Q63mY+eHvnpbstZuUpH0dwyRD8zTpBcsdZ3x6uQ8cGkdes7votSB1FrtvkICdwjBczOy CM47NyAczKbK1Np0Z9LIgkOr+Mxwm5NfQrBw0gHu520mcSQ9L0mc8y/eszMvJe9jBIae KfFux+YkFBIngtCI6OOUMmge2CaG+ZsaGvL91YlSp96UvW2XAP8B++vz/ja2spKIuvWI e+Hg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id x5si298225qkc.121.2019.03.19.19.08.00 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:08:00 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CC63583F40; Wed, 20 Mar 2019 02:07:59 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id E98A96014C; Wed, 20 Mar 2019 02:07:51 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 08/28] userfaultfd: wp: add WP pagetable tracking to x86 Date: Wed, 20 Mar 2019 10:06:22 +0800 Message-Id: <20190320020642.4000-9-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Wed, 20 Mar 2019 02:07:59 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli Accurate userfaultfd WP tracking is possible by tracking exactly which virtual memory ranges were writeprotected by userland. We can't relay only on the RW bit of the mapped pagetable because that information is destroyed by fork() or KSM or swap. If we were to relay on that, we'd need to stay on the safe side and generate false positive wp faults for every swapped out page. Signed-off-by: Andrea Arcangeli [peterx: append _PAGE_UFD_WP to _PAGE_CHG_MASK] Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 52 ++++++++++++++++++++++++++++ arch/x86/include/asm/pgtable_64.h | 8 ++++- arch/x86/include/asm/pgtable_types.h | 11 +++++- include/asm-generic/pgtable.h | 1 + include/asm-generic/pgtable_uffd.h | 51 +++++++++++++++++++++++++++ init/Kconfig | 5 +++ 7 files changed, 127 insertions(+), 2 deletions(-) create mode 100644 include/asm-generic/pgtable_uffd.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5a02dd608f74..d2947525907f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -209,6 +209,7 @@ config X86 select USER_STACKTRACE_SUPPORT select VIRT_TO_BUS select X86_FEATURE_NAMES if PROC_FS + select HAVE_ARCH_USERFAULTFD_WP if USERFAULTFD config INSTRUCTION_DECODER def_bool y diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 2779ace16d23..6863236e8484 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -23,6 +23,7 @@ #ifndef __ASSEMBLY__ #include +#include extern pgd_t early_top_pgt[PTRS_PER_PGD]; int __init __early_make_pgtable(unsigned long address, pmdval_t pmd); @@ -293,6 +294,23 @@ static inline pte_t pte_clear_flags(pte_t pte, pteval_t clear) return native_make_pte(v & ~clear); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline int pte_uffd_wp(pte_t pte) +{ + return pte_flags(pte) & _PAGE_UFFD_WP; +} + +static inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte_set_flags(pte, _PAGE_UFFD_WP); +} + +static inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return pte_clear_flags(pte, _PAGE_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + static inline pte_t pte_mkclean(pte_t pte) { return pte_clear_flags(pte, _PAGE_DIRTY); @@ -372,6 +390,23 @@ static inline pmd_t pmd_clear_flags(pmd_t pmd, pmdval_t clear) return native_make_pmd(v & ~clear); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline int pmd_uffd_wp(pmd_t pmd) +{ + return pmd_flags(pmd) & _PAGE_UFFD_WP; +} + +static inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pmd_set_flags(pmd, _PAGE_UFFD_WP); +} + +static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pmd_clear_flags(pmd, _PAGE_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + static inline pmd_t pmd_mkold(pmd_t pmd) { return pmd_clear_flags(pmd, _PAGE_ACCESSED); @@ -1351,6 +1386,23 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) #endif #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return pte_set_flags(pte, _PAGE_SWP_UFFD_WP); +} + +static inline int pte_swp_uffd_wp(pte_t pte) +{ + return pte_flags(pte) & _PAGE_SWP_UFFD_WP; +} + +static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return pte_clear_flags(pte, _PAGE_SWP_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + #define PKRU_AD_BIT 0x1 #define PKRU_WD_BIT 0x2 #define PKRU_BITS_PER_PKEY 2 diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index 9c85b54bf03c..e0c5d29b8685 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -189,7 +189,7 @@ extern void sync_global_pgds(unsigned long start, unsigned long end); * * | ... | 11| 10| 9|8|7|6|5| 4| 3|2| 1|0| <- bit number * | ... |SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names - * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| X|X|SD|0| <- swp entry + * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| X|F|SD|0| <- swp entry * * G (8) is aliased and used as a PROT_NONE indicator for * !present ptes. We need to start storing swap entries above @@ -197,9 +197,15 @@ extern void sync_global_pgds(unsigned long start, unsigned long end); * erratum where they can be incorrectly set by hardware on * non-present PTEs. * + * SD Bits 1-4 are not used in non-present format and available for + * special use described below: + * * SD (1) in swp entry is used to store soft dirty bit, which helps us * remember soft dirty over page migration * + * F (2) in swp entry is used to record when a pagetable is + * writeprotected by userfaultfd WP support. + * * Bit 7 in swp entry should be 0 because pmd_present checks not only P, * but also L and G. * diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index d6ff0bbdb394..dd9c6295d610 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -32,6 +32,7 @@ #define _PAGE_BIT_SPECIAL _PAGE_BIT_SOFTW1 #define _PAGE_BIT_CPA_TEST _PAGE_BIT_SOFTW1 +#define _PAGE_BIT_UFFD_WP _PAGE_BIT_SOFTW2 /* userfaultfd wrprotected */ #define _PAGE_BIT_SOFT_DIRTY _PAGE_BIT_SOFTW3 /* software dirty tracking */ #define _PAGE_BIT_DEVMAP _PAGE_BIT_SOFTW4 @@ -100,6 +101,14 @@ #define _PAGE_SWP_SOFT_DIRTY (_AT(pteval_t, 0)) #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +#define _PAGE_UFFD_WP (_AT(pteval_t, 1) << _PAGE_BIT_UFFD_WP) +#define _PAGE_SWP_UFFD_WP _PAGE_USER +#else +#define _PAGE_UFFD_WP (_AT(pteval_t, 0)) +#define _PAGE_SWP_UFFD_WP (_AT(pteval_t, 0)) +#endif + #if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE) #define _PAGE_NX (_AT(pteval_t, 1) << _PAGE_BIT_NX) #define _PAGE_DEVMAP (_AT(u64, 1) << _PAGE_BIT_DEVMAP) @@ -124,7 +133,7 @@ */ #define _PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \ _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY | \ - _PAGE_SOFT_DIRTY | _PAGE_DEVMAP) + _PAGE_SOFT_DIRTY | _PAGE_DEVMAP | _PAGE_UFFD_WP) #define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE) /* diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 05e61e6c843f..f49afe951711 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -10,6 +10,7 @@ #include #include #include +#include #if 5 - defined(__PAGETABLE_P4D_FOLDED) - defined(__PAGETABLE_PUD_FOLDED) - \ defined(__PAGETABLE_PMD_FOLDED) != CONFIG_PGTABLE_LEVELS diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h new file mode 100644 index 000000000000..643d1bf559c2 --- /dev/null +++ b/include/asm-generic/pgtable_uffd.h @@ -0,0 +1,51 @@ +#ifndef _ASM_GENERIC_PGTABLE_UFFD_H +#define _ASM_GENERIC_PGTABLE_UFFD_H + +#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static __always_inline int pte_uffd_wp(pte_t pte) +{ + return 0; +} + +static __always_inline int pmd_uffd_wp(pmd_t pmd) +{ + return 0; +} + +static __always_inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pmd; +} + +static __always_inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pmd; +} + +static __always_inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline int pte_swp_uffd_wp(pte_t pte) +{ + return 0; +} + +static __always_inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return pte; +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + +#endif /* _ASM_GENERIC_PGTABLE_UFFD_H */ diff --git a/init/Kconfig b/init/Kconfig index c9386a365eea..892d61ddf2eb 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1424,6 +1424,11 @@ config ADVISE_SYSCALLS applications use these syscalls, you can disable this option to save space. +config HAVE_ARCH_USERFAULTFD_WP + bool + help + Arch has userfaultfd write protection support + config MEMBARRIER bool "Enable membarrier() system call" if EXPERT default y From patchwork Wed Mar 20 02:06:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860685 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F2D8215AC for ; Wed, 20 Mar 2019 02:08:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D738A299B1 for ; Wed, 20 Mar 2019 02:08:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CA917299B0; Wed, 20 Mar 2019 02:08:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4CFAD297E3 for ; Wed, 20 Mar 2019 02:08:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 59E466B0006; Tue, 19 Mar 2019 22:08:12 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 54DC96B0010; Tue, 19 Mar 2019 22:08:12 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 43C3F6B0266; Tue, 19 Mar 2019 22:08:12 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 1A8986B0006 for ; Tue, 19 Mar 2019 22:08:12 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id w134so19494578qka.6 for ; Tue, 19 Mar 2019 19:08:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=dMjQIHNpFGSe7O5uqRGBwJx8Nml7amhMVaY9sZK8RHg=; b=CTsDAABD+rvac5GVLKfRI60CKv9UAL11Z12zLo0g9tU5U+Wa0XSUAGw3VqT2iApql1 PiaaEVAzIbZgiqAKR1lvnDaafox+McGgepjYjOEYJpKDpECK7zhbSyNlXV1GLbZ4s3rD D50QhqbWlkrBkDADDSwBpFRRjoTcPX5uOAmzGosPblipagT7wWHvVwrnx9XHSTjIBotA Jh5y0Bm+ev2DeuayO3Y1YlYJTftNpa3ev4iHgx1S9OZrHt/5Rex9NhF139BK2QlR3j+h m501B5paiBSwjUJaNsBO8GVWfGo1ZBSDqBsNQ/iakNLD3u+HTkb4z2flip9X/Ktu73sI kgOw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXLkX6p5+3mo/IecV5AfKVzwOAJ8i3839i5KPJd+1OiHfAOUAHS WM8qZJFzZGcdQmfH95lJkBiV4nmT48mCnydrWt+spZ6Ixm0bOA5c8vPA4FTDk/7GH99sXI/RCdS Ieqjdv75KgMn6qkhPz2TOMvGB+3FzMt0a8G0bb/eo7sKO0JbK/os3jX9Nprg7rPsHLg== X-Received: by 2002:ac8:2ea6:: with SMTP id h35mr4833913qta.181.1553047691907; Tue, 19 Mar 2019 19:08:11 -0700 (PDT) X-Google-Smtp-Source: APXvYqwifI4B9Oie6OPr6IjSmB7AU51/k0KKopPuehP5hNp/qNQ51fqq0VsF7w60AmahTso1yFx1 X-Received: by 2002:ac8:2ea6:: with SMTP id h35mr4833868qta.181.1553047690903; Tue, 19 Mar 2019 19:08:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047690; cv=none; d=google.com; s=arc-20160816; b=h8tN3am/iyoylVmp4WQMJPHKj4+XA3VwphrSt/hEhQBuE+gd2wq/k306uWk251beeg 1h9BdTyY7YRulGWm2JTcNK94j76Vw+qwouH0SLuQ72X6xJ4W2tQ/9sxoRkpM6qPufoxh A3Ap2WiEf4NvYmqdChI9zsxrbvMXYctmu1f0Dyyf6zRTaBnmE0CW/KL1A+n5s0pYss0p 03bQl6DQL6/Hgh77BxdL4WNEsF6Yz8qaukqWplnpSP+FtyQWQUy5k4qB0Eb5kbXPP8pI ljDdHjrOjfqgEDABf2U1R6vUeR2VJr3Oy56bfwisXsECC+//nCtHIJ739qzYjjqCsMNi u9bQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=dMjQIHNpFGSe7O5uqRGBwJx8Nml7amhMVaY9sZK8RHg=; b=QvbW7QGxs8k54nLug8nEd2L71t1WZbYrnoDiDpKWU7LSqxjx2hUxsXvJF/MC+5/QGB 735trWViqkhcAVC1Jk3v2j4pmDyaW1MBIDNtnTiOewten4k/uYEp9aiHpuyiaD7ePlYx rXREGknIShYjkCOXS/9ATSjgdOin0uGdg//Vu019RDYKKmXaE93C6HhYkje97vm471U9 pP4lyBO3ZPHNgT97v3rhT1F76/4eCcWxP+QnF/LT/UhUywbOcn3MifwjHQhCNscLnRs/ DphS673ASXp/6081PUBAAXKlCadMr8EYqisT50F1iIWF0xYbhs1AL+F4ZFml8X8cmSwi b4Jw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id v26si348190qkv.25.2019.03.19.19.08.10 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:08:10 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 16BD0307D91E; Wed, 20 Mar 2019 02:08:10 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 54B4C6014C; Wed, 20 Mar 2019 02:08:00 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 09/28] userfaultfd: wp: userfaultfd_pte/huge_pmd_wp() helpers Date: Wed, 20 Mar 2019 10:06:23 +0800 Message-Id: <20190320020642.4000-10-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Wed, 20 Mar 2019 02:08:10 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli Implement helpers methods to invoke userfaultfd wp faults more selectively: not only when a wp fault triggers on a vma with vma->vm_flags VM_UFFD_WP set, but only if the _PAGE_UFFD_WP bit is set in the pagetable too. Signed-off-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 38f748e7186e..c6590c58ce28 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -14,6 +14,8 @@ #include /* linux/include/uapi/linux/userfaultfd.h */ #include +#include +#include /* * CAREFUL: Check include/uapi/asm-generic/fcntl.h when defining @@ -55,6 +57,18 @@ static inline bool userfaultfd_wp(struct vm_area_struct *vma) return vma->vm_flags & VM_UFFD_WP; } +static inline bool userfaultfd_pte_wp(struct vm_area_struct *vma, + pte_t pte) +{ + return userfaultfd_wp(vma) && pte_uffd_wp(pte); +} + +static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma, + pmd_t pmd) +{ + return userfaultfd_wp(vma) && pmd_uffd_wp(pmd); +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP); @@ -104,6 +118,19 @@ static inline bool userfaultfd_wp(struct vm_area_struct *vma) return false; } +static inline bool userfaultfd_pte_wp(struct vm_area_struct *vma, + pte_t pte) +{ + return false; +} + +static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma, + pmd_t pmd) +{ + return false; +} + + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return false; From patchwork Wed Mar 20 02:06:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860687 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0E2AC1390 for ; Wed, 20 Mar 2019 02:08:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DFE5629800 for ; Wed, 20 Mar 2019 02:08:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D4069299AD; Wed, 20 Mar 2019 02:08:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 05AB5299AB for ; Wed, 20 Mar 2019 02:08:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F13486B0007; Tue, 19 Mar 2019 22:08:22 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id EC3A56B0266; Tue, 19 Mar 2019 22:08:22 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB3226B0269; Tue, 19 Mar 2019 22:08:22 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id BBA8B6B0007 for ; Tue, 19 Mar 2019 22:08:22 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id o135so19458329qke.11 for ; Tue, 19 Mar 2019 19:08:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=guN7gYgnmDjc0dZDM0nP8l0/tmhZ7IhmP3yHcVRz/U4=; b=eMM/mNeVCLsDM8a+r/SugUDJSkHPydbVvlavpk4DFlzxz322xpj7dfDRIt5YdFYwiP okMrPenzraXCLFPfMYBlvTAPX1L+sj6j7TDsZ73EWI1Y0LxCrWO/5wIq9BLH+3TSns5N 0D9simKYko59kknO2T/Wc3/dB++kXuYc2ff3XFzmTz9THgU26WTycnKD4t2WTMPLDxdZ taOm7HpZrQudwqeYQujjvBUPcjrgpCC7MM3PpUuAZvR9vtGH8MsC24WKUxtDZllpOcmL qv1LJo08V4ZHndq6mKk7u2wioiTwnPUEBeFa95yF/EIcGJKu9O2vzZacgzZB7/WP0Bo9 iMRw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXZ1NnSD1HbGhhnSaX3n3CveMsMpxBNfUpyxJCPKtH0I7bKzws5 ffJCSKw5j0+qshbLF1B444WTFn16DlbFev8sWbOk7D2wMvK+JJS75Sf7/LxypDAGqF7cKQ70JJx 41rtrkIcbniPYqUt6aTSGXB3Ec+zu9NRfRzm+Mtn+sJ5uviDiDL+MdLPBmP6vG907bw== X-Received: by 2002:a37:98c7:: with SMTP id a190mr4435407qke.308.1553047702523; Tue, 19 Mar 2019 19:08:22 -0700 (PDT) X-Google-Smtp-Source: APXvYqwgSmq3Ug2bdf8pio4DLaR3SiX/1aEaILZR2y3yWopfaeExifRDTSOfRToiGjeJxZnVtgyj X-Received: by 2002:a37:98c7:: with SMTP id a190mr4435344qke.308.1553047701173; Tue, 19 Mar 2019 19:08:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047701; cv=none; d=google.com; s=arc-20160816; b=un2ir8mQZxEWQSO7gSJ7w5JZ+Ddv7W3GZqTHCV2iIT0lXGWIX6IvDLxpB4LNEOGuZB VF6UspuhGwPh/FE9us96E5mp87G2Z0goKTeGGSs23icXXVItJAQV537AGiaz9mlGaINi 0NHdIcPT/MaJPARQtcTk1OAX+7kQkedLfnNDQmTCYem5VxxCgiU6z4Agpxp/1rdCgcWP aydi4ASZMlE6kNGGEf/mb3e25wCmRZeRiMZQk/lpIY7F0ZFvgGwcVuFrgxU48gOGZKo5 uAAWR2eij1o8a1tB6VH7dkcjKcXKk4FnS422+mAbLlH4a60/kQci8OKxUoXvX4VKX+E0 9lxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=guN7gYgnmDjc0dZDM0nP8l0/tmhZ7IhmP3yHcVRz/U4=; b=oIoyLRSuzzgvhOaTAmTPATQmOKdDHQ05/7h7FJJbgV0XGhRQAd/maFQ0pUIjrGjTJE RgAUlQDFHzoYL5cTAAWcECgniDkU66Y0Ky2NkL+ExldtWNT4yNh3znOvlnFm8XbJ61Yb f1NWC3+oyIceURywEnllMRSYC2rsJ3zGLZqNO6apTrvmgIE31rBUWLMrWq1LHsyc9AGH iFePVC7Fn9GplSuIS9aBK2XwwTD8kYwJloFZCRkMKWFKOm3ZtKTEGQwtQPuX7cqILmvJ yojlHF2JA8p4ZjaBotXVPaw4kL1GyTQHf/wpfB5v//9V51UkNNdqtqbW9xv4lp9jp0Hh 83Ig== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id n54si428411qtf.156.2019.03.19.19.08.20 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:08:21 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 37A2A83F42; Wed, 20 Mar 2019 02:08:20 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8ED18605CA; Wed, 20 Mar 2019 02:08:10 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 10/28] userfaultfd: wp: add UFFDIO_COPY_MODE_WP Date: Wed, 20 Mar 2019 10:06:24 +0800 Message-Id: <20190320020642.4000-11-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Wed, 20 Mar 2019 02:08:20 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli This allows UFFDIO_COPY to map pages write-protected. Signed-off-by: Andrea Arcangeli [peterx: switch to VM_WARN_ON_ONCE in mfill_atomic_pte; add brackets around "dst_vma->vm_flags & VM_WRITE"; fix wordings in comments and commit messages] Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- fs/userfaultfd.c | 5 +++-- include/linux/userfaultfd_k.h | 2 +- include/uapi/linux/userfaultfd.h | 11 +++++----- mm/userfaultfd.c | 36 ++++++++++++++++++++++---------- 4 files changed, 35 insertions(+), 19 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index b397bc3b954d..3092885c9d2c 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1683,11 +1683,12 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, ret = -EINVAL; if (uffdio_copy.src + uffdio_copy.len <= uffdio_copy.src) goto out; - if (uffdio_copy.mode & ~UFFDIO_COPY_MODE_DONTWAKE) + if (uffdio_copy.mode & ~(UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP)) goto out; if (mmget_not_zero(ctx->mm)) { ret = mcopy_atomic(ctx->mm, uffdio_copy.dst, uffdio_copy.src, - uffdio_copy.len, &ctx->mmap_changing); + uffdio_copy.len, &ctx->mmap_changing, + uffdio_copy.mode); mmput(ctx->mm); } else { return -ESRCH; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index c6590c58ce28..765ce884cec0 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -34,7 +34,7 @@ extern vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason); extern ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - bool *mmap_changing); + bool *mmap_changing, __u64 mode); extern ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long len, diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 48f1a7c2f1f0..340f23bc251d 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -203,13 +203,14 @@ struct uffdio_copy { __u64 dst; __u64 src; __u64 len; +#define UFFDIO_COPY_MODE_DONTWAKE ((__u64)1<<0) /* - * There will be a wrprotection flag later that allows to map - * pages wrprotected on the fly. And such a flag will be - * available if the wrprotection ioctl are implemented for the - * range according to the uffdio_register.ioctls. + * UFFDIO_COPY_MODE_WP will map the page write protected on + * the fly. UFFDIO_COPY_MODE_WP is available only if the + * write protected ioctl is implemented for the range + * according to the uffdio_register.ioctls. */ -#define UFFDIO_COPY_MODE_DONTWAKE ((__u64)1<<0) +#define UFFDIO_COPY_MODE_WP ((__u64)1<<1) __u64 mode; /* diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index d59b5a73dfb3..eaecc21806da 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -25,7 +25,8 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - struct page **pagep) + struct page **pagep, + bool wp_copy) { struct mem_cgroup *memcg; pte_t _dst_pte, *dst_pte; @@ -71,9 +72,9 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, if (mem_cgroup_try_charge(page, dst_mm, GFP_KERNEL, &memcg, false)) goto out_release; - _dst_pte = mk_pte(page, dst_vma->vm_page_prot); - if (dst_vma->vm_flags & VM_WRITE) - _dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte)); + _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); + if ((dst_vma->vm_flags & VM_WRITE) && !wp_copy) + _dst_pte = pte_mkwrite(_dst_pte); dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); if (dst_vma->vm_file) { @@ -399,7 +400,8 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, struct page **page, - bool zeropage) + bool zeropage, + bool wp_copy) { ssize_t err; @@ -416,11 +418,13 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, if (!(dst_vma->vm_flags & VM_SHARED)) { if (!zeropage) err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, - dst_addr, src_addr, page); + dst_addr, src_addr, page, + wp_copy); else err = mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, dst_addr); } else { + VM_WARN_ON_ONCE(wp_copy); if (!zeropage) err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, @@ -438,7 +442,8 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, unsigned long src_start, unsigned long len, bool zeropage, - bool *mmap_changing) + bool *mmap_changing, + __u64 mode) { struct vm_area_struct *dst_vma; ssize_t err; @@ -446,6 +451,7 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, unsigned long src_addr, dst_addr; long copied; struct page *page; + bool wp_copy; /* * Sanitize the command parameters: @@ -502,6 +508,14 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, dst_vma->vm_flags & VM_SHARED)) goto out_unlock; + /* + * validate 'mode' now that we know the dst_vma: don't allow + * a wrprotect copy if the userfaultfd didn't register as WP. + */ + wp_copy = mode & UFFDIO_COPY_MODE_WP; + if (wp_copy && !(dst_vma->vm_flags & VM_UFFD_WP)) + goto out_unlock; + /* * If this is a HUGETLB vma, pass off to appropriate routine */ @@ -557,7 +571,7 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, BUG_ON(pmd_trans_huge(*dst_pmd)); err = mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, - src_addr, &page, zeropage); + src_addr, &page, zeropage, wp_copy); cond_resched(); if (unlikely(err == -ENOENT)) { @@ -604,14 +618,14 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - bool *mmap_changing) + bool *mmap_changing, __u64 mode) { return __mcopy_atomic(dst_mm, dst_start, src_start, len, false, - mmap_changing); + mmap_changing, mode); } ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long start, unsigned long len, bool *mmap_changing) { - return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing); + return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing, 0); } From patchwork Wed Mar 20 02:06:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860689 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6FC3B15AC for ; Wed, 20 Mar 2019 02:08:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 52503297E3 for ; Wed, 20 Mar 2019 02:08:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 46904299B1; Wed, 20 Mar 2019 02:08:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7A8FE299AD for ; Wed, 20 Mar 2019 02:08:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 32C036B0266; Tue, 19 Mar 2019 22:08:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 2DC7E6B0269; Tue, 19 Mar 2019 22:08:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1CC6A6B026A; Tue, 19 Mar 2019 22:08:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id F054D6B0266 for ; Tue, 19 Mar 2019 22:08:28 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id q21so902859qtf.10 for ; Tue, 19 Mar 2019 19:08:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=WjdJ9EWZacrUvhsdi8dDMXNpxKcfuPtRIusuzsNon6w=; b=WMNEI2CG0JSRHOweK7Q10DBmUYktjiz6cMuSjelP1uFmovzzpbi9+6GxUfjfMfMoXj s0E/6nQa6UwNuZNd6GHocl9ldnzpBCyaP1d8EgakUrc+pq+r0Aaiptxn2jEbEj8sh59E mSgNjoDgqLgL4BRK8u8VOxGRZwEsmHl+gi2ZCUkNSF3KB2aSKeUyUg2PT2RRMAOO6VBi lTOO7yAhg13ZMVnTrSGZmdne+ATOZzRcnVBvEkPgyKHOYMI2UKc0bwij/vtOYjVPzm0i QBCpbosYHir59XSYijUoI7i1jPFiGLLVW9k/eFGvPYOXLeMwgUCyh1ORKKOGqqym0QWD bl4Q== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXw9wm7ejekBq22yzGnaAocwfuX8XbGYQPEAR4y9GoXfUN4w5Hm 7Pc0ZicAFvW0qo5amKB/4926NjLHn42ezET1ec1VclLtei485G9leShlShIVBk2IsQ2bbHNaMP9 ujbs8U/WeOnarRpecMeXI1Dj6AYDEG0DUaxAKLfKhKkn4GprOyvlYx6xKH7qa3r4KoA== X-Received: by 2002:a37:4d52:: with SMTP id a79mr4483791qkb.75.1553047708738; Tue, 19 Mar 2019 19:08:28 -0700 (PDT) X-Google-Smtp-Source: APXvYqwUE/9Ctj6PL02zEWwsE1I/zvBqZt3TLfMUu18axNSos4tqV78zpjMULOb6yJrxBxf6LkEO X-Received: by 2002:a37:4d52:: with SMTP id a79mr4483733qkb.75.1553047707294; Tue, 19 Mar 2019 19:08:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047707; cv=none; d=google.com; s=arc-20160816; b=eOvwA9JAGiGuu9VxTSRpo9X/ihj0xoVHVr+OVQ14zOIiGiWMMN0gjV1x3NUyCd+/GM tez/Q2ANXIsthvy23vgkNeHBezPNyEmV7MoH8iVwtarj9vDHIdw+jDWqyiZ5Avy9BR1E EAqYwUVx8QBZGRqRbMFIr2sPalO6c88CB5WRpFQ3XlZOVGUL0uXaj/JKdgMDh2K3y4wk QIOs9w70oP37jhjEETEOuYzfRgD/+RCsNhJqushbyrppmOrmmMyum5fKR+CitBKaIMpu BXP7h+KJ1ImqBjxLnIbGUdJTmCOhelNxkHHoJvnR+/iC5CRwbeGCABaABlfPzC+eEE+U tZJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=WjdJ9EWZacrUvhsdi8dDMXNpxKcfuPtRIusuzsNon6w=; b=Rw7cjac2iziVEqmQ2y/PCQ2rlrgWUKuCm/XQhd8ocqxBpXKOe3la9TE6KBE1nw3CxP TGekmmigVJNB44it8ZTlv2Upu4fje+eVMnF3ZbY0OUUsKeWOLeQFDnOMLvyavhaG+Oqn xhF2pQBUFZllo4eJ1DjlW88spZi4jS7GhNoeEty5xRb8qhS1atvkOVlCsYfjqFfOeYtp b4MuJ/fMpQjIsfv1gOuPDO2ixX2xW25uZKUWrxFIm/wmMVHkNQ90kmUckdD7GFrL3Rka iSU1Q7G7MSDTuIWBLNuIahuYEKMCNaoJwQu4g6dNmEpRBrkZHBCAQ6C31NF7K/uoyaHG YM/A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id u123si349474qkc.146.2019.03.19.19.08.27 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:08:27 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6F9453086218; Wed, 20 Mar 2019 02:08:26 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id B35646014E; Wed, 20 Mar 2019 02:08:20 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 11/28] mm: merge parameters for change_protection() Date: Wed, 20 Mar 2019 10:06:25 +0800 Message-Id: <20190320020642.4000-12-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Wed, 20 Mar 2019 02:08:26 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP change_protection() was used by either the NUMA or mprotect() code, there's one parameter for each of the callers (dirty_accountable and prot_numa). Further, these parameters are passed along the calls: - change_protection_range() - change_p4d_range() - change_pud_range() - change_pmd_range() - ... Now we introduce a flag for change_protect() and all these helpers to replace these parameters. Then we can avoid passing multiple parameters multiple times along the way. More importantly, it'll greatly simplify the work if we want to introduce any new parameters to change_protection(). In the follow up patches, a new parameter for userfaultfd write protection will be introduced. No functional change at all. Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- include/linux/huge_mm.h | 2 +- include/linux/mm.h | 14 +++++++++++++- mm/huge_memory.c | 3 ++- mm/mempolicy.c | 2 +- mm/mprotect.c | 29 ++++++++++++++++------------- 5 files changed, 33 insertions(+), 17 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 381e872bfde0..1550fb12dbd4 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -46,7 +46,7 @@ extern bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, pmd_t *old_pmd, pmd_t *new_pmd); extern int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, pgprot_t newprot, - int prot_numa); + unsigned long cp_flags); vm_fault_t vmf_insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, pmd_t *pmd, pfn_t pfn, bool write); vm_fault_t vmf_insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, diff --git a/include/linux/mm.h b/include/linux/mm.h index f73dbc4a1957..937559a74dc4 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1682,9 +1682,21 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma, unsigned long old_addr, struct vm_area_struct *new_vma, unsigned long new_addr, unsigned long len, bool need_rmap_locks); + +/* + * Flags used by change_protection(). For now we make it a bitmap so + * that we can pass in multiple flags just like parameters. However + * for now all the callers are only use one of the flags at the same + * time. + */ +/* Whether we should allow dirty bit accounting */ +#define MM_CP_DIRTY_ACCT (1UL << 0) +/* Whether this protection change is for NUMA hints */ +#define MM_CP_PROT_NUMA (1UL << 1) + extern unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa); + unsigned long cp_flags); extern int mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev, unsigned long start, unsigned long end, unsigned long newflags); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index faf357eaf0ce..8d65b0f041f9 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1860,13 +1860,14 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, * - HPAGE_PMD_NR is protections changed and TLB flush necessary */ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, - unsigned long addr, pgprot_t newprot, int prot_numa) + unsigned long addr, pgprot_t newprot, unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; spinlock_t *ptl; pmd_t entry; bool preserve_write; int ret; + bool prot_numa = cp_flags & MM_CP_PROT_NUMA; ptl = __pmd_trans_huge_lock(pmd, vma); if (!ptl) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index ee2bce59d2bf..55aed31b4f04 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -554,7 +554,7 @@ unsigned long change_prot_numa(struct vm_area_struct *vma, { int nr_updated; - nr_updated = change_protection(vma, addr, end, PAGE_NONE, 0, 1); + nr_updated = change_protection(vma, addr, end, PAGE_NONE, MM_CP_PROT_NUMA); if (nr_updated) count_vm_numa_events(NUMA_PTE_UPDATES, nr_updated); diff --git a/mm/mprotect.c b/mm/mprotect.c index 36cb358db170..a6ba448c8565 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -37,13 +37,15 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; pte_t *pte, oldpte; spinlock_t *ptl; unsigned long pages = 0; int target_node = NUMA_NO_NODE; + bool dirty_accountable = cp_flags & MM_CP_DIRTY_ACCT; + bool prot_numa = cp_flags & MM_CP_PROT_NUMA; /* * Can be called with only the mmap_sem for reading by @@ -164,7 +166,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, static inline unsigned long change_pmd_range(struct vm_area_struct *vma, pud_t *pud, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { pmd_t *pmd; unsigned long next; @@ -194,7 +196,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, __split_huge_pmd(vma, pmd, addr, false, NULL); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, - newprot, prot_numa); + newprot, cp_flags); if (nr_ptes) { if (nr_ptes == HPAGE_PMD_NR) { @@ -209,7 +211,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, /* fall through, the trans huge pmd just split */ } this_pages = change_pte_range(vma, pmd, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); pages += this_pages; next: cond_resched(); @@ -225,7 +227,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, static inline unsigned long change_pud_range(struct vm_area_struct *vma, p4d_t *p4d, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { pud_t *pud; unsigned long next; @@ -237,7 +239,7 @@ static inline unsigned long change_pud_range(struct vm_area_struct *vma, if (pud_none_or_clear_bad(pud)) continue; pages += change_pmd_range(vma, pud, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (pud++, addr = next, addr != end); return pages; @@ -245,7 +247,7 @@ static inline unsigned long change_pud_range(struct vm_area_struct *vma, static inline unsigned long change_p4d_range(struct vm_area_struct *vma, pgd_t *pgd, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { p4d_t *p4d; unsigned long next; @@ -257,7 +259,7 @@ static inline unsigned long change_p4d_range(struct vm_area_struct *vma, if (p4d_none_or_clear_bad(p4d)) continue; pages += change_pud_range(vma, p4d, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (p4d++, addr = next, addr != end); return pages; @@ -265,7 +267,7 @@ static inline unsigned long change_p4d_range(struct vm_area_struct *vma, static unsigned long change_protection_range(struct vm_area_struct *vma, unsigned long addr, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; pgd_t *pgd; @@ -282,7 +284,7 @@ static unsigned long change_protection_range(struct vm_area_struct *vma, if (pgd_none_or_clear_bad(pgd)) continue; pages += change_p4d_range(vma, pgd, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (pgd++, addr = next, addr != end); /* Only flush the TLB if we actually modified any entries: */ @@ -295,14 +297,15 @@ static unsigned long change_protection_range(struct vm_area_struct *vma, unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { unsigned long pages; if (is_vm_hugetlb_page(vma)) pages = hugetlb_change_protection(vma, start, end, newprot); else - pages = change_protection_range(vma, start, end, newprot, dirty_accountable, prot_numa); + pages = change_protection_range(vma, start, end, newprot, + cp_flags); return pages; } @@ -430,7 +433,7 @@ mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev, vma_set_page_prot(vma); change_protection(vma, start, end, vma->vm_page_prot, - dirty_accountable, 0); + dirty_accountable ? MM_CP_DIRTY_ACCT : 0); /* * Private VM_LOCKED VMA becoming writable: trigger COW to avoid major From patchwork Wed Mar 20 02:06:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860691 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 85C5A15AC for ; Wed, 20 Mar 2019 02:08:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6A32B289B0 for ; Wed, 20 Mar 2019 02:08:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5E1A129800; Wed, 20 Mar 2019 02:08:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 97F6F289B0 for ; Wed, 20 Mar 2019 02:08:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6AEA86B000A; Tue, 19 Mar 2019 22:08:35 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 690206B0269; Tue, 19 Mar 2019 22:08:35 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 574EE6B026A; Tue, 19 Mar 2019 22:08:35 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 35B156B000A for ; Tue, 19 Mar 2019 22:08:35 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id b188so19651622qkg.15 for ; Tue, 19 Mar 2019 19:08:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=Qv5XBK4pCyxq2qwVsBnYbqwBzqvU6bqRS5Ws8Mc0WTE=; b=SOgQ/p31Db+iuaMZ0buZ53ohCIwkifGjDmX0y0nlPyoB2w+n6l3n04ArYIIhvMWSK/ 741jhTIzZDJmhDEB9cDCLS3x+Pd3yGErWjFBDUMlBBOELvODZUPFKy2RbVlOI6eMF/FJ wttCy9Ql/Ph56lkHIi6ksEILXl67c8+hlxucSQz6JczA5ug6iQNcU9f0nU/4Q5ZrZ2X6 TH7ZSwT5HNJf0Snt6LusXJsUAIy3g2+BMIohalf6aXaVeBJmAEta3tx7rHvcYVMUFiuZ L7EBzMIluxE37b7SVakiunIHohL3XrWkG5fdPsw8UCHyKyvXHrSIcCg6fRGAMP8y5yB/ E8Ag== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAUpsJ0N3w59jq7O7b4Tq9aqGw/iw+5OSFsZKyxS+XIqx/Vblnzm XXXvbqg9rIBrzNUi8VIfSGTpCosyXATX08M6wkp3AohY1ilaMYcV8b0dlaPYNR9w168J2aLx4IH QSWnjPqLob+ljiKKcA7y7qJaX5CFJxA0rm5cfN1S+9in5+/9GaEz/zMtdoQZMsENJZw== X-Received: by 2002:a37:4801:: with SMTP id v1mr4620340qka.312.1553047714989; Tue, 19 Mar 2019 19:08:34 -0700 (PDT) X-Google-Smtp-Source: APXvYqzbr/qnyTJXH4T7VkTfgJNmw5fupvzP4qMjudS2oxWQqlQMyhTAC79ckSCzR2uhgQD/r3/Q X-Received: by 2002:a37:4801:: with SMTP id v1mr4620289qka.312.1553047713571; Tue, 19 Mar 2019 19:08:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047713; cv=none; d=google.com; s=arc-20160816; b=ggEsbXZoI8Gb2e3S9P202apgnSHebN2POpmABMIxAjdJv8EzF9DMgwjFPdxUOL2IdJ fGrzal9t/uZwvNf1LckGD9KvSKPG+nrRtPV7f96c9ySjenzTxw3fsipvVJcIQN+YgRPp 3ECqGKAx0iKk/dgM1xr+giGgIICelSYCosHNtPAapB1tg5Fi2Xf8kHN19accq6QAEoWE arfDSRjyU690Pm3b9LlkLCWKLeOyC6fjEiZskOti0lNoHEAukOzeVBQZe9Eedss3gyNS Xj99pk0pQZJGAjVOPbjypmGWIRpjaUU1bPIHGOo+s19fRupI7PUOw96txin1Yh1+1PHx RakA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=Qv5XBK4pCyxq2qwVsBnYbqwBzqvU6bqRS5Ws8Mc0WTE=; b=hiGfn0i9q8LLmN3mkYl9f/MaFfzApaFqaPtHgLkd8zXW3449np6ivVC79A4Q81QzJE FEA0kDTvbv0e4LeHuoC1PIwowwnKCpcrmANGeBFNfNU2qJcF7UUcHZzdZcnEP24oEH1J 8GG9nqPfbEZjtKFASAHPMRKTDyhFga5r0yZO02uho8sO7b8cCdT9Ab5QaaBOCkjTR5u3 ImZZVPlWrcq4PCYR6HOtv41+I7xikooKZ+hy6IZZAIuXWFxXhh+obaGR14AGML8ItyqU EofeKAX1PLq+kV+pPR7I/78CBvI+54ded6ZBAMj5kDWytfCmGIpXMbeaBYJnZR+Y6SEc NvrA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id a8si415740qtb.82.2019.03.19.19.08.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:08:33 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A40623086205; Wed, 20 Mar 2019 02:08:32 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id EB86660634; Wed, 20 Mar 2019 02:08:26 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 12/28] userfaultfd: wp: apply _PAGE_UFFD_WP bit Date: Wed, 20 Mar 2019 10:06:26 +0800 Message-Id: <20190320020642.4000-13-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Wed, 20 Mar 2019 02:08:32 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Firstly, introduce two new flags MM_CP_UFFD_WP[_RESOLVE] for change_protection() when used with uffd-wp and make sure the two new flags are exclusively used. Then, - For MM_CP_UFFD_WP: apply the _PAGE_UFFD_WP bit and remove _PAGE_RW when a range of memory is write protected by uffd - For MM_CP_UFFD_WP_RESOLVE: remove the _PAGE_UFFD_WP bit and recover _PAGE_RW when write protection is resolved from userspace And use this new interface in mwriteprotect_range() to replace the old MM_CP_DIRTY_ACCT. Do this change for both PTEs and huge PMDs. Then we can start to identify which PTE/PMD is write protected by general (e.g., COW or soft dirty tracking), and which is for userfaultfd-wp. Since we should keep the _PAGE_UFFD_WP when doing pte_modify(), add it into _PAGE_CHG_MASK as well. Meanwhile, since we have this new bit, we can be even more strict when detecting uffd-wp page faults in either do_wp_page() or wp_huge_pmd(). Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/linux/mm.h | 5 +++++ mm/huge_memory.c | 14 +++++++++++++- mm/memory.c | 4 ++-- mm/mprotect.c | 12 ++++++++++++ mm/userfaultfd.c | 8 ++++++-- 5 files changed, 38 insertions(+), 5 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 937559a74dc4..b39efe5ca7f6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1693,6 +1693,11 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma, #define MM_CP_DIRTY_ACCT (1UL << 0) /* Whether this protection change is for NUMA hints */ #define MM_CP_PROT_NUMA (1UL << 1) +/* Whether this change is for write protecting */ +#define MM_CP_UFFD_WP (1UL << 2) /* do wp */ +#define MM_CP_UFFD_WP_RESOLVE (1UL << 3) /* Resolve wp */ +#define MM_CP_UFFD_WP_ALL (MM_CP_UFFD_WP | \ + MM_CP_UFFD_WP_RESOLVE) extern unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 8d65b0f041f9..817335b443c2 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1868,6 +1868,8 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, bool preserve_write; int ret; bool prot_numa = cp_flags & MM_CP_PROT_NUMA; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; ptl = __pmd_trans_huge_lock(pmd, vma); if (!ptl) @@ -1934,6 +1936,13 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, entry = pmd_modify(entry, newprot); if (preserve_write) entry = pmd_mk_savedwrite(entry); + if (uffd_wp) { + entry = pmd_wrprotect(entry); + entry = pmd_mkuffd_wp(entry); + } else if (uffd_wp_resolve) { + entry = pmd_mkwrite(entry); + entry = pmd_clear_uffd_wp(entry); + } ret = HPAGE_PMD_NR; set_pmd_at(mm, addr, pmd, entry); BUG_ON(vma_is_anonymous(vma) && !preserve_write && pmd_write(entry)); @@ -2083,7 +2092,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, struct page *page; pgtable_t pgtable; pmd_t old_pmd, _pmd; - bool young, write, soft_dirty, pmd_migration = false; + bool young, write, soft_dirty, pmd_migration = false, uffd_wp = false; unsigned long addr; int i; @@ -2165,6 +2174,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, write = pmd_write(old_pmd); young = pmd_young(old_pmd); soft_dirty = pmd_soft_dirty(old_pmd); + uffd_wp = pmd_uffd_wp(old_pmd); } VM_BUG_ON_PAGE(!page_count(page), page); page_ref_add(page, HPAGE_PMD_NR - 1); @@ -2198,6 +2208,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = pte_mkold(entry); if (soft_dirty) entry = pte_mksoft_dirty(entry); + if (uffd_wp) + entry = pte_mkuffd_wp(entry); } pte = pte_offset_map(&_pmd, addr); BUG_ON(!pte_none(*pte)); diff --git a/mm/memory.c b/mm/memory.c index 567686ec086d..50c2990648ab 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2483,7 +2483,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; - if (userfaultfd_wp(vma)) { + if (userfaultfd_pte_wp(vma, *vmf->pte)) { pte_unmap_unlock(vmf->pte, vmf->ptl); return handle_userfault(vmf, VM_UFFD_WP); } @@ -3690,7 +3690,7 @@ static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf) static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd) { if (vma_is_anonymous(vmf->vma)) { - if (userfaultfd_wp(vmf->vma)) + if (userfaultfd_huge_pmd_wp(vmf->vma, orig_pmd)) return handle_userfault(vmf, VM_UFFD_WP); return do_huge_pmd_wp_page(vmf, orig_pmd); } diff --git a/mm/mprotect.c b/mm/mprotect.c index a6ba448c8565..9d4433044c21 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -46,6 +46,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, int target_node = NUMA_NO_NODE; bool dirty_accountable = cp_flags & MM_CP_DIRTY_ACCT; bool prot_numa = cp_flags & MM_CP_PROT_NUMA; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; /* * Can be called with only the mmap_sem for reading by @@ -117,6 +119,14 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, if (preserve_write) ptent = pte_mk_savedwrite(ptent); + if (uffd_wp) { + ptent = pte_wrprotect(ptent); + ptent = pte_mkuffd_wp(ptent); + } else if (uffd_wp_resolve) { + ptent = pte_mkwrite(ptent); + ptent = pte_clear_uffd_wp(ptent); + } + /* Avoid taking write faults for known dirty pages */ if (dirty_accountable && pte_dirty(ptent) && (pte_soft_dirty(ptent) || @@ -301,6 +311,8 @@ unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, { unsigned long pages; + BUG_ON((cp_flags & MM_CP_UFFD_WP_ALL) == MM_CP_UFFD_WP_ALL); + if (is_vm_hugetlb_page(vma)) pages = hugetlb_change_protection(vma, start, end, newprot); else diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index eaecc21806da..240de2a8492d 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -73,8 +73,12 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, goto out_release; _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); - if ((dst_vma->vm_flags & VM_WRITE) && !wp_copy) - _dst_pte = pte_mkwrite(_dst_pte); + if (dst_vma->vm_flags & VM_WRITE) { + if (wp_copy) + _dst_pte = pte_mkuffd_wp(_dst_pte); + else + _dst_pte = pte_mkwrite(_dst_pte); + } dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); if (dst_vma->vm_file) { From patchwork Wed Mar 20 02:06:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860693 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5F6DB1390 for ; Wed, 20 Mar 2019 02:08:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 46534286CD for ; Wed, 20 Mar 2019 02:08:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3A7D0287D4; Wed, 20 Mar 2019 02:08:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CD47E286CD for ; Wed, 20 Mar 2019 02:08:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D3E166B026A; Tue, 19 Mar 2019 22:08:44 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CED7C6B026B; Tue, 19 Mar 2019 22:08:44 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BDE386B026C; Tue, 19 Mar 2019 22:08:44 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 956F26B026A for ; Tue, 19 Mar 2019 22:08:44 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id k5so935761qte.0 for ; Tue, 19 Mar 2019 19:08:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=hoTZlblJA2VeYKo/mEcOKhsmwUWo5+51FludXN2P/8E=; b=ZtAE3qY2WUIDjdVCRgvlLSzBeXVZB2gaIOuErq66QFaxhHQ3MP409MYYTNF/0nGeRV OhykdWqRGSOw1dh4qsVrR/ZBt1rJlvqjY7gRs1p4iOTr2PmVPWtkWayAPxx4U47xzKtl k9QHdxPYNJvDOuXq/OKhWf61JkLxDBxhB1/fn1JRDGFB6nF+zJlz5bVrP6XJrZKo/K1Q fjsR1+KEC9DkvopCvfs3sr/RyJDxeESAvQ/3IRsr77vyUKPAG58+HYwylIJrTNcyraDW ToW3XsrFFRXMx0AbYfR/LOcPRJQdKq+LhYW/MUxgZD4YULNcx0MGBnBC4wMCz2onDUqY 86yA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAU+u0bOvUFlxYebMJZIhKp50jED9gTRJ+AB/ACtJZO3YeVhHOSr G9tX/aqKlHGiMy0zNISKHHn5rfaBxvoAW+Y4siOwlviEAT05XbCaSqIsNo9Z9n9GKMrG0CxaZfj AVq5o/iqecSfOG0fLZ2TPaYFSJSY+lYdviyV0om5gYRpvpp3nGYuvEy3YjmlXswp5UQ== X-Received: by 2002:a0c:d1a6:: with SMTP id e35mr4554436qvh.174.1553047724400; Tue, 19 Mar 2019 19:08:44 -0700 (PDT) X-Google-Smtp-Source: APXvYqwDBXq1Uf/WxdcjAFnN8SivtSd74bfDeDtOQCkFcnzhhmSmtaaklFSgn2f2AasG1Te/P/Zn X-Received: by 2002:a0c:d1a6:: with SMTP id e35mr4554398qvh.174.1553047723462; Tue, 19 Mar 2019 19:08:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047723; cv=none; d=google.com; s=arc-20160816; b=t1+7X0GR/dMyy6YlwQUxrrw40cUeeJWhGCAxPSwDtLzpW71+K7x/Kg+stTy0T8qA3L Q+XPnfR03kGj6BsddO3N6X6TPkIP6YDdDW2SjOGZK0qHnxrGMYPJM6y6Hy1JU700nnt9 DaMCcLt0RI5UXmGNi0BIS2oQaLDSas0nHAPZAjoP9FW/UytPVv1ZimwmYPNFdx0KHjRJ YCVkIyTbyC3ypxJSNM606sxva/Y+/3zqWnuxRhCJ3XpXCRLxnDL4OyScKogDq5DPEuSa 5enUxXlIqXK0TdmjM5Gc/gB6Jah8OQolyRMOuoiZSzSJPc3+Rp9pgMVq0gsBcaWovTXu HNIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=hoTZlblJA2VeYKo/mEcOKhsmwUWo5+51FludXN2P/8E=; b=lt9yrxIf1AXgPgU4RgiB/WOmJNlbiU8h33B7iTsAzagdW+0CzUpJbpoGcdKRRidnv4 ok8A2A4WrHjdBnshP1j7NRa6/WvoPhk484/NsJxY9Adib2Xl9mAbwUD8RjRN+iSB5bAB qsy8rptZGeq/e71fLqQMRVfUhRAobb2U3/vbnbt2bi9ypZyML6cllYGMGyqPw5Oi0WuB 7KCGs4svw92RaIj2e8ewYQJCoLz1IyMEL2aL2vbwnqGHtgxvMoRkhLqsHOy2O5hLJiLp VIXaY2TLslXORdSaCGQxm/Dt/oylk0D96eUrXWG/5/LAjETXwyRYL28LPUy6suzs9Kfi mmbg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id z34si253989qve.168.2019.03.19.19.08.43 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:08:43 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9ABD13086205; Wed, 20 Mar 2019 02:08:42 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2B6F06058F; Wed, 20 Mar 2019 02:08:32 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 13/28] mm: export wp_page_copy() Date: Wed, 20 Mar 2019 10:06:27 +0800 Message-Id: <20190320020642.4000-14-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Wed, 20 Mar 2019 02:08:42 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Export this function for usages outside page fault handlers. Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- include/linux/mm.h | 2 ++ mm/memory.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index b39efe5ca7f6..00b040e0358d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -441,6 +441,8 @@ struct vm_fault { */ }; +vm_fault_t wp_page_copy(struct vm_fault *vmf); + /* page entry size for vm->huge_fault() */ enum page_entry_size { PE_SIZE_PTE = 0, diff --git a/mm/memory.c b/mm/memory.c index 50c2990648ab..e7a4b9650225 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2239,7 +2239,7 @@ static inline void wp_page_reuse(struct vm_fault *vmf) * held to the old page, as well as updating the rmap. * - In any case, unlock the PTL and drop the reference we took to the old page. */ -static vm_fault_t wp_page_copy(struct vm_fault *vmf) +vm_fault_t wp_page_copy(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; struct mm_struct *mm = vma->vm_mm; From patchwork Wed Mar 20 02:06:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860695 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 47A451390 for ; Wed, 20 Mar 2019 02:08:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2D93C286CD for ; Wed, 20 Mar 2019 02:08:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 21827287D4; Wed, 20 Mar 2019 02:08:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 87DC5286CD for ; Wed, 20 Mar 2019 02:08:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 80F136B026B; Tue, 19 Mar 2019 22:08:51 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 797CC6B026C; Tue, 19 Mar 2019 22:08:51 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5C2F46B026D; Tue, 19 Mar 2019 22:08:51 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 38C816B026B for ; Tue, 19 Mar 2019 22:08:51 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id 18so879346qtw.20 for ; Tue, 19 Mar 2019 19:08:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=p2Ht8maPQnGA8z4sqJmE0IlCDOKKYPrI55cTbBsdp2Q=; b=F+dOArjUgBetFLiA7osGOi+/8kar2M7MHwmYTUZ5omUmMIrmW6y/cosjTKmUPHM9C+ 8ZE8x359XsJ6UnHaNfSUYs1lOWwdXSCk55MBUtzUzWIIlJcV/Tx+V2qGK823JA7+L/6y mqCvDQlr0vQKc8pTuPTsoshglZdE3rwhuAUzIGiC9WwMp0mjRw007CtbHLhce77PBuoM fQyde1weHNG3gYGPutpyfkz4tMejMod5WkZnS3Qb/HsWdjU1K5axoQhPeX6DuOzuRbyA Ha/GttSodeNtE27xvixdGJg2tz14Jjg5tWJrTcTTV3LMnetPJ2Po6I3/H5NcvVTxXk51 xCrg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAX/7Oi57l7PfODpewkSHFZYR4kwVYE7NRefquUMO2dvlV6XJC44 Z42tNwFxpMdtYBOmNUeIa7wcluLaF+RrAAiVlhGuYX6LV5dVH3UVKFts/QH4+xDkXK/559ulveE UPT3UzFXB3pJitnSLUvp8aKyiTA1s829VQFQzuB+twAi24O7kfS9vYT6tHTbNjr3n8w== X-Received: by 2002:a0c:91f0:: with SMTP id r45mr4920392qvr.7.1553047730987; Tue, 19 Mar 2019 19:08:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqxmshbnYlLaujRZWliZTamGhzru3gwfQ+QipXUHZTzfXIlYS9g//fXM0yy1J90RgI3snl76 X-Received: by 2002:a0c:91f0:: with SMTP id r45mr4920343qvr.7.1553047729690; Tue, 19 Mar 2019 19:08:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047729; cv=none; d=google.com; s=arc-20160816; b=YCk4lQE2ICULpbs9eNe+cIGvCb9RxXPhnh+yQ38xQy5lX0cXzttnZ7ymePSKqoDwhO YYqLT+o6bP8RRBXvtc0BlJSZq7EtC6bH6x04HurKgSBTWl2PAhICQuamDPKqRoJ09PDF QjN+95s4PFk53NNe0Fy6YIOMdsikqyUjmwV97cmXkxRIFQiDF4qA+ebKsn/WBNEjs8B/ 7cmDjzx7HpyyXyuSdpFp6O3hCehlkzBwpbM0VWKI92jJ1rrekIBFWyDvL057KuO+IyDe 7Y5pJfvJYUIwcSrMmzfdfSPS6qrlHS1Ub+QG0a58JXUNgLXus6ApsclaD2lhILsVbNVJ GNJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=p2Ht8maPQnGA8z4sqJmE0IlCDOKKYPrI55cTbBsdp2Q=; b=fbh61FOkLHSKDMLlT6oHOnrkuBz3TCQuZpemt1MU27W8XBc06EaMAkCu/gmwwrDtV3 TtP0c7O0Id+yYufn2YJHmFfVesVFfIW5z8xP8sCK4V7quTeJx3k0IwqwClLcKxOS7zun kFsXHl1tYbe2oJyjX2Oa1RK8S4a9Y3DiWwGlQjShLx3i65FWhtMjIhabAYsCO8bMI1f6 mQLDDtKXZ/N/YbPtEOubRVsW0SAc8ojZJxcgC7mqjaNXONvomfVpr+TlVkWKEaxf0rOI 3syCQaaxmjK3LYkJL9e/BSmarq2+jVOHGWIDEAmB/MNsr7Il/k1zDZ31unmPIbXQXo2Q LLRQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id k33si250426qvh.194.2019.03.19.19.08.49 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:08:49 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D0B9C59467; Wed, 20 Mar 2019 02:08:48 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 222BB605CA; Wed, 20 Mar 2019 02:08:42 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 14/28] userfaultfd: wp: handle COW properly for uffd-wp Date: Wed, 20 Mar 2019 10:06:28 +0800 Message-Id: <20190320020642.4000-15-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Wed, 20 Mar 2019 02:08:49 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This allows uffd-wp to support write-protected pages for COW. For example, the uffd write-protected PTE could also be write-protected by other usages like COW or zero pages. When that happens, we can't simply set the write bit in the PTE since otherwise it'll change the content of every single reference to the page. Instead, we should do the COW first if necessary, then handle the uffd-wp fault. To correctly copy the page, we'll also need to carry over the _PAGE_UFFD_WP bit if it was set in the original PTE. For huge PMDs, we just simply split the huge PMDs where we want to resolve an uffd-wp page fault always. That matches what we do with general huge PMD write protections. In that way, we resolved the huge PMD copy-on-write issue into PTE copy-on-write. Signed-off-by: Peter Xu --- mm/memory.c | 5 +++- mm/mprotect.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 65 insertions(+), 4 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index e7a4b9650225..b8a4c0bab461 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2291,7 +2291,10 @@ vm_fault_t wp_page_copy(struct vm_fault *vmf) } flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte)); entry = mk_pte(new_page, vma->vm_page_prot); - entry = maybe_mkwrite(pte_mkdirty(entry), vma); + if (pte_uffd_wp(vmf->orig_pte)) + entry = pte_mkuffd_wp(entry); + else + entry = maybe_mkwrite(pte_mkdirty(entry), vma); /* * Clear the pte entry and flush it first, before updating the * pte with the new entry. This will avoid a race condition diff --git a/mm/mprotect.c b/mm/mprotect.c index 9d4433044c21..855dddb07ff2 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -73,18 +73,18 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, flush_tlb_batched_pending(vma->vm_mm); arch_enter_lazy_mmu_mode(); do { +retry_pte: oldpte = *pte; if (pte_present(oldpte)) { pte_t ptent; bool preserve_write = prot_numa && pte_write(oldpte); + struct page *page; /* * Avoid trapping faults against the zero or KSM * pages. See similar comment in change_huge_pmd. */ if (prot_numa) { - struct page *page; - page = vm_normal_page(vma, addr, oldpte); if (!page || PageKsm(page)) continue; @@ -114,6 +114,54 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, continue; } + /* + * Detect whether we'll need to COW before + * resolving an uffd-wp fault. Note that this + * includes detection of the zero page (where + * page==NULL) + */ + if (uffd_wp_resolve) { + /* If the fault is resolved already, skip */ + if (!pte_uffd_wp(*pte)) + continue; + page = vm_normal_page(vma, addr, oldpte); + if (!page || page_mapcount(page) > 1) { + struct vm_fault vmf = { + .vma = vma, + .address = addr & PAGE_MASK, + .page = page, + .orig_pte = oldpte, + .pmd = pmd, + /* pte and ptl not needed */ + }; + vm_fault_t ret; + + if (page) + get_page(page); + arch_leave_lazy_mmu_mode(); + pte_unmap_unlock(pte, ptl); + ret = wp_page_copy(&vmf); + /* PTE is changed, or OOM */ + if (ret == 0) + /* It's done by others */ + continue; + else if (WARN_ON(ret != VM_FAULT_WRITE)) + return pages; + pte = pte_offset_map_lock(vma->vm_mm, + pmd, addr, + &ptl); + arch_enter_lazy_mmu_mode(); + if (!pte_present(*pte)) + /* + * This PTE could have been + * modified after COW + * before we have taken the + * lock; retry this PTE + */ + goto retry_pte; + } + } + ptent = ptep_modify_prot_start(mm, addr, pte); ptent = pte_modify(ptent, newprot); if (preserve_write) @@ -183,6 +231,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, unsigned long pages = 0; unsigned long nr_huge_updates = 0; struct mmu_notifier_range range; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; range.start = 0; @@ -202,7 +251,16 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, } if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { - if (next - addr != HPAGE_PMD_SIZE) { + /* + * When resolving an userfaultfd write + * protection fault, it's not easy to identify + * whether a THP is shared with others and + * whether we'll need to do copy-on-write, so + * just split it always for now to simply the + * procedure. And that's the policy too for + * general THP write-protect in af9e4d5f2de2. + */ + if (next - addr != HPAGE_PMD_SIZE || uffd_wp_resolve) { __split_huge_pmd(vma, pmd, addr, false, NULL); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, From patchwork Wed Mar 20 02:06:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860697 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6D9ED15AC for ; Wed, 20 Mar 2019 02:08:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 52895287D4 for ; Wed, 20 Mar 2019 02:08:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 466F7299AC; Wed, 20 Mar 2019 02:08:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D7A56287D4 for ; Wed, 20 Mar 2019 02:08:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 96E706B026D; Tue, 19 Mar 2019 22:08:57 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8F7C36B026E; Tue, 19 Mar 2019 22:08:57 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7987B6B026F; Tue, 19 Mar 2019 22:08:57 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id 4F4416B026D for ; Tue, 19 Mar 2019 22:08:57 -0400 (EDT) Received: by mail-qk1-f200.google.com with SMTP id v2so815294qkf.21 for ; Tue, 19 Mar 2019 19:08:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=GFGx0IdWQ0ZPrL6vkXilEv/IYCsyBmlgGdp7pG6DT9M=; b=Q/vBlX5t6oKIbAAMiiFwpKjOZqerhhMJR6Hn/RVOTIIuyn7cgMs6Ve/Baya4wHFMDM aOPpeOi1BK8iYO9rB9wQ2/X/1jbVHZyBBmxlWFUzl98N2rpYb5jBpN2kMgJ6E6SdWIV6 3BPrle2K737GpT4XVNIre/r8irHWH/6ehfhsqbGx+karTHDDUMJFbFJ6nRdr5fnY2g6k qRgAfIld9PQSHlDTXNdWmJJMryIBEGd0qX9nc+BIu5QNGs1vo25d03XgipaYKUiTOCSv gi/pG/Xw0hVjYnpIl9eqFVvPAWf0JKmxRksStAHDBjICN+YEZ3Jc/cwE8yXVbfoNQc5l 8mcA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXeJarHsIwLFn//BYXqqWbwGSynN/6gJExRyEoBPPW20qxqBiAx 9TGVz8wr0pMsZM4j+WPSmuYdCwr++CV2ybEGcp6Ul8gOhlTOyBUOd+0EANLq3XdMXj+nH7O2B+v fxTsx3fNu3aUfpLtUAQVNUoB4DqnGSsDvnbxt77zRZM+CbV09rusFJW5OIqzYQerSWA== X-Received: by 2002:a0c:d0f3:: with SMTP id b48mr4481331qvh.139.1553047737120; Tue, 19 Mar 2019 19:08:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqzewOUe+/0kegXz0p+odpf8cEsSMZdoq9m7XIsBqwx44uLSNhy721m1eM6V1S0VwBAsqph/ X-Received: by 2002:a0c:d0f3:: with SMTP id b48mr4481283qvh.139.1553047735912; Tue, 19 Mar 2019 19:08:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047735; cv=none; d=google.com; s=arc-20160816; b=x0p5O5kGJuru+/6fwy8oPOkgGWIRPMMYdfhq5r9Il4KWGUW6dY136h5xgXCKVmpoDY 0caIY8obsP8DblmXwaYhBjOpnjmbxS7nQKxkQ2qc6RPsMLiFCt0SR+EowGvKiHnRqe94 rTv0dGgbNU8OA7hhDjlt9xYxTc90s2+oin/YE5WIoZoe11LApy+cNUSzLYsNvDNfzCwi JVX7FFDMidTysnye6AQu1WDyeHPCVyArtpNUNzY/keUpx7EbSIsekqXW85yAamdEorSq vwBDkdSEQZJQLbgXFLjNLHuJv8IV0TSzv467h8PS29uIk1EynGau2qxHC/no3/bKJC1X STyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=GFGx0IdWQ0ZPrL6vkXilEv/IYCsyBmlgGdp7pG6DT9M=; b=JpaaNeUzhFtteSYOd+Hx+HuwZACRoR/X/sLGaN8PjDaS4fTxPHgIE7uQ6GllCEXATL F2KHpqAmMUj8zigop0mcFAsw6fhIlKEcS51DgBcVx0DVedLyGOFAwLKW9xvlm2LZOYY0 S0xx3dBuPgY2Ygd8jW2KZ6/pEqVdWNhcIrLFiPqnFJccLuBuckECKTX4ZS0X3g4N+RJ1 Po7fz19ZJJRaf3TupygaIbjvyDaRRwSXLsS58iF+kp8P0TuE814SZR3DJnQpsBteqAeO kgwAllg7733Lipb5n7P55AiKyTxIp7+3xi1D3Xeoa1ukvW3hwh2QyeSeDzu162D7wVuq ofCQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id a67si349337qkb.112.2019.03.19.19.08.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:08:55 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 139C0859FE; Wed, 20 Mar 2019 02:08:55 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 592C0605CA; Wed, 20 Mar 2019 02:08:49 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 15/28] userfaultfd: wp: drop _PAGE_UFFD_WP properly when fork Date: Wed, 20 Mar 2019 10:06:29 +0800 Message-Id: <20190320020642.4000-16-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Wed, 20 Mar 2019 02:08:55 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP UFFD_EVENT_FORK support for uffd-wp should be already there, except that we should clean the uffd-wp bit if uffd fork event is not enabled. Detect that to avoid _PAGE_UFFD_WP being set even if the VMA is not being tracked by VM_UFFD_WP. Do this for both small PTEs and huge PMDs. Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- mm/huge_memory.c | 8 ++++++++ mm/memory.c | 8 ++++++++ 2 files changed, 16 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 817335b443c2..fb2234cb595a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -938,6 +938,14 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, ret = -EAGAIN; pmd = *src_pmd; + /* + * Make sure the _PAGE_UFFD_WP bit is cleared if the new VMA + * does not have the VM_UFFD_WP, which means that the uffd + * fork event is not enabled. + */ + if (!(vma->vm_flags & VM_UFFD_WP)) + pmd = pmd_clear_uffd_wp(pmd); + #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION if (unlikely(is_swap_pmd(pmd))) { swp_entry_t entry = pmd_to_swp_entry(pmd); diff --git a/mm/memory.c b/mm/memory.c index b8a4c0bab461..6405d56debee 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -788,6 +788,14 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte = pte_mkclean(pte); pte = pte_mkold(pte); + /* + * Make sure the _PAGE_UFFD_WP bit is cleared if the new VMA + * does not have the VM_UFFD_WP, which means that the uffd + * fork event is not enabled. + */ + if (!(vm_flags & VM_UFFD_WP)) + pte = pte_clear_uffd_wp(pte); + page = vm_normal_page(vma, addr, pte); if (page) { get_page(page); From patchwork Wed Mar 20 02:06:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860699 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2458515AC for ; Wed, 20 Mar 2019 02:09:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 065F429480 for ; Wed, 20 Mar 2019 02:09:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EDE0B294AB; Wed, 20 Mar 2019 02:09:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8795E29480 for ; Wed, 20 Mar 2019 02:09:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 834D76B0273; Tue, 19 Mar 2019 22:09:07 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7BDC06B0274; Tue, 19 Mar 2019 22:09:07 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 65FA66B0275; Tue, 19 Mar 2019 22:09:07 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 336FA6B0273 for ; Tue, 19 Mar 2019 22:09:07 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id 35so924567qtq.5 for ; Tue, 19 Mar 2019 19:09:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=JMh9KCVPuDaGt/2VnHtIaAX3wvvxE5xtc+RuS52ZdHc=; b=CNV9WutjbFi9J7gXDJr1mwBFsLnPbdHHKkZbIHkUbzvfsYSdwvSR8jDQRBKJyr70H3 NiP079ObNxDYdRpplA6VO7wmSRb81JOqhV8FF1OZyGsHnapXWW8o+Q+QjQxztph85fIO f9CPpTjDRZCN+4owgPjGVIQJTkHS8d7FXdWDAHLSMMgRNm5Hp8hM4u1/7GdHSO3DsXbM vl52byRfGOLKrWiJGP1snEjv4VuBq9OWdDHiP9igwfz/qf8l8UEGkV8kptFRueLlK6Se td79GC1tPUv2E3mFG4E1ijR7Hxu/l35AyIU7o3ZmYZiVlpn+wOh8anS2cmv98JL2wRtr eCxg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVcm2VdfECUNVc7eZKS1vkhTwwzA5xeprAYcWm+8iktBud2r9SC 4eYr5slpNkbjLsyWb5nbwUj3RH3h5PhnSIafgEvTr3CBEV03zZtsibY8ceTlCpIFixDlxMLBLdj 5E28ZPWCGckWfLKMKUh53OzF/s3Er4xiOvOUeoYSdcm6lXnB2taZCAYllNA0ztdDQoA== X-Received: by 2002:a0c:d155:: with SMTP id c21mr4526067qvh.64.1553047747007; Tue, 19 Mar 2019 19:09:07 -0700 (PDT) X-Google-Smtp-Source: APXvYqy6gM70ie+PEdS9RL4rVgsWxECSJRlMCFaCiBlKn4nje1U+Y23Nqab57n4wCHypok2wM8/X X-Received: by 2002:a0c:d155:: with SMTP id c21mr4526029qvh.64.1553047746057; Tue, 19 Mar 2019 19:09:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047746; cv=none; d=google.com; s=arc-20160816; b=dnSiAkV5gEaG/JTpMSRMrQG10NUyNoN+Tjy+FivPHCYBjgiB9oq/vx5pvGx2GA7zfd yDvVnpbAucJxhwL9dxvaOE/RuUOe5gENikjaPa4CMX65dgmyrZgwZzf2vrLelwQNZcMc t792Y9eO9Zh3uymvIp7FjvuNqOQT0n9LG3PT5JCtuEH/4OgWwGg1dGXNnerqC3PYwmE8 92gMQeHM+joeoTAQZtHzM/DrDK25htwgDi5GYCGjDWcqBH3unT8zUwj8f/jV0ggcNgIF 3R+KDNXwuflXSj6/LulO7xyYpYdHMOrCrr3Kd1joUYnitu54kVY3/+33z9GFeNuVRcnf DcBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=JMh9KCVPuDaGt/2VnHtIaAX3wvvxE5xtc+RuS52ZdHc=; b=jqvnizd1qYoz6eqaTETFv/PBcQEkIMrRknW7a8szeYoxNXxrhEppoBwuO9lafVOOPI PKOqGnYvrhA+p/wNR6sO9jsXF5BOT+JNA8ZBY+ooS3MWvb338P8ZYIgIpq7qk6f1VwFQ HW8iOfrlqN2P0fTmND2leUNWLhsorSoNFdiG5u1andLxXvJ81HHI8DydqAwAXBu5/XAo x3rNnKShZkntr7pH53D6sgV8uJkA23a9dBpIYpnMe//b6hZzoXyxooLC4TQNRM4m4pSA ls1YQIlrX3TpifqCRCIbmtH5+u5vYWE6R2bRy+fdyFJND7q5JPV87xP37OQqBsZnzX4g KLSg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id a62si358769qke.96.2019.03.19.19.09.05 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:09:06 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1E0D981F18; Wed, 20 Mar 2019 02:09:05 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8E806605CA; Wed, 20 Mar 2019 02:08:55 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 16/28] userfaultfd: wp: add pmd_swp_*uffd_wp() helpers Date: Wed, 20 Mar 2019 10:06:30 +0800 Message-Id: <20190320020642.4000-17-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Wed, 20 Mar 2019 02:09:05 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Adding these missing helpers for uffd-wp operations with pmd swap/migration entries. Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- arch/x86/include/asm/pgtable.h | 15 +++++++++++++++ include/asm-generic/pgtable_uffd.h | 15 +++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 6863236e8484..18a815d6f4ea 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1401,6 +1401,21 @@ static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) { return pte_clear_flags(pte, _PAGE_SWP_UFFD_WP); } + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pmd_set_flags(pmd, _PAGE_SWP_UFFD_WP); +} + +static inline int pmd_swp_uffd_wp(pmd_t pmd) +{ + return pmd_flags(pmd) & _PAGE_SWP_UFFD_WP; +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pmd_clear_flags(pmd, _PAGE_SWP_UFFD_WP); +} #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ #define PKRU_AD_BIT 0x1 diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h index 643d1bf559c2..828966d4c281 100644 --- a/include/asm-generic/pgtable_uffd.h +++ b/include/asm-generic/pgtable_uffd.h @@ -46,6 +46,21 @@ static __always_inline pte_t pte_swp_clear_uffd_wp(pte_t pte) { return pte; } + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pmd; +} + +static inline int pmd_swp_uffd_wp(pmd_t pmd) +{ + return 0; +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pmd; +} #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ #endif /* _ASM_GENERIC_PGTABLE_UFFD_H */ From patchwork Wed Mar 20 02:06:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860701 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 956841390 for ; Wed, 20 Mar 2019 02:09:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7824729482 for ; Wed, 20 Mar 2019 02:09:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6AEE22966F; Wed, 20 Mar 2019 02:09:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AA30329482 for ; Wed, 20 Mar 2019 02:09:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C1F86B0003; Tue, 19 Mar 2019 22:09:13 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 773296B000C; Tue, 19 Mar 2019 22:09:13 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 638076B0274; Tue, 19 Mar 2019 22:09:13 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 3C9ED6B0003 for ; Tue, 19 Mar 2019 22:09:13 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id e25so19413692qkj.12 for ; Tue, 19 Mar 2019 19:09:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=KfiR8hG7twEoGKvKf8wlozSQdGs9G3V4ZTs3Ue1pp5w=; b=mU7hkS0crM6xRjosQ+jMdIWtOZvhE0dmllUuWMfw7yl+8CTNflUHSG/LqUwkSmQnAs Ve941aBKTB1+N/Zu06aK2Sg+Yu+4dO+IqKQ6GBY0Gp5m8RLkgqI8LtR1eWBFA2EJ7TyP Q6qf94GwSMG5XHNV+PJGi2CWahDaBQbgmyjVV/Ihu/z+XLJ1OtsbAW+gWrYKIqgyPkmg LAXn+J3/Gk9KRIL5hOxDMNt1KR9HfaKgREtuWdPuccZfpFWWcNezk81p8tN0MbDVumGt D87Bj3YmkcLT+BZzd7v4TZBk0BWXHVCy8NG81g6hz+L8LJRW60qUH4ewr6e2I5rAuseV R/Jg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVWHVlkAuljiEZqBMNFEC3XiIMq1besNU0v3XepI2XKJLUmP9SD UdglyxrJtTa/NYIEpEzgaMkAB5PFbNn8lN3JF1WxkcONwLAC1JNYCVtFw34Wm5smZQJFso8w6Or cSRgHTiX5iO2sO2JaaNhAkpjCRFV7Cil/c6RXzo97SibfGWYGm60MM3WmZWyTvSv3YQ== X-Received: by 2002:a37:c98e:: with SMTP id m14mr4274600qkl.274.1553047753029; Tue, 19 Mar 2019 19:09:13 -0700 (PDT) X-Google-Smtp-Source: APXvYqxfS33tXDr5XggEM40o3NhZXB8gRFWnPOVoBWSzKZ0KeKuoPzz884vqlywsb1J0GWJpUVM6 X-Received: by 2002:a37:c98e:: with SMTP id m14mr4274570qkl.274.1553047752194; Tue, 19 Mar 2019 19:09:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047752; cv=none; d=google.com; s=arc-20160816; b=vhy9A+/yTON3R8uxF9/xKX78zWVkKfCUkgjF6vC2JhoWD/hryG4C0M28iE2Lz8USMR TKJHhkoSsJ/n7xBTuMM5umjgQ7w7hnZO+BS2pZqp8mCmD8VtV+mPmnb/niIky6qsL+Jy j3SNYC3iBEXEqhl3FQJmhfylC+YjPs3ssn7u5Kk6zPvtYme/2Sgw/erlEQbl2So/jT4s vnmOpISSu3nTMy7+dgoWtHE7/Tg+WHjDXISCpA1HUzCkUibgoKFCRBDWMHKiExZjIf1V gwV0pMwy8x+oCtwaV+wwhqsLgI1OSnMAyno2nsV7WwaCfJu/AWrD4TP9joy5LrkFZ7pA d93Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=KfiR8hG7twEoGKvKf8wlozSQdGs9G3V4ZTs3Ue1pp5w=; b=ABM07gCWpeU4ZwsUHaekgnb8+nsAfuk9RaClQlsTzw4T8cTqhVm2Hs7f2CK4Y3FwBY uYKR1BQFsyjR2QJlN/0yphGT1vBO/kYPPayQbAsMP1PJn1Q2MJ5ds+rpc4vVh7SZ8y9J bBWEpTJsBRpNI8QxrcXVVk5LzoGOsNKzQSkXARzcjdh2UBzON54bG7wGEw7MU48Jq4k9 LNblqrG9kTM5knhDO2ui7M1rNLx18K8bnYwCVcU38Pp4L5KO0oF6Z19VIYb9+YNuTMZi K7ks+v+NK7skaXr2AvrmoGEmJlgGmcErlQ0OlkchdkVLSfK3ofsxSBmv/ekLiCtPwSZO H8tw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id 25si406347qtq.283.2019.03.19.19.09.12 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:09:12 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4EA8F8553F; Wed, 20 Mar 2019 02:09:11 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 93CB46058F; Wed, 20 Mar 2019 02:09:05 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 17/28] userfaultfd: wp: support swap and page migration Date: Wed, 20 Mar 2019 10:06:31 +0800 Message-Id: <20190320020642.4000-18-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 20 Mar 2019 02:09:11 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP For either swap and page migration, we all use the bit 2 of the entry to identify whether this entry is uffd write-protected. It plays a similar role as the existing soft dirty bit in swap entries but only for keeping the uffd-wp tracking for a specific PTE/PMD. Something special here is that when we want to recover the uffd-wp bit from a swap/migration entry to the PTE bit we'll also need to take care of the _PAGE_RW bit and make sure it's cleared, otherwise even with the _PAGE_UFFD_WP bit we can't trap it at all. Note that this patch removed two lines from "userfaultfd: wp: hook userfault handler to write protection fault" where we try to remove the VM_FAULT_WRITE from vmf->flags when uffd-wp is set for the VMA. This patch will still keep the write flag there. Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/linux/swapops.h | 2 ++ mm/huge_memory.c | 3 +++ mm/memory.c | 6 ++++++ mm/migrate.c | 4 ++++ mm/mprotect.c | 2 ++ mm/rmap.c | 6 ++++++ 6 files changed, 23 insertions(+) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 4d961668e5fc..0c2923b1cdb7 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -68,6 +68,8 @@ static inline swp_entry_t pte_to_swp_entry(pte_t pte) if (pte_swp_soft_dirty(pte)) pte = pte_swp_clear_soft_dirty(pte); + if (pte_swp_uffd_wp(pte)) + pte = pte_swp_clear_uffd_wp(pte); arch_entry = __pte_to_swp_entry(pte); return swp_entry(__swp_type(arch_entry), __swp_offset(arch_entry)); } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index fb2234cb595a..75de07141801 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2175,6 +2175,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, write = is_write_migration_entry(entry); young = false; soft_dirty = pmd_swp_soft_dirty(old_pmd); + uffd_wp = pmd_swp_uffd_wp(old_pmd); } else { page = pmd_page(old_pmd); if (pmd_dirty(old_pmd)) @@ -2207,6 +2208,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = swp_entry_to_pte(swp_entry); if (soft_dirty) entry = pte_swp_mksoft_dirty(entry); + if (uffd_wp) + entry = pte_swp_mkuffd_wp(entry); } else { entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); entry = maybe_mkwrite(entry, vma); diff --git a/mm/memory.c b/mm/memory.c index 6405d56debee..c3d57fa890f2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -736,6 +736,8 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte = swp_entry_to_pte(entry); if (pte_swp_soft_dirty(*src_pte)) pte = pte_swp_mksoft_dirty(pte); + if (pte_swp_uffd_wp(*src_pte)) + pte = pte_swp_mkuffd_wp(pte); set_pte_at(src_mm, addr, src_pte, pte); } } else if (is_device_private_entry(entry)) { @@ -2825,6 +2827,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) flush_icache_page(vma, page); if (pte_swp_soft_dirty(vmf->orig_pte)) pte = pte_mksoft_dirty(pte); + if (pte_swp_uffd_wp(vmf->orig_pte)) { + pte = pte_mkuffd_wp(pte); + pte = pte_wrprotect(pte); + } set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); vmf->orig_pte = pte; diff --git a/mm/migrate.c b/mm/migrate.c index 181f5d2718a9..72cde187d4a1 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -241,6 +241,8 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, entry = pte_to_swp_entry(*pvmw.pte); if (is_write_migration_entry(entry)) pte = maybe_mkwrite(pte, vma); + else if (pte_swp_uffd_wp(*pvmw.pte)) + pte = pte_mkuffd_wp(pte); if (unlikely(is_zone_device_page(new))) { if (is_device_private_page(new)) { @@ -2301,6 +2303,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pte)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pte)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, addr, ptep, swp_pte); /* diff --git a/mm/mprotect.c b/mm/mprotect.c index 855dddb07ff2..96c0f521099d 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -196,6 +196,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, newpte = swp_entry_to_pte(entry); if (pte_swp_soft_dirty(oldpte)) newpte = pte_swp_mksoft_dirty(newpte); + if (pte_swp_uffd_wp(oldpte)) + newpte = pte_swp_mkuffd_wp(newpte); set_pte_at(mm, addr, pte, newpte); pages++; diff --git a/mm/rmap.c b/mm/rmap.c index 0454ecc29537..3750d5a5283c 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1469,6 +1469,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, pvmw.address, pvmw.pte, swp_pte); /* * No need to invalidate here it will synchronize on @@ -1561,6 +1563,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* * No need to invalidate here it will synchronize on @@ -1627,6 +1631,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* Invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, From patchwork Wed Mar 20 02:06:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860703 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F3ECE15AC for ; Wed, 20 Mar 2019 02:09:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D915A29480 for ; Wed, 20 Mar 2019 02:09:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CC9B4294AB; Wed, 20 Mar 2019 02:09:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 539B729480 for ; Wed, 20 Mar 2019 02:09:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 32ADB6B000C; Tue, 19 Mar 2019 22:09:20 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 2B8636B0276; Tue, 19 Mar 2019 22:09:20 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 156FA6B0277; Tue, 19 Mar 2019 22:09:20 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id DC4FA6B000C for ; Tue, 19 Mar 2019 22:09:19 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id c25so905739qtj.13 for ; Tue, 19 Mar 2019 19:09:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=m5VRtYdvEYOOA3DJLOgIjYl2bvguQi+nAgVZ8BaRsF8=; b=pLw6pa3jWLo0pSEDlq+oODK5pCZY/icqgrclh6QJoyXoLXK5AoDwzQ8HVYwrCcV+nK TPWBBsInIc/0Wis7n0J9iweQ0ZjswV0fUwobjnUh+bdN0SusskzdFRxUPtsiSKOx5prz CMhGfS5pwoQfWIvKDwNtvyEXZSaHDJfyVt1L5xgZQFbAUqE/MCUT9lmom5nAonEDolJ3 b8xMOuQ8OKMcm7Z4xNHpBfW+3LTc8fjbw3QqK70F+mqjrbrNnBNEE02N1YuxjPCUgnHO IUUV5hOd2DE6LwMMXkYqR6ceIc4bVHx+NCJUCM+/lTff/YTZ5SPsadEr1B7nW0W6RPK+ I/sw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVvZf2zN+osogb74+sk5+OphFhyheZ5Lxsm3Ymhi+FcnYXCivgB 0nWW0OdiSWnBw4lBtymkAc9dged4hJaS/bYgj+IdYSuAHLl/JiIZYqKjoD8LrFTgYB+MRWIRI0t b1wyLO1JR1CTljiDIToZloLffmmEfU+W/bXcfjlcnkIjNq2h+3jPhGCc0oQXRtxabKw== X-Received: by 2002:a37:650c:: with SMTP id z12mr4537273qkb.115.1553047759685; Tue, 19 Mar 2019 19:09:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqzCiCj8qykdOifH1dHx//XDr1scqCeEV6khjBI79Zfi3c6HcqiR1MF2NX2MNFL8sbmp1OYZ X-Received: by 2002:a37:650c:: with SMTP id z12mr4537220qkb.115.1553047758588; Tue, 19 Mar 2019 19:09:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047758; cv=none; d=google.com; s=arc-20160816; b=V7UnG1sIzkeJrulZsvCYptYHnsHMllcyljd43H0/j8nHqMM0KFqqaQyksNuLxAaIn5 L9vtCuVPM8OECgu7YaT8u4VGeptCM65pfCBWhk2hp6Hx3jXKo6uIkylNswEiEJXZO/Kj n3SZWtDcLNlYtlRgxm91wRsLm931FEic9jDb784b9/Vd2OhhnSN5Z0ZtAfnAot4SAyET dvvX8S0k3HSqz1UqOJLC38QZV4VIDdZuS7RG+ePdJ6sJM+IsGnBQYc9qfBtTRPFLd4ak hS0S3ujdU6XeewbZNuVD1r7RfjihS/5ZQX7EpT2VwkA2F3di5kS/q0qXKpuk60QP3+Rt 23GA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=m5VRtYdvEYOOA3DJLOgIjYl2bvguQi+nAgVZ8BaRsF8=; b=wi4zJhqwo8pbm12fXGfdvfCNNQCobydeisJbVWiRcy3GzMsCS8lUbMfyS2avGO20N8 3rkXEsylBTtuO3wpXoICz7sjNJ4DEdBMB6QjZDMyncwPV7TwIpAKR0zftA4ZiSX8HAxh qHddO7WjQTR+mJDdTefSZlQYbDqv7IJABWFxBvmTGxQL01pTRDLlbnUrUuk3qCcRgcA6 6nGUKaAlx7C/68Kgi/nGGYaxfaa+eIeuY3LIumgxWirFh+ubgUZKXW5kVRSxiP6HEY1o tck/OJtwRUW1rF7SkI8W80l89O9ZYpufyXsSA8cjoIS2BDk+l7s/Q+2iyX9Ifn8t87PK j2EA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id 91si426127qte.160.2019.03.19.19.09.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:09:18 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8CD47308FBAE; Wed, 20 Mar 2019 02:09:17 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id CBB49620A0; Wed, 20 Mar 2019 02:09:11 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 18/28] khugepaged: skip collapse if uffd-wp detected Date: Wed, 20 Mar 2019 10:06:32 +0800 Message-Id: <20190320020642.4000-19-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Wed, 20 Mar 2019 02:09:17 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Don't collapse the huge PMD if there is any userfault write protected small PTEs. The problem is that the write protection is in small page granularity and there's no way to keep all these write protection information if the small pages are going to be merged into a huge PMD. The same thing needs to be considered for swap entries and migration entries. So do the check as well disregarding khugepaged_max_ptes_swap. Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/trace/events/huge_memory.h | 1 + mm/khugepaged.c | 23 +++++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index dd4db334bd63..2d7bad9cb976 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -13,6 +13,7 @@ EM( SCAN_PMD_NULL, "pmd_null") \ EM( SCAN_EXCEED_NONE_PTE, "exceed_none_pte") \ EM( SCAN_PTE_NON_PRESENT, "pte_non_present") \ + EM( SCAN_PTE_UFFD_WP, "pte_uffd_wp") \ EM( SCAN_PAGE_RO, "no_writable_page") \ EM( SCAN_LACK_REFERENCED_PAGE, "lack_referenced_page") \ EM( SCAN_PAGE_NULL, "page_null") \ diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 4f017339ddb2..396c7e4da83e 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -29,6 +29,7 @@ enum scan_result { SCAN_PMD_NULL, SCAN_EXCEED_NONE_PTE, SCAN_PTE_NON_PRESENT, + SCAN_PTE_UFFD_WP, SCAN_PAGE_RO, SCAN_LACK_REFERENCED_PAGE, SCAN_PAGE_NULL, @@ -1123,6 +1124,15 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, pte_t pteval = *_pte; if (is_swap_pte(pteval)) { if (++unmapped <= khugepaged_max_ptes_swap) { + /* + * Always be strict with uffd-wp + * enabled swap entries. Please see + * comment below for pte_uffd_wp(). + */ + if (pte_swp_uffd_wp(pteval)) { + result = SCAN_PTE_UFFD_WP; + goto out_unmap; + } continue; } else { result = SCAN_EXCEED_SWAP_PTE; @@ -1142,6 +1152,19 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, result = SCAN_PTE_NON_PRESENT; goto out_unmap; } + if (pte_uffd_wp(pteval)) { + /* + * Don't collapse the page if any of the small + * PTEs are armed with uffd write protection. + * Here we can also mark the new huge pmd as + * write protected if any of the small ones is + * marked but that could bring uknown + * userfault messages that falls outside of + * the registered range. So, just be simple. + */ + result = SCAN_PTE_UFFD_WP; + goto out_unmap; + } if (pte_write(pteval)) writable = true; From patchwork Wed Mar 20 02:06:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860705 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D525315AC for ; Wed, 20 Mar 2019 02:09:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B7D6E29480 for ; Wed, 20 Mar 2019 02:09:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ABBBF294AB; Wed, 20 Mar 2019 02:09:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 30EBF29480 for ; Wed, 20 Mar 2019 02:09:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4B3B86B0276; Tue, 19 Mar 2019 22:09:30 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 43B456B0278; Tue, 19 Mar 2019 22:09:30 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2DE016B0279; Tue, 19 Mar 2019 22:09:30 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 07FAF6B0276 for ; Tue, 19 Mar 2019 22:09:30 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id d49so911013qtk.8 for ; Tue, 19 Mar 2019 19:09:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=g4CUT7wn2ZfM204z66Iujtc9aw73Tdzc6nKReP/7dxM=; b=SYz5HF1xq5xQLhHqswgYasbX8cHMbK/VA9/x7voQl5HlWSKIopYT1QC8basFhU4z4M AUtfMaj0G94gWYx4FOz7ocnaach5OhEvIKuViHMmCa/lY2bWP2jfpYgXrThk2O9flFNm ShQfAL3BFyFKsG14wZFArQjUkh/DB6QWK66kgBAt7Ukx1feXtM40xHhrhEbND+Jn2aHM PBZqryowXie+stxDPgnrlMj8Fsg33Ek2wpqM+RosTdrop+RnmrIHwBsVzxutrLmLL6K6 VBd53mincweR1JVIiHhBbXT3Q5MFGi6CyrJHXUnVPABJkqH4gSnEqcvd3fDcAR/e9wHU pZbQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAW4XNsAEnpk9hfzrqJuVvStmEc0ZRF5quXqEuo+dgH8N+/FJHyR MQ1PenOqaEWBlZBGXv+if0hzPhgq3bwPP86aSYrXSo2cFfnKuDE/X7i0Z9aLQIP1e4rYZCTKEFf 7GK1WtOADKldSIBU5KAypfDc7AUysUJArvT/KXZgwhKPJERo2SeKQk8MX6xrkLofUGg== X-Received: by 2002:ac8:84a:: with SMTP id x10mr4983466qth.273.1553047769816; Tue, 19 Mar 2019 19:09:29 -0700 (PDT) X-Google-Smtp-Source: APXvYqwsnAMrIwwmZY57jGNJSwLVFF3jy856pxFl9BwIpjTHUdWwQvjs4BCMAUGkf+G9611s/NQ2 X-Received: by 2002:ac8:84a:: with SMTP id x10mr4983397qth.273.1553047768736; Tue, 19 Mar 2019 19:09:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047768; cv=none; d=google.com; s=arc-20160816; b=PA22os9epbpNclI+AMv2Dop5/FzPYiVJtfXfbvB7XmvQ2HVC3kq1pw70W0FPJVrcEp qEJ8aJ//YvcG4H/Vl2JLcP9qxmvr6Z2l36ut+wzQT24+jiayIZiDtbJEarFmiJQ2Sj3K td5Xenl/RoF5m7u+KSJAo11suE+PZzB1mNs4Fa17zzFuE59vd0vcj8cZD+Z/oPTq2wZ2 I96ntqllwKU+aUPMMceRmNCr7V7+3vDG+PJG3uBbY3yMi+ZT1Epd7zcxUKCvVqv//Jyb Azj1pFCEJ/psHFLRXBpAD4de5/WLqtnEOjDof8jiZhWh9KazL+sifMsduu2IsmVc/Han ksMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=g4CUT7wn2ZfM204z66Iujtc9aw73Tdzc6nKReP/7dxM=; b=Zni6VQAuDcRpzi1lc8HTw+2BbZjOE0PGiD1aQVlCIRbZBcoc/SfWxqUoP6bZSWxwpR sCw9yXYyemager6FOU2uUuw8JHhcaaTvpmXq3SkdL6huFRhZGnH2GEFVtNWqa/9VBPEq LzHcVh6nXVPbplo3Uk0fIG/PXUvYROS2OASJoY43zxYVCXbCsoVCU1vbu+7p04ukeGvN yoUuMOA+2Us9AJf4MNvHFNpmXADMtqlb+aKIYQdEI+EL84PvBi0eCDr+BEB6m0B2lsxc ZegVrkaANsgZBHRrkLndozVAAjF17CC0DlcWY3u5HZ+k0EWwLzOg+KcP6RWCfDW43fy/ 8Kuw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id u18si213087qvi.216.2019.03.19.19.09.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:09:28 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B383680462; Wed, 20 Mar 2019 02:09:27 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1382F6014C; Wed, 20 Mar 2019 02:09:17 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 19/28] userfaultfd: introduce helper vma_find_uffd Date: Wed, 20 Mar 2019 10:06:33 +0800 Message-Id: <20190320020642.4000-20-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 20 Mar 2019 02:09:27 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP We've have multiple (and more coming) places that would like to find a userfault enabled VMA from a mm struct that covers a specific memory range. This patch introduce the helper for it, meanwhile apply it to the code. Suggested-by: Mike Rapoport Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- mm/userfaultfd.c | 54 +++++++++++++++++++++++++++--------------------- 1 file changed, 30 insertions(+), 24 deletions(-) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 240de2a8492d..2606409572b2 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -20,6 +20,34 @@ #include #include "internal.h" +/* + * Find a valid userfault enabled VMA region that covers the whole + * address range, or NULL on failure. Must be called with mmap_sem + * held. + */ +static struct vm_area_struct *vma_find_uffd(struct mm_struct *mm, + unsigned long start, + unsigned long len) +{ + struct vm_area_struct *vma = find_vma(mm, start); + + if (!vma) + return NULL; + + /* + * Check the vma is registered in uffd, this is required to + * enforce the VM_MAYWRITE check done at uffd registration + * time. + */ + if (!vma->vm_userfaultfd_ctx.ctx) + return NULL; + + if (start < vma->vm_start || start + len > vma->vm_end) + return NULL; + + return vma; +} + static int mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, struct vm_area_struct *dst_vma, @@ -228,20 +256,9 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, */ if (!dst_vma) { err = -ENOENT; - dst_vma = find_vma(dst_mm, dst_start); + dst_vma = vma_find_uffd(dst_mm, dst_start, len); if (!dst_vma || !is_vm_hugetlb_page(dst_vma)) goto out_unlock; - /* - * Check the vma is registered in uffd, this is - * required to enforce the VM_MAYWRITE check done at - * uffd registration time. - */ - if (!dst_vma->vm_userfaultfd_ctx.ctx) - goto out_unlock; - - if (dst_start < dst_vma->vm_start || - dst_start + len > dst_vma->vm_end) - goto out_unlock; err = -EINVAL; if (vma_hpagesize != vma_kernel_pagesize(dst_vma)) @@ -488,20 +505,9 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, * both valid and fully within a single existing vma. */ err = -ENOENT; - dst_vma = find_vma(dst_mm, dst_start); + dst_vma = vma_find_uffd(dst_mm, dst_start, len); if (!dst_vma) goto out_unlock; - /* - * Check the vma is registered in uffd, this is required to - * enforce the VM_MAYWRITE check done at uffd registration - * time. - */ - if (!dst_vma->vm_userfaultfd_ctx.ctx) - goto out_unlock; - - if (dst_start < dst_vma->vm_start || - dst_start + len > dst_vma->vm_end) - goto out_unlock; err = -EINVAL; /* From patchwork Wed Mar 20 02:06:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860707 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 209691390 for ; Wed, 20 Mar 2019 02:09:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0143529480 for ; Wed, 20 Mar 2019 02:09:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E7B5A294AB; Wed, 20 Mar 2019 02:09:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5E4BF29480 for ; Wed, 20 Mar 2019 02:09:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6FE026B0278; Tue, 19 Mar 2019 22:09:38 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 685656B027A; Tue, 19 Mar 2019 22:09:38 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 527C76B027B; Tue, 19 Mar 2019 22:09:38 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 2B94F6B0278 for ; Tue, 19 Mar 2019 22:09:38 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id 23so19467896qkl.16 for ; Tue, 19 Mar 2019 19:09:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=JJc0h+L8EKEiFRWPZySSJxSRlflCCV4CfqhuFZTjUzs=; b=RRWUCNqE24cdRyGBnrEyremQHjvWhBtWQRQUtn90I9nodlg7+eb2rMrojBj1SkjK7L 7RwrQbKiBVWEov9bUUCCNj+JM0XCNok+SZ1t9vE6wDI5O7wrWhfMsVC7uWx9m7GX95ra tBX4doD9e6gsIvFArH5DpbImB7LWjZfeGl87jttcca6SXlyZmqBIYvYzM6DDzM02+NhQ t9oc0z1OWzMhxUZeRHs0bi66tCIhK23ZCW9Ld0SX4KvlGIUDcJAMQSnJNuXiIpcbz9qb ehfcbhOk9t4YJ0MKZotI/wHegWiaiYiEYS10FY/FNriAPVYKXR2ArLvg+rWY4ZXtId8S 3U7w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVcJV304/302X1xr6trYxABlDYrsKBMJyBQhPD7t0/TzH6/yN8v 7vKjIPjnuGxTVBfxnmnCwVaAjXk9MPlj0YuFv2ZjRRha7Ae7RTnyDyhbh+DUN82AE8U8vHMSwZ9 1rUblIjd+1CNOTByrd7/UPjTRbQHrHdKzS4JKbzut6+tdo9SO1GF8c6Y3PCytE/ygvQ== X-Received: by 2002:ac8:2f10:: with SMTP id j16mr4838881qta.29.1553047777972; Tue, 19 Mar 2019 19:09:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqyIHiqAqiZUXRl47O8F7B0f93bvnP72t1+xURHU2Ld6u0N6DqAQ31MQ5ws0XuYs89sfsDJL X-Received: by 2002:ac8:2f10:: with SMTP id j16mr4838859qta.29.1553047777339; Tue, 19 Mar 2019 19:09:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047777; cv=none; d=google.com; s=arc-20160816; b=y/6zL4jgc7ZFQWA76x4k02XGHjn/gn9Sf6OnggwxjirO1FkaGXU8rCh0aBzd5ZSdYe IUf+P74OLVnYi8PAlIhbRqy4/9HUG2UybJOSFxg4+lczObJrwn7rQy8qz40H8u11ibD0 VKJYOvf5xTYZm9ezkYxqidltGHBXsN8sk+27vR3JdxARTyGYjeCOQAG/1hLC/gTCy6f6 LKoqp8rBfQsX30nVOSQVYFUvjtJ8G786uU67TcQaWLWCMAgC7yze+LgwEaRji/yApO07 VguiAMzMtT5N4SE1l53257S1lLoGJsjpP/sbDVlGduJBnD569M4fBg/NUNa1oankjF/8 nGMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=JJc0h+L8EKEiFRWPZySSJxSRlflCCV4CfqhuFZTjUzs=; b=cseqgklDh5kROeC7ZMSEtiLTEVoydZk5OKsbhyfjbgbBCp66861C8PC8kNYb3D8bVp LVhZsRs32kJNYUoPrtEAnuSnro+X9NU34f3PBbvRcDn24viXsbpXqPmf4GadFcTyvlBj Y72Xj8016LEZ8059TXwo3F2Mez1PGMQTzw+PwXh1E901cH2S2sDi0Tg29nUNkr6hbmku bebBiS5XuoAwZTau3tzeWkmgo4dEaAenF6WFiFry7P0Vj2g2yndoQ+3mddYz3Sb7YptQ aF6tx7qA3lrOCS9IrvgsFcFMBDNYx0T//w7SiNpPCEeMzNEWHKNjINoxxA5mH2jA4ntv WWMA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id e25si455289qtm.381.2019.03.19.19.09.37 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:09:37 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6CDDC308FC22; Wed, 20 Mar 2019 02:09:36 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3A2446014C; Wed, 20 Mar 2019 02:09:27 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Rik van Riel Subject: [PATCH v3 20/28] userfaultfd: wp: support write protection for userfault vma range Date: Wed, 20 Mar 2019 10:06:34 +0800 Message-Id: <20190320020642.4000-21-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Wed, 20 Mar 2019 02:09:36 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li Add API to enable/disable writeprotect a vma range. Unlike mprotect, this doesn't split/merge vmas. Cc: Andrea Arcangeli Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli [peterx: - use the helper to find VMA; - return -ENOENT if not found to match mcopy case; - use the new MM_CP_UFFD_WP* flags for change_protection - check against mmap_changing for failures] Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 3 ++ mm/userfaultfd.c | 54 +++++++++++++++++++++++++++++++++++ 2 files changed, 57 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 765ce884cec0..8f6e6ed544fb 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -39,6 +39,9 @@ extern ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long len, bool *mmap_changing); +extern int mwriteprotect_range(struct mm_struct *dst_mm, + unsigned long start, unsigned long len, + bool enable_wp, bool *mmap_changing); /* mm helpers */ static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma, diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 2606409572b2..70cea2ff3960 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -639,3 +639,57 @@ ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long start, { return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing, 0); } + +int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, + unsigned long len, bool enable_wp, bool *mmap_changing) +{ + struct vm_area_struct *dst_vma; + pgprot_t newprot; + int err; + + /* + * Sanitize the command parameters: + */ + BUG_ON(start & ~PAGE_MASK); + BUG_ON(len & ~PAGE_MASK); + + /* Does the address range wrap, or is the span zero-sized? */ + BUG_ON(start + len <= start); + + down_read(&dst_mm->mmap_sem); + + /* + * If memory mappings are changing because of non-cooperative + * operation (e.g. mremap) running in parallel, bail out and + * request the user to retry later + */ + err = -EAGAIN; + if (mmap_changing && READ_ONCE(*mmap_changing)) + goto out_unlock; + + err = -ENOENT; + dst_vma = vma_find_uffd(dst_mm, start, len); + /* + * Make sure the vma is not shared, that the dst range is + * both valid and fully within a single existing vma. + */ + if (!dst_vma || (dst_vma->vm_flags & VM_SHARED)) + goto out_unlock; + if (!userfaultfd_wp(dst_vma)) + goto out_unlock; + if (!vma_is_anonymous(dst_vma)) + goto out_unlock; + + if (enable_wp) + newprot = vm_get_page_prot(dst_vma->vm_flags & ~(VM_WRITE)); + else + newprot = vm_get_page_prot(dst_vma->vm_flags); + + change_protection(dst_vma, start, start + len, newprot, + enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE); + + err = 0; +out_unlock: + up_read(&dst_mm->mmap_sem); + return err; +} From patchwork Wed Mar 20 02:06:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860709 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 17E831390 for ; Wed, 20 Mar 2019 02:09:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EB99629480 for ; Wed, 20 Mar 2019 02:09:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DEF16294AB; Wed, 20 Mar 2019 02:09:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2062529480 for ; Wed, 20 Mar 2019 02:09:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2B5EC6B027A; Tue, 19 Mar 2019 22:09:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 23F826B027C; Tue, 19 Mar 2019 22:09:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E0336B027D; Tue, 19 Mar 2019 22:09:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id D44FA6B027A for ; Tue, 19 Mar 2019 22:09:48 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id 35so897667qty.12 for ; Tue, 19 Mar 2019 19:09:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=DauPO+I/hJ8SBuajGGHsJsj3lcEKHinvfF4ZjWhK/Gw=; b=WbSrVd+uSTwId+0BQ+mD/cm0I9RPWh13JmxwKPAJHQe0wVSAri/tsL04e0HQ7BuXnu aI5OC70M6WKCi/+sV7Q9BrbAD5DkPNXjzIvHPx68pNYoKD7fOpchhf+zeH6RXK/6wvHy 8V1U3ldXi9/4IAuNopV61WDbOVybgjJC0+4i6MB7a4Xc8CMKfGqQtD/NnKRC36h3e6Ca N9tXAZG6hz919filn191T/UrP8G6pRbdus1iYnZIUn9FEG290gvLafSAb7gEI8NlMgDd ufYWJXRbUdQ6nvqsvCh1emNzQ+zwY4K9CTye/zQgomgsZCvP7he644X/lVQaE/e7POgv XIYQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAUUYFd2DK3wYCQC4u0ZbWaR3XqE8k0dljG1olotEuBUpnts4Faq 9qmh4uz1uGMhjcpKHEMrEHLTMuUvUZg0euwL2z/TpFsykxhS++VM31gCiYSbDvVE6oiC8Wj1MNY 89wH8de3ufXLvV5r5p/vHSDOxAbhBySgxZjg/lX79+9MU9z2D+i/8FeBNo08JJxXWPw== X-Received: by 2002:a37:7847:: with SMTP id t68mr4675091qkc.247.1553047788627; Tue, 19 Mar 2019 19:09:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqxLJ/J6KhPaPo4DpJ7xMpQD0SqlcCURRn1xPbQ+poJcCs9FcrtAI8R1ThmjTAjmtIZdwCJl X-Received: by 2002:a37:7847:: with SMTP id t68mr4675022qkc.247.1553047787337; Tue, 19 Mar 2019 19:09:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047787; cv=none; d=google.com; s=arc-20160816; b=Guav+x2Of+HBUl9gcP7ufZx8/Ry2hennG7bky3UVxL1gF17jVJ+ERC9mqBmxG1Zx6p mLRjeBZkr+9hh9jkfughascOpsPM8GBZBYpDVVS6NXucGWMW6njwX8p14zggyNDXsep4 Em+g8Crlt18tVzC0OE6XKdBCwewYZxx/Jf9/S3Z14BkxFe7a90JrKt2dBeOcR9vPICYa C07u3uZ+c2eoqaFYlxFnD1llbgcqM7oNnbBXMuc72L7T94rIfcSM5Q4A/aV7Hi+X2ASY MVJ9yyQAVCeUKgpkINLJxSKJwMz8a+W5ftJ/Eo2N7nc5Sj6tY+DSwFbpqvzAo/+UNGao WXgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=DauPO+I/hJ8SBuajGGHsJsj3lcEKHinvfF4ZjWhK/Gw=; b=UwsbJa/COBXT8TZYDFGQMTpVMVDeFlE1fXrAXh1GWttV4AoB+32Z48HbxEHeYUSQu1 Tg4FoXG70klGeACs9xoobGlahwhn/60z4GZmL+go1HRKBxnk/KE20B72yY71lt8/2vHx ClZEaJcvIQENUotmTH+nVYrGlMhb24xB2+RAzJoliOxdyg4DAxX2LLU2OEB37ZEzg8Rm dYVjwMHz2C05WF3Pu+tOtlYIbiqHqsw+vNZR+s0AGcdOOaF4lzreeBOW30VjAZQqQHix uj1n4ZzKS5DJ06L2cgdTBWeVLTd736VjsoTMe40nAFdTl2obMIcuIrFTznunXImbXDVu 0Gpw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id 24si420722qtu.137.2019.03.19.19.09.47 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:09:47 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7037459467; Wed, 20 Mar 2019 02:09:46 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id E812C6014C; Wed, 20 Mar 2019 02:09:36 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 21/28] userfaultfd: wp: add the writeprotect API to userfaultfd ioctl Date: Wed, 20 Mar 2019 10:06:35 +0800 Message-Id: <20190320020642.4000-22-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Wed, 20 Mar 2019 02:09:46 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli v1: From: Shaohua Li v2: cleanups, remove a branch. [peterx writes up the commit message, as below...] This patch introduces the new uffd-wp APIs for userspace. Firstly, we'll allow to do UFFDIO_REGISTER with write protection tracking using the new UFFDIO_REGISTER_MODE_WP flag. Note that this flag can co-exist with the existing UFFDIO_REGISTER_MODE_MISSING, in which case the userspace program can not only resolve missing page faults, and at the same time tracking page data changes along the way. Secondly, we introduced the new UFFDIO_WRITEPROTECT API to do page level write protection tracking. Note that we will need to register the memory region with UFFDIO_REGISTER_MODE_WP before that. Signed-off-by: Andrea Arcangeli [peterx: remove useless block, write commit message, check against VM_MAYWRITE rather than VM_WRITE when register] Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- fs/userfaultfd.c | 82 +++++++++++++++++++++++++------- include/uapi/linux/userfaultfd.h | 23 +++++++++ 2 files changed, 89 insertions(+), 16 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 3092885c9d2c..81962d62520c 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -304,8 +304,11 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, if (!pmd_present(_pmd)) goto out; - if (pmd_trans_huge(_pmd)) + if (pmd_trans_huge(_pmd)) { + if (!pmd_write(_pmd) && (reason & VM_UFFD_WP)) + ret = true; goto out; + } /* * the pmd is stable (as in !pmd_trans_unstable) so we can re-read it @@ -318,6 +321,8 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, */ if (pte_none(*pte)) ret = true; + if (!pte_write(*pte) && (reason & VM_UFFD_WP)) + ret = true; pte_unmap(pte); out: @@ -1251,10 +1256,13 @@ static __always_inline int validate_range(struct mm_struct *mm, return 0; } -static inline bool vma_can_userfault(struct vm_area_struct *vma) +static inline bool vma_can_userfault(struct vm_area_struct *vma, + unsigned long vm_flags) { - return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || - vma_is_shmem(vma); + /* FIXME: add WP support to hugetlbfs and shmem */ + return vma_is_anonymous(vma) || + ((is_vm_hugetlb_page(vma) || vma_is_shmem(vma)) && + !(vm_flags & VM_UFFD_WP)); } static int userfaultfd_register(struct userfaultfd_ctx *ctx, @@ -1286,15 +1294,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, vm_flags = 0; if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MISSING) vm_flags |= VM_UFFD_MISSING; - if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) { + if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) vm_flags |= VM_UFFD_WP; - /* - * FIXME: remove the below error constraint by - * implementing the wprotect tracking mode. - */ - ret = -EINVAL; - goto out; - } ret = validate_range(mm, uffdio_register.range.start, uffdio_register.range.len); @@ -1342,7 +1343,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, /* check not compatible vmas */ ret = -EINVAL; - if (!vma_can_userfault(cur)) + if (!vma_can_userfault(cur, vm_flags)) goto out_unlock; /* @@ -1370,6 +1371,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, if (end & (vma_hpagesize - 1)) goto out_unlock; } + if ((vm_flags & VM_UFFD_WP) && !(cur->vm_flags & VM_MAYWRITE)) + goto out_unlock; /* * Check that this vma isn't already owned by a @@ -1399,7 +1402,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!vma_can_userfault(vma)); + BUG_ON(!vma_can_userfault(vma, vm_flags)); BUG_ON(vma->vm_userfaultfd_ctx.ctx && vma->vm_userfaultfd_ctx.ctx != ctx); WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); @@ -1534,7 +1537,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, * provides for more strict behavior to notice * unregistration errors. */ - if (!vma_can_userfault(cur)) + if (!vma_can_userfault(cur, cur->vm_flags)) goto out_unlock; found = true; @@ -1548,7 +1551,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!vma_can_userfault(vma)); + BUG_ON(!vma_can_userfault(vma, vma->vm_flags)); /* * Nothing to do: this vma is already registered into this @@ -1761,6 +1764,50 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx, return ret; } +static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, + unsigned long arg) +{ + int ret; + struct uffdio_writeprotect uffdio_wp; + struct uffdio_writeprotect __user *user_uffdio_wp; + struct userfaultfd_wake_range range; + + if (READ_ONCE(ctx->mmap_changing)) + return -EAGAIN; + + user_uffdio_wp = (struct uffdio_writeprotect __user *) arg; + + if (copy_from_user(&uffdio_wp, user_uffdio_wp, + sizeof(struct uffdio_writeprotect))) + return -EFAULT; + + ret = validate_range(ctx->mm, uffdio_wp.range.start, + uffdio_wp.range.len); + if (ret) + return ret; + + if (uffdio_wp.mode & ~(UFFDIO_WRITEPROTECT_MODE_DONTWAKE | + UFFDIO_WRITEPROTECT_MODE_WP)) + return -EINVAL; + if ((uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP) && + (uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) + return -EINVAL; + + ret = mwriteprotect_range(ctx->mm, uffdio_wp.range.start, + uffdio_wp.range.len, uffdio_wp.mode & + UFFDIO_WRITEPROTECT_MODE_WP, + &ctx->mmap_changing); + if (ret) + return ret; + + if (!(uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) { + range.start = uffdio_wp.range.start; + range.len = uffdio_wp.range.len; + wake_userfault(ctx, &range); + } + return ret; +} + static inline unsigned int uffd_ctx_features(__u64 user_features) { /* @@ -1838,6 +1885,9 @@ static long userfaultfd_ioctl(struct file *file, unsigned cmd, case UFFDIO_ZEROPAGE: ret = userfaultfd_zeropage(ctx, arg); break; + case UFFDIO_WRITEPROTECT: + ret = userfaultfd_writeprotect(ctx, arg); + break; } return ret; } diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 340f23bc251d..95c4a160e5f8 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -52,6 +52,7 @@ #define _UFFDIO_WAKE (0x02) #define _UFFDIO_COPY (0x03) #define _UFFDIO_ZEROPAGE (0x04) +#define _UFFDIO_WRITEPROTECT (0x06) #define _UFFDIO_API (0x3F) /* userfaultfd ioctl ids */ @@ -68,6 +69,8 @@ struct uffdio_copy) #define UFFDIO_ZEROPAGE _IOWR(UFFDIO, _UFFDIO_ZEROPAGE, \ struct uffdio_zeropage) +#define UFFDIO_WRITEPROTECT _IOWR(UFFDIO, _UFFDIO_WRITEPROTECT, \ + struct uffdio_writeprotect) /* read() structure */ struct uffd_msg { @@ -232,4 +235,24 @@ struct uffdio_zeropage { __s64 zeropage; }; +struct uffdio_writeprotect { + struct uffdio_range range; +/* + * UFFDIO_WRITEPROTECT_MODE_WP: set the flag to write protect a range, + * unset the flag to undo protection of a range which was previously + * write protected. + * + * UFFDIO_WRITEPROTECT_MODE_DONTWAKE: set the flag to avoid waking up + * any wait thread after the operation succeeds. + * + * NOTE: Write protecting a region (WP=1) is unrelated to page faults, + * therefore DONTWAKE flag is meaningless with WP=1. Removing write + * protection (WP=0) in response to a page fault wakes the faulting + * task unless DONTWAKE is set. + */ +#define UFFDIO_WRITEPROTECT_MODE_WP ((__u64)1<<0) +#define UFFDIO_WRITEPROTECT_MODE_DONTWAKE ((__u64)1<<1) + __u64 mode; +}; + #endif /* _LINUX_USERFAULTFD_H */ From patchwork Wed Mar 20 02:06:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860711 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D545215AC for ; Wed, 20 Mar 2019 02:10:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B625D29480 for ; Wed, 20 Mar 2019 02:10:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A9159294AB; Wed, 20 Mar 2019 02:10:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3154929480 for ; Wed, 20 Mar 2019 02:10:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3A0376B027C; Tue, 19 Mar 2019 22:09:59 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 32B126B027E; Tue, 19 Mar 2019 22:09:59 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1CC506B027F; Tue, 19 Mar 2019 22:09:59 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id EB5946B027C for ; Tue, 19 Mar 2019 22:09:58 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id c67so8204273qkg.5 for ; Tue, 19 Mar 2019 19:09:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=1u7MlrLHL2FyLr/gtuJYMX9mDwFnzu7WWCE+9OL6ne0=; b=YOQ/qrOElspwRGFuiMDGa/ivY7IPcUYlisawihXE6Alq0uu/eyQSKRKv+dC+S98w1v Z/pZjEJS+5LiTkU7lmYwmbEjY0QrsWtMtpdCLnD65ooreuDn+oiQpfYcLX7Xllernml0 G8CptwzG6Rs4ZnJZD+c9QMGynYXSm7KtWncXal23gzGTINu+vwhynSzupQdb2TEssVTr EI5etJeWo37umO29gYO7o+nhNpdDsfOyegeo/QvI3omBZ/ulBHRWY7Rn0/HfNDpcNjZ5 YjnlZXS9ginNt0Op/YCew+auzD6INbF/XbcV5DbKxhj8B9rivSZ2hUm3kYb4Xd2r8mwA 2kmw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWEGb126AsMx4TZonsim9Gf60YHAzwrVSFVgGGO7MfiO4xoQT3m 2leqwr5vZu71aOeneS/60FYPO+9fBz00DRXd28ZZh0Bzbw1msGjWiWh4H6+Lgi3wV1n3Ewo7fxi cVMcLiSWIX9U2Jf9pBFhkg4NEQG3ietxNefcmrrRJpf3QpioUTYZwob2MtXyEaAUIog== X-Received: by 2002:a0c:e5c7:: with SMTP id u7mr4558219qvm.44.1553047798771; Tue, 19 Mar 2019 19:09:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqy7sMNCwxxJSXEdliAaBFCjx6fLOLE/ZxFdDnpn0BVtxT93FE9hOrGtrDQoNNvjvSyRWTv2 X-Received: by 2002:a0c:e5c7:: with SMTP id u7mr4558184qvm.44.1553047798008; Tue, 19 Mar 2019 19:09:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047798; cv=none; d=google.com; s=arc-20160816; b=lCfzC3smDEnzSlBToBuXEzYlsm3CadwuklG9MmUyrDduz9Khs2Ieyv9QjHmHirtRTc Uj8K6138FOSyfSxFAzTTbWmSeqVmI9XNRRwzIQXv8swIVaH/X/2eFtII8bF1aKLrHuU6 YtkhM6k4m4a1nVrou45+GMlV019rO0w3bw5FjISXyxGc0MhrS+pR90RkE/KdiNJJPMRx m9ELA9Y+9ejLIvTjHFqaFv0o7vHHxno9FtVb1PQKgivBsZK9dK2b1Y0B5AgTg4i6oR6e 65btvS3BX8q4+DOs6H8zE51la3ANb2fe9eIHTPvKq2n3lx9YQ9NsIx7O8M2fOFo2ZDaj F77Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=1u7MlrLHL2FyLr/gtuJYMX9mDwFnzu7WWCE+9OL6ne0=; b=ET1Jm1C7xBaPMApO9nksVyotK/xpX5xWGY3Wd6UxXJFcnBYUScS/a/8f5cBpbn5bod up5aAT1Uc1BFGzXW7TYpiNxWvxubm28Zk9l7yxKXfTNYjERBEaehesHmSKaRpKcnb2/O kkWCFAEA6vUlvUjfskurnmB7OlnvgZpJ1C2LGIb4eyFxyEMBfW3uIm4IR9bSL2/+RTNZ XZrvC7kIiJLRCKTkxLivHqcauUvMg569WhhB/nxXOrQTWB7W4UWdcCVDsx7Ne52bMtsC v2r8Z6uTfj82hybYgg72WndCM2qvP4cqGH7Z7QyPI620MjfZOn/xoXdZcl0nmYoyOMxB NTjg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id p85si368370qki.21.2019.03.19.19.09.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:09:58 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 254B63086205; Wed, 20 Mar 2019 02:09:57 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id ED1476014E; Wed, 20 Mar 2019 02:09:46 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Pavel Emelyanov , Rik van Riel Subject: [PATCH v3 22/28] userfaultfd: wp: enabled write protection in userfaultfd API Date: Wed, 20 Mar 2019 10:06:36 +0800 Message-Id: <20190320020642.4000-23-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Wed, 20 Mar 2019 02:09:57 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li Now it's safe to enable write protection in userfaultfd API Cc: Andrea Arcangeli Cc: Pavel Emelyanov Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu Reviewed-by: Mike Rapoport --- include/uapi/linux/userfaultfd.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 95c4a160e5f8..e7e98bde221f 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -19,7 +19,8 @@ * means the userland is reading). */ #define UFFD_API ((__u64)0xAA) -#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK | \ +#define UFFD_API_FEATURES (UFFD_FEATURE_PAGEFAULT_FLAG_WP | \ + UFFD_FEATURE_EVENT_FORK | \ UFFD_FEATURE_EVENT_REMAP | \ UFFD_FEATURE_EVENT_REMOVE | \ UFFD_FEATURE_EVENT_UNMAP | \ @@ -34,7 +35,8 @@ #define UFFD_API_RANGE_IOCTLS \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY | \ - (__u64)1 << _UFFDIO_ZEROPAGE) + (__u64)1 << _UFFDIO_ZEROPAGE | \ + (__u64)1 << _UFFDIO_WRITEPROTECT) #define UFFD_API_RANGE_IOCTLS_BASIC \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY) From patchwork Wed Mar 20 02:06:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860713 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 400FE1390 for ; Wed, 20 Mar 2019 02:10:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 234AA29480 for ; Wed, 20 Mar 2019 02:10:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 16DA12966B; Wed, 20 Mar 2019 02:10:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9B1CD29480 for ; Wed, 20 Mar 2019 02:10:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA4FD6B027E; Tue, 19 Mar 2019 22:10:11 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A2DBE6B0280; Tue, 19 Mar 2019 22:10:11 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8D0B86B0281; Tue, 19 Mar 2019 22:10:11 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 678FC6B027E for ; Tue, 19 Mar 2019 22:10:11 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id g17so888377qte.17 for ; Tue, 19 Mar 2019 19:10:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=Br5Y+z06NFL6Gv3oN9scI1im/YhSUPS/BrUXGI+L5ts=; b=Q6V7LduXXPc5k7CAsCL5KIiwx6wrYvRZW62GW6lfPYkiL4olGx2Dk8wBv3G34WKqMj jUckMJv2m77rHwVDwuuer1dB0o3kol0U0or8T3qJ84bmYeXkK6hTcM/1ovZ65e8eH2h3 J32l7xCrXPICoWARkH8ULzfHGN/z2cayz9HKWE8Z6a8X7RkzZWj2tfx5/vIJAZAiZaH/ BtCP39hKvndGqoiEg8lLjI9C7gSA1u2dtau+gfyy6DRlE8gdkht9b3IwYsz8ARLjllf6 Cdl9RbpTUchUXK6YackW4/9XjJ8ahdYAe+Ky0X1qXw76jXjvvWSz1wEbJgKdDs9w6x3J MkMA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAV5fiXxa0ZDFUal257gSsVqBLq/k8kbYPHHPwHi6YHhTrz8AX7X w+Ga0wEJAAR5vwT0R4d39ZcURaUKiM+6reUgyBPSFIpX6YJ4+FB/LcEjVZMsq3zm6GpQbZwugHn JmoGk5DvlTm8TW0xE7pOzl++Y7nl1Gw3RiI3rdPotXkjz4L7PDWdeOUTkO7Nz9SWo0Q== X-Received: by 2002:a05:620a:1245:: with SMTP id a5mr4288799qkl.340.1553047811161; Tue, 19 Mar 2019 19:10:11 -0700 (PDT) X-Google-Smtp-Source: APXvYqzK4X+Y6P7Mlo4Gy9nrCDQ3YbTMNLPmpzldQzOdryg9YmNMa8ias3B6DrA7IUUQtvutx9Ub X-Received: by 2002:a05:620a:1245:: with SMTP id a5mr4288742qkl.340.1553047810100; Tue, 19 Mar 2019 19:10:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047810; cv=none; d=google.com; s=arc-20160816; b=qh/bhNb6PHXqq40uFjjUWQdyIBuUBvI889/4lQZgAxttEcu01O+5dM8O7xpijskjUd 57EkwvjdNGDYzUAkmeTnNz/0AUjoT1lfwTKShX0meioO3a3xUM81JhX7idjNRhSnYN4j OzsHLXr+Asy9zk5Lm97VH168NT7Muc7Jr+TCX3shCS5V8NWHpGPWHoAYNeZOq70ARLM4 bm+8cYia4rnuANsgXpLGC44P2f9ZKmH/rCEWPUPoeG/N1a+GByy5SUG5i/15PvKx3tUX 6QzT4asvtYHfOcMuAUKIAurEdg18zs4drjuO1R6Zjlz4QMJWTBa1IpH2vvTecUMR0HNc bw6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=Br5Y+z06NFL6Gv3oN9scI1im/YhSUPS/BrUXGI+L5ts=; b=yAlxGgE2vIh6dDVJbCdQP8Dge8xH7qyaaW2Q6oWoxKioo2SIiwSTnDAci52mrljmYf em7oWQic3uediB1xKyJ7WbqjVEKF3MIC7JgragbQsa3lP0l5Hm/90D451N5+8u9OqHcg QpGCA5lb2XjGkHkFb7mdx8gdSe+H+j6v3BDd+JKKcEIsuE9kcHRtPDaNPkuVibIMZ/rc bSHja2IukKodCG4Q6v/qyFdQtAswHznsOYOdtfJSKP1TLy2fTez7bMYIfHRs31UTvGk1 LUpIdzjFOk7GsBDCWIluBPZaXiZtLQgk0nVee29S5uKYbckEV3XsDouw5l8zsNQTy6gm NThw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id d21si274180qvd.68.2019.03.19.19.10.09 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:10:10 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3107E8535C; Wed, 20 Mar 2019 02:10:09 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id A06456014C; Wed, 20 Mar 2019 02:09:57 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 23/28] userfaultfd: wp: don't wake up when doing write protect Date: Wed, 20 Mar 2019 10:06:37 +0800 Message-Id: <20190320020642.4000-24-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Wed, 20 Mar 2019 02:10:09 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP It does not make sense to try to wake up any waiting thread when we're write-protecting a memory region. Only wake up when resolving a write protected page fault. Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- fs/userfaultfd.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 81962d62520c..f1f61a0278c2 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1771,6 +1771,7 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, struct uffdio_writeprotect uffdio_wp; struct uffdio_writeprotect __user *user_uffdio_wp; struct userfaultfd_wake_range range; + bool mode_wp, mode_dontwake; if (READ_ONCE(ctx->mmap_changing)) return -EAGAIN; @@ -1789,18 +1790,20 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, if (uffdio_wp.mode & ~(UFFDIO_WRITEPROTECT_MODE_DONTWAKE | UFFDIO_WRITEPROTECT_MODE_WP)) return -EINVAL; - if ((uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP) && - (uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) + + mode_wp = uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP; + mode_dontwake = uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE; + + if (mode_wp && mode_dontwake) return -EINVAL; ret = mwriteprotect_range(ctx->mm, uffdio_wp.range.start, - uffdio_wp.range.len, uffdio_wp.mode & - UFFDIO_WRITEPROTECT_MODE_WP, + uffdio_wp.range.len, mode_wp, &ctx->mmap_changing); if (ret) return ret; - if (!(uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) { + if (!mode_wp && !mode_dontwake) { range.start = uffdio_wp.range.start; range.len = uffdio_wp.range.len; wake_userfault(ctx, &range); From patchwork Wed Mar 20 02:06:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860715 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F3E791390 for ; Wed, 20 Mar 2019 02:10:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D618D29480 for ; Wed, 20 Mar 2019 02:10:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CA32D29482; Wed, 20 Mar 2019 02:10:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DF3C62966B for ; Wed, 20 Mar 2019 02:10:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D3EC16B0280; Tue, 19 Mar 2019 22:10:17 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CC71A6B0282; Tue, 19 Mar 2019 22:10:17 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B67B56B0283; Tue, 19 Mar 2019 22:10:17 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 8901C6B0280 for ; Tue, 19 Mar 2019 22:10:17 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id 35so927360qtq.5 for ; Tue, 19 Mar 2019 19:10:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=iDlUo7saY4oT1s3VUd9zMo3eNXlfqXNdDJT52MZbq3Y=; b=bTcno6+WdIhyRcnn7yYeQ4vTCcYeUMvMod3djpATp3MVKUEf14TM4FnzmQLtVmaHxH ZSku2OW0jSnImmGpRUo00822JSD9uLu9wYtnGJ6vOu3a2/60i2pGT20gMZUnlYbyZ2Ry RNWJd/+AvzqrUXkUI80NiN7vAsrX1No5QZIhknSVl9F04LIzpUXJNXVwZ+V9smVTUnLw 3TawCqO7fEs54+it9lqCMKBQhVj6iGKnyLQTNfuX3JPzxYQWnLiNcakEI+KhvDzDCJLG Z987Sg1cR0kQ+yZ6yttz9+YT0Sed/aIvtDMejj/2fFCEggFDdki3ZRXBMFPxzw7wN6AP nXtA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXChB6o9PfrvMgQZvCMSDSos+9cA2M4z4INFNHvBzIWMHu6dbir BnU7lbivr+UNZARbJPbPIg/woNWDuvOPyIk2/5bY/TAcqeLI+KrXRKQD+cbfVu0Mzgh8XAoWpXk L0/uARjKuOxetwTAJAJRR1Y/6LmeqZl5PcqhZQMrLLOMU9cqJtSZKeiSvDe5Xl8e+fw== X-Received: by 2002:a37:dd91:: with SMTP id u17mr4358952qku.264.1553047817331; Tue, 19 Mar 2019 19:10:17 -0700 (PDT) X-Google-Smtp-Source: APXvYqzixT3YmluB+P4QqyVvwwSDoRu9nXK20f6cYbNBdW9lzbN+ZDc4qrhAS6qFuBR5j1O1umlM X-Received: by 2002:a37:dd91:: with SMTP id u17mr4358928qku.264.1553047816588; Tue, 19 Mar 2019 19:10:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047816; cv=none; d=google.com; s=arc-20160816; b=CB/YIiqslKXHtmJzCnDklwyJJn30Y3lTnCTwd0eLYkbopBtmUAdqi4a4tB41AASBTu pHZh3B21JCj7QIlsN+KTVXVEJbh/CjnJu2x4iDRxjTvpP8GZ+RjYDCHxmg/1EZoIF7aL Sbmjt0a+eg6KdLnG0avjBO07ktgtHZ7RYIW30qNVYpUnB06KgMi+A6UQcs65pAvtl4A0 cugcUMS/WgqVKdDPv4o4s6Aknn98b3qA3iM+OYKe1LKXf+xBViv53CApwMIIAtTvEuWr BbhQHjSEtUDhEMHTOMrTX4Tsc1w5bEBIs+mmN7c3i7K4T6z2kc5Q+AP8nz8WOtmoVYnW HF3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=iDlUo7saY4oT1s3VUd9zMo3eNXlfqXNdDJT52MZbq3Y=; b=YWCnJN2C+zstJX/Y4GXU1t6yETZzDp3vGAL5QUyx9bMIgFWWGGTiw8EwyA2CSvCPGT Pp/xCEmLBtAi9AqoA23AW0Gi1vvwaGEfS8tTGrve3a08l5BOsf3gY0LXOkyXxnRmU7RM cGwowoVQQb0VtsYMeKjhBzkrCQKrqyCKcE5iisqmz27gqPtEc0eS0OHBw6baYNYxRMUV 5xgD+ZxdTZLdMq5zhvX+mY74RHr55ymrz2BuOMjlornfCKCFxc04VQA7aIo1HzEpsYat 1m0irxmpz3olo/DONRz1H++ChCKajCyZcBPz83vrccBg9+don/FXeU8vMj+WqQ1SjR3Q /Xaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id n53si451539qta.65.2019.03.19.19.10.16 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:10:16 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6A27983F40; Wed, 20 Mar 2019 02:10:15 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id AA2316014C; Wed, 20 Mar 2019 02:10:09 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 24/28] userfaultfd: wp: UFFDIO_REGISTER_MODE_WP documentation update Date: Wed, 20 Mar 2019 10:06:38 +0800 Message-Id: <20190320020642.4000-25-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Wed, 20 Mar 2019 02:10:15 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Martin Cracauer Adds documentation about the write protection support. Signed-off-by: Martin Cracauer Signed-off-by: Andrea Arcangeli [peterx: rewrite in rst format; fixups here and there] Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu Reviewed-by: Mike Rapoport --- Documentation/admin-guide/mm/userfaultfd.rst | 51 ++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst index 5048cf661a8a..c30176e67900 100644 --- a/Documentation/admin-guide/mm/userfaultfd.rst +++ b/Documentation/admin-guide/mm/userfaultfd.rst @@ -108,6 +108,57 @@ UFFDIO_COPY. They're atomic as in guaranteeing that nothing can see an half copied page since it'll keep userfaulting until the copy has finished. +Notes: + +- If you requested UFFDIO_REGISTER_MODE_MISSING when registering then + you must provide some kind of page in your thread after reading from + the uffd. You must provide either UFFDIO_COPY or UFFDIO_ZEROPAGE. + The normal behavior of the OS automatically providing a zero page on + an annonymous mmaping is not in place. + +- None of the page-delivering ioctls default to the range that you + registered with. You must fill in all fields for the appropriate + ioctl struct including the range. + +- You get the address of the access that triggered the missing page + event out of a struct uffd_msg that you read in the thread from the + uffd. You can supply as many pages as you want with UFFDIO_COPY or + UFFDIO_ZEROPAGE. Keep in mind that unless you used DONTWAKE then + the first of any of those IOCTLs wakes up the faulting thread. + +- Be sure to test for all errors including (pollfd[0].revents & + POLLERR). This can happen, e.g. when ranges supplied were + incorrect. + +Write Protect Notifications +--------------------------- + +This is equivalent to (but faster than) using mprotect and a SIGSEGV +signal handler. + +Firstly you need to register a range with UFFDIO_REGISTER_MODE_WP. +Instead of using mprotect(2) you use ioctl(uffd, UFFDIO_WRITEPROTECT, +struct *uffdio_writeprotect) while mode = UFFDIO_WRITEPROTECT_MODE_WP +in the struct passed in. The range does not default to and does not +have to be identical to the range you registered with. You can write +protect as many ranges as you like (inside the registered range). +Then, in the thread reading from uffd the struct will have +msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP set. Now you send +ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect) again +while pagefault.mode does not have UFFDIO_WRITEPROTECT_MODE_WP set. +This wakes up the thread which will continue to run with writes. This +allows you to do the bookkeeping about the write in the uffd reading +thread before the ioctl. + +If you registered with both UFFDIO_REGISTER_MODE_MISSING and +UFFDIO_REGISTER_MODE_WP then you need to think about the sequence in +which you supply a page and undo write protect. Note that there is a +difference between writes into a WP area and into a !WP area. The +former will have UFFD_PAGEFAULT_FLAG_WP set, the latter +UFFD_PAGEFAULT_FLAG_WRITE. The latter did not fail on protection but +you still need to supply a page when UFFDIO_REGISTER_MODE_MISSING was +used. + QEMU/KVM ======== From patchwork Wed Mar 20 02:06:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860717 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 823E115AC for ; Wed, 20 Mar 2019 02:10:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 687E129480 for ; Wed, 20 Mar 2019 02:10:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5C5722966B; Wed, 20 Mar 2019 02:10:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E4C3B29480 for ; Wed, 20 Mar 2019 02:10:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EC4BB6B0282; Tue, 19 Mar 2019 22:10:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E4C766B0284; Tue, 19 Mar 2019 22:10:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CEE736B0285; Tue, 19 Mar 2019 22:10:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id AB34F6B0282 for ; Tue, 19 Mar 2019 22:10:25 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id k29so19638512qkl.14 for ; Tue, 19 Mar 2019 19:10:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=bs6QHkKFHQBImeSERvMYc93b5+IKtx0QA2EvCI0MuCU=; b=ZlfxEg8nHyxE9IzR9RBkgmq5DHufLPWRacNzDeoQJMq8Edkvmpfafa8GT0cLQXE7Xi CMnAoNvACh1Iu8QSOpgXvPSmGJjhETYrkEbHYLWjoST9ita1C70G964zif7QJS/vQGDF ZNxhfDxrDhAoFUge4DyBsjlk3/UDrD9DHXwWmHsyLe4ZXkiorcyQkoedKzGTU9qG80Go yqrbHybrNh9WO0UlZu60jqM3IIzpj4EDM+cvJA4jPYUDKHh2oDzIwsSc4pT7VDe79kQ2 CCXlmE5qCwTJlZQ5tWaFGISMCNjwSp6HvRTho/YecJ9wc7BapI9+vXPkvfaolsGZ//gQ JLng== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXWo+u2Zyp+pblQiR+AQVNIzEKQO3jiZsToQItNJoimR/xdAD0K 0HMvmHiyEEu4A6Ody4P5N6savuwD3RukS80KUyayknC9PT4bib40hWvY1CdNB9GkHjS8W/OD0OQ VDFRZ7tKeVolGhWWX65ePaXfAKFxuJQPkJlZ0kKEiurbmcSKcZ1Vt8GQitt5FV+xMPQ== X-Received: by 2002:ac8:1638:: with SMTP id p53mr4629145qtj.257.1553047825501; Tue, 19 Mar 2019 19:10:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqzbou+iOPHc3BlDIjiUUUVDyCIiUVHmAPOCAPHr/IVaYqD3roAaobCjctedyO1/H+e0gVgE X-Received: by 2002:ac8:1638:: with SMTP id p53mr4629089qtj.257.1553047824482; Tue, 19 Mar 2019 19:10:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047824; cv=none; d=google.com; s=arc-20160816; b=DzyNXRhi5WvpvKuAThqzqQRNerXmssckcHOSWT8FokBLWBNxL2H1xDDRXxXh+92dNs lKyDlf0yOoPpaUBR+enHmFZQYmWwSiFKqtPnO2oNRz4d6xNBtBvKZ3miEnNz4zZrL/ol uQk35ENVoS5iT+c/DMCL8Foy9+Bv8MlWBSSj3Pg92IO7DE2t89rXY6UiAtfuKeMg2PlI 1giQVe3PfRnk1EJ3ILsPoOhhNrm11pr9sUSnZ2wBAqhmkkeQm0ttUIQGd/MoV2S7uIM7 GpO3h1CLi2I/U4Y5c2l76hRl6BKVDfTorJ21BYVVjkFj6lOOfxjCiwbeAP9MwWjvmvDZ 0g/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=bs6QHkKFHQBImeSERvMYc93b5+IKtx0QA2EvCI0MuCU=; b=ou7HkaPOiI/VVQoRQ4KzONPFDgMGF9GnUho6MsAnuKRJW2veJgWQwsliYyDBY25G9M /JsytpuXhXPCrfPZ3dvEq0pq4JeQhEVCyYa4stS4p5sxagtwq5SfqLMlsGgdN90kDW/5 PzBpDnywKeH/BRqEpWiOPBFeL/l6es/j4ptYEJupdcrIF/gFqLsvL9n5A0bDAgZof70A Yau/SX/yZ6h24ydAkGEcs3gT7w8RJYd24p/v5eKXtLOpR6fB2keCLkZWtwH/pEjguqQj MEDKad1c/uiPLLIsWpQv35mYsm7AphIv2Fg9npHtcMtUQRFTCE/m+ZA7NCcziqP34sQ7 pQgg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id f50si451776qte.34.2019.03.19.19.10.24 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:10:24 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 95594307D874; Wed, 20 Mar 2019 02:10:23 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id E522A605CA; Wed, 20 Mar 2019 02:10:15 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 25/28] userfaultfd: wp: fixup swap entries in change_pte_range Date: Wed, 20 Mar 2019 10:06:39 +0800 Message-Id: <20190320020642.4000-26-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Wed, 20 Mar 2019 02:10:23 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP In change_pte_range() we do nothing for uffd if the PTE is a swap entry. That can lead to data mismatch if the page that we are going to write protect is swapped out when sending the UFFDIO_WRITEPROTECT. This patch applies/removes the uffd-wp bit even for the swap entries. Signed-off-by: Peter Xu --- I kept this patch a standalone one majorly to make review easier. The patch can be considered as standalone or to squash into the patch "userfaultfd: wp: support swap and page migration". --- mm/mprotect.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index 96c0f521099d..a23e03053787 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -183,11 +183,11 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, } ptep_modify_prot_commit(mm, addr, pte, ptent); pages++; - } else if (IS_ENABLED(CONFIG_MIGRATION)) { + } else if (is_swap_pte(oldpte)) { swp_entry_t entry = pte_to_swp_entry(oldpte); + pte_t newpte; if (is_write_migration_entry(entry)) { - pte_t newpte; /* * A protection check is difficult so * just be safe and disable write @@ -198,22 +198,24 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, newpte = pte_swp_mksoft_dirty(newpte); if (pte_swp_uffd_wp(oldpte)) newpte = pte_swp_mkuffd_wp(newpte); - set_pte_at(mm, addr, pte, newpte); - - pages++; - } - - if (is_write_device_private_entry(entry)) { - pte_t newpte; - + } else if (is_write_device_private_entry(entry)) { /* * We do not preserve soft-dirtiness. See * copy_one_pte() for explanation. */ make_device_private_entry_read(&entry); newpte = swp_entry_to_pte(entry); - set_pte_at(mm, addr, pte, newpte); + } else { + newpte = oldpte; + } + if (uffd_wp) + newpte = pte_swp_mkuffd_wp(newpte); + else if (uffd_wp_resolve) + newpte = pte_swp_clear_uffd_wp(newpte); + + if (!pte_same(oldpte, newpte)) { + set_pte_at(mm, addr, pte, newpte); pages++; } } From patchwork Wed Mar 20 02:06:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860719 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 942CD15AC for ; Wed, 20 Mar 2019 02:10:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 76A6929480 for ; Wed, 20 Mar 2019 02:10:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6AC3F2966F; Wed, 20 Mar 2019 02:10:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 042E429480 for ; Wed, 20 Mar 2019 02:10:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 091E86B0284; Tue, 19 Mar 2019 22:10:34 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0447A6B0286; Tue, 19 Mar 2019 22:10:33 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E25586B0287; Tue, 19 Mar 2019 22:10:33 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id BCE6A6B0284 for ; Tue, 19 Mar 2019 22:10:33 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id x12so933674qtk.2 for ; Tue, 19 Mar 2019 19:10:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=+j5GaOU8poBAgZzks6hdkrPmZ/icMeiWJOPNT8FmFtc=; b=gydYWKEVa/Rj41uHQMgnfFLPu80NM24b1kFfgCfFSmUvNXZPkoZhZcD1kGfBolVvdo VtmlCPBHNljbQwGvLMF7LatO0qlbcGRPm9c/tC++opGxsg06qGHIgwL8pzV6AvoeZd+q gmVMQD/sgrbtkI9GDYL7MRYkKHtQZTy0bp0tOZVSCx+rQ6Sy0axJPVUkuh59WMlF7jn9 SjY52kmAMpESJDdJGjaJxR9sX2nI7tVyufrf3c+Zd30GgUO58jFmY3HE95cYmxIAEPQ9 JwSq/rpiWxU5eSK4B8UMV/vAuX/LN79jbSPjQ2kXTwtkz9+G7zbcn+XyBRF8sehSVwQM HcZw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXfTE6/fA8PwNztclozOFoZJKOVmAhoLi/r/sa8QwohMkMCb+7f 2Pkr+vtfRvbDynqy62x/LXiv0pPn+lkFqNTZI5TYQi0KCg8GSFexV3W2eyLaQ/bs/75lRVod9t2 qK9XPYgKIo8XOK5CX5h4IuaGZc+VP7kTbtgZ57k5vFUSgMhX8QzVpr1/yHV6x+THerA== X-Received: by 2002:ae9:f00b:: with SMTP id l11mr4549810qkg.84.1553047833561; Tue, 19 Mar 2019 19:10:33 -0700 (PDT) X-Google-Smtp-Source: APXvYqxqUGo2aMBtSfnOP54dQ4gpaUoIIdnMdhCs8kspl81z7A3Jdyo8YS6/0Oww2RJsLiq6TKGe X-Received: by 2002:ae9:f00b:: with SMTP id l11mr4549780qkg.84.1553047832631; Tue, 19 Mar 2019 19:10:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047832; cv=none; d=google.com; s=arc-20160816; b=RHKeRGyclaG/96tjfi1mwBrDGiqtCSP8SpD4kDRdRpeXeaSElcRaniE2V2qFICiAyH iJsycRgHQzlWM+Ge3uybnuvp9LjiCWRdf6NcSRtYYzm6Eyp31tSBScfKGsmN66EeeX3D NEcNMJuHVA21MDH6kcOUQEPR4rwb7cDHFNbqSujJAv/oWXd+Kttwe3pXupT+iWuAklX+ uL9X72USD8hWtyZW5hEo22H0ReUmTIpcht2LO+v0+8NuncK6UgYT+12QI4wh6ZTXdDdo +5FmfYONTHIQBeFQ7Ek2WJzf9w9UW7Y8uVwL0ZbNiRt5rJ/uzxesE5ve6OFayJjeH6tG 7b6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=+j5GaOU8poBAgZzks6hdkrPmZ/icMeiWJOPNT8FmFtc=; b=OqnZh7GTrSM6KIpSxJpJhI8CvBHgGex8n0agrXE+jPLG0zWrRJly/37VNIhqdSnI6K K4bb772pm/UQMT01dja7XChONnx4XLPrkmcrw3HevC6xhSdumLVEy1sQPywgUwn33Qow w5Tat7KUdYLB1DWaGpRLJD+eiGMeAkXh3ZXWn6w69VrfvR4ItJf0Dv1Iu38zq9J6tYrd rjupIDzaZwaKCOOsimVDxeJVPkLKoYVwElgYFlq0+1M2OAviRz9nIEQm06i8xg1OBOqE jLXIlyUk9S3KALkIkZN1lZmlf2eTeZWLm1EMFs0OSD+JsGIp7LqlmQjAcjstXRHtdF2K ZbCw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id o44si425985qtf.243.2019.03.19.19.10.32 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:10:32 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C0138C049E20; Wed, 20 Mar 2019 02:10:31 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1E4735B0BE; Wed, 20 Mar 2019 02:10:23 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 26/28] userfaultfd: wp: declare _UFFDIO_WRITEPROTECT conditionally Date: Wed, 20 Mar 2019 10:06:40 +0800 Message-Id: <20190320020642.4000-27-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Wed, 20 Mar 2019 02:10:31 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Only declare _UFFDIO_WRITEPROTECT if the user specified UFFDIO_REGISTER_MODE_WP and if all the checks passed. Then when the user registers regions with shmem/hugetlbfs we won't expose the new ioctl to them. Even with complete anonymous memory range, we'll only expose the new WP ioctl bit if the register mode has MODE_WP. Signed-off-by: Peter Xu Reviewed-by: Mike Rapoport --- fs/userfaultfd.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index f1f61a0278c2..7f87e9e4fb9b 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1456,14 +1456,24 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, up_write(&mm->mmap_sem); mmput(mm); if (!ret) { + __u64 ioctls_out; + + ioctls_out = basic_ioctls ? UFFD_API_RANGE_IOCTLS_BASIC : + UFFD_API_RANGE_IOCTLS; + + /* + * Declare the WP ioctl only if the WP mode is + * specified and all checks passed with the range + */ + if (!(uffdio_register.mode & UFFDIO_REGISTER_MODE_WP)) + ioctls_out &= ~((__u64)1 << _UFFDIO_WRITEPROTECT); + /* * Now that we scanned all vmas we can already tell * userland which ioctls methods are guaranteed to * succeed on this range. */ - if (put_user(basic_ioctls ? UFFD_API_RANGE_IOCTLS_BASIC : - UFFD_API_RANGE_IOCTLS, - &user_uffdio_register->ioctls)) + if (put_user(ioctls_out, &user_uffdio_register->ioctls)) ret = -EFAULT; } out: From patchwork Wed Mar 20 02:06:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860721 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7293D1390 for ; Wed, 20 Mar 2019 02:10:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4F19B29737 for ; Wed, 20 Mar 2019 02:10:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 42B89297E3; Wed, 20 Mar 2019 02:10:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7CDFC29737 for ; Wed, 20 Mar 2019 02:10:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 304466B0286; Tue, 19 Mar 2019 22:10:42 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 28E846B0288; Tue, 19 Mar 2019 22:10:42 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 12FF66B0289; Tue, 19 Mar 2019 22:10:42 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id D98376B0286 for ; Tue, 19 Mar 2019 22:10:41 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id i3so920304qtc.7 for ; Tue, 19 Mar 2019 19:10:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=YoZxjejxwXU5vKpbAmaSLSE1AnavbqmoKueYd5KBa1k=; b=UIWSCPTSBfqv/iaq4tqduEz606rJ54ArgDvt5GGdCwTmHrCqG3c/Lrq5c5N51XiF4D sQLFQxRKLp2q83Oeh2Uc8iZd0Js1FVQdjtZJ+OTWFLYe+5+cEvPBhpWKQfMeGwg7o5EE 0yNCvs8zaAXQi42CywTfk6zEGMhnOZZkqunQzl25gKBzZq8gtN1upacDnyWea4WsOLMo m6hkDo1drlicNeZqvk0g3UQ15KOp6ho+WDiAPiRhvwR1uN4ocVFCV6dHLBYjt7lwAp7e gUFs4lO2gm+dmY+6RH+m0V+v2Kt9m6IAA/5iTb/WYKyjCTFKlPYqGnSFMePzlu3hDAQF xY2A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAX/acQcFbUpiDHJvzVzsKTK1GmZJnSCvhW/kMOebN/x6UA4yKDH ZxcBoNzggAH2fxUjfcKb2O4Pz5j795B9nt5pY9H0EV56hr0KfkMJsETUhvgZDyi41iwoxbqnOdo +nom2vOsb/uzxrshZYup+QiCxwO1uZEkEuteHgfsaqPgpas4I/wnzYB0xolObGe7nHA== X-Received: by 2002:a37:a81:: with SMTP id 123mr4671648qkk.290.1553047841666; Tue, 19 Mar 2019 19:10:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqyWN6q6CKByhm8qDwT7GROqIGQrQjj2mnvBGwJBD6T9s4rjMhDr8KoVkiQ55IAIX6AYlv4b X-Received: by 2002:a37:a81:: with SMTP id 123mr4671614qkk.290.1553047840739; Tue, 19 Mar 2019 19:10:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047840; cv=none; d=google.com; s=arc-20160816; b=ZQfSZ85q6IYpOF6eCRWAkGrzLSjyNPTumVwNE76ieHyUcdiZybozyEGRb3Ex6J6nqh z9ZJVTpiGRTc24sV5xTdN7F74k3vJv+TRlWnHrz45HgqWR5479tr86aw7QVHpckltZMu +79LUz3cj32/LqOAg06JHxpVflVYo+ke6H8k348UNk5QezBMeJjc5E+zjbgt2iNEovWW AfQdb8Lyl2WTI/fkGLWjQDTZ/+EBETerPPPviFIGi0fnuUIpiovsKag0jgNyRekatjhn d/GRYBOfgXIyvpcLVqY1Zcgz3OHhheVpvI5QYfojwgaFaXlOfFm79xsmy4+sV1CWtHK3 r6GQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=YoZxjejxwXU5vKpbAmaSLSE1AnavbqmoKueYd5KBa1k=; b=piWhpvfXp0KDU3JeK423TbpUc/chY44vv4PtJX8dJtzLJ5hwIHTzlAXpfj6Vwstz29 zb9AdVziyaBQBfZdFX5gY8vDEnQk+vFCvSUGyDg9j91g+zFQGoMoG8Zx2LD6TXz/UD8C vKpEGY2VPH/l+dXA61vNwUCoGXpbqvcr5pqMZKYDd1by7fQTikR1EFKYqtH09v+vuoHp dKOt6nnQbJlqG6buAaMhFpnQ56mX9RRY8iVl/yC+naWq+aXtz0h1lvSniu9Qh4j8WvKo d7udhU+3AyjDTroUyONoGqNuK3elhEtloLnWeu4NE6VWMS4igUMjx1EESGKwRRwv5OBu 0L6g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id m5si222194qvi.208.2019.03.19.19.10.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:10:40 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A99ED3082231; Wed, 20 Mar 2019 02:10:39 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4873C605CA; Wed, 20 Mar 2019 02:10:32 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 27/28] userfaultfd: selftests: refactor statistics Date: Wed, 20 Mar 2019 10:06:41 +0800 Message-Id: <20190320020642.4000-28-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Wed, 20 Mar 2019 02:10:39 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Introduce uffd_stats structure for statistics of the self test, at the same time refactor the code to always pass in the uffd_stats for either read() or poll() typed fault handling threads instead of using two different ways to return the statistic results. No functional change. With the new structure, it's very easy to introduce new statistics. Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 76 +++++++++++++++--------- 1 file changed, 49 insertions(+), 27 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index 5d1db824f73a..e5d12c209e09 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -88,6 +88,12 @@ static char *area_src, *area_src_alias, *area_dst, *area_dst_alias; static char *zeropage; pthread_attr_t attr; +/* Userfaultfd test statistics */ +struct uffd_stats { + int cpu; + unsigned long missing_faults; +}; + /* pthread_mutex_t starts at page offset 0 */ #define area_mutex(___area, ___nr) \ ((pthread_mutex_t *) ((___area) + (___nr)*page_size)) @@ -127,6 +133,17 @@ static void usage(void) exit(1); } +static void uffd_stats_reset(struct uffd_stats *uffd_stats, + unsigned long n_cpus) +{ + int i; + + for (i = 0; i < n_cpus; i++) { + uffd_stats[i].cpu = i; + uffd_stats[i].missing_faults = 0; + } +} + static int anon_release_pages(char *rel_area) { int ret = 0; @@ -469,8 +486,8 @@ static int uffd_read_msg(int ufd, struct uffd_msg *msg) return 0; } -/* Return 1 if page fault handled by us; otherwise 0 */ -static int uffd_handle_page_fault(struct uffd_msg *msg) +static void uffd_handle_page_fault(struct uffd_msg *msg, + struct uffd_stats *stats) { unsigned long offset; @@ -485,18 +502,19 @@ static int uffd_handle_page_fault(struct uffd_msg *msg) offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; offset &= ~(page_size-1); - return copy_page(uffd, offset); + if (copy_page(uffd, offset)) + stats->missing_faults++; } static void *uffd_poll_thread(void *arg) { - unsigned long cpu = (unsigned long) arg; + struct uffd_stats *stats = (struct uffd_stats *)arg; + unsigned long cpu = stats->cpu; struct pollfd pollfd[2]; struct uffd_msg msg; struct uffdio_register uffd_reg; int ret; char tmp_chr; - unsigned long userfaults = 0; pollfd[0].fd = uffd; pollfd[0].events = POLLIN; @@ -526,7 +544,7 @@ static void *uffd_poll_thread(void *arg) msg.event), exit(1); break; case UFFD_EVENT_PAGEFAULT: - userfaults += uffd_handle_page_fault(&msg); + uffd_handle_page_fault(&msg, stats); break; case UFFD_EVENT_FORK: close(uffd); @@ -545,28 +563,27 @@ static void *uffd_poll_thread(void *arg) break; } } - return (void *)userfaults; + + return NULL; } pthread_mutex_t uffd_read_mutex = PTHREAD_MUTEX_INITIALIZER; static void *uffd_read_thread(void *arg) { - unsigned long *this_cpu_userfaults; + struct uffd_stats *stats = (struct uffd_stats *)arg; struct uffd_msg msg; - this_cpu_userfaults = (unsigned long *) arg; - *this_cpu_userfaults = 0; - pthread_mutex_unlock(&uffd_read_mutex); /* from here cancellation is ok */ for (;;) { if (uffd_read_msg(uffd, &msg)) continue; - (*this_cpu_userfaults) += uffd_handle_page_fault(&msg); + uffd_handle_page_fault(&msg, stats); } - return (void *)NULL; + + return NULL; } static void *background_thread(void *arg) @@ -582,13 +599,12 @@ static void *background_thread(void *arg) return NULL; } -static int stress(unsigned long *userfaults) +static int stress(struct uffd_stats *uffd_stats) { unsigned long cpu; pthread_t locking_threads[nr_cpus]; pthread_t uffd_threads[nr_cpus]; pthread_t background_threads[nr_cpus]; - void **_userfaults = (void **) userfaults; finished = 0; for (cpu = 0; cpu < nr_cpus; cpu++) { @@ -597,12 +613,13 @@ static int stress(unsigned long *userfaults) return 1; if (bounces & BOUNCE_POLL) { if (pthread_create(&uffd_threads[cpu], &attr, - uffd_poll_thread, (void *)cpu)) + uffd_poll_thread, + (void *)&uffd_stats[cpu])) return 1; } else { if (pthread_create(&uffd_threads[cpu], &attr, uffd_read_thread, - &_userfaults[cpu])) + (void *)&uffd_stats[cpu])) return 1; pthread_mutex_lock(&uffd_read_mutex); } @@ -639,7 +656,8 @@ static int stress(unsigned long *userfaults) fprintf(stderr, "pipefd write error\n"); return 1; } - if (pthread_join(uffd_threads[cpu], &_userfaults[cpu])) + if (pthread_join(uffd_threads[cpu], + (void *)&uffd_stats[cpu])) return 1; } else { if (pthread_cancel(uffd_threads[cpu])) @@ -910,11 +928,11 @@ static int userfaultfd_events_test(void) { struct uffdio_register uffdio_register; unsigned long expected_ioctls; - unsigned long userfaults; pthread_t uffd_mon; int err, features; pid_t pid; char c; + struct uffd_stats stats = { 0 }; printf("testing events (fork, remap, remove): "); fflush(stdout); @@ -941,7 +959,7 @@ static int userfaultfd_events_test(void) "unexpected missing ioctl for anon memory\n"), exit(1); - if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL)) + if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, &stats)) perror("uffd_poll_thread create"), exit(1); pid = fork(); @@ -957,13 +975,13 @@ static int userfaultfd_events_test(void) if (write(pipefd[1], &c, sizeof(c)) != sizeof(c)) perror("pipe write"), exit(1); - if (pthread_join(uffd_mon, (void **)&userfaults)) + if (pthread_join(uffd_mon, NULL)) return 1; close(uffd); - printf("userfaults: %ld\n", userfaults); + printf("userfaults: %ld\n", stats.missing_faults); - return userfaults != nr_pages; + return stats.missing_faults != nr_pages; } static int userfaultfd_sig_test(void) @@ -975,6 +993,7 @@ static int userfaultfd_sig_test(void) int err, features; pid_t pid; char c; + struct uffd_stats stats = { 0 }; printf("testing signal delivery: "); fflush(stdout); @@ -1006,7 +1025,7 @@ static int userfaultfd_sig_test(void) if (uffd_test_ops->release_pages(area_dst)) return 1; - if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL)) + if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, &stats)) perror("uffd_poll_thread create"), exit(1); pid = fork(); @@ -1032,6 +1051,7 @@ static int userfaultfd_sig_test(void) close(uffd); return userfaults != 0; } + static int userfaultfd_stress(void) { void *area; @@ -1040,7 +1060,7 @@ static int userfaultfd_stress(void) struct uffdio_register uffdio_register; unsigned long cpu; int err; - unsigned long userfaults[nr_cpus]; + struct uffd_stats uffd_stats[nr_cpus]; uffd_test_ops->allocate_area((void **)&area_src); if (!area_src) @@ -1169,8 +1189,10 @@ static int userfaultfd_stress(void) if (uffd_test_ops->release_pages(area_dst)) return 1; + uffd_stats_reset(uffd_stats, nr_cpus); + /* bounce pass */ - if (stress(userfaults)) + if (stress(uffd_stats)) return 1; /* unregister */ @@ -1213,7 +1235,7 @@ static int userfaultfd_stress(void) printf("userfaults:"); for (cpu = 0; cpu < nr_cpus; cpu++) - printf(" %lu", userfaults[cpu]); + printf(" %lu", uffd_stats[cpu].missing_faults); printf("\n"); } From patchwork Wed Mar 20 02:06:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10860723 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3AC7715AC for ; Wed, 20 Mar 2019 02:10:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1ACC029737 for ; Wed, 20 Mar 2019 02:10:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0D767297E3; Wed, 20 Mar 2019 02:10:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 16AAB29737 for ; Wed, 20 Mar 2019 02:10:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B1AE46B0288; Tue, 19 Mar 2019 22:10:48 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AA3B06B028A; Tue, 19 Mar 2019 22:10:48 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 91CD46B028B; Tue, 19 Mar 2019 22:10:48 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 6A6676B0288 for ; Tue, 19 Mar 2019 22:10:48 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id x12so934124qtk.2 for ; Tue, 19 Mar 2019 19:10:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=MIozerEVlbsC7Eks2DH/IrUUfmnGE+cAmIg6z0gEZVg=; b=c8nuffO/+XOyaQZ3yws3sI1qVfKMFL9hg3kbTNWToibGbuKoVENgvHzCwbgNUSGKIM AKrzrezBEmvPslIKt+qf2jN4QgnBFmxCVVzKCS6bc8PYd5Aw6TYLSHA2O1NQrtw3NAn+ msi1MPITPhKHByur3ePuSykm4IQDez6/Zo9MVLuU1dpvxp909e+abadO5nk6vmlvnpKy ccMKD4DzHDv1WBPEYfKS5DEEZdgpFK9FPh3eMKFaEY7naBt42zFmpdUp6baWwWsLfpTZ 6IMkSQ4wY5Aqhrwr9c98xLk74cKu/NOWa2POC60VpEu8wvQKRea3A/TXg1m3QoqiQZbq ZpGg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWj3yYGzS5gMVYy+WKPosP2+AwSoG2ojp6rZYrot9efmWCn24I2 ScQ1JU3Y4EagcKsIMAhBd3v9olNs7D2F3glyO2LavlZyFZ8U73VoSsK1/uc8X7wq5S/RRxtNT+c gZKaFG23H5puxcXTYEsJ/U8ZAvEVUOWWt2ZLLFWQcLEX2hW6ZC/dQi2cyhjalgfTccg== X-Received: by 2002:a0c:87d9:: with SMTP id 25mr4444458qvk.219.1553047848182; Tue, 19 Mar 2019 19:10:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqwaDN986vv6NeXmB++csbovv1THwb25UIaeRtX7bfplTmQod5bO1biYUXQOJZzbK9HBoBe3 X-Received: by 2002:a0c:87d9:: with SMTP id 25mr4444415qvk.219.1553047847093; Tue, 19 Mar 2019 19:10:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553047847; cv=none; d=google.com; s=arc-20160816; b=txn0WP0lPsFoWEDrrWhN90Wz985TtO3VRbMDHmQnduguW3tRlKOPGi207OC+Q+NK1k hcUY5WHCYtULanxViC2Pu4ydQr8vnZqmLyH6oqVrGWz2GaMRPjGT+w1I67pOZkK78Snw ng/lsDJ6y8ikwYIHDTeVNeEF3jbta22wnn1r3ksxSHNbXDT1DuhT7vlnumFBuaw3CAaC UcEbpjKeXM82LwqTj8dSYQEyviPqV1rg6PIdbzpnx4AhQS6Y7+9Ul+spoN5jEwZE+GCW OZD9P5usJq5aupxzpidP+TYDLeMsRGkGvCXZywbAirzXWazFfMSRBQk/oGE3uOlCuIKT 0Amw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=MIozerEVlbsC7Eks2DH/IrUUfmnGE+cAmIg6z0gEZVg=; b=cpVoT7C6UHeL4ztC/wQV/zh/KWyKOGg8ak0FC5d+sRtc5dUEjzkCfuiJs8UtO5FcOn cxN/wvIzxnPmdb/MY9tWKWXp99wTI0+CqtnQrD9VG9XuFcb0hEBhhYq8PTPSIBLxuDDK r4dThXQMAvs1neYmM0yFeO/0wZnSYE89HuYX4VRfr9pc6xCznzpJ+mzoJPVFvHZHvcVF VWNhu2iRxRefdFcdftxllCDTeibyjS2ubovI4niZFCk2wJQNu4V+ce3KP7xSmhFQoJZT nMrdm6cIPFK2m6coLZWlgqYhAsbmyp/k8SUzCPA/bQzQfY3Tf6NyHKgXOS7m5ZHscAbc 8OcQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id l24si345455qkg.256.2019.03.19.19.10.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Mar 2019 19:10:47 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2880FC1306F8; Wed, 20 Mar 2019 02:10:46 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 32D1F60634; Wed, 20 Mar 2019 02:10:39 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Marty McFadden , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v3 28/28] userfaultfd: selftests: add write-protect test Date: Wed, 20 Mar 2019 10:06:42 +0800 Message-Id: <20190320020642.4000-29-peterx@redhat.com> In-Reply-To: <20190320020642.4000-1-peterx@redhat.com> References: <20190320020642.4000-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Wed, 20 Mar 2019 02:10:46 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This patch adds uffd tests for write protection. Instead of introducing new tests for it, let's simply squashing uffd-wp tests into existing uffd-missing test cases. Changes are: (1) Bouncing tests We do the write-protection in two ways during the bouncing test: - By using UFFDIO_COPY_MODE_WP when resolving MISSING pages: then we'll make sure for each bounce process every single page will be at least fault twice: once for MISSING, once for WP. - By direct call UFFDIO_WRITEPROTECT on existing faulted memories: To further torture the explicit page protection procedures of uffd-wp, we split each bounce procedure into two halves (in the background thread): the first half will be MISSING+WP for each page as explained above. After the first half, we write protect the faulted region in the background thread to make sure at least half of the pages will be write protected again which is the first half to test the new UFFDIO_WRITEPROTECT call. Then we continue with the 2nd half, which will contain both MISSING and WP faulting tests for the 2nd half and WP-only faults from the 1st half. (2) Event/Signal test Mostly previous tests but will do MISSING+WP for each page. For sigbus-mode test we'll need to provide standalone path to handle the write protection faults. For all tests, do statistics as well for uffd-wp pages. Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 157 +++++++++++++++++++---- 1 file changed, 133 insertions(+), 24 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index e5d12c209e09..bf1e10db72f5 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -56,6 +56,7 @@ #include #include #include +#include #include "../kselftest.h" @@ -78,6 +79,8 @@ static int test_type; #define ALARM_INTERVAL_SECS 10 static volatile bool test_uffdio_copy_eexist = true; static volatile bool test_uffdio_zeropage_eexist = true; +/* Whether to test uffd write-protection */ +static bool test_uffdio_wp = false; static bool map_shared; static int huge_fd; @@ -92,6 +95,7 @@ pthread_attr_t attr; struct uffd_stats { int cpu; unsigned long missing_faults; + unsigned long wp_faults; }; /* pthread_mutex_t starts at page offset 0 */ @@ -141,9 +145,29 @@ static void uffd_stats_reset(struct uffd_stats *uffd_stats, for (i = 0; i < n_cpus; i++) { uffd_stats[i].cpu = i; uffd_stats[i].missing_faults = 0; + uffd_stats[i].wp_faults = 0; } } +static void uffd_stats_report(struct uffd_stats *stats, int n_cpus) +{ + int i; + unsigned long long miss_total = 0, wp_total = 0; + + for (i = 0; i < n_cpus; i++) { + miss_total += stats[i].missing_faults; + wp_total += stats[i].wp_faults; + } + + printf("userfaults: %llu missing (", miss_total); + for (i = 0; i < n_cpus; i++) + printf("%lu+", stats[i].missing_faults); + printf("\b), %llu wp (", wp_total); + for (i = 0; i < n_cpus; i++) + printf("%lu+", stats[i].wp_faults); + printf("\b)\n"); +} + static int anon_release_pages(char *rel_area) { int ret = 0; @@ -264,10 +288,15 @@ struct uffd_test_ops { void (*alias_mapping)(__u64 *start, size_t len, unsigned long offset); }; -#define ANON_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ +#define SHMEM_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ (1 << _UFFDIO_COPY) | \ (1 << _UFFDIO_ZEROPAGE)) +#define ANON_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ + (1 << _UFFDIO_COPY) | \ + (1 << _UFFDIO_ZEROPAGE) | \ + (1 << _UFFDIO_WRITEPROTECT)) + static struct uffd_test_ops anon_uffd_test_ops = { .expected_ioctls = ANON_EXPECTED_IOCTLS, .allocate_area = anon_allocate_area, @@ -276,7 +305,7 @@ static struct uffd_test_ops anon_uffd_test_ops = { }; static struct uffd_test_ops shmem_uffd_test_ops = { - .expected_ioctls = ANON_EXPECTED_IOCTLS, + .expected_ioctls = SHMEM_EXPECTED_IOCTLS, .allocate_area = shmem_allocate_area, .release_pages = shmem_release_pages, .alias_mapping = noop_alias_mapping, @@ -300,6 +329,21 @@ static int my_bcmp(char *str1, char *str2, size_t n) return 0; } +static void wp_range(int ufd, __u64 start, __u64 len, bool wp) +{ + struct uffdio_writeprotect prms = { 0 }; + + /* Write protection page faults */ + prms.range.start = start; + prms.range.len = len; + /* Undo write-protect, do wakeup after that */ + prms.mode = wp ? UFFDIO_WRITEPROTECT_MODE_WP : 0; + + if (ioctl(ufd, UFFDIO_WRITEPROTECT, &prms)) + fprintf(stderr, "clear WP failed for address 0x%Lx\n", + start), exit(1); +} + static void *locking_thread(void *arg) { unsigned long cpu = (unsigned long) arg; @@ -438,7 +482,10 @@ static int __copy_page(int ufd, unsigned long offset, bool retry) uffdio_copy.dst = (unsigned long) area_dst + offset; uffdio_copy.src = (unsigned long) area_src + offset; uffdio_copy.len = page_size; - uffdio_copy.mode = 0; + if (test_uffdio_wp) + uffdio_copy.mode = UFFDIO_COPY_MODE_WP; + else + uffdio_copy.mode = 0; uffdio_copy.copy = 0; if (ioctl(ufd, UFFDIO_COPY, &uffdio_copy)) { /* real retval in ufdio_copy.copy */ @@ -495,15 +542,21 @@ static void uffd_handle_page_fault(struct uffd_msg *msg, fprintf(stderr, "unexpected msg event %u\n", msg->event), exit(1); - if (bounces & BOUNCE_VERIFY && - msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE) - fprintf(stderr, "unexpected write fault\n"), exit(1); + if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP) { + wp_range(uffd, msg->arg.pagefault.address, page_size, false); + stats->wp_faults++; + } else { + /* Missing page faults */ + if (bounces & BOUNCE_VERIFY && + msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE) + fprintf(stderr, "unexpected write fault\n"), exit(1); - offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; - offset &= ~(page_size-1); + offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; + offset &= ~(page_size-1); - if (copy_page(uffd, offset)) - stats->missing_faults++; + if (copy_page(uffd, offset)) + stats->missing_faults++; + } } static void *uffd_poll_thread(void *arg) @@ -589,11 +642,30 @@ static void *uffd_read_thread(void *arg) static void *background_thread(void *arg) { unsigned long cpu = (unsigned long) arg; - unsigned long page_nr; + unsigned long page_nr, start_nr, mid_nr, end_nr; + + start_nr = cpu * nr_pages_per_cpu; + end_nr = (cpu+1) * nr_pages_per_cpu; + mid_nr = (start_nr + end_nr) / 2; + + /* Copy the first half of the pages */ + for (page_nr = start_nr; page_nr < mid_nr; page_nr++) + copy_page_retry(uffd, page_nr * page_size); - for (page_nr = cpu * nr_pages_per_cpu; - page_nr < (cpu+1) * nr_pages_per_cpu; - page_nr++) + /* + * If we need to test uffd-wp, set it up now. Then we'll have + * at least the first half of the pages mapped already which + * can be write-protected for testing + */ + if (test_uffdio_wp) + wp_range(uffd, (unsigned long)area_dst + start_nr * page_size, + nr_pages_per_cpu * page_size, true); + + /* + * Continue the 2nd half of the page copying, handling write + * protection faults if any + */ + for (page_nr = mid_nr; page_nr < end_nr; page_nr++) copy_page_retry(uffd, page_nr * page_size); return NULL; @@ -755,17 +827,31 @@ static int faulting_process(int signal_test) } for (nr = 0; nr < split_nr_pages; nr++) { + int steps = 1; + unsigned long offset = nr * page_size; + if (signal_test) { if (sigsetjmp(*sigbuf, 1) != 0) { - if (nr == lastnr) { + if (steps == 1 && nr == lastnr) { fprintf(stderr, "Signal repeated\n"); return 1; } lastnr = nr; if (signal_test == 1) { - if (copy_page(uffd, nr * page_size)) - signalled++; + if (steps == 1) { + /* This is a MISSING request */ + steps++; + if (copy_page(uffd, offset)) + signalled++; + } else { + /* This is a WP request */ + assert(steps == 2); + wp_range(uffd, + (__u64)area_dst + + offset, + page_size, false); + } } else { signalled++; continue; @@ -778,8 +864,13 @@ static int faulting_process(int signal_test) fprintf(stderr, "nr %lu memory corruption %Lu %Lu\n", nr, count, - count_verify[nr]), exit(1); - } + count_verify[nr]); + } + /* + * Trigger write protection if there is by writting + * the same value back. + */ + *area_count(area_dst, nr) = count; } if (signal_test) @@ -801,6 +892,11 @@ static int faulting_process(int signal_test) nr, count, count_verify[nr]), exit(1); } + /* + * Trigger write protection if there is by writting + * the same value back. + */ + *area_count(area_dst, nr) = count; } if (uffd_test_ops->release_pages(area_dst)) @@ -904,6 +1000,8 @@ static int userfaultfd_zeropage_test(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) fprintf(stderr, "register failure\n"), exit(1); @@ -949,6 +1047,8 @@ static int userfaultfd_events_test(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) fprintf(stderr, "register failure\n"), exit(1); @@ -979,7 +1079,8 @@ static int userfaultfd_events_test(void) return 1; close(uffd); - printf("userfaults: %ld\n", stats.missing_faults); + + uffd_stats_report(&stats, 1); return stats.missing_faults != nr_pages; } @@ -1009,6 +1110,8 @@ static int userfaultfd_sig_test(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) fprintf(stderr, "register failure\n"), exit(1); @@ -1141,6 +1244,8 @@ static int userfaultfd_stress(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) { fprintf(stderr, "register failure\n"); return 1; @@ -1195,6 +1300,11 @@ static int userfaultfd_stress(void) if (stress(uffd_stats)) return 1; + /* Clear all the write protections if there is any */ + if (test_uffdio_wp) + wp_range(uffd, (unsigned long)area_dst, + nr_pages * page_size, false); + /* unregister */ if (ioctl(uffd, UFFDIO_UNREGISTER, &uffdio_register.range)) { fprintf(stderr, "unregister failure\n"); @@ -1233,10 +1343,7 @@ static int userfaultfd_stress(void) area_src_alias = area_dst_alias; area_dst_alias = tmp_area; - printf("userfaults:"); - for (cpu = 0; cpu < nr_cpus; cpu++) - printf(" %lu", uffd_stats[cpu].missing_faults); - printf("\n"); + uffd_stats_report(uffd_stats, nr_cpus); } if (err) @@ -1276,6 +1383,8 @@ static void set_test_type(const char *type) if (!strcmp(type, "anon")) { test_type = TEST_ANON; uffd_test_ops = &anon_uffd_test_ops; + /* Only enable write-protect test for anonymous test */ + test_uffdio_wp = true; } else if (!strcmp(type, "hugetlb")) { test_type = TEST_HUGETLB; uffd_test_ops = &hugetlb_uffd_test_ops;