From patchwork Thu Jun 20 02:19:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005677 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 720AE14DB for ; Thu, 20 Jun 2019 02:20:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 659D91FF0B for ; Thu, 20 Jun 2019 02:20:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 588C61FFCD; Thu, 20 Jun 2019 02:20:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 18DB61FF0B for ; Thu, 20 Jun 2019 02:20:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D00E6B0005; Wed, 19 Jun 2019 22:20:40 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 180FB8E0002; Wed, 19 Jun 2019 22:20:40 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 06EE98E0001; Wed, 19 Jun 2019 22:20:40 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id D95C16B0005 for ; Wed, 19 Jun 2019 22:20:39 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id e39so1665333qte.8 for ; Wed, 19 Jun 2019 19:20:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=x0adZNsyyZj3oLuHDkXHPlZ8L8iyk5kBrsICEZ96nek=; b=RqUQ7n562S5eioQEkmi2UL0CKs+W3WsHjPBrGMn6FbVAJIIA0Lo3GobJhZ+lG2VbPL Y781GP8/YWOpo52xYQ3Tr3fS/aMm1+hN1PH/N4WjRDbS+wpLS+XCIpWMJiz1O3U4eMJ5 OKgLDfP3Yh7fleQoY5wcQGEGEI0vA1POKNqhS4xXLHvaWgWgjLI6dCcRCPSto25Iujr/ /jRUMZ6xut01aeb424JzmlZoBNAdkvYc//eUMNhpa4OI9/2k7jzEqvnraj8k3SRXDoF0 fNNXQ5aGtBH1U0a10ltzDL1tV1bEqeFGB9EemFWZaVzVaI2IFRezZbgMtO5qZDXD1Z5u ccZg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVzcccBs9IfkY7QD48csIUjSqe7F4bvWKyLO0xUi4fhayialAgG 3Bfm4w2AcfMvDFkKy6yfLYZQpVz6DhQ2SaCOvwqcHA0VMFAgnuve2RZyEit40jb9j3YJkCwzGya mFOOz0sBJURjtm+y4IoRsn+8Q3/wMkFwo6RU44mAHyq+WsCvp9LvlNqgbpv/IZazufA== X-Received: by 2002:ac8:2268:: with SMTP id p37mr77134609qtp.187.1560997239534; Wed, 19 Jun 2019 19:20:39 -0700 (PDT) X-Google-Smtp-Source: APXvYqyw1+SV4A+NSRXlSuCbaQPyuj0dTSoxReankDVF6ehhNlCP3205A/0zdUcbximFzzB+CMCW X-Received: by 2002:ac8:2268:: with SMTP id p37mr77134533qtp.187.1560997238557; Wed, 19 Jun 2019 19:20:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997238; cv=none; d=google.com; s=arc-20160816; b=Ld2nI7rcDRaj+q4y+2MxJ0fjLpGuzgqM13Oi/hoeAckzOXZCz4vBpwinIjAmuOS0Z+ ZzPOp1MnKyM0YzL4UXjz13UKi8K1cxoYQvWBiUFZMFxsgtOMv962FFNaYI78PhgpvobL KsXSm3Myw7+qe9me0i3aOBlwEWaoEOm1hAQA4WclZukXO6NsbiTDmWyEGOO10kqj0Kul b4/GoiBzZ/HoVwhGa1hJI9upf6YTDCTstDTJNWtOTr1SzDiMsIttob10eckNd9lD05Xf kdDZL+50ypTay1BSiZ4TY6bdKcWuZ5hv2FluQTW8xl2VLOXs7Srpo4BqxlIfydgH+VSQ 33dQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=x0adZNsyyZj3oLuHDkXHPlZ8L8iyk5kBrsICEZ96nek=; b=1Hi/7109Trhw+BZ+XytI6M408HYvZZKVFPGcs022QSaMa5pxo58Z5lt9+20qKSZZlj OsSibLZS8MEC25hIpsUOMChN4D1LYk1xM9XVKDRH6LHh5xZmbniDtgNblDLEg0dXqFjl ypiWUnqJtu5ngFUG5dFLtCKh5SFPJGKVHOBum5H4ki3fvCuCzJY6xZCyrEcEi9cmc7Ps 4dCO/R/DOw2Mv0yhnjy5/xAyFQ4b7UjlE+UuOTZXjBYjTAeSu4SwyeuEqNh6Rpy76cZ0 lg9rLZmqWqfRjazXNRWlx4Tk4KFgni+K1XBHDPUQfNHKEYJAKSqsPsB0YSQUzV7wSfb7 pp0w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id r8si3946871qte.143.2019.06.19.19.20.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:20:38 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9E7CF3082E25; Thu, 20 Jun 2019 02:20:37 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 93ED31001E6F; Thu, 20 Jun 2019 02:20:25 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 01/25] mm: gup: rename "nonblocking" to "locked" where proper Date: Thu, 20 Jun 2019 10:19:44 +0800 Message-Id: <20190620022008.19172-2-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Thu, 20 Jun 2019 02:20:37 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP There's plenty of places around __get_user_pages() that has a parameter "nonblocking" which does not really mean that "it won't block" (because it can really block) but instead it shows whether the mmap_sem is released by up_read() during the page fault handling mostly when VM_FAULT_RETRY is returned. We have the correct naming in e.g. get_user_pages_locked() or get_user_pages_remote() as "locked", however there're still many places that are using the "nonblocking" as name. Renaming the places to "locked" where proper to better suite the functionality of the variable. While at it, fixing up some of the comments accordingly. Reviewed-by: Mike Rapoport Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- mm/gup.c | 44 +++++++++++++++++++++----------------------- mm/hugetlb.c | 8 ++++---- 2 files changed, 25 insertions(+), 27 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index ddde097cf9e4..58d282115d9b 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -625,12 +625,12 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address, } /* - * mmap_sem must be held on entry. If @nonblocking != NULL and - * *@flags does not include FOLL_NOWAIT, the mmap_sem may be released. - * If it is, *@nonblocking will be set to 0 and -EBUSY returned. + * mmap_sem must be held on entry. If @locked != NULL and *@flags + * does not include FOLL_NOWAIT, the mmap_sem may be released. If it + * is, *@locked will be set to 0 and -EBUSY returned. */ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, - unsigned long address, unsigned int *flags, int *nonblocking) + unsigned long address, unsigned int *flags, int *locked) { unsigned int fault_flags = 0; vm_fault_t ret; @@ -642,7 +642,7 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, fault_flags |= FAULT_FLAG_WRITE; if (*flags & FOLL_REMOTE) fault_flags |= FAULT_FLAG_REMOTE; - if (nonblocking) + if (locked) fault_flags |= FAULT_FLAG_ALLOW_RETRY; if (*flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; @@ -668,8 +668,8 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, } if (ret & VM_FAULT_RETRY) { - if (nonblocking && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) - *nonblocking = 0; + if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) + *locked = 0; return -EBUSY; } @@ -746,7 +746,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) * only intends to ensure the pages are faulted in. * @vmas: array of pointers to vmas corresponding to each page. * Or NULL if the caller does not require them. - * @nonblocking: whether waiting for disk IO or mmap_sem contention + * @locked: whether we're still with the mmap_sem held * * Returns number of pages pinned. This may be fewer than the number * requested. If nr_pages is 0 or negative, returns 0. If no pages @@ -775,13 +775,11 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) * appropriate) must be called after the page is finished with, and * before put_page is called. * - * If @nonblocking != NULL, __get_user_pages will not wait for disk IO - * or mmap_sem contention, and if waiting is needed to pin all pages, - * *@nonblocking will be set to 0. Further, if @gup_flags does not - * include FOLL_NOWAIT, the mmap_sem will be released via up_read() in - * this case. + * If @locked != NULL, *@locked will be set to 0 when mmap_sem is + * released by an up_read(). That can happen if @gup_flags does not + * have FOLL_NOWAIT. * - * A caller using such a combination of @nonblocking and @gup_flags + * A caller using such a combination of @locked and @gup_flags * must therefore hold the mmap_sem for reading only, and recognize * when it's been released. Otherwise, it must be held for either * reading or writing and will not be released. @@ -793,7 +791,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, unsigned long start, unsigned long nr_pages, unsigned int gup_flags, struct page **pages, - struct vm_area_struct **vmas, int *nonblocking) + struct vm_area_struct **vmas, int *locked) { long ret = 0, i = 0; struct vm_area_struct *vma = NULL; @@ -837,7 +835,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, if (is_vm_hugetlb_page(vma)) { i = follow_hugetlb_page(mm, vma, pages, vmas, &start, &nr_pages, i, - gup_flags, nonblocking); + gup_flags, locked); continue; } } @@ -855,7 +853,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, page = follow_page_mask(vma, start, foll_flags, &ctx); if (!page) { ret = faultin_page(tsk, vma, start, &foll_flags, - nonblocking); + locked); switch (ret) { case 0: goto retry; @@ -1508,7 +1506,7 @@ EXPORT_SYMBOL(get_user_pages); * @vma: target vma * @start: start address * @end: end address - * @nonblocking: + * @locked: whether the mmap_sem is still held * * This takes care of mlocking the pages too if VM_LOCKED is set. * @@ -1516,14 +1514,14 @@ EXPORT_SYMBOL(get_user_pages); * * vma->vm_mm->mmap_sem must be held. * - * If @nonblocking is NULL, it may be held for read or write and will + * If @locked is NULL, it may be held for read or write and will * be unperturbed. * - * If @nonblocking is non-NULL, it must held for read only and may be - * released. If it's released, *@nonblocking will be set to 0. + * If @locked is non-NULL, it must held for read only and may be + * released. If it's released, *@locked will be set to 0. */ long populate_vma_page_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end, int *nonblocking) + unsigned long start, unsigned long end, int *locked) { struct mm_struct *mm = vma->vm_mm; unsigned long nr_pages = (end - start) / PAGE_SIZE; @@ -1558,7 +1556,7 @@ long populate_vma_page_range(struct vm_area_struct *vma, * not result in a stack expansion that recurses back here. */ return __get_user_pages(current, mm, start, nr_pages, gup_flags, - NULL, NULL, nonblocking); + NULL, NULL, locked); } /* diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ac843d32b019..ba179c2fa8fb 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4240,7 +4240,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, struct page **pages, struct vm_area_struct **vmas, unsigned long *position, unsigned long *nr_pages, - long i, unsigned int flags, int *nonblocking) + long i, unsigned int flags, int *locked) { unsigned long pfn_offset; unsigned long vaddr = *position; @@ -4311,7 +4311,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, spin_unlock(ptl); if (flags & FOLL_WRITE) fault_flags |= FAULT_FLAG_WRITE; - if (nonblocking) + if (locked) fault_flags |= FAULT_FLAG_ALLOW_RETRY; if (flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | @@ -4328,9 +4328,9 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, break; } if (ret & VM_FAULT_RETRY) { - if (nonblocking && + if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) - *nonblocking = 0; + *locked = 0; *nr_pages = 0; /* * VM_FAULT_RETRY must not return an From patchwork Thu Jun 20 02:19:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005679 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CA84114DB for ; Thu, 20 Jun 2019 02:20:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB5471FF0B for ; Thu, 20 Jun 2019 02:20:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AD1C01FFCD; Thu, 20 Jun 2019 02:20:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 76A5C1FF0B for ; Thu, 20 Jun 2019 02:20:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 726FC6B0006; Wed, 19 Jun 2019 22:20:56 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6D8428E0002; Wed, 19 Jun 2019 22:20:56 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5788C8E0001; Wed, 19 Jun 2019 22:20:56 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 343936B0006 for ; Wed, 19 Jun 2019 22:20:56 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id x17so1712313qkf.14 for ; Wed, 19 Jun 2019 19:20:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=RS/yHOYmyDP2DU15NcuGnajOg/n1TeDVshB1Bs7O87I=; b=A/vbuQtcnM2uFlOvRRUIxCzbJ3kefSOoOtH92QITM/NmhC59DX4WDCy7261YJJBqd1 EIUHgxxbuwQs0OnD9rveekmsRHxn4cKRhWvvU6N+ONX0udEwr+M9OkRnJD2j4X0L0fL1 GzxDrPWANOtoPiPJ2qEeiKqsVGpkT6tKAeixWbWd+rz0GgN8b6eFxFJgbmb9usOwQcHb gxoQtq7gBVzMCTMGGB0nZpIhaNHtkesrBgE428/2z8GjzQV9RFz+B9dFRkzPurAdpV4V 87kKglKbWxdNmGjoylkANVbM73+J+yRC7RWWz57A3z/VZ6kSA5N09SPzUYtKkUV0fxgy FsBg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVW94cef7OSnVDiLXEWphCqvCF7wJUx9KsNRfS+SoER/uUBe55O SMzZpvvmV3qMk2uRYmoDATBcCJt4pGwvoEgu1T4l2arIrgPOBC6Oz/wbQzHo/vDuhE54XS514qD 4g/08WzmW9FMfDzJo1fVx0p2yS6twQrxFWCn0Ia6QQZROM+FghwBnyrz3gAPg2udVuw== X-Received: by 2002:a0c:889a:: with SMTP id 26mr5797832qvn.20.1560997255889; Wed, 19 Jun 2019 19:20:55 -0700 (PDT) X-Google-Smtp-Source: APXvYqzpLCyxwzxs61EtHDkPBAcFTyKz2ISnzGzmyvQ1YKIajL30m41ZTOmOKCm8RAYDwB/SxVyD X-Received: by 2002:a0c:889a:: with SMTP id 26mr5797730qvn.20.1560997254529; Wed, 19 Jun 2019 19:20:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997254; cv=none; d=google.com; s=arc-20160816; b=ZMkOMG0HkWf0kqKXJ65knVUYhMe6l77qvDZfQgQS9O7D0cQdBP769ITouz3V0nS5r4 DJ9PhjJLrzXGbMLu0+BXxqKmJFCUDmitlu5L0I30AVQYHvcNfxSu/QsbHI4fSFA8ecUf C/YLyd6hJDoljkgggO6oB+zRE00s8sTIIBvboCF5SN0BrpRNvX7kF1PFRmfwTnaQPoKT 0AkqmSWYG2c/V/C8cVf1yaknWr3h8CmwHf10xYiwhfGjGgdOWoyemDzZiYg27EOSapo5 NtnByfbc5F2T9uzKfcM4LrdgPnkdjMfxy3r0yohaGy56pUn0KGzhO2fK0XOiN56lxIy1 76NQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=RS/yHOYmyDP2DU15NcuGnajOg/n1TeDVshB1Bs7O87I=; b=nky6IpNG4pd/R+EX7i9LrWkw80yJxotZucb6xf0+/XfAJPLMj6H0jOCMC1mcR6eZZs wAvQbc2oP5AZFJV8ta2xOT51jkbLjfnbi73goLDv1i9bLxrXuJAQ6bB0ZHn3pmByoAWh ZWbI4edt+2Geurtt31Rneb6qXF+HriHKx6cJdQbrY6YNcDHrC1zGLGkkkLkziNXjCxTj L4yjlnSLubsz4xIJCW9W1x1edaXQH2KiYNIR9A/rnVO/R3Q/Wu8F7cHtqNZ/tpefhlgL GalEKixF/7qqOb+oYxMnthxZX8MpUpf/FkLmYFNMwNHC0mc2PIHWvsk+mPmMvcn2LAi7 LwGA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id w49si4164143qta.277.2019.06.19.19.20.54 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:20:54 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E63DFDD9F4; Thu, 20 Jun 2019 02:20:52 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 509E41001DC3; Thu, 20 Jun 2019 02:20:37 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Linus Torvalds Subject: [PATCH v5 02/25] mm: userfault: return VM_FAULT_RETRY on signals Date: Thu, 20 Jun 2019 10:19:45 +0800 Message-Id: <20190620022008.19172-3-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Thu, 20 Jun 2019 02:20:53 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from the upstream discussion between Linus and Andrea: https://lkml.org/lkml/2017/10/30/560 A summary to the issue: there was a special path in handle_userfault() in the past that we'll return a VM_FAULT_NOPAGE when we detected non-fatal signals when waiting for userfault handling. We did that by reacquiring the mmap_sem before returning. However that brings a risk in that the vmas might have changed when we retake the mmap_sem and even we could be holding an invalid vma structure. This patch removes the special path and we'll return a VM_FAULT_RETRY with the common path even if we have got such signals. Then for all the architectures that is passing in VM_FAULT_ALLOW_RETRY into handle_mm_fault(), we check not only for SIGKILL but for all the rest of userspace pending signals right after we returned from handle_mm_fault(). This can allow the userspace to handle nonfatal signals faster than before. This patch is a preparation work for the next patch to finally remove the special code path mentioned above in handle_userfault(). Suggested-by: Linus Torvalds Suggested-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- arch/alpha/mm/fault.c | 2 +- arch/arc/mm/fault.c | 11 ++++------- arch/arm/mm/fault.c | 6 +++--- arch/arm64/mm/fault.c | 6 +++--- arch/hexagon/mm/vm_fault.c | 2 +- arch/ia64/mm/fault.c | 2 +- arch/m68k/mm/fault.c | 2 +- arch/microblaze/mm/fault.c | 2 +- arch/mips/mm/fault.c | 2 +- arch/nds32/mm/fault.c | 6 +++--- arch/nios2/mm/fault.c | 2 +- arch/openrisc/mm/fault.c | 2 +- arch/parisc/mm/fault.c | 2 +- arch/powerpc/mm/fault.c | 2 ++ arch/riscv/mm/fault.c | 4 ++-- arch/s390/mm/fault.c | 9 ++++++--- arch/sh/mm/fault.c | 4 ++++ arch/sparc/mm/fault_32.c | 3 +++ arch/sparc/mm/fault_64.c | 3 +++ arch/um/kernel/trap.c | 5 ++++- arch/unicore32/mm/fault.c | 4 ++-- arch/x86/mm/fault.c | 6 +++++- arch/xtensa/mm/fault.c | 3 +++ 23 files changed, 56 insertions(+), 34 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index 188fc9256baf..8a2ef90b4bfc 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -150,7 +150,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, the fault. */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index 6836095251ed..3517820aea07 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -139,17 +139,14 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) */ fault = handle_mm_fault(vma, address, flags); - if (fatal_signal_pending(current)) { - + if (unlikely((fault & VM_FAULT_RETRY) && signal_pending(current))) { + if (fatal_signal_pending(current) && !user_mode(regs)) + goto no_context; /* * if fault retry, mmap_sem already relinquished by core mm * so OK to return to user mode (with signal handled first) */ - if (fault & VM_FAULT_RETRY) { - if (!user_mode(regs)) - goto no_context; - return; - } + return; } perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 58f69fa07df9..c41c021bbe40 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -314,12 +314,12 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_page_fault(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (unlikely(fault & VM_FAULT_RETRY && signal_pending(current))) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return 0; } diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index a30818ed9c60..890ec3a693e6 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -513,13 +513,13 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, if (fault & VM_FAULT_RETRY) { /* - * If we need to retry but a fatal signal is pending, + * If we need to retry but a signal is pending, * handle the signal first. We do not need to release * the mmap_sem because it would already be released * in __lock_page_or_retry in mm/filemap.c. */ - if (fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return 0; } diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index b7a99aa5b0ba..febb4f96ba6f 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -91,7 +91,7 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; /* The most common case -- we are done. */ diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 5baeb022f474..62c2d39d2bed 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -163,7 +163,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index 9b6163c05a75..d9808a807ab8 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -138,7 +138,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); pr_debug("handle_mm_fault returns %x\n", fault); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return 0; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 202ad6a494f5..4fd2dbd0c5ca 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -217,7 +217,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 73d8a0f0b810..92374fd091d2 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -154,7 +154,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index 68d5f2a27f38..da777de8a62e 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -206,12 +206,12 @@ void do_page_fault(unsigned long entry, unsigned long addr, fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return; } diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index 6a2e716b959f..bdb1f9db75ba 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -133,7 +133,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index 9eee5bf3db27..f9f47dc32f94 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -161,7 +161,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index c8e8b7c05558..29422eec329d 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -303,7 +303,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index ec6b7ad70659..c2168b298c82 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -616,6 +616,8 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, */ flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; + if (is_user && signal_pending(current)) + return 0; if (!fatal_signal_pending(current)) goto retry; } diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 3e2708c626a8..4aa7a2343353 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -111,11 +111,11 @@ asmlinkage void do_page_fault(struct pt_regs *regs) fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(tsk)) + if ((fault & VM_FAULT_RETRY) && signal_pending(tsk)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index df75d574246d..94087ba285be 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -493,9 +493,12 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) * the fault. */ fault = handle_mm_fault(vma, address, flags); - /* No reason to continue if interrupted by SIGKILL. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - fault = VM_FAULT_SIGNAL; + /* Do not continue if interrupted by signals. */ + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { + if (fatal_signal_pending(current)) + fault = VM_FAULT_SIGNAL; + else + fault = 0; if (flags & FAULT_FLAG_RETRY_NOWAIT) goto out_up; goto out; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index 6defd2c6d9b1..baf5d73df40c 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -506,6 +506,10 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, * have already released it in __lock_page_or_retry * in mm/filemap.c. */ + + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index b0440b0edd97..a2c83104fe35 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -269,6 +269,9 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index 8f8a604c1300..cad71ec5c7b3 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -467,6 +467,9 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(current)) + return; + goto retry; } } diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index 0e8b6158f224..05dcd4c5f0d5 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -76,8 +76,11 @@ int handle_page_fault(unsigned long address, unsigned long ip, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { + if (is_user && !fatal_signal_pending(current)) + err = 0; goto out_nosemaphore; + } if (unlikely(fault & VM_FAULT_ERROR)) { if (fault & VM_FAULT_OOM) { diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index b9a3a50644c1..3611f19234a1 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -248,11 +248,11 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_pf(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return 0; if (!(fault & VM_FAULT_ERROR) && (flags & FAULT_FLAG_ALLOW_RETRY)) { diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 46df4c6aae46..dcd7c1393be3 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1463,16 +1463,20 @@ void do_user_addr_fault(struct pt_regs *regs, * that we made any progress. Handle this case first. */ if (unlikely(fault & VM_FAULT_RETRY)) { + bool is_user = flags & FAULT_FLAG_USER; + /* Retry at most once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; + if (is_user && signal_pending(tsk)) + return; if (!fatal_signal_pending(tsk)) goto retry; } /* User mode? Just return to handle the fatal exception */ - if (flags & FAULT_FLAG_USER) + if (is_user) return; /* Not returning to user mode? Handle exceptions or die: */ diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index 2ab0e0dcd166..792dad5e2f12 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -136,6 +136,9 @@ void do_page_fault(struct pt_regs *regs) * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(current)) + return; + goto retry; } } From patchwork Thu Jun 20 02:19:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005681 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 24DFA1395 for ; Thu, 20 Jun 2019 02:21:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 18ED8209CD for ; Thu, 20 Jun 2019 02:21:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0C08A27F85; Thu, 20 Jun 2019 02:21:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7CC88209CD for ; Thu, 20 Jun 2019 02:21:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A0CA56B0007; Wed, 19 Jun 2019 22:21:06 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9BE788E0002; Wed, 19 Jun 2019 22:21:06 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8ABFB8E0001; Wed, 19 Jun 2019 22:21:06 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 67FDB6B0007 for ; Wed, 19 Jun 2019 22:21:06 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id h47so1639023qtc.20 for ; Wed, 19 Jun 2019 19:21:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=YUbZDsKt/IT4N2Y9eylDmgPpHsIMfaDKW5AWYk9iEhQ=; b=fz7ncKLyEfe6MtOVtw1oHrTpbUqKuHmm6kvHdwcXRCX+tKiihBv5ZfD7R62QYeibyA g9BaVm4jTcbS8TYGRxqZP7gFi3ECyKm0KGgOBWBzwiOHAA5PIYewRKuJwt3+b5m3RBmH WVx3r2EHvrcZIwHaFZIE/s+Gjs/T5hun/Xzur7xZfqfvtBLVVDCDKRXQzrWgI0pHNNi7 quokhKg5dq0uGbsQzxVDR3FE5qelk4+CRenDyhXYNV4TygTB7Acq+7+PcqfHeDL9rEbE b94YY57VhPEjW4TcxSBp/o1K4IGJeEIk2hTm0egEm61qecyIoxLMSFeDp7aslFNIsVmg qIXQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAUV2LiNCIq4JEDvhJ94ZWEFyNiDpxrs/mQAMdP2P1v5kn2Ao5P2 WdhHcYonX2FBKhnpPSOj9CPoNhLR+frfEWPhkDS3bkOCG5VHRm2p8Ibx2cH3+PeUMBeBgUoaTNI 8EYpJmVC45mFVJwf+adfOkt/iMloH5DRebbMxMHDRqOE4o6tn8bXmsb0q0Tlyvut8Pg== X-Received: by 2002:a37:9d04:: with SMTP id g4mr101417887qke.52.1560997266211; Wed, 19 Jun 2019 19:21:06 -0700 (PDT) X-Google-Smtp-Source: APXvYqy7M4Wn0nOsbzBdFOGWdg95IRC54D+KVObgeGr5fZV1dELTi3f43wNMyxmtWMvYA9JDFahU X-Received: by 2002:a37:9d04:: with SMTP id g4mr101417799qke.52.1560997264964; Wed, 19 Jun 2019 19:21:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997264; cv=none; d=google.com; s=arc-20160816; b=AvOO2gOUwBCaNEAn4uMvNxnIJpcjSYgdgvDM5dm1DbcrtybyhnkZ0wFK8FA+ibz+rl SZ/s8/Ed1WLSzrWrrKdZXdHuAqC/N7v3tNEce3VLidFydeKNoPLR6+HzrmyNRbBbNPcD lRotRqUoQN4St9bt2XYTqz8hHktHmCq8e6w4ZkZoyBWSf3w48Qeo4B4OlJhJfqK9szj5 vDCRh2F0POeZmqhe2sAH6FtAW7tqp1QHKmhXvb91sVf1IcmDOPPTFHbikYay9EUeQ+RP nth/HxAB6IqUf+aA/GmvFF7jaZOcQ68NdSwowL4s5+JYsxHLeOXuKEJ4rUYJA7k8HtnC +iRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=YUbZDsKt/IT4N2Y9eylDmgPpHsIMfaDKW5AWYk9iEhQ=; b=javPb3ZacobatAgpyb6quOKUWzowNMXnqQQWpszJjkWp8xY8xfZYgQyWlUJIPkTg/D 93rCl/aU+Y2c/kHGA5bps1MokNxragR8Y+oiiePHAzYWRaNnSiOoU3sI2Ze2ypEHAfD1 EgIsB9gm4IjXi98VKkc+/mnnk4EN9LPIiLb4VgeGbd4TZOEe8sXYpJIEk1SFj05kaaS1 fl8cIGsTSMqsTyCcqoK7I31w9w9bFXKghvst1QyswH9PSL1KAsY/l6fZIEsY3OjRKK0J 9yaG8uzgGGiazfXGq7JhsRh3JFgpz8H3QJsHom+7rxmpMTZOw5XW/5ocYAGwFLhoOjFN IRZg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id l197si13630397qke.198.2019.06.19.19.21.04 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:21:04 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 02FDC356FE; Thu, 20 Jun 2019 02:21:04 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 24BC41001E69; Thu, 20 Jun 2019 02:20:55 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Linus Torvalds Subject: [PATCH v5 03/25] userfaultfd: don't retake mmap_sem to emulate NOPAGE Date: Thu, 20 Jun 2019 10:19:46 +0800 Message-Id: <20190620022008.19172-4-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Thu, 20 Jun 2019 02:21:04 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from the upstream discussion between Linus and Andrea: https://lkml.org/lkml/2017/10/30/560 A summary to the issue: there was a special path in handle_userfault() in the past that we'll return a VM_FAULT_NOPAGE when we detected non-fatal signals when waiting for userfault handling. We did that by reacquiring the mmap_sem before returning. However that brings a risk in that the vmas might have changed when we retake the mmap_sem and even we could be holding an invalid vma structure. This patch removes the risk path in handle_userfault() then we will be sure that the callers of handle_mm_fault() will know that the VMAs might have changed. Meanwhile with previous patch we don't lose responsiveness as well since the core mm code now can handle the nonfatal userspace signals quickly even if we return VM_FAULT_RETRY. Suggested-by: Andrea Arcangeli Suggested-by: Linus Torvalds Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- fs/userfaultfd.c | 24 ------------------------ 1 file changed, 24 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 3b30301c90ec..5dbef45ecbf5 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -516,30 +516,6 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) __set_current_state(TASK_RUNNING); - if (return_to_userland) { - if (signal_pending(current) && - !fatal_signal_pending(current)) { - /* - * If we got a SIGSTOP or SIGCONT and this is - * a normal userland page fault, just let - * userland return so the signal will be - * handled and gdb debugging works. The page - * fault code immediately after we return from - * this function is going to release the - * mmap_sem and it's not depending on it - * (unlike gup would if we were not to return - * VM_FAULT_RETRY). - * - * If a fatal signal is pending we still take - * the streamlined VM_FAULT_RETRY failure path - * and there's no need to retake the mmap_sem - * in such case. - */ - down_read(&mm->mmap_sem); - ret = VM_FAULT_NOPAGE; - } - } - /* * Here we race with the list_del; list_add in * userfaultfd_ctx_read(), however because we don't ever run From patchwork Thu Jun 20 02:19:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005683 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 294811395 for ; Thu, 20 Jun 2019 02:21:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1708B209CD for ; Thu, 20 Jun 2019 02:21:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 07F3427F85; Thu, 20 Jun 2019 02:21:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8BE06209CD for ; Thu, 20 Jun 2019 02:21:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8D34F6B0008; Wed, 19 Jun 2019 22:21:20 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 886398E0002; Wed, 19 Jun 2019 22:21:20 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 74E268E0001; Wed, 19 Jun 2019 22:21:20 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 511AC6B0008 for ; Wed, 19 Jun 2019 22:21:20 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id s25so1694250qkj.18 for ; Wed, 19 Jun 2019 19:21:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=qSXj0wJMv5utUSmbkTfP2quxBqGfypnNHDKzmRsioFc=; b=bvRyjwHN8F3zJurVtqVgfeBVEIrAUyJ8wIOEhyNXVsBnODI7HXtztAvfy5mcO+MO/b letH2Z7npcWtYdTnfYTMvdO0++ZFopvmFBUSrnXK23voXK7OFou9CnmlmMSEHEoa+ZE0 ltKGbhsmYRVD7UJ5N7H1MYXV05LFdLuCdSt4WedXmQhsctXusTM42ptLGmfvrhBvjSO5 b/dIV6sLYz4iPXpA87XgVkMFSx3sBAJRICRwjqcQfYik8/pkG9KN++CH1YRnHRsFunju HkFpsXmuzi6BWIWrsBirbx9v7r6CnPJvo6EkrwewM+b5MA0zsutr1bh9yrJ7i2pRPbMg y5aA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWEx2ic5z8pgVMXYBP+/MRbtdvw6k5uNgUZbFJoC/C2OajAdefS aT/uWRxdIwHe4vLAZNTyZUJM4nD3+iVLVHTCzoZSDPIJ+C8xGKhPQaf3WfM+6Nmm51JjP1eBurW l/ZFy/2FbIXbr/bl0+kCUYINImCPFwn2VNr72k4yXtS8BHeiWEoiQfzJT9GxcZVyZHw== X-Received: by 2002:ac8:6b06:: with SMTP id w6mr26169507qts.80.1560997280038; Wed, 19 Jun 2019 19:21:20 -0700 (PDT) X-Google-Smtp-Source: APXvYqyptUacbFf4a49YNgBjH0O7IfSkBL8fONo9OtgBfhh/OQe4jpX3KB5Kd701DnS5Fk5qykkP X-Received: by 2002:ac8:6b06:: with SMTP id w6mr26169409qts.80.1560997278207; Wed, 19 Jun 2019 19:21:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997278; cv=none; d=google.com; s=arc-20160816; b=PSafZAjVaCovX8Z1VqkIxcoiDVoHURIj69mf8C7B5sbqXXlwK0wTVayUlJPszTDYkL 1Ly3i6UnrJ48mtmkPhCj9mWI9wEpN2dtRCgeB3nNLMjtF/MUIQUiq+jS1ynquB5qQChz +YDlzuvH1Zf6CujdMNFkn0QaLJaNB0xA0yH4q0ArfLlahwuJXgOAFjpZkHQDs4RwNR1Z FkXiAMxEHvfHUX3RAiUmZWUCmobHKNQZb85Wl45I+o7hAERwW8k8ta4fA57kVOWfa394 P1FMdFW8E2kACxf+IHDnXfF0GEwkQDJ59frjhZZMRcoPeKHnrkGnQI4Qq45xqfW6PWHo 3VGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=qSXj0wJMv5utUSmbkTfP2quxBqGfypnNHDKzmRsioFc=; b=L+plSOZNethYElvLJHlbwPtQS4WlZ2ppMgaistWRu0seRERxzfQcClfxi11xQDLgDu zoDVZLyn600q9URpO+1bXv3Dbj8u4sTD5cZAHOV3idfqGSqY9K3EUVejm86k7DkCnwAZ 2PQLjkD2drwJbSprM4t/8gd0yW+AJIqVLyX/tt+DjvSWs0VUQ710YDYlvoMH9xGK13PN jPcVWlVVjAkikOXzhBxFN1AdypF5Lx/3zlicnCCMndLOw66THBl7TFlIkaJM94aKiwRY zx+i8M6lcNTJsXWBY1Fb7UpqBcIxpIIG+uWyb8OyjUOKbu/yhujLBk7mw4B9UO+8/Vai F6Rg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id a55si3992722qtk.80.2019.06.19.19.21.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:21:18 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 44B5D307D910; Thu, 20 Jun 2019 02:21:17 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 90D531001E69; Thu, 20 Jun 2019 02:21:04 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Linus Torvalds Subject: [PATCH v5 04/25] mm: allow VM_FAULT_RETRY for multiple times Date: Thu, 20 Jun 2019 10:19:47 +0800 Message-Id: <20190620022008.19172-5-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Thu, 20 Jun 2019 02:21:17 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from a discussion between Linus and Andrea [1]. Before this patch we only allow a page fault to retry once. We achieved this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing handle_mm_fault() the second time. This was majorly used to avoid unexpected starvation of the system by looping over forever to handle the page fault on a single page. However that should hardly happen, and after all for each code path to return a VM_FAULT_RETRY we'll first wait for a condition (during which time we should possibly yield the cpu) to happen before VM_FAULT_RETRY is really returned. This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY flag when we receive VM_FAULT_RETRY. It means that the page fault handler now can retry the page fault for multiple times if necessary without the need to generate another page fault event. Meanwhile we still keep the FAULT_FLAG_TRIED flag so page fault handler can still identify whether a page fault is the first attempt or not. Then we'll have these combinations of fault flags (only considering ALLOW_RETRY flag and TRIED flag): - ALLOW_RETRY and !TRIED: this means the page fault allows to retry, and this is the first try - ALLOW_RETRY and TRIED: this means the page fault allows to retry, and this is not the first try - !ALLOW_RETRY and !TRIED: this means the page fault does not allow to retry at all - !ALLOW_RETRY and TRIED: this is forbidden and should never be used In existing code we have multiple places that has taken special care of the first condition above by checking against (fault_flags & FAULT_FLAG_ALLOW_RETRY). This patch introduces a simple helper to detect the first retry of a page fault by checking against both (fault_flags & FAULT_FLAG_ALLOW_RETRY) and !(fault_flag & FAULT_FLAG_TRIED) because now even the 2nd try will have the ALLOW_RETRY set, then use that helper in all existing special paths. One example is in __lock_page_or_retry(), now we'll drop the mmap_sem only in the first attempt of page fault and we'll keep it in follow up retries, so old locking behavior will be retained. This will be a nice enhancement for current code [2] at the same time a supporting material for the future userfaultfd-writeprotect work, since in that work there will always be an explicit userfault writeprotect retry for protected pages, and if that cannot resolve the page fault (e.g., when userfaultfd-writeprotect is used in conjunction with swapped pages) then we'll possibly need a 3rd retry of the page fault. It might also benefit other potential users who will have similar requirement like userfault write-protection. GUP code is not touched yet and will be covered in follow up patch. Please read the thread below for more information. [1] https://lkml.org/lkml/2017/11/2/833 [2] https://lkml.org/lkml/2018/12/30/64 Suggested-by: Linus Torvalds Suggested-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- arch/alpha/mm/fault.c | 2 +- arch/arc/mm/fault.c | 1 - arch/arm/mm/fault.c | 3 --- arch/arm64/mm/fault.c | 5 ---- arch/hexagon/mm/vm_fault.c | 1 - arch/ia64/mm/fault.c | 1 - arch/m68k/mm/fault.c | 3 --- arch/microblaze/mm/fault.c | 1 - arch/mips/mm/fault.c | 1 - arch/nds32/mm/fault.c | 1 - arch/nios2/mm/fault.c | 3 --- arch/openrisc/mm/fault.c | 1 - arch/parisc/mm/fault.c | 4 +--- arch/powerpc/mm/fault.c | 6 ----- arch/riscv/mm/fault.c | 5 ---- arch/s390/mm/fault.c | 5 +--- arch/sh/mm/fault.c | 1 - arch/sparc/mm/fault_32.c | 1 - arch/sparc/mm/fault_64.c | 1 - arch/um/kernel/trap.c | 1 - arch/unicore32/mm/fault.c | 4 +--- arch/x86/mm/fault.c | 2 -- arch/xtensa/mm/fault.c | 1 - drivers/gpu/drm/ttm/ttm_bo_vm.c | 12 +++++++--- include/linux/mm.h | 41 ++++++++++++++++++++++++++++++++- mm/filemap.c | 2 +- mm/shmem.c | 2 +- 27 files changed, 55 insertions(+), 56 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index 8a2ef90b4bfc..6a02c0fb36b9 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -169,7 +169,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; + flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would * have already released it in __lock_page_or_retry diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index 3517820aea07..144d25b2e044 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -165,7 +165,6 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index c41c021bbe40..7910b4b5205d 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -342,9 +342,6 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) regs, addr); } if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 890ec3a693e6..c36da19d9098 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -524,12 +524,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, return 0; } - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk of - * starvation. - */ if (mm_flags & FAULT_FLAG_ALLOW_RETRY) { - mm_flags &= ~FAULT_FLAG_ALLOW_RETRY; mm_flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index febb4f96ba6f..21b6e9d8f2a1 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -102,7 +102,6 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 62c2d39d2bed..9de95d39935e 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -189,7 +189,6 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index d9808a807ab8..b1b2109e4ab4 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -162,9 +162,6 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 4fd2dbd0c5ca..05a4847ac0bf 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -236,7 +236,6 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 92374fd091d2..9953b5b571df 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -178,7 +178,6 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, tsk->min_flt++; } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index da777de8a62e..3642bdd7909d 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -242,7 +242,6 @@ void do_page_fault(unsigned long entry, unsigned long addr, 1, regs, addr); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index bdb1f9db75ba..9d4961d51db4 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -157,9 +157,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index f9f47dc32f94..05c754664fcb 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -181,7 +181,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, else tsk->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index 29422eec329d..675b221af198 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -327,14 +327,12 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; - /* * No need to up_read(&mm->mmap_sem) as we would * have already released it in __lock_page_or_retry * in mm/filemap.c. */ - + flags |= FAULT_FLAG_TRIED; goto retry; } } diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index c2168b298c82..63564afb24ed 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -608,13 +608,7 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, * case. */ if (unlikely(fault & VM_FAULT_RETRY)) { - /* We retry only once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. - */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; if (is_user && signal_pending(current)) return 0; diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 4aa7a2343353..cc76c8766951 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -142,11 +142,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs) 1, regs, addr); } if (fault & VM_FAULT_RETRY) { - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. - */ - flags &= ~(FAULT_FLAG_ALLOW_RETRY); flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index 94087ba285be..e460043776f3 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -530,10 +530,7 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) fault = VM_FAULT_PFAULT; goto out_up; } - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~(FAULT_FLAG_ALLOW_RETRY | - FAULT_FLAG_RETRY_NOWAIT); + flags &= ~FAULT_FLAG_RETRY_NOWAIT; flags |= FAULT_FLAG_TRIED; down_read(&mm->mmap_sem); goto retry; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index baf5d73df40c..cd710e2d7c57 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -498,7 +498,6 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index a2c83104fe35..6735cd1c09b9 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -261,7 +261,6 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, 1, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index cad71ec5c7b3..28d5b4d012c6 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -459,7 +459,6 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) 1, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index 05dcd4c5f0d5..e7723c133c7f 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -99,7 +99,6 @@ int handle_page_fault(unsigned long address, unsigned long ip, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index 3611f19234a1..efca122b5ef7 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -261,9 +261,7 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) else tsk->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; + flags |= FAULT_FLAG_TRIED; goto retry; } } diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index dcd7c1393be3..8d3fbd3dca75 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1465,9 +1465,7 @@ void do_user_addr_fault(struct pt_regs *regs, if (unlikely(fault & VM_FAULT_RETRY)) { bool is_user = flags & FAULT_FLAG_USER; - /* Retry at most once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; if (is_user && signal_pending(tsk)) return; diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index 792dad5e2f12..7cd55f2d66c9 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -128,7 +128,6 @@ void do_page_fault(struct pt_regs *regs) else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index 6dacff49c1cc..8f2f9ee6effa 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -61,9 +61,10 @@ static vm_fault_t ttm_bo_vm_fault_idle(struct ttm_buffer_object *bo, /* * If possible, avoid waiting for GPU with mmap_sem - * held. + * held. We only do this if the fault allows retry and this + * is the first attempt. */ - if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) { + if (fault_flag_allow_retry_first(vmf->flags)) { ret = VM_FAULT_RETRY; if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT) goto out_unlock; @@ -132,7 +133,12 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf) * for the buffer to become unreserved. */ if (unlikely(!reservation_object_trylock(bo->resv))) { - if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) { + /* + * If the fault allows retry and this is the first + * fault attempt, we try to release the mmap_sem + * before waiting + */ + if (fault_flag_allow_retry_first(vmf->flags)) { if (!(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) { ttm_bo_get(bo); up_read(&vmf->vma->vm_mm->mmap_sem); diff --git a/include/linux/mm.h b/include/linux/mm.h index dd0b5f4e1e45..dcaca899e4a8 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -383,16 +383,55 @@ extern unsigned int kobjsize(const void *objp); */ extern pgprot_t protection_map[16]; +/* + * About FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_TRIED: we can specify whether we + * would allow page faults to retry by specifying these two fault flags + * correctly. Currently there can be three legal combinations: + * + * (a) ALLOW_RETRY and !TRIED: this means the page fault allows retry, and + * this is the first try + * + * (b) ALLOW_RETRY and TRIED: this means the page fault allows retry, and + * we've already tried at least once + * + * (c) !ALLOW_RETRY and !TRIED: this means the page fault does not allow retry + * + * The unlisted combination (!ALLOW_RETRY && TRIED) is illegal and should never + * be used. Note that page faults can be allowed to retry for multiple times, + * in which case we'll have an initial fault with flags (a) then later on + * continuous faults with flags (b). We should always try to detect pending + * signals before a retry to make sure the continuous page faults can still be + * interrupted if necessary. + */ + #define FAULT_FLAG_WRITE 0x01 /* Fault was a write access */ #define FAULT_FLAG_MKWRITE 0x02 /* Fault was mkwrite of existing pte */ #define FAULT_FLAG_ALLOW_RETRY 0x04 /* Retry fault if blocking */ #define FAULT_FLAG_RETRY_NOWAIT 0x08 /* Don't drop mmap_sem and wait when retrying */ #define FAULT_FLAG_KILLABLE 0x10 /* The fault task is in SIGKILL killable region */ -#define FAULT_FLAG_TRIED 0x20 /* Second try */ +#define FAULT_FLAG_TRIED 0x20 /* We've tried once */ #define FAULT_FLAG_USER 0x40 /* The fault originated in userspace */ #define FAULT_FLAG_REMOTE 0x80 /* faulting for non current tsk/mm */ #define FAULT_FLAG_INSTRUCTION 0x100 /* The fault was during an instruction fetch */ +/** + * fault_flag_allow_retry_first - check ALLOW_RETRY the first time + * + * This is mostly used for places where we want to try to avoid taking + * the mmap_sem for too long a time when waiting for another condition + * to change, in which case we can try to be polite to release the + * mmap_sem in the first round to avoid potential starvation of other + * processes that would also want the mmap_sem. + * + * Return: true if the page fault allows retry and this is the first + * attempt of the fault handling; false otherwise. + */ +static inline bool fault_flag_allow_retry_first(unsigned int flags) +{ + return (flags & FAULT_FLAG_ALLOW_RETRY) && + (!(flags & FAULT_FLAG_TRIED)); +} + #define FAULT_FLAG_TRACE \ { FAULT_FLAG_WRITE, "WRITE" }, \ { FAULT_FLAG_MKWRITE, "MKWRITE" }, \ diff --git a/mm/filemap.c b/mm/filemap.c index df2006ba0cfa..83fdf429f795 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1381,7 +1381,7 @@ EXPORT_SYMBOL_GPL(__lock_page_killable); int __lock_page_or_retry(struct page *page, struct mm_struct *mm, unsigned int flags) { - if (flags & FAULT_FLAG_ALLOW_RETRY) { + if (fault_flag_allow_retry_first(flags)) { /* * CAUTION! In this case, mmap_sem is not released * even though return 0. diff --git a/mm/shmem.c b/mm/shmem.c index 1bb3b8dc8bb2..ef3a19c83927 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2009,7 +2009,7 @@ static vm_fault_t shmem_fault(struct vm_fault *vmf) DEFINE_WAIT_FUNC(shmem_fault_wait, synchronous_wake_function); ret = VM_FAULT_NOPAGE; - if ((vmf->flags & FAULT_FLAG_ALLOW_RETRY) && + if (fault_flag_allow_retry_first(vmf->flags) && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) { /* It's polite to up mmap_sem if we can */ up_read(&vma->vm_mm->mmap_sem); From patchwork Thu Jun 20 02:19:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005685 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2E6361580 for ; Thu, 20 Jun 2019 02:21:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2040E209CD for ; Thu, 20 Jun 2019 02:21:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 13A8927F85; Thu, 20 Jun 2019 02:21:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 90294209CD for ; Thu, 20 Jun 2019 02:21:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B45176B000A; Wed, 19 Jun 2019 22:21:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AF6248E0002; Wed, 19 Jun 2019 22:21:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A0B268E0001; Wed, 19 Jun 2019 22:21:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 80BF26B000A for ; Wed, 19 Jun 2019 22:21:29 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id g30so1646102qtm.17 for ; Wed, 19 Jun 2019 19:21:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=uZWZF/ts0L0/JdFkWSzEUhYz+Ba9eyZsNkgqvbsHxug=; b=ax9TYG2lYvH8AX1Dm0TlCP0CdyldrZbpC9uLUCvH+cSnO/jjJ/V/sPNBLIKHYDTezh EJ1/7K28igJlj9Hz/AEMp/Zi6JUvCm7CerFbM38oRwnGXNn/j23Zfp1gqmrip5VT/nPW nWjMeU5D6dH907TfVSBQmgZFdSQVoduVj9N0mc+91XjYmzOYq9Y6Iog/HcGHfmHObuBN ramqOB0gT33k6f4XFdPg10EL8DisUJZgKx74w/L4jcx/v2v2FsyFBlWJlYEpiPHKTy3P AyVJOLsE2vQXG7rojy4Ibz0Trwl4Ndah6WejZshLpGbF242DgM0IV6XyZcGlruhmqGr4 O4PA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWzA+0F77/gEY0vUx8Np1nF0CIDi6F/ZkkhcclAtUkHA+UZl3o5 ZACc6kGLcIosC64C/u95J6QY96i1yzgL6gfZ2LWZ8RvAOiJGecO6PQSp0fK5ZlboVdmJAMEELBh Mwckw+5wyOxCrpUy5zwJWhuvjWkBLDVHRsscGerfPwOebmXWAAvzu9IxA5IQoRpb6sQ== X-Received: by 2002:ac8:17ac:: with SMTP id o41mr36757594qtj.184.1560997289321; Wed, 19 Jun 2019 19:21:29 -0700 (PDT) X-Google-Smtp-Source: APXvYqz/PLtt/dEALTn5A8NvPn85L03ASbaxnE37ZGw56jL2a3rYuT/E88eyUgmCbopmLvzG+nPN X-Received: by 2002:ac8:17ac:: with SMTP id o41mr36757554qtj.184.1560997288575; Wed, 19 Jun 2019 19:21:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997288; cv=none; d=google.com; s=arc-20160816; b=m+5VzYAqGLmjZVSyIMzMW+YZw2ccU0HXRopojM9jp8pYKpV7doz55vKFdfyOGM0TOP UxbXH55gMn3tVlrgNUpdCj/G88dp8YWt9JqvLL8QiJKWIcKuSv7cbOh8gW6PS3GUpVEr E6ehjPipc2JV86eNMxlP5juvT+9tisfwnRb6JhM2t779ux54tuXy7D0cnNaETkpYxUp3 bBw7LetpBJ8xIfY3J20gGmz1ZhObTNzL3hUGREyLINmDeaXNTeCeALEvfQQtt7LsWD4Y nWyme4APAkeFHgeQ2bWQDqTSruA+uAYXidiSc0bYVMt36wu3lK8qXaeDKXc1PjUJxtS7 /vdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=uZWZF/ts0L0/JdFkWSzEUhYz+Ba9eyZsNkgqvbsHxug=; b=u8Ft5qrDShEIiWA+QU+xD47JeV2wG75LmtLHH00AZofLe2D69KcBbR8SBw9p8VDK6k 05HbqfEWpGsC5mBPQsFBgIWMNTk71l5ESkotT40ugcnfJJVN0sGHxOPgMj71Agwzj+6q xeudnTM3SD0SrOHkFSH9HwZSDPSSneObyqdM9smisSa7LmNyUp6VwM96/XpWgYM88mKb d9K8B+BMtRuvd9nfS9v58ud0DaaTsx4YYL5ucPsrwGT9kP3H/3FY3nJMnvGT3CCxW55W KJbqAxQm4yr9wRkclLfTtqE0CE1/f9fi3/4wgex2JagwuAWSn9C1F9vTRc4BgFyH7bCG jskA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id n63si13801728qka.114.2019.06.19.19.21.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:21:28 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C7DC5308427C; Thu, 20 Jun 2019 02:21:27 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id D62B91001E69; Thu, 20 Jun 2019 02:21:18 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 05/25] mm: gup: allow VM_FAULT_RETRY for multiple times Date: Thu, 20 Jun 2019 10:19:48 +0800 Message-Id: <20190620022008.19172-6-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Thu, 20 Jun 2019 02:21:27 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This is the gup counterpart of the change that allows the VM_FAULT_RETRY to happen for more than once. Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- mm/gup.c | 17 +++++++++++++---- mm/hugetlb.c | 6 ++++-- 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 58d282115d9b..ac8d5b73c212 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -647,7 +647,10 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, if (*flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; if (*flags & FOLL_TRIED) { - VM_WARN_ON_ONCE(fault_flags & FAULT_FLAG_ALLOW_RETRY); + /* + * Note: FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_TRIED + * can co-exist + */ fault_flags |= FAULT_FLAG_TRIED; } @@ -1062,17 +1065,23 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, if (likely(pages)) pages += ret; start += ret << PAGE_SHIFT; + lock_dropped = true; +retry: /* * Repeat on the address that fired VM_FAULT_RETRY - * without FAULT_FLAG_ALLOW_RETRY but with + * with both FAULT_FLAG_ALLOW_RETRY and * FAULT_FLAG_TRIED. */ *locked = 1; - lock_dropped = true; down_read(&mm->mmap_sem); ret = __get_user_pages(tsk, mm, start, 1, flags | FOLL_TRIED, - pages, NULL, NULL); + pages, NULL, locked); + if (!*locked) { + /* Continue to retry until we succeeded */ + BUG_ON(ret != 0); + goto retry; + } if (ret != 1) { BUG_ON(ret > 1); if (!pages_done) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ba179c2fa8fb..d9c739f9a28e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4317,8 +4317,10 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; if (flags & FOLL_TRIED) { - VM_WARN_ON_ONCE(fault_flags & - FAULT_FLAG_ALLOW_RETRY); + /* + * Note: FAULT_FLAG_ALLOW_RETRY and + * FAULT_FLAG_TRIED can co-exist + */ fault_flags |= FAULT_FLAG_TRIED; } ret = hugetlb_fault(mm, vma, vaddr, fault_flags); From patchwork Thu Jun 20 02:19:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005687 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 715AB1580 for ; Thu, 20 Jun 2019 02:21:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 649FC209CD for ; Thu, 20 Jun 2019 02:21:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5893D27F85; Thu, 20 Jun 2019 02:21:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 25390209CD for ; Thu, 20 Jun 2019 02:21:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3568E6B0003; Wed, 19 Jun 2019 22:21:45 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 305FB8E0002; Wed, 19 Jun 2019 22:21:45 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F59C8E0001; Wed, 19 Jun 2019 22:21:45 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id 020A56B0003 for ; Wed, 19 Jun 2019 22:21:45 -0400 (EDT) Received: by mail-qk1-f200.google.com with SMTP id n77so1700525qke.17 for ; Wed, 19 Jun 2019 19:21:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=RJusdW+a9mihYhTYjKehRlZ9xNq8p5VU2+XjeF7nXj4=; b=W7q2qogdqad9iXmfwsvPP0ZoGFuvgJJxd9YnNbyzoa3aEJSJ0kEb92ThjoAAorgCtV dlr4m5So7wVZajO48i/nyLPWflKj8zmGKMzxk9yaJgtK0eHY+O9UEaXu8NCtBrHDXkXC 47cgj6itHKVfdKydsxDgBTC1os9zp9PmGo3erhvJbAvUy/1QW5uX/s7aeZkxgoypJYCd 16pkXdg+y0AhrDGDrj5M6XQrU1sWegBxDiG19BQXDpUYo1xBKfITm4zYs0XrNgifNrTQ EoWilsktueEOhWKwefN6x0ok8XhcZuGtQRay4EHgqQvxf/1qEzfP0G5KQvnlRTmVA4VE h8eQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXGY3QbhHRr5k0XHHUEUJunLp++MxRWZ88//cRlntXiNpJ9NT2/ rDBSWYDaPSSW2V9TiHMZZNqM1i1/nRkoCjGKKm0v8aYEvrVa/8jm+BLynH2/EtDEvCbhCDQfi8S EeLCn0ff3TlzlvveQWIWsBZRG+NsfDnol2ubPzIucIDfTHmESHcZSQ7d23Ya4JhF5Qw== X-Received: by 2002:a0c:818f:: with SMTP id 15mr10113724qvd.162.1560997304816; Wed, 19 Jun 2019 19:21:44 -0700 (PDT) X-Google-Smtp-Source: APXvYqz27imfL9fyKQjj8rqehA92lCfNqFGioZHDC6FsmZ93KFBUVrTPoKsH9oZYY/64J/kHMMP1 X-Received: by 2002:a0c:818f:: with SMTP id 15mr10113700qvd.162.1560997304362; Wed, 19 Jun 2019 19:21:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997304; cv=none; d=google.com; s=arc-20160816; b=cc+ZQNHKjBXWp4FlGMJM2kkbESyHjqrljeZE17h5lUpvmHLTYBAIqPGDarMlkYyc04 Eqkt0jd/tEgxARO/brBIqHIXif/5+8+AnCn1W3Ld/oBSKaJ0T13+Ab+bwz0MmmF3Mk/L NhCazJlNxOmdza5TdXjbrYGOawSBYSl8z7btw2ewsnIYr+S89MThG2nyKS3AGSb6Jdsv 1BgJloYl7iizNKi4yqCi3AZ8rfxxkITE62qitZaiOUQ+HA1SDYjMu5C+cOuGp1et0dVd RrNjDLFpiut/EeFD0chy0lh9Pi6r9vNB6GZIGWZTzH1tDwP4qv2o2zysTXbEZ+5cUh3j 0wdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=RJusdW+a9mihYhTYjKehRlZ9xNq8p5VU2+XjeF7nXj4=; b=P4UAjsJMlGSXWaohgkpSDbszeDuGhazmcm29WV5DiVVZfgA3iPh/JT8hB9s0b8qWo/ SF0PfsZlfILJODJtqEbka7HWSKyx6Fj1nISNKEvhT9apbvaVhaxodjkAhduBzKFIyCbK R/vNmyDEQkhtjXl02ioYf4lbVBzSmO+uEEd4hDHtO334Y1auAxKtNvGE5oC441x7t1up O5zFd/pyAsGpTeC6VQmQIayJclJfL4zz8/cH9Pa/49HqzkGFWCWdkbNvh0wZ3eI557/J neZySMHVEHadkwijM+PajidK7pUFFb0SuKCXuglJAlK6Qw8h8YCLqX9OUsyQvsT7gcyi uDxA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id o4si8077028qkc.142.2019.06.19.19.21.44 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:21:44 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5EC3B3079B86; Thu, 20 Jun 2019 02:21:43 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id B29151001E6F; Thu, 20 Jun 2019 02:21:28 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Pavel Emelyanov , Rik van Riel Subject: [PATCH v5 06/25] userfaultfd: wp: add helper for writeprotect check Date: Thu, 20 Jun 2019 10:19:49 +0800 Message-Id: <20190620022008.19172-7-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Thu, 20 Jun 2019 02:21:43 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li add helper for writeprotect check. Will use it later. Cc: Andrea Arcangeli Cc: Pavel Emelyanov Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index ac9d71e24b81..5dc247af0f2e 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -52,6 +52,11 @@ static inline bool userfaultfd_missing(struct vm_area_struct *vma) return vma->vm_flags & VM_UFFD_MISSING; } +static inline bool userfaultfd_wp(struct vm_area_struct *vma) +{ + return vma->vm_flags & VM_UFFD_WP; +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP); @@ -96,6 +101,11 @@ static inline bool userfaultfd_missing(struct vm_area_struct *vma) return false; } +static inline bool userfaultfd_wp(struct vm_area_struct *vma) +{ + return false; +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return false; From patchwork Thu Jun 20 02:19:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005689 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 69BEE6C5 for ; Thu, 20 Jun 2019 02:21:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5BD3A209CD for ; Thu, 20 Jun 2019 02:21:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4FA8B27F85; Thu, 20 Jun 2019 02:21:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BFBBB209CD for ; Thu, 20 Jun 2019 02:21:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E78EB6B0005; Wed, 19 Jun 2019 22:21:54 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E2A638E0002; Wed, 19 Jun 2019 22:21:54 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D19058E0001; Wed, 19 Jun 2019 22:21:54 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id B06316B0005 for ; Wed, 19 Jun 2019 22:21:54 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id p43so1623580qtk.23 for ; Wed, 19 Jun 2019 19:21:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=6t42ZVA7nQGmUriMjbIStZ4r/pZ9wxkWvWVsOfcj4vw=; b=J2zU9nKcYeTGqaFQI+QEhiE5/ojcz66skdeeNgvUvRC52A1TFrVP8gNixyHuLJiQRy w2wCCfG5LiXbGepu1MSJWLlsFLfNrzb9M2yJzAqzpO7jWoOB95EUHYTOwlQDdaUFqKWI MAtnK5LVd+k5bP/IkE40re+9uMYQlNe3X8+Y1gNv/zEAEr53H9N0pLIhqPvaMcqoYnv2 qt1SH6qz2HVhmup5yJBaZ0UkO6TQVTvypIkv5XA8RR5udR6XH+i0l3WBREklGDLtFh+1 SOBecYLOChvuqV2vimbAj2RBb575GcKEqZk/incYzOwaitbruY5QVZXECShewzToKjaO P3Ig== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVvAm194PbqiE1KTZ+KEd4AeUX+c8zej9k4KzX0PRtOHdmNLsgt ozpuffxsCbN9niCmtDsuEglUHvKjC+88doSpu7rVSGrKylhYTPz8qSmKWJDwTC+/zrCuIBrMAxi PV4jgZHe7DNbPFTSkt+jXhXRPP+PoLpmze7oiSu6tt/yvTeac9kf5LdWH6YouSYH4yQ== X-Received: by 2002:ac8:2b14:: with SMTP id 20mr17035491qtu.295.1560997314502; Wed, 19 Jun 2019 19:21:54 -0700 (PDT) X-Google-Smtp-Source: APXvYqzHM01JVqa++WVEYLDGUQHKaK52QpXrAgouSJEiznyFvbUZrSJABuDx5u1GoeV+tkH7EWet X-Received: by 2002:ac8:2b14:: with SMTP id 20mr17035450qtu.295.1560997313824; Wed, 19 Jun 2019 19:21:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997313; cv=none; d=google.com; s=arc-20160816; b=wT8gufIRE/ASwBfuQkXx4IBA3KYMUWrEwRP9LQnwccqddM/APq5Q5/EySkCF24HnKx VsmbTJPnQyvVbzNs1UnaiIbhBmstm/zt/qmVeKbKvbxnyp2+RLracxVG5Nl+IAQBCg14 TyOWZLYuu13CetzdtjIUfsW+Aa+GmAdj80/WiezFZDsSM2LCDicNXjqa02JbzF83IayO gxTp4xlFWzmxsqRpUl5bTdRIfZr2jowV8Soe2az9keXQXrkWYeLZeSDC/XBFBpODPLT+ tRqUb+I+BFpzX0db8z3RlgXHtt2fWcBDh5uKlYZeFjt2/w5roWFVm6Ka7/8fvKY/MTmR X2EA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=6t42ZVA7nQGmUriMjbIStZ4r/pZ9wxkWvWVsOfcj4vw=; b=UIgc1UzzGwTc7usY/GLQeSk3201FZxY4ywe4prG+06+8H+4f2uu3uw/p2AKbtyprIT Ze4ENCBF4X0Ihmty02rus2NZeHY1iKtAEYkr2Eme2YIIxg0R/hzXHIjPRRJG10CrBled wRl0wRTlA8cn0HAexEmXLkUVpn0mHOpohgMNu/Keu623sb6JiqBA9pFt4++JAOSwE4ss qTfQMxoThhmtPXTXxrG4vKe/qB+MJWi/8g1u8oj9Pwvsts1AM730oypW6d4wfcnPUt5z Gm6TYBW1Zww9xP97IoDg1aDQQJ+ks2Gat/OmLsDCJ8Ne13XaaGDLw0FtonxuFu2DH29U jAAQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id w21si12801836qkj.69.2019.06.19.19.21.53 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:21:53 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EC73E30872F8; Thu, 20 Jun 2019 02:21:52 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id DB4D21001E69; Thu, 20 Jun 2019 02:21:43 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 07/25] userfaultfd: wp: hook userfault handler to write protection fault Date: Thu, 20 Jun 2019 10:19:50 +0800 Message-Id: <20190620022008.19172-8-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Thu, 20 Jun 2019 02:21:53 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli There are several cases write protection fault happens. It could be a write to zero page, swaped page or userfault write protected page. When the fault happens, there is no way to know if userfault write protect the page before. Here we just blindly issue a userfault notification for vma with VM_UFFD_WP regardless if app write protects it yet. Application should be ready to handle such wp fault. v1: From: Shaohua Li v2: Handle the userfault in the common do_wp_page. If we get there a pagetable is present and readonly so no need to do further processing until we solve the userfault. In the swapin case, always swapin as readonly. This will cause false positive userfaults. We need to decide later if to eliminate them with a flag like soft-dirty in the swap entry (see _PAGE_SWP_SOFT_DIRTY). hugetlbfs wouldn't need to worry about swapouts but and tmpfs would be handled by a swap entry bit like anonymous memory. The main problem with no easy solution to eliminate the false positives, will be if/when userfaultfd is extended to real filesystem pagecache. When the pagecache is freed by reclaim we can't leave the radix tree pinned if the inode and in turn the radix tree is reclaimed as well. The estimation is that full accuracy and lack of false positives could be easily provided only to anonymous memory (as long as there's no fork or as long as MADV_DONTFORK is used on the userfaultfd anonymous range) tmpfs and hugetlbfs, it's most certainly worth to achieve it but in a later incremental patch. v3: Add hooking point for THP wrprotect faults. CC: Shaohua Li Signed-off-by: Andrea Arcangeli [peterx: don't conditionally drop FAULT_FLAG_WRITE in do_swap_page] Reviewed-by: Mike Rapoport Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- mm/memory.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index ddf20bd0c317..05bcd741855b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2579,6 +2579,11 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; + if (userfaultfd_wp(vma)) { + pte_unmap_unlock(vmf->pte, vmf->ptl); + return handle_userfault(vmf, VM_UFFD_WP); + } + vmf->page = vm_normal_page(vma, vmf->address, vmf->orig_pte); if (!vmf->page) { /* @@ -3794,8 +3799,11 @@ static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf) /* `inline' is required to avoid gcc 4.1.2 build error */ static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd) { - if (vma_is_anonymous(vmf->vma)) + if (vma_is_anonymous(vmf->vma)) { + if (userfaultfd_wp(vmf->vma)) + return handle_userfault(vmf, VM_UFFD_WP); return do_huge_pmd_wp_page(vmf, orig_pmd); + } if (vmf->vma->vm_ops->huge_fault) return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD); From patchwork Thu Jun 20 02:19:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005691 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 445176C5 for ; Thu, 20 Jun 2019 02:22:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 36A2C209CD for ; Thu, 20 Jun 2019 02:22:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2A59827F85; Thu, 20 Jun 2019 02:22:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 53960209CD for ; Thu, 20 Jun 2019 02:22:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6D8886B0006; Wed, 19 Jun 2019 22:22:09 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 688AA8E0002; Wed, 19 Jun 2019 22:22:09 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 577208E0001; Wed, 19 Jun 2019 22:22:09 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 348266B0006 for ; Wed, 19 Jun 2019 22:22:09 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id q26so1709571qtr.3 for ; Wed, 19 Jun 2019 19:22:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=z/BRkRpjiewCe6SWegViWL6WkQmfER7TzQ/dvMV3mmw=; b=eIQzwt6cIuaIq/glJitRhngBbkS/vrav2sFUtsjAamcQEDiLV9xVEKW3X/l2zhFEOi aPtZ8eCbJ+h0LnFxSv7IXqt0RNRbb/QZ2g3tIVhxIpSM7HD/AMNh0zQsC2ygC0coi4Ki rofjzTfqTS3n7nk2W5h4H9+BxTj6pMfUfPbCjsbBn+6HPmub3ck76dPg0vM5+raGuapO 2Ni7atVILGxH/tQhVpMLAETf7eR1gcv4T9XLyl75XtUt8cUL+oli1Xh8DLhaBm8O4Psp ooYK5qM9lJyp/H3lSHWQSO6Tkoz3czu8U4yS1XC4/+P+8OPn9nVY83HEekVWdg3efJgZ 4ttg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVp9bSNseTsmzwzBTQzRRcbtKpDRKMmYFsyWxGBwKnn/O7V9k8e 9FZ3Ig516E4IvNU7aQLgoebxAhpA5qE4fVjsxv1ZWgitWOzCWef+bImomsc6aunUkjpdSNunKdi DqMQlsKreWa9RxybQ3DPDx9jyT9vNwVq1/FbJ9J8QL5yogkvfME0gBDv86tzou0wL6Q== X-Received: by 2002:a0c:99d5:: with SMTP id y21mr36664814qve.106.1560997328950; Wed, 19 Jun 2019 19:22:08 -0700 (PDT) X-Google-Smtp-Source: APXvYqwiNpDyS1lC9DiUrZ2EALiDVt3AzqbnQb7HaK31jXELPi2oNUXF1Yz0Waky6YhR/O6dH8OI X-Received: by 2002:a0c:99d5:: with SMTP id y21mr36664761qve.106.1560997327757; Wed, 19 Jun 2019 19:22:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997327; cv=none; d=google.com; s=arc-20160816; b=AeHq8HNXOI2xlPR+1LNKXPQpk+qT8wue8prutyavGz3NoyhFr8HT6eq/YYBHFTqp04 A0toNxKJvvT6i+JrZDb3VyJ1e6SMBLGqhGH2ZRk+qyhEDbh1kCIfIsJuhy8y9PdqnG6H 01Q7GbO8KsDwCnn8XF6CSW2kuzrbM2rGbpEzswOEWow6O7rbVsNXzwZEbKpfTfTpJIsD jiYZVfstmlY92nrURgOd3URHKs6DpvQm/MOQDR5/L3fyqaWYEuPtk782LygI43lIEIRG WFfuacm2KFPYNMBpsu9Y3tHPgiEFQc7NYBKMRHchNplwN5p/22cN5BVrtX26HDi+EeAG Pg4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=z/BRkRpjiewCe6SWegViWL6WkQmfER7TzQ/dvMV3mmw=; b=ftHepfXJv050IdxYKisbXlz24ZSPoeRa1y30Ga83Y+hg9Loh0JHRV9velmsr/SBmKP SPIjpCE/m01fdIoXGp4rqss1AUcT6MA3ygccXAIvIuvZVhiXqF/k3xUhrXIFBFs/KJCT 7mWYvDNGJpZffchT81Tpx3CCT2kLVzcSWpl8cOtPK/5LDwkPim8CRG8o2TBuocRXcvK5 bs2KhPzJrdssxpmlQzTCoZDNyLlnJKuxCbRFvTerSGUS4942jSh3YOFbikmZbhPvhI/E L25MnPG+s6ExhJ6+b9hgeETDIqkeO8vR1fN2C8IlBC5lpFgyGxxI3QkvUVcsTPO2mLSB XGNw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id q1si12346426qva.26.2019.06.19.19.22.07 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:22:07 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CC35A81F0E; Thu, 20 Jun 2019 02:22:06 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 860E31001E69; Thu, 20 Jun 2019 02:21:53 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 08/25] userfaultfd: wp: add WP pagetable tracking to x86 Date: Thu, 20 Jun 2019 10:19:51 +0800 Message-Id: <20190620022008.19172-9-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Thu, 20 Jun 2019 02:22:06 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli Accurate userfaultfd WP tracking is possible by tracking exactly which virtual memory ranges were writeprotected by userland. We can't relay only on the RW bit of the mapped pagetable because that information is destroyed by fork() or KSM or swap. If we were to relay on that, we'd need to stay on the safe side and generate false positive wp faults for every swapped out page. Signed-off-by: Andrea Arcangeli [peterx: append _PAGE_UFD_WP to _PAGE_CHG_MASK] Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 52 ++++++++++++++++++++++++++++ arch/x86/include/asm/pgtable_64.h | 8 ++++- arch/x86/include/asm/pgtable_types.h | 11 +++++- include/asm-generic/pgtable.h | 1 + include/asm-generic/pgtable_uffd.h | 51 +++++++++++++++++++++++++++ init/Kconfig | 5 +++ 7 files changed, 127 insertions(+), 2 deletions(-) create mode 100644 include/asm-generic/pgtable_uffd.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 2bbbd4d1ba31..3e06f679126d 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -217,6 +217,7 @@ config X86 select USER_STACKTRACE_SUPPORT select VIRT_TO_BUS select X86_FEATURE_NAMES if PROC_FS + select HAVE_ARCH_USERFAULTFD_WP if USERFAULTFD config INSTRUCTION_DECODER def_bool y diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 5e0509b41986..5b254b851082 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -25,6 +25,7 @@ #include #include #include +#include extern pgd_t early_top_pgt[PTRS_PER_PGD]; int __init __early_make_pgtable(unsigned long address, pmdval_t pmd); @@ -310,6 +311,23 @@ static inline pte_t pte_clear_flags(pte_t pte, pteval_t clear) return native_make_pte(v & ~clear); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline int pte_uffd_wp(pte_t pte) +{ + return pte_flags(pte) & _PAGE_UFFD_WP; +} + +static inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte_set_flags(pte, _PAGE_UFFD_WP); +} + +static inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return pte_clear_flags(pte, _PAGE_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + static inline pte_t pte_mkclean(pte_t pte) { return pte_clear_flags(pte, _PAGE_DIRTY); @@ -389,6 +407,23 @@ static inline pmd_t pmd_clear_flags(pmd_t pmd, pmdval_t clear) return native_make_pmd(v & ~clear); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline int pmd_uffd_wp(pmd_t pmd) +{ + return pmd_flags(pmd) & _PAGE_UFFD_WP; +} + +static inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pmd_set_flags(pmd, _PAGE_UFFD_WP); +} + +static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pmd_clear_flags(pmd, _PAGE_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + static inline pmd_t pmd_mkold(pmd_t pmd) { return pmd_clear_flags(pmd, _PAGE_ACCESSED); @@ -1371,6 +1406,23 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) #endif #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return pte_set_flags(pte, _PAGE_SWP_UFFD_WP); +} + +static inline int pte_swp_uffd_wp(pte_t pte) +{ + return pte_flags(pte) & _PAGE_SWP_UFFD_WP; +} + +static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return pte_clear_flags(pte, _PAGE_SWP_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + #define PKRU_AD_BIT 0x1 #define PKRU_WD_BIT 0x2 #define PKRU_BITS_PER_PKEY 2 diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index 0bb566315621..627666b1c3c0 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -189,7 +189,7 @@ extern void sync_global_pgds(unsigned long start, unsigned long end); * * | ... | 11| 10| 9|8|7|6|5| 4| 3|2| 1|0| <- bit number * | ... |SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names - * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| X|X|SD|0| <- swp entry + * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| X|F|SD|0| <- swp entry * * G (8) is aliased and used as a PROT_NONE indicator for * !present ptes. We need to start storing swap entries above @@ -197,9 +197,15 @@ extern void sync_global_pgds(unsigned long start, unsigned long end); * erratum where they can be incorrectly set by hardware on * non-present PTEs. * + * SD Bits 1-4 are not used in non-present format and available for + * special use described below: + * * SD (1) in swp entry is used to store soft dirty bit, which helps us * remember soft dirty over page migration * + * F (2) in swp entry is used to record when a pagetable is + * writeprotected by userfaultfd WP support. + * * Bit 7 in swp entry should be 0 because pmd_present checks not only P, * but also L and G. * diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index d6ff0bbdb394..dd9c6295d610 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -32,6 +32,7 @@ #define _PAGE_BIT_SPECIAL _PAGE_BIT_SOFTW1 #define _PAGE_BIT_CPA_TEST _PAGE_BIT_SOFTW1 +#define _PAGE_BIT_UFFD_WP _PAGE_BIT_SOFTW2 /* userfaultfd wrprotected */ #define _PAGE_BIT_SOFT_DIRTY _PAGE_BIT_SOFTW3 /* software dirty tracking */ #define _PAGE_BIT_DEVMAP _PAGE_BIT_SOFTW4 @@ -100,6 +101,14 @@ #define _PAGE_SWP_SOFT_DIRTY (_AT(pteval_t, 0)) #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +#define _PAGE_UFFD_WP (_AT(pteval_t, 1) << _PAGE_BIT_UFFD_WP) +#define _PAGE_SWP_UFFD_WP _PAGE_USER +#else +#define _PAGE_UFFD_WP (_AT(pteval_t, 0)) +#define _PAGE_SWP_UFFD_WP (_AT(pteval_t, 0)) +#endif + #if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE) #define _PAGE_NX (_AT(pteval_t, 1) << _PAGE_BIT_NX) #define _PAGE_DEVMAP (_AT(u64, 1) << _PAGE_BIT_DEVMAP) @@ -124,7 +133,7 @@ */ #define _PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \ _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY | \ - _PAGE_SOFT_DIRTY | _PAGE_DEVMAP) + _PAGE_SOFT_DIRTY | _PAGE_DEVMAP | _PAGE_UFFD_WP) #define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE) /* diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 75d9d68a6de7..1e979845e1cb 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -10,6 +10,7 @@ #include #include #include +#include #if 5 - defined(__PAGETABLE_P4D_FOLDED) - defined(__PAGETABLE_PUD_FOLDED) - \ defined(__PAGETABLE_PMD_FOLDED) != CONFIG_PGTABLE_LEVELS diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h new file mode 100644 index 000000000000..643d1bf559c2 --- /dev/null +++ b/include/asm-generic/pgtable_uffd.h @@ -0,0 +1,51 @@ +#ifndef _ASM_GENERIC_PGTABLE_UFFD_H +#define _ASM_GENERIC_PGTABLE_UFFD_H + +#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static __always_inline int pte_uffd_wp(pte_t pte) +{ + return 0; +} + +static __always_inline int pmd_uffd_wp(pmd_t pmd) +{ + return 0; +} + +static __always_inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pmd; +} + +static __always_inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pmd; +} + +static __always_inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline int pte_swp_uffd_wp(pte_t pte) +{ + return 0; +} + +static __always_inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return pte; +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + +#endif /* _ASM_GENERIC_PGTABLE_UFFD_H */ diff --git a/init/Kconfig b/init/Kconfig index 0e2344389501..763dc7fcf361 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1453,6 +1453,11 @@ config ADVISE_SYSCALLS applications use these syscalls, you can disable this option to save space. +config HAVE_ARCH_USERFAULTFD_WP + bool + help + Arch has userfaultfd write protection support + config MEMBARRIER bool "Enable membarrier() system call" if EXPERT default y From patchwork Thu Jun 20 02:19:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005693 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5A8BF1580 for ; Thu, 20 Jun 2019 02:22:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4C708209CD for ; Thu, 20 Jun 2019 02:22:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3FF7C27F85; Thu, 20 Jun 2019 02:22:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C4DDA209CD for ; Thu, 20 Jun 2019 02:22:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E51896B0007; Wed, 19 Jun 2019 22:22:15 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E02E38E0002; Wed, 19 Jun 2019 22:22:15 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D18828E0001; Wed, 19 Jun 2019 22:22:15 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id B10F36B0007 for ; Wed, 19 Jun 2019 22:22:15 -0400 (EDT) Received: by mail-qk1-f200.google.com with SMTP id j128so1683353qkd.23 for ; Wed, 19 Jun 2019 19:22:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=jN8gpQC/k7eZtjFNJDwsdS9nBmHNvi88dujJcJMSTFw=; b=dAjDYAj7X+e5J3g55EOsDB/qhl/OGKh3tW4nEhN3c3bvwG9l6x9b24paiei7mYOrJG JQX2uTE9FF5F7IAHpFD82AAhxCEMndb4vp2PlGgYplvICsxrWSr9aXdBpHxcKv9Bmled kS6qZCqEjc8A/mj+NgciWq29fnt9lTRQdQkvmKk2yEl0nyC0xh3HD1ZvuLmvEx1hHy7h kv4IYPyPw7M0XL8WMxX8Hkgu+IHG4xfoQcqm54qWnz1JFG27KGpW+UcTa2o65sT47xjy jU8azLW5RJw5yFR9rj5/bCXjVJTxN+nxx95pHtmwQGIrFkpDte4p/qLtfK7NtS8HR0UV UMXg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAUBtMRpVuPyPYjtYZ1KzFkDGoXX2hNyGdm8uRVsbZ1u2bsCSwhp Q+ZToKkyTSsCTgt8WMX2sXvpdD1KCC/moFpcYLKA+6s+U1/PA6CfuLJlfPvqTT7RNV1x7hvaDxL VEed4gIhRqN3UJMEsu2dvjtcwdcyNpU2+JkhMBmfqzlLo8zbSYRHfhqGRiQjTYNxXYA== X-Received: by 2002:ac8:22db:: with SMTP id g27mr109817177qta.221.1560997335508; Wed, 19 Jun 2019 19:22:15 -0700 (PDT) X-Google-Smtp-Source: APXvYqxE7DH03ZmjlCx7KTEQpxuYumts+964ihwYI23UX2scR8C8y0O9hBJJHjrhiOLn/jcp/xN8 X-Received: by 2002:ac8:22db:: with SMTP id g27mr109817100qta.221.1560997334639; Wed, 19 Jun 2019 19:22:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997334; cv=none; d=google.com; s=arc-20160816; b=ArnbGmHgMQxNQLv+1noHnlJLaVTieS4VBKUOdEj+NdargqwwqTzsDEud/SC7O5Yd5T c2QPIh9rN+TN9zORFClGyDZYhXV1HXtLYWWxg/o7PKiPHPH4nvgcR9jNtGg9wqV3cQWV xHCfs9mfqU3qTXkEA5aJam6MgQsHEnns67Px3O92+nC0znVQTkmOwDeeLufa+KOIgtpD YvnWzRC70+JjWjYubzxEopAfivjdxP1FpcmtBfOWf35kJB+k4vMf3+0mCbHiKbsfmqBs giXl+ngbLhx+K7tjV0kEEbxnYh5nFqO9FWr+6LpL2B9E47ry6FECz5pGHF5j13LYTrSX SDJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=jN8gpQC/k7eZtjFNJDwsdS9nBmHNvi88dujJcJMSTFw=; b=l1oAagzH2ptUE7VufDWxMLuo3BgM5GaWijyKvEFwxamUCd62oQC+xL5bkEFstY4B8Q oRclBa3EqIfmJ8IgpALqW3frxZGjtNIQUlnhQREYeDpB42ImGaXbcTelb3uri+YWt37s g93oLCv8VG96Y+q3IL92p2E4qUNPox1nPPd1nt4iSyLpxYMLfGUS9vrIZMaX9DRXbbyW aSV6ZT2ku15WD/RQ3E+on+vJOv8dSsKETLuJdzmEVLlD5MuOX1ATHg55JQpYf7UnJj60 EuUbdcmf7JcSQuClirzeqF92DgICxxcZWOGsVcnTOyEdM66Fpar80RF5RbrNRB/IAb1W j3fQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id v65si13406854qka.78.2019.06.19.19.22.14 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:22:14 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D444F1796; Thu, 20 Jun 2019 02:22:13 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6663E1001E69; Thu, 20 Jun 2019 02:22:07 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 09/25] userfaultfd: wp: userfaultfd_pte/huge_pmd_wp() helpers Date: Thu, 20 Jun 2019 10:19:52 +0800 Message-Id: <20190620022008.19172-10-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Thu, 20 Jun 2019 02:22:14 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli Implement helpers methods to invoke userfaultfd wp faults more selectively: not only when a wp fault triggers on a vma with vma->vm_flags VM_UFFD_WP set, but only if the _PAGE_UFFD_WP bit is set in the pagetable too. Signed-off-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 5dc247af0f2e..7b91b76aac58 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -14,6 +14,8 @@ #include /* linux/include/uapi/linux/userfaultfd.h */ #include +#include +#include /* * CAREFUL: Check include/uapi/asm-generic/fcntl.h when defining @@ -57,6 +59,18 @@ static inline bool userfaultfd_wp(struct vm_area_struct *vma) return vma->vm_flags & VM_UFFD_WP; } +static inline bool userfaultfd_pte_wp(struct vm_area_struct *vma, + pte_t pte) +{ + return userfaultfd_wp(vma) && pte_uffd_wp(pte); +} + +static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma, + pmd_t pmd) +{ + return userfaultfd_wp(vma) && pmd_uffd_wp(pmd); +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP); @@ -106,6 +120,19 @@ static inline bool userfaultfd_wp(struct vm_area_struct *vma) return false; } +static inline bool userfaultfd_pte_wp(struct vm_area_struct *vma, + pte_t pte) +{ + return false; +} + +static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma, + pmd_t pmd) +{ + return false; +} + + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return false; From patchwork Thu Jun 20 02:19:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005695 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D5E5713AF for ; Thu, 20 Jun 2019 02:22:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C731A209CD for ; Thu, 20 Jun 2019 02:22:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BA95127F85; Thu, 20 Jun 2019 02:22:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 090BA209CD for ; Thu, 20 Jun 2019 02:22:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1F5EE6B000C; Wed, 19 Jun 2019 22:22:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1A5318E0002; Wed, 19 Jun 2019 22:22:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0BB9A8E0001; Wed, 19 Jun 2019 22:22:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id DCE9E6B000C for ; Wed, 19 Jun 2019 22:22:28 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id o16so1690194qtj.6 for ; Wed, 19 Jun 2019 19:22:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=o7nPoQszb1KyhdX6zpXj78a4J8XdFeDwU7Jyt6eo1Ag=; b=sYomeDf76ojBhqy/nvILStznGAoAsOHRzCsQAOS1BiCz6h19ZNjJoqZT4m3p98U0z/ ownM36ztr3/woqvWtw6O1ygzKo15/bsU7x1mKc90mbM8wTV6s+++8j5bJ4t2WKuGOfOG BX0JTOlM1YqO4CjCQt8+isPT0tZTGZeCvltj3r4zGia6ojmkGI8MqgzqL6xaTFh6hIbE AqAjt+JyjQOhDOR7xQmh1b4A8DSoJMd4ZKYdKcpU0Z4+94wg1wOXctfu0T0mNyBVhpdj RKVFg5HVsRrUAdMsPhnebPw2med8FutguKAgrVTB49GXikzBiPNc07AQye0rGBxg0iPa 9/lg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAV5ga9DFMGtleK01vRpjO0MZ8WNpmDIG3hqA6GLkYhGq72w8nbH B6QIMj5YwpBsHjPmIudUbX9rN48cNnuDkdPq24rGob4nzxy/1IMMFBMzqf9zBBUBLhDuVPsXqbS EiT12c65s+gAYniutM8kjyWU275cymAsprgYjFrNXp4d7sYHw2e8p8RFtTt6Q0Fhbmg== X-Received: by 2002:ac8:34f4:: with SMTP id x49mr99113718qtb.95.1560997348667; Wed, 19 Jun 2019 19:22:28 -0700 (PDT) X-Google-Smtp-Source: APXvYqzt8SN2FjjNh4xjpM0alMTa0T6sHM2vP23JCJHU3GkALNq1U6l7nBsNq4/QJlcsqrEA9P3e X-Received: by 2002:ac8:34f4:: with SMTP id x49mr99113684qtb.95.1560997347927; Wed, 19 Jun 2019 19:22:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997347; cv=none; d=google.com; s=arc-20160816; b=CTYqgvNLyLYdy0BNXyJCkalbe3V3ZDy8En+oOJ2Dwsef1V0zSm4ZNdME4OtE0usZnQ sMFshj53zO4sNz0LHnB0DWe2T3MeiqWO6q+elwKGQqlFDbNamgYUp05T6vKDeBofiSQi CKVqNFNr6G3qkNQNRNTc7eBRm3OoLVW8nuEg99uS+arATEh03oN9KcWPXeQpNmbGJcsc ElxdA5LfplI12ZLacZR3l4FI0YfEXCiRiJxjw3kYqxjoek07ML4BJLI+fLu58c9phgP0 qVZyWVjCoK6rpC7P6/5rvvhazreH5TQ9tFQd5QhYrNzcAZZTq+iaULT3Ssleq7azp9HN qsKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=o7nPoQszb1KyhdX6zpXj78a4J8XdFeDwU7Jyt6eo1Ag=; b=o7Otc5wybSCuIn511YP/mXOLgxuM86JFcYIfFCKkI3wZvuo7ovpHRfJ6DmuBo+OAxq /+VKMLPJOAe7KsH+y4egEX1xtV0nnsUkBveSMV8WYaojk8EFTP47JdWBQy5HCv9fxM3B cNNWX0euOa2ocYSZBTvj74xojA6qW2ZamaTT4VAWa+OwEceK31sHz88LWEstneeNpuWL GwQnYaWBPrw0iS1ZYdbg6aeAK5yEeJX+Y/HQX+yTXCfk8olMrCFTc8RKf8N9eFnv1eaP d3OmcSH87egSBsuADAXp7JTOWUAKoyu6D6GZjcy7dWgTUehnkDE91imBUZze5VYy0Z5f BZew== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id q19si3507249qtn.43.2019.06.19.19.22.27 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:22:27 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E5C63307D866; Thu, 20 Jun 2019 02:22:26 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 57FA11001DC3; Thu, 20 Jun 2019 02:22:14 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 10/25] userfaultfd: wp: add UFFDIO_COPY_MODE_WP Date: Thu, 20 Jun 2019 10:19:53 +0800 Message-Id: <20190620022008.19172-11-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Thu, 20 Jun 2019 02:22:27 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli This allows UFFDIO_COPY to map pages write-protected. Signed-off-by: Andrea Arcangeli [peterx: switch to VM_WARN_ON_ONCE in mfill_atomic_pte; add brackets around "dst_vma->vm_flags & VM_WRITE"; fix wordings in comments and commit messages] Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- fs/userfaultfd.c | 5 +++-- include/linux/userfaultfd_k.h | 2 +- include/uapi/linux/userfaultfd.h | 11 +++++----- mm/userfaultfd.c | 36 ++++++++++++++++++++++---------- 4 files changed, 35 insertions(+), 19 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 5dbef45ecbf5..c594945ad5bf 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1694,11 +1694,12 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, ret = -EINVAL; if (uffdio_copy.src + uffdio_copy.len <= uffdio_copy.src) goto out; - if (uffdio_copy.mode & ~UFFDIO_COPY_MODE_DONTWAKE) + if (uffdio_copy.mode & ~(UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP)) goto out; if (mmget_not_zero(ctx->mm)) { ret = mcopy_atomic(ctx->mm, uffdio_copy.dst, uffdio_copy.src, - uffdio_copy.len, &ctx->mmap_changing); + uffdio_copy.len, &ctx->mmap_changing, + uffdio_copy.mode); mmput(ctx->mm); } else { return -ESRCH; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 7b91b76aac58..dcd33172b728 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -36,7 +36,7 @@ extern vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason); extern ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - bool *mmap_changing); + bool *mmap_changing, __u64 mode); extern ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long len, diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 48f1a7c2f1f0..340f23bc251d 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -203,13 +203,14 @@ struct uffdio_copy { __u64 dst; __u64 src; __u64 len; +#define UFFDIO_COPY_MODE_DONTWAKE ((__u64)1<<0) /* - * There will be a wrprotection flag later that allows to map - * pages wrprotected on the fly. And such a flag will be - * available if the wrprotection ioctl are implemented for the - * range according to the uffdio_register.ioctls. + * UFFDIO_COPY_MODE_WP will map the page write protected on + * the fly. UFFDIO_COPY_MODE_WP is available only if the + * write protected ioctl is implemented for the range + * according to the uffdio_register.ioctls. */ -#define UFFDIO_COPY_MODE_DONTWAKE ((__u64)1<<0) +#define UFFDIO_COPY_MODE_WP ((__u64)1<<1) __u64 mode; /* diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 9932d5755e4c..c8e7846e9b7e 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -25,7 +25,8 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - struct page **pagep) + struct page **pagep, + bool wp_copy) { struct mem_cgroup *memcg; pte_t _dst_pte, *dst_pte; @@ -71,9 +72,9 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, if (mem_cgroup_try_charge(page, dst_mm, GFP_KERNEL, &memcg, false)) goto out_release; - _dst_pte = mk_pte(page, dst_vma->vm_page_prot); - if (dst_vma->vm_flags & VM_WRITE) - _dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte)); + _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); + if ((dst_vma->vm_flags & VM_WRITE) && !wp_copy) + _dst_pte = pte_mkwrite(_dst_pte); dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); if (dst_vma->vm_file) { @@ -398,7 +399,8 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, struct page **page, - bool zeropage) + bool zeropage, + bool wp_copy) { ssize_t err; @@ -415,11 +417,13 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, if (!(dst_vma->vm_flags & VM_SHARED)) { if (!zeropage) err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, - dst_addr, src_addr, page); + dst_addr, src_addr, page, + wp_copy); else err = mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, dst_addr); } else { + VM_WARN_ON_ONCE(wp_copy); if (!zeropage) err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, @@ -437,7 +441,8 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, unsigned long src_start, unsigned long len, bool zeropage, - bool *mmap_changing) + bool *mmap_changing, + __u64 mode) { struct vm_area_struct *dst_vma; ssize_t err; @@ -445,6 +450,7 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, unsigned long src_addr, dst_addr; long copied; struct page *page; + bool wp_copy; /* * Sanitize the command parameters: @@ -501,6 +507,14 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, dst_vma->vm_flags & VM_SHARED)) goto out_unlock; + /* + * validate 'mode' now that we know the dst_vma: don't allow + * a wrprotect copy if the userfaultfd didn't register as WP. + */ + wp_copy = mode & UFFDIO_COPY_MODE_WP; + if (wp_copy && !(dst_vma->vm_flags & VM_UFFD_WP)) + goto out_unlock; + /* * If this is a HUGETLB vma, pass off to appropriate routine */ @@ -556,7 +570,7 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, BUG_ON(pmd_trans_huge(*dst_pmd)); err = mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, - src_addr, &page, zeropage); + src_addr, &page, zeropage, wp_copy); cond_resched(); if (unlikely(err == -ENOENT)) { @@ -603,14 +617,14 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - bool *mmap_changing) + bool *mmap_changing, __u64 mode) { return __mcopy_atomic(dst_mm, dst_start, src_start, len, false, - mmap_changing); + mmap_changing, mode); } ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long start, unsigned long len, bool *mmap_changing) { - return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing); + return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing, 0); } From patchwork Thu Jun 20 02:19:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005697 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3305C14DB for ; Thu, 20 Jun 2019 02:22:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 25D62209CD for ; Thu, 20 Jun 2019 02:22:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1932E281C3; Thu, 20 Jun 2019 02:22:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3EFE1209CD for ; Thu, 20 Jun 2019 02:22:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 418B36B000A; Wed, 19 Jun 2019 22:22:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3F08B8E0002; Wed, 19 Jun 2019 22:22:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 305B18E0001; Wed, 19 Jun 2019 22:22:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 0C1276B000A for ; Wed, 19 Jun 2019 22:22:43 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id l16so1739103qkk.9 for ; Wed, 19 Jun 2019 19:22:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Xam7yr8h1UjgdZqeIJZCOVOuVD0VyIgPTO0rda1afCM=; b=cVvhUvoDXjy7tMEX9JyZf59yMKuiugrwyAZ7LJDoW+6ms1yjy5SlGpzNGTgeP5+Rsq 3CfMj+phSOY5qhYVPQ9EKI8KGRetk94L435bUUFKh0QmdMhIYV1neI7nEHYY/LFBdMZU H5x80urh5GhV913dTBEvYapRm7FmDQ/XkglHl4kzvhTwosALzb72hWgke8j1AZG5DcVm NhHlJpKGz+JhOdLeJKKgl8l5NBug4dETA2PmFcQEPs+4qJ7w1h/6jK0oT+xBR3a03Xpd wXR0t/H2+dN/pru99957fRW2BRX71oShGv2uHSiNLbn/J34VxtzXW1TUOnjsRZh94Npp HzOw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXwYTkiFd9zeHQ9Xx+It+/VzKT6VX0GtVlfTsjL20FASU7bs/qm rN0aqOeaTeLzK2kAimIVCliaGzWi5WajN9TR6AG+zQjsft6+p+kmsNbVMafXPFVnNoHhNZBzfqJ c6Lhxc0APBhpYss6PMi+hpUixHIvU95mPv9teHxpMBQT/DZWlF4te0cK9zHWmT/42YA== X-Received: by 2002:a0c:b036:: with SMTP id k51mr37997778qvc.103.1560997362817; Wed, 19 Jun 2019 19:22:42 -0700 (PDT) X-Google-Smtp-Source: APXvYqw97U50LSrTtb9eQVQl/yecVBPktu9wuVJro6/CYDwjOeBMscfo5GFvexJ25/LN+PPaiGhu X-Received: by 2002:a0c:b036:: with SMTP id k51mr37997734qvc.103.1560997362056; Wed, 19 Jun 2019 19:22:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997362; cv=none; d=google.com; s=arc-20160816; b=pNw9L4kbEJM3b0hKzTzjLG0CvQAYmLWnmcxuXo7u6xiRBkSAVR4c93fIOWkUt3TacX 7RFHyMe+bPikXYIid8FZMNx2k2OwEE4Tc1onkd5/9/01RvG1eCH8M6rF+BdvNHUtaUMb Q9unjlrcBdKZakdTDj/E6xwHo70Anr5cK69lbEc3JJLoygmWWAY1TNev/Z3O8c4EUjCw cq0LOweaiQF/kHVJRDAyIyHB8+zGsWggv+b4iCP4R8UOx2YG+IjoK8RnV3e7JLUCpvyi lc2mdwah9MEVgGVRA00p52yBbtah/SpK/dqGnM6p9DRqJsEMlKkxUoZvdCY1n5znWQP/ +UKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=Xam7yr8h1UjgdZqeIJZCOVOuVD0VyIgPTO0rda1afCM=; b=GkcVOs0wmeo/Hp8+j+UdNZOfAikdx3wuHtHZYBikWR2vBcSvehIyHJvshab9HQdyoy gWgF1Zu61umgBnvYT85Q4OQOEFGIkUNDaj+HJ+k9Pp9X5nVsdI2UpVZ8vJxrUcqplYzn YZjCQiOUwPAqUTCyVTNSWKWsCTEPWG73P7PLozwq2Xoo5h3q4qoQy8nxWIbPmlEjELua 5O1tcn0bC1QEbg3Lm2F3/El3vJaaOS++tYrfGMILHSYSWdig9pWg+cz3edMEU0C9VbFC n4iM9yhbPHwTLWbbL+CxtAeU00gTmK6jktHIJl5XkG167Ak6OMK0lTrmktt74Vqi6RPw isMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id l4si3812466qtb.237.2019.06.19.19.22.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:22:42 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 11A0C81F01; Thu, 20 Jun 2019 02:22:41 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id A6B151001E69; Thu, 20 Jun 2019 02:22:27 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 11/25] mm: merge parameters for change_protection() Date: Thu, 20 Jun 2019 10:19:54 +0800 Message-Id: <20190620022008.19172-12-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Thu, 20 Jun 2019 02:22:41 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP change_protection() was used by either the NUMA or mprotect() code, there's one parameter for each of the callers (dirty_accountable and prot_numa). Further, these parameters are passed along the calls: - change_protection_range() - change_p4d_range() - change_pud_range() - change_pmd_range() - ... Now we introduce a flag for change_protect() and all these helpers to replace these parameters. Then we can avoid passing multiple parameters multiple times along the way. More importantly, it'll greatly simplify the work if we want to introduce any new parameters to change_protection(). In the follow up patches, a new parameter for userfaultfd write protection will be introduced. No functional change at all. Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- include/linux/huge_mm.h | 2 +- include/linux/mm.h | 14 +++++++++++++- mm/huge_memory.c | 3 ++- mm/mempolicy.c | 2 +- mm/mprotect.c | 29 ++++++++++++++++------------- 5 files changed, 33 insertions(+), 17 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 7cd5c150c21d..a81a6ed609ac 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -46,7 +46,7 @@ extern bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, pmd_t *old_pmd, pmd_t *new_pmd); extern int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, pgprot_t newprot, - int prot_numa); + unsigned long cp_flags); vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write); vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write); enum transparent_hugepage_flag { diff --git a/include/linux/mm.h b/include/linux/mm.h index dcaca899e4a8..a93ac1c37940 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1708,9 +1708,21 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma, unsigned long old_addr, struct vm_area_struct *new_vma, unsigned long new_addr, unsigned long len, bool need_rmap_locks); + +/* + * Flags used by change_protection(). For now we make it a bitmap so + * that we can pass in multiple flags just like parameters. However + * for now all the callers are only use one of the flags at the same + * time. + */ +/* Whether we should allow dirty bit accounting */ +#define MM_CP_DIRTY_ACCT (1UL << 0) +/* Whether this protection change is for NUMA hints */ +#define MM_CP_PROT_NUMA (1UL << 1) + extern unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa); + unsigned long cp_flags); extern int mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev, unsigned long start, unsigned long end, unsigned long newflags); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9f8bce9a6b32..b7149a0acac1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1903,13 +1903,14 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, * - HPAGE_PMD_NR is protections changed and TLB flush necessary */ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, - unsigned long addr, pgprot_t newprot, int prot_numa) + unsigned long addr, pgprot_t newprot, unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; spinlock_t *ptl; pmd_t entry; bool preserve_write; int ret; + bool prot_numa = cp_flags & MM_CP_PROT_NUMA; ptl = __pmd_trans_huge_lock(pmd, vma); if (!ptl) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 01600d80ae01..dea6a49573e3 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -575,7 +575,7 @@ unsigned long change_prot_numa(struct vm_area_struct *vma, { int nr_updated; - nr_updated = change_protection(vma, addr, end, PAGE_NONE, 0, 1); + nr_updated = change_protection(vma, addr, end, PAGE_NONE, MM_CP_PROT_NUMA); if (nr_updated) count_vm_numa_events(NUMA_PTE_UPDATES, nr_updated); diff --git a/mm/mprotect.c b/mm/mprotect.c index bf38dfbbb4b4..ae9caa4c6562 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -37,12 +37,14 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { pte_t *pte, oldpte; spinlock_t *ptl; unsigned long pages = 0; int target_node = NUMA_NO_NODE; + bool dirty_accountable = cp_flags & MM_CP_DIRTY_ACCT; + bool prot_numa = cp_flags & MM_CP_PROT_NUMA; /* * Can be called with only the mmap_sem for reading by @@ -163,7 +165,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, static inline unsigned long change_pmd_range(struct vm_area_struct *vma, pud_t *pud, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { pmd_t *pmd; unsigned long next; @@ -195,7 +197,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, __split_huge_pmd(vma, pmd, addr, false, NULL); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, - newprot, prot_numa); + newprot, cp_flags); if (nr_ptes) { if (nr_ptes == HPAGE_PMD_NR) { @@ -210,7 +212,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, /* fall through, the trans huge pmd just split */ } this_pages = change_pte_range(vma, pmd, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); pages += this_pages; next: cond_resched(); @@ -226,7 +228,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, static inline unsigned long change_pud_range(struct vm_area_struct *vma, p4d_t *p4d, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { pud_t *pud; unsigned long next; @@ -238,7 +240,7 @@ static inline unsigned long change_pud_range(struct vm_area_struct *vma, if (pud_none_or_clear_bad(pud)) continue; pages += change_pmd_range(vma, pud, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (pud++, addr = next, addr != end); return pages; @@ -246,7 +248,7 @@ static inline unsigned long change_pud_range(struct vm_area_struct *vma, static inline unsigned long change_p4d_range(struct vm_area_struct *vma, pgd_t *pgd, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { p4d_t *p4d; unsigned long next; @@ -258,7 +260,7 @@ static inline unsigned long change_p4d_range(struct vm_area_struct *vma, if (p4d_none_or_clear_bad(p4d)) continue; pages += change_pud_range(vma, p4d, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (p4d++, addr = next, addr != end); return pages; @@ -266,7 +268,7 @@ static inline unsigned long change_p4d_range(struct vm_area_struct *vma, static unsigned long change_protection_range(struct vm_area_struct *vma, unsigned long addr, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; pgd_t *pgd; @@ -283,7 +285,7 @@ static unsigned long change_protection_range(struct vm_area_struct *vma, if (pgd_none_or_clear_bad(pgd)) continue; pages += change_p4d_range(vma, pgd, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (pgd++, addr = next, addr != end); /* Only flush the TLB if we actually modified any entries: */ @@ -296,14 +298,15 @@ static unsigned long change_protection_range(struct vm_area_struct *vma, unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { unsigned long pages; if (is_vm_hugetlb_page(vma)) pages = hugetlb_change_protection(vma, start, end, newprot); else - pages = change_protection_range(vma, start, end, newprot, dirty_accountable, prot_numa); + pages = change_protection_range(vma, start, end, newprot, + cp_flags); return pages; } @@ -431,7 +434,7 @@ mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev, vma_set_page_prot(vma); change_protection(vma, start, end, vma->vm_page_prot, - dirty_accountable, 0); + dirty_accountable ? MM_CP_DIRTY_ACCT : 0); /* * Private VM_LOCKED VMA becoming writable: trigger COW to avoid major From patchwork Thu Jun 20 02:19:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005699 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 02FF214DB for ; Thu, 20 Jun 2019 02:23:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EA1F8209CD for ; Thu, 20 Jun 2019 02:22:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DDA332810E; Thu, 20 Jun 2019 02:22:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 14122209CD for ; Thu, 20 Jun 2019 02:22:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1F5618E0002; Wed, 19 Jun 2019 22:22:58 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1CD9F8E0001; Wed, 19 Jun 2019 22:22:58 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E3C88E0002; Wed, 19 Jun 2019 22:22:58 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id E12188E0001 for ; Wed, 19 Jun 2019 22:22:57 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id a18so1645454qtj.18 for ; Wed, 19 Jun 2019 19:22:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=WEH4Ey9rJITN7HxgV1WrePJ0D0pC4Vz8TTkGNcUXJd8=; b=C9nIR7CYg3d5kt2mZj/LWJw8BHcZhbYKumntPiViqziTdIGqStXADiumKdkgU5XCqQ iCVgQzRLFYO6vlNHtQhgiDeNusmloyExEHPfkr3mKeGDminzb82T4iD+Qql8x3VVtXHD MH9+HniPb54Wb8Yd5E4CmTuY3CsghYqD2ZZW9DWpgTjZhka31izUjJQ2clmWe3YlyKOq tfcK2ZjTdv/Ou/VJjf1vBjShDNfXU2srNkwpkFlbLnBIQCg3YPg6I73YVYnEIJYaEIZD tao3i/0fNApbLENtHzCfMqBCv3xuk8/tZh3nBoM/Lj7dEE/q0/UePh//hyGHhC1D9YoD WmAA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVLyXhpl7GvTg+mq89LiigiJK3qk3jy6O27ldM2I84/8wb7ioYI UV6EulSfYp2HFvjR0S/v5jGFuxJ3LYFWVodkVQOAzL9PGaRe6oXLo3orO9v2Z0Yd36P7Lary6CM 9rzD5sMv7t7uFdb939ccaCBmDPBDdjvxHoyNmUozKzjK6o2fEZAx+PYEWBtAfmZTrfA== X-Received: by 2002:ac8:5315:: with SMTP id t21mr13844132qtn.229.1560997377665; Wed, 19 Jun 2019 19:22:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqxTzvcHLPZV0LFUuiKNE8gnsRJJL6jL3/Hw36uvD1N+w7jU339du78oF7f5Ux/A5wjin9nr X-Received: by 2002:ac8:5315:: with SMTP id t21mr13844088qtn.229.1560997376658; Wed, 19 Jun 2019 19:22:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997376; cv=none; d=google.com; s=arc-20160816; b=LfjIjqYN2Vi9SEkltNTUPXpTrmrnb7h8cu82lbudiu+97sLEnnOr9a//ndLbWiZscq 432grt6jR+yHyB+SMArATcAHN67+byz+ww8SfCaZ/5X9RJiTUbi11bRYCIPt8iv+I9oB N7+lgnr6PC5NvssBw5pVeJkfYEwGcz0pJ6WylSoYFnMun8emLJxFH7zbht/iTQ/8N5YT 8qY/45qppfDamHRQYYBXoxAkljWjM72zbdvCq+HP5qFuexcaCQniNeF8xAFHx4EuuUKM uUf+KCLUuS/CcMxItDww93wd0hxpbQk2VDF71BEDHj5VDuPtoZja47uVr5HK7rlhBzb9 bFew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=WEH4Ey9rJITN7HxgV1WrePJ0D0pC4Vz8TTkGNcUXJd8=; b=tyDANQEtjcecsXTQe0ow4rzXG/rZ6g1Ncon6LikAEUjGMcSaI48mVQPm6WqBxaEL9b 46i+erR153ovQDt1scG4n5asQN0IBzI5Jd8cs/nXx+VS/wL0yPwFlPxXq57klnmSQJQ7 lyaoA7WqsMSz4iQXOEG8tRWI3nDPX25e2sNLZ8xmPyWxrtArEtQzha3aSRlSEflvsGnG Dd6+AVOHEwrrnwZ1mTZfNfsOepf9V8wXQAr6jmItg8mzSDJs9Oy7KmH7CvwOTxPO1d0B Ir+M70ZB6EordKbjhahmHBWvOTwVCRZEhmxsJUWf2Ekz8kbMsdeW1cPXMioTmRw7BvBz x49w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id s45si3821730qtj.304.2019.06.19.19.22.56 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:22:56 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C092C81F0E; Thu, 20 Jun 2019 02:22:55 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 99FEA1001DE7; Thu, 20 Jun 2019 02:22:41 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 12/25] userfaultfd: wp: apply _PAGE_UFFD_WP bit Date: Thu, 20 Jun 2019 10:19:55 +0800 Message-Id: <20190620022008.19172-13-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Thu, 20 Jun 2019 02:22:55 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Firstly, introduce two new flags MM_CP_UFFD_WP[_RESOLVE] for change_protection() when used with uffd-wp and make sure the two new flags are exclusively used. Then, - For MM_CP_UFFD_WP: apply the _PAGE_UFFD_WP bit and remove _PAGE_RW when a range of memory is write protected by uffd - For MM_CP_UFFD_WP_RESOLVE: remove the _PAGE_UFFD_WP bit and recover _PAGE_RW when write protection is resolved from userspace And use this new interface in mwriteprotect_range() to replace the old MM_CP_DIRTY_ACCT. Do this change for both PTEs and huge PMDs. Then we can start to identify which PTE/PMD is write protected by general (e.g., COW or soft dirty tracking), and which is for userfaultfd-wp. Since we should keep the _PAGE_UFFD_WP when doing pte_modify(), add it into _PAGE_CHG_MASK as well. Meanwhile, since we have this new bit, we can be even more strict when detecting uffd-wp page faults in either do_wp_page() or wp_huge_pmd(). After we're with _PAGE_UFFD_WP, a special case is when a page is both protected by the general COW logic and also userfault-wp. Here the userfault-wp will have higher priority and will be handled first. Only after the uffd-wp bit is cleared on the PTE/PMD will we continue to handle the general COW. These are the steps on what will happen with such a page: 1. CPU accesses write protected shared page (so both protected by general COW and uffd-wp), blocked by uffd-wp first because in do_wp_page we'll handle uffd-wp first, so it has higher priority than general COW. 2. Uffd service thread receives the request, do UFFDIO_WRITEPROTECT to remove the uffd-wp bit upon the PTE/PMD. However here we still keep the write bit cleared. Notify the blocked CPU. 3. The blocked CPU resumes the page fault process with a fault retry, during retry it'll notice it was not with the uffd-wp bit this time but it is still write protected by general COW, then it'll go though the COW path in the fault handler, copy the page, apply write bit where necessary, and retry again. 4. The CPU will be able to access this page with write bit set. Suggested-by: Andrea Arcangeli Signed-off-by: Peter Xu --- include/linux/mm.h | 5 +++++ mm/huge_memory.c | 18 +++++++++++++++++- mm/memory.c | 4 ++-- mm/mprotect.c | 17 +++++++++++++++++ mm/userfaultfd.c | 8 ++++++-- 5 files changed, 47 insertions(+), 5 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index a93ac1c37940..beca76650271 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1719,6 +1719,11 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma, #define MM_CP_DIRTY_ACCT (1UL << 0) /* Whether this protection change is for NUMA hints */ #define MM_CP_PROT_NUMA (1UL << 1) +/* Whether this change is for write protecting */ +#define MM_CP_UFFD_WP (1UL << 2) /* do wp */ +#define MM_CP_UFFD_WP_RESOLVE (1UL << 3) /* Resolve wp */ +#define MM_CP_UFFD_WP_ALL (MM_CP_UFFD_WP | \ + MM_CP_UFFD_WP_RESOLVE) extern unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b7149a0acac1..3fda79f6746b 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1911,6 +1911,8 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, bool preserve_write; int ret; bool prot_numa = cp_flags & MM_CP_PROT_NUMA; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; ptl = __pmd_trans_huge_lock(pmd, vma); if (!ptl) @@ -1977,6 +1979,17 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, entry = pmd_modify(entry, newprot); if (preserve_write) entry = pmd_mk_savedwrite(entry); + if (uffd_wp) { + entry = pmd_wrprotect(entry); + entry = pmd_mkuffd_wp(entry); + } else if (uffd_wp_resolve) { + /* + * Leave the write bit to be handled by PF interrupt + * handler, then things like COW could be properly + * handled. + */ + entry = pmd_clear_uffd_wp(entry); + } ret = HPAGE_PMD_NR; set_pmd_at(mm, addr, pmd, entry); BUG_ON(vma_is_anonymous(vma) && !preserve_write && pmd_write(entry)); @@ -2125,7 +2138,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, struct page *page; pgtable_t pgtable; pmd_t old_pmd, _pmd; - bool young, write, soft_dirty, pmd_migration = false; + bool young, write, soft_dirty, pmd_migration = false, uffd_wp = false; unsigned long addr; int i; @@ -2207,6 +2220,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, write = pmd_write(old_pmd); young = pmd_young(old_pmd); soft_dirty = pmd_soft_dirty(old_pmd); + uffd_wp = pmd_uffd_wp(old_pmd); } VM_BUG_ON_PAGE(!page_count(page), page); page_ref_add(page, HPAGE_PMD_NR - 1); @@ -2240,6 +2254,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = pte_mkold(entry); if (soft_dirty) entry = pte_mksoft_dirty(entry); + if (uffd_wp) + entry = pte_mkuffd_wp(entry); } pte = pte_offset_map(&_pmd, addr); BUG_ON(!pte_none(*pte)); diff --git a/mm/memory.c b/mm/memory.c index 05bcd741855b..d79e6d1f8c62 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2579,7 +2579,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; - if (userfaultfd_wp(vma)) { + if (userfaultfd_pte_wp(vma, *vmf->pte)) { pte_unmap_unlock(vmf->pte, vmf->ptl); return handle_userfault(vmf, VM_UFFD_WP); } @@ -3800,7 +3800,7 @@ static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf) static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd) { if (vma_is_anonymous(vmf->vma)) { - if (userfaultfd_wp(vmf->vma)) + if (userfaultfd_huge_pmd_wp(vmf->vma, orig_pmd)) return handle_userfault(vmf, VM_UFFD_WP); return do_huge_pmd_wp_page(vmf, orig_pmd); } diff --git a/mm/mprotect.c b/mm/mprotect.c index ae9caa4c6562..c7066d7384e3 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -45,6 +45,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, int target_node = NUMA_NO_NODE; bool dirty_accountable = cp_flags & MM_CP_DIRTY_ACCT; bool prot_numa = cp_flags & MM_CP_PROT_NUMA; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; /* * Can be called with only the mmap_sem for reading by @@ -116,6 +118,19 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, if (preserve_write) ptent = pte_mk_savedwrite(ptent); + if (uffd_wp) { + ptent = pte_wrprotect(ptent); + ptent = pte_mkuffd_wp(ptent); + } else if (uffd_wp_resolve) { + /* + * Leave the write bit to be handled + * by PF interrupt handler, then + * things like COW could be properly + * handled. + */ + ptent = pte_clear_uffd_wp(ptent); + } + /* Avoid taking write faults for known dirty pages */ if (dirty_accountable && pte_dirty(ptent) && (pte_soft_dirty(ptent) || @@ -302,6 +317,8 @@ unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, { unsigned long pages; + BUG_ON((cp_flags & MM_CP_UFFD_WP_ALL) == MM_CP_UFFD_WP_ALL); + if (is_vm_hugetlb_page(vma)) pages = hugetlb_change_protection(vma, start, end, newprot); else diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index c8e7846e9b7e..5363376cb07a 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -73,8 +73,12 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, goto out_release; _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); - if ((dst_vma->vm_flags & VM_WRITE) && !wp_copy) - _dst_pte = pte_mkwrite(_dst_pte); + if (dst_vma->vm_flags & VM_WRITE) { + if (wp_copy) + _dst_pte = pte_mkuffd_wp(_dst_pte); + else + _dst_pte = pte_mkwrite(_dst_pte); + } dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); if (dst_vma->vm_file) { From patchwork Thu Jun 20 02:19:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005701 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6305C13AF for ; Thu, 20 Jun 2019 02:23:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 55B01209CD for ; Thu, 20 Jun 2019 02:23:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 494BF26E49; Thu, 20 Jun 2019 02:23:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CE8B3209CD for ; Thu, 20 Jun 2019 02:23:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 048468E0003; Wed, 19 Jun 2019 22:23:06 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 01F408E0001; Wed, 19 Jun 2019 22:23:05 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E782D8E0003; Wed, 19 Jun 2019 22:23:05 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id C76B38E0001 for ; Wed, 19 Jun 2019 22:23:05 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id u129so1717741qkd.12 for ; Wed, 19 Jun 2019 19:23:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=0B6K/1qpxz5QZNrjR0a/hvrU0mQd0LSbp6v1C3+Zb7A=; b=bm7anhsVWjkrqE1qdZ3jnvLrgEhWGqaFoNQSqSG3OniLYzUuxqDFk3IqtS/cdGtOjP ZwqixyHXjHn6MtTneZUZYkcb6JTgaldonL6cZ6O136WlDBkc4SvPB5JEhRYjv5/3b7WM 9dMt7ETjcgmvNMNsf0bWdUKKpCWpwWVa3eMPXZEAmAo5CrVwm1YOgr9R9j/7P8K8p3YU 41GNRr5rrhveSHY2+8u6TV6FDbEsz/C/H9AhUhTj7iaEPdHl1HEuxXj0cMcYMJJRZwxr L8yoGIcApIJjZUaNing7Hk3CxwjNWEA1uHvomSKBZ6Ok1dpg4g49GWzWNBCZlar5L05e m88w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAU4yEJLCCeQ9zbveP2/cM0UglshqhR0Xve3dDG8N+3iKa1zq/gz 7He2QWhpqsnmKFehO4ZV+Mpicq5BY3pe7X3wCHGlTe5gT6MAe/MC3P6dAWBphrEJ49ug7tzSjuS xTSzjdzS6F6iOGeHWwtJS3kSRnGWTOXmExcbt2iLFRlgtA33oIKKqRj6DV6ynnv+3pg== X-Received: by 2002:ac8:3267:: with SMTP id y36mr107265861qta.293.1560997385607; Wed, 19 Jun 2019 19:23:05 -0700 (PDT) X-Google-Smtp-Source: APXvYqwl0jtrmR+P781GqAXOBDHVMHIhdG8ZqN9XFulKSRIpUYnh17QtW/PKkcbg0cOnRFtR1yvK X-Received: by 2002:ac8:3267:: with SMTP id y36mr107265825qta.293.1560997384797; Wed, 19 Jun 2019 19:23:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997384; cv=none; d=google.com; s=arc-20160816; b=jp3NsT27tJtevN4+tz5NjAB1x2yhU8a5l+jVG6i8/bYO+XZN9a9tYKn+S+7EbYzl9N reohy8qtEq9ajH7TN3Lz2B1h2RM0Fac8Hsq4fcEsV6GdMC8QsE+IMchfHdpbywOlqy2g QqC8FJ9oyGQm5ssjwX7pw1lJK/ZlTABJnbzHLoOWuc5q/16e81IMWFt1qENbn2dh03uN C9XKGTVxcAH809qgQ5jPDU/pIfKw8PJl7OB2DfMMPNU0rRdCZcWCaUEPT7v9ushkBst6 /W+WP5mQNZ2GMMfLLqqcb6jPoU2lt2cqFOcAe7I/8P1rEENsPKcMdzTaRZbvbRTkXjxp suBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=0B6K/1qpxz5QZNrjR0a/hvrU0mQd0LSbp6v1C3+Zb7A=; b=FNKjLc/5SvLX0KhDXfZ6h1Q0QpOkEFaN2lx1gRCQ2CKMiG0Sy8A8AqE/ahk/OTNZwh QT4+NXKcGEcGX285CTJCIaorfIC9ezpgQkaAEF/sUpjJFNz70aprAacsKKOaj0AdTuO7 YEg7/PNuQw9cVt8Ufr20oROY5A4we+x78/Hk1iKcBqNdyea5yNv2/fCjY6khCJTHCKYB nNMo+wswvJ+J50VqfIr2JvjVI57968TvAWBhpPWLxnV5h658IUC2dIpLHB9ne1DETTBx cafWUk+33BMa/0+hSeJJqDMiNPIe16vPnhWC1NaznnjPzuLH6ZWof3JWNT3SNBMA97Sp QIOg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id e6si11350676qvt.108.2019.06.19.19.23.04 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:23:04 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id F0A5E308FBAC; Thu, 20 Jun 2019 02:23:03 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5886D1001E79; Thu, 20 Jun 2019 02:22:56 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 13/25] userfaultfd: wp: drop _PAGE_UFFD_WP properly when fork Date: Thu, 20 Jun 2019 10:19:56 +0800 Message-Id: <20190620022008.19172-14-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Thu, 20 Jun 2019 02:23:04 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP UFFD_EVENT_FORK support for uffd-wp should be already there, except that we should clean the uffd-wp bit if uffd fork event is not enabled. Detect that to avoid _PAGE_UFFD_WP being set even if the VMA is not being tracked by VM_UFFD_WP. Do this for both small PTEs and huge PMDs. Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- mm/huge_memory.c | 8 ++++++++ mm/memory.c | 8 ++++++++ 2 files changed, 16 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 3fda79f6746b..757975920df8 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -980,6 +980,14 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, ret = -EAGAIN; pmd = *src_pmd; + /* + * Make sure the _PAGE_UFFD_WP bit is cleared if the new VMA + * does not have the VM_UFFD_WP, which means that the uffd + * fork event is not enabled. + */ + if (!(vma->vm_flags & VM_UFFD_WP)) + pmd = pmd_clear_uffd_wp(pmd); + #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION if (unlikely(is_swap_pmd(pmd))) { swp_entry_t entry = pmd_to_swp_entry(pmd); diff --git a/mm/memory.c b/mm/memory.c index d79e6d1f8c62..8c69257d6ef1 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -790,6 +790,14 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte = pte_mkclean(pte); pte = pte_mkold(pte); + /* + * Make sure the _PAGE_UFFD_WP bit is cleared if the new VMA + * does not have the VM_UFFD_WP, which means that the uffd + * fork event is not enabled. + */ + if (!(vm_flags & VM_UFFD_WP)) + pte = pte_clear_uffd_wp(pte); + page = vm_normal_page(vma, addr, pte); if (page) { get_page(page); From patchwork Thu Jun 20 02:19:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005703 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 160B914DB for ; Thu, 20 Jun 2019 02:23:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0950C209CD for ; Thu, 20 Jun 2019 02:23:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F074827F85; Thu, 20 Jun 2019 02:23:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8305D209CD for ; Thu, 20 Jun 2019 02:23:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A31FB8E0005; Wed, 19 Jun 2019 22:23:18 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9E3F58E0001; Wed, 19 Jun 2019 22:23:18 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8D2178E0005; Wed, 19 Jun 2019 22:23:18 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 68C808E0001 for ; Wed, 19 Jun 2019 22:23:18 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id o16so1691982qtj.6 for ; Wed, 19 Jun 2019 19:23:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=OA/gE3dA9KiTq0wwEszkXEbgJHTkBwyZNwfY7oPqmEc=; b=Ypx11wMuJbKi6RcE74iF9aKEtJwgfhA4IxogGKiz46hxrqSDf3aK5D2m0YFyptiXnv hcR5b2pxR7ERjc5SWUI/wRA1PPCgyLDz71VY0B3lBVKcqN4Vsh7bxFikg8fjwx5qJ77F JlT0oJxS8TGCwXN6iovlmcHD6bekhoqmPeNamV4+6knwBviKrrAo37iCDQjeN3HRS25d aKpIo2sBFwsSbVDdhfUjtkEN7DLkfpk+a1Y9l5+deEjmoYQSmBIaqb6ruqZBgOdy5OYt DUtMfZh5Ub8ehr5jmJOMMfbyICeVRt/EJYvHUMUm13I0FL4L6Pxxiz2/2+Q8DzE/+vDr tLDA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWbrwTXjRzNlEOk2NbjVOmsxVlRH5Qux8ANKF5+Zm/5RXEoKOI2 q06EgNqWiYLAh+ZG9qYOgtbaN/QvK033DAKyLcRHfrV6OdChriCJ2m3E1EgL9yEU4eZM1IYcrE4 v0zKn4nCFyNZpgDo6Xi9m+JISeM3l0mEYIfvz8DvvfmExg/+bNyBcHpptleP6zqV9hw== X-Received: by 2002:a0c:9932:: with SMTP id h47mr37089184qvd.147.1560997398217; Wed, 19 Jun 2019 19:23:18 -0700 (PDT) X-Google-Smtp-Source: APXvYqwLER80aQsSSadjr+L6McUDn5p93/OUCVHazyM5W0F/XHeAI7ppN8tV6MGkEG1mvyXtyaAb X-Received: by 2002:a0c:9932:: with SMTP id h47mr37089155qvd.147.1560997397736; Wed, 19 Jun 2019 19:23:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997397; cv=none; d=google.com; s=arc-20160816; b=Quel2pvIOv6/AEVH67uRVrhDLSuSj85ud2SVM0wKs3PIJnFP6rbjxc0blYDp4s6jfr vdaNyXbk3Wxjv5DvNLRoe0CLPvZ+9UMs1ZOdDyWFXy6pxCw4nVuLcEh9AghL9xpg2R8/ YwTCWwxgAYLTEFCsPi/cNYU+j/zumVotzU4Me/Yn+mS4iDXu8cgr8tOGgvv93SshrHS/ fl6ARNSyuL6URF0yf3fpScHYCEEFp1/CnY68son7pSO93NarKo1ORawM9tKBxTjlNKH2 LJ78CPhz6yloa05EKEFjPQCojAf93dGOG51eYORo6Uo9auMiMMO/EIcfAVl6hbCIFzPk Ytrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=OA/gE3dA9KiTq0wwEszkXEbgJHTkBwyZNwfY7oPqmEc=; b=nrCuHuvQ9Y65/zfKgfr9ew7iVEUmIOJi0YJafriqnx/xcARl5q1dHHAviZIS5Vx6Yw Wg76Pl33XL+9KeSiwdVA1GPlMH2McZtp93TItnzUzh/66iuyrZHP/Fn3NtvKN6rrB5Kn TKCYVLVJfO5f7UMd76+O11imrwL31jqzqrxS/qL4zn2CnownJL9cVxhfJSxHF/ERqXHV zMB7ddqABZVxOaMOHM3DnGVv0N0bhyJGlkRZ0aO0S5RAMqFqz8HaTDfz9pO6xI3zX/Wp Un+AsuxSs4PIQN/tzerAeGsZNWeM6vo/fZJp4snatTTbF1DFactWKQcciSRr3tPlF8SY /HmQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id d22si3409892qtm.389.2019.06.19.19.23.17 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:23:17 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AAF6B30018D7; Thu, 20 Jun 2019 02:23:16 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 74DB11001DC3; Thu, 20 Jun 2019 02:23:04 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 14/25] userfaultfd: wp: add pmd_swp_*uffd_wp() helpers Date: Thu, 20 Jun 2019 10:19:57 +0800 Message-Id: <20190620022008.19172-15-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Thu, 20 Jun 2019 02:23:17 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Adding these missing helpers for uffd-wp operations with pmd swap/migration entries. Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- arch/x86/include/asm/pgtable.h | 15 +++++++++++++++ include/asm-generic/pgtable_uffd.h | 15 +++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 5b254b851082..0120fa671914 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1421,6 +1421,21 @@ static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) { return pte_clear_flags(pte, _PAGE_SWP_UFFD_WP); } + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pmd_set_flags(pmd, _PAGE_SWP_UFFD_WP); +} + +static inline int pmd_swp_uffd_wp(pmd_t pmd) +{ + return pmd_flags(pmd) & _PAGE_SWP_UFFD_WP; +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pmd_clear_flags(pmd, _PAGE_SWP_UFFD_WP); +} #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ #define PKRU_AD_BIT 0x1 diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h index 643d1bf559c2..828966d4c281 100644 --- a/include/asm-generic/pgtable_uffd.h +++ b/include/asm-generic/pgtable_uffd.h @@ -46,6 +46,21 @@ static __always_inline pte_t pte_swp_clear_uffd_wp(pte_t pte) { return pte; } + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pmd; +} + +static inline int pmd_swp_uffd_wp(pmd_t pmd) +{ + return 0; +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pmd; +} #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ #endif /* _ASM_GENERIC_PGTABLE_UFFD_H */ From patchwork Thu Jun 20 02:19:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005705 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7296B13AF for ; Thu, 20 Jun 2019 02:23:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 65197209CD for ; Thu, 20 Jun 2019 02:23:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 596A826E49; Thu, 20 Jun 2019 02:23:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 91872209CD for ; Thu, 20 Jun 2019 02:23:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B44AB8E0006; Wed, 19 Jun 2019 22:23:30 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B1B7E8E0001; Wed, 19 Jun 2019 22:23:30 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A32268E0006; Wed, 19 Jun 2019 22:23:30 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 83EE18E0001 for ; Wed, 19 Jun 2019 22:23:30 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id l11so1635542qtp.22 for ; Wed, 19 Jun 2019 19:23:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=nUliULzxns7pT1F4qCHDvBIX9vkZLQErhB7zoDHjDvo=; b=VZmAqLo/QPOcTxYTnoBx3koq7RjY6NHEX+PXpTqNeJ6ZqfkoBugasNiyn1l1xqHmWG ZbWRpYiaHCCnLjNDRPJsdkfBqIxxNDigWTZ8uCQIUnRkfS5T3opXDWzH6txB0MaSjdpi lGFRMAa9tOQvEOwoAYcBV3dqSa57BBzfQzn7TqkGWXnjQ1DtvdsO7+mqVJkI9RHfQCuW y35mvojSlPjsApJmWGKZS56b0vvJobWtg3WkgSwMIGOKLa2od7LqOmV/zfOgBswDt2ii UcETXiAeqFTvqjHDmwBiuqYv2xP9u4E8KQYl1GwWFt2All4j8t0xb5v4bE95zZARi6Ek /bng== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAUuEttHU3iED+psIrC7t7eU7UlI0U0eFwXXtYeH/mrNXnSd19Xz dQ7PcG1acY9hq1sAsVjJmvD/EjGg2s9XEXz29xSBxRtnb2wK9bBHED1sBHFo9e+fPUrgSbmfJfC I6Q33dbRmp1H7NrN7dKqf8niTEuha/C7qHwBK4dyLK+Fxk1DebwiOPBgviThoFEcPiw== X-Received: by 2002:ac8:253d:: with SMTP id 58mr10852658qtm.40.1560997410300; Wed, 19 Jun 2019 19:23:30 -0700 (PDT) X-Google-Smtp-Source: APXvYqwdtL9XR/BHXnWV/PNKnIuCcvUUSpPmOXbqb3JOS6yiZCFeysuYo7q86by3rVj84lYozJKE X-Received: by 2002:ac8:253d:: with SMTP id 58mr10852611qtm.40.1560997409408; Wed, 19 Jun 2019 19:23:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997409; cv=none; d=google.com; s=arc-20160816; b=Zm170WEE8B/Ho6mEC5kY/rdqnet5OWb9HSkdOlsUxFe5EsoInCWpqve/iNJ0P06HBg XJvsP03Xu7tbXGEnDJqguYc9eVMVoRyNRSk8oau9uqtH8Zpi5+PbKjAkrhqQJxnSlgGY y/eqBCerxcL83OTW2y1rikq8qZ1YVF0E7bThx1uRgrfVtrNI7lLBJxwyHovYl9s4mU+C JYPej3xpY6AUpGF3p/n47IIFwJwJBQf0dlCUlYXVQrMWVRiSIDo+cRZ2hd76elr3o4Me wKi3Eom01DpMhH83LTem3T20tNa5ez413lV3iHxrw81Sl3rySPHR/NGbVw6RpnzmyrnK OxUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=nUliULzxns7pT1F4qCHDvBIX9vkZLQErhB7zoDHjDvo=; b=KOtKKzpmMVdKvl5+29t4rN9WZvauaQ5JVZaU/+hGisyeoloTXb6xROAAe5oyU7Fv3+ Maompn1g1uNxbStwYtUq6+dwjMhhONP5bN2t3OyEmA137GrMNbX8kP5xiXZMSepLdwjE Dm/Xi1PkoP8Luxm1Liy9QaMSRHioBbKodUsCZLG4XzPO6w9DomZRT2rmzDgo31Y/46js HMJcoDaviZhXQsbwVUomuOexQ24ho431YDFFEVyEaRPb8PQvKoW1HnxzgabpDj6vsyqP yos+qEs7ObiqIfrzSJqtYk2EzPb2trKpWv7MsOfNpJZKTU3BnzKwIUVfK3yz2i3jcJca sYZw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id a6si4073726qta.168.2019.06.19.19.23.29 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:23:29 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7284B30BB557; Thu, 20 Jun 2019 02:23:28 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 287B51001E69; Thu, 20 Jun 2019 02:23:16 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 15/25] userfaultfd: wp: support swap and page migration Date: Thu, 20 Jun 2019 10:19:58 +0800 Message-Id: <20190620022008.19172-16-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Thu, 20 Jun 2019 02:23:28 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP For either swap and page migration, we all use the bit 2 of the entry to identify whether this entry is uffd write-protected. It plays a similar role as the existing soft dirty bit in swap entries but only for keeping the uffd-wp tracking for a specific PTE/PMD. Something special here is that when we want to recover the uffd-wp bit from a swap/migration entry to the PTE bit we'll also need to take care of the _PAGE_RW bit and make sure it's cleared, otherwise even with the _PAGE_UFFD_WP bit we can't trap it at all. In change_pte_range() we do nothing for uffd if the PTE is a swap entry. That can lead to data mismatch if the page that we are going to write protect is swapped out when sending the UFFDIO_WRITEPROTECT. This patch also applies/removes the uffd-wp bit even for the swap entries. Signed-off-by: Peter Xu --- include/linux/swapops.h | 2 ++ mm/huge_memory.c | 3 +++ mm/memory.c | 8 ++++++++ mm/migrate.c | 6 ++++++ mm/mprotect.c | 28 +++++++++++++++++----------- mm/rmap.c | 6 ++++++ 6 files changed, 42 insertions(+), 11 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 4d961668e5fc..0c2923b1cdb7 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -68,6 +68,8 @@ static inline swp_entry_t pte_to_swp_entry(pte_t pte) if (pte_swp_soft_dirty(pte)) pte = pte_swp_clear_soft_dirty(pte); + if (pte_swp_uffd_wp(pte)) + pte = pte_swp_clear_uffd_wp(pte); arch_entry = __pte_to_swp_entry(pte); return swp_entry(__swp_type(arch_entry), __swp_offset(arch_entry)); } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 757975920df8..eae25c58db9d 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2221,6 +2221,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, write = is_write_migration_entry(entry); young = false; soft_dirty = pmd_swp_soft_dirty(old_pmd); + uffd_wp = pmd_swp_uffd_wp(old_pmd); } else { page = pmd_page(old_pmd); if (pmd_dirty(old_pmd)) @@ -2253,6 +2254,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = swp_entry_to_pte(swp_entry); if (soft_dirty) entry = pte_swp_mksoft_dirty(entry); + if (uffd_wp) + entry = pte_swp_mkuffd_wp(entry); } else { entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); entry = maybe_mkwrite(entry, vma); diff --git a/mm/memory.c b/mm/memory.c index 8c69257d6ef1..28e9342d00cc 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -738,6 +738,8 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte = swp_entry_to_pte(entry); if (pte_swp_soft_dirty(*src_pte)) pte = pte_swp_mksoft_dirty(pte); + if (pte_swp_uffd_wp(*src_pte)) + pte = pte_swp_mkuffd_wp(pte); set_pte_at(src_mm, addr, src_pte, pte); } } else if (is_device_private_entry(entry)) { @@ -767,6 +769,8 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, is_cow_mapping(vm_flags)) { make_device_private_entry_read(&entry); pte = swp_entry_to_pte(entry); + if (pte_swp_uffd_wp(*src_pte)) + pte = pte_swp_mkuffd_wp(pte); set_pte_at(src_mm, addr, src_pte, pte); } } @@ -2930,6 +2934,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) flush_icache_page(vma, page); if (pte_swp_soft_dirty(vmf->orig_pte)) pte = pte_mksoft_dirty(pte); + if (pte_swp_uffd_wp(vmf->orig_pte)) { + pte = pte_mkuffd_wp(pte); + pte = pte_wrprotect(pte); + } set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); vmf->orig_pte = pte; diff --git a/mm/migrate.c b/mm/migrate.c index f2ecc2855a12..d8f1f6d13960 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -241,11 +241,15 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, entry = pte_to_swp_entry(*pvmw.pte); if (is_write_migration_entry(entry)) pte = maybe_mkwrite(pte, vma); + else if (pte_swp_uffd_wp(*pvmw.pte)) + pte = pte_mkuffd_wp(pte); if (unlikely(is_zone_device_page(new))) { if (is_device_private_page(new)) { entry = make_device_private_entry(new, pte_write(pte)); pte = swp_entry_to_pte(entry); + if (pte_swp_uffd_wp(*pvmw.pte)) + pte = pte_mkuffd_wp(pte); } else if (is_device_public_page(new)) { pte = pte_mkdevmap(pte); } @@ -2306,6 +2310,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pte)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pte)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, addr, ptep, swp_pte); /* diff --git a/mm/mprotect.c b/mm/mprotect.c index c7066d7384e3..a63737d9884e 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -139,11 +139,11 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, } ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); pages++; - } else if (IS_ENABLED(CONFIG_MIGRATION)) { + } else if (is_swap_pte(oldpte)) { swp_entry_t entry = pte_to_swp_entry(oldpte); + pte_t newpte; if (is_write_migration_entry(entry)) { - pte_t newpte; /* * A protection check is difficult so * just be safe and disable write @@ -152,22 +152,28 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, newpte = swp_entry_to_pte(entry); if (pte_swp_soft_dirty(oldpte)) newpte = pte_swp_mksoft_dirty(newpte); - set_pte_at(vma->vm_mm, addr, pte, newpte); - - pages++; - } - - if (is_write_device_private_entry(entry)) { - pte_t newpte; - + if (pte_swp_uffd_wp(oldpte)) + newpte = pte_swp_mkuffd_wp(newpte); + } else if (is_write_device_private_entry(entry)) { /* * We do not preserve soft-dirtiness. See * copy_one_pte() for explanation. */ make_device_private_entry_read(&entry); newpte = swp_entry_to_pte(entry); - set_pte_at(vma->vm_mm, addr, pte, newpte); + if (pte_swp_uffd_wp(oldpte)) + newpte = pte_swp_mkuffd_wp(newpte); + } else { + newpte = oldpte; + } + if (uffd_wp) + newpte = pte_swp_mkuffd_wp(newpte); + else if (uffd_wp_resolve) + newpte = pte_swp_clear_uffd_wp(newpte); + + if (!pte_same(oldpte, newpte)) { + set_pte_at(vma->vm_mm, addr, pte, newpte); pages++; } } diff --git a/mm/rmap.c b/mm/rmap.c index e5dfe2ae6b0d..dedde54dadb7 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1471,6 +1471,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, pvmw.address, pvmw.pte, swp_pte); /* * No need to invalidate here it will synchronize on @@ -1563,6 +1565,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* * No need to invalidate here it will synchronize on @@ -1629,6 +1633,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* Invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, From patchwork Thu Jun 20 02:19:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005707 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9C63E13AF for ; Thu, 20 Jun 2019 02:23:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8EE3A209CD for ; Thu, 20 Jun 2019 02:23:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8263B26E49; Thu, 20 Jun 2019 02:23:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 07D44209CD for ; Thu, 20 Jun 2019 02:23:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 355E28E0007; Wed, 19 Jun 2019 22:23:45 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 32BCC8E0001; Wed, 19 Jun 2019 22:23:45 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2426F8E0007; Wed, 19 Jun 2019 22:23:45 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 059C88E0001 for ; Wed, 19 Jun 2019 22:23:45 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id h47so1646075qtc.20 for ; Wed, 19 Jun 2019 19:23:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=rQ3xbJr7eyqpQk5fPSSoYksSki3uJVgyjd8kC+XAB+g=; b=pZVY341BsTSZ7mLvYcDszSmLbttkRsCAQBFxAvUpsUaL5Uizj//BwyJSm9zywkHoDd wyZQgGkvX7SMf1LndS3pGP3eFr1GfvkG9IjapLoYacxXT3oYS7u+Qk4KlisaHio7423V 1tt+5OMJuF9bPdek1griJRPU69r3LaUOsh2tFCjgaYSPTKAJhU7ZFKA7H3gv9mTcFG3O oY8bAJEOk1YyolCltv5AUTcWJxuS/cmiuVaPkuf05pvKYcvgelsFfFyKISxsb4KnYePf awSf3Ub/g0Gawp/cf25u3NSclY/gT1xXCCr9ALH/vuxso4ANDTv400AtimGDt7EnQTto fxUw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAV0cfREwp6urbiXI5AFl1mAENDLokYS1gnZu+cb4MOD1PvcLBgi oEkWWaYp/I2Vzg3FHdvhrb9aGzQlJ6TLw0NahnUtfw4MKPvLvwpqV8FLNFrlFrLTN1FSzKTMAjO noEFC18E+rAppSVs394RXNd6ApLqxH0+4YhbkRvU3rG/hc160Y2Jp861jmYfLO4jVaA== X-Received: by 2002:ac8:3325:: with SMTP id t34mr105520127qta.172.1560997424805; Wed, 19 Jun 2019 19:23:44 -0700 (PDT) X-Google-Smtp-Source: APXvYqwlMROoeFn+rEGkMhrkoatczTed0iteiCXOdQg6DYjnlXOqa4g0EMd5Gvh7RsCpG/ZhyADW X-Received: by 2002:ac8:3325:: with SMTP id t34mr105520074qta.172.1560997423889; Wed, 19 Jun 2019 19:23:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997423; cv=none; d=google.com; s=arc-20160816; b=qnZo/RR3BWGcAysWqui7N1S+cexAbUfP+YaLWGwsOH+0VzkwrOYtb3kfY12XfZEXJi 84erepDmEYjN9J7lO5hANQdL4QJ3BckYfz8L4jWaHEzwO7ZPEjTjb8bHzejBI1eQW20r g5uUfdHZv6MBMKUVQLm7NCKcBVfFI7HAZRKqtfgtlHPDVgMxG8/m9bF7kBGDQv9c7umx anV/SMfeY9TVpzKpNwl9agIiWnIzyvoNnlwWsDMm9NUf4G/dUGxGrNLwfpAPC9f73fRu 1quzCuk3q+mME4yW/UC9ZkClh6ggVtseORwBhPcAAjdMAbc4zgrQ9RiIWxVYgbatpu9U LkPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=rQ3xbJr7eyqpQk5fPSSoYksSki3uJVgyjd8kC+XAB+g=; b=GHVtYKvVMNEGiyeh+68YjBdS2qklTbSBGyuWso+IWGRJSVcgVE6ZfBPlZj3zh3eaD/ o0woRo4C5Ok/2oZzOGg0nZPz9YMwkUBvbbCHAvztNm54Sotodx9dQXM56GzcR/o15F0R RZfiSosorlHS1V6ANgcXxJvaZ7Zc8dKWihsJPJM/tcNRPSoOTnpN2pDksiyrriyIFgls NSk/bIaWMgT9t307co4S8Fl2VQvgvKmA0eIX9c3RsU5vEAogzh6aFCzac/zx5+NhyM2Q P1MEX/r773jc8rHFHUGrTLhiJdv/btFRu3VJE8rb22Dp5F5hDmOd1PeDUrx6lNmNzntD jMeQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id f19si4122503qtk.184.2019.06.19.19.23.43 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:23:43 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id F3594223880; Thu, 20 Jun 2019 02:23:42 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id E968D1001E69; Thu, 20 Jun 2019 02:23:28 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 16/25] khugepaged: skip collapse if uffd-wp detected Date: Thu, 20 Jun 2019 10:19:59 +0800 Message-Id: <20190620022008.19172-17-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 20 Jun 2019 02:23:43 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Don't collapse the huge PMD if there is any userfault write protected small PTEs. The problem is that the write protection is in small page granularity and there's no way to keep all these write protection information if the small pages are going to be merged into a huge PMD. The same thing needs to be considered for swap entries and migration entries. So do the check as well disregarding khugepaged_max_ptes_swap. Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/trace/events/huge_memory.h | 1 + mm/khugepaged.c | 23 +++++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index dd4db334bd63..2d7bad9cb976 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -13,6 +13,7 @@ EM( SCAN_PMD_NULL, "pmd_null") \ EM( SCAN_EXCEED_NONE_PTE, "exceed_none_pte") \ EM( SCAN_PTE_NON_PRESENT, "pte_non_present") \ + EM( SCAN_PTE_UFFD_WP, "pte_uffd_wp") \ EM( SCAN_PAGE_RO, "no_writable_page") \ EM( SCAN_LACK_REFERENCED_PAGE, "lack_referenced_page") \ EM( SCAN_PAGE_NULL, "page_null") \ diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 0f7419938008..fc40aa214be7 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -29,6 +29,7 @@ enum scan_result { SCAN_PMD_NULL, SCAN_EXCEED_NONE_PTE, SCAN_PTE_NON_PRESENT, + SCAN_PTE_UFFD_WP, SCAN_PAGE_RO, SCAN_LACK_REFERENCED_PAGE, SCAN_PAGE_NULL, @@ -1128,6 +1129,15 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, pte_t pteval = *_pte; if (is_swap_pte(pteval)) { if (++unmapped <= khugepaged_max_ptes_swap) { + /* + * Always be strict with uffd-wp + * enabled swap entries. Please see + * comment below for pte_uffd_wp(). + */ + if (pte_swp_uffd_wp(pteval)) { + result = SCAN_PTE_UFFD_WP; + goto out_unmap; + } continue; } else { result = SCAN_EXCEED_SWAP_PTE; @@ -1147,6 +1157,19 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, result = SCAN_PTE_NON_PRESENT; goto out_unmap; } + if (pte_uffd_wp(pteval)) { + /* + * Don't collapse the page if any of the small + * PTEs are armed with uffd write protection. + * Here we can also mark the new huge pmd as + * write protected if any of the small ones is + * marked but that could bring uknown + * userfault messages that falls outside of + * the registered range. So, just be simple. + */ + result = SCAN_PTE_UFFD_WP; + goto out_unmap; + } if (pte_write(pteval)) writable = true; From patchwork Thu Jun 20 02:20:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005709 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5D40613AF for ; Thu, 20 Jun 2019 02:23:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 51918209CD for ; Thu, 20 Jun 2019 02:23:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4504727F85; Thu, 20 Jun 2019 02:23:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C35A0209CD for ; Thu, 20 Jun 2019 02:23:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F3A948E0008; Wed, 19 Jun 2019 22:23:54 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id F13038E0001; Wed, 19 Jun 2019 22:23:54 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E28998E0008; Wed, 19 Jun 2019 22:23:54 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id C08538E0001 for ; Wed, 19 Jun 2019 22:23:54 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id t196so1774275qke.0 for ; Wed, 19 Jun 2019 19:23:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=ws8pFVJ/7iUsU4uFKaRwTCDRS4voTwx2y4IltkXPrv8=; b=a9leBzbSGUhqpQ4Xkzpt+jmXavDab6BDqRI6pPmL8rfL1q8Z5Bhcz8wBru7LCd2bUL 94OjlsA07DwzOV+T4EN608ZUcNKktd1o1gYLw5AYRY7ki0EOUweB5y9jMLG1KQfHfYpc PSrSLUN6JH0LRv0L393KGWX435nTDLsiBT9uDEhYhpkSu/6fFV2XvJ7ZpbyLqNBpc+hC 928QHrmKiJyD1HzCrXG9ktezHJreie/Yjbh30tV64+DfFJDfRU8WVjkAOaRxgRVXJ12N IZ0Rqi1lcVTkpovfSb5Sj/1wqqNXUYHM67Hyti4T3r3SXS8Oeqae/j9a0uFO3jwqkQGA 29Zw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAUYJk+604UgfRIvAb2y9hP0wJNvCo0gqwa5IXJKtO/Of/RuTSw2 aIBUONEvwuIirXHFxTJIRuGk9d3M7Pdb/t0ZHRcaZZlSJwsGixxzzonfag4FhdZLNSAIUOKu8lU 9avgQCHk1dXvAje5TLfHMeDX3tUHnpg1LYkCNwe+ubzSd04993yD2pK8ayCcomZetgw== X-Received: by 2002:ac8:1acf:: with SMTP id h15mr109009840qtk.67.1560997434582; Wed, 19 Jun 2019 19:23:54 -0700 (PDT) X-Google-Smtp-Source: APXvYqzSIk5nfBPpXDbYSZi8vD52Tbl8PS6gKhl8mjDIq8nbnYSlzXhvxGmvEcwGihhWv5V2+7QX X-Received: by 2002:ac8:1acf:: with SMTP id h15mr109009806qtk.67.1560997434034; Wed, 19 Jun 2019 19:23:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997434; cv=none; d=google.com; s=arc-20160816; b=GyDT+ogGokGvf9vaPWEm+1xpCWdInjX6yN/VfLsrizhNjD8hYpJ8iclVF1T2mOmRA6 zH4udDEQwFuNqGS476TRXPKeUssriguCSGxeoD8fVoTtJ1h88Bx7FGGAXfeK3ZhsCmj1 tlbiIztqZBiMqHJdcNqqM084u+U/5ueEIsW+lmyPkamOG3YzUxDwMjjHRrvY+RW/WRvo iyV/i1wVltGw/0hcqAT75N7P89rKYjDdMAaUVgkaKdtKkyj7cd1heReRHkhJhmOVFhTB jDLu8bMf9iZ2aACtiXl39y54rTIxryfOYn6m+8Fz4DYbdL9LQlQqZiMgP8qGXyYG7oR7 WBlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=ws8pFVJ/7iUsU4uFKaRwTCDRS4voTwx2y4IltkXPrv8=; b=QRMpKLcCbFe5XUY0sD5NnH914KP9+TrGqoGNW0PvFuHZvhDaPoYDtOsK4idhZkZcpt Q/G5CIDMXShOboI1piwqDAaeTACm3585XfiXLtapW0xZFFE0r3/zCdwFly4UQgHkq2X5 KuIIUJb7qVbh2YDfzOdwqCV8zkqRgBSyBm5UCBPFJyfRzvvzt0l6XbYeEIvIHWw5TI3Y FGXHsOnAIkmqLsN+o6F9JofuaxCI2egasdJVXqBL+OpS2Z1QC1AhL5tywJiRKH+MZPFf gSGpoQsoNqws6YfHTAxWTuGLrlBJoVhEDCachuGoMX7twB5bgR56u2oiNKJdOSO+zWXA Ynmg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id h8si3902581qtb.258.2019.06.19.19.23.53 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:23:54 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1CF9B21BA4; Thu, 20 Jun 2019 02:23:53 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 89BBD1001E69; Thu, 20 Jun 2019 02:23:43 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 17/25] userfaultfd: introduce helper vma_find_uffd Date: Thu, 20 Jun 2019 10:20:00 +0800 Message-Id: <20190620022008.19172-18-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Thu, 20 Jun 2019 02:23:53 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP We've have multiple (and more coming) places that would like to find a userfault enabled VMA from a mm struct that covers a specific memory range. This patch introduce the helper for it, meanwhile apply it to the code. Suggested-by: Mike Rapoport Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- mm/userfaultfd.c | 54 +++++++++++++++++++++++++++--------------------- 1 file changed, 30 insertions(+), 24 deletions(-) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 5363376cb07a..6b9dd5b66f64 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -20,6 +20,34 @@ #include #include "internal.h" +/* + * Find a valid userfault enabled VMA region that covers the whole + * address range, or NULL on failure. Must be called with mmap_sem + * held. + */ +static struct vm_area_struct *vma_find_uffd(struct mm_struct *mm, + unsigned long start, + unsigned long len) +{ + struct vm_area_struct *vma = find_vma(mm, start); + + if (!vma) + return NULL; + + /* + * Check the vma is registered in uffd, this is required to + * enforce the VM_MAYWRITE check done at uffd registration + * time. + */ + if (!vma->vm_userfaultfd_ctx.ctx) + return NULL; + + if (start < vma->vm_start || start + len > vma->vm_end) + return NULL; + + return vma; +} + static int mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, struct vm_area_struct *dst_vma, @@ -228,20 +256,9 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, */ if (!dst_vma) { err = -ENOENT; - dst_vma = find_vma(dst_mm, dst_start); + dst_vma = vma_find_uffd(dst_mm, dst_start, len); if (!dst_vma || !is_vm_hugetlb_page(dst_vma)) goto out_unlock; - /* - * Check the vma is registered in uffd, this is - * required to enforce the VM_MAYWRITE check done at - * uffd registration time. - */ - if (!dst_vma->vm_userfaultfd_ctx.ctx) - goto out_unlock; - - if (dst_start < dst_vma->vm_start || - dst_start + len > dst_vma->vm_end) - goto out_unlock; err = -EINVAL; if (vma_hpagesize != vma_kernel_pagesize(dst_vma)) @@ -487,20 +504,9 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, * both valid and fully within a single existing vma. */ err = -ENOENT; - dst_vma = find_vma(dst_mm, dst_start); + dst_vma = vma_find_uffd(dst_mm, dst_start, len); if (!dst_vma) goto out_unlock; - /* - * Check the vma is registered in uffd, this is required to - * enforce the VM_MAYWRITE check done at uffd registration - * time. - */ - if (!dst_vma->vm_userfaultfd_ctx.ctx) - goto out_unlock; - - if (dst_start < dst_vma->vm_start || - dst_start + len > dst_vma->vm_end) - goto out_unlock; err = -EINVAL; /* From patchwork Thu Jun 20 02:20:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005711 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7D55114DB for ; Thu, 20 Jun 2019 02:24:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6E5D6209CD for ; Thu, 20 Jun 2019 02:24:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6021826E49; Thu, 20 Jun 2019 02:24:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A2C36209CD for ; Thu, 20 Jun 2019 02:24:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D17738E0009; Wed, 19 Jun 2019 22:24:11 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CEF728E0001; Wed, 19 Jun 2019 22:24:11 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C05DA8E0009; Wed, 19 Jun 2019 22:24:11 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id A1E0F8E0001 for ; Wed, 19 Jun 2019 22:24:11 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id y184so1727596qka.15 for ; Wed, 19 Jun 2019 19:24:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=n1hFkPZtnzXNVbq0M+P7hVnKByN4oXMlp+t0H+PU8qo=; b=j/jliSmPUfKCvaDATMzBb2FhEaaAjJIbFJ4aADvbi9SKzguCJGTpbw8jlGOq/oNGWH k0gcDSAaWpWVcujt7bRcp9FI4RYUowkbtU8EQcRTIXAxrrXkjwwwZaET5wjZMcVJ0sRs 5o+1vuXfjmfCB6PDhVJcanxQmDVOxrHGclqH7+pc3mFo5bi5ZtZWVZ4ssSUFvoBqWTWl BIN/csQREgVQxay5z1OmK8xtBtLwjF5+qUpi+Li4USWxVEqrv3EftOkumykWNhpetSNp Qyed7D1qbXQugQWY1ipOmkjtTsSyWkkTw0jgyLlBLwc7J/FtVrziIFHZc2SiQvgHqwyf +DLQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVpfBgRqbxJomCNXcHLb/4DsSRWKsqgC3CQp5THw05zZKEfTB+y lTZY3ePNcGkoSuEIjeBuEve8YmoZYLDHhxhchOhoMWfZ30WOTdTJ15i24PjS9xK/o0tyCVEkPbe Jhml/AF2gfGMfI8TKTDEPNONoX2DN3gkkyWrs7T/4q+nf5+HdOd8rJJ5vBXBdWcjDuQ== X-Received: by 2002:ac8:2e5d:: with SMTP id s29mr101482332qta.70.1560997451443; Wed, 19 Jun 2019 19:24:11 -0700 (PDT) X-Google-Smtp-Source: APXvYqxa0Z1rPtjl+d5K6CwKZwOiuLg5HV+2R8+7yxgK6I8toqlCyxvrUe+xj2cmnPXTaW83k4ka X-Received: by 2002:ac8:2e5d:: with SMTP id s29mr101482310qta.70.1560997450889; Wed, 19 Jun 2019 19:24:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997450; cv=none; d=google.com; s=arc-20160816; b=iKnqm5GWeeRwpKmwSwam9jhCTJicw6ySBQEc3Zh3mfpz+wXQuW1RC9Vyucw2+BmoHa SfbM6foaMxTFLCZ4bWIpi+e0xLkdsnG18w8uvdm9YussFf1ZWG47ozCHIyrb7odvOt9M JdEkhILkuWFscJgE5qiB7LqqOMNuyJ36fRn2CBdTQJ2+qqi32j0eayY5qKjuAqTXfJZz 9W8aHG57vO12Wb/4zwmrPcWLqhTgldLa9vcru4jTWhPnTzXlMygz7nuc/CBYCfT1PDQL vD1zdhW74y43DIwnjiVuhWwC6hpLbVtTCJnWqf7o1xWaycl/hYoikRhBxEQXXQbi2Tgx Ucsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=n1hFkPZtnzXNVbq0M+P7hVnKByN4oXMlp+t0H+PU8qo=; b=XjSrB5DiPVktnq9+goX5Li78bfNljDrjIOB3YaSo1ZykW2LXYohCENRRdoOwv8UmAB xNqep0P9LoDmBEuVPEL92u/g8xh6D/Uh1Nq24yVvmGaHfpbd9pxtV5mXrbhl+gj/NVKo 8nmR0DB+9P1+hUrmfzCm19ypvhkAJ2yVjIshh8YIvDot0E0AgRNC2r8C2H/Ev6en53js qGwmIyrnMTdjepdbljrAjHe6UjKLhq5raBmbkaQQ3ShzU0l5922UOFcOwt5LYZpj/jeE egSJ3FwxwRvyXTK1Ty9V1JFlXMDrjT9C7uHxnJdcB/ePI61BOi/4rrTilrT31YU6Mkux A8Vw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id c127si13489462qkg.194.2019.06.19.19.24.10 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:24:10 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E30A3C05D275; Thu, 20 Jun 2019 02:24:09 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id A7B5610190A8; Thu, 20 Jun 2019 02:23:53 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Rik van Riel Subject: [PATCH v5 18/25] userfaultfd: wp: support write protection for userfault vma range Date: Thu, 20 Jun 2019 10:20:01 +0800 Message-Id: <20190620022008.19172-19-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Thu, 20 Jun 2019 02:24:10 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li Add API to enable/disable writeprotect a vma range. Unlike mprotect, this doesn't split/merge vmas. Cc: Andrea Arcangeli Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli [peterx: - use the helper to find VMA; - return -ENOENT if not found to match mcopy case; - use the new MM_CP_UFFD_WP* flags for change_protection - check against mmap_changing for failures] Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 3 ++ mm/userfaultfd.c | 54 +++++++++++++++++++++++++++++++++++ 2 files changed, 57 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index dcd33172b728..a8e5f3ea9bb2 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -41,6 +41,9 @@ extern ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long len, bool *mmap_changing); +extern int mwriteprotect_range(struct mm_struct *dst_mm, + unsigned long start, unsigned long len, + bool enable_wp, bool *mmap_changing); /* mm helpers */ static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma, diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 6b9dd5b66f64..4208592c7ca3 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -638,3 +638,57 @@ ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long start, { return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing, 0); } + +int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, + unsigned long len, bool enable_wp, bool *mmap_changing) +{ + struct vm_area_struct *dst_vma; + pgprot_t newprot; + int err; + + /* + * Sanitize the command parameters: + */ + BUG_ON(start & ~PAGE_MASK); + BUG_ON(len & ~PAGE_MASK); + + /* Does the address range wrap, or is the span zero-sized? */ + BUG_ON(start + len <= start); + + down_read(&dst_mm->mmap_sem); + + /* + * If memory mappings are changing because of non-cooperative + * operation (e.g. mremap) running in parallel, bail out and + * request the user to retry later + */ + err = -EAGAIN; + if (mmap_changing && READ_ONCE(*mmap_changing)) + goto out_unlock; + + err = -ENOENT; + dst_vma = vma_find_uffd(dst_mm, start, len); + /* + * Make sure the vma is not shared, that the dst range is + * both valid and fully within a single existing vma. + */ + if (!dst_vma || (dst_vma->vm_flags & VM_SHARED)) + goto out_unlock; + if (!userfaultfd_wp(dst_vma)) + goto out_unlock; + if (!vma_is_anonymous(dst_vma)) + goto out_unlock; + + if (enable_wp) + newprot = vm_get_page_prot(dst_vma->vm_flags & ~(VM_WRITE)); + else + newprot = vm_get_page_prot(dst_vma->vm_flags); + + change_protection(dst_vma, start, start + len, newprot, + enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE); + + err = 0; +out_unlock: + up_read(&dst_mm->mmap_sem); + return err; +} From patchwork Thu Jun 20 02:20:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005713 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5FDB813AF for ; Thu, 20 Jun 2019 02:24:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 50989209CD for ; Thu, 20 Jun 2019 02:24:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4425F26E49; Thu, 20 Jun 2019 02:24:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 74867209CD for ; Thu, 20 Jun 2019 02:24:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 85D258E000A; Wed, 19 Jun 2019 22:24:21 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 80DA98E0001; Wed, 19 Jun 2019 22:24:21 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 724A58E000A; Wed, 19 Jun 2019 22:24:21 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 518AD8E0001 for ; Wed, 19 Jun 2019 22:24:21 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id o4so1747479qko.8 for ; Wed, 19 Jun 2019 19:24:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=W/X7WIccHO5S15YBTZPd+cRPsxXXqxtU7MmPZJUfJns=; b=FH2+7V1Va966XNLm3Nll1X4dzYudQJA8+txL/uQt2IBZ7/fV/jaUnuG3F19X8KJ2H/ D2F7aqkc7HHtM/Z3M+aguXw1NX1cvssqEsEE+0VpQUT7+UOP3Rl7pWe/n8yQZ+dRnhW8 1+3wm6LsNDWFVV/we+Z28MgFpkzXy/tnl5MOI+4g64J2nE7lWwKfaztfvjw7Tmxb8fF4 JOG7+VXrqO1ttnnhUOKL4AuRtTBo4O3lMWHJG8NunJhRUy/g1sQcuElmi4B39x6Wd/7V E+02kUF/pbnt3ytHxFlNwKKEr4zCz73anWdi6p3M3COoj1MMsKeKJUDW/WMDlYGiywEe WPcw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAU61F+VMhnc1z0aa8ZpDd53Q2Ma48Qg5Q9i9c7f0UE6AoSeKn98 xdOWB3uXEf3eZ2ysa6DIbo1k4PTdsgnJ4VyrEOLF474InS6iLQihuco2iIzIC1fCY0WNOiB1dCC jovqanaq7PzmnSFkCMB+eB8S3o3LpDJsM3gA/GIJPkcKtOSS/dUU41Dkp7UbkMCCz6w== X-Received: by 2002:a37:9b01:: with SMTP id d1mr98599458qke.46.1560997461074; Wed, 19 Jun 2019 19:24:21 -0700 (PDT) X-Google-Smtp-Source: APXvYqzZt/B2BXOoVOAxEI6FieaKVjFkb8NrCAXLtIfynpSzt/wtGg+kOAJE1kfHNT+PC4ycdKWC X-Received: by 2002:a37:9b01:: with SMTP id d1mr98599416qke.46.1560997460214; Wed, 19 Jun 2019 19:24:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997460; cv=none; d=google.com; s=arc-20160816; b=q9Xfs77JzSp6x92fjlbXmt8fIO0pMnHZ0uZA9UMSxUxNPHgTmOOMvcXLRX0rMOuxl5 o5Ib/IZX4cY08YRvivzt7nfJuWJTXHnKBs4nWJ2fVUAYCepWNgFSlFlDt1nxVZ6VyDDM iuVveOkonTT7rRPFcjRDD5jWzhTn+2vEqi86QwtxwAN3Sw3z65/r729k8S/+HPpwvGom aTeZ3CcmI1aoDAZeZ256EG89BB/YO8gYSfjNfXLcxfKlt1+tK3d+3qUA/6hNvG9BUcs3 Shl2lhcac52+pdiUv8rC/YTNG5jGUzzPuLrJZfyhPyF0AgDBurFV9/X72yK2DbNG3yLn NSvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=W/X7WIccHO5S15YBTZPd+cRPsxXXqxtU7MmPZJUfJns=; b=GbCOgyp0wwyEKgm9JhLNAerXHMmh18yqiVktzxrK8AvVXriWnuEkfpUF7fFnqFf6Kb 956pvLWIU/y5teR30lAZ4pD4HRIPgdQ6CVyurNTgiygP2+Pl6kiL/WS81iXe4HNKmwCr XylaeWbkKyytZdDEjGqVsYH1AzYOFO1OakdVEmL9Ts4ARMvu47x2JvumpqUgz7dDSmbA hkzoqYB53JMmbH5wVQ6MbpJHSvBuzcKYI/J+g6udzivYU4e7twdlRM6hJOnmGxf1ZCmJ RH1HSxwYSGBXjMketJEPArGv+C3ioZt2rD2onxIg+PacGktYp4O9qU9kSlNLc6SUhavi zGYA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id l4si3815488qtb.237.2019.06.19.19.24.20 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:24:20 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 55A133024552; Thu, 20 Jun 2019 02:24:19 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7A2781001E69; Thu, 20 Jun 2019 02:24:10 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 19/25] userfaultfd: wp: add the writeprotect API to userfaultfd ioctl Date: Thu, 20 Jun 2019 10:20:02 +0800 Message-Id: <20190620022008.19172-20-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Thu, 20 Jun 2019 02:24:19 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli v1: From: Shaohua Li v2: cleanups, remove a branch. [peterx writes up the commit message, as below...] This patch introduces the new uffd-wp APIs for userspace. Firstly, we'll allow to do UFFDIO_REGISTER with write protection tracking using the new UFFDIO_REGISTER_MODE_WP flag. Note that this flag can co-exist with the existing UFFDIO_REGISTER_MODE_MISSING, in which case the userspace program can not only resolve missing page faults, and at the same time tracking page data changes along the way. Secondly, we introduced the new UFFDIO_WRITEPROTECT API to do page level write protection tracking. Note that we will need to register the memory region with UFFDIO_REGISTER_MODE_WP before that. Signed-off-by: Andrea Arcangeli [peterx: remove useless block, write commit message, check against VM_MAYWRITE rather than VM_WRITE when register] Reviewed-by: Jerome Glisse Signed-off-by: Peter Xu --- fs/userfaultfd.c | 82 +++++++++++++++++++++++++------- include/uapi/linux/userfaultfd.h | 23 +++++++++ 2 files changed, 89 insertions(+), 16 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index c594945ad5bf..3cf19aeaa0e0 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -306,8 +306,11 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, if (!pmd_present(_pmd)) goto out; - if (pmd_trans_huge(_pmd)) + if (pmd_trans_huge(_pmd)) { + if (!pmd_write(_pmd) && (reason & VM_UFFD_WP)) + ret = true; goto out; + } /* * the pmd is stable (as in !pmd_trans_unstable) so we can re-read it @@ -320,6 +323,8 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, */ if (pte_none(*pte)) ret = true; + if (!pte_write(*pte) && (reason & VM_UFFD_WP)) + ret = true; pte_unmap(pte); out: @@ -1258,10 +1263,13 @@ static __always_inline int validate_range(struct mm_struct *mm, return 0; } -static inline bool vma_can_userfault(struct vm_area_struct *vma) +static inline bool vma_can_userfault(struct vm_area_struct *vma, + unsigned long vm_flags) { - return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || - vma_is_shmem(vma); + /* FIXME: add WP support to hugetlbfs and shmem */ + return vma_is_anonymous(vma) || + ((is_vm_hugetlb_page(vma) || vma_is_shmem(vma)) && + !(vm_flags & VM_UFFD_WP)); } static int userfaultfd_register(struct userfaultfd_ctx *ctx, @@ -1293,15 +1301,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, vm_flags = 0; if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MISSING) vm_flags |= VM_UFFD_MISSING; - if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) { + if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) vm_flags |= VM_UFFD_WP; - /* - * FIXME: remove the below error constraint by - * implementing the wprotect tracking mode. - */ - ret = -EINVAL; - goto out; - } ret = validate_range(mm, uffdio_register.range.start, uffdio_register.range.len); @@ -1351,7 +1352,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, /* check not compatible vmas */ ret = -EINVAL; - if (!vma_can_userfault(cur)) + if (!vma_can_userfault(cur, vm_flags)) goto out_unlock; /* @@ -1379,6 +1380,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, if (end & (vma_hpagesize - 1)) goto out_unlock; } + if ((vm_flags & VM_UFFD_WP) && !(cur->vm_flags & VM_MAYWRITE)) + goto out_unlock; /* * Check that this vma isn't already owned by a @@ -1408,7 +1411,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!vma_can_userfault(vma)); + BUG_ON(!vma_can_userfault(vma, vm_flags)); BUG_ON(vma->vm_userfaultfd_ctx.ctx && vma->vm_userfaultfd_ctx.ctx != ctx); WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); @@ -1545,7 +1548,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, * provides for more strict behavior to notice * unregistration errors. */ - if (!vma_can_userfault(cur)) + if (!vma_can_userfault(cur, cur->vm_flags)) goto out_unlock; found = true; @@ -1559,7 +1562,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!vma_can_userfault(vma)); + BUG_ON(!vma_can_userfault(vma, vma->vm_flags)); /* * Nothing to do: this vma is already registered into this @@ -1772,6 +1775,50 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx, return ret; } +static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, + unsigned long arg) +{ + int ret; + struct uffdio_writeprotect uffdio_wp; + struct uffdio_writeprotect __user *user_uffdio_wp; + struct userfaultfd_wake_range range; + + if (READ_ONCE(ctx->mmap_changing)) + return -EAGAIN; + + user_uffdio_wp = (struct uffdio_writeprotect __user *) arg; + + if (copy_from_user(&uffdio_wp, user_uffdio_wp, + sizeof(struct uffdio_writeprotect))) + return -EFAULT; + + ret = validate_range(ctx->mm, uffdio_wp.range.start, + uffdio_wp.range.len); + if (ret) + return ret; + + if (uffdio_wp.mode & ~(UFFDIO_WRITEPROTECT_MODE_DONTWAKE | + UFFDIO_WRITEPROTECT_MODE_WP)) + return -EINVAL; + if ((uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP) && + (uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) + return -EINVAL; + + ret = mwriteprotect_range(ctx->mm, uffdio_wp.range.start, + uffdio_wp.range.len, uffdio_wp.mode & + UFFDIO_WRITEPROTECT_MODE_WP, + &ctx->mmap_changing); + if (ret) + return ret; + + if (!(uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) { + range.start = uffdio_wp.range.start; + range.len = uffdio_wp.range.len; + wake_userfault(ctx, &range); + } + return ret; +} + static inline unsigned int uffd_ctx_features(__u64 user_features) { /* @@ -1849,6 +1896,9 @@ static long userfaultfd_ioctl(struct file *file, unsigned cmd, case UFFDIO_ZEROPAGE: ret = userfaultfd_zeropage(ctx, arg); break; + case UFFDIO_WRITEPROTECT: + ret = userfaultfd_writeprotect(ctx, arg); + break; } return ret; } diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 340f23bc251d..95c4a160e5f8 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -52,6 +52,7 @@ #define _UFFDIO_WAKE (0x02) #define _UFFDIO_COPY (0x03) #define _UFFDIO_ZEROPAGE (0x04) +#define _UFFDIO_WRITEPROTECT (0x06) #define _UFFDIO_API (0x3F) /* userfaultfd ioctl ids */ @@ -68,6 +69,8 @@ struct uffdio_copy) #define UFFDIO_ZEROPAGE _IOWR(UFFDIO, _UFFDIO_ZEROPAGE, \ struct uffdio_zeropage) +#define UFFDIO_WRITEPROTECT _IOWR(UFFDIO, _UFFDIO_WRITEPROTECT, \ + struct uffdio_writeprotect) /* read() structure */ struct uffd_msg { @@ -232,4 +235,24 @@ struct uffdio_zeropage { __s64 zeropage; }; +struct uffdio_writeprotect { + struct uffdio_range range; +/* + * UFFDIO_WRITEPROTECT_MODE_WP: set the flag to write protect a range, + * unset the flag to undo protection of a range which was previously + * write protected. + * + * UFFDIO_WRITEPROTECT_MODE_DONTWAKE: set the flag to avoid waking up + * any wait thread after the operation succeeds. + * + * NOTE: Write protecting a region (WP=1) is unrelated to page faults, + * therefore DONTWAKE flag is meaningless with WP=1. Removing write + * protection (WP=0) in response to a page fault wakes the faulting + * task unless DONTWAKE is set. + */ +#define UFFDIO_WRITEPROTECT_MODE_WP ((__u64)1<<0) +#define UFFDIO_WRITEPROTECT_MODE_DONTWAKE ((__u64)1<<1) + __u64 mode; +}; + #endif /* _LINUX_USERFAULTFD_H */ From patchwork Thu Jun 20 02:20:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005715 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6E79813AF for ; Thu, 20 Jun 2019 02:24:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5F42A209CD for ; Thu, 20 Jun 2019 02:24:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4D7FC26E49; Thu, 20 Jun 2019 02:24:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BACD3209CD for ; Thu, 20 Jun 2019 02:24:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E026C8E000B; Wed, 19 Jun 2019 22:24:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DB28E8E0001; Wed, 19 Jun 2019 22:24:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CC7D58E000B; Wed, 19 Jun 2019 22:24:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id AAD6A8E0001 for ; Wed, 19 Jun 2019 22:24:39 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id z6so1689290qtj.7 for ; Wed, 19 Jun 2019 19:24:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=ULiOcDZKihkOBhbiRwYSv6/Li2uCDaTi6jb3hQtUL44=; b=Lz7eCAPDwVDwhmxjEwXT1dqcyzLFVdGpzwz9Mpe4MJXDwTK+HUAfyTdvqtuzR19OHu XrTBD/2SlUsJ93hklGe1iXqLuBwFSruuVSSXkcAM86vFtETF5+Lm8rgf0zzqMbB5PFdR 2EPbXt+kAVXy7lcwg3TuEUbjtOaOq3VgiXeWJxIT3L/x/VHUDuz5pnxSkLm5h3iBs881 ioQx0kSLZlbNlYfKDvm9H7qXOTCyto5PNH/FzWBPSIcE0fELCVsme4cOZOAQIGz0clBP erXtbXej3PNi6rHpEoeF0k3NhuT2STXtcjBLJfoMIXrl9kPxu0fj2Gi7BV8TwTIECxUQ uNqQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAUJiy4NnN4oeZt8lmQPfDoa4j8tNv5dK22kn0fk6Dm/K8epoNFu 7g+DheIW8s+DHtiorlaY+C4HL7FoUSCoZa/s2EDOxQA7gSqud+WAAehDQcHH0c+JiM+bbpfZ/40 Z4EcEPXTu6PWKTWZJcEWyG1ub0Uu9k+cGEY/BuzKxVxbjwKt3313qfJVUFkBlz2Caaw== X-Received: by 2002:ac8:374d:: with SMTP id p13mr106516107qtb.389.1560997479493; Wed, 19 Jun 2019 19:24:39 -0700 (PDT) X-Google-Smtp-Source: APXvYqxIGPTwhqODCCpEo1D8IHXEFCHgmLGGflY8ctWORrOQucbiIDVCIZ7EQ5Ml+Ry7eePq6gEy X-Received: by 2002:ac8:374d:: with SMTP id p13mr106516079qtb.389.1560997478940; Wed, 19 Jun 2019 19:24:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997478; cv=none; d=google.com; s=arc-20160816; b=NgwVpWebcG7KFo5hMiD8q1gf53KUx8NcGDXtszjAwox5jyPoAtY3lQHR1xNpdUbhbk QhB/CTMB0QgsRibFQhZGSfhiSE52ioekr2cCwIgsx1/44PHGJ6+GrJFPRptlGgP51NBR Xqqsl0pkTK5GlR8Gm86vIGZmSlEfUxPzBpaKEhY0PnYmVmfqXl4PRvRVwQ21cauLlEKp 3OP6NbKA/TcZxVXb9cOO85hI90qlOZk5i8excryGu6cA4jFH1TDCX5i1J/Q0ed8xMM/A A1qxuj8kmT001L2KhWMFttAfFtfleKQGYraBDnY4XLJ9D35PLjBSAxBWPYMLwcIWb6QR mFdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=ULiOcDZKihkOBhbiRwYSv6/Li2uCDaTi6jb3hQtUL44=; b=OffyBmfmbfkeQT5t82ha3wi2WyyUP1J5WgA7lfDikvPoBJRsIz0MXJH/qkcr5EUtt0 eHjELKd6ghSLmYF6q0qBxryX6eB/84l1kO8HWd10JQX0+pxHsR20lB0W2m2Jz4DQAc6y Ee56nQ1RpFziF3XcxyjiwYEXgCKHxMG6wOKCcFA8SqUSP4cxoUYhkbbtMA4sQ5cI92U5 W2OUjtB9JQU9z5C0o1+CJ69S6XnwYKj+vg14GwBLz1BZ6LNFBwfUXohflEdnluqY3AJh 9O/wLmNM14lUQqwvK/gx8iI9bZMeWrbAXrdnfIdIieejZPKswJi1u7EIcw4c9y8gKVWy UF1A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id 16si3684828qta.329.2019.06.19.19.24.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:24:38 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0BEE1C05D3E4; Thu, 20 Jun 2019 02:24:38 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 02BE01001DE7; Thu, 20 Jun 2019 02:24:19 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Pavel Emelyanov , Rik van Riel Subject: [PATCH v5 20/25] userfaultfd: wp: enabled write protection in userfaultfd API Date: Thu, 20 Jun 2019 10:20:03 +0800 Message-Id: <20190620022008.19172-21-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Thu, 20 Jun 2019 02:24:38 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li Now it's safe to enable write protection in userfaultfd API Cc: Andrea Arcangeli Cc: Pavel Emelyanov Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- include/uapi/linux/userfaultfd.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 95c4a160e5f8..e7e98bde221f 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -19,7 +19,8 @@ * means the userland is reading). */ #define UFFD_API ((__u64)0xAA) -#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK | \ +#define UFFD_API_FEATURES (UFFD_FEATURE_PAGEFAULT_FLAG_WP | \ + UFFD_FEATURE_EVENT_FORK | \ UFFD_FEATURE_EVENT_REMAP | \ UFFD_FEATURE_EVENT_REMOVE | \ UFFD_FEATURE_EVENT_UNMAP | \ @@ -34,7 +35,8 @@ #define UFFD_API_RANGE_IOCTLS \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY | \ - (__u64)1 << _UFFDIO_ZEROPAGE) + (__u64)1 << _UFFDIO_ZEROPAGE | \ + (__u64)1 << _UFFDIO_WRITEPROTECT) #define UFFD_API_RANGE_IOCTLS_BASIC \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY) From patchwork Thu Jun 20 02:20:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005717 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD2B413AF for ; Thu, 20 Jun 2019 02:24:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9FD23209CD for ; Thu, 20 Jun 2019 02:24:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9442C26E49; Thu, 20 Jun 2019 02:24:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1DBC7209CD for ; Thu, 20 Jun 2019 02:24:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DD798E000C; Wed, 19 Jun 2019 22:24:48 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 28F1D8E0001; Wed, 19 Jun 2019 22:24:48 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A61C8E000C; Wed, 19 Jun 2019 22:24:48 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id F0EAA8E0001 for ; Wed, 19 Jun 2019 22:24:47 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id x10so1686704qti.11 for ; Wed, 19 Jun 2019 19:24:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=e7eX1RveEOqtNN+ZGUkmQs0zHK+DZjZtAYbM/evoIws=; b=SESNwjZ5G/2cFxstyS7qt2sufNHrfouWZNtH0xjVeNXfbTYbQtBc+PbaFyQnV8jODh iEGI6O7pbIeWk/DRe75zEIUL9EEMsvhbPYQz5ME73npNd7L5rW2BJOsUrYKh3dqXL3U3 JIxvzWQGh6hYGXuxN77xoE36LQZpfG/SQYDaPwbOXaFYBWcX0CQ9DHh8vbwpNjqEsgIo UYQHl6NBwL8no6uLf11V6vT+e7UwWmqwBx+Yha9C0dBBXNWt39B+XLZEd0qAB00ai5oh nmdB0Br7KPnO+2pbhjSBfeKF2XfNYPDzYh0IIzvb3Omm++sLMfjKt7zAmWMf1qnS4pjL aPSA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAUyb3j1p5K/nkJk1Paggu35duWZ4OFnDoW0u1JJsUTJ1dmm31yZ S7/B4esaimCNcYKn5+XfiRDNeQy6DCPvX+QaOxDY79hBuaVkxBV4CVqm5ZWjMMc9wxYDwGzllBC /PhtBasPYnvuP39rQsPno8ugzIFY/zkGtA2zZeJfDqidZZRLFe5hogU6XpS+SJpG3IQ== X-Received: by 2002:a37:dcc4:: with SMTP id v187mr103992184qki.290.1560997487766; Wed, 19 Jun 2019 19:24:47 -0700 (PDT) X-Google-Smtp-Source: APXvYqyTrEz534li7eVYKp7LtSzLsKito2REI0+FPk9JWjUkAw1Ie1dwqXZEtLCt6FlD/M3m6u4g X-Received: by 2002:a37:dcc4:: with SMTP id v187mr103992137qki.290.1560997486994; Wed, 19 Jun 2019 19:24:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997486; cv=none; d=google.com; s=arc-20160816; b=KO06zDPiZ3+eKxD9wMZvYqC53MTTLl57luqGt2VE+IoKR/pY5T++NgBhB01UZU9w6Z oVVqYbGOqIcjIT6Pc+XtiM8A3xW+FpRMR9gRl2z/3v2NRlgtDFQVBwyOLSsD6r63d0vm T2B8GNl8LGMp4Du6QoVFOySZr0pGHfYIzTSgfnToPUsLGEZoC5q2Vd/xOqasHAF/flqF m9i1ApqRHjpd5f7b9jGgvKUvD2GspNLUmB3IkyNXw2YlvP3mWkoaOlm3JL/QaQIqfD3d nRaijGOfFQV8ZImIQ9dPidcXmWK8h99lkOwn2rG8yotIV5NNvyHL/83h/4XAk2UQxtO6 Q+8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=e7eX1RveEOqtNN+ZGUkmQs0zHK+DZjZtAYbM/evoIws=; b=jAky8OS8QxJNxEpknFpwsha6440DduxxLWkbWEow8biMLSCv7X7Etw0vT+2OByx8Pw W44u9kVOrcRPjLoiipneTjzS420r1ZpAmx59vsecw9SbnxrQ8hkoLo1CQwEIYQFpSjmV 9dJ1EVGZLxmisVIJUvD5TPdaucs+sSDVajIja7qPlE/3sV7nLr9IeFRINH6FJ6/vAA6s HcH7x93A1RHYnvG7cyGAoR0kUUaveC9g6Z0jouDmLuxOfNj4Qa07OU0x/cW8j7m10dDb q7OlIbxSVGDZKr4YpiuWLI2LOsfHzd4LV7xE1F5QPPqgYU8B95Xhd1iu7QnABoQbdEtz dEjw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id r188si13046123qkb.263.2019.06.19.19.24.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:24:46 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 354BF307844A; Thu, 20 Jun 2019 02:24:46 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 899C21001E6F; Thu, 20 Jun 2019 02:24:38 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 21/25] userfaultfd: wp: don't wake up when doing write protect Date: Thu, 20 Jun 2019 10:20:04 +0800 Message-Id: <20190620022008.19172-22-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Thu, 20 Jun 2019 02:24:46 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP It does not make sense to try to wake up any waiting thread when we're write-protecting a memory region. Only wake up when resolving a write protected page fault. Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- fs/userfaultfd.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 3cf19aeaa0e0..498971fa9163 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1782,6 +1782,7 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, struct uffdio_writeprotect uffdio_wp; struct uffdio_writeprotect __user *user_uffdio_wp; struct userfaultfd_wake_range range; + bool mode_wp, mode_dontwake; if (READ_ONCE(ctx->mmap_changing)) return -EAGAIN; @@ -1800,18 +1801,20 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, if (uffdio_wp.mode & ~(UFFDIO_WRITEPROTECT_MODE_DONTWAKE | UFFDIO_WRITEPROTECT_MODE_WP)) return -EINVAL; - if ((uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP) && - (uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) + + mode_wp = uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP; + mode_dontwake = uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE; + + if (mode_wp && mode_dontwake) return -EINVAL; ret = mwriteprotect_range(ctx->mm, uffdio_wp.range.start, - uffdio_wp.range.len, uffdio_wp.mode & - UFFDIO_WRITEPROTECT_MODE_WP, + uffdio_wp.range.len, mode_wp, &ctx->mmap_changing); if (ret) return ret; - if (!(uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) { + if (!mode_wp && !mode_dontwake) { range.start = uffdio_wp.range.start; range.len = uffdio_wp.range.len; wake_userfault(ctx, &range); From patchwork Thu Jun 20 02:20:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005719 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1F49113AF for ; Thu, 20 Jun 2019 02:25:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0DAE4209CD for ; Thu, 20 Jun 2019 02:25:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F108F26E49; Thu, 20 Jun 2019 02:25:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4CF68209CD for ; Thu, 20 Jun 2019 02:25:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 612FE8E000D; Wed, 19 Jun 2019 22:25:01 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5C35A8E0001; Wed, 19 Jun 2019 22:25:01 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D9A48E000D; Wed, 19 Jun 2019 22:25:01 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 2F2C98E0001 for ; Wed, 19 Jun 2019 22:25:01 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id x10so1687378qti.11 for ; Wed, 19 Jun 2019 19:25:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=6M5vIGHcsYBb3WCxHxaonQjlAK8kKdGHX+f4QKpk54A=; b=UJgvi3pVnpXgXDCGkVmT8cYrLOqXFpLKG9VwUckt3k122CnR4Ynd44cZhXE58rQXTV 7OX8JdI1m+B/Hv1VI1/oazlcFT47+auCN1oXuCoCr7WxoorYrwr52t7Ydw7RIxvZI8yg iRCkZ86ENTzFgk9uDxs9vFmYVWTDRh8G+dGsHtY1SnVlzeYoHyEbPwsxoNLjcnEZR61B ooAAHJFn1IvYsVSfGJ3IhIG91UfuyZ1fFU2pNymNjKjubOtCVxQtLkNDfIyijkGF8u4u RlVgVs6r5tJEM5Ybx4n6f1R/fFUxNOXX0cPshz9RaYmDeEh8UjmDT3XykSeR3BlLWcPE FtpA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVH51vUg97LjhYzaRNGVSKzA0ejBPdP6KKWSWz9q/UlSyO8DUk/ brRln75cuOn1agDq/fQPSr9+/xTAf2spOnEoIf5t1SYXjhgJfCKLxDKMImjXCbikmG0A6n+Di3g q51fv9xK5Ogpt7eejbPI/WF2/mE3bKsBbOpQbRJ/AK9sTLxkXDQfGQQXw2sGeY22V2g== X-Received: by 2002:aed:3f1a:: with SMTP id p26mr108477513qtf.113.1560997500973; Wed, 19 Jun 2019 19:25:00 -0700 (PDT) X-Google-Smtp-Source: APXvYqzybsm4ikg3cAo40AsKI7jtqBoVnsDhZNQfXlL7pQW8YuMbRfFUKyjEnjpeogwio2PKaB64 X-Received: by 2002:aed:3f1a:: with SMTP id p26mr108477482qtf.113.1560997500357; Wed, 19 Jun 2019 19:25:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997500; cv=none; d=google.com; s=arc-20160816; b=t3nC9ELiMtn2zEgtws1QudVXPjEdv5fMOx7AlBP358QWYJLWgyeuJzSxHDaqb92enR kSB3XXhejagA83PNpzTI+f0WCyWOZ7QCi1KYZvRUe6Vo+TF8MkX8Wti6jb8VF+IMz0NK 3qDRJ2yyvvocTzAH1hViaw4TuuoS6YvLUiKwiOCfllyovG1ovC+GrMTwg/D3rqCPbNKu Cacn3h/PqOFz5IHRe+lFvmgAvcY+t6ehkxWbq+TqkFTTs91jKuote8BcJvKUtHcfduOv KFMn/OQdm6OoM7c5gvRuHNNTNPWzXRzYmnEQXohxPWqV62uWozdJtP08s2B77x0old2s PFlA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=6M5vIGHcsYBb3WCxHxaonQjlAK8kKdGHX+f4QKpk54A=; b=App0vrrg1wK+aQGcW4wdhKsuFrRwLJZnC5XxDoalxDbaLE4hKgeLpqmg2kypMpHBhX 0VISxaA/yeSdq7UbQooWh7lfGlRY7+5ABjZW0Y+ASGBqacVvMhwtCSLRTEVzFPpyesny 8aZtQRzH771gR+XGzC7HMgerrj435KItldgtGCoNArNCmfoTKw0COX9WfM7NQVJ9t7vE m44zutvcQNUgCy8KCXd1Evcj5ASyarzScMRzLoitNmsySr2994Nj2NvtFT5+krjLOIr2 eStUT3t8ggD02PpiEQQLZe9SptideBwp5RvK3DDaVGQh5A5ce+IJlL4bEKsdUCi9sbna dV5Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id b20si3994439qte.321.2019.06.19.19.25.00 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:25:00 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7B2A685538; Thu, 20 Jun 2019 02:24:59 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id C410C1001DC3; Thu, 20 Jun 2019 02:24:46 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 22/25] userfaultfd: wp: UFFDIO_REGISTER_MODE_WP documentation update Date: Thu, 20 Jun 2019 10:20:05 +0800 Message-Id: <20190620022008.19172-23-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Thu, 20 Jun 2019 02:24:59 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Martin Cracauer Adds documentation about the write protection support. Signed-off-by: Martin Cracauer Signed-off-by: Andrea Arcangeli [peterx: rewrite in rst format; fixups here and there] Reviewed-by: Jerome Glisse Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- Documentation/admin-guide/mm/userfaultfd.rst | 51 ++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst index 5048cf661a8a..c30176e67900 100644 --- a/Documentation/admin-guide/mm/userfaultfd.rst +++ b/Documentation/admin-guide/mm/userfaultfd.rst @@ -108,6 +108,57 @@ UFFDIO_COPY. They're atomic as in guaranteeing that nothing can see an half copied page since it'll keep userfaulting until the copy has finished. +Notes: + +- If you requested UFFDIO_REGISTER_MODE_MISSING when registering then + you must provide some kind of page in your thread after reading from + the uffd. You must provide either UFFDIO_COPY or UFFDIO_ZEROPAGE. + The normal behavior of the OS automatically providing a zero page on + an annonymous mmaping is not in place. + +- None of the page-delivering ioctls default to the range that you + registered with. You must fill in all fields for the appropriate + ioctl struct including the range. + +- You get the address of the access that triggered the missing page + event out of a struct uffd_msg that you read in the thread from the + uffd. You can supply as many pages as you want with UFFDIO_COPY or + UFFDIO_ZEROPAGE. Keep in mind that unless you used DONTWAKE then + the first of any of those IOCTLs wakes up the faulting thread. + +- Be sure to test for all errors including (pollfd[0].revents & + POLLERR). This can happen, e.g. when ranges supplied were + incorrect. + +Write Protect Notifications +--------------------------- + +This is equivalent to (but faster than) using mprotect and a SIGSEGV +signal handler. + +Firstly you need to register a range with UFFDIO_REGISTER_MODE_WP. +Instead of using mprotect(2) you use ioctl(uffd, UFFDIO_WRITEPROTECT, +struct *uffdio_writeprotect) while mode = UFFDIO_WRITEPROTECT_MODE_WP +in the struct passed in. The range does not default to and does not +have to be identical to the range you registered with. You can write +protect as many ranges as you like (inside the registered range). +Then, in the thread reading from uffd the struct will have +msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP set. Now you send +ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect) again +while pagefault.mode does not have UFFDIO_WRITEPROTECT_MODE_WP set. +This wakes up the thread which will continue to run with writes. This +allows you to do the bookkeeping about the write in the uffd reading +thread before the ioctl. + +If you registered with both UFFDIO_REGISTER_MODE_MISSING and +UFFDIO_REGISTER_MODE_WP then you need to think about the sequence in +which you supply a page and undo write protect. Note that there is a +difference between writes into a WP area and into a !WP area. The +former will have UFFD_PAGEFAULT_FLAG_WP set, the latter +UFFD_PAGEFAULT_FLAG_WRITE. The latter did not fail on protection but +you still need to supply a page when UFFDIO_REGISTER_MODE_MISSING was +used. + QEMU/KVM ======== From patchwork Thu Jun 20 02:20:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005721 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BA3AA14DB for ; Thu, 20 Jun 2019 02:25:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AE916209CD for ; Thu, 20 Jun 2019 02:25:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A2C2126E49; Thu, 20 Jun 2019 02:25:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 36F87209CD for ; Thu, 20 Jun 2019 02:25:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 551178E000E; Wed, 19 Jun 2019 22:25:12 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 526B88E0001; Wed, 19 Jun 2019 22:25:12 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 415148E000E; Wed, 19 Jun 2019 22:25:12 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 2206B8E0001 for ; Wed, 19 Jun 2019 22:25:12 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id u129so1723106qkd.12 for ; Wed, 19 Jun 2019 19:25:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Lm0S7FSq2A9HYbt5saNCQ4d83jxh4YmQI2wDIQ8lFTc=; b=CeduDrsCZwUip3iN/CRY1J/uS0zC6vQBuJld1idLsMrfNCVc6IOpJ1UAwkqOMzMN56 G4wVzyG5yuZEKAwUuXEbf6lKjj7TfHXWJSsbnkU/Nmmp0sqEx8uNKVz/f8ekrO12ebuC cjputnL4FpzQG2bR6IxMsXIR4ONlnacrw2qm/nODqXYrsj/egGrcGQHyen+75yqA5wiN MTjL26jZNFatuscF2mP/Nru1HO3NmvI3TQ7Fqh8Dj+UudSIZ0/Qbu1XJi7hLyOK66LtI v9+2m19AtF/uzFpIn6X7PRtF/pC6S7q4CTtJoXAUX8LZgBET7iWTQb5CwRno+2Y7UjuN a/2A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXstdbF/QlGKPccq/eGrUsCRBIKJrmninKmKkan51z6C4bGdD6A is4xEoly4kvzYNFmcxgXzGQO004rpeK74TDLlkm4h/MQNQqENO3DRFkrVpttZtWuwRebukDLLpf BRRSoOrgHv7VJGvaaj2IeZcSCYBxSYG6xeQAspwO/AlX1WkFcpnQsEyErwWOiYGLpVg== X-Received: by 2002:a05:620a:14a8:: with SMTP id x8mr47497464qkj.35.1560997511916; Wed, 19 Jun 2019 19:25:11 -0700 (PDT) X-Google-Smtp-Source: APXvYqwR1XVlXQIHbJWJGgh1hz1pEQ/8/sXqVN0n6oOYMDUwSIF1MBV88rgEnWMNXSkbLMld0B/U X-Received: by 2002:a05:620a:14a8:: with SMTP id x8mr47497434qkj.35.1560997511407; Wed, 19 Jun 2019 19:25:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997511; cv=none; d=google.com; s=arc-20160816; b=wuT5WgWgDLTDKbKXMpJFlpha2mxv8NQ7ZuCy7i3ja1nnWuE+o67MrJfQDdQC+Su2c1 f79JfbqYbNlIq2X7d61N4WD56SuY1s8Ig1DdKYAcYcSf3ZMoAp3gvnqBphZpTWXmAyYF 2MM/d4eM/WjhEnklH5IC9rnUwsZJKKINdKuZuc4rrZqUTWKHXl2Cg85vz0yBikM3yEhi SzIbBN9b4q5BEkuues8xN2oHbzK5fr6/y+BlNb9bkuzC5iql8erGPY5DZsEmRfd8z/Vq KwLdGuIR9Ud5KzdTPUcjdhmszaQ+aNDnMwarYlSG38/pmTdt3NfZ2ktWzAThZJ6dL9hQ YNWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=Lm0S7FSq2A9HYbt5saNCQ4d83jxh4YmQI2wDIQ8lFTc=; b=KpljIwSh4eian7NWTHYS+Su2M+sRy/ktVqLMXwv3Se5OPg3XaHQxaLjhU94LN5oIj0 Any1K/PoDf/M0Wg7RM7euxBnlqq8YirgncO+pmLPOs9YKBJpmb/1qMDw8OZf0UrzVT22 rlwy3Yqud/xgkB7xhfCSO5Mit24oB20xXdQ8kKdYv0TYf7VwXtYp5wSvK4EA+VNunsnC O4hia2eRH3Vjk9AghsYm81Jza8V4bbzYKSmUSoS36E4mF+Al3Stg5uwXteR0AWUnxkkM ueKjeCUX+h1/WaS0ACes5XfcCGAq/x1wvjPY1CfKSk8nWC4qlbfmxzxOrTI2wr0/8xdk R0SA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id x8si3756887qtf.386.2019.06.19.19.25.11 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:25:11 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C7B6F22386D; Thu, 20 Jun 2019 02:25:09 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0F7811001E69; Thu, 20 Jun 2019 02:24:59 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 23/25] userfaultfd: wp: declare _UFFDIO_WRITEPROTECT conditionally Date: Thu, 20 Jun 2019 10:20:06 +0800 Message-Id: <20190620022008.19172-24-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 20 Jun 2019 02:25:10 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Only declare _UFFDIO_WRITEPROTECT if the user specified UFFDIO_REGISTER_MODE_WP and if all the checks passed. Then when the user registers regions with shmem/hugetlbfs we won't expose the new ioctl to them. Even with complete anonymous memory range, we'll only expose the new WP ioctl bit if the register mode has MODE_WP. Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- fs/userfaultfd.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 498971fa9163..4e1d7748224a 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1465,14 +1465,24 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, up_write(&mm->mmap_sem); mmput(mm); if (!ret) { + __u64 ioctls_out; + + ioctls_out = basic_ioctls ? UFFD_API_RANGE_IOCTLS_BASIC : + UFFD_API_RANGE_IOCTLS; + + /* + * Declare the WP ioctl only if the WP mode is + * specified and all checks passed with the range + */ + if (!(uffdio_register.mode & UFFDIO_REGISTER_MODE_WP)) + ioctls_out &= ~((__u64)1 << _UFFDIO_WRITEPROTECT); + /* * Now that we scanned all vmas we can already tell * userland which ioctls methods are guaranteed to * succeed on this range. */ - if (put_user(basic_ioctls ? UFFD_API_RANGE_IOCTLS_BASIC : - UFFD_API_RANGE_IOCTLS, - &user_uffdio_register->ioctls)) + if (put_user(ioctls_out, &user_uffdio_register->ioctls)) ret = -EFAULT; } out: From patchwork Thu Jun 20 02:20:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005723 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2C9A614DB for ; Thu, 20 Jun 2019 02:25:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 15FD226E3C for ; Thu, 20 Jun 2019 02:25:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 08C2726E4F; Thu, 20 Jun 2019 02:25:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3C79D26E3C for ; Thu, 20 Jun 2019 02:25:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 528748E000F; Wed, 19 Jun 2019 22:25:24 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4D8958E0001; Wed, 19 Jun 2019 22:25:24 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3EDFD8E000F; Wed, 19 Jun 2019 22:25:24 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 1ACD08E0001 for ; Wed, 19 Jun 2019 22:25:24 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id i196so1712758qke.20 for ; Wed, 19 Jun 2019 19:25:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=XVOGc/71mTNPvwnOFYz1otfZZNxpWfThPlONODKpSYY=; b=LlFWQOvotrzed2hvxS+/7WZbEX+amhMnGYl978N7h7mZDCK7lnZbncfaT3Y5UlGZB+ xaNP5uDn9ZojOecgl0x4M4AxiAgKvn471CQYaiUJGCBHon2JAV1btHOk9/28/n1o4Gd8 pSQ/iMjr/7QmY84j+jX0ubMuVA2LhagCklFNnUAB3pRQUmNDVXuKLZumyqQuv+sPe9l6 hfS3bgeUP6fiENQ8CX3bsDbgTmuPJUuSAJUzT7lthr53LyTx4vWsAsGlFwao5w8qtzq7 EbAtLes61SJafDjy305cgx6JQWgzlO8YJnxJGOjMu01bkttQ9D8v19M3hoW0mIybZ33n 9iQg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWcPyk7ec8oJ1moztBoh2q0M7D5XTamxI/zjIR+nsFtDMNdXMho +dgQz7GnpnlzjEbYDRHOmxYsHHBwjuDvfx+tseSv2G9G+lh03QUqCxE7X4SP4ROSg8Ltw/bqN9G MJv1jVZrbvZCc4zc7kgHN9mxFz+65p6MOq3lJilE5DmAujNTyd7yG/k0zoSC8Sf9VTw== X-Received: by 2002:ac8:3325:: with SMTP id t34mr105524989qta.172.1560997523850; Wed, 19 Jun 2019 19:25:23 -0700 (PDT) X-Google-Smtp-Source: APXvYqwjWqWqHQRk8Lgf/sP16oDFu8VXqsfsHc4MdSlDiUz9w+y94HQEDmIMqm7aE3Tw5KZK3d25 X-Received: by 2002:ac8:3325:: with SMTP id t34mr105524939qta.172.1560997522675; Wed, 19 Jun 2019 19:25:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997522; cv=none; d=google.com; s=arc-20160816; b=h3P/3KsjAxmFsBahpRStpTanFvuMgE1QAtgF/D3WRzS+U/ZJFc5+tQ/y353KUoXBly eNLGFzUrLaesfYt9f2/iL5+RS3i5G6kdGD+1bLQEmDQsPXD8oadWuE4CyBxBLf7CvApU bD59DhhX5UNT3ogGuKkuvKGlUwEygcXGr33KQtfWzdRf27nM65piEhEhEXjsxNrvnZCM VH+Pz2/QY2aALmoCg7WzbPQ2LxfvWQt+gZkDNL9ximlPzaIyQrrC4bwEOJfSCkmtBoBD LePVUJu60GpNViAlPjD0a46FxGbYdY0aO2gEIuOJktKHPLzozIEcBKKATHNDNbdlmqsn VpTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=XVOGc/71mTNPvwnOFYz1otfZZNxpWfThPlONODKpSYY=; b=wIEBr9CBtV504JoQXwo4FDOolhFI/2VHYjaXjaSNLiLaHDFJdBdeopXJp7kGQnSN3F dv5VankmyUbjCEXC/fIBmi4mb6W/BKVit9PRiPrZ+Euu3VS/5Y5Krn9cuhOy346t+xbp s1N2iYkB9O1ZS8V44YSc3RX1/Yw2oi4J5oou4lGmGFeZS6O7KjnQdOwcmxilGGk9eLaU egyAf9qtBCx1wRM/uDvkftCpr3Am3KAuDZ2+Tcb7Kj6S6I6+F9KWMyIlseZCIN01j6Fj kvU0wT1LbMVg9LQsv6vqXzw7sEGjusQoouV7TEj4zTmQVQAtyPkd7i7DXCvyOjaE8d03 +b8Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id d48si3489065qta.166.2019.06.19.19.25.22 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:25:22 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C47C988E55; Thu, 20 Jun 2019 02:25:21 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5777B1001E69; Thu, 20 Jun 2019 02:25:10 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 24/25] userfaultfd: selftests: refactor statistics Date: Thu, 20 Jun 2019 10:20:07 +0800 Message-Id: <20190620022008.19172-25-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Thu, 20 Jun 2019 02:25:21 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Introduce uffd_stats structure for statistics of the self test, at the same time refactor the code to always pass in the uffd_stats for either read() or poll() typed fault handling threads instead of using two different ways to return the statistic results. No functional change. With the new structure, it's very easy to introduce new statistics. Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 76 +++++++++++++++--------- 1 file changed, 49 insertions(+), 27 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index b3e6497b080c..417dbdf4d379 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -88,6 +88,12 @@ static char *area_src, *area_src_alias, *area_dst, *area_dst_alias; static char *zeropage; pthread_attr_t attr; +/* Userfaultfd test statistics */ +struct uffd_stats { + int cpu; + unsigned long missing_faults; +}; + /* pthread_mutex_t starts at page offset 0 */ #define area_mutex(___area, ___nr) \ ((pthread_mutex_t *) ((___area) + (___nr)*page_size)) @@ -127,6 +133,17 @@ static void usage(void) exit(1); } +static void uffd_stats_reset(struct uffd_stats *uffd_stats, + unsigned long n_cpus) +{ + int i; + + for (i = 0; i < n_cpus; i++) { + uffd_stats[i].cpu = i; + uffd_stats[i].missing_faults = 0; + } +} + static int anon_release_pages(char *rel_area) { int ret = 0; @@ -469,8 +486,8 @@ static int uffd_read_msg(int ufd, struct uffd_msg *msg) return 0; } -/* Return 1 if page fault handled by us; otherwise 0 */ -static int uffd_handle_page_fault(struct uffd_msg *msg) +static void uffd_handle_page_fault(struct uffd_msg *msg, + struct uffd_stats *stats) { unsigned long offset; @@ -485,18 +502,19 @@ static int uffd_handle_page_fault(struct uffd_msg *msg) offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; offset &= ~(page_size-1); - return copy_page(uffd, offset); + if (copy_page(uffd, offset)) + stats->missing_faults++; } static void *uffd_poll_thread(void *arg) { - unsigned long cpu = (unsigned long) arg; + struct uffd_stats *stats = (struct uffd_stats *)arg; + unsigned long cpu = stats->cpu; struct pollfd pollfd[2]; struct uffd_msg msg; struct uffdio_register uffd_reg; int ret; char tmp_chr; - unsigned long userfaults = 0; pollfd[0].fd = uffd; pollfd[0].events = POLLIN; @@ -526,7 +544,7 @@ static void *uffd_poll_thread(void *arg) msg.event), exit(1); break; case UFFD_EVENT_PAGEFAULT: - userfaults += uffd_handle_page_fault(&msg); + uffd_handle_page_fault(&msg, stats); break; case UFFD_EVENT_FORK: close(uffd); @@ -545,28 +563,27 @@ static void *uffd_poll_thread(void *arg) break; } } - return (void *)userfaults; + + return NULL; } pthread_mutex_t uffd_read_mutex = PTHREAD_MUTEX_INITIALIZER; static void *uffd_read_thread(void *arg) { - unsigned long *this_cpu_userfaults; + struct uffd_stats *stats = (struct uffd_stats *)arg; struct uffd_msg msg; - this_cpu_userfaults = (unsigned long *) arg; - *this_cpu_userfaults = 0; - pthread_mutex_unlock(&uffd_read_mutex); /* from here cancellation is ok */ for (;;) { if (uffd_read_msg(uffd, &msg)) continue; - (*this_cpu_userfaults) += uffd_handle_page_fault(&msg); + uffd_handle_page_fault(&msg, stats); } - return (void *)NULL; + + return NULL; } static void *background_thread(void *arg) @@ -582,13 +599,12 @@ static void *background_thread(void *arg) return NULL; } -static int stress(unsigned long *userfaults) +static int stress(struct uffd_stats *uffd_stats) { unsigned long cpu; pthread_t locking_threads[nr_cpus]; pthread_t uffd_threads[nr_cpus]; pthread_t background_threads[nr_cpus]; - void **_userfaults = (void **) userfaults; finished = 0; for (cpu = 0; cpu < nr_cpus; cpu++) { @@ -597,12 +613,13 @@ static int stress(unsigned long *userfaults) return 1; if (bounces & BOUNCE_POLL) { if (pthread_create(&uffd_threads[cpu], &attr, - uffd_poll_thread, (void *)cpu)) + uffd_poll_thread, + (void *)&uffd_stats[cpu])) return 1; } else { if (pthread_create(&uffd_threads[cpu], &attr, uffd_read_thread, - &_userfaults[cpu])) + (void *)&uffd_stats[cpu])) return 1; pthread_mutex_lock(&uffd_read_mutex); } @@ -639,7 +656,8 @@ static int stress(unsigned long *userfaults) fprintf(stderr, "pipefd write error\n"); return 1; } - if (pthread_join(uffd_threads[cpu], &_userfaults[cpu])) + if (pthread_join(uffd_threads[cpu], + (void *)&uffd_stats[cpu])) return 1; } else { if (pthread_cancel(uffd_threads[cpu])) @@ -910,11 +928,11 @@ static int userfaultfd_events_test(void) { struct uffdio_register uffdio_register; unsigned long expected_ioctls; - unsigned long userfaults; pthread_t uffd_mon; int err, features; pid_t pid; char c; + struct uffd_stats stats = { 0 }; printf("testing events (fork, remap, remove): "); fflush(stdout); @@ -941,7 +959,7 @@ static int userfaultfd_events_test(void) "unexpected missing ioctl for anon memory\n"), exit(1); - if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL)) + if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, &stats)) perror("uffd_poll_thread create"), exit(1); pid = fork(); @@ -957,13 +975,13 @@ static int userfaultfd_events_test(void) if (write(pipefd[1], &c, sizeof(c)) != sizeof(c)) perror("pipe write"), exit(1); - if (pthread_join(uffd_mon, (void **)&userfaults)) + if (pthread_join(uffd_mon, NULL)) return 1; close(uffd); - printf("userfaults: %ld\n", userfaults); + printf("userfaults: %ld\n", stats.missing_faults); - return userfaults != nr_pages; + return stats.missing_faults != nr_pages; } static int userfaultfd_sig_test(void) @@ -975,6 +993,7 @@ static int userfaultfd_sig_test(void) int err, features; pid_t pid; char c; + struct uffd_stats stats = { 0 }; printf("testing signal delivery: "); fflush(stdout); @@ -1006,7 +1025,7 @@ static int userfaultfd_sig_test(void) if (uffd_test_ops->release_pages(area_dst)) return 1; - if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL)) + if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, &stats)) perror("uffd_poll_thread create"), exit(1); pid = fork(); @@ -1032,6 +1051,7 @@ static int userfaultfd_sig_test(void) close(uffd); return userfaults != 0; } + static int userfaultfd_stress(void) { void *area; @@ -1040,7 +1060,7 @@ static int userfaultfd_stress(void) struct uffdio_register uffdio_register; unsigned long cpu; int err; - unsigned long userfaults[nr_cpus]; + struct uffd_stats uffd_stats[nr_cpus]; uffd_test_ops->allocate_area((void **)&area_src); if (!area_src) @@ -1169,8 +1189,10 @@ static int userfaultfd_stress(void) if (uffd_test_ops->release_pages(area_dst)) return 1; + uffd_stats_reset(uffd_stats, nr_cpus); + /* bounce pass */ - if (stress(userfaults)) + if (stress(uffd_stats)) return 1; /* unregister */ @@ -1213,7 +1235,7 @@ static int userfaultfd_stress(void) printf("userfaults:"); for (cpu = 0; cpu < nr_cpus; cpu++) - printf(" %lu", userfaults[cpu]); + printf(" %lu", uffd_stats[cpu].missing_faults); printf("\n"); } From patchwork Thu Jun 20 02:20:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11005725 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C4D1B13AF for ; Thu, 20 Jun 2019 02:25:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B3BC2205FB for ; Thu, 20 Jun 2019 02:25:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A130426E49; Thu, 20 Jun 2019 02:25:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A5712205FB for ; Thu, 20 Jun 2019 02:25:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BB7868E0010; Wed, 19 Jun 2019 22:25:41 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B69548E0001; Wed, 19 Jun 2019 22:25:41 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B9AB8E0010; Wed, 19 Jun 2019 22:25:41 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 7D1858E0001 for ; Wed, 19 Jun 2019 22:25:41 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id v4so1740338qkj.10 for ; Wed, 19 Jun 2019 19:25:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=kMN4O4i3BJ2zkE5zs2TwQZ/JGfg9TBffImhfwjWsIf0=; b=qv2Ol+bLNXCcWZMZh/6sN46b+C88YeKw9630qBQbTmdGZGaggqq3eDFASCXw1ldXDt sfQdMbcgKvNSI8uV/AiBTnmBiN1HVUNnnD1h0w+vGkoYr1NZNotdQxFdxx8fLFMjDfbD GjiOVRQjHlP1smgcrsbbW4ICUVm3xRrh+CEH57P0x3AL6pXnqhlXyIX9RFCLqQu2hvwc dsrwVouhIah4jlR+Nm8Jm6bs8Y7lLZzPgOXQNVQbjTORW6hoMIbgQPJlbpVaA0osdY6o xfvTmkKxGYTKf2CaQ1342o61ondZH51Oi/9yl6ImsSgYIWl6Cd9guASNeWeTJtn+fTgr bCbw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVu9nkFGqvvAnKSdlX6l3vASFYEc24pgyazJKzeIPSUx+xKwHSl LDNNSD4NyJY/2QqBYTo6HYRgM1fSpM7TEyZt0SuTMQMGA+bXbh5GYSZcyNsIiYKlrAkJem3KY1y OgoAn726J5lHGspcChKV59851MKHrxPIT+bPfIB8NKSAokC3HJ6wOlPQLFrUlslMgzA== X-Received: by 2002:ae9:d601:: with SMTP id r1mr103736664qkk.231.1560997541227; Wed, 19 Jun 2019 19:25:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqyYFXsbd6Jp9vM7cMXOF0a8M1kVzo3txuopQKLamD2nx3ZR0qs+sjQ+/HUUEblt7NvJkZR4 X-Received: by 2002:ae9:d601:: with SMTP id r1mr103736623qkk.231.1560997540295; Wed, 19 Jun 2019 19:25:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560997540; cv=none; d=google.com; s=arc-20160816; b=QoVR5pSWyKziowd66sNnuJ4tZqzF53S5efNVioSBLPoMgczFFYomKbKHLQdGggMyiB noIHLl6ly5W+PyEwcjfCp33K/DuPkf8V5/bsKsAhec+by/wAEo0aAs+lapG+vrj5vaRq mG2mOA4Nk83+SEAVG1lh2dlA5WHqaHRwtBWhkrXkVRaoQt/dPG2boJHt8PQZaiPawuC1 oNTq0P+e85lwpl7luzN2KnvH7AJkGmbHfx2yAqO6lzNkav5WW0lhJeI/+uZ5ZEeFCDkz gL9V8fmYm/QSmexTA0uQdFTB5fXlBGykX3E9eCuqWfrUOmpo5jRaqzOJYd7KyYMuDZzm QUkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=kMN4O4i3BJ2zkE5zs2TwQZ/JGfg9TBffImhfwjWsIf0=; b=f1VSF3LnUJ2/YaNDLIeDmxunsQWVTCGlkErqHtG44WW4dclgJ1GZdF7JZNaiN4VIh+ Ha+VGprlrs/vhvITArf67kjOUgvdJEGmmhsDNfugv3J9a6mqmZz1vCbyflUaUZf8fAom DDdCwsG8hTKqdLazoQindCTln+d943ovZzLoMj/ChC2nmSWuUf3AkXx4Oe5gw2QxJxsO WSlcfzAdjmBqT4izhGsARO3VVCkGJep1fTwhKW7KTkTZI6mIN1iR436RSTmFS9I5YE+O MUSd9OZKXrx8rg90/wTuePjK2ztYAUSX63lwk9v8f8AbrxvW9EMp3b21oAHqa2OPTWBX bEzw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id u23si3473273qtq.369.2019.06.19.19.25.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 19:25:40 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E300485538; Thu, 20 Jun 2019 02:25:30 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-78.pek2.redhat.com [10.72.12.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6E0691001E69; Thu, 20 Jun 2019 02:25:22 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v5 25/25] userfaultfd: selftests: add write-protect test Date: Thu, 20 Jun 2019 10:20:08 +0800 Message-Id: <20190620022008.19172-26-peterx@redhat.com> In-Reply-To: <20190620022008.19172-1-peterx@redhat.com> References: <20190620022008.19172-1-peterx@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Thu, 20 Jun 2019 02:25:39 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This patch adds uffd tests for write protection. Instead of introducing new tests for it, let's simply squashing uffd-wp tests into existing uffd-missing test cases. Changes are: (1) Bouncing tests We do the write-protection in two ways during the bouncing test: - By using UFFDIO_COPY_MODE_WP when resolving MISSING pages: then we'll make sure for each bounce process every single page will be at least fault twice: once for MISSING, once for WP. - By direct call UFFDIO_WRITEPROTECT on existing faulted memories: To further torture the explicit page protection procedures of uffd-wp, we split each bounce procedure into two halves (in the background thread): the first half will be MISSING+WP for each page as explained above. After the first half, we write protect the faulted region in the background thread to make sure at least half of the pages will be write protected again which is the first half to test the new UFFDIO_WRITEPROTECT call. Then we continue with the 2nd half, which will contain both MISSING and WP faulting tests for the 2nd half and WP-only faults from the 1st half. (2) Event/Signal test Mostly previous tests but will do MISSING+WP for each page. For sigbus-mode test we'll need to provide standalone path to handle the write protection faults. For all tests, do statistics as well for uffd-wp pages. Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 157 +++++++++++++++++++---- 1 file changed, 133 insertions(+), 24 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index 417dbdf4d379..fa362fe311e3 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -56,6 +56,7 @@ #include #include #include +#include #include "../kselftest.h" @@ -78,6 +79,8 @@ static int test_type; #define ALARM_INTERVAL_SECS 10 static volatile bool test_uffdio_copy_eexist = true; static volatile bool test_uffdio_zeropage_eexist = true; +/* Whether to test uffd write-protection */ +static bool test_uffdio_wp = false; static bool map_shared; static int huge_fd; @@ -92,6 +95,7 @@ pthread_attr_t attr; struct uffd_stats { int cpu; unsigned long missing_faults; + unsigned long wp_faults; }; /* pthread_mutex_t starts at page offset 0 */ @@ -141,9 +145,29 @@ static void uffd_stats_reset(struct uffd_stats *uffd_stats, for (i = 0; i < n_cpus; i++) { uffd_stats[i].cpu = i; uffd_stats[i].missing_faults = 0; + uffd_stats[i].wp_faults = 0; } } +static void uffd_stats_report(struct uffd_stats *stats, int n_cpus) +{ + int i; + unsigned long long miss_total = 0, wp_total = 0; + + for (i = 0; i < n_cpus; i++) { + miss_total += stats[i].missing_faults; + wp_total += stats[i].wp_faults; + } + + printf("userfaults: %llu missing (", miss_total); + for (i = 0; i < n_cpus; i++) + printf("%lu+", stats[i].missing_faults); + printf("\b), %llu wp (", wp_total); + for (i = 0; i < n_cpus; i++) + printf("%lu+", stats[i].wp_faults); + printf("\b)\n"); +} + static int anon_release_pages(char *rel_area) { int ret = 0; @@ -264,10 +288,15 @@ struct uffd_test_ops { void (*alias_mapping)(__u64 *start, size_t len, unsigned long offset); }; -#define ANON_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ +#define SHMEM_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ (1 << _UFFDIO_COPY) | \ (1 << _UFFDIO_ZEROPAGE)) +#define ANON_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ + (1 << _UFFDIO_COPY) | \ + (1 << _UFFDIO_ZEROPAGE) | \ + (1 << _UFFDIO_WRITEPROTECT)) + static struct uffd_test_ops anon_uffd_test_ops = { .expected_ioctls = ANON_EXPECTED_IOCTLS, .allocate_area = anon_allocate_area, @@ -276,7 +305,7 @@ static struct uffd_test_ops anon_uffd_test_ops = { }; static struct uffd_test_ops shmem_uffd_test_ops = { - .expected_ioctls = ANON_EXPECTED_IOCTLS, + .expected_ioctls = SHMEM_EXPECTED_IOCTLS, .allocate_area = shmem_allocate_area, .release_pages = shmem_release_pages, .alias_mapping = noop_alias_mapping, @@ -300,6 +329,21 @@ static int my_bcmp(char *str1, char *str2, size_t n) return 0; } +static void wp_range(int ufd, __u64 start, __u64 len, bool wp) +{ + struct uffdio_writeprotect prms = { 0 }; + + /* Write protection page faults */ + prms.range.start = start; + prms.range.len = len; + /* Undo write-protect, do wakeup after that */ + prms.mode = wp ? UFFDIO_WRITEPROTECT_MODE_WP : 0; + + if (ioctl(ufd, UFFDIO_WRITEPROTECT, &prms)) + fprintf(stderr, "clear WP failed for address 0x%Lx\n", + start), exit(1); +} + static void *locking_thread(void *arg) { unsigned long cpu = (unsigned long) arg; @@ -438,7 +482,10 @@ static int __copy_page(int ufd, unsigned long offset, bool retry) uffdio_copy.dst = (unsigned long) area_dst + offset; uffdio_copy.src = (unsigned long) area_src + offset; uffdio_copy.len = page_size; - uffdio_copy.mode = 0; + if (test_uffdio_wp) + uffdio_copy.mode = UFFDIO_COPY_MODE_WP; + else + uffdio_copy.mode = 0; uffdio_copy.copy = 0; if (ioctl(ufd, UFFDIO_COPY, &uffdio_copy)) { /* real retval in ufdio_copy.copy */ @@ -495,15 +542,21 @@ static void uffd_handle_page_fault(struct uffd_msg *msg, fprintf(stderr, "unexpected msg event %u\n", msg->event), exit(1); - if (bounces & BOUNCE_VERIFY && - msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE) - fprintf(stderr, "unexpected write fault\n"), exit(1); + if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP) { + wp_range(uffd, msg->arg.pagefault.address, page_size, false); + stats->wp_faults++; + } else { + /* Missing page faults */ + if (bounces & BOUNCE_VERIFY && + msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE) + fprintf(stderr, "unexpected write fault\n"), exit(1); - offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; - offset &= ~(page_size-1); + offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; + offset &= ~(page_size-1); - if (copy_page(uffd, offset)) - stats->missing_faults++; + if (copy_page(uffd, offset)) + stats->missing_faults++; + } } static void *uffd_poll_thread(void *arg) @@ -589,11 +642,30 @@ static void *uffd_read_thread(void *arg) static void *background_thread(void *arg) { unsigned long cpu = (unsigned long) arg; - unsigned long page_nr; + unsigned long page_nr, start_nr, mid_nr, end_nr; + + start_nr = cpu * nr_pages_per_cpu; + end_nr = (cpu+1) * nr_pages_per_cpu; + mid_nr = (start_nr + end_nr) / 2; + + /* Copy the first half of the pages */ + for (page_nr = start_nr; page_nr < mid_nr; page_nr++) + copy_page_retry(uffd, page_nr * page_size); - for (page_nr = cpu * nr_pages_per_cpu; - page_nr < (cpu+1) * nr_pages_per_cpu; - page_nr++) + /* + * If we need to test uffd-wp, set it up now. Then we'll have + * at least the first half of the pages mapped already which + * can be write-protected for testing + */ + if (test_uffdio_wp) + wp_range(uffd, (unsigned long)area_dst + start_nr * page_size, + nr_pages_per_cpu * page_size, true); + + /* + * Continue the 2nd half of the page copying, handling write + * protection faults if any + */ + for (page_nr = mid_nr; page_nr < end_nr; page_nr++) copy_page_retry(uffd, page_nr * page_size); return NULL; @@ -755,17 +827,31 @@ static int faulting_process(int signal_test) } for (nr = 0; nr < split_nr_pages; nr++) { + int steps = 1; + unsigned long offset = nr * page_size; + if (signal_test) { if (sigsetjmp(*sigbuf, 1) != 0) { - if (nr == lastnr) { + if (steps == 1 && nr == lastnr) { fprintf(stderr, "Signal repeated\n"); return 1; } lastnr = nr; if (signal_test == 1) { - if (copy_page(uffd, nr * page_size)) - signalled++; + if (steps == 1) { + /* This is a MISSING request */ + steps++; + if (copy_page(uffd, offset)) + signalled++; + } else { + /* This is a WP request */ + assert(steps == 2); + wp_range(uffd, + (__u64)area_dst + + offset, + page_size, false); + } } else { signalled++; continue; @@ -778,8 +864,13 @@ static int faulting_process(int signal_test) fprintf(stderr, "nr %lu memory corruption %Lu %Lu\n", nr, count, - count_verify[nr]), exit(1); - } + count_verify[nr]); + } + /* + * Trigger write protection if there is by writting + * the same value back. + */ + *area_count(area_dst, nr) = count; } if (signal_test) @@ -801,6 +892,11 @@ static int faulting_process(int signal_test) nr, count, count_verify[nr]), exit(1); } + /* + * Trigger write protection if there is by writting + * the same value back. + */ + *area_count(area_dst, nr) = count; } if (uffd_test_ops->release_pages(area_dst)) @@ -904,6 +1000,8 @@ static int userfaultfd_zeropage_test(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) fprintf(stderr, "register failure\n"), exit(1); @@ -949,6 +1047,8 @@ static int userfaultfd_events_test(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) fprintf(stderr, "register failure\n"), exit(1); @@ -979,7 +1079,8 @@ static int userfaultfd_events_test(void) return 1; close(uffd); - printf("userfaults: %ld\n", stats.missing_faults); + + uffd_stats_report(&stats, 1); return stats.missing_faults != nr_pages; } @@ -1009,6 +1110,8 @@ static int userfaultfd_sig_test(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) fprintf(stderr, "register failure\n"), exit(1); @@ -1141,6 +1244,8 @@ static int userfaultfd_stress(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) { fprintf(stderr, "register failure\n"); return 1; @@ -1195,6 +1300,11 @@ static int userfaultfd_stress(void) if (stress(uffd_stats)) return 1; + /* Clear all the write protections if there is any */ + if (test_uffdio_wp) + wp_range(uffd, (unsigned long)area_dst, + nr_pages * page_size, false); + /* unregister */ if (ioctl(uffd, UFFDIO_UNREGISTER, &uffdio_register.range)) { fprintf(stderr, "unregister failure\n"); @@ -1233,10 +1343,7 @@ static int userfaultfd_stress(void) area_src_alias = area_dst_alias; area_dst_alias = tmp_area; - printf("userfaults:"); - for (cpu = 0; cpu < nr_cpus; cpu++) - printf(" %lu", uffd_stats[cpu].missing_faults); - printf("\n"); + uffd_stats_report(uffd_stats, nr_cpus); } if (err) @@ -1276,6 +1383,8 @@ static void set_test_type(const char *type) if (!strcmp(type, "anon")) { test_type = TEST_ANON; uffd_test_ops = &anon_uffd_test_ops; + /* Only enable write-protect test for anonymous test */ + test_uffdio_wp = true; } else if (!strcmp(type, "hugetlb")) { test_type = TEST_HUGETLB; uffd_test_ops = &hugetlb_uffd_test_ops;