From patchwork Tue Feb 12 02:56:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807241 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 645781669 for ; Tue, 12 Feb 2019 02:56:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 515332AE6B for ; Tue, 12 Feb 2019 02:56:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 444432AE78; Tue, 12 Feb 2019 02:56:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 88BAB2AE6B for ; Tue, 12 Feb 2019 02:56:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 90CF18E012D; Mon, 11 Feb 2019 21:56:57 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8BC918E000E; Mon, 11 Feb 2019 21:56:57 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7AC6C8E012D; Mon, 11 Feb 2019 21:56:57 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 53A178E000E for ; Mon, 11 Feb 2019 21:56:57 -0500 (EST) Received: by mail-qt1-f198.google.com with SMTP id s4so1255356qts.11 for ; Mon, 11 Feb 2019 18:56:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=xWIqeG6XfnuhjkqSjp4EggaxCb9UxDhUO3zF7NC0L0c=; b=ivT4PnxtnqDoauXXlm9t8rXdg5unHX0MOrzRrp8C1iD66m+LOfp8uv1XBniRdL1XA8 FuQVzQb/oto7RFPZWvG5yUFSPggYrQKX71axHgO/MbIQdgdebZl0E8Qg74pUnHyieZ9H sMEs758JhALyUeS+oXzbia+bqM2BUxMrHFgIuFK/5LD7DUso4d+fqPxIG/kbm27R6558 ibGaPUngaW6J3xDy6jfOmyhHsiY5FyM+MrhI33vsq1f1Iy/0UU8iXf6Jeq92ho562q3q 03vZ98aAsoCDblVfQrYwuis2ApW5F7f+gN8vcDYLpKigCbV6JdPAsco+KoSd+zt38dx2 IffA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuY+ZSqXzzaGYnUSM1h7A5CTvTwRr+UFUKWREMXlrJm9ajuo0+jO +NvhY5ZlN7lF6EKGBU1Q3Cz0I9s3XHqaLx0jDGxEZe9pseOe4LCdBrAe4+mwtDTu3xGHNHQf8N7 U1f8HiLr1eBzo8O2qLejCCXAipnmKWtLRa053n841J+dkdNZ3grkJgRjRLSRpitf0rA== X-Received: by 2002:aed:3b58:: with SMTP id q24mr1122786qte.227.1549940217090; Mon, 11 Feb 2019 18:56:57 -0800 (PST) X-Google-Smtp-Source: AHgI3IZQs/aqB/bJPJdROtL/lXWaqGD0a6jq8X+vkCZRHyg8smw5pe/Sad1+2pHkx69kBTHOwhTw X-Received: by 2002:aed:3b58:: with SMTP id q24mr1122760qte.227.1549940216345; Mon, 11 Feb 2019 18:56:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940216; cv=none; d=google.com; s=arc-20160816; b=km3UMscSDD+lFFNQDS6mweVpA4MC4gH1sCPKQfi0LOMAiQZf7Vp9NskCWt4Aq2WB/u qx3kaotHbHrVXg/QBnjISYLqpkvcvetelLEte7bQxZdI520U09i4+4JCfRQqy1focqII q64/M0jSiBgWKr963mli+eTmD2Y8KIj0maHblQx7gRhpAuRAcP/yX5pLeaFJLQQUC7V3 k27R6VMoQEgqF4yXrhOy5jFly3QrBFum2/UWGdFRlj/jpQD0pr383ZeDfF49PMMbebh5 1SS44FdWhrwHpCCMZQIFG3eN3t6Rwc7i4vzIvv46IR+/XmgAxM5qQ9ELyMyhKYj7x+Kh NAbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=xWIqeG6XfnuhjkqSjp4EggaxCb9UxDhUO3zF7NC0L0c=; b=ROs0uVIqG+t+2aBIo7h/hGKUnSA78Nze48vWDe//Jm4gYxwm2Hbpb3VdbbH7O4RHA7 69UNKjp7VS5xBBxLDpAxbqZhAppkUtq6KSs3Qu2vOo0b3noW1L2j0ppPrjjHNU7IgAQ5 A5NDUzwMpz3At8UxzOaxHi+yJeAa0Wq7A0xigLzAKKxXvjA+2SvQ/dqQWIp6RCTtn/Og U/SGDp2ePLkkCZqcklQKyUWa55HOtEChLdH3ObUgh+yLI32HvTvEGgFXLYiWIJFqWhg5 e6IjFooJn67zl+qjOPSBm+KmWmwRDcwR+xv9t2m7/EwVd8Uy//gNzCMSODIfn5DEm9qT ldHQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id h12si4577957qti.361.2019.02.11.18.56.56 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:56:56 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3AD66E6A60; Tue, 12 Feb 2019 02:56:55 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 782F2600CC; Tue, 12 Feb 2019 02:56:46 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 01/26] mm: gup: rename "nonblocking" to "locked" where proper Date: Tue, 12 Feb 2019 10:56:07 +0800 Message-Id: <20190212025632.28946-2-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Tue, 12 Feb 2019 02:56:55 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP There's plenty of places around __get_user_pages() that has a parameter "nonblocking" which does not really mean that "it won't block" (because it can really block) but instead it shows whether the mmap_sem is released by up_read() during the page fault handling mostly when VM_FAULT_RETRY is returned. We have the correct naming in e.g. get_user_pages_locked() or get_user_pages_remote() as "locked", however there're still many places that are using the "nonblocking" as name. Renaming the places to "locked" where proper to better suite the functionality of the variable. While at it, fixing up some of the comments accordingly. Reviewed-by: Mike Rapoport Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse --- mm/gup.c | 44 +++++++++++++++++++++----------------------- mm/hugetlb.c | 8 ++++---- 2 files changed, 25 insertions(+), 27 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 05acd7e2eb22..fa75a03204c1 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -506,12 +506,12 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address, } /* - * mmap_sem must be held on entry. If @nonblocking != NULL and - * *@flags does not include FOLL_NOWAIT, the mmap_sem may be released. - * If it is, *@nonblocking will be set to 0 and -EBUSY returned. + * mmap_sem must be held on entry. If @locked != NULL and *@flags + * does not include FOLL_NOWAIT, the mmap_sem may be released. If it + * is, *@locked will be set to 0 and -EBUSY returned. */ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, - unsigned long address, unsigned int *flags, int *nonblocking) + unsigned long address, unsigned int *flags, int *locked) { unsigned int fault_flags = 0; vm_fault_t ret; @@ -523,7 +523,7 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, fault_flags |= FAULT_FLAG_WRITE; if (*flags & FOLL_REMOTE) fault_flags |= FAULT_FLAG_REMOTE; - if (nonblocking) + if (locked) fault_flags |= FAULT_FLAG_ALLOW_RETRY; if (*flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; @@ -549,8 +549,8 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, } if (ret & VM_FAULT_RETRY) { - if (nonblocking && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) - *nonblocking = 0; + if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) + *locked = 0; return -EBUSY; } @@ -627,7 +627,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) * only intends to ensure the pages are faulted in. * @vmas: array of pointers to vmas corresponding to each page. * Or NULL if the caller does not require them. - * @nonblocking: whether waiting for disk IO or mmap_sem contention + * @locked: whether we're still with the mmap_sem held * * Returns number of pages pinned. This may be fewer than the number * requested. If nr_pages is 0 or negative, returns 0. If no pages @@ -656,13 +656,11 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) * appropriate) must be called after the page is finished with, and * before put_page is called. * - * If @nonblocking != NULL, __get_user_pages will not wait for disk IO - * or mmap_sem contention, and if waiting is needed to pin all pages, - * *@nonblocking will be set to 0. Further, if @gup_flags does not - * include FOLL_NOWAIT, the mmap_sem will be released via up_read() in - * this case. + * If @locked != NULL, *@locked will be set to 0 when mmap_sem is + * released by an up_read(). That can happen if @gup_flags does not + * has FOLL_NOWAIT. * - * A caller using such a combination of @nonblocking and @gup_flags + * A caller using such a combination of @locked and @gup_flags * must therefore hold the mmap_sem for reading only, and recognize * when it's been released. Otherwise, it must be held for either * reading or writing and will not be released. @@ -674,7 +672,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, unsigned long start, unsigned long nr_pages, unsigned int gup_flags, struct page **pages, - struct vm_area_struct **vmas, int *nonblocking) + struct vm_area_struct **vmas, int *locked) { long ret = 0, i = 0; struct vm_area_struct *vma = NULL; @@ -718,7 +716,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, if (is_vm_hugetlb_page(vma)) { i = follow_hugetlb_page(mm, vma, pages, vmas, &start, &nr_pages, i, - gup_flags, nonblocking); + gup_flags, locked); continue; } } @@ -736,7 +734,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, page = follow_page_mask(vma, start, foll_flags, &ctx); if (!page) { ret = faultin_page(tsk, vma, start, &foll_flags, - nonblocking); + locked); switch (ret) { case 0: goto retry; @@ -1195,7 +1193,7 @@ EXPORT_SYMBOL(get_user_pages_longterm); * @vma: target vma * @start: start address * @end: end address - * @nonblocking: + * @locked: whether the mmap_sem is still held * * This takes care of mlocking the pages too if VM_LOCKED is set. * @@ -1203,14 +1201,14 @@ EXPORT_SYMBOL(get_user_pages_longterm); * * vma->vm_mm->mmap_sem must be held. * - * If @nonblocking is NULL, it may be held for read or write and will + * If @locked is NULL, it may be held for read or write and will * be unperturbed. * - * If @nonblocking is non-NULL, it must held for read only and may be - * released. If it's released, *@nonblocking will be set to 0. + * If @locked is non-NULL, it must held for read only and may be + * released. If it's released, *@locked will be set to 0. */ long populate_vma_page_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end, int *nonblocking) + unsigned long start, unsigned long end, int *locked) { struct mm_struct *mm = vma->vm_mm; unsigned long nr_pages = (end - start) / PAGE_SIZE; @@ -1245,7 +1243,7 @@ long populate_vma_page_range(struct vm_area_struct *vma, * not result in a stack expansion that recurses back here. */ return __get_user_pages(current, mm, start, nr_pages, gup_flags, - NULL, NULL, nonblocking); + NULL, NULL, locked); } /* diff --git a/mm/hugetlb.c b/mm/hugetlb.c index afef61656c1e..e3c738bde72e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4180,7 +4180,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, struct page **pages, struct vm_area_struct **vmas, unsigned long *position, unsigned long *nr_pages, - long i, unsigned int flags, int *nonblocking) + long i, unsigned int flags, int *locked) { unsigned long pfn_offset; unsigned long vaddr = *position; @@ -4251,7 +4251,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, spin_unlock(ptl); if (flags & FOLL_WRITE) fault_flags |= FAULT_FLAG_WRITE; - if (nonblocking) + if (locked) fault_flags |= FAULT_FLAG_ALLOW_RETRY; if (flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | @@ -4268,9 +4268,9 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, break; } if (ret & VM_FAULT_RETRY) { - if (nonblocking && + if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) - *nonblocking = 0; + *locked = 0; *nr_pages = 0; /* * VM_FAULT_RETRY must not return an From patchwork Tue Feb 12 02:56:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807243 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3F7881669 for ; Tue, 12 Feb 2019 02:57:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 28CBF2AE6B for ; Tue, 12 Feb 2019 02:57:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1A0402AE78; Tue, 12 Feb 2019 02:57:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EEE162AE6B for ; Tue, 12 Feb 2019 02:57:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D2B3A8E0155; Mon, 11 Feb 2019 21:57:09 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CDC7A8E000E; Mon, 11 Feb 2019 21:57:09 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BCBA88E0155; Mon, 11 Feb 2019 21:57:09 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 8D93F8E000E for ; Mon, 11 Feb 2019 21:57:09 -0500 (EST) Received: by mail-qk1-f199.google.com with SMTP id a11so3235068qkk.10 for ; Mon, 11 Feb 2019 18:57:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=GvwN9xwxYWCuruXX3J+EUqHnFqcdGoMxbaeA6oWA0bQ=; b=Fq8nkQP8TbFF+S4KX8qet1n38wImVfIVGTqo1V1qP+H8+xI+sXCKAceWr46goN8DvZ IbH7KLfNTK/zXR0NZXSwtj3bD9XEhC2ajO2lYsRf0bHCHtA62avoGFJ8IiF+hhAnQMyG 8xVMSh2N56eZC85gqVWuOm35x8bLyL7wT8sCdjx9rxtpf+/mElP0WAuvbcJuOxLkOA8C I8c0YrsGzqMBf/78xe1NOgAuTSAEEPhiTUZzfXF/7Xu0Uwl1NFx8DzQppH5SaTQSpHkv RlYgsZRlGuf4sbbFBOH5peHOHagnrh+vOsW7HP7ljAd43Df1Yd50qstdtCHa4u0z/l/o bk/w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuZxgsDYK+HsUnt8wEDji12a0SLyWn0cqQ6zDv0PB4ijqnLc/9/J vt7S/cRyrlOqja9ft8S/zCNKkCTkDdhei+nLmhil9BcKJyH7mPnKBBJyyhxpkGhzoG20/dNL8Rf G4+b0zAx8qlH6a3AVBohgdmqOVsQ00fgmsa0PDYO/gMsky+W1hCQPnjDjO86HefaSOw== X-Received: by 2002:a37:484c:: with SMTP id v73mr1009813qka.196.1549940229312; Mon, 11 Feb 2019 18:57:09 -0800 (PST) X-Google-Smtp-Source: AHgI3IZjqAdYu1pwmJvIPNT3JnLXR1EO+oYHeXQakGueDuj2b8t7DKbofSh39x4SAM6TpAL/6Kl4 X-Received: by 2002:a37:484c:: with SMTP id v73mr1009778qka.196.1549940228162; Mon, 11 Feb 2019 18:57:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940228; cv=none; d=google.com; s=arc-20160816; b=IJb/tVQ17CAEL1VLd0/Z3Oka/epf7B90pz0/XnpkK6jz0J0szUo+l/R8hJGf9nq4wg nDdritwKMzpftgMrbczn8eJ5sp7x4gp9m2sK5lpd1sxQnJnMOsBfCNnMdHLzan4vZFAc hdaGCNnyMzvUpvWRM+kaf7PsqfgMsB4pbwqSvAP3mR2qIhn64xqhRW5vQybu590orVVz MOyqv/HntSdk7W+FKySjQ3OcE7Fh6gvytqZQpWDfzPyRmM70ItofB8TAQ9CZzdIo5/ZY /3UeKUP2fMnlo2p4XCEKSmaxbi/B/4MqL9y8UTyPAPRXOchKuZoX6FQw+uZWrbLopXX/ oOMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=GvwN9xwxYWCuruXX3J+EUqHnFqcdGoMxbaeA6oWA0bQ=; b=yMiMeOfsFhlGxREIhFo0TJMKgEJ+0FauLPyISGPixaxPz5XcYFNJ4iNfPYvtL2VydH pZitpjtRoNN+cNhWVt3QaIIFrd1154Si+cdnb5e/L/fPX+eIICFB+1uLWHf92TlP+YWf cXc8Cx1Hb7nfK+l6/NuAiXzvKvDw6xb+6t/12oXDciGP2Xxxomrj1NWWivAaVyWPJJAn vOh+48U7T4GCvqnRDMQmZV4eWEBYHRcTRnMAB+dTda/HBBLKEVculLSDGVV6KzatuGDU t+nFNbtI0rINZRPrrTZCZSZ0OHDKkVtt5pYd2f9WH4XSDm40rd6ZsNAIBF1OYpErWtcX FQQw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id 14si6555440qtp.203.2019.02.11.18.57.07 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:57:08 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 25B2087633; Tue, 12 Feb 2019 02:57:07 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id B5B91600CC; Tue, 12 Feb 2019 02:56:55 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 02/26] mm: userfault: return VM_FAULT_RETRY on signals Date: Tue, 12 Feb 2019 10:56:08 +0800 Message-Id: <20190212025632.28946-3-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Tue, 12 Feb 2019 02:57:07 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from the upstream discussion between Linus and Andrea: https://lkml.org/lkml/2017/10/30/560 A summary to the issue: there was a special path in handle_userfault() in the past that we'll return a VM_FAULT_NOPAGE when we detected non-fatal signals when waiting for userfault handling. We did that by reacquiring the mmap_sem before returning. However that brings a risk in that the vmas might have changed when we retake the mmap_sem and even we could be holding an invalid vma structure. This patch removes the special path and we'll return a VM_FAULT_RETRY with the common path even if we have got such signals. Then for all the architectures that is passing in VM_FAULT_ALLOW_RETRY into handle_mm_fault(), we check not only for SIGKILL but for all the rest of userspace pending signals right after we returned from handle_mm_fault(). This can allow the userspace to handle nonfatal signals faster than before. This patch is a preparation work for the next patch to finally remove the special code path mentioned above in handle_userfault(). Suggested-by: Linus Torvalds Suggested-by: Andrea Arcangeli Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse --- arch/alpha/mm/fault.c | 2 +- arch/arc/mm/fault.c | 11 ++++------- arch/arm/mm/fault.c | 6 +++--- arch/arm64/mm/fault.c | 6 +++--- arch/hexagon/mm/vm_fault.c | 2 +- arch/ia64/mm/fault.c | 2 +- arch/m68k/mm/fault.c | 2 +- arch/microblaze/mm/fault.c | 2 +- arch/mips/mm/fault.c | 2 +- arch/nds32/mm/fault.c | 6 +++--- arch/nios2/mm/fault.c | 2 +- arch/openrisc/mm/fault.c | 2 +- arch/parisc/mm/fault.c | 2 +- arch/powerpc/mm/fault.c | 2 ++ arch/riscv/mm/fault.c | 4 ++-- arch/s390/mm/fault.c | 9 ++++++--- arch/sh/mm/fault.c | 4 ++++ arch/sparc/mm/fault_32.c | 3 +++ arch/sparc/mm/fault_64.c | 3 +++ arch/um/kernel/trap.c | 5 ++++- arch/unicore32/mm/fault.c | 4 ++-- arch/x86/mm/fault.c | 6 +++++- arch/xtensa/mm/fault.c | 3 +++ 23 files changed, 56 insertions(+), 34 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index d73dc473fbb9..46e5e420ad2a 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -150,7 +150,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, the fault. */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index 8df1638259f3..dc5f1b8859d2 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -141,17 +141,14 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) */ fault = handle_mm_fault(vma, address, flags); - if (fatal_signal_pending(current)) { - + if (unlikely(fault & VM_FAULT_RETRY && signal_pending(current))) { + if (fatal_signal_pending(current) && !user_mode(regs)) + goto no_context; /* * if fault retry, mmap_sem already relinquished by core mm * so OK to return to user mode (with signal handled first) */ - if (fault & VM_FAULT_RETRY) { - if (!user_mode(regs)) - goto no_context; - return; - } + return; } perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 58f69fa07df9..c41c021bbe40 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -314,12 +314,12 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_page_fault(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (unlikely(fault & VM_FAULT_RETRY && signal_pending(current))) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return 0; } diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index efb7b2cbead5..a38ff8c49a66 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -512,13 +512,13 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, if (fault & VM_FAULT_RETRY) { /* - * If we need to retry but a fatal signal is pending, + * If we need to retry but a signal is pending, * handle the signal first. We do not need to release * the mmap_sem because it would already be released * in __lock_page_or_retry in mm/filemap.c. */ - if (fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return 0; } diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index eb263e61daf4..be10b441d9cc 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -104,7 +104,7 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; /* The most common case -- we are done. */ diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 5baeb022f474..62c2d39d2bed 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -163,7 +163,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index 9b6163c05a75..d9808a807ab8 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -138,7 +138,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); pr_debug("handle_mm_fault returns %x\n", fault); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return 0; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 202ad6a494f5..4fd2dbd0c5ca 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -217,7 +217,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 73d8a0f0b810..92374fd091d2 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -154,7 +154,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index 68d5f2a27f38..9f6e477b9e30 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -206,12 +206,12 @@ void do_page_fault(unsigned long entry, unsigned long addr, fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (fault & VM_FAULT_RETRY && signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return; } diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index 24fd84cf6006..5939434a31ae 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -134,7 +134,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index dc4dbafc1d83..873ecb5d82d7 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -165,7 +165,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index c8e8b7c05558..29422eec329d 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -303,7 +303,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 887f11bcf330..aaa853e6592f 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -591,6 +591,8 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, */ flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; + if (is_user && signal_pending(current)) + return 0; if (!fatal_signal_pending(current)) goto retry; } diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 88401d5125bc..4fc8d746bec3 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -123,11 +123,11 @@ asmlinkage void do_page_fault(struct pt_regs *regs) fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(tsk)) + if ((fault & VM_FAULT_RETRY) && signal_pending(tsk)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index 11613362c4e7..aba1dad1efcd 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -476,9 +476,12 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) * the fault. */ fault = handle_mm_fault(vma, address, flags); - /* No reason to continue if interrupted by SIGKILL. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - fault = VM_FAULT_SIGNAL; + /* Do not continue if interrupted by signals. */ + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { + if (fatal_signal_pending(current)) + fault = VM_FAULT_SIGNAL; + else + fault = 0; if (flags & FAULT_FLAG_RETRY_NOWAIT) goto out_up; goto out; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index 6defd2c6d9b1..baf5d73df40c 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -506,6 +506,10 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, * have already released it in __lock_page_or_retry * in mm/filemap.c. */ + + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index b0440b0edd97..a2c83104fe35 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -269,6 +269,9 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index 8f8a604c1300..cad71ec5c7b3 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -467,6 +467,9 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(current)) + return; + goto retry; } } diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index 0e8b6158f224..09baf37b65b9 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -76,8 +76,11 @@ int handle_page_fault(unsigned long address, unsigned long ip, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if (fault & VM_FAULT_RETRY && signal_pending(current)) { + if (is_user && !fatal_signal_pending(current)) + err = 0; goto out_nosemaphore; + } if (unlikely(fault & VM_FAULT_ERROR)) { if (fault & VM_FAULT_OOM) { diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index b9a3a50644c1..3611f19234a1 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -248,11 +248,11 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_pf(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return 0; if (!(fault & VM_FAULT_ERROR) && (flags & FAULT_FLAG_ALLOW_RETRY)) { diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 9d5c75f02295..248ff0a28ecd 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1481,16 +1481,20 @@ void do_user_addr_fault(struct pt_regs *regs, * that we made any progress. Handle this case first. */ if (unlikely(fault & VM_FAULT_RETRY)) { + bool is_user = flags & FAULT_FLAG_USER; + /* Retry at most once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; + if (is_user && signal_pending(tsk)) + return; if (!fatal_signal_pending(tsk)) goto retry; } /* User mode? Just return to handle the fatal exception */ - if (flags & FAULT_FLAG_USER) + if (is_user) return; /* Not returning to user mode? Handle exceptions or die: */ diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index 2ab0e0dcd166..792dad5e2f12 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -136,6 +136,9 @@ void do_page_fault(struct pt_regs *regs) * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(current)) + return; + goto retry; } } From patchwork Tue Feb 12 02:56:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807245 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8ACB61575 for ; Tue, 12 Feb 2019 02:57:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 78DDE2AE6B for ; Tue, 12 Feb 2019 02:57:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6CF872AE78; Tue, 12 Feb 2019 02:57:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EAE352AE6B for ; Tue, 12 Feb 2019 02:57:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 22E558E017F; Mon, 11 Feb 2019 21:57:22 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1DF058E000E; Mon, 11 Feb 2019 21:57:22 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0CD818E017F; Mon, 11 Feb 2019 21:57:22 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id D754F8E000E for ; Mon, 11 Feb 2019 21:57:21 -0500 (EST) Received: by mail-qk1-f197.google.com with SMTP id b187so14443017qkf.3 for ; Mon, 11 Feb 2019 18:57:21 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=ZLIE1+oU+vsDu0BxwTOfDTXs0UuTA0Eh09kMZGyxmIM=; b=B2vt9Yq0XoDEfMeemUI+um9AeAqycN+ikJRcK4eX/PtnmqYagKGlifLJWuV2D+GtWF poIS4XMUAUEamMhxqnpJDBeNbrF6/UXjCfpMluLlQwxneICntcDwKOqKSrKdPoxYhFBn OQc0w8U+CrLkUQMGT+LdTFtSF+i5I4hBhoglCv/RWdaYEDo0ySw8GdmYVwFl1dd8/Prt rCoopQy8ugj2morGkxwD0xiJVF/zhLNoIvprqbpsmf7B5sHwn9U1aHaJaEs2euMCqWQJ gdsBeL5+cb5nivMvXhaz4Y9ltStbL5SbrNG6VvuSm1cK7/gpSM/bhAgx/Xh/lsmmlEMI En9g== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuZ/R/IZo3jho/PzYEnjZdqFGJhp3IEtMFeYP7VBzMVlwAf62VV3 MPkmS7sCDMZpcT1vrIOfs9+Z7k1Qyf3tju6BuOCiwbDQ8by9iDwgAxlY1vr9+zWmODbrwMmQaB7 g0vGdfs5a0c7eoihPigujuJL3zEGstd/dlBWLlDTFUBfdCMLiU91a7+1FfoagkY+jAg== X-Received: by 2002:ae9:ec0a:: with SMTP id h10mr1037260qkg.22.1549940241651; Mon, 11 Feb 2019 18:57:21 -0800 (PST) X-Google-Smtp-Source: AHgI3IZ0zHF0UIeW8Dk16HqVkk8XkybYv2OpkjdNw9Ux4NR4Rwj5Sb94yEHvtne2q91ots/vDanN X-Received: by 2002:ae9:ec0a:: with SMTP id h10mr1037239qkg.22.1549940241058; Mon, 11 Feb 2019 18:57:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940241; cv=none; d=google.com; s=arc-20160816; b=Y8bugUCz9KIwwpAOF2YCkZNKqg/MjS/MnZuoh2NXu978vbORul+3ZG0AD5xpk0ia0G p9C4ljqM+4FLEkhrbQyFC/+TLsRsgqVDwUs7ljem4KXFDxyYsZpR/43S6n0O6wmuwrtT uPvcdK/UkH2V/tyP5JB5tpkkT+pVIjNTaX6Z9+sesYHO/rvo1tLwzniCptxzTHe/P1Ab /NuwGMSmKUuSPJeI4K0IF5go+Oke6WH3cPqj5DCKISpWSnQpr5G0ciMKxdwSDqDy9zrE kr7jznMd32rwgcGeKp5ChS/ktV7K2olxRfpgte7ykOYDhFFtbVtFgtgz3VXxZ+SlfDSp VX0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=ZLIE1+oU+vsDu0BxwTOfDTXs0UuTA0Eh09kMZGyxmIM=; b=VHr4ZKtyJTBSujgEH+cxB/0V5LFAO1k+I3ZLeSBtfnXs8kq0t3MZ/2BfXCdkiKeyrW HuDzcRDqju8sNj1VG7wceFHZBN9C+xQGrzgK14x9QJYQZOmGpOWhRFzr4zA4uYGGhVsa lrcU8+actKNPD9ZitHPilAa+T+/ZAZzQjsXCh2/Sq+pAFIGch3vXmjcNxiVBo8XvSC8d wwSYNhT8Th1kKwwJ+NRFi1BxEBRczCZHwruh21n+MbOMQQuamteY+rjYKgCGhVxBaji1 I+Eh3iw3ih6BIJjpGJfXtQrR5o6nyJM+oKr4QVvZAHOoemk3jp0me6ajMn7+qdrNMoDZ 8ZUw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id k3si1066083qkc.157.2019.02.11.18.57.20 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:57:21 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1CD7381DEB; Tue, 12 Feb 2019 02:57:20 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id A0579600CC; Tue, 12 Feb 2019 02:57:07 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 03/26] userfaultfd: don't retake mmap_sem to emulate NOPAGE Date: Tue, 12 Feb 2019 10:56:09 +0800 Message-Id: <20190212025632.28946-4-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 12 Feb 2019 02:57:20 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from the upstream discussion between Linus and Andrea: https://lkml.org/lkml/2017/10/30/560 A summary to the issue: there was a special path in handle_userfault() in the past that we'll return a VM_FAULT_NOPAGE when we detected non-fatal signals when waiting for userfault handling. We did that by reacquiring the mmap_sem before returning. However that brings a risk in that the vmas might have changed when we retake the mmap_sem and even we could be holding an invalid vma structure. This patch removes the risk path in handle_userfault() then we will be sure that the callers of handle_mm_fault() will know that the VMAs might have changed. Meanwhile with previous patch we don't lose responsiveness as well since the core mm code now can handle the nonfatal userspace signals quickly even if we return VM_FAULT_RETRY. Suggested-by: Andrea Arcangeli Suggested-by: Linus Torvalds Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse --- fs/userfaultfd.c | 24 ------------------------ 1 file changed, 24 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 89800fc7dc9d..b397bc3b954d 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -514,30 +514,6 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) __set_current_state(TASK_RUNNING); - if (return_to_userland) { - if (signal_pending(current) && - !fatal_signal_pending(current)) { - /* - * If we got a SIGSTOP or SIGCONT and this is - * a normal userland page fault, just let - * userland return so the signal will be - * handled and gdb debugging works. The page - * fault code immediately after we return from - * this function is going to release the - * mmap_sem and it's not depending on it - * (unlike gup would if we were not to return - * VM_FAULT_RETRY). - * - * If a fatal signal is pending we still take - * the streamlined VM_FAULT_RETRY failure path - * and there's no need to retake the mmap_sem - * in such case. - */ - down_read(&mm->mmap_sem); - ret = VM_FAULT_NOPAGE; - } - } - /* * Here we race with the list_del; list_add in * userfaultfd_ctx_read(), however because we don't ever run From patchwork Tue Feb 12 02:56:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807247 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 23B661669 for ; Tue, 12 Feb 2019 02:57:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0FE812AE6B for ; Tue, 12 Feb 2019 02:57:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 025252AE78; Tue, 12 Feb 2019 02:57:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D560E2AE6B for ; Tue, 12 Feb 2019 02:57:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E8A278E0186; Mon, 11 Feb 2019 21:57:33 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E3C4F8E000E; Mon, 11 Feb 2019 21:57:33 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D2B638E0186; Mon, 11 Feb 2019 21:57:33 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id A4A828E000E for ; Mon, 11 Feb 2019 21:57:33 -0500 (EST) Received: by mail-qt1-f199.google.com with SMTP id d13so1278741qth.6 for ; Mon, 11 Feb 2019 18:57:33 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=rnzoinFNdHnWAwX7NGpWaYzzTLuVLNlO0wiSVUKUTxk=; b=cgN3uhXj868asgYt9UbR9B/t80zAYgB3QY12UeK4E9Lp1Nxy+cjCd6ac2WiYyVIqQL /ireezKSTLMH7mCJE9caosOJZ9GbT4waCjD7W6nXIjYPUmyhwBe5GPo5/YRpksfUG2JG il8ZcWlOM0QtW0BFNiEJGyE17dUwET3t3ItNATJZMxXoYuM8eFJbVf4MqANw+5wSPRNY LxLUkrq+9afJ9qABocWcAB+HGjx6nGgNNtCPe8Lngb/s51llAr9Afes2e5UZHWYXJWqa nsUTCm9Yj+R8ARLvmoTzBG4MRWrxf1w2W9jdlx9QAOqXSxWmnR1xdj5l/l7EdQQmEToq xoNg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAubVKHtUe4rWGMRe7chU7g5Kq5U9bdnjjNqn8ZXH6zReVOMTIEY5 y0Pe/4sgclS+jwIwsJ8toXCDf+j5SxeFfppQc7x/q/KN2YDaN050abfsSapjiLQejLj96oc2AO3 YJfm0jv2BGWErdj6/vTQC/euWb6X9I+bo7j0zLVdtk0u+RgVFyoswcJIiIocY0w6ZKA== X-Received: by 2002:a0c:8aa1:: with SMTP id 30mr1020215qvv.1.1549940253414; Mon, 11 Feb 2019 18:57:33 -0800 (PST) X-Google-Smtp-Source: AHgI3IZmJnZYtc+bG7bN7S/xteukMCZno/JnIk2psrIUOwR49eHhepxntfS3ELmjP4PMp1yjeaxq X-Received: by 2002:a0c:8aa1:: with SMTP id 30mr1020169qvv.1.1549940252265; Mon, 11 Feb 2019 18:57:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940252; cv=none; d=google.com; s=arc-20160816; b=V4oerqRGiqXs8zLCBduZF+eOpkgV0ZZ9emb1AhNIVgmidjV102YOanaAJCSCu+vCp0 YT1ihMhosiHbUi2AvwHaXhmtbiyKsqmn35JhryRPJyjEYJJ/P6oHGCB7AezI6VaaOJWz AXDp3dK2DvNpxEn2gNxxSmyPoT437tvSzhACmXXfdycygJMuTEErMme9FOZaTY4j7W8t /l/5Owtz0eQtprJvvd9nlyKJ9Uj9COtD2PWUYHBnxkDDXVwwnaxvlRsXx0HK4PGG0OSi DsHMhVnnXjSAcZvRvcq9gPA+TSfeyLhNWngN7Ts0Xudh6ibwJWdc/JHS1utNqwNiWeEc EpgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=rnzoinFNdHnWAwX7NGpWaYzzTLuVLNlO0wiSVUKUTxk=; b=Nz2dKqX9jJXJync0wMNneQureYjVX/ef27JGaXCR+xALsihDVKJA5Z8Obyyy8ag81d TxhT9277/tDUcVT+ogfeL4koCtpXjUa64JwDEwuXrFjrpWcfPs+9O+4B14FcQlxY7YAJ cM+mXey5h31+HJNey9g2X8qBvh55LJyRyRC7vhfAB9ypq01WEqyQbjoh1FqMWUhdCVaU hp3AbUrupKtS4+LSY+I/Yq+qOrIMEapJNFzNnnEBKbkQOGkOhlgk+qZBKXYKBGISALiE E2PUpIz1qwI+5xtIHwZKZJ3Q6erZ/1NrTQR7daMWk70+LJq74hqP9Bl2FIAVhj58TVkD lIkQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id g15si2590603qvn.156.2019.02.11.18.57.32 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:57:32 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4D1B380F6B; Tue, 12 Feb 2019 02:57:31 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 94787600C6; Tue, 12 Feb 2019 02:57:20 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 04/26] mm: allow VM_FAULT_RETRY for multiple times Date: Tue, 12 Feb 2019 10:56:10 +0800 Message-Id: <20190212025632.28946-5-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 12 Feb 2019 02:57:31 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from a discussion between Linus and Andrea [1]. Before this patch we only allow a page fault to retry once. We achieved this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing handle_mm_fault() the second time. This was majorly used to avoid unexpected starvation of the system by looping over forever to handle the page fault on a single page. However that should hardly happen, and after all for each code path to return a VM_FAULT_RETRY we'll first wait for a condition (during which time we should possibly yield the cpu) to happen before VM_FAULT_RETRY is really returned. This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY flag when we receive VM_FAULT_RETRY. It means that the page fault handler now can retry the page fault for multiple times if necessary without the need to generate another page fault event. Meanwhile we still keep the FAULT_FLAG_TRIED flag so page fault handler can still identify whether a page fault is the first attempt or not. One example is in __lock_page_or_retry(), now we'll drop the mmap_sem only in the first attempt of page fault and we'll keep it in follow up retries, so old locking behavior will be retained. GUP code is not touched yet and will be covered in follow up patch. This will be a nice enhancement for current code [2] at the same time a supporting material for the future userfaultfd-writeprotect work, since in that work there will always be an explicit userfault writeprotect retry for protected pages, and if that cannot resolve the page fault (e.g., when userfaultfd-writeprotect is used in conjunction with swapped pages) then we'll possibly need a 3rd retry of the page fault. It might also benefit other potential users who will have similar requirement like userfault write-protection. Please read the thread below for more information. [1] https://lkml.org/lkml/2017/11/2/833 [2] https://lkml.org/lkml/2018/12/30/64 Suggested-by: Linus Torvalds Suggested-by: Andrea Arcangeli Signed-off-by: Peter Xu --- arch/alpha/mm/fault.c | 2 +- arch/arc/mm/fault.c | 1 - arch/arm/mm/fault.c | 3 --- arch/arm64/mm/fault.c | 5 ----- arch/hexagon/mm/vm_fault.c | 1 - arch/ia64/mm/fault.c | 1 - arch/m68k/mm/fault.c | 3 --- arch/microblaze/mm/fault.c | 1 - arch/mips/mm/fault.c | 1 - arch/nds32/mm/fault.c | 1 - arch/nios2/mm/fault.c | 3 --- arch/openrisc/mm/fault.c | 1 - arch/parisc/mm/fault.c | 2 -- arch/powerpc/mm/fault.c | 5 ----- arch/riscv/mm/fault.c | 5 ----- arch/s390/mm/fault.c | 5 +---- arch/sh/mm/fault.c | 1 - arch/sparc/mm/fault_32.c | 1 - arch/sparc/mm/fault_64.c | 1 - arch/um/kernel/trap.c | 1 - arch/unicore32/mm/fault.c | 6 +----- arch/x86/mm/fault.c | 1 - arch/xtensa/mm/fault.c | 1 - mm/filemap.c | 2 +- 24 files changed, 4 insertions(+), 50 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index 46e5e420ad2a..deae82bb83c1 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -169,7 +169,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; + flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would * have already released it in __lock_page_or_retry diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index dc5f1b8859d2..664e18a8749f 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -167,7 +167,6 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index c41c021bbe40..7910b4b5205d 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -342,9 +342,6 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) regs, addr); } if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index a38ff8c49a66..d1d3c98f9ffb 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -523,12 +523,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, return 0; } - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk of - * starvation. - */ if (mm_flags & FAULT_FLAG_ALLOW_RETRY) { - mm_flags &= ~FAULT_FLAG_ALLOW_RETRY; mm_flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index be10b441d9cc..576751597e77 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -115,7 +115,6 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 62c2d39d2bed..9de95d39935e 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -189,7 +189,6 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index d9808a807ab8..b1b2109e4ab4 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -162,9 +162,6 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 4fd2dbd0c5ca..05a4847ac0bf 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -236,7 +236,6 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 92374fd091d2..9953b5b571df 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -178,7 +178,6 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, tsk->min_flt++; } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index 9f6e477b9e30..32259afc751a 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -242,7 +242,6 @@ void do_page_fault(unsigned long entry, unsigned long addr, 1, regs, addr); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index 5939434a31ae..9dd1c51acc22 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -158,9 +158,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index 873ecb5d82d7..ff92c5674781 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -185,7 +185,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, else tsk->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index 29422eec329d..7d3e96a9a7ab 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -327,8 +327,6 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; - /* * No need to up_read(&mm->mmap_sem) as we would * have already released it in __lock_page_or_retry diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index aaa853e6592f..becebfe67e32 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -585,11 +585,6 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, if (unlikely(fault & VM_FAULT_RETRY)) { /* We retry only once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. - */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; if (is_user && signal_pending(current)) return 0; diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 4fc8d746bec3..aad2c0557d2f 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -154,11 +154,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs) 1, regs, addr); } if (fault & VM_FAULT_RETRY) { - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. - */ - flags &= ~(FAULT_FLAG_ALLOW_RETRY); flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index aba1dad1efcd..4e8c066964a9 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -513,10 +513,7 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) fault = VM_FAULT_PFAULT; goto out_up; } - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~(FAULT_FLAG_ALLOW_RETRY | - FAULT_FLAG_RETRY_NOWAIT); + flags &= ~FAULT_FLAG_RETRY_NOWAIT; flags |= FAULT_FLAG_TRIED; down_read(&mm->mmap_sem); goto retry; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index baf5d73df40c..cd710e2d7c57 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -498,7 +498,6 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index a2c83104fe35..6735cd1c09b9 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -261,7 +261,6 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, 1, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index cad71ec5c7b3..28d5b4d012c6 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -459,7 +459,6 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) 1, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index 09baf37b65b9..c63fc292aea0 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -99,7 +99,6 @@ int handle_page_fault(unsigned long address, unsigned long ip, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index 3611f19234a1..fdf577956f5f 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -260,12 +260,8 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) tsk->maj_flt++; else tsk->min_flt++; - if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; + if (fault & VM_FAULT_RETRY) goto retry; - } } up_read(&mm->mmap_sem); diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 248ff0a28ecd..71d68aa03e43 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1485,7 +1485,6 @@ void do_user_addr_fault(struct pt_regs *regs, /* Retry at most once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; if (is_user && signal_pending(tsk)) return; diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index 792dad5e2f12..7cd55f2d66c9 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -128,7 +128,6 @@ void do_page_fault(struct pt_regs *regs) else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/mm/filemap.c b/mm/filemap.c index 9f5e323e883e..44942c78bb92 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1351,7 +1351,7 @@ EXPORT_SYMBOL_GPL(__lock_page_killable); int __lock_page_or_retry(struct page *page, struct mm_struct *mm, unsigned int flags) { - if (flags & FAULT_FLAG_ALLOW_RETRY) { + if (!flags & FAULT_FLAG_TRIED) { /* * CAUTION! In this case, mmap_sem is not released * even though return 0. From patchwork Tue Feb 12 02:56:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807249 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D77B91669 for ; Tue, 12 Feb 2019 02:57:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C73BF2AE6B for ; Tue, 12 Feb 2019 02:57:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BB1B32AE78; Tue, 12 Feb 2019 02:57:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 582932AE6B for ; Tue, 12 Feb 2019 02:57:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 675208E019C; Mon, 11 Feb 2019 21:57:39 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 624A28E000E; Mon, 11 Feb 2019 21:57:39 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 516488E019C; Mon, 11 Feb 2019 21:57:39 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 273C08E000E for ; Mon, 11 Feb 2019 21:57:39 -0500 (EST) Received: by mail-qt1-f200.google.com with SMTP id y31so1279110qty.9 for ; Mon, 11 Feb 2019 18:57:39 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=SlBeP/1Rk5D+vIK5AAJaqlVM0UwikLcHv9t1VRvv/Ec=; b=Fp1MtgRYd9HJScKJmCZ7LbxfxCWay1fbpI7YL1rkbjuSr2YrDTqnqAtNlkwaP2XIKz KiuFiW4YBkglymGCug3hXu96mkE0dr5nLdawJ17TN+p/gI16qZmE8R4sJNZad/Pi67TX JKxGQ+PnkVOCvd9maYXoLCt+UohejETDyc+dNwqujibo1Das5/ZNgO6wZqplrH6iHe5I +7nLrbnylWoH+m2AVZR/XtJvHpnWPC6nmaaEtBOL3O22gIch50eRmso4fLXQ5a1kQNd3 DfI+xNT98arh2C5CskpFok2KoEa3f4ztO0xGgj9eCL9o6fQdglnpfoz1zRN42rkie46u uBmg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuZXl+wqI2hl8FZWA4ZvA6SSOcaqI6+8O7RMWpxVrIv6JxpNWpPa JIZjm1aYRcVal3VDuwTHrBp4RX5OO8XHfKss/juCUX8ayd+/jNSxLqnVv7ppoiswhulIEG/T7IY 2SEI05Z0MyutAgihj2RenT5ScY7j8MVIYy5Yp2x/p2Wk3g+BnI0gIPmonRyVDws3ueg== X-Received: by 2002:a37:9604:: with SMTP id y4mr1035436qkd.279.1549940258951; Mon, 11 Feb 2019 18:57:38 -0800 (PST) X-Google-Smtp-Source: AHgI3IaYJrRWC7V30Zn2C4e3zmploK+nqFE3bHiV9XVVxOlAABWbm69YDu2QDcSvQSOTly4KmNFL X-Received: by 2002:a37:9604:: with SMTP id y4mr1035422qkd.279.1549940258473; Mon, 11 Feb 2019 18:57:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940258; cv=none; d=google.com; s=arc-20160816; b=CnZQdClU/taC187sBW5VINbVr/Jv/S3ZNwHYmaoCy8ScRqIM9LIJ8CFR9UIZxcdvyz FC2Ts2qFqH2u1G7s7ZtJW5clcnCZYTRSZ2rexOLWV8qK4x1mzutjW76BgruLuIHcJDeH EjFaV/wk/6GHnsVYpmRs/5pQziC8TgDKrTK90VdmuAziwr+quCYK+nfMljH81zy/8QAq 3sKtBRKNhIAmquPgf7YiVMFILb6UA35TyFyUbYujaanUc+qbGT9QLjRKZ58FYsR2M36C 9cRVW8VlTfy74EJr9or8FZizZp/jc5CMbrPVd0MyOJWPfqSiOmhIWyc/Vtqm1Nxd2I6Y FV7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=SlBeP/1Rk5D+vIK5AAJaqlVM0UwikLcHv9t1VRvv/Ec=; b=kug4NSqUL7zDJy6qShjRhLg2yKlZl5/qnSZ+iVxVH+eHSCHb7bn1EE/o58lpvsOBu0 P6oomOA+GozDpwswl5CyhzlwgKJuBfR5JP44ua39KI22lLhK+SirrM0ic2GYRrTHu7HZ 4D19YjUcAn6y7dNbliBNoadyFHZJ259gMJV++WQeMBEy448GyuK7vIVWdVC6CE1Plmz4 ucFWg59iO/x3cIkC0jVWrWuG2BfAqG8KUKxCVrWgEIU1lnTAOMf+xdmNMcXD3UYK4hrT oYVQkZfytP9rOAx8IfFd024C3UcQW37k4DQ5RdZE6c/DP7iB/9ABrzXZLuPFjuEiL+qQ c1MQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id l16si7527568qkg.3.2019.02.11.18.57.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:57:38 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A310B5947A; Tue, 12 Feb 2019 02:57:37 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id C350B600C6; Tue, 12 Feb 2019 02:57:31 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 05/26] mm: gup: allow VM_FAULT_RETRY for multiple times Date: Tue, 12 Feb 2019 10:56:11 +0800 Message-Id: <20190212025632.28946-6-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 12 Feb 2019 02:57:37 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This is the gup counterpart of the change that allows the VM_FAULT_RETRY to happen for more than once. Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse --- mm/gup.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index fa75a03204c1..ba387aec0d80 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -528,7 +528,10 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, if (*flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; if (*flags & FOLL_TRIED) { - VM_WARN_ON_ONCE(fault_flags & FAULT_FLAG_ALLOW_RETRY); + /* + * Note: FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_TRIED + * can co-exist + */ fault_flags |= FAULT_FLAG_TRIED; } @@ -943,17 +946,23 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, /* VM_FAULT_RETRY triggered, so seek to the faulting offset */ pages += ret; start += ret << PAGE_SHIFT; + lock_dropped = true; +retry: /* * Repeat on the address that fired VM_FAULT_RETRY - * without FAULT_FLAG_ALLOW_RETRY but with + * with both FAULT_FLAG_ALLOW_RETRY and * FAULT_FLAG_TRIED. */ *locked = 1; - lock_dropped = true; down_read(&mm->mmap_sem); ret = __get_user_pages(tsk, mm, start, 1, flags | FOLL_TRIED, - pages, NULL, NULL); + pages, NULL, locked); + if (!*locked) { + /* Continue to retry until we succeeded */ + BUG_ON(ret != 0); + goto retry; + } if (ret != 1) { BUG_ON(ret > 1); if (!pages_done) From patchwork Tue Feb 12 02:56:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807251 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 62E7B1575 for ; Tue, 12 Feb 2019 02:57:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4CB3C2AE6C for ; Tue, 12 Feb 2019 02:57:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 40C552AE7A; Tue, 12 Feb 2019 02:57:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6FE362AE6C for ; Tue, 12 Feb 2019 02:57:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 931918E0115; Mon, 11 Feb 2019 21:57:57 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8E2698E000E; Mon, 11 Feb 2019 21:57:57 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D14B8E0115; Mon, 11 Feb 2019 21:57:57 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 526188E000E for ; Mon, 11 Feb 2019 21:57:57 -0500 (EST) Received: by mail-qt1-f198.google.com with SMTP id m37so1261871qte.10 for ; Mon, 11 Feb 2019 18:57:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=Rn84y8C8A09oYMyr3TKfoYKiYwkNmhvpkXr89LUWW7o=; b=ELXPjMoJfZKqEotMXWY2QwKGbU1PZRWtok3Kb/ZP+vpODE/VadjFfC5o+LYT5mlZgN 6Bw6Qt2J+fwXdqPjZugv/0Mm6dSTK6f7gz5D233KlWVGwI8BK+ykeo+GmA9s7x5IMuY+ G9b9zRbdby6DvTZpe7hw8TBRPdWZPNiuBM64IeTaWEShVb5lhsaIf93V29Pgm0hdCQMA yDH+TSkgrBRjzOFXTj3EvTlR/SUlgssJMa1LLdD2begASNXYrgmnR1hothUGqmJaWXY8 IUZ2fwAY2m08I4C5r/S4Q0oC96eRLv4q0nlcDUpzib5BxlpgXjdX/KyXfcQaq29Xex9E u2rg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAubR9EAfZw3+JznJYCXfRkB7TbUWD6OZzkwKxCu8PcYt/8YUOW+I yni/jNDZ3lgfpQdpx9+fm7rtiGyJ9roX3oYN2i2wkx2ejns3o9zsEAUVxKxE2Lusww20cqZO7Mz uKsFewx/0iK6DxdmPvWidiuvrzxBzAlRJQx8UwCMw3HKnFeQ/oIL7xlO71w1Lp+i5AQ== X-Received: by 2002:ac8:2614:: with SMTP id u20mr1154705qtu.28.1549940277120; Mon, 11 Feb 2019 18:57:57 -0800 (PST) X-Google-Smtp-Source: AHgI3Ia0vGVCk7j2VNca+QOZ6g9U2Ro0tPYF8WkESVhhW6bFtOYCfnygxoM/OM7B6Cc1CRjEh99K X-Received: by 2002:ac8:2614:: with SMTP id u20mr1154691qtu.28.1549940276739; Mon, 11 Feb 2019 18:57:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940276; cv=none; d=google.com; s=arc-20160816; b=e8tg1t+Mv1Sj6imfwt38tv53io3nfuP2reMSI9Es9FjfmrGAGNRBevKEV7G4b3jbeD aLUK0qxjYclHiAdhZCKs7zR/OvbFVPpAqI0IFgZ/dPe9MtBspBmHDnZ8VwuF1uPKcWoI Jm4Giipnnwcc2uflV73BcfWU7+49MK4+CZysxOtv0oWvuqEAow62YNRhuP1h2zZhUtji mBogwST4/5ozC8JVrkxk9sxBGH+4v2RQRqwj5Oj17Ui4M/dWy+SbKUdsWrRPI7Axwrh5 7EulYVcmsZWE92j90BRE0Kp/mv7A6cZVnvSZ0l01HRzuu7bt5NHncZQz6MnKZ7Uc57SI Bjlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=Rn84y8C8A09oYMyr3TKfoYKiYwkNmhvpkXr89LUWW7o=; b=XeEe12e6Xd/v3WVjx2h2Jes8MscbyTc3orpBoXrwePUkFVzIbwjsGkGKVtQfGi+DuD D55zb6m4uKBTbZrusxowvw7avJb/BnS7QNx092VwNU9qsn5quP3UsEVDPNKN8KbpmGBi KDZSjOm5k5hmZ4Nq03YxolYTZw3BM3CEA38cg+66jcLw8QG4iY/D8D7IEOWz87kiZDNg 8+vBIea0u2EEeI7Y6CfJ5EMx2Jo1GqRNkspd9Uxrpll2t1HH8WD9kjVXzJHPCDgnbOhT liLYDKhwtICHfwICkgmQRIRc9uc1iW4BeV4+yP4Q4GZvBYgBiy2JllZoqakMwFhpmr6l V3yw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id d32si3831749qtd.307.2019.02.11.18.57.56 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:57:56 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DA48A81DEB; Tue, 12 Feb 2019 02:57:55 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2D30F600C6; Tue, 12 Feb 2019 02:57:37 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Pavel Emelyanov , Rik van Riel Subject: [PATCH v2 06/26] userfaultfd: wp: add helper for writeprotect check Date: Tue, 12 Feb 2019 10:56:12 +0800 Message-Id: <20190212025632.28946-7-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 12 Feb 2019 02:57:56 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li add helper for writeprotect check. Will use it later. Cc: Andrea Arcangeli Cc: Pavel Emelyanov Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse Reviewed-by: Mike Rapoport --- include/linux/userfaultfd_k.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 37c9eba75c98..38f748e7186e 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -50,6 +50,11 @@ static inline bool userfaultfd_missing(struct vm_area_struct *vma) return vma->vm_flags & VM_UFFD_MISSING; } +static inline bool userfaultfd_wp(struct vm_area_struct *vma) +{ + return vma->vm_flags & VM_UFFD_WP; +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP); @@ -94,6 +99,11 @@ static inline bool userfaultfd_missing(struct vm_area_struct *vma) return false; } +static inline bool userfaultfd_wp(struct vm_area_struct *vma) +{ + return false; +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return false; From patchwork Tue Feb 12 02:56:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807253 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 90FF617E0 for ; Tue, 12 Feb 2019 02:58:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7F4422AE6B for ; Tue, 12 Feb 2019 02:58:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7345F2AE6C; Tue, 12 Feb 2019 02:58:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EFDC02AE7C for ; Tue, 12 Feb 2019 02:58:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DC6198E012D; Mon, 11 Feb 2019 21:58:04 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D75B08E000E; Mon, 11 Feb 2019 21:58:04 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C65688E012D; Mon, 11 Feb 2019 21:58:04 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 9841B8E000E for ; Mon, 11 Feb 2019 21:58:04 -0500 (EST) Received: by mail-qt1-f200.google.com with SMTP id n95so1240994qte.16 for ; Mon, 11 Feb 2019 18:58:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=/0tGI+WMMQCIgqP4tW1kg30CUM4gY3lrMug3d0U0StI=; b=FzgNOprSrpkR9zQvSBfF5UDkHXI/9Pu6xW/RpgJUDLEFrKGCaTNPJdksu/DEd2N2Ce pxXhTSHznn22US3iPoDdKjdhBuy69cPYGqIzJqSmiJQ5jnenlvnJ6oxe6UasjnLZPcCX 76eSiFHkhNS0+I5pqd9w+9L2LFusCbn8V3o/VbA1mZ0fvwi46UdnFWUL0o150wpMu0L4 vhm6t62n2/3NGsLyEnhapp3WsEq8GClwtKWGISIpy+Snlr3aUv2u/VGZwiht3s55VNBX CoJZBwDwKYhx8ReXY6t+XJFmW83O9MbVsvBYzB0DlAyQWwm9RwRgxJHQsFI6iPwxawkU jOtQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuaHJmG8aAm9TXvszsXe28U4npqpZ6TvePPI7cJEa9CJKke09STw kKFkbz4ps/9TQOo5yhlE6v1Ie5fka3eMMNS4GufgV1Hhjo5YR3lOJORIKZ2/1I5FXJjOjtne+yB 6PH3eygKrw+aM3zM1MGmcvPCAatQWAB6cex+e3Lb2pK6ZLQucilDAsiwI4rDKFJHbqw== X-Received: by 2002:ac8:581:: with SMTP id a1mr1148324qth.168.1549940284394; Mon, 11 Feb 2019 18:58:04 -0800 (PST) X-Google-Smtp-Source: AHgI3IZKgiSvCYOuwr7Fnc8UhYu4gsDGd2W89YdY/mOIoBk9R2wmfn6hytj/CS24xZibxYm+xbeG X-Received: by 2002:ac8:581:: with SMTP id a1mr1148304qth.168.1549940283854; Mon, 11 Feb 2019 18:58:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940283; cv=none; d=google.com; s=arc-20160816; b=Gr4x8QWgCyEfITm2YvdrvP070xjf+4m8JkEhVKzV1tGIRSV1IVwKVI31Bq5GDs4yiA Rs+62GTu7ZxbKH7SiCdHVFWAK+nhQ1U+T30W47MOrKC/OpNTk83v2F4v9UpPZeg8o7Xz sL3LqMy37Am+nJa/XKA6e3G9MRgPAgzblVz9VA/ZLgYMa7D6zSP2SZDlZtTbaY4jZgvc uc6WtME0pvc32W/IPB3daOyqOvnsmllMXcLqCbAskL487A3HwE+YgNvGpxHeXVXh3bHu D6Q2kY3Aah69PxR7RgSoF+yAB7y6k3ufl+UAFnghnrp1DTyp8xjKK9lY5e0j2ZpLenj2 R5zg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=/0tGI+WMMQCIgqP4tW1kg30CUM4gY3lrMug3d0U0StI=; b=B8hu10umiRwMqSautDchirji+ApLaxr1zRNb6qFTzpGg41HQB0P6kJOY7IbYXqZRgm Y1yHcYcNva2tvjZ/D7zG0LuUMXsMAp4DArRPjCqnHgQ9xZ5/G1mm5WJdSwlGWRci9ASv tu9JTr6lq4i0/hwhgGFwkUuqcB2Gi7Q7ixM+qId93n2S2ZJHqbwSNjR1Zhf9ITeQ/tQS ipWZu+a+hJSt5ceKd2L+014FO+LxjoKg0a+GEtSeyXDaI6cen1YcmBH8fEVzgRtEhHrc hRUU+F9l8sZmXyjROBvsNZjjzqbTt/WLV5Hk3F8YSaFPxSy5lIkBUWCHsIu15EhppDO1 zMeg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id p9si4917383qvq.61.2019.02.11.18.58.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:58:03 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DB26BE6A60; Tue, 12 Feb 2019 02:58:02 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 62382600C6; Tue, 12 Feb 2019 02:57:56 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 07/26] userfaultfd: wp: hook userfault handler to write protection fault Date: Tue, 12 Feb 2019 10:56:13 +0800 Message-Id: <20190212025632.28946-8-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Tue, 12 Feb 2019 02:58:03 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli There are several cases write protection fault happens. It could be a write to zero page, swaped page or userfault write protected page. When the fault happens, there is no way to know if userfault write protect the page before. Here we just blindly issue a userfault notification for vma with VM_UFFD_WP regardless if app write protects it yet. Application should be ready to handle such wp fault. v1: From: Shaohua Li v2: Handle the userfault in the common do_wp_page. If we get there a pagetable is present and readonly so no need to do further processing until we solve the userfault. In the swapin case, always swapin as readonly. This will cause false positive userfaults. We need to decide later if to eliminate them with a flag like soft-dirty in the swap entry (see _PAGE_SWP_SOFT_DIRTY). hugetlbfs wouldn't need to worry about swapouts but and tmpfs would be handled by a swap entry bit like anonymous memory. The main problem with no easy solution to eliminate the false positives, will be if/when userfaultfd is extended to real filesystem pagecache. When the pagecache is freed by reclaim we can't leave the radix tree pinned if the inode and in turn the radix tree is reclaimed as well. The estimation is that full accuracy and lack of false positives could be easily provided only to anonymous memory (as long as there's no fork or as long as MADV_DONTFORK is used on the userfaultfd anonymous range) tmpfs and hugetlbfs, it's most certainly worth to achieve it but in a later incremental patch. v3: Add hooking point for THP wrprotect faults. CC: Shaohua Li Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu Reviewed-by: Mike Rapoport --- mm/memory.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index e11ca9dd823f..00781c43407b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2483,6 +2483,11 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; + if (userfaultfd_wp(vma)) { + pte_unmap_unlock(vmf->pte, vmf->ptl); + return handle_userfault(vmf, VM_UFFD_WP); + } + vmf->page = vm_normal_page(vma, vmf->address, vmf->orig_pte); if (!vmf->page) { /* @@ -2800,6 +2805,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS); pte = mk_pte(page, vma->vm_page_prot); + if (userfaultfd_wp(vma)) + vmf->flags &= ~FAULT_FLAG_WRITE; if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) { pte = maybe_mkwrite(pte_mkdirty(pte), vma); vmf->flags &= ~FAULT_FLAG_WRITE; @@ -3684,8 +3691,11 @@ static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf) /* `inline' is required to avoid gcc 4.1.2 build error */ static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd) { - if (vma_is_anonymous(vmf->vma)) + if (vma_is_anonymous(vmf->vma)) { + if (userfaultfd_wp(vmf->vma)) + return handle_userfault(vmf, VM_UFFD_WP); return do_huge_pmd_wp_page(vmf, orig_pmd); + } if (vmf->vma->vm_ops->huge_fault) return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD); From patchwork Tue Feb 12 02:56:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807255 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8E3091575 for ; Tue, 12 Feb 2019 02:58:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 799B32AE6B for ; Tue, 12 Feb 2019 02:58:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6D5212AE78; Tue, 12 Feb 2019 02:58:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A9C6A2AE6B for ; Tue, 12 Feb 2019 02:58:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AB7818E0155; Mon, 11 Feb 2019 21:58:16 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A6A158E000E; Mon, 11 Feb 2019 21:58:16 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 957B28E0155; Mon, 11 Feb 2019 21:58:16 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 6A2458E000E for ; Mon, 11 Feb 2019 21:58:16 -0500 (EST) Received: by mail-qk1-f198.google.com with SMTP id q81so14409307qkl.20 for ; Mon, 11 Feb 2019 18:58:16 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=xm3gfMSjNUy+tGvDZc6qYqwfD1G89YH9iNOkXoiArnc=; b=cxvG+aIYF5POKmPs3Ham/mRK/obavka3SQir9Eo4WMCAcRYfombXwU/uWMR+DwrP1e WxDSi6GFM3hzfXlWlb7w652X/oDxd7nJ5+xnNDVjgDOZ6YatBWjEjf9jMscbApYd/A2p yK4tDeW7F39gD2URBDd7tC3PzqVw4xnbyOfUWuTVJPenwk+GqG+iReZycQ/BkviyC4UA AETubYHcr5hfydTNFLq0/sYqgs5XLzLgrByDsFlcUD63hmuJqo3hlMsCm/1vEb7PFah2 kJvY6q2FWc7/N9j4XicD2uuaqfQwp+qhhDikk4lKv+qRkqgtz75q5Zx91xYAXbD6DqPW tvmA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuYDgz61fKinOfwYOJtTmOj7XXHm8hLVEo0naZaAxq4fkTu29lGI OOzSWvH2eIv9LuPFqJUge/W5cEA/u69Z48An+SC7fV+jMHZvcTBEQUdLdaED2HCAauCsCWpNu0u 2gUHjkJrFQxP/WybkKZgMQxL4akQLIXch7D5QlM9ieZa8U049rr+QsCZ0vwYdsRiVog== X-Received: by 2002:a37:9442:: with SMTP id w63mr980840qkd.109.1549940296166; Mon, 11 Feb 2019 18:58:16 -0800 (PST) X-Google-Smtp-Source: AHgI3IZrW3ZUAuYuLrFD28oD6lv25IFubxslO0QUYvsVljboy7YB0qMvMBDBsZ+PY5AkBmIWxmB4 X-Received: by 2002:a37:9442:: with SMTP id w63mr980818qkd.109.1549940295505; Mon, 11 Feb 2019 18:58:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940295; cv=none; d=google.com; s=arc-20160816; b=RqgO720Op4/dCYM+DqFJqqfLlsGOHayrPE4azdyEERazXYNJfZlyAj4bkbBpgq4uvq 6UY4r22bSAQ0O2HtVABLtyB8UjhMPq+41n0z7wC7/jI334hH8D9WkDLwy8U2jaVmHXEa QhZ21WXQIBhDrzSWhKjGaTDuDwOOiCZQB/muAcevWkkve5CZJiqXZxW4b5Sv224lSsD+ yecMZvRlkHGmkj9KZPHRfHQ6vlYjNzlj15lXTbDmrHDwWqj5l84sMKiPoqdCepbcQzIo rpkg1RBFeOPWSIePf+7LuqBguFMRhDun+raPYZtc5+OIc6jLGJ1/qm3UgktDZRnQ3TYo XMpw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=xm3gfMSjNUy+tGvDZc6qYqwfD1G89YH9iNOkXoiArnc=; b=KEnzgEt2d6C4hXbDB1bli3v3IgIc2s/6JX1ngEH8M1SNqu/wRAUvzsMoMVaUIPP9kd D8VAt39lYXpLEXZ9EkB9J3acWIz9wcRIE3WqTqC6jpiAgMm9c0G1GmxEr2bciAZFMFun YKLRjhR1RshwWDr2Husi0jZ/HUTJIk63qAKYVYMEIGoqC44qBs8909vx5s+/tfGi4BMu taSIANULfSfvFqHqS7WhKkir669v7rivcpNiGlIxpj6du1iQOB+cgjriLldZHD+mWSCY 6cfsDz1FGVyCFRfr5TOGOCpgW+nM9k0RWrJFoI82nZBMAOmE/rRXtcSQ3Px1Un54BRCR b3jQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id f37si567505qve.169.2019.02.11.18.58.15 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:58:15 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 60A4981F12; Tue, 12 Feb 2019 02:58:14 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 662AA60123; Tue, 12 Feb 2019 02:58:03 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 08/26] userfaultfd: wp: add WP pagetable tracking to x86 Date: Tue, 12 Feb 2019 10:56:14 +0800 Message-Id: <20190212025632.28946-9-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 12 Feb 2019 02:58:14 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli Accurate userfaultfd WP tracking is possible by tracking exactly which virtual memory ranges were writeprotected by userland. We can't relay only on the RW bit of the mapped pagetable because that information is destroyed by fork() or KSM or swap. If we were to relay on that, we'd need to stay on the safe side and generate false positive wp faults for every swapped out page. Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse Reviewed-by: Mike Rapoport --- arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 52 ++++++++++++++++++++++++++++ arch/x86/include/asm/pgtable_64.h | 8 ++++- arch/x86/include/asm/pgtable_types.h | 9 +++++ include/asm-generic/pgtable.h | 1 + include/asm-generic/pgtable_uffd.h | 51 +++++++++++++++++++++++++++ init/Kconfig | 5 +++ 7 files changed, 126 insertions(+), 1 deletion(-) create mode 100644 include/asm-generic/pgtable_uffd.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 68261430fe6e..cb43bc008675 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -209,6 +209,7 @@ config X86 select USER_STACKTRACE_SUPPORT select VIRT_TO_BUS select X86_FEATURE_NAMES if PROC_FS + select HAVE_ARCH_USERFAULTFD_WP if USERFAULTFD config INSTRUCTION_DECODER def_bool y diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 2779ace16d23..6863236e8484 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -23,6 +23,7 @@ #ifndef __ASSEMBLY__ #include +#include extern pgd_t early_top_pgt[PTRS_PER_PGD]; int __init __early_make_pgtable(unsigned long address, pmdval_t pmd); @@ -293,6 +294,23 @@ static inline pte_t pte_clear_flags(pte_t pte, pteval_t clear) return native_make_pte(v & ~clear); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline int pte_uffd_wp(pte_t pte) +{ + return pte_flags(pte) & _PAGE_UFFD_WP; +} + +static inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte_set_flags(pte, _PAGE_UFFD_WP); +} + +static inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return pte_clear_flags(pte, _PAGE_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + static inline pte_t pte_mkclean(pte_t pte) { return pte_clear_flags(pte, _PAGE_DIRTY); @@ -372,6 +390,23 @@ static inline pmd_t pmd_clear_flags(pmd_t pmd, pmdval_t clear) return native_make_pmd(v & ~clear); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline int pmd_uffd_wp(pmd_t pmd) +{ + return pmd_flags(pmd) & _PAGE_UFFD_WP; +} + +static inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pmd_set_flags(pmd, _PAGE_UFFD_WP); +} + +static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pmd_clear_flags(pmd, _PAGE_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + static inline pmd_t pmd_mkold(pmd_t pmd) { return pmd_clear_flags(pmd, _PAGE_ACCESSED); @@ -1351,6 +1386,23 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) #endif #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return pte_set_flags(pte, _PAGE_SWP_UFFD_WP); +} + +static inline int pte_swp_uffd_wp(pte_t pte) +{ + return pte_flags(pte) & _PAGE_SWP_UFFD_WP; +} + +static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return pte_clear_flags(pte, _PAGE_SWP_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + #define PKRU_AD_BIT 0x1 #define PKRU_WD_BIT 0x2 #define PKRU_BITS_PER_PKEY 2 diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index 9c85b54bf03c..e0c5d29b8685 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -189,7 +189,7 @@ extern void sync_global_pgds(unsigned long start, unsigned long end); * * | ... | 11| 10| 9|8|7|6|5| 4| 3|2| 1|0| <- bit number * | ... |SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names - * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| X|X|SD|0| <- swp entry + * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| X|F|SD|0| <- swp entry * * G (8) is aliased and used as a PROT_NONE indicator for * !present ptes. We need to start storing swap entries above @@ -197,9 +197,15 @@ extern void sync_global_pgds(unsigned long start, unsigned long end); * erratum where they can be incorrectly set by hardware on * non-present PTEs. * + * SD Bits 1-4 are not used in non-present format and available for + * special use described below: + * * SD (1) in swp entry is used to store soft dirty bit, which helps us * remember soft dirty over page migration * + * F (2) in swp entry is used to record when a pagetable is + * writeprotected by userfaultfd WP support. + * * Bit 7 in swp entry should be 0 because pmd_present checks not only P, * but also L and G. * diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index d6ff0bbdb394..8cebcff91e57 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -32,6 +32,7 @@ #define _PAGE_BIT_SPECIAL _PAGE_BIT_SOFTW1 #define _PAGE_BIT_CPA_TEST _PAGE_BIT_SOFTW1 +#define _PAGE_BIT_UFFD_WP _PAGE_BIT_SOFTW2 /* userfaultfd wrprotected */ #define _PAGE_BIT_SOFT_DIRTY _PAGE_BIT_SOFTW3 /* software dirty tracking */ #define _PAGE_BIT_DEVMAP _PAGE_BIT_SOFTW4 @@ -100,6 +101,14 @@ #define _PAGE_SWP_SOFT_DIRTY (_AT(pteval_t, 0)) #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +#define _PAGE_UFFD_WP (_AT(pteval_t, 1) << _PAGE_BIT_UFFD_WP) +#define _PAGE_SWP_UFFD_WP _PAGE_USER +#else +#define _PAGE_UFFD_WP (_AT(pteval_t, 0)) +#define _PAGE_SWP_UFFD_WP (_AT(pteval_t, 0)) +#endif + #if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE) #define _PAGE_NX (_AT(pteval_t, 1) << _PAGE_BIT_NX) #define _PAGE_DEVMAP (_AT(u64, 1) << _PAGE_BIT_DEVMAP) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 05e61e6c843f..f49afe951711 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -10,6 +10,7 @@ #include #include #include +#include #if 5 - defined(__PAGETABLE_P4D_FOLDED) - defined(__PAGETABLE_PUD_FOLDED) - \ defined(__PAGETABLE_PMD_FOLDED) != CONFIG_PGTABLE_LEVELS diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h new file mode 100644 index 000000000000..643d1bf559c2 --- /dev/null +++ b/include/asm-generic/pgtable_uffd.h @@ -0,0 +1,51 @@ +#ifndef _ASM_GENERIC_PGTABLE_UFFD_H +#define _ASM_GENERIC_PGTABLE_UFFD_H + +#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static __always_inline int pte_uffd_wp(pte_t pte) +{ + return 0; +} + +static __always_inline int pmd_uffd_wp(pmd_t pmd) +{ + return 0; +} + +static __always_inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pmd; +} + +static __always_inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pmd; +} + +static __always_inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline int pte_swp_uffd_wp(pte_t pte) +{ + return 0; +} + +static __always_inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return pte; +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + +#endif /* _ASM_GENERIC_PGTABLE_UFFD_H */ diff --git a/init/Kconfig b/init/Kconfig index c9386a365eea..892d61ddf2eb 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1424,6 +1424,11 @@ config ADVISE_SYSCALLS applications use these syscalls, you can disable this option to save space. +config HAVE_ARCH_USERFAULTFD_WP + bool + help + Arch has userfaultfd write protection support + config MEMBARRIER bool "Enable membarrier() system call" if EXPERT default y From patchwork Tue Feb 12 02:56:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807257 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 42DEE1575 for ; Tue, 12 Feb 2019 02:58:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 30D212AE6B for ; Tue, 12 Feb 2019 02:58:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2470F2AE78; Tue, 12 Feb 2019 02:58:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A7C7D2AE6B for ; Tue, 12 Feb 2019 02:58:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B7BF48E017F; Mon, 11 Feb 2019 21:58:29 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B2B478E000E; Mon, 11 Feb 2019 21:58:29 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A1AB78E017F; Mon, 11 Feb 2019 21:58:29 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 763CC8E000E for ; Mon, 11 Feb 2019 21:58:29 -0500 (EST) Received: by mail-qt1-f200.google.com with SMTP id u32so1313833qte.1 for ; Mon, 11 Feb 2019 18:58:29 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=KLP2440TXQRdVyNsG670UipiAgbAuOUsK8ZDtYYU72c=; b=MTnrK/Qkp4HJ3FNdf3MadIclL7nmJfxB2xzz4igwp1WozEC8PNWtqOqVBJgCABiTn/ wj7c0AFvfRZ2kctEpUnUzedBnOsBnBuakIC4qiURvCX+mjN5DrOKsHbl/5P0Yxn2034w YmIrCPBtcWUewjP1dRqXAQIA3UIUmFcJfChIvHJZcP4ChaGspvU2QQOxz1wojN/hXKXh 3erxNLIlLB5w2axiD5FiEuN9q3V2sEm71Ph4HDcxcvOZlZD44/MkFEVmdFvwOsqU+heC XaT7Gy0zTc6y9zXw+BSPyOGoq4j9tAwHRnx7bKyEXP+usbrNuBPGWHspodMQU4Xc5vQx SMVQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuYu/oG0dAYzLL53J0WsGXYbOix3WfugVHOIVbQbeJXxehIvRbBH RptredcV0AhJ4KjY26aGZ+Tw7W4OPRMIEhI1cfElfsqcBWMfkBzNt/DVI0roLGd7QVswM3HQB1T zU1dtmisclrQTowpdtsp2taiS1QC8TNA+AK+kKsRccFxF53AYFQ991bqT4YMkLEWvWA== X-Received: by 2002:a37:9906:: with SMTP id b6mr1048766qke.208.1549940309274; Mon, 11 Feb 2019 18:58:29 -0800 (PST) X-Google-Smtp-Source: AHgI3IbvvOncL3skbu1v5yegTzxXSfydY5+eKSevpRZGbIEqUsccMvLQDaTykXV7XFuHat2PDu7w X-Received: by 2002:a37:9906:: with SMTP id b6mr1048742qke.208.1549940308781; Mon, 11 Feb 2019 18:58:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940308; cv=none; d=google.com; s=arc-20160816; b=IJzohMz3hwtzokiGggrarnxxWzqOU7zhUDePneC7EsCRAkgY5mcvvPc3CvawdKzlmx mysi3EihgOrIP+8WzsptVnHYzGAj/uN/wA4bzI0p5lHUbLin7bPzU0d6puuwhoegXih3 8YUA5Jp2DCsVP8H2aH+GyiAwvj32eCAmdLpn/FEpEVYqh6OHOFOcs9EoyDNMhiA7GNi4 7n2MBdQ4Pj7ZtTHkHMMqdbc2tmSN0cj/bqpvaXjRtbmjJ2OZIdwXHbnSJGDRDneQd6Xa vUGPpWgF67WQTafxenbIohjV6K2QmRaAi0v7KQKHwyHZ0Zq5uLK5sjz6Dc9aHJ79sWX1 Jyrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=KLP2440TXQRdVyNsG670UipiAgbAuOUsK8ZDtYYU72c=; b=KpD3DiIVimFJHJREUhBiEPSpL2Nrv9MfbdeqfxSVIOp+/h/J4Aj2v4pywvH2SZonRZ IIuWgXhPhcOZrmLlp33iMGcSkPRrmRou9pX0446vlcoJRp8MxVAh0Q6zkMjPx8gIfXR9 qvo7EiokNhRBL4IR/wgKtkEgCSHQs/FEp4X+q9SzKWb98X130hsJz44QuTfCOqHqDtjv 5WVssgH8ilj5KSjZgIR6GmqGhKbl308y7EZf+uqo9KnWaHn5WIQJGkfRoDJIyWEjwsBB RrVqukgQ3MNtXQC0vOf63mFVQ34dERNyd1whpFdhPjUjcQBh/u8+kqOQC4nJ3C3kUfi2 1Csw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id q5si1439139qvr.203.2019.02.11.18.58.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:58:28 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E1D537F6C8; Tue, 12 Feb 2019 02:58:27 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id D8D26600CC; Tue, 12 Feb 2019 02:58:14 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 09/26] userfaultfd: wp: userfaultfd_pte/huge_pmd_wp() helpers Date: Tue, 12 Feb 2019 10:56:15 +0800 Message-Id: <20190212025632.28946-10-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 12 Feb 2019 02:58:28 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli Implement helpers methods to invoke userfaultfd wp faults more selectively: not only when a wp fault triggers on a vma with vma->vm_flags VM_UFFD_WP set, but only if the _PAGE_UFFD_WP bit is set in the pagetable too. Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse Reviewed-by: Mike Rapoport --- include/linux/userfaultfd_k.h | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 38f748e7186e..c6590c58ce28 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -14,6 +14,8 @@ #include /* linux/include/uapi/linux/userfaultfd.h */ #include +#include +#include /* * CAREFUL: Check include/uapi/asm-generic/fcntl.h when defining @@ -55,6 +57,18 @@ static inline bool userfaultfd_wp(struct vm_area_struct *vma) return vma->vm_flags & VM_UFFD_WP; } +static inline bool userfaultfd_pte_wp(struct vm_area_struct *vma, + pte_t pte) +{ + return userfaultfd_wp(vma) && pte_uffd_wp(pte); +} + +static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma, + pmd_t pmd) +{ + return userfaultfd_wp(vma) && pmd_uffd_wp(pmd); +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP); @@ -104,6 +118,19 @@ static inline bool userfaultfd_wp(struct vm_area_struct *vma) return false; } +static inline bool userfaultfd_pte_wp(struct vm_area_struct *vma, + pte_t pte) +{ + return false; +} + +static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma, + pmd_t pmd) +{ + return false; +} + + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return false; From patchwork Tue Feb 12 02:56:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807259 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 27AF51575 for ; Tue, 12 Feb 2019 02:58:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 144052AE6B for ; Tue, 12 Feb 2019 02:58:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 070452AE78; Tue, 12 Feb 2019 02:58:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 576C32AE6B for ; Tue, 12 Feb 2019 02:58:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5FCE48E0186; Mon, 11 Feb 2019 21:58:46 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5ACFA8E000E; Mon, 11 Feb 2019 21:58:46 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 49F3B8E0186; Mon, 11 Feb 2019 21:58:46 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id 204418E000E for ; Mon, 11 Feb 2019 21:58:46 -0500 (EST) Received: by mail-qk1-f200.google.com with SMTP id s65so14325846qke.16 for ; Mon, 11 Feb 2019 18:58:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=CSN6BpvJzT5VHx+WQVtWQy53Z2LkLHgDVFgPV+Pavgg=; b=Kj4u7SskDUYLusXgs0PsOS7JhqFiYjtSsAUJ4xdekF0SFEMM6F6AliVjoroOhy5lIS j3fYRqpkxIMEgdlJ+w94vH/y/BIcQ5sjrR+F20XlvNHta/hK41HKclOYHG5fBYkK88y0 g6R9++hck5LfWpyvLCYANS636AS4YXkw++wpwoAjJcbG+rztFPOv67qwyByLE5H+8p7L Yg4Y2vsrskscHVur6qLaLf9Rgyo33/Q+qIQx4gP9yWTCshhDN9IjmlwV/AFZTp9pXfUI hmbWfxVs19cOD258IN4D8fCs/8zVzi03Plk1wvNZXSKMqgU66rtTDJbFOFqEi+d2KFRz wyXA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuaQlGmPLaYtMMEhAjou7CqT2u82WvGl8g43QoF0dp44WBQiTT5q uaahtSQCoAviIXd4KWSD4to0D31sTc44HP3PoujD1b8EVgNUfSl6+Msja+qm1lh8nSSHyVY8oKn I9CoSfZYpS+RAFyqqYEmV72K6Iv/5wPSknV/hMBmZdYov8kwDnI35Sb8YKLO7b5e9FA== X-Received: by 2002:a0c:b48d:: with SMTP id c13mr1014033qve.91.1549940325897; Mon, 11 Feb 2019 18:58:45 -0800 (PST) X-Google-Smtp-Source: AHgI3IaClhFkL4+NBLPQ6NyOi8oWp80KXbCXBFILY+y8/mjEpqL6D/6fwJQ5UtGzeg3wv6C9UB1h X-Received: by 2002:a0c:b48d:: with SMTP id c13mr1014018qve.91.1549940325294; Mon, 11 Feb 2019 18:58:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940325; cv=none; d=google.com; s=arc-20160816; b=IKscSLz8HlknPWiE+mFpv5d5w/bRgcFaFSv/M8SZobaTH+INpIXoki5vkInZ7kswdy CaHVWbIwPE8J4S51GttTRB+ea9emPkVopnYwXKJkNJlb9fPokvbD0iEuKJjb79wtcUKX W/YuYqDLy+nStgc3EGmbk0C384hqjKdlCOvQBzak+XxwzFnfh9RCU4AGUiXQJyNTrQm/ 7lmu0TXp/KM18IGQxmqe3r0Vf7CsNfT6+e320bWUKrp8fyKoOjo5+Mam7VVnO4dYJRjQ 9qJAm+3TAUFhCEa4/x8JhZSVfxctS7BDkyCUwEBgTBql5KqPsif51/WZeOqT24aVpgoT vMSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=CSN6BpvJzT5VHx+WQVtWQy53Z2LkLHgDVFgPV+Pavgg=; b=dm8qkCWLnbBPrzbnOzInt4PROE5kR75Yj+Q16yzU/T7du1gqhLPVDId2IL2bit44Oj xcNpqg7WT/Jw3xq3mrO5e/k4fNmUlQOZ///t77vvOtqx7k/EM0mbO64Wzf/xwnEVQJ2y NSFfx7r6pcEoLxEHEfYjeR/MGFOD+fbzMQzFYwCdAVvz8j/iJREicv4UPr+76t3zl+Wd Xa5yRPEMzQ65izcFEg9joj2rD3GpubpbNziOM+SVporqj9T1dXkQW1l6V7RComecxmiE PLiM71fNzUCDpq+BpYJ6ftenhjcQWXwaJmQWyz0mW8cjj8U9KEjkW6GylrlFQ+X1RedD Qj9Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id 46si290642qta.42.2019.02.11.18.58.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:58:45 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4D10A7AE89; Tue, 12 Feb 2019 02:58:44 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6BE8A600CC; Tue, 12 Feb 2019 02:58:28 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 10/26] userfaultfd: wp: add UFFDIO_COPY_MODE_WP Date: Tue, 12 Feb 2019 10:56:16 +0800 Message-Id: <20190212025632.28946-11-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 12 Feb 2019 02:58:44 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli This allows UFFDIO_COPY to map pages wrprotected. Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse Reviewed-by: Mike Rapoport --- fs/userfaultfd.c | 5 +++-- include/linux/userfaultfd_k.h | 2 +- include/uapi/linux/userfaultfd.h | 11 +++++----- mm/userfaultfd.c | 36 ++++++++++++++++++++++---------- 4 files changed, 35 insertions(+), 19 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index b397bc3b954d..3092885c9d2c 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1683,11 +1683,12 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, ret = -EINVAL; if (uffdio_copy.src + uffdio_copy.len <= uffdio_copy.src) goto out; - if (uffdio_copy.mode & ~UFFDIO_COPY_MODE_DONTWAKE) + if (uffdio_copy.mode & ~(UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP)) goto out; if (mmget_not_zero(ctx->mm)) { ret = mcopy_atomic(ctx->mm, uffdio_copy.dst, uffdio_copy.src, - uffdio_copy.len, &ctx->mmap_changing); + uffdio_copy.len, &ctx->mmap_changing, + uffdio_copy.mode); mmput(ctx->mm); } else { return -ESRCH; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index c6590c58ce28..765ce884cec0 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -34,7 +34,7 @@ extern vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason); extern ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - bool *mmap_changing); + bool *mmap_changing, __u64 mode); extern ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long len, diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 48f1a7c2f1f0..297cb044c03f 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -203,13 +203,14 @@ struct uffdio_copy { __u64 dst; __u64 src; __u64 len; +#define UFFDIO_COPY_MODE_DONTWAKE ((__u64)1<<0) /* - * There will be a wrprotection flag later that allows to map - * pages wrprotected on the fly. And such a flag will be - * available if the wrprotection ioctl are implemented for the - * range according to the uffdio_register.ioctls. + * UFFDIO_COPY_MODE_WP will map the page wrprotected on the + * fly. UFFDIO_COPY_MODE_WP is available only if the + * wrprotection ioctl are implemented for the range according + * to the uffdio_register.ioctls. */ -#define UFFDIO_COPY_MODE_DONTWAKE ((__u64)1<<0) +#define UFFDIO_COPY_MODE_WP ((__u64)1<<1) __u64 mode; /* diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index d59b5a73dfb3..73a208c5c1e7 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -25,7 +25,8 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - struct page **pagep) + struct page **pagep, + bool wp_copy) { struct mem_cgroup *memcg; pte_t _dst_pte, *dst_pte; @@ -71,9 +72,9 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, if (mem_cgroup_try_charge(page, dst_mm, GFP_KERNEL, &memcg, false)) goto out_release; - _dst_pte = mk_pte(page, dst_vma->vm_page_prot); - if (dst_vma->vm_flags & VM_WRITE) - _dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte)); + _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); + if (dst_vma->vm_flags & VM_WRITE && !wp_copy) + _dst_pte = pte_mkwrite(_dst_pte); dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); if (dst_vma->vm_file) { @@ -399,7 +400,8 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, struct page **page, - bool zeropage) + bool zeropage, + bool wp_copy) { ssize_t err; @@ -416,11 +418,13 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, if (!(dst_vma->vm_flags & VM_SHARED)) { if (!zeropage) err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, - dst_addr, src_addr, page); + dst_addr, src_addr, page, + wp_copy); else err = mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, dst_addr); } else { + VM_WARN_ON(wp_copy); /* WP only available for anon */ if (!zeropage) err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, @@ -438,7 +442,8 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, unsigned long src_start, unsigned long len, bool zeropage, - bool *mmap_changing) + bool *mmap_changing, + __u64 mode) { struct vm_area_struct *dst_vma; ssize_t err; @@ -446,6 +451,7 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, unsigned long src_addr, dst_addr; long copied; struct page *page; + bool wp_copy; /* * Sanitize the command parameters: @@ -502,6 +508,14 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, dst_vma->vm_flags & VM_SHARED)) goto out_unlock; + /* + * validate 'mode' now that we know the dst_vma: don't allow + * a wrprotect copy if the userfaultfd didn't register as WP. + */ + wp_copy = mode & UFFDIO_COPY_MODE_WP; + if (wp_copy && !(dst_vma->vm_flags & VM_UFFD_WP)) + goto out_unlock; + /* * If this is a HUGETLB vma, pass off to appropriate routine */ @@ -557,7 +571,7 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, BUG_ON(pmd_trans_huge(*dst_pmd)); err = mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, - src_addr, &page, zeropage); + src_addr, &page, zeropage, wp_copy); cond_resched(); if (unlikely(err == -ENOENT)) { @@ -604,14 +618,14 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - bool *mmap_changing) + bool *mmap_changing, __u64 mode) { return __mcopy_atomic(dst_mm, dst_start, src_start, len, false, - mmap_changing); + mmap_changing, mode); } ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long start, unsigned long len, bool *mmap_changing) { - return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing); + return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing, 0); } From patchwork Tue Feb 12 02:56:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807261 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8F90C1669 for ; Tue, 12 Feb 2019 02:58:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7A9412AAC0 for ; Tue, 12 Feb 2019 02:58:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 693E92AAC7; Tue, 12 Feb 2019 02:58:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9FBCE2AAC0 for ; Tue, 12 Feb 2019 02:58:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A436A8E019C; Mon, 11 Feb 2019 21:58:54 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9F27D8E000E; Mon, 11 Feb 2019 21:58:54 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8BC368E019C; Mon, 11 Feb 2019 21:58:54 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 623938E000E for ; Mon, 11 Feb 2019 21:58:54 -0500 (EST) Received: by mail-qk1-f198.google.com with SMTP id v64so14455779qka.5 for ; Mon, 11 Feb 2019 18:58:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=L0YWfIiXdOZ6OBisThNcKx5XTcZFkIB62VeeFqLIm44=; b=fG1in472X9cg0SxgiYn97lM7/ssHdk0GhCWnE5pcwd72dRC1T0TohN4ABpSE09wnp2 9hUvaMYkQEWY9ww8IZ5Bxziu06y3Hr/eujpoWEw1UkokQZBgEgWtPLn6eXqmsMYK1hzW F/AJ6CYohiF0bS6vToINgm6mwmbu8PN4RxrIFkvfN9hTvHqavqO1qRpwI6/6VGjzJac3 KI3dF1pOqaqAwlct5cXBkQ1Nkf9gldDTzCRj0/EKPkbKM1K2OmIArVYo54FCAKBsiwBE OlgCaIacsGFZusHpR6+xjnuNIFkN8VK/rIB8n9KvLPUlpgntoDZp3veRFVr8NxYsm555 kTpQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuaOknVdMdgvIyabn0qQFGrQw4snBkYeBqhou9GeOiADyVSSQt3H KcTr+z1GJF9NZNGtUA/5YMWwsoNw0xLFlD40TKj6SmGd57BMG6t9/s/80L9TKTRnI7ugs/ET+eE LW9UE0b4WrAel1eX1HV24NlsZPjYhfLxEUJurdIN0YC6b9X1NpxymbRID1F7dGEbYVw== X-Received: by 2002:ac8:2214:: with SMTP id o20mr1144755qto.170.1549940334166; Mon, 11 Feb 2019 18:58:54 -0800 (PST) X-Google-Smtp-Source: AHgI3IaxNrokA+VTwNigAuBREMze87Q/NcGHVINmEjJQGT5fsWgIqYKrKIBBZEGrRKxDzEa11MCw X-Received: by 2002:ac8:2214:: with SMTP id o20mr1144732qto.170.1549940333500; Mon, 11 Feb 2019 18:58:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940333; cv=none; d=google.com; s=arc-20160816; b=cPwIgNudGFcgGx+gz3oP/Mr3U+sbtPBCOQLcsGo68QhoNSoyTXlzuiLfnjVt0v29B4 C68AUGL2Gyw5u0JVSSflSaoe+acODIhoiCs6P4ao6nlqHyZLmjanFSsK2D7KXmHhA/KB sLsigHXwUUIj2Bghwr1LuXUtK/SRKwLfq+IsEZ6Sd58pjRKDVpq2wyzxRH9XMbK2JieJ +LcUxF1h50RVI65zoXevYXGW1MvZnCYa1UgnatZI/rmRt4Tb0d/0J5PkL1mxqzl88Wn2 426grPvqrtq9ivaefmOwSD7CGMtEq4XgpudKFfGabl8ta5eRr4tdQCypEbHlWYA4a2l3 uZUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=L0YWfIiXdOZ6OBisThNcKx5XTcZFkIB62VeeFqLIm44=; b=Et08vs+tks+BHXpO2e6tbS1mtX7MYXbGEuWsVTvTxvOnVQK6h5cyzoUV+oSUv2NcZe jknXMsGOvSrYZqmbn2V+K5p7rmYjt1Zylqyc+vniAo/PGZF403Rs3wcZWraJqJnw+zNk b9xrF2NhXR05pc/PDrG2uGWSkMCBuxUDPSoeAS7VqakvlLaFksTl8TrkWSq8ckUYsrcJ Wb/rkQdacpgxBvqtBrTuSL62dL0ZzJC+ZRZuI11/m0aubDj3lqHuDHT7Yz2iJiYELyBQ DLfUTM3P54lgb7KlfB1C0tB6X2bcZYa1xU6UX6WVYFOK+A3wLd9ud9Er91D2JT2ybvaL e9lQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id j5si2427566qvc.139.2019.02.11.18.58.53 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:58:53 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 82F16432A2; Tue, 12 Feb 2019 02:58:52 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id CC15A600C6; Tue, 12 Feb 2019 02:58:44 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 11/26] mm: merge parameters for change_protection() Date: Tue, 12 Feb 2019 10:56:17 +0800 Message-Id: <20190212025632.28946-12-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Tue, 12 Feb 2019 02:58:52 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP change_protection() was used by either the NUMA or mprotect() code, there's one parameter for each of the callers (dirty_accountable and prot_numa). Further, these parameters are passed along the calls: - change_protection_range() - change_p4d_range() - change_pud_range() - change_pmd_range() - ... Now we introduce a flag for change_protect() and all these helpers to replace these parameters. Then we can avoid passing multiple parameters multiple times along the way. More importantly, it'll greatly simplify the work if we want to introduce any new parameters to change_protection(). In the follow up patches, a new parameter for userfaultfd write protection will be introduced. No functional change at all. Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse --- include/linux/huge_mm.h | 2 +- include/linux/mm.h | 14 +++++++++++++- mm/huge_memory.c | 3 ++- mm/mempolicy.c | 2 +- mm/mprotect.c | 29 ++++++++++++++++------------- 5 files changed, 33 insertions(+), 17 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 381e872bfde0..1550fb12dbd4 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -46,7 +46,7 @@ extern bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, pmd_t *old_pmd, pmd_t *new_pmd); extern int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, pgprot_t newprot, - int prot_numa); + unsigned long cp_flags); vm_fault_t vmf_insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, pmd_t *pmd, pfn_t pfn, bool write); vm_fault_t vmf_insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, diff --git a/include/linux/mm.h b/include/linux/mm.h index 80bb6408fe73..9fe3b0066324 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1646,9 +1646,21 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma, unsigned long old_addr, struct vm_area_struct *new_vma, unsigned long new_addr, unsigned long len, bool need_rmap_locks); + +/* + * Flags used by change_protection(). For now we make it a bitmap so + * that we can pass in multiple flags just like parameters. However + * for now all the callers are only use one of the flags at the same + * time. + */ +/* Whether we should allow dirty bit accounting */ +#define MM_CP_DIRTY_ACCT (1UL << 0) +/* Whether this protection change is for NUMA hints */ +#define MM_CP_PROT_NUMA (1UL << 1) + extern unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa); + unsigned long cp_flags); extern int mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev, unsigned long start, unsigned long end, unsigned long newflags); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index faf357eaf0ce..8d65b0f041f9 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1860,13 +1860,14 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, * - HPAGE_PMD_NR is protections changed and TLB flush necessary */ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, - unsigned long addr, pgprot_t newprot, int prot_numa) + unsigned long addr, pgprot_t newprot, unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; spinlock_t *ptl; pmd_t entry; bool preserve_write; int ret; + bool prot_numa = cp_flags & MM_CP_PROT_NUMA; ptl = __pmd_trans_huge_lock(pmd, vma); if (!ptl) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index d4496d9d34f5..233194f3d69a 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -554,7 +554,7 @@ unsigned long change_prot_numa(struct vm_area_struct *vma, { int nr_updated; - nr_updated = change_protection(vma, addr, end, PAGE_NONE, 0, 1); + nr_updated = change_protection(vma, addr, end, PAGE_NONE, MM_CP_PROT_NUMA); if (nr_updated) count_vm_numa_events(NUMA_PTE_UPDATES, nr_updated); diff --git a/mm/mprotect.c b/mm/mprotect.c index 36cb358db170..a6ba448c8565 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -37,13 +37,15 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; pte_t *pte, oldpte; spinlock_t *ptl; unsigned long pages = 0; int target_node = NUMA_NO_NODE; + bool dirty_accountable = cp_flags & MM_CP_DIRTY_ACCT; + bool prot_numa = cp_flags & MM_CP_PROT_NUMA; /* * Can be called with only the mmap_sem for reading by @@ -164,7 +166,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, static inline unsigned long change_pmd_range(struct vm_area_struct *vma, pud_t *pud, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { pmd_t *pmd; unsigned long next; @@ -194,7 +196,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, __split_huge_pmd(vma, pmd, addr, false, NULL); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, - newprot, prot_numa); + newprot, cp_flags); if (nr_ptes) { if (nr_ptes == HPAGE_PMD_NR) { @@ -209,7 +211,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, /* fall through, the trans huge pmd just split */ } this_pages = change_pte_range(vma, pmd, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); pages += this_pages; next: cond_resched(); @@ -225,7 +227,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, static inline unsigned long change_pud_range(struct vm_area_struct *vma, p4d_t *p4d, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { pud_t *pud; unsigned long next; @@ -237,7 +239,7 @@ static inline unsigned long change_pud_range(struct vm_area_struct *vma, if (pud_none_or_clear_bad(pud)) continue; pages += change_pmd_range(vma, pud, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (pud++, addr = next, addr != end); return pages; @@ -245,7 +247,7 @@ static inline unsigned long change_pud_range(struct vm_area_struct *vma, static inline unsigned long change_p4d_range(struct vm_area_struct *vma, pgd_t *pgd, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { p4d_t *p4d; unsigned long next; @@ -257,7 +259,7 @@ static inline unsigned long change_p4d_range(struct vm_area_struct *vma, if (p4d_none_or_clear_bad(p4d)) continue; pages += change_pud_range(vma, p4d, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (p4d++, addr = next, addr != end); return pages; @@ -265,7 +267,7 @@ static inline unsigned long change_p4d_range(struct vm_area_struct *vma, static unsigned long change_protection_range(struct vm_area_struct *vma, unsigned long addr, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; pgd_t *pgd; @@ -282,7 +284,7 @@ static unsigned long change_protection_range(struct vm_area_struct *vma, if (pgd_none_or_clear_bad(pgd)) continue; pages += change_p4d_range(vma, pgd, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (pgd++, addr = next, addr != end); /* Only flush the TLB if we actually modified any entries: */ @@ -295,14 +297,15 @@ static unsigned long change_protection_range(struct vm_area_struct *vma, unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { unsigned long pages; if (is_vm_hugetlb_page(vma)) pages = hugetlb_change_protection(vma, start, end, newprot); else - pages = change_protection_range(vma, start, end, newprot, dirty_accountable, prot_numa); + pages = change_protection_range(vma, start, end, newprot, + cp_flags); return pages; } @@ -430,7 +433,7 @@ mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev, vma_set_page_prot(vma); change_protection(vma, start, end, vma->vm_page_prot, - dirty_accountable, 0); + dirty_accountable ? MM_CP_DIRTY_ACCT : 0); /* * Private VM_LOCKED VMA becoming writable: trigger COW to avoid major From patchwork Tue Feb 12 02:56:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807263 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6A0221669 for ; Tue, 12 Feb 2019 02:59:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 566C82AAC0 for ; Tue, 12 Feb 2019 02:59:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 48A892AAC7; Tue, 12 Feb 2019 02:59:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8AB2B2AAC0 for ; Tue, 12 Feb 2019 02:59:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 86E7C8E0115; Mon, 11 Feb 2019 21:59:03 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 81EB18E000E; Mon, 11 Feb 2019 21:59:03 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 710098E0115; Mon, 11 Feb 2019 21:59:03 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 48BC88E000E for ; Mon, 11 Feb 2019 21:59:03 -0500 (EST) Received: by mail-qt1-f198.google.com with SMTP id u32so1314929qte.1 for ; Mon, 11 Feb 2019 18:59:03 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=dhdCTIHCV8HciGbR5iIiZM8L1/0jXhii3fkZcSsW62s=; b=Jz9TxcB1ux6O+ewu39VazPuQLk1wszu2Ft4hLXoV11256zYO7eIg0hZKd5erBm1RSK gc/Wi26HR7RLQoVJaeb4A/k5mLuCdVITzp/IVVm1Is9r7lwks6R4BRdH8UJotnaAoNy1 sJk5GYgMjjWHNKy+VikxZ+FIploWNTdX5Cs23o+wSRxuPzc2F9kpHwvmwB91XeamDl7B O7XckzXfPtmLWOjspj/48tKrHk3iKW9wpw3hMnXyrUrk8f6gun8kBJjqHz1Qdil7XdK/ hEEixPKGxgoYAsXcx/a44FllTm6R0JNGI9X/j/cshjVLIxpufrcthBIrrpKKUCJViMKF cZaw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAubbx75EvXHw/9FLnnCEU5rgmQxY2VsRlY7t6l40LcUB9QqbvAB0 G9WsuTMSilT4K9GeStx7qA3N/SPvSA4O1lgGUdIrkd+z1KpXKAEr9GGayz+QawZIldTYdkwFf6G RPsWSRmjXRLtEx0iA1JCJY4qXMTuF/WdSv/dU1WX6bkGc6q4RIkidhD91ZE4Qq0QlTA== X-Received: by 2002:a37:7707:: with SMTP id s7mr1041151qkc.252.1549940343054; Mon, 11 Feb 2019 18:59:03 -0800 (PST) X-Google-Smtp-Source: AHgI3Ibf774aDO60Utv6RMMRSBcaZkkqNDm/GPgCYQz3Jn8ZjSRztEh/D+EHA8T5O+U3x2tNMu3r X-Received: by 2002:a37:7707:: with SMTP id s7mr1041124qkc.252.1549940342301; Mon, 11 Feb 2019 18:59:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940342; cv=none; d=google.com; s=arc-20160816; b=VsjYb+oVOUVvXK5Ppb0XJEKcJJN38cX++myCUGvGbMerh6R6/cC/bnNZSKDlbUHLpm wFIIPlRoRINjV8EZbgpmy27ztK8/k2oZmbzvL5+3X+K1t53fIJIfY1DqfECVTTxiWFfi FL5gBD3ppZLPa1M6OMnuFsLfX7jMohGqMHi40am9bJa8nftS2nhctUtyXhPbi/BRaX2x itLNjCTpFsJERSIFefLrfv1oCKqQBu1ClOkQYwot0uBnkCmhXvT0HOyz1NYt3G6DmwV8 x8mtLpPX4FsDb32xCOuYkELbo7ACRzZ9G38ZaCDNp7L6APws4yM9jfs31qKN2pWEUJ4Z e2vQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=dhdCTIHCV8HciGbR5iIiZM8L1/0jXhii3fkZcSsW62s=; b=Cgq6N4DYESwlaoIXr1TPGggQ2TH1rTCV4TCXI8sPExmZQMbKdPtA/m+GhIsiOnUphU Sf6YulrfSi6vAlQWOn8s3NYbni/hlwJUmJinC7T31JPPFdsW6BNIE7VRLV3RMRFk4i9T KNq1grTC6g/hOpZ8wcm2EnVw6sUXEJ9w5C0TPKJqffAjs1+tMe23/YHdoOoi8npiZDQ+ 1vrhIgw2SAF+jCGJEMmMX5SH+v4/bqmgYBfKN6GXxeR/hb6m3if3ZkrpszVWMy7SDjSW qEMqETjxzvFrnlmO8ipkvDomdB6VU4gmSrrXKppNzdziM7uTtO0CIUfOutl9PNZwvTxb euRg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id i67si1042398qke.61.2019.02.11.18.59.02 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:59:02 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6202C7F3E7; Tue, 12 Feb 2019 02:59:01 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0BCBD600C6; Tue, 12 Feb 2019 02:58:52 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 12/26] userfaultfd: wp: apply _PAGE_UFFD_WP bit Date: Tue, 12 Feb 2019 10:56:18 +0800 Message-Id: <20190212025632.28946-13-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 12 Feb 2019 02:59:01 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Firstly, introduce two new flags MM_CP_UFFD_WP[_RESOLVE] for change_protection() when used with uffd-wp and make sure the two new flags are exclusively used. Then, - For MM_CP_UFFD_WP: apply the _PAGE_UFFD_WP bit and remove _PAGE_RW when a range of memory is write protected by uffd - For MM_CP_UFFD_WP_RESOLVE: remove the _PAGE_UFFD_WP bit and recover _PAGE_RW when write protection is resolved from userspace And use this new interface in mwriteprotect_range() to replace the old MM_CP_DIRTY_ACCT. Do this change for both PTEs and huge PMDs. Then we can start to identify which PTE/PMD is write protected by general (e.g., COW or soft dirty tracking), and which is for userfaultfd-wp. Since we should keep the _PAGE_UFFD_WP when doing pte_modify(), add it into _PAGE_CHG_MASK as well. Meanwhile, since we have this new bit, we can be even more strict when detecting uffd-wp page faults in either do_wp_page() or wp_huge_pmd(). Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse Reviewed-by: Mike Rapoport --- arch/x86/include/asm/pgtable_types.h | 2 +- include/linux/mm.h | 5 +++++ mm/huge_memory.c | 14 +++++++++++++- mm/memory.c | 4 ++-- mm/mprotect.c | 12 ++++++++++++ mm/userfaultfd.c | 8 ++++++-- 6 files changed, 39 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 8cebcff91e57..dd9c6295d610 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -133,7 +133,7 @@ */ #define _PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \ _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY | \ - _PAGE_SOFT_DIRTY | _PAGE_DEVMAP) + _PAGE_SOFT_DIRTY | _PAGE_DEVMAP | _PAGE_UFFD_WP) #define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE) /* diff --git a/include/linux/mm.h b/include/linux/mm.h index 9fe3b0066324..f38fbe9c8bc9 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1657,6 +1657,11 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma, #define MM_CP_DIRTY_ACCT (1UL << 0) /* Whether this protection change is for NUMA hints */ #define MM_CP_PROT_NUMA (1UL << 1) +/* Whether this change is for write protecting */ +#define MM_CP_UFFD_WP (1UL << 2) /* do wp */ +#define MM_CP_UFFD_WP_RESOLVE (1UL << 3) /* Resolve wp */ +#define MM_CP_UFFD_WP_ALL (MM_CP_UFFD_WP | \ + MM_CP_UFFD_WP_RESOLVE) extern unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 8d65b0f041f9..817335b443c2 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1868,6 +1868,8 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, bool preserve_write; int ret; bool prot_numa = cp_flags & MM_CP_PROT_NUMA; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; ptl = __pmd_trans_huge_lock(pmd, vma); if (!ptl) @@ -1934,6 +1936,13 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, entry = pmd_modify(entry, newprot); if (preserve_write) entry = pmd_mk_savedwrite(entry); + if (uffd_wp) { + entry = pmd_wrprotect(entry); + entry = pmd_mkuffd_wp(entry); + } else if (uffd_wp_resolve) { + entry = pmd_mkwrite(entry); + entry = pmd_clear_uffd_wp(entry); + } ret = HPAGE_PMD_NR; set_pmd_at(mm, addr, pmd, entry); BUG_ON(vma_is_anonymous(vma) && !preserve_write && pmd_write(entry)); @@ -2083,7 +2092,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, struct page *page; pgtable_t pgtable; pmd_t old_pmd, _pmd; - bool young, write, soft_dirty, pmd_migration = false; + bool young, write, soft_dirty, pmd_migration = false, uffd_wp = false; unsigned long addr; int i; @@ -2165,6 +2174,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, write = pmd_write(old_pmd); young = pmd_young(old_pmd); soft_dirty = pmd_soft_dirty(old_pmd); + uffd_wp = pmd_uffd_wp(old_pmd); } VM_BUG_ON_PAGE(!page_count(page), page); page_ref_add(page, HPAGE_PMD_NR - 1); @@ -2198,6 +2208,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = pte_mkold(entry); if (soft_dirty) entry = pte_mksoft_dirty(entry); + if (uffd_wp) + entry = pte_mkuffd_wp(entry); } pte = pte_offset_map(&_pmd, addr); BUG_ON(!pte_none(*pte)); diff --git a/mm/memory.c b/mm/memory.c index 00781c43407b..f8d83ae16eff 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2483,7 +2483,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; - if (userfaultfd_wp(vma)) { + if (userfaultfd_pte_wp(vma, *vmf->pte)) { pte_unmap_unlock(vmf->pte, vmf->ptl); return handle_userfault(vmf, VM_UFFD_WP); } @@ -3692,7 +3692,7 @@ static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf) static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd) { if (vma_is_anonymous(vmf->vma)) { - if (userfaultfd_wp(vmf->vma)) + if (userfaultfd_huge_pmd_wp(vmf->vma, orig_pmd)) return handle_userfault(vmf, VM_UFFD_WP); return do_huge_pmd_wp_page(vmf, orig_pmd); } diff --git a/mm/mprotect.c b/mm/mprotect.c index a6ba448c8565..9d4433044c21 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -46,6 +46,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, int target_node = NUMA_NO_NODE; bool dirty_accountable = cp_flags & MM_CP_DIRTY_ACCT; bool prot_numa = cp_flags & MM_CP_PROT_NUMA; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; /* * Can be called with only the mmap_sem for reading by @@ -117,6 +119,14 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, if (preserve_write) ptent = pte_mk_savedwrite(ptent); + if (uffd_wp) { + ptent = pte_wrprotect(ptent); + ptent = pte_mkuffd_wp(ptent); + } else if (uffd_wp_resolve) { + ptent = pte_mkwrite(ptent); + ptent = pte_clear_uffd_wp(ptent); + } + /* Avoid taking write faults for known dirty pages */ if (dirty_accountable && pte_dirty(ptent) && (pte_soft_dirty(ptent) || @@ -301,6 +311,8 @@ unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, { unsigned long pages; + BUG_ON((cp_flags & MM_CP_UFFD_WP_ALL) == MM_CP_UFFD_WP_ALL); + if (is_vm_hugetlb_page(vma)) pages = hugetlb_change_protection(vma, start, end, newprot); else diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 73a208c5c1e7..80bcd642911d 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -73,8 +73,12 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, goto out_release; _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); - if (dst_vma->vm_flags & VM_WRITE && !wp_copy) - _dst_pte = pte_mkwrite(_dst_pte); + if (dst_vma->vm_flags & VM_WRITE) { + if (wp_copy) + _dst_pte = pte_mkuffd_wp(_dst_pte); + else + _dst_pte = pte_mkwrite(_dst_pte); + } dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); if (dst_vma->vm_file) { From patchwork Tue Feb 12 02:56:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807265 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3FA6B1575 for ; Tue, 12 Feb 2019 02:59:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2DAD22AAC0 for ; Tue, 12 Feb 2019 02:59:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1ED2D2AAC7; Tue, 12 Feb 2019 02:59:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A67BF2AAC0 for ; Tue, 12 Feb 2019 02:59:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8BCBA8E01A6; Mon, 11 Feb 2019 21:59:09 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 86D0E8E000E; Mon, 11 Feb 2019 21:59:09 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 75D628E01A6; Mon, 11 Feb 2019 21:59:09 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 525EB8E000E for ; Mon, 11 Feb 2019 21:59:09 -0500 (EST) Received: by mail-qk1-f197.google.com with SMTP id a11so3238876qkk.10 for ; Mon, 11 Feb 2019 18:59:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=MibS+TtRueKTrLwoGYVp90wSttr+ZpLaqBjvkYtAgns=; b=uWXpFdLn+u7NrzGvIHGNd47BQ2GxWE7A/JQw3tIYhxvJrccZtG4yjGe/xSCawEhWVb MNIpOvlkjQgep2lzYUXcKnFXBytxlZKcJ70EJt6hIu+hfHu3UM+IbCrzK2olpPFskW1f Cd2THYDNHoagk38ukAAOyzjtyhTQGJaqESNRfcSI6az0fW/nwBDlGzn3Hx9/hHTqhTZS LRXgkd8jPVGIxvnKITohy4UgymohtRPE3rE3QwOQQL9fUjN9jSune+U7XCld0p+J7QXb dRVN5Cay5fayZtqx3LpALbzBQG9idEHzGEPpkg4zYO6YkPFQI0NdmmeMilSdnUZYj9Vj 9HRQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAub3T6NxnyoiAaaAxiahH93RMrdAcpeJe4ccjx5ywZH7MGk0U40H 0nAfRf31toTwyuyvr1a0Jy9w2hb5wAVLuIWhQgtpQ5p42v3NqQf86U8Mr+bMSG5tIUNpX/HOAUS 17VvkcEMl+5xnppIBdo/UoTGz3b6nYMNjuLBAjCr+bhB/c56WOBEM51Pqvph792lIVw== X-Received: by 2002:ac8:f0f:: with SMTP id e15mr1099170qtk.373.1549940349127; Mon, 11 Feb 2019 18:59:09 -0800 (PST) X-Google-Smtp-Source: AHgI3IZquKxtigZB3OJUyayzHtr0F4AODNldfIQq2Cod8SIbYKQNJ0eoDH/lfWwSTv5JeSIyDC3L X-Received: by 2002:ac8:f0f:: with SMTP id e15mr1099153qtk.373.1549940348648; Mon, 11 Feb 2019 18:59:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940348; cv=none; d=google.com; s=arc-20160816; b=XNZ2nLWchoKgajTKtNqskgVrF/ii0xWU09Qn5niN81Oq+6D/ElY4/WLrHZteEFTlGq n27Gl8SbxvWUdgX5SLwQZwdI+hIBBxTovfv+UUL7/NdjLm6E8u3HCUefCTifwsv0gNdU exsP8yop29T7/toriIWOuaZHmNzcY3chPLj6vjlAfLHhDb7N722W0zTgHFEEw2mUbRas SVhiVtW/jJrmmNJSEhu+8pfSJkQhtI1ZPoIK7+pI/tXiTsnEzq3npDDEjQ5zWBK6TX2l wYNRmk2rnSNaSXz7rgzqJcrgLdUK8BivkTcZzmuyDFj0c2dUVhg782C13+gaeeH2gKNc 6fdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=MibS+TtRueKTrLwoGYVp90wSttr+ZpLaqBjvkYtAgns=; b=V87LoP+jNtJxrwYRQTf1F+QJ/fg9oK+2uMCyAkAm70e1rRo692dJA3L86XjGP+jRDW Naswm/exy6HGu5Ju4QMClkMrWOXnMxEpgOhE85v1oPYYzh+Y25XeH0Cqahg67MXnDo3L da2TKqvIrZkrDUG8btLDwOX8Eaau3hXH08Gjq7jBbd80y8wvQ0WorlUfVcAa7Xqt5beG Bbrwd3NDWesKCLiCUiN1k7N5RF4tQtLiU+k9/q8iQ16E39j5SCXspr3YWVpcnr6vaRph Bun0ezvR1oDYKYcIKl/qxdLkBIq4crxC+WVsWAxrc9jtW1jsCIdC48JQ25daNfFdh+Ho UnaQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id o60si751185qte.262.2019.02.11.18.59.08 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:59:08 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BCCB719CF29; Tue, 12 Feb 2019 02:59:07 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id E0969600C6; Tue, 12 Feb 2019 02:59:01 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 13/26] mm: export wp_page_copy() Date: Tue, 12 Feb 2019 10:56:19 +0800 Message-Id: <20190212025632.28946-14-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Tue, 12 Feb 2019 02:59:07 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Export this function for usages outside page fault handlers. Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse --- include/linux/mm.h | 2 ++ mm/memory.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index f38fbe9c8bc9..2fd14a62324b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -405,6 +405,8 @@ struct vm_fault { */ }; +vm_fault_t wp_page_copy(struct vm_fault *vmf); + /* page entry size for vm->huge_fault() */ enum page_entry_size { PE_SIZE_PTE = 0, diff --git a/mm/memory.c b/mm/memory.c index f8d83ae16eff..32d32b6e6339 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2239,7 +2239,7 @@ static inline void wp_page_reuse(struct vm_fault *vmf) * held to the old page, as well as updating the rmap. * - In any case, unlock the PTL and drop the reference we took to the old page. */ -static vm_fault_t wp_page_copy(struct vm_fault *vmf) +vm_fault_t wp_page_copy(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; struct mm_struct *mm = vma->vm_mm; From patchwork Tue Feb 12 02:56:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807267 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 27EE21669 for ; Tue, 12 Feb 2019 02:59:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 13CCD2ACC2 for ; Tue, 12 Feb 2019 02:59:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 056592ACDC; Tue, 12 Feb 2019 02:59:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6D66C2ACC2 for ; Tue, 12 Feb 2019 02:59:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7101E8E01A7; Mon, 11 Feb 2019 21:59:24 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6724A8E000E; Mon, 11 Feb 2019 21:59:24 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 53AD78E01A7; Mon, 11 Feb 2019 21:59:24 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 279778E000E for ; Mon, 11 Feb 2019 21:59:24 -0500 (EST) Received: by mail-qk1-f198.google.com with SMTP id i66so11536616qke.21 for ; Mon, 11 Feb 2019 18:59:24 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=ylIq5mGR3mm8NvVx73VTNBWfb4B1nm0xwLay7laduNk=; b=YbXCrpHiYyjpI5pLa8VQ0EYNU2YAH1o4LGu/dwKoCuSbEY35KC3yWv05rr8RnOK8fU 5HingvkPzmeSJkdJNftKZMPD6W/GI5ZmP9fJUaLrDwxB1iJxrhe63MXaYx1vG6+fsXDE ZQ7og8UPznnBjEi3lt/7G/bySchvwN071F21j0x/w89Z5knGkVWM8Juu4acth6yKvJAF HtfswjnLKdJ1QeuFJw2jR62rYEwQeiq4PNfxezyyHfR2BftLQC0yj/4VmMkT26nloJGL V2RJ30+UELcMRhJHmC1ljV0QZjwjZs5pSy+7EdCWnuWM2pgd574wirWRFzx+wemAcdot /P2w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuZomTB4K/GzzmJaL444xT4hawh5lOh0KXN+nBV6v88xAna2Etzc UcB8Zx6Xc0HUegwmuANpbULbr2bADGlfx4UZpDNFd8iF+tRD9dap9FyThU3ruacrSOg9sCVSTxG UMrpk5sNWRQY7eUSYN5zYj4yKLuTFCsF9E3WjxS7gcD8YQywR4u7eVqdL3kcW3zvKEQ== X-Received: by 2002:ac8:336a:: with SMTP id u39mr1164500qta.64.1549940363944; Mon, 11 Feb 2019 18:59:23 -0800 (PST) X-Google-Smtp-Source: AHgI3IaMUOqH1INHvGESXrEk2GmGUwHHN8jfZmvFSVa9ldsmXxvvTLVNI1n+YQmAgU2vpU3bwuvY X-Received: by 2002:ac8:336a:: with SMTP id u39mr1164488qta.64.1549940363393; Mon, 11 Feb 2019 18:59:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940363; cv=none; d=google.com; s=arc-20160816; b=wtMtLan6Eb92MIas7A4ihXIGEBrSY9bo/OQmD4KZZWxeH+huLlfMdYiJ31sPFEbECW /fCN72yG2sw7u2tYY6kInhAWYhuH6M+JG4d+oK9u6g0wWuPkRM2T73xqHeNRUq2ni440 2ds5gzPpo0eAWeUgnmyJ6nqWtaQx0+5AaChOScqTx+JEZkZ0V1zDF+b/8ljNZpWEwYWX 8B8hGW1tIZwlUY8nNiYikbxfcXgAmuafbbq0nh7Nn38ohcBZZCRV5WE12pERxAazEXC1 oZ9AvZbNL8S7uPmXbcGuWLKQrmBaiiu1pnPqFHVfjmOb4J1S4xRT4/oC5cgWnCeYGxRt HeVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=ylIq5mGR3mm8NvVx73VTNBWfb4B1nm0xwLay7laduNk=; b=ce6WIAmJpGnqI+VtQJjltWDPixyviJZaUAljXldAPbPAsFKxrX4XOalyTn+ka8SlLe T9d4pak9pb6qbZlY9v1H87pszy20DKf9LBHGcNSdbGQ7hq1AIC0FvRRliDNx2of0KkDq p1MBPgDNIvpAAE5H1HJQczHJiGERp64204TK/rrcLRNWdSf1UxZRxgzXg/3KtT4/zHvb HZNJ/W11yYzOHSVFAzw+AXTJuoAEa7BSxgrPZU1UsW82wN73PVF36S3f92kYRvNgqKlQ fBYkR7A5kU15j2zo3dv6XyNm080/U+x4K0RXSjXw2CohBg1mxD9LkMQOpIN0yQawmv7H NS7A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id w11si3720699qti.269.2019.02.11.18.59.23 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:59:23 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 68D834902D; Tue, 12 Feb 2019 02:59:22 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 45FC6600C6; Tue, 12 Feb 2019 02:59:08 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 14/26] userfaultfd: wp: handle COW properly for uffd-wp Date: Tue, 12 Feb 2019 10:56:20 +0800 Message-Id: <20190212025632.28946-15-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Tue, 12 Feb 2019 02:59:22 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This allows uffd-wp to support write-protected pages for COW. For example, the uffd write-protected PTE could also be write-protected by other usages like COW or zero pages. When that happens, we can't simply set the write bit in the PTE since otherwise it'll change the content of every single reference to the page. Instead, we should do the COW first if necessary, then handle the uffd-wp fault. To correctly copy the page, we'll also need to carry over the _PAGE_UFFD_WP bit if it was set in the original PTE. For huge PMDs, we just simply split the huge PMDs where we want to resolve an uffd-wp page fault always. That matches what we do with general huge PMD write protections. In that way, we resolved the huge PMD copy-on-write issue into PTE copy-on-write. Signed-off-by: Peter Xu --- mm/memory.c | 2 ++ mm/mprotect.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 54 insertions(+), 3 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 32d32b6e6339..b5d67bafae35 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2291,6 +2291,8 @@ vm_fault_t wp_page_copy(struct vm_fault *vmf) } flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte)); entry = mk_pte(new_page, vma->vm_page_prot); + if (pte_uffd_wp(vmf->orig_pte)) + entry = pte_mkuffd_wp(entry); entry = maybe_mkwrite(pte_mkdirty(entry), vma); /* * Clear the pte entry and flush it first, before updating the diff --git a/mm/mprotect.c b/mm/mprotect.c index 9d4433044c21..ae93721f3795 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -77,14 +77,13 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, if (pte_present(oldpte)) { pte_t ptent; bool preserve_write = prot_numa && pte_write(oldpte); + struct page *page; /* * Avoid trapping faults against the zero or KSM * pages. See similar comment in change_huge_pmd. */ if (prot_numa) { - struct page *page; - page = vm_normal_page(vma, addr, oldpte); if (!page || PageKsm(page)) continue; @@ -114,6 +113,46 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, continue; } + /* + * Detect whether we'll need to COW before + * resolving an uffd-wp fault. Note that this + * includes detection of the zero page (where + * page==NULL) + */ + if (uffd_wp_resolve) { + /* If the fault is resolved already, skip */ + if (!pte_uffd_wp(*pte)) + continue; + page = vm_normal_page(vma, addr, oldpte); + if (!page || page_mapcount(page) > 1) { + struct vm_fault vmf = { + .vma = vma, + .address = addr & PAGE_MASK, + .page = page, + .orig_pte = oldpte, + .pmd = pmd, + /* pte and ptl not needed */ + }; + vm_fault_t ret; + + if (page) + get_page(page); + arch_leave_lazy_mmu_mode(); + pte_unmap_unlock(pte, ptl); + ret = wp_page_copy(&vmf); + /* PTE is changed, or OOM */ + if (ret == 0) + /* It's done by others */ + continue; + else if (WARN_ON(ret != VM_FAULT_WRITE)) + return pages; + pte = pte_offset_map_lock(vma->vm_mm, + pmd, addr, + &ptl); + arch_enter_lazy_mmu_mode(); + } + } + ptent = ptep_modify_prot_start(mm, addr, pte); ptent = pte_modify(ptent, newprot); if (preserve_write) @@ -183,6 +222,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, unsigned long pages = 0; unsigned long nr_huge_updates = 0; struct mmu_notifier_range range; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; range.start = 0; @@ -202,7 +242,16 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, } if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { - if (next - addr != HPAGE_PMD_SIZE) { + /* + * When resolving an userfaultfd write + * protection fault, it's not easy to identify + * whether a THP is shared with others and + * whether we'll need to do copy-on-write, so + * just split it always for now to simply the + * procedure. And that's the policy too for + * general THP write-protect in af9e4d5f2de2. + */ + if (next - addr != HPAGE_PMD_SIZE || uffd_wp_resolve) { __split_huge_pmd(vma, pmd, addr, false, NULL); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, From patchwork Tue Feb 12 02:56:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807269 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C71D71669 for ; Tue, 12 Feb 2019 02:59:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B74072ACC2 for ; Tue, 12 Feb 2019 02:59:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AB31A2ACDC; Tue, 12 Feb 2019 02:59:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4C25D2ACC2 for ; Tue, 12 Feb 2019 02:59:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D7B58E017F; Mon, 11 Feb 2019 21:59:37 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 586E38E000E; Mon, 11 Feb 2019 21:59:37 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 428FC8E017F; Mon, 11 Feb 2019 21:59:37 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 105A88E000E for ; Mon, 11 Feb 2019 21:59:37 -0500 (EST) Received: by mail-qt1-f198.google.com with SMTP id y8so1245533qto.19 for ; Mon, 11 Feb 2019 18:59:37 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=7IiIqj97bzkfWOp+1GqlfOmQCtvZ1lzK34xFsh5m1uc=; b=hVMn7+xdsBFvukd/BGfGuXhGKVXVFhKI5oQWLUltsbQlk8JSnHd2pI8I0GU762tGpm 7AcGFBDb4Og57NuUTldke8any/UnYK6e3sA2wVS+Xe6zH0vp2fe7Ez7TfBqxP8PNiLGa TE6hFinAz8QIXxBmNAUG64VWjmiZJzgfepa5uXmHPt884JmNYYEZJbbwp79FNtoZLsxg rn0e0Wtxv5XKs0NMBFVCBF+l9RUeVhYyeCsjrtEXSh6ZNW9PbbMXwRyhPckDpwVjY1NS aW8xjJR1b66xCPN7s0ukqunoSQJKnjq62ht44MgoO7bjycSe3p2xWOOWhRXsmGYkZ1O3 z0vg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAubt2Fh7y+trWMk2F/NShScMUoEYKQXckVGushkVTWL7mU7xkm1f Y8R2NkRS9X0UbmiQdmAq0EIPZiSGwsWAcSeKaTasoapYAO3SvJoQZ2TrFAzxWYzYkfGcEOvEV9C bnt8BC7clEXGw6RMfmSFEGJoEtOOESZnmLjv3Bhzlx/QuY1Vv3eu0/25c60MaCNrN3g== X-Received: by 2002:a0c:f184:: with SMTP id m4mr1022109qvl.178.1549940376859; Mon, 11 Feb 2019 18:59:36 -0800 (PST) X-Google-Smtp-Source: AHgI3IZnnukF/Xj9HQo9gNC80EUR79c/gurhyOBKsZOFSBGeho4+NgX5ztZYroTTXwBKvOFt0orV X-Received: by 2002:a0c:f184:: with SMTP id m4mr1022093qvl.178.1549940376448; Mon, 11 Feb 2019 18:59:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940376; cv=none; d=google.com; s=arc-20160816; b=NLeqGizSyXxXE8zlhev0dHnmU2s7+IZZpo55PU2czDIQIMbh/KRVCsr2tgIcCkQ0Ry w3QuzRvTnQw/ICHDcUqYTdokNGo4UjxW3lmzhm75iXGrnCqvnr0nBJPTBZynzT/S+mE3 KaOpHkXrQP4tShOe6P+NGYhE8CNKnNCA9+c5kOxCD3XWHeoc514piUyIv4nW7+7jwWBf Q7M1Tald6BjP2Qg6d5zCE5cGxU92qLGaxIrXuQdOHvjU2mHPd+0858oMXnwyIGQQr/E9 Cu+17dxGg2sY9TBLT+BrImHxVNGCeqD0XLOJp1j1vrg40MN7Aahgr1ujh90SJDk/BaHl j7rQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=7IiIqj97bzkfWOp+1GqlfOmQCtvZ1lzK34xFsh5m1uc=; b=JjKbH3fw4OJXIp86jvXIbwrXez7UXj8JgGPE94G+k9Zlbh6NJ0T8kcCHvNh+9t7inT h0nTNyKnlLUUOc7E7KT5YF9jyErgGeHuDe+ik07TFB7t40AUexED3l4ZyBQE9CENuB5L 1RxyEp3VzuThictcpqeSfS8n5CVu/3hUtq+sko2Sw+oEDXQWvvewikJI0zMn1msWCFQZ 6dT3MUdyVGLs7TgCEwLRe9CpuOIMgr7Vx3DS/nHv87UTea+woYdClQlSbNn87CZ14OgD kIwj/PsOKFOBunwwW0R/W2N8rt+4z1xPDEWYBr5e2YzW7zpymWR9eQJ+BZZZpQjcMRZN XaDw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id f17si7953511qvr.101.2019.02.11.18.59.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:59:36 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8CE4781254; Tue, 12 Feb 2019 02:59:35 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id E3690600C6; Tue, 12 Feb 2019 02:59:22 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 15/26] userfaultfd: wp: drop _PAGE_UFFD_WP properly when fork Date: Tue, 12 Feb 2019 10:56:21 +0800 Message-Id: <20190212025632.28946-16-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 12 Feb 2019 02:59:35 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP UFFD_EVENT_FORK support for uffd-wp should be already there, except that we should clean the uffd-wp bit if uffd fork event is not enabled. Detect that to avoid _PAGE_UFFD_WP being set even if the VMA is not being tracked by VM_UFFD_WP. Do this for both small PTEs and huge PMDs. Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse Reviewed-by: Mike Rapoport --- mm/huge_memory.c | 8 ++++++++ mm/memory.c | 8 ++++++++ 2 files changed, 16 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 817335b443c2..fb2234cb595a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -938,6 +938,14 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, ret = -EAGAIN; pmd = *src_pmd; + /* + * Make sure the _PAGE_UFFD_WP bit is cleared if the new VMA + * does not have the VM_UFFD_WP, which means that the uffd + * fork event is not enabled. + */ + if (!(vma->vm_flags & VM_UFFD_WP)) + pmd = pmd_clear_uffd_wp(pmd); + #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION if (unlikely(is_swap_pmd(pmd))) { swp_entry_t entry = pmd_to_swp_entry(pmd); diff --git a/mm/memory.c b/mm/memory.c index b5d67bafae35..c2035539e9fd 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -788,6 +788,14 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte = pte_mkclean(pte); pte = pte_mkold(pte); + /* + * Make sure the _PAGE_UFFD_WP bit is cleared if the new VMA + * does not have the VM_UFFD_WP, which means that the uffd + * fork event is not enabled. + */ + if (!(vm_flags & VM_UFFD_WP)) + pte = pte_clear_uffd_wp(pte); + page = vm_normal_page(vma, addr, pte); if (page) { get_page(page); From patchwork Tue Feb 12 02:56:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807271 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 866E01575 for ; Tue, 12 Feb 2019 02:59:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 74F602ACC2 for ; Tue, 12 Feb 2019 02:59:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6901A2ACDC; Tue, 12 Feb 2019 02:59:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0EF062ACC2 for ; Tue, 12 Feb 2019 02:59:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 206338E01A8; Mon, 11 Feb 2019 21:59:49 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1B6FE8E000E; Mon, 11 Feb 2019 21:59:49 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 058F68E01A8; Mon, 11 Feb 2019 21:59:49 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id C22B18E000E for ; Mon, 11 Feb 2019 21:59:48 -0500 (EST) Received: by mail-qt1-f200.google.com with SMTP id 42so1275934qtr.7 for ; Mon, 11 Feb 2019 18:59:48 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=P04+GFKWxEz1IxW9TwcjuRARGsgBA+kAqrjkE60dE6k=; b=SLySybCZsRCBW1q0+lV+7S+ik2z0qZrRq35DLEGWR5A9wa4TTmCRqjiUdhZzPNYzdT ffeoT1XC9RXGDyKsUzK82C+lhBGJWqei/kGyt7zqdnUQyhOHaGCXnRAdAuQM8Zlj3VI6 NlkyIuEZTfBpS3HEmLv9wgKsG7Giv1Y/f65hTo+V/cczBTiEofDe06SG7vAlgkQcfP8z is/2pV5JMPPrz+LveApJH3DiXCRLgFWp5GpebBum2NheQ6Ltj7G4Zts2I97U69Y2lfHl tUzQ2kWzhZLUefMUR/6YFoJ/DL1k79600Uqw06BMMSnYMppmwmmwMdxb7wikmEfyRW8G Whag== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAua3unixzgt96XS8ISEwMACJYtopT5De9HINlCj4ZKxsL61fd/rk OnIrLk8uTOAhXyN4CGyhJOKRWNFfvcZhHyw5FyufvyOFnaJqJzZhl7MRimYYdCCGhc24/wNLHsY EDHaR8d3jb1+Utd0mRZ1kD566DPNKglqzZNJMjsK4hdH7AkX22CGTbRSpU1H8xIRI/g== X-Received: by 2002:ae9:f712:: with SMTP id s18mr984730qkg.83.1549940388589; Mon, 11 Feb 2019 18:59:48 -0800 (PST) X-Google-Smtp-Source: AHgI3IZfboEOqlFl1ZeL97xlfwf/gzZ0hJjNxbLyNPQwe0aUTmXa/opJbYHYhJGvx3EfzLGNlXDh X-Received: by 2002:ae9:f712:: with SMTP id s18mr984718qkg.83.1549940388187; Mon, 11 Feb 2019 18:59:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940388; cv=none; d=google.com; s=arc-20160816; b=DKtTvCcww26ubO11zxdGQ9UbDwoVfO66X3fxiX/QQZ8TTSrL4RW72LTW8C8pQomSb4 uEuvjmhmAgamwHAQkG6RRPFLweIdcmpjXR/jLB2Zeyq+hy+6qCVK0mmJ/zErQuKfmPi+ +GpXx7Fcd+3PiscjndJieGL6j6a5iJpKLgmoGUyNcQqpkqlwgzCCcKfkm2lF1rJm38y9 06oTQ37P57s8zhZuLTal3I1q4V+twcH5lDLUkkueYLaLy7ABGI1QoSEaNTyygBnJbXTj CveRKCMjXW/BX3bzzaRfw7JJxPoEnelm3esqdPOxtdjgwgSfunzqX53zg9Iq2hahqA+X O63A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=P04+GFKWxEz1IxW9TwcjuRARGsgBA+kAqrjkE60dE6k=; b=O9uGIDxRNooM4UrNAlePcUWuwi+eH8Ja/77MtuI5hYhhYalUNKyKab/plp0XybR3hH F113ltA7Vt+wxy3Y/fOi187GJYagoo/EauZuBj2i4N0EMriM5J7Kaqn7lbTrgwLb7gax p6vrbbTGaD1wSvCU4rhh75UCrldLd0sJ2sbt1PMM1hXG2l+PcibQ6cDpeV213RtT34A7 vDyhi36sEJ8kp4pcKXNRZf0ipLsBrCI55+9Mbx+5C9uGu/xO6+KV+1JQa2JTn+0fIai/ GNoqQHZyGV92rwWo94kqDZATwwL/kYjtSRT8v0M3fhIqwokn/P6MhgBp97IOcJA39P9p kJyA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id b21si3283517qtk.244.2019.02.11.18.59.48 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:59:48 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 413F3C06C9F8; Tue, 12 Feb 2019 02:59:47 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 10907600C6; Tue, 12 Feb 2019 02:59:35 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 16/26] userfaultfd: wp: add pmd_swp_*uffd_wp() helpers Date: Tue, 12 Feb 2019 10:56:22 +0800 Message-Id: <20190212025632.28946-17-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 12 Feb 2019 02:59:47 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Adding these missing helpers for uffd-wp operations with pmd swap/migration entries. Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse Reviewed-by: Mike Rapoport --- arch/x86/include/asm/pgtable.h | 15 +++++++++++++++ include/asm-generic/pgtable_uffd.h | 15 +++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 6863236e8484..18a815d6f4ea 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1401,6 +1401,21 @@ static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) { return pte_clear_flags(pte, _PAGE_SWP_UFFD_WP); } + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pmd_set_flags(pmd, _PAGE_SWP_UFFD_WP); +} + +static inline int pmd_swp_uffd_wp(pmd_t pmd) +{ + return pmd_flags(pmd) & _PAGE_SWP_UFFD_WP; +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pmd_clear_flags(pmd, _PAGE_SWP_UFFD_WP); +} #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ #define PKRU_AD_BIT 0x1 diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h index 643d1bf559c2..828966d4c281 100644 --- a/include/asm-generic/pgtable_uffd.h +++ b/include/asm-generic/pgtable_uffd.h @@ -46,6 +46,21 @@ static __always_inline pte_t pte_swp_clear_uffd_wp(pte_t pte) { return pte; } + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pmd; +} + +static inline int pmd_swp_uffd_wp(pmd_t pmd) +{ + return 0; +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pmd; +} #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ #endif /* _ASM_GENERIC_PGTABLE_UFFD_H */ From patchwork Tue Feb 12 02:56:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807273 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5284D1575 for ; Tue, 12 Feb 2019 03:00:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3E1972ACD1 for ; Tue, 12 Feb 2019 03:00:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2E7632ACDE; Tue, 12 Feb 2019 03:00:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 795272ACD1 for ; Tue, 12 Feb 2019 03:00:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 847BE8E01A9; Mon, 11 Feb 2019 22:00:00 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7D1488E000E; Mon, 11 Feb 2019 22:00:00 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 672A68E01A9; Mon, 11 Feb 2019 22:00:00 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 3851E8E000E for ; Mon, 11 Feb 2019 22:00:00 -0500 (EST) Received: by mail-qt1-f199.google.com with SMTP id u32so1316819qte.1 for ; Mon, 11 Feb 2019 19:00:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=kb1ebXpRqlyDe3Bcl80yjz2zx6M1w1j8ZoU2XgwL93g=; b=fxXvaGlbGLX+5DcrT7e8xSlAbCZE/7jZSzeIhiXuZUyPA1rhSlgz8hGSCTZI0AoVg1 Y9Qbcov0G2mfOF4pAeAlg8hd+elPeqR34cqRMxmrmob1FlPm7AYpAfSRxi+GRhfs2eb0 iH+/FIHuxtE2gPXLY2u5jJd2+X7QppfsmS/A8/FolLXGCqFLKnm927w4L/3UXMwMlwJj fr+LJMx8nrNyOqEc/PznPbFVh/MylQ8zxnC4+5dNOYozRoMEhnfjKuaOWFkV89TAD2G6 pwmrcn+5eq7J2sKjv9hmniXeFmt/OGOwY3OZa1wJJ8dyzTqnsLx2lZCrQGjikzdADfVY WI6g== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuZEChdfbGB8uSE02EtKUUGVoVTmF56TnBqIn16tPsBPMT8FUEoV PAKQuqffwEENQZWPIw5hiafbWsr6vUPbHTihiHXe0j2gOfkDYVMah64kaSyp6F7bqVomoTNrZZP wgku6Mbw/Zk/hPFotG/4bFSyaw/ccozS8So3EbtYyyT0gpgT5+ZMeJlbmYZr2lQjVdA== X-Received: by 2002:a37:2d02:: with SMTP id t2mr1031135qkh.82.1549940399966; Mon, 11 Feb 2019 18:59:59 -0800 (PST) X-Google-Smtp-Source: AHgI3IZabPf1sUXCNUJvXnY288sLd328B0kruNvmfxtX/rvPT106SQWoxBwFA1Q3C6kV1y/Qau54 X-Received: by 2002:a37:2d02:: with SMTP id t2mr1031107qkh.82.1549940399305; Mon, 11 Feb 2019 18:59:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940399; cv=none; d=google.com; s=arc-20160816; b=VVv8hqUvjVdZCm/TUsfm/TJlv+cRecRUe9gl4Md76NFDLm9MpRofFzdkk+lnlMMlaM 8s1LIVx11sFXkAKBJYdZPrFzAYisPTph2Rstbb8JT8/lo7UKfNU4dEXHOhjqVmY58Stk 9JVtsrk6Kf6BK1tWW0a3Jzoy75MK6L+gpA67Vg4FzWuyMaJfR+nwIVIzgwLnwQQoown5 6Nm6yVxQEspgrVPMaHbUSvzrPcC091x25xxiv4JcI2rl6xV4S9UsDww+dv3HlTCPVucC kGVzCYp1MFNz+UFh9SLJcwF9fv5xDd/oZ+ZzibBnqBfqfdXfzMEs1obPH8IpGA/M5gJC SN0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=kb1ebXpRqlyDe3Bcl80yjz2zx6M1w1j8ZoU2XgwL93g=; b=IxEAY084EX9fh1x9Sc5eLq2GMFXEn04kIfIPdkdVfEuxRCwJDDazqnGO3EpJm+xD2p LwZkBsigoatyxrbSNAh17HDlMnREvEiTRXkc+J0WiztI+oSiyrFMp6FvfjC4NyIk5Kdx yIbT3DPUokerP5fh8PCa1qpG1e2b94L6l0EvUAjohfBK8lNRbTSF0ydJZvEcjPq/irzI 0eHnJB5RytaSECp1IRGaqIsr9pgQfrKxsbJQ8RwINiRbOU17BqSYSF7SHFWiUH+wre+R rThTu+JqIByGAvzEXDhDKq9KCb2uOT0SmY+sghovR9E8paRGfLouDaYNU7ChWVHENXlr R5MQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id h12si8283878qto.184.2019.02.11.18.59.59 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 18:59:59 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5041080F90; Tue, 12 Feb 2019 02:59:58 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id BEA35600C6; Tue, 12 Feb 2019 02:59:47 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 17/26] userfaultfd: wp: support swap and page migration Date: Tue, 12 Feb 2019 10:56:23 +0800 Message-Id: <20190212025632.28946-18-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 12 Feb 2019 02:59:58 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP For either swap and page migration, we all use the bit 2 of the entry to identify whether this entry is uffd write-protected. It plays a similar role as the existing soft dirty bit in swap entries but only for keeping the uffd-wp tracking for a specific PTE/PMD. Something special here is that when we want to recover the uffd-wp bit from a swap/migration entry to the PTE bit we'll also need to take care of the _PAGE_RW bit and make sure it's cleared, otherwise even with the _PAGE_UFFD_WP bit we can't trap it at all. Note that this patch removed two lines from "userfaultfd: wp: hook userfault handler to write protection fault" where we try to remove the VM_FAULT_WRITE from vmf->flags when uffd-wp is set for the VMA. This patch will still keep the write flag there. Signed-off-by: Peter Xu Reviewed-by: Mike Rapoport --- include/linux/swapops.h | 2 ++ mm/huge_memory.c | 3 +++ mm/memory.c | 8 ++++++-- mm/migrate.c | 7 +++++++ mm/mprotect.c | 2 ++ mm/rmap.c | 6 ++++++ 6 files changed, 26 insertions(+), 2 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 4d961668e5fc..0c2923b1cdb7 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -68,6 +68,8 @@ static inline swp_entry_t pte_to_swp_entry(pte_t pte) if (pte_swp_soft_dirty(pte)) pte = pte_swp_clear_soft_dirty(pte); + if (pte_swp_uffd_wp(pte)) + pte = pte_swp_clear_uffd_wp(pte); arch_entry = __pte_to_swp_entry(pte); return swp_entry(__swp_type(arch_entry), __swp_offset(arch_entry)); } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index fb2234cb595a..75de07141801 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2175,6 +2175,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, write = is_write_migration_entry(entry); young = false; soft_dirty = pmd_swp_soft_dirty(old_pmd); + uffd_wp = pmd_swp_uffd_wp(old_pmd); } else { page = pmd_page(old_pmd); if (pmd_dirty(old_pmd)) @@ -2207,6 +2208,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = swp_entry_to_pte(swp_entry); if (soft_dirty) entry = pte_swp_mksoft_dirty(entry); + if (uffd_wp) + entry = pte_swp_mkuffd_wp(entry); } else { entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); entry = maybe_mkwrite(entry, vma); diff --git a/mm/memory.c b/mm/memory.c index c2035539e9fd..7cee990d67cf 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -736,6 +736,8 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte = swp_entry_to_pte(entry); if (pte_swp_soft_dirty(*src_pte)) pte = pte_swp_mksoft_dirty(pte); + if (pte_swp_uffd_wp(*src_pte)) + pte = pte_swp_mkuffd_wp(pte); set_pte_at(src_mm, addr, src_pte, pte); } } else if (is_device_private_entry(entry)) { @@ -2815,8 +2817,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS); pte = mk_pte(page, vma->vm_page_prot); - if (userfaultfd_wp(vma)) - vmf->flags &= ~FAULT_FLAG_WRITE; if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) { pte = maybe_mkwrite(pte_mkdirty(pte), vma); vmf->flags &= ~FAULT_FLAG_WRITE; @@ -2826,6 +2826,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) flush_icache_page(vma, page); if (pte_swp_soft_dirty(vmf->orig_pte)) pte = pte_mksoft_dirty(pte); + if (pte_swp_uffd_wp(vmf->orig_pte)) { + pte = pte_mkuffd_wp(pte); + pte = pte_wrprotect(pte); + } set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); vmf->orig_pte = pte; diff --git a/mm/migrate.c b/mm/migrate.c index d4fd680be3b0..605ccd1f5c64 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -242,6 +242,11 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, if (is_write_migration_entry(entry)) pte = maybe_mkwrite(pte, vma); + if (pte_swp_uffd_wp(*pvmw.pte)) { + pte = pte_mkuffd_wp(pte); + pte = pte_wrprotect(pte); + } + if (unlikely(is_zone_device_page(new))) { if (is_device_private_page(new)) { entry = make_device_private_entry(new, pte_write(pte)); @@ -2290,6 +2295,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pte)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pte)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, addr, ptep, swp_pte); /* diff --git a/mm/mprotect.c b/mm/mprotect.c index ae93721f3795..73a65f07fe41 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -187,6 +187,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, newpte = swp_entry_to_pte(entry); if (pte_swp_soft_dirty(oldpte)) newpte = pte_swp_mksoft_dirty(newpte); + if (pte_swp_uffd_wp(oldpte)) + newpte = pte_swp_mkuffd_wp(newpte); set_pte_at(mm, addr, pte, newpte); pages++; diff --git a/mm/rmap.c b/mm/rmap.c index 0454ecc29537..3750d5a5283c 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1469,6 +1469,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, pvmw.address, pvmw.pte, swp_pte); /* * No need to invalidate here it will synchronize on @@ -1561,6 +1563,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* * No need to invalidate here it will synchronize on @@ -1627,6 +1631,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* Invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, From patchwork Tue Feb 12 02:56:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807275 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 634341575 for ; Tue, 12 Feb 2019 03:00:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4F0A42ACD1 for ; Tue, 12 Feb 2019 03:00:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3EEBA2ACDE; Tue, 12 Feb 2019 03:00:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C652C2ACD1 for ; Tue, 12 Feb 2019 03:00:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F25B88E01AA; Mon, 11 Feb 2019 22:00:14 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id EE0458E000E; Mon, 11 Feb 2019 22:00:14 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D9CD98E01AA; Mon, 11 Feb 2019 22:00:14 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id ABB828E000E for ; Mon, 11 Feb 2019 22:00:14 -0500 (EST) Received: by mail-qt1-f200.google.com with SMTP id 43so1263856qtz.8 for ; Mon, 11 Feb 2019 19:00:14 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=qkUM5PHc4fGG1Gos3msNI5iqIZh+zFOQoQJ/383TR9s=; b=sdjvHejiP0nkAeGVgK9XgeuJ/FIGKWN/dmGY19BTb1pe9Fw7lCOeND3V7XYiyJPftd VU1Mev2A+3JElhUNWGHCpGCBdEaaHW9ePGrIzadEt2K12B48Bc24uA9XDIbNMHIT+pfD YriHXDFtVlaixvluSn/OxKkBGciQ0JGQS/fuI+ccf1z7JKXuw5Ts64mq2d0T1002ybya z/l5Z8DrlHf4FeIkDTRFqYcEXZUmwXhfUFeEkTLyB8r6Dlahf2eE9SQMgeclietryPGW EHhkoMF/W9GI7BeAcCmcpFQDBMdNzHQVUwtN4XWdKPvH0tpZBvsrnZxuLF4Uq22LtXSj ZXow== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuZM0qcj9JDT2+7plBlX/OXMxTzMCFSgXwDJtWyUdiQZGa44lY4p JyHLomXk9FJwNm8whK8F0i88xGFOJ8oGZjUjgBJtlsBltCZgTNWhNXwJ6yBE5z2mB44+t0qPB7W OO7rb+gw46YsgatLRjGK7gBY4LN01Zsf1D3qjvz+ULt3QDqqtt0iKL929vjPRFKrCgg== X-Received: by 2002:a37:4f8f:: with SMTP id d137mr1042734qkb.325.1549940414483; Mon, 11 Feb 2019 19:00:14 -0800 (PST) X-Google-Smtp-Source: AHgI3IbcfFLRbvg/QCGtab7H+r/o9UGq3YDeeqpX7ql9uFCnLPjqMHt4oGpYZ+18L/zg+9S6ZFZY X-Received: by 2002:a37:4f8f:: with SMTP id d137mr1042701qkb.325.1549940413931; Mon, 11 Feb 2019 19:00:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940413; cv=none; d=google.com; s=arc-20160816; b=uWE85TBTKZDHtoieCqWbPS6SwJOn3Al7JBOUQIuXxkvA1woj0vwYmm0aZ8YljsLdpB t8YcFagtl3f3uEzIrUBlcJ5+xzRq4KyKR6bJi/kGRXouzqPY5Or9cYhuHJ2bD7TngbVU GjMuiWCn3D5Hbg19aWiadGoH78J1bMHrqMlB5SP3ie2m7/KHYPDoc1QRbCXqqUnCTZbl c1RHyioPFv0in34JfloZmX1DaomWfCjh7qf5ioBibOQipPodPfOcmv8cADz6wKsAS3Co G7IJvTc1aekbh9/eGzZMrTczyA8FD4sCjiWagnfFNDmg87WGRednYMLNqWTVfQ7qm60B eYDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=qkUM5PHc4fGG1Gos3msNI5iqIZh+zFOQoQJ/383TR9s=; b=tNcqTRE0ih+b8EMNGHvepmpIE+OO4CXQ3NEr4cV//rl1HVzT3Aqmsnjy4X50PzCzpu DQVxVoF7CkrRYX0suP+xKTELSpkIQ77/3YfK4MlVhAM8JWYgKG/BIPrAh9ECIq89l37M 6NAV1Hp4cXNAfVXiQn29pGyglYONCYgkWDRhZvE1P0lKi4Klet99kXAg+V4wvQf6lh3o v3vW3ewutTg03F762TikI5a+CX0qtLq2x7w6KHmyh4qBE4qYRJRKSX2vAdsB9Ac5RBkp p5VblAX4aT6jRbtRRDKkUQp7LOIaFI5HcaxB5hZ76lqAvioAK6AMQ1+RWFtJY3Z1hmWC Co0Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id u22si2181690qvf.20.2019.02.11.19.00.13 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 19:00:13 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C56DF9FDCC; Tue, 12 Feb 2019 03:00:12 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id CD0EF600C6; Tue, 12 Feb 2019 02:59:58 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 18/26] khugepaged: skip collapse if uffd-wp detected Date: Tue, 12 Feb 2019 10:56:24 +0800 Message-Id: <20190212025632.28946-19-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 12 Feb 2019 03:00:13 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Don't collapse the huge PMD if there is any userfault write protected small PTEs. The problem is that the write protection is in small page granularity and there's no way to keep all these write protection information if the small pages are going to be merged into a huge PMD. The same thing needs to be considered for swap entries and migration entries. So do the check as well disregarding khugepaged_max_ptes_swap. Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse Reviewed-by: Mike Rapoport --- include/trace/events/huge_memory.h | 1 + mm/khugepaged.c | 23 +++++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index dd4db334bd63..2d7bad9cb976 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -13,6 +13,7 @@ EM( SCAN_PMD_NULL, "pmd_null") \ EM( SCAN_EXCEED_NONE_PTE, "exceed_none_pte") \ EM( SCAN_PTE_NON_PRESENT, "pte_non_present") \ + EM( SCAN_PTE_UFFD_WP, "pte_uffd_wp") \ EM( SCAN_PAGE_RO, "no_writable_page") \ EM( SCAN_LACK_REFERENCED_PAGE, "lack_referenced_page") \ EM( SCAN_PAGE_NULL, "page_null") \ diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 4f017339ddb2..396c7e4da83e 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -29,6 +29,7 @@ enum scan_result { SCAN_PMD_NULL, SCAN_EXCEED_NONE_PTE, SCAN_PTE_NON_PRESENT, + SCAN_PTE_UFFD_WP, SCAN_PAGE_RO, SCAN_LACK_REFERENCED_PAGE, SCAN_PAGE_NULL, @@ -1123,6 +1124,15 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, pte_t pteval = *_pte; if (is_swap_pte(pteval)) { if (++unmapped <= khugepaged_max_ptes_swap) { + /* + * Always be strict with uffd-wp + * enabled swap entries. Please see + * comment below for pte_uffd_wp(). + */ + if (pte_swp_uffd_wp(pteval)) { + result = SCAN_PTE_UFFD_WP; + goto out_unmap; + } continue; } else { result = SCAN_EXCEED_SWAP_PTE; @@ -1142,6 +1152,19 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, result = SCAN_PTE_NON_PRESENT; goto out_unmap; } + if (pte_uffd_wp(pteval)) { + /* + * Don't collapse the page if any of the small + * PTEs are armed with uffd write protection. + * Here we can also mark the new huge pmd as + * write protected if any of the small ones is + * marked but that could bring uknown + * userfault messages that falls outside of + * the registered range. So, just be simple. + */ + result = SCAN_PTE_UFFD_WP; + goto out_unmap; + } if (pte_write(pteval)) writable = true; From patchwork Tue Feb 12 02:56:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807277 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0689D1575 for ; Tue, 12 Feb 2019 03:00:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E9A562AD82 for ; Tue, 12 Feb 2019 03:00:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DD1EE2AD84; Tue, 12 Feb 2019 03:00:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 734A02AD82 for ; Tue, 12 Feb 2019 03:00:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 403A08E01AB; Mon, 11 Feb 2019 22:00:21 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 389E88E000E; Mon, 11 Feb 2019 22:00:21 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 252628E01AB; Mon, 11 Feb 2019 22:00:21 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id E65B78E000E for ; Mon, 11 Feb 2019 22:00:20 -0500 (EST) Received: by mail-qk1-f200.google.com with SMTP id q81so14413756qkl.20 for ; Mon, 11 Feb 2019 19:00:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=9uk3Q0rvNodB/4vR1QWlwzCboqvJ+Wix/SOodhBg4sg=; b=N60BbA7Ft+pA2sLzwWbGO1VOcvHh5eslyVNM0Wo5RzkdgJiuCr82LZxbu0qSnksPAa OfSdZYMNG90pw2tlCEFOcKRpj4AjK4yDPZNTrN7SI55+W4gO3dQo9qyjLPD66ejVPMhu tQkLFYXHrTrAZYzYGzt82gdEYcEbc3fr3LWA4fZlpM1TK3pv8lk7MP8xDHKGT1NJT4bh npDHW60bhoBLXjyoMSCi7viDeMzoA5hNU1Lre6GBBqeS0ZvKIo2+mDCAYuNVY5TsyiNQ 0DoOZdfAGJNzCX9r0B6ILvEsRV3gIL/+Iws6y3urae5W/NgX5qagIEQdKYK5N7NmtuZD gl0w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuaeVoYg6OqeSa0/puLUL2AM7dZY48Z082l0+yJUjwsjXAom08x2 2aoMUixYrOBpVRVKIOPRT64gdgH687/Atojs4TKU8bpxldKhBApRmCsi4Egys40HEJQhIz0iTQK FTw1BrtsasCAPW+IVRQsUedPqMhjjsN1wZ+/NMQ12AvxydS4duRcrdlnhyToEQvDF+A== X-Received: by 2002:a37:b405:: with SMTP id d5mr1040290qkf.162.1549940420730; Mon, 11 Feb 2019 19:00:20 -0800 (PST) X-Google-Smtp-Source: AHgI3IZgX/B0DN5fn4Zw9F+5CJK1C+WnaHcFnHERH/Lz0eHl8fEwCAwUtnCVgy/qEzFoHo6XblGm X-Received: by 2002:a37:b405:: with SMTP id d5mr1040262qkf.162.1549940420195; Mon, 11 Feb 2019 19:00:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940420; cv=none; d=google.com; s=arc-20160816; b=WWDO4m0W92jkZsfdpszCJJ6+OGz8zvOzgg+cKo4xhPByg20VeE0pZ81/UPAzJ7uOLy QDxQDWe7TsJKKfeVanV6wZewOqWLKZmJYLNqTAvwfTtk0nfk35OXkPSUEFCSi+TXz9m2 D6U+tWDM/vIWGM0/Pbu4Hk8AOXHCC6FunybRi0QxiKeiGmxg/wwtHTxR+n8HABoZj0g0 fJqo4Idkj5xotZQWD/bjPrAd8raHmNHAwO1lDgoC9fxY6TrWZ8vnGq6Sp4+8dxZdCJbd qY//kPIzx126eKZ1rOKZd1SGH4kKE+9nftrwWNhQy3PHxcpAxNr8Lmif3mEqzeOJdX0K csyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=9uk3Q0rvNodB/4vR1QWlwzCboqvJ+Wix/SOodhBg4sg=; b=n/24NT2ci/vXKHI5m3Q74sDs6UZR9mSYjDoA6MTZaGz6HwT45yJmP+ckDUyUTkK3Vi Xd4tvoNrOKIjqivE7dmhin9bI+0PmwtXZn+s8gCsocsPi/sij0ojCzCfGtA7v+WLTU6B graWdYEmHclqQWsF2+lgdWJUuRz2bVlsqpHH8uMF86GSYkrXyQziFkw9JujK+CYNX8+O vdXJNDOaxKnGabTAgZm40XPXwWXc3rXBm7/1uFKNFzxnVfsgkIVGxIsskr838DODzRG5 cwTVFFYHzW/3f/wkFGtxHyXRPOstx1Lbv8zZEBidGOgUOZsvjs3XxCdAJ7xHj0gXWQ5L DkHw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id q31si1005650qvf.108.2019.02.11.19.00.20 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 19:00:20 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0FC7D85362; Tue, 12 Feb 2019 03:00:19 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 15BC8600CC; Tue, 12 Feb 2019 03:00:12 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 19/26] userfaultfd: introduce helper vma_find_uffd Date: Tue, 12 Feb 2019 10:56:25 +0800 Message-Id: <20190212025632.28946-20-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 12 Feb 2019 03:00:19 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP We've have multiple (and more coming) places that would like to find a userfault enabled VMA from a mm struct that covers a specific memory range. This patch introduce the helper for it, meanwhile apply it to the code. Suggested-by: Mike Rapoport Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse Reviewed-by: Mike Rapoport --- mm/userfaultfd.c | 54 +++++++++++++++++++++++++++--------------------- 1 file changed, 30 insertions(+), 24 deletions(-) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 80bcd642911d..fefa81c301b7 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -20,6 +20,34 @@ #include #include "internal.h" +/* + * Find a valid userfault enabled VMA region that covers the whole + * address range, or NULL on failure. Must be called with mmap_sem + * held. + */ +static struct vm_area_struct *vma_find_uffd(struct mm_struct *mm, + unsigned long start, + unsigned long len) +{ + struct vm_area_struct *vma = find_vma(mm, start); + + if (!vma) + return NULL; + + /* + * Check the vma is registered in uffd, this is required to + * enforce the VM_MAYWRITE check done at uffd registration + * time. + */ + if (!vma->vm_userfaultfd_ctx.ctx) + return NULL; + + if (start < vma->vm_start || start + len > vma->vm_end) + return NULL; + + return vma; +} + static int mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, struct vm_area_struct *dst_vma, @@ -228,20 +256,9 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, */ if (!dst_vma) { err = -ENOENT; - dst_vma = find_vma(dst_mm, dst_start); + dst_vma = vma_find_uffd(dst_mm, dst_start, len); if (!dst_vma || !is_vm_hugetlb_page(dst_vma)) goto out_unlock; - /* - * Check the vma is registered in uffd, this is - * required to enforce the VM_MAYWRITE check done at - * uffd registration time. - */ - if (!dst_vma->vm_userfaultfd_ctx.ctx) - goto out_unlock; - - if (dst_start < dst_vma->vm_start || - dst_start + len > dst_vma->vm_end) - goto out_unlock; err = -EINVAL; if (vma_hpagesize != vma_kernel_pagesize(dst_vma)) @@ -488,20 +505,9 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, * both valid and fully within a single existing vma. */ err = -ENOENT; - dst_vma = find_vma(dst_mm, dst_start); + dst_vma = vma_find_uffd(dst_mm, dst_start, len); if (!dst_vma) goto out_unlock; - /* - * Check the vma is registered in uffd, this is required to - * enforce the VM_MAYWRITE check done at uffd registration - * time. - */ - if (!dst_vma->vm_userfaultfd_ctx.ctx) - goto out_unlock; - - if (dst_start < dst_vma->vm_start || - dst_start + len > dst_vma->vm_end) - goto out_unlock; err = -EINVAL; /* From patchwork Tue Feb 12 02:56:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807279 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B404013BF for ; Tue, 12 Feb 2019 03:00:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A0A982AD82 for ; Tue, 12 Feb 2019 03:00:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 944AC2AD84; Tue, 12 Feb 2019 03:00:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 174252AD82 for ; Tue, 12 Feb 2019 03:00:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 401738E000F; Mon, 11 Feb 2019 22:00:33 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 38A858E000E; Mon, 11 Feb 2019 22:00:33 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 22BED8E000F; Mon, 11 Feb 2019 22:00:33 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id E7D148E000E for ; Mon, 11 Feb 2019 22:00:32 -0500 (EST) Received: by mail-qt1-f197.google.com with SMTP id n95so1246477qte.16 for ; Mon, 11 Feb 2019 19:00:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=9d6f/k3uUCUckQQ3eaDd6rBnviayTR6LMrwP55wkWfc=; b=tSNZ1pG5XRIoj2lVS3gsWetpLfoRm4CqwZtp7oRojIm7jTO3+TjYQzdshN8HtU9lMy 9pbTn1a1wZStfTOLxRwZKwyrK8GZS4KI4yga/ldtxPyXg766AD9Ey670+Jbq0/JVzGwu Vr2mcsQrQtdJsfohwv71o01/zGdDz3+L8z28/xVfSRjwWBxAa/pYGixm1vBzmvp2j9Q5 pgmB+/U0UDcIYNvY0yIR5SYHr4UkSoQlOxqfJq0ZIpiv7QoWX63IlVuGHviwxNfUX7be BMb6s8ktjSgwmKhq4/ViNZZQvN+OFl+yPEWukabI1eWDj0ALpWOe289TICvX1LOVdDJv Miyw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuYbQuhW6PzhS1NGKO352vYkpT0HRwmaMFKKrb7Odvu/VItnXxGZ NanU7phJHVFm2JWYm8aqtO+cUjg+wRiz7G0NIlYrfinwTXONDPtxJjkr2MeOwB1GTVIeTxZjihE a9++NnkjR9jpz7iHSQUdSm08nI3jR64NaErtC8JRLO8aSPRHQz1f3gCo7zMEAnTkgbQ== X-Received: by 2002:ac8:7016:: with SMTP id x22mr1058835qtm.325.1549940432729; Mon, 11 Feb 2019 19:00:32 -0800 (PST) X-Google-Smtp-Source: AHgI3IZfwSXHWSioJISWlKgdMDLGh2nNFjOdfVeGhYn/FjVETw+/b7KjlKkZeqqj5v1+D6++Y9mE X-Received: by 2002:ac8:7016:: with SMTP id x22mr1058808qtm.325.1549940432294; Mon, 11 Feb 2019 19:00:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940432; cv=none; d=google.com; s=arc-20160816; b=wXQfGvPxrgS7Ovx8lU4Yyebk48k8AjZpBuPBf2mfgOZVj94a4rEIAdhB5u4Cujia45 5FtfDCUEn/tTg7uGqhLnzQyLRrsWmOcTxFcgRRTPh5LIgxLO5+Sa6aJzYeS88p0tTNii Uo8gGxtZlyy/4hc4Olh+4ThYhtKiYQwIK5gnf87NJfjvs6dovoY4BMAiaxcgYKuniQHG FZRU+zd12vVo/LD0IBx1H4D7fYWVXFRMvs83VXyxu03nSBzvQ+a2dSuud9RtOJrciSz5 uYzf3qzcrGuVIcMnHdDgz5obtuj3TpAWSbEo0xHaJ/L9uYqhY9k1vcnJ+SitxGjdkhCn yNzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=9d6f/k3uUCUckQQ3eaDd6rBnviayTR6LMrwP55wkWfc=; b=K3jNXGSw6dcLbnEJo6/auSEfAchM0QIfTAJf0oy9FD9O49y0lxbv8CyawY4dNrIX/K gX+yEm/jFhjXtfD5L7QmLf2193NhZhSivcaP66CiOHTL6f3f70l34an3K1qrpWrW6Ck2 eH5TgLlskDBgWz5eMHRlwLczwd0sK05J5H1v7PhxcD9ih0aWGhX0NByruMoy+UFkLvBm hpBBKZ5AFS+saUqvFl1ejQG22YK58YXcaqDC8jJZRxuBibF4RaV7pX+oKbxxiORYSdNp u7mf9h4KZR5y/IzvDn/PaLT66dRzpiW4OZcfphA+CkM7ZMcRytrQiWkMXnfF76JjhWEl FmZg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id t7si724935qvh.32.2019.02.11.19.00.32 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 19:00:32 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 39ADE5F797; Tue, 12 Feb 2019 03:00:31 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8D46963F9C; Tue, 12 Feb 2019 03:00:19 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Rik van Riel Subject: [PATCH v2 20/26] userfaultfd: wp: support write protection for userfault vma range Date: Tue, 12 Feb 2019 10:56:26 +0800 Message-Id: <20190212025632.28946-21-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 12 Feb 2019 03:00:31 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li Add API to enable/disable writeprotect a vma range. Unlike mprotect, this doesn't split/merge vmas. Cc: Andrea Arcangeli Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli [peterx: - use the helper to find VMA; - return -ENOENT if not found to match mcopy case; - use the new MM_CP_UFFD_WP* flags for change_protection - check against mmap_changing for failures] Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse Reviewed-by: Mike Rapoport --- include/linux/userfaultfd_k.h | 3 ++ mm/userfaultfd.c | 54 +++++++++++++++++++++++++++++++++++ 2 files changed, 57 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 765ce884cec0..8f6e6ed544fb 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -39,6 +39,9 @@ extern ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long len, bool *mmap_changing); +extern int mwriteprotect_range(struct mm_struct *dst_mm, + unsigned long start, unsigned long len, + bool enable_wp, bool *mmap_changing); /* mm helpers */ static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma, diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index fefa81c301b7..529d180bb4d7 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -639,3 +639,57 @@ ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long start, { return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing, 0); } + +int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, + unsigned long len, bool enable_wp, bool *mmap_changing) +{ + struct vm_area_struct *dst_vma; + pgprot_t newprot; + int err; + + /* + * Sanitize the command parameters: + */ + BUG_ON(start & ~PAGE_MASK); + BUG_ON(len & ~PAGE_MASK); + + /* Does the address range wrap, or is the span zero-sized? */ + BUG_ON(start + len <= start); + + down_read(&dst_mm->mmap_sem); + + /* + * If memory mappings are changing because of non-cooperative + * operation (e.g. mremap) running in parallel, bail out and + * request the user to retry later + */ + err = -EAGAIN; + if (mmap_changing && READ_ONCE(*mmap_changing)) + goto out_unlock; + + err = -ENOENT; + dst_vma = vma_find_uffd(dst_mm, start, len); + /* + * Make sure the vma is not shared, that the dst range is + * both valid and fully within a single existing vma. + */ + if (!dst_vma || (dst_vma->vm_flags & VM_SHARED)) + goto out_unlock; + if (!userfaultfd_wp(dst_vma)) + goto out_unlock; + if (!vma_is_anonymous(dst_vma)) + goto out_unlock; + + if (enable_wp) + newprot = vm_get_page_prot(dst_vma->vm_flags & ~(VM_WRITE)); + else + newprot = vm_get_page_prot(dst_vma->vm_flags); + + change_protection(dst_vma, start, start + len, newprot, + enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE); + + err = 0; +out_unlock: + up_read(&dst_mm->mmap_sem); + return err; +} From patchwork Tue Feb 12 02:56:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807281 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DA6931575 for ; Tue, 12 Feb 2019 03:00:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C74352AD82 for ; Tue, 12 Feb 2019 03:00:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BB5D02AD84; Tue, 12 Feb 2019 03:00:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0ACEF2AD82 for ; Tue, 12 Feb 2019 03:00:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2CAC88E0125; Mon, 11 Feb 2019 22:00:43 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 2781C8E000E; Mon, 11 Feb 2019 22:00:43 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 11ACE8E0125; Mon, 11 Feb 2019 22:00:43 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id D62368E000E for ; Mon, 11 Feb 2019 22:00:42 -0500 (EST) Received: by mail-qk1-f200.google.com with SMTP id w124so14420128qkc.14 for ; Mon, 11 Feb 2019 19:00:42 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=ri/plaGGPGNBrPDmiuoX3iMv+Of0DqCouwOYW+z6hkk=; b=Gk4QP1vFCzaP1SS43yVVnv1tuygL1/r3JTpoy/RSPFSIaUNq4s89/vCdHHdq1BIwQB v7FhQfioofb072m26sSgAl4/RY4pVXV752fEbSmfy6iaMPimVAHZfF59AynhnHbsmwh8 IOhKOCSZtCyI0simEqwhxsBvvaE0reK1OmgMAVTYj9UEsjhcDoUE1ywwfAkmTG8kYeF9 xAgz20EHhxmVw9UxER8qYnyiXf57CQKmwBYOYFU9+No3o/VbI0Xl3JX/6RBdYDHy08FF MFC+HHFgHU5D3rgsZ+zrqaZ6Io8hvZLD4ZjHVTDs+9F/Ls+2GdS7a9ojTfZaVq2uySB+ RVbw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuaiFbqbZlZs21vP9BInedCHZx4MvQ8ymLScy2io6qjJYTTAkw1l c4x9a/Kw8RXzsQTQFHB3WGDAH399rr1mMjuPOrsyVVwo7ekwqJczvndDigcYscseRjrkgQyO1Ds EKkFhAGDuU/a/HkLicgHSTatF2S5L2sgHIq/1hmdlC64ehJWu9a8weTwbpHXdBFKx9Q== X-Received: by 2002:ac8:4792:: with SMTP id k18mr1133934qtq.294.1549940442631; Mon, 11 Feb 2019 19:00:42 -0800 (PST) X-Google-Smtp-Source: AHgI3IbmM9zPgbyz+IgNi1VLfr3KT5E8UFSO9RqFAUTRMUJwyDkSt7SjgtxATe/qB3V+s8ZBEzfb X-Received: by 2002:ac8:4792:: with SMTP id k18mr1133920qtq.294.1549940441924; Mon, 11 Feb 2019 19:00:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940442; cv=none; d=google.com; s=arc-20160816; b=tdbDv7cNKGA3seimRORpkc10byF1FPbvGKHwRmEwN+71wz0x8cjFFcAgv0pHks5j5w DLqgDXjFl/3uUEIFHfYDFZNVlgqkZOwRStLD/IhgDK2OUowRXD7hL6K65HaImpA0wg7S 3ABBUiLQ9d+GbPiEywrJpWf4UrDuV4Dg+DOvK0b+TWA9IJR03JAykhdPPcq2tCO6ZUj4 pxPG5i/5Pw2ven7MVKgY2g8x/KP+1NR0ZLjBNYI0yk3rXWIJX0pvV/02rqAN5MVolIRk JjOW0i0u6xY0Z0ySqEx2ZNxJbyX5Rzm7554pxW1R0F7VnRCx8wm+QJVR5QnNFrSWHo9J y/8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=ri/plaGGPGNBrPDmiuoX3iMv+Of0DqCouwOYW+z6hkk=; b=g4Hh/W4VH0DBT1mYVrbkLLFTqyo/q37dMCDcdQclqbPMhaVn9AsImsRvikE8LUSXmc AMU5JKlpLticHgncmm1xK51sjPU59GnwpVsgefhJOHu9zRGu5xnovmb1kiEbGh+rXX5C eiPWiWwPrAa8r+cbPkRdBGHUD1dmX5H2DPtHrjF5obod4ul5P3THOXEyiQfYPToGrD/Y 9ucdPUlphMprVg4lrlAX14WhUL5bDuDksFf/2Xs/sogEEbVOGPPd4j6b65kT2p0gsbi6 pbMJCVawt3GCoU8M0aeKlGN9VG6Nzos2sCvq0m3O8gzFI2Lv/kgsG2/QldzGlyjuNW9o rv8A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id r20si17951qvl.44.2019.02.11.19.00.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 19:00:42 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E8EE6A4035; Tue, 12 Feb 2019 03:00:40 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id B53C9600CC; Tue, 12 Feb 2019 03:00:31 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 21/26] userfaultfd: wp: add the writeprotect API to userfaultfd ioctl Date: Tue, 12 Feb 2019 10:56:27 +0800 Message-Id: <20190212025632.28946-22-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Tue, 12 Feb 2019 03:00:41 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli v1: From: Shaohua Li v2: cleanups, remove a branch. [peterx writes up the commit message, as below...] This patch introduces the new uffd-wp APIs for userspace. Firstly, we'll allow to do UFFDIO_REGISTER with write protection tracking using the new UFFDIO_REGISTER_MODE_WP flag. Note that this flag can co-exist with the existing UFFDIO_REGISTER_MODE_MISSING, in which case the userspace program can not only resolve missing page faults, and at the same time tracking page data changes along the way. Secondly, we introduced the new UFFDIO_WRITEPROTECT API to do page level write protection tracking. Note that we will need to register the memory region with UFFDIO_REGISTER_MODE_WP before that. Signed-off-by: Andrea Arcangeli [peterx: remove useless block, write commit message, check against VM_MAYWRITE rather than VM_WRITE when register] Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse --- fs/userfaultfd.c | 82 +++++++++++++++++++++++++------- include/uapi/linux/userfaultfd.h | 11 +++++ 2 files changed, 77 insertions(+), 16 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 3092885c9d2c..81962d62520c 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -304,8 +304,11 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, if (!pmd_present(_pmd)) goto out; - if (pmd_trans_huge(_pmd)) + if (pmd_trans_huge(_pmd)) { + if (!pmd_write(_pmd) && (reason & VM_UFFD_WP)) + ret = true; goto out; + } /* * the pmd is stable (as in !pmd_trans_unstable) so we can re-read it @@ -318,6 +321,8 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, */ if (pte_none(*pte)) ret = true; + if (!pte_write(*pte) && (reason & VM_UFFD_WP)) + ret = true; pte_unmap(pte); out: @@ -1251,10 +1256,13 @@ static __always_inline int validate_range(struct mm_struct *mm, return 0; } -static inline bool vma_can_userfault(struct vm_area_struct *vma) +static inline bool vma_can_userfault(struct vm_area_struct *vma, + unsigned long vm_flags) { - return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || - vma_is_shmem(vma); + /* FIXME: add WP support to hugetlbfs and shmem */ + return vma_is_anonymous(vma) || + ((is_vm_hugetlb_page(vma) || vma_is_shmem(vma)) && + !(vm_flags & VM_UFFD_WP)); } static int userfaultfd_register(struct userfaultfd_ctx *ctx, @@ -1286,15 +1294,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, vm_flags = 0; if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MISSING) vm_flags |= VM_UFFD_MISSING; - if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) { + if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) vm_flags |= VM_UFFD_WP; - /* - * FIXME: remove the below error constraint by - * implementing the wprotect tracking mode. - */ - ret = -EINVAL; - goto out; - } ret = validate_range(mm, uffdio_register.range.start, uffdio_register.range.len); @@ -1342,7 +1343,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, /* check not compatible vmas */ ret = -EINVAL; - if (!vma_can_userfault(cur)) + if (!vma_can_userfault(cur, vm_flags)) goto out_unlock; /* @@ -1370,6 +1371,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, if (end & (vma_hpagesize - 1)) goto out_unlock; } + if ((vm_flags & VM_UFFD_WP) && !(cur->vm_flags & VM_MAYWRITE)) + goto out_unlock; /* * Check that this vma isn't already owned by a @@ -1399,7 +1402,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!vma_can_userfault(vma)); + BUG_ON(!vma_can_userfault(vma, vm_flags)); BUG_ON(vma->vm_userfaultfd_ctx.ctx && vma->vm_userfaultfd_ctx.ctx != ctx); WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); @@ -1534,7 +1537,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, * provides for more strict behavior to notice * unregistration errors. */ - if (!vma_can_userfault(cur)) + if (!vma_can_userfault(cur, cur->vm_flags)) goto out_unlock; found = true; @@ -1548,7 +1551,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!vma_can_userfault(vma)); + BUG_ON(!vma_can_userfault(vma, vma->vm_flags)); /* * Nothing to do: this vma is already registered into this @@ -1761,6 +1764,50 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx, return ret; } +static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, + unsigned long arg) +{ + int ret; + struct uffdio_writeprotect uffdio_wp; + struct uffdio_writeprotect __user *user_uffdio_wp; + struct userfaultfd_wake_range range; + + if (READ_ONCE(ctx->mmap_changing)) + return -EAGAIN; + + user_uffdio_wp = (struct uffdio_writeprotect __user *) arg; + + if (copy_from_user(&uffdio_wp, user_uffdio_wp, + sizeof(struct uffdio_writeprotect))) + return -EFAULT; + + ret = validate_range(ctx->mm, uffdio_wp.range.start, + uffdio_wp.range.len); + if (ret) + return ret; + + if (uffdio_wp.mode & ~(UFFDIO_WRITEPROTECT_MODE_DONTWAKE | + UFFDIO_WRITEPROTECT_MODE_WP)) + return -EINVAL; + if ((uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP) && + (uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) + return -EINVAL; + + ret = mwriteprotect_range(ctx->mm, uffdio_wp.range.start, + uffdio_wp.range.len, uffdio_wp.mode & + UFFDIO_WRITEPROTECT_MODE_WP, + &ctx->mmap_changing); + if (ret) + return ret; + + if (!(uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) { + range.start = uffdio_wp.range.start; + range.len = uffdio_wp.range.len; + wake_userfault(ctx, &range); + } + return ret; +} + static inline unsigned int uffd_ctx_features(__u64 user_features) { /* @@ -1838,6 +1885,9 @@ static long userfaultfd_ioctl(struct file *file, unsigned cmd, case UFFDIO_ZEROPAGE: ret = userfaultfd_zeropage(ctx, arg); break; + case UFFDIO_WRITEPROTECT: + ret = userfaultfd_writeprotect(ctx, arg); + break; } return ret; } diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 297cb044c03f..1b977a7a4435 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -52,6 +52,7 @@ #define _UFFDIO_WAKE (0x02) #define _UFFDIO_COPY (0x03) #define _UFFDIO_ZEROPAGE (0x04) +#define _UFFDIO_WRITEPROTECT (0x06) #define _UFFDIO_API (0x3F) /* userfaultfd ioctl ids */ @@ -68,6 +69,8 @@ struct uffdio_copy) #define UFFDIO_ZEROPAGE _IOWR(UFFDIO, _UFFDIO_ZEROPAGE, \ struct uffdio_zeropage) +#define UFFDIO_WRITEPROTECT _IOWR(UFFDIO, _UFFDIO_WRITEPROTECT, \ + struct uffdio_writeprotect) /* read() structure */ struct uffd_msg { @@ -232,4 +235,12 @@ struct uffdio_zeropage { __s64 zeropage; }; +struct uffdio_writeprotect { + struct uffdio_range range; + /* !WP means undo writeprotect. DONTWAKE is valid only with !WP */ +#define UFFDIO_WRITEPROTECT_MODE_WP ((__u64)1<<0) +#define UFFDIO_WRITEPROTECT_MODE_DONTWAKE ((__u64)1<<1) + __u64 mode; +}; + #endif /* _LINUX_USERFAULTFD_H */ From patchwork Tue Feb 12 02:56:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807283 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 56FC413BF for ; Tue, 12 Feb 2019 03:00:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4549A2AD82 for ; Tue, 12 Feb 2019 03:00:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 389012AD84; Tue, 12 Feb 2019 03:00:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B89902AD82 for ; Tue, 12 Feb 2019 03:00:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D9FAE8E01AC; Mon, 11 Feb 2019 22:00:54 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D4C948E000E; Mon, 11 Feb 2019 22:00:54 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BED738E01AC; Mon, 11 Feb 2019 22:00:54 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 8FC3D8E000E for ; Mon, 11 Feb 2019 22:00:54 -0500 (EST) Received: by mail-qt1-f198.google.com with SMTP id 42so1278896qtr.7 for ; Mon, 11 Feb 2019 19:00:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=et7H5onmOlBCAp/YbZF3o8+/lw37bMOD7FBBfwvBrIY=; b=oKa7hk/A6dl88nGopkHrO6D7j30Th3Fz09X7Lqv0AS9hFcgbxF+hB2Qqzx/f6YokJY hiuIiW4If2Noz7p/C6w4mtsNuWs6JDXQbX1kGIxMAYDh3ucW0xFSu8azZMxiFUM5yKLB FcwenLtKtrEKiB2TO0+T+aGKnkj7kG81DK3VZleTPWCRjmh/QUMUEvTibQN55HA2PNsX Ue6brPSEM3zgpySCezrLFuFO7uz/DsAlDeKbZTd9Hc5RtH7GaGzdwBcvxHb8dYPK554n PWgd47UYIfHrmu0apE7oMGu80uUC/ntol24EeZR5PTdAI9yzrP76rB0bIiRGDEkLYpsq vutA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuZiWzlqC8bs7nA+TV4f/Uq4kIzV8mwO8/jIlfmlEtB+3QIojUDK o4PZX4hEqb1A/Oqh/Mb43XhKC8thCgs6oYIyc7NYz3fX+/tPQ90mMD4S6Lk8hT7dBL9JFFmyKJl 07gn4i6UgX3XmkLnCYXFKRcd50f0Y86Xo845I4QXocfYiGaOVGERdcH5kFC5pCnJAhw== X-Received: by 2002:a0c:8b67:: with SMTP id d39mr1055670qvc.9.1549940454387; Mon, 11 Feb 2019 19:00:54 -0800 (PST) X-Google-Smtp-Source: AHgI3IYTbBMZYPO5A2R8CkbVDX5oGp01KeBoofQaXWsylCKUnaivPc7LW7qN7YC3iOBE+2Zfliku X-Received: by 2002:a0c:8b67:: with SMTP id d39mr1055652qvc.9.1549940453993; Mon, 11 Feb 2019 19:00:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940453; cv=none; d=google.com; s=arc-20160816; b=bIq8kMK1sGJZk3Pov9F3OuiORxKT7ctKruk6VxBVEKnj4djlzJaYO2u06DEq1KTL28 rH9yoQVr9/JYMJGGjtcxkZ98zJsYFdXjdZPIT5mYnGpCqbwbE/16K7xz82WO0jChjaRB V8XNwm7elc2FkH92l/1Fn/pHfoIRaS8ybKQA2CM0h8ZV95B4hODlIGPuJ5U+4o7mUjLK TcVc0Js6Xu7AHMr3NbMhEv7Xf6GDFsH2wmVJ4eMnjdW81SF4LaQF5YfxrO0I6ihrgfmh qAy4Mym6M8tNvAjqvYCbLPZzDXjMTY6LZgcAS44Mm8TFaPr1LoveXOJkFgbBmwH1Cb2h ndJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=et7H5onmOlBCAp/YbZF3o8+/lw37bMOD7FBBfwvBrIY=; b=Ke1JKOcUqkEz1eIgTm381srEJGOGpxYPMzn14QP2qBzrhyFL4gY63Y0TcJc4bnrQ7a Nw0gX+9T2/I7nu4jsuZwZvL7aBLGX2B7y26sZog6dzH8qvazS4QeYdDf1KPA99ahNIl8 6mv4xu1DXSvVelgXhruF1DD8kgBYNtNCxmsN+JlV6Cdj8VwLUboDoCgQV0ym+v7QQ+70 sJ7BDIT91WWeMF2vStERJArrwHf8xGm4asbGOk4lSOz5eRue2aLPzHP6Ro8ZVHCtrZQY Frul3ro96N8trs+9pPJhNmJF95uzytyNjeQFNfcUT0xq1w0jozAt3KBJHzdbN2MO2d8e 5NEw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id q20si2613337qtq.364.2019.02.11.19.00.53 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 19:00:53 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 19DED5947A; Tue, 12 Feb 2019 03:00:53 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6EDDE600C6; Tue, 12 Feb 2019 03:00:41 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Pavel Emelyanov , Rik van Riel Subject: [PATCH v2 22/26] userfaultfd: wp: enabled write protection in userfaultfd API Date: Tue, 12 Feb 2019 10:56:28 +0800 Message-Id: <20190212025632.28946-23-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 12 Feb 2019 03:00:53 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li Now it's safe to enable write protection in userfaultfd API Cc: Andrea Arcangeli Cc: Pavel Emelyanov Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse --- include/uapi/linux/userfaultfd.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 1b977a7a4435..a50f1ed24d23 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -19,7 +19,8 @@ * means the userland is reading). */ #define UFFD_API ((__u64)0xAA) -#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK | \ +#define UFFD_API_FEATURES (UFFD_FEATURE_PAGEFAULT_FLAG_WP | \ + UFFD_FEATURE_EVENT_FORK | \ UFFD_FEATURE_EVENT_REMAP | \ UFFD_FEATURE_EVENT_REMOVE | \ UFFD_FEATURE_EVENT_UNMAP | \ @@ -34,7 +35,8 @@ #define UFFD_API_RANGE_IOCTLS \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY | \ - (__u64)1 << _UFFDIO_ZEROPAGE) + (__u64)1 << _UFFDIO_ZEROPAGE | \ + (__u64)1 << _UFFDIO_WRITEPROTECT) #define UFFD_API_RANGE_IOCTLS_BASIC \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY) From patchwork Tue Feb 12 02:56:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807285 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BA69C1575 for ; Tue, 12 Feb 2019 03:01:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A57882AD82 for ; Tue, 12 Feb 2019 03:01:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 968E72AD84; Tue, 12 Feb 2019 03:01:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 32E4F2AD82 for ; Tue, 12 Feb 2019 03:01:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 538178E01AD; Mon, 11 Feb 2019 22:01:04 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4C13A8E000E; Mon, 11 Feb 2019 22:01:04 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B0F58E01AD; Mon, 11 Feb 2019 22:01:04 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 0C3F58E000E for ; Mon, 11 Feb 2019 22:01:04 -0500 (EST) Received: by mail-qk1-f197.google.com with SMTP id n197so14471630qke.0 for ; Mon, 11 Feb 2019 19:01:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=QzUweX58NX5kJXc37B6qWAgrFGIP75kjzpaY8e9hvrs=; b=WLM46c8xDEMDP4z+hXdAJYRz3QIZl73wJ2djcLSWi+i3+7onh8KMXERKDdqp7lVuaE 67aqzrj6cfflBMOSbATVva/YSEiRgRzFvSi+GumEM116KJV6L9l0lVkxlBctlQDaFV9r IupIttqV63WHZaUVen2KGqFVWS46qNioWKUWz73mAlCt7lz92d8S2rT4TjKp4WtnaPoi vcrlv8NlbUWepiyvhPL+xca1VxfIWDsTK0t8V/FP8wbjzPBY4I0IQ+4MEe0AxBv4DZpE btkXHvU+g74WIlP3lG8CkV3GyDlvgavauirhDuM5HWaiRJ5lloZjzVsJoXY899yWLpZp +qqA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuYIPLpFWcpWmJ4/YsZTBl9vCxr7Wd2oa9yFHKWkvpinhMuHecnS LA74iBcJAU4kjUJXYKRsoc7fOWOw0MuJOVmkCiXHFW8Q2NcviV5i2jeL6NV19QUek6WcdtjsQ/D j8ZWcOWyTdwncbCU4SHSgDTqJfHO9Gxmrr7cMQS3ieZA6aGrr/5S2e4xE0K1n4LO2tA== X-Received: by 2002:ac8:38fc:: with SMTP id g57mr1132046qtc.39.1549940463843; Mon, 11 Feb 2019 19:01:03 -0800 (PST) X-Google-Smtp-Source: AHgI3Ibd+IHqmyASF+XMC+Aa4DDbgmsircmKAufpClZfQ9kgZsMJfQk5nCLlVh93xVPiQCkqSipr X-Received: by 2002:ac8:38fc:: with SMTP id g57mr1132029qtc.39.1549940463371; Mon, 11 Feb 2019 19:01:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940463; cv=none; d=google.com; s=arc-20160816; b=szPLU93VdxmrLSwLGdnCiBOZoUKxqlTdioKUpHzsXFEeeJow3L3T+8np6ulBppD/s9 lY54u/jd63dep8av+elyzSE66d43ZUEZyMXHkIUv1XVkdaljqVcMxEUdhx2roo6/QKM+ J6Rrlig4VuSICYab8IfDl6hnq+cRfWoEzAgE6KBzW1Aw/HzcaPWRQDoTEyy7Gb/lbVwN /SeUzfpVGy62jWifXKooIyzbrphwOKnJZ0L3C7oeOslGBWaOvNNaMP3qzUvLtNcawo61 2A7TVcGUfhV30OOAE1qAKHC54XXN8ZKV5H1qpNcnOQWyxp5Jtq2C1k29JAQpPrR8FeC9 uLZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=QzUweX58NX5kJXc37B6qWAgrFGIP75kjzpaY8e9hvrs=; b=zFYxQsNfMlFUKH3I3z3qVSC2zIJA5wU0TH7F0lpKZb+JRTlkBz2VPRZWn4kswG2CY+ GAIjDpoKC2qJVL+Y532HhSQLsWh8z0CZlSApm82BZJVVVhts/1SlkUIeC9vhW5eoXM3A x1qb2e9AP7Lm0wE5Px9TRRgwufQCwK1Rhx0oQ8pxefT7vyeFuuPgiMBj6TfKVyX2usXv vfjsY6zFvV8/Y+ABLaE1q5CQtOT2ptxo/yE1Abad33y1fg+axN0t4L0r9/xcwh2xuTyj JCFMRsdhfinM4/8C0500GBb/d28TIuwhOw84ODCZMCbwez5qjBY11tCAl0Fm0GG9/2kN 6XRA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id i66si1483545qkc.207.2019.02.11.19.01.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 19:01:03 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 585908762F; Tue, 12 Feb 2019 03:01:02 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 94BE7600C6; Tue, 12 Feb 2019 03:00:53 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 23/26] userfaultfd: wp: don't wake up when doing write protect Date: Tue, 12 Feb 2019 10:56:29 +0800 Message-Id: <20190212025632.28946-24-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Tue, 12 Feb 2019 03:01:02 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP It does not make sense to try to wake up any waiting thread when we're write-protecting a memory region. Only wake up when resolving a write protected page fault. Signed-off-by: Peter Xu Reviewed-by: Mike Rapoport --- fs/userfaultfd.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 81962d62520c..f1f61a0278c2 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1771,6 +1771,7 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, struct uffdio_writeprotect uffdio_wp; struct uffdio_writeprotect __user *user_uffdio_wp; struct userfaultfd_wake_range range; + bool mode_wp, mode_dontwake; if (READ_ONCE(ctx->mmap_changing)) return -EAGAIN; @@ -1789,18 +1790,20 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, if (uffdio_wp.mode & ~(UFFDIO_WRITEPROTECT_MODE_DONTWAKE | UFFDIO_WRITEPROTECT_MODE_WP)) return -EINVAL; - if ((uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP) && - (uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) + + mode_wp = uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP; + mode_dontwake = uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE; + + if (mode_wp && mode_dontwake) return -EINVAL; ret = mwriteprotect_range(ctx->mm, uffdio_wp.range.start, - uffdio_wp.range.len, uffdio_wp.mode & - UFFDIO_WRITEPROTECT_MODE_WP, + uffdio_wp.range.len, mode_wp, &ctx->mmap_changing); if (ret) return ret; - if (!(uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) { + if (!mode_wp && !mode_dontwake) { range.start = uffdio_wp.range.start; range.len = uffdio_wp.range.len; wake_userfault(ctx, &range); From patchwork Tue Feb 12 02:56:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807287 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AEFA313BF for ; Tue, 12 Feb 2019 03:01:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9CA882AD83 for ; Tue, 12 Feb 2019 03:01:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8FDF32AD85; Tue, 12 Feb 2019 03:01:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 853FD2AD83 for ; Tue, 12 Feb 2019 03:01:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 958898E01AE; Mon, 11 Feb 2019 22:01:17 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8E0068E000E; Mon, 11 Feb 2019 22:01:17 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A8BC8E01AE; Mon, 11 Feb 2019 22:01:17 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 4B7868E000E for ; Mon, 11 Feb 2019 22:01:17 -0500 (EST) Received: by mail-qt1-f197.google.com with SMTP id 43so1266795qtz.8 for ; Mon, 11 Feb 2019 19:01:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=4SS/tii+Fgf7IamftibaxkkOaBEJw8ULTu3fmYKLaOU=; b=bofE9rEz1S7I16gvoxoKCL9XSXHf1WTbCC0dreDQ0Li8pPxBesBy18iwOZ15ud7ccD DJfYzRLUosbEbr/hnRpqR/je5AV+Ag+nKcMFkakH+Db7B48St/sytVzSECRN0DFDhNHN TTAoEI5eWAtOe3DgA6M/0n94x//X6ZAxknxQPXEz7lZ5uu+XtEVVbwFEyxvnQ7IyLaL1 awqNaMt8ktFzpQD64wREcxAiqkp8P7fJ15PX7i4FH+0FJR792FzXfgmW3dERiysXK7RL Dkchwumg+EBY6yzbAXL8O0i4PJvdK8RVmjTUC+0R6OOEqGh66cguaVrihI5NhoM4y32u 1aAA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAubTOTTHlgPkvAJqZFf6CKjRufl0onxZY5gKhjYernveseCBh5an eThi4YQsK2kx3WrCo6sNFVj3n0C8ivLjhKhcvS4Hskbz7PR/AiRn+q44rONwQ0phgkjrtRLgwim T0jr+diyZCB5Fm068y8MPFiaymILa1HQKlNg3G4uJgoC5UPHUuZD/GBFxzoso8UYnVw== X-Received: by 2002:a0c:c966:: with SMTP id v35mr1071620qvj.116.1549940477088; Mon, 11 Feb 2019 19:01:17 -0800 (PST) X-Google-Smtp-Source: AHgI3IYJveBrBLxN5VkGySu6NJEd1vIxtdHAI+TcM4FN1a4UumUQPQgdfDOmDg1fp2QjNVPTmUeS X-Received: by 2002:a0c:c966:: with SMTP id v35mr1071598qvj.116.1549940476558; Mon, 11 Feb 2019 19:01:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940476; cv=none; d=google.com; s=arc-20160816; b=vpmncx3Xgd+Q3xJQNQdixalQwUwA1b35RuSHGOXqkUnxbXmpwe0lsNVstDHfYGp8DM PgNJkqc8USHZtuNGgHmXMO4MmNcvd1BV0rkcWQTbddGYw0SZ2EfMTTouq63Q3n/KgE/W YXGgsvMlz+RNAZV6fypNgw04xkX0YGTLSHdTcfXr2bTGfg9vYG3SyY6b0S1G95R/DgiX yaF5fAk8ft+QkIMJ2bx5VxOwEq2ETyWr2o80fgS6KsE9d5BZUScGLNis5JlerqVazlU+ ZrYriDEr8yXDdciHG3NHzy78fTrVmDfIzlfdPS+as8FhvjAdXXQy+TiSG7oejrH2sVdn n6JA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=4SS/tii+Fgf7IamftibaxkkOaBEJw8ULTu3fmYKLaOU=; b=MvnzgRhhm5DbqFXnod7cLUIGL+9xaogUsN2CiPhx6QdhgaxUnslzOIegsl394cQlMX RO0uuwO8yn7BPmWDuFKO7ckDCjDOT4NKLYXQpi7osKYbh78NhYZILeuUE8s57V4nBfAl hknqr22tF9e1PnPki0ZEY5sHI4ktDHNHWFa45e1zTdUiSKTOeMAGlDEpVSf6T914xmp3 Io2nW8pFYCOBl0ZvjmCSGiA0pnG9HbLsVfrwTrOCB438AD9/f54g3FKhWmG461p58Ulu 5eBV7FWCZO+sB8a30R8cGxjxFRjSK2u1/K/y7UfZr9Smef1J/MbqSNPh6QSyk+AQ5x3P oUMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id b1si862802qtr.173.2019.02.11.19.01.16 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 19:01:16 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3B9B780F75; Tue, 12 Feb 2019 03:01:15 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id CBA2A600C6; Tue, 12 Feb 2019 03:01:02 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 24/26] userfaultfd: wp: UFFDIO_REGISTER_MODE_WP documentation update Date: Tue, 12 Feb 2019 10:56:30 +0800 Message-Id: <20190212025632.28946-25-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 12 Feb 2019 03:01:15 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Martin Cracauer Adds documentation about the write protection support. Signed-off-by: Andrea Arcangeli [peterx: rewrite in rst format; fixups here and there] Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse Reviewed-by: Mike Rapoport --- Documentation/admin-guide/mm/userfaultfd.rst | 51 ++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst index 5048cf661a8a..c30176e67900 100644 --- a/Documentation/admin-guide/mm/userfaultfd.rst +++ b/Documentation/admin-guide/mm/userfaultfd.rst @@ -108,6 +108,57 @@ UFFDIO_COPY. They're atomic as in guaranteeing that nothing can see an half copied page since it'll keep userfaulting until the copy has finished. +Notes: + +- If you requested UFFDIO_REGISTER_MODE_MISSING when registering then + you must provide some kind of page in your thread after reading from + the uffd. You must provide either UFFDIO_COPY or UFFDIO_ZEROPAGE. + The normal behavior of the OS automatically providing a zero page on + an annonymous mmaping is not in place. + +- None of the page-delivering ioctls default to the range that you + registered with. You must fill in all fields for the appropriate + ioctl struct including the range. + +- You get the address of the access that triggered the missing page + event out of a struct uffd_msg that you read in the thread from the + uffd. You can supply as many pages as you want with UFFDIO_COPY or + UFFDIO_ZEROPAGE. Keep in mind that unless you used DONTWAKE then + the first of any of those IOCTLs wakes up the faulting thread. + +- Be sure to test for all errors including (pollfd[0].revents & + POLLERR). This can happen, e.g. when ranges supplied were + incorrect. + +Write Protect Notifications +--------------------------- + +This is equivalent to (but faster than) using mprotect and a SIGSEGV +signal handler. + +Firstly you need to register a range with UFFDIO_REGISTER_MODE_WP. +Instead of using mprotect(2) you use ioctl(uffd, UFFDIO_WRITEPROTECT, +struct *uffdio_writeprotect) while mode = UFFDIO_WRITEPROTECT_MODE_WP +in the struct passed in. The range does not default to and does not +have to be identical to the range you registered with. You can write +protect as many ranges as you like (inside the registered range). +Then, in the thread reading from uffd the struct will have +msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP set. Now you send +ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect) again +while pagefault.mode does not have UFFDIO_WRITEPROTECT_MODE_WP set. +This wakes up the thread which will continue to run with writes. This +allows you to do the bookkeeping about the write in the uffd reading +thread before the ioctl. + +If you registered with both UFFDIO_REGISTER_MODE_MISSING and +UFFDIO_REGISTER_MODE_WP then you need to think about the sequence in +which you supply a page and undo write protect. Note that there is a +difference between writes into a WP area and into a !WP area. The +former will have UFFD_PAGEFAULT_FLAG_WP set, the latter +UFFD_PAGEFAULT_FLAG_WRITE. The latter did not fail on protection but +you still need to supply a page when UFFDIO_REGISTER_MODE_MISSING was +used. + QEMU/KVM ======== From patchwork Tue Feb 12 02:56:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807289 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 59AC613BF for ; Tue, 12 Feb 2019 03:01:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 446432AD83 for ; Tue, 12 Feb 2019 03:01:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3390D2AD85; Tue, 12 Feb 2019 03:01:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 777D82AD83 for ; Tue, 12 Feb 2019 03:01:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 930188E01AF; Mon, 11 Feb 2019 22:01:29 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8E0FD8E000E; Mon, 11 Feb 2019 22:01:29 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7805A8E01AF; Mon, 11 Feb 2019 22:01:29 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 492128E000E for ; Mon, 11 Feb 2019 22:01:29 -0500 (EST) Received: by mail-qk1-f198.google.com with SMTP id n197so14472753qke.0 for ; Mon, 11 Feb 2019 19:01:29 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=FAHDI2rhdj9FPEM82Nf5d1uwaIql7JfBDm123fs4fBA=; b=sOEFtfAvoNdg+aak2WoEuTedxhL6DzMqzPrXUSDRNVN1L9aGj4u1A/IFdQ+zmn3Y5P ooP22nhPC2j+YXuqzpxvot8Fc8y+UVg5wxTi6Pf2WVXm/Zbfqy7LcQsXTwemHdBk1Pf4 5itQQe/RUI1uQXbpVargwEMJBU1zE9cQ6hw/h07ZZcAjk/Jte+S419IqNBYWhjUhOu9N o/PEkUaX70JHR+OGKJl6USXGn+eaEHvUDkfdDciaaDazeq9hR4fj/xdUTdKWajDV2xXU FWSn7UyUL88U9R1hRsuxX8GIM8Qv6zgqG/SegYmJrxhOzUZbaJF56EeyfgGzYOsJwXy1 sy/Q== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAuZPUsI+RMn7WiyRGzukwqmdoX7AjQCA6DV2BRPEwj6NVuAdajpO mtbHVnmwTdNxU9uK61TKq/NyS0tZJAbzv+4BvIrQP3Li2MBIyInxbELWzONagMzQElc2CAEhHtS tNRdpy5P+5ylSkUQ1jtiy3Xi5Vfpi484GFho6PmTH0CHfqQAkVAS4hUswEwd7Una0JA== X-Received: by 2002:ac8:e43:: with SMTP id j3mr1110922qti.239.1549940489045; Mon, 11 Feb 2019 19:01:29 -0800 (PST) X-Google-Smtp-Source: AHgI3IYyupaSSkaaEJw5vtoKzdvd9j7aEwTZbH9t4j7RTkEipBTpVJG4QRh/R5oE1AhoPK6JAGwf X-Received: by 2002:ac8:e43:: with SMTP id j3mr1110890qti.239.1549940488398; Mon, 11 Feb 2019 19:01:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940488; cv=none; d=google.com; s=arc-20160816; b=R9sbN1vKeB+uQ0g73BNcCQkooEw5tVpTHe90Xg/R+Ne1/aus/60hx7DC7JkXZJcWJR IhF2Kpqs6wWb3XuzSHwcM0Vg9ewck8lXgXkAIHsUtmMFsQqhIIunNLSVV/rXyY9zl1fm pTgaEZwYBAv8mVOXz7Ne+sJDhqyC2FpAaCAPSV2biAxaenX0uayv0pVlqQL3tebOsO5M Yw+eYseGykkJelpMoPJ42kXhcUNc/ZTQ/Mr0jslndKerf/D8xVP+T23oE9XeEh6aGHci hpiBg+ReFRJGsWA5HThWXAEDuxfNN6Vv5/6L1gA5d5XWmK7wN+LXwLC0WPThPdrYou8U 9RMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=FAHDI2rhdj9FPEM82Nf5d1uwaIql7JfBDm123fs4fBA=; b=ShHePSTbvQs5G2Z2Z/uJDqcbyMijJE5eYuy5SOws+rBUCmflfSjY0VIQhv1E8HXqme bNtGdDvLQ4hXr/f7TBZyTqDatvPqtePKaVyody9oFfYSKp9Xx+4LfivUspS7MojjmMHy e2X3Lznw4/RpYgrYcxEkGQ535lqiR3kRf2nRMx4nNuwVFNLoTE0EDBJq5SMJAX/uEcDT 6Rfkm77EZSU+WkycKBIk8ipMZh4b1ULBS+P8kVw13+O7JUKogfkIDOEnqIcCKEgMF0BC QMy7lCB5S4wOxN6NSrjd8fCkYKk8D/R7AAqTlle8C1Xmy9pMWeAt9V1kFrm/VFf/9t0D +gIw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id z14si1511753qtn.132.2019.02.11.19.01.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 19:01:28 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7576474F12; Tue, 12 Feb 2019 03:01:27 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id B56776443D; Tue, 12 Feb 2019 03:01:15 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 25/26] userfaultfd: selftests: refactor statistics Date: Tue, 12 Feb 2019 10:56:31 +0800 Message-Id: <20190212025632.28946-26-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 12 Feb 2019 03:01:27 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Introduce uffd_stats structure for statistics of the self test, at the same time refactor the code to always pass in the uffd_stats for either read() or poll() typed fault handling threads instead of using two different ways to return the statistic results. No functional change. With the new structure, it's very easy to introduce new statistics. Signed-off-by: Peter Xu Reviewed-by: Mike Rapoport --- tools/testing/selftests/vm/userfaultfd.c | 76 +++++++++++++++--------- 1 file changed, 49 insertions(+), 27 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index 5d1db824f73a..e5d12c209e09 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -88,6 +88,12 @@ static char *area_src, *area_src_alias, *area_dst, *area_dst_alias; static char *zeropage; pthread_attr_t attr; +/* Userfaultfd test statistics */ +struct uffd_stats { + int cpu; + unsigned long missing_faults; +}; + /* pthread_mutex_t starts at page offset 0 */ #define area_mutex(___area, ___nr) \ ((pthread_mutex_t *) ((___area) + (___nr)*page_size)) @@ -127,6 +133,17 @@ static void usage(void) exit(1); } +static void uffd_stats_reset(struct uffd_stats *uffd_stats, + unsigned long n_cpus) +{ + int i; + + for (i = 0; i < n_cpus; i++) { + uffd_stats[i].cpu = i; + uffd_stats[i].missing_faults = 0; + } +} + static int anon_release_pages(char *rel_area) { int ret = 0; @@ -469,8 +486,8 @@ static int uffd_read_msg(int ufd, struct uffd_msg *msg) return 0; } -/* Return 1 if page fault handled by us; otherwise 0 */ -static int uffd_handle_page_fault(struct uffd_msg *msg) +static void uffd_handle_page_fault(struct uffd_msg *msg, + struct uffd_stats *stats) { unsigned long offset; @@ -485,18 +502,19 @@ static int uffd_handle_page_fault(struct uffd_msg *msg) offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; offset &= ~(page_size-1); - return copy_page(uffd, offset); + if (copy_page(uffd, offset)) + stats->missing_faults++; } static void *uffd_poll_thread(void *arg) { - unsigned long cpu = (unsigned long) arg; + struct uffd_stats *stats = (struct uffd_stats *)arg; + unsigned long cpu = stats->cpu; struct pollfd pollfd[2]; struct uffd_msg msg; struct uffdio_register uffd_reg; int ret; char tmp_chr; - unsigned long userfaults = 0; pollfd[0].fd = uffd; pollfd[0].events = POLLIN; @@ -526,7 +544,7 @@ static void *uffd_poll_thread(void *arg) msg.event), exit(1); break; case UFFD_EVENT_PAGEFAULT: - userfaults += uffd_handle_page_fault(&msg); + uffd_handle_page_fault(&msg, stats); break; case UFFD_EVENT_FORK: close(uffd); @@ -545,28 +563,27 @@ static void *uffd_poll_thread(void *arg) break; } } - return (void *)userfaults; + + return NULL; } pthread_mutex_t uffd_read_mutex = PTHREAD_MUTEX_INITIALIZER; static void *uffd_read_thread(void *arg) { - unsigned long *this_cpu_userfaults; + struct uffd_stats *stats = (struct uffd_stats *)arg; struct uffd_msg msg; - this_cpu_userfaults = (unsigned long *) arg; - *this_cpu_userfaults = 0; - pthread_mutex_unlock(&uffd_read_mutex); /* from here cancellation is ok */ for (;;) { if (uffd_read_msg(uffd, &msg)) continue; - (*this_cpu_userfaults) += uffd_handle_page_fault(&msg); + uffd_handle_page_fault(&msg, stats); } - return (void *)NULL; + + return NULL; } static void *background_thread(void *arg) @@ -582,13 +599,12 @@ static void *background_thread(void *arg) return NULL; } -static int stress(unsigned long *userfaults) +static int stress(struct uffd_stats *uffd_stats) { unsigned long cpu; pthread_t locking_threads[nr_cpus]; pthread_t uffd_threads[nr_cpus]; pthread_t background_threads[nr_cpus]; - void **_userfaults = (void **) userfaults; finished = 0; for (cpu = 0; cpu < nr_cpus; cpu++) { @@ -597,12 +613,13 @@ static int stress(unsigned long *userfaults) return 1; if (bounces & BOUNCE_POLL) { if (pthread_create(&uffd_threads[cpu], &attr, - uffd_poll_thread, (void *)cpu)) + uffd_poll_thread, + (void *)&uffd_stats[cpu])) return 1; } else { if (pthread_create(&uffd_threads[cpu], &attr, uffd_read_thread, - &_userfaults[cpu])) + (void *)&uffd_stats[cpu])) return 1; pthread_mutex_lock(&uffd_read_mutex); } @@ -639,7 +656,8 @@ static int stress(unsigned long *userfaults) fprintf(stderr, "pipefd write error\n"); return 1; } - if (pthread_join(uffd_threads[cpu], &_userfaults[cpu])) + if (pthread_join(uffd_threads[cpu], + (void *)&uffd_stats[cpu])) return 1; } else { if (pthread_cancel(uffd_threads[cpu])) @@ -910,11 +928,11 @@ static int userfaultfd_events_test(void) { struct uffdio_register uffdio_register; unsigned long expected_ioctls; - unsigned long userfaults; pthread_t uffd_mon; int err, features; pid_t pid; char c; + struct uffd_stats stats = { 0 }; printf("testing events (fork, remap, remove): "); fflush(stdout); @@ -941,7 +959,7 @@ static int userfaultfd_events_test(void) "unexpected missing ioctl for anon memory\n"), exit(1); - if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL)) + if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, &stats)) perror("uffd_poll_thread create"), exit(1); pid = fork(); @@ -957,13 +975,13 @@ static int userfaultfd_events_test(void) if (write(pipefd[1], &c, sizeof(c)) != sizeof(c)) perror("pipe write"), exit(1); - if (pthread_join(uffd_mon, (void **)&userfaults)) + if (pthread_join(uffd_mon, NULL)) return 1; close(uffd); - printf("userfaults: %ld\n", userfaults); + printf("userfaults: %ld\n", stats.missing_faults); - return userfaults != nr_pages; + return stats.missing_faults != nr_pages; } static int userfaultfd_sig_test(void) @@ -975,6 +993,7 @@ static int userfaultfd_sig_test(void) int err, features; pid_t pid; char c; + struct uffd_stats stats = { 0 }; printf("testing signal delivery: "); fflush(stdout); @@ -1006,7 +1025,7 @@ static int userfaultfd_sig_test(void) if (uffd_test_ops->release_pages(area_dst)) return 1; - if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL)) + if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, &stats)) perror("uffd_poll_thread create"), exit(1); pid = fork(); @@ -1032,6 +1051,7 @@ static int userfaultfd_sig_test(void) close(uffd); return userfaults != 0; } + static int userfaultfd_stress(void) { void *area; @@ -1040,7 +1060,7 @@ static int userfaultfd_stress(void) struct uffdio_register uffdio_register; unsigned long cpu; int err; - unsigned long userfaults[nr_cpus]; + struct uffd_stats uffd_stats[nr_cpus]; uffd_test_ops->allocate_area((void **)&area_src); if (!area_src) @@ -1169,8 +1189,10 @@ static int userfaultfd_stress(void) if (uffd_test_ops->release_pages(area_dst)) return 1; + uffd_stats_reset(uffd_stats, nr_cpus); + /* bounce pass */ - if (stress(userfaults)) + if (stress(uffd_stats)) return 1; /* unregister */ @@ -1213,7 +1235,7 @@ static int userfaultfd_stress(void) printf("userfaults:"); for (cpu = 0; cpu < nr_cpus; cpu++) - printf(" %lu", userfaults[cpu]); + printf(" %lu", uffd_stats[cpu].missing_faults); printf("\n"); } From patchwork Tue Feb 12 02:56:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10807291 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0B11813BF for ; Tue, 12 Feb 2019 03:01:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EACFE2AC45 for ; Tue, 12 Feb 2019 03:01:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D9BC72AD85; Tue, 12 Feb 2019 03:01:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DAB502AC45 for ; Tue, 12 Feb 2019 03:01:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE9C38E01B0; Mon, 11 Feb 2019 22:01:47 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D99E28E000E; Mon, 11 Feb 2019 22:01:47 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C3BBE8E01B0; Mon, 11 Feb 2019 22:01:47 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 955458E000E for ; Mon, 11 Feb 2019 22:01:47 -0500 (EST) Received: by mail-qk1-f197.google.com with SMTP id e9so14242240qka.11 for ; Mon, 11 Feb 2019 19:01:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=1e0ADGgd5khDqTzI8a0ZR54GIFB9fAr8S+c4oruTe0w=; b=LTlt5lDpcH4rXAhOQkAqE3B50TRT0zeXGpXP7Z9AWF3vK9U93FfVusfkKjp/Wja0Qv NoEVjCLWRZvNa//+mPU1IwGfITpkx2ATJthrfCwFzmxWitM17ZaOCUG8KroUZIvVivbJ afyRvzdFqj/qB1XBFyVGRElrgqfMIMcHClqysh7AuhMhFU1iLIQNRwLAGIz9aVvyQ88w Ay48q6hBuv0/pXmX7McCgGKVXEt2Bu1VjltMKFOhMSs35nSxwr2zmZ8It2BtRiVnO//o 00hleTpaFJAlGX9UCltWRH6Sow1k3DoDfRyLfBnC+C8SyXQCPuG5pv444HR4+j0JHG6o QVPQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AHQUAube1fRncSdtprpBTf3/i4wfLe181LJd2pIZA3DfFFCdQHqFdYRy 06jf0fm+ja1/Nnj4eEHFNQQpFwL68avkuWNeDhXV/5caKn0ykMA4oBhJOiEcy0xt5AUQPwEzThP aRR7m9oaFYZCi9BVHIXtq6DuzfV4siZfy1hH4O9T6vFxcybp4mNxkxekN8WXBNCDbcA== X-Received: by 2002:ac8:3f0f:: with SMTP id c15mr1141821qtk.142.1549940507370; Mon, 11 Feb 2019 19:01:47 -0800 (PST) X-Google-Smtp-Source: AHgI3IYKo2RlG2UPBHoM3bBNEfedNxJ90cEJ80W4d6cL3U9gTR1R3W3ulM7EFKZA2l6QqPAO8Eat X-Received: by 2002:ac8:3f0f:: with SMTP id c15mr1141778qtk.142.1549940506396; Mon, 11 Feb 2019 19:01:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549940506; cv=none; d=google.com; s=arc-20160816; b=Y2Fq7hjawWPy0wOhkP1CB3OiD41vWt4XZi6CBB1SDvKYA9HZlo6gi8dEaNVhYoBCLG 0lznje5SP8Wc4ibEthyJSAI2z+DrWYXWLQM0mXnjUw292SKodBPw1R+AzP5s1eVs/GIu GhVQXMOo/2mqHRr6rj+gKpsOmesIntTl3skHimLRjN+Amz1xgqQY/tUBO6AqZV52Qx4T xnUZpBLZJDkKKNWP2UQqDrqYkfPQ3v63qkDbUCTO1wY7w20j3Q5ALUuZJKa+0wNsIylU gk1Ol8FPYVLgaiXdeeFIdy18L7+IbYEWdsv4rS78uILoiQlF/OsUCJL9d9+DJ4RUualL KcCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=1e0ADGgd5khDqTzI8a0ZR54GIFB9fAr8S+c4oruTe0w=; b=a5bNx/0ibiX+3nVJH+4MR3bYLVr1Jjk+Nwd5RW1Cw43tT2TVmHjqsYlOkTOB6/mCoT zRh2ZE8tvlSLgA7m+doA/a2smEGZnMGtieWz60KOmE5FA/740V1zS2Us2orKOE2onVAR TOjkIRXZ8Y/gMv0dTCpBxN/Y62N9CyfFwEvxgM3U18cySnanv1eryXlZiiNdsW1nGhr6 C0dYh/2uV+hXW26fBVotil+yWHE6oPuWootYKNC+Vcvj4e5ITGN6nY5+mnhaFgi/drU7 wkvpMCNLJgblCSPfYrb8XvVtF8RH/e4ARUrNzyDLuwcUobkyqTXCXIkMvjiBWFwfEDH7 duBg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id z55si2005656qta.188.2019.02.11.19.01.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 19:01:46 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 13DF058E22; Tue, 12 Feb 2019 03:01:45 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id F0174600C6; Tue, 12 Feb 2019 03:01:27 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Hugh Dickins , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Shaohua Li , Marty McFadden , Andrea Arcangeli , Mike Kravetz , Denis Plotnikov , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH v2 26/26] userfaultfd: selftests: add write-protect test Date: Tue, 12 Feb 2019 10:56:32 +0800 Message-Id: <20190212025632.28946-27-peterx@redhat.com> In-Reply-To: <20190212025632.28946-1-peterx@redhat.com> References: <20190212025632.28946-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 12 Feb 2019 03:01:45 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This patch adds uffd tests for write protection. Instead of introducing new tests for it, let's simply squashing uffd-wp tests into existing uffd-missing test cases. Changes are: (1) Bouncing tests We do the write-protection in two ways during the bouncing test: - By using UFFDIO_COPY_MODE_WP when resolving MISSING pages: then we'll make sure for each bounce process every single page will be at least fault twice: once for MISSING, once for WP. - By direct call UFFDIO_WRITEPROTECT on existing faulted memories: To further torture the explicit page protection procedures of uffd-wp, we split each bounce procedure into two halves (in the background thread): the first half will be MISSING+WP for each page as explained above. After the first half, we write protect the faulted region in the background thread to make sure at least half of the pages will be write protected again which is the first half to test the new UFFDIO_WRITEPROTECT call. Then we continue with the 2nd half, which will contain both MISSING and WP faulting tests for the 2nd half and WP-only faults from the 1st half. (2) Event/Signal test Mostly previous tests but will do MISSING+WP for each page. For sigbus-mode test we'll need to provide standalone path to handle the write protection faults. For all tests, do statistics as well for uffd-wp pages. Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 154 ++++++++++++++++++----- 1 file changed, 126 insertions(+), 28 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index e5d12c209e09..57b5ac02080a 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -56,6 +56,7 @@ #include #include #include +#include #include "../kselftest.h" @@ -78,6 +79,8 @@ static int test_type; #define ALARM_INTERVAL_SECS 10 static volatile bool test_uffdio_copy_eexist = true; static volatile bool test_uffdio_zeropage_eexist = true; +/* Whether to test uffd write-protection */ +static bool test_uffdio_wp = false; static bool map_shared; static int huge_fd; @@ -92,6 +95,7 @@ pthread_attr_t attr; struct uffd_stats { int cpu; unsigned long missing_faults; + unsigned long wp_faults; }; /* pthread_mutex_t starts at page offset 0 */ @@ -141,9 +145,29 @@ static void uffd_stats_reset(struct uffd_stats *uffd_stats, for (i = 0; i < n_cpus; i++) { uffd_stats[i].cpu = i; uffd_stats[i].missing_faults = 0; + uffd_stats[i].wp_faults = 0; } } +static void uffd_stats_report(struct uffd_stats *stats, int n_cpus) +{ + int i; + unsigned long long miss_total = 0, wp_total = 0; + + for (i = 0; i < n_cpus; i++) { + miss_total += stats[i].missing_faults; + wp_total += stats[i].wp_faults; + } + + printf("userfaults: %llu missing (", miss_total); + for (i = 0; i < n_cpus; i++) + printf("%lu+", stats[i].missing_faults); + printf("\b), %llu wp (", wp_total); + for (i = 0; i < n_cpus; i++) + printf("%lu+", stats[i].wp_faults); + printf("\b)\n"); +} + static int anon_release_pages(char *rel_area) { int ret = 0; @@ -264,19 +288,15 @@ struct uffd_test_ops { void (*alias_mapping)(__u64 *start, size_t len, unsigned long offset); }; -#define ANON_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ - (1 << _UFFDIO_COPY) | \ - (1 << _UFFDIO_ZEROPAGE)) - static struct uffd_test_ops anon_uffd_test_ops = { - .expected_ioctls = ANON_EXPECTED_IOCTLS, + .expected_ioctls = UFFD_API_RANGE_IOCTLS, .allocate_area = anon_allocate_area, .release_pages = anon_release_pages, .alias_mapping = noop_alias_mapping, }; static struct uffd_test_ops shmem_uffd_test_ops = { - .expected_ioctls = ANON_EXPECTED_IOCTLS, + .expected_ioctls = UFFD_API_RANGE_IOCTLS, .allocate_area = shmem_allocate_area, .release_pages = shmem_release_pages, .alias_mapping = noop_alias_mapping, @@ -300,6 +320,21 @@ static int my_bcmp(char *str1, char *str2, size_t n) return 0; } +static void wp_range(int ufd, __u64 start, __u64 len, bool wp) +{ + struct uffdio_writeprotect prms = { 0 }; + + /* Write protection page faults */ + prms.range.start = start; + prms.range.len = len; + /* Undo write-protect, do wakeup after that */ + prms.mode = wp ? UFFDIO_WRITEPROTECT_MODE_WP : 0; + + if (ioctl(ufd, UFFDIO_WRITEPROTECT, &prms)) + fprintf(stderr, "clear WP failed for address 0x%Lx\n", + start), exit(1); +} + static void *locking_thread(void *arg) { unsigned long cpu = (unsigned long) arg; @@ -438,7 +473,10 @@ static int __copy_page(int ufd, unsigned long offset, bool retry) uffdio_copy.dst = (unsigned long) area_dst + offset; uffdio_copy.src = (unsigned long) area_src + offset; uffdio_copy.len = page_size; - uffdio_copy.mode = 0; + if (test_uffdio_wp) + uffdio_copy.mode = UFFDIO_COPY_MODE_WP; + else + uffdio_copy.mode = 0; uffdio_copy.copy = 0; if (ioctl(ufd, UFFDIO_COPY, &uffdio_copy)) { /* real retval in ufdio_copy.copy */ @@ -495,15 +533,21 @@ static void uffd_handle_page_fault(struct uffd_msg *msg, fprintf(stderr, "unexpected msg event %u\n", msg->event), exit(1); - if (bounces & BOUNCE_VERIFY && - msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE) - fprintf(stderr, "unexpected write fault\n"), exit(1); + if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP) { + wp_range(uffd, msg->arg.pagefault.address, page_size, false); + stats->wp_faults++; + } else { + /* Missing page faults */ + if (bounces & BOUNCE_VERIFY && + msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE) + fprintf(stderr, "unexpected write fault\n"), exit(1); - offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; - offset &= ~(page_size-1); + offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; + offset &= ~(page_size-1); - if (copy_page(uffd, offset)) - stats->missing_faults++; + if (copy_page(uffd, offset)) + stats->missing_faults++; + } } static void *uffd_poll_thread(void *arg) @@ -589,11 +633,30 @@ static void *uffd_read_thread(void *arg) static void *background_thread(void *arg) { unsigned long cpu = (unsigned long) arg; - unsigned long page_nr; + unsigned long page_nr, start_nr, mid_nr, end_nr; - for (page_nr = cpu * nr_pages_per_cpu; - page_nr < (cpu+1) * nr_pages_per_cpu; - page_nr++) + start_nr = cpu * nr_pages_per_cpu; + end_nr = (cpu+1) * nr_pages_per_cpu; + mid_nr = (start_nr + end_nr) / 2; + + /* Copy the first half of the pages */ + for (page_nr = start_nr; page_nr < mid_nr; page_nr++) + copy_page_retry(uffd, page_nr * page_size); + + /* + * If we need to test uffd-wp, set it up now. Then we'll have + * at least the first half of the pages mapped already which + * can be write-protected for testing + */ + if (test_uffdio_wp) + wp_range(uffd, (unsigned long)area_dst + start_nr * page_size, + nr_pages_per_cpu * page_size, true); + + /* + * Continue the 2nd half of the page copying, handling write + * protection faults if any + */ + for (page_nr = mid_nr; page_nr < end_nr; page_nr++) copy_page_retry(uffd, page_nr * page_size); return NULL; @@ -755,17 +818,31 @@ static int faulting_process(int signal_test) } for (nr = 0; nr < split_nr_pages; nr++) { + int steps = 1; + unsigned long offset = nr * page_size; + if (signal_test) { if (sigsetjmp(*sigbuf, 1) != 0) { - if (nr == lastnr) { + if (steps == 1 && nr == lastnr) { fprintf(stderr, "Signal repeated\n"); return 1; } lastnr = nr; if (signal_test == 1) { - if (copy_page(uffd, nr * page_size)) - signalled++; + if (steps == 1) { + /* This is a MISSING request */ + steps++; + if (copy_page(uffd, offset)) + signalled++; + } else { + /* This is a WP request */ + assert(steps == 2); + wp_range(uffd, + (__u64)area_dst + + offset, + page_size, false); + } } else { signalled++; continue; @@ -778,8 +855,13 @@ static int faulting_process(int signal_test) fprintf(stderr, "nr %lu memory corruption %Lu %Lu\n", nr, count, - count_verify[nr]), exit(1); - } + count_verify[nr]); + } + /* + * Trigger write protection if there is by writting + * the same value back. + */ + *area_count(area_dst, nr) = count; } if (signal_test) @@ -801,6 +883,11 @@ static int faulting_process(int signal_test) nr, count, count_verify[nr]), exit(1); } + /* + * Trigger write protection if there is by writting + * the same value back. + */ + *area_count(area_dst, nr) = count; } if (uffd_test_ops->release_pages(area_dst)) @@ -949,6 +1036,8 @@ static int userfaultfd_events_test(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) fprintf(stderr, "register failure\n"), exit(1); @@ -979,7 +1068,8 @@ static int userfaultfd_events_test(void) return 1; close(uffd); - printf("userfaults: %ld\n", stats.missing_faults); + + uffd_stats_report(&stats, 1); return stats.missing_faults != nr_pages; } @@ -1009,6 +1099,8 @@ static int userfaultfd_sig_test(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) fprintf(stderr, "register failure\n"), exit(1); @@ -1141,6 +1233,8 @@ static int userfaultfd_stress(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) { fprintf(stderr, "register failure\n"); return 1; @@ -1195,6 +1289,11 @@ static int userfaultfd_stress(void) if (stress(uffd_stats)) return 1; + /* Clear all the write protections if there is any */ + if (test_uffdio_wp) + wp_range(uffd, (unsigned long)area_dst, + nr_pages * page_size, false); + /* unregister */ if (ioctl(uffd, UFFDIO_UNREGISTER, &uffdio_register.range)) { fprintf(stderr, "unregister failure\n"); @@ -1233,10 +1332,7 @@ static int userfaultfd_stress(void) area_src_alias = area_dst_alias; area_dst_alias = tmp_area; - printf("userfaults:"); - for (cpu = 0; cpu < nr_cpus; cpu++) - printf(" %lu", uffd_stats[cpu].missing_faults); - printf("\n"); + uffd_stats_report(uffd_stats, nr_cpus); } if (err) @@ -1276,6 +1372,8 @@ static void set_test_type(const char *type) if (!strcmp(type, "anon")) { test_type = TEST_ANON; uffd_test_ops = &anon_uffd_test_ops; + /* Only enable write-protect test for anonymous test */ + test_uffdio_wp = true; } else if (!strcmp(type, "hugetlb")) { test_type = TEST_HUGETLB; uffd_test_ops = &hugetlb_uffd_test_ops;