From patchwork Mon Jan 21 07:56:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772785 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 797A81390 for ; Mon, 21 Jan 2019 07:57:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6AB2B29D39 for ; Mon, 21 Jan 2019 07:57:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5C9BE29D3D; Mon, 21 Jan 2019 07:57:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A086329D39 for ; Mon, 21 Jan 2019 07:57:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 858B18E0004; Mon, 21 Jan 2019 02:57:41 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 808568E0001; Mon, 21 Jan 2019 02:57:41 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6F80B8E0004; Mon, 21 Jan 2019 02:57:41 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 46CF28E0001 for ; Mon, 21 Jan 2019 02:57:41 -0500 (EST) Received: by mail-qt1-f197.google.com with SMTP id b16so19901969qtc.22 for ; Sun, 20 Jan 2019 23:57:41 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=rxt3s9kc54pL0+vTJ/vehsxD86pPZYwIBUXCGO07TQM=; b=nspX31yaZ7+EyrjHLK8/tRHrWc0Kfyeny4jsO9mX/NIE9jibHJiwSOMUPTN7klhnyC /RuESBRxe/Od3UbbTRhLrbf/qySp1omlEO4C7+BJEsnxL7tQJELIq5PlyWnfS0m/msvg kbTOgOsU4dri7t7DVjBXo3tlBr2fDlOUfHCmd5RmBR8quZtJ/eLE/9A6H5511PWQqQ9A FAwBvsdTSCHJBWAvghLWAcnKr+9kB/hw55Xd4C50mXdO3TP/t41yDpug3QQ5XkSzGj4d ouWL7Vy5EBUIcVUWvzBHnI9QUkBEmVbwPIoZ1vufp+hnsWtFD4yO8KHLoT2gXXLtNesO 6+0Q== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukd3xUa1xFihsoPfFDXla6pyJCd7YSNQgZD/pCN1v6DWwg+heAgc zSmVAFLEvuWzcIL67h84U2BHg3wf9kLNK/SQ/7QFSpXaiIFz9BdWn0mUxzkKwPAibQReS/DGitO OvgimTSOgCWHF/5RqslL7mIpLY9xmv/P5I1I19OfdGY8jhZtR9HQpdwr6g9PTi3gipQ== X-Received: by 2002:ac8:1add:: with SMTP id h29mr25318773qtk.258.1548057461061; Sun, 20 Jan 2019 23:57:41 -0800 (PST) X-Google-Smtp-Source: ALg8bN4zd76UceiCfupQF/WtrY9rAXyZsM2TlVx//a3mA/hJPdAv4YfGqrQJhdRw++aLkP3PFneO X-Received: by 2002:ac8:1add:: with SMTP id h29mr25318744qtk.258.1548057460364; Sun, 20 Jan 2019 23:57:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057460; cv=none; d=google.com; s=arc-20160816; b=TV33vOLO/pjcszHPlSr449kC1aplbketmrvXcwAV/6bYRcuuuYB7MlXYah1JM3fJP2 hNzqB3TfsQTYVHa6yzkb2CJKP4JNfQNUQXzqIXCowtoBtwxd5jZnEAU2pwSnKzC3uSIF GvLBcLTeoGEersXyfcfsCiBebXBcP165ftKK7E3vXog8OpEit7TxjJdAetXl0NkcaxdI 1Btuzg/sq+SK2OPp9rHU52db31xalsO4yiMOfeG1grsa0g29aOIY/tw/1Zcfs+wTZ13i zssp3u2lzzYos9TEC/+fqMX01E5kEowBZPEirbRcNTfIt+lt1zF+rxKBAXl1GfAgfIsn JSWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=rxt3s9kc54pL0+vTJ/vehsxD86pPZYwIBUXCGO07TQM=; b=xtInLdiDoiaYp9G+TrYgmDzbbDc5CZ/PUv6Emjmr0QjUwNUr9jpyEDLwHffz31NOgD 8C8vAEO2R3UU1tWU5EVHd8PesV3nCaX5OELrjyup+y1vS9vl0jR9DbRJvbeysG9X7Z2k J7MjrEqLgm+kqMWDAG5bEf5s3VP+dHm0SkIVy2jlNwChCALzzMBJ+wPEZ5MGClqqagIx V8DgB9RtuZAIh1Mh8u4kKrwTudOeFB/nZfVYAGEx2L45GIrQIO0Hm1Dtq8v6CPI42vtL TTJicYK+XZ9jO0Q37BXskwCqBNMAc+QS1tsLorZiN2ONPunYaUInoeXbsfy+1waCXCLa F9oQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id f64si1994711qtd.182.2019.01.20.23.57.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:57:40 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 65187C05000F; Mon, 21 Jan 2019 07:57:39 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id D959D608C2; Mon, 21 Jan 2019 07:57:33 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 01/24] mm: gup: rename "nonblocking" to "locked" where proper Date: Mon, 21 Jan 2019 15:56:59 +0800 Message-Id: <20190121075722.7945-2-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Mon, 21 Jan 2019 07:57:39 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP There's plenty of places around __get_user_pages() that has a parameter "nonblocking" which does not really mean that "it won't block" (because it can really block) but instead it shows whether the mmap_sem is released by up_read() during the page fault handling mostly when VM_FAULT_RETRY is returned. We have the correct naming in e.g. get_user_pages_locked() or get_user_pages_remote() as "locked", however there're still many places that are using the "nonblocking" as name. Renaming the places to "locked" where proper to better suite the functionality of the variable. While at it, fixing up some of the comments accordingly. Signed-off-by: Peter Xu Reviewed-by: Mike Rapoport --- mm/gup.c | 44 +++++++++++++++++++++----------------------- mm/hugetlb.c | 8 ++++---- 2 files changed, 25 insertions(+), 27 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 8cb68a50dbdf..7b1f452cc2ef 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -506,12 +506,12 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address, } /* - * mmap_sem must be held on entry. If @nonblocking != NULL and - * *@flags does not include FOLL_NOWAIT, the mmap_sem may be released. - * If it is, *@nonblocking will be set to 0 and -EBUSY returned. + * mmap_sem must be held on entry. If @locked != NULL and *@flags + * does not include FOLL_NOWAIT, the mmap_sem may be released. If it + * is, *@locked will be set to 0 and -EBUSY returned. */ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, - unsigned long address, unsigned int *flags, int *nonblocking) + unsigned long address, unsigned int *flags, int *locked) { unsigned int fault_flags = 0; vm_fault_t ret; @@ -523,7 +523,7 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, fault_flags |= FAULT_FLAG_WRITE; if (*flags & FOLL_REMOTE) fault_flags |= FAULT_FLAG_REMOTE; - if (nonblocking) + if (locked) fault_flags |= FAULT_FLAG_ALLOW_RETRY; if (*flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; @@ -549,8 +549,8 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, } if (ret & VM_FAULT_RETRY) { - if (nonblocking && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) - *nonblocking = 0; + if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) + *locked = 0; return -EBUSY; } @@ -627,7 +627,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) * only intends to ensure the pages are faulted in. * @vmas: array of pointers to vmas corresponding to each page. * Or NULL if the caller does not require them. - * @nonblocking: whether waiting for disk IO or mmap_sem contention + * @locked: whether we're still with the mmap_sem held * * Returns number of pages pinned. This may be fewer than the number * requested. If nr_pages is 0 or negative, returns 0. If no pages @@ -656,13 +656,11 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) * appropriate) must be called after the page is finished with, and * before put_page is called. * - * If @nonblocking != NULL, __get_user_pages will not wait for disk IO - * or mmap_sem contention, and if waiting is needed to pin all pages, - * *@nonblocking will be set to 0. Further, if @gup_flags does not - * include FOLL_NOWAIT, the mmap_sem will be released via up_read() in - * this case. + * If @locked != NULL, *@locked will be set to 0 when mmap_sem is + * released by an up_read(). That can happen if @gup_flags does not + * has FOLL_NOWAIT. * - * A caller using such a combination of @nonblocking and @gup_flags + * A caller using such a combination of @locked and @gup_flags * must therefore hold the mmap_sem for reading only, and recognize * when it's been released. Otherwise, it must be held for either * reading or writing and will not be released. @@ -674,7 +672,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, unsigned long start, unsigned long nr_pages, unsigned int gup_flags, struct page **pages, - struct vm_area_struct **vmas, int *nonblocking) + struct vm_area_struct **vmas, int *locked) { long ret = 0, i = 0; struct vm_area_struct *vma = NULL; @@ -718,7 +716,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, if (is_vm_hugetlb_page(vma)) { i = follow_hugetlb_page(mm, vma, pages, vmas, &start, &nr_pages, i, - gup_flags, nonblocking); + gup_flags, locked); continue; } } @@ -736,7 +734,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, page = follow_page_mask(vma, start, foll_flags, &ctx); if (!page) { ret = faultin_page(tsk, vma, start, &foll_flags, - nonblocking); + locked); switch (ret) { case 0: goto retry; @@ -1195,7 +1193,7 @@ EXPORT_SYMBOL(get_user_pages_longterm); * @vma: target vma * @start: start address * @end: end address - * @nonblocking: + * @locked: whether the mmap_sem is still held * * This takes care of mlocking the pages too if VM_LOCKED is set. * @@ -1203,14 +1201,14 @@ EXPORT_SYMBOL(get_user_pages_longterm); * * vma->vm_mm->mmap_sem must be held. * - * If @nonblocking is NULL, it may be held for read or write and will + * If @locked is NULL, it may be held for read or write and will * be unperturbed. * - * If @nonblocking is non-NULL, it must held for read only and may be - * released. If it's released, *@nonblocking will be set to 0. + * If @locked is non-NULL, it must held for read only and may be + * released. If it's released, *@locked will be set to 0. */ long populate_vma_page_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end, int *nonblocking) + unsigned long start, unsigned long end, int *locked) { struct mm_struct *mm = vma->vm_mm; unsigned long nr_pages = (end - start) / PAGE_SIZE; @@ -1245,7 +1243,7 @@ long populate_vma_page_range(struct vm_area_struct *vma, * not result in a stack expansion that recurses back here. */ return __get_user_pages(current, mm, start, nr_pages, gup_flags, - NULL, NULL, nonblocking); + NULL, NULL, locked); } /* diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 705a3e9cc910..05b879bda10a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4181,7 +4181,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, struct page **pages, struct vm_area_struct **vmas, unsigned long *position, unsigned long *nr_pages, - long i, unsigned int flags, int *nonblocking) + long i, unsigned int flags, int *locked) { unsigned long pfn_offset; unsigned long vaddr = *position; @@ -4252,7 +4252,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, spin_unlock(ptl); if (flags & FOLL_WRITE) fault_flags |= FAULT_FLAG_WRITE; - if (nonblocking) + if (locked) fault_flags |= FAULT_FLAG_ALLOW_RETRY; if (flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | @@ -4269,8 +4269,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, break; } if (ret & VM_FAULT_RETRY) { - if (nonblocking) - *nonblocking = 0; + if (locked) + *locked = 0; *nr_pages = 0; /* * VM_FAULT_RETRY must not return an From patchwork Mon Jan 21 07:57:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772787 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 87AB113BF for ; Mon, 21 Jan 2019 07:57:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 78B4D29D39 for ; Mon, 21 Jan 2019 07:57:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6CA3929D3D; Mon, 21 Jan 2019 07:57:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2D62D29D39 for ; Mon, 21 Jan 2019 07:57:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DA458E0005; Mon, 21 Jan 2019 02:57:48 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 28AD58E0001; Mon, 21 Jan 2019 02:57:48 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A21D8E0005; Mon, 21 Jan 2019 02:57:48 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id DBCF68E0001 for ; Mon, 21 Jan 2019 02:57:47 -0500 (EST) Received: by mail-qk1-f198.google.com with SMTP id z126so18486900qka.10 for ; Sun, 20 Jan 2019 23:57:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=FGwWQPOkbBkLO4DDpMJK/DEbq9k7RgwskXdQuwihkGw=; b=GcbsDJ2iFIVSUqV3FBpQTtPmw/KtlPlOnzHJ7LRxAgD+Mik9d8bYxGoQ84M7pMB7V4 z20FujGoT8tH6ans1bwD4Lc1FF7J7E8oHfC0z9I2WL/97XjmqwxdDT1/62X4mALXIT03 1h6gR5Bx/q4FN1vrXhd9g2/Frl5swP1m6N50jcM5AQzGZ+MFk8g7xaf1nUG57jolepub BY6JZfhKWmZvYDHI+E68B/7APCnoy7DJECVi+7o6l5pl1E/kgUuFi02rq7ZWfob3iTf3 wD32t8l4HWq42PrPEflndjn19BZJ4nbiZaCq62OkDgemX6zomB6alSV1NbEE2DKVEn0v BWSA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUuker4K6BWfVoyUHkPxsXZ32U42FqWNJrxEPwVlxHYXiRZLYRKt5f eYZNampab1DiMs33L3SPaf+GtyBSXFcg48Am4QgnX8RHX54BLgV1Jr6vFnbUGqkKdwwOBN/qtdJ KEX4f8USt/znQXZrmGbj4fGt9ovZBzhim0TSZzXLTowpx26ZH9ZZqKe5mhyzG4d9n6w== X-Received: by 2002:ac8:34b3:: with SMTP id w48mr26549183qtb.125.1548057467651; Sun, 20 Jan 2019 23:57:47 -0800 (PST) X-Google-Smtp-Source: ALg8bN6iv39s/znLG7MLQRJ6v4zKWM81z1j8aheODb8H6fjgqaNlgzKiWAqRSIFR851dfyf7eiC8 X-Received: by 2002:ac8:34b3:: with SMTP id w48mr26549137qtb.125.1548057466643; Sun, 20 Jan 2019 23:57:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057466; cv=none; d=google.com; s=arc-20160816; b=zmGg4n3bAooIav0U5cIQAK9BmAoCLiVGHV2siuOKYj3RUz/D+7yX/+jo9vuGMISh7N jpjWk4iauP/sGvGd1zxOc72D+yhE8rpELHIk5nnPIcRsY8xeC8xpayExuHz+C/142M70 OynclXkBpnL48CtPYosPZcZe7E69F6mzntBVmHxxe9GBk1AGCQSrdvb9MyBzrUMITnxZ wU4BibgGSjhSGiYpHnJnYCkIWA0kLy3zPo99d/okNnTn2zP5axy83ydLA8exLloxaCj6 A7ENvhssrWcyEaTHg1wdbowwi65/n5DbZqKoRh+dDPiF4Qdw/irPkFwN4jowErX7qrot FgQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=FGwWQPOkbBkLO4DDpMJK/DEbq9k7RgwskXdQuwihkGw=; b=xnSSRdd2/GGWTySoaHaHyaLtvVrA9aljLIoxH5uPx9vqHpTwQnfx5DWmBwnAvpLgF8 pOg1g7mS0MAxdSPHi+/NEbyceqbuXqGTq+HUxPF6QxWW6eCVKt2+d1IDBe11wmyEi5Kp n1iSwbeNrkPiJeLgaj1Re0Qw3sqf6KPt0JNTgGUUYxNJ/gY01yFa6VE0FeZpVqMNPCi5 2wPZRltzt8diNVX6qy3wv/LwZ2bf5pVewz2wPrEMkKOjJsHUWL3Vjhb7FxDpMTtfB7g6 cbE68a1o1v7ZbQzPsw6Ws3k4uY4IEibmN5YRLVBG3TsxI7vyhJUOKDHB222aTAml6l7k NlqA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id z65si4079748qtd.206.2019.01.20.23.57.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:57:46 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B2AA987620; Mon, 21 Jan 2019 07:57:45 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id E24E0608C2; Mon, 21 Jan 2019 07:57:39 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 02/24] mm: userfault: return VM_FAULT_RETRY on signals Date: Mon, 21 Jan 2019 15:57:00 +0800 Message-Id: <20190121075722.7945-3-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 21 Jan 2019 07:57:45 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP There was a special path in handle_userfault() in the past that we'll return a VM_FAULT_NOPAGE when we detected non-fatal signals when waiting for userfault handling. We did that by reacquiring the mmap_sem before returning. However that brings a risk in that the vmas might have changed when we retake the mmap_sem and even we could be holding an invalid vma structure. The problem was reported by syzbot. This patch removes the special path and we'll return a VM_FAULT_RETRY with the common path even if we have got such signals. Then for all the architectures that is passing in VM_FAULT_ALLOW_RETRY into handle_mm_fault(), we check not only for SIGKILL but for all the rest of userspace pending signals right after we returned from handle_mm_fault(). The idea comes from the upstream discussion between Linus and Andrea: https://lkml.org/lkml/2017/10/30/560 (This patch contains a potential fix for a double-free of mmap_sem on ARC architecture; please see https://lkml.org/lkml/2018/11/1/723 for more information) Suggested-by: Linus Torvalds Suggested-by: Andrea Arcangeli Signed-off-by: Peter Xu --- arch/alpha/mm/fault.c | 2 +- arch/arc/mm/fault.c | 11 +++++++---- arch/arm/mm/fault.c | 14 ++++++++++---- arch/arm64/mm/fault.c | 6 +++--- arch/hexagon/mm/vm_fault.c | 2 +- arch/ia64/mm/fault.c | 2 +- arch/m68k/mm/fault.c | 2 +- arch/microblaze/mm/fault.c | 2 +- arch/mips/mm/fault.c | 2 +- arch/nds32/mm/fault.c | 6 +++--- arch/nios2/mm/fault.c | 2 +- arch/openrisc/mm/fault.c | 2 +- arch/parisc/mm/fault.c | 2 +- arch/powerpc/mm/fault.c | 4 +++- arch/riscv/mm/fault.c | 4 ++-- arch/s390/mm/fault.c | 9 ++++++--- arch/sh/mm/fault.c | 4 ++++ arch/sparc/mm/fault_32.c | 3 +++ arch/sparc/mm/fault_64.c | 3 +++ arch/um/kernel/trap.c | 5 ++++- arch/unicore32/mm/fault.c | 4 ++-- arch/x86/mm/fault.c | 12 +++++++++++- arch/xtensa/mm/fault.c | 3 +++ fs/userfaultfd.c | 24 ------------------------ 24 files changed, 73 insertions(+), 57 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index d73dc473fbb9..46e5e420ad2a 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -150,7 +150,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, the fault. */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index e2d9fc3fea01..91492d244ea6 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -142,11 +142,14 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) fault = handle_mm_fault(vma, address, flags); /* If Pagefault was interrupted by SIGKILL, exit page fault "early" */ - if (unlikely(fatal_signal_pending(current))) { - if ((fault & VM_FAULT_ERROR) && !(fault & VM_FAULT_RETRY)) + if (unlikely(fatal_signal_pending(current) && user_mode(regs))) { + /* + * VM_FAULT_RETRY means we have released the mmap_sem, + * otherwise we need to drop it before leaving + */ + if (!(fault & VM_FAULT_RETRY)) up_read(&mm->mmap_sem); - if (user_mode(regs)) - return; + return; } perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index f4ea4c62c613..743077d19669 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -308,14 +308,20 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_page_fault(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (fault & VM_FAULT_RETRY) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; - return 0; + else if (signal_pending(current)) + /* + * It's either a common signal, or a fatal + * signal but for the userspace, we return + * immediately. + */ + return 0; } /* diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 7d9571f4ae3d..744d6451ea83 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -499,13 +499,13 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, if (fault & VM_FAULT_RETRY) { /* - * If we need to retry but a fatal signal is pending, + * If we need to retry but a signal is pending, * handle the signal first. We do not need to release * the mmap_sem because it would already be released * in __lock_page_or_retry in mm/filemap.c. */ - if (fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return 0; } diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index eb263e61daf4..be10b441d9cc 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -104,7 +104,7 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; /* The most common case -- we are done. */ diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 5baeb022f474..62c2d39d2bed 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -163,7 +163,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index 9b6163c05a75..d9808a807ab8 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -138,7 +138,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); pr_debug("handle_mm_fault returns %x\n", fault); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return 0; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 202ad6a494f5..4fd2dbd0c5ca 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -217,7 +217,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 73d8a0f0b810..92374fd091d2 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -154,7 +154,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index b740534b152c..72461745d3e1 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -207,12 +207,12 @@ void do_page_fault(unsigned long entry, unsigned long addr, fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (fault & VM_FAULT_RETRY && signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return; } diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index 24fd84cf6006..5939434a31ae 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -134,7 +134,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index dc4dbafc1d83..873ecb5d82d7 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -165,7 +165,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index c8e8b7c05558..29422eec329d 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -303,7 +303,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 1697e903bbf2..8bc0d091f13c 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -575,8 +575,10 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, */ flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; - if (!fatal_signal_pending(current)) + if (!signal_pending(current)) goto retry; + else if (!fatal_signal_pending(current) && is_user) + return 0; } /* diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 88401d5125bc..4fc8d746bec3 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -123,11 +123,11 @@ asmlinkage void do_page_fault(struct pt_regs *regs) fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(tsk)) + if ((fault & VM_FAULT_RETRY) && signal_pending(tsk)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index 2b8f32f56e0c..19b4fb2fafab 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -500,9 +500,12 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) * the fault. */ fault = handle_mm_fault(vma, address, flags); - /* No reason to continue if interrupted by SIGKILL. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - fault = VM_FAULT_SIGNAL; + /* Do not continue if interrupted by signals. */ + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { + if (fatal_signal_pending(current)) + fault = VM_FAULT_SIGNAL; + else + fault = 0; if (flags & FAULT_FLAG_RETRY_NOWAIT) goto out_up; goto out; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index 6defd2c6d9b1..baf5d73df40c 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -506,6 +506,10 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, * have already released it in __lock_page_or_retry * in mm/filemap.c. */ + + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index b0440b0edd97..a2c83104fe35 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -269,6 +269,9 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index 8f8a604c1300..cad71ec5c7b3 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -467,6 +467,9 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(current)) + return; + goto retry; } } diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index 0e8b6158f224..09baf37b65b9 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -76,8 +76,11 @@ int handle_page_fault(unsigned long address, unsigned long ip, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if (fault & VM_FAULT_RETRY && signal_pending(current)) { + if (is_user && !fatal_signal_pending(current)) + err = 0; goto out_nosemaphore; + } if (unlikely(fault & VM_FAULT_ERROR)) { if (fault & VM_FAULT_OOM) { diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index b9a3a50644c1..3611f19234a1 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -248,11 +248,11 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_pf(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return 0; if (!(fault & VM_FAULT_ERROR) && (flags & FAULT_FLAG_ALLOW_RETRY)) { diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 71d4b9d4d43f..b94ef0c2b98c 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1433,8 +1433,18 @@ void do_user_addr_fault(struct pt_regs *regs, if (flags & FAULT_FLAG_ALLOW_RETRY) { flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; - if (!fatal_signal_pending(tsk)) + if (!signal_pending(tsk)) goto retry; + else if (!fatal_signal_pending(tsk)) + /* + * There is a signal for the task but + * it's not fatal, let's return + * directly to the userspace. This + * gives chance for signals like + * SIGSTOP/SIGCONT to be handled + * faster, e.g., with GDB. + */ + return; } /* User mode? Just return to handle the fatal exception */ diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index 2ab0e0dcd166..792dad5e2f12 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -136,6 +136,9 @@ void do_page_fault(struct pt_regs *regs) * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(current)) + return; + goto retry; } } diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 270d4888c6d5..bc9f6230a3f0 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -515,30 +515,6 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) __set_current_state(TASK_RUNNING); - if (return_to_userland) { - if (signal_pending(current) && - !fatal_signal_pending(current)) { - /* - * If we got a SIGSTOP or SIGCONT and this is - * a normal userland page fault, just let - * userland return so the signal will be - * handled and gdb debugging works. The page - * fault code immediately after we return from - * this function is going to release the - * mmap_sem and it's not depending on it - * (unlike gup would if we were not to return - * VM_FAULT_RETRY). - * - * If a fatal signal is pending we still take - * the streamlined VM_FAULT_RETRY failure path - * and there's no need to retake the mmap_sem - * in such case. - */ - down_read(&mm->mmap_sem); - ret = VM_FAULT_NOPAGE; - } - } - /* * Here we race with the list_del; list_add in * userfaultfd_ctx_read(), however because we don't ever run From patchwork Mon Jan 21 07:57:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772789 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3362313BF for ; Mon, 21 Jan 2019 07:57:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2133F29D39 for ; Mon, 21 Jan 2019 07:57:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 12C1A29D3D; Mon, 21 Jan 2019 07:57:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 08AFB29D39 for ; Mon, 21 Jan 2019 07:57:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 17BCB8E0006; Mon, 21 Jan 2019 02:57:57 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1517E8E0001; Mon, 21 Jan 2019 02:57:57 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 040788E0006; Mon, 21 Jan 2019 02:57:57 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id CA45D8E0001 for ; Mon, 21 Jan 2019 02:57:56 -0500 (EST) Received: by mail-qt1-f199.google.com with SMTP id u20so20163154qtk.6 for ; Sun, 20 Jan 2019 23:57:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=3bPTeBFpPB27MegFNWpiFMC/sPvGyckYDl2Y6riZt1U=; b=EsQWP2FQT43Os/eY2C247DK0T1Ke+mi5PaHUe0I0/y2paL5K0VD67irwjOi+JTjgme 2Zvyre7ELuoIDMQgB5W5iBcb2iC3WE3p7SjSkjBHsOwvm7Kpwzmleoc7ug0+RkSi0DgD Fsa1/Ias0eUBKuOjJEmeW8sKgOj2vbZgWqqZnrc6aq7IO/6dTFF7Z+FEUvtNsRd8f/BL 1AX5iKjz7HodiHx10Jo1qTJ79I3WeCgUUvXKJTEg8KrnpFyFrqZrPNpRHfxndZrLJrpK L7V7vFI+zd9cCW3jqyaj4x1VTWB7WJp6BaHUm5GvBuwnH3OUCNOpBOHI+lL3sVIR1Ylx n1Lg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukcvt+QqJ8w5DU/nHDmNQHgc6mR4BmTQo3qEJlzuAlVP3Fn3TLYg sh8aa5JPdAnOBmOxhfBvlxt2g50msoMMGvykJrYa90uiMvKWow0xEAVI7+qYV4fKeBM0vcsjjEP kfx44EFDkcE01QL4lh41LS7cm+pPmusXcN2FVQQXrslkGTd3bzoQaovINt04QTpfuJw== X-Received: by 2002:a0c:9aca:: with SMTP id k10mr25199217qvf.185.1548057476571; Sun, 20 Jan 2019 23:57:56 -0800 (PST) X-Google-Smtp-Source: ALg8bN6qbgAyQ75Q9fggpCmZ+fzvpQ6GEgPt3MXXjwe0xVveq92HNydKxv83HKbXmgJQF/UIr4zx X-Received: by 2002:a0c:9aca:: with SMTP id k10mr25199180qvf.185.1548057475377; Sun, 20 Jan 2019 23:57:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057475; cv=none; d=google.com; s=arc-20160816; b=WiBjJIIOckfm/Ba/JBYbbFW0k6uB7DosVDZqeiSGZBXMFH5pQO0ZHTCADERySfEqq2 fRyUrwn8ubMmlle3YAXD85biyviUIX5jtI4vo60+PP3KJgV/MfV+DNRtq9joUmfqPDkC VejscwqMBiPFpyXehmH2+/ki3W4SaF7+SFrhaGnFpUrxWpaWAw1bC/rgx5UE7p5O2wn6 JzyaQZIIE7Q5ZgMFolQu0j9k00dCOQ4NHmIpbHggEy64cniW8uwFsvEcU33Tkv0hE3IT NhYMcBg5Z80/ga6xVwLHhf6/cwPJojIkMC2Tq+djsUt3PjemQcX/pYHJgRVWBkGV2lrQ tkyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=3bPTeBFpPB27MegFNWpiFMC/sPvGyckYDl2Y6riZt1U=; b=ag9GXu+rK+6/F0rdgUZlP3SF7afY2AF5wgLAX1QCDGf/W/mEEwYxP3BnDNt0ktz2Ep ofcKpTOb7Q9yNK74WO6JU9hJemvz1HTkwa9u1I9riH1urLkxvNsxqLyZN4UqKxcITHxg m4on26m5AiFxFXMgEv/myqgOWvc11YFXvzkXpU7+1mj15CXmal0Rh9m6oMr2JQz7UQgS LxyMR5BLUfhyCjcYrEt2T+cUgkLTmwdlK79EcftmoJ1Sw3G9eb+bFCbPPioFIrYV144I ntS9cCOKD1/DIb7G388aKMTJdwR08E4Y1Wq3e7tQmKWDB/FE/HFITeQUQ75CQCO1mTpw TClQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id q11si2859538qkc.214.2019.01.20.23.57.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:57:55 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7452489AC4; Mon, 21 Jan 2019 07:57:54 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3AA17608C2; Mon, 21 Jan 2019 07:57:45 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 03/24] mm: allow VM_FAULT_RETRY for multiple times Date: Mon, 21 Jan 2019 15:57:01 +0800 Message-Id: <20190121075722.7945-4-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 21 Jan 2019 07:57:54 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from a discussion between Linus and Andrea [1]. Before this patch we only allow a page fault to retry once. We achieved this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing handle_mm_fault() the second time. This was majorly used to avoid unexpected starvation of the system by looping over forever to handle the page fault on a single page. However that should hardly happen, and after all for each code path to return a VM_FAULT_RETRY we'll first wait for a condition (during which time we should possibly yield the cpu) to happen before VM_FAULT_RETRY is really returned. This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY flag when we receive VM_FAULT_RETRY. It means that the page fault handler now can retry the page fault for multiple times if necessary without the need to generate another page fault event. Meanwhile we still keep the FAULT_FLAG_TRIED flag so page fault handler can still identify whether a page fault is the first attempt or not. GUP code is not touched yet and will be covered in follow up patch. This will be a nice enhancement for current code at the same time a supporting material for the future userfaultfd-writeprotect work since in that work there will always be an explicit userfault writeprotect retry for protected pages, and if that cannot resolve the page fault (e.g., when userfaultfd-writeprotect is used in conjunction with shared memory) then we'll possibly need a 3rd retry of the page fault. It might also benefit other potential users who will have similar requirement like userfault write-protection. Please read the thread below for more information. [1] https://lkml.org/lkml/2017/11/2/833 Suggested-by: Linus Torvalds Suggested-by: Andrea Arcangeli Signed-off-by: Peter Xu --- arch/alpha/mm/fault.c | 2 +- arch/arc/mm/fault.c | 1 - arch/arm/mm/fault.c | 3 --- arch/arm64/mm/fault.c | 5 ----- arch/hexagon/mm/vm_fault.c | 1 - arch/ia64/mm/fault.c | 1 - arch/m68k/mm/fault.c | 3 --- arch/microblaze/mm/fault.c | 1 - arch/mips/mm/fault.c | 1 - arch/nds32/mm/fault.c | 1 - arch/nios2/mm/fault.c | 3 --- arch/openrisc/mm/fault.c | 1 - arch/parisc/mm/fault.c | 2 -- arch/powerpc/mm/fault.c | 5 ----- arch/riscv/mm/fault.c | 5 ----- arch/s390/mm/fault.c | 5 +---- arch/sh/mm/fault.c | 1 - arch/sparc/mm/fault_32.c | 1 - arch/sparc/mm/fault_64.c | 1 - arch/um/kernel/trap.c | 1 - arch/unicore32/mm/fault.c | 6 +----- arch/x86/mm/fault.c | 1 - arch/xtensa/mm/fault.c | 1 - 23 files changed, 3 insertions(+), 49 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index 46e5e420ad2a..deae82bb83c1 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -169,7 +169,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; + flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would * have already released it in __lock_page_or_retry diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index 91492d244ea6..7f48b377028c 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -168,7 +168,6 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 743077d19669..377781d8491a 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -342,9 +342,6 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) regs, addr); } if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 744d6451ea83..8a26e03fc2bf 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -510,12 +510,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, return 0; } - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk of - * starvation. - */ if (mm_flags & FAULT_FLAG_ALLOW_RETRY) { - mm_flags &= ~FAULT_FLAG_ALLOW_RETRY; mm_flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index be10b441d9cc..576751597e77 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -115,7 +115,6 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 62c2d39d2bed..9de95d39935e 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -189,7 +189,6 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index d9808a807ab8..b1b2109e4ab4 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -162,9 +162,6 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 4fd2dbd0c5ca..05a4847ac0bf 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -236,7 +236,6 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 92374fd091d2..9953b5b571df 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -178,7 +178,6 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, tsk->min_flt++; } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index 72461745d3e1..f0b775cb5cdf 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -237,7 +237,6 @@ void do_page_fault(unsigned long entry, unsigned long addr, else tsk->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index 5939434a31ae..9dd1c51acc22 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -158,9 +158,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index 873ecb5d82d7..ff92c5674781 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -185,7 +185,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, else tsk->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index 29422eec329d..7d3e96a9a7ab 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -327,8 +327,6 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; - /* * No need to up_read(&mm->mmap_sem) as we would * have already released it in __lock_page_or_retry diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 8bc0d091f13c..8bdc7e75d2e5 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -569,11 +569,6 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, if (unlikely(fault & VM_FAULT_RETRY)) { /* We retry only once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. - */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; if (!signal_pending(current)) goto retry; diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 4fc8d746bec3..aad2c0557d2f 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -154,11 +154,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs) 1, regs, addr); } if (fault & VM_FAULT_RETRY) { - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. - */ - flags &= ~(FAULT_FLAG_ALLOW_RETRY); flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index 19b4fb2fafab..819f87169ee1 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -537,10 +537,7 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) fault = VM_FAULT_PFAULT; goto out_up; } - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~(FAULT_FLAG_ALLOW_RETRY | - FAULT_FLAG_RETRY_NOWAIT); + flags &= ~FAULT_FLAG_RETRY_NOWAIT; flags |= FAULT_FLAG_TRIED; down_read(&mm->mmap_sem); goto retry; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index baf5d73df40c..cd710e2d7c57 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -498,7 +498,6 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index a2c83104fe35..6735cd1c09b9 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -261,7 +261,6 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, 1, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index cad71ec5c7b3..28d5b4d012c6 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -459,7 +459,6 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) 1, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index 09baf37b65b9..c63fc292aea0 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -99,7 +99,6 @@ int handle_page_fault(unsigned long address, unsigned long ip, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index 3611f19234a1..fdf577956f5f 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -260,12 +260,8 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) tsk->maj_flt++; else tsk->min_flt++; - if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; + if (fault & VM_FAULT_RETRY) goto retry; - } } up_read(&mm->mmap_sem); diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index b94ef0c2b98c..645b1365a72d 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1431,7 +1431,6 @@ void do_user_addr_fault(struct pt_regs *regs, if (unlikely(fault & VM_FAULT_RETRY)) { /* Retry at most once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; if (!signal_pending(tsk)) goto retry; diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index 792dad5e2f12..7cd55f2d66c9 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -128,7 +128,6 @@ void do_page_fault(struct pt_regs *regs) else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would From patchwork Mon Jan 21 07:57:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772791 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 166371390 for ; Mon, 21 Jan 2019 07:58:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 065FE29D39 for ; Mon, 21 Jan 2019 07:58:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ED28E29D3D; Mon, 21 Jan 2019 07:58:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8E84829D39 for ; Mon, 21 Jan 2019 07:58:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 990268E0007; Mon, 21 Jan 2019 02:58:05 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 941038E0001; Mon, 21 Jan 2019 02:58:05 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 82FEB8E0007; Mon, 21 Jan 2019 02:58:05 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 5B7E38E0001 for ; Mon, 21 Jan 2019 02:58:05 -0500 (EST) Received: by mail-qt1-f197.google.com with SMTP id d31so20336200qtc.4 for ; Sun, 20 Jan 2019 23:58:05 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=5i3UnXFERP4rofpBwMYKkqKKJI/2hArru+36QcQJrxE=; b=DaJI/xY4RIry7AN4lybF8DlkfD+SA2H8olbEYrlHNpibwxF8gkNGHAttoh64ba2zCv 5pC3xYA/L+p9zEzTqHshhD7AZInY6TtjgB5fCyrQDbU3JVKr4F+KsQbRsx8NudEjyD0Q u5VzGkxfmI50YUeROiNJw2HZFFJ7gDgYIPq/K4fc1JcgDPPgUhwQrCzhupPHPGkATtFe mI7oafYdCUnkjy72X36JxOlgGnWcf/XACO4CnxwsbCSRyYz8EeuQOP6WTJ/ogeM0fAtS cAIN+K9nHel9xSsLDoWbAYbhi+yZJbz4/YJ+ev4ggZyyk8OKPDA/tUqnXBVc8F5gXiel rjpg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukc1u5aD9GoROpyYfjBT5u3kDwyK7sLHqZUPq4olHyXESyYszt3Y I1ohd1VUmLP2qm4B7Dla8hjOdy3wG/I+azYN5BZyeG393WPgoo7HgrKuZsOxvY0i7or4Z//NIm4 E1PVrHdqfVLGKqW5310ELUiZ69UNhdtQspuVV2ws2Rlu85d8FKnYsKMkvRadaYgKd9A== X-Received: by 2002:a0c:cb85:: with SMTP id p5mr24613451qvk.162.1548057485165; Sun, 20 Jan 2019 23:58:05 -0800 (PST) X-Google-Smtp-Source: ALg8bN4zyuUCHtwT6/Qhl9gttuC6EcFpcppBlWG4Iih4/IlemrEhE5oVJ490GWU2UlINvoruys51 X-Received: by 2002:a0c:cb85:: with SMTP id p5mr24613403qvk.162.1548057483867; Sun, 20 Jan 2019 23:58:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057483; cv=none; d=google.com; s=arc-20160816; b=WJBr9aDibQg8GRZppVq846Wop0r8DJcljrjq79HL4mEK+zqqawfXDly8luy71QSbUm olNhTvjLrwXKmyRSOOZgqawzdcWmbl1kQ2W7GWzfMS95Qw4k6xs9py2KspL/+gXWck4h VwI7aVmfrRzRZt+RY7bgmMeppSFcOjd0AwhpgVLxmXXXcdXs4BzpiYxKH5SOgK1zrWrW d5qpw1rGKx2ZC2ppaYKJB1sYQ3IG3E42v2rTDTq4uS0oPu+6DS6JRb/leNF2xFmkb6nC 11sgFkqXP5kxKSuDXmxlfhmbXQiSmJmCeGfEQ4QIc2HV8He3QrVwpmmQFAy4VUgREAqo 8ESA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=5i3UnXFERP4rofpBwMYKkqKKJI/2hArru+36QcQJrxE=; b=CLC72/6mm9wnwCoRvOOMMIP1jskptEfnqARpks/Fo5CWA5KPXD2twccf4j0pSPY71v Rb+mROgXOZK/BL0V3hXuAvo+ialw3og+yKumOntlCoD5Jb0L2EbSZK20nbKAWtEF5wBJ w3V+1gsdNh2S+744PqULoRM05ldkpQs9DQm2uBR1W9+2UVMFbj69JD1J0pnBT4GSHGtv meJkKZGdliIVkf1r/5Wg8VItCSkLpprpcL8xbyuycqUoWS0cCZiQuXUlNXnOJyilnUHf ZAlYy5vkdbWwBhK0VESCRjgrvpBGnjDpVzo+i5EMfrIEj8tLTZNEMSTVIres9dN0mKG1 HBqA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id m28si2624893qtf.328.2019.01.20.23.58.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:58:03 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EA7EE394D4D; Mon, 21 Jan 2019 07:58:02 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id F39F8608C2; Mon, 21 Jan 2019 07:57:54 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 04/24] mm: gup: allow VM_FAULT_RETRY for multiple times Date: Mon, 21 Jan 2019 15:57:02 +0800 Message-Id: <20190121075722.7945-5-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 21 Jan 2019 07:58:03 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This is the gup counterpart of the change that allows the VM_FAULT_RETRY to happen for more than once. Signed-off-by: Peter Xu Reviewed-by: Jérôme Glisse --- mm/gup.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 7b1f452cc2ef..22f1d419a849 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -528,7 +528,10 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, if (*flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; if (*flags & FOLL_TRIED) { - VM_WARN_ON_ONCE(fault_flags & FAULT_FLAG_ALLOW_RETRY); + /* + * Note: FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_TRIED + * can co-exist + */ fault_flags |= FAULT_FLAG_TRIED; } @@ -943,17 +946,23 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, /* VM_FAULT_RETRY triggered, so seek to the faulting offset */ pages += ret; start += ret << PAGE_SHIFT; + lock_dropped = true; +retry: /* * Repeat on the address that fired VM_FAULT_RETRY - * without FAULT_FLAG_ALLOW_RETRY but with + * with both FAULT_FLAG_ALLOW_RETRY and * FAULT_FLAG_TRIED. */ *locked = 1; - lock_dropped = true; down_read(&mm->mmap_sem); ret = __get_user_pages(tsk, mm, start, 1, flags | FOLL_TRIED, - pages, NULL, NULL); + pages, NULL, locked); + if (!*locked) { + /* Continue to retry until we succeeded */ + BUG_ON(ret != 0); + goto retry; + } if (ret != 1) { BUG_ON(ret > 1); if (!pages_done) From patchwork Mon Jan 21 07:57:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772793 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 90CD413BF for ; Mon, 21 Jan 2019 07:58:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 81B0429D39 for ; Mon, 21 Jan 2019 07:58:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 73FF429D3D; Mon, 21 Jan 2019 07:58:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BC61629D39 for ; Mon, 21 Jan 2019 07:58:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E4C658E0008; Mon, 21 Jan 2019 02:58:13 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E22768E0001; Mon, 21 Jan 2019 02:58:13 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D39398E0008; Mon, 21 Jan 2019 02:58:13 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id A84198E0001 for ; Mon, 21 Jan 2019 02:58:13 -0500 (EST) Received: by mail-qk1-f199.google.com with SMTP id z68so18616728qkb.14 for ; Sun, 20 Jan 2019 23:58:13 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=Rn84y8C8A09oYMyr3TKfoYKiYwkNmhvpkXr89LUWW7o=; b=pNvq4oe/x5iOPhMsfefSysM6NY9aU/svJiK8Gfx9p9KISk7IfYX/zQqrFdeotp9Fo8 OFzq9FMc2gSlEdv0cOZHshUVW3LqLRD3UV2NB1tgpYE4ScqNAYXfb8NCAkvV/EKDqKjz WZA3xmgL3B0MDo2OZFzFkn3KH1zM1qr8eXRVRqQ1wg2zP3tZ7O4RAYQ/UO6jsRdBCHlC tPs2XkDj+AF11+Rf5Sr6P+OCm6B7iwPPtuIivQ+ZqweYtzIQnvF1+w6T2p7hZVen3k3q 4XAp2UeBippgnbj0ezTsPHsyyC5TueuEJTyXjl3V7oVV9nDzmLgYkeYRkgeJrrfKIBwl hd+A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukfdJl0y4vzaSTedjWBdTLWrLbQz7nfLXdpf0zv4WApcdykadmJy jRJK/mYLLs1hJoVNuZe4sdelqjRFBcvM1MIatXnwadLbgzQxCOEIcVSU8ABRqNe+L8xb9plQWYg h9+/nIG9D+IHLpHI66AWN+gUC/POMVy0ngirpDENVMsMcQGGJvfZSnoJebRTSql3CpA== X-Received: by 2002:a0c:e394:: with SMTP id a20mr24243221qvl.42.1548057493462; Sun, 20 Jan 2019 23:58:13 -0800 (PST) X-Google-Smtp-Source: ALg8bN4fFlqv5THU73viaWz59Bt33/71ufT3pMaob/Zux8tPEa9VqE0bn+ZLXn9L2BHdK2g4wRxT X-Received: by 2002:a0c:e394:: with SMTP id a20mr24243207qvl.42.1548057493063; Sun, 20 Jan 2019 23:58:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057493; cv=none; d=google.com; s=arc-20160816; b=LoOIfAAUwVPEGbctsJ25z4OABXCtxahBBUUO8DIaHUo9d/hWz/5wDRCbr8nralpp5R 9Rtfjg6wbthxqHJ0zCOcnlRh/zR5jSb8tqgytWI7TkweSD/ueX4JiShT52gLNNX28rk8 TyWJt2kbn3yLf8ZnOkiGNdpv4GrKGTdWd8c0sNDiOaCOFoMHRDo10xL/U95GZEUph9Uu u9g3idmJdaWn1Kf5RuHlAH35Za6mwjYz5hc+dwLFWBKMqiaGuc9SRTIa9+8oQaYsmnNW nUGgDqHuKuSAU6MpR7RqJW3lWcgnb8+niW/khkI49vqpkEz3bSezL0fZjH0LbeeOXqKv wvaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=Rn84y8C8A09oYMyr3TKfoYKiYwkNmhvpkXr89LUWW7o=; b=AWMcg//rYWeEd1i8UluvYM9BYnPhtJPOwveXfxsM2rNDXYsHebX1/o+ZCEAr74FKmy DTzp8kXEUEjPi6pPnMm0M+j4sjyRocBTryYXF35zmG1WiNUzx3OFTwDKdgl/9nQ5isES voLG5tJ7IuLZzakXW3LJ8xrWYDB3fYn8LmlLhEb+R0rcUlASTkn4QshCSuoILPi33qQt 8MvJ9Q0rcDrSu1ljimbI23pnio9X6LN7nKsQEb41ROgTe4qfA49yjmA63wCu81OzwJZj xTi7gKC/TPX///M+wT4Fzf5RhZAeB7tztlHQJyd1saRWPuXWqiZAJG1EW2p6FLJo8ELx 3kEw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id j13si932278qtj.296.2019.01.20.23.58.12 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:58:13 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 268237F40E; Mon, 21 Jan 2019 07:58:12 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 746D3608C2; Mon, 21 Jan 2019 07:58:03 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Rik van Riel Subject: [PATCH RFC 05/24] userfaultfd: wp: add helper for writeprotect check Date: Mon, 21 Jan 2019 15:57:03 +0800 Message-Id: <20190121075722.7945-6-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Mon, 21 Jan 2019 07:58:12 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li add helper for writeprotect check. Will use it later. Cc: Andrea Arcangeli Cc: Pavel Emelyanov Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 37c9eba75c98..38f748e7186e 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -50,6 +50,11 @@ static inline bool userfaultfd_missing(struct vm_area_struct *vma) return vma->vm_flags & VM_UFFD_MISSING; } +static inline bool userfaultfd_wp(struct vm_area_struct *vma) +{ + return vma->vm_flags & VM_UFFD_WP; +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP); @@ -94,6 +99,11 @@ static inline bool userfaultfd_missing(struct vm_area_struct *vma) return false; } +static inline bool userfaultfd_wp(struct vm_area_struct *vma) +{ + return false; +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return false; From patchwork Mon Jan 21 07:57:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772795 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 19E011390 for ; Mon, 21 Jan 2019 07:58:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B46129D39 for ; Mon, 21 Jan 2019 07:58:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F2CC129D3D; Mon, 21 Jan 2019 07:58:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7C17F29D39 for ; Mon, 21 Jan 2019 07:58:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA2298E0009; Mon, 21 Jan 2019 02:58:23 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A51958E0001; Mon, 21 Jan 2019 02:58:23 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 968BD8E0009; Mon, 21 Jan 2019 02:58:23 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 6A1FE8E0001 for ; Mon, 21 Jan 2019 02:58:23 -0500 (EST) Received: by mail-qk1-f199.google.com with SMTP id c71so18252685qke.18 for ; Sun, 20 Jan 2019 23:58:23 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=E5pUTQPw9eOs30SK4xNMDv9HtqOmCvKjdxalnhHddoE=; b=LOr7IuDQ/3Oz20iZN+TAYkm+03vhQe/CcK2porXWxlW3UXfmtKDM4RYltcYmlcJMEw gKefbqDJuhUBrx6+520i8wtwvlLjvELHjg+d68hCNoUJiPK6Jvppipu7F4z1+hz1/QmK fq+pUoZx/zUkS9+hfKoSTqx6XeIBijXY56OAblRudeyo7cc2AGrG5awRtxPKXuj806yJ wIiUD9cun4Dl/1EAljuBpa2x4kWvocnyesydvsFCBdTXCgDv/Rv5AcslnYO6UzFAkD52 BAGDIsaUJIfnmfEyAxRMAsNOM7kX9t2xowLes/grM0AzfkxKKvz61ChRp1iFznE8HTQN Yp7Q== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukfAlhiKoEJb038dm9Sp7oriU2bW28HyRETnqOgRw06YbUsqMKNJ CX1UdMQHJ5t7PCmIa/cQvz2u7jZ35nxwsYKwmz31INzB4QHalK5HrZ2zIv//l9G+MYbOXuQBaXq THD1fFJ1/LBugXvqsOLhzAApUZSMRkQcpaT6nv2s07w0MioR/P08rJHcDkmOjuYJD9w== X-Received: by 2002:a37:62d3:: with SMTP id w202mr16506892qkb.357.1548057503197; Sun, 20 Jan 2019 23:58:23 -0800 (PST) X-Google-Smtp-Source: ALg8bN57dpT4IvQDfGLIdfz5zEH7Fs86sEiu+4tfSiihiF1jiGpmFthoBH1QiOSZOfwcyLHwJGiH X-Received: by 2002:a37:62d3:: with SMTP id w202mr16506871qkb.357.1548057502394; Sun, 20 Jan 2019 23:58:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057502; cv=none; d=google.com; s=arc-20160816; b=ALIwalLMDi2rJptsfFeGMoU4MAYGl9+XrE09SgxYCR9aFOnsTXBaNH024/2Jcg+BXm wu9P9pJoplEkpOMsLpS9r+KKJvap3kolgyrtOKbK/S26tMP68fvIFF28Q3wtd881Imda qgQgdNZPeJGP0qDihBkWCZcqVv1YCIhRsyAaU0q/O+Rk+yl/VCWvGPZbw4+dXpPekS1y QszNfD+NLCtEoSfTzfgsVgCwFTo9FGiawK5WMDIXC6eabtAQPHkET004J3BE0WfIhx3Z TUcQfccPbA1f+sVVaanxWRHS+ypkIAEyQaDP5r3TSXrJrJENjJ+zDGmGaxJF3sW4LnMR Bsdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=E5pUTQPw9eOs30SK4xNMDv9HtqOmCvKjdxalnhHddoE=; b=envVmpxDsVDl1ZijKalMi3hfFvK7Sib3ZwhTavkRZ/hfRM7qg+oyL8PTPd/pX4nItr 9bpEKhuzrZ0pSiQ1AzD8XXYg7Y4UTChwXQUC/McTmYNxK8wyyfMCIzaF3fSrLi8qwHKO UWYoQch/bVlGiR+aP/ZItVCQww0sLTp8yXl2AWjr7dkO1uoo0lGRViF1yxEjC8hF5boO yJGsoYThIq9sY8wTG6Hz+LF4b9uaLybXi5BEjXPHnaAKudK6ZojM9I++Rs9kgydShOYo 5ABT/9XfyPFlq+pR0ceJwgCgw4W/KTgBbVA09VeX+9wLKd9PFAct3gcoAFjsypYIiV2+ 5sVQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id w39si806948qtc.168.2019.01.20.23.58.22 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:58:22 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 64D2DCAA8E; Mon, 21 Jan 2019 07:58:21 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id A405A608F3; Mon, 21 Jan 2019 07:58:12 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Rik van Riel Subject: [PATCH RFC 06/24] userfaultfd: wp: support write protection for userfault vma range Date: Mon, 21 Jan 2019 15:57:04 +0800 Message-Id: <20190121075722.7945-7-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 21 Jan 2019 07:58:21 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li Add API to enable/disable writeprotect a vma range. Unlike mprotect, this doesn't split/merge vmas. Cc: Andrea Arcangeli Cc: Pavel Emelyanov Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 2 ++ mm/userfaultfd.c | 52 +++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 38f748e7186e..e82f3156f4e9 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -37,6 +37,8 @@ extern ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long len, bool *mmap_changing); +extern int mwriteprotect_range(struct mm_struct *dst_mm, + unsigned long start, unsigned long len, bool enable_wp); /* mm helpers */ static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma, diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 458acda96f20..c38903f501c7 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -615,3 +615,55 @@ ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long start, { return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing); } + +int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, + unsigned long len, bool enable_wp) +{ + struct vm_area_struct *dst_vma; + pgprot_t newprot; + int err; + + /* + * Sanitize the command parameters: + */ + BUG_ON(start & ~PAGE_MASK); + BUG_ON(len & ~PAGE_MASK); + + /* Does the address range wrap, or is the span zero-sized? */ + BUG_ON(start + len <= start); + + down_read(&dst_mm->mmap_sem); + + /* + * Make sure the vma is not shared, that the dst range is + * both valid and fully within a single existing vma. + */ + err = -EINVAL; + dst_vma = find_vma(dst_mm, start); + if (!dst_vma || (dst_vma->vm_flags & VM_SHARED)) + goto out_unlock; + if (start < dst_vma->vm_start || + start + len > dst_vma->vm_end) + goto out_unlock; + + if (!dst_vma->vm_userfaultfd_ctx.ctx) + goto out_unlock; + if (!userfaultfd_wp(dst_vma)) + goto out_unlock; + + if (!vma_is_anonymous(dst_vma)) + goto out_unlock; + + if (enable_wp) + newprot = vm_get_page_prot(dst_vma->vm_flags & ~(VM_WRITE)); + else + newprot = vm_get_page_prot(dst_vma->vm_flags); + + change_protection(dst_vma, start, start + len, newprot, + !enable_wp, 0); + + err = 0; +out_unlock: + up_read(&dst_mm->mmap_sem); + return err; +} From patchwork Mon Jan 21 07:57:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772797 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7949B1390 for ; Mon, 21 Jan 2019 07:58:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 68B6F29D3A for ; Mon, 21 Jan 2019 07:58:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5911229D3E; Mon, 21 Jan 2019 07:58:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 944AD29D3A for ; Mon, 21 Jan 2019 07:58:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 91C1A8E000A; Mon, 21 Jan 2019 02:58:29 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8CA9F8E0001; Mon, 21 Jan 2019 02:58:29 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E1CE8E000A; Mon, 21 Jan 2019 02:58:29 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 575568E0001 for ; Mon, 21 Jan 2019 02:58:29 -0500 (EST) Received: by mail-qk1-f197.google.com with SMTP id p79so18648378qki.15 for ; Sun, 20 Jan 2019 23:58:29 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=3CqpOHajnnwyM/eM2boZSR6tOkHLF3lOorPH6ksH8VE=; b=fb2hb6pDAf4GDLTyfI6VSvQo2f+/eLZuVv3A93bcUzz6p/PLz9JZ3TVQ9RbOnCUdzo YmhttD0/zioaLJyK6kDMqmDCVtAHixAOLiYv5K713gyRlzbPgVFrYcTmIQMnY8HNgRAz 6Iq6knsuADXdkk9qCpNxoMA64gTRAyH97W3LFIwLPs4hF+TwSRqnRQUfv9kejFAak7AW EIrRg34mrmLZCQv6mKUmMWNkoYtdB28p1SbIy+9b2IvRe8A5xnN/xOJRePGbFptsylmT ZbZvpmnZ+vNRIiIIAwrg/9hcU/1w6zJbCTXWvmz+NMd1SkyWGdyTfq7AI/NiQmCLDs9b JCnQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukfGKfd0w7AkUdGAPOX4gOTKi3je3qmB6tSEutX4llDTjx7u0+FG 2DnrGTl2Cwq6OXqO+hOdTSUEGbzAOU1x/1xyBNTOuQBHQ1HIFJTwFiSgCBGpcClLpQDCGLaN1TA 4pxzHeGDp/644K3/a7Mkj1f6SaBAz5TRvWj+PqokrDDTAwBMrOWRcrZW7hMB1AZVyzg== X-Received: by 2002:a0c:f90b:: with SMTP id v11mr24416581qvn.236.1548057509075; Sun, 20 Jan 2019 23:58:29 -0800 (PST) X-Google-Smtp-Source: ALg8bN6ucE4W419wYtDMpdcD1SCyD2Kk0wrhhxSEfWcSMCyWLCY30DGY6Rd1O/S7xaZwAhVIoDuI X-Received: by 2002:a0c:f90b:: with SMTP id v11mr24416561qvn.236.1548057508413; Sun, 20 Jan 2019 23:58:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057508; cv=none; d=google.com; s=arc-20160816; b=GTgObJsX3lPpfh/Ln0XtBAGe9BBd5O+rkPJtSADHSZMjrUlkg61xcJiqaocliDASpe a5WqmlwWPKHf6/wPYqTbJm4lHoB9AbzMkjvvYjoDR0Qv+3VTRS5OxJMBseLFE9lvJtox mmyCF5QBRbh3BRX1/bxKPgija6jBteQrdIQngVneX/STurEVOCgf/QmV4ePcfUKlnxfZ L2TwX9UwUwdtZnk1YkCjkSkwO/Ac90JjXf9/VPRjuS+BwcmcGDzzviVnr69LwjTyQ7be ZtLP7/PFnax2jhfNIz5j/rZuSB2lWVwkLSrFMSKHU+gp9HG1d8o0EqNDvPIQMkX+hTrU aJIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=3CqpOHajnnwyM/eM2boZSR6tOkHLF3lOorPH6ksH8VE=; b=YadxQH4VAwiovR4U0JKFpdRfOJ9YsWpR5l6zbMoo1K9LD2G/bVdFtA15flwb3DmVAW HF/H+Fq9mnKBy1tKZeM1GsdiZ+/f8Le27jTqGwJSMjb20GpUJbyzfeX8maRsDg/8VoSK SoS+epjSRa079fJDi30kUl06GrwPJIw7vCNIOljrYd9kE6LBWHV73CE1+yaqNJHiLUiJ OeRrhWx9NjmRKqWwopPvfVUA4TPDcSw9EoGqwMEl/o6P+ckW8cn/Kd5sAPxpJwb+mvdo LMKgfV3zJAP440xIfLcTRAfjcG0hpEvCeB2SEj9op1eSQt6zrtW/upCNtvYUhkYTwmIP CM6w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id i1si723561qtr.117.2019.01.20.23.58.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:58:28 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 773628046C; Mon, 21 Jan 2019 07:58:27 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id E3320608C2; Mon, 21 Jan 2019 07:58:21 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 07/24] userfaultfd: wp: add the writeprotect API to userfaultfd ioctl Date: Mon, 21 Jan 2019 15:57:05 +0800 Message-Id: <20190121075722.7945-8-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Mon, 21 Jan 2019 07:58:27 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli v1: From: Shaohua Li v2: cleanups, remove a branch. [peterx writes up the commit message, as below...] This patch introduces the new uffd-wp APIs for userspace. Firstly, we'll allow to do UFFDIO_REGISTER with write protection tracking using the new UFFDIO_REGISTER_MODE_WP flag. Note that this flag can co-exist with the existing UFFDIO_REGISTER_MODE_MISSING, in which case the userspace program can not only resolve missing page faults, and at the same time tracking page data changes along the way. Secondly, we introduced the new UFFDIO_WRITEPROTECT API to do page level write protection tracking. Note that we will need to register the memory region with UFFDIO_REGISTER_MODE_WP before that. Signed-off-by: Andrea Arcangeli [peterx: remove useless block, write commit message] Signed-off-by: Peter Xu --- fs/userfaultfd.c | 78 +++++++++++++++++++++++++------- include/uapi/linux/userfaultfd.h | 11 +++++ 2 files changed, 73 insertions(+), 16 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index bc9f6230a3f0..6ff8773d6797 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -305,8 +305,11 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, if (!pmd_present(_pmd)) goto out; - if (pmd_trans_huge(_pmd)) + if (pmd_trans_huge(_pmd)) { + if (!pmd_write(_pmd) && (reason & VM_UFFD_WP)) + ret = true; goto out; + } /* * the pmd is stable (as in !pmd_trans_unstable) so we can re-read it @@ -319,6 +322,8 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, */ if (pte_none(*pte)) ret = true; + if (!pte_write(*pte) && (reason & VM_UFFD_WP)) + ret = true; pte_unmap(pte); out: @@ -1252,10 +1257,13 @@ static __always_inline int validate_range(struct mm_struct *mm, return 0; } -static inline bool vma_can_userfault(struct vm_area_struct *vma) +static inline bool vma_can_userfault(struct vm_area_struct *vma, + unsigned long vm_flags) { - return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || - vma_is_shmem(vma); + /* FIXME: add WP support to hugetlbfs and shmem */ + return vma_is_anonymous(vma) || + ((is_vm_hugetlb_page(vma) || vma_is_shmem(vma)) && + !(vm_flags & VM_UFFD_WP)); } static int userfaultfd_register(struct userfaultfd_ctx *ctx, @@ -1287,15 +1295,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, vm_flags = 0; if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MISSING) vm_flags |= VM_UFFD_MISSING; - if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) { + if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) vm_flags |= VM_UFFD_WP; - /* - * FIXME: remove the below error constraint by - * implementing the wprotect tracking mode. - */ - ret = -EINVAL; - goto out; - } ret = validate_range(mm, uffdio_register.range.start, uffdio_register.range.len); @@ -1343,7 +1344,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, /* check not compatible vmas */ ret = -EINVAL; - if (!vma_can_userfault(cur)) + if (!vma_can_userfault(cur, vm_flags)) goto out_unlock; /* @@ -1371,6 +1372,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, if (end & (vma_hpagesize - 1)) goto out_unlock; } + if ((vm_flags & VM_UFFD_WP) && !(cur->vm_flags & VM_WRITE)) + goto out_unlock; /* * Check that this vma isn't already owned by a @@ -1400,7 +1403,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!vma_can_userfault(vma)); + BUG_ON(!vma_can_userfault(vma, vm_flags)); BUG_ON(vma->vm_userfaultfd_ctx.ctx && vma->vm_userfaultfd_ctx.ctx != ctx); WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); @@ -1535,7 +1538,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, * provides for more strict behavior to notice * unregistration errors. */ - if (!vma_can_userfault(cur)) + if (!vma_can_userfault(cur, cur->vm_flags)) goto out_unlock; found = true; @@ -1549,7 +1552,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!vma_can_userfault(vma)); + BUG_ON(!vma_can_userfault(vma, vma->vm_flags)); WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); /* @@ -1760,6 +1763,46 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx, return ret; } +static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, + unsigned long arg) +{ + int ret; + struct uffdio_writeprotect uffdio_wp; + struct uffdio_writeprotect __user *user_uffdio_wp; + struct userfaultfd_wake_range range; + + user_uffdio_wp = (struct uffdio_writeprotect __user *) arg; + + if (copy_from_user(&uffdio_wp, user_uffdio_wp, + sizeof(struct uffdio_writeprotect))) + return -EFAULT; + + ret = validate_range(ctx->mm, uffdio_wp.range.start, + uffdio_wp.range.len); + if (ret) + return ret; + + if (uffdio_wp.mode & ~(UFFDIO_WRITEPROTECT_MODE_DONTWAKE | + UFFDIO_WRITEPROTECT_MODE_WP)) + return -EINVAL; + if ((uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP) && + (uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) + return -EINVAL; + + ret = mwriteprotect_range(ctx->mm, uffdio_wp.range.start, + uffdio_wp.range.len, uffdio_wp.mode & + UFFDIO_WRITEPROTECT_MODE_WP); + if (ret) + return ret; + + if (!(uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) { + range.start = uffdio_wp.range.start; + range.len = uffdio_wp.range.len; + wake_userfault(ctx, &range); + } + return ret; +} + static inline unsigned int uffd_ctx_features(__u64 user_features) { /* @@ -1837,6 +1880,9 @@ static long userfaultfd_ioctl(struct file *file, unsigned cmd, case UFFDIO_ZEROPAGE: ret = userfaultfd_zeropage(ctx, arg); break; + case UFFDIO_WRITEPROTECT: + ret = userfaultfd_writeprotect(ctx, arg); + break; } return ret; } diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 48f1a7c2f1f0..11517f796275 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -52,6 +52,7 @@ #define _UFFDIO_WAKE (0x02) #define _UFFDIO_COPY (0x03) #define _UFFDIO_ZEROPAGE (0x04) +#define _UFFDIO_WRITEPROTECT (0x06) #define _UFFDIO_API (0x3F) /* userfaultfd ioctl ids */ @@ -68,6 +69,8 @@ struct uffdio_copy) #define UFFDIO_ZEROPAGE _IOWR(UFFDIO, _UFFDIO_ZEROPAGE, \ struct uffdio_zeropage) +#define UFFDIO_WRITEPROTECT _IOWR(UFFDIO, _UFFDIO_WRITEPROTECT, \ + struct uffdio_writeprotect) /* read() structure */ struct uffd_msg { @@ -231,4 +234,12 @@ struct uffdio_zeropage { __s64 zeropage; }; +struct uffdio_writeprotect { + struct uffdio_range range; + /* !WP means undo writeprotect. DONTWAKE is valid only with !WP */ +#define UFFDIO_WRITEPROTECT_MODE_WP ((__u64)1<<0) +#define UFFDIO_WRITEPROTECT_MODE_DONTWAKE ((__u64)1<<1) + __u64 mode; +}; + #endif /* _LINUX_USERFAULTFD_H */ From patchwork Mon Jan 21 07:57:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772799 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2AD8C13BF for ; Mon, 21 Jan 2019 07:58:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1C8DF29D39 for ; Mon, 21 Jan 2019 07:58:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1084B29D3D; Mon, 21 Jan 2019 07:58:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9120B29D39 for ; Mon, 21 Jan 2019 07:58:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A546C8E000B; Mon, 21 Jan 2019 02:58:35 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A02CE8E0001; Mon, 21 Jan 2019 02:58:35 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F2C68E000B; Mon, 21 Jan 2019 02:58:35 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 685358E0001 for ; Mon, 21 Jan 2019 02:58:35 -0500 (EST) Received: by mail-qt1-f198.google.com with SMTP id u20so20164459qtk.6 for ; Sun, 20 Jan 2019 23:58:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=AhgKOlGfnVME6vNO5nF8xCpFGu20VLCRJBMaSTPFS4A=; b=aw+MVk1i7zNOcGda6s5yoOpBd7xmfP0/N4NjjT8iiLyNvQrsi070IpFcyamlMv8yXh EPojSGUAoyBUsbTsNptIG8qp61bR12TWvlmOQUcLVk43wMw8Gmzl1fZAw5A8PynQKzc6 /eeaLRyDtn8aD7ASRGfDViwRSq5vF28ycbywO3/8Zc0+8T8OuQUH4GLf2rWkFhej0U9U ieQR+ff9tSXAoB4W0+6W77fNrkupckwQUbC+JEMnnnq0S+8+ecxVPGNcF1bP73vfZTko JRrsYWQDMsXaK/AljcXurf8eNWqCL4hLloqpI8Ib4Dd+HETcgDX5hXd5ZJw52Kdtchtm ehLQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukcvwZagLX+E7gV4VwMNH+KdQfDcLzw2ouVJALCE62wkEd4OpP89 Knvu/LXjmPgAtzH8Ff8sMLxOm7J95aCt+KT8cAg15g+whHZNd4rfMoic+0lbdPKXYYfi3tYtsL6 yhLUGhzvKypO3wkuLGpzLlTt7fRitslMEZurhPqOd8ahD2Q7o2aKZmKM6AVmYmpkWyQ== X-Received: by 2002:a37:b482:: with SMTP id d124mr24240127qkf.168.1548057515197; Sun, 20 Jan 2019 23:58:35 -0800 (PST) X-Google-Smtp-Source: ALg8bN5ZPayL+mB7bjIHRLV2iIHLafA3YPsIdCViy0ZGeefg9x+wHcrWOzlR+fApFQ08Vdt7i/Ij X-Received: by 2002:a37:b482:: with SMTP id d124mr24240094qkf.168.1548057514543; Sun, 20 Jan 2019 23:58:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057514; cv=none; d=google.com; s=arc-20160816; b=oe6a1/0ioQhXx11pt4dGyPADr0pEutaDhEX31vyiCvayiXse7bClJwloq6PYtU6xsY AEzLHrK7p6y2K+LGzpiPmflKn5/AdujPXU0vFpC+SkukNX9KCwGmctpR2zmBFDlBSAvN OTKwqc4nJBYpX8rsX0qcUvTam/RGovzvtQRdJhfJYDWNJq1Wst1I34uil4oPLYLEJnk7 8243/8mrPMbMkaYRP4HXzKKnuGHOxeQMFAHMF0UJStTpS1Dwupp0UkkbHv3C/ww5T4oq WbXysa7jGuc0Dj6ljiAvzJcKFZPQ84ZT7S2rUb4uPUTmeWOFl4j7VsfJcOBXspS9PiE3 b0Hw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=AhgKOlGfnVME6vNO5nF8xCpFGu20VLCRJBMaSTPFS4A=; b=oxIq+x/JlLsdvbfFmTuEnvxLpXMPHE+f06oqtW3Uo3Duc2770UeWvdBAsaKqEPOwQc 6jJjaCgQJxPbt0wMiylMDSY4B+S4G4KXnkgoKrotSf42hZGq6lqQ1mK9KEnW9CPm7pXS ZqVDf3dhR/O7dAY8lKoGSLS0EpXmt9iT/SMWi8aUaZAIhOKANNK0XelzkaZeVVPw7Ppt ZrGFI0tULeCI1wwMUlJ6bbLK2DA6YJJUD0S13ETgBlyqiSes8DIjoBMxbe9rCq+iatHw Woo8t7mNVoqHy9pz5c7e7QdWUPvCkrNcDA6QLOQQTeQBGSpNJeipTXMjTumxsvTvUDo2 l0QA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id j25si3179253qtr.152.2019.01.20.23.58.34 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:58:34 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 85DD386677; Mon, 21 Jan 2019 07:58:33 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 000C56090E; Mon, 21 Jan 2019 07:58:27 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 08/24] userfaultfd: wp: hook userfault handler to write protection fault Date: Mon, 21 Jan 2019 15:57:06 +0800 Message-Id: <20190121075722.7945-9-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 21 Jan 2019 07:58:33 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli There are several cases write protection fault happens. It could be a write to zero page, swaped page or userfault write protected page. When the fault happens, there is no way to know if userfault write protect the page before. Here we just blindly issue a userfault notification for vma with VM_UFFD_WP regardless if app write protects it yet. Application should be ready to handle such wp fault. v1: From: Shaohua Li v2: Handle the userfault in the common do_wp_page. If we get there a pagetable is present and readonly so no need to do further processing until we solve the userfault. In the swapin case, always swapin as readonly. This will cause false positive userfaults. We need to decide later if to eliminate them with a flag like soft-dirty in the swap entry (see _PAGE_SWP_SOFT_DIRTY). hugetlbfs wouldn't need to worry about swapouts but and tmpfs would be handled by a swap entry bit like anonymous memory. The main problem with no easy solution to eliminate the false positives, will be if/when userfaultfd is extended to real filesystem pagecache. When the pagecache is freed by reclaim we can't leave the radix tree pinned if the inode and in turn the radix tree is reclaimed as well. The estimation is that full accuracy and lack of false positives could be easily provided only to anonymous memory (as long as there's no fork or as long as MADV_DONTFORK is used on the userfaultfd anonymous range) tmpfs and hugetlbfs, it's most certainly worth to achieve it but in a later incremental patch. v3: Add hooking point for THP wrprotect faults. CC: Shaohua Li Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu --- mm/memory.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index 4ad2d293ddc2..89d51d1650e4 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2482,6 +2482,11 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; + if (userfaultfd_wp(vma)) { + pte_unmap_unlock(vmf->pte, vmf->ptl); + return handle_userfault(vmf, VM_UFFD_WP); + } + vmf->page = vm_normal_page(vma, vmf->address, vmf->orig_pte); if (!vmf->page) { /* @@ -2799,6 +2804,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS); pte = mk_pte(page, vma->vm_page_prot); + if (userfaultfd_wp(vma)) + vmf->flags &= ~FAULT_FLAG_WRITE; if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) { pte = maybe_mkwrite(pte_mkdirty(pte), vma); vmf->flags &= ~FAULT_FLAG_WRITE; @@ -3662,8 +3669,11 @@ static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf) /* `inline' is required to avoid gcc 4.1.2 build error */ static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd) { - if (vma_is_anonymous(vmf->vma)) + if (vma_is_anonymous(vmf->vma)) { + if (userfaultfd_wp(vmf->vma)) + return handle_userfault(vmf, VM_UFFD_WP); return do_huge_pmd_wp_page(vmf, orig_pmd); + } if (vmf->vma->vm_ops->huge_fault) return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD); From patchwork Mon Jan 21 07:57:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772801 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 77F8213BF for ; Mon, 21 Jan 2019 07:58:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 66CD429C15 for ; Mon, 21 Jan 2019 07:58:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 55DA129C18; Mon, 21 Jan 2019 07:58:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D6DC829C15 for ; Mon, 21 Jan 2019 07:58:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 069038E0003; Mon, 21 Jan 2019 02:58:50 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 040998E0001; Mon, 21 Jan 2019 02:58:50 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E9A878E0003; Mon, 21 Jan 2019 02:58:49 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id BDE358E0001 for ; Mon, 21 Jan 2019 02:58:49 -0500 (EST) Received: by mail-qt1-f198.google.com with SMTP id d31so20337613qtc.4 for ; Sun, 20 Jan 2019 23:58:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=pZJ84L/ZbRdOY+WOUeg+9o36Qq7/uo9wYzCI16ljv34=; b=JivGQnnV2QtqvHhsDLg/dqTslOprrcF4ns39lnW6x/h+lnhHc2cYWrUgBXWTuljOf0 gH+KfRSp4+mxwyLWk3S3xy3f5WE34NGClAbpsNIz625wogTQVNQG43QlxDkfDuMsyXMi 1oqFP2o3msqCKCnMLlqCqZneUr4xeyLKCmHR3QHWxQMV7plALRjVHGk0hXCLSrGxZcfg RpUcBMUv+/gDUFYA0CCSzCl/LB8l88wy87OfrHWEqFEV2ifKc7cxOoZUC01B63YzUrq+ LdDLaRV1xDkFo0q4s4X1tU17TWBBxyZdkbgLAC4Tr38yC7jYLob07cizo4SHmdPDyBQu 6ibA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukd34kyyrCcfSt+hW47FVU3/NwDv53gNCpdgjZOcsCrxUXU3hpG7 6m5Gig1vnIrLolmo41RQwFByDclgkNJrPK/+2ZoPEEbkUE2yQTE7Z+1THUt1QdI+J3md8pXr9h4 8WJ+yYTLEXUM8v/v86exyd6Dytl6YOui4ylPaFzzp1gtmbwpJd7vQ5x8UnWfJnWowhg== X-Received: by 2002:ac8:2585:: with SMTP id e5mr25978888qte.233.1548057529585; Sun, 20 Jan 2019 23:58:49 -0800 (PST) X-Google-Smtp-Source: ALg8bN7tOkSLYypoYNP2vcK8e0phyhnEIqA09GypoH0cw5vP8GEmnuhiayqMMTd6LJeQ8S476Q0m X-Received: by 2002:ac8:2585:: with SMTP id e5mr25978873qte.233.1548057529232; Sun, 20 Jan 2019 23:58:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057529; cv=none; d=google.com; s=arc-20160816; b=ujznQ4O3OdagrBbK1DoKwAsH49i5rs844bmH7/daiU39ucA18PnLjmbBGg64wCmFDb usmXkp0Z1jkRqnApBWCAe1oIl13wgekeduK9QLHMtVWS7E2k3VmHaL1ugkshFsZC1egt vVGjLwfvJwGry9TOPudjO9swCN4yJFom0K/5pYLhpQEMfS4zgAbwAiRnXIuK3il1snE8 JPH9se0V+ZpPwjvq4WqZHvKQr75RHs3i44wa5j5pgig4lda8lSLL4/BdVftBKyvwvMFe OarZtrvSjre30kugdHAtxmK2DTJpfvrRBP66lSSQg4fsaiZagggZhj+b40rM6Za6UTMi zU5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=pZJ84L/ZbRdOY+WOUeg+9o36Qq7/uo9wYzCI16ljv34=; b=yjRuRWVFTQNzX7ROpffW5l6EkUal5FUBzjOaNJIQlUhcCbqyXorSuQki1WIcBY7Sy9 MhY4WIcpLpSeyao/R4gyfEp5gvLFf3KFLjcmPVVUeko8dBOCOhwTJotFKL4O8+OfUdr6 WwePH79dsh9mav6fCEjzvrCQgYybPejlf4ec+08NoFVCO3cq8vAqolSQ7gxoqizTDMGN dKM0+2GwsfkmmI78r1Q0E9zF3E1/pwBkdu4BtQvc4SSpD51i82qPTHUoFgFfugKfCFP4 f8+xfd1UAVMwqzXypZaTJm6FoCkFCRq7XfgTLAkEjQWdAsk7KUFqBnFmp9X9ijYANkmQ bW+w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id x2si1888794qvp.94.2019.01.20.23.58.49 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:58:49 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4C7507E9CB; Mon, 21 Jan 2019 07:58:48 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0C3FE608E1; Mon, 21 Jan 2019 07:58:33 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" , Rik van Riel Subject: [PATCH RFC 09/24] userfaultfd: wp: enabled write protection in userfaultfd API Date: Mon, 21 Jan 2019 15:57:07 +0800 Message-Id: <20190121075722.7945-10-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 21 Jan 2019 07:58:48 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Shaohua Li Now it's safe to enable write protection in userfaultfd API Cc: Andrea Arcangeli Cc: Pavel Emelyanov Cc: Rik van Riel Cc: Kirill A. Shutemov Cc: Mel Gorman Cc: Hugh Dickins Cc: Johannes Weiner Signed-off-by: Shaohua Li Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu --- include/uapi/linux/userfaultfd.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 11517f796275..9de61cd8e228 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -19,7 +19,8 @@ * means the userland is reading). */ #define UFFD_API ((__u64)0xAA) -#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK | \ +#define UFFD_API_FEATURES (UFFD_FEATURE_PAGEFAULT_FLAG_WP | \ + UFFD_FEATURE_EVENT_FORK | \ UFFD_FEATURE_EVENT_REMAP | \ UFFD_FEATURE_EVENT_REMOVE | \ UFFD_FEATURE_EVENT_UNMAP | \ @@ -34,7 +35,8 @@ #define UFFD_API_RANGE_IOCTLS \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY | \ - (__u64)1 << _UFFDIO_ZEROPAGE) + (__u64)1 << _UFFDIO_ZEROPAGE | \ + (__u64)1 << _UFFDIO_WRITEPROTECT) #define UFFD_API_RANGE_IOCTLS_BASIC \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY) From patchwork Mon Jan 21 07:57:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772803 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F0ECB1390 for ; Mon, 21 Jan 2019 07:59:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DFDFD29C15 for ; Mon, 21 Jan 2019 07:59:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D2D5A29C18; Mon, 21 Jan 2019 07:59:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1DF4029C15 for ; Mon, 21 Jan 2019 07:59:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2D9608E0004; Mon, 21 Jan 2019 02:58:59 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 288718E0001; Mon, 21 Jan 2019 02:58:59 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 178E58E0004; Mon, 21 Jan 2019 02:58:59 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id E25488E0001 for ; Mon, 21 Jan 2019 02:58:58 -0500 (EST) Received: by mail-qt1-f200.google.com with SMTP id w18so19924323qts.8 for ; Sun, 20 Jan 2019 23:58:58 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=KpcxPza66LjFTCrDzc03kNRql8zP/+CfIr1uzfP3XZk=; b=B1TXTFTPIthXclr1BWhnZLzcHdy13CWgPzs+D6aX0Y37e6599fSDv7qG6ItKI5GIpR lL1b5C0ICgBDL5oazH/F21t4fgjNFCyaVZgg8IrqFAFb08gTscns/Tr6hmCYMFI+bg4m lB1XL9C7liDwIJjrKeQBUNRk6x5wDIOoDHnGQcSFMs8THhdL2Hqyn4XCOSNjsMxwff+t TuSHMnbeV8j6w5TZ0JRBOt551u+aZooKmVrTS1bZhzgXV/njZFXdwLpO4c+hRE/rTDVH xA7UaykLieYSFpN840zS1w+uhCEll8c1ethVQ0/cW2ZlNgK0av9T1fuVtOVU82GH1pWr ew5w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukcnx+J8pTySxbvrGw5XATCjgpfFPXigvfio81bvsN7/ujUXVGs0 xkP8DDA/XmdSZQ8FNogA1SMMw9J1bT6Tf6IrJpNyrbhTf3jmabsIEtrqObX7LWaf5sNM1iBdAMU amzctwAyeYIKavAGEfgiNhAfb5UCQTmzkMzEd41qxRAYr1mS5A9+WhhLYkqe9XA/GwA== X-Received: by 2002:a37:8cc5:: with SMTP id o188mr23143435qkd.62.1548057538633; Sun, 20 Jan 2019 23:58:58 -0800 (PST) X-Google-Smtp-Source: ALg8bN7CS/6PDARjWOFRri4OhHO0OpOWUaO1E2MDyLHL+t1ohAMXGkFLkxOjmLIr8qfz3418YbPE X-Received: by 2002:a37:8cc5:: with SMTP id o188mr23143412qkd.62.1548057537828; Sun, 20 Jan 2019 23:58:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057537; cv=none; d=google.com; s=arc-20160816; b=rP+Z8pDmd4V+1GugcVnfPfDV5OVfAhoAe+iVYihxLWRr3N5jDfkGpf0Vzv5MMiZDw2 IujFq3Vmvolh2vpsR3p56sLw5KmenpvV1ltzNGt/BCFnu2syssFpYamRg0pJihFTA4+w JsV3vMbokvrM7gmnjOzZWMMrtmTtbkx5ZhL2kC3SBSBFMYYm3f8wdBOPPvnwuLZ6cRxD 4OymBimDQUk4pujaU4btxQlSANUBeDeonTNLOcccqHHB4HxupDi8VUyrSZWi3Um4OQOj l1fpVO2mGnNJwNa/96SVhE7XHgU+Xj+hX6nIiGa2GEW3dveAJye51RbLNmisNOR/f61X eMFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=KpcxPza66LjFTCrDzc03kNRql8zP/+CfIr1uzfP3XZk=; b=0JQT0kJWZ7PFp5JujOIHNCk/VjMKTXNkFqEJ1FeCyoOBvKnMCcrGpuQXxbpmF1qBaQ b7EyX0MP4OZyLcsWYGfX/g/4MevqKpBMW6kUmau/82hS2yVGubzI9VTEikLJR4HI4tLG Zj07GgMrv6aPpvJR5KEMe41BjvwkveSE56QMvcurKoNodGN+iUBYPT3Kh2259Mt7wA/z nQ8QClcg0xzVyy7OaAtIbekfwAlv4UV6o4kf6NIr0joyV9+S0hbfYuWmihcL0NXA0lzk x+MmaH7iTVpWRQosOCDdhIFc73jHUki3hySygYWzY/B1u8Sml2YLNu5dOrDsEB2y2PWP YeIg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id b62si3235431qkf.59.2019.01.20.23.58.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:58:57 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DA33E8046C; Mon, 21 Jan 2019 07:58:56 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id C613F608C7; Mon, 21 Jan 2019 07:58:48 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 10/24] userfaultfd: wp: add WP pagetable tracking to x86 Date: Mon, 21 Jan 2019 15:57:08 +0800 Message-Id: <20190121075722.7945-11-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Mon, 21 Jan 2019 07:58:57 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli Accurate userfaultfd WP tracking is possible by tracking exactly which virtual memory ranges were writeprotected by userland. We can't relay only on the RW bit of the mapped pagetable because that information is destroyed by fork() or KSM or swap. If we were to relay on that, we'd need to stay on the safe side and generate false positive wp faults for every swapped out page. Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu --- arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 52 ++++++++++++++++++++++++++++ arch/x86/include/asm/pgtable_64.h | 8 ++++- arch/x86/include/asm/pgtable_types.h | 9 +++++ include/asm-generic/pgtable.h | 1 + include/asm-generic/pgtable_uffd.h | 51 +++++++++++++++++++++++++++ init/Kconfig | 5 +++ 7 files changed, 126 insertions(+), 1 deletion(-) create mode 100644 include/asm-generic/pgtable_uffd.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 8689e794a43c..096c773452d0 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -207,6 +207,7 @@ config X86 select USER_STACKTRACE_SUPPORT select VIRT_TO_BUS select X86_FEATURE_NAMES if PROC_FS + select HAVE_ARCH_USERFAULTFD_WP if USERFAULTFD config INSTRUCTION_DECODER def_bool y diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 40616e805292..7a71158982f4 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -23,6 +23,7 @@ #ifndef __ASSEMBLY__ #include +#include extern pgd_t early_top_pgt[PTRS_PER_PGD]; int __init __early_make_pgtable(unsigned long address, pmdval_t pmd); @@ -293,6 +294,23 @@ static inline pte_t pte_clear_flags(pte_t pte, pteval_t clear) return native_make_pte(v & ~clear); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline int pte_uffd_wp(pte_t pte) +{ + return pte_flags(pte) & _PAGE_UFFD_WP; +} + +static inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte_set_flags(pte, _PAGE_UFFD_WP); +} + +static inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return pte_clear_flags(pte, _PAGE_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + static inline pte_t pte_mkclean(pte_t pte) { return pte_clear_flags(pte, _PAGE_DIRTY); @@ -372,6 +390,23 @@ static inline pmd_t pmd_clear_flags(pmd_t pmd, pmdval_t clear) return native_make_pmd(v & ~clear); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline int pmd_uffd_wp(pmd_t pmd) +{ + return pmd_flags(pmd) & _PAGE_UFFD_WP; +} + +static inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pmd_set_flags(pmd, _PAGE_UFFD_WP); +} + +static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pmd_clear_flags(pmd, _PAGE_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + static inline pmd_t pmd_mkold(pmd_t pmd) { return pmd_clear_flags(pmd, _PAGE_ACCESSED); @@ -1351,6 +1386,23 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) #endif #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return pte_set_flags(pte, _PAGE_SWP_UFFD_WP); +} + +static inline int pte_swp_uffd_wp(pte_t pte) +{ + return pte_flags(pte) & _PAGE_SWP_UFFD_WP; +} + +static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return pte_clear_flags(pte, _PAGE_SWP_UFFD_WP); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + #define PKRU_AD_BIT 0x1 #define PKRU_WD_BIT 0x2 #define PKRU_BITS_PER_PKEY 2 diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index 9c85b54bf03c..e0c5d29b8685 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -189,7 +189,7 @@ extern void sync_global_pgds(unsigned long start, unsigned long end); * * | ... | 11| 10| 9|8|7|6|5| 4| 3|2| 1|0| <- bit number * | ... |SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names - * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| X|X|SD|0| <- swp entry + * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| X|F|SD|0| <- swp entry * * G (8) is aliased and used as a PROT_NONE indicator for * !present ptes. We need to start storing swap entries above @@ -197,9 +197,15 @@ extern void sync_global_pgds(unsigned long start, unsigned long end); * erratum where they can be incorrectly set by hardware on * non-present PTEs. * + * SD Bits 1-4 are not used in non-present format and available for + * special use described below: + * * SD (1) in swp entry is used to store soft dirty bit, which helps us * remember soft dirty over page migration * + * F (2) in swp entry is used to record when a pagetable is + * writeprotected by userfaultfd WP support. + * * Bit 7 in swp entry should be 0 because pmd_present checks not only P, * but also L and G. * diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 106b7d0e2dae..163043ab142d 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -32,6 +32,7 @@ #define _PAGE_BIT_SPECIAL _PAGE_BIT_SOFTW1 #define _PAGE_BIT_CPA_TEST _PAGE_BIT_SOFTW1 +#define _PAGE_BIT_UFFD_WP _PAGE_BIT_SOFTW2 /* userfaultfd wrprotected */ #define _PAGE_BIT_SOFT_DIRTY _PAGE_BIT_SOFTW3 /* software dirty tracking */ #define _PAGE_BIT_DEVMAP _PAGE_BIT_SOFTW4 @@ -100,6 +101,14 @@ #define _PAGE_SWP_SOFT_DIRTY (_AT(pteval_t, 0)) #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +#define _PAGE_UFFD_WP (_AT(pteval_t, 1) << _PAGE_BIT_UFFD_WP) +#define _PAGE_SWP_UFFD_WP _PAGE_USER +#else +#define _PAGE_UFFD_WP (_AT(pteval_t, 0)) +#define _PAGE_SWP_UFFD_WP (_AT(pteval_t, 0)) +#endif + #if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE) #define _PAGE_NX (_AT(pteval_t, 1) << _PAGE_BIT_NX) #define _PAGE_DEVMAP (_AT(u64, 1) << _PAGE_BIT_DEVMAP) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 359fb935ded6..0e1470ecf7b5 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -10,6 +10,7 @@ #include #include #include +#include #if 5 - defined(__PAGETABLE_P4D_FOLDED) - defined(__PAGETABLE_PUD_FOLDED) - \ defined(__PAGETABLE_PMD_FOLDED) != CONFIG_PGTABLE_LEVELS diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h new file mode 100644 index 000000000000..643d1bf559c2 --- /dev/null +++ b/include/asm-generic/pgtable_uffd.h @@ -0,0 +1,51 @@ +#ifndef _ASM_GENERIC_PGTABLE_UFFD_H +#define _ASM_GENERIC_PGTABLE_UFFD_H + +#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static __always_inline int pte_uffd_wp(pte_t pte) +{ + return 0; +} + +static __always_inline int pmd_uffd_wp(pmd_t pmd) +{ + return 0; +} + +static __always_inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pmd; +} + +static __always_inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pmd; +} + +static __always_inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return pte; +} + +static __always_inline int pte_swp_uffd_wp(pte_t pte) +{ + return 0; +} + +static __always_inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return pte; +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + +#endif /* _ASM_GENERIC_PGTABLE_UFFD_H */ diff --git a/init/Kconfig b/init/Kconfig index cf5b5a0dcbc2..2a02e004874e 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1418,6 +1418,11 @@ config ADVISE_SYSCALLS applications use these syscalls, you can disable this option to save space. +config HAVE_ARCH_USERFAULTFD_WP + bool + help + Arch has userfaultfd write protection support + config MEMBARRIER bool "Enable membarrier() system call" if EXPERT default y From patchwork Mon Jan 21 07:57:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772805 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C44A51390 for ; Mon, 21 Jan 2019 07:59:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B6A1C29C15 for ; Mon, 21 Jan 2019 07:59:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AA29C29C18; Mon, 21 Jan 2019 07:59:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4676D29C15 for ; Mon, 21 Jan 2019 07:59:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 59A448E0005; Mon, 21 Jan 2019 02:59:07 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 548148E0001; Mon, 21 Jan 2019 02:59:07 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 460828E0005; Mon, 21 Jan 2019 02:59:07 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 1F1E28E0001 for ; Mon, 21 Jan 2019 02:59:07 -0500 (EST) Received: by mail-qt1-f199.google.com with SMTP id d35so20293551qtd.20 for ; Sun, 20 Jan 2019 23:59:07 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=NbmjnZLLq16I7ddzT2uJLYevedAg3DqF8/tVlLBRIBA=; b=bmCOXtSx9lnYursZLngSP+/HwTDlYAhD7YkCYGVtIsV2rFT8ypnM224ybvLIl1/BfG o3R09K1RkSl1Qn/4Gscp2Qnyude6kqIhz3fDmr8nyASJL+cUFgTa+oz+1SDJLFMyQiOf 4f1MxrJFNCIlZlbbQ1oPVo66K91vGglpoJPYCgCW03CMBJ92B7JWT6sUSJCvORweh4NT p+7T5q+B6FK51o9vhBG5UaOfdhxhc2yVgrkPFuVoCCbcn/zuW+NDjWnE1cE4361z1P4n IvKUzFTcq45O/8yNcLD59ZKG4RyryOveDKbz4vrTM7zHZDKL4EjJN2oHhpuXG1GuDO1w 9FkQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukcdg1zxDIQ2rJtVofLGQ12tice/LU6ZR3zwcJyxZFUPPP1RUjoZ MDYcNeXMZpxo8VcUXA9nhWp6a/2V4sJYQ0dXSWk8M9zwhp0eeWgZvV1YwQPDjbgFfIVS01yPq25 H2xmHC1OateE1l/7nYMhfAKDcRkDdqs7Eh+HjlL13RhZIWeymN6k2jAmwmDnR16/ZcQ== X-Received: by 2002:a0c:e40a:: with SMTP id o10mr24179821qvl.197.1548057546919; Sun, 20 Jan 2019 23:59:06 -0800 (PST) X-Google-Smtp-Source: ALg8bN6tz+ez7S2Iyl7tRWYNs3UQLh6lM3fYZD2wjss72/AXFqe4x712ZsJFg7Q4GHdl72Fq/Ioo X-Received: by 2002:a0c:e40a:: with SMTP id o10mr24179806qvl.197.1548057546480; Sun, 20 Jan 2019 23:59:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057546; cv=none; d=google.com; s=arc-20160816; b=rB++37co68pMO3K+4fR8aHFsH0rp3hnmYu2XNgKQlNd1LMUlu54jsGrArnUuN2zlA8 mozRRKc+yo6LWD1OqucBsYcOIkKUcC3kgK1RZqtajm7u1JcIYCNyKQUiz3YByYei1F2G 789aCSb4BZIFUU3EQTaFdNNk+muHIjOvzLEWzwwekgDVvrq+1BcIQD1G4/0xpKuIDRQQ oLhEpYeGRcAs8iXZbIzWNL9ctbkJRyMPFOOLkI8jed2R9NcJr2WyQGz+eDaz3j/KHSfx Z0JpNb9+1k1uczvgHaJcjmvgyeVf3MSPm9GvH9MGAjsgvgCz/vx1vXrrjV4XDTCMGlz3 9B6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=NbmjnZLLq16I7ddzT2uJLYevedAg3DqF8/tVlLBRIBA=; b=VScPny8zGXtWdRtpJoQNxOin/ZAIwMBV79/xeL+LQy2O8m0cxXFjwR2ALVi6piVMZ0 3sMAPr91TQL8QIc3CiuuVXynfg7L016rUAPcNxNSfX8TAGbDm/+xZKEd8+OIWAG6SHT/ YIdNEf8PsFz/BFan8+ODgfjfcq+cDBe3rD+omk9ZGonKzyjx4UwQ05JBDOVn7VpSdyJE mnadvkL6/7U7DIcIiMtXxv2PEbgptSQ4QMBpnSKSjZXbEZR77TK6KxLDiuYaCAQgo+9b IMKRDHMxuZoRj6dOf7px4806Fk7aWn/nTUC1WxA+M33rvy8NLX+0DhY0D/BbZ3UbgMmU 1ptg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id y36si5656021qtd.218.2019.01.20.23.59.06 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:59:06 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9A1EB9FDD4; Mon, 21 Jan 2019 07:59:05 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 62EC9608C2; Mon, 21 Jan 2019 07:58:57 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 11/24] userfaultfd: wp: userfaultfd_pte/huge_pmd_wp() helpers Date: Mon, 21 Jan 2019 15:57:09 +0800 Message-Id: <20190121075722.7945-12-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Mon, 21 Jan 2019 07:59:05 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli Implement helpers methods to invoke userfaultfd wp faults more selectively: not only when a wp fault triggers on a vma with vma->vm_flags VM_UFFD_WP set, but only if the _PAGE_UFFD_WP bit is set in the pagetable too. Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index e82f3156f4e9..0d3b32b54e2a 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -14,6 +14,8 @@ #include /* linux/include/uapi/linux/userfaultfd.h */ #include +#include +#include /* * CAREFUL: Check include/uapi/asm-generic/fcntl.h when defining @@ -57,6 +59,18 @@ static inline bool userfaultfd_wp(struct vm_area_struct *vma) return vma->vm_flags & VM_UFFD_WP; } +static inline bool userfaultfd_pte_wp(struct vm_area_struct *vma, + pte_t pte) +{ + return userfaultfd_wp(vma) && pte_uffd_wp(pte); +} + +static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma, + pmd_t pmd) +{ + return userfaultfd_wp(vma) && pmd_uffd_wp(pmd); +} + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP); @@ -106,6 +120,19 @@ static inline bool userfaultfd_wp(struct vm_area_struct *vma) return false; } +static inline bool userfaultfd_pte_wp(struct vm_area_struct *vma, + pte_t pte) +{ + return false; +} + +static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma, + pmd_t pmd) +{ + return false; +} + + static inline bool userfaultfd_armed(struct vm_area_struct *vma) { return false; From patchwork Mon Jan 21 07:57:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772807 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BD64513BF for ; Mon, 21 Jan 2019 07:59:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AF0B629C15 for ; Mon, 21 Jan 2019 07:59:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A28FE29C18; Mon, 21 Jan 2019 07:59:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0178329C15 for ; Mon, 21 Jan 2019 07:59:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E25108E0006; Mon, 21 Jan 2019 02:59:13 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DFA9F8E0001; Mon, 21 Jan 2019 02:59:13 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D11978E0006; Mon, 21 Jan 2019 02:59:13 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id A983C8E0001 for ; Mon, 21 Jan 2019 02:59:13 -0500 (EST) Received: by mail-qt1-f200.google.com with SMTP id w18so19924806qts.8 for ; Sun, 20 Jan 2019 23:59:13 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=W+oW2y6xC9D1uZTpgqUrG6lxxCxYn+XwHEZzvOfHOu8=; b=b2u4vJ4uyyLH5EDh64J164sck/sMqeUaT7gr8bEpVWkrbGCjBAz0aHpLmOwvIGlOae 66qOvZ6bIkN8XdOWEtn+dINp7/WmZ308++k11vmS/70UWvkV8qzSBJUFmJmWDKLK0U5I iNw2IJp8ixjOJf2MHQOO3Opi8ucplZFkk8SDSzhl9v2iuAQ7sG1din7diFzI0ANNdj8n Sxohcg80SXc4N0SrRLVfTSyUgMjbz6wdVlLT7/+5vsvFeiU6NZd2h+Lv4AFwsUxTzv1P MIqT4dyufb+G18VjvuP3ISGUc/YU9JyL/A95Q4zgqzZF8v6BRhnrhLfluvB2oPeS9myv dReQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukdkB9bpWUs4yuLyk50mpaJCN0umrWkU9IgR9VBqEV4e/Qu3R88J jLlA5X0t03aY1qKHboj90lA11i6hBPVzO54xc65jLc6IIfHQm73lbELJxXRW992MupVPWNNbL6O Ie0QbYziPmrRnB/hM/IOBoHDvPFWAc8zhmi9P15nhh9YsllI3v3QR2TOGWyk87lwwwQ== X-Received: by 2002:ae9:d8c2:: with SMTP id u185mr22710799qkf.107.1548057553437; Sun, 20 Jan 2019 23:59:13 -0800 (PST) X-Google-Smtp-Source: ALg8bN4BDZ54EsEcH9bOg5U5JnBWsGAXIp3nSV9tPO8sgU1kCyMS9e6OUzORh2PDlfxAvw1TRxbc X-Received: by 2002:ae9:d8c2:: with SMTP id u185mr22710773qkf.107.1548057552720; Sun, 20 Jan 2019 23:59:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057552; cv=none; d=google.com; s=arc-20160816; b=FNlKj5QFMZde7KNkNm6ExNX9eL14LP/hIGQL1CRQiEq2RyKkbE5NtPfPm2KWI1hGDg pIrxhjy8a3ru/TJLYrwLDcjY+MOlG1KDTcsyzi68clpH8KmphrCPOv35apV3z0ch3hzi Ov8rxCeBCOLNp14wKk1MkGmUXp3cVzaw1M34FbUT19ATPuB2YhuQcmNsFM23FJeJu93j fogwz2rYNJmsJfXq+AnxDbl21f8KqpMiUvuwP6mLUNhy9MQ5LowGS9mdU0pts8OZnvSX R4OnLJnPZ4vG4ru408k3h0AUpEL7XvPu9nGnguyiHj/hcGUbbQh2VgZjZxaPP+TEJsOF 0QcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=W+oW2y6xC9D1uZTpgqUrG6lxxCxYn+XwHEZzvOfHOu8=; b=OlbYgabf1osdfnaxEc+cBZunoMLg2qbBI2HCGf/PbolbB1NQ0WSuEGewFk1oHEsXCm psEAoQKKYtdebQ1TvkNQuj9CMF6Fp3pQJWMndUZsJXQJEPq1m33SMwRO+RWdEvI1MBEE YwQGszF11fbD2S/STTPu/Rdejo30liIGfo52CQMl2mbYsE+hL5x1zbGommNkq7KBwdoT 91Z1GlVObmF4X3jdL+jO+IWwJA/Qbi0yGt2F/fJRIZYlfeide0mnN69NYXHC9TQb+0RW LYogOatLRb0QjH4cVhNYlX7oWyUM/PgU6xB21CiapOZJqW3/0+QiAkFQHvz7JVTtv5RL zeNg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id n66si854785qka.101.2019.01.20.23.59.12 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:59:12 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AA641C0222; Mon, 21 Jan 2019 07:59:11 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 21CF6608F3; Mon, 21 Jan 2019 07:59:05 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 12/24] userfaultfd: wp: add UFFDIO_COPY_MODE_WP Date: Mon, 21 Jan 2019 15:57:10 +0800 Message-Id: <20190121075722.7945-13-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Mon, 21 Jan 2019 07:59:12 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli This allows UFFDIO_COPY to map pages wrprotected. Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu --- fs/userfaultfd.c | 5 +++-- include/linux/userfaultfd_k.h | 2 +- include/uapi/linux/userfaultfd.h | 11 +++++----- mm/userfaultfd.c | 36 ++++++++++++++++++++++---------- 4 files changed, 35 insertions(+), 19 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 6ff8773d6797..455b87c0596f 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1686,11 +1686,12 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, ret = -EINVAL; if (uffdio_copy.src + uffdio_copy.len <= uffdio_copy.src) goto out; - if (uffdio_copy.mode & ~UFFDIO_COPY_MODE_DONTWAKE) + if (uffdio_copy.mode & ~(UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP)) goto out; if (mmget_not_zero(ctx->mm)) { ret = mcopy_atomic(ctx->mm, uffdio_copy.dst, uffdio_copy.src, - uffdio_copy.len, &ctx->mmap_changing); + uffdio_copy.len, &ctx->mmap_changing, + uffdio_copy.mode); mmput(ctx->mm); } else { return -ESRCH; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 0d3b32b54e2a..7d870e9a5761 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -34,7 +34,7 @@ extern vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason); extern ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - bool *mmap_changing); + bool *mmap_changing, __u64 mode); extern ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long len, diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 9de61cd8e228..a50f1ed24d23 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -208,13 +208,14 @@ struct uffdio_copy { __u64 dst; __u64 src; __u64 len; +#define UFFDIO_COPY_MODE_DONTWAKE ((__u64)1<<0) /* - * There will be a wrprotection flag later that allows to map - * pages wrprotected on the fly. And such a flag will be - * available if the wrprotection ioctl are implemented for the - * range according to the uffdio_register.ioctls. + * UFFDIO_COPY_MODE_WP will map the page wrprotected on the + * fly. UFFDIO_COPY_MODE_WP is available only if the + * wrprotection ioctl are implemented for the range according + * to the uffdio_register.ioctls. */ -#define UFFDIO_COPY_MODE_DONTWAKE ((__u64)1<<0) +#define UFFDIO_COPY_MODE_WP ((__u64)1<<1) __u64 mode; /* diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index c38903f501c7..005291b9b62f 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -25,7 +25,8 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - struct page **pagep) + struct page **pagep, + bool wp_copy) { struct mem_cgroup *memcg; pte_t _dst_pte, *dst_pte; @@ -71,9 +72,9 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, if (mem_cgroup_try_charge(page, dst_mm, GFP_KERNEL, &memcg, false)) goto out_release; - _dst_pte = mk_pte(page, dst_vma->vm_page_prot); - if (dst_vma->vm_flags & VM_WRITE) - _dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte)); + _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); + if (dst_vma->vm_flags & VM_WRITE && !wp_copy) + _dst_pte = pte_mkwrite(_dst_pte); dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); if (dst_vma->vm_file) { @@ -399,7 +400,8 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, struct page **page, - bool zeropage) + bool zeropage, + bool wp_copy) { ssize_t err; @@ -416,11 +418,13 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, if (!(dst_vma->vm_flags & VM_SHARED)) { if (!zeropage) err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, - dst_addr, src_addr, page); + dst_addr, src_addr, page, + wp_copy); else err = mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, dst_addr); } else { + VM_WARN_ON(wp_copy); /* WP only available for anon */ if (!zeropage) err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, @@ -438,7 +442,8 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, unsigned long src_start, unsigned long len, bool zeropage, - bool *mmap_changing) + bool *mmap_changing, + __u64 mode) { struct vm_area_struct *dst_vma; ssize_t err; @@ -446,6 +451,7 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, unsigned long src_addr, dst_addr; long copied; struct page *page; + bool wp_copy; /* * Sanitize the command parameters: @@ -502,6 +508,14 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, dst_vma->vm_flags & VM_SHARED)) goto out_unlock; + /* + * validate 'mode' now that we know the dst_vma: don't allow + * a wrprotect copy if the userfaultfd didn't register as WP. + */ + wp_copy = mode & UFFDIO_COPY_MODE_WP; + if (wp_copy && !(dst_vma->vm_flags & VM_UFFD_WP)) + goto out_unlock; + /* * If this is a HUGETLB vma, pass off to appropriate routine */ @@ -557,7 +571,7 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, BUG_ON(pmd_trans_huge(*dst_pmd)); err = mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, - src_addr, &page, zeropage); + src_addr, &page, zeropage, wp_copy); cond_resched(); if (unlikely(err == -ENOENT)) { @@ -604,16 +618,16 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - bool *mmap_changing) + bool *mmap_changing, __u64 mode) { return __mcopy_atomic(dst_mm, dst_start, src_start, len, false, - mmap_changing); + mmap_changing, mode); } ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long start, unsigned long len, bool *mmap_changing) { - return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing); + return __mcopy_atomic(dst_mm, start, 0, len, true, mmap_changing, 0); } int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, From patchwork Mon Jan 21 07:57:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772811 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 625371390 for ; Mon, 21 Jan 2019 07:59:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5347B29CB1 for ; Mon, 21 Jan 2019 07:59:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4764329CBB; Mon, 21 Jan 2019 07:59:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E83829CB1 for ; Mon, 21 Jan 2019 07:59:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4FC5C8E0007; Mon, 21 Jan 2019 02:59:22 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4D21C8E0001; Mon, 21 Jan 2019 02:59:22 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3EA388E0007; Mon, 21 Jan 2019 02:59:22 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 13E178E0001 for ; Mon, 21 Jan 2019 02:59:22 -0500 (EST) Received: by mail-qk1-f199.google.com with SMTP id s70so18663389qks.4 for ; Sun, 20 Jan 2019 23:59:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=4ZGGc+zxn+3FJJ9IwgC0zI5Z8Owd5tQmIBSnulXj7KA=; b=uavhD0PuKo7VnrCdNEO4WFZDlTvQUt83zb3G4AmoWIL0R6xzzGl5CQqocd/tqJF1qf gAE4gQiRct6Pju7gdytHRE4Ws2Pd+r68g2iSJKVjn5ERbdCoHKKYPjDHKGQTo3UIx5Ij SIF7EKGt7JZyLQro62C0DGIjFe8Pidfk8FlWvCw9AvOsxZK5hHNAs2JBiHwysrdlBkPH G33s4fcgxIzus5ygz3hycb+pgUup/JAdOspbMS8j7bekey+d+fVaVB5xPcWO4QbJYiLi zChuQTQsPTLYbagtTWyIevDmIasD/ox3IPhB+fwI41ckFbvmYXJq1O3utS440S5KHzpR DblQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukcEsC0u4tRoCYDpG177HAihAb/aZxoVI8JCc/GCKfCpwIF+lLKm LaCRH+fQqNp8tI5pEMXgfX3mtyi2c2wMkkQ6ITRIhdRbzWlEMKStI9evM5hAt15mOc8I1aYuSe8 WYhzhrF9l4CdcWY703GRqtMzzYZSCi2wlL5Lynqt+/1ZMjHBe/sjrCfobc2UBSG3PlQ== X-Received: by 2002:aed:2cc4:: with SMTP id g62mr25519694qtd.192.1548057561856; Sun, 20 Jan 2019 23:59:21 -0800 (PST) X-Google-Smtp-Source: ALg8bN7e2rSMYlJhSQ/84Xu2OYRYftzdoKnqyjsprPrhS+smVanRNQ0B9J7FVSOs9xGBwyOTYTUG X-Received: by 2002:aed:2cc4:: with SMTP id g62mr25519662qtd.192.1548057561197; Sun, 20 Jan 2019 23:59:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057561; cv=none; d=google.com; s=arc-20160816; b=w9aePyxFDneZZBa533aqFTDht2BA7vvx05/iyxVMg+Ourm4tTQR6Nlwegowt6QK9+x 9q78iUaxjhVn9XNw+stCl+O0dtX40MkOlG10JGBIF0DXF+nMUIxP4MmWLe8FJq1ZdsMj i4ZzwSA5bQ0FfjLu2pHubuP09GMGxo64CXmvNFG6m1TqhwfHSfslC3OE63oUuZHJqaqd vmmZKb3Y+NpVkX5EVoPrsKHOvpGxVG31z9hgsIEQKUjC7zQ0nYIZFmHengZHpLkcB1k7 ryBMmHIDHjxJ3owtxmFwBmx4r26noWeMyQtn6MWXqwAu2bTdlFtTQLcC10Tem+cCkkxg O7Yw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=4ZGGc+zxn+3FJJ9IwgC0zI5Z8Owd5tQmIBSnulXj7KA=; b=whEwWPNPRDCzCOAYdKtiiIjEo51X10/ud5F+wyocIe+V5UCjn9wSufFyxoPEbcAoCS PlM5RU99XcxzeAU8M/SQHTzWTpd3scGBEL2GUStMA5brI98ssi6gkuPlfSmptLDtXwOz tCWbgeWWnMn5JvMMVyTy8VjnM+UZFOsZVwW0xY3hHNArtkVQpJpSpC1KNgjbqawY1iP2 z1e7eN1LXxqOIJSJ2xBr9P4FhZnsKr/ZiMPm3UH5jsMXdMebpORnTrAz3Bdt3mtxSKAu u00j/MTswh7VPDBiL8NxZGrSRZ+d5gihcx/Dhj2QSQ1gtgblRY7IeAL0ixL0dZIkXBuO bV6g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id w7si1202965qte.36.2019.01.20.23.59.21 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:59:21 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4BD96284B6; Mon, 21 Jan 2019 07:59:20 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 301C2608E1; Mon, 21 Jan 2019 07:59:11 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 13/24] mm: merge parameters for change_protection() Date: Mon, 21 Jan 2019 15:57:11 +0800 Message-Id: <20190121075722.7945-14-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Mon, 21 Jan 2019 07:59:20 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP change_protection() was used by either the NUMA or mprotect() code, there's one parameter for each of the callers (dirty_accountable and prot_numa). Further, these parameters are passed along the calls: - change_protection_range() - change_p4d_range() - change_pud_range() - change_pmd_range() - ... Now we introduce a flag for change_protect() and all these helpers to replace these parameters. Then we can avoid passing multiple parameters multiple times along the way. More importantly, it'll greatly simplify the work if we want to introduce any new parameters to change_protection(). In the follow up patches, a new parameter for userfaultfd write protection will be introduced. No functional change at all. Signed-off-by: Peter Xu --- include/linux/huge_mm.h | 2 +- include/linux/mm.h | 14 +++++++++++++- mm/huge_memory.c | 3 ++- mm/mempolicy.c | 2 +- mm/mprotect.c | 30 ++++++++++++++++-------------- mm/userfaultfd.c | 2 +- 6 files changed, 34 insertions(+), 19 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 4663ee96cf59..a8845eed6958 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -46,7 +46,7 @@ extern bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, pmd_t *old_pmd, pmd_t *new_pmd); extern int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, pgprot_t newprot, - int prot_numa); + unsigned long cp_flags); vm_fault_t vmf_insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, pmd_t *pmd, pfn_t pfn, bool write); vm_fault_t vmf_insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, diff --git a/include/linux/mm.h b/include/linux/mm.h index 5411de93a363..452fcc31fa29 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1588,9 +1588,21 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma, unsigned long old_addr, struct vm_area_struct *new_vma, unsigned long new_addr, unsigned long len, bool need_rmap_locks); + +/* + * Flags used by change_protection(). For now we make it a bitmap so + * that we can pass in multiple flags just like parameters. However + * for now all the callers are only use one of the flags at the same + * time. + */ +/* Whether we should allow dirty bit accounting */ +#define MM_CP_DIRTY_ACCT (1UL << 0) +/* Whether this protection change is for NUMA hints */ +#define MM_CP_PROT_NUMA (1UL << 1) + extern unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa); + unsigned long cp_flags); extern int mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev, unsigned long start, unsigned long end, unsigned long newflags); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index e84a10b0d310..be8160bb7cac 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1856,13 +1856,14 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, * - HPAGE_PMD_NR is protections changed and TLB flush necessary */ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, - unsigned long addr, pgprot_t newprot, int prot_numa) + unsigned long addr, pgprot_t newprot, unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; spinlock_t *ptl; pmd_t entry; bool preserve_write; int ret; + bool prot_numa = cp_flags & MM_CP_PROT_NUMA; ptl = __pmd_trans_huge_lock(pmd, vma); if (!ptl) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index d4496d9d34f5..233194f3d69a 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -554,7 +554,7 @@ unsigned long change_prot_numa(struct vm_area_struct *vma, { int nr_updated; - nr_updated = change_protection(vma, addr, end, PAGE_NONE, 0, 1); + nr_updated = change_protection(vma, addr, end, PAGE_NONE, MM_CP_PROT_NUMA); if (nr_updated) count_vm_numa_events(NUMA_PTE_UPDATES, nr_updated); diff --git a/mm/mprotect.c b/mm/mprotect.c index 6d331620b9e5..416ede326c03 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -37,13 +37,15 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; pte_t *pte, oldpte; spinlock_t *ptl; unsigned long pages = 0; int target_node = NUMA_NO_NODE; + bool dirty_accountable = cp_flags & MM_CP_DIRTY_ACCT; + bool prot_numa = cp_flags & MM_CP_PROT_NUMA; /* * Can be called with only the mmap_sem for reading by @@ -164,7 +166,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, static inline unsigned long change_pmd_range(struct vm_area_struct *vma, pud_t *pud, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { pmd_t *pmd; struct mm_struct *mm = vma->vm_mm; @@ -193,7 +195,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, __split_huge_pmd(vma, pmd, addr, false, NULL); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, - newprot, prot_numa); + newprot, cp_flags); if (nr_ptes) { if (nr_ptes == HPAGE_PMD_NR) { @@ -208,7 +210,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, /* fall through, the trans huge pmd just split */ } this_pages = change_pte_range(vma, pmd, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); pages += this_pages; next: cond_resched(); @@ -224,7 +226,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, static inline unsigned long change_pud_range(struct vm_area_struct *vma, p4d_t *p4d, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { pud_t *pud; unsigned long next; @@ -236,7 +238,7 @@ static inline unsigned long change_pud_range(struct vm_area_struct *vma, if (pud_none_or_clear_bad(pud)) continue; pages += change_pmd_range(vma, pud, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (pud++, addr = next, addr != end); return pages; @@ -244,7 +246,7 @@ static inline unsigned long change_pud_range(struct vm_area_struct *vma, static inline unsigned long change_p4d_range(struct vm_area_struct *vma, pgd_t *pgd, unsigned long addr, unsigned long end, - pgprot_t newprot, int dirty_accountable, int prot_numa) + pgprot_t newprot, unsigned long cp_flags) { p4d_t *p4d; unsigned long next; @@ -256,7 +258,7 @@ static inline unsigned long change_p4d_range(struct vm_area_struct *vma, if (p4d_none_or_clear_bad(p4d)) continue; pages += change_pud_range(vma, p4d, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (p4d++, addr = next, addr != end); return pages; @@ -264,7 +266,7 @@ static inline unsigned long change_p4d_range(struct vm_area_struct *vma, static unsigned long change_protection_range(struct vm_area_struct *vma, unsigned long addr, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; pgd_t *pgd; @@ -281,7 +283,7 @@ static unsigned long change_protection_range(struct vm_area_struct *vma, if (pgd_none_or_clear_bad(pgd)) continue; pages += change_p4d_range(vma, pgd, addr, next, newprot, - dirty_accountable, prot_numa); + cp_flags); } while (pgd++, addr = next, addr != end); /* Only flush the TLB if we actually modified any entries: */ @@ -294,14 +296,15 @@ static unsigned long change_protection_range(struct vm_area_struct *vma, unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, - int dirty_accountable, int prot_numa) + unsigned long cp_flags) { unsigned long pages; if (is_vm_hugetlb_page(vma)) pages = hugetlb_change_protection(vma, start, end, newprot); else - pages = change_protection_range(vma, start, end, newprot, dirty_accountable, prot_numa); + pages = change_protection_range(vma, start, end, newprot, + cp_flags); return pages; } @@ -428,8 +431,7 @@ mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev, dirty_accountable = vma_wants_writenotify(vma, vma->vm_page_prot); vma_set_page_prot(vma); - change_protection(vma, start, end, vma->vm_page_prot, - dirty_accountable, 0); + change_protection(vma, start, end, vma->vm_page_prot, MM_CP_DIRTY_ACCT); /* * Private VM_LOCKED VMA becoming writable: trigger COW to avoid major diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 005291b9b62f..23d4bbd117ee 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -674,7 +674,7 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, newprot = vm_get_page_prot(dst_vma->vm_flags); change_protection(dst_vma, start, start + len, newprot, - !enable_wp, 0); + enable_wp ? 0 : MM_CP_DIRTY_ACCT); err = 0; out_unlock: From patchwork Mon Jan 21 07:57:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772813 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AC1411390 for ; Mon, 21 Jan 2019 07:59:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9CBF729CB1 for ; Mon, 21 Jan 2019 07:59:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8FEC829CBB; Mon, 21 Jan 2019 07:59:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D7AEF29CB1 for ; Mon, 21 Jan 2019 07:59:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 76DBF8E000C; Mon, 21 Jan 2019 02:59:28 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 71CC68E0001; Mon, 21 Jan 2019 02:59:28 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6323E8E000C; Mon, 21 Jan 2019 02:59:28 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 3A6FE8E0001 for ; Mon, 21 Jan 2019 02:59:28 -0500 (EST) Received: by mail-qk1-f198.google.com with SMTP id y83so18336283qka.7 for ; Sun, 20 Jan 2019 23:59:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=Q6mEP+YBmpqS+Yd9kHQJfl3StEgalSQMf3wQ7sA5hG0=; b=UFgO+UYLEG5KGCgWD8yI9GkX/UseDkPExImKF25QCgOoxmPIChKX7TRh6BkJjQwSRZ Dr3d45X4HEejJKhFtqZsZ7BehA1GRkYDM15S1Wy44EotfS874I3x99lGJCU3lC7HdVAw ILNsmp2MRQDA9gB66oT/eryVyY5Tr+CaVpTFZIFGoDUAMPxjCgKE9NUhQWHEAgcBGV4R alGQHSpjleEB2dr11mVbyrae2ulyHnAEcqjxFx9rw4q0/hGAD/HMFYSRr68X5NA9eSt1 qWaow67eSOzm1q+GoxmsIGG81ldHw5FsCMTPIMbeVZM0SsEId4o9zrvpbLXNrEes07at 5SUQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukeRvUzBqCfS+sd7Vg2V+X2p9Sd66fSoPTT2pPPOergavYMLHPd2 f2k3Ain56q7Wb3vrgZCdSeIhHoHzJUs43++oGbc4s3pfQ2kUSndaKMD0ppYYw/21lcO8uqG7Yfe D86bBvRugwwGPYfKS24T6cCj+nQLkm1teoV8xU15FIjDTic4r0npPAgohpPkAysApQw== X-Received: by 2002:a0c:e394:: with SMTP id a20mr24245695qvl.42.1548057568015; Sun, 20 Jan 2019 23:59:28 -0800 (PST) X-Google-Smtp-Source: ALg8bN6nQ/u1kZtGNG0/Oizv5stYqXNfLSn87+/Ri+mo4Sygay9KuWttNYptkvgEn70rDpYkYmLd X-Received: by 2002:a0c:e394:: with SMTP id a20mr24245673qvl.42.1548057567286; Sun, 20 Jan 2019 23:59:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057567; cv=none; d=google.com; s=arc-20160816; b=iWThWInAOrOb1PzD8o8T6KcYHfPfsmKMEPviJ1Rnktix4R3Mg/dVEx3gpG0UI83wsn zh52p1MoeX0Ef9PGWwjb+cZss10tFqjQM/xIVQg3uFat9aqKDwL9dDore82wcNPpcSs4 A20Ermmk60/z1/kSgGCSWE8qCLGLMIuO+6zFYBShB7YU0BnPFNh/bOBSYpvOQF6pGfd7 563hDEliPo1rBX7ezvJMHF+laiTyM4cgulQsI/qrwMWJnKJC8wYiKhit8Qbyff3Z/twS 4jS8Elpxre4G6CvoPg1QXM6wHVJ1q71QnihQymGeokotbbCt0Mr8T61jxGXTbUWGo2El RkIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=Q6mEP+YBmpqS+Yd9kHQJfl3StEgalSQMf3wQ7sA5hG0=; b=lFxIZq5NiDPh1H6E6txEHNKRYJLfgKvTi5wpSmcjRnT7GWtj1oIN9ldqA8g+7FRClt 7atIKNUgRPPjT7YFMINIFR7QIyRoM7nwiYeGkOLJNsC1QwGF3mbGZ7LF7z/8BzjqEX4H aBu950jr7yiz2jsyaVNbUuGQ4bUOkQNkJo5E7eHk2lYeaLAhaFGceDt22sqsPKBSvNkQ XNQEzGwE+TK2FuG+Qj74Z0DXxzH+E3/1nR8y+sbGPm0cj09INjtT/3GNhTCCrE+bs0Y/ 2H7KFwEEy5Wxb2U7bs4c9sZ1TUv07uNhndKctoYePt+9oaePoM/Gx9rf62LQnh2s0Vpc p5Bg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id q123si1306828qke.124.2019.01.20.23.59.27 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:59:27 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5ADE5C057F93; Mon, 21 Jan 2019 07:59:26 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id C96AE608DA; Mon, 21 Jan 2019 07:59:20 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 14/24] userfaultfd: wp: apply _PAGE_UFFD_WP bit Date: Mon, 21 Jan 2019 15:57:12 +0800 Message-Id: <20190121075722.7945-15-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Mon, 21 Jan 2019 07:59:26 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Firstly, introduce two new flags MM_CP_UFFD_WP[_RESOLVE] for change_protection() when used with uffd-wp and make sure the two new flags are exclusively used. Then, - For MM_CP_UFFD_WP: apply the _PAGE_UFFD_WP bit and remove _PAGE_RW when a range of memory is write protected by uffd - For MM_CP_UFFD_WP_RESOLVE: remove the _PAGE_UFFD_WP bit and recover _PAGE_RW when write protection is resolved from userspace And use this new interface in mwriteprotect_range() to replace the old MM_CP_DIRTY_ACCT. Do this change for both PTEs and huge PMDs. Then we can start to identify which PTE/PMD is write protected by general (e.g., COW or soft dirty tracking), and which is for userfaultfd-wp. Since we should keep the _PAGE_UFFD_WP when doing pte_modify(), add it into _PAGE_CHG_MASK as well. Meanwhile, since we have this new bit, we can be even more strict when detecting uffd-wp page faults in either do_wp_page() or wp_huge_pmd(). Signed-off-by: Peter Xu --- arch/x86/include/asm/pgtable_types.h | 2 +- include/linux/mm.h | 5 +++++ mm/huge_memory.c | 14 +++++++++++++- mm/memory.c | 4 ++-- mm/mprotect.c | 12 ++++++++++++ mm/userfaultfd.c | 10 +++++++--- 6 files changed, 40 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 163043ab142d..d6972b4c6abc 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -133,7 +133,7 @@ */ #define _PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \ _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY | \ - _PAGE_SOFT_DIRTY | _PAGE_DEVMAP) + _PAGE_SOFT_DIRTY | _PAGE_DEVMAP | _PAGE_UFFD_WP) #define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE) /* diff --git a/include/linux/mm.h b/include/linux/mm.h index 452fcc31fa29..89345b51d8bd 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1599,6 +1599,11 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma, #define MM_CP_DIRTY_ACCT (1UL << 0) /* Whether this protection change is for NUMA hints */ #define MM_CP_PROT_NUMA (1UL << 1) +/* Whether this change is for write protecting */ +#define MM_CP_UFFD_WP (1UL << 2) /* do wp */ +#define MM_CP_UFFD_WP_RESOLVE (1UL << 3) /* Resolve wp */ +#define MM_CP_UFFD_WP_ALL (MM_CP_UFFD_WP | \ + MM_CP_UFFD_WP_RESOLVE) extern unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgprot_t newprot, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index be8160bb7cac..169795c8e56c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1864,6 +1864,8 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, bool preserve_write; int ret; bool prot_numa = cp_flags & MM_CP_PROT_NUMA; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; ptl = __pmd_trans_huge_lock(pmd, vma); if (!ptl) @@ -1930,6 +1932,13 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, entry = pmd_modify(entry, newprot); if (preserve_write) entry = pmd_mk_savedwrite(entry); + if (uffd_wp) { + entry = pmd_wrprotect(entry); + entry = pmd_mkuffd_wp(entry); + } else if (uffd_wp_resolve) { + entry = pmd_mkwrite(entry); + entry = pmd_clear_uffd_wp(entry); + } ret = HPAGE_PMD_NR; set_pmd_at(mm, addr, pmd, entry); BUG_ON(vma_is_anonymous(vma) && !preserve_write && pmd_write(entry)); @@ -2079,7 +2088,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, struct page *page; pgtable_t pgtable; pmd_t old_pmd, _pmd; - bool young, write, soft_dirty, pmd_migration = false; + bool young, write, soft_dirty, pmd_migration = false, uffd_wp = false; unsigned long addr; int i; @@ -2161,6 +2170,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, write = pmd_write(old_pmd); young = pmd_young(old_pmd); soft_dirty = pmd_soft_dirty(old_pmd); + uffd_wp = pmd_uffd_wp(old_pmd); } VM_BUG_ON_PAGE(!page_count(page), page); page_ref_add(page, HPAGE_PMD_NR - 1); @@ -2194,6 +2204,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = pte_mkold(entry); if (soft_dirty) entry = pte_mksoft_dirty(entry); + if (uffd_wp) + entry = pte_mkuffd_wp(entry); } pte = pte_offset_map(&_pmd, addr); BUG_ON(!pte_none(*pte)); diff --git a/mm/memory.c b/mm/memory.c index 89d51d1650e4..7f276158683b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2482,7 +2482,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; - if (userfaultfd_wp(vma)) { + if (userfaultfd_pte_wp(vma, *vmf->pte)) { pte_unmap_unlock(vmf->pte, vmf->ptl); return handle_userfault(vmf, VM_UFFD_WP); } @@ -3670,7 +3670,7 @@ static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf) static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd) { if (vma_is_anonymous(vmf->vma)) { - if (userfaultfd_wp(vmf->vma)) + if (userfaultfd_huge_pmd_wp(vmf->vma, orig_pmd)) return handle_userfault(vmf, VM_UFFD_WP); return do_huge_pmd_wp_page(vmf, orig_pmd); } diff --git a/mm/mprotect.c b/mm/mprotect.c index 416ede326c03..000e246c163b 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -46,6 +46,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, int target_node = NUMA_NO_NODE; bool dirty_accountable = cp_flags & MM_CP_DIRTY_ACCT; bool prot_numa = cp_flags & MM_CP_PROT_NUMA; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; /* * Can be called with only the mmap_sem for reading by @@ -117,6 +119,14 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, if (preserve_write) ptent = pte_mk_savedwrite(ptent); + if (uffd_wp) { + ptent = pte_wrprotect(ptent); + ptent = pte_mkuffd_wp(ptent); + } else if (uffd_wp_resolve) { + ptent = pte_mkwrite(ptent); + ptent = pte_clear_uffd_wp(ptent); + } + /* Avoid taking write faults for known dirty pages */ if (dirty_accountable && pte_dirty(ptent) && (pte_soft_dirty(ptent) || @@ -300,6 +310,8 @@ unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, { unsigned long pages; + BUG_ON((cp_flags & MM_CP_UFFD_WP_ALL) == MM_CP_UFFD_WP_ALL); + if (is_vm_hugetlb_page(vma)) pages = hugetlb_change_protection(vma, start, end, newprot); else diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 23d4bbd117ee..902247ca1474 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -73,8 +73,12 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, goto out_release; _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); - if (dst_vma->vm_flags & VM_WRITE && !wp_copy) - _dst_pte = pte_mkwrite(_dst_pte); + if (dst_vma->vm_flags & VM_WRITE) { + if (wp_copy) + _dst_pte = pte_mkuffd_wp(_dst_pte); + else + _dst_pte = pte_mkwrite(_dst_pte); + } dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); if (dst_vma->vm_file) { @@ -674,7 +678,7 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, newprot = vm_get_page_prot(dst_vma->vm_flags); change_protection(dst_vma, start, start + len, newprot, - enable_wp ? 0 : MM_CP_DIRTY_ACCT); + enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE); err = 0; out_unlock: From patchwork Mon Jan 21 07:57:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772815 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D4E7513BF for ; Mon, 21 Jan 2019 07:59:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C778A29CB1 for ; Mon, 21 Jan 2019 07:59:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BAED529CBB; Mon, 21 Jan 2019 07:59:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 59FE929CB1 for ; Mon, 21 Jan 2019 07:59:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 263338E000D; Mon, 21 Jan 2019 02:59:34 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 2133A8E0001; Mon, 21 Jan 2019 02:59:34 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 103FF8E000D; Mon, 21 Jan 2019 02:59:34 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id DC83D8E0001 for ; Mon, 21 Jan 2019 02:59:33 -0500 (EST) Received: by mail-qk1-f200.google.com with SMTP id k66so18724869qkf.1 for ; Sun, 20 Jan 2019 23:59:33 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=Zc32lxTVLUb1OMMe42Eclie7QeYOeMdbOGekCMZ6Zmo=; b=qhn4c6xj0KBCpvpB58daqHnojrOpts0Pw4h0b9phD4b0oTEhtx+h9bbTkUdExGZXHL m9fjZ7HBb0tEjGlbZFi7bmWOUJgqgKNikfUIqGSx1PU0mrBvzQv7fIcy+cms7vQw/z/a 61IhrSVykvieXA4CqVjSVRBpAO1lASB9gHA+guHPM8IbWDtmz4VwPL8+U9nMFPl7LKZu vDVWgAOw8yFIXeCuWSzn99l8+4K3l6E2RMMatC7dsVBU4hECbeit/McTi1ugpNQu7sit 88LzhVahNcKdhYy1/63xGyhl4JTF0iLY5ab4TpcUJisxNreEoDS3rd7Ssdakb4SBTzCX EjLw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukceDHMByGe3p67pi6RxEuxW5cu6ejA6tj5I3VWYti9VkNuOHd1D Mw2Zr2lKk69VBq/r93N9IAxCzZvlfF3pbqZg1L3dRAFczI+esVQ5IGW/g/fF2YVi+y03MKBZTp6 9gzM9oK1kZkurdfz2y+ONgAdbEaM2CnZBwoxZj8/Z6FpEoa1XcmB9HF240rP/s99YhQ== X-Received: by 2002:ac8:760f:: with SMTP id t15mr25074364qtq.188.1548057573711; Sun, 20 Jan 2019 23:59:33 -0800 (PST) X-Google-Smtp-Source: ALg8bN4qpLfuq/UyIv4N5FJFmaVlqLsTNOrFLDhsc5T+SsEu3rpQbo4FpYxhzDOsQDoGhFw7MIRf X-Received: by 2002:ac8:760f:: with SMTP id t15mr25074343qtq.188.1548057573268; Sun, 20 Jan 2019 23:59:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057573; cv=none; d=google.com; s=arc-20160816; b=idBJXNVgkNHtUVMbhrnei8hw82AyeZNAtRPliB/nt3m/L4wG0v5aXEX/hi/lmDxWPw xwARHCFd2Io98ifBf6mvE0yk+KgvWmeL1tgaXupngSWTvBPyp3I5M3lYbvSclvWQKA4p z2VjxtKhyuik5Bme6jVgNpwWJSKJokGoO5Z9LVoOUxkBJybdTTYS6Y4t4wMK4MZOVeSm 0M9Fhdc5RAE9NkwlaiRaq4j8lhRU0GB0/YjwCXL0pMsmUSVjLGNEPX1Tw8zG3KsozstL jnjC6CI/SxDDdMjURjRCYGv6i5HjuXinc8L6K31g3YbVQGXtsSxTWjrUjen3nvZFxj45 rd0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=Zc32lxTVLUb1OMMe42Eclie7QeYOeMdbOGekCMZ6Zmo=; b=wPF/7RZXLCqnEle7l0PdGnyV4wh4wvVBuqusRve1wJXkf233MFyD0YkGQKgL/3lFKL tbdZo3pZNdv2NKsjUOPPecaXkQZFzzybIXVQWX7xYIlXbMcUGTjOkJaUJSxm9D+reDuC riQ2VbEAadzbEDVynEA2EJRAsy4/UoUnCD+XG8TPa4hfraOFyhNRqZ9LBfevcS984iPN B9YXEY3FYlp5mBs/3PfYvJKVlg2N4AoyjJyJ2px5DEJl0crw1GbNGVCDj1P3Xf0bnPk4 WKgwuS/r6yocapSlKXMHuB6KMwAAI3l4OSqmba5r4TVZh2e2wtEFH3OMqcf0lYhAdcR4 pJgg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id o63si2294025qka.164.2019.01.20.23.59.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:59:33 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 67BF4806B4; Mon, 21 Jan 2019 07:59:32 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id D79F6608F3; Mon, 21 Jan 2019 07:59:26 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 15/24] mm: export wp_page_copy() Date: Mon, 21 Jan 2019 15:57:13 +0800 Message-Id: <20190121075722.7945-16-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 21 Jan 2019 07:59:32 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Export this function for usages outside page fault handlers. Signed-off-by: Peter Xu --- include/linux/mm.h | 2 ++ mm/memory.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 89345b51d8bd..bf04e187fafe 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -378,6 +378,8 @@ struct vm_fault { */ }; +vm_fault_t wp_page_copy(struct vm_fault *vmf); + /* page entry size for vm->huge_fault() */ enum page_entry_size { PE_SIZE_PTE = 0, diff --git a/mm/memory.c b/mm/memory.c index 7f276158683b..ef823c07f635 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2239,7 +2239,7 @@ static inline void wp_page_reuse(struct vm_fault *vmf) * held to the old page, as well as updating the rmap. * - In any case, unlock the PTL and drop the reference we took to the old page. */ -static vm_fault_t wp_page_copy(struct vm_fault *vmf) +vm_fault_t wp_page_copy(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; struct mm_struct *mm = vma->vm_mm; From patchwork Mon Jan 21 07:57:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772817 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9CFCA13BF for ; Mon, 21 Jan 2019 07:59:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8FAD529CB1 for ; Mon, 21 Jan 2019 07:59:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 833A029CBB; Mon, 21 Jan 2019 07:59:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EE61D29CB1 for ; Mon, 21 Jan 2019 07:59:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D87678E000E; Mon, 21 Jan 2019 02:59:40 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D36158E0001; Mon, 21 Jan 2019 02:59:40 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4CDC8E000E; Mon, 21 Jan 2019 02:59:40 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 99D438E0001 for ; Mon, 21 Jan 2019 02:59:40 -0500 (EST) Received: by mail-qt1-f198.google.com with SMTP id t18so20335248qtj.3 for ; Sun, 20 Jan 2019 23:59:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=Mka3b8vvtY1z/EONomoJWbxxwVNXsJ/BDhDKm5hAFfQ=; b=kVXWBiK3OSaSCHHt0PQEIkFq4o5D5oGFZTkJcf63OIDZiaY2hDkLhoAyT7LWXWimpN ud0I/9KSeMrvedjSxGFLwQ9GOc6CShu8MsnpEkBWNhkcbB6D/hQngI63qaxDxS1JK9P3 o36OQ0mdnGm2dZQcNky1oIC7kMRsbcVbiJVpP+R/tDtPHPDaa0Yuc0FZs+3LtgidXr9r hjrQWEzgR48gIH0azknRkAX2wfHQtP+CPvTmoVQYrNqfTlJurC9WaAlOZtj7k2RCMz8j 4y+0QmBczVo1BaBlHhDpDBZQb+eVh3YWIYflAIwWO9w2Gk9a69wFWwS7uWBfjf7yiBFQ Azag== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukd12wVN8VgWcLJPIbpKe/BRYjX0BZpG4TvCoSzAB2O7+QQ4E6AV hHKPXa3unNwxe5KMbKzw4UgjObJv1OXKrNnyNIqOOWUq6pVmX0+Fl9+tNds6FunHgPUw5BSAXfR IU+6NdDZvZYYIh3ZGiL85ENG+CStqLsTjjLh+i4A+cmr6RRNZJMsCbvi03vqK4GdPBQ== X-Received: by 2002:a37:be84:: with SMTP id o126mr24240929qkf.312.1548057580379; Sun, 20 Jan 2019 23:59:40 -0800 (PST) X-Google-Smtp-Source: ALg8bN4yPPEPXNOy++eHxQWeXU4627C1T4+d7yJbdfa1wZXU5a/NwcYc5sZOWJ5IdpnjWX9WuNMH X-Received: by 2002:a37:be84:: with SMTP id o126mr24240906qkf.312.1548057579708; Sun, 20 Jan 2019 23:59:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057579; cv=none; d=google.com; s=arc-20160816; b=bTfTcpbrW24n9SpMCSxkCm/203ihyUUUj9uh7uiEw5wo0lJHZJXCRk7324VqfhVo9O 5gJ8R4TAQfl4rH8nQX1Gv71YMpZbCV9fYyziEt1SPkrElddtT4KcdI3+/1KRC5xJmvPA TK77AMBl5O4SBOsOlw1R1mTEhtVykkBRmvexy8wxAyzzJB4VxoEvoUOctrHJSpJ9Sfmm pqSyY8o+j5PpSkO6f1spdPWq15CECdbDUEWcN8AhY+loJhxWwZuevzkeLtUQNkj+9Ptq 2IWZD4f5GkdJHB1oKsMgDoWRwuEsLcQZEy3GuhWkK4KkbgXoMip3NOlwN2aXt72OkXak r7cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=Mka3b8vvtY1z/EONomoJWbxxwVNXsJ/BDhDKm5hAFfQ=; b=s5e1UxFRdIAfgqb42QtCY2RMQ9H5vs5vObtIVrpg/PegDJd3b2pP+XlNv9I0tmckuv yRgXbjAMmQMKl1ufN2DVAImxrLH05d5ZdhQtiVZx8nxfEpYrRdwI6P+erKhXZcDf2C5L ODS15ekdhgvz+12Mj0YDLWw1Ijc9shvlwxfUZYk/iobOMh7NCSZxDLk/idIWnrO4MjAW Ki6edVIv4DJuvqSrsBIIXaJTtQqjZA08+69P2SaP31ecUayJV6QGVA6IKlE7gFrqXEkm TeT1KbU4/K+IaSgTKLi6meGgRe0UWuFldnYyhRX3edZspOMpmwjkLKDqQx/+mbXoUUlg tmiA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id u24si1168646qtj.98.2019.01.20.23.59.39 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:59:39 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C392719CF67; Mon, 21 Jan 2019 07:59:38 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id E4481608F3; Mon, 21 Jan 2019 07:59:32 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 16/24] userfaultfd: wp: handle COW properly for uffd-wp Date: Mon, 21 Jan 2019 15:57:14 +0800 Message-Id: <20190121075722.7945-17-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 21 Jan 2019 07:59:39 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This allows uffd-wp to support write-protected pages for COW. For example, the uffd write-protected PTE could also be write-protected by other usages like COW or zero pages. When that happens, we can't simply set the write bit in the PTE since otherwise it'll change the content of every single reference to the page. Instead, we should do the COW first if necessary, then handle the uffd-wp fault. To correctly copy the page, we'll also need to carry over the _PAGE_UFFD_WP bit if it was set in the original PTE. For huge PMDs, we just simply split the huge PMDs where we want to resolve an uffd-wp page fault always. That matches what we do with general huge PMD write protections. In that way, we resolved the huge PMD copy-on-write issue into PTE copy-on-write. Signed-off-by: Peter Xu --- mm/memory.c | 2 ++ mm/mprotect.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 54 insertions(+), 3 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index ef823c07f635..a3de13b728f4 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2290,6 +2290,8 @@ vm_fault_t wp_page_copy(struct vm_fault *vmf) } flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte)); entry = mk_pte(new_page, vma->vm_page_prot); + if (pte_uffd_wp(vmf->orig_pte)) + entry = pte_mkuffd_wp(entry); entry = maybe_mkwrite(pte_mkdirty(entry), vma); /* * Clear the pte entry and flush it first, before updating the diff --git a/mm/mprotect.c b/mm/mprotect.c index 000e246c163b..c37c9aa7a54e 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -77,14 +77,13 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, if (pte_present(oldpte)) { pte_t ptent; bool preserve_write = prot_numa && pte_write(oldpte); + struct page *page; /* * Avoid trapping faults against the zero or KSM * pages. See similar comment in change_huge_pmd. */ if (prot_numa) { - struct page *page; - page = vm_normal_page(vma, addr, oldpte); if (!page || PageKsm(page)) continue; @@ -114,6 +113,46 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, continue; } + /* + * Detect whether we'll need to COW before + * resolving an uffd-wp fault. Note that this + * includes detection of the zero page (where + * page==NULL) + */ + if (uffd_wp_resolve) { + /* If the fault is resolved already, skip */ + if (!pte_uffd_wp(*pte)) + continue; + page = vm_normal_page(vma, addr, oldpte); + if (!page || page_mapcount(page) > 1) { + struct vm_fault vmf = { + .vma = vma, + .address = addr & PAGE_MASK, + .page = page, + .orig_pte = oldpte, + .pmd = pmd, + /* pte and ptl not needed */ + }; + vm_fault_t ret; + + if (page) + get_page(page); + arch_leave_lazy_mmu_mode(); + pte_unmap_unlock(pte, ptl); + ret = wp_page_copy(&vmf); + /* PTE is changed, or OOM */ + if (ret == 0) + /* It's done by others */ + continue; + else if (WARN_ON(ret != VM_FAULT_WRITE)) + return pages; + pte = pte_offset_map_lock(vma->vm_mm, + pmd, addr, + &ptl); + arch_enter_lazy_mmu_mode(); + } + } + ptent = ptep_modify_prot_start(mm, addr, pte); ptent = pte_modify(ptent, newprot); if (preserve_write) @@ -184,6 +223,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, unsigned long pages = 0; unsigned long nr_huge_updates = 0; unsigned long mni_start = 0; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; pmd = pmd_offset(pud, addr); do { @@ -201,7 +241,16 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, } if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { - if (next - addr != HPAGE_PMD_SIZE) { + /* + * When resolving an userfaultfd write + * protection fault, it's not easy to identify + * whether a THP is shared with others and + * whether we'll need to do copy-on-write, so + * just split it always for now to simply the + * procedure. And that's the policy too for + * general THP write-protect in af9e4d5f2de2. + */ + if (next - addr != HPAGE_PMD_SIZE || uffd_wp_resolve) { __split_huge_pmd(vma, pmd, addr, false, NULL); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, From patchwork Mon Jan 21 07:57:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772819 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D254F1390 for ; Mon, 21 Jan 2019 07:59:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C4C8729CB1 for ; Mon, 21 Jan 2019 07:59:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B912929CBB; Mon, 21 Jan 2019 07:59:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5B12529CB1 for ; Mon, 21 Jan 2019 07:59:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6CFE08E0008; Mon, 21 Jan 2019 02:59:49 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 67FCF8E0001; Mon, 21 Jan 2019 02:59:49 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5960B8E0008; Mon, 21 Jan 2019 02:59:49 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 301958E0001 for ; Mon, 21 Jan 2019 02:59:49 -0500 (EST) Received: by mail-qk1-f197.google.com with SMTP id w28so18271338qkj.22 for ; Sun, 20 Jan 2019 23:59:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=CMKcOJHBuk/LOB9JBPR9KTuaFj4SKYgIxDPbqbU3Psw=; b=KE6zGID785N470UBUw5y83wWamYY2chqhk5hQneov29UUBkOESH2KmcmpAfPnlTeKh xTJa3pbBy7nde0FCdKZv8ohVPEdiDdrjdz31NxRRQAX0FzROTBJCJjPXFEBmGaq/JdFB PSPUH0DPbQ+3epwYaRLF/iIYJpwFvoDhOxxnV0mYNH6mORxej20SryC/FfKY7LeQhAdj RObd8tdbHHZrKmW9hb2qW5BycZilhWUKmqa93pkIRFnmtptYH+L/W2G5RFuXBpOKbG9f O/E45d3McphOFyQpbDejzwSH9v8iD9Y/MTlsx/2jPufD45BUkw3/mzLT2p+I7Brdm2Ro A+Lw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukfzsgx5CPLmXnzDsU6UmVz1eJSWZjtHRzQRIwSSIw12Om2Tu/TL U+V0gKrpaurZBLmOc6hxuYVF4TwnUIZRg4k4cGmdJAIWH5KUUS4RC+O63QJVALMaNX3pz9YPOQB OAHkve08dwPMnaQWys4xT3BlgtlnKEQWtL3XJQqtbTD9f1tzGBakw7n2Zg+nCs3AZKQ== X-Received: by 2002:a0c:95b5:: with SMTP id s50mr24804266qvs.64.1548057588985; Sun, 20 Jan 2019 23:59:48 -0800 (PST) X-Google-Smtp-Source: ALg8bN5YqqNao6b4Hw9DWhKdTuCXwg594EUi6dQS0mavo7DNHR5csiuDsUVRBYKz4tcwg4mpkRKc X-Received: by 2002:a0c:95b5:: with SMTP id s50mr24804245qvs.64.1548057588505; Sun, 20 Jan 2019 23:59:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057588; cv=none; d=google.com; s=arc-20160816; b=Xy10TYmQ89iaPvsWsudLU1IK1h8KZ/Qtl24OWucALmsQqOQgv3VgqDL6DceA6Sa84f NqbhjVSx8SHhqyvuv0SWQkY23YoT9iUo1IVyUUk+WME6tD/94jP+os6fb7Qs5egdpwaa bEjKW6sozGBqioyKpPE1CzvHKg7MuKkazjLrf7Ga29SGEaKwFOIld7x/aiSHvXZJbWpC m7h0uIZYk7LXsHoar6+EdLHLExo2p/Du2cRaHhk8hClNWFd66m43SSleFSZKpC1FqwXW a07ZUNo03NWwMy4X555Nlbmi+LN3fIMCSkIZNOV/23iUUn1A1X67yvfDtqFaag7rj1P8 Oqtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=CMKcOJHBuk/LOB9JBPR9KTuaFj4SKYgIxDPbqbU3Psw=; b=L3HjYoSvlE2SPlLgAXOOPBUPIJDeL8FkJZUwFoAoUkeNJhfif/nVD8GHupFuB2BVlO wMR/UqjRStWW3MdIzzKoyBwY7/lt9DC1Atb27H5znMfkmaYMWZUMuXUssfveJUxSTpnM 6oiAAsWdz9+DFdsz25gZFPcOC5k0WAk/F1lENtaDKL/9o3HUNguEJwykxbOm3/zsmRXl HWHEYmG+4n4WHjyR2RicoGe0AeGSVJuDbQIdrT2+8/9yoToMJtIEeOSCxMUJuXAQnNsQ g4OSIesBA3R0arl+oj13L8dn7zy3oyVR+ag+aheeBuG+WaO92KP1+fvHcNdYqnuxO+XY ia1g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id b16si11661148qtp.115.2019.01.20.23.59.48 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:59:48 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9AE09806B6; Mon, 21 Jan 2019 07:59:47 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4B9F5608F3; Mon, 21 Jan 2019 07:59:39 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 17/24] userfaultfd: wp: drop _PAGE_UFFD_WP properly when fork Date: Mon, 21 Jan 2019 15:57:15 +0800 Message-Id: <20190121075722.7945-18-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 21 Jan 2019 07:59:47 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP UFFD_EVENT_FORK support for uffd-wp should be already there, except that we should clean the uffd-wp bit if uffd fork event is not enabled. Detect that to avoid _PAGE_UFFD_WP being set even if the VMA is not being tracked by VM_UFFD_WP. Do this for both small PTEs and huge PMDs. Signed-off-by: Peter Xu --- mm/huge_memory.c | 8 ++++++++ mm/memory.c | 8 ++++++++ 2 files changed, 16 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 169795c8e56c..2a3ec62e83b6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -928,6 +928,14 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, ret = -EAGAIN; pmd = *src_pmd; + /* + * Make sure the _PAGE_UFFD_WP bit is cleared if the new VMA + * does not have the VM_UFFD_WP, which means that the uffd + * fork event is not enabled. + */ + if (!(vma->vm_flags & VM_UFFD_WP)) + pmd = pmd_clear_uffd_wp(pmd); + #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION if (unlikely(is_swap_pmd(pmd))) { swp_entry_t entry = pmd_to_swp_entry(pmd); diff --git a/mm/memory.c b/mm/memory.c index a3de13b728f4..f5497752d2a3 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -788,6 +788,14 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte = pte_mkclean(pte); pte = pte_mkold(pte); + /* + * Make sure the _PAGE_UFFD_WP bit is cleared if the new VMA + * does not have the VM_UFFD_WP, which means that the uffd + * fork event is not enabled. + */ + if (!(vm_flags & VM_UFFD_WP)) + pte = pte_clear_uffd_wp(pte); + page = vm_normal_page(vma, addr, pte); if (page) { get_page(page); From patchwork Mon Jan 21 07:57:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772821 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D9DD213BF for ; Mon, 21 Jan 2019 07:59:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CC46B29CB1 for ; Mon, 21 Jan 2019 07:59:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C01DF29CBB; Mon, 21 Jan 2019 07:59:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6337729CB1 for ; Mon, 21 Jan 2019 07:59:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6F77B8E000A; Mon, 21 Jan 2019 02:59:55 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6A6A18E0001; Mon, 21 Jan 2019 02:59:55 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5BDAA8E000A; Mon, 21 Jan 2019 02:59:55 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 2DEC38E0001 for ; Mon, 21 Jan 2019 02:59:55 -0500 (EST) Received: by mail-qt1-f197.google.com with SMTP id t18so20335750qtj.3 for ; Sun, 20 Jan 2019 23:59:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=LcMQbdG+ZNPaF5mPUkSU09DPT7VSAN+WniFF4Bj7dqs=; b=ehMrTFX/ZANwMaR4g33rv74QUITAbbah0q8COQ8quvpTzSFsYsxz/nBQYOAEyHIrw0 aPwD4DWv8X9VaNH7CGbNXFNaV+VXBkJg9liYaF0Ix89Bxi8Y8hoUNaKuF+Ulp/lAct4n OnbJvPwlUl5g1VKogclgiVem++OIIBAXpwmyfoLuEPiQCCM3jlhgAUKY8fu+5awtnUXM US4Y5kTItKnuvhpFjB52Lt4X1VoTfDc1ApMMAy7qxQdcA8vdx+jrw2fMZawqivHJhkeq Aixa5RCilgOdxI+0afq1zpsoRfVuWDHXBorFQ69tRk6oA4ln3t2wKF0MllrHE0DH7VE4 Uwtg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukdmx0pxyONPTatSv6bRW0aMY+lMOPASFHqqIfNetu7Yd+YW/ov8 8M29bDy6pAM0xjCtt14lHDH+rkRjEsagQ4N8gKnOsBDRjqGZnvB2CTesoHTyyYzF3serDGPaL8M 5GT2N/sAsJp6itLhblcnjk4O587y9VorW8mItxFlvFzeeaH9On4vL6Ga0PpFr9NLOwA== X-Received: by 2002:a37:4eca:: with SMTP id c193mr23606993qkb.37.1548057594984; Sun, 20 Jan 2019 23:59:54 -0800 (PST) X-Google-Smtp-Source: ALg8bN72os3VwsbbSc77il0y2zIFmid4jxC3Jv/IYunSS/jDJbWAfumWq0rWMPmqKvwRh8Cler4w X-Received: by 2002:a37:4eca:: with SMTP id c193mr23606972qkb.37.1548057594552; Sun, 20 Jan 2019 23:59:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057594; cv=none; d=google.com; s=arc-20160816; b=OM8vuqsUwa82E/gzK4ULz7LRFXtDry7DVQb8gJzV1k9fTNq9URYufz1FLMm+X6nawU Xz97KOq7Ym7mMqc1NG8Ap+WaO+iJQdhRV0XLDTcAJ76746gVkcfzs12V2lTwvkJn5v8k P2uL7g8uH+5jwi6LX2ukHckol0MbygUCEJsz5HmXbJnXTffTkpthEdy7uDU2MhE2Ez/3 333gZh+V7v28q4tPUYO2VPZ6vOxSm4KwXjddIcNA4T0XCqcQBG1GOY9k8Gl4pazU8w+H 3+3e34XttMWOSmwG0LH2ZZCHVIdeG4k2qTLhIVTKOKjzdJA+k769ALVQyJ+jDWkmsUOK DiFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=LcMQbdG+ZNPaF5mPUkSU09DPT7VSAN+WniFF4Bj7dqs=; b=G2+TkOJ62ybZNG9Z1mrpTuf1hKZO7gwbI7+kKu6vvnOiEw8cdrZvP4833G4fLs7vgH TV/lpnVSwklY6tyFlsNkx6dA20uDmyXQXyQQGyyHC7jWt+7VQcG3KAbuIzmFmVIg//NN e/rlAUaQiKJXWLT96WCnB0l1YhyFgrinwEcuIg/Thi6NGa+/OXC1pLS56JAjtv+bHLK6 seUxnxOUaj+Lwej7zDf5uL4lCO5sKaPVCT8RZlvEp1kVO/ci6f5ccskK3XLpvgpO37Ae i9BWmVPms+NtEn5sMjMENfbOHPqcz2srGiTs8MysOTGR12NA4i/iLSF1XrJkak5WrSOn 3gVA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id d16si610833qvn.7.2019.01.20.23.59.54 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:59:54 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A9F0B19CF7A; Mon, 21 Jan 2019 07:59:53 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 22340608F3; Mon, 21 Jan 2019 07:59:47 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 18/24] userfaultfd: wp: add pmd_swp_*uffd_wp() helpers Date: Mon, 21 Jan 2019 15:57:16 +0800 Message-Id: <20190121075722.7945-19-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 21 Jan 2019 07:59:53 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Adding these missing helpers for uffd-wp operations with pmd swap/migration entries. Signed-off-by: Peter Xu --- arch/x86/include/asm/pgtable.h | 15 +++++++++++++++ include/asm-generic/pgtable_uffd.h | 15 +++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 7a71158982f4..aa2eb36d7edf 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1401,6 +1401,21 @@ static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) { return pte_clear_flags(pte, _PAGE_SWP_UFFD_WP); } + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pmd_set_flags(pmd, _PAGE_SWP_UFFD_WP); +} + +static inline int pmd_swp_uffd_wp(pmd_t pmd) +{ + return pmd_flags(pmd) & _PAGE_SWP_UFFD_WP; +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pmd_clear_flags(pmd, _PAGE_SWP_UFFD_WP); +} #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ #define PKRU_AD_BIT 0x1 diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h index 643d1bf559c2..828966d4c281 100644 --- a/include/asm-generic/pgtable_uffd.h +++ b/include/asm-generic/pgtable_uffd.h @@ -46,6 +46,21 @@ static __always_inline pte_t pte_swp_clear_uffd_wp(pte_t pte) { return pte; } + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pmd; +} + +static inline int pmd_swp_uffd_wp(pmd_t pmd) +{ + return 0; +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pmd; +} #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ #endif /* _ASM_GENERIC_PGTABLE_UFFD_H */ From patchwork Mon Jan 21 07:57:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772823 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9061B6C2 for ; Mon, 21 Jan 2019 08:00:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7A9A329CB1 for ; Mon, 21 Jan 2019 08:00:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6F15329CD5; Mon, 21 Jan 2019 08:00:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C0CB229CBB for ; Mon, 21 Jan 2019 08:00:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B15C98E000F; Mon, 21 Jan 2019 03:00:01 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AC4EE8E0001; Mon, 21 Jan 2019 03:00:01 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9DD498E000F; Mon, 21 Jan 2019 03:00:01 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 76F068E0001 for ; Mon, 21 Jan 2019 03:00:01 -0500 (EST) Received: by mail-qt1-f198.google.com with SMTP id n45so20354649qta.5 for ; Mon, 21 Jan 2019 00:00:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=Q+QOyQca9ncyuyaJGvCiMLlYSfq6f+Nwy1A85AFb5kI=; b=oJf1mZQ9i2q8dNef/5SdPAHroLpwSNs1UOfmOCUizvKOyFFebAkot2dAr4/MbISJPY ND8l44DLYJLBAknujNJkvomwB1BwWzHXZUwscGplXfnaOdhKPiqMEsHPs/qDTxEus12I VLkrBEpOTsxO72l7fz4+pZxTf4pE1sCOGzCbPBkBs/Qfzsqy0t1xT0KajN719dFA9Zko 0s1Iti18UYTPC738DAHXE4svBLMFEGlNHn0ffY1Thr67LJMK7qNt+i1n+lEOErB+IUR5 eRTYSk9oV10SRvDhahWKfdAmySdULyswcd/omtCMEcUF8BHGppS2hsObjiHPPofWYdU9 1uNw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukflp4lwVuF51NF3jmwvJ1ueZm/jUY/O3A0sDXTiXcIyPQzOojJa IjpwJwMvWzTK5v90f6YbGESkNHS5bTsvERGLq3tZhJ07v7GiR3bLp37A8Y1epbkLzt80g2UAj+f I9axL0QzD1wY7gGR+8h5VcAPr0GAkAsfHxiaT0Ilg3O8SVjflPNpyBdcxYJZMsDKBLg== X-Received: by 2002:ac8:728f:: with SMTP id v15mr25335212qto.260.1548057601265; Mon, 21 Jan 2019 00:00:01 -0800 (PST) X-Google-Smtp-Source: ALg8bN47g6SXHcG6OnnhI4gngkHlNHUVX+LrBZ0oej9kd9/bVQHdq+P8mbHSoeR8O4aAb7fCCXXd X-Received: by 2002:ac8:728f:: with SMTP id v15mr25335193qto.260.1548057600678; Mon, 21 Jan 2019 00:00:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057600; cv=none; d=google.com; s=arc-20160816; b=CvqmVf0Kmu4FR9Nl0PjXGfH5/lxUDCQDNHfIJxPIH3VxsonwJQBV5NqFQTw0MaFCfC 9KQqp4CsyLJy3xS0Jg/InylIVHkytFpFGF7g87h1vAgaP7OdBeBu5z/iYXhnTRiyNFkj YCuc9WSxouIqTTGUlsz4nQ1ISHHgo1XLTVck+v485cOgFG44QCe0P1efgv+nqsJZIhwJ siykGi/+fnEweAB8E6njdVZhIN1n6XRmuYG3Um82uQtW8DIwtSHtdCDg/rQPcJgo+Or6 BeBLKpu3M0NpjJ7uMxf2SJap4fqzGgu9v0GfkYE5HNF9WhgNR8HC8Rmg43JEbb0vGiha IdvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=Q+QOyQca9ncyuyaJGvCiMLlYSfq6f+Nwy1A85AFb5kI=; b=Bv/APlKYNj3qmgMmcqBNcYs9YRK4SeRlnCqeQJHVof6A+vCa0II40KCxK7hMRsM4qV sXVS85ffhmd6gRx//oAGQOKfi16Qs4cBcqxHwJOEIsnIgMMhJ40BdrZUTqIBwDljv4jq h4p/QKxSAEJ0kZYUTPGy1f7wNXTyXiWX/t+1ZqVtldDHelkq9fZQWdCB949Fqu0+KH1H Cb1yTKCgtTpxDzuJWJUNI7ez3HQkdIuAHQsfXiFuSzvlsuzpU7/lH1qfOzfw7XrX9RxE R3sCBpJRH+wDwbnwFbVhD0JhJumOsNVhwlzl0EGFqivVYCmcyx+cz3vcqZojaKV3wh4S bL+g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id v2si3433618qvm.85.2019.01.21.00.00.00 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 Jan 2019 00:00:00 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B68FF80472; Mon, 21 Jan 2019 07:59:59 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3234E608DA; Mon, 21 Jan 2019 07:59:53 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 19/24] userfaultfd: wp: support swap and page migration Date: Mon, 21 Jan 2019 15:57:17 +0800 Message-Id: <20190121075722.7945-20-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Mon, 21 Jan 2019 07:59:59 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP For either swap and page migration, we all use the bit 2 of the entry to identify whether this entry is uffd write-protected. It plays a similar role as the existing soft dirty bit in swap entries but only for keeping the uffd-wp tracking for a specific PTE/PMD. Something special here is that when we want to recover the uffd-wp bit from a swap/migration entry to the PTE bit we'll also need to take care of the _PAGE_RW bit and make sure it's cleared, otherwise even with the _PAGE_UFFD_WP bit we can't trap it at all. Note that this patch removed two lines from "userfaultfd: wp: hook userfault handler to write protection fault" where we try to remove the VM_FAULT_WRITE from vmf->flags when uffd-wp is set for the VMA. This patch will still keep the write flag there. Signed-off-by: Peter Xu --- include/linux/swapops.h | 2 ++ mm/huge_memory.c | 3 +++ mm/memory.c | 8 ++++++-- mm/migrate.c | 7 +++++++ mm/mprotect.c | 2 ++ mm/rmap.c | 6 ++++++ 6 files changed, 26 insertions(+), 2 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 4d961668e5fc..0c2923b1cdb7 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -68,6 +68,8 @@ static inline swp_entry_t pte_to_swp_entry(pte_t pte) if (pte_swp_soft_dirty(pte)) pte = pte_swp_clear_soft_dirty(pte); + if (pte_swp_uffd_wp(pte)) + pte = pte_swp_clear_uffd_wp(pte); arch_entry = __pte_to_swp_entry(pte); return swp_entry(__swp_type(arch_entry), __swp_offset(arch_entry)); } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2a3ec62e83b6..682f1427da1a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2171,6 +2171,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, write = is_write_migration_entry(entry); young = false; soft_dirty = pmd_swp_soft_dirty(old_pmd); + uffd_wp = pmd_swp_uffd_wp(old_pmd); } else { page = pmd_page(old_pmd); if (pmd_dirty(old_pmd)) @@ -2203,6 +2204,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = swp_entry_to_pte(swp_entry); if (soft_dirty) entry = pte_swp_mksoft_dirty(entry); + if (uffd_wp) + entry = pte_swp_mkuffd_wp(entry); } else { entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); entry = maybe_mkwrite(entry, vma); diff --git a/mm/memory.c b/mm/memory.c index f5497752d2a3..ac7d659e40fe 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -736,6 +736,8 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte = swp_entry_to_pte(entry); if (pte_swp_soft_dirty(*src_pte)) pte = pte_swp_mksoft_dirty(pte); + if (pte_swp_uffd_wp(*src_pte)) + pte = pte_swp_mkuffd_wp(pte); set_pte_at(src_mm, addr, src_pte, pte); } } else if (is_device_private_entry(entry)) { @@ -2814,8 +2816,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS); pte = mk_pte(page, vma->vm_page_prot); - if (userfaultfd_wp(vma)) - vmf->flags &= ~FAULT_FLAG_WRITE; if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) { pte = maybe_mkwrite(pte_mkdirty(pte), vma); vmf->flags &= ~FAULT_FLAG_WRITE; @@ -2825,6 +2825,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) flush_icache_page(vma, page); if (pte_swp_soft_dirty(vmf->orig_pte)) pte = pte_mksoft_dirty(pte); + if (pte_swp_uffd_wp(vmf->orig_pte)) { + pte = pte_mkuffd_wp(pte); + pte = pte_wrprotect(pte); + } set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); vmf->orig_pte = pte; diff --git a/mm/migrate.c b/mm/migrate.c index f7e4bfdc13b7..963d3dd65cf0 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -242,6 +242,11 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, if (is_write_migration_entry(entry)) pte = maybe_mkwrite(pte, vma); + if (pte_swp_uffd_wp(*pvmw.pte)) { + pte = pte_mkuffd_wp(pte); + pte = pte_wrprotect(pte); + } + if (unlikely(is_zone_device_page(new))) { if (is_device_private_page(new)) { entry = make_device_private_entry(new, pte_write(pte)); @@ -2265,6 +2270,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pte)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pte)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, addr, ptep, swp_pte); /* diff --git a/mm/mprotect.c b/mm/mprotect.c index c37c9aa7a54e..2ce62d806108 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -187,6 +187,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, newpte = swp_entry_to_pte(entry); if (pte_swp_soft_dirty(oldpte)) newpte = pte_swp_mksoft_dirty(newpte); + if (pte_swp_uffd_wp(oldpte)) + newpte = pte_swp_mkuffd_wp(newpte); set_pte_at(mm, addr, pte, newpte); pages++; diff --git a/mm/rmap.c b/mm/rmap.c index 85b7f9423352..e1cf191db4f3 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1463,6 +1463,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, pvmw.address, pvmw.pte, swp_pte); /* * No need to invalidate here it will synchronize on @@ -1555,6 +1557,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* * No need to invalidate here it will synchronize on @@ -1621,6 +1625,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* Invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, From patchwork Mon Jan 21 07:57:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772825 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F16986C2 for ; Mon, 21 Jan 2019 08:00:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E311B29CB7 for ; Mon, 21 Jan 2019 08:00:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D767D29CD1; Mon, 21 Jan 2019 08:00:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AAE7629CBB for ; Mon, 21 Jan 2019 08:00:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C75A68E0010; Mon, 21 Jan 2019 03:00:12 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BD7108E0001; Mon, 21 Jan 2019 03:00:12 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A9F778E0010; Mon, 21 Jan 2019 03:00:12 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 7E8008E0001 for ; Mon, 21 Jan 2019 03:00:12 -0500 (EST) Received: by mail-qk1-f199.google.com with SMTP id y83so18337742qka.7 for ; Mon, 21 Jan 2019 00:00:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=jxlEHcEGe3kHekh99WP30Nd+yDOg4/Ym/8a55AV+thA=; b=dMTT9Lw+U+mmuArrPAAYqpG7M63jFCJBqEVusiFH3ZChGLIXeVlMe5rew26wblDwvi 8d6ia9q//UgsmvJq1/Wy5MNZQhCBJpxHCzHhbYnN4Q4kbotIYbGmxc9fGOyBMmNd5nZm X/sIc5RnYuUrJbo7rcmC4uyVcJLVNJZzftmlQUeSzgSD1fBRSPBkaqPeBirgM72k+CgN mriY8iKR9ynMq9MjcWVCJElKSrFnyqM6O4bB17nwxkjlkjk0w+9coKNM9xXe/Ux//dFA Bxg5J85w9A6hSgtQ3VMzQni8vg2kDwGpX/vjSS56CSBZ2wDQr54H5IAmWqM27FhjWtbR HA5Q== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukfZl9r3UlqmrbgwvOKQUaj832VcZWz941qYAqi1FY/KFgGOM6Tc 7wBlCiDcKlncIlEZ0ouecTrGGCC6wYgC6gsxjHHsBOZhouFmNR7s0n9V1ytZkEoydnXwKjtPx98 3Kzw8JBoryxL2DlvV2qxVHNST5NBsOuyVCQ4pdH9u12QY4X9KxAYiBt7BY23eSB1p+Q== X-Received: by 2002:ac8:3038:: with SMTP id f53mr25430157qte.45.1548057612330; Mon, 21 Jan 2019 00:00:12 -0800 (PST) X-Google-Smtp-Source: ALg8bN60KTtye2EtrPJ5SsIrq7tCLhnLpQ28ljDC75hCPgsVNi1FInjiO8jSBJcfeEmk+8jc3yaj X-Received: by 2002:ac8:3038:: with SMTP id f53mr25430126qte.45.1548057611738; Mon, 21 Jan 2019 00:00:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057611; cv=none; d=google.com; s=arc-20160816; b=Jsk4uo4RIYnG/FMHYAS9TYxDW6p6iGUOPe2ArMQjyjjj8cri1G5Q9MlJRungGsb/if wUWBDG/lMIzV+QgDRJWoklGqmCRr6+9e/4VdUxSdpKrHYZvJj7dbA+W3LoyTnVV3beyt OBayKTtjWE3JTQb6GONZN4tyLrlcDsXssRfpwO7F4qd9/pjezLAnEyB5LT3v67fjq5Rt JhRiNnL4XGbfjMzypDkDkSNBYnRLQTf+cm4g066n+YZij0mIDdIPTPzu8Wicl3MiHYPk wZabauZvPYZAPL/2MuoiGel5CCuxjW8VjjYtveu8ey7mcOgBaZvdOjMxWDBKb0maW0Im woHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=jxlEHcEGe3kHekh99WP30Nd+yDOg4/Ym/8a55AV+thA=; b=QzsEk5fZun6nciloD0NZsSpfEf3sn3RKIsdv0Cn6cIp9lIbdDGR2/uhN+MSMyvkJIQ xkcrm8HYDfYWYOrxKENqPOpHVycn6vGKlPQmSoDfMecmmTQPZgl+m/bNr2LZ7GaUFUFR ymUWErkLItrn/aT+CpbAj32BH7L9/p43ogpJc5JKVJuKgQzfV+uUMhc66/1FEMdgZLU2 l4z9Bc7t7UQeaD5qrIYy9wgPCC6MXGfNsipTGFMn5a4oUiGmyfQTf7JSvmuysakZya7d Jj8SnnjynEBKwLotbHzDEssqIXnoMNiA4Fb/+eu00xOutSUqgLz1cw/X3SKTE+s2ZCbw Oxxg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id p42si2997749qtc.174.2019.01.21.00.00.11 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 Jan 2019 00:00:11 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BC7C187620; Mon, 21 Jan 2019 08:00:10 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3EF28608C7; Mon, 21 Jan 2019 07:59:59 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 20/24] userfaultfd: wp: don't wake up when doing write protect Date: Mon, 21 Jan 2019 15:57:18 +0800 Message-Id: <20190121075722.7945-21-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 21 Jan 2019 08:00:11 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP It does not make sense to try to wake up any waiting thread when we're write-protecting a memory region. Only wake up when resolving a write protected page fault. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 455b87c0596f..e54ab6076e13 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1771,6 +1771,7 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, struct uffdio_writeprotect uffdio_wp; struct uffdio_writeprotect __user *user_uffdio_wp; struct userfaultfd_wake_range range; + bool mode_wp, mode_dontwake; user_uffdio_wp = (struct uffdio_writeprotect __user *) arg; @@ -1786,17 +1787,19 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, if (uffdio_wp.mode & ~(UFFDIO_WRITEPROTECT_MODE_DONTWAKE | UFFDIO_WRITEPROTECT_MODE_WP)) return -EINVAL; - if ((uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP) && - (uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) + + mode_wp = uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP; + mode_dontwake = uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE; + + if (mode_wp && mode_dontwake) return -EINVAL; ret = mwriteprotect_range(ctx->mm, uffdio_wp.range.start, - uffdio_wp.range.len, uffdio_wp.mode & - UFFDIO_WRITEPROTECT_MODE_WP); + uffdio_wp.range.len, mode_wp); if (ret) return ret; - if (!(uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE)) { + if (!mode_wp && !mode_dontwake) { range.start = uffdio_wp.range.start; range.len = uffdio_wp.range.len; wake_userfault(ctx, &range); From patchwork Mon Jan 21 07:57:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772827 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 041CA1390 for ; Mon, 21 Jan 2019 08:00:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EA2A929D61 for ; Mon, 21 Jan 2019 08:00:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CDE5F29D63; Mon, 21 Jan 2019 08:00:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 650E829D61 for ; Mon, 21 Jan 2019 08:00:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6A9158E0011; Mon, 21 Jan 2019 03:00:19 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 631F08E0001; Mon, 21 Jan 2019 03:00:19 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D2F88E0011; Mon, 21 Jan 2019 03:00:19 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 1F5AE8E0001 for ; Mon, 21 Jan 2019 03:00:19 -0500 (EST) Received: by mail-qt1-f200.google.com with SMTP id q3so20337899qtq.15 for ; Mon, 21 Jan 2019 00:00:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=hfIqzaLx8cim+nIa8TVvSLzSI024ej8iZBzn/c9pUVU=; b=JvBD7AHFCiD7mtBKKnVmRgNzPKfR2+txozNqapzxcikL3WnHTPMGHa9N4C4uS79xmM 2LfanUSlCXcygVFmY7DIXvQBvXF8MYks1BmZbVGwqeHBz7KWnsaiqeSBQcdz80fpt6IW fM9LU0q+cPHcLpT4x1OqZ18xc9XnDCKBnE2BHvi8tjgA/FPiiF7Yum2gq6EQtr2Rt66r eigjqSQqZTGwqfNCAqMa6vhisgEJXCJ8sM7hy6P3h/OcLKOJqzxBwIs9Urq77yST3Zlk KYkCTRmQqSO6IBRtgx/vAZtLLIkCdEX4QtqZSXyrR9BJ+i8ucuAYZqOIbPX/YRjaSs61 2h+g== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUuke78vpFrkLPZZ/U6obsccj5pj+VF5zD7Nqn7ESqqsq/r3MHystG xeCZ0bh+7rPiuuaPMuFthSVszcrVLj8Koi+sGyzwyzJlGYtYgFrIyoVKg03nxwV+bqnTkHRFx8s tyO2GbPKLQEAzvsSzXXAZAo7aB3H/+83MXK1aqB/TlHcBEq4R5Am1lMX+nJMcgQ4HAw== X-Received: by 2002:ac8:17f0:: with SMTP id r45mr25571050qtk.206.1548057618916; Mon, 21 Jan 2019 00:00:18 -0800 (PST) X-Google-Smtp-Source: ALg8bN5LxHu3uAOQuB8K6OrqvouiwjTe3g8WlnjYCsQ2TjMSHxYN0cbsOG3j29lPpaoKj4VJIYuV X-Received: by 2002:ac8:17f0:: with SMTP id r45mr25571023qtk.206.1548057618429; Mon, 21 Jan 2019 00:00:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057618; cv=none; d=google.com; s=arc-20160816; b=Xu7sPgX97FUoIIWcWH+3veE/vrsdOMb7fbgEFp7tU2zKuo4z2WNBZOqusVt+lPk/mS qdIhthLAxWcWvUtW3/hUY0D76nXxeawvHnoNMZpGZ6rCCE+BMhwVXF5BXL78PtXjg7eE dHnqw6rp0RjycO5XePcX14CFHH07Em3h7IespSiQNFM1s3hWFZhm4pJ9J0BiwvPCjyLb NR8EMKFc31cRdzzXCFnm0xCJZTwdRKgTRF9ycnW4KTbwFpuSeHKXB5uoxCs7C6o7varN OgpqdiCA/bUOAaTwfoZ0nP9JymsvjJqbRQusF1Y3N2yj5isExlzSoKhZHvNiAPyPlbQZ sf8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=hfIqzaLx8cim+nIa8TVvSLzSI024ej8iZBzn/c9pUVU=; b=CSIvkUbS27YIuel94TFOHjladdpdVwMx2X/1ncPUJR6TmgSj1hqjLpxc2jQDl94qix 0UC+nR5F+igZbzkwwV+MjiUvaMeIoEf4+rf1oQbdpV4TRJl8cB6f6yMaFIUMjTVBxbgE aBVvja/KrUjL5zPH/32s5nMcefvhvQ/g2C0aIuEFEgOA0WV3JAP3eUh06qQ1DgDmo3Mn meIbv059pRDR2b7S40fcLrgwTSH/aHd2k7tDV/wZpow8E/NsbdeeNz+EYAw84QEjGWyh NxWGVnE7Ie5QA6k9w/VopdhQ2AdgzPuvkB8TonSUPuKYiAHbSC6TkwkSBXThYd3GvbV4 EAMQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id g14si2490491qti.392.2019.01.21.00.00.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 Jan 2019 00:00:18 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 56F07750DA; Mon, 21 Jan 2019 08:00:17 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 42FE4608C7; Mon, 21 Jan 2019 08:00:10 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 21/24] khugepaged: skip collapse if uffd-wp detected Date: Mon, 21 Jan 2019 15:57:19 +0800 Message-Id: <20190121075722.7945-22-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Mon, 21 Jan 2019 08:00:17 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Don't collapse the huge PMD if there is any userfault write protected small PTEs. The problem is that the write protection is in small page granularity and there's no way to keep all these write protection information if the small pages are going to be merged into a huge PMD. The same thing needs to be considered for swap entries and migration entries. So do the check as well disregarding khugepaged_max_ptes_swap. Signed-off-by: Peter Xu --- include/trace/events/huge_memory.h | 1 + mm/khugepaged.c | 23 +++++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index dd4db334bd63..2d7bad9cb976 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -13,6 +13,7 @@ EM( SCAN_PMD_NULL, "pmd_null") \ EM( SCAN_EXCEED_NONE_PTE, "exceed_none_pte") \ EM( SCAN_PTE_NON_PRESENT, "pte_non_present") \ + EM( SCAN_PTE_UFFD_WP, "pte_uffd_wp") \ EM( SCAN_PAGE_RO, "no_writable_page") \ EM( SCAN_LACK_REFERENCED_PAGE, "lack_referenced_page") \ EM( SCAN_PAGE_NULL, "page_null") \ diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 8e2ff195ecb3..92f06e1c941e 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -29,6 +29,7 @@ enum scan_result { SCAN_PMD_NULL, SCAN_EXCEED_NONE_PTE, SCAN_PTE_NON_PRESENT, + SCAN_PTE_UFFD_WP, SCAN_PAGE_RO, SCAN_LACK_REFERENCED_PAGE, SCAN_PAGE_NULL, @@ -1125,6 +1126,15 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, pte_t pteval = *_pte; if (is_swap_pte(pteval)) { if (++unmapped <= khugepaged_max_ptes_swap) { + /* + * Always be strict with uffd-wp + * enabled swap entries. Please see + * comment below for pte_uffd_wp(). + */ + if (pte_swp_uffd_wp(pteval)) { + result = SCAN_PTE_UFFD_WP; + goto out_unmap; + } continue; } else { result = SCAN_EXCEED_SWAP_PTE; @@ -1144,6 +1154,19 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, result = SCAN_PTE_NON_PRESENT; goto out_unmap; } + if (pte_uffd_wp(pteval)) { + /* + * Don't collapse the page if any of the small + * PTEs are armed with uffd write protection. + * Here we can also mark the new huge pmd as + * write protected if any of the small ones is + * marked but that could bring uknown + * userfault messages that falls outside of + * the registered range. So, just be simple. + */ + result = SCAN_PTE_UFFD_WP; + goto out_unmap; + } if (pte_write(pteval)) writable = true; From patchwork Mon Jan 21 07:57:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772829 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0D3681390 for ; Mon, 21 Jan 2019 08:00:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F271529CB1 for ; Mon, 21 Jan 2019 08:00:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E665229D62; Mon, 21 Jan 2019 08:00:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 39F0229CB1 for ; Mon, 21 Jan 2019 08:00:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 59ADA8E0012; Mon, 21 Jan 2019 03:00:37 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 51FD88E0001; Mon, 21 Jan 2019 03:00:37 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E8DD8E0012; Mon, 21 Jan 2019 03:00:37 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 0FA8A8E0001 for ; Mon, 21 Jan 2019 03:00:37 -0500 (EST) Received: by mail-qk1-f197.google.com with SMTP id d196so18501765qkb.6 for ; Mon, 21 Jan 2019 00:00:37 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=4SS/tii+Fgf7IamftibaxkkOaBEJw8ULTu3fmYKLaOU=; b=GMtfCwRb6QxukRbNQaXTYwF8JxytHX1y6kv2vwL8xzLitu9UCT0xQTj8Yl5ihq44v4 h1FtGloVcLxvM1N8lsCGaFdazZFIKHoRTiyiX8lptehXqK3vmVYTT0rfNSJgWKmDSCSk UnGqFTrwEXOQvW1Qy7ee0Ubslzh0Gfkez4Osgbp6sDitrzm/kpIZ6XfJheGgQkTqT1U1 +98LPCKwxv40KbY4YKN+Y2hGjgaEkRXaPbdA361kVOdK6OBGeWsGKOLO4Q9Mw62junbk E7/VoVEDeQbSGaC1GSiLGsdy5cLX6eBQmPJ0uuaiZhopHltZGs5WyJ4E5O5zSJRKw5rR eIcQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukeDMzunu2Pvy4Ht8fXnCnIW3QFccgM4E1hhuZZXBHBFBUqbZuYv YthuV820fYUZbc2oeNGIFGl4paEBx9v/k9gS+l9odkwrm8U5HkNc26MLJJpLuxGOK1OPPbAENRo wFN7l06eM78dHE9OUbrSboPMD0yJstgshC05TyVZOzmq3cxNQUNCD5n5xRJenxbVGZg== X-Received: by 2002:a37:2d07:: with SMTP id t7mr24211199qkh.136.1548057636819; Mon, 21 Jan 2019 00:00:36 -0800 (PST) X-Google-Smtp-Source: ALg8bN55CKUFzU4NmdjoIUjWqUO32owIohTQN/qGJrUaoEb32jXcuNp43XN/4fTz0GGvLLfeQgpb X-Received: by 2002:a37:2d07:: with SMTP id t7mr24211172qkh.136.1548057636234; Mon, 21 Jan 2019 00:00:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057636; cv=none; d=google.com; s=arc-20160816; b=NdbxRNp2RBvRzHv+xrp9eXCOpH4qlFirhJyJXddofYTwYEH1xu961T1eMpI1kp3d4z frSD8bjZkMJo2Ewlh4HbhSy/kUgzIprICp4EKyNT7RQRQOqwavMOXsrgwts+lKaR/2w+ 74POQs73KrRn1fXFkKtTNcj6qFuN57qmK0VbiFJpJN0y+elvl9Z1UE2Ls9rbWwV16ZP8 r9EBhl+YrQXK1HekrcrIxwX6nWkhP9OI2UQpQovbyo+cNjD/YMhhbKIzlWiW2p4vYvx2 L5XkY3Z/RX5WQPzvijMge59ju/t+v6C5oq5EoPetVb6QZon+AWp1oxs9Gn1zPdvW98Bn +/+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=4SS/tii+Fgf7IamftibaxkkOaBEJw8ULTu3fmYKLaOU=; b=wFVYLWzVaXhxkkOVKiMRMcnpmvg4WaoxsyZk8Ezmr4YVrbKB1sjW8lyVEc8vzqbixy yba/NHFY84bPpUpVQIZnyVSGRgx6XYA/mWbUxzb9RVbxBG1u27YTc7VyK++IjHebLtXC KJAbX548U+eKGshF5gxKIhmf4OqbuGoYQeKBSNsJcrkoOmDCGIYM8Q5NTFDnc8McOe5W SOBLDJgpDu9zbskRJV6qApq80jqgrAtSBj25mmBUsFiMDFX5uGloe1Z+juWaXfWE3s8g MgUOB1/Te7iQ6chWlDcT5wRuj7TRxR3AqHUuEj1wkB40+qpNp+6YC3v3TxdpsLvYdlQq eALg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id v64si1879275qte.289.2019.01.21.00.00.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 Jan 2019 00:00:36 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CCB077BDA5; Mon, 21 Jan 2019 08:00:34 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id C8E53608C7; Mon, 21 Jan 2019 08:00:17 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 22/24] userfaultfd: wp: UFFDIO_REGISTER_MODE_WP documentation update Date: Mon, 21 Jan 2019 15:57:20 +0800 Message-Id: <20190121075722.7945-23-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 21 Jan 2019 08:00:35 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Martin Cracauer Adds documentation about the write protection support. Signed-off-by: Andrea Arcangeli [peterx: rewrite in rst format; fixups here and there] Signed-off-by: Peter Xu --- Documentation/admin-guide/mm/userfaultfd.rst | 51 ++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst index 5048cf661a8a..c30176e67900 100644 --- a/Documentation/admin-guide/mm/userfaultfd.rst +++ b/Documentation/admin-guide/mm/userfaultfd.rst @@ -108,6 +108,57 @@ UFFDIO_COPY. They're atomic as in guaranteeing that nothing can see an half copied page since it'll keep userfaulting until the copy has finished. +Notes: + +- If you requested UFFDIO_REGISTER_MODE_MISSING when registering then + you must provide some kind of page in your thread after reading from + the uffd. You must provide either UFFDIO_COPY or UFFDIO_ZEROPAGE. + The normal behavior of the OS automatically providing a zero page on + an annonymous mmaping is not in place. + +- None of the page-delivering ioctls default to the range that you + registered with. You must fill in all fields for the appropriate + ioctl struct including the range. + +- You get the address of the access that triggered the missing page + event out of a struct uffd_msg that you read in the thread from the + uffd. You can supply as many pages as you want with UFFDIO_COPY or + UFFDIO_ZEROPAGE. Keep in mind that unless you used DONTWAKE then + the first of any of those IOCTLs wakes up the faulting thread. + +- Be sure to test for all errors including (pollfd[0].revents & + POLLERR). This can happen, e.g. when ranges supplied were + incorrect. + +Write Protect Notifications +--------------------------- + +This is equivalent to (but faster than) using mprotect and a SIGSEGV +signal handler. + +Firstly you need to register a range with UFFDIO_REGISTER_MODE_WP. +Instead of using mprotect(2) you use ioctl(uffd, UFFDIO_WRITEPROTECT, +struct *uffdio_writeprotect) while mode = UFFDIO_WRITEPROTECT_MODE_WP +in the struct passed in. The range does not default to and does not +have to be identical to the range you registered with. You can write +protect as many ranges as you like (inside the registered range). +Then, in the thread reading from uffd the struct will have +msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP set. Now you send +ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect) again +while pagefault.mode does not have UFFDIO_WRITEPROTECT_MODE_WP set. +This wakes up the thread which will continue to run with writes. This +allows you to do the bookkeeping about the write in the uffd reading +thread before the ioctl. + +If you registered with both UFFDIO_REGISTER_MODE_MISSING and +UFFDIO_REGISTER_MODE_WP then you need to think about the sequence in +which you supply a page and undo write protect. Note that there is a +difference between writes into a WP area and into a !WP area. The +former will have UFFD_PAGEFAULT_FLAG_WP set, the latter +UFFD_PAGEFAULT_FLAG_WRITE. The latter did not fail on protection but +you still need to supply a page when UFFDIO_REGISTER_MODE_MISSING was +used. + QEMU/KVM ======== From patchwork Mon Jan 21 07:57:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772831 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 78E906C2 for ; Mon, 21 Jan 2019 08:00:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6AC3529CB1 for ; Mon, 21 Jan 2019 08:00:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5E22529D62; Mon, 21 Jan 2019 08:00:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A8F9B29CB1 for ; Mon, 21 Jan 2019 08:00:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA3EB8E0013; Mon, 21 Jan 2019 03:00:46 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A50138E0001; Mon, 21 Jan 2019 03:00:46 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F3158E0013; Mon, 21 Jan 2019 03:00:46 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id 5CD358E0001 for ; Mon, 21 Jan 2019 03:00:46 -0500 (EST) Received: by mail-qk1-f200.google.com with SMTP id w28so18273721qkj.22 for ; Mon, 21 Jan 2019 00:00:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=FAHDI2rhdj9FPEM82Nf5d1uwaIql7JfBDm123fs4fBA=; b=Egm7yR7ZMeJyCFvpfZPz0nG264T0hZoDK7qIcvCEsU3kZn3DglL9RcmcJAyRtW0gvh ZnEH/QDV0cdbgmVAWIgTozpzalDbmpvVCbhJnpW/9aJ8d1ZICH5wwQoIngo47JORxJ/w vkbNDUCei/Fk8liy2FRNKQpwNwygOnOhbAxx6ZtFv+Ey6uiJin+OdOJLSwVqEMOqirvV ygkJAQOFhRGDvxhzsoxNM+PexzwMdpPMg6/Qmu9Jq4fDVKAXzrKYppmDLtPAjEf84pB7 X8Siby3/qP2ReAqiqzyNf/jrlFog2H/4GeKc3bk8ylPBXnqwlism4MWWrCFQGaU4uwEa /kcg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukdXm77gMvU8IMe+n9k5BhF8FMcaY0SYKg6GNUZqFiCBrkfisAPh YrIVFQedc8MhSaNOV8XZz03SEKn6lOmNQcj8NZEa9JP795XbP85bms/JWZ8FKFu8p+/EUNYyjOZ 2SuUGe9qIh+hH66oe2JaFUr4BcOgI9WdZ/uph9QkF0c004x2gwYPLWf8emQcaVBemxA== X-Received: by 2002:ac8:95:: with SMTP id c21mr9291187qtg.201.1548057646147; Mon, 21 Jan 2019 00:00:46 -0800 (PST) X-Google-Smtp-Source: ALg8bN7E6M1sgWlsiwZ0mfRIAsAwoVD3ga/bkCBMcTyRkOcZbQhabcQ7HE9/keKCJLkXmMIv1ENE X-Received: by 2002:ac8:95:: with SMTP id c21mr9291151qtg.201.1548057645492; Mon, 21 Jan 2019 00:00:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057645; cv=none; d=google.com; s=arc-20160816; b=VqNdDoLTj0REB7kq5UXDwSXpithDoQrdanLUIIg2jAbFKEweRa5UzNhYx6/ooggnem kj9rd1x8mEup8RMgzLFjVnJjosyp7CWjRJA8kN+S1VXSKROsNL4UZZDjt31MnlmtZ1UU +3PUJBX0IP7f2ZJNOSJVWJVXv+OAqKXNLIjGAuHgfUbi3oAbRm7rZYmH/m+ne2v9EvQN XTmJ1D6iPrIhqVOCoCLRlHJxKaZLiAWLXV9q5hLSatNlGh9TqIz4xts+F8JyPn0nUwUE +pb1lV/U67FHnXnluynlrnAiR1o/tr1deQvSdpeN4qhhXLpNZxpnzBWzw6CwKrB2elEd L5xg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=FAHDI2rhdj9FPEM82Nf5d1uwaIql7JfBDm123fs4fBA=; b=jq7xzD9hr9Tc3ke1Xgya8JLE3gB+aFDA1w+NHqsXKD4q7kvru5Uen9odtKwC8O5nRu p/NIoX5pg4KNo4VhcAAsTPrtbthvEoBBp2Am+kDxrmWf+do890M0XVZtUIHWZnjgqRdu XqqpUnudRXCVG0TiI9gZQV4CJTykOLD0gooz6YexPUQi3hnO+Y+TSbh8R/IFYthStmFt LIZtWAqajkeXWbI5x8Ho3iBDGoOe8flTZ7SP2YsfTyriRjv4DgVMvJALLyYeHN//ONT/ F/eOrnlI6oeYpd6AO3fDbXhlx/5AG09WncEUpUTSgnxTPDUPpnd0vGQddoM1yEpFOdzy 5PrQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id b62si3237847qkf.59.2019.01.21.00.00.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 Jan 2019 00:00:45 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 76C99C050DEE; Mon, 21 Jan 2019 08:00:44 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 48BB4608F3; Mon, 21 Jan 2019 08:00:35 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 23/24] userfaultfd: selftests: refactor statistics Date: Mon, 21 Jan 2019 15:57:21 +0800 Message-Id: <20190121075722.7945-24-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Mon, 21 Jan 2019 08:00:44 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Introduce uffd_stats structure for statistics of the self test, at the same time refactor the code to always pass in the uffd_stats for either read() or poll() typed fault handling threads instead of using two different ways to return the statistic results. No functional change. With the new structure, it's very easy to introduce new statistics. Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 76 +++++++++++++++--------- 1 file changed, 49 insertions(+), 27 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index 5d1db824f73a..e5d12c209e09 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -88,6 +88,12 @@ static char *area_src, *area_src_alias, *area_dst, *area_dst_alias; static char *zeropage; pthread_attr_t attr; +/* Userfaultfd test statistics */ +struct uffd_stats { + int cpu; + unsigned long missing_faults; +}; + /* pthread_mutex_t starts at page offset 0 */ #define area_mutex(___area, ___nr) \ ((pthread_mutex_t *) ((___area) + (___nr)*page_size)) @@ -127,6 +133,17 @@ static void usage(void) exit(1); } +static void uffd_stats_reset(struct uffd_stats *uffd_stats, + unsigned long n_cpus) +{ + int i; + + for (i = 0; i < n_cpus; i++) { + uffd_stats[i].cpu = i; + uffd_stats[i].missing_faults = 0; + } +} + static int anon_release_pages(char *rel_area) { int ret = 0; @@ -469,8 +486,8 @@ static int uffd_read_msg(int ufd, struct uffd_msg *msg) return 0; } -/* Return 1 if page fault handled by us; otherwise 0 */ -static int uffd_handle_page_fault(struct uffd_msg *msg) +static void uffd_handle_page_fault(struct uffd_msg *msg, + struct uffd_stats *stats) { unsigned long offset; @@ -485,18 +502,19 @@ static int uffd_handle_page_fault(struct uffd_msg *msg) offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; offset &= ~(page_size-1); - return copy_page(uffd, offset); + if (copy_page(uffd, offset)) + stats->missing_faults++; } static void *uffd_poll_thread(void *arg) { - unsigned long cpu = (unsigned long) arg; + struct uffd_stats *stats = (struct uffd_stats *)arg; + unsigned long cpu = stats->cpu; struct pollfd pollfd[2]; struct uffd_msg msg; struct uffdio_register uffd_reg; int ret; char tmp_chr; - unsigned long userfaults = 0; pollfd[0].fd = uffd; pollfd[0].events = POLLIN; @@ -526,7 +544,7 @@ static void *uffd_poll_thread(void *arg) msg.event), exit(1); break; case UFFD_EVENT_PAGEFAULT: - userfaults += uffd_handle_page_fault(&msg); + uffd_handle_page_fault(&msg, stats); break; case UFFD_EVENT_FORK: close(uffd); @@ -545,28 +563,27 @@ static void *uffd_poll_thread(void *arg) break; } } - return (void *)userfaults; + + return NULL; } pthread_mutex_t uffd_read_mutex = PTHREAD_MUTEX_INITIALIZER; static void *uffd_read_thread(void *arg) { - unsigned long *this_cpu_userfaults; + struct uffd_stats *stats = (struct uffd_stats *)arg; struct uffd_msg msg; - this_cpu_userfaults = (unsigned long *) arg; - *this_cpu_userfaults = 0; - pthread_mutex_unlock(&uffd_read_mutex); /* from here cancellation is ok */ for (;;) { if (uffd_read_msg(uffd, &msg)) continue; - (*this_cpu_userfaults) += uffd_handle_page_fault(&msg); + uffd_handle_page_fault(&msg, stats); } - return (void *)NULL; + + return NULL; } static void *background_thread(void *arg) @@ -582,13 +599,12 @@ static void *background_thread(void *arg) return NULL; } -static int stress(unsigned long *userfaults) +static int stress(struct uffd_stats *uffd_stats) { unsigned long cpu; pthread_t locking_threads[nr_cpus]; pthread_t uffd_threads[nr_cpus]; pthread_t background_threads[nr_cpus]; - void **_userfaults = (void **) userfaults; finished = 0; for (cpu = 0; cpu < nr_cpus; cpu++) { @@ -597,12 +613,13 @@ static int stress(unsigned long *userfaults) return 1; if (bounces & BOUNCE_POLL) { if (pthread_create(&uffd_threads[cpu], &attr, - uffd_poll_thread, (void *)cpu)) + uffd_poll_thread, + (void *)&uffd_stats[cpu])) return 1; } else { if (pthread_create(&uffd_threads[cpu], &attr, uffd_read_thread, - &_userfaults[cpu])) + (void *)&uffd_stats[cpu])) return 1; pthread_mutex_lock(&uffd_read_mutex); } @@ -639,7 +656,8 @@ static int stress(unsigned long *userfaults) fprintf(stderr, "pipefd write error\n"); return 1; } - if (pthread_join(uffd_threads[cpu], &_userfaults[cpu])) + if (pthread_join(uffd_threads[cpu], + (void *)&uffd_stats[cpu])) return 1; } else { if (pthread_cancel(uffd_threads[cpu])) @@ -910,11 +928,11 @@ static int userfaultfd_events_test(void) { struct uffdio_register uffdio_register; unsigned long expected_ioctls; - unsigned long userfaults; pthread_t uffd_mon; int err, features; pid_t pid; char c; + struct uffd_stats stats = { 0 }; printf("testing events (fork, remap, remove): "); fflush(stdout); @@ -941,7 +959,7 @@ static int userfaultfd_events_test(void) "unexpected missing ioctl for anon memory\n"), exit(1); - if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL)) + if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, &stats)) perror("uffd_poll_thread create"), exit(1); pid = fork(); @@ -957,13 +975,13 @@ static int userfaultfd_events_test(void) if (write(pipefd[1], &c, sizeof(c)) != sizeof(c)) perror("pipe write"), exit(1); - if (pthread_join(uffd_mon, (void **)&userfaults)) + if (pthread_join(uffd_mon, NULL)) return 1; close(uffd); - printf("userfaults: %ld\n", userfaults); + printf("userfaults: %ld\n", stats.missing_faults); - return userfaults != nr_pages; + return stats.missing_faults != nr_pages; } static int userfaultfd_sig_test(void) @@ -975,6 +993,7 @@ static int userfaultfd_sig_test(void) int err, features; pid_t pid; char c; + struct uffd_stats stats = { 0 }; printf("testing signal delivery: "); fflush(stdout); @@ -1006,7 +1025,7 @@ static int userfaultfd_sig_test(void) if (uffd_test_ops->release_pages(area_dst)) return 1; - if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL)) + if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, &stats)) perror("uffd_poll_thread create"), exit(1); pid = fork(); @@ -1032,6 +1051,7 @@ static int userfaultfd_sig_test(void) close(uffd); return userfaults != 0; } + static int userfaultfd_stress(void) { void *area; @@ -1040,7 +1060,7 @@ static int userfaultfd_stress(void) struct uffdio_register uffdio_register; unsigned long cpu; int err; - unsigned long userfaults[nr_cpus]; + struct uffd_stats uffd_stats[nr_cpus]; uffd_test_ops->allocate_area((void **)&area_src); if (!area_src) @@ -1169,8 +1189,10 @@ static int userfaultfd_stress(void) if (uffd_test_ops->release_pages(area_dst)) return 1; + uffd_stats_reset(uffd_stats, nr_cpus); + /* bounce pass */ - if (stress(userfaults)) + if (stress(uffd_stats)) return 1; /* unregister */ @@ -1213,7 +1235,7 @@ static int userfaultfd_stress(void) printf("userfaults:"); for (cpu = 0; cpu < nr_cpus; cpu++) - printf(" %lu", userfaults[cpu]); + printf(" %lu", uffd_stats[cpu].missing_faults); printf("\n"); } From patchwork Mon Jan 21 07:57:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772835 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B202B6C2 for ; Mon, 21 Jan 2019 08:01:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A356B29CB1 for ; Mon, 21 Jan 2019 08:01:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 965F929D62; Mon, 21 Jan 2019 08:01:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AB9B829CB1 for ; Mon, 21 Jan 2019 08:00:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AF9058E0014; Mon, 21 Jan 2019 03:00:58 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A9D878E0001; Mon, 21 Jan 2019 03:00:58 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9676A8E0014; Mon, 21 Jan 2019 03:00:58 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 65C3F8E0001 for ; Mon, 21 Jan 2019 03:00:58 -0500 (EST) Received: by mail-qt1-f200.google.com with SMTP id q33so19893951qte.23 for ; Mon, 21 Jan 2019 00:00:58 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=1e0ADGgd5khDqTzI8a0ZR54GIFB9fAr8S+c4oruTe0w=; b=cJ2/cDz07+k5PB6x/MMy5WIKiCDCd4fhK8V39zjU9IypXWuOQB3DSwaaJGX64sZcEz tedMw1Qtej4rDZbpnTobFVqAsKIJgwwHivDunTD1noEeZMAkmiLEZDIFKMuq1f6evJ/Y vGXkjmMFJgRkeJ85UOvjoKDfzhHLAABh3u15mTMz739POtthWxoRQ62fj9zl/ggueBsl Y4I9FbCnCoUdmMQGMQfX8NKeci5/fPEElVprfnEOuTwuUBcVkzEFiDq2bfarMH2cPSAV x/GvenUUm7KT9+E3PkHdb+5lzqgQocBSZ7SVued2qD0nTIuOapR9izZFr8pWxM+YLnax y9nA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukeoCKZ7HgPQC6Wq/EVRX/FRO93U4q8TaAD4OJpbhiD4zHkgdXwO 74O7ZaPZbmMfmiIDGuhpBaTNcNq33ZytHALmApJj5aYJKfV9UZ/uwUPFK8wGId6afvDrkDiV5NC 4AT7uN5qinKYEDH3ndBb0XG7255xPtxixwUoK7Ubde+A1OhHr8iNak5ZYdRa0dxJoXg== X-Received: by 2002:ac8:4884:: with SMTP id i4mr6803117qtq.219.1548057658177; Mon, 21 Jan 2019 00:00:58 -0800 (PST) X-Google-Smtp-Source: ALg8bN5naXPt+Kt06yRORQ7BYMr4fpWR3VAzY9gGYfy4Rhqfbmhz8ik/V0xJheOgoJcByNCnm//W X-Received: by 2002:ac8:4884:: with SMTP id i4mr6803066qtq.219.1548057657266; Mon, 21 Jan 2019 00:00:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057657; cv=none; d=google.com; s=arc-20160816; b=TLJeS6iKywsPYatyCw1c0fdbF8bEryzUaw4Kd+RixPK1qcBcHFlgg42qq025HkZrsh r9JNxIdREzmQ53G/vf3TvorNU6P0dea9EAWf6eYlLVT5w5c1cA1oEH0WcD0zkUYRWPZ2 F15irn2TZcTo5So3lxtQztUrEMl2LHA5Ig5bdh/uMW6EsWwg8zi8JwX7VMB6ojhamUBP OAPGTkLy2ELy3qbwo5GxxkR4JqmhsrmucTOW7nY/rbaJdRhyzWRVnr2BkzZ9GuSagomJ NiMP0BKSpJ87N8r8skjC0yEj9c+caiM76K6Qd+LpOZNkLCm/iCUoKzj1Vbty5Wa+Y0bk jGVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=1e0ADGgd5khDqTzI8a0ZR54GIFB9fAr8S+c4oruTe0w=; b=L/++g5UZLYNvXEOua5e5yCNljiPUXC5h+81lXIJ86BXHjJ87/5kmlY43JKOczbIdFy 2bw2GSm53MrAMKaXF4rhhh8Fy8DtGGtBAO+GKFBud8XCbVyDb2XeSzgjOEmZ49QjiaYW xo09QoJobiS5lJRXO9N2h/iylWxPi7O9cIovhN8/B+rrcPUhxZ6vXyqXsVWuJk56ym0i nSwGH8xYTBCg2/+09iGLgXlJJdzJ0QqO/BJZyPJTeJvG4uslEIyQySz2R2P05nMOvWHr ACiOTNjv8VqZ43OqLPUnom2Iheh+NKPjD80dqdMjSOCOg3+TEknQRVk2bMF62yMHbNrV jSvw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id t82si766333qkl.141.2019.01.21.00.00.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 Jan 2019 00:00:57 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 44BBC87649; Mon, 21 Jan 2019 08:00:56 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id F0258608E1; Mon, 21 Jan 2019 08:00:44 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 24/24] userfaultfd: selftests: add write-protect test Date: Mon, 21 Jan 2019 15:57:22 +0800 Message-Id: <20190121075722.7945-25-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 21 Jan 2019 08:00:56 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This patch adds uffd tests for write protection. Instead of introducing new tests for it, let's simply squashing uffd-wp tests into existing uffd-missing test cases. Changes are: (1) Bouncing tests We do the write-protection in two ways during the bouncing test: - By using UFFDIO_COPY_MODE_WP when resolving MISSING pages: then we'll make sure for each bounce process every single page will be at least fault twice: once for MISSING, once for WP. - By direct call UFFDIO_WRITEPROTECT on existing faulted memories: To further torture the explicit page protection procedures of uffd-wp, we split each bounce procedure into two halves (in the background thread): the first half will be MISSING+WP for each page as explained above. After the first half, we write protect the faulted region in the background thread to make sure at least half of the pages will be write protected again which is the first half to test the new UFFDIO_WRITEPROTECT call. Then we continue with the 2nd half, which will contain both MISSING and WP faulting tests for the 2nd half and WP-only faults from the 1st half. (2) Event/Signal test Mostly previous tests but will do MISSING+WP for each page. For sigbus-mode test we'll need to provide standalone path to handle the write protection faults. For all tests, do statistics as well for uffd-wp pages. Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 154 ++++++++++++++++++----- 1 file changed, 126 insertions(+), 28 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index e5d12c209e09..57b5ac02080a 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -56,6 +56,7 @@ #include #include #include +#include #include "../kselftest.h" @@ -78,6 +79,8 @@ static int test_type; #define ALARM_INTERVAL_SECS 10 static volatile bool test_uffdio_copy_eexist = true; static volatile bool test_uffdio_zeropage_eexist = true; +/* Whether to test uffd write-protection */ +static bool test_uffdio_wp = false; static bool map_shared; static int huge_fd; @@ -92,6 +95,7 @@ pthread_attr_t attr; struct uffd_stats { int cpu; unsigned long missing_faults; + unsigned long wp_faults; }; /* pthread_mutex_t starts at page offset 0 */ @@ -141,9 +145,29 @@ static void uffd_stats_reset(struct uffd_stats *uffd_stats, for (i = 0; i < n_cpus; i++) { uffd_stats[i].cpu = i; uffd_stats[i].missing_faults = 0; + uffd_stats[i].wp_faults = 0; } } +static void uffd_stats_report(struct uffd_stats *stats, int n_cpus) +{ + int i; + unsigned long long miss_total = 0, wp_total = 0; + + for (i = 0; i < n_cpus; i++) { + miss_total += stats[i].missing_faults; + wp_total += stats[i].wp_faults; + } + + printf("userfaults: %llu missing (", miss_total); + for (i = 0; i < n_cpus; i++) + printf("%lu+", stats[i].missing_faults); + printf("\b), %llu wp (", wp_total); + for (i = 0; i < n_cpus; i++) + printf("%lu+", stats[i].wp_faults); + printf("\b)\n"); +} + static int anon_release_pages(char *rel_area) { int ret = 0; @@ -264,19 +288,15 @@ struct uffd_test_ops { void (*alias_mapping)(__u64 *start, size_t len, unsigned long offset); }; -#define ANON_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ - (1 << _UFFDIO_COPY) | \ - (1 << _UFFDIO_ZEROPAGE)) - static struct uffd_test_ops anon_uffd_test_ops = { - .expected_ioctls = ANON_EXPECTED_IOCTLS, + .expected_ioctls = UFFD_API_RANGE_IOCTLS, .allocate_area = anon_allocate_area, .release_pages = anon_release_pages, .alias_mapping = noop_alias_mapping, }; static struct uffd_test_ops shmem_uffd_test_ops = { - .expected_ioctls = ANON_EXPECTED_IOCTLS, + .expected_ioctls = UFFD_API_RANGE_IOCTLS, .allocate_area = shmem_allocate_area, .release_pages = shmem_release_pages, .alias_mapping = noop_alias_mapping, @@ -300,6 +320,21 @@ static int my_bcmp(char *str1, char *str2, size_t n) return 0; } +static void wp_range(int ufd, __u64 start, __u64 len, bool wp) +{ + struct uffdio_writeprotect prms = { 0 }; + + /* Write protection page faults */ + prms.range.start = start; + prms.range.len = len; + /* Undo write-protect, do wakeup after that */ + prms.mode = wp ? UFFDIO_WRITEPROTECT_MODE_WP : 0; + + if (ioctl(ufd, UFFDIO_WRITEPROTECT, &prms)) + fprintf(stderr, "clear WP failed for address 0x%Lx\n", + start), exit(1); +} + static void *locking_thread(void *arg) { unsigned long cpu = (unsigned long) arg; @@ -438,7 +473,10 @@ static int __copy_page(int ufd, unsigned long offset, bool retry) uffdio_copy.dst = (unsigned long) area_dst + offset; uffdio_copy.src = (unsigned long) area_src + offset; uffdio_copy.len = page_size; - uffdio_copy.mode = 0; + if (test_uffdio_wp) + uffdio_copy.mode = UFFDIO_COPY_MODE_WP; + else + uffdio_copy.mode = 0; uffdio_copy.copy = 0; if (ioctl(ufd, UFFDIO_COPY, &uffdio_copy)) { /* real retval in ufdio_copy.copy */ @@ -495,15 +533,21 @@ static void uffd_handle_page_fault(struct uffd_msg *msg, fprintf(stderr, "unexpected msg event %u\n", msg->event), exit(1); - if (bounces & BOUNCE_VERIFY && - msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE) - fprintf(stderr, "unexpected write fault\n"), exit(1); + if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP) { + wp_range(uffd, msg->arg.pagefault.address, page_size, false); + stats->wp_faults++; + } else { + /* Missing page faults */ + if (bounces & BOUNCE_VERIFY && + msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE) + fprintf(stderr, "unexpected write fault\n"), exit(1); - offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; - offset &= ~(page_size-1); + offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; + offset &= ~(page_size-1); - if (copy_page(uffd, offset)) - stats->missing_faults++; + if (copy_page(uffd, offset)) + stats->missing_faults++; + } } static void *uffd_poll_thread(void *arg) @@ -589,11 +633,30 @@ static void *uffd_read_thread(void *arg) static void *background_thread(void *arg) { unsigned long cpu = (unsigned long) arg; - unsigned long page_nr; + unsigned long page_nr, start_nr, mid_nr, end_nr; - for (page_nr = cpu * nr_pages_per_cpu; - page_nr < (cpu+1) * nr_pages_per_cpu; - page_nr++) + start_nr = cpu * nr_pages_per_cpu; + end_nr = (cpu+1) * nr_pages_per_cpu; + mid_nr = (start_nr + end_nr) / 2; + + /* Copy the first half of the pages */ + for (page_nr = start_nr; page_nr < mid_nr; page_nr++) + copy_page_retry(uffd, page_nr * page_size); + + /* + * If we need to test uffd-wp, set it up now. Then we'll have + * at least the first half of the pages mapped already which + * can be write-protected for testing + */ + if (test_uffdio_wp) + wp_range(uffd, (unsigned long)area_dst + start_nr * page_size, + nr_pages_per_cpu * page_size, true); + + /* + * Continue the 2nd half of the page copying, handling write + * protection faults if any + */ + for (page_nr = mid_nr; page_nr < end_nr; page_nr++) copy_page_retry(uffd, page_nr * page_size); return NULL; @@ -755,17 +818,31 @@ static int faulting_process(int signal_test) } for (nr = 0; nr < split_nr_pages; nr++) { + int steps = 1; + unsigned long offset = nr * page_size; + if (signal_test) { if (sigsetjmp(*sigbuf, 1) != 0) { - if (nr == lastnr) { + if (steps == 1 && nr == lastnr) { fprintf(stderr, "Signal repeated\n"); return 1; } lastnr = nr; if (signal_test == 1) { - if (copy_page(uffd, nr * page_size)) - signalled++; + if (steps == 1) { + /* This is a MISSING request */ + steps++; + if (copy_page(uffd, offset)) + signalled++; + } else { + /* This is a WP request */ + assert(steps == 2); + wp_range(uffd, + (__u64)area_dst + + offset, + page_size, false); + } } else { signalled++; continue; @@ -778,8 +855,13 @@ static int faulting_process(int signal_test) fprintf(stderr, "nr %lu memory corruption %Lu %Lu\n", nr, count, - count_verify[nr]), exit(1); - } + count_verify[nr]); + } + /* + * Trigger write protection if there is by writting + * the same value back. + */ + *area_count(area_dst, nr) = count; } if (signal_test) @@ -801,6 +883,11 @@ static int faulting_process(int signal_test) nr, count, count_verify[nr]), exit(1); } + /* + * Trigger write protection if there is by writting + * the same value back. + */ + *area_count(area_dst, nr) = count; } if (uffd_test_ops->release_pages(area_dst)) @@ -949,6 +1036,8 @@ static int userfaultfd_events_test(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) fprintf(stderr, "register failure\n"), exit(1); @@ -979,7 +1068,8 @@ static int userfaultfd_events_test(void) return 1; close(uffd); - printf("userfaults: %ld\n", stats.missing_faults); + + uffd_stats_report(&stats, 1); return stats.missing_faults != nr_pages; } @@ -1009,6 +1099,8 @@ static int userfaultfd_sig_test(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) fprintf(stderr, "register failure\n"), exit(1); @@ -1141,6 +1233,8 @@ static int userfaultfd_stress(void) uffdio_register.range.start = (unsigned long) area_dst; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) { fprintf(stderr, "register failure\n"); return 1; @@ -1195,6 +1289,11 @@ static int userfaultfd_stress(void) if (stress(uffd_stats)) return 1; + /* Clear all the write protections if there is any */ + if (test_uffdio_wp) + wp_range(uffd, (unsigned long)area_dst, + nr_pages * page_size, false); + /* unregister */ if (ioctl(uffd, UFFDIO_UNREGISTER, &uffdio_register.range)) { fprintf(stderr, "unregister failure\n"); @@ -1233,10 +1332,7 @@ static int userfaultfd_stress(void) area_src_alias = area_dst_alias; area_dst_alias = tmp_area; - printf("userfaults:"); - for (cpu = 0; cpu < nr_cpus; cpu++) - printf(" %lu", uffd_stats[cpu].missing_faults); - printf("\n"); + uffd_stats_report(uffd_stats, nr_cpus); } if (err) @@ -1276,6 +1372,8 @@ static void set_test_type(const char *type) if (!strcmp(type, "anon")) { test_type = TEST_ANON; uffd_test_ops = &anon_uffd_test_ops; + /* Only enable write-protect test for anonymous test */ + test_uffdio_wp = true; } else if (!strcmp(type, "hugetlb")) { test_type = TEST_HUGETLB; uffd_test_ops = &hugetlb_uffd_test_ops;