From patchwork Wed Nov 7 06:06:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10671873 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A65EA1709 for ; Wed, 7 Nov 2018 06:07:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 929D628755 for ; Wed, 7 Nov 2018 06:07:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8332F289B6; Wed, 7 Nov 2018 06:07:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BA75128755 for ; Wed, 7 Nov 2018 06:07:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 62B866B04B3; Wed, 7 Nov 2018 01:07:09 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5DB6A6B04B5; Wed, 7 Nov 2018 01:07:09 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4CB986B04B6; Wed, 7 Nov 2018 01:07:09 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 1D6326B04B3 for ; Wed, 7 Nov 2018 01:07:09 -0500 (EST) Received: by mail-qt1-f199.google.com with SMTP id l24-v6so5477649qtj.10 for ; Tue, 06 Nov 2018 22:07:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=A16blqehQrExnVKwtFdvRPtpxWZ+yfNpKVWG6kiSzg8=; b=e8DVhRBMjHkvF8t0rN5w58CsVl4gW6U4vKMzMIxXpLUtV8MRIT9/wb0xYodqshck6C oV2lLpRCDAy1oVdS8SQo8LbsC2lsvNqQ8pjPfYCChpbM3FiN+ThGLVayFPMwWDZB4q6N /u+79zjTxwz6NWV0/LXYUkQTcyjffgmZbFd4CvB4iX7QtwRjugSAAUz8Q0yXi2ZKqqTp ngZxri5Q1xtZ5oVSRMXcM9K+QeZ7SKmx8yr1R7VLTpw+8OmP4LVRNE3u4axYmBPNkeZF Ytw8Rf7KOHChDqNKZ/SaqdR/Lf+Qm3uufhlcdEVIBkBSwwQLX1BN6ueLfPG11pVf12gG hG2A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AGRZ1gIbuPhQTzVWnB7SkzOtY/a7VRHZycUfb5qsJQnrl4PMmvlYHxHW H0cG6sqVfUNvfFTdDfisG81fIgTsay0lRD0uO7YW8ykGJ9Ci3JBrORFo9wQQcuaB1QgvL92PX0M scabnK7gETVn6UudkxEbuHWavT2btwtgT2JlWU9sIAPZRof07WCI/1MNI6vNNEavwcA== X-Received: by 2002:a37:1054:: with SMTP id a81mr466955qkh.150.1541570828868; Tue, 06 Nov 2018 22:07:08 -0800 (PST) X-Google-Smtp-Source: AJdET5fjT8Is7HUVPLgFIFXk8z6eGcJeXZeo37ncWmakdq4AqpdthwWgl/alnp42VsTpfpcD6rZd X-Received: by 2002:a37:1054:: with SMTP id a81mr466923qkh.150.1541570828027; Tue, 06 Nov 2018 22:07:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541570828; cv=none; d=google.com; s=arc-20160816; b=QE8J44V+34VE+2C4mkKqDvox+H7t5Z1xaofbEqHUko50x/CjNLvhMBqub5wdB4Wv9N wkZAIZcJPwgAJU0b62XSKZbNW/0IwSLh9bquGVJ8VPFW5L0M6uZ6/ZwyVsvadUUlUd3P UJgrc+soMqtIZ8H7G0zIPLAjWqWeABwDKqxUjwOBUakCw2zBtGQpD0iO6KCzfv/HyRi5 0lAdEVsDyDGEue/2vgeVTBakGM+nWxCWy09fuQVAXXudmwX8Oo3F9tyjV72BdpxgKGhE Oy/7Dhbq58jaQXb2n4/vWgSi4gVkIt+fGWYyqkWlSR8x9AyQvMXM5DrDlcstUNE6t0zY 6UsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=A16blqehQrExnVKwtFdvRPtpxWZ+yfNpKVWG6kiSzg8=; b=zw3DPmLdzNLoStv2htEScN4DXPgCLUVUCuc7yYC7KoKa4oBuy2jJsViKXBy5qYKmAx 4jL/Oo7VYlXdMf/DUBS+7j+3rQQgzhDar/bB7iNaXvV/P7QpNG+kKp/NqKPIbSkss0QX kE59wkyTgHylZchzKtrTTNgOmrUbSx0Sc8R8Cw/uCXIa4p3Ut5hsLREcDeC1Q0Zmx5E/ yQRDnFTIYN/1FSe05Xupmhz4gLeG90zWb4AJkXsWBklSnAI2yyIk/G2Cv8QeC6cn1vvg y1Vt01PLSLoZzv/CE1bXdWEMghWSzb3LpBwPyIjarw+kFF8SPG6ZCiZj7fIIdGAjCxwx X5Wg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id p41si4573207qve.126.2018.11.06.22.07.07 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 06 Nov 2018 22:07:07 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D97FD85546; Wed, 7 Nov 2018 06:07:06 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-128.nay.redhat.com [10.66.14.128]) by smtp.corp.redhat.com (Postfix) with ESMTP id F0F8D1057051; Wed, 7 Nov 2018 06:07:00 +0000 (UTC) From: Peter Xu To: linux-kernel@vger.kernel.org Cc: Keith Busch , Linus Torvalds , peterx@redhat.com, Dan Williams , linux-mm@kvack.org, Matthew Wilcox , Al Viro , Andrea Arcangeli , Huang Ying , Mike Kravetz , Mike Rapoport , Jerome Glisse , "Michael S. Tsirkin" , "Kirill A . Shutemov" , Michal Hocko , Vlastimil Babka , Pavel Tatashin , Andrew Morton Subject: [PATCH RFC v2 1/4] mm: gup: rename "nonblocking" to "locked" where proper Date: Wed, 7 Nov 2018 14:06:40 +0800 Message-Id: <20181107060643.10950-2-peterx@redhat.com> In-Reply-To: <20181107060643.10950-1-peterx@redhat.com> References: <20181107060643.10950-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 07 Nov 2018 06:07:07 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP There's plenty of places around __get_user_pages() that has a parameter "nonblocking" which does not really mean that "it won't block" (because it can really block) but instead it shows whether the mmap_sem is released by up_read() during the page fault handling mostly when VM_FAULT_RETRY is returned. We have the correct naming in e.g. get_user_pages_locked() or get_user_pages_remote() as "locked", however there're still many places that are using the "nonblocking" as name. Renaming the places to "locked" where proper to better suite the functionality of the variable. While at it, fixing up some of the comments accordingly. Signed-off-by: Peter Xu --- mm/gup.c | 44 +++++++++++++++++++++----------------------- mm/hugetlb.c | 8 ++++---- 2 files changed, 25 insertions(+), 27 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 841d7ef53591..6faff46cd409 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -500,12 +500,12 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address, } /* - * mmap_sem must be held on entry. If @nonblocking != NULL and - * *@flags does not include FOLL_NOWAIT, the mmap_sem may be released. - * If it is, *@nonblocking will be set to 0 and -EBUSY returned. + * mmap_sem must be held on entry. If @locked != NULL and *@flags + * does not include FOLL_NOWAIT, the mmap_sem may be released. If it + * is, *@locked will be set to 0 and -EBUSY returned. */ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, - unsigned long address, unsigned int *flags, int *nonblocking) + unsigned long address, unsigned int *flags, int *locked) { unsigned int fault_flags = 0; vm_fault_t ret; @@ -517,7 +517,7 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, fault_flags |= FAULT_FLAG_WRITE; if (*flags & FOLL_REMOTE) fault_flags |= FAULT_FLAG_REMOTE; - if (nonblocking) + if (locked) fault_flags |= FAULT_FLAG_ALLOW_RETRY; if (*flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; @@ -543,8 +543,8 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, } if (ret & VM_FAULT_RETRY) { - if (nonblocking && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) - *nonblocking = 0; + if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) + *locked = 0; return -EBUSY; } @@ -621,7 +621,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) * only intends to ensure the pages are faulted in. * @vmas: array of pointers to vmas corresponding to each page. * Or NULL if the caller does not require them. - * @nonblocking: whether waiting for disk IO or mmap_sem contention + * @locked: whether we're still with the mmap_sem held * * Returns number of pages pinned. This may be fewer than the number * requested. If nr_pages is 0 or negative, returns 0. If no pages @@ -650,13 +650,11 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) * appropriate) must be called after the page is finished with, and * before put_page is called. * - * If @nonblocking != NULL, __get_user_pages will not wait for disk IO - * or mmap_sem contention, and if waiting is needed to pin all pages, - * *@nonblocking will be set to 0. Further, if @gup_flags does not - * include FOLL_NOWAIT, the mmap_sem will be released via up_read() in - * this case. + * If @locked != NULL, *@locked will be set to 0 when mmap_sem is + * released by an up_read(). That can happen if @gup_flags does not + * has FOLL_NOWAIT. * - * A caller using such a combination of @nonblocking and @gup_flags + * A caller using such a combination of @locked and @gup_flags * must therefore hold the mmap_sem for reading only, and recognize * when it's been released. Otherwise, it must be held for either * reading or writing and will not be released. @@ -668,7 +666,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, unsigned long start, unsigned long nr_pages, unsigned int gup_flags, struct page **pages, - struct vm_area_struct **vmas, int *nonblocking) + struct vm_area_struct **vmas, int *locked) { long ret = 0, i = 0; struct vm_area_struct *vma = NULL; @@ -713,7 +711,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, if (is_vm_hugetlb_page(vma)) { i = follow_hugetlb_page(mm, vma, pages, vmas, &start, &nr_pages, i, - gup_flags, nonblocking); + gup_flags, locked); continue; } } @@ -731,7 +729,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, page = follow_page_mask(vma, start, foll_flags, &ctx); if (!page) { ret = faultin_page(tsk, vma, start, &foll_flags, - nonblocking); + locked); switch (ret) { case 0: goto retry; @@ -1190,7 +1188,7 @@ EXPORT_SYMBOL(get_user_pages_longterm); * @vma: target vma * @start: start address * @end: end address - * @nonblocking: + * @locked: whether the mmap_sem is still held * * This takes care of mlocking the pages too if VM_LOCKED is set. * @@ -1198,14 +1196,14 @@ EXPORT_SYMBOL(get_user_pages_longterm); * * vma->vm_mm->mmap_sem must be held. * - * If @nonblocking is NULL, it may be held for read or write and will + * If @locked is NULL, it may be held for read or write and will * be unperturbed. * - * If @nonblocking is non-NULL, it must held for read only and may be - * released. If it's released, *@nonblocking will be set to 0. + * If @locked is non-NULL, it must held for read only and may be + * released. If it's released, *@locked will be set to 0. */ long populate_vma_page_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end, int *nonblocking) + unsigned long start, unsigned long end, int *locked) { struct mm_struct *mm = vma->vm_mm; unsigned long nr_pages = (end - start) / PAGE_SIZE; @@ -1240,7 +1238,7 @@ long populate_vma_page_range(struct vm_area_struct *vma, * not result in a stack expansion that recurses back here. */ return __get_user_pages(current, mm, start, nr_pages, gup_flags, - NULL, NULL, nonblocking); + NULL, NULL, locked); } /* diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7b5c0ad9a6bd..c700d4dfbbc3 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4166,7 +4166,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, struct page **pages, struct vm_area_struct **vmas, unsigned long *position, unsigned long *nr_pages, - long i, unsigned int flags, int *nonblocking) + long i, unsigned int flags, int *locked) { unsigned long pfn_offset; unsigned long vaddr = *position; @@ -4237,7 +4237,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, spin_unlock(ptl); if (flags & FOLL_WRITE) fault_flags |= FAULT_FLAG_WRITE; - if (nonblocking) + if (locked) fault_flags |= FAULT_FLAG_ALLOW_RETRY; if (flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | @@ -4254,8 +4254,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, break; } if (ret & VM_FAULT_RETRY) { - if (nonblocking) - *nonblocking = 0; + if (locked) + *locked = 0; *nr_pages = 0; /* * VM_FAULT_RETRY must not return an From patchwork Wed Nov 7 06:06:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10671875 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4EEC31709 for ; Wed, 7 Nov 2018 06:07:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3D081287AC for ; Wed, 7 Nov 2018 06:07:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 30F2C28A11; Wed, 7 Nov 2018 06:07:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CD808287AC for ; Wed, 7 Nov 2018 06:07:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9FC7E6B04B5; Wed, 7 Nov 2018 01:07:22 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9AD0D6B04B7; Wed, 7 Nov 2018 01:07:22 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 899EC6B04B8; Wed, 7 Nov 2018 01:07:22 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 5818B6B04B5 for ; Wed, 7 Nov 2018 01:07:22 -0500 (EST) Received: by mail-qk1-f199.google.com with SMTP id c84so29450240qkb.13 for ; Tue, 06 Nov 2018 22:07:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=sxr/RNfWx2cJdrZRHXLKFcUW2mwtO7LMKYGfz4RTJA4=; b=rT0vbUnqjxzgxlZU1KZRWInEXx628FIYNVpim7GHETmHUUbaArlah7zPyVWVSRt1TF 9dXKSnrBi/QWuPPinZ5pjxuHxDrryIj+LnDrxywKpH1FSWe0U2B9wfDaMuGk9T6FMo8r wWVldvadfrGuKmXGKFonUk+NQzgT6cxUkhaWLj7g6U7F5UTcUlEa3GeoswMNUSYROobL Dggc83uBYd6elJFWaOn8373eJR6b5iaDTrsCOhsVglRH3MxNp/RAu2gL/CITol9sGyfk bMWpCUDTs8uV7VR0oqV4cb3jrzkTI/WTnFLlqFFWsztgek0Du/tjF8xiNj2LmO9fWEp7 iBtg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AGRZ1gIyvt9jnSv4fwwUp7pghJAxdoni3Vy6RamZz8xMK8ZWJ6gCRcps Jg7V31kb2BF6gUIb/uOYpyG42/XYS3lv3k9X6kRAtwHfPqMx2vA5eO8SQGHSKLRJ8dJ7Jkgj2DX DESut/i8+NrCGBA5r2E6OJR2DoQ6LDzwAKamOt0YbQjR2omx2NH9IRiasD7kDJ1mhpw== X-Received: by 2002:a37:bf46:: with SMTP id p67-v6mr517475qkf.46.1541570841937; Tue, 06 Nov 2018 22:07:21 -0800 (PST) X-Google-Smtp-Source: AJdET5etqraszt2XMco8RSk/9mtTCamhJxbzOa56PXAg6HMQIRa9uRaaeUbcdlRrCJ4xY6KDKLIR X-Received: by 2002:a37:bf46:: with SMTP id p67-v6mr517433qkf.46.1541570840586; Tue, 06 Nov 2018 22:07:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541570840; cv=none; d=google.com; s=arc-20160816; b=d6fQ2oUYV/VB2W9Q5/BSUTFCROWslW+yilEbPo9QjCToCDvtvvXBlFfBGWPRhQJDNz 3KA0lf2mBHjpRC2f+TJq81bUX4pDFONkYe4v5oh/6R/W7lR5ali3oAsftAV4nNXlztEv g00e93sCDgXspoSX0dwnlt6HY5QSVWj+7uay0Kjg2gkzdAUcdqRBCjOJrcd8ODsFbZsr o8ZqLK+f6IrEPL4JBmRFcQfY0g3hqS/IkAzImf+DzKrrykX6RZ6XHnLoM35mptIyJSkk W1phFMb52tdPkojxc4bOY7KZBd+LjUoljVEDcjyWtn74FZY3fR53p21rziXS8KilqEzq fuEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=sxr/RNfWx2cJdrZRHXLKFcUW2mwtO7LMKYGfz4RTJA4=; b=vy1ptO/PQLPDc0SW4GH9wuRTtCyXD1bTChNq6nM5GqR9BymAo2tSVr31oRM4Gqbbgo Ceq06eq0sUQMMKKmotisq1r5+EDTkGB5ltKQozrr0FBBOz4aMET2qj2OvQ/bFKpS9aGD Sx4Njst/msBZ5rkbRbs+71eTqfPegxJkK6UhFQ3vDWBfFSpz8pZwM59a6mI1WTJnjQbu EHt2a4rywz24tHZUx+30biHU7PVYAGuaK6yv313Rm7UUlwrvBuiMeJCEpnFgLm3HsWpc qK4RW2P4zQc/hkgrvtEgiNt4d+Oi8KiDXvuu0aybtM0NlcoZtWhXjmaSqKt701PEixkL wYNQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id w64si3591456qte.98.2018.11.06.22.07.20 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 06 Nov 2018 22:07:20 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 750A337E87; Wed, 7 Nov 2018 06:07:19 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-128.nay.redhat.com [10.66.14.128]) by smtp.corp.redhat.com (Postfix) with ESMTP id 676E7104C529; Wed, 7 Nov 2018 06:07:07 +0000 (UTC) From: Peter Xu To: linux-kernel@vger.kernel.org Cc: Keith Busch , Linus Torvalds , peterx@redhat.com, Dan Williams , linux-mm@kvack.org, Matthew Wilcox , Al Viro , Andrea Arcangeli , Huang Ying , Mike Kravetz , Mike Rapoport , Jerome Glisse , "Michael S. Tsirkin" , "Kirill A . Shutemov" , Michal Hocko , Vlastimil Babka , Pavel Tatashin , Andrew Morton Subject: [PATCH RFC v2 2/4] mm: userfault: return VM_FAULT_RETRY on signals Date: Wed, 7 Nov 2018 14:06:41 +0800 Message-Id: <20181107060643.10950-3-peterx@redhat.com> In-Reply-To: <20181107060643.10950-1-peterx@redhat.com> References: <20181107060643.10950-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Wed, 07 Nov 2018 06:07:19 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP There was a special path in handle_userfault() in the past that we'll return a VM_FAULT_NOPAGE when we detected non-fatal signals when waiting for userfault handling. We did that by reacquiring the mmap_sem before returning. However that brings a risk in that the vmas might have changed when we retake the mmap_sem and even we could be holding an invalid vma structure. The problem was reported by syzbot. This patch removes the special path and we'll return a VM_FAULT_RETRY with the common path even if we have got such signals. Then for all the architectures that is passing in VM_FAULT_ALLOW_RETRY into handle_mm_fault(), we check not only for SIGKILL but for all the rest of userspace pending signals right after we returned from handle_mm_fault(). The idea comes from the upstream discussion between Linus and Andrea: https://lkml.org/lkml/2017/10/30/560 (This patch contains a potential fix for a double-free of mmap_sem on ARC architecture; please see https://lkml.org/lkml/2018/11/1/723 for more information) Suggested-by: Linus Torvalds Suggested-by: Andrea Arcangeli Signed-off-by: Peter Xu --- arch/alpha/mm/fault.c | 2 +- arch/arc/mm/fault.c | 11 +++++++---- arch/arm/mm/fault.c | 14 ++++++++++---- arch/arm64/mm/fault.c | 6 +++--- arch/hexagon/mm/vm_fault.c | 2 +- arch/ia64/mm/fault.c | 2 +- arch/m68k/mm/fault.c | 2 +- arch/microblaze/mm/fault.c | 2 +- arch/mips/mm/fault.c | 2 +- arch/nds32/mm/fault.c | 6 +++--- arch/nios2/mm/fault.c | 2 +- arch/openrisc/mm/fault.c | 2 +- arch/parisc/mm/fault.c | 2 +- arch/powerpc/mm/fault.c | 4 +++- arch/riscv/mm/fault.c | 4 ++-- arch/s390/mm/fault.c | 9 ++++++--- arch/sh/mm/fault.c | 4 ++++ arch/sparc/mm/fault_32.c | 3 +++ arch/sparc/mm/fault_64.c | 3 +++ arch/um/kernel/trap.c | 5 ++++- arch/unicore32/mm/fault.c | 4 ++-- arch/x86/mm/fault.c | 12 +++++++++++- arch/xtensa/mm/fault.c | 3 +++ fs/userfaultfd.c | 24 ------------------------ 24 files changed, 73 insertions(+), 57 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index d73dc473fbb9..46e5e420ad2a 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -150,7 +150,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, the fault. */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index c9da6102eb4f..52b6b71b74e2 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -142,11 +142,14 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) fault = handle_mm_fault(vma, address, flags); /* If Pagefault was interrupted by SIGKILL, exit page fault "early" */ - if (unlikely(fatal_signal_pending(current))) { - if ((fault & VM_FAULT_ERROR) && !(fault & VM_FAULT_RETRY)) + if (unlikely(fatal_signal_pending(current) && user_mode(regs))) { + /* + * VM_FAULT_RETRY means we have released the mmap_sem, + * otherwise we need to drop it before leaving + */ + if (!(fault & VM_FAULT_RETRY)) up_read(&mm->mmap_sem); - if (user_mode(regs)) - return; + return; } perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index f4ea4c62c613..743077d19669 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -308,14 +308,20 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_page_fault(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (fault & VM_FAULT_RETRY) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; - return 0; + else if (signal_pending(current)) + /* + * It's either a common signal, or a fatal + * signal but for the userspace, we return + * immediately. + */ + return 0; } /* diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 7d9571f4ae3d..744d6451ea83 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -499,13 +499,13 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, if (fault & VM_FAULT_RETRY) { /* - * If we need to retry but a fatal signal is pending, + * If we need to retry but a signal is pending, * handle the signal first. We do not need to release * the mmap_sem because it would already be released * in __lock_page_or_retry in mm/filemap.c. */ - if (fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return 0; } diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index eb263e61daf4..be10b441d9cc 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -104,7 +104,7 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; /* The most common case -- we are done. */ diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 5baeb022f474..62c2d39d2bed 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -163,7 +163,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index 9b6163c05a75..d9808a807ab8 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -138,7 +138,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); pr_debug("handle_mm_fault returns %x\n", fault); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return 0; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 202ad6a494f5..4fd2dbd0c5ca 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -217,7 +217,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 73d8a0f0b810..92374fd091d2 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -154,7 +154,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index b740534b152c..72461745d3e1 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -207,12 +207,12 @@ void do_page_fault(unsigned long entry, unsigned long addr, fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (fault & VM_FAULT_RETRY && signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; return; } diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index 24fd84cf6006..5939434a31ae 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -134,7 +134,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index dc4dbafc1d83..873ecb5d82d7 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -165,7 +165,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index c8e8b7c05558..29422eec329d 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -303,7 +303,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 1697e903bbf2..ef5b0cf0a902 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -575,8 +575,10 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, */ flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; - if (!fatal_signal_pending(current)) + if (!signal_pending(current)) goto retry; + else if (!fatal_signal_pending(current) && is_user) + return; } /* diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 88401d5125bc..4fc8d746bec3 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -123,11 +123,11 @@ asmlinkage void do_page_fault(struct pt_regs *regs) fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(tsk)) + if ((fault & VM_FAULT_RETRY) && signal_pending(tsk)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index 2b8f32f56e0c..19b4fb2fafab 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -500,9 +500,12 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) * the fault. */ fault = handle_mm_fault(vma, address, flags); - /* No reason to continue if interrupted by SIGKILL. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - fault = VM_FAULT_SIGNAL; + /* Do not continue if interrupted by signals. */ + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { + if (fatal_signal_pending(current)) + fault = VM_FAULT_SIGNAL; + else + fault = 0; if (flags & FAULT_FLAG_RETRY_NOWAIT) goto out_up; goto out; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index 6defd2c6d9b1..baf5d73df40c 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -506,6 +506,10 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, * have already released it in __lock_page_or_retry * in mm/filemap.c. */ + + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index b0440b0edd97..a2c83104fe35 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -269,6 +269,9 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index 8f8a604c1300..3345fa7234e2 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -467,6 +467,9 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index cced82946042..bf12032f703d 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -76,8 +76,11 @@ int handle_page_fault(unsigned long address, unsigned long ip, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if (fault & VM_FAULT_RETRY && signal_pending(current)) { + if (is_user && !fatal_signal_pending(current)) + err = 0; goto out_nosemaphore; + } if (unlikely(fault & VM_FAULT_ERROR)) { if (fault & VM_FAULT_OOM) { diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index b9a3a50644c1..3611f19234a1 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -248,11 +248,11 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_pf(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) return 0; if (!(fault & VM_FAULT_ERROR) && (flags & FAULT_FLAG_ALLOW_RETRY)) { diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index b24eb4eb9984..e05bdc2602d6 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1433,8 +1433,18 @@ void do_user_addr_fault(struct pt_regs *regs, if (flags & FAULT_FLAG_ALLOW_RETRY) { flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; - if (!fatal_signal_pending(tsk)) + if (!signal_pending(tsk)) goto retry; + else if (!fatal_signal_pending(tsk)) + /* + * There is a signal for the task but + * it's not fatal, let's return + * directly to the userspace. This + * gives chance for signals like + * SIGSTOP/SIGCONT to be handled + * faster, e.g., with GDB. + */ + return; } /* User mode? Just return to handle the fatal exception */ diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index 2ab0e0dcd166..792dad5e2f12 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -136,6 +136,9 @@ void do_page_fault(struct pt_regs *regs) * in mm/filemap.c. */ + if (user_mode(regs) && signal_pending(current)) + return; + goto retry; } } diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 356d2b8568c1..249c9211b2ed 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -515,30 +515,6 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) __set_current_state(TASK_RUNNING); - if (return_to_userland) { - if (signal_pending(current) && - !fatal_signal_pending(current)) { - /* - * If we got a SIGSTOP or SIGCONT and this is - * a normal userland page fault, just let - * userland return so the signal will be - * handled and gdb debugging works. The page - * fault code immediately after we return from - * this function is going to release the - * mmap_sem and it's not depending on it - * (unlike gup would if we were not to return - * VM_FAULT_RETRY). - * - * If a fatal signal is pending we still take - * the streamlined VM_FAULT_RETRY failure path - * and there's no need to retake the mmap_sem - * in such case. - */ - down_read(&mm->mmap_sem); - ret = VM_FAULT_NOPAGE; - } - } - /* * Here we race with the list_del; list_add in * userfaultfd_ctx_read(), however because we don't ever run From patchwork Wed Nov 7 06:06:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10671877 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B353E15A6 for ; Wed, 7 Nov 2018 06:07:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A136128755 for ; Wed, 7 Nov 2018 06:07:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9429728A11; Wed, 7 Nov 2018 06:07:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7C20828755 for ; Wed, 7 Nov 2018 06:07:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E2406B04B7; Wed, 7 Nov 2018 01:07:34 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3B7476B04B9; Wed, 7 Nov 2018 01:07:34 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A78C6B04BA; Wed, 7 Nov 2018 01:07:34 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id E9ED86B04B7 for ; Wed, 7 Nov 2018 01:07:33 -0500 (EST) Received: by mail-qk1-f198.google.com with SMTP id n68so28921372qkn.8 for ; Tue, 06 Nov 2018 22:07:33 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=wWh5O8Oy+1W+MXFrOeeuXpWLL0193GuTK3euapyY20c=; b=gCkNh8R5JD4bZvAAX2nidvMW+qV36keQ6IGivO/T4LDv7+4J4pfRb+s7SV/FoB8rsn hYK1JcbAuqT+rfO3vlOpcCZyWvT827KfXpqj7ljYbMOIvK4tbVYcXzqUOORg4kUCxE3Z Qm5TVxo0EFEQU+wnuKH4WsjXbDXEk9WojqHnkzorWhF4tKe4p59gcRfESQo8GZAMyVKm JHaIvlF4Suh5/Zs23qOYeiPZObzRfwzK0R1ZQHdhY7LXMuXbMkajrr38SRaBe447piLu 4hvB82btk0cspYkK4MJjDySI/PdtxrK3zN1PJBMNBrI+e0eo/LRjwSZ6z95Ka6wTCaXG hqRQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AGRZ1gIFcW7z8zqk7J1objw/fP7CIKHLfP2ovpELYNtCOicTg8d75lD8 Px6K4+QGxm9YIiM+Ybp4TQ1zQK9wxLipBNXDd/7Ce7T9ofdftFRsXfxG7KXOh8OuGuUmKnPJpXe JN2AKgSPsg7OKma6CX4Yi8JZoS6WbypL/x/FnqSpd3gtsoaK9BVWxEB9mlpbBhwEgzQ== X-Received: by 2002:a0c:e34f:: with SMTP id a15mr504905qvm.147.1541570853695; Tue, 06 Nov 2018 22:07:33 -0800 (PST) X-Google-Smtp-Source: AJdET5er/dLdvQIM8e5H1MpwlvlZJPZ20+EGszAjofVeRw215X6sLTlPbawjTn1dJoFWmGt+NF6A X-Received: by 2002:a0c:e34f:: with SMTP id a15mr504857qvm.147.1541570852531; Tue, 06 Nov 2018 22:07:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541570852; cv=none; d=google.com; s=arc-20160816; b=WPH6s7C+yWTPOwyElIexega0eOWiH+VpPc3vBpBpjE6YmEGk3Yab3tYqPI1fGoVxHM wT3rQ+dwtljYeYnbWo+otPJV7x3YT3CTX2KzXzITpumVnY7JhbENwkzJ/f8sI0rxOrhK RzUl/uQHUSLVL2VIgO/TRsuzA2A66SU/Yrn5I6+0N/LRjsv5gLOQOjM9J2f14ERRo9bF VDcKmZu2CBadGPB9rWJbZmizWtFIoXj/52HKsNQAKiTuFr2sukwWMUqenKkSUbjSqIN2 MW23DfCNrfb9CdNcvoLGfeNO49gBTMKh2Q/d0wYmaHaDxsilWw662NGhqIiyOcUcPQvF h4WQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=wWh5O8Oy+1W+MXFrOeeuXpWLL0193GuTK3euapyY20c=; b=ariwnPnumZuGFezg2wtGJW706gDS/Upy3XVO8uvISzfo+4kz2/ctOTNKnXXxsw8Tcv SkGNiPDrFSo8RaZFA34vkXgckJIa4Y9Zc9UzpfUHaYkqb3gcpE8vvz3VJ8H8BEXZdMt6 cLvQaXn7gN0Snf1nmjvxpLC3/EU5/nYCWx8zRAE/kc8K9/XE3hqwnD94TnYw4QaGJhaz 49vZ5GcoJaShegwOZZHa66Fb9CE2QjwyvH0iWwjAm1t4L7kESnCLtHsK8r3ADFaQiiEo bglCFCy3DmxLuF1KkDXYO+Vv1s+rFj0yMZR3MkgT6Y0xas8rHUJvJhNLQ113ErMEjfvI wTxQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id y3si429829qvj.17.2018.11.06.22.07.32 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 06 Nov 2018 22:07:32 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 85B2985546; Wed, 7 Nov 2018 06:07:31 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-128.nay.redhat.com [10.66.14.128]) by smtp.corp.redhat.com (Postfix) with ESMTP id EFE5D106A792; Wed, 7 Nov 2018 06:07:19 +0000 (UTC) From: Peter Xu To: linux-kernel@vger.kernel.org Cc: Keith Busch , Linus Torvalds , peterx@redhat.com, Dan Williams , linux-mm@kvack.org, Matthew Wilcox , Al Viro , Andrea Arcangeli , Huang Ying , Mike Kravetz , Mike Rapoport , Jerome Glisse , "Michael S. Tsirkin" , "Kirill A . Shutemov" , Michal Hocko , Vlastimil Babka , Pavel Tatashin , Andrew Morton Subject: [PATCH RFC v2 3/4] mm: allow VM_FAULT_RETRY for multiple times Date: Wed, 7 Nov 2018 14:06:42 +0800 Message-Id: <20181107060643.10950-4-peterx@redhat.com> In-Reply-To: <20181107060643.10950-1-peterx@redhat.com> References: <20181107060643.10950-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 07 Nov 2018 06:07:31 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The idea comes from a discussion between Linus and Andrea [1]. Before this patch we only allow a page fault to retry once. We achieved this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing handle_mm_fault() the second time. This was majorly used to avoid unexpected starvation of the system by looping over forever to handle the page fault on a single page. However that should hardly happen, and after all for each code path to return a VM_FAULT_RETRY we'll first wait for a condition (during which time we should possibly yield the cpu) to happen before VM_FAULT_RETRY is really returned. This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY flag when we receive VM_FAULT_RETRY. It means that the page fault handler now can retry the page fault for multiple times if necessary without the need to generate another page fault event. Meanwhile we still keep the FAULT_FLAG_TRIED flag so page fault handler can still identify whether a page fault is the first attempt or not. GUP code is not touched yet and will be covered in follow up patch. This will be a nice enhancement for current code at the same time a supporting material for the future userfaultfd-writeprotect work since in that work there will always be an explicit userfault writeprotect retry for protected pages, and if that cannot resolve the page fault (e.g., when userfaultfd-writeprotect is used in conjunction with shared memory) then we'll possibly need a 3rd retry of the page fault. It might also benefit other potential users who will have similar requirement like userfault write-protection. Please read the thread below for more information. [1] https://lkml.org/lkml/2017/11/2/833 Suggested-by: Linus Torvalds Suggested-by: Andrea Arcangeli Signed-off-by: Peter Xu --- arch/alpha/mm/fault.c | 2 +- arch/arc/mm/fault.c | 1 - arch/arm/mm/fault.c | 3 --- arch/arm64/mm/fault.c | 5 ----- arch/hexagon/mm/vm_fault.c | 1 - arch/ia64/mm/fault.c | 1 - arch/m68k/mm/fault.c | 3 --- arch/microblaze/mm/fault.c | 1 - arch/mips/mm/fault.c | 1 - arch/nds32/mm/fault.c | 1 - arch/nios2/mm/fault.c | 3 --- arch/openrisc/mm/fault.c | 1 - arch/parisc/mm/fault.c | 2 -- arch/powerpc/mm/fault.c | 5 ----- arch/riscv/mm/fault.c | 5 ----- arch/s390/mm/fault.c | 5 +---- arch/sh/mm/fault.c | 1 - arch/sparc/mm/fault_32.c | 1 - arch/sparc/mm/fault_64.c | 1 - arch/um/kernel/trap.c | 1 - arch/unicore32/mm/fault.c | 6 +----- arch/x86/mm/fault.c | 1 - arch/xtensa/mm/fault.c | 1 - 23 files changed, 3 insertions(+), 49 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index 46e5e420ad2a..deae82bb83c1 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -169,7 +169,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; + flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would * have already released it in __lock_page_or_retry diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index 52b6b71b74e2..a8b02db45324 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -168,7 +168,6 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 743077d19669..377781d8491a 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -342,9 +342,6 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) regs, addr); } if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 744d6451ea83..8a26e03fc2bf 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -510,12 +510,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, return 0; } - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk of - * starvation. - */ if (mm_flags & FAULT_FLAG_ALLOW_RETRY) { - mm_flags &= ~FAULT_FLAG_ALLOW_RETRY; mm_flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index be10b441d9cc..576751597e77 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -115,7 +115,6 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; } diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 62c2d39d2bed..9de95d39935e 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -189,7 +189,6 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index d9808a807ab8..b1b2109e4ab4 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -162,9 +162,6 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 4fd2dbd0c5ca..05a4847ac0bf 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -236,7 +236,6 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 92374fd091d2..9953b5b571df 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -178,7 +178,6 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, tsk->min_flt++; } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index 72461745d3e1..f0b775cb5cdf 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -237,7 +237,6 @@ void do_page_fault(unsigned long entry, unsigned long addr, else tsk->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index 5939434a31ae..9dd1c51acc22 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -158,9 +158,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index 873ecb5d82d7..ff92c5674781 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -185,7 +185,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, else tsk->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index 29422eec329d..7d3e96a9a7ab 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -327,8 +327,6 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; - /* * No need to up_read(&mm->mmap_sem) as we would * have already released it in __lock_page_or_retry diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index ef5b0cf0a902..4563cb44107e 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -569,11 +569,6 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, if (unlikely(fault & VM_FAULT_RETRY)) { /* We retry only once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. - */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; if (!signal_pending(current)) goto retry; diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 4fc8d746bec3..aad2c0557d2f 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -154,11 +154,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs) 1, regs, addr); } if (fault & VM_FAULT_RETRY) { - /* - * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. - */ - flags &= ~(FAULT_FLAG_ALLOW_RETRY); flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index 19b4fb2fafab..819f87169ee1 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -537,10 +537,7 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) fault = VM_FAULT_PFAULT; goto out_up; } - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~(FAULT_FLAG_ALLOW_RETRY | - FAULT_FLAG_RETRY_NOWAIT); + flags &= ~FAULT_FLAG_RETRY_NOWAIT; flags |= FAULT_FLAG_TRIED; down_read(&mm->mmap_sem); goto retry; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index baf5d73df40c..cd710e2d7c57 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -498,7 +498,6 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index a2c83104fe35..6735cd1c09b9 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -261,7 +261,6 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, 1, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index 3345fa7234e2..3996fd0bc092 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -459,7 +459,6 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) 1, regs, address); } if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index bf12032f703d..b11d8af569f1 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -99,7 +99,6 @@ int handle_page_fault(unsigned long address, unsigned long ip, else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; goto retry; diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index 3611f19234a1..fdf577956f5f 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -260,12 +260,8 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) tsk->maj_flt++; else tsk->min_flt++; - if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk - * of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; + if (fault & VM_FAULT_RETRY) goto retry; - } } up_read(&mm->mmap_sem); diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index e05bdc2602d6..ed230afbc02a 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1431,7 +1431,6 @@ void do_user_addr_fault(struct pt_regs *regs, if (unlikely(fault & VM_FAULT_RETRY)) { /* Retry at most once */ if (flags & FAULT_FLAG_ALLOW_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; if (!signal_pending(tsk)) goto retry; diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index 792dad5e2f12..7cd55f2d66c9 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -128,7 +128,6 @@ void do_page_fault(struct pt_regs *regs) else current->min_flt++; if (fault & VM_FAULT_RETRY) { - flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; /* No need to up_read(&mm->mmap_sem) as we would From patchwork Wed Nov 7 06:06:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10671879 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 95A6A1709 for ; Wed, 7 Nov 2018 06:07:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 85BD328755 for ; Wed, 7 Nov 2018 06:07:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 79DD128A11; Wed, 7 Nov 2018 06:07:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 162AB28755 for ; Wed, 7 Nov 2018 06:07:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CBC746B04B9; Wed, 7 Nov 2018 01:07:45 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C6A2E6B04BB; Wed, 7 Nov 2018 01:07:45 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B801E6B04BC; Wed, 7 Nov 2018 01:07:45 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id 8E52A6B04B9 for ; Wed, 7 Nov 2018 01:07:45 -0500 (EST) Received: by mail-qk1-f200.google.com with SMTP id h68so29544059qke.3 for ; Tue, 06 Nov 2018 22:07:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=n4wGEVzmtOcKVDBg3rQlx6dTYRrrb8ymHIgdc7aT9rc=; b=iNiF7ioROG+1UGklt6htx+KkMG3LtyGLrnOSC8ZPn8wShjFWsKmQC91p4bzGEchvyd GnDxfkvQ6oIj/nUuvCYjoda/xvBxUR5qgbgw+6jkRwTQTshjMKNnTFfSAQbpEZMp1dbP NV4xQw6yzGVg25oISDLz8qT0od+DoDZj8HwVb8w2wLvjIuIue11NdVV5l4Z9n8+gx+Lv cKpBpMR2XZYI8NGlIx/2z6mJp1nnyAh292dnyagBmAKN4egBpQBEHRUldZm+S+QDqkWe iXgGvNKMvO4Eu+5bUNaVlB08doe3AIwQYLTi0+auifukmTY5TFABF5MYfAJ0xcy9gE2+ 0bxg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AGRZ1gI/rbg3rYDJEDf/rJmPtn3+N7HU0bFr65hd+hy+4ybyNBvFls7m NTn/1VYufYiOQDnmwZcCYP/MnQxnNhemVRfxu6QBd2pRW7/7Nl//IBCvCxFhVXhtNfrMbfKv2QB zEC5rMjmXF9apllsZIifMbIBI8BiHOA5RrpsVvIwjztga4qqqknmaJ3DzCdJQrBqeQg== X-Received: by 2002:ac8:4258:: with SMTP id r24mr505128qtm.213.1541570865311; Tue, 06 Nov 2018 22:07:45 -0800 (PST) X-Google-Smtp-Source: AJdET5fKO4iChwBoR2dO6KNlrjd0ofdHdK1pz40jG0CkN3Q/b08DDbDSDmIaprJeV56iYHwKtJOM X-Received: by 2002:ac8:4258:: with SMTP id r24mr505103qtm.213.1541570864854; Tue, 06 Nov 2018 22:07:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541570864; cv=none; d=google.com; s=arc-20160816; b=HHYu0efQW6BCbz0utK4tMiY7x9qW/quGMxD4s8+4D92OTv25ZFyiLlYiNcDu65x1ei AdV8B/jU8w7M8Abiop0tSnGC7w3jrILGjd2ZLUTlO1R1ocN7qf4ShOTH9zfpwXOMwwaN wG8/0/MX4D/zyvf2H3UiZp/5YTBIaVoVDsNxG1hRCvu0hV8ebLnvc8jtyBM+4eFoNax0 xcLcIS0OP9ABfbmGv0V4vn3YPc/Pe0MEXiHnMwGJz1pZKZqR9yKUxo4oIG6A5IMSRpxd mLi1bNtR9tojqU52dpTachtzhmYcj9Tlyv+y+sOXn8Kd3LPrWEeQL99XbFOWJ6jWmVTR OT4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=n4wGEVzmtOcKVDBg3rQlx6dTYRrrb8ymHIgdc7aT9rc=; b=CHGnCak+rcrCr1z79KxHzfbK3Ro2fW4eky3diL0SOr4hH++bT80xtcEFiqEDGloQ5B jfgI4I6YgGJYxq/vYrrzS0asLVbgA+dz6hCILZ27yybA8AWDdCnHAtCkf0WmtxKSYeca l7nn1s0xk2UOI3ikYpEmfZOY4lF6sQ5mk2RMTLk9SmbVpB/gekj6mfVEZEE5okBXoJTR it3hYTnzhqMyMByIuCijP4Xa4AMpTChjrfZSe9PY/JZPUm96dxHXVVICQa0eKlIaeF/I d9i8iisvO4QPxfbmpjkzwm+4mp9uc7cKjQfrSPUb4MPqiULiJzdBPMi955IkrkVbRUfg 3N5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id m68si76428qkd.120.2018.11.06.22.07.44 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 06 Nov 2018 22:07:44 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D74FAC0495B1; Wed, 7 Nov 2018 06:07:43 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-128.nay.redhat.com [10.66.14.128]) by smtp.corp.redhat.com (Postfix) with ESMTP id 10DD31057051; Wed, 7 Nov 2018 06:07:31 +0000 (UTC) From: Peter Xu To: linux-kernel@vger.kernel.org Cc: Keith Busch , Linus Torvalds , peterx@redhat.com, Dan Williams , linux-mm@kvack.org, Matthew Wilcox , Al Viro , Andrea Arcangeli , Huang Ying , Mike Kravetz , Mike Rapoport , Jerome Glisse , "Michael S. Tsirkin" , "Kirill A . Shutemov" , Michal Hocko , Vlastimil Babka , Pavel Tatashin , Andrew Morton Subject: [PATCH RFC v2 4/4] mm: gup: allow VM_FAULT_RETRY for multiple times Date: Wed, 7 Nov 2018 14:06:43 +0800 Message-Id: <20181107060643.10950-5-peterx@redhat.com> In-Reply-To: <20181107060643.10950-1-peterx@redhat.com> References: <20181107060643.10950-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Wed, 07 Nov 2018 06:07:44 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This is the gup counterpart of the change that allows the VM_FAULT_RETRY to happen for more than once. Signed-off-by: Peter Xu --- mm/gup.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 6faff46cd409..8a0e7f9bd29a 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -522,7 +522,10 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, if (*flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; if (*flags & FOLL_TRIED) { - VM_WARN_ON_ONCE(fault_flags & FAULT_FLAG_ALLOW_RETRY); + /* + * Note: FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_TRIED + * can co-exist + */ fault_flags |= FAULT_FLAG_TRIED; } @@ -938,17 +941,23 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, /* VM_FAULT_RETRY triggered, so seek to the faulting offset */ pages += ret; start += ret << PAGE_SHIFT; + lock_dropped = true; +retry: /* * Repeat on the address that fired VM_FAULT_RETRY - * without FAULT_FLAG_ALLOW_RETRY but with + * with both FAULT_FLAG_ALLOW_RETRY and * FAULT_FLAG_TRIED. */ *locked = 1; - lock_dropped = true; down_read(&mm->mmap_sem); ret = __get_user_pages(tsk, mm, start, 1, flags | FOLL_TRIED, - pages, NULL, NULL); + pages, NULL, locked); + if (!*locked) { + /* Continue to retry until we succeeded */ + BUG_ON(ret != 0); + goto retry; + } if (ret != 1) { BUG_ON(ret > 1); if (!pages_done)