From patchwork Tue Jun 11 14:40:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987125 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 15FC114B6 for ; Tue, 11 Jun 2019 14:41:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0631422A2A for ; Tue, 11 Jun 2019 14:41:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ED70428889; Tue, 11 Jun 2019 14:41:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DC60328783 for ; Tue, 11 Jun 2019 14:41:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391572AbfFKOli (ORCPT ); Tue, 11 Jun 2019 10:41:38 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37478 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390512AbfFKOlh (ORCPT ); Tue, 11 Jun 2019 10:41:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=pU76vjF549TgrGMDxTP+px3yFTf1TEh0R8Swc6QhucM=; b=hksldD9HOj5evI3EEsvNNZserQ 9QfL5wF9gT7egY+bR9UULwLlKaDcw+8UV7m2/8Nx0LXz3UkxV9NozCDfZ44qqcS/1MjRrVmQH691W F+pa5IeoK2lMHw7zdweUhqWjsXhgXBNiVCgccTrkGw+N4DTtTTb6rdNpAcyxs2QNryR0hJKmc2s/b YhzCLeQqFaGRjbitn3ZJ5vYl9VFojSvxwCaSFpt+Sy1VKPlpBJYsDKNs5rLGgMg2fBrAGZ19Pbgp0 SeqyCW789tSA3PxnOlwKgzjeyFyE2oxk6A62waWOoKoHh27uuHpxZYxoZuPf046XmMbKVyOakaQh1 ZCwcvfmw==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahxV-0005Nh-VL; Tue, 11 Jun 2019 14:41:10 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 01/16] mm: use untagged_addr() for get_user_pages_fast addresses Date: Tue, 11 Jun 2019 16:40:47 +0200 Message-Id: <20190611144102.8848-2-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This will allow sparc64 to override its ADI tags for get_user_pages and get_user_pages_fast. Signed-off-by: Christoph Hellwig Reviewed-by: Khalid Aziz Reviewed-by: Jason Gunthorpe --- mm/gup.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index ddde097cf9e4..6bb521db67ec 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2146,7 +2146,7 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, unsigned long flags; int nr = 0; - start &= PAGE_MASK; + start = untagged_addr(start) & PAGE_MASK; len = (unsigned long) nr_pages << PAGE_SHIFT; end = start + len; @@ -2219,7 +2219,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages, unsigned long addr, len, end; int nr = 0, ret = 0; - start &= PAGE_MASK; + start = untagged_addr(start) & PAGE_MASK; addr = start; len = (unsigned long) nr_pages << PAGE_SHIFT; end = start + len; From patchwork Tue Jun 11 14:40:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987219 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D3B3514B6 for ; Tue, 11 Jun 2019 14:42:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C4D0428884 for ; Tue, 11 Jun 2019 14:42:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B91D32888C; Tue, 11 Jun 2019 14:42:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3369528889 for ; Tue, 11 Jun 2019 14:42:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391499AbfFKOli (ORCPT ); Tue, 11 Jun 2019 10:41:38 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37474 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388535AbfFKOlh (ORCPT ); Tue, 11 Jun 2019 10:41:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=gzNr8cC+RcaEr7a/7B73Y02JTGGTEGr38/f2ngv8wBc=; b=Qsrqx7Gjl2GC8vf87J1kV0OBS9 Tt5zTsC9u5FMT4aBrJhjCtE6a9K5hEj6huEHcTGHIZTp0lPnrD3O3q6Taf6l5zAkrj04NvtpjMFZz QSbVSiRFTFj1K1DWEpoIo8npKLbZp/icZRb4FmghwPVhUl2hQQ3GZ8V/iVwFa0djVedxsNvetvYQ/ U8rrAK2RSgtkuzMw1qKgToiMIX3BNteGcXa0JARj45dk/04PT26mlULgR6rFGQSTQUrd9alXQpxPZ v2Ok1HiUy9mTgFSunZi7dGsxAIdFawyIN5trdPG1sAjbfv2nBSw2rK5taBkGiygEsnDU61sdcdRaz phR5Tu6w==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahxZ-0005O7-RH; Tue, 11 Jun 2019 14:41:14 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 02/16] mm: simplify gup_fast_permitted Date: Tue, 11 Jun 2019 16:40:48 +0200 Message-Id: <20190611144102.8848-3-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Pass in the already calculated end value instead of recomputing it, and leave the end > start check in the callers instead of duplicating them in the arch code. Signed-off-by: Christoph Hellwig Reviewed-by: Jason Gunthorpe --- arch/s390/include/asm/pgtable.h | 8 +------- arch/x86/include/asm/pgtable_64.h | 8 +------- mm/gup.c | 17 +++++++---------- 3 files changed, 9 insertions(+), 24 deletions(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 9f0195d5fa16..9b274fcaacb6 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1270,14 +1270,8 @@ static inline pte_t *pte_offset(pmd_t *pmd, unsigned long address) #define pte_offset_map(pmd, address) pte_offset_kernel(pmd, address) #define pte_unmap(pte) do { } while (0) -static inline bool gup_fast_permitted(unsigned long start, int nr_pages) +static inline bool gup_fast_permitted(unsigned long start, unsigned long end) { - unsigned long len, end; - - len = (unsigned long) nr_pages << PAGE_SHIFT; - end = start + len; - if (end < start) - return false; return end <= current->mm->context.asce_limit; } #define gup_fast_permitted gup_fast_permitted diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index 0bb566315621..4990d26dfc73 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -259,14 +259,8 @@ extern void init_extra_mapping_uc(unsigned long phys, unsigned long size); extern void init_extra_mapping_wb(unsigned long phys, unsigned long size); #define gup_fast_permitted gup_fast_permitted -static inline bool gup_fast_permitted(unsigned long start, int nr_pages) +static inline bool gup_fast_permitted(unsigned long start, unsigned long end) { - unsigned long len, end; - - len = (unsigned long)nr_pages << PAGE_SHIFT; - end = start + len; - if (end < start) - return false; if (end >> __VIRTUAL_MASK_SHIFT) return false; return true; diff --git a/mm/gup.c b/mm/gup.c index 6bb521db67ec..3237f33792e6 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2123,13 +2123,9 @@ static void gup_pgd_range(unsigned long addr, unsigned long end, * Check if it's allowed to use __get_user_pages_fast() for the range, or * we need to fall back to the slow version: */ -bool gup_fast_permitted(unsigned long start, int nr_pages) +static bool gup_fast_permitted(unsigned long start, unsigned long end) { - unsigned long len, end; - - len = (unsigned long) nr_pages << PAGE_SHIFT; - end = start + len; - return end >= start; + return true; } #endif @@ -2150,6 +2146,8 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, len = (unsigned long) nr_pages << PAGE_SHIFT; end = start + len; + if (end <= start) + return 0; if (unlikely(!access_ok((void __user *)start, len))) return 0; @@ -2165,7 +2163,7 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, * block IPIs that come from THPs splitting. */ - if (gup_fast_permitted(start, nr_pages)) { + if (gup_fast_permitted(start, end)) { local_irq_save(flags); gup_pgd_range(start, end, write ? FOLL_WRITE : 0, pages, &nr); local_irq_restore(flags); @@ -2224,13 +2222,12 @@ int get_user_pages_fast(unsigned long start, int nr_pages, len = (unsigned long) nr_pages << PAGE_SHIFT; end = start + len; - if (nr_pages <= 0) + if (end <= start) return 0; - if (unlikely(!access_ok((void __user *)start, len))) return -EFAULT; - if (gup_fast_permitted(start, nr_pages)) { + if (gup_fast_permitted(start, end)) { local_irq_disable(); gup_pgd_range(addr, end, gup_flags, pages, &nr); local_irq_enable(); From patchwork Tue Jun 11 14:40:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987215 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 96875924 for ; Tue, 11 Jun 2019 14:42:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 85895283BF for ; Tue, 11 Jun 2019 14:42:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 77FA62877B; Tue, 11 Jun 2019 14:42:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BA90E2888D for ; Tue, 11 Jun 2019 14:42:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391574AbfFKOmq (ORCPT ); Tue, 11 Jun 2019 10:42:46 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37502 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391576AbfFKOli (ORCPT ); Tue, 11 Jun 2019 10:41:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=g4MxpUkGP+hxy8ku71ZEZt00ts6l03REsi5/bfFDlbU=; b=X22/uN/3aqQ1VzBdZFYnIpfrIJ JgQOzg3WSjfNhYKznXgvyyQPBF9PGAwq6jJ+1Hl6cDJMxwNjBGvNc5s3srm+ocnQmrf4M8kkhOO/W gUs3R2xrlOGvqs5y2CBuev39K1ER5sRtYrGoq4y0dvrKATsVsJpdUPRmS687iV9RkUotsvw15Mh/S 4xZTVyXHovu/nPgEBWXHfcBFWVWgBaRnr9MDCEclbB2bKjUsn3GX8AOIK006JtuGdhzZjag1gYo4f 8nPJijR2K316Q4pSDbA33p2yfBWo6yWTrnofL9AuquV2WMhI6Gu2jUVnjmOBUfBlIATdgpoy+HhA9 kLOu+C6w==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahxc-0005Pc-Qw; Tue, 11 Jun 2019 14:41:17 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 03/16] mm: lift the x86_32 PAE version of gup_get_pte to common code Date: Tue, 11 Jun 2019 16:40:49 +0200 Message-Id: <20190611144102.8848-4-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The split low/high access is the only non-READ_ONCE version of gup_get_pte that did show up in the various arch implemenations. Lift it to common code and drop the ifdef based arch override. Signed-off-by: Christoph Hellwig Reviewed-by: Jason Gunthorpe --- arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable-3level.h | 47 ------------------------ arch/x86/kvm/mmu.c | 2 +- mm/Kconfig | 3 ++ mm/gup.c | 51 ++++++++++++++++++++++++--- 5 files changed, 52 insertions(+), 52 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 2bbbd4d1ba31..7cd53cc59f0f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -121,6 +121,7 @@ config X86 select GENERIC_STRNCPY_FROM_USER select GENERIC_STRNLEN_USER select GENERIC_TIME_VSYSCALL + select GUP_GET_PTE_LOW_HIGH if X86_PAE select HARDLOCKUP_CHECK_TIMESTAMP if X86_64 select HAVE_ACPI_APEI if ACPI select HAVE_ACPI_APEI_NMI if ACPI diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h index f8b1ad2c3828..e3633795fb22 100644 --- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -285,53 +285,6 @@ static inline pud_t native_pudp_get_and_clear(pud_t *pudp) #define __pte_to_swp_entry(pte) (__swp_entry(__pteval_swp_type(pte), \ __pteval_swp_offset(pte))) -#define gup_get_pte gup_get_pte -/* - * WARNING: only to be used in the get_user_pages_fast() implementation. - * - * With get_user_pages_fast(), we walk down the pagetables without taking - * any locks. For this we would like to load the pointers atomically, - * but that is not possible (without expensive cmpxchg8b) on PAE. What - * we do have is the guarantee that a PTE will only either go from not - * present to present, or present to not present or both -- it will not - * switch to a completely different present page without a TLB flush in - * between; something that we are blocking by holding interrupts off. - * - * Setting ptes from not present to present goes: - * - * ptep->pte_high = h; - * smp_wmb(); - * ptep->pte_low = l; - * - * And present to not present goes: - * - * ptep->pte_low = 0; - * smp_wmb(); - * ptep->pte_high = 0; - * - * We must ensure here that the load of pte_low sees 'l' iff pte_high - * sees 'h'. We load pte_high *after* loading pte_low, which ensures we - * don't see an older value of pte_high. *Then* we recheck pte_low, - * which ensures that we haven't picked up a changed pte high. We might - * have gotten rubbish values from pte_low and pte_high, but we are - * guaranteed that pte_low will not have the present bit set *unless* - * it is 'l'. Because get_user_pages_fast() only operates on present ptes - * we're safe. - */ -static inline pte_t gup_get_pte(pte_t *ptep) -{ - pte_t pte; - - do { - pte.pte_low = ptep->pte_low; - smp_rmb(); - pte.pte_high = ptep->pte_high; - smp_rmb(); - } while (unlikely(pte.pte_low != ptep->pte_low)); - - return pte; -} - #include #endif /* _ASM_X86_PGTABLE_3LEVEL_H */ diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 1e9ba81accba..3f7cd11168f9 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -653,7 +653,7 @@ static u64 __update_clear_spte_slow(u64 *sptep, u64 spte) /* * The idea using the light way get the spte on x86_32 guest is from - * gup_get_pte(arch/x86/mm/gup.c). + * gup_get_pte (mm/gup.c). * * An spte tlb flush may be pending, because kvm_set_pte_rmapp * coalesces them and we are running out of the MMU lock. Therefore diff --git a/mm/Kconfig b/mm/Kconfig index f0c76ba47695..fe51f104a9e0 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -762,6 +762,9 @@ config GUP_BENCHMARK See tools/testing/selftests/vm/gup_benchmark.c +config GUP_GET_PTE_LOW_HIGH + bool + config ARCH_HAS_PTE_SPECIAL bool diff --git a/mm/gup.c b/mm/gup.c index 3237f33792e6..9b72f2ea3471 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1684,17 +1684,60 @@ struct page *get_dump_page(unsigned long addr) * This code is based heavily on the PowerPC implementation by Nick Piggin. */ #ifdef CONFIG_HAVE_GENERIC_GUP +#ifdef CONFIG_GUP_GET_PTE_LOW_HIGH +/* + * WARNING: only to be used in the get_user_pages_fast() implementation. + * + * With get_user_pages_fast(), we walk down the pagetables without taking any + * locks. For this we would like to load the pointers atomically, but sometimes + * that is not possible (e.g. without expensive cmpxchg8b on x86_32 PAE). What + * we do have is the guarantee that a PTE will only either go from not present + * to present, or present to not present or both -- it will not switch to a + * completely different present page without a TLB flush in between; something + * that we are blocking by holding interrupts off. + * + * Setting ptes from not present to present goes: + * + * ptep->pte_high = h; + * smp_wmb(); + * ptep->pte_low = l; + * + * And present to not present goes: + * + * ptep->pte_low = 0; + * smp_wmb(); + * ptep->pte_high = 0; + * + * We must ensure here that the load of pte_low sees 'l' IFF pte_high sees 'h'. + * We load pte_high *after* loading pte_low, which ensures we don't see an older + * value of pte_high. *Then* we recheck pte_low, which ensures that we haven't + * picked up a changed pte high. We might have gotten rubbish values from + * pte_low and pte_high, but we are guaranteed that pte_low will not have the + * present bit set *unless* it is 'l'. Because get_user_pages_fast() only + * operates on present ptes we're safe. + */ +static inline pte_t gup_get_pte(pte_t *ptep) +{ + pte_t pte; -#ifndef gup_get_pte + do { + pte.pte_low = ptep->pte_low; + smp_rmb(); + pte.pte_high = ptep->pte_high; + smp_rmb(); + } while (unlikely(pte.pte_low != ptep->pte_low)); + + return pte; +} +#else /* CONFIG_GUP_GET_PTE_LOW_HIGH */ /* - * We assume that the PTE can be read atomically. If this is not the case for - * your architecture, please provide the helper. + * We require that the PTE can be read atomically. */ static inline pte_t gup_get_pte(pte_t *ptep) { return READ_ONCE(*ptep); } -#endif +#endif /* CONFIG_GUP_GET_PTE_LOW_HIGH */ static void undo_dev_pagemap(int *nr, int nr_start, struct page **pages) { From patchwork Tue Jun 11 14:40:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987211 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 91BDA1708 for ; Tue, 11 Jun 2019 14:42:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8226A2887D for ; Tue, 11 Jun 2019 14:42:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7582527F97; Tue, 11 Jun 2019 14:42:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 82B6928880 for ; Tue, 11 Jun 2019 14:42:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391589AbfFKOli (ORCPT ); Tue, 11 Jun 2019 10:41:38 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37486 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391528AbfFKOli (ORCPT ); Tue, 11 Jun 2019 10:41:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=YdgQvYMgqOUQwEvGH17qBF0YrrxMqW6F+rGIWz8dtc8=; b=ZURuWGVuu8robHBBLAFnrfoXA1 3nEbA75Eot/QHKe01UAt3GRvyP3MaANX7kIj1156NUNOqzDWHfLifgPOpACTIhxY5zThJ4/Yuo30s SUvmYPVHlXiI8R1SYFNCriL33BKP7fi1zZM0xrz4HRElky5PGOcl56Pw6qviU9ASt5PugN2/So9F1 CcUgFtnaC/Zut0sPLBcCZUj1yseVIfmsO8zMpvIcgP9fED1u7pqhk68OQvC4iRR5fvwz/6W8SxrHl gd3K2Kf6WAEZ9pJvBtBj57E0scbQ6tZP8jQmV/Yt5FTYtbjG06xkf+G6FL/jlyNMHgKdKnaCsKcSP dNHuBkew==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahxf-0005Pu-L2; Tue, 11 Jun 2019 14:41:20 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 04/16] MIPS: use the generic get_user_pages_fast code Date: Tue, 11 Jun 2019 16:40:50 +0200 Message-Id: <20190611144102.8848-5-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The mips code is mostly equivalent to the generic one, minus various bugfixes and an arch override for gup_fast_permitted. Note that this defines ARCH_HAS_PTE_SPECIAL for mips as mips has pte_special and pte_mkspecial implemented and used in the existing gup code. They are no-op stubs, though which makes me a little unsure if this is really right thing to do. Signed-off-by: Christoph Hellwig Reviewed-by: Jason Gunthorpe --- arch/mips/Kconfig | 3 + arch/mips/include/asm/pgtable.h | 3 + arch/mips/mm/Makefile | 1 - arch/mips/mm/gup.c | 303 -------------------------------- 4 files changed, 6 insertions(+), 304 deletions(-) delete mode 100644 arch/mips/mm/gup.c diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index 70d3200476bf..64108a2a16d4 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -6,6 +6,7 @@ config MIPS select ARCH_BINFMT_ELF_STATE if MIPS_FP_SUPPORT select ARCH_CLOCKSOURCE_DATA select ARCH_HAS_ELF_RANDOMIZE + select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_HAS_UBSAN_SANITIZE_ALL select ARCH_SUPPORTS_UPROBES @@ -34,6 +35,7 @@ config MIPS select GENERIC_SCHED_CLOCK if !CAVIUM_OCTEON_SOC select GENERIC_SMP_IDLE_THREAD select GENERIC_TIME_VSYSCALL + select GUP_GET_PTE_LOW_HIGH if CPU_MIPS32 && PHYS_ADDR_T_64BIT select HANDLE_DOMAIN_IRQ select HAVE_ARCH_COMPILER_H select HAVE_ARCH_JUMP_LABEL @@ -55,6 +57,7 @@ config MIPS select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FUNCTION_GRAPH_TRACER select HAVE_FUNCTION_TRACER + select HAVE_GENERIC_GUP select HAVE_IDE select HAVE_IOREMAP_PROT select HAVE_IRQ_EXIT_ON_IRQ_STACK diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h index 4ccb465ef3f2..7d27194e3b45 100644 --- a/arch/mips/include/asm/pgtable.h +++ b/arch/mips/include/asm/pgtable.h @@ -20,6 +20,7 @@ #include #include #include +#include struct mm_struct; struct vm_area_struct; @@ -626,6 +627,8 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ +#define gup_fast_permitted(start, end) (!cpu_has_dc_aliases) + #include /* diff --git a/arch/mips/mm/Makefile b/arch/mips/mm/Makefile index f34d7ff5eb60..1e8d335025d7 100644 --- a/arch/mips/mm/Makefile +++ b/arch/mips/mm/Makefile @@ -7,7 +7,6 @@ obj-y += cache.o obj-y += context.o obj-y += extable.o obj-y += fault.o -obj-y += gup.o obj-y += init.o obj-y += mmap.o obj-y += page.o diff --git a/arch/mips/mm/gup.c b/arch/mips/mm/gup.c deleted file mode 100644 index 4c2b4483683c..000000000000 --- a/arch/mips/mm/gup.c +++ /dev/null @@ -1,303 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * Lockless get_user_pages_fast for MIPS - * - * Copyright (C) 2008 Nick Piggin - * Copyright (C) 2008 Novell Inc. - * Copyright (C) 2011 Ralf Baechle - */ -#include -#include -#include -#include -#include -#include - -#include -#include - -static inline pte_t gup_get_pte(pte_t *ptep) -{ -#if defined(CONFIG_PHYS_ADDR_T_64BIT) && defined(CONFIG_CPU_MIPS32) - pte_t pte; - -retry: - pte.pte_low = ptep->pte_low; - smp_rmb(); - pte.pte_high = ptep->pte_high; - smp_rmb(); - if (unlikely(pte.pte_low != ptep->pte_low)) - goto retry; - - return pte; -#else - return READ_ONCE(*ptep); -#endif -} - -static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, - int write, struct page **pages, int *nr) -{ - pte_t *ptep = pte_offset_map(&pmd, addr); - do { - pte_t pte = gup_get_pte(ptep); - struct page *page; - - if (!pte_present(pte) || - pte_special(pte) || (write && !pte_write(pte))) { - pte_unmap(ptep); - return 0; - } - VM_BUG_ON(!pfn_valid(pte_pfn(pte))); - page = pte_page(pte); - get_page(page); - SetPageReferenced(page); - pages[*nr] = page; - (*nr)++; - - } while (ptep++, addr += PAGE_SIZE, addr != end); - - pte_unmap(ptep - 1); - return 1; -} - -static inline void get_head_page_multiple(struct page *page, int nr) -{ - VM_BUG_ON(page != compound_head(page)); - VM_BUG_ON(page_count(page) == 0); - page_ref_add(page, nr); - SetPageReferenced(page); -} - -static int gup_huge_pmd(pmd_t pmd, unsigned long addr, unsigned long end, - int write, struct page **pages, int *nr) -{ - pte_t pte = *(pte_t *)&pmd; - struct page *head, *page; - int refs; - - if (write && !pte_write(pte)) - return 0; - /* hugepages are never "special" */ - VM_BUG_ON(pte_special(pte)); - VM_BUG_ON(!pfn_valid(pte_pfn(pte))); - - refs = 0; - head = pte_page(pte); - page = head + ((addr & ~PMD_MASK) >> PAGE_SHIFT); - do { - VM_BUG_ON(compound_head(page) != head); - pages[*nr] = page; - (*nr)++; - page++; - refs++; - } while (addr += PAGE_SIZE, addr != end); - - get_head_page_multiple(head, refs); - return 1; -} - -static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, - int write, struct page **pages, int *nr) -{ - unsigned long next; - pmd_t *pmdp; - - pmdp = pmd_offset(&pud, addr); - do { - pmd_t pmd = *pmdp; - - next = pmd_addr_end(addr, end); - if (pmd_none(pmd)) - return 0; - if (unlikely(pmd_huge(pmd))) { - if (!gup_huge_pmd(pmd, addr, next, write, pages,nr)) - return 0; - } else { - if (!gup_pte_range(pmd, addr, next, write, pages,nr)) - return 0; - } - } while (pmdp++, addr = next, addr != end); - - return 1; -} - -static int gup_huge_pud(pud_t pud, unsigned long addr, unsigned long end, - int write, struct page **pages, int *nr) -{ - pte_t pte = *(pte_t *)&pud; - struct page *head, *page; - int refs; - - if (write && !pte_write(pte)) - return 0; - /* hugepages are never "special" */ - VM_BUG_ON(pte_special(pte)); - VM_BUG_ON(!pfn_valid(pte_pfn(pte))); - - refs = 0; - head = pte_page(pte); - page = head + ((addr & ~PUD_MASK) >> PAGE_SHIFT); - do { - VM_BUG_ON(compound_head(page) != head); - pages[*nr] = page; - (*nr)++; - page++; - refs++; - } while (addr += PAGE_SIZE, addr != end); - - get_head_page_multiple(head, refs); - return 1; -} - -static int gup_pud_range(pgd_t pgd, unsigned long addr, unsigned long end, - int write, struct page **pages, int *nr) -{ - unsigned long next; - pud_t *pudp; - - pudp = pud_offset(&pgd, addr); - do { - pud_t pud = *pudp; - - next = pud_addr_end(addr, end); - if (pud_none(pud)) - return 0; - if (unlikely(pud_huge(pud))) { - if (!gup_huge_pud(pud, addr, next, write, pages,nr)) - return 0; - } else { - if (!gup_pmd_range(pud, addr, next, write, pages,nr)) - return 0; - } - } while (pudp++, addr = next, addr != end); - - return 1; -} - -/* - * Like get_user_pages_fast() except its IRQ-safe in that it won't fall - * back to the regular GUP. - * Note a difference with get_user_pages_fast: this always returns the - * number of pages pinned, 0 if no pages were pinned. - */ -int __get_user_pages_fast(unsigned long start, int nr_pages, int write, - struct page **pages) -{ - struct mm_struct *mm = current->mm; - unsigned long addr, len, end; - unsigned long next; - unsigned long flags; - pgd_t *pgdp; - int nr = 0; - - start &= PAGE_MASK; - addr = start; - len = (unsigned long) nr_pages << PAGE_SHIFT; - end = start + len; - if (unlikely(!access_ok((void __user *)start, len))) - return 0; - - /* - * XXX: batch / limit 'nr', to avoid large irq off latency - * needs some instrumenting to determine the common sizes used by - * important workloads (eg. DB2), and whether limiting the batch - * size will decrease performance. - * - * It seems like we're in the clear for the moment. Direct-IO is - * the main guy that batches up lots of get_user_pages, and even - * they are limited to 64-at-a-time which is not so many. - */ - /* - * This doesn't prevent pagetable teardown, but does prevent - * the pagetables and pages from being freed. - * - * So long as we atomically load page table pointers versus teardown, - * we can follow the address down to the page and take a ref on it. - */ - local_irq_save(flags); - pgdp = pgd_offset(mm, addr); - do { - pgd_t pgd = *pgdp; - - next = pgd_addr_end(addr, end); - if (pgd_none(pgd)) - break; - if (!gup_pud_range(pgd, addr, next, write, pages, &nr)) - break; - } while (pgdp++, addr = next, addr != end); - local_irq_restore(flags); - - return nr; -} - -/** - * get_user_pages_fast() - pin user pages in memory - * @start: starting user address - * @nr_pages: number of pages from start to pin - * @gup_flags: flags modifying pin behaviour - * @pages: array that receives pointers to the pages pinned. - * Should be at least nr_pages long. - * - * Attempt to pin user pages in memory without taking mm->mmap_sem. - * If not successful, it will fall back to taking the lock and - * calling get_user_pages(). - * - * Returns number of pages pinned. This may be fewer than the number - * requested. If nr_pages is 0 or negative, returns 0. If no pages - * were pinned, returns -errno. - */ -int get_user_pages_fast(unsigned long start, int nr_pages, - unsigned int gup_flags, struct page **pages) -{ - struct mm_struct *mm = current->mm; - unsigned long addr, len, end; - unsigned long next; - pgd_t *pgdp; - int ret, nr = 0; - - start &= PAGE_MASK; - addr = start; - len = (unsigned long) nr_pages << PAGE_SHIFT; - - end = start + len; - if (end < start || cpu_has_dc_aliases) - goto slow_irqon; - - /* XXX: batch / limit 'nr' */ - local_irq_disable(); - pgdp = pgd_offset(mm, addr); - do { - pgd_t pgd = *pgdp; - - next = pgd_addr_end(addr, end); - if (pgd_none(pgd)) - goto slow; - if (!gup_pud_range(pgd, addr, next, gup_flags & FOLL_WRITE, - pages, &nr)) - goto slow; - } while (pgdp++, addr = next, addr != end); - local_irq_enable(); - - VM_BUG_ON(nr != (end - start) >> PAGE_SHIFT); - return nr; -slow: - local_irq_enable(); - -slow_irqon: - /* Try to get the remaining pages with get_user_pages */ - start += nr << PAGE_SHIFT; - pages += nr; - - ret = get_user_pages_unlocked(start, (end - start) >> PAGE_SHIFT, - pages, gup_flags); - - /* Have to be a bit careful with return values */ - if (nr > 0) { - if (ret < 0) - ret = nr; - else - ret += nr; - } - return ret; -} From patchwork Tue Jun 11 14:40:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987199 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AEF08924 for ; Tue, 11 Jun 2019 14:42:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A193328783 for ; Tue, 11 Jun 2019 14:42:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9E5AB2888C; Tue, 11 Jun 2019 14:42:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5E88128871 for ; Tue, 11 Jun 2019 14:42:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391773AbfFKOmf (ORCPT ); Tue, 11 Jun 2019 10:42:35 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37510 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391595AbfFKOlj (ORCPT ); Tue, 11 Jun 2019 10:41:39 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=NpWcgZzp3PVAHx6KOALQeB51c2SoNPG0lnYZA6y2f1k=; b=Elx6a7oB+cG63hmgYxilqyyn8c 41Dy5+a7mSEbpr74nkmh4p8U4+Csa9fSuC+Gn2NIci0h+zE1AJ2oCVBGO13LLiFb0b5dtSmQVG46v 9u67LZN22/J37Gq6Pw0EdVcHmtYQmz4D+jPa3xzMxshR3N3w3jFih5D2yKc/uCU1s1tzAUO98HmhR Eb4uYc9Q63+/RNoh85FinfAQMq2GbkMiPqDoguuctm3ABd08mxoDvUh7pRodjX/6Qrk1P4MO1lvCS cb+WEkPCrmQqVuObuLcxvqmcclY+k9OvtBq6GcKL8ErofhZRUpsV0D0m18ViO9NmB/7DHIAQrxxoM YaFeTNdQ==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahxi-0005Q5-7q; Tue, 11 Jun 2019 14:41:22 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 05/16] sh: add the missing pud_page definition Date: Tue, 11 Jun 2019 16:40:51 +0200 Message-Id: <20190611144102.8848-6-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP sh only had pud_page_vaddr, but not pud_page. Signed-off-by: Christoph Hellwig --- arch/sh/include/asm/pgtable-3level.h | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/sh/include/asm/pgtable-3level.h b/arch/sh/include/asm/pgtable-3level.h index 7d8587eb65ff..3c7ff20f3f94 100644 --- a/arch/sh/include/asm/pgtable-3level.h +++ b/arch/sh/include/asm/pgtable-3level.h @@ -37,6 +37,7 @@ static inline unsigned long pud_page_vaddr(pud_t pud) { return pud_val(pud); } +#define pud_page(pud) pfn_to_page(pud_pfn(pud)) #define pmd_index(address) (((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1)) static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address) From patchwork Tue Jun 11 14:40:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987203 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F0EC114B6 for ; Tue, 11 Jun 2019 14:42:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E2F8528872 for ; Tue, 11 Jun 2019 14:42:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D622728884; Tue, 11 Jun 2019 14:42:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0BFB428872 for ; Tue, 11 Jun 2019 14:42:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391778AbfFKOmg (ORCPT ); Tue, 11 Jun 2019 10:42:36 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37500 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391574AbfFKOli (ORCPT ); Tue, 11 Jun 2019 10:41:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=+LfGuMfatgroW0jiS20tFtJAd2NqewOpaHqEsGODANw=; b=MzyPzl48/ml62a4ZrQyR9L5KBy WglwoTV9DQOuHwXk3Ec2QFcXT7X2AaARQNidYyxlEUN13Jyb5VW0f4ZVwb/4DrA1ZAAU1/nYpUSJj ibLq9BrcOGov9rKpSxDGFV6bnwHpi+QXt7nHhcCYWjo9+PAuoC8JBMc+2TD9BtS4sORtarGy6seQf JjKDbCrpPlxOUZfx4slD2Zb5kpyz7ZE12Bqbx1cWazNeCgkrltmNQP5S82l8XUehMiWYUisQMLoBm wIqkPZBDkpGzaBkiUqjyULHsotNmsGWCG6pNmNi+ZSkdYFlhQwemunxrro+VplNHOtCmhxt6ozkDa LgKKWlzw==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahxl-0005Qh-6z; Tue, 11 Jun 2019 14:41:25 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 06/16] sh: use the generic get_user_pages_fast code Date: Tue, 11 Jun 2019 16:40:52 +0200 Message-Id: <20190611144102.8848-7-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The sh code is mostly equivalent to the generic one, minus various bugfixes and two arch overrides that this patch adds to pgtable.h. Signed-off-by: Christoph Hellwig --- arch/sh/Kconfig | 2 + arch/sh/include/asm/pgtable.h | 37 +++++ arch/sh/mm/Makefile | 2 +- arch/sh/mm/gup.c | 277 ---------------------------------- 4 files changed, 40 insertions(+), 278 deletions(-) delete mode 100644 arch/sh/mm/gup.c diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig index b77f512bb176..6fddfc3c9710 100644 --- a/arch/sh/Kconfig +++ b/arch/sh/Kconfig @@ -14,6 +14,7 @@ config SUPERH select HAVE_ARCH_TRACEHOOK select HAVE_PERF_EVENTS select HAVE_DEBUG_BUGVERBOSE + select HAVE_GENERIC_GUP select ARCH_HAVE_CUSTOM_GPIO_H select ARCH_HAVE_NMI_SAFE_CMPXCHG if (GUSA_RB || CPU_SH4A) select ARCH_HAS_GCOV_PROFILE_ALL @@ -63,6 +64,7 @@ config SUPERH config SUPERH32 def_bool "$(ARCH)" = "sh" select ARCH_32BIT_OFF_T + select GUP_GET_PTE_LOW_HIGH if X2TLB select HAVE_KPROBES select HAVE_KRETPROBES select HAVE_IOREMAP_PROT if MMU && !X2TLB diff --git a/arch/sh/include/asm/pgtable.h b/arch/sh/include/asm/pgtable.h index 3587103afe59..9085d1142fa3 100644 --- a/arch/sh/include/asm/pgtable.h +++ b/arch/sh/include/asm/pgtable.h @@ -149,6 +149,43 @@ extern void paging_init(void); extern void page_table_range_init(unsigned long start, unsigned long end, pgd_t *pgd); +static inline bool __pte_access_permitted(pte_t pte, u64 prot) +{ + return (pte_val(pte) & (prot | _PAGE_SPECIAL)) == prot; +} + +#ifdef CONFIG_X2TLB +static inline bool pte_access_permitted(pte_t pte, bool write) +{ + u64 prot = _PAGE_PRESENT; + + prot |= _PAGE_EXT(_PAGE_EXT_KERN_READ | _PAGE_EXT_USER_READ); + if (write) + prot |= _PAGE_EXT(_PAGE_EXT_KERN_WRITE | _PAGE_EXT_USER_WRITE); + return __pte_access_permitted(pte, prot); +} +#elif defined(CONFIG_SUPERH64) +static inline bool pte_access_permitted(pte_t pte, bool write) +{ + u64 prot = _PAGE_PRESENT | _PAGE_USER | _PAGE_READ; + + if (write) + prot |= _PAGE_WRITE; + return __pte_access_permitted(pte, prot); +} +#else +static inline bool pte_access_permitted(pte_t pte, bool write) +{ + u64 prot = _PAGE_PRESENT | _PAGE_USER; + + if (write) + prot |= _PAGE_RW; + return __pte_access_permitted(pte, prot); +} +#endif + +#define pte_access_permitted pte_access_permitted + /* arch/sh/mm/mmap.c */ #define HAVE_ARCH_UNMAPPED_AREA #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN diff --git a/arch/sh/mm/Makefile b/arch/sh/mm/Makefile index fbe5e79751b3..5051b38fd5b6 100644 --- a/arch/sh/mm/Makefile +++ b/arch/sh/mm/Makefile @@ -17,7 +17,7 @@ cacheops-$(CONFIG_CPU_SHX3) += cache-shx3.o obj-y += $(cacheops-y) mmu-y := nommu.o extable_32.o -mmu-$(CONFIG_MMU) := extable_$(BITS).o fault.o gup.o ioremap.o kmap.o \ +mmu-$(CONFIG_MMU) := extable_$(BITS).o fault.o ioremap.o kmap.o \ pgtable.o tlbex_$(BITS).o tlbflush_$(BITS).o obj-y += $(mmu-y) diff --git a/arch/sh/mm/gup.c b/arch/sh/mm/gup.c deleted file mode 100644 index 277c882f7489..000000000000 --- a/arch/sh/mm/gup.c +++ /dev/null @@ -1,277 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * Lockless get_user_pages_fast for SuperH - * - * Copyright (C) 2009 - 2010 Paul Mundt - * - * Cloned from the x86 and PowerPC versions, by: - * - * Copyright (C) 2008 Nick Piggin - * Copyright (C) 2008 Novell Inc. - */ -#include -#include -#include -#include -#include - -static inline pte_t gup_get_pte(pte_t *ptep) -{ -#ifndef CONFIG_X2TLB - return READ_ONCE(*ptep); -#else - /* - * With get_user_pages_fast, we walk down the pagetables without - * taking any locks. For this we would like to load the pointers - * atomically, but that is not possible with 64-bit PTEs. What - * we do have is the guarantee that a pte will only either go - * from not present to present, or present to not present or both - * -- it will not switch to a completely different present page - * without a TLB flush in between; something that we are blocking - * by holding interrupts off. - * - * Setting ptes from not present to present goes: - * ptep->pte_high = h; - * smp_wmb(); - * ptep->pte_low = l; - * - * And present to not present goes: - * ptep->pte_low = 0; - * smp_wmb(); - * ptep->pte_high = 0; - * - * We must ensure here that the load of pte_low sees l iff pte_high - * sees h. We load pte_high *after* loading pte_low, which ensures we - * don't see an older value of pte_high. *Then* we recheck pte_low, - * which ensures that we haven't picked up a changed pte high. We might - * have got rubbish values from pte_low and pte_high, but we are - * guaranteed that pte_low will not have the present bit set *unless* - * it is 'l'. And get_user_pages_fast only operates on present ptes, so - * we're safe. - * - * gup_get_pte should not be used or copied outside gup.c without being - * very careful -- it does not atomically load the pte or anything that - * is likely to be useful for you. - */ - pte_t pte; - -retry: - pte.pte_low = ptep->pte_low; - smp_rmb(); - pte.pte_high = ptep->pte_high; - smp_rmb(); - if (unlikely(pte.pte_low != ptep->pte_low)) - goto retry; - - return pte; -#endif -} - -/* - * The performance critical leaf functions are made noinline otherwise gcc - * inlines everything into a single function which results in too much - * register pressure. - */ -static noinline int gup_pte_range(pmd_t pmd, unsigned long addr, - unsigned long end, int write, struct page **pages, int *nr) -{ - u64 mask, result; - pte_t *ptep; - -#ifdef CONFIG_X2TLB - result = _PAGE_PRESENT | _PAGE_EXT(_PAGE_EXT_KERN_READ | _PAGE_EXT_USER_READ); - if (write) - result |= _PAGE_EXT(_PAGE_EXT_KERN_WRITE | _PAGE_EXT_USER_WRITE); -#elif defined(CONFIG_SUPERH64) - result = _PAGE_PRESENT | _PAGE_USER | _PAGE_READ; - if (write) - result |= _PAGE_WRITE; -#else - result = _PAGE_PRESENT | _PAGE_USER; - if (write) - result |= _PAGE_RW; -#endif - - mask = result | _PAGE_SPECIAL; - - ptep = pte_offset_map(&pmd, addr); - do { - pte_t pte = gup_get_pte(ptep); - struct page *page; - - if ((pte_val(pte) & mask) != result) { - pte_unmap(ptep); - return 0; - } - VM_BUG_ON(!pfn_valid(pte_pfn(pte))); - page = pte_page(pte); - get_page(page); - __flush_anon_page(page, addr); - flush_dcache_page(page); - pages[*nr] = page; - (*nr)++; - - } while (ptep++, addr += PAGE_SIZE, addr != end); - pte_unmap(ptep - 1); - - return 1; -} - -static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, - int write, struct page **pages, int *nr) -{ - unsigned long next; - pmd_t *pmdp; - - pmdp = pmd_offset(&pud, addr); - do { - pmd_t pmd = *pmdp; - - next = pmd_addr_end(addr, end); - if (pmd_none(pmd)) - return 0; - if (!gup_pte_range(pmd, addr, next, write, pages, nr)) - return 0; - } while (pmdp++, addr = next, addr != end); - - return 1; -} - -static int gup_pud_range(pgd_t pgd, unsigned long addr, unsigned long end, - int write, struct page **pages, int *nr) -{ - unsigned long next; - pud_t *pudp; - - pudp = pud_offset(&pgd, addr); - do { - pud_t pud = *pudp; - - next = pud_addr_end(addr, end); - if (pud_none(pud)) - return 0; - if (!gup_pmd_range(pud, addr, next, write, pages, nr)) - return 0; - } while (pudp++, addr = next, addr != end); - - return 1; -} - -/* - * Like get_user_pages_fast() except its IRQ-safe in that it won't fall - * back to the regular GUP. - * Note a difference with get_user_pages_fast: this always returns the - * number of pages pinned, 0 if no pages were pinned. - */ -int __get_user_pages_fast(unsigned long start, int nr_pages, int write, - struct page **pages) -{ - struct mm_struct *mm = current->mm; - unsigned long addr, len, end; - unsigned long next; - unsigned long flags; - pgd_t *pgdp; - int nr = 0; - - start &= PAGE_MASK; - addr = start; - len = (unsigned long) nr_pages << PAGE_SHIFT; - end = start + len; - if (unlikely(!access_ok((void __user *)start, len))) - return 0; - - /* - * This doesn't prevent pagetable teardown, but does prevent - * the pagetables and pages from being freed. - */ - local_irq_save(flags); - pgdp = pgd_offset(mm, addr); - do { - pgd_t pgd = *pgdp; - - next = pgd_addr_end(addr, end); - if (pgd_none(pgd)) - break; - if (!gup_pud_range(pgd, addr, next, write, pages, &nr)) - break; - } while (pgdp++, addr = next, addr != end); - local_irq_restore(flags); - - return nr; -} - -/** - * get_user_pages_fast() - pin user pages in memory - * @start: starting user address - * @nr_pages: number of pages from start to pin - * @gup_flags: flags modifying pin behaviour - * @pages: array that receives pointers to the pages pinned. - * Should be at least nr_pages long. - * - * Attempt to pin user pages in memory without taking mm->mmap_sem. - * If not successful, it will fall back to taking the lock and - * calling get_user_pages(). - * - * Returns number of pages pinned. This may be fewer than the number - * requested. If nr_pages is 0 or negative, returns 0. If no pages - * were pinned, returns -errno. - */ -int get_user_pages_fast(unsigned long start, int nr_pages, - unsigned int gup_flags, struct page **pages) -{ - struct mm_struct *mm = current->mm; - unsigned long addr, len, end; - unsigned long next; - pgd_t *pgdp; - int nr = 0; - - start &= PAGE_MASK; - addr = start; - len = (unsigned long) nr_pages << PAGE_SHIFT; - - end = start + len; - if (end < start) - goto slow_irqon; - - local_irq_disable(); - pgdp = pgd_offset(mm, addr); - do { - pgd_t pgd = *pgdp; - - next = pgd_addr_end(addr, end); - if (pgd_none(pgd)) - goto slow; - if (!gup_pud_range(pgd, addr, next, gup_flags & FOLL_WRITE, - pages, &nr)) - goto slow; - } while (pgdp++, addr = next, addr != end); - local_irq_enable(); - - VM_BUG_ON(nr != (end - start) >> PAGE_SHIFT); - return nr; - - { - int ret; - -slow: - local_irq_enable(); -slow_irqon: - /* Try to get the remaining pages with get_user_pages */ - start += nr << PAGE_SHIFT; - pages += nr; - - ret = get_user_pages_unlocked(start, - (end - start) >> PAGE_SHIFT, pages, - gup_flags); - - /* Have to be a bit careful with return values */ - if (nr > 0) { - if (ret < 0) - ret = nr; - else - ret += nr; - } - - return ret; - } -} From patchwork Tue Jun 11 14:40:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987195 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6D3CB14B6 for ; Tue, 11 Jun 2019 14:42:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5ED1628889 for ; Tue, 11 Jun 2019 14:42:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5D01C2888B; Tue, 11 Jun 2019 14:42:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0068B28872 for ; Tue, 11 Jun 2019 14:42:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391473AbfFKOm1 (ORCPT ); Tue, 11 Jun 2019 10:42:27 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37524 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391614AbfFKOlj (ORCPT ); Tue, 11 Jun 2019 10:41:39 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=aoM0W68wUr2XaaE/a2NJlPCtCZVNUAc490vBn4tMbec=; b=gDnBG4EJhorlcxzh8n3Q6192lS oxcv3q6eGknVP5wRVlITasIuZSJKr++EHCSVxOVhjVGn2cwkHjWGCmhBBQKUBElIcT9hruj4PtePO MHFxV+dlrAFJz1NPfBmMiKvrFq/MRfT7uYbat0vilb1MVShRUuxMpBhiL3YUJMzmll051/NWCIPfw fVxD5UZKDeeqxPGLKsBzgs99hbXHTDPXIDaEdBomeYM4pa2U55YSA/5KJVGWqy5P9EUg3aF1jdVmR NDp08Vb3hrxotUTnHlQuHiVKToQQT8DsKhT+zU8JUVij0alNbApDW3J1pFC6FZhCh8HRIB6gk/bsD CmmlBGzg==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahxo-0005R7-B5; Tue, 11 Jun 2019 14:41:28 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 07/16] sparc64: add the missing pgd_page definition Date: Tue, 11 Jun 2019 16:40:53 +0200 Message-Id: <20190611144102.8848-8-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP sparc64 only had pgd_page_vaddr, but not pgd_page. Signed-off-by: Christoph Hellwig --- arch/sparc/include/asm/pgtable_64.h | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index 22500c3be7a9..f0dcf991d27f 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -861,6 +861,7 @@ static inline unsigned long pud_page_vaddr(pud_t pud) #define pud_clear(pudp) (pud_val(*(pudp)) = 0UL) #define pgd_page_vaddr(pgd) \ ((unsigned long) __va(pgd_val(pgd))) +#define pgd_page(pgd) pfn_to_page(pgd_pfn(pgd)) #define pgd_present(pgd) (pgd_val(pgd) != 0U) #define pgd_clear(pgdp) (pgd_val(*(pgdp)) = 0UL) From patchwork Tue Jun 11 14:40:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987135 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7D9FE924 for ; Tue, 11 Jun 2019 14:41:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6F50E2887B for ; Tue, 11 Jun 2019 14:41:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6BA53281DB; Tue, 11 Jun 2019 14:41:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 654642888A for ; Tue, 11 Jun 2019 14:41:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391635AbfFKOln (ORCPT ); Tue, 11 Jun 2019 10:41:43 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37580 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391528AbfFKOln (ORCPT ); Tue, 11 Jun 2019 10:41:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=YLjfzK43wQ9CU00al+Ym+V5yBUHCJAJzq91hYzEVV+8=; b=lctxf60o+qeZ56iUVs4sFHqM9g Bhl9Dv3SAKW713Ypki/OIDIOKgshgcusVFDaJY04xE9PEL+nj6W+vPTcq7JYNGL6xmdw2OKqVHSZA wJ6OQJcmfOFC0wdk6AGsIWyjU3jowYGGcpltD1WM9o0bAv+v5iEFLlAgDzcA9g64YCZnPb3LMbiPy wfa1Qmx12MM5qG3ouc5kp6B6jJ7pltFZOfe7Y+qDODepS8DylwA9QQ6maqbjNTcnhamI0AxDdQ8Jo bTNehbXDNjVFbP2iwArNFZoxouMJxnrPETVIANVvrT3NyE4+eQyNI4FET7Z9TwvgVWAqjvhPnaJEp KTHj52cQ==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahxr-0005RJ-A2; Tue, 11 Jun 2019 14:41:31 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 08/16] sparc64: define untagged_addr() Date: Tue, 11 Jun 2019 16:40:54 +0200 Message-Id: <20190611144102.8848-9-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add a helper to untag a user pointer. This is needed for ADI support in get_user_pages_fast. Signed-off-by: Christoph Hellwig Reviewed-by: Khalid Aziz --- arch/sparc/include/asm/pgtable_64.h | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index f0dcf991d27f..1904782dcd39 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -1076,6 +1076,28 @@ static inline int io_remap_pfn_range(struct vm_area_struct *vma, } #define io_remap_pfn_range io_remap_pfn_range +static inline unsigned long untagged_addr(unsigned long start) +{ + if (adi_capable()) { + long addr = start; + + /* If userspace has passed a versioned address, kernel + * will not find it in the VMAs since it does not store + * the version tags in the list of VMAs. Storing version + * tags in list of VMAs is impractical since they can be + * changed any time from userspace without dropping into + * kernel. Any address search in VMAs will be done with + * non-versioned addresses. Ensure the ADI version bits + * are dropped here by sign extending the last bit before + * ADI bits. IOMMU does not implement version tags. + */ + return (addr << (long)adi_nbits()) >> (long)adi_nbits(); + } + + return start; +} +#define untagged_addr untagged_addr + #include #include From patchwork Tue Jun 11 14:40:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987187 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AA6481708 for ; Tue, 11 Jun 2019 14:42:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9D267281DB for ; Tue, 11 Jun 2019 14:42:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8FFD12877B; Tue, 11 Jun 2019 14:42:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B867328889 for ; Tue, 11 Jun 2019 14:42:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391649AbfFKOls (ORCPT ); Tue, 11 Jun 2019 10:41:48 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37662 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391528AbfFKOls (ORCPT ); Tue, 11 Jun 2019 10:41:48 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=CCBHIDdzGw8dGT9zTCEQtOYrgJc8ZgxZZtlN4zHEVIg=; b=QjUN4GXs9WS8NwKVyrN/1jfeyT 3JfuLGB1+nUhhITWtvXF8SBH4lyEcYO6uDwD65K79Me0wivesJlXkrfs5T57LWLMU9iQm4xFHLbVd N1rJtvtrWFaqA+CTkEy00BvVIAGlu+kLv0CNclptcLbddV9z0NDcgkg5NODv0TTOG+2ZSGfbseBqD XeLA7vHyIKnjNaCfUK495QHfnc9SPjvCMgdXYkHxwHnY3IlPaN33oaWheojAID9VApvkf5DXalh/8 J73XphRdNfXhsay33ktGM9xVYkdyXj+dMzrQXtknZ+uo5PNXfc1+eULr8rnEjLy2wv4OiIhJaysG8 l64kLMew==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahxu-0005Rg-3u; Tue, 11 Jun 2019 14:41:34 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 09/16] sparc64: use the generic get_user_pages_fast code Date: Tue, 11 Jun 2019 16:40:55 +0200 Message-Id: <20190611144102.8848-10-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The sparc64 code is mostly equivalent to the generic one, minus various bugfixes and two arch overrides that this patch adds to pgtable.h. Signed-off-by: Christoph Hellwig Reviewed-by: Khalid Aziz --- arch/sparc/Kconfig | 1 + arch/sparc/include/asm/pgtable_64.h | 18 ++ arch/sparc/mm/Makefile | 2 +- arch/sparc/mm/gup.c | 340 ---------------------------- 4 files changed, 20 insertions(+), 341 deletions(-) delete mode 100644 arch/sparc/mm/gup.c diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index 26ab6f5bbaaf..22435471f942 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -28,6 +28,7 @@ config SPARC select RTC_DRV_M48T59 select RTC_SYSTOHC select HAVE_ARCH_JUMP_LABEL if SPARC64 + select HAVE_GENERIC_GUP if SPARC64 select GENERIC_IRQ_SHOW select ARCH_WANT_IPC_PARSE_VERSION select GENERIC_PCI_IOMAP diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index 1904782dcd39..547ff96fb228 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -1098,6 +1098,24 @@ static inline unsigned long untagged_addr(unsigned long start) } #define untagged_addr untagged_addr +static inline bool pte_access_permitted(pte_t pte, bool write) +{ + u64 prot; + + if (tlb_type == hypervisor) { + prot = _PAGE_PRESENT_4V | _PAGE_P_4V; + if (write) + prot |= _PAGE_WRITE_4V; + } else { + prot = _PAGE_PRESENT_4U | _PAGE_P_4U; + if (write) + prot |= _PAGE_WRITE_4U; + } + + return (pte_val(pte) & (prot | _PAGE_SPECIAL)) == prot; +} +#define pte_access_permitted pte_access_permitted + #include #include diff --git a/arch/sparc/mm/Makefile b/arch/sparc/mm/Makefile index d39075b1e3b7..b078205b70e0 100644 --- a/arch/sparc/mm/Makefile +++ b/arch/sparc/mm/Makefile @@ -5,7 +5,7 @@ asflags-y := -ansi ccflags-y := -Werror -obj-$(CONFIG_SPARC64) += ultra.o tlb.o tsb.o gup.o +obj-$(CONFIG_SPARC64) += ultra.o tlb.o tsb.o obj-y += fault_$(BITS).o obj-y += init_$(BITS).o obj-$(CONFIG_SPARC32) += extable.o srmmu.o iommu.o io-unit.o diff --git a/arch/sparc/mm/gup.c b/arch/sparc/mm/gup.c deleted file mode 100644 index 1e770a517d4a..000000000000 --- a/arch/sparc/mm/gup.c +++ /dev/null @@ -1,340 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * Lockless get_user_pages_fast for sparc, cribbed from powerpc - * - * Copyright (C) 2008 Nick Piggin - * Copyright (C) 2008 Novell Inc. - */ - -#include -#include -#include -#include -#include -#include -#include - -/* - * The performance critical leaf functions are made noinline otherwise gcc - * inlines everything into a single function which results in too much - * register pressure. - */ -static noinline int gup_pte_range(pmd_t pmd, unsigned long addr, - unsigned long end, int write, struct page **pages, int *nr) -{ - unsigned long mask, result; - pte_t *ptep; - - if (tlb_type == hypervisor) { - result = _PAGE_PRESENT_4V|_PAGE_P_4V; - if (write) - result |= _PAGE_WRITE_4V; - } else { - result = _PAGE_PRESENT_4U|_PAGE_P_4U; - if (write) - result |= _PAGE_WRITE_4U; - } - mask = result | _PAGE_SPECIAL; - - ptep = pte_offset_kernel(&pmd, addr); - do { - struct page *page, *head; - pte_t pte = *ptep; - - if ((pte_val(pte) & mask) != result) - return 0; - VM_BUG_ON(!pfn_valid(pte_pfn(pte))); - - /* The hugepage case is simplified on sparc64 because - * we encode the sub-page pfn offsets into the - * hugepage PTEs. We could optimize this in the future - * use page_cache_add_speculative() for the hugepage case. - */ - page = pte_page(pte); - head = compound_head(page); - if (!page_cache_get_speculative(head)) - return 0; - if (unlikely(pte_val(pte) != pte_val(*ptep))) { - put_page(head); - return 0; - } - - pages[*nr] = page; - (*nr)++; - } while (ptep++, addr += PAGE_SIZE, addr != end); - - return 1; -} - -static int gup_huge_pmd(pmd_t *pmdp, pmd_t pmd, unsigned long addr, - unsigned long end, int write, struct page **pages, - int *nr) -{ - struct page *head, *page; - int refs; - - if (!(pmd_val(pmd) & _PAGE_VALID)) - return 0; - - if (write && !pmd_write(pmd)) - return 0; - - refs = 0; - page = pmd_page(pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); - head = compound_head(page); - do { - VM_BUG_ON(compound_head(page) != head); - pages[*nr] = page; - (*nr)++; - page++; - refs++; - } while (addr += PAGE_SIZE, addr != end); - - if (!page_cache_add_speculative(head, refs)) { - *nr -= refs; - return 0; - } - - if (unlikely(pmd_val(pmd) != pmd_val(*pmdp))) { - *nr -= refs; - while (refs--) - put_page(head); - return 0; - } - - return 1; -} - -static int gup_huge_pud(pud_t *pudp, pud_t pud, unsigned long addr, - unsigned long end, int write, struct page **pages, - int *nr) -{ - struct page *head, *page; - int refs; - - if (!(pud_val(pud) & _PAGE_VALID)) - return 0; - - if (write && !pud_write(pud)) - return 0; - - refs = 0; - page = pud_page(pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); - head = compound_head(page); - do { - VM_BUG_ON(compound_head(page) != head); - pages[*nr] = page; - (*nr)++; - page++; - refs++; - } while (addr += PAGE_SIZE, addr != end); - - if (!page_cache_add_speculative(head, refs)) { - *nr -= refs; - return 0; - } - - if (unlikely(pud_val(pud) != pud_val(*pudp))) { - *nr -= refs; - while (refs--) - put_page(head); - return 0; - } - - return 1; -} - -static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, - int write, struct page **pages, int *nr) -{ - unsigned long next; - pmd_t *pmdp; - - pmdp = pmd_offset(&pud, addr); - do { - pmd_t pmd = *pmdp; - - next = pmd_addr_end(addr, end); - if (pmd_none(pmd)) - return 0; - if (unlikely(pmd_large(pmd))) { - if (!gup_huge_pmd(pmdp, pmd, addr, next, - write, pages, nr)) - return 0; - } else if (!gup_pte_range(pmd, addr, next, write, - pages, nr)) - return 0; - } while (pmdp++, addr = next, addr != end); - - return 1; -} - -static int gup_pud_range(pgd_t pgd, unsigned long addr, unsigned long end, - int write, struct page **pages, int *nr) -{ - unsigned long next; - pud_t *pudp; - - pudp = pud_offset(&pgd, addr); - do { - pud_t pud = *pudp; - - next = pud_addr_end(addr, end); - if (pud_none(pud)) - return 0; - if (unlikely(pud_large(pud))) { - if (!gup_huge_pud(pudp, pud, addr, next, - write, pages, nr)) - return 0; - } else if (!gup_pmd_range(pud, addr, next, write, pages, nr)) - return 0; - } while (pudp++, addr = next, addr != end); - - return 1; -} - -/* - * Note a difference with get_user_pages_fast: this always returns the - * number of pages pinned, 0 if no pages were pinned. - */ -int __get_user_pages_fast(unsigned long start, int nr_pages, int write, - struct page **pages) -{ - struct mm_struct *mm = current->mm; - unsigned long addr, len, end; - unsigned long next, flags; - pgd_t *pgdp; - int nr = 0; - -#ifdef CONFIG_SPARC64 - if (adi_capable()) { - long addr = start; - - /* If userspace has passed a versioned address, kernel - * will not find it in the VMAs since it does not store - * the version tags in the list of VMAs. Storing version - * tags in list of VMAs is impractical since they can be - * changed any time from userspace without dropping into - * kernel. Any address search in VMAs will be done with - * non-versioned addresses. Ensure the ADI version bits - * are dropped here by sign extending the last bit before - * ADI bits. IOMMU does not implement version tags. - */ - addr = (addr << (long)adi_nbits()) >> (long)adi_nbits(); - start = addr; - } -#endif - start &= PAGE_MASK; - addr = start; - len = (unsigned long) nr_pages << PAGE_SHIFT; - end = start + len; - - local_irq_save(flags); - pgdp = pgd_offset(mm, addr); - do { - pgd_t pgd = *pgdp; - - next = pgd_addr_end(addr, end); - if (pgd_none(pgd)) - break; - if (!gup_pud_range(pgd, addr, next, write, pages, &nr)) - break; - } while (pgdp++, addr = next, addr != end); - local_irq_restore(flags); - - return nr; -} - -int get_user_pages_fast(unsigned long start, int nr_pages, - unsigned int gup_flags, struct page **pages) -{ - struct mm_struct *mm = current->mm; - unsigned long addr, len, end; - unsigned long next; - pgd_t *pgdp; - int nr = 0; - -#ifdef CONFIG_SPARC64 - if (adi_capable()) { - long addr = start; - - /* If userspace has passed a versioned address, kernel - * will not find it in the VMAs since it does not store - * the version tags in the list of VMAs. Storing version - * tags in list of VMAs is impractical since they can be - * changed any time from userspace without dropping into - * kernel. Any address search in VMAs will be done with - * non-versioned addresses. Ensure the ADI version bits - * are dropped here by sign extending the last bit before - * ADI bits. IOMMU does not implements version tags, - */ - addr = (addr << (long)adi_nbits()) >> (long)adi_nbits(); - start = addr; - } -#endif - start &= PAGE_MASK; - addr = start; - len = (unsigned long) nr_pages << PAGE_SHIFT; - end = start + len; - - /* - * XXX: batch / limit 'nr', to avoid large irq off latency - * needs some instrumenting to determine the common sizes used by - * important workloads (eg. DB2), and whether limiting the batch size - * will decrease performance. - * - * It seems like we're in the clear for the moment. Direct-IO is - * the main guy that batches up lots of get_user_pages, and even - * they are limited to 64-at-a-time which is not so many. - */ - /* - * This doesn't prevent pagetable teardown, but does prevent - * the pagetables from being freed on sparc. - * - * So long as we atomically load page table pointers versus teardown, - * we can follow the address down to the the page and take a ref on it. - */ - local_irq_disable(); - - pgdp = pgd_offset(mm, addr); - do { - pgd_t pgd = *pgdp; - - next = pgd_addr_end(addr, end); - if (pgd_none(pgd)) - goto slow; - if (!gup_pud_range(pgd, addr, next, gup_flags & FOLL_WRITE, - pages, &nr)) - goto slow; - } while (pgdp++, addr = next, addr != end); - - local_irq_enable(); - - VM_BUG_ON(nr != (end - start) >> PAGE_SHIFT); - return nr; - - { - int ret; - -slow: - local_irq_enable(); - - /* Try to get the remaining pages with get_user_pages */ - start += nr << PAGE_SHIFT; - pages += nr; - - ret = get_user_pages_unlocked(start, - (end - start) >> PAGE_SHIFT, pages, - gup_flags); - - /* Have to be a bit careful with return values */ - if (nr > 0) { - if (ret < 0) - ret = nr; - else - ret += nr; - } - - return ret; - } -} From patchwork Tue Jun 11 14:40:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987141 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 98F6D924 for ; Tue, 11 Jun 2019 14:41:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8888428884 for ; Tue, 11 Jun 2019 14:41:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 789B92887B; Tue, 11 Jun 2019 14:41:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CF27C2887B for ; Tue, 11 Jun 2019 14:41:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391666AbfFKOlv (ORCPT ); Tue, 11 Jun 2019 10:41:51 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37712 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391662AbfFKOlv (ORCPT ); Tue, 11 Jun 2019 10:41:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=t2d9JxQzJuEgLYY/v71JevoOvDwte7pey1Fa/i1JLIA=; b=eoQkf68iEuTlysDEZ3s3bGGu6a AM3fALgVNwvh2wvApu1EevuZqvsFjSpqWQ84K3tustAgNWu7XbVr6AOhWTS+csaPefUlNcacSeswZ MWp+TbPRz07d4DHLB4b2ihSDa+cf9PwOm+bZmhMCnsYWvGo2Vf0h4FJhRqrtzbMUdKEzaU9o/Nmw6 nGy0Sy/0e/gDvXNoX9ArIpOOTkY6ggxm+aRa9LMP9xhRhx15brmok/7ZWIwr9jEC83BnIgK3pW36l 2TXLHfZNCxbiB8u4Vz6+d9KrbtsJAcnZ7HM7RtgW9Rgmt6grYLsJDCSFsU+vd8n7eu3XAC1BdFxuP mHY9VW2Q==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahxw-0005S0-SZ; Tue, 11 Jun 2019 14:41:37 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 10/16] mm: rename CONFIG_HAVE_GENERIC_GUP to CONFIG_HAVE_FAST_GUP Date: Tue, 11 Jun 2019 16:40:56 +0200 Message-Id: <20190611144102.8848-11-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We only support the generic GUP now, so rename the config option to be more clear, and always use the mm/Kconfig definition of the symbol and select it from the arch Kconfigs. Signed-off-by: Christoph Hellwig Reviewed-by: Khalid Aziz Reviewed-by: Jason Gunthorpe --- arch/arm/Kconfig | 5 +---- arch/arm64/Kconfig | 4 +--- arch/mips/Kconfig | 2 +- arch/powerpc/Kconfig | 2 +- arch/s390/Kconfig | 2 +- arch/sh/Kconfig | 2 +- arch/sparc/Kconfig | 2 +- arch/x86/Kconfig | 4 +--- mm/Kconfig | 2 +- mm/gup.c | 4 ++-- 10 files changed, 11 insertions(+), 18 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 8869742a85df..3879a3e2c511 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -73,6 +73,7 @@ config ARM select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU select HAVE_EXIT_THREAD + select HAVE_FAST_GUP if ARM_LPAE select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL select HAVE_FUNCTION_GRAPH_TRACER if !THUMB2_KERNEL && !CC_IS_CLANG select HAVE_FUNCTION_TRACER if !XIP_KERNEL @@ -1596,10 +1597,6 @@ config ARCH_SELECT_MEMORY_MODEL config HAVE_ARCH_PFN_VALID def_bool ARCH_HAS_HOLES_MEMORYMODEL || !SPARSEMEM -config HAVE_GENERIC_GUP - def_bool y - depends on ARM_LPAE - config HIGHMEM bool "High Memory Support" depends on MMU diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 697ea0510729..4a6ee3e92757 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -140,6 +140,7 @@ config ARM64 select HAVE_DMA_CONTIGUOUS select HAVE_DYNAMIC_FTRACE select HAVE_EFFICIENT_UNALIGNED_ACCESS + select HAVE_FAST_GUP select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FUNCTION_TRACER select HAVE_FUNCTION_GRAPH_TRACER @@ -262,9 +263,6 @@ config GENERIC_CALIBRATE_DELAY config ZONE_DMA32 def_bool y -config HAVE_GENERIC_GUP - def_bool y - config ARCH_ENABLE_MEMORY_HOTPLUG def_bool y diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index 64108a2a16d4..b1e42f0e4ed0 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -54,10 +54,10 @@ config MIPS select HAVE_DMA_CONTIGUOUS select HAVE_DYNAMIC_FTRACE select HAVE_EXIT_THREAD + select HAVE_FAST_GUP select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FUNCTION_GRAPH_TRACER select HAVE_FUNCTION_TRACER - select HAVE_GENERIC_GUP select HAVE_IDE select HAVE_IOREMAP_PROT select HAVE_IRQ_EXIT_ON_IRQ_STACK diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 8c1c636308c8..992a04796e56 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -185,12 +185,12 @@ config PPC select HAVE_DYNAMIC_FTRACE_WITH_REGS if MPROFILE_KERNEL select HAVE_EBPF_JIT if PPC64 select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && POWER7_CPU) + select HAVE_FAST_GUP select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FUNCTION_ERROR_INJECTION select HAVE_FUNCTION_GRAPH_TRACER select HAVE_FUNCTION_TRACER select HAVE_GCC_PLUGINS if GCC_VERSION >= 50200 # plugin support on gcc <= 5.1 is buggy on PPC - select HAVE_GENERIC_GUP select HAVE_HW_BREAKPOINT if PERF_EVENTS && (PPC_BOOK3S || PPC_8xx) select HAVE_IDE select HAVE_IOREMAP_PROT diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 109243fdb6ec..aaff0376bf53 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -137,6 +137,7 @@ config S390 select HAVE_DMA_CONTIGUOUS select HAVE_DYNAMIC_FTRACE select HAVE_DYNAMIC_FTRACE_WITH_REGS + select HAVE_FAST_GUP select HAVE_EFFICIENT_UNALIGNED_ACCESS select HAVE_FENTRY select HAVE_FTRACE_MCOUNT_RECORD @@ -144,7 +145,6 @@ config S390 select HAVE_FUNCTION_TRACER select HAVE_FUTEX_CMPXCHG if FUTEX select HAVE_GCC_PLUGINS - select HAVE_GENERIC_GUP select HAVE_KERNEL_BZIP2 select HAVE_KERNEL_GZIP select HAVE_KERNEL_LZ4 diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig index 6fddfc3c9710..56712f3c9838 100644 --- a/arch/sh/Kconfig +++ b/arch/sh/Kconfig @@ -14,7 +14,7 @@ config SUPERH select HAVE_ARCH_TRACEHOOK select HAVE_PERF_EVENTS select HAVE_DEBUG_BUGVERBOSE - select HAVE_GENERIC_GUP + select HAVE_FAST_GUP select ARCH_HAVE_CUSTOM_GPIO_H select ARCH_HAVE_NMI_SAFE_CMPXCHG if (GUSA_RB || CPU_SH4A) select ARCH_HAS_GCOV_PROFILE_ALL diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index 22435471f942..659232b760e1 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -28,7 +28,7 @@ config SPARC select RTC_DRV_M48T59 select RTC_SYSTOHC select HAVE_ARCH_JUMP_LABEL if SPARC64 - select HAVE_GENERIC_GUP if SPARC64 + select HAVE_FAST_GUP if SPARC64 select GENERIC_IRQ_SHOW select ARCH_WANT_IPC_PARSE_VERSION select GENERIC_PCI_IOMAP diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 7cd53cc59f0f..44500e0ed630 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -157,6 +157,7 @@ config X86 select HAVE_EFFICIENT_UNALIGNED_ACCESS select HAVE_EISA select HAVE_EXIT_THREAD + select HAVE_FAST_GUP select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FUNCTION_GRAPH_TRACER @@ -2874,9 +2875,6 @@ config HAVE_ATOMIC_IOMAP config X86_DEV_DMA_OPS bool -config HAVE_GENERIC_GUP - def_bool y - source "drivers/firmware/Kconfig" source "arch/x86/kvm/Kconfig" diff --git a/mm/Kconfig b/mm/Kconfig index fe51f104a9e0..98dffb0f2447 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -132,7 +132,7 @@ config HAVE_MEMBLOCK_NODE_MAP config HAVE_MEMBLOCK_PHYS_MAP bool -config HAVE_GENERIC_GUP +config HAVE_FAST_GUP bool config ARCH_KEEP_MEMBLOCK diff --git a/mm/gup.c b/mm/gup.c index 9b72f2ea3471..7328890ad8d3 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1651,7 +1651,7 @@ struct page *get_dump_page(unsigned long addr) #endif /* CONFIG_ELF_CORE */ /* - * Generic Fast GUP + * Fast GUP * * get_user_pages_fast attempts to pin user pages by walking the page * tables directly and avoids taking locks. Thus the walker needs to be @@ -1683,7 +1683,7 @@ struct page *get_dump_page(unsigned long addr) * * This code is based heavily on the PowerPC implementation by Nick Piggin. */ -#ifdef CONFIG_HAVE_GENERIC_GUP +#ifdef CONFIG_HAVE_FAST_GUP #ifdef CONFIG_GUP_GET_PTE_LOW_HIGH /* * WARNING: only to be used in the get_user_pages_fast() implementation. From patchwork Tue Jun 11 14:40:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987183 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 615D414B6 for ; Tue, 11 Jun 2019 14:42:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 522512888B for ; Tue, 11 Jun 2019 14:42:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3EE2128896; Tue, 11 Jun 2019 14:42:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5278828884 for ; Tue, 11 Jun 2019 14:42:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391681AbfFKOl4 (ORCPT ); Tue, 11 Jun 2019 10:41:56 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37734 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391662AbfFKOlx (ORCPT ); Tue, 11 Jun 2019 10:41:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=Di5x9I+PmFbp3wQMXLFAFlDyJOmNB4FZD0AGEZH9ycM=; b=TNWLYOuKb4VsBbWmGlVRczEKVd Oc8fAb9Laqz/d2m6dxerNyqhqKK6NDR64Vp7M3aMej2vN/b1FX9x89y0D9h9xCZHG0FU5RlFKW5FX o3sy6OMgatLwY8+ZFFoZWJCOQRNVTwcefC7wxvZpmZ8AAAD1p+yl+dvVlwK5t3LmSL93sboYQjHop yZEAtlogrNrNYIQMRMEoPJza/+5vuExm681yLDlnq9mVjJuzZ0oII1MMnX458Rr76GSWalTonkUOd 3vbwVHdH6xvEuyni2brMTM/Ge49rp2TozJmoAsinSWEfz1zu/tAoJLbg1gpZZeGan87BQ88Zd/lq5 BfVIU2uA==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahxz-0005ST-Hi; Tue, 11 Jun 2019 14:41:40 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 11/16] mm: consolidate the get_user_pages* implementations Date: Tue, 11 Jun 2019 16:40:57 +0200 Message-Id: <20190611144102.8848-12-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Always build mm/gup.c, and move the nommu versions and replace the separate stubs for various functions by the default ones, with the _fast version always falling back to the slow path because gup_fast_permitted always returns false now if HAVE_FAST_GUP is not set, and we use the nommu version of __get_user_pages while keeping all the wrappers common. This also ensures the new put_user_pages* helpers are available for nommu, as those are currently missing, which would create a problem as soon as we actually grew users for it. Signed-off-by: Christoph Hellwig --- mm/Kconfig | 1 + mm/Makefile | 4 +- mm/gup.c | 476 +++++++++++++++++++++++++++++----------------------- mm/nommu.c | 88 ---------- mm/util.c | 47 ------ 5 files changed, 269 insertions(+), 347 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index 98dffb0f2447..5c41409557da 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -133,6 +133,7 @@ config HAVE_MEMBLOCK_PHYS_MAP bool config HAVE_FAST_GUP + depends on MMU bool config ARCH_KEEP_MEMBLOCK diff --git a/mm/Makefile b/mm/Makefile index ac5e5ba78874..dc0746ca1109 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -22,7 +22,7 @@ KCOV_INSTRUMENT_mmzone.o := n KCOV_INSTRUMENT_vmstat.o := n mmu-y := nommu.o -mmu-$(CONFIG_MMU) := gup.o highmem.o memory.o mincore.o \ +mmu-$(CONFIG_MMU) := highmem.o memory.o mincore.o \ mlock.o mmap.o mmu_gather.o mprotect.o mremap.o \ msync.o page_vma_mapped.o pagewalk.o \ pgtable-generic.o rmap.o vmalloc.o @@ -39,7 +39,7 @@ obj-y := filemap.o mempool.o oom_kill.o fadvise.o \ mm_init.o mmu_context.o percpu.o slab_common.o \ compaction.o vmacache.o \ interval_tree.o list_lru.o workingset.o \ - debug.o $(mmu-y) + debug.o gup.o $(mmu-y) # Give 'page_alloc' its own module-parameter namespace page-alloc-y := page_alloc.o diff --git a/mm/gup.c b/mm/gup.c index 7328890ad8d3..fe4f205651fd 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -134,6 +134,7 @@ void put_user_pages(struct page **pages, unsigned long npages) } EXPORT_SYMBOL(put_user_pages); +#ifdef CONFIG_MMU static struct page *no_page_table(struct vm_area_struct *vma, unsigned int flags) { @@ -1100,86 +1101,6 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, return pages_done; } -/* - * We can leverage the VM_FAULT_RETRY functionality in the page fault - * paths better by using either get_user_pages_locked() or - * get_user_pages_unlocked(). - * - * get_user_pages_locked() is suitable to replace the form: - * - * down_read(&mm->mmap_sem); - * do_something() - * get_user_pages(tsk, mm, ..., pages, NULL); - * up_read(&mm->mmap_sem); - * - * to: - * - * int locked = 1; - * down_read(&mm->mmap_sem); - * do_something() - * get_user_pages_locked(tsk, mm, ..., pages, &locked); - * if (locked) - * up_read(&mm->mmap_sem); - */ -long get_user_pages_locked(unsigned long start, unsigned long nr_pages, - unsigned int gup_flags, struct page **pages, - int *locked) -{ - /* - * FIXME: Current FOLL_LONGTERM behavior is incompatible with - * FAULT_FLAG_ALLOW_RETRY because of the FS DAX check requirement on - * vmas. As there are no users of this flag in this call we simply - * disallow this option for now. - */ - if (WARN_ON_ONCE(gup_flags & FOLL_LONGTERM)) - return -EINVAL; - - return __get_user_pages_locked(current, current->mm, start, nr_pages, - pages, NULL, locked, - gup_flags | FOLL_TOUCH); -} -EXPORT_SYMBOL(get_user_pages_locked); - -/* - * get_user_pages_unlocked() is suitable to replace the form: - * - * down_read(&mm->mmap_sem); - * get_user_pages(tsk, mm, ..., pages, NULL); - * up_read(&mm->mmap_sem); - * - * with: - * - * get_user_pages_unlocked(tsk, mm, ..., pages); - * - * It is functionally equivalent to get_user_pages_fast so - * get_user_pages_fast should be used instead if specific gup_flags - * (e.g. FOLL_FORCE) are not required. - */ -long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, - struct page **pages, unsigned int gup_flags) -{ - struct mm_struct *mm = current->mm; - int locked = 1; - long ret; - - /* - * FIXME: Current FOLL_LONGTERM behavior is incompatible with - * FAULT_FLAG_ALLOW_RETRY because of the FS DAX check requirement on - * vmas. As there are no users of this flag in this call we simply - * disallow this option for now. - */ - if (WARN_ON_ONCE(gup_flags & FOLL_LONGTERM)) - return -EINVAL; - - down_read(&mm->mmap_sem); - ret = __get_user_pages_locked(current, mm, start, nr_pages, pages, NULL, - &locked, gup_flags | FOLL_TOUCH); - if (locked) - up_read(&mm->mmap_sem); - return ret; -} -EXPORT_SYMBOL(get_user_pages_unlocked); - /* * get_user_pages_remote() - pin user pages in memory * @tsk: the task_struct to use for page fault accounting, or @@ -1256,6 +1177,199 @@ long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, } EXPORT_SYMBOL(get_user_pages_remote); +/** + * populate_vma_page_range() - populate a range of pages in the vma. + * @vma: target vma + * @start: start address + * @end: end address + * @nonblocking: + * + * This takes care of mlocking the pages too if VM_LOCKED is set. + * + * return 0 on success, negative error code on error. + * + * vma->vm_mm->mmap_sem must be held. + * + * If @nonblocking is NULL, it may be held for read or write and will + * be unperturbed. + * + * If @nonblocking is non-NULL, it must held for read only and may be + * released. If it's released, *@nonblocking will be set to 0. + */ +long populate_vma_page_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end, int *nonblocking) +{ + struct mm_struct *mm = vma->vm_mm; + unsigned long nr_pages = (end - start) / PAGE_SIZE; + int gup_flags; + + VM_BUG_ON(start & ~PAGE_MASK); + VM_BUG_ON(end & ~PAGE_MASK); + VM_BUG_ON_VMA(start < vma->vm_start, vma); + VM_BUG_ON_VMA(end > vma->vm_end, vma); + VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm); + + gup_flags = FOLL_TOUCH | FOLL_POPULATE | FOLL_MLOCK; + if (vma->vm_flags & VM_LOCKONFAULT) + gup_flags &= ~FOLL_POPULATE; + /* + * We want to touch writable mappings with a write fault in order + * to break COW, except for shared mappings because these don't COW + * and we would not want to dirty them for nothing. + */ + if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE) + gup_flags |= FOLL_WRITE; + + /* + * We want mlock to succeed for regions that have any permissions + * other than PROT_NONE. + */ + if (vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) + gup_flags |= FOLL_FORCE; + + /* + * We made sure addr is within a VMA, so the following will + * not result in a stack expansion that recurses back here. + */ + return __get_user_pages(current, mm, start, nr_pages, gup_flags, + NULL, NULL, nonblocking); +} + +/* + * __mm_populate - populate and/or mlock pages within a range of address space. + * + * This is used to implement mlock() and the MAP_POPULATE / MAP_LOCKED mmap + * flags. VMAs must be already marked with the desired vm_flags, and + * mmap_sem must not be held. + */ +int __mm_populate(unsigned long start, unsigned long len, int ignore_errors) +{ + struct mm_struct *mm = current->mm; + unsigned long end, nstart, nend; + struct vm_area_struct *vma = NULL; + int locked = 0; + long ret = 0; + + end = start + len; + + for (nstart = start; nstart < end; nstart = nend) { + /* + * We want to fault in pages for [nstart; end) address range. + * Find first corresponding VMA. + */ + if (!locked) { + locked = 1; + down_read(&mm->mmap_sem); + vma = find_vma(mm, nstart); + } else if (nstart >= vma->vm_end) + vma = vma->vm_next; + if (!vma || vma->vm_start >= end) + break; + /* + * Set [nstart; nend) to intersection of desired address + * range with the first VMA. Also, skip undesirable VMA types. + */ + nend = min(end, vma->vm_end); + if (vma->vm_flags & (VM_IO | VM_PFNMAP)) + continue; + if (nstart < vma->vm_start) + nstart = vma->vm_start; + /* + * Now fault in a range of pages. populate_vma_page_range() + * double checks the vma flags, so that it won't mlock pages + * if the vma was already munlocked. + */ + ret = populate_vma_page_range(vma, nstart, nend, &locked); + if (ret < 0) { + if (ignore_errors) { + ret = 0; + continue; /* continue at next VMA */ + } + break; + } + nend = nstart + ret * PAGE_SIZE; + ret = 0; + } + if (locked) + up_read(&mm->mmap_sem); + return ret; /* 0 or negative error code */ +} + +/** + * get_dump_page() - pin user page in memory while writing it to core dump + * @addr: user address + * + * Returns struct page pointer of user page pinned for dump, + * to be freed afterwards by put_page(). + * + * Returns NULL on any kind of failure - a hole must then be inserted into + * the corefile, to preserve alignment with its headers; and also returns + * NULL wherever the ZERO_PAGE, or an anonymous pte_none, has been found - + * allowing a hole to be left in the corefile to save diskspace. + * + * Called without mmap_sem, but after all other threads have been killed. + */ +#ifdef CONFIG_ELF_CORE +struct page *get_dump_page(unsigned long addr) +{ + struct vm_area_struct *vma; + struct page *page; + + if (__get_user_pages(current, current->mm, addr, 1, + FOLL_FORCE | FOLL_DUMP | FOLL_GET, &page, &vma, + NULL) < 1) + return NULL; + flush_cache_page(vma, addr, page_to_pfn(page)); + return page; +} +#endif /* CONFIG_ELF_CORE */ + +#else /* CONFIG_MMU */ +static long __get_user_pages_locked(struct task_struct *tsk, + struct mm_struct *mm, unsigned long start, + unsigned long nr_pages, struct page **pages, + struct vm_area_struct **vmas, int *locked, + unsigned int foll_flags) +{ + struct vm_area_struct *vma; + unsigned long vm_flags; + int i; + + /* calculate required read or write permissions. + * If FOLL_FORCE is set, we only require the "MAY" flags. + */ + vm_flags = (foll_flags & FOLL_WRITE) ? + (VM_WRITE | VM_MAYWRITE) : (VM_READ | VM_MAYREAD); + vm_flags &= (foll_flags & FOLL_FORCE) ? + (VM_MAYREAD | VM_MAYWRITE) : (VM_READ | VM_WRITE); + + for (i = 0; i < nr_pages; i++) { + vma = find_vma(mm, start); + if (!vma) + goto finish_or_fault; + + /* protect what we can, including chardevs */ + if ((vma->vm_flags & (VM_IO | VM_PFNMAP)) || + !(vm_flags & vma->vm_flags)) + goto finish_or_fault; + + if (pages) { + pages[i] = virt_to_page(start); + if (pages[i]) + get_page(pages[i]); + } + if (vmas) + vmas[i] = vma; + start = (start + PAGE_SIZE) & PAGE_MASK; + } + + return i; + +finish_or_fault: + return i ? : -EFAULT; +} +#endif /* !CONFIG_MMU */ + #if defined(CONFIG_FS_DAX) || defined (CONFIG_CMA) static bool check_dax_vmas(struct vm_area_struct **vmas, long nr_pages) { @@ -1417,7 +1531,7 @@ static long check_and_migrate_cma_pages(struct task_struct *tsk, { return nr_pages; } -#endif +#endif /* CONFIG_CMA */ /* * __gup_longterm_locked() is a wrapper for __get_user_pages_locked which @@ -1503,152 +1617,85 @@ long get_user_pages(unsigned long start, unsigned long nr_pages, } EXPORT_SYMBOL(get_user_pages); -/** - * populate_vma_page_range() - populate a range of pages in the vma. - * @vma: target vma - * @start: start address - * @end: end address - * @nonblocking: - * - * This takes care of mlocking the pages too if VM_LOCKED is set. +/* + * We can leverage the VM_FAULT_RETRY functionality in the page fault + * paths better by using either get_user_pages_locked() or + * get_user_pages_unlocked(). * - * return 0 on success, negative error code on error. + * get_user_pages_locked() is suitable to replace the form: * - * vma->vm_mm->mmap_sem must be held. + * down_read(&mm->mmap_sem); + * do_something() + * get_user_pages(tsk, mm, ..., pages, NULL); + * up_read(&mm->mmap_sem); * - * If @nonblocking is NULL, it may be held for read or write and will - * be unperturbed. + * to: * - * If @nonblocking is non-NULL, it must held for read only and may be - * released. If it's released, *@nonblocking will be set to 0. + * int locked = 1; + * down_read(&mm->mmap_sem); + * do_something() + * get_user_pages_locked(tsk, mm, ..., pages, &locked); + * if (locked) + * up_read(&mm->mmap_sem); */ -long populate_vma_page_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end, int *nonblocking) +long get_user_pages_locked(unsigned long start, unsigned long nr_pages, + unsigned int gup_flags, struct page **pages, + int *locked) { - struct mm_struct *mm = vma->vm_mm; - unsigned long nr_pages = (end - start) / PAGE_SIZE; - int gup_flags; - - VM_BUG_ON(start & ~PAGE_MASK); - VM_BUG_ON(end & ~PAGE_MASK); - VM_BUG_ON_VMA(start < vma->vm_start, vma); - VM_BUG_ON_VMA(end > vma->vm_end, vma); - VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm); - - gup_flags = FOLL_TOUCH | FOLL_POPULATE | FOLL_MLOCK; - if (vma->vm_flags & VM_LOCKONFAULT) - gup_flags &= ~FOLL_POPULATE; - /* - * We want to touch writable mappings with a write fault in order - * to break COW, except for shared mappings because these don't COW - * and we would not want to dirty them for nothing. - */ - if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE) - gup_flags |= FOLL_WRITE; - /* - * We want mlock to succeed for regions that have any permissions - * other than PROT_NONE. + * FIXME: Current FOLL_LONGTERM behavior is incompatible with + * FAULT_FLAG_ALLOW_RETRY because of the FS DAX check requirement on + * vmas. As there are no users of this flag in this call we simply + * disallow this option for now. */ - if (vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) - gup_flags |= FOLL_FORCE; + if (WARN_ON_ONCE(gup_flags & FOLL_LONGTERM)) + return -EINVAL; - /* - * We made sure addr is within a VMA, so the following will - * not result in a stack expansion that recurses back here. - */ - return __get_user_pages(current, mm, start, nr_pages, gup_flags, - NULL, NULL, nonblocking); + return __get_user_pages_locked(current, current->mm, start, nr_pages, + pages, NULL, locked, + gup_flags | FOLL_TOUCH); } +EXPORT_SYMBOL(get_user_pages_locked); /* - * __mm_populate - populate and/or mlock pages within a range of address space. + * get_user_pages_unlocked() is suitable to replace the form: * - * This is used to implement mlock() and the MAP_POPULATE / MAP_LOCKED mmap - * flags. VMAs must be already marked with the desired vm_flags, and - * mmap_sem must not be held. + * down_read(&mm->mmap_sem); + * get_user_pages(tsk, mm, ..., pages, NULL); + * up_read(&mm->mmap_sem); + * + * with: + * + * get_user_pages_unlocked(tsk, mm, ..., pages); + * + * It is functionally equivalent to get_user_pages_fast so + * get_user_pages_fast should be used instead if specific gup_flags + * (e.g. FOLL_FORCE) are not required. */ -int __mm_populate(unsigned long start, unsigned long len, int ignore_errors) +long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, + struct page **pages, unsigned int gup_flags) { struct mm_struct *mm = current->mm; - unsigned long end, nstart, nend; - struct vm_area_struct *vma = NULL; - int locked = 0; - long ret = 0; + int locked = 1; + long ret; - end = start + len; + /* + * FIXME: Current FOLL_LONGTERM behavior is incompatible with + * FAULT_FLAG_ALLOW_RETRY because of the FS DAX check requirement on + * vmas. As there are no users of this flag in this call we simply + * disallow this option for now. + */ + if (WARN_ON_ONCE(gup_flags & FOLL_LONGTERM)) + return -EINVAL; - for (nstart = start; nstart < end; nstart = nend) { - /* - * We want to fault in pages for [nstart; end) address range. - * Find first corresponding VMA. - */ - if (!locked) { - locked = 1; - down_read(&mm->mmap_sem); - vma = find_vma(mm, nstart); - } else if (nstart >= vma->vm_end) - vma = vma->vm_next; - if (!vma || vma->vm_start >= end) - break; - /* - * Set [nstart; nend) to intersection of desired address - * range with the first VMA. Also, skip undesirable VMA types. - */ - nend = min(end, vma->vm_end); - if (vma->vm_flags & (VM_IO | VM_PFNMAP)) - continue; - if (nstart < vma->vm_start) - nstart = vma->vm_start; - /* - * Now fault in a range of pages. populate_vma_page_range() - * double checks the vma flags, so that it won't mlock pages - * if the vma was already munlocked. - */ - ret = populate_vma_page_range(vma, nstart, nend, &locked); - if (ret < 0) { - if (ignore_errors) { - ret = 0; - continue; /* continue at next VMA */ - } - break; - } - nend = nstart + ret * PAGE_SIZE; - ret = 0; - } + down_read(&mm->mmap_sem); + ret = __get_user_pages_locked(current, mm, start, nr_pages, pages, NULL, + &locked, gup_flags | FOLL_TOUCH); if (locked) up_read(&mm->mmap_sem); - return ret; /* 0 or negative error code */ -} - -/** - * get_dump_page() - pin user page in memory while writing it to core dump - * @addr: user address - * - * Returns struct page pointer of user page pinned for dump, - * to be freed afterwards by put_page(). - * - * Returns NULL on any kind of failure - a hole must then be inserted into - * the corefile, to preserve alignment with its headers; and also returns - * NULL wherever the ZERO_PAGE, or an anonymous pte_none, has been found - - * allowing a hole to be left in the corefile to save diskspace. - * - * Called without mmap_sem, but after all other threads have been killed. - */ -#ifdef CONFIG_ELF_CORE -struct page *get_dump_page(unsigned long addr) -{ - struct vm_area_struct *vma; - struct page *page; - - if (__get_user_pages(current, current->mm, addr, 1, - FOLL_FORCE | FOLL_DUMP | FOLL_GET, &page, &vma, - NULL) < 1) - return NULL; - flush_cache_page(vma, addr, page_to_pfn(page)); - return page; + return ret; } -#endif /* CONFIG_ELF_CORE */ +EXPORT_SYMBOL(get_user_pages_unlocked); /* * Fast GUP @@ -1683,7 +1730,7 @@ struct page *get_dump_page(unsigned long addr) * * This code is based heavily on the PowerPC implementation by Nick Piggin. */ -#ifdef CONFIG_HAVE_FAST_GUP +#if defined(CONFIG_MMU) && defined(CONFIG_HAVE_FAST_GUP) #ifdef CONFIG_GUP_GET_PTE_LOW_HIGH /* * WARNING: only to be used in the get_user_pages_fast() implementation. @@ -2160,6 +2207,12 @@ static void gup_pgd_range(unsigned long addr, unsigned long end, return; } while (pgdp++, addr = next, addr != end); } +#else +static inline void gup_pgd_range(unsigned long addr, unsigned long end, + unsigned int flags, struct page **pages, int *nr) +{ +} +#endif /* CONFIG_HAVE_FAST_GUP */ #ifndef gup_fast_permitted /* @@ -2168,7 +2221,7 @@ static void gup_pgd_range(unsigned long addr, unsigned long end, */ static bool gup_fast_permitted(unsigned long start, unsigned long end) { - return true; + return IS_ENABLED(CONFIG_HAVE_FAST_GUP) ? true : false; } #endif @@ -2177,6 +2230,9 @@ static bool gup_fast_permitted(unsigned long start, unsigned long end) * the regular GUP. * Note a difference with get_user_pages_fast: this always returns the * number of pages pinned, 0 if no pages were pinned. + * + * If the architecture does not support this function, simply return with no + * pages pinned. */ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, struct page **pages) @@ -2214,6 +2270,7 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, return nr; } +EXPORT_SYMBOL_GPL(__get_user_pages_fast); static int __gup_longterm_unlocked(unsigned long start, int nr_pages, unsigned int gup_flags, struct page **pages) @@ -2296,5 +2353,4 @@ int get_user_pages_fast(unsigned long start, int nr_pages, return ret; } - -#endif /* CONFIG_HAVE_GENERIC_GUP */ +EXPORT_SYMBOL_GPL(get_user_pages_fast); diff --git a/mm/nommu.c b/mm/nommu.c index d8c02fbe03b5..07165ad2e548 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -111,94 +111,6 @@ unsigned int kobjsize(const void *objp) return PAGE_SIZE << compound_order(page); } -static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, - unsigned long start, unsigned long nr_pages, - unsigned int foll_flags, struct page **pages, - struct vm_area_struct **vmas, int *nonblocking) -{ - struct vm_area_struct *vma; - unsigned long vm_flags; - int i; - - /* calculate required read or write permissions. - * If FOLL_FORCE is set, we only require the "MAY" flags. - */ - vm_flags = (foll_flags & FOLL_WRITE) ? - (VM_WRITE | VM_MAYWRITE) : (VM_READ | VM_MAYREAD); - vm_flags &= (foll_flags & FOLL_FORCE) ? - (VM_MAYREAD | VM_MAYWRITE) : (VM_READ | VM_WRITE); - - for (i = 0; i < nr_pages; i++) { - vma = find_vma(mm, start); - if (!vma) - goto finish_or_fault; - - /* protect what we can, including chardevs */ - if ((vma->vm_flags & (VM_IO | VM_PFNMAP)) || - !(vm_flags & vma->vm_flags)) - goto finish_or_fault; - - if (pages) { - pages[i] = virt_to_page(start); - if (pages[i]) - get_page(pages[i]); - } - if (vmas) - vmas[i] = vma; - start = (start + PAGE_SIZE) & PAGE_MASK; - } - - return i; - -finish_or_fault: - return i ? : -EFAULT; -} - -/* - * get a list of pages in an address range belonging to the specified process - * and indicate the VMA that covers each page - * - this is potentially dodgy as we may end incrementing the page count of a - * slab page or a secondary page from a compound page - * - don't permit access to VMAs that don't support it, such as I/O mappings - */ -long get_user_pages(unsigned long start, unsigned long nr_pages, - unsigned int gup_flags, struct page **pages, - struct vm_area_struct **vmas) -{ - return __get_user_pages(current, current->mm, start, nr_pages, - gup_flags, pages, vmas, NULL); -} -EXPORT_SYMBOL(get_user_pages); - -long get_user_pages_locked(unsigned long start, unsigned long nr_pages, - unsigned int gup_flags, struct page **pages, - int *locked) -{ - return get_user_pages(start, nr_pages, gup_flags, pages, NULL); -} -EXPORT_SYMBOL(get_user_pages_locked); - -static long __get_user_pages_unlocked(struct task_struct *tsk, - struct mm_struct *mm, unsigned long start, - unsigned long nr_pages, struct page **pages, - unsigned int gup_flags) -{ - long ret; - down_read(&mm->mmap_sem); - ret = __get_user_pages(tsk, mm, start, nr_pages, gup_flags, pages, - NULL, NULL); - up_read(&mm->mmap_sem); - return ret; -} - -long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, - struct page **pages, unsigned int gup_flags) -{ - return __get_user_pages_unlocked(current, current->mm, start, nr_pages, - pages, gup_flags); -} -EXPORT_SYMBOL(get_user_pages_unlocked); - /** * follow_pfn - look up PFN at a user virtual address * @vma: memory mapping diff --git a/mm/util.c b/mm/util.c index 9834c4ab7d8e..68575a315dc5 100644 --- a/mm/util.c +++ b/mm/util.c @@ -300,53 +300,6 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack) } #endif -/* - * Like get_user_pages_fast() except its IRQ-safe in that it won't fall - * back to the regular GUP. - * Note a difference with get_user_pages_fast: this always returns the - * number of pages pinned, 0 if no pages were pinned. - * If the architecture does not support this function, simply return with no - * pages pinned. - */ -int __weak __get_user_pages_fast(unsigned long start, - int nr_pages, int write, struct page **pages) -{ - return 0; -} -EXPORT_SYMBOL_GPL(__get_user_pages_fast); - -/** - * get_user_pages_fast() - pin user pages in memory - * @start: starting user address - * @nr_pages: number of pages from start to pin - * @gup_flags: flags modifying pin behaviour - * @pages: array that receives pointers to the pages pinned. - * Should be at least nr_pages long. - * - * get_user_pages_fast provides equivalent functionality to get_user_pages, - * operating on current and current->mm, with force=0 and vma=NULL. However - * unlike get_user_pages, it must be called without mmap_sem held. - * - * get_user_pages_fast may take mmap_sem and page table locks, so no - * assumptions can be made about lack of locking. get_user_pages_fast is to be - * implemented in a way that is advantageous (vs get_user_pages()) when the - * user memory area is already faulted in and present in ptes. However if the - * pages have to be faulted in, it may turn out to be slightly slower so - * callers need to carefully consider what to use. On many architectures, - * get_user_pages_fast simply falls back to get_user_pages. - * - * Return: number of pages pinned. This may be fewer than the number - * requested. If nr_pages is 0 or negative, returns 0. If no pages - * were pinned, returns -errno. - */ -int __weak get_user_pages_fast(unsigned long start, - int nr_pages, unsigned int gup_flags, - struct page **pages) -{ - return get_user_pages_unlocked(start, nr_pages, pages, gup_flags); -} -EXPORT_SYMBOL_GPL(get_user_pages_fast); - unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flag, unsigned long pgoff) From patchwork Tue Jun 11 14:40:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987173 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9933C18A6 for ; Tue, 11 Jun 2019 14:42:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8A99C2887B for ; Tue, 11 Jun 2019 14:42:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 890462888B; Tue, 11 Jun 2019 14:42:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 413C42887B for ; Tue, 11 Jun 2019 14:42:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391686AbfFKOl4 (ORCPT ); Tue, 11 Jun 2019 10:41:56 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37754 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391234AbfFKOly (ORCPT ); Tue, 11 Jun 2019 10:41:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=19P6HKT8gx+qvtagjEoAfgixjiXJTHCy2J5lWoKb+RI=; b=RvMLHQLM0EqS4QeoOT3XL4S8Cn bxrsz4T2AKHyg/Jnz9ysLo0v7PvYXCdSxmLvfSIpHzfHqH9NklV5iCu1CJz+7sk5NP0HxWHhNDbVS APpwXXhK22mKDDmkVGTsQl6GmckrjXQaHs33F/LL3JxSB4TnsvLYz7ZZIfzWiwRU6CjT/SINtqap4 868A4nTBLbFV2gFiY5ook73YDF/6b+QARegN2/rb9yAqWpqUDNB0D9CjTXtsogPm0omdTGL9Ajdmp roiPNf+qsDk//cZFte1jfmP7N7hAUk6jNYUyMBU3Oxq/RYBPF/lvSKYiKGZQIUGuymNMKWpU8qPg8 8Ld3TJGg==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahy2-0005YC-6o; Tue, 11 Jun 2019 14:41:42 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 12/16] mm: validate get_user_pages_fast flags Date: Tue, 11 Jun 2019 16:40:58 +0200 Message-Id: <20190611144102.8848-13-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We can only deal with FOLL_WRITE and/or FOLL_LONGTERM in get_user_pages_fast, so reject all other flags. Signed-off-by: Christoph Hellwig --- mm/gup.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/gup.c b/mm/gup.c index fe4f205651fd..78dc1871b3d4 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2317,6 +2317,9 @@ int get_user_pages_fast(unsigned long start, int nr_pages, unsigned long addr, len, end; int nr = 0, ret = 0; + if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM))) + return -EINVAL; + start = untagged_addr(start) & PAGE_MASK; addr = start; len = (unsigned long) nr_pages << PAGE_SHIFT; From patchwork Tue Jun 11 14:40:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987165 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2D07D18E8 for ; Tue, 11 Jun 2019 14:42:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1FC2C28783 for ; Tue, 11 Jun 2019 14:42:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1D9B528884; Tue, 11 Jun 2019 14:42:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8700428872 for ; Tue, 11 Jun 2019 14:42:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391706AbfFKOmA (ORCPT ); Tue, 11 Jun 2019 10:42:00 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37814 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391234AbfFKOmA (ORCPT ); Tue, 11 Jun 2019 10:42:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=BBXXT9xa9mrRhIGkEBsSxPqH41gwjewyswp0iGthWuw=; b=G8LOT4PF2yvCs1mHkYytZqaKeQ xEujP5jOt56yiaI+d8fgDk4Xmq480Lul6ae1hHIRTkyZ4l7d6Fsri/dNIUK9LffjZeavJwiShIBZe Q44n5eKUC+iTdz7NZezLbHWpMZoll7MPwJGN4b+uFUK/T3Wp6fklHarEf9lPLlQsbnIV2Ppjy6PC2 aEKuoSGKW0MabJ7AF8PN1lRrSYS8/EUd9PN10WcpUVfYQl5IsiolX/4XUXYwV+6XNGn1jowk49CzR O7mStspTewveegDCsMZIAXSzkHajsbpdPgODgDrKkW1Ef97kDEoc2nmsBeNk3X+3EarOGJxgFUb9c vjWvJ/5A==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahy5-0005dm-5p; Tue, 11 Jun 2019 14:41:45 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 13/16] mm: move the powerpc hugepd code to mm/gup.c Date: Tue, 11 Jun 2019 16:40:59 +0200 Message-Id: <20190611144102.8848-14-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP While only powerpc supports the hugepd case, the code is pretty generic and I'd like to keep all GUP internals in one place. Signed-off-by: Christoph Hellwig --- arch/powerpc/Kconfig | 1 + arch/powerpc/mm/hugetlbpage.c | 72 ------------------------------ include/linux/hugetlb.h | 18 -------- mm/Kconfig | 10 +++++ mm/gup.c | 82 +++++++++++++++++++++++++++++++++++ 5 files changed, 93 insertions(+), 90 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 992a04796e56..4f1b00979cde 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -125,6 +125,7 @@ config PPC select ARCH_HAS_FORTIFY_SOURCE select ARCH_HAS_GCOV_PROFILE_ALL select ARCH_HAS_KCOV + select ARCH_HAS_HUGEPD if HUGETLB_PAGE select ARCH_HAS_MMIOWB if PPC64 select ARCH_HAS_PHYS_TO_DMA select ARCH_HAS_PMEM_API if PPC64 diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index b5d92dc32844..51716c11d0fb 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -511,13 +511,6 @@ struct page *follow_huge_pd(struct vm_area_struct *vma, return page; } -static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end, - unsigned long sz) -{ - unsigned long __boundary = (addr + sz) & ~(sz-1); - return (__boundary - 1 < end - 1) ? __boundary : end; -} - #ifdef CONFIG_PPC_MM_SLICES unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, unsigned long pgoff, @@ -665,68 +658,3 @@ void flush_dcache_icache_hugepage(struct page *page) } } } - -static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, - unsigned long end, int write, struct page **pages, int *nr) -{ - unsigned long pte_end; - struct page *head, *page; - pte_t pte; - int refs; - - pte_end = (addr + sz) & ~(sz-1); - if (pte_end < end) - end = pte_end; - - pte = READ_ONCE(*ptep); - - if (!pte_access_permitted(pte, write)) - return 0; - - /* hugepages are never "special" */ - VM_BUG_ON(!pfn_valid(pte_pfn(pte))); - - refs = 0; - head = pte_page(pte); - - page = head + ((addr & (sz-1)) >> PAGE_SHIFT); - do { - VM_BUG_ON(compound_head(page) != head); - pages[*nr] = page; - (*nr)++; - page++; - refs++; - } while (addr += PAGE_SIZE, addr != end); - - if (!page_cache_add_speculative(head, refs)) { - *nr -= refs; - return 0; - } - - if (unlikely(pte_val(pte) != pte_val(*ptep))) { - /* Could be optimized better */ - *nr -= refs; - while (refs--) - put_page(head); - return 0; - } - - return 1; -} - -int gup_huge_pd(hugepd_t hugepd, unsigned long addr, unsigned int pdshift, - unsigned long end, int write, struct page **pages, int *nr) -{ - pte_t *ptep; - unsigned long sz = 1UL << hugepd_shift(hugepd); - unsigned long next; - - ptep = hugepte_offset(hugepd, addr, pdshift); - do { - next = hugepte_addr_end(addr, end, sz); - if (!gup_hugepte(ptep, sz, addr, end, write, pages, nr)) - return 0; - } while (ptep++, addr = next, addr != end); - - return 1; -} diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index edf476c8cfb9..0f91761e2c53 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -16,29 +16,11 @@ struct user_struct; struct mmu_gather; #ifndef is_hugepd -/* - * Some architectures requires a hugepage directory format that is - * required to support multiple hugepage sizes. For example - * a4fe3ce76 "powerpc/mm: Allow more flexible layouts for hugepage pagetables" - * introduced the same on powerpc. This allows for a more flexible hugepage - * pagetable layout. - */ typedef struct { unsigned long pd; } hugepd_t; #define is_hugepd(hugepd) (0) #define __hugepd(x) ((hugepd_t) { (x) }) -static inline int gup_huge_pd(hugepd_t hugepd, unsigned long addr, - unsigned pdshift, unsigned long end, - int write, struct page **pages, int *nr) -{ - return 0; -} -#else -extern int gup_huge_pd(hugepd_t hugepd, unsigned long addr, - unsigned pdshift, unsigned long end, - int write, struct page **pages, int *nr); #endif - #ifdef CONFIG_HUGETLB_PAGE #include diff --git a/mm/Kconfig b/mm/Kconfig index 5c41409557da..44be3f01a2b2 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -769,4 +769,14 @@ config GUP_GET_PTE_LOW_HIGH config ARCH_HAS_PTE_SPECIAL bool +# +# Some architectures require a special hugepage directory format that is +# required to support multiple hugepage sizes. For example a4fe3ce76 +# "powerpc/mm: Allow more flexible layouts for hugepage pagetables" +# introduced it on powerpc. This allows for a more flexible hugepage +# pagetable layouts. +# +config ARCH_HAS_HUGEPD + bool + endmenu diff --git a/mm/gup.c b/mm/gup.c index 78dc1871b3d4..494aa4c3a55e 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1967,6 +1967,88 @@ static int __gup_device_huge_pud(pud_t pud, pud_t *pudp, unsigned long addr, } #endif +#ifdef CONFIG_ARCH_HAS_HUGEPD +static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end, + unsigned long sz) +{ + unsigned long __boundary = (addr + sz) & ~(sz-1); + return (__boundary - 1 < end - 1) ? __boundary : end; +} + +static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, + unsigned long end, int write, struct page **pages, int *nr) +{ + unsigned long pte_end; + struct page *head, *page; + pte_t pte; + int refs; + + pte_end = (addr + sz) & ~(sz-1); + if (pte_end < end) + end = pte_end; + + pte = READ_ONCE(*ptep); + + if (!pte_access_permitted(pte, write)) + return 0; + + /* hugepages are never "special" */ + VM_BUG_ON(!pfn_valid(pte_pfn(pte))); + + refs = 0; + head = pte_page(pte); + + page = head + ((addr & (sz-1)) >> PAGE_SHIFT); + do { + VM_BUG_ON(compound_head(page) != head); + pages[*nr] = page; + (*nr)++; + page++; + refs++; + } while (addr += PAGE_SIZE, addr != end); + + if (!page_cache_add_speculative(head, refs)) { + *nr -= refs; + return 0; + } + + if (unlikely(pte_val(pte) != pte_val(*ptep))) { + /* Could be optimized better */ + *nr -= refs; + while (refs--) + put_page(head); + return 0; + } + + return 1; +} + +static int gup_huge_pd(hugepd_t hugepd, unsigned long addr, + unsigned int pdshift, unsigned long end, int write, + struct page **pages, int *nr) +{ + pte_t *ptep; + unsigned long sz = 1UL << hugepd_shift(hugepd); + unsigned long next; + + ptep = hugepte_offset(hugepd, addr, pdshift); + do { + next = hugepte_addr_end(addr, end, sz); + if (!gup_hugepte(ptep, sz, addr, end, write, pages, nr)) + return 0; + } while (ptep++, addr = next, addr != end); + + return 1; +} +#else +static inline int gup_huge_pd(hugepd_t hugepd, unsigned long addr, + unsigned pdshift, unsigned long end, int write, + struct page **pages, int *nr) +{ + return 0; +} +#endif /* CONFIG_ARCH_HAS_HUGEPD */ + static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, unsigned long end, unsigned int flags, struct page **pages, int *nr) { From patchwork Tue Jun 11 14:41:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987145 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 72D3B14B6 for ; Tue, 11 Jun 2019 14:42:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 658832888B for ; Tue, 11 Jun 2019 14:42:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6383B28896; Tue, 11 Jun 2019 14:42:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 14D2E2888B for ; Tue, 11 Jun 2019 14:42:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391712AbfFKOmB (ORCPT ); Tue, 11 Jun 2019 10:42:01 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37826 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391705AbfFKOmA (ORCPT ); Tue, 11 Jun 2019 10:42:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=92m0mWNJuHmY5cYpWlooP/cZgBBhq6eJcxducqZ88IQ=; b=rr+6i7rPu5se1K/u84C05O1tdY GzqfoR4ThRZnBYykD/g3PZVO7wt8FMplFhvZc9xK09NnOLWoT3OfPdCMgF6j8idQ5hVx9Llo1V/aA gkylDbMT/JiY3hZo9qmZ7jM0kZrPGMqnbWcr+ylPSRBycRzfkVl6yoTaZ1ZcJXTyhNuZUVuhmKdkL 39NSJ31ih14hQPhzjcZ3GtnHRqtmiZsoUO6nKeiz7ttH5Iuyf6If7xjguwIwAwHBQreLYaFo0nkc2 q9x/QDp+b7OsLJTf33M2uFsaCD+JFkB+KMn94HytN+qVUEV+oOEb6fRveAy1MReAX/mRXtPMLUcHi LLWRbdIw==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahy7-0005mk-T7; Tue, 11 Jun 2019 14:41:48 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 14/16] mm: switch gup_hugepte to use try_get_compound_head Date: Tue, 11 Jun 2019 16:41:00 +0200 Message-Id: <20190611144102.8848-15-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This applies the overflow fixes from 8fde12ca79aff ("mm: prevent get_user_pages() from overflowing page refcount") to the powerpc hugepd code and brings it back in sync with the other GUP cases. Signed-off-by: Christoph Hellwig --- mm/gup.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/gup.c b/mm/gup.c index 494aa4c3a55e..0733674b539d 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2007,7 +2007,8 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, refs++; } while (addr += PAGE_SIZE, addr != end); - if (!page_cache_add_speculative(head, refs)) { + head = try_get_compound_head(head, refs); + if (!head) { *nr -= refs; return 0; } From patchwork Tue Jun 11 14:41:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987151 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8B1AE1708 for ; Tue, 11 Jun 2019 14:42:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7C5C726C9B for ; Tue, 11 Jun 2019 14:42:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7AA3D2838B; Tue, 11 Jun 2019 14:42:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 26E1E28889 for ; Tue, 11 Jun 2019 14:42:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391725AbfFKOmD (ORCPT ); Tue, 11 Jun 2019 10:42:03 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37846 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391720AbfFKOmD (ORCPT ); Tue, 11 Jun 2019 10:42:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=A1i8h65qriplI15qgyPKPV5aWfBos+CWcUOK7xSAxuY=; b=NOYU9DLYhjro838aLFWjvDoMbZ UkqZ8t//il4sKiPD2MvuzAucG5i7l8pWQ6VZyXnM1lDiBoQSz1ebEgO0BoHCTP8qbcTO53B6Bxri5 4hFy8f18Wb9wpm7g7LLyMcw6j6ZRXpvaD3DUeonwZKNckLX8QbqQvsZlKL5eHMh/7DG7/kuoYQcX/ YW+S9XokqeT/7Vw2z5+vqNVRt8eGRYlmsFbAa22ATQgqR5aJPASr2OARLcEF1Ta3Mn/k33RIaEXrm z6rmE/LSYeU+Btb4GR8sulOJQPhFjwpE69UYW3PVZcX30EaXUIHK9X6wejIo8dhub+RTZQTowcoNm oUmZ6b2A==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahyB-0005u1-RB; Tue, 11 Jun 2019 14:41:52 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 15/16] mm: mark the page referenced in gup_hugepte Date: Tue, 11 Jun 2019 16:41:01 +0200 Message-Id: <20190611144102.8848-16-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP All other get_user_page_fast cases mark the page referenced, so do this here as well. Signed-off-by: Christoph Hellwig --- mm/gup.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/gup.c b/mm/gup.c index 0733674b539d..8bcc042f933a 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2021,6 +2021,7 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, return 0; } + SetPageReferenced(head); return 1; } From patchwork Tue Jun 11 14:41:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10987169 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CF79314B6 for ; Tue, 11 Jun 2019 14:42:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BC59228889 for ; Tue, 11 Jun 2019 14:42:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AF4BA2888B; Tue, 11 Jun 2019 14:42:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 415C128783 for ; Tue, 11 Jun 2019 14:42:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391747AbfFKOmP (ORCPT ); Tue, 11 Jun 2019 10:42:15 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:37902 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391473AbfFKOmP (ORCPT ); Tue, 11 Jun 2019 10:42:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=eRYqrBdA83ELJwzehQp5ACnwy4+Ii18V3bNuC63QHNg=; b=gyjzOdFjX2Hc07Fw3ofnTYcH1V 4motr276Y/b+VWeq+PJMXx+r8fzS5ixUMH8KzPOTBCwRPmWWvyi52ti5GBX2fi5EAvr1apreGQjxn gF2YXCRf/RJYs1iu0i28vAyitkrJPd3TGl5GmOX6bpzaMXee/5BOspGVUTsI63qNWTGuEX3DrIo02 KFBHJaCrQiVe1nxb+CkKluXla2IExJuZRuxtU+WvON5pJTvvjJojqE9/+MhkQC8ZKJv5UDK5EYwmU Srstwn4NGlvjlOwDSPHxhZW7Hm2f+wGyT1yk20gBUnkms/ozNmJEKv5vcWFuLgm6cYJ24W/8srhis AnP8HaGQ==; Received: from mpp-cp1-natpool-1-037.ethz.ch ([82.130.71.37] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hahyF-000665-83; Tue, 11 Jun 2019 14:41:55 +0000 From: Christoph Hellwig To: Linus Torvalds , Paul Burton , James Hogan , Yoshinori Sato , Rich Felker , "David S. Miller" Cc: Nicholas Piggin , Khalid Aziz , Andrey Konovalov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-mips@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 16/16] mm: pass get_user_pages_fast iterator arguments in a structure Date: Tue, 11 Jun 2019 16:41:02 +0200 Message-Id: <20190611144102.8848-17-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190611144102.8848-1-hch@lst.de> References: <20190611144102.8848-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Instead of passing a set of always repeated arguments down the get_user_pages_fast iterators, create a struct gup_args to hold them and pass that by reference. This leads to an over 100 byte .text size reduction for x86-64. Signed-off-by: Christoph Hellwig --- mm/gup.c | 338 ++++++++++++++++++++++++++----------------------------- 1 file changed, 158 insertions(+), 180 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 8bcc042f933a..419a565fc998 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -24,6 +24,13 @@ #include "internal.h" +struct gup_args { + unsigned long addr; + unsigned int flags; + struct page **pages; + unsigned int nr; +}; + struct follow_page_context { struct dev_pagemap *pgmap; unsigned int page_mask; @@ -1786,10 +1793,10 @@ static inline pte_t gup_get_pte(pte_t *ptep) } #endif /* CONFIG_GUP_GET_PTE_LOW_HIGH */ -static void undo_dev_pagemap(int *nr, int nr_start, struct page **pages) +static void undo_dev_pagemap(struct gup_args *args, int nr_start) { - while ((*nr) - nr_start) { - struct page *page = pages[--(*nr)]; + while (args->nr - nr_start) { + struct page *page = args->pages[--args->nr]; ClearPageReferenced(page); put_page(page); @@ -1811,14 +1818,13 @@ static inline struct page *try_get_compound_head(struct page *page, int refs) } #ifdef CONFIG_ARCH_HAS_PTE_SPECIAL -static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) +static int gup_pte_range(struct gup_args *args, pmd_t pmd, unsigned long end) { struct dev_pagemap *pgmap = NULL; - int nr_start = *nr, ret = 0; + int nr_start = args->nr, ret = 0; pte_t *ptep, *ptem; - ptem = ptep = pte_offset_map(&pmd, addr); + ptem = ptep = pte_offset_map(&pmd, args->addr); do { pte_t pte = gup_get_pte(ptep); struct page *head, *page; @@ -1830,16 +1836,16 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, if (pte_protnone(pte)) goto pte_unmap; - if (!pte_access_permitted(pte, flags & FOLL_WRITE)) + if (!pte_access_permitted(pte, args->flags & FOLL_WRITE)) goto pte_unmap; if (pte_devmap(pte)) { - if (unlikely(flags & FOLL_LONGTERM)) + if (unlikely(args->flags & FOLL_LONGTERM)) goto pte_unmap; pgmap = get_dev_pagemap(pte_pfn(pte), pgmap); if (unlikely(!pgmap)) { - undo_dev_pagemap(nr, nr_start, pages); + undo_dev_pagemap(args, nr_start); goto pte_unmap; } } else if (pte_special(pte)) @@ -1860,10 +1866,8 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, VM_BUG_ON_PAGE(compound_head(page) != head, page); SetPageReferenced(page); - pages[*nr] = page; - (*nr)++; - - } while (ptep++, addr += PAGE_SIZE, addr != end); + args->pages[args->nr++] = page; + } while (ptep++, args->addr += PAGE_SIZE, args->addr != end); ret = 1; @@ -1884,18 +1888,17 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, * __get_user_pages_fast implementation that can pin pages. Thus it's still * useful to have gup_huge_pmd even if we can't operate on ptes. */ -static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) +static int gup_pte_range(struct gup_args *args, pmd_t pmd, unsigned long end) { return 0; } #endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */ #if defined(__HAVE_ARCH_PTE_DEVMAP) && defined(CONFIG_TRANSPARENT_HUGEPAGE) -static int __gup_device_huge(unsigned long pfn, unsigned long addr, - unsigned long end, struct page **pages, int *nr) +static int __gup_device_huge(struct gup_args *args, unsigned long pfn, + unsigned long end) { - int nr_start = *nr; + int nr_start = args->nr; struct dev_pagemap *pgmap = NULL; do { @@ -1903,64 +1906,63 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr, pgmap = get_dev_pagemap(pfn, pgmap); if (unlikely(!pgmap)) { - undo_dev_pagemap(nr, nr_start, pages); + undo_dev_pagemap(args, nr_start); return 0; } SetPageReferenced(page); - pages[*nr] = page; + args->pages[args->nr++] = page; get_page(page); - (*nr)++; pfn++; - } while (addr += PAGE_SIZE, addr != end); + } while (args->addr += PAGE_SIZE, args->addr != end); if (pgmap) put_dev_pagemap(pgmap); return 1; } -static int __gup_device_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, - unsigned long end, struct page **pages, int *nr) +static int __gup_device_huge_pmd(struct gup_args *args, pmd_t orig, pmd_t *pmdp, + unsigned long end) { unsigned long fault_pfn; - int nr_start = *nr; + int nr_start = args->nr; - fault_pfn = pmd_pfn(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); - if (!__gup_device_huge(fault_pfn, addr, end, pages, nr)) + fault_pfn = pmd_pfn(orig) + ((args->addr & ~PMD_MASK) >> PAGE_SHIFT); + if (!__gup_device_huge(args, fault_pfn, end)) return 0; if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) { - undo_dev_pagemap(nr, nr_start, pages); + undo_dev_pagemap(args, nr_start); return 0; } return 1; } -static int __gup_device_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, - unsigned long end, struct page **pages, int *nr) +static int __gup_device_huge_pud(struct gup_args *args, pud_t orig, pud_t *pudp, + unsigned long end) { unsigned long fault_pfn; - int nr_start = *nr; + int nr_start = args->nr; - fault_pfn = pud_pfn(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); - if (!__gup_device_huge(fault_pfn, addr, end, pages, nr)) + fault_pfn = pud_pfn(orig) + ((args->addr & ~PUD_MASK) >> PAGE_SHIFT); + if (!__gup_device_huge(args, fault_pfn, end)) return 0; if (unlikely(pud_val(orig) != pud_val(*pudp))) { - undo_dev_pagemap(nr, nr_start, pages); + undo_dev_pagemap(args, nr_start); return 0; } return 1; } #else -static int __gup_device_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, - unsigned long end, struct page **pages, int *nr) +static int __gup_device_huge_pmd(struct gup_args *args, pmd_t orig, pmd_t *pmdp, + unsigned long end) { BUILD_BUG(); return 0; } -static int __gup_device_huge_pud(pud_t pud, pud_t *pudp, unsigned long addr, - unsigned long end, struct page **pages, int *nr) +static int __gup_device_huge_pud(struct gup_args *args, pud_t pud, pud_t *pudp, + unsigned long end) { BUILD_BUG(); return 0; @@ -1975,21 +1977,21 @@ static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end, return (__boundary - 1 < end - 1) ? __boundary : end; } -static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, - unsigned long end, int write, struct page **pages, int *nr) +static int gup_hugepte(struct gup_args *args, pte_t *ptep, unsigned long sz, + unsigned long end) { unsigned long pte_end; struct page *head, *page; pte_t pte; int refs; - pte_end = (addr + sz) & ~(sz-1); + pte_end = (args->addr + sz) & ~(sz - 1); if (pte_end < end) end = pte_end; pte = READ_ONCE(*ptep); - if (!pte_access_permitted(pte, write)) + if (!pte_access_permitted(pte, args->flags & FOLL_WRITE)) return 0; /* hugepages are never "special" */ @@ -1998,24 +2000,23 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, refs = 0; head = pte_page(pte); - page = head + ((addr & (sz-1)) >> PAGE_SHIFT); + page = head + ((args->addr & (sz - 1)) >> PAGE_SHIFT); do { VM_BUG_ON(compound_head(page) != head); - pages[*nr] = page; - (*nr)++; + args->pages[args->nr++] = page; page++; refs++; - } while (addr += PAGE_SIZE, addr != end); + } while (args->addr += PAGE_SIZE, args->addr != end); head = try_get_compound_head(head, refs); if (!head) { - *nr -= refs; + args->nr -= refs; return 0; } if (unlikely(pte_val(pte) != pte_val(*ptep))) { /* Could be optimized better */ - *nr -= refs; + args->nr -= refs; while (refs--) put_page(head); return 0; @@ -2025,64 +2026,61 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, return 1; } -static int gup_huge_pd(hugepd_t hugepd, unsigned long addr, - unsigned int pdshift, unsigned long end, int write, - struct page **pages, int *nr) +static int gup_huge_pd(struct gup_args *args, hugepd_t hugepd, unsigned pdshift, + unsigned long end) { pte_t *ptep; unsigned long sz = 1UL << hugepd_shift(hugepd); unsigned long next; - ptep = hugepte_offset(hugepd, addr, pdshift); + ptep = hugepte_offset(hugepd, args->addr, pdshift); do { - next = hugepte_addr_end(addr, end, sz); - if (!gup_hugepte(ptep, sz, addr, end, write, pages, nr)) + next = hugepte_addr_end(args->addr, end, sz); + if (!gup_hugepte(args, ptep, sz, next)) return 0; - } while (ptep++, addr = next, addr != end); + } while (ptep++, args->addr != end); return 1; } #else -static inline int gup_huge_pd(hugepd_t hugepd, unsigned long addr, - unsigned pdshift, unsigned long end, int write, - struct page **pages, int *nr) +static inline int gup_huge_pd(struct gup_args *args, hugepd_t hugepd, + unsigned pdshift, unsigned long end) { return 0; } #endif /* CONFIG_ARCH_HAS_HUGEPD */ -static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, int *nr) +static int gup_huge_pmd(struct gup_args *args, pmd_t orig, pmd_t *pmdp, + unsigned long end) { struct page *head, *page; int refs; - if (!pmd_access_permitted(orig, flags & FOLL_WRITE)) + if (!pmd_access_permitted(orig, args->flags & FOLL_WRITE)) return 0; if (pmd_devmap(orig)) { - if (unlikely(flags & FOLL_LONGTERM)) + if (unlikely(args->flags & FOLL_LONGTERM)) return 0; - return __gup_device_huge_pmd(orig, pmdp, addr, end, pages, nr); + return __gup_device_huge_pmd(args, orig, pmdp, end); } refs = 0; - page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); + page = pmd_page(orig) + ((args->addr & ~PMD_MASK) >> PAGE_SHIFT); do { - pages[*nr] = page; - (*nr)++; + args->pages[args->nr++] = page; page++; refs++; - } while (addr += PAGE_SIZE, addr != end); + } while (args->addr += PAGE_SIZE, args->addr != end); head = try_get_compound_head(pmd_page(orig), refs); if (!head) { - *nr -= refs; + args->nr -= refs; return 0; } if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) { - *nr -= refs; + args->nr -= refs; while (refs--) put_page(head); return 0; @@ -2092,38 +2090,37 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, return 1; } -static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, int *nr) +static int gup_huge_pud(struct gup_args *args, pud_t orig, pud_t *pudp, + unsigned long end) { struct page *head, *page; int refs; - if (!pud_access_permitted(orig, flags & FOLL_WRITE)) + if (!pud_access_permitted(orig, args->flags & FOLL_WRITE)) return 0; if (pud_devmap(orig)) { - if (unlikely(flags & FOLL_LONGTERM)) + if (unlikely(args->flags & FOLL_LONGTERM)) return 0; - return __gup_device_huge_pud(orig, pudp, addr, end, pages, nr); + return __gup_device_huge_pud(args, orig, pudp, end); } refs = 0; - page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); + page = pud_page(orig) + ((args->addr & ~PUD_MASK) >> PAGE_SHIFT); do { - pages[*nr] = page; - (*nr)++; + args->pages[args->nr++] = page; page++; refs++; - } while (addr += PAGE_SIZE, addr != end); + } while (args->addr += PAGE_SIZE, args->addr != end); head = try_get_compound_head(pud_page(orig), refs); if (!head) { - *nr -= refs; + args->nr -= refs; return 0; } if (unlikely(pud_val(orig) != pud_val(*pudp))) { - *nr -= refs; + args->nr -= refs; while (refs--) put_page(head); return 0; @@ -2133,34 +2130,32 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, return 1; } -static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr, - unsigned long end, unsigned int flags, - struct page **pages, int *nr) +static int gup_huge_pgd(struct gup_args *args, pgd_t orig, pgd_t *pgdp, + unsigned long end) { - int refs; struct page *head, *page; + int refs; - if (!pgd_access_permitted(orig, flags & FOLL_WRITE)) + if (!pgd_access_permitted(orig, args->flags & FOLL_WRITE)) return 0; BUILD_BUG_ON(pgd_devmap(orig)); refs = 0; - page = pgd_page(orig) + ((addr & ~PGDIR_MASK) >> PAGE_SHIFT); + page = pgd_page(orig) + ((args->addr & ~PGDIR_MASK) >> PAGE_SHIFT); do { - pages[*nr] = page; - (*nr)++; + args->pages[args->nr++] = page; page++; refs++; - } while (addr += PAGE_SIZE, addr != end); + } while (args->addr += PAGE_SIZE, args->addr != end); head = try_get_compound_head(pgd_page(orig), refs); if (!head) { - *nr -= refs; + args->nr -= refs; return 0; } if (unlikely(pgd_val(orig) != pgd_val(*pgdp))) { - *nr -= refs; + args->nr -= refs; while (refs--) put_page(head); return 0; @@ -2170,17 +2165,16 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr, return 1; } -static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) +static int gup_pmd_range(struct gup_args *args, pud_t pud, unsigned long end) { unsigned long next; pmd_t *pmdp; - pmdp = pmd_offset(&pud, addr); + pmdp = pmd_offset(&pud, args->addr); do { pmd_t pmd = READ_ONCE(*pmdp); - next = pmd_addr_end(addr, end); + next = pmd_addr_end(args->addr, end); if (!pmd_present(pmd)) return 0; @@ -2194,8 +2188,7 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, if (pmd_protnone(pmd)) return 0; - if (!gup_huge_pmd(pmd, pmdp, addr, next, flags, - pages, nr)) + if (!gup_huge_pmd(args, pmd, pmdp, next)) return 0; } else if (unlikely(is_hugepd(__hugepd(pmd_val(pmd))))) { @@ -2203,93 +2196,88 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, * architecture have different format for hugetlbfs * pmd format and THP pmd format */ - if (!gup_huge_pd(__hugepd(pmd_val(pmd)), addr, - PMD_SHIFT, next, flags, pages, nr)) + if (!gup_huge_pd(args, __hugepd(pmd_val(pmd)), + PMD_SHIFT, next)) return 0; - } else if (!gup_pte_range(pmd, addr, next, flags, pages, nr)) + } else if (!gup_pte_range(args, pmd, next)) return 0; - } while (pmdp++, addr = next, addr != end); + } while (pmdp++, args->addr != end); return 1; } -static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) +static int gup_pud_range(struct gup_args *args, p4d_t p4d, unsigned long end) { unsigned long next; pud_t *pudp; - pudp = pud_offset(&p4d, addr); + pudp = pud_offset(&p4d, args->addr); do { pud_t pud = READ_ONCE(*pudp); - next = pud_addr_end(addr, end); + next = pud_addr_end(args->addr, end); if (pud_none(pud)) return 0; if (unlikely(pud_huge(pud))) { - if (!gup_huge_pud(pud, pudp, addr, next, flags, - pages, nr)) + if (!gup_huge_pud(args, pud, pudp, next)) return 0; } else if (unlikely(is_hugepd(__hugepd(pud_val(pud))))) { - if (!gup_huge_pd(__hugepd(pud_val(pud)), addr, - PUD_SHIFT, next, flags, pages, nr)) + if (!gup_huge_pd(args, __hugepd(pud_val(pud)), + PUD_SHIFT, next)) return 0; - } else if (!gup_pmd_range(pud, addr, next, flags, pages, nr)) + } else if (!gup_pmd_range(args, pud, next)) return 0; - } while (pudp++, addr = next, addr != end); + } while (pudp++, args->addr != end); return 1; } -static int gup_p4d_range(pgd_t pgd, unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) +static int gup_p4d_range(struct gup_args *args, pgd_t pgd, unsigned long end) { unsigned long next; p4d_t *p4dp; - p4dp = p4d_offset(&pgd, addr); + p4dp = p4d_offset(&pgd, args->addr); do { p4d_t p4d = READ_ONCE(*p4dp); - next = p4d_addr_end(addr, end); + next = p4d_addr_end(args->addr, end); if (p4d_none(p4d)) return 0; BUILD_BUG_ON(p4d_huge(p4d)); if (unlikely(is_hugepd(__hugepd(p4d_val(p4d))))) { - if (!gup_huge_pd(__hugepd(p4d_val(p4d)), addr, - P4D_SHIFT, next, flags, pages, nr)) + if (!gup_huge_pd(args, __hugepd(p4d_val(p4d)), + P4D_SHIFT, next)) return 0; - } else if (!gup_pud_range(p4d, addr, next, flags, pages, nr)) + } else if (!gup_pud_range(args, p4d, next)) return 0; - } while (p4dp++, addr = next, addr != end); + } while (p4dp++, args->addr != end); return 1; } -static void gup_pgd_range(unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) +static void gup_pgd_range(struct gup_args *args, unsigned long end) { unsigned long next; pgd_t *pgdp; - pgdp = pgd_offset(current->mm, addr); + pgdp = pgd_offset(current->mm, args->addr); do { pgd_t pgd = READ_ONCE(*pgdp); - next = pgd_addr_end(addr, end); + next = pgd_addr_end(args->addr, end); if (pgd_none(pgd)) return; if (unlikely(pgd_huge(pgd))) { - if (!gup_huge_pgd(pgd, pgdp, addr, next, flags, - pages, nr)) + if (!gup_huge_pgd(args, pgd, pgdp, next)) return; } else if (unlikely(is_hugepd(__hugepd(pgd_val(pgd))))) { - if (!gup_huge_pd(__hugepd(pgd_val(pgd)), addr, - PGDIR_SHIFT, next, flags, pages, nr)) + if (!gup_huge_pd(args, __hugepd(pgd_val(pgd)), + PGDIR_SHIFT, next)) return; - } else if (!gup_p4d_range(pgd, addr, next, flags, pages, nr)) + } else if (!gup_p4d_range(args, pgd, next)) return; - } while (pgdp++, addr = next, addr != end); + } while (pgdp++, args->addr != end); } #else static inline void gup_pgd_range(unsigned long addr, unsigned long end, @@ -2321,17 +2309,18 @@ static bool gup_fast_permitted(unsigned long start, unsigned long end) int __get_user_pages_fast(unsigned long start, int nr_pages, int write, struct page **pages) { - unsigned long len, end; + struct gup_args args = { + .addr = untagged_addr(start) & PAGE_MASK, + .flags = write ? FOLL_WRITE : 0, + .pages = pages, + }; + unsigned long len = (unsigned long)nr_pages << PAGE_SHIFT; + unsigned long end = args.addr + len; unsigned long flags; - int nr = 0; - - start = untagged_addr(start) & PAGE_MASK; - len = (unsigned long) nr_pages << PAGE_SHIFT; - end = start + len; - if (end <= start) + if (end <= args.addr) return 0; - if (unlikely(!access_ok((void __user *)start, len))) + if (unlikely(!access_ok((void __user *)args.addr, len))) return 0; /* @@ -2345,38 +2334,42 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, * We do not adopt an rcu_read_lock(.) here as we also want to * block IPIs that come from THPs splitting. */ - - if (gup_fast_permitted(start, end)) { + if (gup_fast_permitted(args.addr, end)) { local_irq_save(flags); - gup_pgd_range(start, end, write ? FOLL_WRITE : 0, pages, &nr); + gup_pgd_range(&args, end); local_irq_restore(flags); } - return nr; + return args.nr; } EXPORT_SYMBOL_GPL(__get_user_pages_fast); -static int __gup_longterm_unlocked(unsigned long start, int nr_pages, - unsigned int gup_flags, struct page **pages) +static int get_user_pages_fallback(struct gup_args *args, int nr_pages) { + struct page **pages = args->pages + args->nr; int ret; + nr_pages -= args->nr; + /* * FIXME: FOLL_LONGTERM does not work with * get_user_pages_unlocked() (see comments in that function) */ - if (gup_flags & FOLL_LONGTERM) { + if (args->flags & FOLL_LONGTERM) { down_read(¤t->mm->mmap_sem); ret = __gup_longterm_locked(current, current->mm, - start, nr_pages, - pages, NULL, gup_flags); + args->addr, nr_pages, + pages, NULL, args->flags); up_read(¤t->mm->mmap_sem); } else { - ret = get_user_pages_unlocked(start, nr_pages, - pages, gup_flags); + ret = get_user_pages_unlocked(args->addr, nr_pages, pages, + args->flags); } - return ret; + /* Have to be a bit careful with return values */ + if (ret > 0) + args->nr += ret; + return args->nr ? args->nr : ret; } /** @@ -2398,46 +2391,31 @@ static int __gup_longterm_unlocked(unsigned long start, int nr_pages, int get_user_pages_fast(unsigned long start, int nr_pages, unsigned int gup_flags, struct page **pages) { - unsigned long addr, len, end; - int nr = 0, ret = 0; + struct gup_args args = { + .addr = untagged_addr(start) & PAGE_MASK, + .flags = gup_flags, + .pages = pages, + }; + unsigned long len = (unsigned long)nr_pages << PAGE_SHIFT; + unsigned long end = args.addr + len; if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM))) return -EINVAL; - start = untagged_addr(start) & PAGE_MASK; - addr = start; - len = (unsigned long) nr_pages << PAGE_SHIFT; - end = start + len; - - if (end <= start) + if (end <= args.addr) return 0; - if (unlikely(!access_ok((void __user *)start, len))) + if (unlikely(!access_ok((void __user *)args.addr, len))) return -EFAULT; - if (gup_fast_permitted(start, end)) { + if (gup_fast_permitted(args.addr, end)) { local_irq_disable(); - gup_pgd_range(addr, end, gup_flags, pages, &nr); + gup_pgd_range(&args, end); local_irq_enable(); - ret = nr; - } - - if (nr < nr_pages) { - /* Try to get the remaining pages with get_user_pages */ - start += nr << PAGE_SHIFT; - pages += nr; - - ret = __gup_longterm_unlocked(start, nr_pages - nr, - gup_flags, pages); - - /* Have to be a bit careful with return values */ - if (nr > 0) { - if (ret < 0) - ret = nr; - else - ret += nr; - } } - return ret; + /* Try to get the remaining pages with get_user_pages */ + if (args.nr < nr_pages) + return get_user_pages_fallback(&args, nr_pages); + return args.nr; } EXPORT_SYMBOL_GPL(get_user_pages_fast);