From patchwork Tue Sep 25 15:30:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 10614245 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1A39615E8 for ; Tue, 25 Sep 2018 15:30:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0CF242A811 for ; Tue, 25 Sep 2018 15:30:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0A75C2A866; Tue, 25 Sep 2018 15:30:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EEA3B2A857 for ; Tue, 25 Sep 2018 15:30:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CBF758E0072; Tue, 25 Sep 2018 11:30:20 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C42668E009E; Tue, 25 Sep 2018 11:30:20 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A75A28E0072; Tue, 25 Sep 2018 11:30:20 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 67A6F8E009E for ; Tue, 25 Sep 2018 11:30:20 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id s1-v6so9105942qte.19 for ; Tue, 25 Sep 2018 08:30:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:subject:date:message-id :in-reply-to:references; bh=KKlu3kUxNG+h6Ld8e3C9M6kZ0vdqFvFoHj3fBDj0fpE=; b=tRKvYAyQhlT6McBS+Wjq5jKpkGmfwdvFW45D9T9xAUu1fZs0tIK6NHrIVIl8mHo1AL T7ZaENpiOYr/mYVqWUdUq65i5Bec9VRX8cgIunnyk9oWm62lXoJFyFSWMRIsQZ9lxSEn hIBI/aT3gS4yAeBqcVwzdolSqNEh7Vrni19o7RbqqXPELv4Rydcma54tJ7UrXGSwLkYk 50gBJqociUuVICCJ/rLp2o8a/pFUj823iT+XPP2wb8c2IQEhymz9ucaDh2cLoZgQlhMJ KR38sllhuidCosbALPr71dIYPuZxRqO0psByazW9zEUjqCNRmnqDEbIpAmnyY8yfVMMA GZdw== X-Gm-Message-State: ABuFfohiKAzLb9U0+XaMulLKvaG0xC17udSR8WsSqNxAaA1bkJfU5mLI NJLKy8XtRxEG0zwQGA4odrlfylSgNblCslMe8e1JTlNo4UQQ2nerw5PkSL3d03cM2+Vc0AC4g/7 /OAy5qROS1+tSC9mYqeIIgvuuoDpnKb1RorVfQx8pGBYWJplQYKHgYwDVHPn48bD09kGcOqPO6Q I9K0jEivmtApsjPg/f9pGBfr+lKSrbKpQcCi4QF/UpLdsBUzJ7Wfg4Netg9EPQl1w1uOxbNQT4G S+rLdLFYdX+f/uvpVxnZJtMBgGrcWAp26pP7+RlpW8oszxTo6Jt3n+fdl/rfRFz5Inq3Rn16Z0h W9vtEJ7BdkXD5L9uaLI359RH/h89b1b6iflQ6U7he7X0Pycm3BWwrdAe9MvCoc9qulP8mtHRmdP k X-Received: by 2002:aed:3e4d:: with SMTP id m13-v6mr1158814qtf.99.1537889420075; Tue, 25 Sep 2018 08:30:20 -0700 (PDT) X-Received: by 2002:aed:3e4d:: with SMTP id m13-v6mr1158646qtf.99.1537889417388; Tue, 25 Sep 2018 08:30:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537889417; cv=none; d=google.com; s=arc-20160816; b=ZbhfOpTbyuWpvRFmuSdqJmTKkWJIa9JlWHHShYa4eogEvlTOdu172ywXJNuYykCJBw XToj91OF0RQrUUbPTqnKdSRC1wgO6tDIiJl2hdA7nFBA/IGikY7eacYtsR+SybdwQxRq pTcl4kEjMaSIqKvXCZkDNFysurrGryjHoJ+ShaZvNlHJ/1ITaK/kPQAbihmKusFQ2Ugv +pRt79IIeGj+hrDLdsXMhL9Dr3leCFeqg6yQw4YHghOQ28FI57Zz1s5Aq7lQWySDVgHK qVthLPHg6O1UJRybMCP0c590fbosP+Aij596IREJkc9wguNUf/pdgTv2Zn1hRYbshf1+ PWZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=KKlu3kUxNG+h6Ld8e3C9M6kZ0vdqFvFoHj3fBDj0fpE=; b=PXmXKLahndd7+SwTFSd6SKY9STFWXDJsBUNDPkOzMYVTPQM9TyDdac7jzAxALRFUW4 2VTIRLYd7JZDHhwHjDmGnMWigsTCkLC8ag4RO5mIw/UYZCXBxND943nDYDHznuTM+SHu qahhd7h3Cc26F0KxyfKHwmal+/B4DxUXWOmPF2Mjj+Oz75mHAo3I6GTS367/+dYHgPz8 yNcy5sYpRFqnYXBRcKGEIwQSuayaDKaSqRONhgzVpdar471/YqUn9mNdKWWd0RAXI44R FhPn9S3j8GYWOTGDvOQyrlmUnHtJNn6hbvcSPSJ9T5eg71sMRPZhAZrAKGms1vvieeyr TJvQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=V+SX+IrX; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id t65-v6sor694389qkl.14.2018.09.25.08.30.17 for (Google Transport Security); Tue, 25 Sep 2018 08:30:17 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=V+SX+IrX; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references; bh=KKlu3kUxNG+h6Ld8e3C9M6kZ0vdqFvFoHj3fBDj0fpE=; b=V+SX+IrXz8YKH7w+JcUU28i/3vNOdpVNmG0QMgYJ8/TuQ1oFGQtDbfcRibtaLYi9gV IoNRzw06eTRdTRjFs16hT3Q7Y+Ei0ykXvCVkjUj0oEY7UVj32/i8PRqU1mBfw6Cb8P1F NrlSx/6PCa9eT3hXo51l42uLhYCpMi5a3auGNVzZ0UL1hwCJ7WQ5ioX8tlqNChU2IDqe GSNKd46Idn274wKDYK7c3FZ4LRVT5ARlVRd/LRS10r1j/7xbbIMvLKQMtvnbc1TH7b5i HR3QgP9dqo75q0zMw3MmY+8BcWzzwBxEbmoa0PFxgsI247Ts6I35qkUmqMJy+jcU6+Dy DrJA== X-Google-Smtp-Source: ACcGV61MEULAwA/RYRAExAcMOhBkv+NaUiNtpu04iGC45fG8t9bg7x7a7sYumgT5bTsBakkhWF2xnQ== X-Received: by 2002:a37:7a86:: with SMTP id v128-v6mr1114096qkc.218.1537889416500; Tue, 25 Sep 2018 08:30:16 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id l5-v6sm1350471qte.20.2018.09.25.08.30.15 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 25 Sep 2018 08:30:15 -0700 (PDT) From: Josef Bacik To: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, linux-btrfs@vger.kernel.org, riel@redhat.com, hannes@cmpxchg.org, tj@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 1/8] mm: push vm_fault into the page fault handlers Date: Tue, 25 Sep 2018 11:30:04 -0400 Message-Id: <20180925153011.15311-2-josef@toxicpanda.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180925153011.15311-1-josef@toxicpanda.com> References: <20180925153011.15311-1-josef@toxicpanda.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP In preparation for caching pages during filemap faults we need to push the struct vm_fault up a level into the arch page fault handlers, since they are the ones responsible for retrying if we unlock the mmap_sem. Signed-off-by: Josef Bacik --- arch/alpha/mm/fault.c | 4 ++- arch/arc/mm/fault.c | 2 ++ arch/arm/mm/fault.c | 18 ++++++++----- arch/arm64/mm/fault.c | 18 +++++++------ arch/hexagon/mm/vm_fault.c | 4 ++- arch/ia64/mm/fault.c | 4 ++- arch/m68k/mm/fault.c | 5 ++-- arch/microblaze/mm/fault.c | 4 ++- arch/mips/mm/fault.c | 4 ++- arch/nds32/mm/fault.c | 5 ++-- arch/nios2/mm/fault.c | 4 ++- arch/openrisc/mm/fault.c | 5 ++-- arch/parisc/mm/fault.c | 5 ++-- arch/powerpc/mm/copro_fault.c | 4 ++- arch/powerpc/mm/fault.c | 4 ++- arch/riscv/mm/fault.c | 2 ++ arch/s390/mm/fault.c | 4 ++- arch/sh/mm/fault.c | 4 ++- arch/sparc/mm/fault_32.c | 6 ++++- arch/sparc/mm/fault_64.c | 2 ++ arch/um/kernel/trap.c | 4 ++- arch/unicore32/mm/fault.c | 17 +++++++----- arch/x86/mm/fault.c | 4 ++- arch/xtensa/mm/fault.c | 4 ++- drivers/iommu/amd_iommu_v2.c | 4 ++- drivers/iommu/intel-svm.c | 6 +++-- include/linux/mm.h | 16 +++++++++--- mm/gup.c | 8 ++++-- mm/hmm.c | 4 ++- mm/ksm.c | 10 ++++--- mm/memory.c | 61 +++++++++++++++++++++---------------------- 31 files changed, 157 insertions(+), 89 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index d73dc473fbb9..3c98dfef03a9 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -84,6 +84,7 @@ asmlinkage void do_page_fault(unsigned long address, unsigned long mmcsr, long cause, struct pt_regs *regs) { + struct vm_fault vmf = {}; struct vm_area_struct * vma; struct mm_struct *mm = current->mm; const struct exception_table_entry *fixup; @@ -148,7 +149,8 @@ do_page_fault(unsigned long address, unsigned long mmcsr, /* If for any reason at all we couldn't handle the fault, make sure we exit gracefully rather than endlessly redo the fault. */ - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmfs, vma, flags, address); + fault = handle_mm_fault(&vmf); if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) return; diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index db6913094be3..7aeb81ff5070 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -63,6 +63,7 @@ noinline static int handle_kernel_vaddr_fault(unsigned long address) void do_page_fault(unsigned long address, struct pt_regs *regs) { + struct vm_fault vmf = {}; struct vm_area_struct *vma = NULL; struct task_struct *tsk = current; struct mm_struct *mm = tsk->mm; @@ -141,6 +142,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) * make sure we exit gracefully rather than endlessly redo * the fault. */ + vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(vma, address, flags); /* If Pagefault was interrupted by SIGKILL, exit page fault "early" */ diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 3232afb6fdc0..885a24385a0a 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -225,17 +225,17 @@ static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma) } static vm_fault_t __kprobes -__do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr, - unsigned int flags, struct task_struct *tsk) +__do_page_fault(struct mm_struct *mm, struct vm_fault *vm, unsigned int fsr, + struct task_struct *tsk) { struct vm_area_struct *vma; vm_fault_t fault; - vma = find_vma(mm, addr); + vma = find_vma(mm, vmf->address); fault = VM_FAULT_BADMAP; if (unlikely(!vma)) goto out; - if (unlikely(vma->vm_start > addr)) + if (unlikely(vma->vm_start > vmf->address)) goto check_stack; /* @@ -248,12 +248,14 @@ __do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr, goto out; } - return handle_mm_fault(vma, addr & PAGE_MASK, flags); + vmf->vma = vma; + return handle_mm_fault(vmf); check_stack: /* Don't allow expansion below FIRST_USER_ADDRESS */ if (vma->vm_flags & VM_GROWSDOWN && - addr >= FIRST_USER_ADDRESS && !expand_stack(vma, addr)) + vmf->address >= FIRST_USER_ADDRESS && + !expand_stack(vma, vmf->address)) goto good_area; out: return fault; @@ -262,6 +264,7 @@ __do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr, static int __kprobes do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) { + struct vm_fault = {}; struct task_struct *tsk; struct mm_struct *mm; int sig, code; @@ -314,7 +317,8 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) #endif } - fault = __do_page_fault(mm, addr, fsr, flags, tsk); + vm_fault_init(&vmf, NULL, addr, flags); + fault = __do_page_fault(mm, &vmf, fsr, tsk); /* If we need to retry but a fatal signal is pending, handle the * signal first. We do not need to release the mmap_sem because diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 50b30ff30de4..31e86a74cbe0 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -379,18 +379,17 @@ static void do_bad_area(unsigned long addr, unsigned int esr, struct pt_regs *re #define VM_FAULT_BADMAP 0x010000 #define VM_FAULT_BADACCESS 0x020000 -static vm_fault_t __do_page_fault(struct mm_struct *mm, unsigned long addr, - unsigned int mm_flags, unsigned long vm_flags, - struct task_struct *tsk) +static vm_fault_t __do_page_fault(struct mm_struct *mm, struct vm_fault *vmf, + unsigned long vm_flags, struct task_struct *tsk) { struct vm_area_struct *vma; vm_fault_t fault; - vma = find_vma(mm, addr); + vma = find_vma(mm, vmf->address); fault = VM_FAULT_BADMAP; if (unlikely(!vma)) goto out; - if (unlikely(vma->vm_start > addr)) + if (unlikely(vma->vm_start > vmf->address)) goto check_stack; /* @@ -407,10 +406,11 @@ static vm_fault_t __do_page_fault(struct mm_struct *mm, unsigned long addr, goto out; } - return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags); + vmf->vma = vma; + return handle_mm_fault(vmf); check_stack: - if (vma->vm_flags & VM_GROWSDOWN && !expand_stack(vma, addr)) + if (vma->vm_flags & VM_GROWSDOWN && !expand_stack(vma, vmf->address)) goto good_area; out: return fault; @@ -424,6 +424,7 @@ static bool is_el0_instruction_abort(unsigned int esr) static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, struct pt_regs *regs) { + struct vm_fault vmf = {}; struct task_struct *tsk; struct mm_struct *mm; struct siginfo si; @@ -493,7 +494,8 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, #endif } - fault = __do_page_fault(mm, addr, mm_flags, vm_flags, tsk); + vm_fault_init(&vmf, NULL, addr, mm_flags); + fault = __do_page_fault(mm, vmf, vm_flags, tsk); major |= fault & VM_FAULT_MAJOR; if (fault & VM_FAULT_RETRY) { diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index eb263e61daf4..1ee1042bb2b5 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -48,6 +48,7 @@ */ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) { + struct vm_fault vmf = {}; struct vm_area_struct *vma; struct mm_struct *mm = current->mm; int si_signo; @@ -102,7 +103,8 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) break; } - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) return; diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index a9d55ad8d67b..827b898adb5e 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -82,6 +82,7 @@ mapped_kernel_page_is_present (unsigned long address) void __kprobes ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *regs) { + struct vm_fault vmf = {}; int signal = SIGSEGV, code = SEGV_MAPERR; struct vm_area_struct *vma, *prev_vma; struct mm_struct *mm = current->mm; @@ -161,7 +162,8 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re * sure we exit gracefully rather than endlessly redo the * fault. */ - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) return; diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index 9b6163c05a75..e42eddc9c7ca 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -68,6 +68,7 @@ int send_fault_sig(struct pt_regs *regs) int do_page_fault(struct pt_regs *regs, unsigned long address, unsigned long error_code) { + struct vm_fault vmf = {}; struct mm_struct *mm = current->mm; struct vm_area_struct * vma; vm_fault_t fault; @@ -134,8 +135,8 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, * make sure we exit gracefully rather than endlessly redo * the fault. */ - - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); pr_debug("handle_mm_fault returns %x\n", fault); if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 202ad6a494f5..ade980266f65 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -86,6 +86,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig) void do_page_fault(struct pt_regs *regs, unsigned long address, unsigned long error_code) { + struct vm_fault vmf = {}; struct vm_area_struct *vma; struct mm_struct *mm = current->mm; int code = SEGV_MAPERR; @@ -215,7 +216,8 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) return; diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 73d8a0f0b810..bf212bb70f24 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -38,6 +38,7 @@ int show_unhandled_signals = 1; static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, unsigned long address) { + struct vm_fault vmf = {}; struct vm_area_struct * vma = NULL; struct task_struct *tsk = current; struct mm_struct *mm = tsk->mm; @@ -152,7 +153,8 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) return; diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index b740534b152c..27ac4caa5102 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -69,6 +69,7 @@ void show_pte(struct mm_struct *mm, unsigned long addr) void do_page_fault(unsigned long entry, unsigned long addr, unsigned int error_code, struct pt_regs *regs) { + struct vm_fault vmf = {}; struct task_struct *tsk; struct mm_struct *mm; struct vm_area_struct *vma; @@ -203,8 +204,8 @@ void do_page_fault(unsigned long entry, unsigned long addr, * make sure we exit gracefully rather than endlessly redo * the fault. */ - - fault = handle_mm_fault(vma, addr, flags); + vm_fault_init(&vmf, vma, addr, flags); + fault = handle_mm_fault(&vmf); /* * If we need to retry but a fatal signal is pending, handle the diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index 24fd84cf6006..693472f05065 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -43,6 +43,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, unsigned long address) { + struct vm_fault vmf = {}; struct vm_area_struct *vma = NULL; struct task_struct *tsk = current; struct mm_struct *mm = tsk->mm; @@ -132,7 +133,8 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) return; diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index dc4dbafc1d83..70eef1d9f7ed 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -49,6 +49,7 @@ extern void die(char *, struct pt_regs *, long); asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, unsigned long vector, int write_acc) { + struct vm_fault vmf = {}; struct task_struct *tsk; struct mm_struct *mm; struct vm_area_struct *vma; @@ -162,8 +163,8 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, * make sure we exit gracefully rather than endlessly redo * the fault. */ - - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) return; diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index c8e8b7c05558..83c89cada3c0 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -258,6 +258,7 @@ show_signal_msg(struct pt_regs *regs, unsigned long code, void do_page_fault(struct pt_regs *regs, unsigned long code, unsigned long address) { + struct vm_fault vmf = {}; struct vm_area_struct *vma, *prev_vma; struct task_struct *tsk; struct mm_struct *mm; @@ -300,8 +301,8 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, * sure we exit gracefully rather than endlessly redo the * fault. */ - - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) return; diff --git a/arch/powerpc/mm/copro_fault.c b/arch/powerpc/mm/copro_fault.c index c8da352e8686..02dd21a54479 100644 --- a/arch/powerpc/mm/copro_fault.c +++ b/arch/powerpc/mm/copro_fault.c @@ -36,6 +36,7 @@ int copro_handle_mm_fault(struct mm_struct *mm, unsigned long ea, unsigned long dsisr, vm_fault_t *flt) { + struct vm_fault vmf = {}; struct vm_area_struct *vma; unsigned long is_write; int ret; @@ -77,7 +78,8 @@ int copro_handle_mm_fault(struct mm_struct *mm, unsigned long ea, } ret = 0; - *flt = handle_mm_fault(vma, ea, is_write ? FAULT_FLAG_WRITE : 0); + vm_fault_init(&vmf, vma, ea, is_write ? FAULT_FLAG_WRITE : 0); + *flt = handle_mm_fault(&vmf); if (unlikely(*flt & VM_FAULT_ERROR)) { if (*flt & VM_FAULT_OOM) { ret = -ENOMEM; diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index d51cf5f4e45e..cc00bba104fb 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -409,6 +409,7 @@ static void sanity_check_fault(bool is_write, unsigned long error_code) { } static int __do_page_fault(struct pt_regs *regs, unsigned long address, unsigned long error_code) { + struct vm_fault vmf = {}; struct vm_area_struct * vma; struct mm_struct *mm = current->mm; unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; @@ -538,7 +539,8 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); #ifdef CONFIG_PPC_MEM_KEYS /* diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 88401d5125bc..aa3db34c9eb8 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -36,6 +36,7 @@ */ asmlinkage void do_page_fault(struct pt_regs *regs) { + struct vm_fault vmf = {}; struct task_struct *tsk; struct vm_area_struct *vma; struct mm_struct *mm; @@ -120,6 +121,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs) * make sure we exit gracefully rather than endlessly redo * the fault. */ + vm_fault_init(&vmf, vma, addr, flags); fault = handle_mm_fault(vma, addr, flags); /* diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index 72af23bacbb5..14cfd6de43ed 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -404,6 +404,7 @@ static noinline void do_fault_error(struct pt_regs *regs, int access, */ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) { + struct vm_fault vmf = {}; struct gmap *gmap; struct task_struct *tsk; struct mm_struct *mm; @@ -499,7 +500,8 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); /* No reason to continue if interrupted by SIGKILL. */ if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { fault = VM_FAULT_SIGNAL; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index 6defd2c6d9b1..31202706125c 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -392,6 +392,7 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address) { + stuct vm_fault vmf = {}; unsigned long vec; struct task_struct *tsk; struct mm_struct *mm; @@ -481,7 +482,8 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); if (unlikely(fault & (VM_FAULT_RETRY | VM_FAULT_ERROR))) if (mm_fault_error(regs, error_code, address, fault)) diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index b0440b0edd97..a9dd62393934 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -160,6 +160,7 @@ static noinline void do_fault_siginfo(int code, int sig, struct pt_regs *regs, asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, unsigned long address) { + struct vm_fault vmf = {}; struct vm_area_struct *vma; struct task_struct *tsk = current; struct mm_struct *mm = tsk->mm; @@ -235,6 +236,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, * make sure we exit gracefully rather than endlessly redo * the fault. */ + vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(vma, address, flags); if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) @@ -377,6 +379,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, /* This always deals with user addresses. */ static void force_user_fault(unsigned long address, int write) { + struct vm_fault vmf = {}; struct vm_area_struct *vma; struct task_struct *tsk = current; struct mm_struct *mm = tsk->mm; @@ -405,7 +408,8 @@ static void force_user_fault(unsigned long address, int write) if (!(vma->vm_flags & (VM_READ | VM_EXEC))) goto bad_area; } - switch (handle_mm_fault(vma, address, flags)) { + vm_fault_init(&vmf, vma, address, flags); + switch (handle_mm_fault(&vmf)) { case VM_FAULT_SIGBUS: case VM_FAULT_OOM: goto do_sigbus; diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index 8f8a604c1300..381ab905eb2c 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -274,6 +274,7 @@ static void noinline __kprobes bogus_32bit_fault_tpc(struct pt_regs *regs) asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) { + struct vm_fault vmf = {}; enum ctx_state prev_state = exception_enter(); struct mm_struct *mm = current->mm; struct vm_area_struct *vma; @@ -433,6 +434,7 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) goto bad_area; } + vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(vma, address, flags); if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index cced82946042..c6d9e176c5c5 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -25,6 +25,7 @@ int handle_page_fault(unsigned long address, unsigned long ip, int is_write, int is_user, int *code_out) { + struct vm_fault vmf = {}; struct mm_struct *mm = current->mm; struct vm_area_struct *vma; pgd_t *pgd; @@ -74,7 +75,8 @@ int handle_page_fault(unsigned long address, unsigned long ip, do { vm_fault_t fault; - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) goto out_nosemaphore; diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index 8f12a5b50a42..68c2b0a65348 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -168,17 +168,17 @@ static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma) return vma->vm_flags & mask ? false : true; } -static vm_fault_t __do_pf(struct mm_struct *mm, unsigned long addr, - unsigned int fsr, unsigned int flags, struct task_struct *tsk) +static vm_fault_t __do_pf(struct mm_struct *mm, struct vm_fault *vmf, + unsigned int fsr, struct task_struct *tsk) { struct vm_area_struct *vma; vm_fault_t fault; - vma = find_vma(mm, addr); + vma = find_vma(mm, vmf->address); fault = VM_FAULT_BADMAP; if (unlikely(!vma)) goto out; - if (unlikely(vma->vm_start > addr)) + if (unlikely(vma->vm_start > vmf->address)) goto check_stack; /* @@ -195,11 +195,12 @@ static vm_fault_t __do_pf(struct mm_struct *mm, unsigned long addr, * If for any reason at all we couldn't handle the fault, make * sure we exit gracefully rather than endlessly redo the fault. */ - fault = handle_mm_fault(vma, addr & PAGE_MASK, flags); + vmf->vma = vma; + fault = handle_mm_fault(vmf); return fault; check_stack: - if (vma->vm_flags & VM_GROWSDOWN && !expand_stack(vma, addr)) + if (vma->vm_flags & VM_GROWSDOWN && !expand_stack(vma, vmf->address)) goto good_area; out: return fault; @@ -207,6 +208,7 @@ static vm_fault_t __do_pf(struct mm_struct *mm, unsigned long addr, static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) { + struct vm_fault vmf = {}; struct task_struct *tsk; struct mm_struct *mm; int sig, code; @@ -253,7 +255,8 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) #endif } - fault = __do_pf(mm, addr, fsr, flags, tsk); + vm_fault_init(&vmf, NULL, addr, flags); + fault = __do_pf(mm, &vmf, fsr, tsk); /* If we need to retry but a fatal signal is pending, handle the * signal first. We do not need to release the mmap_sem because diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 47bebfe6efa7..9919a25b15e6 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1211,6 +1211,7 @@ static noinline void __do_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address) { + struct vm_fault vmf = {}; struct vm_area_struct *vma; struct task_struct *tsk; struct mm_struct *mm; @@ -1392,7 +1393,8 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code, * fault, so we read the pkey beforehand. */ pkey = vma_pkey(vma); - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); major |= fault & VM_FAULT_MAJOR; /* diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index 2ab0e0dcd166..f1b0f4f858ff 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -35,6 +35,7 @@ void bad_page_fault(struct pt_regs*, unsigned long, int); void do_page_fault(struct pt_regs *regs) { + struct vm_fault vmf = {}; struct vm_area_struct * vma; struct mm_struct *mm = current->mm; unsigned int exccause = regs->exccause; @@ -108,7 +109,8 @@ void do_page_fault(struct pt_regs *regs) * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + fault = handle_mm_fault(&vmf); if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) return; diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c index 58da65df03f5..129e0ef68827 100644 --- a/drivers/iommu/amd_iommu_v2.c +++ b/drivers/iommu/amd_iommu_v2.c @@ -506,6 +506,7 @@ static bool access_error(struct vm_area_struct *vma, struct fault *fault) static void do_fault(struct work_struct *work) { + struct vm_fault vmf = {}; struct fault *fault = container_of(work, struct fault, work); struct vm_area_struct *vma; vm_fault_t ret = VM_FAULT_ERROR; @@ -532,7 +533,8 @@ static void do_fault(struct work_struct *work) if (access_error(vma, fault)) goto out; - ret = handle_mm_fault(vma, address, flags); + vm_fault_init(&vmf, vma, address, flags); + ret = handle_mm_fault(&vmf); out: up_read(&mm->mmap_sem); diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c index 4a03e5090952..03aa02723242 100644 --- a/drivers/iommu/intel-svm.c +++ b/drivers/iommu/intel-svm.c @@ -567,6 +567,7 @@ static bool is_canonical_address(u64 addr) static irqreturn_t prq_event_thread(int irq, void *d) { + struct vm_fault vmf = {}; struct intel_iommu *iommu = d; struct intel_svm *svm = NULL; int head, tail, handled = 0; @@ -636,8 +637,9 @@ static irqreturn_t prq_event_thread(int irq, void *d) if (access_error(vma, req)) goto invalid; - ret = handle_mm_fault(vma, address, - req->wr_req ? FAULT_FLAG_WRITE : 0); + vm_fault_init(&vmf, vma, address, + req->wr_req ? FAULT_FLAG_WRITE : 0); + ret = handle_mm_fault(&vmf); if (ret & VM_FAULT_ERROR) goto invalid; diff --git a/include/linux/mm.h b/include/linux/mm.h index a61ebe8ad4ca..e271c60af01a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -378,6 +378,16 @@ struct vm_fault { */ }; +static inline void vm_fault_init(struct vm_fault *vmf, + struct vm_area_struct *vma, + unsigned long address, + unsigned int flags) +{ + vmf->vma = vma; + vmf->address = address; + vmf->flags = flags; +} + /* page entry size for vm->huge_fault() */ enum page_entry_size { PE_SIZE_PTE = 0, @@ -1403,8 +1413,7 @@ int generic_error_remove_page(struct address_space *mapping, struct page *page); int invalidate_inode_page(struct page *page); #ifdef CONFIG_MMU -extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma, - unsigned long address, unsigned int flags); +extern vm_fault_t handle_mm_fault(struct vm_fault *vmf); extern int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm, unsigned long address, unsigned int fault_flags, bool *unlocked); @@ -1413,8 +1422,7 @@ void unmap_mapping_pages(struct address_space *mapping, void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows); #else -static inline vm_fault_t handle_mm_fault(struct vm_area_struct *vma, - unsigned long address, unsigned int flags) +static inline vm_fault_t handle_mm_fault(struct vm_fault *vmf) { /* should never happen if there's no MMU */ BUG(); diff --git a/mm/gup.c b/mm/gup.c index 1abc8b4afff6..c12d1e98614b 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -496,6 +496,7 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address, static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, unsigned long address, unsigned int *flags, int *nonblocking) { + struct vm_fault vmf = {}; unsigned int fault_flags = 0; vm_fault_t ret; @@ -515,7 +516,8 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, fault_flags |= FAULT_FLAG_TRIED; } - ret = handle_mm_fault(vma, address, fault_flags); + vm_fault_init(&vmf, vma, address, fault_flags); + ret = handle_mm_fault(&vmf); if (ret & VM_FAULT_ERROR) { int err = vm_fault_to_errno(ret, *flags); @@ -817,6 +819,7 @@ int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm, unsigned long address, unsigned int fault_flags, bool *unlocked) { + struct vm_fault vmf = {}; struct vm_area_struct *vma; vm_fault_t ret, major = 0; @@ -831,7 +834,8 @@ int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm, if (!vma_permits_fault(vma, fault_flags)) return -EFAULT; - ret = handle_mm_fault(vma, address, fault_flags); + vm_fault_init(&vmf, vma, address, fault_flags); + ret = handle_mm_fault(&vmf); major |= ret & VM_FAULT_MAJOR; if (ret & VM_FAULT_ERROR) { int err = vm_fault_to_errno(ret, 0); diff --git a/mm/hmm.c b/mm/hmm.c index c968e49f7a0c..695ef184a7d0 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -298,6 +298,7 @@ struct hmm_vma_walk { static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr, bool write_fault, uint64_t *pfn) { + struct vm_fault vmf = {}; unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_REMOTE; struct hmm_vma_walk *hmm_vma_walk = walk->private; struct hmm_range *range = hmm_vma_walk->range; @@ -306,7 +307,8 @@ static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr, flags |= hmm_vma_walk->block ? 0 : FAULT_FLAG_ALLOW_RETRY; flags |= write_fault ? FAULT_FLAG_WRITE : 0; - ret = handle_mm_fault(vma, addr, flags); + vm_fault_init(&vmf, vma, addr, flags); + ret = handle_mm_fault(&vmf); if (ret & VM_FAULT_RETRY) return -EBUSY; if (ret & VM_FAULT_ERROR) { diff --git a/mm/ksm.c b/mm/ksm.c index 5b0894b45ee5..4b6d90357ee2 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -478,10 +478,12 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr) FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE); if (IS_ERR_OR_NULL(page)) break; - if (PageKsm(page)) - ret = handle_mm_fault(vma, addr, - FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE); - else + if (PageKsm(page)) { + struct vm_fault vmf = {}; + vm_fault_init(&vmf, vma, addr, + FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE); + ret = handle_mm_fault(&vmf); + } else ret = VM_FAULT_WRITE; put_page(page); } while (!(ret & (VM_FAULT_WRITE | VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | VM_FAULT_OOM))); diff --git a/mm/memory.c b/mm/memory.c index c467102a5cbc..9152c2a2c9f6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4024,36 +4024,34 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) * The mmap_sem may have been released depending on flags and our * return value. See filemap_fault() and __lock_page_or_retry(). */ -static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, - unsigned long address, unsigned int flags) +static vm_fault_t __handle_mm_fault(struct vm_fault *vmf) { - struct vm_fault vmf = { - .vma = vma, - .address = address & PAGE_MASK, - .flags = flags, - .pgoff = linear_page_index(vma, address), - .gfp_mask = __get_fault_gfp_mask(vma), - }; - unsigned int dirty = flags & FAULT_FLAG_WRITE; + struct vm_area_struct *vma = vmf->vma; + unsigned long address = vmf->address; + unsigned int dirty = vmf->flags & FAULT_FLAG_WRITE; struct mm_struct *mm = vma->vm_mm; pgd_t *pgd; p4d_t *p4d; vm_fault_t ret; + vmf->address = address & PAGE_MASK; + vmf->pgoff = linear_page_index(vma, address); + vmf->gfp_mask = __get_fault_gfp_mask(vma); + pgd = pgd_offset(mm, address); p4d = p4d_alloc(mm, pgd, address); if (!p4d) return VM_FAULT_OOM; - vmf.pud = pud_alloc(mm, p4d, address); - if (!vmf.pud) + vmf->pud = pud_alloc(mm, p4d, address); + if (!vmf->pud) return VM_FAULT_OOM; - if (pud_none(*vmf.pud) && transparent_hugepage_enabled(vma)) { - ret = create_huge_pud(&vmf); + if (pud_none(*vmf->pud) && transparent_hugepage_enabled(vma)) { + ret = create_huge_pud(vmf); if (!(ret & VM_FAULT_FALLBACK)) return ret; } else { - pud_t orig_pud = *vmf.pud; + pud_t orig_pud = *vmf->pud; barrier(); if (pud_trans_huge(orig_pud) || pud_devmap(orig_pud)) { @@ -4061,50 +4059,50 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, /* NUMA case for anonymous PUDs would go here */ if (dirty && !pud_write(orig_pud)) { - ret = wp_huge_pud(&vmf, orig_pud); + ret = wp_huge_pud(vmf, orig_pud); if (!(ret & VM_FAULT_FALLBACK)) return ret; } else { - huge_pud_set_accessed(&vmf, orig_pud); + huge_pud_set_accessed(vmf, orig_pud); return 0; } } } - vmf.pmd = pmd_alloc(mm, vmf.pud, address); - if (!vmf.pmd) + vmf->pmd = pmd_alloc(mm, vmf->pud, address); + if (!vmf->pmd) return VM_FAULT_OOM; - if (pmd_none(*vmf.pmd) && transparent_hugepage_enabled(vma)) { - ret = create_huge_pmd(&vmf); + if (pmd_none(*vmf->pmd) && transparent_hugepage_enabled(vma)) { + ret = create_huge_pmd(vmf); if (!(ret & VM_FAULT_FALLBACK)) return ret; } else { - pmd_t orig_pmd = *vmf.pmd; + pmd_t orig_pmd = *vmf->pmd; barrier(); if (unlikely(is_swap_pmd(orig_pmd))) { VM_BUG_ON(thp_migration_supported() && !is_pmd_migration_entry(orig_pmd)); if (is_pmd_migration_entry(orig_pmd)) - pmd_migration_entry_wait(mm, vmf.pmd); + pmd_migration_entry_wait(mm, vmf->pmd); return 0; } if (pmd_trans_huge(orig_pmd) || pmd_devmap(orig_pmd)) { if (pmd_protnone(orig_pmd) && vma_is_accessible(vma)) - return do_huge_pmd_numa_page(&vmf, orig_pmd); + return do_huge_pmd_numa_page(vmf, orig_pmd); if (dirty && !pmd_write(orig_pmd)) { - ret = wp_huge_pmd(&vmf, orig_pmd); + ret = wp_huge_pmd(vmf, orig_pmd); if (!(ret & VM_FAULT_FALLBACK)) return ret; } else { - huge_pmd_set_accessed(&vmf, orig_pmd); + huge_pmd_set_accessed(vmf, orig_pmd); return 0; } } } - return handle_pte_fault(&vmf); + return handle_pte_fault(vmf); } /* @@ -4113,9 +4111,10 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, * The mmap_sem may have been released depending on flags and our * return value. See filemap_fault() and __lock_page_or_retry(). */ -vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, - unsigned int flags) +vm_fault_t handle_mm_fault(struct vm_fault *vmf) { + struct vm_area_struct *vma = vmf->vma; + unsigned int flags = vmf->flags; vm_fault_t ret; __set_current_state(TASK_RUNNING); @@ -4139,9 +4138,9 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, mem_cgroup_enter_user_fault(); if (unlikely(is_vm_hugetlb_page(vma))) - ret = hugetlb_fault(vma->vm_mm, vma, address, flags); + ret = hugetlb_fault(vma->vm_mm, vma, vmf->address, flags); else - ret = __handle_mm_fault(vma, address, flags); + ret = __handle_mm_fault(vmf); if (flags & FAULT_FLAG_USER) { mem_cgroup_exit_user_fault(); From patchwork Tue Sep 25 15:30:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 10614241 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F2A7815E8 for ; Tue, 25 Sep 2018 15:30:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E4EE02A7F5 for ; Tue, 25 Sep 2018 15:30:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D92762A846; Tue, 25 Sep 2018 15:30:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B59B2A846 for ; Tue, 25 Sep 2018 15:30:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 633F88E009D; Tue, 25 Sep 2018 11:30:20 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4EF328E0072; Tue, 25 Sep 2018 11:30:20 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 392328E009D; Tue, 25 Sep 2018 11:30:20 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 049828E0072 for ; Tue, 25 Sep 2018 11:30:20 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id p192-v6so25800547qke.13 for ; Tue, 25 Sep 2018 08:30:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:subject:date:message-id :in-reply-to:references; bh=FY4PvfFFgkG4rCGf8NvSi/OKKAo2xpKioHzicQvhcOI=; b=cXJRLRw6+OolM1cJ5XNz7weO8siZL6SXN0HkjhuIjJARJri199yQgHatZ323YYL0/q 8LeViOQNjo5PWrpZJLUc3dyfWrVLzPNbFmYDVqBtjHpGb/NNvMpBPqAdK22WdIADzAjW sMYK5BuORHJQ0MhL3rrFJ2ZhAbY6qx7UwkwSAjt+O2d7g4am16PEaHUH0suhL1GJYmlY SRu7kqPbQ4UPG8sggmVXmIa4fVzVMlrC5wZRDeQWF/JaJMOAJ0/vSsGjPPrzxv9Is7Ev 5RQcVcsyivD875XyYI8eJb/6vqS+FmtqlnpCrC/RfZlqq3t+em/wzVDcEzjaH2xM+W1g CcPg== X-Gm-Message-State: ABuFfojTJgaWX3SZ87I+yy3NTv18S0G3lW9sXXQ3BFQf5uZhJG8jppK3 Xb/PC7m1GeYXQ2t2xojE31c46XPvBNRjY8xbm0bGW4Imkcu5GHV4Mgr3YxRQDI3WWpyVKjCpgkZ LuToNtCXGtZKf3WZl3aNpmKW94JRpXuFp1MhVcLBjVA8uxj++FOTOlnKvYbDPRkQOw8EwpFhhAU vjCZTEUIfX/njgJFWT1og6Ujcw24KIxkO9V/Rtgjpjt+oG3hmQlSt32cRuUdD6FzJ925JJ4z/I/ BuOuiXNk6l70NJ8Y7GGWM2Fw3cBYcO6EOhswUboyZ6k03WthIfkogKIlrsHXo/Q1WZVaYP46aNo oXpb+SKMiXhsApS5jXFfQplRUHpJzP3JgO3UOsK8eQ5ntNO22YY6CuaLz0TmmEuzty7LdRmSmBo A X-Received: by 2002:aed:3d4e:: with SMTP id h14-v6mr1192867qtf.222.1537889419687; Tue, 25 Sep 2018 08:30:19 -0700 (PDT) X-Received: by 2002:aed:3d4e:: with SMTP id h14-v6mr1192816qtf.222.1537889418826; Tue, 25 Sep 2018 08:30:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537889418; cv=none; d=google.com; s=arc-20160816; b=w0WNgMwcW5hUhxqbmwiBgvzIRd5UjnBavrwXE/z3confPj/FAGb5xSIWUoGf9YMDTK Ze1X9xdqKgi2pgoY+vNTwU1IfYaN9bMeZrMfED25VA93QeHZvFMEs4HPu6nJKHUd0no6 uQw7SpzC+xXu/ax8205wggwUKYesD4YDHfbrdpPRlVCakNOkUHOBY7SryjyWgifB2jYU qf/ffIfFhOfdwAXPPmM5DEa9mncnL1fo1j3qA9o95+IKIXZ4+/s/WdnuJejpmK+oTzX8 terYv0NFYu2DM19tc4ED6lPaDzthP2crtrU2d/LLbx94GtwG+19KjSUR9KBcoSxzWt+5 RHmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=FY4PvfFFgkG4rCGf8NvSi/OKKAo2xpKioHzicQvhcOI=; b=YY5OBAY6/G6RC6EsyIe/hP15bGDIFl7ZjNYwlRpQ3xnRJyGIXGr/HrPSOrds2f1BWJ Kc1qhNUuwhgqtoXLUo7HKiXNSkFccwW/pgFfZW4LRcO45XQSRZ6PkVzzNEYP7A90edZ7 3j7attaC/F4IB6CxrEHFZNMzNNzvZOR/NUEngYeEleQxvYAxxtwiLww/efh13e8gdFBn qD+EBJbAA/xOlj+pr2O0SIrvJbwF440SdGGgQYQzSC0R5JI9LvCCgiLRKvDl5hi4DEFR WePpnO99HAYHdSEsj13l4stncT0f040lQGLn1f8c+0i0bo52VWIjD0zBo+C688LrN0fE AiCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=r16xab4q; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id o11-v6sor878994qto.38.2018.09.25.08.30.18 for (Google Transport Security); Tue, 25 Sep 2018 08:30:18 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=r16xab4q; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references; bh=FY4PvfFFgkG4rCGf8NvSi/OKKAo2xpKioHzicQvhcOI=; b=r16xab4qK/GzMbIKCC4aw7lZg+2OpxVn3pFK0xbG14epU4FWUX0WIYOooStDmiuVVb jmhFEufRVcbUZ2OoeF40N3YypUdzdZ1ia5oDwZCXIde/v6HqBftP06QnHMNA2ZJno/+B SYw/AUihAab2IaqYl/p0ewxnh0qA4X3QfakWuQeq1oTT1kfliaUBXI2ydTr/Mp/t0j6z xnmcy10agWziYzmo8+goSXT2aAcs4+hIbOYUgatNHnmOQ7uRNkiWWN3dcSakM6wlwFww dYSdbD5qXQm+P48pjM9uRaYlf8q0UQ7mO49yjpXx73aoepDGbMOGQFggb9W5c3HXNfck Tpsw== X-Google-Smtp-Source: ACcGV61vPNw252mdhbHuhSVRmWvA1VPrladnxlDA+gQ/uZwGukOs4HjONukkbRdqx0ZKC/CD1V+FFg== X-Received: by 2002:ac8:39c5:: with SMTP id v63-v6mr1176759qte.273.1537889418468; Tue, 25 Sep 2018 08:30:18 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id y205-v6sm1507950qkb.56.2018.09.25.08.30.17 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 25 Sep 2018 08:30:17 -0700 (PDT) From: Josef Bacik To: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, linux-btrfs@vger.kernel.org, riel@redhat.com, hannes@cmpxchg.org, tj@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 2/8] mm: drop mmap_sem for page cache read IO submission Date: Tue, 25 Sep 2018 11:30:05 -0400 Message-Id: <20180925153011.15311-3-josef@toxicpanda.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180925153011.15311-1-josef@toxicpanda.com> References: <20180925153011.15311-1-josef@toxicpanda.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Johannes Weiner Reads can take a long time, and if anybody needs to take a write lock on the mmap_sem it'll block any subsequent readers to the mmap_sem while the read is outstanding, which could cause long delays. Instead drop the mmap_sem if we do any reads at all. Signed-off-by: Johannes Weiner Signed-off-by: Josef Bacik --- mm/filemap.c | 119 ++++++++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 90 insertions(+), 29 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 52517f28e6f4..1ed35cd99b2c 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2366,6 +2366,18 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter) EXPORT_SYMBOL(generic_file_read_iter); #ifdef CONFIG_MMU +static struct file *maybe_unlock_mmap_for_io(struct vm_area_struct *vma, int flags) +{ + if ((flags & (FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT)) == FAULT_FLAG_ALLOW_RETRY) { + struct file *file; + + file = get_file(vma->vm_file); + up_read(&vma->vm_mm->mmap_sem); + return file; + } + return NULL; +} + /** * page_cache_read - adds requested page to the page cache if not already there * @file: file to read @@ -2405,23 +2417,28 @@ static int page_cache_read(struct file *file, pgoff_t offset, gfp_t gfp_mask) * Synchronous readahead happens when we don't even find * a page in the page cache at all. */ -static void do_sync_mmap_readahead(struct vm_area_struct *vma, - struct file_ra_state *ra, - struct file *file, - pgoff_t offset) +static int do_sync_mmap_readahead(struct vm_area_struct *vma, + struct file_ra_state *ra, + struct file *file, + pgoff_t offset, + int flags) { struct address_space *mapping = file->f_mapping; + struct file *fpin; /* If we don't want any read-ahead, don't bother */ if (vma->vm_flags & VM_RAND_READ) - return; + return 0; if (!ra->ra_pages) - return; + return 0; if (vma->vm_flags & VM_SEQ_READ) { + fpin = maybe_unlock_mmap_for_io(vma, flags); page_cache_sync_readahead(mapping, ra, file, offset, ra->ra_pages); - return; + if (fpin) + fput(fpin); + return fpin ? -EAGAIN : 0; } /* Avoid banging the cache line if not needed */ @@ -2433,7 +2450,9 @@ static void do_sync_mmap_readahead(struct vm_area_struct *vma, * stop bothering with read-ahead. It will only hurt. */ if (ra->mmap_miss > MMAP_LOTSAMISS) - return; + return 0; + + fpin = maybe_unlock_mmap_for_io(vma, flags); /* * mmap read-around @@ -2442,28 +2461,40 @@ static void do_sync_mmap_readahead(struct vm_area_struct *vma, ra->size = ra->ra_pages; ra->async_size = ra->ra_pages / 4; ra_submit(ra, mapping, file); + + if (fpin) + fput(fpin); + + return fpin ? -EAGAIN : 0; } /* * Asynchronous readahead happens when we find the page and PG_readahead, * so we want to possibly extend the readahead further.. */ -static void do_async_mmap_readahead(struct vm_area_struct *vma, - struct file_ra_state *ra, - struct file *file, - struct page *page, - pgoff_t offset) +static int do_async_mmap_readahead(struct vm_area_struct *vma, + struct file_ra_state *ra, + struct file *file, + struct page *page, + pgoff_t offset, + int flags) { struct address_space *mapping = file->f_mapping; + struct file *fpin; /* If we don't want any read-ahead, don't bother */ if (vma->vm_flags & VM_RAND_READ) - return; + return 0; if (ra->mmap_miss > 0) ra->mmap_miss--; - if (PageReadahead(page)) - page_cache_async_readahead(mapping, ra, file, - page, offset, ra->ra_pages); + if (!PageReadahead(page)) + return 0; + fpin = maybe_unlock_mmap_for_io(vma, flags); + page_cache_async_readahead(mapping, ra, file, + page, offset, ra->ra_pages); + if (fpin) + fput(fpin); + return fpin ? -EAGAIN : 0; } /** @@ -2479,10 +2510,8 @@ static void do_async_mmap_readahead(struct vm_area_struct *vma, * * vma->vm_mm->mmap_sem must be held on entry. * - * If our return value has VM_FAULT_RETRY set, it's because - * lock_page_or_retry() returned 0. - * The mmap_sem has usually been released in this case. - * See __lock_page_or_retry() for the exception. + * If our return value has VM_FAULT_RETRY set, the mmap_sem has + * usually been released. * * If our return value does not have VM_FAULT_RETRY set, the mmap_sem * has not been released. @@ -2492,11 +2521,13 @@ static void do_async_mmap_readahead(struct vm_area_struct *vma, vm_fault_t filemap_fault(struct vm_fault *vmf) { int error; + struct mm_struct *mm = vmf->vma->vm_mm; struct file *file = vmf->vma->vm_file; struct address_space *mapping = file->f_mapping; struct file_ra_state *ra = &file->f_ra; struct inode *inode = mapping->host; pgoff_t offset = vmf->pgoff; + int flags = vmf->flags; pgoff_t max_off; struct page *page; vm_fault_t ret = 0; @@ -2509,27 +2540,44 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) * Do we have something in the page cache already? */ page = find_get_page(mapping, offset); - if (likely(page) && !(vmf->flags & FAULT_FLAG_TRIED)) { + if (likely(page) && !(flags & FAULT_FLAG_TRIED)) { /* * We found the page, so try async readahead before * waiting for the lock. */ - do_async_mmap_readahead(vmf->vma, ra, file, page, offset); + error = do_async_mmap_readahead(vmf->vma, ra, file, page, offset, vmf->flags); + if (error == -EAGAIN) + goto out_retry_wait; } else if (!page) { /* No page in the page cache at all */ - do_sync_mmap_readahead(vmf->vma, ra, file, offset); - count_vm_event(PGMAJFAULT); - count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT); ret = VM_FAULT_MAJOR; + count_vm_event(PGMAJFAULT); + count_memcg_event_mm(mm, PGMAJFAULT); + error = do_sync_mmap_readahead(vmf->vma, ra, file, offset, vmf->flags); + if (error == -EAGAIN) + goto out_retry_wait; retry_find: page = find_get_page(mapping, offset); if (!page) goto no_cached_page; } - if (!lock_page_or_retry(page, vmf->vma->vm_mm, vmf->flags)) { - put_page(page); - return ret | VM_FAULT_RETRY; + if (!trylock_page(page)) { + if (flags & FAULT_FLAG_ALLOW_RETRY) { + if (flags & FAULT_FLAG_RETRY_NOWAIT) + goto out_retry; + up_read(&mm->mmap_sem); + goto out_retry_wait; + } + if (flags & FAULT_FLAG_KILLABLE) { + int ret = __lock_page_killable(page); + + if (ret) { + up_read(&mm->mmap_sem); + goto out_retry; + } + } else + __lock_page(page); } /* Did it get truncated? */ @@ -2607,6 +2655,19 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) /* Things didn't work out. Return zero to tell the mm layer so. */ shrink_readahead_size_eio(file, ra); return VM_FAULT_SIGBUS; + +out_retry_wait: + if (page) { + if (flags & FAULT_FLAG_KILLABLE) + wait_on_page_locked_killable(page); + else + wait_on_page_locked(page); + } + +out_retry: + if (page) + put_page(page); + return ret | VM_FAULT_RETRY; } EXPORT_SYMBOL(filemap_fault); From patchwork Tue Sep 25 15:30:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 10614249 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3CECB161F for ; Tue, 25 Sep 2018 15:30:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2D3572A7DE for ; Tue, 25 Sep 2018 15:30:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 215BD2A87F; Tue, 25 Sep 2018 15:30:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A3A62A885 for ; Tue, 25 Sep 2018 15:30:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 39D858E009F; Tue, 25 Sep 2018 11:30:22 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 2D1518E009E; Tue, 25 Sep 2018 11:30:22 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 063568E009F; Tue, 25 Sep 2018 11:30:21 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id B10948E009E for ; Tue, 25 Sep 2018 11:30:21 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id h26-v6so7646041qtp.18 for ; Tue, 25 Sep 2018 08:30:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=CbVu8tMWjWUQaLeNEaAx14400uYFzt+wHZIzJ3Bs3Ms=; b=tF15XKQEvaIKko9SgRxmBCHgw02uByLsbxJ6kvU568G65P6ps2AkGGMQhI5QMsgtHS //ByXp2NDV/kMIXlGQExrD2c4hV0CcEu2XNct2Ph8UJSZZ+SOAxpFcqzM04fYsa2L0ir D9+cteKiLqeHTGii0WMbkyzocakz/kA+4WAH+Z7bcE9DvQ9yD/eRWcJk1HpaFQiOzxM6 dKsif4X8LcAaK2q4M61hLngEIpwml4y+hkkglr0NXeDp6GlfTwK8SIcEwTE7MbMck4Ku m6erhQJ6Peswwmaqmhw9f21ewqCCHbaflsWbRPe8yOMuMYAkqTV1kP0myuHu6kxttKqR VWGg== X-Gm-Message-State: ABuFfoj4zYc/uhYYzgfgKmr1tyEx6v9MMdhYb9775s41vdMBWkGlVXBz h5f0GEkc2qDOi5L0e+t3zj/Huwt9xnvxqtA1Y+gsJpfM/wMDUvPKukRBOLdGJBTCPi7/Vb1hmVT mrcin78YnOXidk3vl701Ad2Comc9JbUX3F8eXTIz4NtsZPnmBZXuFfwb9vHXRzdzUe5IZwd/y23 /hufizS0nyu/5C3ZSAtBzWUd5eAO6IXSek7RcdWWz8HJgND2bAvri+9jl6nqGtaC/HG/NaG79C/ F4Q4fiwT+hwHRudn1nViEu7yoai57jESYXGlqxRtqU9QP7F3BrcwhYgHW0usoFV0iGXO0rE0R5f xilxFj2BFLvf1VfR3fmmYkpTFMNUgM7clvcfwl1XmINVMBlpF42wYIaykXqS81yRvQPPDK/RHew K X-Received: by 2002:a0c:ae15:: with SMTP id y21-v6mr1141538qvc.233.1537889421481; Tue, 25 Sep 2018 08:30:21 -0700 (PDT) X-Received: by 2002:a0c:ae15:: with SMTP id y21-v6mr1141485qvc.233.1537889420678; Tue, 25 Sep 2018 08:30:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537889420; cv=none; d=google.com; s=arc-20160816; b=kDijWe52TPvZtEDfIz1OjkW//SWX1kr+ABf/5iUj7Z5SotnuP8y3kUX0d+9TURVGb6 yl5BG8A6bTvREpGyboVid5YrOVvHM1Mp89y0PGyV6FQRRKxg73oW1U8Isk9qi6snvjg9 rdyArczUZxMDi3dehOpUkZGn0gsvekcHVR70zbZcp8c3gA+tZ9V4yfIza5FJ0SDeaav8 y4YxYC/YWQWN9JT90Vd+CpEr4lNiEnS44V7LZSSGbzhFL1c/f9gLdZs7ZHWQ6Yo+BeFM YO9Sc4wblLPaJ/4gxTmUNFI2ZYuM5hrPcol5QUOmhcrwEvIK5J8jQ6hyaoPFSHRooa43 JmqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=CbVu8tMWjWUQaLeNEaAx14400uYFzt+wHZIzJ3Bs3Ms=; b=TuLA39/MCB6hfeHGsqlXTsWScRhHz05xyqThcswx17wiji2A+N+JCX6uCooSkYqLfM LC6XADDrO9Sb9qeBeMR6XYMWfyT0ThH9dvKEnoZCuGiuJU9f10UIB2W+R9XJ7w9YIUIg uuGqiiehTGY2I0lqnS6MxbftXcNTPMJDx7q5r93vHq3v6E87VUu3NmZSaG19bPeY0lRo ZxxfMmMx5h+5xWj14Irc8eh/jkMDyPm68fUcUtO3RdhAd3Y2IFyMtbCPbQp7c5+r7CFP IdETEyYCXDVzGEiVTUbG1yhVD7MwQq42Ypx+b38mzueMQzz+/Q44ymIJhqxnd7IMr9UP iWzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=dcqFFAvW; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id j9-v6sor878274qvi.147.2018.09.25.08.30.20 for (Google Transport Security); Tue, 25 Sep 2018 08:30:20 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=dcqFFAvW; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=CbVu8tMWjWUQaLeNEaAx14400uYFzt+wHZIzJ3Bs3Ms=; b=dcqFFAvWlWUj5T9zy+uxEBmg0YxNzJrjmOC7u1YkCHd208/d0F6bwQr3RnkYBthtAJ HDs2M64h06yCRsXxhwvfAIAchrJR7YeYJVEYxL4MMI4Fx01RzWqj2e7OJ7By9WxXrxth T5C/8M6/xwmcEPFgLUZ39Wq/Oe4BPdWlXAXK2gDg1Peh6DzdpNho2Dzg7c676P5RxwVQ WyUKr+jp7bBCuLkFwAeRvMnQtF5NbZ5/EZPRChFsu60rJ2gptQhYMlZjPff/deAlie6i CkgWGQ/YUC//gP1DdQb6cEkYKjMfIxSNoMXbSYcLz/i6z1F4XJDiOwiqgSbqWNMmMT4b owdg== X-Google-Smtp-Source: ACcGV60MKc9cGahvCmsKYiU3+tba0XoW1JntBlakzIAaie30ejuR2WUAz1oL+AZ7EdNUC7snQMBw0A== X-Received: by 2002:a0c:9691:: with SMTP id a17-v6mr1207731qvd.30.1537889420257; Tue, 25 Sep 2018 08:30:20 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id 53-v6sm1617242qto.61.2018.09.25.08.30.19 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 25 Sep 2018 08:30:19 -0700 (PDT) From: Josef Bacik To: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, linux-btrfs@vger.kernel.org, riel@redhat.com, hannes@cmpxchg.org, tj@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: Johannes Weiner Subject: [PATCH 3/8] mm: clean up swapcache lookup and creation function names Date: Tue, 25 Sep 2018 11:30:06 -0400 Message-Id: <20180925153011.15311-4-josef@toxicpanda.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180925153011.15311-1-josef@toxicpanda.com> References: <20180925153011.15311-1-josef@toxicpanda.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Johannes Weiner __read_swap_cache_async() has a misleading name. All it does is look up or create a page in swapcache; it doesn't initiate any IO. The swapcache has many parallels to the page cache, and shares naming schemes with it elsewhere. Analogous to the cache lookup and creation API, rename __read_swap_cache_async() find_or_create_swap_cache() and lookup_swap_cache() to find_swap_cache(). Signed-off-by: Johannes Weiner Signed-off-by: Josef Bacik --- include/linux/swap.h | 14 ++++++++------ mm/memory.c | 2 +- mm/shmem.c | 2 +- mm/swap_state.c | 43 ++++++++++++++++++++++--------------------- mm/zswap.c | 8 ++++---- 5 files changed, 36 insertions(+), 33 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 8e2c11e692ba..293a84c34448 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -412,15 +412,17 @@ extern void __delete_from_swap_cache(struct page *); extern void delete_from_swap_cache(struct page *); extern void free_page_and_swap_cache(struct page *); extern void free_pages_and_swap_cache(struct page **, int); -extern struct page *lookup_swap_cache(swp_entry_t entry, - struct vm_area_struct *vma, - unsigned long addr); +extern struct page *find_swap_cache(swp_entry_t entry, + struct vm_area_struct *vma, + unsigned long addr); +extern struct page *find_or_create_swap_cache(swp_entry_t entry, + gfp_t gfp_mask, + struct vm_area_struct *vma, + unsigned long addr, + bool *created); extern struct page *read_swap_cache_async(swp_entry_t, gfp_t, struct vm_area_struct *vma, unsigned long addr, bool do_poll); -extern struct page *__read_swap_cache_async(swp_entry_t, gfp_t, - struct vm_area_struct *vma, unsigned long addr, - bool *new_page_allocated); extern struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t flag, struct vm_fault *vmf); extern struct page *swapin_readahead(swp_entry_t entry, gfp_t flag, diff --git a/mm/memory.c b/mm/memory.c index 9152c2a2c9f6..f27295c1c91d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2935,7 +2935,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) delayacct_set_flag(DELAYACCT_PF_SWAPIN); - page = lookup_swap_cache(entry, vma, vmf->address); + page = find_swap_cache(entry, vma, vmf->address); swapcache = page; if (!page) { diff --git a/mm/shmem.c b/mm/shmem.c index 0376c124b043..9854903ae92f 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1679,7 +1679,7 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, if (swap.val) { /* Look it up and read it in.. */ - page = lookup_swap_cache(swap, NULL, 0); + page = find_swap_cache(swap, NULL, 0); if (!page) { /* Or update major stats only when swapin succeeds?? */ if (fault_type) { diff --git a/mm/swap_state.c b/mm/swap_state.c index ecee9c6c4cc1..bae758e19f7a 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -330,8 +330,8 @@ static inline bool swap_use_vma_readahead(void) * lock getting page table operations atomic even if we drop the page * lock before returning. */ -struct page *lookup_swap_cache(swp_entry_t entry, struct vm_area_struct *vma, - unsigned long addr) +struct page *find_swap_cache(swp_entry_t entry, struct vm_area_struct *vma, + unsigned long addr) { struct page *page; @@ -374,19 +374,20 @@ struct page *lookup_swap_cache(swp_entry_t entry, struct vm_area_struct *vma, return page; } -struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, +struct page *find_or_create_swap_cache(swp_entry_t entry, gfp_t gfp_mask, struct vm_area_struct *vma, unsigned long addr, - bool *new_page_allocated) + bool *created) { struct page *found_page, *new_page = NULL; struct address_space *swapper_space = swap_address_space(entry); int err; - *new_page_allocated = false; + + *created = false; do { /* * First check the swap cache. Since this is normally - * called after lookup_swap_cache() failed, re-calling + * called after find_swap_cache() failed, re-calling * that would confuse statistics. */ found_page = find_get_page(swapper_space, swp_offset(entry)); @@ -449,7 +450,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, * Initiate read into locked page and return. */ lru_cache_add_anon(new_page); - *new_page_allocated = true; + *created = true; return new_page; } radix_tree_preload_end(); @@ -475,14 +476,14 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, struct page *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, struct vm_area_struct *vma, unsigned long addr, bool do_poll) { - bool page_was_allocated; - struct page *retpage = __read_swap_cache_async(entry, gfp_mask, - vma, addr, &page_was_allocated); + struct page *page; + bool created; - if (page_was_allocated) - swap_readpage(retpage, do_poll); + page = find_or_create_swap_cache(entry, gfp_mask, vma, addr, &created); + if (created) + swap_readpage(page, do_poll); - return retpage; + return page; } static unsigned int __swapin_nr_pages(unsigned long prev_offset, @@ -573,7 +574,7 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, unsigned long mask; struct swap_info_struct *si = swp_swap_info(entry); struct blk_plug plug; - bool do_poll = true, page_allocated; + bool do_poll = true, created; struct vm_area_struct *vma = vmf->vma; unsigned long addr = vmf->address; @@ -593,12 +594,12 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, blk_start_plug(&plug); for (offset = start_offset; offset <= end_offset ; offset++) { /* Ok, do the async read-ahead now */ - page = __read_swap_cache_async( + page = find_or_create_swap_cache( swp_entry(swp_type(entry), offset), - gfp_mask, vma, addr, &page_allocated); + gfp_mask, vma, addr, &created); if (!page) continue; - if (page_allocated) { + if (created) { swap_readpage(page, false); if (offset != entry_offset) { SetPageReadahead(page); @@ -738,7 +739,7 @@ static struct page *swap_vma_readahead(swp_entry_t fentry, gfp_t gfp_mask, pte_t *pte, pentry; swp_entry_t entry; unsigned int i; - bool page_allocated; + bool created; struct vma_swap_readahead ra_info = {0,}; swap_ra_info(vmf, &ra_info); @@ -756,11 +757,11 @@ static struct page *swap_vma_readahead(swp_entry_t fentry, gfp_t gfp_mask, entry = pte_to_swp_entry(pentry); if (unlikely(non_swap_entry(entry))) continue; - page = __read_swap_cache_async(entry, gfp_mask, vma, - vmf->address, &page_allocated); + page = find_or_create_swap_cache(entry, gfp_mask, vma, + vmf->address, &created); if (!page) continue; - if (page_allocated) { + if (created) { swap_readpage(page, false); if (i != ra_info.offset) { SetPageReadahead(page); diff --git a/mm/zswap.c b/mm/zswap.c index cd91fd9d96b8..6f05faa75766 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -823,11 +823,11 @@ enum zswap_get_swap_ret { static int zswap_get_swap_cache_page(swp_entry_t entry, struct page **retpage) { - bool page_was_allocated; + bool created; - *retpage = __read_swap_cache_async(entry, GFP_KERNEL, - NULL, 0, &page_was_allocated); - if (page_was_allocated) + *retpage = find_or_create_swap_cache(entry, GFP_KERNEL, + NULL, 0, &created); + if (created) return ZSWAP_SWAPCACHE_NEW; if (!*retpage) return ZSWAP_SWAPCACHE_FAIL; From patchwork Tue Sep 25 15:30:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 10614253 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3438F161F for ; Tue, 25 Sep 2018 15:30:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 232202A857 for ; Tue, 25 Sep 2018 15:30:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 213052A893; Tue, 25 Sep 2018 15:30:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6FA642A870 for ; Tue, 25 Sep 2018 15:30:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EBC178E00A0; Tue, 25 Sep 2018 11:30:23 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DD1AB8E009E; Tue, 25 Sep 2018 11:30:23 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BD32D8E00A0; Tue, 25 Sep 2018 11:30:23 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 8387B8E009E for ; Tue, 25 Sep 2018 11:30:23 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id d205-v6so6296579qkg.16 for ; Tue, 25 Sep 2018 08:30:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=L4OEsFz3WUjgEGzu73gT/wwAqaCABq/L1QMv8mr7jYM=; b=HA5iuFcAZbjEhjqcNwCaDCTTLYxZeMlFymLFhVvakT3rpJaRUgI5/JVohwFjNHBdhF YUyUdmwJcAgjoDll/kpjdjSsSzA5HeSIoXzxPyzqHZAKL0mA7eCHuGsTg3HvUABDY36/ U948+lfe5eVzGXetHx9BnyY+FqGK9YcGlXSYOOb9/3mSlNwcH0wzoaJAG2wv+HdARZ09 vqMJrjYddPnQnrrTqzzat8vR5ceB7Nsy6DfF2aBk8TpdKe5urM9ug0CGKINx0Z9pyd+o 1I10nV6cNzEdHe2Gfy3XEmEkUQI/q5s1XwSZc2mwX461aip0W4xPYH08X4UiFiyGABvS dS4A== X-Gm-Message-State: ABuFfojE3VOitkkE+e9ZBzhfWno1OyRgskMl4DMUQ79gzXY+fDtZqxCp VeSjqLvwMx30mtCt2gXFRVU5Fi8udIB7sxIiaw+2F//PGO03CTQFt2DEpj6MsSMXkko0+U4FKrz Ft3yy1lt4EiTPfxOgNnyMw3n8qiZdgn7f8lH/uV/LirVfkau1tB3gcBSn0iVF1VpwFdqCwFFTbH dHPzbo7QqtsMeH+DVe1OxQE0ddLipBMjtJU7RPxNbwkGGxYbUud2PgoyNl+2c2sLwAFOA2ltB+s cGsdY1Y/NYYTRAgEvk3A17+FlPEE+PcMrhzUuUxQuRQFd8xM6ENlJzXxqTubM0kO+waKhWFmMgK RM/gYywHS7+jlIzWDh1EMrXoGDRlwEgPaTLmS6pTfsjjbzTT0Vd8dHTuHovOir7uoSNOWzLK1CI o X-Received: by 2002:aed:2601:: with SMTP id z1-v6mr1189855qtc.123.1537889423248; Tue, 25 Sep 2018 08:30:23 -0700 (PDT) X-Received: by 2002:aed:2601:: with SMTP id z1-v6mr1189818qtc.123.1537889422625; Tue, 25 Sep 2018 08:30:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537889422; cv=none; d=google.com; s=arc-20160816; b=po7tCX0BW8zjCfEOFXqLVKDeY1STNw+61HX/JjmpK4/BsrgNqoN9u4NuEIUAMbTO4R QtV0AQB2siiITqOvXQN+ZokZK7onUVTOGjKlA2en3hHP9DoN0cjHhpGa8ZTUUM4SeHaO W96cBpJrTKhwlzhdRCb6d7NbXw1t7s5le9Iwrx5KvVFLmmWYkEtrnMRcyNlnXlBK9rvV PhqUmFms1+O1eKtF9LsR1zmpiD5IaheMa/C5txxHEpuMYy424uCM5wo9gLPttTelfcW/ /7LxWxFA3bqtqazr1u7Pz/zUW0q+Zmt9LSnOZuLx/rbqQyFDUdrxY0MAni9LqDElCWkg OrbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=L4OEsFz3WUjgEGzu73gT/wwAqaCABq/L1QMv8mr7jYM=; b=Q+Hs+pjgvEjQMIxWTFdurt+zKKzUx2CjWrpTWNKeQf/0fWdCxu5fIA2LAStDGwiqm+ RwpNq6fNQZPlrA+yY0TV/tjGR6iLx31YnzRdw3CqyxSU90+Kdc/XIMb4oi/jyy6K+G4m RMYFYTxH+Wxb4zDEkFYn0okbb3Iw+TN0mC+k7+nU1NOenVqjHBpOkTjyG3uUy9HnkUuE IV7IAL/Y7P4on0tQ3EK/y6fSo4Ff7mmEs6FQJFy9nu4AI6eTTm3YM/aonx0EsjBEZdck VstnBm0h1nU+IPBwxVzWgvTB1kXKMw24eyvGeLvodw4OtWU4ZoKJO5EbVQuVA0hnAI9p 2Yxg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b="xiIjmt/e"; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id l13-v6sor884602qvi.64.2018.09.25.08.30.22 for (Google Transport Security); Tue, 25 Sep 2018 08:30:22 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b="xiIjmt/e"; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=L4OEsFz3WUjgEGzu73gT/wwAqaCABq/L1QMv8mr7jYM=; b=xiIjmt/eWUB4u3fWLkbXc4bB7hWrJjmm0MZsa1qEd8+bOOshROHuBckxZSLE9WOkbC 9zKp4ykcV/4h1kwVZP20aH6CxmRLsn58jroSPwDyXAsgjo+7GsH/62PBM3emKGgMKl/0 qJdqIhQn2OffxV0dpEsqgw8TyQwTJyXDv7Yv3w7G70CekvgpufDW8C6byBD+fNwsYA4A DRJBi5PieHXCYIfxhAvdvYbKvgIneJRXYt7vM8ZMjtC34i+Jh4XxftG9VtmFNrv4Us80 4uMNMv/RXjd7aBT+STiW8dQa80NdBOSb8sscF0GIcVME/BtNWK5Nhw/ScCv99l8AUu8p AstA== X-Google-Smtp-Source: ACcGV636XzSwKd2meu1Ys5O7v+jQbfQLNsgLIWLK0BCd4sgh4v5i1cNr48zBxWR5cs6Aamdx/459CA== X-Received: by 2002:a0c:f8ce:: with SMTP id h14-v6mr1158470qvo.201.1537889421997; Tue, 25 Sep 2018 08:30:21 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id u184-v6sm1468940qkc.87.2018.09.25.08.30.20 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 25 Sep 2018 08:30:21 -0700 (PDT) From: Josef Bacik To: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, linux-btrfs@vger.kernel.org, riel@redhat.com, hannes@cmpxchg.org, tj@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: Johannes Weiner Subject: [PATCH 4/8] mm: drop mmap_sem for swap read IO submission Date: Tue, 25 Sep 2018 11:30:07 -0400 Message-Id: <20180925153011.15311-5-josef@toxicpanda.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180925153011.15311-1-josef@toxicpanda.com> References: <20180925153011.15311-1-josef@toxicpanda.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Johannes Weiner We don't need to hold the mmap_sem while we're doing the IO, simply drop it and retry appropriately. Signed-off-by: Johannes Weiner Signed-off-by: Josef Bacik --- mm/page_io.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/mm/page_io.c b/mm/page_io.c index aafd19ec1db4..bf21b56a964e 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -365,6 +365,20 @@ int swap_readpage(struct page *page, bool synchronous) goto out; } + /* + * XXX: + * + * Propagate mm->mmap_sem into this function. Then: + * + * get_file(sis->swap_file) + * up_read(mm->mmap_sem) + * submit io request + * fput + * + * After mmap_sem is dropped, sis is no longer valid. Go + * through swap_file->blah->bdev. + */ + if (sis->flags & SWP_FILE) { struct file *swap_file = sis->swap_file; struct address_space *mapping = swap_file->f_mapping; From patchwork Tue Sep 25 15:30:08 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 10614259 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 11C8D161F for ; Tue, 25 Sep 2018 15:30:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EF8C62A857 for ; Tue, 25 Sep 2018 15:30:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ED35B2A8A4; Tue, 25 Sep 2018 15:30:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6C7CA2A857 for ; Tue, 25 Sep 2018 15:30:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 78C528E00A1; Tue, 25 Sep 2018 11:30:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 713918E009E; Tue, 25 Sep 2018 11:30:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F2698E00A1; Tue, 25 Sep 2018 11:30:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 1401B8E009E for ; Tue, 25 Sep 2018 11:30:25 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id z19-v6so7553285qts.11 for ; Tue, 25 Sep 2018 08:30:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:subject:date:message-id :in-reply-to:references; bh=fxF9egp/0MfNC6eP3oXuHL949t8miVwsCl5P35THCQ0=; b=T6QzvQVWqZPoNXFYto0ZzsKXapCpFeJ9Hns929GJ0jLHPh5loMpP08x/JkMYZlGqfV 0I9Er0EJ8mH0Z8hx2m855FroBDPyROhmJ+p26p2FdDQRBbp8lIj9MiqcRgFVFPkMVJ0G s3tLce8fsZb8Wp2RJcHQ21HPmMN/bxPxybOCtAV7VoZ3SGJYKIudfNP11jzoCpdzDpNz mtvWfvaDcn1CjtGmIqWpiaG4uS+BrD8bhVhvpI1w0heVF6IJVYrZLqgdUYTfRixEK3Tu PCVgNlsELbHsA9A8MPrU+6EZMsLsuOSCn6UyMhdwxHJNOcQXA+5XLfHCvQdgXfEFqw2C E76A== X-Gm-Message-State: ABuFfojhZRhxutscgx0dfAc7fwWeuhi2gNDTKMg4fvMhKaakpT2Arwjt agnxT6nv4B+PiFuk20Zb3epL6av4DUtkb9kDXkFIihOrEg+XZgMAd6vjcJl1kuEJZgGdQrZOnbK J+q+mp25zkuBcoXjFcxx9bLWIRifv1tpKXNC77tWygdu2OW9SYcKKUHDlueS1MWdw4pBnM+J1dW En9oPgoC+kBJNcByU9k8onUO0ZxY9g5rjnA3eXA6Vjf6qzCvPSXGRocMM3gRnnvCfJUPc1pjfwb 6vs1l4vY6WLdabfGYpOIARFrhZnm4NrrM2dpQtzvueCAqotLAAWFDpAEcD+eux0CylUUB+d2DOH L2Goe5JMq0f6ixDaWtJIC61ZrZROle6LYmA76DRTxRbrXS4xaJ+F4tXjxTSxkH/o5GxjjXGGZlv U X-Received: by 2002:ac8:2bc3:: with SMTP id n3-v6mr1267915qtn.152.1537889424810; Tue, 25 Sep 2018 08:30:24 -0700 (PDT) X-Received: by 2002:ac8:2bc3:: with SMTP id n3-v6mr1267863qtn.152.1537889424075; Tue, 25 Sep 2018 08:30:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537889424; cv=none; d=google.com; s=arc-20160816; b=bNGgWjupzvz2xIz8Wj4bxP9GNGjGJo+jd/DeTfF8FkZ3HwIr3yAxQelqavgf8GRMv+ loGTIGD6l20UWpw7/lEL6eCnkdrv7xM0H6hx4dWO1tNmTlYj9GKBe0J2Hn7pwd/xRl7T OAmQrJqyJC0vJZ0UitqUqXh0hHfT4Fy2Fw2+YAGszirN6C8Zo0LgdRdl0rlr7i5n/HuL lbkGebOQvGdCIF4Pb2c5+sXadoEt4ecL23voO9hXSF+ucaLOTcQ3KLiRtpiP8SiYMWQ5 1dD2dqOFSBOzO1NjvaGCior4kdjK6orRdKkYM9Mwnnb2s2x7rmPbH4mgIGJePz/xZhnE EVpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=fxF9egp/0MfNC6eP3oXuHL949t8miVwsCl5P35THCQ0=; b=j1CyO9GbzFhi3v2AtQwH6UAsynygsjx3vV/6dBsH334vjsvOwqWjKxRE2nB7KrePtQ LffnJfqHh5rM0WNTQ0fAdiJJP4VdlgitLipXlIw6fvKE4F+DJ1yvLEqEaATM0IhHTF/t qcTEPFzuz1qvX05WnKqTEM/azUUhOPGSSy9jZKDx5A9kh7pfXlPkfVs0eZC2L6NwtJF+ Lx2fGEIuJNYlXAlSyNptPvI+Jb6WSbauDvZNzOHEswXgFNlQL3yxyzsbV6jmdCYAbYAQ gAy4e2b/oIKP1Vza0BnrJfwuPCdU8tCaUBz91RZ0AZkK6SYtDqYmlr96qNoNfH7HXi7b 6R4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=DEmAPLzk; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id d7-v6sor887251qvp.134.2018.09.25.08.30.24 for (Google Transport Security); Tue, 25 Sep 2018 08:30:24 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=DEmAPLzk; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references; bh=fxF9egp/0MfNC6eP3oXuHL949t8miVwsCl5P35THCQ0=; b=DEmAPLzkrPKAHv0xbTFGpZ6jamzlToLR/cRumBykBnmJoZyUpkUvG6gNwTXEEEvyo9 FzeLKBbs0BBVsDOCDYe7Z+Kt3M7PelIG7cQnsJ+iLSS0Q0hbOoLbqqKjMD25QmSZj3z5 6K56i3YJEvNJ7B6CpBPv1+KXW2xSGvgAYAbgPmw5fsr4RGI9WsKfIyWFaBWMxqTcB0OR 40CUFcmwjK6YV/Uz6Lh4a6RxyGCPOE7GNqcdPm7xXuxJ5VknZBe8xNgSQakCklIAM9Y9 M/tXW7YerwILLQKONWAtGoKLqcTvBtzmBwY/IzR3jcvURZQglkbsM/g/KgB2NiS5a0ov vVoA== X-Google-Smtp-Source: ACcGV636ILuqRfdWOg/6E/YNgwltlS2o+vJZwooGLdmWZ+ffN3YpnBW1fp9l4X+RFEOUrIOJ5hu2bA== X-Received: by 2002:a0c:acad:: with SMTP id m42-v6mr1166622qvc.27.1537889423820; Tue, 25 Sep 2018 08:30:23 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id o2-v6sm1965827qkl.63.2018.09.25.08.30.22 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 25 Sep 2018 08:30:22 -0700 (PDT) From: Josef Bacik To: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, linux-btrfs@vger.kernel.org, riel@redhat.com, hannes@cmpxchg.org, tj@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 5/8] mm: drop the mmap_sem in all read fault cases Date: Tue, 25 Sep 2018 11:30:08 -0400 Message-Id: <20180925153011.15311-6-josef@toxicpanda.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180925153011.15311-1-josef@toxicpanda.com> References: <20180925153011.15311-1-josef@toxicpanda.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Johannes' patches didn't quite cover all of the IO cases that we need to drop the mmap_sem for, this patch covers the rest of them. Signed-off-by: Josef Bacik --- mm/filemap.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index 1ed35cd99b2c..65395ee132a0 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2523,6 +2523,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) int error; struct mm_struct *mm = vmf->vma->vm_mm; struct file *file = vmf->vma->vm_file; + struct file *fpin = NULL; struct address_space *mapping = file->f_mapping; struct file_ra_state *ra = &file->f_ra; struct inode *inode = mapping->host; @@ -2610,11 +2611,15 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) return ret | VM_FAULT_LOCKED; no_cached_page: + fpin = maybe_unlock_mmap_for_io(vmf->vma, vmf->flags); + /* * We're only likely to ever get here if MADV_RANDOM is in * effect. */ error = page_cache_read(file, offset, vmf->gfp_mask); + if (fpin) + goto out_retry; /* * The page we want has now been added to the page cache. @@ -2634,6 +2639,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) return VM_FAULT_SIGBUS; page_not_uptodate: + fpin = maybe_unlock_mmap_for_io(vmf->vma, vmf->flags); + /* * Umm, take care of errors if the page isn't up-to-date. * Try to re-read it _once_. We do this synchronously, @@ -2647,6 +2654,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) if (!PageUptodate(page)) error = -EIO; } + if (fpin) + goto out_retry; put_page(page); if (!error || error == AOP_TRUNCATED_PAGE) @@ -2665,6 +2674,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) } out_retry: + if (fpin) + fput(fpin); if (page) put_page(page); return ret | VM_FAULT_RETRY; From patchwork Tue Sep 25 15:30:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 10614269 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 769B3161F for ; Tue, 25 Sep 2018 15:30:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 673D82A7CF for ; Tue, 25 Sep 2018 15:30:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5B74F2A864; Tue, 25 Sep 2018 15:30:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BA3CA2A8B5 for ; Tue, 25 Sep 2018 15:30:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 566F38E009E; Tue, 25 Sep 2018 11:30:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4F1088E00A2; Tue, 25 Sep 2018 11:30:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 36C3B8E009E; Tue, 25 Sep 2018 11:30:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id F27AE8E00A2 for ; Tue, 25 Sep 2018 11:30:28 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id d1-v6so8926175qth.21 for ; Tue, 25 Sep 2018 08:30:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:subject:date:message-id :in-reply-to:references; bh=Ed5moJJ8/PFxZOKpTdaP/2kh9jGVOkOFH2zJa9dMW80=; b=uC3Ai4W2Dsc7527tAmAP47o1enmMAPu7CM9rR4zknPBHetWabY5rYWIuvjCegl54ph PtYsP/gkn1GVsa3TqEu9GXsw864lOK2+0JDJzzqlqBOpCbH2heChrQT0OdDrPv10qQGV egThgDW7FkgmJY7Kt0pLarKq3q0A7p6ibtzlAr8b+N0jqQ7dA1Azoc0HK/eXUNfKkkAz KBGwPJRIbG1jrwBatuLwUd1z18i3epRUlMekwtY9JtwMztQbb+gRZ//5BzNZQAwDIZLs hJ/1Plk9FPtQ+UCBBH4jXvjQaG1Wr3+thtaSIpQmhpV3Djj9r2wwxsNeCSns6w0Rh3wd jtWw== X-Gm-Message-State: ABuFfogA94jRqmANtrPYjVRZAM3n8CDfFR7VBx0DXKz3Ddbi1S5JF/qR f6Kw8y77DcrCsRRryInU9dWKZ2dHmzwuRQYUV7QZN9I0kk1uDRGTFMPHtLaq362ueWORUVCTjZe xaVkkZ8TB+pG44HmsosHFxpof8d/Cxd+MkDATovfuBhf0ZqaOE0UNi3xJzUb9LxkHmKBufD4vxr 0Js9QjEDiWrPCeEdg+3A7QtsN7gp5q4eGXmJDhni3zuiroj9CUUKyn8J99BTBpR5lUD45e08QZE HPs4RwSCSZyaP41gDVOjFRPvHO/y7uNkoo4FOQ3nnZeEffV5gWVjVfrcAqTZNRbk/7mHpD1pxmm uIalC4zi7BGAjs2Mg7/8hOwNydIdsORFaE3daLTdXIf+V7ouMZr/0IJHxo7M+jUibHUyVQn67LO S X-Received: by 2002:a37:6e01:: with SMTP id j1-v6mr1156502qkc.70.1537889428683; Tue, 25 Sep 2018 08:30:28 -0700 (PDT) X-Received: by 2002:a37:6e01:: with SMTP id j1-v6mr1156369qkc.70.1537889426364; Tue, 25 Sep 2018 08:30:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537889426; cv=none; d=google.com; s=arc-20160816; b=QTqporTxGtUuSkNTl9fyryMWFqqRCtKhFPYKDr7i/r/Zye32ykBGN2jtRJxhur1UqA i7uveaZlN+NeHGwA5eSrWldY3EodnEKuahEb63py3fSaKyMeBpyfeJS76TmKP5osuTaE 3a6vrMenzuyePoNsN2b0Mn2VDwvvTDt6IGVCoTCcC58igjv9QCZJNBIZrJjgOVHHNWVw Vk4YGh3/dyO/i0NtuPl8hO732D2G8zUczhd6XAk/YKkE9iF1pStr4bJTN9630OTI+dCQ MrGy7RN0Mcy31muWuTjpnjD/XfXkTOfTYk6iNHlcHG1BztOQEMi3xkcPzRVROcFmO13C vYiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=Ed5moJJ8/PFxZOKpTdaP/2kh9jGVOkOFH2zJa9dMW80=; b=ssO1pgXdFY3vDOsl/i4OVj67XqZXaNpWCMaH8lj8Xov3yeHiNSmCEpQdLAMxJfM6tw P9eTDiaELYrN3ICvjSsWB5CORwiITv+iaAojhLNgeAD61l0iT8IQYyiKfYmuHjlHWRks Jyp4aROtv/pqD14WbsVcT38a8/+QqAHOWTdoGINueVggcmKn1S2VDfdhBp0UjZgZyPLn iuWj9WTmMUK8WsK3mFQmSkKqCC+8Et9w5fqGkyZOgwDy9kLUuMM/5SN2CjRO19r02jOz 3pzYW1Aw9iv/h+Vj35A/EpOkcJkjtFQdPT6P3McK4l9GEWIp9CVmqLxhUu0SWeF/iFX+ oBAw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=yuzwFo6t; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id l6-v6sor884645qvq.113.2018.09.25.08.30.26 for (Google Transport Security); Tue, 25 Sep 2018 08:30:26 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=yuzwFo6t; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references; bh=Ed5moJJ8/PFxZOKpTdaP/2kh9jGVOkOFH2zJa9dMW80=; b=yuzwFo6ttgZTspbvdXloHkNFlZNUAccgjnv3rzkcaGn1rB6JI9CufWBd3mlFyzWuov Vwoen740y6MdfxaG1x2bFxWUciR558tEMIvsxuLiupmRM/zZWWH4cC+liAgnsIhqsNCd tKkmBim980Ivld37Yhc5d0q6Mo/wlyEMqFl3vLnfbNHZa+e76F9opJHR/bkUvODNt5e9 JBZInhQUapPyGYQznk8WiJelYYSE22g0DPnocrYVkfuNA5jznXemTRXH6HoeUEiEWjRg x1nC4JN9VPmvOmcZfP2+TwWxZoRRwtt0LIi3+26yedt4T6WX9QRTCdR1FpPmNvxBCfPB 46Xg== X-Google-Smtp-Source: ACcGV605LJc6y4PS7556go9WhOZB9GtKOrLEVkEMZKYPaLtplHEdxo58eNJudgcUEw2mZyLGP0CS/Q== X-Received: by 2002:a0c:891c:: with SMTP id 28-v6mr1153897qvp.225.1537889425621; Tue, 25 Sep 2018 08:30:25 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id p1-v6sm268340qkg.82.2018.09.25.08.30.24 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 25 Sep 2018 08:30:24 -0700 (PDT) From: Josef Bacik To: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, linux-btrfs@vger.kernel.org, riel@redhat.com, hannes@cmpxchg.org, tj@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 6/8] mm: keep the page we read for the next loop Date: Tue, 25 Sep 2018 11:30:09 -0400 Message-Id: <20180925153011.15311-7-josef@toxicpanda.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180925153011.15311-1-josef@toxicpanda.com> References: <20180925153011.15311-1-josef@toxicpanda.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP If we drop the mmap_sem we need to redo the vma lookup and then re-lookup the page. This is kind of a waste since we've already done the work, and we could even possibly evict the page, causing a refault. Instead just hold a reference to the page and save it in our vm_fault. The next time we go through filemap_fault we'll grab our page, verify that it's the one we want and carry on. Signed-off-by: Josef Bacik --- arch/alpha/mm/fault.c | 7 +++++-- arch/arc/mm/fault.c | 6 +++++- arch/arm/mm/fault.c | 2 ++ arch/arm64/mm/fault.c | 2 ++ arch/hexagon/mm/vm_fault.c | 6 +++++- arch/ia64/mm/fault.c | 6 +++++- arch/m68k/mm/fault.c | 6 +++++- arch/microblaze/mm/fault.c | 6 +++++- arch/mips/mm/fault.c | 6 +++++- arch/nds32/mm/fault.c | 3 +++ arch/nios2/mm/fault.c | 6 +++++- arch/openrisc/mm/fault.c | 6 +++++- arch/parisc/mm/fault.c | 6 +++++- arch/powerpc/mm/copro_fault.c | 3 ++- arch/powerpc/mm/fault.c | 3 +++ arch/riscv/mm/fault.c | 6 +++++- arch/s390/mm/fault.c | 1 + arch/sh/mm/fault.c | 8 ++++++-- arch/sparc/mm/fault_32.c | 8 +++++++- arch/sparc/mm/fault_64.c | 6 +++++- arch/um/kernel/trap.c | 6 +++++- arch/unicore32/mm/fault.c | 5 ++++- arch/x86/mm/fault.c | 2 ++ arch/xtensa/mm/fault.c | 6 +++++- drivers/iommu/amd_iommu_v2.c | 1 + drivers/iommu/intel-svm.c | 1 + include/linux/mm.h | 14 ++++++++++++++ mm/filemap.c | 31 ++++++++++++++++++++++++++++--- mm/gup.c | 3 +++ mm/hmm.c | 1 + mm/ksm.c | 1 + 31 files changed, 151 insertions(+), 23 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index 3c98dfef03a9..ed5929787d4a 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -152,10 +152,13 @@ do_page_fault(unsigned long address, unsigned long mmcsr, vm_fault_init(&vmfs, vma, flags, address); fault = handle_mm_fault(&vmf); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); return; + } if (unlikely(fault & VM_FAULT_ERROR)) { + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) goto out_of_memory; else if (fault & VM_FAULT_SIGSEGV) @@ -181,7 +184,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, goto retry; } } - + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return; diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index 7aeb81ff5070..38a6c5e94fac 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -149,8 +149,10 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) if (unlikely(fatal_signal_pending(current))) { if ((fault & VM_FAULT_ERROR) && !(fault & VM_FAULT_RETRY)) up_read(&mm->mmap_sem); - if (user_mode(regs)) + if (user_mode(regs)) { + vm_fault_cleanup(&vmf); return; + } } perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); @@ -176,10 +178,12 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) } /* Fault Handled Gracefully */ + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return; } + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) goto out_of_memory; else if (fault & VM_FAULT_SIGSEGV) diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 885a24385a0a..f08946e78bd9 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -325,6 +325,7 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); if (!user_mode(regs)) goto no_context; return 0; @@ -356,6 +357,7 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); /* diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 31e86a74cbe0..6f3e908a3820 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -506,6 +506,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, * in __lock_page_or_retry in mm/filemap.c. */ if (fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); if (!user_mode(regs)) goto no_context; return 0; @@ -521,6 +522,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, goto retry; } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); /* diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index 1ee1042bb2b5..d68aa9691184 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -106,8 +106,10 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(&vmf); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); return; + } /* The most common case -- we are done. */ if (likely(!(fault & VM_FAULT_ERROR))) { @@ -123,10 +125,12 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return; } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); /* Handle copyin/out exception cases */ diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 827b898adb5e..68b689bb619f 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -165,8 +165,10 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(&vmf); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); return; + } if (unlikely(fault & VM_FAULT_ERROR)) { /* @@ -174,6 +176,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re * to us that made us unable to handle the page fault * gracefully. */ + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) { goto out_of_memory; } else if (fault & VM_FAULT_SIGSEGV) { @@ -203,6 +206,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return; diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index e42eddc9c7ca..7e8be4665ef9 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -139,10 +139,13 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(&vmf); pr_debug("handle_mm_fault returns %x\n", fault); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); return 0; + } if (unlikely(fault & VM_FAULT_ERROR)) { + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) goto out_of_memory; else if (fault & VM_FAULT_SIGSEGV) @@ -178,6 +181,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return 0; diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index ade980266f65..bb320be95142 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -219,10 +219,13 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(&vmf); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); return; + } if (unlikely(fault & VM_FAULT_ERROR)) { + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) goto out_of_memory; else if (fault & VM_FAULT_SIGSEGV) @@ -251,6 +254,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); /* diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index bf212bb70f24..8f1cfe564987 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -156,11 +156,14 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(&vmf); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); return; + } perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); if (unlikely(fault & VM_FAULT_ERROR)) { + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) goto out_of_memory; else if (fault & VM_FAULT_SIGSEGV) @@ -193,6 +196,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return; diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index 27ac4caa5102..7cb4d9f73c1a 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -213,12 +213,14 @@ void do_page_fault(unsigned long entry, unsigned long addr, * would already be released in __lock_page_or_retry in mm/filemap.c. */ if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); if (!user_mode(regs)) goto no_context; return; } if (unlikely(fault & VM_FAULT_ERROR)) { + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) goto out_of_memory; else if (fault & VM_FAULT_SIGBUS) @@ -249,6 +251,7 @@ void do_page_fault(unsigned long entry, unsigned long addr, } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return; diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index 693472f05065..774035116392 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -136,10 +136,13 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(&vmf); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); return; + } if (unlikely(fault & VM_FAULT_ERROR)) { + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) goto out_of_memory; else if (fault & VM_FAULT_SIGSEGV) @@ -175,6 +178,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return; diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index 70eef1d9f7ed..9186af1b9cdc 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -166,10 +166,13 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(&vmf); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); return; + } if (unlikely(fault & VM_FAULT_ERROR)) { + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) goto out_of_memory; else if (fault & VM_FAULT_SIGSEGV) @@ -198,6 +201,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return; diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index 83c89cada3c0..7ad74571407e 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -304,8 +304,10 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(&vmf); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); return; + } if (unlikely(fault & VM_FAULT_ERROR)) { /* @@ -313,6 +315,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, * other thing happened to us that made us unable to * handle the page fault gracefully. */ + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) goto out_of_memory; else if (fault & VM_FAULT_SIGSEGV) @@ -339,6 +342,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, goto retry; } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return; diff --git a/arch/powerpc/mm/copro_fault.c b/arch/powerpc/mm/copro_fault.c index 02dd21a54479..07ec389ac6c6 100644 --- a/arch/powerpc/mm/copro_fault.c +++ b/arch/powerpc/mm/copro_fault.c @@ -81,6 +81,7 @@ int copro_handle_mm_fault(struct mm_struct *mm, unsigned long ea, vm_fault_init(&vmf, vma, ea, is_write ? FAULT_FLAG_WRITE : 0); *flt = handle_mm_fault(&vmf); if (unlikely(*flt & VM_FAULT_ERROR)) { + vm_fault_cleanup(&vmf); if (*flt & VM_FAULT_OOM) { ret = -ENOMEM; goto out_unlock; @@ -95,7 +96,7 @@ int copro_handle_mm_fault(struct mm_struct *mm, unsigned long ea, current->maj_flt++; else current->min_flt++; - + vm_fault_cleanup(&vmf); out_unlock: up_read(&mm->mmap_sem); return ret; diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index cc00bba104fb..1940471c6a6f 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -552,6 +552,7 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, int pkey = vma_pkey(vma); + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return bad_key_fault_exception(regs, address, pkey); } @@ -580,9 +581,11 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, * User mode? Just return to handle the fatal exception otherwise * return to bad_page_fault */ + vm_fault_cleanup(&vmf); return is_user ? 0 : SIGBUS; } + vm_fault_cleanup(&vmf); up_read(¤t->mm->mmap_sem); if (unlikely(fault & VM_FAULT_ERROR)) diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index aa3db34c9eb8..64c8de82a40b 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -129,10 +129,13 @@ asmlinkage void do_page_fault(struct pt_regs *regs) * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(tsk)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(tsk)) { + vm_fault_cleanup(&vmf); return; + } if (unlikely(fault & VM_FAULT_ERROR)) { + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) goto out_of_memory; else if (fault & VM_FAULT_SIGBUS) @@ -172,6 +175,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs) } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return; diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index 14cfd6de43ed..a91849a7e338 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -561,6 +561,7 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) out_up: up_read(&mm->mmap_sem); out: + vm_fault_cleanup(&vmf); return fault; } diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index 31202706125c..ee0ad499ed53 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -485,9 +485,12 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(&vmf); - if (unlikely(fault & (VM_FAULT_RETRY | VM_FAULT_ERROR))) - if (mm_fault_error(regs, error_code, address, fault)) + if (unlikely(fault & (VM_FAULT_RETRY | VM_FAULT_ERROR))) { + if (mm_fault_error(regs, error_code, address, fault)) { + vm_fault_cleanup(&vmf); return; + } + } if (flags & FAULT_FLAG_ALLOW_RETRY) { if (fault & VM_FAULT_MAJOR) { @@ -512,5 +515,6 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); } diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index a9dd62393934..0623154163c5 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -239,10 +239,13 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); return; + } if (unlikely(fault & VM_FAULT_ERROR)) { + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) goto out_of_memory; else if (fault & VM_FAULT_SIGSEGV) @@ -275,6 +278,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return; @@ -412,8 +416,10 @@ static void force_user_fault(unsigned long address, int write) switch (handle_mm_fault(&vmf)) { case VM_FAULT_SIGBUS: case VM_FAULT_OOM: + vm_fault_cleanup(&vmf); goto do_sigbus; } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); return; bad_area: diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index 381ab905eb2c..45107ddb8478 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -437,10 +437,13 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); goto exit_exception; + } if (unlikely(fault & VM_FAULT_ERROR)) { + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) goto out_of_memory; else if (fault & VM_FAULT_SIGSEGV) @@ -472,6 +475,7 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) goto retry; } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); mm_rss = get_mm_rss(mm); diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index c6d9e176c5c5..419f4d54bf10 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -78,10 +78,13 @@ int handle_page_fault(unsigned long address, unsigned long ip, vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(&vmf); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); goto out_nosemaphore; + } if (unlikely(fault & VM_FAULT_ERROR)) { + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) { goto out_of_memory; } else if (fault & VM_FAULT_SIGSEGV) { @@ -109,6 +112,7 @@ int handle_page_fault(unsigned long address, unsigned long ip, pud = pud_offset(pgd, address); pmd = pmd_offset(pud, address); pte = pte_offset_kernel(pmd, address); + vm_fault_cleanup(&vmf); } while (!pte_present(*pte)); err = 0; /* diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index 68c2b0a65348..0c94b8d5187d 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -262,8 +262,10 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); return 0; + } if (!(fault & VM_FAULT_ERROR) && (flags & FAULT_FLAG_ALLOW_RETRY)) { if (fault & VM_FAULT_MAJOR) @@ -278,6 +280,7 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); /* diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 9919a25b15e6..a8ea7b609697 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1410,6 +1410,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code, if (!fatal_signal_pending(tsk)) goto retry; } + vm_fault_cleanup(&vmf); /* User mode? Just return to handle the fatal exception */ if (flags & FAULT_FLAG_USER) @@ -1420,6 +1421,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code, return; } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); if (unlikely(fault & VM_FAULT_ERROR)) { mm_fault_error(regs, error_code, address, &pkey, fault); diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index f1b0f4f858ff..a577b73f9ca4 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -112,10 +112,13 @@ void do_page_fault(struct pt_regs *regs) vm_fault_init(&vmf, vma, address, flags); fault = handle_mm_fault(&vmf); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + vm_fault_cleanup(&vmf); return; + } if (unlikely(fault & VM_FAULT_ERROR)) { + vm_fault_cleanup(&vmf); if (fault & VM_FAULT_OOM) goto out_of_memory; else if (fault & VM_FAULT_SIGSEGV) @@ -142,6 +145,7 @@ void do_page_fault(struct pt_regs *regs) } } + vm_fault_cleanup(&vmf); up_read(&mm->mmap_sem); perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); if (flags & VM_FAULT_MAJOR) diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c index 129e0ef68827..fc20bbe1c0dc 100644 --- a/drivers/iommu/amd_iommu_v2.c +++ b/drivers/iommu/amd_iommu_v2.c @@ -535,6 +535,7 @@ static void do_fault(struct work_struct *work) vm_fault_init(&vmf, vma, address, flags); ret = handle_mm_fault(&vmf); + vm_fault_cleanup(&vmf); out: up_read(&mm->mmap_sem); diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c index 03aa02723242..614f6aab9615 100644 --- a/drivers/iommu/intel-svm.c +++ b/drivers/iommu/intel-svm.c @@ -640,6 +640,7 @@ static irqreturn_t prq_event_thread(int irq, void *d) vm_fault_init(&vmf, vma, address, req->wr_req ? FAULT_FLAG_WRITE : 0); ret = handle_mm_fault(&vmf); + vm_fault_cleanup(&vmf); if (ret & VM_FAULT_ERROR) goto invalid; diff --git a/include/linux/mm.h b/include/linux/mm.h index e271c60af01a..724514be03b2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -360,6 +360,12 @@ struct vm_fault { * is set (which is also implied by * VM_FAULT_ERROR). */ + struct page *cached_page; /* ->fault handlers that return + * VM_FAULT_RETRY can store their + * previous page here to be reused the + * next time we loop through the fault + * handler for faster lookup. + */ /* These three entries are valid only while holding ptl lock */ pte_t *pte; /* Pointer to pte entry matching * the 'address'. NULL if the page @@ -953,6 +959,14 @@ static inline void put_page(struct page *page) __put_page(page); } +static inline void vm_fault_cleanup(struct vm_fault *vmf) +{ + if (vmf->cached_page) { + put_page(vmf->cached_page); + vmf->cached_page = NULL; + } +} + #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP) #define SECTION_IN_PAGE_FLAGS #endif diff --git a/mm/filemap.c b/mm/filemap.c index 65395ee132a0..49b35293fa95 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2530,13 +2530,38 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) pgoff_t offset = vmf->pgoff; int flags = vmf->flags; pgoff_t max_off; - struct page *page; + struct page *page = NULL; + struct page *cached_page = vmf->cached_page; vm_fault_t ret = 0; max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); if (unlikely(offset >= max_off)) return VM_FAULT_SIGBUS; + /* + * We may have read in the page already and have a page from an earlier + * loop. If so we need to see if this page is still valid, and if not + * do the whole dance over again. + */ + if (cached_page) { + if (flags & FAULT_FLAG_KILLABLE) { + error = lock_page_killable(cached_page); + if (error) { + up_read(&mm->mmap_sem); + goto out_retry; + } + } else + lock_page(cached_page); + vmf->cached_page = NULL; + if (cached_page->mapping == mapping && + cached_page->index == offset) { + page = cached_page; + goto have_cached_page; + } + unlock_page(cached_page); + put_page(cached_page); + } + /* * Do we have something in the page cache already? */ @@ -2587,8 +2612,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) put_page(page); goto retry_find; } +have_cached_page: VM_BUG_ON_PAGE(page->index != offset, page); - /* * We have a locked page in the page cache, now we need to check * that it's up-to-date. If not, it is going to be due to an error. @@ -2677,7 +2702,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) if (fpin) fput(fpin); if (page) - put_page(page); + vmf->cached_page = page; return ret | VM_FAULT_RETRY; } EXPORT_SYMBOL(filemap_fault); diff --git a/mm/gup.c b/mm/gup.c index c12d1e98614b..75f55f4f044c 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -518,6 +518,7 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, vm_fault_init(&vmf, vma, address, fault_flags); ret = handle_mm_fault(&vmf); + vm_fault_cleanup(&vmf); if (ret & VM_FAULT_ERROR) { int err = vm_fault_to_errno(ret, *flags); @@ -840,6 +841,7 @@ int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm, if (ret & VM_FAULT_ERROR) { int err = vm_fault_to_errno(ret, 0); + vm_fault_cleanup(&vmf); if (err) return err; BUG(); @@ -854,6 +856,7 @@ int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm, goto retry; } } + vm_fault_cleanup(&vmf); if (tsk) { if (major) diff --git a/mm/hmm.c b/mm/hmm.c index 695ef184a7d0..b803746745a5 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -309,6 +309,7 @@ static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr, flags |= write_fault ? FAULT_FLAG_WRITE : 0; vm_fault_init(&vmf, vma, addr, flags); ret = handle_mm_fault(&vmf); + vm_fault_cleanup(&vmf); if (ret & VM_FAULT_RETRY) return -EBUSY; if (ret & VM_FAULT_ERROR) { diff --git a/mm/ksm.c b/mm/ksm.c index 4b6d90357ee2..8404e230fdab 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -483,6 +483,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr) vm_fault_init(&vmf, vma, addr, FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE); ret = handle_mm_fault(&vmf); + vm_fault_cleanup(&vmf); } else ret = VM_FAULT_WRITE; put_page(page); From patchwork Tue Sep 25 15:30:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 10614263 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 37F9E15E8 for ; Tue, 25 Sep 2018 15:30:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2B4952A857 for ; Tue, 25 Sep 2018 15:30:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 28F742A866; Tue, 25 Sep 2018 15:30:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 245612A89A for ; Tue, 25 Sep 2018 15:30:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 03D818E00A3; Tue, 25 Sep 2018 11:30:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id F05068E009E; Tue, 25 Sep 2018 11:30:28 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D553D8E00A2; Tue, 25 Sep 2018 11:30:28 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id A18488E009E for ; Tue, 25 Sep 2018 11:30:28 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id v14-v6so6675508qkg.8 for ; Tue, 25 Sep 2018 08:30:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:subject:date:message-id :in-reply-to:references; bh=ku7/MRL/ppjtTHGZe2O1XF901LokT9maZEyU9FpjnHE=; b=hAXLsDvFPhRzBxNiFzWsYICife82POOsr70sa38xxvWg+9zN/yjKWqX/lbCIi12aPH eY8uZB/6jDFGLmB1aaMvcBkmeNR2LZMUdNsehDgMMYKDqjTxu4jIeBrEnRivT6V/SkbV z5gbLwvfYIgLVDkweyJ3iu+3qnXkHTgl9/kxAMY5X9bX5F3mbc1NcVcE3u/tLydOT9gS 3IyzVQuushaL2KHP2FIGsaB88+BBe4NXIQ+A4NtQA5zB6Xpk2wEe3uWzROuBCLWYpXKY TLy/9gGdl1/tjRzV94jPPb0mVriOA8Uj2SMaoBCtjQb8IaUwj6ufsf3pD8nW7VE7gDEG hbpA== X-Gm-Message-State: ABuFfogAS7kLcsQNLVGB9011t2wI7YLE7oDNlL/jZozZ4TDSH9XVxep2 GozX7Y+t/KM2MVvdtsh4dFQlKmMi2BJ9z+2ynzHgNYJpoL4S36gJ0RIvZz54DYZ6cDx/txlC3rZ 7Tr15rUA2r8ibTSEP6b9WYZfE1FURQtvTkMbDSjpoYuG1ba3zGGhFOkP0+DV6KDrdhkk/Ep//wW aZfKu1cXfq3Yb7hp0pCcS6Q0g9ba1cIC3u21+Ah2V5tRqUiEmUzHTOk6LjuM38Sj5OT0/iF5NFr lPtFS7/y+hf9fs6/TzBsRB9WHhl0+YRGE/O44yrO+zUmfeBruhMI9j+Nwn6UqRP4m0Ikedj/Fje ZW8JZ29yVWtWlr6REKZmaiUezygCyvLXyHkB6+ip54kF+vTZrnrTJlEZMsAWrRY0s6dqt2xiRv2 9 X-Received: by 2002:aed:2462:: with SMTP id s31-v6mr1278813qtc.160.1537889428435; Tue, 25 Sep 2018 08:30:28 -0700 (PDT) X-Received: by 2002:aed:2462:: with SMTP id s31-v6mr1278778qtc.160.1537889427824; Tue, 25 Sep 2018 08:30:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537889427; cv=none; d=google.com; s=arc-20160816; b=lI5cT8+K1m8cIpgDDKCWvvUT0zg/Pp5o5VtX+0hArJewL5NlqbIpLyXkFLfYuLOYoH g7nPgAStBNnTdjOahk51ZCT6WnVhSbf7qdOq3188ItgBXohR98TYH/TRCbFzkE/NO7fF sYgjT1GbKJEXnem6iPVH8zGaiujw0XKqnoZ//PNFvbsDPyjseGdbOyLNvsSnpTevNcKj UsFGkrusrhQ5o85c2k/Rna0ilaV7FxG4O0Zv77xSZbq+WIA4HzCZQazw9D69CNFe5Glc RLz0XeofBx2Fwov2qmJjUHCeR0kHg5Ewps3ZQEDa9qtWEiKqhwhMKcOtfaJ7o0pCh/lc 6NPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=ku7/MRL/ppjtTHGZe2O1XF901LokT9maZEyU9FpjnHE=; b=yNwlOYF2K9FqVjFzgeSVAfvVbL47y1VrkoNj7qGD6NbL2E/eftWw4yV/lJI9QSoQZd NkTTo1oPR4eDJxlBjVS3OtdmF1QWbx09lGXG2yVT5dcXw5br5hMJwkgCeQCMHBpMhDMx 4HAx6f3YRnmkWMeXeIEGH/Amh8nhX2/vn5DhkMStD7iA4xP8KAkmCqMhfq0I1iO40k/Q g8wpWTnIkETHcTQfS7LjFllZlxSnSmZU4fqeh145XXyD3mc4hM42W2x5qDG/OnDMpJkz 34UYsTX9zSj/IEXqRLV0192ptlhzENwxN85NPnpGxMvA6KmGAW2y/fAouCwCS3RgOgHZ TnWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=QENpwOrr; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id c17-v6sor883445qvn.136.2018.09.25.08.30.27 for (Google Transport Security); Tue, 25 Sep 2018 08:30:27 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=QENpwOrr; spf=neutral (google.com: 209.85.220.65 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references; bh=ku7/MRL/ppjtTHGZe2O1XF901LokT9maZEyU9FpjnHE=; b=QENpwOrr/7ZhS366ZP7w913/Hs2ElSqyKva/3kYfiaWk0+nhF6n+0Oa0FNzB+IV9Ye w8PDsT3mG1ZBqJSQto0v1fNNqZiiuYsf8CdazEuwY8dEGoeye7fn2Mkgzrzd1pmXKusw ihZdiUq10sVQ4cocO/MxEM+S/6DBh0Z4tjeMK4z44d/zCWr8qs0/nJDyXikVoXRDdTDk mtZ+7pr6nbyNre5aGEML/ZNBB8ndkk8vZSu7Mhd0VOqKcMyDsa5hBcEYJLYdoh4dLmog MdEJGU9LhxBcsHxVKFbrTgk52uT33H2FKS0kYhrbYjBC841r9NXAD4ZgSSGqzActbwAG 2dfw== X-Google-Smtp-Source: ACcGV62u48xuG+u6HnDqHjgjSdJBvYRkbiUdmT4B3A/rajjCC8tIPdq/rTHhk6YGPZQGn6+Y1hSAMw== X-Received: by 2002:a0c:93c1:: with SMTP id g1-v6mr1226748qvg.136.1537889427517; Tue, 25 Sep 2018 08:30:27 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id l16-v6sm1642913qtk.30.2018.09.25.08.30.26 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 25 Sep 2018 08:30:26 -0700 (PDT) From: Josef Bacik To: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, linux-btrfs@vger.kernel.org, riel@redhat.com, hannes@cmpxchg.org, tj@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 7/8] mm: add a flag to indicate we used a cached page Date: Tue, 25 Sep 2018 11:30:10 -0400 Message-Id: <20180925153011.15311-8-josef@toxicpanda.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180925153011.15311-1-josef@toxicpanda.com> References: <20180925153011.15311-1-josef@toxicpanda.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This is preparation for dropping the mmap_sem in page_mkwrite. We need to know if we used our cached page so we can be sure it is the page we already did the page_mkwrite stuff on so we don't have to redo all of that work. Signed-off-by: Josef Bacik --- include/linux/mm.h | 6 +++++- mm/filemap.c | 5 ++++- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 724514be03b2..10a0118f5485 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -318,6 +318,9 @@ extern pgprot_t protection_map[16]; #define FAULT_FLAG_USER 0x40 /* The fault originated in userspace */ #define FAULT_FLAG_REMOTE 0x80 /* faulting for non current tsk/mm */ #define FAULT_FLAG_INSTRUCTION 0x100 /* The fault was during an instruction fetch */ +#define FAULT_FLAG_USED_CACHED 0x200 /* Our vmf->page was from a previous + * loop through the fault handler. + */ #define FAULT_FLAG_TRACE \ { FAULT_FLAG_WRITE, "WRITE" }, \ @@ -328,7 +331,8 @@ extern pgprot_t protection_map[16]; { FAULT_FLAG_TRIED, "TRIED" }, \ { FAULT_FLAG_USER, "USER" }, \ { FAULT_FLAG_REMOTE, "REMOTE" }, \ - { FAULT_FLAG_INSTRUCTION, "INSTRUCTION" } + { FAULT_FLAG_INSTRUCTION, "INSTRUCTION" }, \ + { FAULT_FLAG_USED_CACHED, "USED_CACHED" } /* * vm_fault is filled by the the pagefault handler and passed to the vma's diff --git a/mm/filemap.c b/mm/filemap.c index 49b35293fa95..75a8b252814a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2556,6 +2556,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) if (cached_page->mapping == mapping && cached_page->index == offset) { page = cached_page; + vmf->flags |= FAULT_FLAG_USED_CACHED; goto have_cached_page; } unlock_page(cached_page); @@ -2618,8 +2619,10 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) * We have a locked page in the page cache, now we need to check * that it's up-to-date. If not, it is going to be due to an error. */ - if (unlikely(!PageUptodate(page))) + if (unlikely(!PageUptodate(page))) { + vmf->flags &= ~(FAULT_FLAG_USED_CACHED); goto page_not_uptodate; + } /* * Found the page and have a reference on it. From patchwork Tue Sep 25 15:30:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 10614271 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5C44315E8 for ; Tue, 25 Sep 2018 15:30:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4F84C2A89E for ; Tue, 25 Sep 2018 15:30:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 43AAA2A8BF; Tue, 25 Sep 2018 15:30:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7EFDA2A8B5 for ; Tue, 25 Sep 2018 15:30:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 806508E00A2; Tue, 25 Sep 2018 11:30:32 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7624D8E00A4; Tue, 25 Sep 2018 11:30:32 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DF0A8E00A2; Tue, 25 Sep 2018 11:30:32 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 2C5E88E00A4 for ; Tue, 25 Sep 2018 11:30:32 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id x144-v6so11298611qkb.4 for ; Tue, 25 Sep 2018 08:30:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:subject:date:message-id :in-reply-to:references; bh=PDnxzeo4ARamvsNmFxpuCNGpSh/SMf2cCcJx4Y/RGR0=; b=e1oTQq2yW0geJtnCxdbLg/vXHoL7MDzTonwHktMFRSNQqVPF8d94JakLLTyj5e99AQ kbL+a/mMQPkvLS+UQ6NS1r1R0wh8yEQKDiGPk/7AxsKdHr8yB0lHWq7sJ9GTaAdjxUlz Xy64uLT+6qxWwxEQ9/2kOen8ko0dn/o0weLdCn2t+1W7s5HLlp/CTYtpiVdd5MDU4V+z 3tPZt4iOOXa5tB2eItvVPC6nN+5o5Bffml7+l8cFLIeyibfgZjvfCvOL8MW2WyJ8fdvY y6uDLz20Lu4eHN1nZU5/v++U0jgjRXJrBTyD8K34279gsDY5W8jAQSgegpw3lx5BWDlA iSpg== X-Gm-Message-State: ABuFfoicsOzPV8nMjivI2PNZtTYWKEbS/V9pdz1zrGA7M8BI5xy8EsQQ T30efzA2OpEVTsrotbq0gKPKyPd5udBrekYht0zpycVvN91Dp6SZiEkdtx5Yatl+mmD1vXgBNmt TQNVV/lzroIqlDsIHnDL2K0rPuKtMnjENH2VHkelKiu4AcGNTTzNm7NC5J99a3s4FHF3lIoES6s XmvdN71PWryYqKkWtFCBuCZmMJ1IjOGb4oNhZHXVHjGEF1ZMFeToao/jyXalMzHxJQZf85VhD1Z vGPGs/cocjLrBVI2R0tfns/Xxiyo07D988bHI5SMp121vM2s2dwIinaPsHQ2SpINoN4uffRl7Jf aqoVzLchZnO32uutAjt7zAMRdWnMRTaFN4pzRIKQNHFmQSLEtqjWJrgTh2PSBzgRy9LB+nIyFOj e X-Received: by 2002:a37:5742:: with SMTP id l63-v6mr1143252qkb.246.1537889431913; Tue, 25 Sep 2018 08:30:31 -0700 (PDT) X-Received: by 2002:a37:5742:: with SMTP id l63-v6mr1143163qkb.246.1537889430415; Tue, 25 Sep 2018 08:30:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537889430; cv=none; d=google.com; s=arc-20160816; b=cmcFZfjrOib3D8fEySu1T/6JrPujT+sLYcTPWdwqgNjgnr4H7qi4nAOlWnge8ebRVc rn01topqkBKLv58frPICddsf2LRG0sxDY3oTEO7p5Pf541D3riZ9HOCSNHR1B3nznuzZ QWaZM182rjdhaa4UTNYuXXEb58pYcIRnCwJKx1YY85zfencDRVgJLsoAdX48ZV8YlJPO xMaZOD/uWZSKW5Vp82Ev7LKfsqOEhnH5BVWQ5yFEUdd0YawKWITqDI9vjO9UjeJn+9Fs YDkxEMc3Yy6F9WlPHAHkNRixkrfKGMW0tdTcnnSfeLXEY+njhDZWCF/xuX4wFhX5479D o+ng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=PDnxzeo4ARamvsNmFxpuCNGpSh/SMf2cCcJx4Y/RGR0=; b=MWkQMxyyM7O3m6tEueX1lkoaEaIMzj+roTO4okd2vqszP6vOSTq+1WrKDi6TrzFFt2 SN4pp2MjAFyCAhKDQ4gZwkh5QZWs+KedQogXXMoMom7fDX95SMAdByc6Me53bfCjvz5F k2kvlyZS/zXWNmHHSxFxC8v78U5Vjyvj/x46dfX5pmetd8HcVc45fDYjfmOy+OdWSGNx HITD0LQEDl+r9Mt6h3paJVAkji0AbPIl7APzW/K7p1Le2zgTcZKtxGR0EXxF7DOV/yid jHQKz2e3Caabs8x2VT3oeKKKK8uXoydvI9UygjEoFb5VNF/2bTe3694eWuSHABF2M89e IuXg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=RcPALi8D; spf=neutral (google.com: 209.85.220.41 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com Received: from mail-sor-f41.google.com (mail-sor-f41.google.com. [209.85.220.41]) by mx.google.com with SMTPS id l8-v6sor820794qvp.41.2018.09.25.08.30.30 for (Google Transport Security); Tue, 25 Sep 2018 08:30:30 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.220.41 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) client-ip=209.85.220.41; Authentication-Results: mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=RcPALi8D; spf=neutral (google.com: 209.85.220.41 is neither permitted nor denied by best guess record for domain of josef@toxicpanda.com) smtp.mailfrom=josef@toxicpanda.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references; bh=PDnxzeo4ARamvsNmFxpuCNGpSh/SMf2cCcJx4Y/RGR0=; b=RcPALi8D6dfVSo4xlu/KTA8lljnTkd1vaQFvFipLTjLvG09gFvjKaaFuI0ye5YqtBx BxcYU2VyL+jWgQDY/mGmPNWPqTHSsHHmhlyn8HaVvjneIHJOhoPxMoU+VHmEoLpRbl80 mqOXzcjI5sYECZ+o4wIo+CdYrB0TZomSBH3PzILJLrxNGIL+Dy2drfy3EBceaDkJHJ1V csTrCXoaSVtLpjiOY36WUxEiVCWeh7OQaPu7y9B0kOotkyxI7LCryd1SauR9HEZhtL9/ 3LaK5Y8GtZQg8eQXFUN3uqPhM/YTK7BVZR948YXEisThS+mck3zoXYBcZtgIbxV0sVa/ 0lSQ== X-Google-Smtp-Source: ACcGV63k5igT7ioJSNQTT0X9VogsRs/dXZ8zQ0Qp02y7wE62ew2v7c+iNBaCvHz97xuiRcH27KC+dw== X-Received: by 2002:a0c:93c1:: with SMTP id g1-v6mr1226889qvg.136.1537889430118; Tue, 25 Sep 2018 08:30:30 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id e65-v6sm1371245qkf.39.2018.09.25.08.30.28 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 25 Sep 2018 08:30:29 -0700 (PDT) From: Josef Bacik To: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, linux-btrfs@vger.kernel.org, riel@redhat.com, hannes@cmpxchg.org, tj@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 8/8] btrfs: drop mmap_sem in mkwrite for btrfs Date: Tue, 25 Sep 2018 11:30:11 -0400 Message-Id: <20180925153011.15311-9-josef@toxicpanda.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180925153011.15311-1-josef@toxicpanda.com> References: <20180925153011.15311-1-josef@toxicpanda.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP ->page_mkwrite is extremely expensive in btrfs. We have to reserve space, which can take 6 lifetimes, and we could possibly have to wait on writeback on the page, another several lifetimes. To avoid this simply drop the mmap_sem if we didn't have the cached page and do all of our work and return the appropriate retry error. If we have the cached page we know we did all the right things to set this page up and we can just carry on. Signed-off-by: Josef Bacik --- fs/btrfs/inode.c | 40 ++++++++++++++++++++++++++++++++++++++-- include/linux/mm.h | 14 ++++++++++++++ mm/filemap.c | 3 ++- 3 files changed, 54 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 3ea5339603cf..34c33b96d335 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8809,7 +8809,9 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset, vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) { struct page *page = vmf->page; - struct inode *inode = file_inode(vmf->vma->vm_file); + struct file *file = vmf->vma->vm_file, *fpin; + struct mm_struct *mm = vmf->vma->vm_mm; + struct inode *inode = file_inode(file); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; struct btrfs_ordered_extent *ordered; @@ -8828,6 +8830,29 @@ vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) reserved_space = PAGE_SIZE; + /* + * We have our cached page from a previous mkwrite, check it to make + * sure it's still dirty and our file size matches when we ran mkwrite + * the last time. If everything is OK then return VM_FAULT_LOCKED, + * otherwise do the mkwrite again. + */ + if (vmf->flags & FAULT_FLAG_USED_CACHED) { + lock_page(page); + if (vmf->cached_size == i_size_read(inode) && + PageDirty(page)) + return VM_FAULT_LOCKED; + unlock_page(page); + } + + /* + * mkwrite is extremely expensive, and we are holding the mmap_sem + * during this, which means we can starve out anybody trying to + * down_write(mmap_sem) for a long while, especially if we throw cgroups + * into the mix. So just drop the mmap_sem and do all of our work, + * we'll loop back through and verify everything is ok the next time and + * hopefully avoid doing the work twice. + */ + fpin = maybe_unlock_mmap_for_io(vmf->vma, vmf->flags); sb_start_pagefault(inode->i_sb); page_start = page_offset(page); page_end = page_start + PAGE_SIZE - 1; @@ -8844,7 +8869,7 @@ vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) ret2 = btrfs_delalloc_reserve_space(inode, &data_reserved, page_start, reserved_space); if (!ret2) { - ret2 = file_update_time(vmf->vma->vm_file); + ret2 = file_update_time(file); reserved = 1; } if (ret2) { @@ -8943,6 +8968,13 @@ vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) btrfs_delalloc_release_extents(BTRFS_I(inode), PAGE_SIZE, true); sb_end_pagefault(inode->i_sb); extent_changeset_free(data_reserved); + if (fpin) { + unlock_page(page); + fput(fpin); + vmf->cached_size = size; + down_read(&mm->mmap_sem); + return VM_FAULT_RETRY; + } return VM_FAULT_LOCKED; } @@ -8955,6 +8987,10 @@ vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) out_noreserve: sb_end_pagefault(inode->i_sb); extent_changeset_free(data_reserved); + if (fpin) { + fput(fpin); + down_read(&mm->mmap_sem); + } return ret; } diff --git a/include/linux/mm.h b/include/linux/mm.h index 10a0118f5485..b9ad6cb3de84 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -370,6 +370,13 @@ struct vm_fault { * next time we loop through the fault * handler for faster lookup. */ + loff_t cached_size; /* ->page_mkwrite handlers may drop + * the mmap_sem to avoid starvation, in + * which case they need to save the + * i_size in order to verify the cached + * page we're using the next loop + * through hasn't changed under us. + */ /* These three entries are valid only while holding ptl lock */ pte_t *pte; /* Pointer to pte entry matching * the 'address'. NULL if the page @@ -1435,6 +1442,8 @@ extern vm_fault_t handle_mm_fault(struct vm_fault *vmf); extern int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm, unsigned long address, unsigned int fault_flags, bool *unlocked); +extern struct file *maybe_unlock_mmap_for_io(struct vm_area_struct *vma, + int flags); void unmap_mapping_pages(struct address_space *mapping, pgoff_t start, pgoff_t nr, bool even_cows); void unmap_mapping_range(struct address_space *mapping, @@ -1454,6 +1463,11 @@ static inline int fixup_user_fault(struct task_struct *tsk, BUG(); return -EFAULT; } +stiatc inline struct file *maybe_unlock_mmap_for_io(struct vm_area_struct *vma, + int flags) +{ + return NULL; +} static inline void unmap_mapping_pages(struct address_space *mapping, pgoff_t start, pgoff_t nr, bool even_cows) { } static inline void unmap_mapping_range(struct address_space *mapping, diff --git a/mm/filemap.c b/mm/filemap.c index 75a8b252814a..748c696d23af 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2366,7 +2366,7 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter) EXPORT_SYMBOL(generic_file_read_iter); #ifdef CONFIG_MMU -static struct file *maybe_unlock_mmap_for_io(struct vm_area_struct *vma, int flags) +struct file *maybe_unlock_mmap_for_io(struct vm_area_struct *vma, int flags) { if ((flags & (FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT)) == FAULT_FLAG_ALLOW_RETRY) { struct file *file; @@ -2377,6 +2377,7 @@ static struct file *maybe_unlock_mmap_for_io(struct vm_area_struct *vma, int fla } return NULL; } +EXPORT_SYMBOL_GPL(maybe_unlock_mmap_for_io); /** * page_cache_read - adds requested page to the page cache if not already there