From patchwork Mon Jul 24 18:54:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13325246 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF7CDC0015E for ; Mon, 24 Jul 2023 18:54:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229461AbjGXSyS (ORCPT ); Mon, 24 Jul 2023 14:54:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229486AbjGXSyS (ORCPT ); Mon, 24 Jul 2023 14:54:18 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1E0410E2 for ; Mon, 24 Jul 2023 11:54:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=2AZT1+3agWdrOE7nqGCVHOcFr0jkcGTHBHWs2FPeDAM=; b=n5GXwO6Dr8M+Aa3h7HKboWgfbn O6jzDiQok9xwJwtnCZMhuc2dh0YtV6FQJWpLd1iAAgzkjij5vatuLMlggbmcGT/XPMZ5rEC1vbHG+ dIDesgUfaQ31ZEBPWNRwAWCW7XVkC7uAv34PynBov/6Mr3o2ot8lA2yNDCI2J/npI4AdVaRoHR24h msmW+juCmkG/E4EQ2AvmRvbvwxI7rZZVKE+YjdaOaFJRREPwbl1O4kwNhrasi+oMpo2IRxx8u2dYI KNnMexdkEPHhY7srT/Gv6v0BYss34lJgRGAKi58np38ju4aZRpssnyqcmRTW7YggNW3h9jdjWCyBC 98GOUPDQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qO0hA-004iR1-25; Mon, 24 Jul 2023 18:54:12 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Punit Agrawal , Suren Baghdasaryan Subject: [PATCH v3 01/10] mm: Remove CONFIG_PER_VMA_LOCK ifdefs Date: Mon, 24 Jul 2023 19:54:01 +0100 Message-Id: <20230724185410.1124082-2-willy@infradead.org> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230724185410.1124082-1-willy@infradead.org> References: <20230724185410.1124082-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Provide lock_vma_under_rcu() when CONFIG_PER_VMA_LOCK is not defined to eliminate ifdefs in the users. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Suren Baghdasaryan --- arch/arm64/mm/fault.c | 2 -- arch/powerpc/mm/fault.c | 4 ---- arch/riscv/mm/fault.c | 4 ---- arch/s390/mm/fault.c | 2 -- arch/x86/mm/fault.c | 4 ---- include/linux/mm.h | 6 ++++++ 6 files changed, 6 insertions(+), 16 deletions(-) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index b8c80f7b8a5f..2e5d1e238af9 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -587,7 +587,6 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr); -#ifdef CONFIG_PER_VMA_LOCK if (!(mm_flags & FAULT_FLAG_USER)) goto lock_mmap; @@ -616,7 +615,6 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, return 0; } lock_mmap: -#endif /* CONFIG_PER_VMA_LOCK */ retry: vma = lock_mm_and_find_vma(mm, addr, regs); diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 82954d0e6906..b1723094d464 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -469,7 +469,6 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address, if (is_exec) flags |= FAULT_FLAG_INSTRUCTION; -#ifdef CONFIG_PER_VMA_LOCK if (!(flags & FAULT_FLAG_USER)) goto lock_mmap; @@ -502,7 +501,6 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address, return user_mode(regs) ? 0 : SIGBUS; lock_mmap: -#endif /* CONFIG_PER_VMA_LOCK */ /* When running in the kernel we expect faults to occur only to * addresses in user space. All other faults represent errors in the @@ -552,9 +550,7 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address, mmap_read_unlock(current->mm); -#ifdef CONFIG_PER_VMA_LOCK done: -#endif if (unlikely(fault & VM_FAULT_ERROR)) return mm_fault_error(regs, address, fault); diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 6ea2cce4cc17..046732fcb48c 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -283,7 +283,6 @@ void handle_page_fault(struct pt_regs *regs) flags |= FAULT_FLAG_WRITE; else if (cause == EXC_INST_PAGE_FAULT) flags |= FAULT_FLAG_INSTRUCTION; -#ifdef CONFIG_PER_VMA_LOCK if (!(flags & FAULT_FLAG_USER)) goto lock_mmap; @@ -311,7 +310,6 @@ void handle_page_fault(struct pt_regs *regs) return; } lock_mmap: -#endif /* CONFIG_PER_VMA_LOCK */ retry: vma = lock_mm_and_find_vma(mm, addr, regs); @@ -368,9 +366,7 @@ void handle_page_fault(struct pt_regs *regs) mmap_read_unlock(mm); -#ifdef CONFIG_PER_VMA_LOCK done: -#endif if (unlikely(fault & VM_FAULT_ERROR)) { tsk->thread.bad_cause = cause; mm_fault_error(regs, addr, fault); diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index 2b953625a517..a063774ba584 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -407,7 +407,6 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) access = VM_WRITE; if (access == VM_WRITE) flags |= FAULT_FLAG_WRITE; -#ifdef CONFIG_PER_VMA_LOCK if (!(flags & FAULT_FLAG_USER)) goto lock_mmap; vma = lock_vma_under_rcu(mm, address); @@ -433,7 +432,6 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) goto out; } lock_mmap: -#endif /* CONFIG_PER_VMA_LOCK */ mmap_read_lock(mm); gmap = NULL; diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 56b4f9faf8c4..2e861b9360c7 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1328,7 +1328,6 @@ void do_user_addr_fault(struct pt_regs *regs, } #endif -#ifdef CONFIG_PER_VMA_LOCK if (!(flags & FAULT_FLAG_USER)) goto lock_mmap; @@ -1359,7 +1358,6 @@ void do_user_addr_fault(struct pt_regs *regs, return; } lock_mmap: -#endif /* CONFIG_PER_VMA_LOCK */ retry: vma = lock_mm_and_find_vma(mm, address, regs); @@ -1419,9 +1417,7 @@ void do_user_addr_fault(struct pt_regs *regs, } mmap_read_unlock(mm); -#ifdef CONFIG_PER_VMA_LOCK done: -#endif if (likely(!(fault & VM_FAULT_ERROR))) return; diff --git a/include/linux/mm.h b/include/linux/mm.h index a5d68baea231..89f80e2c6ed9 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -753,6 +753,12 @@ static inline void assert_fault_locked(struct vm_fault *vmf) mmap_assert_locked(vmf->vma->vm_mm); } +static inline struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, + unsigned long address) +{ + return NULL; +} + #endif /* CONFIG_PER_VMA_LOCK */ /* From patchwork Mon Jul 24 18:54:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13325254 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B436EB64DD for ; Mon, 24 Jul 2023 18:54:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230049AbjGXSyj (ORCPT ); Mon, 24 Jul 2023 14:54:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229598AbjGXSyi (ORCPT ); Mon, 24 Jul 2023 14:54:38 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 139AAE55 for ; Mon, 24 Jul 2023 11:54:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=3AerkyW6jQd6y4z+nJVqv9X/P9KyvfBPmyU8zyX0gIY=; b=T5WZLqosM499I02fiX0L+Oh2WN UVgaWQNODE3toYL7Hkf9TqP02b83nLnWZK0d16D8E2EBKem+n5svfi/BWGCMHr9+JqGgq9wSB2cUZ Ul2RAEhjDKUkApuBDNwSO4vEr1GQ0rCAqJg/tEc/UP5d6/RpdIbRYVXtlvVBCJdZzViD0p6t3MFTR W2zX2H9kNr37fSa1DoYb3/SnNFsASa0KmyalF7lugxtR93Ykk3scERqFti65kmwu26eMSGo7evEHg ES2YTHDMR7hSv6+QZd3gF6jnzLamcbnFtqXQuCMCsUrYKjAotodfYGgMlts3LXVKYhDeRDybzsjUI 9t1fww6Q==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qO0hA-004iR3-4n; Mon, 24 Jul 2023 18:54:12 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Punit Agrawal , Suren Baghdasaryan , Arjun Roy , Eric Dumazet Subject: [PATCH v3 02/10] mm: Allow per-VMA locks on file-backed VMAs Date: Mon, 24 Jul 2023 19:54:02 +0100 Message-Id: <20230724185410.1124082-3-willy@infradead.org> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230724185410.1124082-1-willy@infradead.org> References: <20230724185410.1124082-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Remove the TCP layering violation by allowing per-VMA locks on all VMAs. The fault path will immediately fail in handle_mm_fault(). There may be a small performance reduction from this patch as a little unnecessary work will be done on each page fault. See later patches for the improvement. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Suren Baghdasaryan Cc: Arjun Roy Cc: Eric Dumazet --- MAINTAINERS | 1 - include/linux/net_mm.h | 17 ----------------- include/net/tcp.h | 1 - mm/memory.c | 12 ++++++------ net/ipv4/tcp.c | 11 ++++------- 5 files changed, 10 insertions(+), 32 deletions(-) delete mode 100644 include/linux/net_mm.h diff --git a/MAINTAINERS b/MAINTAINERS index 220566cd8da8..f5f5134219b4 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14874,7 +14874,6 @@ NETWORKING [TCP] M: Eric Dumazet L: netdev@vger.kernel.org S: Maintained -F: include/linux/net_mm.h F: include/linux/tcp.h F: include/net/tcp.h F: include/trace/events/tcp.h diff --git a/include/linux/net_mm.h b/include/linux/net_mm.h deleted file mode 100644 index b298998bd5a0..000000000000 --- a/include/linux/net_mm.h +++ /dev/null @@ -1,17 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0-or-later */ -#ifdef CONFIG_MMU - -#ifdef CONFIG_INET -extern const struct vm_operations_struct tcp_vm_ops; -static inline bool vma_is_tcp(const struct vm_area_struct *vma) -{ - return vma->vm_ops == &tcp_vm_ops; -} -#else -static inline bool vma_is_tcp(const struct vm_area_struct *vma) -{ - return false; -} -#endif /* CONFIG_INET*/ - -#endif /* CONFIG_MMU */ diff --git a/include/net/tcp.h b/include/net/tcp.h index d17cb8ab4c48..9d1d312c90c2 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -45,7 +45,6 @@ #include #include #include -#include extern struct inet_hashinfo tcp_hashinfo; diff --git a/mm/memory.c b/mm/memory.c index 7ff73a197cce..c7ad754dd8ed 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -77,7 +77,6 @@ #include #include #include -#include #include @@ -5362,6 +5361,11 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, goto out; } + if ((flags & FAULT_FLAG_VMA_LOCK) && !vma_is_anonymous(vma)) { + vma_end_read(vma); + return VM_FAULT_RETRY; + } + /* * Enable the memcg OOM handling for faults triggered in user * space. Kernel faults are handled more gracefully. @@ -5533,12 +5537,8 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, if (!vma) goto inval; - /* Only anonymous and tcp vmas are supported for now */ - if (!vma_is_anonymous(vma) && !vma_is_tcp(vma)) - goto inval; - /* find_mergeable_anon_vma uses adjacent vmas which are not locked */ - if (!vma->anon_vma && !vma_is_tcp(vma)) + if (vma_is_anonymous(vma) && !vma->anon_vma) goto inval; if (!vma_start_read(vma)) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index aca5620cf3ba..f7391d017c4c 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1740,7 +1740,7 @@ void tcp_update_recv_tstamps(struct sk_buff *skb, } #ifdef CONFIG_MMU -const struct vm_operations_struct tcp_vm_ops = { +static const struct vm_operations_struct tcp_vm_ops = { }; int tcp_mmap(struct file *file, struct socket *sock, @@ -2043,13 +2043,10 @@ static struct vm_area_struct *find_tcp_vma(struct mm_struct *mm, unsigned long address, bool *mmap_locked) { - struct vm_area_struct *vma = NULL; + struct vm_area_struct *vma = lock_vma_under_rcu(mm, address); -#ifdef CONFIG_PER_VMA_LOCK - vma = lock_vma_under_rcu(mm, address); -#endif if (vma) { - if (!vma_is_tcp(vma)) { + if (vma->vm_ops != &tcp_vm_ops) { vma_end_read(vma); return NULL; } @@ -2059,7 +2056,7 @@ static struct vm_area_struct *find_tcp_vma(struct mm_struct *mm, mmap_read_lock(mm); vma = vma_lookup(mm, address); - if (!vma || !vma_is_tcp(vma)) { + if (!vma || vma->vm_ops != &tcp_vm_ops) { mmap_read_unlock(mm); return NULL; } From patchwork Mon Jul 24 18:54:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13325252 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52240C0015E for ; Mon, 24 Jul 2023 18:54:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230126AbjGXSyh (ORCPT ); Mon, 24 Jul 2023 14:54:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230049AbjGXSyc (ORCPT ); Mon, 24 Jul 2023 14:54:32 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27318188 for ; Mon, 24 Jul 2023 11:54:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=TWTBYufoYSHkxfGZKIaPme2dTr/Pf10i6Hfp7/PcTWM=; b=EInUKGfNGCuTTNsKCFRp6uzlB9 1fJcmDCG3/jqX0aR3pLqeB7Hfy3fyyoldC/a2lX8d4E4Z+0RA6Dqfj4qjuAvlFXSEhACCF4KIS8YA CfoD9MK242rPZ7Zi1CapBwX3l2XrXyD8sHIWtl7fQT15/dJnIuqKFa9jwI3D0IFIxQXiGB/eIyGQW MgbFEpHyXrSqrRmNd7OZcOekNHH53oejJDKO5zaI1nU3ilkddAlysN4Cviibn2M2A7528Ffzmggtq yof80Uewlt6OaQZ2zNYLFnAa9mWwu7VMTOfLdGS0Qrn4Gfz4V8UeeqDrY3cehYVPDsJXVVYK+Ty+y iaQQTgqA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qO0hA-004iR5-7U; Mon, 24 Jul 2023 18:54:12 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Punit Agrawal , Suren Baghdasaryan Subject: [PATCH v3 03/10] mm: Move FAULT_FLAG_VMA_LOCK check from handle_mm_fault() Date: Mon, 24 Jul 2023 19:54:03 +0100 Message-Id: <20230724185410.1124082-4-willy@infradead.org> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230724185410.1124082-1-willy@infradead.org> References: <20230724185410.1124082-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Handle a little more of the page fault path outside the mmap sem. The hugetlb path doesn't need to check whether the VMA is anonymous; the VM_HUGETLB flag is only set on hugetlbfs VMAs. There should be no performance change from the previous commit; this is simply a step to ease bisection of any problems. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Suren Baghdasaryan --- mm/hugetlb.c | 6 ++++++ mm/memory.c | 18 +++++++++--------- 2 files changed, 15 insertions(+), 9 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 154cc5b31572..e327a5a7602c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6089,6 +6089,12 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, int need_wait_lock = 0; unsigned long haddr = address & huge_page_mask(h); + /* TODO: Handle faults under the VMA lock */ + if (flags & FAULT_FLAG_VMA_LOCK) { + vma_end_read(vma); + return VM_FAULT_RETRY; + } + /* * Serialize hugepage allocation and instantiation, so that we don't * get spurious allocation failures if two CPUs race to instantiate diff --git a/mm/memory.c b/mm/memory.c index c7ad754dd8ed..5ca8902b6f67 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5112,10 +5112,10 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) } /* - * By the time we get here, we already hold the mm semaphore - * - * The mmap_lock may have been released depending on flags and our - * return value. See filemap_fault() and __folio_lock_or_retry(). + * On entry, we hold either the VMA lock or the mmap_lock + * (FAULT_FLAG_VMA_LOCK tells you which). If VM_FAULT_RETRY is set in + * the result, the mmap_lock is not held on exit. See filemap_fault() + * and __folio_lock_or_retry(). */ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, unsigned long address, unsigned int flags) @@ -5134,6 +5134,11 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, p4d_t *p4d; vm_fault_t ret; + if ((flags & FAULT_FLAG_VMA_LOCK) && !vma_is_anonymous(vma)) { + vma_end_read(vma); + return VM_FAULT_RETRY; + } + pgd = pgd_offset(mm, address); p4d = p4d_alloc(mm, pgd, address); if (!p4d) @@ -5361,11 +5366,6 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, goto out; } - if ((flags & FAULT_FLAG_VMA_LOCK) && !vma_is_anonymous(vma)) { - vma_end_read(vma); - return VM_FAULT_RETRY; - } - /* * Enable the memcg OOM handling for faults triggered in user * space. Kernel faults are handled more gracefully. From patchwork Mon Jul 24 18:54:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13325256 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 946BCEB64DD for ; Mon, 24 Jul 2023 18:54:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230232AbjGXSyp (ORCPT ); Mon, 24 Jul 2023 14:54:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230197AbjGXSyo (ORCPT ); Mon, 24 Jul 2023 14:54:44 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A07410E2 for ; Mon, 24 Jul 2023 11:54:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=/dr8E7Ysfsh/pzxHk8PrgytRhAkkfjQGUpicX+vSfvw=; b=h2urfpoJlH+/tTgmv+1TWIjF7V SHPiUiKpC2n086K/Blscs2mtWtATwrhaglAB7qSgdBQvKMNtIPQV/UFDUBNnXBCJKZqS02PMxZDhN QIpfedqmho8nh2D698VotNmU0q5qTKALJGUD/wc0E58tg4ZCE4Gn2rhzDLRFYpHGnf4jBnK69OxHr fBw6gic1/us6lWW6l7ExzoNHq56D4JqqkX3VwhzEJSKvB1yN9l0CoM6WoGz83conZI3w9cwm2x84i Zc3bYa8EqN9n54f/4wlFuqSHLh/NnBnmRxHDAymVbpKp5GzpcpfCkw2wIAESzGP7wGjHvSVs7xgCq /JBsnc2A==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qO0hA-004iR7-AG; Mon, 24 Jul 2023 18:54:12 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Punit Agrawal Subject: [PATCH v3 04/10] mm: Handle PUD faults under the VMA lock Date: Mon, 24 Jul 2023 19:54:04 +0100 Message-Id: <20230724185410.1124082-5-willy@infradead.org> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230724185410.1124082-1-willy@infradead.org> References: <20230724185410.1124082-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Postpone checking the VMA_LOCK flag until we've attempted to handle faults on PUDs. There's a mild upside to this patch in that we'll allocate the page tables while under the VMA lock rather than the mmap lock, reducing the hold time on the mmap lock, since the retry will find the page tables already populated. The real purpose here is to make a commit that shows we don't call ->huge_fault under the VMA lock. We do now handle setting the accessed bit on a PUD fault under the VMA lock, but that doesn't seem likely to be a measurable difference. Signed-off-by: Matthew Wilcox (Oracle) --- mm/memory.c | 37 ++++++++++++++++++++++++------------- 1 file changed, 24 insertions(+), 13 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 5ca8902b6f67..7fec616f490b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4987,11 +4987,17 @@ static vm_fault_t create_huge_pud(struct vm_fault *vmf) { #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \ defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) + struct vm_area_struct *vma = vmf->vma; /* No support for anonymous transparent PUD pages yet */ - if (vma_is_anonymous(vmf->vma)) + if (vma_is_anonymous(vma)) return VM_FAULT_FALLBACK; - if (vmf->vma->vm_ops->huge_fault) - return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD); + if (vma->vm_ops->huge_fault) { + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + vma_end_read(vma); + return VM_FAULT_RETRY; + } + return vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD); + } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ return VM_FAULT_FALLBACK; } @@ -5000,21 +5006,26 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud) { #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \ defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) + struct vm_area_struct *vma = vmf->vma; vm_fault_t ret; /* No support for anonymous transparent PUD pages yet */ - if (vma_is_anonymous(vmf->vma)) + if (vma_is_anonymous(vma)) goto split; - if (vmf->vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) { - if (vmf->vma->vm_ops->huge_fault) { - ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD); + if (vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) { + if (vma->vm_ops->huge_fault) { + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + vma_end_read(vma); + return VM_FAULT_RETRY; + } + ret = vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD); if (!(ret & VM_FAULT_FALLBACK)) return ret; } } split: /* COW or write-notify not handled on PUD level: split pud.*/ - __split_huge_pud(vmf->vma, vmf->pud, vmf->address); + __split_huge_pud(vma, vmf->pud, vmf->address); #endif /* CONFIG_TRANSPARENT_HUGEPAGE && CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ return VM_FAULT_FALLBACK; } @@ -5134,11 +5145,6 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, p4d_t *p4d; vm_fault_t ret; - if ((flags & FAULT_FLAG_VMA_LOCK) && !vma_is_anonymous(vma)) { - vma_end_read(vma); - return VM_FAULT_RETRY; - } - pgd = pgd_offset(mm, address); p4d = p4d_alloc(mm, pgd, address); if (!p4d) @@ -5182,6 +5188,11 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, if (pud_trans_unstable(vmf.pud)) goto retry_pud; + if ((flags & FAULT_FLAG_VMA_LOCK) && !vma_is_anonymous(vma)) { + vma_end_read(vma); + return VM_FAULT_RETRY; + } + if (pmd_none(*vmf.pmd) && hugepage_vma_check(vma, vm_flags, false, true, true)) { ret = create_huge_pmd(&vmf); From patchwork Mon Jul 24 18:54:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13325251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EA97C00528 for ; Mon, 24 Jul 2023 18:54:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230084AbjGXSyb (ORCPT ); Mon, 24 Jul 2023 14:54:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229995AbjGXSy3 (ORCPT ); Mon, 24 Jul 2023 14:54:29 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1181510E3 for ; Mon, 24 Jul 2023 11:54:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=hJu05/6R+2/+R58/bqnW2667sqrjvy4Vo+FjhIXzCu0=; b=EL5lMUFnNoPCZMe0eS0Hvijf+2 j32meE393cAjsq10i8jRMBUAvHcjjnw4BZyBiVirhYCjv44Xl4PLoH/sWyaqKSIwj0MCPc3w4382H sMSy+UNGoiKw8jyI5NZzdHT00TMqbw5mU8dWWw9oQPdxKe0U3T6zDJ0FWgJfc8FxlIPQcYk7vEW/o ZuTt6UHN7KrKdU+56Av7l32r4Eo7vG1SZaOKkWH+D5hZLNwYGjYrfh6DSX0Lc98neqYSZBttPU1cN Ul9efZMKPBrSxs/5R+8v5NbJUr06vyCHOrd6yf9uuxA58P9XcM8NezBphhwdnh8O+svpoAzp2Swxq rE7Bc2kA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qO0hA-004iR9-Cu; Mon, 24 Jul 2023 18:54:12 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Punit Agrawal Subject: [PATCH v3 05/10] mm: Handle some PMD faults under the VMA lock Date: Mon, 24 Jul 2023 19:54:05 +0100 Message-Id: <20230724185410.1124082-6-willy@infradead.org> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230724185410.1124082-1-willy@infradead.org> References: <20230724185410.1124082-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Push the VMA_LOCK check down from __handle_mm_fault() to handle_pte_fault(). Once again, we refuse to call ->huge_fault() with the VMA lock held, but we will wait for a PMD migration entry with the VMA lock held, handle NUMA migration and set the accessed bit. We were already doing this for anonymous VMAs, so it should be safe. Signed-off-by: Matthew Wilcox (Oracle) --- mm/memory.c | 39 +++++++++++++++++++++++++-------------- 1 file changed, 25 insertions(+), 14 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 7fec616f490b..9e4dd65e06ac 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4949,36 +4949,47 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf) { - if (vma_is_anonymous(vmf->vma)) + struct vm_area_struct *vma = vmf->vma; + if (vma_is_anonymous(vma)) return do_huge_pmd_anonymous_page(vmf); - if (vmf->vma->vm_ops->huge_fault) - return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD); + if (vma->vm_ops->huge_fault) { + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + vma_end_read(vma); + return VM_FAULT_RETRY; + } + return vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD); + } return VM_FAULT_FALLBACK; } /* `inline' is required to avoid gcc 4.1.2 build error */ static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf) { + struct vm_area_struct *vma = vmf->vma; const bool unshare = vmf->flags & FAULT_FLAG_UNSHARE; vm_fault_t ret; - if (vma_is_anonymous(vmf->vma)) { + if (vma_is_anonymous(vma)) { if (likely(!unshare) && - userfaultfd_huge_pmd_wp(vmf->vma, vmf->orig_pmd)) + userfaultfd_huge_pmd_wp(vma, vmf->orig_pmd)) return handle_userfault(vmf, VM_UFFD_WP); return do_huge_pmd_wp_page(vmf); } - if (vmf->vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) { - if (vmf->vma->vm_ops->huge_fault) { - ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD); + if (vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) { + if (vma->vm_ops->huge_fault) { + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + vma_end_read(vma); + return VM_FAULT_RETRY; + } + ret = vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD); if (!(ret & VM_FAULT_FALLBACK)) return ret; } } /* COW or write-notify handled on pte level: split pmd. */ - __split_huge_pmd(vmf->vma, vmf->pmd, vmf->address, false, NULL); + __split_huge_pmd(vma, vmf->pmd, vmf->address, false, NULL); return VM_FAULT_FALLBACK; } @@ -5049,6 +5060,11 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) { pte_t entry; + if ((vmf->flags & FAULT_FLAG_VMA_LOCK) && !vma_is_anonymous(vmf->vma)) { + vma_end_read(vmf->vma); + return VM_FAULT_RETRY; + } + if (unlikely(pmd_none(*vmf->pmd))) { /* * Leave __pte_alloc() until later: because vm_ops->fault may @@ -5188,11 +5204,6 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, if (pud_trans_unstable(vmf.pud)) goto retry_pud; - if ((flags & FAULT_FLAG_VMA_LOCK) && !vma_is_anonymous(vma)) { - vma_end_read(vma); - return VM_FAULT_RETRY; - } - if (pmd_none(*vmf.pmd) && hugepage_vma_check(vma, vm_flags, false, true, true)) { ret = create_huge_pmd(&vmf); From patchwork Mon Jul 24 18:54:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13325255 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E515C001DF for ; Mon, 24 Jul 2023 18:54:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230204AbjGXSyo (ORCPT ); Mon, 24 Jul 2023 14:54:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229598AbjGXSyl (ORCPT ); Mon, 24 Jul 2023 14:54:41 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2BF0BE55 for ; Mon, 24 Jul 2023 11:54:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=j/+5nvHmLcQr2gQNLZnma/P8C+KH0GEPUfvfP6aRFCA=; b=BFJKXQAH8Ld9mSwUnjlCJ9Cgoh fHUbEJMazDTd4xQQptvmIIdTFyaEH2YONUrjuOkX0UZQQ3TTYhtjroUHubNIxcQHwaE5xTUDqOCJs 0VU0ypfcwtPQHyWFq4J4BuPx+zIDlFkLgft+yQUGk+0ivxv9RpxA2BtSphe+v5RvSEm05/oiXwvrR /Af8Ms5TsZWa/9POSZcp9IVBoKd4YybnnKWboaz7rRkZcWp7CN8Ot7LXch5abDyrcgIJSZMVUI+7z zu1lLMHI/+0UzRSBkH2VswJg10brdZ4gtKKnSmFrNxMIEaU9Dx578le2dcBm6S3Rj+jhQQWKd8ygJ E9UgB4KA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qO0hA-004iRB-Fc; Mon, 24 Jul 2023 18:54:12 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Punit Agrawal , Suren Baghdasaryan Subject: [PATCH v3 06/10] mm: Move FAULT_FLAG_VMA_LOCK check down in handle_pte_fault() Date: Mon, 24 Jul 2023 19:54:06 +0100 Message-Id: <20230724185410.1124082-7-willy@infradead.org> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230724185410.1124082-1-willy@infradead.org> References: <20230724185410.1124082-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Call do_pte_missing() under the VMA lock ... then immediately retry in do_fault(). Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Suren Baghdasaryan --- mm/memory.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 9e4dd65e06ac..11b337876477 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4775,6 +4775,11 @@ static vm_fault_t do_fault(struct vm_fault *vmf) struct mm_struct *vm_mm = vma->vm_mm; vm_fault_t ret; + if (vmf->flags & FAULT_FLAG_VMA_LOCK){ + vma_end_read(vma); + return VM_FAULT_RETRY; + } + /* * The VMA was not fully populated on mmap() or missing VM_DONTEXPAND */ @@ -5060,11 +5065,6 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) { pte_t entry; - if ((vmf->flags & FAULT_FLAG_VMA_LOCK) && !vma_is_anonymous(vmf->vma)) { - vma_end_read(vmf->vma); - return VM_FAULT_RETRY; - } - if (unlikely(pmd_none(*vmf->pmd))) { /* * Leave __pte_alloc() until later: because vm_ops->fault may @@ -5097,6 +5097,12 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) if (!vmf->pte) return do_pte_missing(vmf); + if ((vmf->flags & FAULT_FLAG_VMA_LOCK) && !vma_is_anonymous(vmf->vma)) { + pte_unmap(vmf->pte); + vma_end_read(vmf->vma); + return VM_FAULT_RETRY; + } + if (!pte_present(vmf->orig_pte)) return do_swap_page(vmf); From patchwork Mon Jul 24 18:54:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13325249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E269EB64DD for ; Mon, 24 Jul 2023 18:54:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229806AbjGXSyZ (ORCPT ); Mon, 24 Jul 2023 14:54:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229597AbjGXSyW (ORCPT ); Mon, 24 Jul 2023 14:54:22 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F3AF10D1 for ; Mon, 24 Jul 2023 11:54:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=S+dCr3wFmOnYx33aA8EkljEdRO/JPPjH0b8bNo8hRK8=; b=LpXGzVO1V3NFVNKSce0PI1K3mI 00HJIlamGKCVSJTU/TaXpSYtxAGCJ00H8nAvYbCONsEJF+t6kq7PAN1mvhkItteW71rqdc1PhVEbq 2gY6Z8zsNCr9W04TlSMo9Hy2HGC1WDAHIYECDH4Sbo1jdx+UmSmFhImPpi4G+6hpt3omrFGnPCioY r5OetU3asujp/vCVz0A1TrJNtGePESTSBDleEdYtuuJx/GTh05EMuZxNBZKUDjxrff0WVp9hInezm QXszgBXraH7AudLUVjQqC8IyJoZKy1QmoyTQLjTacx21fFkIGpppl2591qthTSu8Cfpv95BJwk1oO JSJPo0dQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qO0hA-004iRD-Jf; Mon, 24 Jul 2023 18:54:12 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Punit Agrawal , Suren Baghdasaryan Subject: [PATCH v3 07/10] mm: Move FAULT_FLAG_VMA_LOCK check down from do_fault() Date: Mon, 24 Jul 2023 19:54:07 +0100 Message-Id: <20230724185410.1124082-8-willy@infradead.org> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230724185410.1124082-1-willy@infradead.org> References: <20230724185410.1124082-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Perform the check at the start of do_read_fault(), do_cow_fault() and do_shared_fault() instead. Should be no performance change from the last commit. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Suren Baghdasaryan --- mm/memory.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 11b337876477..627a2abb969b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4661,6 +4661,11 @@ static vm_fault_t do_read_fault(struct vm_fault *vmf) vm_fault_t ret = 0; struct folio *folio; + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + vma_end_read(vmf->vma); + return VM_FAULT_RETRY; + } + /* * Let's call ->map_pages() first and use ->fault() as fallback * if page by the offset is not ready to be mapped (cold cache or @@ -4689,6 +4694,11 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) struct vm_area_struct *vma = vmf->vma; vm_fault_t ret; + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + vma_end_read(vma); + return VM_FAULT_RETRY; + } + if (unlikely(anon_vma_prepare(vma))) return VM_FAULT_OOM; @@ -4729,6 +4739,11 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf) vm_fault_t ret, tmp; struct folio *folio; + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + vma_end_read(vma); + return VM_FAULT_RETRY; + } + ret = __do_fault(vmf); if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY))) return ret; @@ -4775,11 +4790,6 @@ static vm_fault_t do_fault(struct vm_fault *vmf) struct mm_struct *vm_mm = vma->vm_mm; vm_fault_t ret; - if (vmf->flags & FAULT_FLAG_VMA_LOCK){ - vma_end_read(vma); - return VM_FAULT_RETRY; - } - /* * The VMA was not fully populated on mmap() or missing VM_DONTEXPAND */ From patchwork Mon Jul 24 18:54:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13325247 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E792C0015E for ; Mon, 24 Jul 2023 18:54:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229747AbjGXSyX (ORCPT ); Mon, 24 Jul 2023 14:54:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229486AbjGXSyT (ORCPT ); Mon, 24 Jul 2023 14:54:19 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 14059188 for ; Mon, 24 Jul 2023 11:54:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=v1353Pyv++o8NWvrrZQCLEMMSP1M9ywuP5OB+aDM0kw=; b=dTwfpe4CSf/wNXWMmkRXZpVHoZ vKf5kVMS43ZWZIFpaRKalQSU56M1zPmuvT0cTfRjsHG1ms+id4W5sGxNd+sE6EVlBjAaiNoNxDr1h Xonm4QEVzdVvQOSZTzKKLOlWi/7AHgQcxw+xBTzLLDTPshsCp14Xt1CAf3E711aCMTCmiUsbsyKRJ E7epN1PqMGdDThFGqBD/6NTEaQk9H9FaPMFkL5DFJJwMjPDD9zI0zKEXyQrh4HK4MTkWSaztP6OC+ hH67f752GSEuWd8RT90APXA9Co/fW4lHowfXBqJRfeXhMgCMSq4BVozxlTuI59Y7oVIbvi1uR9rCv csAytTjg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qO0hA-004iRF-Mf; Mon, 24 Jul 2023 18:54:12 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Punit Agrawal , Suren Baghdasaryan Subject: [PATCH v3 08/10] mm: Run the fault-around code under the VMA lock Date: Mon, 24 Jul 2023 19:54:08 +0100 Message-Id: <20230724185410.1124082-9-willy@infradead.org> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230724185410.1124082-1-willy@infradead.org> References: <20230724185410.1124082-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The map_pages fs method should be safe to run under the VMA lock instead of the mmap lock. This should have a measurable reduction in contention on the mmap lock. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Suren Baghdasaryan --- mm/memory.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 627a2abb969b..cf28afe7a416 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4661,11 +4661,6 @@ static vm_fault_t do_read_fault(struct vm_fault *vmf) vm_fault_t ret = 0; struct folio *folio; - if (vmf->flags & FAULT_FLAG_VMA_LOCK) { - vma_end_read(vmf->vma); - return VM_FAULT_RETRY; - } - /* * Let's call ->map_pages() first and use ->fault() as fallback * if page by the offset is not ready to be mapped (cold cache or @@ -4677,6 +4672,11 @@ static vm_fault_t do_read_fault(struct vm_fault *vmf) return ret; } + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + vma_end_read(vmf->vma); + return VM_FAULT_RETRY; + } + ret = __do_fault(vmf); if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY))) return ret; From patchwork Mon Jul 24 18:54:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13325250 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8BA9C0015E for ; Mon, 24 Jul 2023 18:54:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229912AbjGXSya (ORCPT ); Mon, 24 Jul 2023 14:54:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229922AbjGXSy1 (ORCPT ); Mon, 24 Jul 2023 14:54:27 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1254DE55 for ; Mon, 24 Jul 2023 11:54:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=Bf2WSeVVnx5PtuK0GrdZC7nqaCiGfJc1owLWRc2d9S0=; b=pq4q53ipW8imewozPRVRm9SF+m RVM+pPZTkW+xlAdqR4U8etpaEe/I/DDBVIxPfL9YcKgLPN700wsZFK/8HQ5tiM5SH9mlMAQeOGFQA pCDpj4vAGjQDG2G8pb6Xmrrw2nge32+b/8RHXob2OgRXXA53Z7FI7GZWoNnSVpT3xUtrrRsaP4FR6 tK7rBBJ1LALKGEDtjsDaIcr8sxHVwqGwW0SYlCDqoxT5OW/qwUT1M9eB4glQiild/Zb7Pqwbx9PUp /ECOtnxFdu74Lg9qhUaT959BflWAxgCMzx3rOnmDtJjeTqakbEK6eH6mVpXBfeR3DdLPiqoB/07ik DDyIR+Uw==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qO0hA-004iRH-PM; Mon, 24 Jul 2023 18:54:12 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Punit Agrawal Subject: [PATCH v3 09/10] mm: Handle swap and NUMA PTE faults under the VMA lock Date: Mon, 24 Jul 2023 19:54:09 +0100 Message-Id: <20230724185410.1124082-10-willy@infradead.org> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230724185410.1124082-1-willy@infradead.org> References: <20230724185410.1124082-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Move the FAULT_FLAG_VMA_LOCK check down in handle_pte_fault(). This is probably not a huge win in its own right, but is a nicely separable bit from the next patch. Signed-off-by: Matthew Wilcox (Oracle) --- mm/memory.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index cf28afe7a416..ad77ac7f8c9b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5107,18 +5107,18 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) if (!vmf->pte) return do_pte_missing(vmf); - if ((vmf->flags & FAULT_FLAG_VMA_LOCK) && !vma_is_anonymous(vmf->vma)) { - pte_unmap(vmf->pte); - vma_end_read(vmf->vma); - return VM_FAULT_RETRY; - } - if (!pte_present(vmf->orig_pte)) return do_swap_page(vmf); if (pte_protnone(vmf->orig_pte) && vma_is_accessible(vmf->vma)) return do_numa_page(vmf); + if ((vmf->flags & FAULT_FLAG_VMA_LOCK) && !vma_is_anonymous(vmf->vma)) { + pte_unmap(vmf->pte); + vma_end_read(vmf->vma); + return VM_FAULT_RETRY; + } + spin_lock(vmf->ptl); entry = vmf->orig_pte; if (unlikely(!pte_same(ptep_get(vmf->pte), entry))) { From patchwork Mon Jul 24 18:54:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13325248 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67A37C00528 for ; Mon, 24 Jul 2023 18:54:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229770AbjGXSyY (ORCPT ); Mon, 24 Jul 2023 14:54:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229562AbjGXSyU (ORCPT ); Mon, 24 Jul 2023 14:54:20 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0757DE55 for ; Mon, 24 Jul 2023 11:54:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=Q+DaNTNbIUG3q6eDPG2po4w54RCX1zIfiIR5STu1uiQ=; b=IfZBEC2bafAFTQbS89y7UijEBE RWmQ1Ds4r4Jychulds5YnYRb3I0od+yIzxGEjRBg/sMoaW8Sgo39q598ZbCrb8yFJjLhEXLbKGw2u m6ON7coI8Vkyui8eqwVD3MAAI4b1SEwfQnyn3FGv9SQx1ziGML5H3GK5zcH+sBTO4lmFkaq0PKQyx xzo+xp/C7beutQOsWwV77qUovD+qADlO+56E/Dwyj29cWGg4lg7hX6Mo01FXPuux0JJUirC3kWWWp PlgvLemMPnSu3miVyRvcWhE2qw/Nw7f8SnbiBBb69OcutacNlt/2UeeoQC1+cppK2G6+2AMHfS7AU qU4S7HIA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qO0hA-004iRJ-S7; Mon, 24 Jul 2023 18:54:12 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Punit Agrawal Subject: [PATCH v3 10/10] mm: Handle faults that merely update the accessed bit under the VMA lock Date: Mon, 24 Jul 2023 19:54:10 +0100 Message-Id: <20230724185410.1124082-11-willy@infradead.org> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230724185410.1124082-1-willy@infradead.org> References: <20230724185410.1124082-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Move FAULT_FLAG_VMA_LOCK check out of handle_pte_fault(). This should have a significant performance improvement for mmaped files. Write faults (on read-only shared pages) still take the mmap lock as we do not want to audit all the implementations of ->pfn_mkwrite() and ->page_mkwrite(). However write-faults on private mappings are handled under the VMA lock. Signed-off-by: Matthew Wilcox (Oracle) --- mm/memory.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index ad77ac7f8c9b..20a2e9ed4aeb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3393,6 +3393,11 @@ static vm_fault_t wp_pfn_shared(struct vm_fault *vmf) vm_fault_t ret; pte_unmap_unlock(vmf->pte, vmf->ptl); + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + vma_end_read(vmf->vma); + return VM_FAULT_RETRY; + } + vmf->flags |= FAULT_FLAG_MKWRITE; ret = vma->vm_ops->pfn_mkwrite(vmf); if (ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE)) @@ -3415,6 +3420,12 @@ static vm_fault_t wp_page_shared(struct vm_fault *vmf, struct folio *folio) vm_fault_t tmp; pte_unmap_unlock(vmf->pte, vmf->ptl); + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + folio_put(folio); + vma_end_read(vmf->vma); + return VM_FAULT_RETRY; + } + tmp = do_page_mkwrite(vmf, folio); if (unlikely(!tmp || (tmp & (VM_FAULT_ERROR | VM_FAULT_NOPAGE)))) { @@ -5113,12 +5124,6 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) if (pte_protnone(vmf->orig_pte) && vma_is_accessible(vmf->vma)) return do_numa_page(vmf); - if ((vmf->flags & FAULT_FLAG_VMA_LOCK) && !vma_is_anonymous(vmf->vma)) { - pte_unmap(vmf->pte); - vma_end_read(vmf->vma); - return VM_FAULT_RETRY; - } - spin_lock(vmf->ptl); entry = vmf->orig_pte; if (unlikely(!pte_same(ptep_get(vmf->pte), entry))) {