[v2,33/35] arm64/mm: attempt speculative mm faults first

Message ID	20220128131006.67712-34-michel@lespinasse.org (mailing list archive)
State	New
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: Michel Lespinasse <michel@lespinasse.org> To: Linux-MM <linux-mm@kvack.org>, linux-kernel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org> Cc: kernel-team@fb.com, Laurent Dufour <ldufour@linux.ibm.com>, Jerome Glisse <jglisse@google.com>, Peter Zijlstra <peterz@infradead.org>, Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>, Davidlohr Bueso <dave@stgolabs.net>, Matthew Wilcox <willy@infradead.org>, Liam Howlett <liam.howlett@oracle.com>, Rik van Riel <riel@surriel.com>, Paul McKenney <paulmck@kernel.org>, Song Liu <songliubraving@fb.com>, Suren Baghdasaryan <surenb@google.com>, Minchan Kim <minchan@google.com>, Joel Fernandes <joelaf@google.com>, David Rientjes <rientjes@google.com>, Axel Rasmussen <axelrasmussen@google.com>, Andy Lutomirski <luto@kernel.org>, Michel Lespinasse <michel@lespinasse.org> Subject: [PATCH v2 33/35] arm64/mm: attempt speculative mm faults first Date: Fri, 28 Jan 2022 05:10:04 -0800 Message-Id: <20220128131006.67712-34-michel@lespinasse.org> In-Reply-To: <20220128131006.67712-1-michel@lespinasse.org> References: <20220128131006.67712-1-michel@lespinasse.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	Speculative page faults \| expand [v2,00/35] Speculative page faults [v2,01/35] mm: export dump_mm [v2,02/35] mmap locking API: mmap_lock_is_contended returns a bool [v2,03/35] mmap locking API: name the return values [v2,04/35] do_anonymous_page: use update_mmu_tlb() [v2,05/35] do_anonymous_page: reduce code duplication [v2,06/35] mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT [v2,07/35] x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT [v2,08/35] mm: add FAULT_FLAG_SPECULATIVE flag [v2,09/35] mm: add do_handle_mm_fault() [v2,10/35] mm: add per-mm mmap sequence counter for speculative page fault handling. [v2,11/35] mm: rcu safe vma freeing [v2,12/35] mm: separate mmap locked assertion from find_vma [v2,13/35] x86/mm: attempt speculative mm faults first [v2,14/35] mm: add speculative_page_walk_begin() and speculative_page_walk_end() [v2,15/35] mm: refactor __handle_mm_fault() / handle_pte_fault() [v2,16/35] mm: implement speculative handling in __handle_mm_fault(). [v2,17/35] mm: add pte_map_lock() and pte_spinlock() [v2,18/35] mm: implement speculative handling in do_anonymous_page() [v2,19/35] mm: enable speculative fault handling through do_anonymous_page() [v2,20/35] mm: implement speculative handling in do_numa_page() [v2,21/35] mm: enable speculative fault handling in do_numa_page() [v2,22/35] percpu-rwsem: enable percpu_sem destruction in atomic context [v2,23/35] mm: add mmu_notifier_lock [v2,24/35] mm: write lock mmu_notifier_lock when registering mmu notifiers [v2,25/35] mm: add mmu_notifier_trylock() and mmu_notifier_unlock() [v2,26/35] mm: implement speculative handling in wp_page_copy() [v2,27/35] mm: implement and enable speculative fault handling in handle_pte_fault() [v2,28/35] mm: disable speculative faults for single threaded user space [v2,29/35] mm: disable rcu safe vma freeing for single threaded user space [v2,30/35] mm: create new include/linux/vm_event.h header file [v2,31/35] mm: anon spf statistics [v2,32/35] arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT [v2,33/35] arm64/mm: attempt speculative mm faults first [v2,34/35] powerpc/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT [v2,35/35] powerpc/mm: attempt speculative mm faults first

Michel Lespinasse Jan. 28, 2022, 1:10 p.m. UTC

Attempt speculative mm fault handling first, and fall back to the
existing (non-speculative) code if that fails.

This follows the lines of the x86 speculative fault handling code,
but with some minor arch differences such as the way that the
VM_FAULT_BADACCESS case is handled.

Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
---
 arch/arm64/mm/fault.c | 62 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

Mike Rapoport Jan. 30, 2022, 9:13 a.m. UTC | #1

On Fri, Jan 28, 2022 at 05:10:04AM -0800, Michel Lespinasse wrote:
> Attempt speculative mm fault handling first, and fall back to the
> existing (non-speculative) code if that fails.
> 
> This follows the lines of the x86 speculative fault handling code,
> but with some minor arch differences such as the way that the
> VM_FAULT_BADACCESS case is handled.
> 
> Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
> ---
>  arch/arm64/mm/fault.c | 62 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 62 insertions(+)
> 
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 77341b160aca..2598795f4e70 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -25,6 +25,7 @@
>  #include <linux/perf_event.h>
>  #include <linux/preempt.h>
>  #include <linux/hugetlb.h>
> +#include <linux/vm_event_item.h>
>  
>  #include <asm/acpi.h>
>  #include <asm/bug.h>
> @@ -524,6 +525,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned int esr,
>  	unsigned long vm_flags;
>  	unsigned int mm_flags = FAULT_FLAG_DEFAULT;
>  	unsigned long addr = untagged_addr(far);
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +	struct vm_area_struct *vma;
> +	struct vm_area_struct pvma;
> +	unsigned long seq;
> +#endif
>  
>  	if (kprobe_page_fault(regs, esr))
>  		return 0;
> @@ -574,6 +580,59 @@ static int __kprobes do_page_fault(unsigned long far, unsigned int esr,
>  
>  	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
>  
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +	/*
> +	 * No need to try speculative faults for kernel or
> +	 * single threaded user space.
> +	 */
> +	if (!(mm_flags & FAULT_FLAG_USER) || atomic_read(&mm->mm_users) == 1)
> +		goto no_spf;
> +
> +	count_vm_event(SPF_ATTEMPT);
> +	seq = mmap_seq_read_start(mm);
> +	if (seq & 1) {
> +		count_vm_spf_event(SPF_ABORT_ODD);
> +		goto spf_abort;
> +	}
> +	rcu_read_lock();
> +	vma = __find_vma(mm, addr);
> +	if (!vma || vma->vm_start > addr) {
> +		rcu_read_unlock();
> +		count_vm_spf_event(SPF_ABORT_UNMAPPED);
> +		goto spf_abort;
> +	}
> +	if (!vma_is_anonymous(vma)) {
> +		rcu_read_unlock();
> +		count_vm_spf_event(SPF_ABORT_NO_SPECULATE);
> +		goto spf_abort;
> +	}
> +	pvma = *vma;
> +	rcu_read_unlock();
> +	if (!mmap_seq_read_check(mm, seq, SPF_ABORT_VMA_COPY))
> +		goto spf_abort;
> +	vma = &pvma;
> +	if (!(vma->vm_flags & vm_flags)) {
> +		count_vm_spf_event(SPF_ABORT_ACCESS_ERROR);
> +		goto spf_abort;
> +	}
> +	fault = do_handle_mm_fault(vma, addr & PAGE_MASK,
> +			mm_flags | FAULT_FLAG_SPECULATIVE, seq, regs);
> +
> +	/* Quick path to respond to signals */
> +	if (fault_signal_pending(fault, regs)) {
> +		if (!user_mode(regs))
> +			goto no_context;
> +		return 0;
> +	}
> +	if (!(fault & VM_FAULT_RETRY))
> +		goto done;
> +
> +spf_abort:
> +	count_vm_event(SPF_ABORT);
> +no_spf:
> +
> +#endif	/* CONFIG_SPECULATIVE_PAGE_FAULT */

The speculative page fault implementation here (and for PowerPC as well)
looks very similar to x86. Can we factor it our rather than copy 3 (or
more) times?

> +
>  	/*
>  	 * As per x86, we may deadlock here. However, since the kernel only
>  	 * validly references user space from well defined areas of the code,
> @@ -612,6 +671,9 @@ static int __kprobes do_page_fault(unsigned long far, unsigned int esr,
>  		goto retry;
>  	}
>  	mmap_read_unlock(mm);
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +done:
> +#endif
>  
>  	/*
>  	 * Handle the "normal" (no error) case first.
> -- 
> 2.20.1
> 
>

Michel Lespinasse Jan. 31, 2022, 8:07 a.m. UTC | #2

On Sun, Jan 30, 2022 at 11:13:26AM +0200, Mike Rapoport wrote:
> The speculative page fault implementation here (and for PowerPC as well)
> looks very similar to x86. Can we factor it our rather than copy 3 (or
> more) times?

In each arch, the speculative code was written along the lines of the
existing non-speculative code, so that behavior would be unchanged
when speculation succeeds.

Now each arch's existing, non-speculative code paths are quite similar,
but they do have small differences as to how they implement various
permission checks, protection keys and the like. The same small
differences end up being reflected in the new speculative code paths.

I agree it would be nice if this code could be unified between archs,
but IMO this should start with the existing non-speculative code -
I don't think it would make sense to try unifying the new speculative
code while trying to follow the behavior of the non-unified old
non-speculative code paths...

Mike Rapoport Feb. 1, 2022, 8:58 a.m. UTC | #3

On Mon, Jan 31, 2022 at 12:07:29AM -0800, Michel Lespinasse wrote:
> On Sun, Jan 30, 2022 at 11:13:26AM +0200, Mike Rapoport wrote:
> > The speculative page fault implementation here (and for PowerPC as well)
> > looks very similar to x86. Can we factor it our rather than copy 3 (or
> > more) times?
> 
> In each arch, the speculative code was written along the lines of the
> existing non-speculative code, so that behavior would be unchanged
> when speculation succeeds.
> 
> Now each arch's existing, non-speculative code paths are quite similar,
> but they do have small differences as to how they implement various
> permission checks, protection keys and the like. The same small
> differences end up being reflected in the new speculative code paths.
> 
> I agree it would be nice if this code could be unified between archs,
> but IMO this should start with the existing non-speculative code -
> I don't think it would make sense to try unifying the new speculative
> code while trying to follow the behavior of the non-unified old
> non-speculative code paths...

Then maybe this unification can be done as the ground work for the
speculative page fault handling?

Michel Lespinasse Feb. 7, 2022, 5:39 p.m. UTC | #4

On Tue, Feb 01, 2022 at 10:58:03AM +0200, Mike Rapoport wrote:
> On Mon, Jan 31, 2022 at 12:07:29AM -0800, Michel Lespinasse wrote:
> > On Sun, Jan 30, 2022 at 11:13:26AM +0200, Mike Rapoport wrote:
> > > The speculative page fault implementation here (and for PowerPC as well)
> > > looks very similar to x86. Can we factor it our rather than copy 3 (or
> > > more) times?
> > 
> > In each arch, the speculative code was written along the lines of the
> > existing non-speculative code, so that behavior would be unchanged
> > when speculation succeeds.
> > 
> > Now each arch's existing, non-speculative code paths are quite similar,
> > but they do have small differences as to how they implement various
> > permission checks, protection keys and the like. The same small
> > differences end up being reflected in the new speculative code paths.
> > 
> > I agree it would be nice if this code could be unified between archs,
> > but IMO this should start with the existing non-speculative code -
> > I don't think it would make sense to try unifying the new speculative
> > code while trying to follow the behavior of the non-unified old
> > non-speculative code paths...
> 
> Then maybe this unification can be done as the ground work for the
> speculative page fault handling?

I feel like this is quite unrelated, and that introducing such
artificial dependencies is a bad work habit we have here in linux MM...

That said, unifying the PF code between archs would be an interesting
project on its own. The way I see it, there could be a unified page
fault handler, with some arch specific parts defined as inline
functions.  I can see myself making an x86/arm64/powerpc initial
proposal if there is enough interest for it, but I'm not sure how
extending it to more exotic archs would go - I think this would have
to involve arch maintainers at least for testing purposes, and I'm not
sure if they'd have any bandwidth for such a project...

--
Michel "walken" Lespinasse

Mike Rapoport Feb. 8, 2022, 9:07 a.m. UTC | #5

On Mon, Feb 07, 2022 at 09:39:19AM -0800, Michel Lespinasse wrote:
> On Tue, Feb 01, 2022 at 10:58:03AM +0200, Mike Rapoport wrote:
> > On Mon, Jan 31, 2022 at 12:07:29AM -0800, Michel Lespinasse wrote:
> > > On Sun, Jan 30, 2022 at 11:13:26AM +0200, Mike Rapoport wrote:
> > > > The speculative page fault implementation here (and for PowerPC as well)
> > > > looks very similar to x86. Can we factor it our rather than copy 3 (or
> > > > more) times?
> > > 
> > > In each arch, the speculative code was written along the lines of the
> > > existing non-speculative code, so that behavior would be unchanged
> > > when speculation succeeds.
> > > 
> > > Now each arch's existing, non-speculative code paths are quite similar,
> > > but they do have small differences as to how they implement various
> > > permission checks, protection keys and the like. The same small
> > > differences end up being reflected in the new speculative code paths.
> > > 
> > > I agree it would be nice if this code could be unified between archs,
> > > but IMO this should start with the existing non-speculative code -
> > > I don't think it would make sense to try unifying the new speculative
> > > code while trying to follow the behavior of the non-unified old
> > > non-speculative code paths...
> > 
> > Then maybe this unification can be done as the ground work for the
> > speculative page fault handling?
> 
> I feel like this is quite unrelated, and that introducing such
> artificial dependencies is a bad work habit we have here in linux MM...

The reduction of the code duplication in page fault handlers per se is
indeed not very related to SPF work, but since the SPF patches increase
the code duplication, I believe that the refactoring that prevents this
additional code duplication is related and is in scope of this work.
 
> That said, unifying the PF code between archs would be an interesting
> project on its own. The way I see it, there could be a unified page
> fault handler, with some arch specific parts defined as inline
> functions.  I can see myself making an x86/arm64/powerpc initial
> proposal if there is enough interest for it, but I'm not sure how
> extending it to more exotic archs would go - I think this would have
> to involve arch maintainers at least for testing purposes, and I'm not
> sure if they'd have any bandwidth for such a project...

There is no need to convert all architectures and surely not at once.
The parts of page fault handler that are shared by several architectures
can live under #ifdef ARCH_WANTS_GENERIC_PAGE_FAULT or something like this.

> --
> Michel "walken" Lespinasse

[v2,33/35] arm64/mm: attempt speculative mm faults first

Commit Message

Comments

Patch