Message ID | 20241028010818.2487581-1-andrii@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | uprobes,mm: speculative lockless VMA-to-uprobe lookup | expand |
On Sun, Oct 27, 2024 at 6:09 PM Andrii Nakryiko <andrii@kernel.org> wrote: > > Implement speculative (lockless) resolution of VMA to inode to uprobe, > bypassing the need to take mmap_lock for reads, if possible. First two patches > by Suren adds mm_struct helpers that help detect whether mm_struct was > changed, which is used by uprobe logic to validate that speculative results > can be trusted after all the lookup logic results in a valid uprobe instance. > > Patch #3 is a simplification to uprobe VMA flag checking, suggested by Oleg. > > And, finally, patch #4 is the speculative VMA-to-uprobe resolution logic > itself, and is the focal point of this patch set. It makes entry uprobes in > common case scale very well with number of CPUs, as we avoid any locking or > cache line bouncing between CPUs. See corresponding patch for details and > benchmarking results. > > Note, this patch set assumes that FMODE_BACKING files were switched to have > SLAB_TYPE_SAFE_BY_RCU semantics, which was recently done by Christian Brauner > in [0]. This change can be pulled into perf/core through stable > tags/vfs-6.13.for-bpf.file tag from [1]. > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs-6.13.for-bpf.file&id=8b1bc2590af61129b82a189e9dc7c2804c34400e > [1] git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git > > v3->v4: > - rebased and dropped data_race(), given mm_struct uses real seqcount (Peter); > v2->v3: > - dropped kfree_rcu() patch (Christian); > - added data_race() annotations for fields of vma and vma->vm_file which could > be modified during speculative lookup (Oleg); > - fixed int->long problem in stubs for mmap_lock_speculation_{start,end}(), > caught by Kernel test robot; > v1->v2: > - adjusted vma_end_write_all() comment to point out it should never be called > manually now, but I wasn't sure how ACQUIRE/RELEASE comments should be > reworded (previously requested by Jann), so I'd appreciate some help there > (Jann); > - int -> long change for mm_lock_seq, as agreed at LPC2024 (Jann, Suren, Liam); > - kfree_rcu_mightsleep() for FMODE_BACKING (Suren, Christian); > - vm_flags simplification in find_active_uprobe_rcu() and > find_active_uprobe_speculative() (Oleg); > - guard(rcu)() simplified find_active_uprobe_speculative() implementation. > > Andrii Nakryiko (2): > uprobes: simplify find_active_uprobe_rcu() VMA checks > uprobes: add speculative lockless VMA-to-inode-to-uprobe resolution > > Suren Baghdasaryan (2): > mm: Convert mm_lock_seq to a proper seqcount > mm: Introduce mmap_lock_speculation_{begin|end} > > include/linux/mm.h | 12 ++--- > include/linux/mm_types.h | 7 ++- > include/linux/mmap_lock.h | 87 ++++++++++++++++++++++++-------- > kernel/events/uprobes.c | 47 ++++++++++++++++- > kernel/fork.c | 5 +- > mm/init-mm.c | 2 +- > tools/testing/vma/vma.c | 4 +- > tools/testing/vma/vma_internal.h | 4 +- > 8 files changed, 129 insertions(+), 39 deletions(-) > > -- > 2.43.5 > Hi! What's the status of this patch set? Are there any blockers for it to be applied to perf/core? MM folks are OK with landing the first two patches in perf/core, so hopefully we should be good to go?
On Tue, Nov 5, 2024 at 6:01 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Sun, Oct 27, 2024 at 6:09 PM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > Implement speculative (lockless) resolution of VMA to inode to uprobe, > > bypassing the need to take mmap_lock for reads, if possible. First two patches > > by Suren adds mm_struct helpers that help detect whether mm_struct was > > changed, which is used by uprobe logic to validate that speculative results > > can be trusted after all the lookup logic results in a valid uprobe instance. > > > > Patch #3 is a simplification to uprobe VMA flag checking, suggested by Oleg. > > > > And, finally, patch #4 is the speculative VMA-to-uprobe resolution logic > > itself, and is the focal point of this patch set. It makes entry uprobes in > > common case scale very well with number of CPUs, as we avoid any locking or > > cache line bouncing between CPUs. See corresponding patch for details and > > benchmarking results. > > > > Note, this patch set assumes that FMODE_BACKING files were switched to have > > SLAB_TYPE_SAFE_BY_RCU semantics, which was recently done by Christian Brauner > > in [0]. This change can be pulled into perf/core through stable > > tags/vfs-6.13.for-bpf.file tag from [1]. > > > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs-6.13.for-bpf.file&id=8b1bc2590af61129b82a189e9dc7c2804c34400e > > [1] git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git > > > > v3->v4: > > - rebased and dropped data_race(), given mm_struct uses real seqcount (Peter); > > v2->v3: > > - dropped kfree_rcu() patch (Christian); > > - added data_race() annotations for fields of vma and vma->vm_file which could > > be modified during speculative lookup (Oleg); > > - fixed int->long problem in stubs for mmap_lock_speculation_{start,end}(), > > caught by Kernel test robot; > > v1->v2: > > - adjusted vma_end_write_all() comment to point out it should never be called > > manually now, but I wasn't sure how ACQUIRE/RELEASE comments should be > > reworded (previously requested by Jann), so I'd appreciate some help there > > (Jann); > > - int -> long change for mm_lock_seq, as agreed at LPC2024 (Jann, Suren, Liam); > > - kfree_rcu_mightsleep() for FMODE_BACKING (Suren, Christian); > > - vm_flags simplification in find_active_uprobe_rcu() and > > find_active_uprobe_speculative() (Oleg); > > - guard(rcu)() simplified find_active_uprobe_speculative() implementation. > > > > Andrii Nakryiko (2): > > uprobes: simplify find_active_uprobe_rcu() VMA checks > > uprobes: add speculative lockless VMA-to-inode-to-uprobe resolution > > > > Suren Baghdasaryan (2): > > mm: Convert mm_lock_seq to a proper seqcount > > mm: Introduce mmap_lock_speculation_{begin|end} > > > > include/linux/mm.h | 12 ++--- > > include/linux/mm_types.h | 7 ++- > > include/linux/mmap_lock.h | 87 ++++++++++++++++++++++++-------- > > kernel/events/uprobes.c | 47 ++++++++++++++++- > > kernel/fork.c | 5 +- > > mm/init-mm.c | 2 +- > > tools/testing/vma/vma.c | 4 +- > > tools/testing/vma/vma_internal.h | 4 +- > > 8 files changed, 129 insertions(+), 39 deletions(-) > > > > -- > > 2.43.5 > > > > Hi! > > What's the status of this patch set? Are there any blockers for it to > be applied to perf/core? MM folks are OK with landing the first two > patches in perf/core, so hopefully we should be good to go? Another week, another ping. Peter, what can I do to make this land? MM parts are clearly ok with Andrew Morton, uprobe-side logic didn't change (modulo inconsequential data_race() back and forth) since at least August, was approved by Oleg, and seems to be very stable in testing. I think it's time to let me forget about this patch set and make actual use of it in production, please.