Message ID | 20240906051205.530219-1-andrii@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | uprobes,mm: speculative lockless VMA-to-uprobe lookup | expand |
On Fri, Sep 6, 2024 at 7:12 AM Andrii Nakryiko <andrii@kernel.org> wrote: > Implement speculative (lockless) resolution of VMA to inode to uprobe, > bypassing the need to take mmap_lock for reads, if possible. Patch #1 by Suren > adds mm_struct helpers that help detect whether mm_struct were changed, which > is used by uprobe logic to validate that speculative results can be trusted > after all the lookup logic results in a valid uprobe instance. Random thought: It would be nice if you could skip the MM stuff entirely and instead go through the GUP-fast path, but I guess going from a uprobe-created anon page to the corresponding uprobe is hard... but maybe if you used the anon_vma pointer as a lookup key to find the uprobe, it could work? Though then you'd need hooks in the anon_vma code... maybe not such a great idea.
On Tue, Sep 10, 2024 at 9:06 AM Jann Horn <jannh@google.com> wrote: > > On Fri, Sep 6, 2024 at 7:12 AM Andrii Nakryiko <andrii@kernel.org> wrote: > > Implement speculative (lockless) resolution of VMA to inode to uprobe, > > bypassing the need to take mmap_lock for reads, if possible. Patch #1 by Suren > > adds mm_struct helpers that help detect whether mm_struct were changed, which > > is used by uprobe logic to validate that speculative results can be trusted > > after all the lookup logic results in a valid uprobe instance. > > Random thought: It would be nice if you could skip the MM stuff > entirely and instead go through the GUP-fast path, but I guess going > from a uprobe-created anon page to the corresponding uprobe is hard... > but maybe if you used the anon_vma pointer as a lookup key to find the > uprobe, it could work? Though then you'd need hooks in the anon_vma > code... maybe not such a great idea. So I'm not crystal clear on all the details here, so maybe you can elaborate a bit. But keep in mind that a) there could be multiple uprobes within a single user page, so lookup has to take at least offset within the page into account somehow. But also b) single uprobe can be installed in many independent anon VMAs across many processes. So anon vma itself can't be part of the key. Though maybe we could have left some sort of "cookie" stashed somewhere to help with lookup. But then again, multiple uprobes per page. It does feel like lockless VMA to inode resolution would be a cleaner solution, let's see if we can get there somehow.
On Tue, Sep 10, 2024 at 7:58 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > On Tue, Sep 10, 2024 at 9:06 AM Jann Horn <jannh@google.com> wrote: > > On Fri, Sep 6, 2024 at 7:12 AM Andrii Nakryiko <andrii@kernel.org> wrote: > > > Implement speculative (lockless) resolution of VMA to inode to uprobe, > > > bypassing the need to take mmap_lock for reads, if possible. Patch #1 by Suren > > > adds mm_struct helpers that help detect whether mm_struct were changed, which > > > is used by uprobe logic to validate that speculative results can be trusted > > > after all the lookup logic results in a valid uprobe instance. > > > > Random thought: It would be nice if you could skip the MM stuff > > entirely and instead go through the GUP-fast path, but I guess going > > from a uprobe-created anon page to the corresponding uprobe is hard... > > but maybe if you used the anon_vma pointer as a lookup key to find the > > uprobe, it could work? Though then you'd need hooks in the anon_vma > > code... maybe not such a great idea. > > So I'm not crystal clear on all the details here, so maybe you can > elaborate a bit. But keep in mind that a) there could be multiple > uprobes within a single user page, so lookup has to take at least > offset within the page into account somehow. But also b) single uprobe I think anonymous pages have the same pgoff numbering as file pages; so the page's mapping and pgoff pointers together should almost give you the same amount of information as what you are currently looking for (the file and the offset inside it), except that you'd get an anon_vma pointer corresponding to the file instead of directly getting the file. > can be installed in many independent anon VMAs across many processes. > So anon vma itself can't be part of the key. Yeah, I guess to make that work you'd have to somehow track which anon_vmas exist for which mappings. (An anon_vma is tied to one specific file, see anon_vma_compatible().) > Though maybe we could have left some sort of "cookie" stashed > somewhere to help with lookup. But then again, multiple uprobes per > page. > > It does feel like lockless VMA to inode resolution would be a cleaner > solution, let's see if we can get there somehow. Mh, yes, I was just thinking it would be nice if we could keep this lockless complexity out of the mmap locking code... but I guess it's not much more straightforward than what you're doing.