[2/8] uprobes: revamp uprobe refcounting and lifetime management

Revamp how struct uprobe is refcounted, and thus how its lifetime is
managed.

Right now, there are a few possible "owners" of uprobe refcount:
  - uprobes_tree RB tree assumes one refcount when uprobe is registered
    and added to the lookup tree;
  - while uprobe is triggered and kernel is handling it in the breakpoint
    handler code, temporary refcount bump is done to keep uprobe from
    being freed;
  - if we have uretprobe requested on a given struct uprobe instance, we
    take another refcount to keep uprobe alive until user space code
    returns from the function and triggers return handler.

The uprobe_tree's extra refcount of 1 is confusing and problematic. No
matter how many actual consumers are attached, they all share the same
refcount, and we have an extra logic to drop the "last" (which might not
really be last) refcount once uprobe's consumer list becomes empty.

This is unconventional and has to be kept in mind as a special case all
the time. Further, because of this design we have the situations where
find_uprobe() will find uprobe, bump refcount, return it to the caller,
but that uprobe will still need uprobe_is_active() check, after which
the caller is required to drop refcount and try again. This is just too
many details leaking to the higher level logic.

This patch changes refcounting scheme in such a way as to not have
uprobes_tree keeping extra refcount for struct uprobe. Instead, each
uprobe_consumer is assuming its own refcount, which will be dropped
when consumer is unregistered. Other than that, all the active users of
uprobe (entry and return uprobe handling code) keeps exactly the same
refcounting approach.

With the above setup, once uprobe's refcount drops to zero, we need to
make sure that uprobe's "destructor" removes uprobe from uprobes_tree,
of course. This, though, races with uprobe entry handling code in
handle_swbp(), which, through find_active_uprobe()->find_uprobe() lookup,
can race with uprobe being destroyed after refcount drops to zero (e.g.,
due to uprobe_consumer unregistering). So we add try_get_uprobe(), which
will attempt to bump refcount, unless it already is zero. Caller needs
to guarantee that uprobe instance won't be freed in parallel, which is
the case while we keep uprobes_treelock (for read or write, doesn't
matter).

Note also, we now don't leak the race between registration and
unregistration, so we remove the retry logic completely. If
find_uprobe() returns valid uprobe, it's guaranteed to remain in
uprobes_tree with properly incremented refcount. The race is handled
inside __insert_uprobe() and put_uprobe() working together:
__insert_uprobe() will remove uprobe from RB-tree, if it can't bump
refcount and will retry to insert the new uprobe instance. put_uprobe()
won't attempt to remove uprobe from RB-tree, if it's already not there.
All that is protected by uprobes_treelock, which keeps things simple.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/events/uprobes.c | 163 +++++++++++++++++++++++-----------------
 1 file changed, 93 insertions(+), 70 deletions(-)

Message ID	20240731214256.3588718-3-andrii@kernel.org (mailing list archive)
State	Superseded
Headers	show Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3106C16D4E4; Wed, 31 Jul 2024 21:43:07 +0000 (UTC) From: Andrii Nakryiko <andrii@kernel.org> To: linux-trace-kernel@vger.kernel.org, peterz@infradead.org, oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, Andrii Nakryiko <andrii@kernel.org> Subject: [PATCH 2/8] uprobes: revamp uprobe refcounting and lifetime management Date: Wed, 31 Jul 2024 14:42:50 -0700 Message-ID: <20240731214256.3588718-3-andrii@kernel.org> In-Reply-To: <20240731214256.3588718-1-andrii@kernel.org> References: <20240731214256.3588718-1-andrii@kernel.org> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	uprobes: RCU-protected hot path optimizations \| expand [0/8] uprobes: RCU-protected hot path optimizations [1/8] rbtree: provide rb_find_rcu() / rb_find_add_rcu() [2/8] uprobes: revamp uprobe refcounting and lifetime management [3/8] uprobes: protected uprobe lifetime with SRCU [4/8] uprobes: get rid of enum uprobe_filter_ctx in uprobe filter callbacks [5/8] uprobes: travers uprobe's consumer list locklessly under SRCU protection [6/8] perf/uprobe: split uprobe_unregister() [7/8] uprobes: perform lockless SRCU-protected uprobes_tree lookup [8/8] uprobes: switch to RCU Tasks Trace flavor for better performance

[2/8] uprobes: revamp uprobe refcounting and lifetime management

Checks

Commit Message

Comments

Patch