From patchwork Tue Aug 13 04:29:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13761250 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE7BEC52D7B for ; Tue, 13 Aug 2024 04:30:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2D0E06B00A5; Tue, 13 Aug 2024 00:30:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E6D46B00A7; Tue, 13 Aug 2024 00:30:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 05ED16B00A8; Tue, 13 Aug 2024 00:30:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id CA9626B00A5 for ; Tue, 13 Aug 2024 00:30:09 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7A583A058C for ; Tue, 13 Aug 2024 04:30:09 +0000 (UTC) X-FDA: 82445944938.04.89EDD35 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf06.hostedemail.com (Postfix) with ESMTP id C49A318000F for ; Tue, 13 Aug 2024 04:30:07 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=d2qy7pu5; spf=pass (imf06.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723523396; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zgwnpK1QZvsw3Z3wOqvZa54lclJP05R94Yu4Ts/FO0E=; b=NVvSlfkcoScpMtnxyxgA10aWaEgdxvL0FvaBDmVJ5AU5CRxmFgIvPd/Bxpy0SmmhXW/BIz x9Ok+ddDTE4S9I+IXcTvyIHUDvs8fusqo+fHD4OEGQ8bNZnMuy7qoAH3/C78ssyyzFL95k qsrin1mZ0nAtUTSH9x0ubGulLc+TFtY= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=d2qy7pu5; spf=pass (imf06.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723523396; a=rsa-sha256; cv=none; b=3Bsn53cc9Gn+sV9ch6B3GhxA1CHeN6Y48Lrs333MZXTHEhaojZUUwAaREs3Ydky+MHKaT/ MKyuEL2CLVho7goElPc2OXoHpgpguh4Q0TvVgiW6dDw3MSsb9mF4aS5TwcoWuj3ZgKqIVZ FMxm7x+4SDvglfP7wbSWwfg20eLUnc4= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 00D9A61550; Tue, 13 Aug 2024 04:30:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 79525C4AF09; Tue, 13 Aug 2024 04:30:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723523406; bh=/A8ty9vSHeed+/Tmg8mjROrsgzuVWuvPWZwEBfeENiw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=d2qy7pu5HAsPWWumRq01aZd6gxWYz+LakPOyZo1KSVcUIbFFAOKnY0fvLJCCcMz3n S7FW8vDRhrRhV6WuTOlFVnLitf9+sX914g7iwpYCEgly8XpAk2bnpn/VMtVeT9h9hO xtxGG3ZUFpy9p1rzB7+skRPDxCUGK5WO28qdfrpCE6ehDS17O08zwhNcEzOK8r2hDR QHdPqa3PjFW8zXOvH7umQpEDoECxWeIc0GAmqMQtPgUWLpuz4xiyPC9UwQ/+U9H5HM zsTNPSHwveyM5xApxsr0P5qr5fTmgTQSoxAXIROeV8CxolMdeU6vqZLTk6JPSjd4aa Y2So+16jdHo8g== From: Andrii Nakryiko To: linux-trace-kernel@vger.kernel.org, peterz@infradead.org, oleg@redhat.com Cc: rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, surenb@google.com, akpm@linux-foundation.org, linux-mm@kvack.org, Andrii Nakryiko Subject: [PATCH v3 08/13] uprobes: switch to RCU Tasks Trace flavor for better performance Date: Mon, 12 Aug 2024 21:29:12 -0700 Message-ID: <20240813042917.506057-9-andrii@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240813042917.506057-1-andrii@kernel.org> References: <20240813042917.506057-1-andrii@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: 98a4awezck6ewfkicnq8z9zp9jgean8r X-Rspamd-Queue-Id: C49A318000F X-Rspamd-Server: rspam11 X-HE-Tag: 1723523407-832727 X-HE-Meta: U2FsdGVkX18JEnjcFAkURU9YYv+pVoYzi+DuUX7bYIwnmTRmETK4UORTotVYAMy+QNIhIp3m+3BWmD/yesJE544sANUo3zKYAjL7OVAMLN5uJQSMSij/o4XRJpdXFAM8EbvWVWhAbT0MHXZ3oIhUE56DkK1SQzCm004h5ka7bpuw3Sua26MNMoapqI9oiOsoO2BqC7LB7AKJ32EQtE40oSFBhwxkZEUDTKiSjccnVyDMPV9Umn14IJcgedLwZ1N2jTXN+OSEciRHwAK2KprnxOyEXCjgu/pTzhQbL5IPWiOEgJiD9I2iaIIbsyA4ooXK9uJaPm5yBzxUwohy6ujsGZGUnpIESImRXtNxMZ0P9udUfoU9lw06qoYp8reob+u6bNhyCbiKetGvuU4V839+8aYemNf+8onPjMWjF4Z3UoqiLz0QPbT4tBjgKIGbnUOyPY32qYTN7lctUFrIrBpcjkS+3ah6mnK9KYcpfD3sPFhXMNQr3rGzRFRe60WCKZgfpUqeQ3ikfdanIlFaSd7sVtZ81F9P7WMirpgSBzyEVP+4pOja/ZfRt3SybcLtW3lhZfiZsylC4G5Ulveu7u7ATeryUtza2H6UxcDvujh2GJgGTlLBe0OLGPE1TQ87dYjxK8/QGNwgquRcwwamw6pCePo+tuJw4Itvh/X8isE8h+HaKwoyelgQfBXP9Osm6KnJ+eLzooNkcCxzYAbLKGFRqYinv7aHH0Oc00f3G1gyTBhtNXMGwC080q4Z+6YazZ/W5U8qeNIV95OikVFfMEyRwL7Eto55G53wRqnMKN+nQf6QG2qccC2KLWeNIrPxATC+LSoq4wN7iEYeh6SS+PZXquywi9qHpp3gobjJgPhL3inr5IN6AP1xrujKV64Vv1unJrmuiEYPHeGsBEaJaYOR483Wl5btFNwxEXpj1Y3eylJ+yLMx6axFqMMVsuDNQCNECdg+oZ9MAoXUNiWYnzi 9ETPzsAo jxM8LhDVWRlq0z58PDiUeDkHOzSjq2MkP1G+U1HtwOo5UqPwzseGfktOsNLq+srhGSY/o3xGIZLabSYoKhlc7bWIS4la8gXqk5GfxTmt8NmRS92zqpZHhKddMJKRG5URpcytloQgCAq5SSq2mTyYgP76uF78JQYO0TzkbvD3G3R6u1KmqJoKbyvy7w0KJx11PCjXgKK+0gNTJC3tl+MHsyWRJd7zP2pzqprlKO5rOyRohdNQuKBAEy/2osECXqsD1Nww7Y3tXourOaSMBB0KlJOF5wg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch switches uprobes SRCU usage to RCU Tasks Trace flavor, which is optimized for more lightweight and quick readers (at the expense of slower writers, which for uprobes is a fine tradeof) and has better performance and scalability with number of CPUs. Similarly to baseline vs SRCU, we've benchmarked SRCU-based implementation vs RCU Tasks Trace implementation. SRCU ==== uprobe-nop ( 1 cpus): 3.276 ± 0.005M/s ( 3.276M/s/cpu) uprobe-nop ( 2 cpus): 4.125 ± 0.002M/s ( 2.063M/s/cpu) uprobe-nop ( 4 cpus): 7.713 ± 0.002M/s ( 1.928M/s/cpu) uprobe-nop ( 8 cpus): 8.097 ± 0.006M/s ( 1.012M/s/cpu) uprobe-nop (16 cpus): 6.501 ± 0.056M/s ( 0.406M/s/cpu) uprobe-nop (32 cpus): 4.398 ± 0.084M/s ( 0.137M/s/cpu) uprobe-nop (64 cpus): 6.452 ± 0.000M/s ( 0.101M/s/cpu) uretprobe-nop ( 1 cpus): 2.055 ± 0.001M/s ( 2.055M/s/cpu) uretprobe-nop ( 2 cpus): 2.677 ± 0.000M/s ( 1.339M/s/cpu) uretprobe-nop ( 4 cpus): 4.561 ± 0.003M/s ( 1.140M/s/cpu) uretprobe-nop ( 8 cpus): 5.291 ± 0.002M/s ( 0.661M/s/cpu) uretprobe-nop (16 cpus): 5.065 ± 0.019M/s ( 0.317M/s/cpu) uretprobe-nop (32 cpus): 3.622 ± 0.003M/s ( 0.113M/s/cpu) uretprobe-nop (64 cpus): 3.723 ± 0.002M/s ( 0.058M/s/cpu) RCU Tasks Trace =============== uprobe-nop ( 1 cpus): 3.396 ± 0.002M/s ( 3.396M/s/cpu) uprobe-nop ( 2 cpus): 4.271 ± 0.006M/s ( 2.135M/s/cpu) uprobe-nop ( 4 cpus): 8.499 ± 0.015M/s ( 2.125M/s/cpu) uprobe-nop ( 8 cpus): 10.355 ± 0.028M/s ( 1.294M/s/cpu) uprobe-nop (16 cpus): 7.615 ± 0.099M/s ( 0.476M/s/cpu) uprobe-nop (32 cpus): 4.430 ± 0.007M/s ( 0.138M/s/cpu) uprobe-nop (64 cpus): 6.887 ± 0.020M/s ( 0.108M/s/cpu) uretprobe-nop ( 1 cpus): 2.174 ± 0.001M/s ( 2.174M/s/cpu) uretprobe-nop ( 2 cpus): 2.853 ± 0.001M/s ( 1.426M/s/cpu) uretprobe-nop ( 4 cpus): 4.913 ± 0.002M/s ( 1.228M/s/cpu) uretprobe-nop ( 8 cpus): 5.883 ± 0.002M/s ( 0.735M/s/cpu) uretprobe-nop (16 cpus): 5.147 ± 0.001M/s ( 0.322M/s/cpu) uretprobe-nop (32 cpus): 3.738 ± 0.008M/s ( 0.117M/s/cpu) uretprobe-nop (64 cpus): 4.397 ± 0.002M/s ( 0.069M/s/cpu) Peak throughput for uprobes increases from 8 mln/s to 10.3 mln/s (+28%!), and for uretprobes from 5.3 mln/s to 5.8 mln/s (+11%), as we have more work to do on uretprobes side. Even single-thread (no contention) performance is slightly better: 3.276 mln/s to 3.396 mln/s (+3.5%) for uprobes, and 2.055 mln/s to 2.174 mln/s (+5.8%) for uretprobes. Signed-off-by: Andrii Nakryiko --- kernel/events/uprobes.c | 37 +++++++++++++++---------------------- 1 file changed, 15 insertions(+), 22 deletions(-) diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 8559ca365679..0480ad841942 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -42,8 +42,6 @@ static struct rb_root uprobes_tree = RB_ROOT; static DEFINE_RWLOCK(uprobes_treelock); /* serialize rbtree access */ static seqcount_rwlock_t uprobes_seqcount = SEQCNT_RWLOCK_ZERO(uprobes_seqcount, &uprobes_treelock); -DEFINE_STATIC_SRCU(uprobes_srcu); - #define UPROBES_HASH_SZ 13 /* serialize uprobe->pending_list */ static struct mutex uprobes_mmap_mutex[UPROBES_HASH_SZ]; @@ -652,7 +650,7 @@ static void put_uprobe(struct uprobe *uprobe) delayed_uprobe_remove(uprobe, NULL); mutex_unlock(&delayed_uprobe_lock); - call_srcu(&uprobes_srcu, &uprobe->rcu, uprobe_free_rcu); + call_rcu_tasks_trace(&uprobe->rcu, uprobe_free_rcu); } static __always_inline @@ -707,7 +705,7 @@ static struct uprobe *find_uprobe_rcu(struct inode *inode, loff_t offset) struct rb_node *node; unsigned int seq; - lockdep_assert(srcu_read_lock_held(&uprobes_srcu)); + lockdep_assert(rcu_read_lock_trace_held()); do { seq = read_seqcount_begin(&uprobes_seqcount); @@ -924,8 +922,7 @@ static bool filter_chain(struct uprobe *uprobe, struct mm_struct *mm) bool ret = false; down_read(&uprobe->consumer_rwsem); - list_for_each_entry_srcu(uc, &uprobe->consumers, cons_node, - srcu_read_lock_held(&uprobes_srcu)) { + list_for_each_entry_rcu(uc, &uprobe->consumers, cons_node, rcu_read_lock_trace_held()) { ret = consumer_filter(uc, mm); if (ret) break; @@ -1148,7 +1145,7 @@ void uprobe_unregister_sync(void) * unlucky enough caller can free consumer's memory and cause * handler_chain() or handle_uretprobe_chain() to do an use-after-free. */ - synchronize_srcu(&uprobes_srcu); + synchronize_rcu_tasks_trace(); } EXPORT_SYMBOL_GPL(uprobe_unregister_sync); @@ -1232,19 +1229,18 @@ EXPORT_SYMBOL_GPL(uprobe_register); int uprobe_apply(struct uprobe *uprobe, struct uprobe_consumer *uc, bool add) { struct uprobe_consumer *con; - int ret = -ENOENT, srcu_idx; + int ret = -ENOENT; down_write(&uprobe->register_rwsem); - srcu_idx = srcu_read_lock(&uprobes_srcu); - list_for_each_entry_srcu(con, &uprobe->consumers, cons_node, - srcu_read_lock_held(&uprobes_srcu)) { + rcu_read_lock_trace(); + list_for_each_entry_rcu(con, &uprobe->consumers, cons_node, rcu_read_lock_trace_held()) { if (con == uc) { ret = register_for_each_vma(uprobe, add ? uc : NULL); break; } } - srcu_read_unlock(&uprobes_srcu, srcu_idx); + rcu_read_unlock_trace(); up_write(&uprobe->register_rwsem); @@ -2114,8 +2110,7 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs) current->utask->auprobe = &uprobe->arch; - list_for_each_entry_srcu(uc, &uprobe->consumers, cons_node, - srcu_read_lock_held(&uprobes_srcu)) { + list_for_each_entry_rcu(uc, &uprobe->consumers, cons_node, rcu_read_lock_trace_held()) { int rc = 0; if (uc->handler) { @@ -2153,15 +2148,13 @@ handle_uretprobe_chain(struct return_instance *ri, struct pt_regs *regs) { struct uprobe *uprobe = ri->uprobe; struct uprobe_consumer *uc; - int srcu_idx; - srcu_idx = srcu_read_lock(&uprobes_srcu); - list_for_each_entry_srcu(uc, &uprobe->consumers, cons_node, - srcu_read_lock_held(&uprobes_srcu)) { + rcu_read_lock_trace(); + list_for_each_entry_rcu(uc, &uprobe->consumers, cons_node, rcu_read_lock_trace_held()) { if (uc->ret_handler) uc->ret_handler(uc, ri->func, regs); } - srcu_read_unlock(&uprobes_srcu, srcu_idx); + rcu_read_unlock_trace(); } static struct return_instance *find_next_ret_chain(struct return_instance *ri) @@ -2246,13 +2239,13 @@ static void handle_swbp(struct pt_regs *regs) { struct uprobe *uprobe; unsigned long bp_vaddr; - int is_swbp, srcu_idx; + int is_swbp; bp_vaddr = uprobe_get_swbp_addr(regs); if (bp_vaddr == uprobe_get_trampoline_vaddr()) return uprobe_handle_trampoline(regs); - srcu_idx = srcu_read_lock(&uprobes_srcu); + rcu_read_lock_trace(); uprobe = find_active_uprobe_rcu(bp_vaddr, &is_swbp); if (!uprobe) { @@ -2310,7 +2303,7 @@ static void handle_swbp(struct pt_regs *regs) out: /* arch_uprobe_skip_sstep() succeeded, or restart if can't singlestep */ - srcu_read_unlock(&uprobes_srcu, srcu_idx); + rcu_read_unlock_trace(); } /*