From patchwork Fri Aug 23 21:15:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13776030 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D057C14A4C8; Fri, 23 Aug 2024 21:15:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724447717; cv=none; b=WC5LmrHa27YLa1TdY648yz42NWXyKey30inlqRi2nG5Y9iNINjbl0i+DZevy99f4vQfMOvaFq+vyb6tsKkogTPLitqOpgjgVXLrEG/VVnooxwDQ5bX349CgFhMS2JkqM9MDMCeSW43YnBKH1RsYkKksZdwW0iiczaFwwml437S4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724447717; c=relaxed/simple; bh=Eu9iIKrUmBF+EtSklwZ4302HDaY6OtItQ7LVMVD6LIM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=FO1GdBLWErkblZIvbh/N0Xb+8lq9+a+/Qf1N7SDR2C/YdxJLcnlrPhQMDJVdUrXvRduObm1VKfA1kKUrH7whSFE5wCzn84dRBl9n9MKVhzj1snvzEeq00kzIkSpEkhYjo6uN4HM3Er+vjlGSdw3V4UoMrY9eBR4uQp9TCOU0qTo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=PqkEdViH; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="PqkEdViH" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 70D2EC32786; Fri, 23 Aug 2024 21:15:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724447717; bh=Eu9iIKrUmBF+EtSklwZ4302HDaY6OtItQ7LVMVD6LIM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PqkEdViHRN3JOmr8HgG4OdQ2Occ5V8K6tjroC0/GI0g16iglsYQx34Q5F4h4Umhy2 kTNzx3Okkn/Kx258/OrEK2y10HUqbJRdgxVndSDyiVienTsnkYubQpfhJabUjgwn4U c02VyMMSbrKOahJLshopzXZk9IgFKWgHfC39tlPS7y0qAS5Q0ZFg6ctcwnqRyWliLG R6KsGtM1riQWLCqQ2qOi9nRyC4MTP/G7YPpFfWgBjKIbhXZVJ+ilAsy2xyKymvQAe4 X3FtMbN20NS9kMbeeioc8ilak8jihCBZCu0ONuA7cgCNfXz6+qDIcFkZ0GtNsrdCU8 Ba0iQeNJCLOxA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 273C6CE0D9B; Fri, 23 Aug 2024 14:15:17 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, riel@surriel.com, "Paul E. McKenney" Subject: [PATCH rcu 1/4] rcu: Defer printing stall-warning backtrace when holding rcu_node lock Date: Fri, 23 Aug 2024 14:15:12 -0700 Message-Id: <20240823211516.2984627-1-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <415b108b-1046-4027-aa2a-c829b77f39f6@paulmck-laptop> References: <415b108b-1046-4027-aa2a-c829b77f39f6@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The rcu_dump_cpu_stacks() holds the leaf rcu_node structure's ->lock when dumping the stakcks of any CPUs stalling the current grace period. This lock is held to prevent confusion that would otherwise occur when the stalled CPU reported its quiescent state (and then went on to do unrelated things) just as the backtrace NMI was heading towards it. This has worked well, but on larger systems has recently been observed to cause severe lock contention resulting in CSD-lock stalls and other general unhappiness. This commit therefore does printk_deferred_enter() before acquiring the lock and printk_deferred_exit() after releasing it, thus deferring the overhead of actually outputting the stack trace out of that lock's critical section. Reported-by: Rik van Riel Suggested-by: Rik van Riel Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_stall.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index cf8e5c6ed50ac..2fb40ec4b2aea 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -371,6 +371,7 @@ static void rcu_dump_cpu_stacks(void) struct rcu_node *rnp; rcu_for_each_leaf_node(rnp) { + printk_deferred_enter(); raw_spin_lock_irqsave_rcu_node(rnp, flags); for_each_leaf_node_possible_cpu(rnp, cpu) if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) { @@ -380,6 +381,7 @@ static void rcu_dump_cpu_stacks(void) dump_cpu_task(cpu); } raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + printk_deferred_exit(); } } From patchwork Fri Aug 23 21:15:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13776031 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D05CC16B391; Fri, 23 Aug 2024 21:15:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724447717; cv=none; b=IYoriYhOyDNuefrpQ54btTZ+QfrnoFkDjbHtWtFWEG/h+vEDax62jDZD7OCTmxTk6IwUAOKUHVgZ/OoWAbgPst+YpYIqsEPDsL5+Qep4VqUKxIelg8+560qNsDNdxvmQUuMQHPUU04fTBqgXnkK3WiSgL7r2K/HQZS/V6U/KdXg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724447717; c=relaxed/simple; bh=bmBTGW+VcZqWF/8zKVL1ZslDNcVmNY96MKMOPxGZ0M4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TyuGVk7LA6l7LzSpIhGJCrjaJHD+V4PCKxxhj6UjJeh8YahzDhLVD8e2l4JzSQtE2VIOHc9wWVttMg8fB50d15UmLQhqUapTD0XO4dKRyc8t/xtevFIjcwBW2HiofkU7LfbROOYSjcdaqCcwLQxFgFwcMU5HVVwTHtM5kdrqHw4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oW4MYXgG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oW4MYXgG" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 791D5C4AF0C; Fri, 23 Aug 2024 21:15:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724447717; bh=bmBTGW+VcZqWF/8zKVL1ZslDNcVmNY96MKMOPxGZ0M4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oW4MYXgGBz96OA1xvf5UFLs4XdneGZLSC4kfWvnEoTkJTsEyx6cU2t7SWL62lbb5v qRmAG3qfhYMnSxSYZklN4ULqWOC4rw4SdKMRl+A8fo4zHh9uA0gsTGWEfJCOZNyeeW AbX3EY+LxlLPjmeICYnFuD6Cne4cPsCA6xvyT3SvL1wwBCajQYMQMw5s6nVrvA4hCV 5dWklflOBeTZWbdcg/6M23VqY0X3WfAr6xJJhWvcLT/NP8R/yzEq2p/rJfpakn4JF7 wm7909Zp11xP7tu6LP88GLl2Bo+jYZPqo31AEIBA6RXt2QHSFkwq+V+QMLlkgBMLup eyuV3JvMU9FfQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 2A64FCE1078; Fri, 23 Aug 2024 14:15:17 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, riel@surriel.com, "Paul E. McKenney" Subject: [PATCH rcu 2/4] rcu: Delete unused rcu_gp_might_be_stalled() function Date: Fri, 23 Aug 2024 14:15:13 -0700 Message-Id: <20240823211516.2984627-2-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <415b108b-1046-4027-aa2a-c829b77f39f6@paulmck-laptop> References: <415b108b-1046-4027-aa2a-c829b77f39f6@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The rcu_gp_might_be_stalled() function is no longer used, so this commit removes it. Signed-off-by: Paul E. McKenney --- include/linux/rcutiny.h | 1 - include/linux/rcutree.h | 1 - kernel/rcu/tree_stall.h | 30 ------------------------------ 3 files changed, 32 deletions(-) diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index cf2b5a188f783..ca43525474a9e 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -160,7 +160,6 @@ static inline bool rcu_inkernel_boot_has_ended(void) { return true; } static inline bool rcu_is_watching(void) { return true; } static inline void rcu_momentary_eqs(void) { } static inline void kfree_rcu_scheduler_running(void) { } -static inline bool rcu_gp_might_be_stalled(void) { return false; } /* Avoid RCU read-side critical sections leaking across. */ static inline void rcu_all_qs(void) { barrier(); } diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 7dbde2b6f714a..da14ad1141263 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -39,7 +39,6 @@ void kvfree_call_rcu(struct rcu_head *head, void *ptr); void rcu_barrier(void); void rcu_momentary_eqs(void); void kfree_rcu_scheduler_running(void); -bool rcu_gp_might_be_stalled(void); struct rcu_gp_oldstate { unsigned long rgos_norm; diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index 2fb40ec4b2aea..ed065e3ce5c33 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -75,36 +75,6 @@ int rcu_jiffies_till_stall_check(void) } EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check); -/** - * rcu_gp_might_be_stalled - Is it likely that the grace period is stalled? - * - * Returns @true if the current grace period is sufficiently old that - * it is reasonable to assume that it might be stalled. This can be - * useful when deciding whether to allocate memory to enable RCU-mediated - * freeing on the one hand or just invoking synchronize_rcu() on the other. - * The latter is preferable when the grace period is stalled. - * - * Note that sampling of the .gp_start and .gp_seq fields must be done - * carefully to avoid false positives at the beginnings and ends of - * grace periods. - */ -bool rcu_gp_might_be_stalled(void) -{ - unsigned long d = rcu_jiffies_till_stall_check() / RCU_STALL_MIGHT_DIV; - unsigned long j = jiffies; - - if (d < RCU_STALL_MIGHT_MIN) - d = RCU_STALL_MIGHT_MIN; - smp_mb(); // jiffies before .gp_seq to avoid false positives. - if (!rcu_gp_in_progress()) - return false; - // Long delays at this point avoids false positive, but a delay - // of ULONG_MAX/4 jiffies voids your no-false-positive warranty. - smp_mb(); // .gp_seq before second .gp_start - // And ditto here. - return !time_before(j, READ_ONCE(rcu_state.gp_start) + d); -} - /* Don't do RCU CPU stall warnings during long sysrq printouts. */ void rcu_sysrq_start(void) { From patchwork Fri Aug 23 21:15:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13776029 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D06141917D0; Fri, 23 Aug 2024 21:15:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724447717; cv=none; b=e7HvDNexbMMwbNA6MKL9WdjVzoFEOEn73HOC4rWLnzniz5MkkAGH0IRZNI90MhFf5t/KfRWt258DHiF6ClEEVcrdiHeqf8HxZTKB/ScXz/dBSbwbEI+WNGoFjHZZGdsteL1Cdf4S0731flwWQc9RZ4uhqBpsvngqgp++Gamyl5o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724447717; c=relaxed/simple; bh=0OOpu3k8NVeX3ztovi2iVp34wCer+lJEp1NxMQDNF5Y=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hYQ+VbLw4avThni9ma23gTT1gJ42k76LaFRu8P/leDgQUzWDxVwaUtJbt3bwMv3jlA0Ssr5UyyNHnsHnPbMAOmC5QP8xVGJu6A93GVBn/Pma1UfDIFv9LZt5ugDD73EhYiQ4RCxFqRu176KjSjD/l8N3Smms8fNeN6VVgrXjMMI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=gyBNbByL; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="gyBNbByL" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7C00BC4AF0F; Fri, 23 Aug 2024 21:15:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724447717; bh=0OOpu3k8NVeX3ztovi2iVp34wCer+lJEp1NxMQDNF5Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gyBNbByLTboUUxyoghpARnEOAVGkfnWMg28grrezR1JRsPMayVkZqa9C5wmirrHdZ uWL5I1cFWEGoRMXfmTldgmEH/ZpuJwWzNemT7Yx3B8jbVG6CkYWxLdmFzLSAb1YxRb Wbf0cpDEiKlhii+yCeZ4CGU+cVcdfvdLitJSpsq6YNHkMPw8jaStK8Ukl+f4cQIARS 4k79oaeLMrXX8YJ0uezQqeh6A7vOdK+oaxm+NXMzhUXEyApO0ELFTkoGoLj/i9dTqD WtL/cyM+F+h54lgQBOFqgkL/FjEAO4kErjEKCvWfng1MjvVYEM7lXOhZne23DfT0qF D29xK4H1NuJNQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 2D0D6CE1257; Fri, 23 Aug 2024 14:15:17 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, riel@surriel.com, "Paul E. McKenney" Subject: [PATCH rcu 3/4] rcu: Stop stall warning from dumping stacks if grace period ends Date: Fri, 23 Aug 2024 14:15:14 -0700 Message-Id: <20240823211516.2984627-3-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <415b108b-1046-4027-aa2a-c829b77f39f6@paulmck-laptop> References: <415b108b-1046-4027-aa2a-c829b77f39f6@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Currently, once an RCU CPU stall warning decides to dump the stalling CPUs' stacks, the rcu_dump_cpu_stacks() function persists until it has gone through the full list. Unfortunately, if the stalled grace periods ends midway through, this function will be dumping stacks of innocent-bystander CPUs that happen to be blocking not the old grace period, but instead the new one. This can cause serious confusion. This commit therefore stops dumping stacks if and when the stalled grace period ends. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_stall.h | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index ed065e3ce5c33..9c8eb4b8dfb33 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -334,13 +334,17 @@ static int rcu_print_task_stall(struct rcu_node *rnp, unsigned long flags) * that don't support NMI-based stack dumps. The NMI-triggered stack * traces are more accurate because they are printed by the target CPU. */ -static void rcu_dump_cpu_stacks(void) +static void rcu_dump_cpu_stacks(unsigned long gp_seq) { int cpu; unsigned long flags; struct rcu_node *rnp; rcu_for_each_leaf_node(rnp) { + if (gp_seq != rcu_state.gp_seq) { + pr_err("INFO: Stall ended during stack backtracing.\n"); + return; + } printk_deferred_enter(); raw_spin_lock_irqsave_rcu_node(rnp, flags); for_each_leaf_node_possible_cpu(rnp, cpu) @@ -605,7 +609,7 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps) (long)rcu_seq_current(&rcu_state.gp_seq), totqlen, data_race(rcu_state.n_online_cpus)); // Diagnostic read if (ndetected) { - rcu_dump_cpu_stacks(); + rcu_dump_cpu_stacks(gp_seq); /* Complain about tasks blocking the grace period. */ rcu_for_each_leaf_node(rnp) @@ -635,7 +639,7 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps) rcu_force_quiescent_state(); /* Kick them all. */ } -static void print_cpu_stall(unsigned long gps) +static void print_cpu_stall(unsigned long gp_seq, unsigned long gps) { int cpu; unsigned long flags; @@ -670,7 +674,7 @@ static void print_cpu_stall(unsigned long gps) rcu_check_gp_kthread_expired_fqs_timer(); rcu_check_gp_kthread_starvation(); - rcu_dump_cpu_stacks(); + rcu_dump_cpu_stacks(gp_seq); raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Rewrite if needed in case of slow consoles. */ @@ -750,7 +754,8 @@ static void check_cpu_stall(struct rcu_data *rdp) gs2 = READ_ONCE(rcu_state.gp_seq); if (gs1 != gs2 || ULONG_CMP_LT(j, js) || - ULONG_CMP_GE(gps, js)) + ULONG_CMP_GE(gps, js) || + !rcu_seq_state(gs2)) return; /* No stall or GP completed since entering function. */ rnp = rdp->mynode; jn = jiffies + ULONG_MAX / 2; @@ -771,7 +776,7 @@ static void check_cpu_stall(struct rcu_data *rdp) pr_err("INFO: %s detected stall, but suppressed full report due to a stuck CSD-lock.\n", rcu_state.name); } else if (self_detected) { /* We haven't checked in, so go dump stack. */ - print_cpu_stall(gps); + print_cpu_stall(gs2, gps); } else { /* They had a few time units to dump stack, so complain. */ print_other_cpu_stall(gs2, gps); From patchwork Fri Aug 23 21:15:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13776032 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F0A8D193418; Fri, 23 Aug 2024 21:15:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724447718; cv=none; b=LK3XqQd2mU/gavERnLhP19UM3jZLa5lynf+JsRrPmQAf41DUFQJCJEvVwK6Pgf9Rm+ZvLT0PrCEwtp4+qOfO1GSekbL+6yVBTOZPxZrXP84vSx7gYRdLgG9y6AHMYOmkyCKGkq7B4qj43bxL2FjkjGTxgGt0wFURzvmOk6Ob5ZM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724447718; c=relaxed/simple; bh=DkzXOO64SLa1CEcGq1E7i7N2RHxWUUhHE+4zJSeNSXs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=gz+zBpkGUNz9cyt+Q8KY4hiomZbLpH010Kk2eXGjYLBKKws3FYBfXonsNVdmOigVS0Vc+cxMDr0miGd7GSXVjx3vOV4vfu1Nfu/6072uzEM92SJKAb7/XhpX01+JgBiIs+7uI/CDFkNoIqcbcJ6aNKYJ6GQSdlNJmn/cCFm93BY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=QaszPb+v; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="QaszPb+v" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 92226C4AF09; Fri, 23 Aug 2024 21:15:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724447717; bh=DkzXOO64SLa1CEcGq1E7i7N2RHxWUUhHE+4zJSeNSXs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QaszPb+vaUVvWuE2DRZp5kCrLS+hlEqGHJfKBZ3MOe3eLi9Z8CrreKOSnqTW1vvdC zDvys+w7qOTBkui2HcNri7fe+WH2a+6UbWf43QMmcsTmpj8Gsqfq28KLiytQTM1ABW P1x+QAgHsjcl1GwfLussyEosiJTZH67puX2tFbM377JZcNMboIT0SUozTb5W8k7+Q0 XodxgrNoWSvq9NdDH7wgRI/HtfKJgh/z1/12E13edOLwez7h8yMbaP6UEP5F9xNYCr UKylE0dWHpYphUXYW7D+FqRvrSO0wkgvF2em6QUceJmCCzX47YswNylgTTXaGlIstt YFo3FNAAZhYUw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 2F8A7CE14F4; Fri, 23 Aug 2024 14:15:17 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, riel@surriel.com, "Paul E. McKenney" Subject: [PATCH rcu 4/4] rcu: Finer-grained grace-period-end checks in rcu_dump_cpu_stacks() Date: Fri, 23 Aug 2024 14:15:15 -0700 Message-Id: <20240823211516.2984627-4-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <415b108b-1046-4027-aa2a-c829b77f39f6@paulmck-laptop> References: <415b108b-1046-4027-aa2a-c829b77f39f6@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This commit pushes the grace-period-end checks further down into rcu_dump_cpu_stacks(), and also uses lockless checks coupled with finer-grained locking. The result is that the current leaf rcu_node structure's ->lock is acquired only if a stack backtrace might be needed from the current CPU, and is held across only that CPU's backtrace. As a result, if there are no stalled CPUs associated with a given rcu_node structure, then its ->lock will not be acquired at all. On large systems, it is usually (though not always) the case that a small number of CPUs are stalling the current grace period, which means that the ->lock need be acquired only for a small fraction of the rcu_node structures. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_stall.h | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index 9c8eb4b8dfb33..ab6848baba4f6 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -341,20 +341,24 @@ static void rcu_dump_cpu_stacks(unsigned long gp_seq) struct rcu_node *rnp; rcu_for_each_leaf_node(rnp) { - if (gp_seq != rcu_state.gp_seq) { - pr_err("INFO: Stall ended during stack backtracing.\n"); - return; - } printk_deferred_enter(); - raw_spin_lock_irqsave_rcu_node(rnp, flags); - for_each_leaf_node_possible_cpu(rnp, cpu) + for_each_leaf_node_possible_cpu(rnp, cpu) { + if (gp_seq != data_race(rcu_state.gp_seq)) { + printk_deferred_exit(); + pr_err("INFO: Stall ended during stack backtracing.\n"); + return; + } + if (!(data_race(rnp->qsmask) & leaf_node_cpu_bit(rnp, cpu))) + continue; + raw_spin_lock_irqsave_rcu_node(rnp, flags); if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) { if (cpu_is_offline(cpu)) pr_err("Offline CPU %d blocking current GP.\n", cpu); else dump_cpu_task(cpu); + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + } printk_deferred_exit(); } }