From patchwork Wed Oct 16 16:19:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13838637 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD2DD20F5CB; Wed, 16 Oct 2024 16:19:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729095573; cv=none; b=m7V0byPOlvqx2uht4xyCBuf+n8Mdkmgd8Uq9NCy8KI15HhDUWSbg78MwqvYw4AKRabyyyNzOweKXWsFPr3ACsQHe86jBgSgdBd8KR/nEGieux5rSv03eeLIhhqwv0fJnVNWiwTuP7yt+/GJisuyD0Qi3QxR6lOTrjWQCyRwOvuQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729095573; c=relaxed/simple; bh=z6RZfS0L73tloDDW1M9QJxcVucFxOSoR9JcTaLK3X34=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=lmtCfqtVYcvCxYa05i64xgwRO6kIF0iTnOcIfl74pK+AUXzpZyRtMNTXhwyCtLiUlvsJjIio+TlFBUsE8+QxesXY/5syqs2PeLEHed0eIrDyYVTvjzESvYwqWdRck8aGHZNSZs+c3MRNDybBojpE3U38Zffd9njaPSmWS7VJzlk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=b6kK4pqS; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="b6kK4pqS" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3C174C4CEC5; Wed, 16 Oct 2024 16:19:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729095573; bh=z6RZfS0L73tloDDW1M9QJxcVucFxOSoR9JcTaLK3X34=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=b6kK4pqSXG4sraAbFVFg4bOELtahxfJ4i2b71TQanKI1Yh7BGWdK61Z5xWOur7MdL 4vVz0tijWwYtW5+xHVsiFdhmrIepVg1DzxhCjt2AmNXd1V9NLTwd8h+aFjqPpnmDgL e0yMkJJFFy0HvpbRokTRRn2QlmOlu9n+0lB0JdfONnKffPz6TwFti9BzDlsj3SeKLC Bk9rV4deSfYtC73nF4h4Y2Jd/xafKKvR//6QiEtUJYQX91FcLbMEMRR15ouF72zuoo cu7Pq9CMoftmXg/893F4NTEaKI0EhebjErfPeQlzcSpGx56bMQrwlP71AK/UdQtacy BnfQIwGx98/lA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id E6291CE0DCA; Wed, 16 Oct 2024 09:19:32 -0700 (PDT) From: "Paul E. McKenney" To: frederic@kernel.org, rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" , Joel Fernandes Subject: [PATCH v2 rcu 1/3] rcu: Delete unused rcu_gp_might_be_stalled() function Date: Wed, 16 Oct 2024 09:19:29 -0700 Message-Id: <20241016161931.478592-1-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <92193018-8624-495e-a685-320119f78db1@paulmck-laptop> References: <92193018-8624-495e-a685-320119f78db1@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The rcu_gp_might_be_stalled() function is no longer used, so this commit removes it. Signed-off-by: Paul E. McKenney Reviewed-by: Joel Fernandes (Google) --- include/linux/rcutiny.h | 1 - include/linux/rcutree.h | 1 - kernel/rcu/tree_stall.h | 30 ------------------------------ 3 files changed, 32 deletions(-) diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index 0ee270b3f5ed2..fe42315f667fc 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -165,7 +165,6 @@ static inline bool rcu_inkernel_boot_has_ended(void) { return true; } static inline bool rcu_is_watching(void) { return true; } static inline void rcu_momentary_eqs(void) { } static inline void kfree_rcu_scheduler_running(void) { } -static inline bool rcu_gp_might_be_stalled(void) { return false; } /* Avoid RCU read-side critical sections leaking across. */ static inline void rcu_all_qs(void) { barrier(); } diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 90a684f94776e..27d86d9127817 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -40,7 +40,6 @@ void kvfree_rcu_barrier(void); void rcu_barrier(void); void rcu_momentary_eqs(void); void kfree_rcu_scheduler_running(void); -bool rcu_gp_might_be_stalled(void); struct rcu_gp_oldstate { unsigned long rgos_norm; diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index 4432db6d0b99b..d7cdd535e50b1 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -76,36 +76,6 @@ int rcu_jiffies_till_stall_check(void) } EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check); -/** - * rcu_gp_might_be_stalled - Is it likely that the grace period is stalled? - * - * Returns @true if the current grace period is sufficiently old that - * it is reasonable to assume that it might be stalled. This can be - * useful when deciding whether to allocate memory to enable RCU-mediated - * freeing on the one hand or just invoking synchronize_rcu() on the other. - * The latter is preferable when the grace period is stalled. - * - * Note that sampling of the .gp_start and .gp_seq fields must be done - * carefully to avoid false positives at the beginnings and ends of - * grace periods. - */ -bool rcu_gp_might_be_stalled(void) -{ - unsigned long d = rcu_jiffies_till_stall_check() / RCU_STALL_MIGHT_DIV; - unsigned long j = jiffies; - - if (d < RCU_STALL_MIGHT_MIN) - d = RCU_STALL_MIGHT_MIN; - smp_mb(); // jiffies before .gp_seq to avoid false positives. - if (!rcu_gp_in_progress()) - return false; - // Long delays at this point avoids false positive, but a delay - // of ULONG_MAX/4 jiffies voids your no-false-positive warranty. - smp_mb(); // .gp_seq before second .gp_start - // And ditto here. - return !time_before(j, READ_ONCE(rcu_state.gp_start) + d); -} - /* Don't do RCU CPU stall warnings during long sysrq printouts. */ void rcu_sysrq_start(void) { From patchwork Wed Oct 16 16:19:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13838635 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6B21012DD8A; Wed, 16 Oct 2024 16:19:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729095573; cv=none; b=ftCHgdyculIDPth71+AOHZuMfIzUQ22CWoIwRPlrvyz2WA9R9sdEpW1oGkHyMTBqLWPvXLElRVtMxjDLrc3AsKZr/Ii2X87POCpvitjDSExEt6yg4R7+tY5JSZkAg0RVvcZlJ++TPR1Q70dU3FfeQMHX+XA87wcrpuX75+yNOBg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729095573; c=relaxed/simple; bh=sZq/5H24e81CBhBGj1qplbJtZ3lHNc2VHN4b3D53dts=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=eoobXB+FuPARnbgj5wkavvq0Mxw6IlPiqDvZ3iMHXYQ4IrFa7L0oltjQ0nGDX2EzpF/lvSI29QrVf4dbbArmAnTFX/Kw4R3YARq8bKh0nvdbf2CgQ4ioFtSHqGZV0Gwg/xDSzsCkIxeduHw4zbmzjMcQ75Tdt/C5asEc2t3T6Bk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Z/GQfDlm; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Z/GQfDlm" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 47AD6C4CECE; Wed, 16 Oct 2024 16:19:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729095573; bh=sZq/5H24e81CBhBGj1qplbJtZ3lHNc2VHN4b3D53dts=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Z/GQfDlmviXmVr3f7aUo+z0kFYhW8H4L73q/kJMtUCxvREasPDUVw/2gjNbjZxXhv WsbtNww5N+tY007vAUjU6BQi8IFI5vD7FyQ0NEXyFtFnk9dxbRYB6vxvIjBrW5CMJX UsFlTilRPHE+g3uHouVr2cg3+MIx3Mg48Ou3orX55hiJwnf81MMBxhofsSQ2V+qwZO P6SJBK9XGfu+W55AdLaixs97D2IJ+ifodf5R+4EoXuiKJNNuD2etglzbItJOaB2VF8 ti3PUreUpuRBU4ZtIikV0w9XIrLGO9BLAQjJxpkmeQL/LmIyTgX7yIbW4/3y9RkOAy TL/SGPqU2vYoQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id E9209CE0E98; Wed, 16 Oct 2024 09:19:32 -0700 (PDT) From: "Paul E. McKenney" To: frederic@kernel.org, rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" Subject: [PATCH v2 rcu 2/3] rcu: Stop stall warning from dumping stacks if grace period ends Date: Wed, 16 Oct 2024 09:19:30 -0700 Message-Id: <20241016161931.478592-2-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <92193018-8624-495e-a685-320119f78db1@paulmck-laptop> References: <92193018-8624-495e-a685-320119f78db1@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Currently, once an RCU CPU stall warning decides to dump the stalling CPUs' stacks, the rcu_dump_cpu_stacks() function persists until it has gone through the full list. Unfortunately, if the stalled grace periods ends midway through, this function will be dumping stacks of innocent-bystander CPUs that happen to be blocking not the old grace period, but instead the new one. This can cause serious confusion. This commit therefore stops dumping stacks if and when the stalled grace period ends. [ paulmck: Apply Joel Fernandes feedback. ] Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_stall.h | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index d7cdd535e50b1..b530844becf85 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -335,13 +335,17 @@ static int rcu_print_task_stall(struct rcu_node *rnp, unsigned long flags) * that don't support NMI-based stack dumps. The NMI-triggered stack * traces are more accurate because they are printed by the target CPU. */ -static void rcu_dump_cpu_stacks(void) +static void rcu_dump_cpu_stacks(unsigned long gp_seq) { int cpu; unsigned long flags; struct rcu_node *rnp; rcu_for_each_leaf_node(rnp) { + if (gp_seq != data_race(rcu_state.gp_seq)) { + pr_err("INFO: Stall ended during stack backtracing.\n"); + return; + } printk_deferred_enter(); raw_spin_lock_irqsave_rcu_node(rnp, flags); for_each_leaf_node_possible_cpu(rnp, cpu) @@ -608,7 +612,7 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps) (long)rcu_seq_current(&rcu_state.gp_seq), totqlen, data_race(rcu_state.n_online_cpus)); // Diagnostic read if (ndetected) { - rcu_dump_cpu_stacks(); + rcu_dump_cpu_stacks(gp_seq); /* Complain about tasks blocking the grace period. */ rcu_for_each_leaf_node(rnp) @@ -640,7 +644,7 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps) rcu_force_quiescent_state(); /* Kick them all. */ } -static void print_cpu_stall(unsigned long gps) +static void print_cpu_stall(unsigned long gp_seq, unsigned long gps) { int cpu; unsigned long flags; @@ -677,7 +681,7 @@ static void print_cpu_stall(unsigned long gps) rcu_check_gp_kthread_expired_fqs_timer(); rcu_check_gp_kthread_starvation(); - rcu_dump_cpu_stacks(); + rcu_dump_cpu_stacks(gp_seq); raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Rewrite if needed in case of slow consoles. */ @@ -759,7 +763,8 @@ static void check_cpu_stall(struct rcu_data *rdp) gs2 = READ_ONCE(rcu_state.gp_seq); if (gs1 != gs2 || ULONG_CMP_LT(j, js) || - ULONG_CMP_GE(gps, js)) + ULONG_CMP_GE(gps, js) || + !rcu_seq_state(gs2)) return; /* No stall or GP completed since entering function. */ rnp = rdp->mynode; jn = jiffies + ULONG_MAX / 2; @@ -780,7 +785,7 @@ static void check_cpu_stall(struct rcu_data *rdp) pr_err("INFO: %s detected stall, but suppressed full report due to a stuck CSD-lock.\n", rcu_state.name); } else if (self_detected) { /* We haven't checked in, so go dump stack. */ - print_cpu_stall(gps); + print_cpu_stall(gs2, gps); } else { /* They had a few time units to dump stack, so complain. */ print_other_cpu_stall(gs2, gps); From patchwork Wed Oct 16 16:19:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13838636 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD28420F5AA; Wed, 16 Oct 2024 16:19:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729095573; cv=none; b=X6o9C53a+5rH6Ua4NaeqjYoPsRC9iaorYphILQFC2FubTgwuQsUEh+72dlZsQwzK6iHDHRIOlaJI068c3UC7rY88lRIXsTklEA5lznqPEdi5Kx80RjqSWm7XhQK53jJvQP/4N/OmvnobT0MJXblM0kpfBtGctAhO1G4yWowbOfc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729095573; c=relaxed/simple; bh=b/Ydri4xRhgAwhSosvwIWIB9nRy/kHsOi/HMFdJm+Qk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=uqiC2Ceh7m+rWBt9Mf8l0eGV9rRvQO5BVJYRr6okVDM8cuOQwqlK1t82+pdvO4IQrj71oxtW1HygpC26w9dYPR9KBiiAV198xzCwyNjgqXpabkGwViOsX/aJH+DyjxS4rfhJoZa23TO81KI39H0C3RcEER+HOWTMkBuFHrKK63s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=b/+H1O2n; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="b/+H1O2n" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 50B02C4CED0; Wed, 16 Oct 2024 16:19:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729095573; bh=b/Ydri4xRhgAwhSosvwIWIB9nRy/kHsOi/HMFdJm+Qk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=b/+H1O2nyM7yqxyhnNoIRnVbd3KlUXN5OdWSyBpJdU+tcdrBcP3H8e25dNWN5dxsU WehNWhQrCXxTq7eWm96j8teeXaPSQxmmjHwAj9BrLqf9aXwttXwozD/66Uywfir4HM 2KGdkK5/BGWDYyYcCqUpvsz++LLnFcNg2m9XDr8Z35tdWefkK7zKlk1YXrGThlyKgl Es8UE4/rd6WTfPIea9RaRgNo0oIn0i9CJJc6zD7Gacz3f5Ln6FVdLKthfqPcfw9qTc OGFP/4J9Ak9APcpNAssJfObx1zgiIoUVxb/I6hEm8OYYzDHRoQpSABpCRWvFwSwZF/ dDBda1fD3S0kQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id EBDFFCE0F9A; Wed, 16 Oct 2024 09:19:32 -0700 (PDT) From: "Paul E. McKenney" To: frederic@kernel.org, rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" , Joel Fernandes Subject: [PATCH v2 rcu 3/3] rcu: Finer-grained grace-period-end checks in rcu_dump_cpu_stacks() Date: Wed, 16 Oct 2024 09:19:31 -0700 Message-Id: <20241016161931.478592-3-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <92193018-8624-495e-a685-320119f78db1@paulmck-laptop> References: <92193018-8624-495e-a685-320119f78db1@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This commit pushes the grace-period-end checks further down into rcu_dump_cpu_stacks(), and also uses lockless checks coupled with finer-grained locking. The result is that the current leaf rcu_node structure's ->lock is acquired only if a stack backtrace might be needed from the current CPU, and is held across only that CPU's backtrace. As a result, if there are no stalled CPUs associated with a given rcu_node structure, then its ->lock will not be acquired at all. On large systems, it is usually (though not always) the case that a small number of CPUs are stalling the current grace period, which means that the ->lock need be acquired only for a small fraction of the rcu_node structures. Signed-off-by: Paul E. McKenney Reviewed-by: Joel Fernandes (Google) --- kernel/rcu/tree_stall.h | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index b530844becf85..8994391b95c76 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -342,20 +342,24 @@ static void rcu_dump_cpu_stacks(unsigned long gp_seq) struct rcu_node *rnp; rcu_for_each_leaf_node(rnp) { - if (gp_seq != data_race(rcu_state.gp_seq)) { - pr_err("INFO: Stall ended during stack backtracing.\n"); - return; - } printk_deferred_enter(); - raw_spin_lock_irqsave_rcu_node(rnp, flags); - for_each_leaf_node_possible_cpu(rnp, cpu) + for_each_leaf_node_possible_cpu(rnp, cpu) { + if (gp_seq != data_race(rcu_state.gp_seq)) { + printk_deferred_exit(); + pr_err("INFO: Stall ended during stack backtracing.\n"); + return; + } + if (!(data_race(rnp->qsmask) & leaf_node_cpu_bit(rnp, cpu))) + continue; + raw_spin_lock_irqsave_rcu_node(rnp, flags); if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) { if (cpu_is_offline(cpu)) pr_err("Offline CPU %d blocking current GP.\n", cpu); else dump_cpu_task(cpu); + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + } printk_deferred_exit(); } }