From patchwork Thu Feb 13 23:25:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13974156 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5AE4222D7BC; Thu, 13 Feb 2025 23:26:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739489170; cv=none; b=triNvrAHw5+kT7onoy1sM7Uj/cEycLHUyI1SDwdO7Wklag+6CAUf8s6kuXuKF/oUrVX/+9WABmRIag4FzmgUGd4S1hPQmwg6xSa6R9fLZisKQJUUlnN5db6Nj2Thfpm8rmDB5Zmf+oKMzxizlhcflb8LcWDutfPRLDdNIa3xRYI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739489170; c=relaxed/simple; bh=CKxsi2aJsPtfh0/X6LjUEirL7VKDHgefV+VYEBXQ+AM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BhEtM0eFlYFy/698jYiYcMmsWfs4E+YbYp8pIlsKWjTvC9oPGcLPkyn+jjYkWs38lnyhrt3UN8caAFGhcxz9DrhVyU4yqbByofOdKLoTA93d/dejHv2jIZ4XjCT/WQTIV40QmfQk4TisddTRvmCOtQ9b/W5yxZQDIBQHgR9JMrE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=OYNj6XPU; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="OYNj6XPU" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E615FC4CEE5; Thu, 13 Feb 2025 23:26:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1739489169; bh=CKxsi2aJsPtfh0/X6LjUEirL7VKDHgefV+VYEBXQ+AM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OYNj6XPUYSEeCZhFlhis2xUvJqjnkcHBEL7DwRlQ4IDzgD+Os46kHEqHCqBaFUNCb V8Gnr1xwZfT8TXDtbcANMjLsCZaFWD1n/2si9zc5+rjw3LFUBtqtmu8ntgD6eGAn9o rSL7UElTcEMsxZyqQpsBPJXuFwRDCszhI+jIdEsD+qAqq/zUN2J0LMHGvFLngWVIp4 oE2SgAnsBS0tWZNXQU6JZYQtnn2/RLlVrNDwc/m1MNR44C0ihd2wpDfehTK9vr+QN1 s/OcX7LTBVzR2U+c67HOPN5SQtPv0bRevoSkPVNucp91Vsx49U8iZdZ3JTCf7zirgi UYydzBuTU/XpQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Boqun Feng , Joel Fernandes , Neeraj Upadhyay , "Paul E . McKenney" , Uladzislau Rezki , Zqiang , rcu Subject: [PATCH 1/3] rcu/exp: Protect against early QS report Date: Fri, 14 Feb 2025 00:25:57 +0100 Message-ID: <20250213232559.34163-2-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250213232559.34163-1-frederic@kernel.org> References: <20250213232559.34163-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 When a grace period is started, the ->expmask of each node is set up from sync_exp_reset_tree(). Then later on each leaf node also initialize its ->exp_tasks pointer. This means that the initialization of the quiescent state of a node and the initialization of its blocking tasks happen with an unlocked node gap in-between. It happens to be fine because nothing is expected to report an exp quiescent state within this gap, since no IPI have been issued yet and every rdp's ->cpu_no_qs.b.exp should be false. However if it were to happen by accident, the quiescent state could be reported and propagated while ignoring tasks that blocked _before_ the start of the grace period. Prevent such trouble to happen in the future and initialize both the quiescent states mask to report and the blocked tasks head from the same node locked block. Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree_exp.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 8d4895c854c5..caff16e441d1 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -141,6 +141,13 @@ static void __maybe_unused sync_exp_reset_tree(void) raw_spin_lock_irqsave_rcu_node(rnp, flags); WARN_ON_ONCE(rnp->expmask); WRITE_ONCE(rnp->expmask, rnp->expmaskinit); + /* + * Need to wait for any blocked tasks as well. Note that + * additional blocking tasks will also block the expedited GP + * until such time as the ->expmask bits are cleared. + */ + if (rcu_is_leaf_node(rnp) && rcu_preempt_has_tasks(rnp)) + WRITE_ONCE(rnp->exp_tasks, rnp->blkd_tasks.next); raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } } @@ -393,13 +400,6 @@ static void __sync_rcu_exp_select_node_cpus(struct rcu_exp_work *rewp) } mask_ofl_ipi = rnp->expmask & ~mask_ofl_test; - /* - * Need to wait for any blocked tasks as well. Note that - * additional blocking tasks will also block the expedited GP - * until such time as the ->expmask bits are cleared. - */ - if (rcu_preempt_has_tasks(rnp)) - WRITE_ONCE(rnp->exp_tasks, rnp->blkd_tasks.next); raw_spin_unlock_irqrestore_rcu_node(rnp, flags); /* IPI the remaining CPUs for expedited quiescent state. */ From patchwork Thu Feb 13 23:25:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13974157 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A022B24A05A; Thu, 13 Feb 2025 23:26:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739489172; cv=none; b=XJtjRhEg0UhIehaQGGsGe8Ho6pGxt330whdv/+GT7P1/pr5ebtgtQWTNqwI/9Zkh/wT3H+trg3lyUr3Fx34uyJIUOYmPqqcXMghr2Ny07g+BC/bV80M3rwJkOanmN2aZekiV9lYvJBoiOoX+CvKu8sLUuqotClZYc7k/jHrC9Go= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739489172; c=relaxed/simple; bh=h2c3p5IRVnPlkrbT+U6oZXQnZrjV6yS35OkqeQuIypw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XIkrNVcGsRVl9940PH5uDuwkw66+9G9ncpWzc1tGaAaAT8TI1yHFecM1QoWObAoEJpJ3nYxEeFUiERvGDsbgKHRmh4ZxUMtWYM4a3s3s2NZQe2OmS1wFlwgT1X+9csqAg3uoATlUL235L9CPdS8D/BrvqX5SgcgoLaRMCR6iCH4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MhHgLjbG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MhHgLjbG" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 40C4DC4CEE6; Thu, 13 Feb 2025 23:26:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1739489172; bh=h2c3p5IRVnPlkrbT+U6oZXQnZrjV6yS35OkqeQuIypw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MhHgLjbGw/fjNl0yDpInuhIT0/A2dvA9AIpJJC1mNyCdUI3Dsit19P3sfeNJ1Ff38 bJlGMIS40Rbga2kW276IEUHqATNKkzrrLl3ATbh2cct5aQ/EPoLrsxX2IRpc3Qdnto X/wPuWBFHpcxH84/LkTNGkCp/+i6lqhulYK5j9XBq5a7BIqcKfAST8fHIxeWr9P/Tf TSB8Fy7BXlChGOk+1SkxMELER5qZMptGPmCzsGpFzLaWVTmVBxmVbMrRBdZknOnbMY wvB8/k4luM8h80/Ayn8Aw9aBLDCK2IgkgfYiki/9bUcH+KweBGZmV3+EXjTkiwU28v MWjytyG56z63g== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Boqun Feng , Joel Fernandes , Neeraj Upadhyay , "Paul E . McKenney" , Uladzislau Rezki , Zqiang , rcu Subject: [PATCH 2/3] rcu/exp: Remove confusing needless full barrier on task unblock Date: Fri, 14 Feb 2025 00:25:58 +0100 Message-ID: <20250213232559.34163-3-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250213232559.34163-1-frederic@kernel.org> References: <20250213232559.34163-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 A full memory barrier in the RCU-PREEMPT task unblock path advertizes to order the context switch (or rather the accesses prior to rcu_read_unlock()) with the expedited grace period fastpath. However the grace period can not complete without the rnp calling into rcu_report_exp_rnp() with the node locked. This reports the quiescent state in a fully ordered fashion against updater's accesses thanks to: 1) The READ-SIDE smp_mb__after_unlock_lock() barrier accross nodes locking while propagating QS up to the root. 2) The UPDATE-SIDE smp_mb__after_unlock_lock() barrier while holding the the root rnp to wait/check for the GP completion. 3) The (perhaps redundant given step 1) and 2)) smp_mb() in rcu_seq_end() before the grace period completes. This makes the explicit barrier in this place superflous. Therefore remove it as it is confusing. Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree_plugin.h | 1 - 1 file changed, 1 deletion(-) diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 3c0bbbbb686f..d51cc7a5dfc7 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -534,7 +534,6 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) WARN_ON_ONCE(rnp->completedqs == rnp->gp_seq && (!empty_norm || rnp->qsmask)); empty_exp = sync_rcu_exp_done(rnp); - smp_mb(); /* ensure expedited fastpath sees end of RCU c-s. */ np = rcu_next_node_entry(t, rnp); list_del_init(&t->rcu_node_entry); t->rcu_blocked_node = NULL; From patchwork Thu Feb 13 23:25:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13974158 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 560D6266B62; Thu, 13 Feb 2025 23:26:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739489174; cv=none; b=MWbBrlzZQAli3LAZvDQOtSMiCVnLtUP17IkUDwdcrufcKDNeDmfRkA8lNRlHXSu+YXzc+60QzOaolz/umXxxXtTFWSJXD58GWP/qsC0Ibcm7ZXdjGf8Q9l0uAviF8EYn+iR75vL2Hfs2A0sc5q44sTdKf5P4rhwa8bkS7n2rxS8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739489174; c=relaxed/simple; bh=1vUAhxtVoEBWLWyhXs0/uySMD893wjnea4rdyKRk6V4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pa1l9T3dqm/oKMAcSwo31tfTVG3VBbv2WnZEpOQ5kclDIf2AgfduRGVP6qAT05wkm6sLbfKL4rjM22LITRuGVVouSJte1gLwifIjtvUvg0FfyvZbjUJoDEemGyoKaWv6qSr7+DDTa/GYMWrvQSGpXDlhBoJXDFmb61ssJgil6vM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=BNaTYcyT; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="BNaTYcyT" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 80F2CC4CEE5; Thu, 13 Feb 2025 23:26:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1739489174; bh=1vUAhxtVoEBWLWyhXs0/uySMD893wjnea4rdyKRk6V4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=BNaTYcyTh/UvfvzKvEj37g2o439D8wQ2vAoo/fgJu0tcni5H2AQ/UZqpbxbiWoOFS XxExivFPR/96yHPZ9ecpceScMR4Y7SwK47i/AM1A5dxpv/y4WnkiBuDwC7x7wLng6Q wJsiBFp9Ep3WYa7SM3+yPPwIjoRJ+tuLZVzrL54j5hVojaknYi3zzm5DG0HIzFpZ9w KbLSj5s4D8r1gNKaphHZ7MRYKpsmVrDp9hueojfBt/+EGRSG7pqNlIqIZlUlrBGY+Y BJj05ILS8oTFNfEqMx9Iatp8m+ifN6Y+K7pP5J4sIw/Cqyr1C054WOeFPAFxO+b1Hb ltejEMkgyls5g== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Boqun Feng , Joel Fernandes , Neeraj Upadhyay , "Paul E . McKenney" , Uladzislau Rezki , Zqiang , rcu Subject: [PATCH 3/3] rcu/exp: Remove needless CPU up quiescent state report Date: Fri, 14 Feb 2025 00:25:59 +0100 Message-ID: <20250213232559.34163-4-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250213232559.34163-1-frederic@kernel.org> References: <20250213232559.34163-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 A CPU coming online checks for an ongoing grace period and reports a quiescent state accordingly if needed. This special treatment that shortcuts the expedited IPI finds its origin as an optimization purpose on the following commit: 338b0f760e84 (rcu: Better hotplug handling for synchronize_sched_expedited() The point is to avoid an IPI while waiting for a CPU to become online or failing to become offline. However this is pointless and even error prone for several reasons: * If the CPU has been seen offline in the first round scanning offline and idle CPUs, no IPI is even tried and the quiescent state is reported on behalf of the CPU. * This means that if the IPI fails, the CPU just became offline. So it's unlikely to become online right away, unless the cpu hotplug operation failed and rolled back, which is a rare event that can wait a jiffy for a new IPI to be issued. * But then the "optimization" applying on failing CPU hotplug down only applies to !PREEMPT_RCU. * This force reports a quiescent state even if ->cpu_no_qs.b.exp is not set. As a result it can race with remote QS reports on the same rdp. Fortunately it happens to be OK but an accident is waiting to happen. For all those reasons, remove this optimization that doesn't look worthy to keep around. Reported-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree.c | 2 -- kernel/rcu/tree_exp.h | 45 ++----------------------------------------- 2 files changed, 2 insertions(+), 45 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 8625f616c65a..86935fe00397 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -151,7 +151,6 @@ static void rcu_report_qs_rnp(unsigned long mask, struct rcu_node *rnp, unsigned long gps, unsigned long flags); static void invoke_rcu_core(void); static void rcu_report_exp_rdp(struct rcu_data *rdp); -static void sync_sched_exp_online_cleanup(int cpu); static void check_cb_ovld_locked(struct rcu_data *rdp, struct rcu_node *rnp); static bool rcu_rdp_is_offloaded(struct rcu_data *rdp); static bool rcu_rdp_cpu_online(struct rcu_data *rdp); @@ -4259,7 +4258,6 @@ int rcutree_online_cpu(unsigned int cpu) raw_spin_unlock_irqrestore_rcu_node(rnp, flags); if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) return 0; /* Too early in boot for scheduler work. */ - sync_sched_exp_online_cleanup(cpu); // Stop-machine done, so allow nohz_full to disable tick. tick_dep_clear(TICK_DEP_BIT_RCU); diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index caff16e441d1..a3f962eb7057 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -751,12 +751,8 @@ static void rcu_exp_handler(void *unused) struct task_struct *t = current; /* - * First, is there no need for a quiescent state from this CPU, - * or is this CPU already looking for a quiescent state for the - * current grace period? If either is the case, just leave. - * However, this should not happen due to the preemptible - * sync_sched_exp_online_cleanup() implementation being a no-op, - * so warn if this does happen. + * WARN if the CPU is unexpectedly already looking for a + * QS or has already reported one. */ ASSERT_EXCLUSIVE_WRITER_SCOPED(rdp->cpu_no_qs.b.exp); if (WARN_ON_ONCE(!(READ_ONCE(rnp->expmask) & rdp->grpmask) || @@ -803,11 +799,6 @@ static void rcu_exp_handler(void *unused) WARN_ON_ONCE(1); } -/* PREEMPTION=y, so no PREEMPTION=n expedited grace period to clean up after. */ -static void sync_sched_exp_online_cleanup(int cpu) -{ -} - /* * Scan the current list of tasks blocked within RCU read-side critical * sections, printing out the tid of each that is blocking the current @@ -885,38 +876,6 @@ static void rcu_exp_handler(void *unused) rcu_exp_need_qs(); } -/* Send IPI for expedited cleanup if needed at end of CPU-hotplug operation. */ -static void sync_sched_exp_online_cleanup(int cpu) -{ - unsigned long flags; - int my_cpu; - struct rcu_data *rdp; - int ret; - struct rcu_node *rnp; - - rdp = per_cpu_ptr(&rcu_data, cpu); - rnp = rdp->mynode; - my_cpu = get_cpu(); - /* Quiescent state either not needed or already requested, leave. */ - if (!(READ_ONCE(rnp->expmask) & rdp->grpmask) || - READ_ONCE(rdp->cpu_no_qs.b.exp)) { - put_cpu(); - return; - } - /* Quiescent state needed on current CPU, so set it up locally. */ - if (my_cpu == cpu) { - local_irq_save(flags); - rcu_exp_need_qs(); - local_irq_restore(flags); - put_cpu(); - return; - } - /* Quiescent state needed on some other CPU, send IPI. */ - ret = smp_call_function_single(cpu, rcu_exp_handler, NULL, 0); - put_cpu(); - WARN_ON_ONCE(ret); -} - /* * Because preemptible RCU does not exist, we never have to check for * tasks blocked within RCU read-side critical sections that are