From patchwork Tue Jun 4 22:23:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13685920 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 282A2145322; Tue, 4 Jun 2024 22:23:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; cv=none; b=gO5KqU7hcuBR8PgfhPuxdkW2aIz0WG47tu2NB7Um+Z93PrysUb9jA5N3nTU4U1N0N/lwFpYvwGRKQwy14GF3aUkzxqjVakVpLhkyoTj7MYT8YB+EiCf/7jo+jxDrDtCP78GPzWEtun2dut9rVy0/D2booqY5D7Km65ffT1V3oGc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; c=relaxed/simple; bh=Zg9sZgSfw8tshKxAzI3tRPCKrOIhzkGKLv0XgiHxsM8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=WAGx8X1ly7Z82GQwUeE5dRpLWMCTnP9bFdD4XoicZuPunNlag2F1F5WWBFPoEcH8kX+D1B3P1vOH5TzBUMzld6Tl9VpN22kcmm58o/tw3e3opcBIFvaHs5aEAjTtsmrXsC8NYnNM3pNht88WiXbPBbdkpWW/zUZQuECRENDwn7c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SS+UH7cu; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SS+UH7cu" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A7DF9C2BBFC; Tue, 4 Jun 2024 22:23:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717539836; bh=Zg9sZgSfw8tshKxAzI3tRPCKrOIhzkGKLv0XgiHxsM8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SS+UH7cuLvDau5uNM/gzwumHyeUhSvhZjWKDAh0R08pe+kWTHr6B0sW3nIobnTbPK pfIbzPhGCS9fSN+RpdylGD4JVJ1NxKbrNn2vYu25HgrwNlSDgtSzh1y9YNSXiFSf75 LTH8hQaiRPEUYnTq4cnuRYOycWkVi9gItuDKzPNiivBTfobmL3QdheLQC916jhqbS3 GpvwxexASDXfWgRqvGI/ZweGSYeIYs5K/VuARYV7wH9K/JvVfbkmKs4X+/bLmtwLYW /L/uMR6JdJU149Pd6/lcbmTpfgH/p+HWhrow2hD3nOv/4tT84A7BqGXm3O7UNu9BDZ LSiZrHadbpZ2A== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 5A049CE3ED6; Tue, 4 Jun 2024 15:23:56 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" , Jens Axboe Subject: [PATCH rcu 1/9] rcu: Add lockdep_assert_in_rcu_read_lock() and friends Date: Tue, 4 Jun 2024 15:23:47 -0700 Message-Id: <20240604222355.2370768-1-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> References: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 There is no direct RCU counterpart to lockdep_assert_irqs_disabled() and friends. Although it is possible to construct them, it would be more convenient to have the following lockdep assertions: lockdep_assert_in_rcu_read_lock() lockdep_assert_in_rcu_read_lock_bh() lockdep_assert_in_rcu_read_lock_sched() lockdep_assert_in_rcu_reader() This commit therefore creates them. Reported-by: Jens Axboe Signed-off-by: Paul E. McKenney --- include/linux/rcupdate.h | 60 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index dfd2399f2cde0..8470a85f65634 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -421,11 +421,71 @@ static inline void rcu_preempt_sleep_check(void) { } "Illegal context switch in RCU-sched read-side critical section"); \ } while (0) +// See RCU_LOCKDEP_WARN() for an explanation of the double call to +// debug_lockdep_rcu_enabled(). +static inline bool lockdep_assert_rcu_helper(bool c) +{ + return debug_lockdep_rcu_enabled() && + (c || !rcu_is_watching() || !rcu_lockdep_current_cpu_online()) && + debug_lockdep_rcu_enabled(); +} + +/** + * lockdep_assert_in_rcu_read_lock - WARN if not protected by rcu_read_lock() + * + * Splats if lockdep is enabled and there is no rcu_read_lock() in effect. + */ +#define lockdep_assert_in_rcu_read_lock() \ + WARN_ON_ONCE(lockdep_assert_rcu_helper(!lock_is_held(&rcu_lock_map))) + +/** + * lockdep_assert_in_rcu_read_lock_bh - WARN if not protected by rcu_read_lock_bh() + * + * Splats if lockdep is enabled and there is no rcu_read_lock_bh() in effect. + * Note that local_bh_disable() and friends do not suffice here, instead an + * actual rcu_read_lock_bh() is required. + */ +#define lockdep_assert_in_rcu_read_lock_bh() \ + WARN_ON_ONCE(lockdep_assert_rcu_helper(!lock_is_held(&rcu_bh_lock_map))) + +/** + * lockdep_assert_in_rcu_read_lock_sched - WARN if not protected by rcu_read_lock_sched() + * + * Splats if lockdep is enabled and there is no rcu_read_lock_sched() + * in effect. Note that preempt_disable() and friends do not suffice here, + * instead an actual rcu_read_lock_sched() is required. + */ +#define lockdep_assert_in_rcu_read_lock_sched() \ + WARN_ON_ONCE(lockdep_assert_rcu_helper(!lock_is_held(&rcu_sched_lock_map))) + +/** + * lockdep_assert_in_rcu_reader - WARN if not within some type of RCU reader + * + * Splats if lockdep is enabled and there is no RCU reader of any + * type in effect. Note that regions of code protected by things like + * preempt_disable, local_bh_disable(), and local_irq_disable() all qualify + * as RCU readers. + * + * Note that this will never trigger in PREEMPT_NONE or PREEMPT_VOLUNTARY + * kernels that are not also built with PREEMPT_COUNT. But if you have + * lockdep enabled, you might as well also enable PREEMPT_COUNT. + */ +#define lockdep_assert_in_rcu_reader() \ + WARN_ON_ONCE(lockdep_assert_rcu_helper(!lock_is_held(&rcu_lock_map) && \ + !lock_is_held(&rcu_bh_lock_map) && \ + !lock_is_held(&rcu_sched_lock_map) && \ + preemptible())) + #else /* #ifdef CONFIG_PROVE_RCU */ #define RCU_LOCKDEP_WARN(c, s) do { } while (0 && (c)) #define rcu_sleep_check() do { } while (0) +#define lockdep_assert_in_rcu_read_lock() do { } while (0) +#define lockdep_assert_in_rcu_read_lock_bh() do { } while (0) +#define lockdep_assert_in_rcu_read_lock_sched() do { } while (0) +#define lockdep_assert_in_rcu_reader() do { } while (0) + #endif /* #else #ifdef CONFIG_PROVE_RCU */ /* From patchwork Tue Jun 4 22:23:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13685918 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8844A5F; Tue, 4 Jun 2024 22:23:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539836; cv=none; b=rdxNTIPh4Bi1c0CFYcmj3phHAk6yyr2+a7FB0DjHPZEjefLqh+wghUW99PT0ld3m+C/98crF3ivpTrgpoTESaK4DPA+K81t20SJNLgkE0V0Q0esECF3rpeCT+l3Z3wjQLhCGci3t+maOq0XGZBHDPYkOuPNSMtMIvGavWWM2aB4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539836; c=relaxed/simple; bh=RUq/x40RoZ1+Q28PHuHwZWpj7Bz1teiZmeOv5T+R3o0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=KDSUad7wwcIyF26/bHgLMEGqPMRXDZ8g4HZxFVyaQjRdPbp5JlQsvaf3o2Be9pp2PcdjhPYBNR4q49Of3xJJq8GT9yQkpXmxH2tZQSulUWHz7MCqwWBxlxZKaJx//S06WT6ky84j0M3wT3WTnQBtSpr9jGMDvAhHylUAhh+JUdo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Kt3FpNKb; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Kt3FpNKb" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B3980C3277B; Tue, 4 Jun 2024 22:23:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717539836; bh=RUq/x40RoZ1+Q28PHuHwZWpj7Bz1teiZmeOv5T+R3o0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Kt3FpNKbTghXYwbkm+LzOtM+NWHFQdZhQcDh65ABKodYDBCJ9tOwN2z7AHbgYUgYi WZv5/SAjdvqw1xhhnzcVMLiFhj203zOi71+yPHpjJakaYR8mHfYFpqGE+QTZ7B8YkA 0Dx467tCwu8w1vNNY+sK7o6J+aA38ub5gT7xW9rLBK5WVXWYOBEi1SD8iNyURotfsA T+mX0rlyQb3Fk4MwuwrkrIlAO/+51MbITrNmrU8INEVOke6npkJWY0hqTZzZBWW5VQ dJPx4MMSvGRNIbkzsdozlziGbRLkQGbZtIpWO7SdfUgMTFyq2jXC94waG7/sEcx/2O vmNS3HSV9A35A== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 5CADECE3F0F; Tue, 4 Jun 2024 15:23:56 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, Neeraj Upadhyay , "Paul E . McKenney" Subject: [PATCH rcu 2/9] rcu: Reduce synchronize_rcu() delays when all wait heads are in use Date: Tue, 4 Jun 2024 15:23:48 -0700 Message-Id: <20240604222355.2370768-2-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> References: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Neeraj Upadhyay When all wait heads are in use, which can happen when rcu_sr_normal_gp_cleanup_work()'s callback processing is slow, any new synchronize_rcu() user's rcu_synchronize node's processing is deferred to future GP periods. This can result in long list of synchronize_rcu() invocations waiting for full grace period processing, which can delay freeing of memory. Mitigate this problem by using first node in the list as wait tail when all wait heads are in use. While methods to speed up callback processing would be needed to recover from this situation, allowing new nodes to complete their grace period can help prevent delays due to a fixed number of wait head nodes. Signed-off-by: Neeraj Upadhyay Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 40 +++++++++++++++++++++++----------------- 1 file changed, 23 insertions(+), 17 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 28c7031711a3f..6ba36d9c09bde 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1463,14 +1463,11 @@ static void rcu_poll_gp_seq_end_unlocked(unsigned long *snap) * for this new grace period. Given that there are a fixed * number of wait nodes, if all wait nodes are in use * (which can happen when kworker callback processing - * is delayed) and additional grace period is requested. - * This means, a system is slow in processing callbacks. - * - * TODO: If a slow processing is detected, a first node - * in the llist should be used as a wait-tail for this - * grace period, therefore users which should wait due - * to a slow process are handled by _this_ grace period - * and not next. + * is delayed), first node in the llist is used as wait + * tail for this grace period. This means, the first node + * has to go through additional grace periods before it is + * part of the wait callbacks. This should be ok, as + * the system is slow in processing callbacks anyway. * * Below is an illustration of how the done and wait * tail pointers move from one set of rcu_synchronize nodes @@ -1639,7 +1636,6 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work) if (!done) return; - WARN_ON_ONCE(!rcu_sr_is_wait_head(done)); head = done->next; done->next = NULL; @@ -1676,13 +1672,21 @@ static void rcu_sr_normal_gp_cleanup(void) rcu_state.srs_wait_tail = NULL; ASSERT_EXCLUSIVE_WRITER(rcu_state.srs_wait_tail); - WARN_ON_ONCE(!rcu_sr_is_wait_head(wait_tail)); /* * Process (a) and (d) cases. See an illustration. */ llist_for_each_safe(rcu, next, wait_tail->next) { - if (rcu_sr_is_wait_head(rcu)) + /* + * The done tail may reference a rcu_synchronize node. + * Stop at done tail, as using rcu_sr_normal_complete() + * from this path can result in use-after-free. This + * may occur if, following the wake-up of the synchronize_rcu() + * wait contexts and freeing up of node memory, + * rcu_sr_normal_gp_cleanup_work() accesses the done tail and + * its subsequent nodes. + */ + if (wait_tail->next == rcu_state.srs_done_tail) break; rcu_sr_normal_complete(rcu); @@ -1719,15 +1723,17 @@ static bool rcu_sr_normal_gp_init(void) return start_new_poll; wait_head = rcu_sr_get_wait_head(); - if (!wait_head) { - // Kick another GP to retry. + if (wait_head) { + /* Inject a wait-dummy-node. */ + llist_add(wait_head, &rcu_state.srs_next); + } else { + // Kick another GP for first node. start_new_poll = true; - return start_new_poll; + if (first == rcu_state.srs_done_tail) + return start_new_poll; + wait_head = first; } - /* Inject a wait-dummy-node. */ - llist_add(wait_head, &rcu_state.srs_next); - /* * A waiting list of rcu_synchronize nodes should be empty on * this step, since a GP-kthread, rcu_gp_init() -> gp_cleanup(), From patchwork Tue Jun 4 22:23:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13685922 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2834614C592; Tue, 4 Jun 2024 22:23:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; cv=none; b=SdgBGIM1m0qS8sUTgOVSacE3FB0pBwVLqoORgzrWpIUla1iS/t8tk2Mph+FTFvDcivXE3eQghnvbH0KCiZY8QH3KSn5Q7Wt3u8Set4mjZHl8rbkw/dzO2/MpBTdCPjDlQScEoif0fc96mi+Zc+KmVJsxQwaPnEqkO0KQisxOTb8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; c=relaxed/simple; bh=pfQQEuNAMcpIVAntByrnVCNTp3OoaecMef/sKBSFuQE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hU33HoIDYoSxDeGBALR4VItDiPLg2iKxtlWbMv+fheKHhhz7GcRJXP5m4wRuLBtX/pkCHoaOht1KLQze0lLDHrxWBKOgSh/pnghrpru1NQhP8m1+F+VCEo1Hbo0TiE44LVmcTw8Y+le62DTLF4SSz5T/7onSWR7MBQa6AUBug0I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=W0LX1Gsk; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="W0LX1Gsk" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BA6E6C4AF08; Tue, 4 Jun 2024 22:23:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717539836; bh=pfQQEuNAMcpIVAntByrnVCNTp3OoaecMef/sKBSFuQE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=W0LX1GskxrP735W7tui7ZPtRZqg8hRjGnLsh3TQxMveQtXzar8nhQ9LN3Gf6D5mAr ppyDH57GjFGHhEjW9RK8pWV7UpC+/RveIlrAqV91QKywsS3wydUX0Q9+0aNnsNAwA4 pSUkyvPKRXEQ1W9n/cDGVR46DVmZJ0u/7dzlbNwy7DcN5DCkNm0opWliukzhC/4Hdu NWlCs7BN190+IY2/XbEyrIeFO55ApJHopcLmOV+cPLCE+k5hitWuFHfpjbk/p8XPsX UuKI5uPVH4onkDSVU4XaAXZn8XiCI8/1lxzfWZqbs8S9iZmHe8u9C1c7/ttGSmHjNo Ad/xufTVra4og== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 5F39DCE3F26; Tue, 4 Jun 2024 15:23:56 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Joel Fernandes (Google)" , Uladzislau Rezki , "Paul E . McKenney" Subject: [PATCH rcu 3/9] rcu/tree: Reduce wake up for synchronize_rcu() common case Date: Tue, 4 Jun 2024 15:23:49 -0700 Message-Id: <20240604222355.2370768-3-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> References: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: "Joel Fernandes (Google)" In the synchronize_rcu() common case, we will have less than SR_MAX_USERS_WAKE_FROM_GP number of users per GP. Waking up the kworker is pointless just to free the last injected wait head since at that point, all the users have already been awakened. Introduce a new counter to track this and prevent the wakeup in the common case. Signed-off-by: Joel Fernandes (Google) Reviewed-by: Uladzislau Rezki (Sony) Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 35 ++++++++++++++++++++++++++++++----- kernel/rcu/tree.h | 1 + 2 files changed, 31 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 6ba36d9c09bde..2fe08e6186b4d 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -96,6 +96,7 @@ static struct rcu_state rcu_state = { .ofl_lock = __ARCH_SPIN_LOCK_UNLOCKED, .srs_cleanup_work = __WORK_INITIALIZER(rcu_state.srs_cleanup_work, rcu_sr_normal_gp_cleanup_work), + .srs_cleanups_pending = ATOMIC_INIT(0), }; /* Dump rcu_node combining tree at boot to verify correct setup. */ @@ -1633,8 +1634,11 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work) * the done tail list manipulations are protected here. */ done = smp_load_acquire(&rcu_state.srs_done_tail); - if (!done) + if (!done) { + /* See comments below. */ + atomic_dec_return_release(&rcu_state.srs_cleanups_pending); return; + } head = done->next; done->next = NULL; @@ -1656,6 +1660,9 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work) rcu_sr_put_wait_head(rcu); } + + /* Order list manipulations with atomic access. */ + atomic_dec_return_release(&rcu_state.srs_cleanups_pending); } /* @@ -1663,7 +1670,7 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work) */ static void rcu_sr_normal_gp_cleanup(void) { - struct llist_node *wait_tail, *next, *rcu; + struct llist_node *wait_tail, *next = NULL, *rcu = NULL; int done = 0; wait_tail = rcu_state.srs_wait_tail; @@ -1697,16 +1704,34 @@ static void rcu_sr_normal_gp_cleanup(void) break; } - // concurrent sr_normal_gp_cleanup work might observe this update. - smp_store_release(&rcu_state.srs_done_tail, wait_tail); + /* + * Fast path, no more users to process except putting the second last + * wait head if no inflight-workers. If there are in-flight workers, + * they will remove the last wait head. + * + * Note that the ACQUIRE orders atomic access with list manipulation. + */ + if (wait_tail->next && wait_tail->next->next == NULL && + rcu_sr_is_wait_head(wait_tail->next) && + !atomic_read_acquire(&rcu_state.srs_cleanups_pending)) { + rcu_sr_put_wait_head(wait_tail->next); + wait_tail->next = NULL; + } + + /* Concurrent sr_normal_gp_cleanup work might observe this update. */ ASSERT_EXCLUSIVE_WRITER(rcu_state.srs_done_tail); + smp_store_release(&rcu_state.srs_done_tail, wait_tail); /* * We schedule a work in order to perform a final processing * of outstanding users(if still left) and releasing wait-heads * added by rcu_sr_normal_gp_init() call. */ - queue_work(sync_wq, &rcu_state.srs_cleanup_work); + if (wait_tail->next) { + atomic_inc(&rcu_state.srs_cleanups_pending); + if (!queue_work(sync_wq, &rcu_state.srs_cleanup_work)) + atomic_dec(&rcu_state.srs_cleanups_pending); + } } /* diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index bae7925c497fe..affcb92a358c3 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -420,6 +420,7 @@ struct rcu_state { struct llist_node *srs_done_tail; /* ready for GP users. */ struct sr_wait_node srs_wait_nodes[SR_NORMAL_GP_WAIT_HEAD_MAX]; struct work_struct srs_cleanup_work; + atomic_t srs_cleanups_pending; /* srs inflight worker cleanups. */ }; /* Values for rcu_state structure's gp_flags field. */ From patchwork Tue Jun 4 22:23:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13685919 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2824E1442FE; Tue, 4 Jun 2024 22:23:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; cv=none; b=XKFCA6rbRB2koH6BuOI/543T7AnC/heUbWV1qKh9/0EPt3ibiu4CFp+MfBVx2sexYxYB9HUeF0zHt5uZPOQLf73101Vj7mraknqgVd8B/2CfVBRAW6HUna4atxcoslS1+ehAW54HmJykvJ0b3Ezt+w77eOCDFcJPePTILtBR2QQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; c=relaxed/simple; bh=cSJFd+VMxpJPGTTylKf3LUw/sJdSsIpeVfT0Z+nfmlg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ANlO6qIAJpy6wZ13c9FITFqbqm8TmtTe7qlADWxZQgHzcchr6F6ntW2kuPgvLhya9MNY4whTby6ggjGn/9Gg9IAdLg1Bc0bYiRETNpribNqSZl4oNbMOPrfX3h5/9m6LGY9wHbIupt7tZPfKt0PJCHm6vZWankGWRkOir8Mzu4s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oTjPybFM; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oTjPybFM" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C839CC4AF07; Tue, 4 Jun 2024 22:23:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717539836; bh=cSJFd+VMxpJPGTTylKf3LUw/sJdSsIpeVfT0Z+nfmlg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oTjPybFMsz8Iy9BYREe5CNlP0iHsbid6J1MzhckH7NnNXuNvG6DPvUuph59muncRX aMQ8w9r+pdiHW8iKsY696KjC8anAhA7x1FX0KSBTxvyOYqALJboSydbDNUsoJ14MLG qhmnwKY0IeiKWPiLQPDePHvhRVZaeiT+YmFGit2EGUpjukN1LvmdCWBTbqULDxFACy 7v5oA9ZqAUVFIryEc41YIn4bdMkYwTXr8kt1xoRpHnCvk63CNEO8FN8OUpWwSUmy6y BPTVWwxl45PCDYEu7ptZk/fXslfM6M9GwrO1ei3BCxpo7XCshNvMRHYTo2nqUG1xmP zslkg7oXr3iaQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 61E6BCE3F27; Tue, 4 Jun 2024 15:23:56 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" , Dan Carpenter Subject: [PATCH rcu 4/9] rcu: Disable interrupts directly in rcu_gp_init() Date: Tue, 4 Jun 2024 15:23:50 -0700 Message-Id: <20240604222355.2370768-4-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> References: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Interrupts are enabled in rcu_gp_init(), so this commit switches from local_irq_save() and local_irq_restore() to local_irq_disable() and local_irq_enable(). Link: https://lore.kernel.org/all/febb13ab-a4bb-48b4-8e97-7e9f7749e6da@moroto.mountain/ Reported-by: Dan Carpenter Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 2fe08e6186b4d..35bf4a3736765 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1841,7 +1841,7 @@ static noinline_for_stack bool rcu_gp_init(void) WRITE_ONCE(rcu_state.gp_state, RCU_GP_ONOFF); /* Exclude CPU hotplug operations. */ rcu_for_each_leaf_node(rnp) { - local_irq_save(flags); + local_irq_disable(); arch_spin_lock(&rcu_state.ofl_lock); raw_spin_lock_rcu_node(rnp); if (rnp->qsmaskinit == rnp->qsmaskinitnext && @@ -1849,7 +1849,7 @@ static noinline_for_stack bool rcu_gp_init(void) /* Nothing to do on this leaf rcu_node structure. */ raw_spin_unlock_rcu_node(rnp); arch_spin_unlock(&rcu_state.ofl_lock); - local_irq_restore(flags); + local_irq_enable(); continue; } @@ -1886,7 +1886,7 @@ static noinline_for_stack bool rcu_gp_init(void) raw_spin_unlock_rcu_node(rnp); arch_spin_unlock(&rcu_state.ofl_lock); - local_irq_restore(flags); + local_irq_enable(); } rcu_gp_slow(gp_preinit_delay); /* Races with CPU hotplug. */ From patchwork Tue Jun 4 22:23:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13685921 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 282E214658E; Tue, 4 Jun 2024 22:23:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; cv=none; b=DnAaHZsEdQ+up1lmXSgnAZU8JWMwx8URYaSSiqdTS0w3H1oI4hOwvFSEgjQL6xzPMPQN/fma8eRIaaaSRTf8flCR+YeiCmWMjq03O0JLll+CnXJhWwoFmcWkX8lqowkv2gQ40tGAQnjfDKHK3DeGRdEZ7n9gzRI7vuiLCFir/po= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; c=relaxed/simple; bh=rhuvEgtcACTNn6ax1Ziuj5pEwRCIs7jJ27srtcjn1HU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=dOc/rXC/fETodbfabDJLexBQoYpRfI8CuEPfFGpk5kDsAnPKN5BwZIrNhH09Ky5641rixmaRDSmAEWkiECFUHAwCFDbiOEGFKMQioq9nRb3mwxMIADjCMIqFsGwO/CK0uvJk6VRQUVpVSk26R9bACtQmANC8ge4ZSJhSgccSk+k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ReR8ITPj; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ReR8ITPj" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CEDECC4AF0C; Tue, 4 Jun 2024 22:23:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717539836; bh=rhuvEgtcACTNn6ax1Ziuj5pEwRCIs7jJ27srtcjn1HU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ReR8ITPjfVvne78bEQaAxExpg3RF0XUt4R8qbyytwNbtj1rg9Uu/DTwB5RDlx8eWs xcD+EGg3oETsALlRitka2BSNwJcNrf5LUdFHtxMB4dU1ecBJbQeDmTVwKqzCFudNOu jPuqz0dn2orXdacMJ/boYrrzjNPMd2wk655zYXP0PnCH+pasb/CBqS2xHdEGwUMH6L p6Lt/kjUmu3nHHcFggUbeE9Ylt2X5gYkBl3wPeVxjJYAuhJrZ4XInHuBkIL34urcJH L78EYU4svrkUP/jzbV96TMh3bv/8LzVMPFMXVE5QShtkg60QWwexbDs9Bs3SK69iVb 6aGxE1h23cmGA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 645E4CE3F2C; Tue, 4 Jun 2024 15:23:56 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" , Dan Carpenter Subject: [PATCH rcu 5/9] srcu: Disable interrupts directly in srcu_gp_end() Date: Tue, 4 Jun 2024 15:23:51 -0700 Message-Id: <20240604222355.2370768-5-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> References: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Interrupts are enabled in srcu_gp_end(), so this commit switches from spin_lock_irqsave_rcu_node() and spin_unlock_irqrestore_rcu_node() to spin_lock_irq_rcu_node() and spin_unlock_irq_rcu_node(). Link: https://lore.kernel.org/all/febb13ab-a4bb-48b4-8e97-7e9f7749e6da@moroto.mountain/ Reported-by: Dan Carpenter Signed-off-by: Paul E. McKenney --- kernel/rcu/srcutree.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index bc4b58b0204e9..d14d350f505f4 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -845,7 +845,6 @@ static void srcu_gp_end(struct srcu_struct *ssp) bool cbs; bool last_lvl; int cpu; - unsigned long flags; unsigned long gpseq; int idx; unsigned long mask; @@ -907,12 +906,12 @@ static void srcu_gp_end(struct srcu_struct *ssp) if (!(gpseq & counter_wrap_check)) for_each_possible_cpu(cpu) { sdp = per_cpu_ptr(ssp->sda, cpu); - spin_lock_irqsave_rcu_node(sdp, flags); + spin_lock_irq_rcu_node(sdp); if (ULONG_CMP_GE(gpseq, sdp->srcu_gp_seq_needed + 100)) sdp->srcu_gp_seq_needed = gpseq; if (ULONG_CMP_GE(gpseq, sdp->srcu_gp_seq_needed_exp + 100)) sdp->srcu_gp_seq_needed_exp = gpseq; - spin_unlock_irqrestore_rcu_node(sdp, flags); + spin_unlock_irq_rcu_node(sdp); } /* Callback initiation done, allow grace periods after next. */ From patchwork Tue Jun 4 22:23:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13685923 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B46B14C5BA; Tue, 4 Jun 2024 22:23:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; cv=none; b=QscyFfmwZiliLB1ukdcFYq945D6b7xv6CSVm2ztytL+A7N8gb5kAXo3aZ6tF7UwPqBvE9P6SXdyL2yuaMPAT6ZuRmsURfEPO2FJ8QbUttGZNnLdB/fv3OJPbOKXg3/9qTrmWfbaHfi2hipruYQ4wC998/OL9DDASIUZ6eCDANE8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; c=relaxed/simple; bh=Mr0jgnXrW9nLakafqNopRYzxiUW6MmdPijZgfPRpq7Y=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CmsjppLI5Mq0SQu52mleQH+AuFCpqmvKHJ0sFYewJezFrmYP+3KnW0Nw4p7lTEIMJvm3Sx2g3Xa5zokrp9TEwMaP4qEOhXBXq166hDmXfjS1YMmIMT+CNS01tWI0/1w0Qgm+Opxw/OvtQvzbM1PHb/BCk+Z3qTUkvssTc5J7o+g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=X7EL/eN1; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="X7EL/eN1" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 16C78C4AF0F; Tue, 4 Jun 2024 22:23:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717539837; bh=Mr0jgnXrW9nLakafqNopRYzxiUW6MmdPijZgfPRpq7Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=X7EL/eN1SHfNEkXECubtVnDDCJtiDSOesBLMbda4eaJenNyPOBOZW+dZ/RsThju5i lmHgj0eZddck8K7R13yuNIfmjLrbD7+geOzHmyQdZFXpMD2k4zPkALiG3bPBsDUb5N zJYlTCELaZTimBbStA8G9cznf/a9jllb9gwDfkwIX4opsRsVuAxKBLMLHtRt9vQ4EB rrMPVBqT23skMVg+9fBlO+AfrBmdIh31tDvgCymnOE7VTXirr9cqLqOxFHQ5WURalQ XNukyjTTkXivfjLnV9+4UkxycIl9rBTD15FGv5AZIsN3rRNMP5IfMminSqSzUJHRld OqMZH0CWowTeg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 67235CE3F30; Tue, 4 Jun 2024 15:23:56 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" , Leonardo Bras , Sean Christopherson Subject: [PATCH rcu 6/9] rcu: Add rcutree.nocb_patience_delay to reduce nohz_full OS jitter Date: Tue, 4 Jun 2024 15:23:52 -0700 Message-Id: <20240604222355.2370768-6-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> References: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 If a CPU is running either a userspace application or a guest OS in nohz_full mode, it is possible for a system call to occur just as an RCU grace period is starting. If that CPU also has the scheduling-clock tick enabled for any reason (such as a second runnable task), and if the system was booted with rcutree.use_softirq=0, then RCU can add insult to injury by awakening that CPU's rcuc kthread, resulting in yet another task and yet more OS jitter due to switching to that task, running it, and switching back. In addition, in the common case where that system call is not of excessively long duration, awakening the rcuc task is pointless. This pointlessness is due to the fact that the CPU will enter an extended quiescent state upon returning to the userspace application or guest OS. In this case, the rcuc kthread cannot do anything that the main RCU grace-period kthread cannot do on its behalf, at least if it is given a few additional milliseconds (for example, given the time duration specified by rcutree.jiffies_till_first_fqs, give or take scheduling delays). This commit therefore adds a rcutree.nocb_patience_delay kernel boot parameter that specifies the grace period age (in milliseconds) before which RCU will refrain from awakening the rcuc kthread. Preliminary experiementation suggests a value of 1000, that is, one second. Increasing rcutree.nocb_patience_delay will increase grace-period latency and in turn increase memory footprint, so systems with constrained memory might choose a smaller value. Systems with less-aggressive OS-jitter requirements might choose the default value of zero, which keeps the traditional immediate-wakeup behavior, thus avoiding increases in grace-period latency. [ paulmck: Apply Leonardo Bras feedback. ] Link: https://lore.kernel.org/all/20240328171949.743211-1-leobras@redhat.com/ Reported-by: Leonardo Bras Suggested-by: Leonardo Bras Suggested-by: Sean Christopherson Signed-off-by: Paul E. McKenney Reviewed-by: Leonardo Bras --- Documentation/admin-guide/kernel-parameters.txt | 8 ++++++++ kernel/rcu/tree.c | 10 ++++++++-- kernel/rcu/tree_plugin.h | 10 ++++++++++ 3 files changed, 26 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 500cfa7762257..2d4a512cf1fc6 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5018,6 +5018,14 @@ the ->nocb_bypass queue. The definition of "too many" is supplied by this kernel boot parameter. + rcutree.nocb_patience_delay= [KNL] + On callback-offloaded (rcu_nocbs) CPUs, avoid + disturbing RCU unless the grace period has + reached the specified age in milliseconds. + Defaults to zero. Large values will be capped + at five seconds. All values will be rounded down + to the nearest value representable by jiffies. + rcutree.qhimark= [KNL] Set threshold of queued RCU callbacks beyond which batch limiting is disabled. diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 35bf4a3736765..408b020c9501f 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -176,6 +176,9 @@ static int gp_init_delay; module_param(gp_init_delay, int, 0444); static int gp_cleanup_delay; module_param(gp_cleanup_delay, int, 0444); +static int nocb_patience_delay; +module_param(nocb_patience_delay, int, 0444); +static int nocb_patience_delay_jiffies; // Add delay to rcu_read_unlock() for strict grace periods. static int rcu_unlock_delay; @@ -4344,11 +4347,14 @@ static int rcu_pending(int user) return 1; /* Is this a nohz_full CPU in userspace or idle? (Ignore RCU if so.) */ - if ((user || rcu_is_cpu_rrupt_from_idle()) && rcu_nohz_full_cpu()) + gp_in_progress = rcu_gp_in_progress(); + if ((user || rcu_is_cpu_rrupt_from_idle() || + (gp_in_progress && + time_before(jiffies, READ_ONCE(rcu_state.gp_start) + nocb_patience_delay_jiffies))) && + rcu_nohz_full_cpu()) return 0; /* Is the RCU core waiting for a quiescent state from this CPU? */ - gp_in_progress = rcu_gp_in_progress(); if (rdp->core_needs_qs && !rdp->cpu_no_qs.b.norm && gp_in_progress) return 1; diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 340bbefe5f652..31c539f09c150 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -93,6 +93,16 @@ static void __init rcu_bootup_announce_oddness(void) pr_info("\tRCU debug GP init slowdown %d jiffies.\n", gp_init_delay); if (gp_cleanup_delay) pr_info("\tRCU debug GP cleanup slowdown %d jiffies.\n", gp_cleanup_delay); + if (nocb_patience_delay < 0) { + pr_info("\tRCU NOCB CPU patience negative (%d), resetting to zero.\n", nocb_patience_delay); + nocb_patience_delay = 0; + } else if (nocb_patience_delay > 5 * MSEC_PER_SEC) { + pr_info("\tRCU NOCB CPU patience too large (%d), resetting to %ld.\n", nocb_patience_delay, 5 * MSEC_PER_SEC); + nocb_patience_delay = 5 * MSEC_PER_SEC; + } else if (nocb_patience_delay) { + pr_info("\tRCU NOCB CPU patience set to %d milliseconds.\n", nocb_patience_delay); + } + nocb_patience_delay_jiffies = msecs_to_jiffies(nocb_patience_delay); if (!use_softirq) pr_info("\tRCU_SOFTIRQ processing moved to rcuc kthreads.\n"); if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG)) From patchwork Tue Jun 4 22:23:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13685924 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5DC5114D280; Tue, 4 Jun 2024 22:23:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; cv=none; b=MhYZh79uB3Aq+I3VJpjNlXFCuCblwveLPk634VH8FypZi5zk9GmWhCJIeQtdsYk2dg1iQVyd8vk68pX0IZsKuVVeQl+pLrERETt1s53O5Ybd2D7UnT5Q8VxLCzLvVQ7FegzH/M+vHROySZnOIjUddfEv89qiXgYbsCqVcDO4wt0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; c=relaxed/simple; bh=98bpyMc4kjv6NZYaNOfRLHxirnUo9+sqZwi21lujIj4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cyICS69W/2B/O4hcPIte7Mkax0QQrIrLpSuzt9YzMRauALqKICQmMpxBpEN4oG972FTkFkTkg2QbkUoHgHoi9GgTXQLFFndVmociZVU3OUzAyM6n9MYp+zzE+iCfeEUbeOKQFsQ3Jet24ZGadSKJBYKlbTwLy6cajeDK0T2d7Vw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=LPKa69by; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="LPKa69by" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 16CB4C4AF11; Tue, 4 Jun 2024 22:23:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717539837; bh=98bpyMc4kjv6NZYaNOfRLHxirnUo9+sqZwi21lujIj4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LPKa69bymMxvmWPand6Ih37w5YpXlIXfzKLK4yX5nhY7ZolasnPQkBO43EibtqMV2 npkcNgtpaSoOsnX2obVG3/bdM25lPWiz5hwt5pqsAZOrKBxEk8QGwbAMSg7KVqrpxn iEilfCvS4lxZB1TEx8luUUNX+oY5u7qBAI9g+GI0gBsoz0B1c1n9kkyDMf5ELlngPb cNyE7gLjuCf9kPGG59+rHBQfjZmiJLWlC0z1fX099opPwaBpyGD2FEA46U/cpyl/ce ouGQ1OvTyT3ZBph2tTd6h0qkeDCU6a7Yf13IjgET6qDyPzAKaaQECkZW255ThOb7rR mph7GHqzTGE5w== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 698DECE3F32; Tue, 4 Jun 2024 15:23:56 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" , Uladzislau Rezki Subject: [PATCH rcu 7/9] MAINTAINERS: Add Uladzislau Rezki as RCU maintainer Date: Tue, 4 Jun 2024 15:23:53 -0700 Message-Id: <20240604222355.2370768-7-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> References: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Signed-off-by: Paul E. McKenney Cc: Uladzislau Rezki --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index d6c90161c7bfe..3f2047082073f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -18868,6 +18868,7 @@ M: Neeraj Upadhyay (kernel/rcu/tasks.h) M: Joel Fernandes M: Josh Triplett M: Boqun Feng +M: Uladzislau Rezki R: Steven Rostedt R: Mathieu Desnoyers R: Lai Jiangshan From patchwork Tue Jun 4 22:23:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13685926 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6DBC214D28E; Tue, 4 Jun 2024 22:23:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; cv=none; b=q5EFIMNJucwSD+VVEBmPloCLIsCLtJbbway3ysqw9SD62rL2iQZy5QdCibv1b52wObXs4cpL4DcaOT6dmts5WvAFgqAdxBViX38agZ+pKtnCZky0ixsuSiC65/7csFHYCesWJOIVEQmQrQ5irLvg7Apv0Jnhog7akfOOi5eDHKY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; c=relaxed/simple; bh=3cxpJfoy8T/KbD5i0taeL255jR0v95FOadtBWaxfrUc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=UY605izIvreICRcG8oharWEfuUzZ1fr6Ibetrynuy8XFbnaojK6WQELQ7PSoCfSj1Z52uAxggcNLaHrhpSl+id0x9V3oE6dGowHXUmJ7D8V9Q4+QFz/bzDc1TWnPrMPJ9QO7DHF1hxQ95NeYfNBRTTEmMTjQwIyzl++LsmhTgas= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=CZ1l9DSb; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="CZ1l9DSb" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 16E48C4AF12; Tue, 4 Jun 2024 22:23:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717539837; bh=3cxpJfoy8T/KbD5i0taeL255jR0v95FOadtBWaxfrUc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CZ1l9DSbH/RQScpO9KFYfapGQp/GOEMXcSGIh/NWdipN1ejmD3EzK8bvwfQ9aX6pE GCTh1c6SILCZWHSWQQDTEDIkNTFP36HuGJeAApSYDUiRiuL31N90sZT7atwP0E3o5E i/FbeHtk27KBcBLm4rWlOeqUgPLxlVs0DjCQntGLeZG7p5QDuNOJnhf01bZZ9QwlA7 3oXNRFkF7dcQoiI92REbWdCYtv3ccFXQszoRmYqC8RG27rbuzsPUzRBL9gDPo7L8RP kADVu5B0IYyeYHlD0UWphkYOUaNklI7MxuTMe/Ik3UZX4BsLV57TVnY6vDa5XPlWbp 3Fzu4swDy011g== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 6C4EFCE3F34; Tue, 4 Jun 2024 15:23:56 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, Oleg Nesterov , "Paul E . McKenney" Subject: [PATCH rcu 8/9] rcu: Eliminate lockless accesses to rcu_sync->gp_count Date: Tue, 4 Jun 2024 15:23:54 -0700 Message-Id: <20240604222355.2370768-8-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> References: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Oleg Nesterov The rcu_sync structure's ->gp_count field is always accessed under the protection of that same structure's ->rss_lock field, with the exception of a pair of WARN_ON_ONCE() calls just prior to acquiring that lock in functions rcu_sync_exit() and rcu_sync_dtor(). These lockless accesses are unnecessary and impair KCSAN's ability to catch bugs that might be inserted via other lockless accesses. This commit therefore moves those WARN_ON_ONCE() calls under the lock. Signed-off-by: Oleg Nesterov Signed-off-by: Paul E. McKenney --- kernel/rcu/sync.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c index 6c2bd9001adcd..da60a9947c005 100644 --- a/kernel/rcu/sync.c +++ b/kernel/rcu/sync.c @@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp) * we are called at early boot time but this shouldn't happen. */ } - WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1); + rsp->gp_count++; spin_unlock_irq(&rsp->rss_lock); if (gp_state == GP_IDLE) { @@ -151,15 +151,11 @@ void rcu_sync_enter(struct rcu_sync *rsp) */ void rcu_sync_exit(struct rcu_sync *rsp) { - int gpc; - WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE); - WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0); spin_lock_irq(&rsp->rss_lock); - gpc = rsp->gp_count - 1; - WRITE_ONCE(rsp->gp_count, gpc); - if (!gpc) { + WARN_ON_ONCE(rsp->gp_count == 0); + if (!--rsp->gp_count) { if (rsp->gp_state == GP_PASSED) { WRITE_ONCE(rsp->gp_state, GP_EXIT); rcu_sync_call(rsp); @@ -178,10 +174,10 @@ void rcu_sync_dtor(struct rcu_sync *rsp) { int gp_state; - WARN_ON_ONCE(READ_ONCE(rsp->gp_count)); WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_PASSED); spin_lock_irq(&rsp->rss_lock); + WARN_ON_ONCE(rsp->gp_count); if (rsp->gp_state == GP_REPLAY) WRITE_ONCE(rsp->gp_state, GP_EXIT); gp_state = rsp->gp_state; From patchwork Tue Jun 4 22:23:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13685925 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6DBEE14D28F; Tue, 4 Jun 2024 22:23:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; cv=none; b=aWS+i2xMEcFyKeM7Edn9iPbmBFrypC+/ItQ0rUl9lhnEdkjN6uHON+Th9J1CGOG9KpHpM8bZltNuNlp4wmXeOT+NoGSm4+XtsPUqQFwnby95Dl82YN1dlxguj/SxPbWL/iiXzyunkWS3PU1spjxR+iQeK0/yX3bZV8vPZda1BC4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; c=relaxed/simple; bh=yXmNx6GGZHORzw+Gi/ligi9zPZy9C9j2+D0gbKw6Oec=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=RwzeMZ0TCK19l6PNYMkyGYyu9wguOx11fZyhEkJ/5vjfUozD7gXKE6q3zIJ8hOwjQ78kTbyH0UMeoOQBhjZNcD5wDL+sUaqu2WVRlMafykIU79OoollDvNLqJYYPZE3NWtOyEaeHW0teGj6LKw8T7GFmPMhHaZLt+p76lpg3TpY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oU6xBh0V; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oU6xBh0V" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2D17EC4AF09; Tue, 4 Jun 2024 22:23:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717539837; bh=yXmNx6GGZHORzw+Gi/ligi9zPZy9C9j2+D0gbKw6Oec=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oU6xBh0VDGe7iMxicBoLC3QF6WQmFzOSJtugKtOUFyrrHafO+M+Y3cBp0yNCHOKbs SAxgBt0nbSN4vSDr2TCJ4u5ZGIz/E+HiCslFN52DUwsc/Dl14I4K/M8v1dbXSEeIu4 fD6UkNzVcucXXz0kCzGy1O/I/lu2z8pOYSykXfLTN3A/2jNKB4ku+bfGlQfuTalaDJ MToyertvOMNHwHaSS/2vDoUoP6BCIJmgcj2urVR84N7m7BSxAMnyCtLDaNpuLie7qx wsOfz2lVGdpaPDk6hJ4vawC70JgXUd2MCkT0BYA4azl40xWEjzUhPhb5VCKBmNDAXL U0AUr7aV+nbQQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 6ECB1CE3F35; Tue, 4 Jun 2024 15:23:56 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, Frederic Weisbecker , "Paul E . McKenney" Subject: [PATCH rcu 9/9] rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU invocation Date: Tue, 4 Jun 2024 15:23:55 -0700 Message-Id: <20240604222355.2370768-9-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> References: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Frederic Weisbecker When rcu_barrier() calls rcu_rdp_cpu_online() and observes a CPU off rnp->qsmaskinitnext, it means that all accesses from the offline CPU preceding the CPUHP_TEARDOWN_CPU are visible to RCU barrier, including callbacks expiration and counter updates. However interrupts can still fire after stop_machine() re-enables interrupts and before rcutree_report_cpu_dead(). The related accesses happening between CPUHP_TEARDOWN_CPU and rnp->qsmaskinitnext clearing are _NOT_ guaranteed to be seen by rcu_barrier() without proper ordering, especially when callbacks are invoked there to the end, making rcutree_migrate_callback() bypass barrier_lock. The following theoretical race example can make rcu_barrier() hang: CPU 0 CPU 1 ----- ----- //cpu_down() smpboot_park_threads() //ksoftirqd is parked now rcu_sched_clock_irq() invoke_rcu_core() do_softirq() rcu_core() rcu_do_batch() // callback storm // rcu_do_batch() returns // before completing all // of them // do_softirq also returns early because of // timeout. It defers to ksoftirqd but // it's parked stop_machine() take_cpu_down() rcu_barrier() spin_lock(barrier_lock) // observes rcu_segcblist_n_cbs(&rdp->cblist) != 0 do_softirq() rcu_core() rcu_do_batch() //completes all pending callbacks //smp_mb() implied _after_ callback number dec rcutree_report_cpu_dead() rnp->qsmaskinitnext &= ~rdp->grpmask; rcutree_migrate_callback() // no callback, early return without locking // barrier_lock //observes !rcu_rdp_cpu_online(rdp) rcu_barrier_entrain() rcu_segcblist_entrain() // Observe rcu_segcblist_n_cbs(rsclp) == 0 // because no barrier between reading // rnp->qsmaskinitnext and rsclp->len rcu_segcblist_add_len() smp_mb__before_atomic() // will now observe the 0 count and empty // list, but too late, we enqueue regardless WRITE_ONCE(rsclp->len, rsclp->len + v); // ignored barrier callback // rcu barrier stall... This could be solved with a read memory barrier, enforcing the message passing between rnp->qsmaskinitnext and rsclp->len, matching the full memory barrier after rsclp->len addition in rcu_segcblist_add_len() performed at the end of rcu_do_batch(). However the rcu_barrier() is complicated enough and probably doesn't need too many more subtleties. CPU down is a slowpath and the barrier_lock seldom contended. Solve the issue with unconditionally locking the barrier_lock on rcutree_migrate_callbacks(). This makes sure that either rcu_barrier() sees the empty queue or its entrained callback will be migrated. Signed-off-by: Frederic Weisbecker Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 408b020c9501f..c58fc10fb5969 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -5147,11 +5147,15 @@ void rcutree_migrate_callbacks(int cpu) struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); bool needwake; - if (rcu_rdp_is_offloaded(rdp) || - rcu_segcblist_empty(&rdp->cblist)) - return; /* No callbacks to migrate. */ + if (rcu_rdp_is_offloaded(rdp)) + return; raw_spin_lock_irqsave(&rcu_state.barrier_lock, flags); + if (rcu_segcblist_empty(&rdp->cblist)) { + raw_spin_unlock_irqrestore(&rcu_state.barrier_lock, flags); + return; /* No callbacks to migrate. */ + } + WARN_ON_ONCE(rcu_rdp_cpu_online(rdp)); rcu_barrier_entrain(rdp); my_rdp = this_cpu_ptr(&rcu_data);