From patchwork Wed Oct 19 22:51:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012454 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08596C4332F for ; Wed, 19 Oct 2022 22:51:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230199AbiJSWvx (ORCPT ); Wed, 19 Oct 2022 18:51:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230270AbiJSWvs (ORCPT ); Wed, 19 Oct 2022 18:51:48 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E88B2181CB1; Wed, 19 Oct 2022 15:51:47 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6A17D619E8; Wed, 19 Oct 2022 22:51:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BC79DC433D7; Wed, 19 Oct 2022 22:51:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219906; bh=KTUh/BBOVf/jDi48jtrSyvr3uGxVW8ae61JQrk9NyJc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QMnspMzNjGvI3eZcJ9PJnKKYTgCQgWhpSkDUKzx4lmQ5tOQf3lVjsfp2BOtXzL5Gk vu0dEyFrZXgwafcYKYjpu0Y27GSxwm1KKhl3FjaFKv5nuABUfkhHjkjKQWSxRZkYCb HM2E+cqEyFrqiVfSRTvAYhMOlQMtpcmVKbH/iXfmSW3CCCIwZCp1cWnsYvQsWOUF2B 0qr9TT+Yzb8/ra4woz0xMTP5HazBmWakNQyf1nDfed3yguzWkgUCFzQAy36bhMDgAp OWwQm2ASc6yz6oYl5r1Ue7kMaR5Xdx9A/4UK+jvcTo6GWYkT3b0z5I2amWSrjauGKx fxXkLf+1pzVRw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 756C85C06B4; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, Zhen Lei , Joel Fernandes , Frederic Weisbecker , "Paul E . McKenney" Subject: [PATCH rcu 01/14] rcu: Simplify rcu_init_nohz() cpumask handling Date: Wed, 19 Oct 2022 15:51:31 -0700 Message-Id: <20221019225144.2500095-1-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Zhen Lei In kernels built with either CONFIG_RCU_NOCB_CPU_DEFAULT_ALL=y or CONFIG_NO_HZ_FULL=y, additional CPUs must be added to rcu_nocb_mask. Except that kernels booted without the rcu_nocbs= will not have allocated rcu_nocb_mask. And the current rcu_init_nohz() function uses its need_rcu_nocb_mask and offload_all local variables to track the rcu_nocb and nohz_full state. But there is a much simpler approach, namely creating a cpumask pointer to track the default and then using cpumask_available() to check the rcu_nocb_mask state. This commit takes this approach, thereby simplifying and shortening the rcu_init_nohz() function. Signed-off-by: Zhen Lei Reviewed-by: Joel Fernandes (Google) Acked-by: Frederic Weisbecker Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_nocb.h | 34 +++++++++++----------------------- 1 file changed, 11 insertions(+), 23 deletions(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 0a5f0ef414845..ce526cc2791ca 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -1210,45 +1210,33 @@ EXPORT_SYMBOL_GPL(rcu_nocb_cpu_offload); void __init rcu_init_nohz(void) { int cpu; - bool need_rcu_nocb_mask = false; - bool offload_all = false; struct rcu_data *rdp; - -#if defined(CONFIG_RCU_NOCB_CPU_DEFAULT_ALL) - if (!rcu_state.nocb_is_setup) { - need_rcu_nocb_mask = true; - offload_all = true; - } -#endif /* #if defined(CONFIG_RCU_NOCB_CPU_DEFAULT_ALL) */ + const struct cpumask *cpumask = NULL; #if defined(CONFIG_NO_HZ_FULL) - if (tick_nohz_full_running && !cpumask_empty(tick_nohz_full_mask)) { - need_rcu_nocb_mask = true; - offload_all = false; /* NO_HZ_FULL has its own mask. */ - } -#endif /* #if defined(CONFIG_NO_HZ_FULL) */ + if (tick_nohz_full_running && !cpumask_empty(tick_nohz_full_mask)) + cpumask = tick_nohz_full_mask; +#endif + + if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_DEFAULT_ALL) && + !rcu_state.nocb_is_setup && !cpumask) + cpumask = cpu_possible_mask; - if (need_rcu_nocb_mask) { + if (cpumask) { if (!cpumask_available(rcu_nocb_mask)) { if (!zalloc_cpumask_var(&rcu_nocb_mask, GFP_KERNEL)) { pr_info("rcu_nocb_mask allocation failed, callback offloading disabled.\n"); return; } } + + cpumask_or(rcu_nocb_mask, rcu_nocb_mask, cpumask); rcu_state.nocb_is_setup = true; } if (!rcu_state.nocb_is_setup) return; -#if defined(CONFIG_NO_HZ_FULL) - if (tick_nohz_full_running) - cpumask_or(rcu_nocb_mask, rcu_nocb_mask, tick_nohz_full_mask); -#endif /* #if defined(CONFIG_NO_HZ_FULL) */ - - if (offload_all) - cpumask_setall(rcu_nocb_mask); - if (!cpumask_subset(rcu_nocb_mask, cpu_possible_mask)) { pr_info("\tNote: kernel parameter 'rcu_nocbs=', 'nohz_full', or 'isolcpus=' contains nonexistent CPUs.\n"); cpumask_and(rcu_nocb_mask, cpu_possible_mask, From patchwork Wed Oct 19 22:51:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012466 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48AA0C4332F for ; Wed, 19 Oct 2022 22:52:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231251AbiJSWwD (ORCPT ); Wed, 19 Oct 2022 18:52:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51268 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230452AbiJSWvx (ORCPT ); Wed, 19 Oct 2022 18:51:53 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B528C190472; Wed, 19 Oct 2022 15:51:50 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 618D9CE242E; Wed, 19 Oct 2022 22:51:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B7EA2C433D6; Wed, 19 Oct 2022 22:51:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219906; bh=cd9rgMHV5ytGRmwwPo9gYzUKZlyUq56zzJaeb9ZJrZY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kFduOefcF3DL5P78TykY4+I41g8zwf9ipFCX1ghnGq+rZ2F7oeLDQ3t6/PGJHe5FT lhhVZ1vE522OBbGx19z+6enNorzxpeNLqUQqIB49qVbTuc4u0zKMUKcQV3o1/ebZQP IUKC2sdvWq9i9w/EgpY1/G59nv+d3Hd/RoeLr85Yreq5Dv0ELqosSmsSPMf5iakGgI a10LJ/nGnHeUBJf7EYdPQRXemru4WmQJPebsrjdXUEB+1Ua8tZibX/zkrBIBcIYA9o 5vBM4oRDM4qnD6o5tpJIRuI/AeTFdp+F0capyD+rTMI/DjP2mczY42srq6bGppUraC 30lefLU9VMigQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 787575C0879; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, "Joel Fernandes (Google)" , Frederic Weisbecker , "Paul E . McKenney" Subject: [PATCH rcu 02/14] rcu: Fix late wakeup when flush of bypass cblist happens Date: Wed, 19 Oct 2022 15:51:32 -0700 Message-Id: <20221019225144.2500095-2-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: "Joel Fernandes (Google)" When the bypass cblist gets too big or its timeout has occurred, it is flushed into the main cblist. However, the bypass timer is still running and the behavior is that it would eventually expire and wake the GP thread. Since we are going to use the bypass cblist for lazy CBs, do the wakeup soon as the flush for "too big or too long" bypass list happens. Otherwise, long delays can happen for callbacks which get promoted from lazy to non-lazy. This is a good thing to do anyway (regardless of future lazy patches), since it makes the behavior consistent with behavior of other code paths where flushing into the ->cblist makes the GP kthread into a non-sleeping state quickly. [ Frederic Weisbecker: Changes to avoid unnecessary GP-thread wakeups plus comment changes. ] Reviewed-by: Frederic Weisbecker Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_nocb.h | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index ce526cc2791ca..f77a6d7e13564 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -433,8 +433,9 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, if ((ncbs && j != READ_ONCE(rdp->nocb_bypass_first)) || ncbs >= qhimark) { rcu_nocb_lock(rdp); + *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist); + if (!rcu_nocb_flush_bypass(rdp, rhp, j)) { - *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist); if (*was_alldone) trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstQ")); @@ -447,7 +448,12 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, rcu_advance_cbs_nowake(rdp->mynode, rdp); rdp->nocb_gp_adv_time = j; } - rcu_nocb_unlock_irqrestore(rdp, flags); + + // The flush succeeded and we moved CBs into the regular list. + // Don't wait for the wake up timer as it may be too far ahead. + // Wake up the GP thread now instead, if the cblist was empty. + __call_rcu_nocb_wake(rdp, *was_alldone, flags); + return true; // Callback already enqueued. } From patchwork Wed Oct 19 22:51:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012457 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BFCEC43217 for ; Wed, 19 Oct 2022 22:51:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230393AbiJSWv5 (ORCPT ); Wed, 19 Oct 2022 18:51:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230353AbiJSWvx (ORCPT ); Wed, 19 Oct 2022 18:51:53 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A0AEB18F0E0; Wed, 19 Oct 2022 15:51:49 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 11C4CB8260A; Wed, 19 Oct 2022 22:51:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B440EC433C1; Wed, 19 Oct 2022 22:51:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219906; bh=JgCYhsIq4xhCHA9e/N+ivpZlC63szLfECFNNQENWSoc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=BmmPHg3u79iCf77y3+BTRprS1zEoKx1IdYqgBKXj3kv7PWvc/alzqFWj7YcT9olOC GsE3TKr7s5RD0JDx5Gyd/nK2z0FHM0IO6g960wWIXKirt2Qzp4SzWVPyN9pi5JjNSR ROd8sGq9OAZb2A0N0q8V48fWZrcFhujmQk0mjtoU9pSgWcDG5jE0ZjkwNIE3im4o8i i+tKYRbtCh82xR0H7GAW30YK0omnw1aXSpnxRUOgobPAahwXsra6Bi+fSKiRW81eJl Ei+dYJTIXpPiqZqnSNAKp5gLYh7HwF6+ouEH3k6Peqxbfkd53igt/U63mMnhcvcJ6X MUrClLR0WxLag== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 7A6055C0890; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, Frederic Weisbecker , Joel Fernandes , "Paul E . McKenney" Subject: [PATCH rcu 03/14] rcu: Fix missing nocb gp wake on rcu_barrier() Date: Wed, 19 Oct 2022 15:51:33 -0700 Message-Id: <20221019225144.2500095-3-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Frederic Weisbecker In preparation for RCU lazy changes, wake up the RCU nocb gp thread if needed after an entrain. This change prevents the RCU barrier callback from waiting in the queue for several seconds before the lazy callbacks in front of it are serviced. Reported-by: Joel Fernandes (Google) Signed-off-by: Frederic Weisbecker Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 11 +++++++++++ kernel/rcu/tree.h | 1 + kernel/rcu/tree_nocb.h | 5 +++++ 3 files changed, 17 insertions(+) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 6bb8e72bc8151..fb7a1b95af71e 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3894,6 +3894,8 @@ static void rcu_barrier_entrain(struct rcu_data *rdp) { unsigned long gseq = READ_ONCE(rcu_state.barrier_sequence); unsigned long lseq = READ_ONCE(rdp->barrier_seq_snap); + bool wake_nocb = false; + bool was_alldone = false; lockdep_assert_held(&rcu_state.barrier_lock); if (rcu_seq_state(lseq) || !rcu_seq_state(gseq) || rcu_seq_ctr(lseq) != rcu_seq_ctr(gseq)) @@ -3902,7 +3904,14 @@ static void rcu_barrier_entrain(struct rcu_data *rdp) rdp->barrier_head.func = rcu_barrier_callback; debug_rcu_head_queue(&rdp->barrier_head); rcu_nocb_lock(rdp); + /* + * Flush bypass and wakeup rcuog if we add callbacks to an empty regular + * queue. This way we don't wait for bypass timer that can reach seconds + * if it's fully lazy. + */ + was_alldone = rcu_rdp_is_offloaded(rdp) && !rcu_segcblist_pend_cbs(&rdp->cblist); WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies)); + wake_nocb = was_alldone && rcu_segcblist_pend_cbs(&rdp->cblist); if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) { atomic_inc(&rcu_state.barrier_cpu_count); } else { @@ -3910,6 +3919,8 @@ static void rcu_barrier_entrain(struct rcu_data *rdp) rcu_barrier_trace(TPS("IRQNQ"), -1, rcu_state.barrier_sequence); } rcu_nocb_unlock(rdp); + if (wake_nocb) + wake_nocb_gp(rdp, false); smp_store_release(&rdp->barrier_seq_snap, gseq); } diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index d4a97e40ea9c3..925dd98f8b23b 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -439,6 +439,7 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp); static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp); static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq); static void rcu_init_one_nocb(struct rcu_node *rnp); +static bool wake_nocb_gp(struct rcu_data *rdp, bool force); static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp, unsigned long j); static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index f77a6d7e13564..094fd454b6c38 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -1558,6 +1558,11 @@ static void rcu_init_one_nocb(struct rcu_node *rnp) { } +static bool wake_nocb_gp(struct rcu_data *rdp, bool force) +{ + return false; +} + static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp, unsigned long j) { From patchwork Wed Oct 19 22:51:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA596C43217 for ; Wed, 19 Oct 2022 22:52:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230398AbiJSWwB (ORCPT ); Wed, 19 Oct 2022 18:52:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230437AbiJSWvx (ORCPT ); Wed, 19 Oct 2022 18:51:53 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1F3F18F0E4; Wed, 19 Oct 2022 15:51:49 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 14499B82613; Wed, 19 Oct 2022 22:51:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BA5AEC433B5; Wed, 19 Oct 2022 22:51:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219906; bh=ngHOxhHcPNa1/fns45XDyAgKGnPMU0aLs17ox8uP24s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=i2+Z3dL0N429SOE1aONKywaLHUHGWfVI7sTNYenmAnhO4lbcQbq3D0bgt3ZBPrnym xYwg1dsV1wYmrlFHc4WFKxIerFG2f1l8JUkXK7v0GDDPtCU1XHSh+7qs3kuFNzM+JL AuBBgaEaz2hpn5RtSEkEwZTQtqas6+buOB1xRbRYha255+jnfZvETsf8uvA5dc57zO zXU4eowW++eJnVtkZHjjjEaAXkK9dw+SdANZVwEa4DAeBsFIVYCN0092N9DpUqu1k2 5KgfVpaP8K68uN/A3y9MrY5Tk4SJzHUyT7lsh0A9t8WjMEZoDTCmosbIV5PqKJu330 C91F41A2TMfrg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 7CEAF5C0920; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, "Joel Fernandes (Google)" , Paul McKenney , Frederic Weisbecker Subject: [PATCH rcu 04/14] rcu: Make call_rcu() lazy to save power Date: Wed, 19 Oct 2022 15:51:34 -0700 Message-Id: <20221019225144.2500095-4-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: "Joel Fernandes (Google)" Implement timer-based RCU callback batching (also known as lazy callbacks). With this we save about 5-10% of power consumed due to RCU requests that happen when system is lightly loaded or idle. By default, all async callbacks (queued via call_rcu) are marked lazy. An alternate API call_rcu_flush() is provided for the few users, for example synchronize_rcu(), that need the old behavior. The batch is flushed whenever a certain amount of time has passed, or the batch on a particular CPU grows too big. Also memory pressure will flush it in a future patch. To handle several corner cases automagically (such as rcu_barrier() and hotplug), we re-use bypass lists which were originally introduced to address lock contention, to handle lazy CBs as well. The bypass list length has the lazy CB length included in it. A separate lazy CB length counter is also introduced to keep track of the number of lazy CBs. [ paulmck: Fix formatting of inline call_rcu_lazy() definition. ] Suggested-by: Paul McKenney Acked-by: Frederic Weisbecker Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- include/linux/rcupdate.h | 9 +++ kernel/rcu/Kconfig | 8 ++ kernel/rcu/rcu.h | 8 ++ kernel/rcu/tiny.c | 2 +- kernel/rcu/tree.c | 129 ++++++++++++++++++++----------- kernel/rcu/tree.h | 11 ++- kernel/rcu/tree_exp.h | 2 +- kernel/rcu/tree_nocb.h | 159 +++++++++++++++++++++++++++++++-------- 8 files changed, 246 insertions(+), 82 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 08605ce7379d7..f6288c1124425 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -108,6 +108,15 @@ static inline int rcu_preempt_depth(void) #endif /* #else #ifdef CONFIG_PREEMPT_RCU */ +#ifdef CONFIG_RCU_LAZY +void call_rcu_flush(struct rcu_head *head, rcu_callback_t func); +#else +static inline void call_rcu_flush(struct rcu_head *head, rcu_callback_t func) +{ + call_rcu(head, func); +} +#endif + /* Internal to kernel */ void rcu_init(void); extern int rcu_scheduler_active; diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig index d471d22a5e21b..d78f6181c8aad 100644 --- a/kernel/rcu/Kconfig +++ b/kernel/rcu/Kconfig @@ -311,4 +311,12 @@ config TASKS_TRACE_RCU_READ_MB Say N here if you hate read-side memory barriers. Take the default if you are unsure. +config RCU_LAZY + bool "RCU callback lazy invocation functionality" + depends on RCU_NOCB_CPU + default n + help + To save power, batch RCU callbacks and flush after delay, memory + pressure, or callback list growing too big. + endmenu # "RCU Subsystem" diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h index be5979da07f59..65704cbc9df7b 100644 --- a/kernel/rcu/rcu.h +++ b/kernel/rcu/rcu.h @@ -474,6 +474,14 @@ enum rcutorture_type { INVALID_RCU_FLAVOR }; +#if defined(CONFIG_RCU_LAZY) +unsigned long rcu_lazy_get_jiffies_till_flush(void); +void rcu_lazy_set_jiffies_till_flush(unsigned long j); +#else +static inline unsigned long rcu_lazy_get_jiffies_till_flush(void) { return 0; } +static inline void rcu_lazy_set_jiffies_till_flush(unsigned long j) { } +#endif + #if defined(CONFIG_TREE_RCU) void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags, unsigned long *gp_seq); diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c index a33a8d4942c37..810479cf17bae 100644 --- a/kernel/rcu/tiny.c +++ b/kernel/rcu/tiny.c @@ -44,7 +44,7 @@ static struct rcu_ctrlblk rcu_ctrlblk = { void rcu_barrier(void) { - wait_rcu_gp(call_rcu); + wait_rcu_gp(call_rcu_flush); } EXPORT_SYMBOL(rcu_barrier); diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index fb7a1b95af71e..6eaa020a9d289 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2728,47 +2728,8 @@ static void check_cb_ovld(struct rcu_data *rdp) raw_spin_unlock_rcu_node(rnp); } -/** - * call_rcu() - Queue an RCU callback for invocation after a grace period. - * @head: structure to be used for queueing the RCU updates. - * @func: actual callback function to be invoked after the grace period - * - * The callback function will be invoked some time after a full grace - * period elapses, in other words after all pre-existing RCU read-side - * critical sections have completed. However, the callback function - * might well execute concurrently with RCU read-side critical sections - * that started after call_rcu() was invoked. - * - * RCU read-side critical sections are delimited by rcu_read_lock() - * and rcu_read_unlock(), and may be nested. In addition, but only in - * v5.0 and later, regions of code across which interrupts, preemption, - * or softirqs have been disabled also serve as RCU read-side critical - * sections. This includes hardware interrupt handlers, softirq handlers, - * and NMI handlers. - * - * Note that all CPUs must agree that the grace period extended beyond - * all pre-existing RCU read-side critical section. On systems with more - * than one CPU, this means that when "func()" is invoked, each CPU is - * guaranteed to have executed a full memory barrier since the end of its - * last RCU read-side critical section whose beginning preceded the call - * to call_rcu(). It also means that each CPU executing an RCU read-side - * critical section that continues beyond the start of "func()" must have - * executed a memory barrier after the call_rcu() but before the beginning - * of that RCU read-side critical section. Note that these guarantees - * include CPUs that are offline, idle, or executing in user mode, as - * well as CPUs that are executing in the kernel. - * - * Furthermore, if CPU A invoked call_rcu() and CPU B invoked the - * resulting RCU callback function "func()", then both CPU A and CPU B are - * guaranteed to execute a full memory barrier during the time interval - * between the call to call_rcu() and the invocation of "func()" -- even - * if CPU A and CPU B are the same CPU (but again only if the system has - * more than one CPU). - * - * Implementation of these memory-ordering guarantees is described here: - * Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst. - */ -void call_rcu(struct rcu_head *head, rcu_callback_t func) +static void +__call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy) { static atomic_t doublefrees; unsigned long flags; @@ -2809,7 +2770,7 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) } check_cb_ovld(rdp); - if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) + if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags, lazy)) return; // Enqueued onto ->nocb_bypass, so just leave. // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock. rcu_segcblist_enqueue(&rdp->cblist, head); @@ -2831,8 +2792,84 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) local_irq_restore(flags); } } -EXPORT_SYMBOL_GPL(call_rcu); +#ifdef CONFIG_RCU_LAZY +/** + * call_rcu_flush() - Queue RCU callback for invocation after grace period, and + * flush all lazy callbacks (including the new one) to the main ->cblist while + * doing so. + * + * @head: structure to be used for queueing the RCU updates. + * @func: actual callback function to be invoked after the grace period + * + * The callback function will be invoked some time after a full grace + * period elapses, in other words after all pre-existing RCU read-side + * critical sections have completed. + * + * Use this API instead of call_rcu() if you don't want the callback to be + * invoked after very long periods of time, which can happen on systems without + * memory pressure and on systems which are lightly loaded or mostly idle. + * This function will cause callbacks to be invoked sooner than later at the + * expense of extra power. Other than that, this function is identical to, and + * reuses call_rcu()'s logic. Refer to call_rcu() for more details about memory + * ordering and other functionality. + */ +void call_rcu_flush(struct rcu_head *head, rcu_callback_t func) +{ + return __call_rcu_common(head, func, false); +} +EXPORT_SYMBOL_GPL(call_rcu_flush); +#endif + +/** + * call_rcu() - Queue an RCU callback for invocation after a grace period. + * By default the callbacks are 'lazy' and are kept hidden from the main + * ->cblist to prevent starting of grace periods too soon. + * If you desire grace periods to start very soon, use call_rcu_flush(). + * + * @head: structure to be used for queueing the RCU updates. + * @func: actual callback function to be invoked after the grace period + * + * The callback function will be invoked some time after a full grace + * period elapses, in other words after all pre-existing RCU read-side + * critical sections have completed. However, the callback function + * might well execute concurrently with RCU read-side critical sections + * that started after call_rcu() was invoked. + * + * RCU read-side critical sections are delimited by rcu_read_lock() + * and rcu_read_unlock(), and may be nested. In addition, but only in + * v5.0 and later, regions of code across which interrupts, preemption, + * or softirqs have been disabled also serve as RCU read-side critical + * sections. This includes hardware interrupt handlers, softirq handlers, + * and NMI handlers. + * + * Note that all CPUs must agree that the grace period extended beyond + * all pre-existing RCU read-side critical section. On systems with more + * than one CPU, this means that when "func()" is invoked, each CPU is + * guaranteed to have executed a full memory barrier since the end of its + * last RCU read-side critical section whose beginning preceded the call + * to call_rcu(). It also means that each CPU executing an RCU read-side + * critical section that continues beyond the start of "func()" must have + * executed a memory barrier after the call_rcu() but before the beginning + * of that RCU read-side critical section. Note that these guarantees + * include CPUs that are offline, idle, or executing in user mode, as + * well as CPUs that are executing in the kernel. + * + * Furthermore, if CPU A invoked call_rcu() and CPU B invoked the + * resulting RCU callback function "func()", then both CPU A and CPU B are + * guaranteed to execute a full memory barrier during the time interval + * between the call to call_rcu() and the invocation of "func()" -- even + * if CPU A and CPU B are the same CPU (but again only if the system has + * more than one CPU). + * + * Implementation of these memory-ordering guarantees is described here: + * Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst. + */ +void call_rcu(struct rcu_head *head, rcu_callback_t func) +{ + return __call_rcu_common(head, func, true); +} +EXPORT_SYMBOL_GPL(call_rcu); /* Maximum number of jiffies to wait before draining a batch. */ #define KFREE_DRAIN_JIFFIES (5 * HZ) @@ -3507,7 +3544,7 @@ void synchronize_rcu(void) if (rcu_gp_is_expedited()) synchronize_rcu_expedited(); else - wait_rcu_gp(call_rcu); + wait_rcu_gp(call_rcu_flush); return; } @@ -3910,7 +3947,7 @@ static void rcu_barrier_entrain(struct rcu_data *rdp) * if it's fully lazy. */ was_alldone = rcu_rdp_is_offloaded(rdp) && !rcu_segcblist_pend_cbs(&rdp->cblist); - WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies)); + WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false)); wake_nocb = was_alldone && rcu_segcblist_pend_cbs(&rdp->cblist); if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) { atomic_inc(&rcu_state.barrier_cpu_count); @@ -4336,7 +4373,7 @@ void rcutree_migrate_callbacks(int cpu) my_rdp = this_cpu_ptr(&rcu_data); my_rnp = my_rdp->mynode; rcu_nocb_lock(my_rdp); /* irqs already disabled. */ - WARN_ON_ONCE(!rcu_nocb_flush_bypass(my_rdp, NULL, jiffies)); + WARN_ON_ONCE(!rcu_nocb_flush_bypass(my_rdp, NULL, jiffies, false)); raw_spin_lock_rcu_node(my_rnp); /* irqs already disabled. */ /* Leverage recent GPs and set GP for new callbacks. */ needwake = rcu_advance_cbs(my_rnp, rdp) || diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 925dd98f8b23b..fcb5d696eb170 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -263,14 +263,16 @@ struct rcu_data { unsigned long last_fqs_resched; /* Time of last rcu_resched(). */ unsigned long last_sched_clock; /* Jiffies of last rcu_sched_clock_irq(). */ + long lazy_len; /* Length of buffered lazy callbacks. */ int cpu; }; /* Values for nocb_defer_wakeup field in struct rcu_data. */ #define RCU_NOCB_WAKE_NOT 0 #define RCU_NOCB_WAKE_BYPASS 1 -#define RCU_NOCB_WAKE 2 -#define RCU_NOCB_WAKE_FORCE 3 +#define RCU_NOCB_WAKE_LAZY 2 +#define RCU_NOCB_WAKE 3 +#define RCU_NOCB_WAKE_FORCE 4 #define RCU_JIFFIES_TILL_FORCE_QS (1 + (HZ > 250) + (HZ > 500)) /* For jiffies_till_first_fqs and */ @@ -441,9 +443,10 @@ static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq); static void rcu_init_one_nocb(struct rcu_node *rnp); static bool wake_nocb_gp(struct rcu_data *rdp, bool force); static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp, - unsigned long j); + unsigned long j, bool lazy); static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, - bool *was_alldone, unsigned long flags); + bool *was_alldone, unsigned long flags, + bool lazy); static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_empty, unsigned long flags); static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp, int level); diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 18e9b4cd78ef8..5cac056007982 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -937,7 +937,7 @@ void synchronize_rcu_expedited(void) /* If expedited grace periods are prohibited, fall back to normal. */ if (rcu_gp_is_normal()) { - wait_rcu_gp(call_rcu); + wait_rcu_gp(call_rcu_flush); return; } diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 094fd454b6c38..d6e4c076b0515 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -256,6 +256,31 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force) return __wake_nocb_gp(rdp_gp, rdp, force, flags); } +/* + * LAZY_FLUSH_JIFFIES decides the maximum amount of time that + * can elapse before lazy callbacks are flushed. Lazy callbacks + * could be flushed much earlier for a number of other reasons + * however, LAZY_FLUSH_JIFFIES will ensure no lazy callbacks are + * left unsubmitted to RCU after those many jiffies. + */ +#define LAZY_FLUSH_JIFFIES (10 * HZ) +static unsigned long jiffies_till_flush = LAZY_FLUSH_JIFFIES; + +#ifdef CONFIG_RCU_LAZY +// To be called only from test code. +void rcu_lazy_set_jiffies_till_flush(unsigned long jif) +{ + jiffies_till_flush = jif; +} +EXPORT_SYMBOL(rcu_lazy_set_jiffies_till_flush); + +unsigned long rcu_lazy_get_jiffies_till_flush(void) +{ + return jiffies_till_flush; +} +EXPORT_SYMBOL(rcu_lazy_get_jiffies_till_flush); +#endif + /* * Arrange to wake the GP kthread for this NOCB group at some future * time when it is safe to do so. @@ -269,10 +294,14 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype, raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags); /* - * Bypass wakeup overrides previous deferments. In case - * of callback storm, no need to wake up too early. + * Bypass wakeup overrides previous deferments. In case of + * callback storms, no need to wake up too early. */ - if (waketype == RCU_NOCB_WAKE_BYPASS) { + if (waketype == RCU_NOCB_WAKE_LAZY && + rdp->nocb_defer_wakeup == RCU_NOCB_WAKE_NOT) { + mod_timer(&rdp_gp->nocb_timer, jiffies + jiffies_till_flush); + WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype); + } else if (waketype == RCU_NOCB_WAKE_BYPASS) { mod_timer(&rdp_gp->nocb_timer, jiffies + 2); WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype); } else { @@ -293,10 +322,13 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype, * proves to be initially empty, just return false because the no-CB GP * kthread may need to be awakened in this case. * + * Return true if there was something to be flushed and it succeeded, otherwise + * false. + * * Note that this function always returns true if rhp is NULL. */ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp, - unsigned long j) + unsigned long j, bool lazy) { struct rcu_cblist rcl; @@ -310,7 +342,20 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp, /* Note: ->cblist.len already accounts for ->nocb_bypass contents. */ if (rhp) rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */ - rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp); + + /* + * If the new CB requested was a lazy one, queue it onto the main + * ->cblist so we can take advantage of a sooner grade period. + */ + if (lazy && rhp) { + rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, NULL); + rcu_cblist_enqueue(&rcl, rhp); + WRITE_ONCE(rdp->lazy_len, 0); + } else { + rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp); + WRITE_ONCE(rdp->lazy_len, 0); + } + rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl); WRITE_ONCE(rdp->nocb_bypass_first, j); rcu_nocb_bypass_unlock(rdp); @@ -326,13 +371,13 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp, * Note that this function always returns true if rhp is NULL. */ static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp, - unsigned long j) + unsigned long j, bool lazy) { if (!rcu_rdp_is_offloaded(rdp)) return true; rcu_lockdep_assert_cblist_protected(rdp); rcu_nocb_bypass_lock(rdp); - return rcu_nocb_do_flush_bypass(rdp, rhp, j); + return rcu_nocb_do_flush_bypass(rdp, rhp, j, lazy); } /* @@ -345,7 +390,7 @@ static void rcu_nocb_try_flush_bypass(struct rcu_data *rdp, unsigned long j) if (!rcu_rdp_is_offloaded(rdp) || !rcu_nocb_bypass_trylock(rdp)) return; - WARN_ON_ONCE(!rcu_nocb_do_flush_bypass(rdp, NULL, j)); + WARN_ON_ONCE(!rcu_nocb_do_flush_bypass(rdp, NULL, j, false)); } /* @@ -367,12 +412,14 @@ static void rcu_nocb_try_flush_bypass(struct rcu_data *rdp, unsigned long j) * there is only one CPU in operation. */ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, - bool *was_alldone, unsigned long flags) + bool *was_alldone, unsigned long flags, + bool lazy) { unsigned long c; unsigned long cur_gp_seq; unsigned long j = jiffies; long ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass); + bool bypass_is_lazy = (ncbs == READ_ONCE(rdp->lazy_len)); lockdep_assert_irqs_disabled(); @@ -417,25 +464,29 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, // If there hasn't yet been all that many ->cblist enqueues // this jiffy, tell the caller to enqueue onto ->cblist. But flush // ->nocb_bypass first. - if (rdp->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy) { + // Lazy CBs throttle this back and do immediate bypass queuing. + if (rdp->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy && !lazy) { rcu_nocb_lock(rdp); *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist); if (*was_alldone) trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstQ")); - WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j)); + + WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j, false)); WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass)); return false; // Caller must enqueue the callback. } // If ->nocb_bypass has been used too long or is too full, // flush ->nocb_bypass to ->cblist. - if ((ncbs && j != READ_ONCE(rdp->nocb_bypass_first)) || + if ((ncbs && !bypass_is_lazy && j != READ_ONCE(rdp->nocb_bypass_first)) || + (ncbs && bypass_is_lazy && + (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + jiffies_till_flush))) || ncbs >= qhimark) { rcu_nocb_lock(rdp); *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist); - if (!rcu_nocb_flush_bypass(rdp, rhp, j)) { + if (!rcu_nocb_flush_bypass(rdp, rhp, j, lazy)) { if (*was_alldone) trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstQ")); @@ -463,13 +514,24 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass); rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */ rcu_cblist_enqueue(&rdp->nocb_bypass, rhp); + + if (lazy) + WRITE_ONCE(rdp->lazy_len, rdp->lazy_len + 1); + if (!ncbs) { WRITE_ONCE(rdp->nocb_bypass_first, j); trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ")); } rcu_nocb_bypass_unlock(rdp); smp_mb(); /* Order enqueue before wake. */ - if (ncbs) { + // A wake up of the grace period kthread or timer adjustment + // needs to be done only if: + // 1. Bypass list was fully empty before (this is the first + // bypass list entry), or: + // 2. Both of these conditions are met: + // a. The bypass list previously had only lazy CBs, and: + // b. The new CB is non-lazy. + if (ncbs && (!bypass_is_lazy || lazy)) { local_irq_restore(flags); } else { // No-CBs GP kthread might be indefinitely asleep, if so, wake. @@ -497,8 +559,10 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone, unsigned long flags) __releases(rdp->nocb_lock) { + long bypass_len; unsigned long cur_gp_seq; unsigned long j; + long lazy_len; long len; struct task_struct *t; @@ -512,9 +576,16 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone, } // Need to actually to a wakeup. len = rcu_segcblist_n_cbs(&rdp->cblist); + bypass_len = rcu_cblist_n_cbs(&rdp->nocb_bypass); + lazy_len = READ_ONCE(rdp->lazy_len); if (was_alldone) { rdp->qlen_last_fqs_check = len; - if (!irqs_disabled_flags(flags)) { + // Only lazy CBs in bypass list + if (lazy_len && bypass_len == lazy_len) { + rcu_nocb_unlock_irqrestore(rdp, flags); + wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE_LAZY, + TPS("WakeLazy")); + } else if (!irqs_disabled_flags(flags)) { /* ... if queue was empty ... */ rcu_nocb_unlock_irqrestore(rdp, flags); wake_nocb_gp(rdp, false); @@ -605,12 +676,12 @@ static void nocb_gp_sleep(struct rcu_data *my_rdp, int cpu) static void nocb_gp_wait(struct rcu_data *my_rdp) { bool bypass = false; - long bypass_ncbs; int __maybe_unused cpu = my_rdp->cpu; unsigned long cur_gp_seq; unsigned long flags; bool gotcbs = false; unsigned long j = jiffies; + bool lazy = false; bool needwait_gp = false; // This prevents actual uninitialized use. bool needwake; bool needwake_gp; @@ -640,24 +711,43 @@ static void nocb_gp_wait(struct rcu_data *my_rdp) * won't be ignored for long. */ list_for_each_entry(rdp, &my_rdp->nocb_head_rdp, nocb_entry_rdp) { + long bypass_ncbs; + bool flush_bypass = false; + long lazy_ncbs; + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check")); rcu_nocb_lock_irqsave(rdp, flags); lockdep_assert_held(&rdp->nocb_lock); bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass); - if (bypass_ncbs && + lazy_ncbs = READ_ONCE(rdp->lazy_len); + + if (bypass_ncbs && (lazy_ncbs == bypass_ncbs) && + (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + jiffies_till_flush) || + bypass_ncbs > 2 * qhimark)) { + flush_bypass = true; + } else if (bypass_ncbs && (lazy_ncbs != bypass_ncbs) && (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + 1) || bypass_ncbs > 2 * qhimark)) { - // Bypass full or old, so flush it. - (void)rcu_nocb_try_flush_bypass(rdp, j); - bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass); + flush_bypass = true; } else if (!bypass_ncbs && rcu_segcblist_empty(&rdp->cblist)) { rcu_nocb_unlock_irqrestore(rdp, flags); continue; /* No callbacks here, try next. */ } + + if (flush_bypass) { + // Bypass full or old, so flush it. + (void)rcu_nocb_try_flush_bypass(rdp, j); + bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass); + lazy_ncbs = READ_ONCE(rdp->lazy_len); + } + if (bypass_ncbs) { trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, - TPS("Bypass")); - bypass = true; + bypass_ncbs == lazy_ncbs ? TPS("Lazy") : TPS("Bypass")); + if (bypass_ncbs == lazy_ncbs) + lazy = true; + else + bypass = true; } rnp = rdp->mynode; @@ -705,12 +795,20 @@ static void nocb_gp_wait(struct rcu_data *my_rdp) my_rdp->nocb_gp_gp = needwait_gp; my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0; - if (bypass && !rcu_nocb_poll) { - // At least one child with non-empty ->nocb_bypass, so set - // timer in order to avoid stranding its callbacks. - wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_BYPASS, - TPS("WakeBypassIsDeferred")); + // At least one child with non-empty ->nocb_bypass, so set + // timer in order to avoid stranding its callbacks. + if (!rcu_nocb_poll) { + // If bypass list only has lazy CBs. Add a deferred lazy wake up. + if (lazy && !bypass) { + wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_LAZY, + TPS("WakeLazyIsDeferred")); + // Otherwise add a deferred bypass wake up. + } else if (bypass) { + wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_BYPASS, + TPS("WakeBypassIsDeferred")); + } } + if (rcu_nocb_poll) { /* Polling, so trace if first poll in the series. */ if (gotcbs) @@ -1036,7 +1134,7 @@ static long rcu_nocb_rdp_deoffload(void *arg) * return false, which means that future calls to rcu_nocb_try_bypass() * will refuse to put anything into the bypass. */ - WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies)); + WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false)); /* * Start with invoking rcu_core() early. This way if the current thread * happens to preempt an ongoing call to rcu_core() in the middle, @@ -1278,6 +1376,7 @@ static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp) raw_spin_lock_init(&rdp->nocb_gp_lock); timer_setup(&rdp->nocb_timer, do_nocb_deferred_wakeup_timer, 0); rcu_cblist_init(&rdp->nocb_bypass); + WRITE_ONCE(rdp->lazy_len, 0); mutex_init(&rdp->nocb_gp_kthread_mutex); } @@ -1564,13 +1663,13 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force) } static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp, - unsigned long j) + unsigned long j, bool lazy) { return true; } static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, - bool *was_alldone, unsigned long flags) + bool *was_alldone, unsigned long flags, bool lazy) { return false; } From patchwork Wed Oct 19 22:51:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012458 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EE7EC4332F for ; Wed, 19 Oct 2022 22:52:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229904AbiJSWwA (ORCPT ); Wed, 19 Oct 2022 18:52:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230371AbiJSWvx (ORCPT ); Wed, 19 Oct 2022 18:51:53 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6B2518F26F; Wed, 19 Oct 2022 15:51:49 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 30935B8261B; Wed, 19 Oct 2022 22:51:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BEFD7C4347C; Wed, 19 Oct 2022 22:51:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219906; bh=U3+4ieL/lmCXWeRZRiLB8u0S5rhXMcyArw0nT7Mm3tc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=aXyfqiSSPcD8uihbcQQsOhuv8uB2FhPe5BW3m7mBtg8Njne6MornkUtJW2OjiMWKj LqyXFvjZw8GPKhvNIvuocWTS4z0aImtcd6CzrM7J8bbL5Nwvl8Ojd1et9HFJi28We1 4KXX7qt8Jwqbh29nHQ05z7eTsRLzKuaimihSOg5xwuBrlQgDG0L+etBFrn9oKA4JSR XVEDKhm27IADRU/VtEYXAf6lEKoCuHsjT/Qb+uKNV1g+C6hpDXxBj1AYSbac/GY57Q EXpWTeuMcgE2V3p8EoGJrnBshgmOPfg5kAaUEP2AJlZRtCMq+dl93qtdMDY9UvJV9b F/ePiJKHoEl8Q== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 7EC325C0A04; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, "Joel Fernandes (Google)" , "Paul E . McKenney" Subject: [PATCH rcu 05/14] rcu: Refactor code a bit in rcu_nocb_do_flush_bypass() Date: Wed, 19 Oct 2022 15:51:35 -0700 Message-Id: <20221019225144.2500095-5-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: "Joel Fernandes (Google)" This consolidates the code a bit and makes it cleaner. Functionally it is the same. Reported-by: Paul E. McKenney Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_nocb.h | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index d6e4c076b0515..213daf81c057f 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -327,10 +327,11 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype, * * Note that this function always returns true if rhp is NULL. */ -static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp, +static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp_in, unsigned long j, bool lazy) { struct rcu_cblist rcl; + struct rcu_head *rhp = rhp_in; WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp)); rcu_lockdep_assert_cblist_protected(rdp); @@ -345,16 +346,16 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp, /* * If the new CB requested was a lazy one, queue it onto the main - * ->cblist so we can take advantage of a sooner grade period. + * ->cblist so that we can take advantage of the grace-period that will + * happen regardless. But queue it onto the bypass list first so that + * the lazy CB is ordered with the existing CBs in the bypass list. */ if (lazy && rhp) { - rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, NULL); - rcu_cblist_enqueue(&rcl, rhp); - WRITE_ONCE(rdp->lazy_len, 0); - } else { - rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp); - WRITE_ONCE(rdp->lazy_len, 0); + rcu_cblist_enqueue(&rdp->nocb_bypass, rhp); + rhp = NULL; } + rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp); + WRITE_ONCE(rdp->lazy_len, 0); rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl); WRITE_ONCE(rdp->nocb_bypass_first, j); From patchwork Wed Oct 19 22:51:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012464 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FF62C43217 for ; Wed, 19 Oct 2022 22:52:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230195AbiJSWwE (ORCPT ); Wed, 19 Oct 2022 18:52:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51160 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230313AbiJSWvy (ORCPT ); Wed, 19 Oct 2022 18:51:54 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B94F1905C2; Wed, 19 Oct 2022 15:51:50 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 37C07B82620; Wed, 19 Oct 2022 22:51:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 18ADEC4314C; Wed, 19 Oct 2022 22:51:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219907; bh=hIFNQHcIbY8YtBkmBJx5icL1buQHNTtIxJeozRTrMW8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MAbTGcy7ARkabmWioJ3weutz0OZZjs+BAUteqFV69VdQSdNqMT9FxKkjNkPkaqTwh Oru8cXB60UytRoW9394LnIayvuYlEAOTHXodoTFMC30pFaxrw9HbkmQWe4RYYdT+em dsiNokspncAC8j/16EpELJfEJL7eSmo2Pe13KcNQehD8CFPLPNvICE/CPDSAf7HK3a YiUIxk9ov1aQ/alMvF/w66EKTpR1zjGcbCGll00dDyEdW5lh0UxAr+DHfmccAQsg7x QuXbu0Di5xVKAbWESeGLUHvR0h/HuVGX1R+Wz275LVWGZu8ZLAkCG/wyZ79RwutS+m dyMmPFn8ZZOgw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 80A4B5C0A40; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, Vineeth Pillai , Joel Fernandes , "Paul E . McKenney" Subject: [PATCH rcu 06/14] rcu: Shrinker for lazy rcu Date: Wed, 19 Oct 2022 15:51:36 -0700 Message-Id: <20221019225144.2500095-6-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Vineeth Pillai The shrinker is used to speed up the free'ing of memory potentially held by RCU lazy callbacks. RCU kernel module test cases show this to be effective. Test is introduced in a later patch. Signed-off-by: Vineeth Pillai Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_nocb.h | 52 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 213daf81c057f..9e1c8caec5ceb 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -1312,6 +1312,55 @@ int rcu_nocb_cpu_offload(int cpu) } EXPORT_SYMBOL_GPL(rcu_nocb_cpu_offload); +static unsigned long +lazy_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc) +{ + int cpu; + unsigned long count = 0; + + /* Snapshot count of all CPUs */ + for_each_possible_cpu(cpu) { + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); + + count += READ_ONCE(rdp->lazy_len); + } + + return count ? count : SHRINK_EMPTY; +} + +static unsigned long +lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) +{ + int cpu; + unsigned long flags; + unsigned long count = 0; + + /* Snapshot count of all CPUs */ + for_each_possible_cpu(cpu) { + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); + int _count = READ_ONCE(rdp->lazy_len); + + if (_count == 0) + continue; + rcu_nocb_lock_irqsave(rdp, flags); + WRITE_ONCE(rdp->lazy_len, 0); + rcu_nocb_unlock_irqrestore(rdp, flags); + wake_nocb_gp(rdp, false); + sc->nr_to_scan -= _count; + count += _count; + if (sc->nr_to_scan <= 0) + break; + } + return count ? count : SHRINK_STOP; +} + +static struct shrinker lazy_rcu_shrinker = { + .count_objects = lazy_rcu_shrink_count, + .scan_objects = lazy_rcu_shrink_scan, + .batch = 0, + .seeks = DEFAULT_SEEKS, +}; + void __init rcu_init_nohz(void) { int cpu; @@ -1342,6 +1391,9 @@ void __init rcu_init_nohz(void) if (!rcu_state.nocb_is_setup) return; + if (register_shrinker(&lazy_rcu_shrinker, "rcu-lazy")) + pr_err("Failed to register lazy_rcu shrinker!\n"); + if (!cpumask_subset(rcu_nocb_mask, cpu_possible_mask)) { pr_info("\tNote: kernel parameter 'rcu_nocbs=', 'nohz_full', or 'isolcpus=' contains nonexistent CPUs.\n"); cpumask_and(rcu_nocb_mask, cpu_possible_mask, From patchwork Wed Oct 19 22:51:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012462 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B19F7C433FE for ; Wed, 19 Oct 2022 22:52:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231314AbiJSWv6 (ORCPT ); Wed, 19 Oct 2022 18:51:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51252 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230260AbiJSWvw (ORCPT ); Wed, 19 Oct 2022 18:51:52 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B30A118E722; Wed, 19 Oct 2022 15:51:48 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2F91D619F1; Wed, 19 Oct 2022 22:51:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 187AAC4314B; Wed, 19 Oct 2022 22:51:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219907; bh=1i6/DFTHykaCBDuDcsNQFPp5wn3kjszlN1gPZX2MEbg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=J35e/a+uDANICwfUg2NLAu3g/6j2DIc8ElKM91V5CeQzauXmLD+DxGh/QZ4pk/znX dxU2NpYPT0CiO1Jwk+eEkwyE4MAMIgiYH1Ksjj99bkxq+YncFx4yPRS3u4H+t94m9p QdJ6XETHgOZyIi3WuhGkSkfrDfW8mUpEFHxFPMKap7RS61dWraR1CcHKnvUTgGncDG BvvdpSCwWi6GRzlcQnrx9o5PM6MrvFhtoYg0abYK+tHfvaioaaGx52cYxX3AEUENh6 5obt3Jk0TgNXuJaY5eaDIs2VYC2/1tWg0gYWn7OBFWFtsr8o7/graOKipaP/HVOcse 80aZ2gDXDhd8g== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 827985C0AC5; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, "Joel Fernandes (Google)" , "Paul E . McKenney" Subject: [PATCH rcu 07/14] rcuscale: Add laziness and kfree tests Date: Wed, 19 Oct 2022 15:51:37 -0700 Message-Id: <20221019225144.2500095-7-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: "Joel Fernandes (Google)" This commit adds 2 tests to rcuscale. The first one is a startup test to check whether we are not too lazy or too hard working. The second one causes kfree_rcu() itself to use call_rcu() and checks memory pressure. Testing indicates that the new call_rcu() keeps memory pressure under control roughly as well as does kfree_rcu(). Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- kernel/rcu/rcuscale.c | 68 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 66 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c index 3ef02d4a81085..bbdcac1804ec8 100644 --- a/kernel/rcu/rcuscale.c +++ b/kernel/rcu/rcuscale.c @@ -95,6 +95,7 @@ torture_param(int, verbose, 1, "Enable verbose debugging printk()s"); torture_param(int, writer_holdoff, 0, "Holdoff (us) between GPs, zero to disable"); torture_param(int, kfree_rcu_test, 0, "Do we run a kfree_rcu() scale test?"); torture_param(int, kfree_mult, 1, "Multiple of kfree_obj size to allocate."); +torture_param(int, kfree_by_call_rcu, 0, "Use call_rcu() to emulate kfree_rcu()?"); static char *scale_type = "rcu"; module_param(scale_type, charp, 0444); @@ -659,6 +660,14 @@ struct kfree_obj { struct rcu_head rh; }; +/* Used if doing RCU-kfree'ing via call_rcu(). */ +static void kfree_call_rcu(struct rcu_head *rh) +{ + struct kfree_obj *obj = container_of(rh, struct kfree_obj, rh); + + kfree(obj); +} + static int kfree_scale_thread(void *arg) { @@ -696,6 +705,11 @@ kfree_scale_thread(void *arg) if (!alloc_ptr) return -ENOMEM; + if (kfree_by_call_rcu) { + call_rcu(&(alloc_ptr->rh), kfree_call_rcu); + continue; + } + // By default kfree_rcu_test_single and kfree_rcu_test_double are // initialized to false. If both have the same value (false or true) // both are randomly tested, otherwise only the one with value true @@ -767,11 +781,59 @@ kfree_scale_shutdown(void *arg) return -EINVAL; } +// Used if doing RCU-kfree'ing via call_rcu(). +static unsigned long jiffies_at_lazy_cb; +static struct rcu_head lazy_test1_rh; +static int rcu_lazy_test1_cb_called; +static void call_rcu_lazy_test1(struct rcu_head *rh) +{ + jiffies_at_lazy_cb = jiffies; + WRITE_ONCE(rcu_lazy_test1_cb_called, 1); +} + static int __init kfree_scale_init(void) { - long i; int firsterr = 0; + long i; + unsigned long jif_start; + unsigned long orig_jif; + + // Also, do a quick self-test to ensure laziness is as much as + // expected. + if (kfree_by_call_rcu && !IS_ENABLED(CONFIG_RCU_LAZY)) { + pr_alert("CONFIG_RCU_LAZY is disabled, falling back to kfree_rcu() " + "for delayed RCU kfree'ing\n"); + kfree_by_call_rcu = 0; + } + + if (kfree_by_call_rcu) { + /* do a test to check the timeout. */ + orig_jif = rcu_lazy_get_jiffies_till_flush(); + + rcu_lazy_set_jiffies_till_flush(2 * HZ); + rcu_barrier(); + + jif_start = jiffies; + jiffies_at_lazy_cb = 0; + call_rcu(&lazy_test1_rh, call_rcu_lazy_test1); + + smp_cond_load_relaxed(&rcu_lazy_test1_cb_called, VAL == 1); + + rcu_lazy_set_jiffies_till_flush(orig_jif); + + if (WARN_ON_ONCE(jiffies_at_lazy_cb - jif_start < 2 * HZ)) { + pr_alert("ERROR: call_rcu() CBs are not being lazy as expected!\n"); + WARN_ON_ONCE(1); + return -1; + } + + if (WARN_ON_ONCE(jiffies_at_lazy_cb - jif_start > 3 * HZ)) { + pr_alert("ERROR: call_rcu() CBs are being too lazy!\n"); + WARN_ON_ONCE(1); + return -1; + } + } kfree_nrealthreads = compute_real(kfree_nthreads); /* Start up the kthreads. */ @@ -784,7 +846,9 @@ kfree_scale_init(void) schedule_timeout_uninterruptible(1); } - pr_alert("kfree object size=%zu\n", kfree_mult * sizeof(struct kfree_obj)); + pr_alert("kfree object size=%zu, kfree_by_call_rcu=%d\n", + kfree_mult * sizeof(struct kfree_obj), + kfree_by_call_rcu); kfree_reader_tasks = kcalloc(kfree_nrealthreads, sizeof(kfree_reader_tasks[0]), GFP_KERNEL); From patchwork Wed Oct 19 22:51:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012455 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49A28C433FE for ; Wed, 19 Oct 2022 22:51:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230323AbiJSWvy (ORCPT ); Wed, 19 Oct 2022 18:51:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230287AbiJSWvu (ORCPT ); Wed, 19 Oct 2022 18:51:50 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B142218E706; Wed, 19 Oct 2022 15:51:48 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 03065619DA; Wed, 19 Oct 2022 22:51:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1870AC43147; Wed, 19 Oct 2022 22:51:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219907; bh=wIrUnsZvsb76w0mrU4ByciwxrnTRBZTUwIx2ZrTZP6w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DplQmogbLIlmDKw7nrrkuSAqXJXVs56ONOkM2rhX3C0kvLWKXYoh/417+tAnrpZEH /XuVKaH1KLnf6KdOFX2VmVHU1OPwbqZJCUsalOicFa/EQfbFMh8v5UCoAdMzeCxLB/ IusxzzT3D5rqNr95x3L8lhNofkrHcbYbX19fPfMGLU6vnB6hspmzVYOQLX+0TvgiIv OihWC3yew5drgwauQAe4qmjLQ7JdAXWeAfPm2HOu21tWurXsguQLGSqEm53bOZvrlR 8GTuYwQmOpX2OfLBnKqcqwJHtCXXMYGDFHTPXKQVq1kMk25UIle0wLQ/PrWeIhE/5e z62Ggq6tADgvg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 844165C0B8F; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, "Joel Fernandes (Google)" , "Paul E . McKenney" Subject: [PATCH rcu 08/14] percpu-refcount: Use call_rcu_flush() for atomic switch Date: Wed, 19 Oct 2022 15:51:38 -0700 Message-Id: <20221019225144.2500095-8-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: "Joel Fernandes (Google)" call_rcu() changes to save power will slow down the percpu refcounter's "per-CPU to atomic switch" path. The primitive uses RCU when switching to atomic mode. The enqueued async callback wakes up waiters waiting in the percpu_ref_switch_waitq. Due to this, per-CPU refcount users will slow down, such as blk_pre_runtime_suspend(). Use the call_rcu_flush() API instead which reverts to the old behavior. Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- lib/percpu-refcount.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c index e5c5315da2741..65c58a029297d 100644 --- a/lib/percpu-refcount.c +++ b/lib/percpu-refcount.c @@ -230,7 +230,8 @@ static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref, percpu_ref_noop_confirm_switch; percpu_ref_get(ref); /* put after confirmation */ - call_rcu(&ref->data->rcu, percpu_ref_switch_to_atomic_rcu); + call_rcu_flush(&ref->data->rcu, + percpu_ref_switch_to_atomic_rcu); } static void __percpu_ref_switch_to_percpu(struct percpu_ref *ref) From patchwork Wed Oct 19 22:51:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012460 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F9EAC43219 for ; Wed, 19 Oct 2022 22:52:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230356AbiJSWwC (ORCPT ); Wed, 19 Oct 2022 18:52:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230467AbiJSWvx (ORCPT ); Wed, 19 Oct 2022 18:51:53 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B6BF190477; Wed, 19 Oct 2022 15:51:50 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id ADC9DB824C1; Wed, 19 Oct 2022 22:51:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1877CC4314A; Wed, 19 Oct 2022 22:51:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219907; bh=uyDGWyzfPaiQ/+Iwibul18/2tA5ta5E2a9H+CSNE/yA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FOIF4YT+jBvQCcjeX7q5P1woLdrAJJIJMkH/cVsGSyyZwzbRa3M4Cp3SUpwH1c/cd YzuPiSfWWB+c3rbYEIP6mZ15C58H0GHc2rkupVlCC/i9TojgYm1epLbEM6olCoMdH+ pVdGyxu1y73xPhJ1G+TE9NTs0WJvimU3uTnJB1qMAdDygUb1t3K/jlTq+Z02QxwegQ 9rv1YLiBVL+7WUbosqD/JsMq8be8XghLcaOIzRSlsAWtRO5YNTkfBff2hihW6nEdk4 g7JAbuGEsmC6T48BPyC2dZ7vUEBw/R6wA0dycUCiHf8x1W3Fo+n3cbUInLm0yDPKSp icbz2T/6l7wOA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 860145C0BE8; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, "Joel Fernandes (Google)" , "Paul E . McKenney" Subject: [PATCH rcu 09/14] rcu/sync: Use call_rcu_flush() instead of call_rcu Date: Wed, 19 Oct 2022 15:51:39 -0700 Message-Id: <20221019225144.2500095-9-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: "Joel Fernandes (Google)" call_rcu() changes to save power will slow down rcu sync. Use the call_rcu_flush() API instead which reverts to the old behavior. Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- kernel/rcu/sync.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c index 5cefc702158fe..bdce3b5d7f714 100644 --- a/kernel/rcu/sync.c +++ b/kernel/rcu/sync.c @@ -44,7 +44,7 @@ static void rcu_sync_func(struct rcu_head *rhp); static void rcu_sync_call(struct rcu_sync *rsp) { - call_rcu(&rsp->cb_head, rcu_sync_func); + call_rcu_flush(&rsp->cb_head, rcu_sync_func); } /** From patchwork Wed Oct 19 22:51:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99EB5C433FE for ; Wed, 19 Oct 2022 22:52:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231382AbiJSWwF (ORCPT ); Wed, 19 Oct 2022 18:52:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231128AbiJSWv5 (ORCPT ); Wed, 19 Oct 2022 18:51:57 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF9BF18DAA7; Wed, 19 Oct 2022 15:51:53 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 673DCB82624; Wed, 19 Oct 2022 22:51:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 25E2BC4314D; Wed, 19 Oct 2022 22:51:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219907; bh=n20MIqYs0+kWqGRzQ0GMSKtx306RkdYYsYOWA4bkwmg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cYcIi9YT6YwtY8k9Whsejvj1ntn5NUuJrR1GY7zsIEczTN8qC7CK9urL79tCh+hFL gNaFECgsjyApupJdT/fvJgSf87WwgZ2vRtPR6DQBmn/8BqENMMpgKZ5bG/GoNARzmN jrDUxONIRlmVNlOhcF0gj/+Zsh/rBiOH6wqqs9h6fsJYc0i58UptoqDr/235lyyAbn vsI5g1q6HGqHFtXBAzGzTPcw2kPK5Jz0adC6fazXn+Y6PYYx9sT3VEGll3hv1W5zhn vSKi5A9mLPlGVy3ShgPJGj/JUK2gchfQjqx4QXGCc8Qd6uHlH/relI51B5XtA94THa 7aN0vGyPc8H4w== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 87D815C0D2B; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, "Joel Fernandes (Google)" , "Paul E . McKenney" Subject: [PATCH rcu 10/14] rcu/rcuscale: Use call_rcu_flush() for async reader test Date: Wed, 19 Oct 2022 15:51:40 -0700 Message-Id: <20221019225144.2500095-10-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: "Joel Fernandes (Google)" rcuscale uses call_rcu() to queue async readers. With recent changes to save power, the test will have fewer async readers in flight. Use the call_rcu_flush() API instead to revert to the old behavior. Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- kernel/rcu/rcuscale.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c index bbdcac1804ec8..0385e9b123998 100644 --- a/kernel/rcu/rcuscale.c +++ b/kernel/rcu/rcuscale.c @@ -176,7 +176,7 @@ static struct rcu_scale_ops rcu_ops = { .get_gp_seq = rcu_get_gp_seq, .gp_diff = rcu_seq_diff, .exp_completed = rcu_exp_batches_completed, - .async = call_rcu, + .async = call_rcu_flush, .gp_barrier = rcu_barrier, .sync = synchronize_rcu, .exp_sync = synchronize_rcu_expedited, From patchwork Wed Oct 19 22:51:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012456 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32567C4332F for ; Wed, 19 Oct 2022 22:51:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231255AbiJSWv4 (ORCPT ); Wed, 19 Oct 2022 18:51:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51160 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230348AbiJSWvx (ORCPT ); Wed, 19 Oct 2022 18:51:53 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EAE2818E738; Wed, 19 Oct 2022 15:51:48 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5C4C8619EC; Wed, 19 Oct 2022 22:51:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 207B4C43149; Wed, 19 Oct 2022 22:51:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219907; bh=wpKgIqAJ7kcPSavM/RzKNoKYI274ueWKeTYgl/A5jf8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XnFcmUZ2KYoW9kGNBzloi+MxX7n5i8F/9xf0y6fxNKVhySs9d/a/YOOs9lNi/H1Y5 WCif+YrOEtegJM+39/G9d8vhepALCQ44WHzOzlcZaCty4nrU/9EFhvVmMh25Pr5yld Um9mIGAVQe6GaTRgnpbQZSZQU2LV6Prj4ZXEyBf/qvolgrnXLHDqDK1w4ItaLH5cGW OvRtaBm81QyxZUszFv9oAHw62IO7ikd3dUXVZE45TqsggZ5xtZR7omER3n4OW5hTcZ pAvFHX+Ln38I9JRH7M9hXMXzddaBoZqIWNQUD2kJUu+z5Jb/47kEE8TPXB8B2kN2o9 irsVvQdlc6xmQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 899405C0DA6; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, "Joel Fernandes (Google)" , "Paul E . McKenney" Subject: [PATCH rcu 11/14] rcu/rcutorture: Use call_rcu_flush() where needed Date: Wed, 19 Oct 2022 15:51:41 -0700 Message-Id: <20221019225144.2500095-11-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: "Joel Fernandes (Google)" call_rcu() changes to save power will change the behavior of rcutorture tests. Use the call_rcu_flush() API instead which reverts to the old behavior. Reported-by: Paul E. McKenney Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 503c2aa845a4a..c8ddb4b635b77 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -510,7 +510,7 @@ static unsigned long rcu_no_completed(void) static void rcu_torture_deferred_free(struct rcu_torture *p) { - call_rcu(&p->rtort_rcu, rcu_torture_cb); + call_rcu_flush(&p->rtort_rcu, rcu_torture_cb); } static void rcu_sync_torture_init(void) @@ -551,7 +551,7 @@ static struct rcu_torture_ops rcu_ops = { .start_gp_poll_exp_full = start_poll_synchronize_rcu_expedited_full, .poll_gp_state_exp = poll_state_synchronize_rcu, .cond_sync_exp = cond_synchronize_rcu_expedited, - .call = call_rcu, + .call = call_rcu_flush, .cb_barrier = rcu_barrier, .fqs = rcu_force_quiescent_state, .stats = NULL, @@ -848,7 +848,7 @@ static void rcu_tasks_torture_deferred_free(struct rcu_torture *p) static void synchronize_rcu_mult_test(void) { - synchronize_rcu_mult(call_rcu_tasks, call_rcu); + synchronize_rcu_mult(call_rcu_tasks, call_rcu_flush); } static struct rcu_torture_ops tasks_ops = { @@ -3388,13 +3388,13 @@ static void rcu_test_debug_objects(void) /* Try to queue the rh2 pair of callbacks for the same grace period. */ preempt_disable(); /* Prevent preemption from interrupting test. */ rcu_read_lock(); /* Make it impossible to finish a grace period. */ - call_rcu(&rh1, rcu_torture_leak_cb); /* Start grace period. */ + call_rcu_flush(&rh1, rcu_torture_leak_cb); /* Start grace period. */ local_irq_disable(); /* Make it harder to start a new grace period. */ - call_rcu(&rh2, rcu_torture_leak_cb); - call_rcu(&rh2, rcu_torture_err_cb); /* Duplicate callback. */ + call_rcu_flush(&rh2, rcu_torture_leak_cb); + call_rcu_flush(&rh2, rcu_torture_err_cb); /* Duplicate callback. */ if (rhp) { - call_rcu(rhp, rcu_torture_leak_cb); - call_rcu(rhp, rcu_torture_err_cb); /* Another duplicate callback. */ + call_rcu_flush(rhp, rcu_torture_leak_cb); + call_rcu_flush(rhp, rcu_torture_err_cb); /* Another duplicate callback. */ } local_irq_enable(); rcu_read_unlock(); From patchwork Wed Oct 19 22:51:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E007CC43217 for ; Wed, 19 Oct 2022 22:52:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231132AbiJSWwG (ORCPT ); Wed, 19 Oct 2022 18:52:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231252AbiJSWv4 (ORCPT ); Wed, 19 Oct 2022 18:51:56 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC0B018DD58; Wed, 19 Oct 2022 15:51:53 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 82A9EB82625; Wed, 19 Oct 2022 22:51:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2E5DEC4314E; Wed, 19 Oct 2022 22:51:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219907; bh=mhcpLJZcHbiLjOBEVgPlVc5Zs6jp1PTb1ytPNog7Qy4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tT1EJTN+S1m0Ws51ZJ2CUpp/35nOFOnx0Nrqc0pVhtve35sQ5QKg5oABH2+dZYKLZ RyNb2mkA1cqKxeibNUnzFKG1fEFWiia+sKd08hawKOFpC8IWO8ZfhHCpXZNL/K6O7u 2VPp3Eq6w6wzJQjfdHYniExubZP/g/wD6l6Ym1wnPRHgxKP02qp87QnouD6cXwVZPH l6zKFEZAtItER+MAkWi1Wi98W6AfmUv904rH+o3o47GOR2VMVXkr5cbCjjUMMZMIJJ 4W3x5LdBiUiryhYxg4UVe4P5+ld3HFaLo9ptC5Of3F5OAO8Z7r9+pu0E1gkUW1ODv5 uHmTOcjkhDSDA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 8B5795C0E1E; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, Uladzislau Rezki , Joel Fernandes , "Paul E . McKenney" Subject: [PATCH rcu 12/14] scsi/scsi_error: Use call_rcu_flush() instead of call_rcu() Date: Wed, 19 Oct 2022 15:51:42 -0700 Message-Id: <20221019225144.2500095-12-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Uladzislau Rezki Slow boot time is seen on KVM running typical Linux distributions due to SCSI layer calling call_rcu(). Recent changes to save power may be causing this slowness. Using call_rcu_flush() fixes the issue and brings the boot time back to what it originally was. Convert it. Tested-by: Joel Fernandes (Google) Signed-off-by: Uladzislau Rezki Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- drivers/scsi/scsi_error.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 6995c89792300..634672e67c81f 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -312,7 +312,7 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd) * Ensure that all tasks observe the host state change before the * host_failed change. */ - call_rcu(&scmd->rcu, scsi_eh_inc_host_failed); + call_rcu_flush(&scmd->rcu, scsi_eh_inc_host_failed); } /** From patchwork Wed Oct 19 22:51:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012459 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13DD6C4321E for ; Wed, 19 Oct 2022 22:52:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231181AbiJSWwA (ORCPT ); Wed, 19 Oct 2022 18:52:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230387AbiJSWvx (ORCPT ); Wed, 19 Oct 2022 18:51:53 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E8ADD17C156; Wed, 19 Oct 2022 15:51:49 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CC78E619EE; Wed, 19 Oct 2022 22:51:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2E67BC43150; Wed, 19 Oct 2022 22:51:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219907; bh=MQ9LcMOdsSRTz38i4kDvf32ku4ItL7jI+4EvtYFoPC0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UAl/iSQsuk1+2M6aDOUrrdjnjX8vuEJOxRzokcPebrX5Up7justHJnoXtapDkXiir H08aHzwHoHP7hWGyOpklSAJhMwDvlPg4eJI2BtOj1fQ03u27rQVehlmN6QMsyhd27t F03ExYx0PhDcUp//Auh8E/Xn2//+YcgvzH5YFVrMJ5FVgdcnZ1HrECn0AX3Uw7dNdS uUTzQDtaldoqtV7/RUFeGOBGi4v7t4EYHeyuRAETDRrDmvp5sX1wGTG3Zxl9wMs8ZK 8vk5sqJoRvXkOoyp7Tt6sFvwNnqMNGEuCPxtpW7TzwGXSi8pSeFv57FpHBmibahNCN 3ZupmHrjfTdww== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 8D1785C0E57; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, Uladzislau Rezki , Joel Fernandes , "Paul E . McKenney" Subject: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush() Date: Wed, 19 Oct 2022 15:51:43 -0700 Message-Id: <20221019225144.2500095-13-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Uladzislau Rezki call_rcu() changes to save power will slow down RCU workqueue items queued via queue_rcu_work(). This may not be an issue, however we cannot assume that workqueue users are OK with long delays. Use call_rcu_flush() API instead which reverts to the old behavior. Signed-off-by: Uladzislau Rezki Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney Signed-off-by: Joel Fernandes --- kernel/workqueue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 7cd5f5e7e0a1b..b4b0e828b529e 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -1771,7 +1771,7 @@ bool queue_rcu_work(struct workqueue_struct *wq, struct rcu_work *rwork) if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) { rwork->wq = wq; - call_rcu(&rwork->rcu, rcu_work_rcufn); + call_rcu_flush(&rwork->rcu, rcu_work_rcufn); return true; } From patchwork Wed Oct 19 22:51:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13012463 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC851C4167D for ; Wed, 19 Oct 2022 22:52:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231372AbiJSWwE (ORCPT ); Wed, 19 Oct 2022 18:52:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230266AbiJSWvx (ORCPT ); Wed, 19 Oct 2022 18:51:53 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1485918F27F; Wed, 19 Oct 2022 15:51:50 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E64C6619E8; Wed, 19 Oct 2022 22:51:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 388EAC43156; Wed, 19 Oct 2022 22:51:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666219907; bh=RL61tOwimeBcU4/Vfql47BzA0BOTvlir0vUZt/Njig0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Cz96/20x4CieacmsCHfXGXf22cmpZZpQLoV+8usewU9hRJspL4ooU4Pxu+9K2EXFG mThTNKGqTjG0TsUKUUP6pEm9Cxl30fEpyAc2cGXKnIWr9glas7ieDnYS7IQXpa46wJ awcwKw4A3BWg1cHTtQRUSF5K4E6ltIQchbyCjLYldvtCQ2lzmUP2eC79z0MSn5Js6B xpxY2fV7hzdeYwOirjrx7tgVucLlQ3std5DSee86C1q8BAAoW9PVnISsiefYWcGcQZ Y2sXBgwD57W+E6szViTyIj6EfdmMOcaBG3SmJ4dr92rbvolD2jNTKCDwxVHbwzu6Az 0T7v05UiCo7JA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 8ED6C5C0F19; Wed, 19 Oct 2022 15:51:46 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, rostedt@goodmis.org, "Joel Fernandes (Google)" , "Paul E . McKenney" Subject: [PATCH rcu 14/14] rxrpc: Use call_rcu_flush() instead of call_rcu() Date: Wed, 19 Oct 2022 15:51:44 -0700 Message-Id: <20221019225144.2500095-14-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> References: <20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: "Joel Fernandes (Google)" call_rcu() changes to save power may cause slowness. Use the call_rcu_flush() API instead which reverts to the old behavior. We find this via inspection that the RCU callback does a wakeup of a thread. This usually indicates that something is waiting on it. To be safe, let us use call_rcu_flush() here instead. Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- net/rxrpc/conn_object.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/rxrpc/conn_object.c b/net/rxrpc/conn_object.c index 22089e37e97f0..fdcfb509cc443 100644 --- a/net/rxrpc/conn_object.c +++ b/net/rxrpc/conn_object.c @@ -253,7 +253,7 @@ void rxrpc_kill_connection(struct rxrpc_connection *conn) * must carry a ref on the connection to prevent us getting here whilst * it is queued or running. */ - call_rcu(&conn->rcu, rxrpc_destroy_connection); + call_rcu_flush(&conn->rcu, rxrpc_destroy_connection); } /*