From patchwork Tue Aug 13 22:56:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13762647 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 51EA21AB53F; Tue, 13 Aug 2024 22:56:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723589813; cv=none; b=UqUAH0Z4ugsN/ihtagUIKRsyTvlfcwyv1TAF8FZ53Uce3ajwB5+qkJnklzxSI9Mc2H46xrv5TMdfOM4e+ljouF+MQ8TaI9q+jV9u6JhbQh9BXx4G+WE0LjBwfqR+WEX0/XB//qmQLsXmmMrodIjISdbjEZsntMeaO/dJCXcNeoE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723589813; c=relaxed/simple; bh=//d0b/5lpY1r0BFbIu3UeB0f6ghuzBF9x7spSDqN2Pc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=dACImnW3EgXb2ruqvgV6gd6FgDfSgSdd2l8YE/yWrC+pgjRVbkH/KiqZjyXcM4Rab9PY5wJRBkzjqXZg+JxMe/vve445h/vcdvAmVxhTMdkV4y3Mwe1FgjQycC4HVwTZdskJTLEQy7z9IWT8VEF7B7emut0jL2KOFll3lH3gBb8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=t38riVsV; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="t38riVsV" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B7E22C4AF0C; Tue, 13 Aug 2024 22:56:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723589812; bh=//d0b/5lpY1r0BFbIu3UeB0f6ghuzBF9x7spSDqN2Pc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=t38riVsVoQ8on9Lq4p6oNRQaebHhd09yeD0TiWuI/SbodviCFmjB3im3K8JE+2a8A ktD+VJ5MvoZoRIqphDrjyCVb9jl8PojO+2XREVXAhg/36bq9VexDpiGel6GpNpRK3Q BNgJfF63qTcAr4i5xhr5ZusW6N9qDR/e3VIb8xbbKkRBOCARZDHSY71GJwOhS36rNE 9f/uUnHaGza0j2/tugMCLxd+0A9vINRCwwwSqwZrw9J061N0+Iou0ocICUXrMbaYgJ TDJvbmeMHw1RH/dkO8Ef/Y3943jpduKwjjVpo4DqeLb7T5L/MHWAQ6sn8fB1cv7MHq yoCjTLPduVNeA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Boqun Feng , Joel Fernandes , Neeraj Upadhyay , "Paul E . McKenney" , Uladzislau Rezki , Zqiang , rcu , Cheng-Jui Wang Subject: [PATCH 1/3] rcu/nocb: Fix RT throttling hrtimer armed from offline CPU Date: Wed, 14 Aug 2024 00:56:40 +0200 Message-ID: <20240813225642.12604-2-frederic@kernel.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240813225642.12604-1-frederic@kernel.org> References: <20240813225642.12604-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 After a CPU is marked offline and until it reaches its final trip to idle, rcuo has several opportunities to be woken up, either because a callback has been queued in the meantime or because rcutree_report_cpu_dead() has issued the final deferred NOCB wake up. If RCU-boosting is enabled, RCU kthreads are set to SCHED_FIFO policy. And if RT-bandwidth is enabled, the related hrtimer might be armed. However this then happens after hrtimers have been migrated at the CPUHP_AP_HRTIMERS_DYING stage, which is broken as reported by the following warning: Call trace: enqueue_hrtimer+0x7c/0xf8 hrtimer_start_range_ns+0x2b8/0x300 enqueue_task_rt+0x298/0x3f0 enqueue_task+0x94/0x188 ttwu_do_activate+0xb4/0x27c try_to_wake_up+0x2d8/0x79c wake_up_process+0x18/0x28 __wake_nocb_gp+0x80/0x1a0 do_nocb_deferred_wakeup_common+0x3c/0xcc rcu_report_dead+0x68/0x1ac cpuhp_report_idle_dead+0x48/0x9c do_idle+0x288/0x294 cpu_startup_entry+0x34/0x3c secondary_start_kernel+0x138/0x158 Fix this with waking up rcuo using an IPI if necessary. Since the existing API to deal with this situation only handles swait queue, rcuo is only woken up from offline CPUs if it's not already waiting on a grace period. In the worst case some callbacks will just wait for a grace period to complete before being assigned to a subsequent one. Reported-by: Cheng-Jui Wang (王正睿) Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier") Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree_nocb.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 1e92164116ef..08608fe1792c 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -216,7 +216,10 @@ static bool __wake_nocb_gp(struct rcu_data *rdp_gp, raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags); if (needwake) { trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DoWake")); - wake_up_process(rdp_gp->nocb_gp_kthread); + if (cpu_is_offline(raw_smp_processor_id())) + swake_up_one_online(&rdp_gp->nocb_gp_wq); + else + wake_up_process(rdp_gp->nocb_gp_kthread); } return needwake; From patchwork Tue Aug 13 22:56:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13762648 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB8281AC426; Tue, 13 Aug 2024 22:56:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723589815; cv=none; b=WF5eIkNcv4Pk+7sCTSMr8Yo84PA/cD+892WDHbBvsIROEWw1bCLNpJk1saSo+bTw4nqZAdolYnZv0tGkYIXQco/SoX/YGxUYeqo9XuGBehGQve9VP7BgTri1d6EKz6G90fskMsUN4BrxL/CNMZ1kD4wN2ZsEoROTmiP1pWFg7Ks= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723589815; c=relaxed/simple; bh=8ABdp/mvtv583Z/CXv4BFc9AuImk05ra5ARaYYVqd+4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WG6OsqY2PMQ8soJLgWH3vZ2agpH5GH7dqtLBf1GQ40bR/QVzfoNX/cusHvYL8iFCtAtJiMPoc6scj+/XHSma0hOBXWLO+Y1VnrEMJJJEkcjLLKP5/Yc/jbd7KiFfzhYI78GN6f0A2ypKBY+Z6KL2xgZmjLdttPtdb1Hf/UOLoHM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=rVTmzmlO; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="rVTmzmlO" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3494DC4AF0F; Tue, 13 Aug 2024 22:56:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723589815; bh=8ABdp/mvtv583Z/CXv4BFc9AuImk05ra5ARaYYVqd+4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rVTmzmlOPamQjn+e0PVo6ruD3c7helHzunXC29QY7DqLRUc4yB3dPPUB689VsIsai jfyWND1WXGomVSttMtZLUgFGfQbVQ2YGLyjGHY9sjMKVnwJ+uM7UmJwYfyPOlRMhne 1ESdZz6Ds312XaFcOSzEt143BgzJ77Yb/rPlFzMW2cF4X18fx4P5W5sOwzBzMEXOCo mtnKI1Mj/gjaOo8dtKCSMVi8ob+YxY8FT8IdvrjtFswSCH6c0tIAHZ5iBZu1177aV8 baqoCZoC9Z45otO5hysPf3iXE3U3FbtXHi3jN6BF5bAVutmtPqUImBREUfYFH+pHVs wGoldZldJ323g== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Boqun Feng , Joel Fernandes , Neeraj Upadhyay , "Paul E . McKenney" , Uladzislau Rezki , Zqiang , rcu , Cheng-Jui Wang Subject: [PATCH 2/3] rcu/nocb: Conditionally wake up rcuo if not already waiting on GP Date: Wed, 14 Aug 2024 00:56:41 +0200 Message-ID: <20240813225642.12604-3-frederic@kernel.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240813225642.12604-1-frederic@kernel.org> References: <20240813225642.12604-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 A callback enqueuer currently wakes up the rcuo kthread if it is adding the first non-done callback of a CPU, whether the kthread is waiting on a grace period or not (unless the CPU is offline). This looks like a desired behaviour because then the rcuo kthread doesn't wait for the end of the current grace period to handle the callback. It is accelerated right away and assigned to the next grace period. The GP kthread is notified about that fact and iterates with the upcoming GP without sleeping in-between. However this best-case scenario is contradicted by a few details, depending on the situation: 1) If the callback is a non-bypass one queued with IRQs enabled, the wake up only occurs if no other pending callbacks are on the list. Therefore the theoretical "optimization" actually applies on rare occasions. 2) If the callback is a non-bypass one queued with IRQs disabled, the situation is similar with even more uncertainty due to the deferred wake up. 3) If the callback is lazy, a few jiffies don't make any difference. 4) If the callback is bypass, the wake up timer is programmed 2 jiffies ahead by rcuo in case the regular pending queue has been handled in the meantime. The rare storm of callbacks can otherwise wait for the currently elapsing grace period to be flushed and handled. For all those reasons, the optimization is only theoretical and occasional. Therefore it is reasonable that callbacks enqueuers only wake up the rcuo kthread when it is not already waiting on a grace period to complete. Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree_nocb.h | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 08608fe1792c..0c9eca1cc76e 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -216,10 +216,7 @@ static bool __wake_nocb_gp(struct rcu_data *rdp_gp, raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags); if (needwake) { trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DoWake")); - if (cpu_is_offline(raw_smp_processor_id())) - swake_up_one_online(&rdp_gp->nocb_gp_wq); - else - wake_up_process(rdp_gp->nocb_gp_kthread); + swake_up_one_online(&rdp_gp->nocb_gp_wq); } return needwake; From patchwork Tue Aug 13 22:56:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13762649 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E76FD1AB528; Tue, 13 Aug 2024 22:56:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723589818; cv=none; b=aTSIE2ND5AfDf01EumCo8V+7lqpmizfRdbdBsX49/oXuQ4DllNnsGzQRJdkrdjouPFDXL9s+YulyejXkcqRCw4ik8puohACo1/DygZO8uz3LCxuFUzLMcxm4Pa1UjiRHMW8VchwtELCisCNdt0Y+QidQAEA4ObVnpfQNQmf6QDE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723589818; c=relaxed/simple; bh=uzSPe7hDnBfbvVFiLqJZzljHfbGcKr4WlCE9pNxHl94=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NiE1HUCxrreIY4yevUKkw/DXqAW/xQDE5jPR+rK7AAAcOvltiGBJqmz7s/NbYyzgT6Sg6qPQfet6N+6egIka8D/IMURTT0rWREbuPdos42UNw7lygpzcXZx7e50xITXEtO+4wb9299wcjJthXhFwXcPC0e/oe3dYd8BJl0XDKgw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=jn5GtNEP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="jn5GtNEP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A1AEAC4AF0C; Tue, 13 Aug 2024 22:56:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723589817; bh=uzSPe7hDnBfbvVFiLqJZzljHfbGcKr4WlCE9pNxHl94=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jn5GtNEP0GkzXMTw46MyqEC4LjVcrP2oX6flTEe22Mpd7KIV/cR+MOztpdWJgbBuZ gpvEizucKUfBw+j/D73kTleDmB87GA7bETVWEeezCRTTjMbpmpbGgR9Rx5FaJsPlig B+6F3Wq1kIu+k1MVMyGGghGeFoupCuLKVxilupAmvvQHRE+IpkCIXer8zE8mODGO4T su84wW4gNFh67ci/1yJ4QSFw8tPJD15NXgPer7psNXF0frkRCPYdhdoV6vLnygdAPZ AbIw/OhnRSK3wxehym6psLXMBQ017cN9U2ahA4LNP7ZQA8ZWtXPsT0l7jHJn5I7rwy qLQCJHOknAenQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Boqun Feng , Joel Fernandes , Neeraj Upadhyay , "Paul E . McKenney" , Uladzislau Rezki , Zqiang , rcu Subject: [PATCH 3/3] rcu/nocb: Remove superfluous memory barrier after bypass enqueue Date: Wed, 14 Aug 2024 00:56:42 +0200 Message-ID: <20240813225642.12604-4-frederic@kernel.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240813225642.12604-1-frederic@kernel.org> References: <20240813225642.12604-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Pre-GP accesses performed by the update side must be ordered against post-GP accesses performed by the readers. This is ensured by the bypass or nocb locking on enqueue time, followed by the fully ordered rnp locking initiated while callbacks are accelerated, and then propagated throughout the whole GP lifecyle associated with the callbacks. Therefore the explicit barrier advertizing ordering between bypass enqueue and rcuo wakeup is superfluous. If anything, it would even only order the first bypass callback enqueue against the rcuo wakeup and ignore all the subsequent ones. Remove the needless barrier. Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree_nocb.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 0c9eca1cc76e..755ada098035 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -493,7 +493,7 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ")); } rcu_nocb_bypass_unlock(rdp); - smp_mb(); /* Order enqueue before wake. */ + // A wake up of the grace period kthread or timer adjustment // needs to be done only if: // 1. Bypass list was fully empty before (this is the first