From patchwork Wed Mar 29 16:02:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13192871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD004C77B62 for ; Wed, 29 Mar 2023 16:05:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229799AbjC2QE4 (ORCPT ); Wed, 29 Mar 2023 12:04:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231178AbjC2QEf (ORCPT ); Wed, 29 Mar 2023 12:04:35 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 157F47DBA; Wed, 29 Mar 2023 09:03:31 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CA5CA61DA0; Wed, 29 Mar 2023 16:02:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 76468C433EF; Wed, 29 Mar 2023 16:02:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680105731; bh=Fp/ndTMmWDQvJYRUYSiB/n1pWGiQmprjbLIvXN1t7Yo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GXJMVZfZjTUnt6ouzt7l7m2vLdEpVbUYF3zXJObcVdJJ25rx/QAGDa+UpVUX5FE0p yr9Nrt6tM4pt81wfx84r5lPmsnUfyLbv3QrLoXbP+LziiVJ06zK3GQsjq+RFuVFwiq v/5qNkDMeWEGa054oW0J6Lk4jzXytIB0ApwtctH1DbKmdcqmNH0qiX5j553slYPOp5 aaTi7iyeF5s8UlNX+OkcbFgJMznNcCAKMSsCCYFsqremZD8TJRiiv6HxfgLeXT3yDy 7Z//tvu6PB1lD4HVnV8kem77vQWp8PlIRW8pyBKiBT1n6SzZs5RnfTlJ2C+xWJMZaU 7Qo2OdpvBGN6g== From: Frederic Weisbecker To: "Paul E . McKenney" Cc: LKML , Frederic Weisbecker , rcu , Uladzislau Rezki , Neeraj Upadhyay , Boqun Feng , Joel Fernandes Subject: [PATCH 1/4] rcu/nocb: Protect lazy shrinker against concurrent (de-)offloading Date: Wed, 29 Mar 2023 18:02:00 +0200 Message-Id: <20230329160203.191380-2-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230329160203.191380-1-frederic@kernel.org> References: <20230329160203.191380-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org The shrinker may run concurrently with callbacks (de-)offloading. As such, calling rcu_nocb_lock() is very dangerous because it does a conditional locking. The worst outcome is that rcu_nocb_lock() doesn't lock but rcu_nocb_unlock() eventually unlocks, or the reverse, creating an imbalance. Fix this with protecting against (de-)offloading using the barrier mutex. Although if the barrier mutex is contended, which should be rare, then step aside so as not to trigger a mutex VS allocation dependency chain. Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree_nocb.h | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index f2280616f9d5..1a86883902ce 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -1336,13 +1336,33 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) unsigned long flags; unsigned long count = 0; + /* + * Protect against concurrent (de-)offloading. Otherwise nocb locking + * may be ignored or imbalanced. + */ + if (!mutex_trylock(&rcu_state.barrier_mutex)) { + /* + * But really don't insist if barrier_mutex is contended since we + * can't guarantee that it will never engage in a dependency + * chain involving memory allocation. The lock is seldom contended + * anyway. + */ + return 0; + } + /* Snapshot count of all CPUs */ for_each_possible_cpu(cpu) { struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); - int _count = READ_ONCE(rdp->lazy_len); + int _count; + + if (!rcu_rdp_is_offloaded(rdp)) + continue; + + _count = READ_ONCE(rdp->lazy_len); if (_count == 0) continue; + rcu_nocb_lock_irqsave(rdp, flags); WRITE_ONCE(rdp->lazy_len, 0); rcu_nocb_unlock_irqrestore(rdp, flags); @@ -1352,6 +1372,9 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) if (sc->nr_to_scan <= 0) break; } + + mutex_unlock(&rcu_state.barrier_mutex); + return count ? count : SHRINK_STOP; } From patchwork Wed Mar 29 16:02:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13192874 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FD90C77B6F for ; Wed, 29 Mar 2023 16:05:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230073AbjC2QE5 (ORCPT ); Wed, 29 Mar 2023 12:04:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231285AbjC2QEk (ORCPT ); Wed, 29 Mar 2023 12:04:40 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D2DE4C16; Wed, 29 Mar 2023 09:03:46 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id BDF39B82380; Wed, 29 Mar 2023 16:02:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id ABE73C433A1; Wed, 29 Mar 2023 16:02:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680105733; bh=qp5qTHymeNKXf/igbwuWFINn/OHqxS5wqWdXe8S1EME=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mM1UAcjeDDMImv5k9KeshViE+mnte2MLkwfhDMuMk3mam4OB+U+s9icDSTJiY6tj2 mpV0By8nybN+Dgd8/UsXyNaX0vKIF6Wsj1oYa1nO7Ro+o3qlvIQ6RmJP0DQBtpP2Vm lINjK5j4VPanlu31F6MfIammRNoXjwvehMtGoBzrQPEQMKVN2HJjL/Go+QZeE+JOA3 BCfSnQzEC4MnFn0lZJ3KjD6gofBMB+dceLNEydiTGDxuzL4Bk0Tus2PSqy3dsyVrDv GcLelC4UmxcmsywSnqEBIBbrM36mqdisHzvvceK86rBfsbDqZxqwPAgzhp18r48E2m VsKkc1l1zbseA== From: Frederic Weisbecker To: "Paul E . McKenney" Cc: LKML , Frederic Weisbecker , rcu , Uladzislau Rezki , Neeraj Upadhyay , Boqun Feng , Joel Fernandes Subject: [PATCH 2/4] rcu/nocb: Fix shrinker race against callback enqueuer Date: Wed, 29 Mar 2023 18:02:01 +0200 Message-Id: <20230329160203.191380-3-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230329160203.191380-1-frederic@kernel.org> References: <20230329160203.191380-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org The shrinker resets the lazy callbacks counter in order to trigger the pending lazy queue flush though the rcuog kthread. The counter reset is protected by the ->nocb_lock against concurrent accesses...except for one of them. Here is a list of existing synchronized readers/writer: 1) The first lazy enqueuer (incrementing ->lazy_len to 1) does so under ->nocb_lock and ->nocb_bypass_lock. 2) The further lazy enqueuers (incrementing ->lazy_len above 1) do so under ->nocb_bypass_lock _only_. 3) The lazy flush checks and resets to 0 under ->nocb_lock and ->nocb_bypass_lock. The shrinker protects its ->lazy_len reset against cases 1) and 3) but not against 2). As such, setting ->lazy_len to 0 under the ->nocb_lock may be cancelled right away by an overwrite from an enqueuer, leading rcuog to ignore the flush. To avoid that, use the proper bypass flush API which takes care of all those details. Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree_nocb.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 1a86883902ce..c321fce2af8e 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -1364,7 +1364,7 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) continue; rcu_nocb_lock_irqsave(rdp, flags); - WRITE_ONCE(rdp->lazy_len, 0); + WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false)); rcu_nocb_unlock_irqrestore(rdp, flags); wake_nocb_gp(rdp, false); sc->nr_to_scan -= _count; From patchwork Wed Mar 29 16:02:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13192872 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 868C2C74A5B for ; Wed, 29 Mar 2023 16:05:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231199AbjC2QEz (ORCPT ); Wed, 29 Mar 2023 12:04:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60208 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231582AbjC2QEU (ORCPT ); Wed, 29 Mar 2023 12:04:20 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF252558E; Wed, 29 Mar 2023 09:03:10 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3F88461DAA; Wed, 29 Mar 2023 16:02:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DFC75C4339C; Wed, 29 Mar 2023 16:02:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680105735; bh=+h4XK6+UI6FKKHszmm1wj9jmGuHr+bG0mxcM6PK0eo8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=We6SppzsOzlK2//1daxu59Df7ANxDJyMznwP+OkNrFmQmFcLdYSH8atr1ezJd2MQj WGhWJt5bkH3VrhROVfbJh02EKGkS4HvFe7yGH7qz1L3O1hPSNGdLY0xegZuyYPbhO4 HFJ39S8in9KjgKsomrNfCm6kkGsr3ndqR6XGOtnZzBDu5NF8SysXHICJOoLTeTUZlf t8uQZIL7RyPCnaGpbfjU1TLNYO2Hj0+jgz4c93W2sJu3rtVZTMLioZzU+iPZjGmRfC D6tzE56MJESRUNbCdPEERhTIsa0krSKTMJRV4SDPnbDJ+cY+nwa+73GhrVeGcZadG2 6mVRYA++CYdug== From: Frederic Weisbecker To: "Paul E . McKenney" Cc: LKML , Frederic Weisbecker , rcu , Uladzislau Rezki , Neeraj Upadhyay , Boqun Feng , Joel Fernandes Subject: [PATCH 3/4] rcu/nocb: Recheck lazy callbacks under the ->nocb_lock from shrinker Date: Wed, 29 Mar 2023 18:02:02 +0200 Message-Id: <20230329160203.191380-4-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230329160203.191380-1-frederic@kernel.org> References: <20230329160203.191380-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org The ->lazy_len is only checked locklessly. Recheck again under the ->nocb_lock to avoid spending more time on flushing/waking if not necessary. The ->lazy_len can still increment concurrently (from 1 to infinity) but under the ->nocb_lock we at least know for sure if there are lazy callbacks at all (->lazy_len > 0). Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree_nocb.h | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index c321fce2af8e..dfa9c10d6727 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -1358,12 +1358,20 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) if (!rcu_rdp_is_offloaded(rdp)) continue; + if (!READ_ONCE(rdp->lazy_len)) + continue; + + rcu_nocb_lock_irqsave(rdp, flags); + /* + * Recheck under the nocb lock. Since we are not holding the bypass + * lock we may still race with increments from the enqueuer but still + * we know for sure if there is at least one lazy callback. + */ _count = READ_ONCE(rdp->lazy_len); - - if (_count == 0) + if (!_count) { + rcu_nocb_unlock_irqrestore(rdp, flags); continue; - - rcu_nocb_lock_irqsave(rdp, flags); + } WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false)); rcu_nocb_unlock_irqrestore(rdp, flags); wake_nocb_gp(rdp, false); From patchwork Wed Mar 29 16:02:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13192873 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5780C77B6E for ; Wed, 29 Mar 2023 16:05:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229609AbjC2QEz (ORCPT ); Wed, 29 Mar 2023 12:04:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60232 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231585AbjC2QEV (ORCPT ); Wed, 29 Mar 2023 12:04:21 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B56C3619E; Wed, 29 Mar 2023 09:03:11 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7054461DA6; Wed, 29 Mar 2023 16:02:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2042CC433A8; Wed, 29 Mar 2023 16:02:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680105737; bh=0K4ESNHhsDtk/yTZYL0aDVGEKmylHOGHPXnx46UNdgA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=T9MnKLbPrzarjfBI9Dw3rgLPHWQ8R6SP/XmHP0pf5K+NS6LV+GbdRHtt44caB17Aa Fovrx/MzSWysqABNpo4PTUYL30Yd0e/T14VLVTAG7i5Tij52icZET8bAihMSn2uncp 9dMU1P+Y0blFA4Z74TJYw8hmeSgYk6IBYNX6ovEdXytKo3dlUkiI98znrYWcl9kRBo 1slX4ifmJD5DWD+8I3Rbvnud97Fu8BfkTVZ+xehDhzeywWmnKwPeXuJ2bUC1uZ1sx1 rHHr6nT9u3Mw3v5d+dOFQGgwI7yM4iXHP8aK6LM2EiEjamYCebP3ZCgp8OBmd3WvAN ctWQZThLtRm/g== From: Frederic Weisbecker To: "Paul E . McKenney" Cc: LKML , Frederic Weisbecker , rcu , Uladzislau Rezki , Neeraj Upadhyay , Boqun Feng , Joel Fernandes Subject: [PATCH 4/4] rcu/nocb: Make shrinker to iterate only NOCB CPUs Date: Wed, 29 Mar 2023 18:02:03 +0200 Message-Id: <20230329160203.191380-5-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230329160203.191380-1-frederic@kernel.org> References: <20230329160203.191380-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org Callbacks can only be queued as lazy on NOCB CPUs, therefore iterating over the NOCB mask is enough for both counting and scanning. Just lock the mostly uncontended barrier mutex on counting as well in order to keep rcu_nocb_mask stable. Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree_nocb.h | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index dfa9c10d6727..43229d2b0c44 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -1319,13 +1319,22 @@ lazy_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc) int cpu; unsigned long count = 0; + if (WARN_ON_ONCE(!cpumask_available(rcu_nocb_mask))) + return 0; + + /* Protect rcu_nocb_mask against concurrent (de-)offloading. */ + if (!mutex_trylock(&rcu_state.barrier_mutex)) + return 0; + /* Snapshot count of all CPUs */ - for_each_possible_cpu(cpu) { + for_each_cpu(cpu, rcu_nocb_mask) { struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); count += READ_ONCE(rdp->lazy_len); } + mutex_unlock(&rcu_state.barrier_mutex); + return count ? count : SHRINK_EMPTY; } @@ -1336,6 +1345,8 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) unsigned long flags; unsigned long count = 0; + if (WARN_ON_ONCE(!cpumask_available(rcu_nocb_mask))) + return 0; /* * Protect against concurrent (de-)offloading. Otherwise nocb locking * may be ignored or imbalanced. @@ -1351,11 +1362,11 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) } /* Snapshot count of all CPUs */ - for_each_possible_cpu(cpu) { + for_each_cpu(cpu, rcu_nocb_mask) { struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); int _count; - if (!rcu_rdp_is_offloaded(rdp)) + if (WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp))) continue; if (!READ_ONCE(rdp->lazy_len))