From patchwork Wed May 25 22:10:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 12861742 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DDD6C433EF for ; Wed, 25 May 2022 22:11:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344794AbiEYWLN (ORCPT ); Wed, 25 May 2022 18:11:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232947AbiEYWLK (ORCPT ); Wed, 25 May 2022 18:11:10 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A6EA13CEC; Wed, 25 May 2022 15:11:07 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 96382B81EA6; Wed, 25 May 2022 22:11:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 93DEAC34119; Wed, 25 May 2022 22:11:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1653516665; bh=v6Wgx/GpjWpLSumr/2ZJN5dax3uTKAPTl+nT8m7pWNg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=eFu7TlrMa1UEzJ9OqxeR/6DSEgim3ARu6FiaoIH4GfsZWjBHSXyqsXoGzkKUb3uvR WlRgNKwag2/RlYHsk4/Gy5hRVh7VubNF9A+qNFFnvvHnvLWXmppEheL2hCiUBmMahE hbFWfOEUhvpzYf3IvpyWT/m3ghKgc2FVxvwPZrS/KwEYFkLliwSU9Fw7qRbNcN6iup Qklx/g+s64OnGZy2Mldguh+5rRXC7xhPgvyZe9yZvResztQmhLz/YZGXzd5ecbDwnZ fNlim4wMeU1kZdSvC52C/S37oST9Ibr3/yxNibVYcS1StlD7r7+6mmt4XUdFXHrGQx Efd7QYfrPwe0A== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Tejun Heo , Peter Zijlstra , "Paul E . McKenney" , Paul Gortmaker , Johannes Weiner , Marcelo Tosatti , Phil Auld , Zefan Li , Waiman Long , Daniel Bristot de Oliveira , Nicolas Saenz Julienne , rcu@vger.kernel.org Subject: [PATCH 1/4] rcu/nocb: Pass a cpumask instead of a single CPU to offload/deoffload Date: Thu, 26 May 2022 00:10:52 +0200 Message-Id: <20220525221055.1152307-2-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220525221055.1152307-1-frederic@kernel.org> References: <20220525221055.1152307-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org Currently the interface to toggle callbacks offloading state only takes a single CPU per call. Now driving RCU NOCB through cpusets requires to be able to change the offloading state of a whole set of CPUs. To make it easier, extend the (de-)offloading interface to support a cpumask. Signed-off-by: Frederic Weisbecker Cc: Zefan Li Cc: Tejun Heo Cc: Johannes Weiner Cc: Paul E. McKenney Cc: Phil Auld Cc: Nicolas Saenz Julienne Cc: Marcelo Tosatti Cc: Paul Gortmaker Cc: Waiman Long Cc: Daniel Bristot de Oliveira Cc: Peter Zijlstra --- include/linux/rcupdate.h | 9 ++-- kernel/rcu/rcutorture.c | 4 +- kernel/rcu/tree_nocb.h | 102 ++++++++++++++++++++++++++------------- 3 files changed, 76 insertions(+), 39 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index f9f75a3cfeb8..dc8bb7cc893a 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -114,13 +114,14 @@ static inline void rcu_user_exit(void) { } #ifdef CONFIG_RCU_NOCB_CPU void rcu_init_nohz(void); -int rcu_nocb_cpu_offload(int cpu); -int rcu_nocb_cpu_deoffload(int cpu); +int rcu_nocb_cpumask_update(struct cpumask *cpumask, bool offload); void rcu_nocb_flush_deferred_wakeup(void); #else /* #ifdef CONFIG_RCU_NOCB_CPU */ static inline void rcu_init_nohz(void) { } -static inline int rcu_nocb_cpu_offload(int cpu) { return -EINVAL; } -static inline int rcu_nocb_cpu_deoffload(int cpu) { return 0; } +static inline int rcu_nocb_cpumask_update(struct cpumask *cpumask, bool offload) +{ + return -EINVAL; +} static inline void rcu_nocb_flush_deferred_wakeup(void) { } #endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */ diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index faf6b4c7a757..f912ff4869b3 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1887,10 +1887,10 @@ static int rcu_nocb_toggle(void *arg) r = torture_random(&rand); cpu = (r >> 4) % (maxcpu + 1); if (r & 0x1) { - rcu_nocb_cpu_offload(cpu); + rcu_nocb_cpumask_update(cpumask_of(cpu), true); atomic_long_inc(&n_nocb_offload); } else { - rcu_nocb_cpu_deoffload(cpu); + rcu_nocb_cpumask_update(cpumask_of(cpu), false); atomic_long_inc(&n_nocb_deoffload); } toggle_delay = torture_random(&rand) % toggle_fuzz + toggle_interval; diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index fa8e4f82e60c..428571ad11e3 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -1084,29 +1084,23 @@ static long rcu_nocb_rdp_deoffload(void *arg) return 0; } -int rcu_nocb_cpu_deoffload(int cpu) +static int rcu_nocb_cpu_deoffload(int cpu) { struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); int ret = 0; - cpus_read_lock(); - mutex_lock(&rcu_state.barrier_mutex); - if (rcu_rdp_is_offloaded(rdp)) { - if (cpu_online(cpu)) { - ret = work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp); - if (!ret) - cpumask_clear_cpu(cpu, rcu_nocb_mask); - } else { - pr_info("NOCB: Can't CB-deoffload an offline CPU\n"); - ret = -EINVAL; - } - } - mutex_unlock(&rcu_state.barrier_mutex); - cpus_read_unlock(); + if (cpu_is_offline(cpu)) + return -EINVAL; + + if (!rcu_rdp_is_offloaded(rdp)) + return 0; + + ret = work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp); + if (!ret) + cpumask_clear_cpu(cpu, rcu_nocb_mask); return ret; } -EXPORT_SYMBOL_GPL(rcu_nocb_cpu_deoffload); static long rcu_nocb_rdp_offload(void *arg) { @@ -1117,12 +1111,6 @@ static long rcu_nocb_rdp_offload(void *arg) struct rcu_data *rdp_gp = rdp->nocb_gp_rdp; WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id()); - /* - * For now we only support re-offload, ie: the rdp must have been - * offloaded on boot first. - */ - if (!rdp->nocb_gp_rdp) - return -EINVAL; if (WARN_ON_ONCE(!rdp_gp->nocb_gp_kthread)) return -EINVAL; @@ -1169,29 +1157,77 @@ static long rcu_nocb_rdp_offload(void *arg) return 0; } -int rcu_nocb_cpu_offload(int cpu) +static int rcu_nocb_cpu_offload(int cpu) { struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); - int ret = 0; + int ret; + + if (cpu_is_offline(cpu)) + return -EINVAL; + + if (rcu_rdp_is_offloaded(rdp)) + return 0; + + ret = work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp); + if (!ret) + cpumask_set_cpu(cpu, rcu_nocb_mask); + + return ret; +} + +int rcu_nocb_cpumask_update(struct cpumask *cpumask, bool offload) +{ + int cpu; + int err = 0; + int err_cpu; + cpumask_var_t saved_nocb_mask; + + if (!alloc_cpumask_var(&saved_nocb_mask, GFP_KERNEL)) + return -ENOMEM; + + cpumask_copy(saved_nocb_mask, rcu_nocb_mask); cpus_read_lock(); mutex_lock(&rcu_state.barrier_mutex); - if (!rcu_rdp_is_offloaded(rdp)) { - if (cpu_online(cpu)) { - ret = work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp); - if (!ret) - cpumask_set_cpu(cpu, rcu_nocb_mask); + for_each_cpu(cpu, cpumask) { + if (offload) { + err = rcu_nocb_cpu_offload(cpu); + if (err < 0) { + err_cpu = cpu; + pr_err("NOCB: offload cpu %d failed (%d)\n", cpu, err); + break; + } } else { - pr_info("NOCB: Can't CB-offload an offline CPU\n"); - ret = -EINVAL; + err = rcu_nocb_cpu_deoffload(cpu); + if (err < 0) { + err_cpu = cpu; + pr_err("NOCB: deoffload cpu %d failed (%d)\n", cpu, err); + break; + } } } + + /* Rollback in case of error */ + if (err < 0) { + err_cpu = cpu; + for_each_cpu(cpu, cpumask) { + if (err_cpu == cpu) + break; + if (cpumask_test_cpu(cpu, saved_nocb_mask)) + WARN_ON_ONCE(rcu_nocb_cpu_offload(cpu)); + else + WARN_ON_ONCE(rcu_nocb_cpu_deoffload(cpu)); + } + } + mutex_unlock(&rcu_state.barrier_mutex); cpus_read_unlock(); - return ret; + free_cpumask_var(saved_nocb_mask); + + return err; } -EXPORT_SYMBOL_GPL(rcu_nocb_cpu_offload); +EXPORT_SYMBOL_GPL(rcu_nocb_cpumask_update); void __init rcu_init_nohz(void) { From patchwork Wed May 25 22:10:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 12861743 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1D7FC433FE for ; Wed, 25 May 2022 22:11:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240551AbiEYWLP (ORCPT ); Wed, 25 May 2022 18:11:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47752 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241245AbiEYWLK (ORCPT ); Wed, 25 May 2022 18:11:10 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D285E11175; Wed, 25 May 2022 15:11:09 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 38FA161AC3; Wed, 25 May 2022 22:11:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CF32BC34117; Wed, 25 May 2022 22:11:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1653516668; bh=l+U+3vUv8PFPM6xouiNpdZzkfzghAf+GVYpk33ixy3Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=IungJCUJSyEmeFEddWMdpfXZJiNj46GYKCCM61F8vs2ByYOLp6o5YHEHBH7Jh9u/j GRkUNevXlUiBoASF4dH4cjM0rwI097quqixRhTqJ5WbIrHdorMItafK7oRHEc3TS4V Gf0xZYnLQCzk7sEFlVcJqHVeW/RhTCdBDgHX19xTjLVyHsOrYWLutvgB8l6gDyu55w iAPqWp3a5f+W29epwd0Pej40GNf+eGGMfZ5NvTJVWSapSYACEG772YB6pYbr4y5r1S TGSTl5uT2tpfrjgn+0+hWyO9UiE8FX/ThCK7sGLGvCSTLfXVyEaPiXE94Cw4re7xS+ cPeaoZtJCYMBQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Tejun Heo , Peter Zijlstra , "Paul E . McKenney" , Paul Gortmaker , Johannes Weiner , Marcelo Tosatti , Phil Auld , Zefan Li , Waiman Long , Daniel Bristot de Oliveira , Nicolas Saenz Julienne , rcu@vger.kernel.org Subject: [PATCH 2/4] rcu/nocb: Prepare to change nocb cpumask from CPU-hotplug protected cpuset caller Date: Thu, 26 May 2022 00:10:53 +0200 Message-Id: <20220525221055.1152307-3-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220525221055.1152307-1-frederic@kernel.org> References: <20220525221055.1152307-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org cpusets is going to use the NOCB (de-)offloading interface while holding hotplug lock. Therefore pull out the responsibility of protecting against concurrent CPU-hotplug changes to the callers of rcu_nocb_cpumask_update(). Signed-off-by: Frederic Weisbecker Cc: Zefan Li Cc: Tejun Heo Cc: Johannes Weiner Cc: Paul E. McKenney Cc: Phil Auld Cc: Nicolas Saenz Julienne Cc: Marcelo Tosatti Cc: Paul Gortmaker Cc: Waiman Long Cc: Daniel Bristot de Oliveira Cc: Peter Zijlstra --- kernel/rcu/rcutorture.c | 2 ++ kernel/rcu/tree_nocb.h | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index f912ff4869b3..5a3029550e83 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1886,6 +1886,7 @@ static int rcu_nocb_toggle(void *arg) do { r = torture_random(&rand); cpu = (r >> 4) % (maxcpu + 1); + cpus_read_lock(); if (r & 0x1) { rcu_nocb_cpumask_update(cpumask_of(cpu), true); atomic_long_inc(&n_nocb_offload); @@ -1893,6 +1894,7 @@ static int rcu_nocb_toggle(void *arg) rcu_nocb_cpumask_update(cpumask_of(cpu), false); atomic_long_inc(&n_nocb_deoffload); } + cpus_read_unlock(); toggle_delay = torture_random(&rand) % toggle_fuzz + toggle_interval; set_current_state(TASK_INTERRUPTIBLE); schedule_hrtimeout(&toggle_delay, HRTIMER_MODE_REL); diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 428571ad11e3..6396af6c765a 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -1182,12 +1182,13 @@ int rcu_nocb_cpumask_update(struct cpumask *cpumask, bool offload) int err_cpu; cpumask_var_t saved_nocb_mask; + lockdep_assert_cpus_held(); + if (!alloc_cpumask_var(&saved_nocb_mask, GFP_KERNEL)) return -ENOMEM; cpumask_copy(saved_nocb_mask, rcu_nocb_mask); - cpus_read_lock(); mutex_lock(&rcu_state.barrier_mutex); for_each_cpu(cpu, cpumask) { if (offload) { @@ -1221,7 +1222,6 @@ int rcu_nocb_cpumask_update(struct cpumask *cpumask, bool offload) } mutex_unlock(&rcu_state.barrier_mutex); - cpus_read_unlock(); free_cpumask_var(saved_nocb_mask); From patchwork Wed May 25 22:10:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 12861744 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 450D6C433F5 for ; Wed, 25 May 2022 22:11:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241458AbiEYWLP (ORCPT ); Wed, 25 May 2022 18:11:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345347AbiEYWLO (ORCPT ); Wed, 25 May 2022 18:11:14 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E75B513CEC; Wed, 25 May 2022 15:11:12 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 708C961AD8; Wed, 25 May 2022 22:11:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 14C9DC34119; Wed, 25 May 2022 22:11:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1653516671; bh=3zniIbPdy7+9dNP1xUFhfBA0uM9hAlLVYqm9/0P36DE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Z8awa+Mbgjc4sfpESzBF+k72TBoJ7yG8rlly8BaIhqNwqCWJo7o3y2eFhWB/l0oa/ Tx9HTmfhNaH8d0u3VIhejG1rh+HTjQoCl1sxu8p01i1l1CqQ/I7ed0yfpF1oYnfFbi WYR8VsejPW7OgHmrYM2C8Qv1wSQg6lYsJb2iTz0s8qsaHKM1vhwd41ipcYjV8i9K+e EQV5HAP3JHE5qGcLYvVPHawJ4dUWU1B7VvMbCTcx4aj0sA86BqyH6BkFEVMnyO5R8o lQOG8ZIbKnqb1yW6wrf27ZJ7engpEfr1QDykVVdZX25aSp9zlokvqbB75qWZE1OF04 k3q6HHEXq4MUQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Tejun Heo , Peter Zijlstra , "Paul E . McKenney" , Paul Gortmaker , Johannes Weiner , Marcelo Tosatti , Phil Auld , Zefan Li , Waiman Long , Daniel Bristot de Oliveira , Nicolas Saenz Julienne , rcu@vger.kernel.org Subject: [PATCH 3/4] sched/isolation: Infrastructure to support rcu nocb cpumask changes Date: Thu, 26 May 2022 00:10:54 +0200 Message-Id: <20220525221055.1152307-4-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220525221055.1152307-1-frederic@kernel.org> References: <20220525221055.1152307-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org Provide a minimal infrastructure to change the housekeeping cpumasks. For now only RCU NOCB cpumask is handled. Signed-off-by: Frederic Weisbecker Cc: Zefan Li Cc: Tejun Heo Cc: Johannes Weiner Cc: Paul E. McKenney Cc: Phil Auld Cc: Nicolas Saenz Julienne Cc: Marcelo Tosatti Cc: Paul Gortmaker Cc: Waiman Long Cc: Daniel Bristot de Oliveira Cc: Peter Zijlstra --- include/linux/sched/isolation.h | 13 +++++++++++ kernel/sched/isolation.c | 38 +++++++++++++++++++++++++++++++++ 2 files changed, 51 insertions(+) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h index 8c15abd67aed..c6d0e3f83a20 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -25,6 +25,8 @@ extern const struct cpumask *housekeeping_cpumask(enum hk_type type); extern bool housekeeping_enabled(enum hk_type type); extern void housekeeping_affine(struct task_struct *t, enum hk_type type); extern bool housekeeping_test_cpu(int cpu, enum hk_type type); +extern int housekeeping_cpumask_set(struct cpumask *cpumask, enum hk_type type); +extern int housekeeping_cpumask_clear(struct cpumask *cpumask, enum hk_type type); extern void __init housekeeping_init(void); #else @@ -46,6 +48,17 @@ static inline bool housekeeping_enabled(enum hk_type type) static inline void housekeeping_affine(struct task_struct *t, enum hk_type type) { } + +static inline int housekeeping_cpumask_set(struct cpumask *cpumask, enum hk_type type) +{ + return -EINVAL; +} + +static inline int housekeeping_cpumask_clear(struct cpumask *cpumask, enum hk_type type) +{ + return -EINVAL; +} + static inline void housekeeping_init(void) { } #endif /* CONFIG_CPU_ISOLATION */ diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 373d42c707bc..ab4aba795c01 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -79,6 +79,44 @@ bool housekeeping_test_cpu(int cpu, enum hk_type type) } EXPORT_SYMBOL_GPL(housekeeping_test_cpu); +static int housekeeping_cpumask_update(struct cpumask *cpumask, + enum hk_type type, bool on) +{ + int err; + + switch (type) { + case HK_TYPE_RCU: + err = rcu_nocb_cpumask_update(cpumask, on); + break; + default: + err = -EINVAL; + } + + if (err >= 0) { + if (on) { + cpumask_or(housekeeping.cpumasks[type], + housekeeping.cpumasks[type], + cpumask); + } else { + cpumask_andnot(housekeeping.cpumasks[type], + housekeeping.cpumasks[type], + cpumask); + } + } + + return err; +} + +int housekeeping_cpumask_set(struct cpumask *cpumask, enum hk_type type) +{ + return housekeeping_cpumask_update(cpumask, type, true); +} + +int housekeeping_cpumask_clear(struct cpumask *cpumask, enum hk_type type) +{ + return housekeeping_cpumask_update(cpumask, type, false); +} + void __init housekeeping_init(void) { enum hk_type type; From patchwork Wed May 25 22:10:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 12861745 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7150EC433EF for ; Wed, 25 May 2022 22:11:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345405AbiEYWLY (ORCPT ); Wed, 25 May 2022 18:11:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47936 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345363AbiEYWLT (ORCPT ); Wed, 25 May 2022 18:11:19 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9DA5A2CDEE; Wed, 25 May 2022 15:11:17 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4DC94B81EA6; Wed, 25 May 2022 22:11:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4F2ABC34117; Wed, 25 May 2022 22:11:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1653516675; bh=LBhBMs9GjOEfhBJWzPUCF+VTJhGM8NlzVA+cHXKdExE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TrST2hlRULgNwKOnE+rzfebGkrbckmwSLNUEBXotFa7TOPGyuoo4Us8v62Bq0PgEN 7DHRYL5D/qVplGyjwRMY1yJMkYVB7A8wO0sIV6GoumB3CoJQCB5n11SnBqvAn85QEZ kHNvFhCcFs7LgfL+XjC+B8tg9nQKN55mNVXfvbxOLNryRKbUoIJOypYqiq9FzVFq88 axLHeKiaZDEDeYANqGTcm93UWsVpw9Eu/8ivjYRAyVzeK2LsPo6MaFgw5UDEC3Wx8s 5hYnSkXpT8VMIc4G4jUO8qDHU/0CTPd58Ljqz9R7sYVC4owkEf62cenPTw+wC0C2uF iiN5HCZM/KGIQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Tejun Heo , Peter Zijlstra , "Paul E . McKenney" , Paul Gortmaker , Johannes Weiner , Marcelo Tosatti , Phil Auld , Zefan Li , Waiman Long , Daniel Bristot de Oliveira , Nicolas Saenz Julienne , rcu@vger.kernel.org Subject: [RFC PATCH 4/4] cpuset: Support RCU-NOCB toggle on v2 root partitions Date: Thu, 26 May 2022 00:10:55 +0200 Message-Id: <20220525221055.1152307-5-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220525221055.1152307-1-frederic@kernel.org> References: <20220525221055.1152307-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org Introduce a new "isolation.rcu_nocb" file within a cgroup2/cpuset directory which provides support for a set of CPUs to either enable ("1") or disable ("0") RCU callbacks offloading (aka. RCU NOCB). This can overwrite previous boot settings towards "rcu_nocbs=" kernel parameter. The file is only writeable on "root" type partitions to exclude any overlap. The deepest root type partition has the highest priority. This means that given the following setting: Top cpuset (CPUs: 0-7) cpuset.isolation.rcu_nocb = 0 | | Subdirectory A (CPUs: 5-7) cpuset.cpus.partition = root cpuset.isolation.rcu_nocb = 0 | | Subdirectory B (CPUs: 7) cpuset.cpus.partition = root cpuset.isolation.rcu_nocb = 1 the result is that only CPU 7 is in rcu_nocb mode. Note that "rcu_nocbs" kernel parameter must be passed on boot, even without a cpulist, so that nocb support is enabled. Signed-off-by: Frederic Weisbecker Cc: Zefan Li Cc: Tejun Heo Cc: Johannes Weiner Cc: Paul E. McKenney Cc: Phil Auld Cc: Nicolas Saenz Julienne Cc: Marcelo Tosatti Cc: Paul Gortmaker Cc: Waiman Long Cc: Daniel Bristot de Oliveira Cc: Peter Zijlstra --- kernel/cgroup/cpuset.c | 95 ++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 92 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 9390bfd9f1cd..2d9f019bb590 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -225,6 +225,7 @@ typedef enum { CS_SCHED_LOAD_BALANCE, CS_SPREAD_PAGE, CS_SPREAD_SLAB, + CS_RCU_NOCB, } cpuset_flagbits_t; /* convenient tests for these bits */ @@ -268,6 +269,11 @@ static inline int is_spread_slab(const struct cpuset *cs) return test_bit(CS_SPREAD_SLAB, &cs->flags); } +static inline int is_rcu_nocb(const struct cpuset *cs) +{ + return test_bit(CS_RCU_NOCB, &cs->flags); +} + static inline int is_partition_root(const struct cpuset *cs) { return cs->partition_root_state > 0; @@ -590,6 +596,62 @@ static inline void free_cpuset(struct cpuset *cs) kfree(cs); } +#ifdef CONFIG_RCU_NOCB_CPU +static int cpuset_rcu_nocb_apply(struct cpuset *root) +{ + int err; + + if (is_rcu_nocb(root)) + err = housekeeping_cpumask_set(root->effective_cpus, HK_TYPE_RCU); + else + err = housekeeping_cpumask_clear(root->effective_cpus, HK_TYPE_RCU); + + return err; +} + +static int cpuset_rcu_nocb_update(struct cpuset *cur, struct cpuset *trialcs) +{ + struct cgroup_subsys_state *des_css; + struct cpuset *des; + int err; + + if (cur->partition_root_state != PRS_ENABLED) + return -EINVAL; + + err = cpuset_rcu_nocb_apply(trialcs); + if (err < 0) + return err; + + rcu_read_lock(); + cpuset_for_each_descendant_pre(des, des_css, cur) { + if (des == cur) + continue; + if (des->partition_root_state == PRS_ENABLED) + break; + spin_lock_irq(&callback_lock); + if (is_rcu_nocb(trialcs)) + set_bit(CS_RCU_NOCB, &des->flags); + else + clear_bit(CS_RCU_NOCB, &des->flags); + spin_unlock_irq(&callback_lock); + } + rcu_read_unlock(); + + return 0; +} +#else +static inline int cpuset_rcu_nocb_apply(struct cpuset *root) +{ + return 0; +} + +static inline int cpuset_rcu_nocb_update(struct cpuset *cur, + struct cpuset *trialcs) +{ + return 0; +} +#endif /* #ifdef CONFIG_RCU_NOCB_CPU */ + /* * validate_change_legacy() - Validate conditions specific to legacy (v1) * behavior. @@ -1655,6 +1717,9 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs, if (cs->partition_root_state) { struct cpuset *parent = parent_cs(cs); + WARN_ON_ONCE(cpuset_rcu_nocb_apply(parent) < 0); + WARN_ON_ONCE(cpuset_rcu_nocb_apply(cs) < 0); + /* * For partition root, update the cpumasks of sibling * cpusets if they use parent's effective_cpus. @@ -2012,6 +2077,12 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs, spread_flag_changed = ((is_spread_slab(cs) != is_spread_slab(trialcs)) || (is_spread_page(cs) != is_spread_page(trialcs))); + if (is_rcu_nocb(cs) != is_rcu_nocb(trialcs)) { + err = cpuset_rcu_nocb_update(cs, trialcs); + if (err < 0) + goto out; + } + spin_lock_irq(&callback_lock); cs->flags = trialcs->flags; spin_unlock_irq(&callback_lock); @@ -2365,6 +2436,7 @@ typedef enum { FILE_MEMORY_PRESSURE, FILE_SPREAD_PAGE, FILE_SPREAD_SLAB, + FILE_RCU_NOCB, } cpuset_filetype_t; static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft, @@ -2406,6 +2478,9 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft, case FILE_SPREAD_SLAB: retval = update_flag(CS_SPREAD_SLAB, cs, val); break; + case FILE_RCU_NOCB: + retval = update_flag(CS_RCU_NOCB, cs, val); + break; default: retval = -EINVAL; break; @@ -2573,6 +2648,8 @@ static u64 cpuset_read_u64(struct cgroup_subsys_state *css, struct cftype *cft) return is_spread_page(cs); case FILE_SPREAD_SLAB: return is_spread_slab(cs); + case FILE_RCU_NOCB: + return is_rcu_nocb(cs); default: BUG(); } @@ -2803,7 +2880,14 @@ static struct cftype dfl_files[] = { .private = FILE_SUBPARTS_CPULIST, .flags = CFTYPE_DEBUG, }, - +#ifdef CONFIG_RCU_NOCB_CPU + { + .name = "isolation.rcu_nocb", + .read_u64 = cpuset_read_u64, + .write_u64 = cpuset_write_u64, + .private = FILE_RCU_NOCB, + }, +#endif { } /* terminate */ }; @@ -2861,6 +2945,8 @@ static int cpuset_css_online(struct cgroup_subsys_state *css) set_bit(CS_SPREAD_PAGE, &cs->flags); if (is_spread_slab(parent)) set_bit(CS_SPREAD_SLAB, &cs->flags); + if (is_rcu_nocb(parent)) + set_bit(CS_RCU_NOCB, &cs->flags); cpuset_inc(); @@ -3227,12 +3313,15 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp) if (mems_updated) check_insane_mems_config(&new_mems); - if (is_in_v2_mode()) + if (is_in_v2_mode()) { hotplug_update_tasks(cs, &new_cpus, &new_mems, cpus_updated, mems_updated); - else + if (cpus_updated) + WARN_ON_ONCE(cpuset_rcu_nocb_apply(cs) < 0); + } else { hotplug_update_tasks_legacy(cs, &new_cpus, &new_mems, cpus_updated, mems_updated); + } percpu_up_write(&cpuset_rwsem); }