Message ID | 20220620224503.3841196-6-paulmck@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Commit | 8f489b4da5278fc6e5fc8f0029ae7fb51c060215 |
Headers | show |
Series | Callback-offload (nocb) updates for v5.20 | expand |
On 6/21/2022 4:15 AM, Paul E. McKenney wrote: > From: "Uladzislau Rezki (Sony)" <urezki@gmail.com> > > This commit introduces a RCU_NOCB_CPU_CB_BOOST Kconfig option that > prevents rcuo kthreads from running at real-time priority, even in > kernels built with RCU_BOOST. This capability is important to devices > needing low-latency (as in a few milliseconds) response from expedited > RCU grace periods, but which are not running a classic real-time workload. > On such devices, permitting the rcuo kthreads to run at real-time priority > results in unacceptable latencies imposed on the application tasks, > which run as SCHED_OTHER. > > See for example the following trace output: > > <snip> > <...>-60 [006] d..1 2979.028717: rcu_batch_start: rcu_preempt CBs=34619 bl=270 > <snip> > > If that rcuop kthread were permitted to run at real-time SCHED_FIFO > priority, it would monopolize its CPU for hundreds of milliseconds > while invoking those 34619 RCU callback functions, which would cause an > unacceptably long latency spike for many application stacks on Android > platforms. > > However, some existing real-time workloads require that callback > invocation run at SCHED_FIFO priority, for example, those running on > systems with heavy SCHED_OTHER background loads. (It is the real-time > system's administrator's responsibility to make sure that important > real-time tasks run at a higher priority than do RCU's kthreads.) > > Therefore, this new RCU_NOCB_CPU_CB_BOOST Kconfig option defaults to > "y" on kernels built with PREEMPT_RT and defaults to "n" otherwise. > The effect is to preserve current behavior for real-time systems, but for > other systems to allow expedited RCU grace periods to run with real-time > priority while continuing to invoke RCU callbacks as SCHED_OTHER. > > As you would expect, this RCU_NOCB_CPU_CB_BOOST Kconfig option has no > effect except on CPUs with offloaded RCU callbacks. > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> > --- Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Thanks Neeraj > kernel/rcu/Kconfig | 16 ++++++++++++++++ > kernel/rcu/tree.c | 6 +++++- > kernel/rcu/tree_nocb.h | 3 ++- > 3 files changed, 23 insertions(+), 2 deletions(-) > > diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig > index 27aab870ae4cf..c05ca52cdf64d 100644 > --- a/kernel/rcu/Kconfig > +++ b/kernel/rcu/Kconfig > @@ -275,6 +275,22 @@ config RCU_NOCB_CPU_DEFAULT_ALL > Say Y here if you want offload all CPUs by default on boot. > Say N here if you are unsure. > > +config RCU_NOCB_CPU_CB_BOOST > + bool "Offload RCU callback from real-time kthread" > + depends on RCU_NOCB_CPU && RCU_BOOST > + default y if PREEMPT_RT > + help > + Use this option to invoke offloaded callbacks as SCHED_FIFO > + to avoid starvation by heavy SCHED_OTHER background load. > + Of course, running as SCHED_FIFO during callback floods will > + cause the rcuo[ps] kthreads to monopolize the CPU for hundreds > + of milliseconds or more. Therefore, when enabling this option, > + it is your responsibility to ensure that latency-sensitive > + tasks either run with higher priority or run on some other CPU. > + > + Say Y here if you want to set RT priority for offloading kthreads. > + Say N here if you are building a !PREEMPT_RT kernel and are unsure. > + > config TASKS_TRACE_RCU_READ_MB > bool "Tasks Trace RCU readers use memory barriers in user and idle" > depends on RCU_EXPERT && TASKS_TRACE_RCU > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 74455671e6cf2..3b9f45ebb4999 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -154,7 +154,11 @@ static void sync_sched_exp_online_cleanup(int cpu); > static void check_cb_ovld_locked(struct rcu_data *rdp, struct rcu_node *rnp); > static bool rcu_rdp_is_offloaded(struct rcu_data *rdp); > > -/* rcuc/rcub/rcuop kthread realtime priority */ > +/* > + * rcuc/rcub/rcuop kthread realtime priority. The "rcuop" > + * real-time priority(enabling/disabling) is controlled by > + * the extra CONFIG_RCU_NOCB_CPU_CB_BOOST configuration. > + */ > static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0; > module_param(kthread_prio, int, 0444); > > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h > index 60cc92cc66552..fa8e4f82e60c0 100644 > --- a/kernel/rcu/tree_nocb.h > +++ b/kernel/rcu/tree_nocb.h > @@ -1315,8 +1315,9 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu) > if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__)) > goto end; > > - if (kthread_prio) > + if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_CB_BOOST) && kthread_prio) > sched_setscheduler_nocheck(t, SCHED_FIFO, &sp); > + > WRITE_ONCE(rdp->nocb_cb_kthread, t); > WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread); > return;
diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig index 27aab870ae4cf..c05ca52cdf64d 100644 --- a/kernel/rcu/Kconfig +++ b/kernel/rcu/Kconfig @@ -275,6 +275,22 @@ config RCU_NOCB_CPU_DEFAULT_ALL Say Y here if you want offload all CPUs by default on boot. Say N here if you are unsure. +config RCU_NOCB_CPU_CB_BOOST + bool "Offload RCU callback from real-time kthread" + depends on RCU_NOCB_CPU && RCU_BOOST + default y if PREEMPT_RT + help + Use this option to invoke offloaded callbacks as SCHED_FIFO + to avoid starvation by heavy SCHED_OTHER background load. + Of course, running as SCHED_FIFO during callback floods will + cause the rcuo[ps] kthreads to monopolize the CPU for hundreds + of milliseconds or more. Therefore, when enabling this option, + it is your responsibility to ensure that latency-sensitive + tasks either run with higher priority or run on some other CPU. + + Say Y here if you want to set RT priority for offloading kthreads. + Say N here if you are building a !PREEMPT_RT kernel and are unsure. + config TASKS_TRACE_RCU_READ_MB bool "Tasks Trace RCU readers use memory barriers in user and idle" depends on RCU_EXPERT && TASKS_TRACE_RCU diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 74455671e6cf2..3b9f45ebb4999 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -154,7 +154,11 @@ static void sync_sched_exp_online_cleanup(int cpu); static void check_cb_ovld_locked(struct rcu_data *rdp, struct rcu_node *rnp); static bool rcu_rdp_is_offloaded(struct rcu_data *rdp); -/* rcuc/rcub/rcuop kthread realtime priority */ +/* + * rcuc/rcub/rcuop kthread realtime priority. The "rcuop" + * real-time priority(enabling/disabling) is controlled by + * the extra CONFIG_RCU_NOCB_CPU_CB_BOOST configuration. + */ static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0; module_param(kthread_prio, int, 0444); diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 60cc92cc66552..fa8e4f82e60c0 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -1315,8 +1315,9 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu) if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__)) goto end; - if (kthread_prio) + if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_CB_BOOST) && kthread_prio) sched_setscheduler_nocheck(t, SCHED_FIFO, &sp); + WRITE_ONCE(rdp->nocb_cb_kthread, t); WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread); return;