diff mbox series

[v3,13/25] context_tracking, rcu: Rename rcu_dynticks_task*() into rcu_task*()

Message ID 20240724144325.3307148-14-vschneid@redhat.com (mailing list archive)
State Accepted
Commit 0e0b77f83102be118510c62b4b4f47e8ac36d174
Headers show
Series context_tracking, rcu: Spring cleaning of dynticks references | expand

Commit Message

Valentin Schneider July 24, 2024, 2:43 p.m. UTC
The context_tracking.state RCU_DYNTICKS subvariable has been renamed to
RCU_WATCHING, and the 'dynticks' prefix can be dropped without losing any
meaning.

While at it, flip the suffixes of these helpers. We are not telling
that we are entering dynticks mode from an RCU-task perspective anymore; we
are telling that we are exiting RCU-tasks because we are in eqs mode.

Suggested-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 kernel/context_tracking.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

Comments

Frederic Weisbecker July 25, 2024, 2:32 p.m. UTC | #1
Le Wed, Jul 24, 2024 at 04:43:13PM +0200, Valentin Schneider a écrit :
> -/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
> -static __always_inline void rcu_dynticks_task_trace_enter(void)
> +/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
> +static __always_inline void rcu_task_trace_exit(void)

Before I proceed on this last one, a few questions for Paul and others:

1) Why is rcu_dynticks_task_exit() not called while entering in NMI?
   Does that mean that NMIs aren't RCU-Task read side critical sections?

2) Looking further into CONFIG_TASKS_TRACE_RCU_READ_MB=y, it seems to
   allow for uses of rcu_read_[un]lock_trace() while RCU is not watching
   (EQS). Is it really a good idea to support that? Are we aware of any
   such potential usecase?

Thanks!
Paul E. McKenney July 30, 2024, 2:23 p.m. UTC | #2
On Thu, Jul 25, 2024 at 04:32:46PM +0200, Frederic Weisbecker wrote:
> Le Wed, Jul 24, 2024 at 04:43:13PM +0200, Valentin Schneider a écrit :
> > -/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
> > -static __always_inline void rcu_dynticks_task_trace_enter(void)
> > +/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
> > +static __always_inline void rcu_task_trace_exit(void)
> 
> Before I proceed on this last one, a few questions for Paul and others:
> 
> 1) Why is rcu_dynticks_task_exit() not called while entering in NMI?
>    Does that mean that NMIs aren't RCU-Task read side critical sections?

Because Tasks RCU Rude handles that case currently.  So good catch,
because this might need adjustment when we get rid of Tasks RCU Rude.
And both rcu_dynticks_task_enter() and rcu_dynticks_task_exit() look safe
to invoke from NMI handlers.  Memory ordering needs checking, of course.

Except that on architectures defining CONFIG_ARCH_WANTS_NO_INSTR, Tasks
RCU should instead check the ct_kernel_enter_state(RCU_DYNTICKS_IDX)
state, right?  And on those architectures, I believe that
rcu_dynticks_task_enter() and rcu_dynticks_task_exit() can just be no-ops.
Or am I missing something here?

> 2) Looking further into CONFIG_TASKS_TRACE_RCU_READ_MB=y, it seems to
>    allow for uses of rcu_read_[un]lock_trace() while RCU is not watching
>    (EQS). Is it really a good idea to support that? Are we aware of any
>    such potential usecase?

I hope that in the longer term, there will be no reason to support this.
Right now, architectures not defining CONFIG_ARCH_WANTS_NO_INSTR must
support this because tracers really can attach probes where RCU is
not watching.

And even now, in architectures defining CONFIG_ARCH_WANTS_NO_INSTR, I
am not convinced that the early incoming and late outgoing CPU-hotplug
paths are handled correctly.  RCU is not watching them, but I am not so
sure that they are all marked noinstr as needed.

Or am I missing some implicit reason why this all works?

							Thanx, Paul
Frederic Weisbecker July 30, 2024, 10:17 p.m. UTC | #3
Le Tue, Jul 30, 2024 at 07:23:58AM -0700, Paul E. McKenney a écrit :
> On Thu, Jul 25, 2024 at 04:32:46PM +0200, Frederic Weisbecker wrote:
> > Le Wed, Jul 24, 2024 at 04:43:13PM +0200, Valentin Schneider a écrit :
> > > -/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
> > > -static __always_inline void rcu_dynticks_task_trace_enter(void)
> > > +/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
> > > +static __always_inline void rcu_task_trace_exit(void)
> > 
> > Before I proceed on this last one, a few questions for Paul and others:
> > 
> > 1) Why is rcu_dynticks_task_exit() not called while entering in NMI?
> >    Does that mean that NMIs aren't RCU-Task read side critical sections?
> 
> Because Tasks RCU Rude handles that case currently.  So good catch,
> because this might need adjustment when we get rid of Tasks RCU Rude.
> And both rcu_dynticks_task_enter() and rcu_dynticks_task_exit() look safe
> to invoke from NMI handlers.  Memory ordering needs checking, of course.
> 
> Except that on architectures defining CONFIG_ARCH_WANTS_NO_INSTR, Tasks
> RCU should instead check the ct_kernel_enter_state(RCU_DYNTICKS_IDX)
> state, right?  And on those architectures, I believe that
> rcu_dynticks_task_enter() and rcu_dynticks_task_exit() can just be no-ops.
> Or am I missing something here?

I think rcu_dynticks_task_enter() and rcu_dynticks_task_exit() are
still needed anyway because the target task can migrate. So unless the rq is locked,
it's hard to match a stable task_cpu() with the corresponding RCU_DYNTICKS_IDX.

> 
> > 2) Looking further into CONFIG_TASKS_TRACE_RCU_READ_MB=y, it seems to
> >    allow for uses of rcu_read_[un]lock_trace() while RCU is not watching
> >    (EQS). Is it really a good idea to support that? Are we aware of any
> >    such potential usecase?
> 
> I hope that in the longer term, there will be no reason to support this.
> Right now, architectures not defining CONFIG_ARCH_WANTS_NO_INSTR must
> support this because tracers really can attach probes where RCU is
> not watching.
> 
> And even now, in architectures defining CONFIG_ARCH_WANTS_NO_INSTR, I
> am not convinced that the early incoming and late outgoing CPU-hotplug
> paths are handled correctly.  RCU is not watching them, but I am not so
> sure that they are all marked noinstr as needed.

Ok I see...

Thanks.
Paul E. McKenney July 30, 2024, 10:39 p.m. UTC | #4
On Wed, Jul 31, 2024 at 12:17:49AM +0200, Frederic Weisbecker wrote:
> Le Tue, Jul 30, 2024 at 07:23:58AM -0700, Paul E. McKenney a écrit :
> > On Thu, Jul 25, 2024 at 04:32:46PM +0200, Frederic Weisbecker wrote:
> > > Le Wed, Jul 24, 2024 at 04:43:13PM +0200, Valentin Schneider a écrit :
> > > > -/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
> > > > -static __always_inline void rcu_dynticks_task_trace_enter(void)
> > > > +/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
> > > > +static __always_inline void rcu_task_trace_exit(void)
> > > 
> > > Before I proceed on this last one, a few questions for Paul and others:
> > > 
> > > 1) Why is rcu_dynticks_task_exit() not called while entering in NMI?
> > >    Does that mean that NMIs aren't RCU-Task read side critical sections?
> > 
> > Because Tasks RCU Rude handles that case currently.  So good catch,
> > because this might need adjustment when we get rid of Tasks RCU Rude.
> > And both rcu_dynticks_task_enter() and rcu_dynticks_task_exit() look safe
> > to invoke from NMI handlers.  Memory ordering needs checking, of course.
> > 
> > Except that on architectures defining CONFIG_ARCH_WANTS_NO_INSTR, Tasks
> > RCU should instead check the ct_kernel_enter_state(RCU_DYNTICKS_IDX)
> > state, right?  And on those architectures, I believe that
> > rcu_dynticks_task_enter() and rcu_dynticks_task_exit() can just be no-ops.
> > Or am I missing something here?
> 
> I think rcu_dynticks_task_enter() and rcu_dynticks_task_exit() are
> still needed anyway because the target task can migrate. So unless the rq is locked,
> it's hard to match a stable task_cpu() with the corresponding RCU_DYNTICKS_IDX.

Can it really migrate while in entry/exit or deep idle code?  Or am I
missing a trick here?

> > > 2) Looking further into CONFIG_TASKS_TRACE_RCU_READ_MB=y, it seems to
> > >    allow for uses of rcu_read_[un]lock_trace() while RCU is not watching
> > >    (EQS). Is it really a good idea to support that? Are we aware of any
> > >    such potential usecase?
> > 
> > I hope that in the longer term, there will be no reason to support this.
> > Right now, architectures not defining CONFIG_ARCH_WANTS_NO_INSTR must
> > support this because tracers really can attach probes where RCU is
> > not watching.
> > 
> > And even now, in architectures defining CONFIG_ARCH_WANTS_NO_INSTR, I
> > am not convinced that the early incoming and late outgoing CPU-hotplug
> > paths are handled correctly.  RCU is not watching them, but I am not so
> > sure that they are all marked noinstr as needed.
> 
> Ok I see...

If need be, the outgoing-CPU transition to RCU-not-watching could be
delayed into arch-specific code.  We already allow this for the incoming
transition.

							Thanx, Paul
Frederic Weisbecker July 31, 2024, 12:28 p.m. UTC | #5
Le Tue, Jul 30, 2024 at 03:39:44PM -0700, Paul E. McKenney a écrit :
> On Wed, Jul 31, 2024 at 12:17:49AM +0200, Frederic Weisbecker wrote:
> > Le Tue, Jul 30, 2024 at 07:23:58AM -0700, Paul E. McKenney a écrit :
> > > On Thu, Jul 25, 2024 at 04:32:46PM +0200, Frederic Weisbecker wrote:
> > > > Le Wed, Jul 24, 2024 at 04:43:13PM +0200, Valentin Schneider a écrit :
> > > > > -/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
> > > > > -static __always_inline void rcu_dynticks_task_trace_enter(void)
> > > > > +/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
> > > > > +static __always_inline void rcu_task_trace_exit(void)
> > > > 
> > > > Before I proceed on this last one, a few questions for Paul and others:
> > > > 
> > > > 1) Why is rcu_dynticks_task_exit() not called while entering in NMI?
> > > >    Does that mean that NMIs aren't RCU-Task read side critical sections?
> > > 
> > > Because Tasks RCU Rude handles that case currently.  So good catch,
> > > because this might need adjustment when we get rid of Tasks RCU Rude.
> > > And both rcu_dynticks_task_enter() and rcu_dynticks_task_exit() look safe
> > > to invoke from NMI handlers.  Memory ordering needs checking, of course.
> > > 
> > > Except that on architectures defining CONFIG_ARCH_WANTS_NO_INSTR, Tasks
> > > RCU should instead check the ct_kernel_enter_state(RCU_DYNTICKS_IDX)
> > > state, right?  And on those architectures, I believe that
> > > rcu_dynticks_task_enter() and rcu_dynticks_task_exit() can just be no-ops.
> > > Or am I missing something here?
> > 
> > I think rcu_dynticks_task_enter() and rcu_dynticks_task_exit() are
> > still needed anyway because the target task can migrate. So unless the rq is locked,
> > it's hard to match a stable task_cpu() with the corresponding RCU_DYNTICKS_IDX.
> 
> Can it really migrate while in entry/exit or deep idle code?  Or am I
> missing a trick here?

No but it can migrate before or after EQS. So we need to handle situations like:

 == CPU 0 ==                      == CPU 1 ==
                                  // TASK A is on rq
                                  set_task_cpu(TASK A, 0)
// TASK B runs
ct_user_enter()
ct_user_exit()

//TASK A runs


It could be something like the following:


int rcu_tasks_nohz_full_holdout(struct task_struct *t)
{
	int cpu;
	int snap;

	cpu = task_cpu(t);

	/* Don't miss EQS exit if the task migrated out and in */
	smp_rmb()

	snap = ct_dynticks_cpu(cpu);
	if (snap & RCU_DYNTICKS_IDX)
		return true;

	/* Check if it's the actual task running */
	smp_rmb()

	if (rcu_dereference_raw(cpu_curr(cpu)) != t)
		return true;

	/* Make sure the task hasn't migrated in after the above EQS */
	smp_rmb()

	
	return ct_dynticks_cpu(cpu) != snap;
}

But there is still a risk that ct_dynticks wraps before the last test. So
we would need task_call_func() if task_cpu() is in nohz_full mode.

> 
> > > > 2) Looking further into CONFIG_TASKS_TRACE_RCU_READ_MB=y, it seems to
> > > >    allow for uses of rcu_read_[un]lock_trace() while RCU is not watching
> > > >    (EQS). Is it really a good idea to support that? Are we aware of any
> > > >    such potential usecase?
> > > 
> > > I hope that in the longer term, there will be no reason to support this.
> > > Right now, architectures not defining CONFIG_ARCH_WANTS_NO_INSTR must
> > > support this because tracers really can attach probes where RCU is
> > > not watching.
> > > 
> > > And even now, in architectures defining CONFIG_ARCH_WANTS_NO_INSTR, I
> > > am not convinced that the early incoming and late outgoing CPU-hotplug
> > > paths are handled correctly.  RCU is not watching them, but I am not so
> > > sure that they are all marked noinstr as needed.
> > 
> > Ok I see...
> 
> If need be, the outgoing-CPU transition to RCU-not-watching could be
> delayed into arch-specific code.  We already allow this for the incoming
> transition.

That's a lot of scary architectures code to handle :-)
And how do we determine which place is finally safe to stop watching?

Thanks.

> 
> 							Thanx, Paul
Frederic Weisbecker July 31, 2024, 4:07 p.m. UTC | #6
Le Wed, Jul 24, 2024 at 04:43:13PM +0200, Valentin Schneider a écrit :
> The context_tracking.state RCU_DYNTICKS subvariable has been renamed to
> RCU_WATCHING, and the 'dynticks' prefix can be dropped without losing any
> meaning.
> 
> While at it, flip the suffixes of these helpers. We are not telling
> that we are entering dynticks mode from an RCU-task perspective anymore; we
> are telling that we are exiting RCU-tasks because we are in eqs mode.
> 
> Suggested-by: Frederic Weisbecker <frederic@kernel.org>
> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
> ---
>  kernel/context_tracking.c | 28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 8262f57a43636..1c16a7336360f 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -38,24 +38,24 @@ EXPORT_SYMBOL_GPL(context_tracking);
>  #ifdef CONFIG_CONTEXT_TRACKING_IDLE
>  #define TPS(x)  tracepoint_string(x)
>  
> -/* Record the current task on dyntick-idle entry. */
> -static __always_inline void rcu_dynticks_task_enter(void)
> +/* Record the current task on exiting RCU-tasks (dyntick-idle entry). */
> +static __always_inline void rcu_task_exit(void)

So this makes sense.

>  {
>  #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
>  	WRITE_ONCE(current->rcu_tasks_idle_cpu, smp_processor_id());
>  #endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
>  }
>  
> -/* Record no current task on dyntick-idle exit. */
> -static __always_inline void rcu_dynticks_task_exit(void)
> +/* Record no current task on entering RCU-tasks (dyntick-idle exit). */
> +static __always_inline void rcu_task_enter(void)

That too.

>  {
>  #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
>  	WRITE_ONCE(current->rcu_tasks_idle_cpu, -1);
>  #endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
>  }
>  
> -/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
> -static __always_inline void rcu_dynticks_task_trace_enter(void)
> +/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
> +static __always_inline void rcu_task_trace_exit(void)

But that eventually doesn't, because it's not about not wathing anymore from
an RCU-TASKS-TRACE perspective. It's actually about adding more heavyweight
ordering to track down RCU-TASKS-TRACE read side while traditional RCU is not
watching. Sorry for understanding it that late.

Oh well. So a more accurate name here would be rcu_task_trace_heavyweight_enter().

>  {
>  #ifdef CONFIG_TASKS_TRACE_RCU
>  	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
> @@ -63,8 +63,8 @@ static __always_inline void rcu_dynticks_task_trace_enter(void)
>  #endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
>  }
>  
> -/* Turn off heavyweight RCU tasks trace readers on idle/user exit. */
> -static __always_inline void rcu_dynticks_task_trace_exit(void)
> +/* Turn off heavyweight RCU tasks trace readers on kernel entry. */
> +static __always_inline void rcu_task_trace_enter(void)

And rcu_task_trace_heavyweight_exit().

Thanks!

>  {
>  #ifdef CONFIG_TASKS_TRACE_RCU
>  	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
> @@ -87,7 +87,7 @@ static noinstr void ct_kernel_exit_state(int offset)
>  	 * critical sections, and we also must force ordering with the
>  	 * next idle sojourn.
>  	 */
> -	rcu_dynticks_task_trace_enter();  // Before ->dynticks update!
> +	rcu_task_trace_exit();  // Before CT state update!
>  	seq = ct_state_inc(offset);
>  	// RCU is no longer watching.  Better be in extended quiescent state!
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & CT_RCU_WATCHING));
> @@ -109,7 +109,7 @@ static noinstr void ct_kernel_enter_state(int offset)
>  	 */
>  	seq = ct_state_inc(offset);
>  	// RCU is now watching.  Better not be in an extended quiescent state!
> -	rcu_dynticks_task_trace_exit();  // After ->dynticks update!
> +	rcu_task_trace_enter();  // After CT state update!
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & CT_RCU_WATCHING));
>  }
>  
> @@ -149,7 +149,7 @@ static void noinstr ct_kernel_exit(bool user, int offset)
>  	// RCU is watching here ...
>  	ct_kernel_exit_state(offset);
>  	// ... but is no longer watching here.
> -	rcu_dynticks_task_enter();
> +	rcu_task_exit();
>  }
>  
>  /*
> @@ -173,7 +173,7 @@ static void noinstr ct_kernel_enter(bool user, int offset)
>  		ct->nesting++;
>  		return;
>  	}
> -	rcu_dynticks_task_exit();
> +	rcu_task_enter();
>  	// RCU is not watching here ...
>  	ct_kernel_enter_state(offset);
>  	// ... but is watching here.
> @@ -240,7 +240,7 @@ void noinstr ct_nmi_exit(void)
>  	// ... but is no longer watching here.
>  
>  	if (!in_nmi())
> -		rcu_dynticks_task_enter();
> +		rcu_task_exit();
>  }
>  
>  /**
> @@ -274,7 +274,7 @@ void noinstr ct_nmi_enter(void)
>  	if (rcu_dynticks_curr_cpu_in_eqs()) {
>  
>  		if (!in_nmi())
> -			rcu_dynticks_task_exit();
> +			rcu_task_enter();
>  
>  		// RCU is not watching here ...
>  		ct_kernel_enter_state(CT_RCU_WATCHING);
> -- 
> 2.43.0
>
Paul E. McKenney Aug. 3, 2024, 4:32 a.m. UTC | #7
On Wed, Jul 31, 2024 at 02:28:16PM +0200, Frederic Weisbecker wrote:
> Le Tue, Jul 30, 2024 at 03:39:44PM -0700, Paul E. McKenney a écrit :
> > On Wed, Jul 31, 2024 at 12:17:49AM +0200, Frederic Weisbecker wrote:
> > > Le Tue, Jul 30, 2024 at 07:23:58AM -0700, Paul E. McKenney a écrit :
> > > > On Thu, Jul 25, 2024 at 04:32:46PM +0200, Frederic Weisbecker wrote:
> > > > > Le Wed, Jul 24, 2024 at 04:43:13PM +0200, Valentin Schneider a écrit :
> > > > > > -/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
> > > > > > -static __always_inline void rcu_dynticks_task_trace_enter(void)
> > > > > > +/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
> > > > > > +static __always_inline void rcu_task_trace_exit(void)
> > > > > 
> > > > > Before I proceed on this last one, a few questions for Paul and others:
> > > > > 
> > > > > 1) Why is rcu_dynticks_task_exit() not called while entering in NMI?
> > > > >    Does that mean that NMIs aren't RCU-Task read side critical sections?
> > > > 
> > > > Because Tasks RCU Rude handles that case currently.  So good catch,
> > > > because this might need adjustment when we get rid of Tasks RCU Rude.
> > > > And both rcu_dynticks_task_enter() and rcu_dynticks_task_exit() look safe
> > > > to invoke from NMI handlers.  Memory ordering needs checking, of course.
> > > > 
> > > > Except that on architectures defining CONFIG_ARCH_WANTS_NO_INSTR, Tasks
> > > > RCU should instead check the ct_kernel_enter_state(RCU_DYNTICKS_IDX)
> > > > state, right?  And on those architectures, I believe that
> > > > rcu_dynticks_task_enter() and rcu_dynticks_task_exit() can just be no-ops.
> > > > Or am I missing something here?
> > > 
> > > I think rcu_dynticks_task_enter() and rcu_dynticks_task_exit() are
> > > still needed anyway because the target task can migrate. So unless the rq is locked,
> > > it's hard to match a stable task_cpu() with the corresponding RCU_DYNTICKS_IDX.
> > 
> > Can it really migrate while in entry/exit or deep idle code?  Or am I
> > missing a trick here?
> 
> No but it can migrate before or after EQS. So we need to handle situations like:
> 
>  == CPU 0 ==                      == CPU 1 ==
>                                   // TASK A is on rq
>                                   set_task_cpu(TASK A, 0)
> // TASK B runs
> ct_user_enter()
> ct_user_exit()
> 
> //TASK A runs
> 
> 
> It could be something like the following:
> 
> 
> int rcu_tasks_nohz_full_holdout(struct task_struct *t)
> {
> 	int cpu;
> 	int snap;
> 
> 	cpu = task_cpu(t);
> 
> 	/* Don't miss EQS exit if the task migrated out and in */
> 	smp_rmb()
> 
> 	snap = ct_dynticks_cpu(cpu);
> 	if (snap & RCU_DYNTICKS_IDX)
> 		return true;
> 
> 	/* Check if it's the actual task running */
> 	smp_rmb()
> 
> 	if (rcu_dereference_raw(cpu_curr(cpu)) != t)
> 		return true;
> 
> 	/* Make sure the task hasn't migrated in after the above EQS */
> 	smp_rmb()
> 
> 	
> 	return ct_dynticks_cpu(cpu) != snap;
> }
> 
> But there is still a risk that ct_dynticks wraps before the last test. So
> we would need task_call_func() if task_cpu() is in nohz_full mode.

Good point, migration just before or just after can look much the same
as migrating during..

> > > > > 2) Looking further into CONFIG_TASKS_TRACE_RCU_READ_MB=y, it seems to
> > > > >    allow for uses of rcu_read_[un]lock_trace() while RCU is not watching
> > > > >    (EQS). Is it really a good idea to support that? Are we aware of any
> > > > >    such potential usecase?
> > > > 
> > > > I hope that in the longer term, there will be no reason to support this.
> > > > Right now, architectures not defining CONFIG_ARCH_WANTS_NO_INSTR must
> > > > support this because tracers really can attach probes where RCU is
> > > > not watching.
> > > > 
> > > > And even now, in architectures defining CONFIG_ARCH_WANTS_NO_INSTR, I
> > > > am not convinced that the early incoming and late outgoing CPU-hotplug
> > > > paths are handled correctly.  RCU is not watching them, but I am not so
> > > > sure that they are all marked noinstr as needed.
> > > 
> > > Ok I see...
> > 
> > If need be, the outgoing-CPU transition to RCU-not-watching could be
> > delayed into arch-specific code.  We already allow this for the incoming
> > transition.
> 
> That's a lot of scary architectures code to handle :-)
> And how do we determine which place is finally safe to stop watching?

Huh.  One reason for the current smp_call_function_single() in
cpuhp_report_idle_dead() was some ARM32 CPUs that shut down caching on
their way out.  this made it impossible to use shared-variable-based
CPU-dead notification.  I wonder if Arnd's deprecation schedule
for ARM32-based platforms will allow us to go back to shared-memory
notification, which might make this sort of thing easier.

							Thanx, Paul
Peter Zijlstra Aug. 5, 2024, 9:01 a.m. UTC | #8
On Fri, Aug 02, 2024 at 09:32:08PM -0700, Paul E. McKenney wrote:

> Huh.  One reason for the current smp_call_function_single() in
> cpuhp_report_idle_dead() was some ARM32 CPUs that shut down caching on
> their way out.  this made it impossible to use shared-variable-based
> CPU-dead notification.  I wonder if Arnd's deprecation schedule
> for ARM32-based platforms will allow us to go back to shared-memory
> notification, which might make this sort of thing easier.

All those idle paths should be doing ct_cpuidle_enter(), which includes
telling RCU the CPU is going idle, no?

I cleaned all that up a while back.
Frederic Weisbecker Aug. 5, 2024, 12:18 p.m. UTC | #9
Le Mon, Aug 05, 2024 at 11:01:22AM +0200, Peter Zijlstra a écrit :
> On Fri, Aug 02, 2024 at 09:32:08PM -0700, Paul E. McKenney wrote:
> 
> > Huh.  One reason for the current smp_call_function_single() in
> > cpuhp_report_idle_dead() was some ARM32 CPUs that shut down caching on
> > their way out.  this made it impossible to use shared-variable-based
> > CPU-dead notification.  I wonder if Arnd's deprecation schedule
> > for ARM32-based platforms will allow us to go back to shared-memory
> > notification, which might make this sort of thing easier.
> 
> All those idle paths should be doing ct_cpuidle_enter(), which includes
> telling RCU the CPU is going idle, no?
> 
> I cleaned all that up a while back.

On every architectures? That would be a good news, as we could then remove
the CONFIG_TASKS_TRACE_RCU_READ_MB dance.

There is still some vulnerable path on CPU hotplug down path once
cpuhp_complete_idle_dead() is executed. The dead CPU may still call a bunch
of general purpose functions (the exit path of smp_call_function_single()
and the exit of cpuhp_report_idle_dead()). And this will become worse once
we remove TASK-RUDE and rely only on rcutree_report_cpu_dead().

Thanks.
Frederic Weisbecker Aug. 5, 2024, 1:26 p.m. UTC | #10
Le Fri, Aug 02, 2024 at 09:32:08PM -0700, Paul E. McKenney a écrit :
> On Wed, Jul 31, 2024 at 02:28:16PM +0200, Frederic Weisbecker wrote:
> > Le Tue, Jul 30, 2024 at 03:39:44PM -0700, Paul E. McKenney a écrit :
> > > On Wed, Jul 31, 2024 at 12:17:49AM +0200, Frederic Weisbecker wrote:
> > > > Le Tue, Jul 30, 2024 at 07:23:58AM -0700, Paul E. McKenney a écrit :
> > > > > On Thu, Jul 25, 2024 at 04:32:46PM +0200, Frederic Weisbecker wrote:
> > > > > > Le Wed, Jul 24, 2024 at 04:43:13PM +0200, Valentin Schneider a écrit :
> > > > > > > -/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
> > > > > > > -static __always_inline void rcu_dynticks_task_trace_enter(void)
> > > > > > > +/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
> > > > > > > +static __always_inline void rcu_task_trace_exit(void)
> > > > > > 
> > > > > > Before I proceed on this last one, a few questions for Paul and others:
> > > > > > 
> > > > > > 1) Why is rcu_dynticks_task_exit() not called while entering in NMI?
> > > > > >    Does that mean that NMIs aren't RCU-Task read side critical sections?
> > > > > 
> > > > > Because Tasks RCU Rude handles that case currently.  So good catch,
> > > > > because this might need adjustment when we get rid of Tasks RCU Rude.
> > > > > And both rcu_dynticks_task_enter() and rcu_dynticks_task_exit() look safe
> > > > > to invoke from NMI handlers.  Memory ordering needs checking, of course.
> > > > > 
> > > > > Except that on architectures defining CONFIG_ARCH_WANTS_NO_INSTR, Tasks
> > > > > RCU should instead check the ct_kernel_enter_state(RCU_DYNTICKS_IDX)
> > > > > state, right?  And on those architectures, I believe that
> > > > > rcu_dynticks_task_enter() and rcu_dynticks_task_exit() can just be no-ops.
> > > > > Or am I missing something here?
> > > > 
> > > > I think rcu_dynticks_task_enter() and rcu_dynticks_task_exit() are
> > > > still needed anyway because the target task can migrate. So unless the rq is locked,
> > > > it's hard to match a stable task_cpu() with the corresponding RCU_DYNTICKS_IDX.
> > > 
> > > Can it really migrate while in entry/exit or deep idle code?  Or am I
> > > missing a trick here?
> > 
> > No but it can migrate before or after EQS. So we need to handle situations like:
> > 
> >  == CPU 0 ==                      == CPU 1 ==
> >                                   // TASK A is on rq
> >                                   set_task_cpu(TASK A, 0)
> > // TASK B runs
> > ct_user_enter()
> > ct_user_exit()
> > 
> > //TASK A runs
> > 
> > 
> > It could be something like the following:
> > 
> > 
> > int rcu_tasks_nohz_full_holdout(struct task_struct *t)
> > {
> > 	int cpu;
> > 	int snap;
> > 
> > 	cpu = task_cpu(t);
> > 
> > 	/* Don't miss EQS exit if the task migrated out and in */
> > 	smp_rmb()
> > 
> > 	snap = ct_dynticks_cpu(cpu);
> > 	if (snap & RCU_DYNTICKS_IDX)
> > 		return true;
> > 
> > 	/* Check if it's the actual task running */
> > 	smp_rmb()
> > 
> > 	if (rcu_dereference_raw(cpu_curr(cpu)) != t)
> > 		return true;
> > 
> > 	/* Make sure the task hasn't migrated in after the above EQS */
> > 	smp_rmb()
> > 
> > 	
> > 	return ct_dynticks_cpu(cpu) != snap;
> > }
> > 
> > But there is still a risk that ct_dynticks wraps before the last test. So
> > we would need task_call_func() if task_cpu() is in nohz_full mode.
> 
> Good point, migration just before or just after can look much the same
> as migrating during..
> 
> > > > > > 2) Looking further into CONFIG_TASKS_TRACE_RCU_READ_MB=y, it seems to
> > > > > >    allow for uses of rcu_read_[un]lock_trace() while RCU is not watching
> > > > > >    (EQS). Is it really a good idea to support that? Are we aware of any
> > > > > >    such potential usecase?
> > > > > 
> > > > > I hope that in the longer term, there will be no reason to support this.
> > > > > Right now, architectures not defining CONFIG_ARCH_WANTS_NO_INSTR must
> > > > > support this because tracers really can attach probes where RCU is
> > > > > not watching.
> > > > > 
> > > > > And even now, in architectures defining CONFIG_ARCH_WANTS_NO_INSTR, I
> > > > > am not convinced that the early incoming and late outgoing CPU-hotplug
> > > > > paths are handled correctly.  RCU is not watching them, but I am not so
> > > > > sure that they are all marked noinstr as needed.
> > > > 
> > > > Ok I see...
> > > 
> > > If need be, the outgoing-CPU transition to RCU-not-watching could be
> > > delayed into arch-specific code.  We already allow this for the incoming
> > > transition.
> > 
> > That's a lot of scary architectures code to handle :-)
> > And how do we determine which place is finally safe to stop watching?
> 
> Huh.  One reason for the current smp_call_function_single() in
> cpuhp_report_idle_dead() was some ARM32 CPUs that shut down caching on
> their way out.  this made it impossible to use shared-variable-based
> CPU-dead notification.  I wonder if Arnd's deprecation schedule
> for ARM32-based platforms will allow us to go back to shared-memory
> notification, which might make this sort of thing easier.

git blame retrieved that:

71f87b2fc64c (cpu/hotplug: Plug death reporting race)

For the reason why using smp_call_function_single():

    """
    We cant call complete after rcu_report_dead(), so instead of going back to
    busy polling, simply issue a function call to do the completion.
    """

This doesn't state exactly why as I don't see obvious use of RCU there. But
RCU-TASK-TRACE definetly doesn't want to stop watching while complete() is
called.

Another thing to consider is that rcutree_migrate_callbacks() may execute
concurrently with rcutree_report_cpu_dead() otherwise. This might work ok,
despite the WARN_ON_ONCE(rcu_rdp_cpu_online(rdp)) in
rcutree_migrate_callbacks(). But that's doesn't make it more a good idea :-)

As for the reason why calling complete() _after_ rcutree_report_cpu_dead():

    """
    Paul noticed that the conversion of the death reporting introduced a race
    where the outgoing cpu might be delayed after waking the controll processor,
    so it might not be able to call rcu_report_dead() before being physically
    removed, leading to RCU stalls.
    """

I'm lacking details but perhaps the call to __cpu_die() by the control CPU on
some archs may kill the CPU before it used to reach rcu_report_dead() ? No idea.
But we need to know if __cpu_die() can kill the CPU before it reaches
arch_cpu_idle_dead(). If it's not the case, then we can do something like this:

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 24b1e1143260..fb5b866c2fb3 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -31,7 +31,6 @@ DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
 	.dynticks_nesting = 1,
 	.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
 #endif
-	.state = ATOMIC_INIT(RCU_DYNTICKS_IDX),
 };
 EXPORT_SYMBOL_GPL(context_tracking);
 
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 6e78d071beb5..7ebaf04af37a 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -307,6 +307,7 @@ static void do_idle(void)
 
 		if (cpu_is_offline(cpu)) {
 			cpuhp_report_idle_dead();
+			ct_state_inc(RCU_DYNTICKS_IDX);
 			arch_cpu_idle_dead();
 		}
 
This mirrors rcutree_report_cpu_starting() -> rcu_dynticks_eqs_online()
And then we can make RCU-TASKS not worry about rcu_cpu_online() anymore
and only care about the context tracking.

The CPU up path looks saner with rcutree_report_cpu_starting() called early
enough on sane archs... Otherwise it's easy to fix case by case.

Thoughts?
Paul E. McKenney Aug. 5, 2024, 5:04 p.m. UTC | #11
On Mon, Aug 05, 2024 at 03:26:38PM +0200, Frederic Weisbecker wrote:
> Le Fri, Aug 02, 2024 at 09:32:08PM -0700, Paul E. McKenney a écrit :
> > On Wed, Jul 31, 2024 at 02:28:16PM +0200, Frederic Weisbecker wrote:
> > > Le Tue, Jul 30, 2024 at 03:39:44PM -0700, Paul E. McKenney a écrit :
> > > > On Wed, Jul 31, 2024 at 12:17:49AM +0200, Frederic Weisbecker wrote:
> > > > > Le Tue, Jul 30, 2024 at 07:23:58AM -0700, Paul E. McKenney a écrit :
> > > > > > On Thu, Jul 25, 2024 at 04:32:46PM +0200, Frederic Weisbecker wrote:
> > > > > > > Le Wed, Jul 24, 2024 at 04:43:13PM +0200, Valentin Schneider a écrit :
> > > > > > > > -/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
> > > > > > > > -static __always_inline void rcu_dynticks_task_trace_enter(void)
> > > > > > > > +/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
> > > > > > > > +static __always_inline void rcu_task_trace_exit(void)
> > > > > > > 
> > > > > > > Before I proceed on this last one, a few questions for Paul and others:
> > > > > > > 
> > > > > > > 1) Why is rcu_dynticks_task_exit() not called while entering in NMI?
> > > > > > >    Does that mean that NMIs aren't RCU-Task read side critical sections?
> > > > > > 
> > > > > > Because Tasks RCU Rude handles that case currently.  So good catch,
> > > > > > because this might need adjustment when we get rid of Tasks RCU Rude.
> > > > > > And both rcu_dynticks_task_enter() and rcu_dynticks_task_exit() look safe
> > > > > > to invoke from NMI handlers.  Memory ordering needs checking, of course.
> > > > > > 
> > > > > > Except that on architectures defining CONFIG_ARCH_WANTS_NO_INSTR, Tasks
> > > > > > RCU should instead check the ct_kernel_enter_state(RCU_DYNTICKS_IDX)
> > > > > > state, right?  And on those architectures, I believe that
> > > > > > rcu_dynticks_task_enter() and rcu_dynticks_task_exit() can just be no-ops.
> > > > > > Or am I missing something here?
> > > > > 
> > > > > I think rcu_dynticks_task_enter() and rcu_dynticks_task_exit() are
> > > > > still needed anyway because the target task can migrate. So unless the rq is locked,
> > > > > it's hard to match a stable task_cpu() with the corresponding RCU_DYNTICKS_IDX.
> > > > 
> > > > Can it really migrate while in entry/exit or deep idle code?  Or am I
> > > > missing a trick here?
> > > 
> > > No but it can migrate before or after EQS. So we need to handle situations like:
> > > 
> > >  == CPU 0 ==                      == CPU 1 ==
> > >                                   // TASK A is on rq
> > >                                   set_task_cpu(TASK A, 0)
> > > // TASK B runs
> > > ct_user_enter()
> > > ct_user_exit()
> > > 
> > > //TASK A runs
> > > 
> > > 
> > > It could be something like the following:
> > > 
> > > 
> > > int rcu_tasks_nohz_full_holdout(struct task_struct *t)
> > > {
> > > 	int cpu;
> > > 	int snap;
> > > 
> > > 	cpu = task_cpu(t);
> > > 
> > > 	/* Don't miss EQS exit if the task migrated out and in */
> > > 	smp_rmb()
> > > 
> > > 	snap = ct_dynticks_cpu(cpu);
> > > 	if (snap & RCU_DYNTICKS_IDX)
> > > 		return true;
> > > 
> > > 	/* Check if it's the actual task running */
> > > 	smp_rmb()
> > > 
> > > 	if (rcu_dereference_raw(cpu_curr(cpu)) != t)
> > > 		return true;
> > > 
> > > 	/* Make sure the task hasn't migrated in after the above EQS */
> > > 	smp_rmb()
> > > 
> > > 	
> > > 	return ct_dynticks_cpu(cpu) != snap;
> > > }
> > > 
> > > But there is still a risk that ct_dynticks wraps before the last test. So
> > > we would need task_call_func() if task_cpu() is in nohz_full mode.
> > 
> > Good point, migration just before or just after can look much the same
> > as migrating during..
> > 
> > > > > > > 2) Looking further into CONFIG_TASKS_TRACE_RCU_READ_MB=y, it seems to
> > > > > > >    allow for uses of rcu_read_[un]lock_trace() while RCU is not watching
> > > > > > >    (EQS). Is it really a good idea to support that? Are we aware of any
> > > > > > >    such potential usecase?
> > > > > > 
> > > > > > I hope that in the longer term, there will be no reason to support this.
> > > > > > Right now, architectures not defining CONFIG_ARCH_WANTS_NO_INSTR must
> > > > > > support this because tracers really can attach probes where RCU is
> > > > > > not watching.
> > > > > > 
> > > > > > And even now, in architectures defining CONFIG_ARCH_WANTS_NO_INSTR, I
> > > > > > am not convinced that the early incoming and late outgoing CPU-hotplug
> > > > > > paths are handled correctly.  RCU is not watching them, but I am not so
> > > > > > sure that they are all marked noinstr as needed.
> > > > > 
> > > > > Ok I see...
> > > > 
> > > > If need be, the outgoing-CPU transition to RCU-not-watching could be
> > > > delayed into arch-specific code.  We already allow this for the incoming
> > > > transition.
> > > 
> > > That's a lot of scary architectures code to handle :-)
> > > And how do we determine which place is finally safe to stop watching?
> > 
> > Huh.  One reason for the current smp_call_function_single() in
> > cpuhp_report_idle_dead() was some ARM32 CPUs that shut down caching on
> > their way out.  this made it impossible to use shared-variable-based
> > CPU-dead notification.  I wonder if Arnd's deprecation schedule
> > for ARM32-based platforms will allow us to go back to shared-memory
> > notification, which might make this sort of thing easier.
> 
> git blame retrieved that:
> 
> 71f87b2fc64c (cpu/hotplug: Plug death reporting race)
> 
> For the reason why using smp_call_function_single():
> 
>     """
>     We cant call complete after rcu_report_dead(), so instead of going back to
>     busy polling, simply issue a function call to do the completion.
>     """
> 
> This doesn't state exactly why as I don't see obvious use of RCU there. But
> RCU-TASK-TRACE definetly doesn't want to stop watching while complete() is
> called.

With lockdep, the lock acquisition and release in complete_with_flags()
would splat, though this could be handled by using a low-level lock
acquisition and release that didn't mess with lockdep.  With the resulting
risks of missed deadlock detection, of course!

With tracing enabled, there is the trace_sched_waking() within
try_to_wake_up() along with the trace_sched_wakeup() within
ttwu_do_wakeup().  And more beyond that, but I am lazy this morning.  ;-)

> Another thing to consider is that rcutree_migrate_callbacks() may execute
> concurrently with rcutree_report_cpu_dead() otherwise. This might work ok,
> despite the WARN_ON_ONCE(rcu_rdp_cpu_online(rdp)) in
> rcutree_migrate_callbacks(). But that's doesn't make it more a good idea :-)
> 
> As for the reason why calling complete() _after_ rcutree_report_cpu_dead():
> 
>     """
>     Paul noticed that the conversion of the death reporting introduced a race
>     where the outgoing cpu might be delayed after waking the controll processor,
>     so it might not be able to call rcu_report_dead() before being physically
>     removed, leading to RCU stalls.
>     """

There were also ARM32 splats that have aged out of my mental cache.

> I'm lacking details but perhaps the call to __cpu_die() by the control CPU on
> some archs may kill the CPU before it used to reach rcu_report_dead() ? No idea.
> But we need to know if __cpu_die() can kill the CPU before it reaches
> arch_cpu_idle_dead(). If it's not the case, then we can do something like this:
> 
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 24b1e1143260..fb5b866c2fb3 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -31,7 +31,6 @@ DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
>  	.dynticks_nesting = 1,
>  	.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
>  #endif
> -	.state = ATOMIC_INIT(RCU_DYNTICKS_IDX),

Don't you still need the initial state for the boot CPU?

>  };
>  EXPORT_SYMBOL_GPL(context_tracking);
>  
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index 6e78d071beb5..7ebaf04af37a 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -307,6 +307,7 @@ static void do_idle(void)
>  
>  		if (cpu_is_offline(cpu)) {
>  			cpuhp_report_idle_dead();
> +			ct_state_inc(RCU_DYNTICKS_IDX);
>  			arch_cpu_idle_dead();
>  		}
>  
> This mirrors rcutree_report_cpu_starting() -> rcu_dynticks_eqs_online()
> And then we can make RCU-TASKS not worry about rcu_cpu_online() anymore
> and only care about the context tracking.

For that to work, we need all architectures to be OK for Tasks Rude RCU
being removed, right?  Plus that would make all of arch_cpu_idle_dead()
untraceable.  Is that a good thing?

> The CPU up path looks saner with rcutree_report_cpu_starting() called early
> enough on sane archs... Otherwise it's easy to fix case by case.
> 
> Thoughts?

Not entirly sure yet, but I agree some fix would be good!

							Thanx, Paul
Neeraj Upadhyay Aug. 14, 2024, 12:06 p.m. UTC | #12
On Wed, Jul 31, 2024 at 06:07:32PM +0200, Frederic Weisbecker wrote:
> Le Wed, Jul 24, 2024 at 04:43:13PM +0200, Valentin Schneider a écrit :
> > The context_tracking.state RCU_DYNTICKS subvariable has been renamed to
> > RCU_WATCHING, and the 'dynticks' prefix can be dropped without losing any
> > meaning.
> > 
> > While at it, flip the suffixes of these helpers. We are not telling
> > that we are entering dynticks mode from an RCU-task perspective anymore; we
> > are telling that we are exiting RCU-tasks because we are in eqs mode.
> > 
> > Suggested-by: Frederic Weisbecker <frederic@kernel.org>
> > Signed-off-by: Valentin Schneider <vschneid@redhat.com>
> > ---
> >  kernel/context_tracking.c | 28 ++++++++++++++--------------
> >  1 file changed, 14 insertions(+), 14 deletions(-)
> > 
> > diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> > index 8262f57a43636..1c16a7336360f 100644
> > --- a/kernel/context_tracking.c
> > +++ b/kernel/context_tracking.c
> > @@ -38,24 +38,24 @@ EXPORT_SYMBOL_GPL(context_tracking);
> >  #ifdef CONFIG_CONTEXT_TRACKING_IDLE
> >  #define TPS(x)  tracepoint_string(x)
> >  
> > -/* Record the current task on dyntick-idle entry. */
> > -static __always_inline void rcu_dynticks_task_enter(void)
> > +/* Record the current task on exiting RCU-tasks (dyntick-idle entry). */
> > +static __always_inline void rcu_task_exit(void)
> 
> So this makes sense.
> 
> >  {
> >  #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
> >  	WRITE_ONCE(current->rcu_tasks_idle_cpu, smp_processor_id());
> >  #endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
> >  }
> >  
> > -/* Record no current task on dyntick-idle exit. */
> > -static __always_inline void rcu_dynticks_task_exit(void)
> > +/* Record no current task on entering RCU-tasks (dyntick-idle exit). */
> > +static __always_inline void rcu_task_enter(void)
> 
> That too.
> 
> >  {
> >  #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
> >  	WRITE_ONCE(current->rcu_tasks_idle_cpu, -1);
> >  #endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
> >  }
> >  
> > -/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
> > -static __always_inline void rcu_dynticks_task_trace_enter(void)
> > +/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
> > +static __always_inline void rcu_task_trace_exit(void)
> 
> But that eventually doesn't, because it's not about not wathing anymore from
> an RCU-TASKS-TRACE perspective. It's actually about adding more heavyweight
> ordering to track down RCU-TASKS-TRACE read side while traditional RCU is not
> watching. Sorry for understanding it that late.
> 
> Oh well. So a more accurate name here would be rcu_task_trace_heavyweight_enter().
> 
> >  {
> >  #ifdef CONFIG_TASKS_TRACE_RCU
> >  	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
> > @@ -63,8 +63,8 @@ static __always_inline void rcu_dynticks_task_trace_enter(void)
> >  #endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
> >  }
> >  
> > -/* Turn off heavyweight RCU tasks trace readers on idle/user exit. */
> > -static __always_inline void rcu_dynticks_task_trace_exit(void)
> > +/* Turn off heavyweight RCU tasks trace readers on kernel entry. */
> > +static __always_inline void rcu_task_trace_enter(void)
> 
> And rcu_task_trace_heavyweight_exit().
> 

I have updated it here [1]. Please let me know if something looks
incorrect.


[1] https://git.kernel.org/pub/scm/linux/kernel/git/neeraj.upadhyay/linux-rcu.git/commit/?h=next.14.08.24a&id=cfc22b9f1572b137dd9f36da831dd7b69c9fe352

- Neeraj

> Thanks!
> 
> >  {
> >  #ifdef CONFIG_TASKS_TRACE_RCU
> >  	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
> > @@ -87,7 +87,7 @@ static noinstr void ct_kernel_exit_state(int offset)
> >  	 * critical sections, and we also must force ordering with the
> >  	 * next idle sojourn.
> >  	 */
> > -	rcu_dynticks_task_trace_enter();  // Before ->dynticks update!
> > +	rcu_task_trace_exit();  // Before CT state update!
> >  	seq = ct_state_inc(offset);
> >  	// RCU is no longer watching.  Better be in extended quiescent state!
> >  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & CT_RCU_WATCHING));
> > @@ -109,7 +109,7 @@ static noinstr void ct_kernel_enter_state(int offset)
> >  	 */
> >  	seq = ct_state_inc(offset);
> >  	// RCU is now watching.  Better not be in an extended quiescent state!
> > -	rcu_dynticks_task_trace_exit();  // After ->dynticks update!
> > +	rcu_task_trace_enter();  // After CT state update!
> >  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & CT_RCU_WATCHING));
> >  }
> >  
> > @@ -149,7 +149,7 @@ static void noinstr ct_kernel_exit(bool user, int offset)
> >  	// RCU is watching here ...
> >  	ct_kernel_exit_state(offset);
> >  	// ... but is no longer watching here.
> > -	rcu_dynticks_task_enter();
> > +	rcu_task_exit();
> >  }
> >  
> >  /*
> > @@ -173,7 +173,7 @@ static void noinstr ct_kernel_enter(bool user, int offset)
> >  		ct->nesting++;
> >  		return;
> >  	}
> > -	rcu_dynticks_task_exit();
> > +	rcu_task_enter();
> >  	// RCU is not watching here ...
> >  	ct_kernel_enter_state(offset);
> >  	// ... but is watching here.
> > @@ -240,7 +240,7 @@ void noinstr ct_nmi_exit(void)
> >  	// ... but is no longer watching here.
> >  
> >  	if (!in_nmi())
> > -		rcu_dynticks_task_enter();
> > +		rcu_task_exit();
> >  }
> >  
> >  /**
> > @@ -274,7 +274,7 @@ void noinstr ct_nmi_enter(void)
> >  	if (rcu_dynticks_curr_cpu_in_eqs()) {
> >  
> >  		if (!in_nmi())
> > -			rcu_dynticks_task_exit();
> > +			rcu_task_enter();
> >  
> >  		// RCU is not watching here ...
> >  		ct_kernel_enter_state(CT_RCU_WATCHING);
> > -- 
> > 2.43.0
> >
Frederic Weisbecker Aug. 15, 2024, 2:14 p.m. UTC | #13
Le Wed, Aug 14, 2024 at 05:36:55PM +0530, Neeraj Upadhyay a écrit :
> On Wed, Jul 31, 2024 at 06:07:32PM +0200, Frederic Weisbecker wrote:
> > Le Wed, Jul 24, 2024 at 04:43:13PM +0200, Valentin Schneider a écrit :
> > > The context_tracking.state RCU_DYNTICKS subvariable has been renamed to
> > > RCU_WATCHING, and the 'dynticks' prefix can be dropped without losing any
> > > meaning.
> > > 
> > > While at it, flip the suffixes of these helpers. We are not telling
> > > that we are entering dynticks mode from an RCU-task perspective anymore; we
> > > are telling that we are exiting RCU-tasks because we are in eqs mode.
> > > 
> > > Suggested-by: Frederic Weisbecker <frederic@kernel.org>
> > > Signed-off-by: Valentin Schneider <vschneid@redhat.com>
> > > ---
> > >  kernel/context_tracking.c | 28 ++++++++++++++--------------
> > >  1 file changed, 14 insertions(+), 14 deletions(-)
> > > 
> > > diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> > > index 8262f57a43636..1c16a7336360f 100644
> > > --- a/kernel/context_tracking.c
> > > +++ b/kernel/context_tracking.c
> > > @@ -38,24 +38,24 @@ EXPORT_SYMBOL_GPL(context_tracking);
> > >  #ifdef CONFIG_CONTEXT_TRACKING_IDLE
> > >  #define TPS(x)  tracepoint_string(x)
> > >  
> > > -/* Record the current task on dyntick-idle entry. */
> > > -static __always_inline void rcu_dynticks_task_enter(void)
> > > +/* Record the current task on exiting RCU-tasks (dyntick-idle entry). */
> > > +static __always_inline void rcu_task_exit(void)
> > 
> > So this makes sense.
> > 
> > >  {
> > >  #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
> > >  	WRITE_ONCE(current->rcu_tasks_idle_cpu, smp_processor_id());
> > >  #endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
> > >  }
> > >  
> > > -/* Record no current task on dyntick-idle exit. */
> > > -static __always_inline void rcu_dynticks_task_exit(void)
> > > +/* Record no current task on entering RCU-tasks (dyntick-idle exit). */
> > > +static __always_inline void rcu_task_enter(void)
> > 
> > That too.
> > 
> > >  {
> > >  #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
> > >  	WRITE_ONCE(current->rcu_tasks_idle_cpu, -1);
> > >  #endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
> > >  }
> > >  
> > > -/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
> > > -static __always_inline void rcu_dynticks_task_trace_enter(void)
> > > +/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
> > > +static __always_inline void rcu_task_trace_exit(void)
> > 
> > But that eventually doesn't, because it's not about not wathing anymore from
> > an RCU-TASKS-TRACE perspective. It's actually about adding more heavyweight
> > ordering to track down RCU-TASKS-TRACE read side while traditional RCU is not
> > watching. Sorry for understanding it that late.
> > 
> > Oh well. So a more accurate name here would be rcu_task_trace_heavyweight_enter().
> > 
> > >  {
> > >  #ifdef CONFIG_TASKS_TRACE_RCU
> > >  	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
> > > @@ -63,8 +63,8 @@ static __always_inline void rcu_dynticks_task_trace_enter(void)
> > >  #endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
> > >  }
> > >  
> > > -/* Turn off heavyweight RCU tasks trace readers on idle/user exit. */
> > > -static __always_inline void rcu_dynticks_task_trace_exit(void)
> > > +/* Turn off heavyweight RCU tasks trace readers on kernel entry. */
> > > +static __always_inline void rcu_task_trace_enter(void)
> > 
> > And rcu_task_trace_heavyweight_exit().
> > 
> 
> I have updated it here [1]. Please let me know if something looks
> incorrect.
> 
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/neeraj.upadhyay/linux-rcu.git/commit/?h=next.14.08.24a&id=cfc22b9f1572b137dd9f36da831dd7b69c9fe352

Looks good!

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

Thanks!
Valentin Schneider Aug. 16, 2024, 10:19 a.m. UTC | #14
On 14/08/24 17:36, Neeraj Upadhyay wrote:
>
> I have updated it here [1]. Please let me know if something looks
> incorrect.
>
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/neeraj.upadhyay/linux-rcu.git/commit/?h=next.14.08.24a&id=cfc22b9f1572b137dd9f36da831dd7b69c9fe352
>
> - Neeraj

LGTM, thank you for making the changes!
diff mbox series

Patch

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 8262f57a43636..1c16a7336360f 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -38,24 +38,24 @@  EXPORT_SYMBOL_GPL(context_tracking);
 #ifdef CONFIG_CONTEXT_TRACKING_IDLE
 #define TPS(x)  tracepoint_string(x)
 
-/* Record the current task on dyntick-idle entry. */
-static __always_inline void rcu_dynticks_task_enter(void)
+/* Record the current task on exiting RCU-tasks (dyntick-idle entry). */
+static __always_inline void rcu_task_exit(void)
 {
 #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
 	WRITE_ONCE(current->rcu_tasks_idle_cpu, smp_processor_id());
 #endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
 }
 
-/* Record no current task on dyntick-idle exit. */
-static __always_inline void rcu_dynticks_task_exit(void)
+/* Record no current task on entering RCU-tasks (dyntick-idle exit). */
+static __always_inline void rcu_task_enter(void)
 {
 #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
 	WRITE_ONCE(current->rcu_tasks_idle_cpu, -1);
 #endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
 }
 
-/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
-static __always_inline void rcu_dynticks_task_trace_enter(void)
+/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
+static __always_inline void rcu_task_trace_exit(void)
 {
 #ifdef CONFIG_TASKS_TRACE_RCU
 	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
@@ -63,8 +63,8 @@  static __always_inline void rcu_dynticks_task_trace_enter(void)
 #endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
 }
 
-/* Turn off heavyweight RCU tasks trace readers on idle/user exit. */
-static __always_inline void rcu_dynticks_task_trace_exit(void)
+/* Turn off heavyweight RCU tasks trace readers on kernel entry. */
+static __always_inline void rcu_task_trace_enter(void)
 {
 #ifdef CONFIG_TASKS_TRACE_RCU
 	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
@@ -87,7 +87,7 @@  static noinstr void ct_kernel_exit_state(int offset)
 	 * critical sections, and we also must force ordering with the
 	 * next idle sojourn.
 	 */
-	rcu_dynticks_task_trace_enter();  // Before ->dynticks update!
+	rcu_task_trace_exit();  // Before CT state update!
 	seq = ct_state_inc(offset);
 	// RCU is no longer watching.  Better be in extended quiescent state!
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & CT_RCU_WATCHING));
@@ -109,7 +109,7 @@  static noinstr void ct_kernel_enter_state(int offset)
 	 */
 	seq = ct_state_inc(offset);
 	// RCU is now watching.  Better not be in an extended quiescent state!
-	rcu_dynticks_task_trace_exit();  // After ->dynticks update!
+	rcu_task_trace_enter();  // After CT state update!
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & CT_RCU_WATCHING));
 }
 
@@ -149,7 +149,7 @@  static void noinstr ct_kernel_exit(bool user, int offset)
 	// RCU is watching here ...
 	ct_kernel_exit_state(offset);
 	// ... but is no longer watching here.
-	rcu_dynticks_task_enter();
+	rcu_task_exit();
 }
 
 /*
@@ -173,7 +173,7 @@  static void noinstr ct_kernel_enter(bool user, int offset)
 		ct->nesting++;
 		return;
 	}
-	rcu_dynticks_task_exit();
+	rcu_task_enter();
 	// RCU is not watching here ...
 	ct_kernel_enter_state(offset);
 	// ... but is watching here.
@@ -240,7 +240,7 @@  void noinstr ct_nmi_exit(void)
 	// ... but is no longer watching here.
 
 	if (!in_nmi())
-		rcu_dynticks_task_enter();
+		rcu_task_exit();
 }
 
 /**
@@ -274,7 +274,7 @@  void noinstr ct_nmi_enter(void)
 	if (rcu_dynticks_curr_cpu_in_eqs()) {
 
 		if (!in_nmi())
-			rcu_dynticks_task_exit();
+			rcu_task_enter();
 
 		// RCU is not watching here ...
 		ct_kernel_enter_state(CT_RCU_WATCHING);