diff mbox series

[2/2] rcu/tasks: Further comment ordering around current task snapshot on TASK-TRACE

Message ID 20240517152303.19689-3-frederic@kernel.org (mailing list archive)
State New
Headers show
Series rcu/tasks: Fix stale task snapshot | expand

Commit Message

Frederic Weisbecker May 17, 2024, 3:23 p.m. UTC
Comment the current understanding of barriers and locking role around
task snapshot.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/rcu/tasks.h | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

Comments

Paul E. McKenney May 20, 2024, 6:48 p.m. UTC | #1
On Fri, May 17, 2024 at 05:23:03PM +0200, Frederic Weisbecker wrote:
> Comment the current understanding of barriers and locking role around
> task snapshot.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> ---
>  kernel/rcu/tasks.h | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> index 6a9ee35a282e..05413b37dd6e 100644
> --- a/kernel/rcu/tasks.h
> +++ b/kernel/rcu/tasks.h
> @@ -1738,9 +1738,21 @@ static void rcu_tasks_trace_pregp_step(struct list_head *hop)
>  	for_each_online_cpu(cpu) {
>  		rcu_read_lock();
>  		/*
> -		 * RQ must be locked because no ordering exists/can be relied upon
> -		 * between rq->curr write and subsequent read sides. This ensures that
> -		 * further context switching tasks will see update side pre-GP accesses.
> +		 * RQ lock + smp_mb__after_spinlock() before reading rq->curr serve
> +		 * two purposes:
> +		 *
> +		 * 1) Ordering against previous tasks accesses (though already enforced
> +		 *    by upcoming IPIs and post-gp synchronize_rcu()).
> +		 *
> +		 * 2) Make sure not to miss latest context switch, because no ordering
> +		 *    exists/can be relied upon between rq->curr write and subsequent read
> +		 *    sides.
> +		 *
> +		 * 3) Make sure subsequent context switching tasks will see update side
> +		 *    pre-GP accesses.
> +		 *
> +		 * smp_mb() after reading rq->curr doesn't play a significant role and might
> +		 * be considered for removal in the future.
>  		 */
>  		t = cpu_curr_snapshot(cpu);
>  		if (rcu_tasks_trace_pertask_prep(t, true))

How about this for that comment?

		// Note that cpu_curr_snapshot() picks up the target
		// CPU's current task while its runqueue is locked with an
		// smp_mb__after_spinlock().  This ensures that subsequent
		// tasks running on that CPU will see the updater's pre-GP
		// accesses.  The trailng smp_mb() in cpu_curr_snapshot()
		// does not currently play a role other than simplify
		// that function's ordering semantics.  If these simplified
		// ordering semantics continue to be redundant, that smp_mb()
		// might be removed.

I left out the "ordering agains previous tasks accesses" because,
as you say, this ordering is provided elsewhere.

Thoughts?

							Thanx, Paul
Frederic Weisbecker May 20, 2024, 8:41 p.m. UTC | #2
Le Mon, May 20, 2024 at 11:48:54AM -0700, Paul E. McKenney a écrit :
> On Fri, May 17, 2024 at 05:23:03PM +0200, Frederic Weisbecker wrote:
> > Comment the current understanding of barriers and locking role around
> > task snapshot.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > ---
> >  kernel/rcu/tasks.h | 18 +++++++++++++++---
> >  1 file changed, 15 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> > index 6a9ee35a282e..05413b37dd6e 100644
> > --- a/kernel/rcu/tasks.h
> > +++ b/kernel/rcu/tasks.h
> > @@ -1738,9 +1738,21 @@ static void rcu_tasks_trace_pregp_step(struct list_head *hop)
> >  	for_each_online_cpu(cpu) {
> >  		rcu_read_lock();
> >  		/*
> > -		 * RQ must be locked because no ordering exists/can be relied upon
> > -		 * between rq->curr write and subsequent read sides. This ensures that
> > -		 * further context switching tasks will see update side pre-GP accesses.
> > +		 * RQ lock + smp_mb__after_spinlock() before reading rq->curr serve
> > +		 * two purposes:
> > +		 *
> > +		 * 1) Ordering against previous tasks accesses (though already enforced
> > +		 *    by upcoming IPIs and post-gp synchronize_rcu()).
> > +		 *
> > +		 * 2) Make sure not to miss latest context switch, because no ordering
> > +		 *    exists/can be relied upon between rq->curr write and subsequent read
> > +		 *    sides.
> > +		 *
> > +		 * 3) Make sure subsequent context switching tasks will see update side
> > +		 *    pre-GP accesses.
> > +		 *
> > +		 * smp_mb() after reading rq->curr doesn't play a significant role and might
> > +		 * be considered for removal in the future.
> >  		 */
> >  		t = cpu_curr_snapshot(cpu);
> >  		if (rcu_tasks_trace_pertask_prep(t, true))
> 
> How about this for that comment?
> 
> 		// Note that cpu_curr_snapshot() picks up the target
> 		// CPU's current task while its runqueue is locked with an
> 		// smp_mb__after_spinlock().  This ensures that subsequent
> 		// tasks running on that CPU will see the updater's pre-GP
> 		// accesses.

Right but to achieve that, the smp_mb() was already enough, courtesy of
the official full barrier on schedule that (this one at least) we could rely on:

Updater             Reader
------             -------
X = 1              rq->curr = A
                   // another context switch later
smp_mb()           smp_mb__after_spin_lock() // right after rq_lock on __schedule()
READ rq->curr      rq->curr = B
                   READ X

If the updater misses A, B will see the update on X.

So I think we still need to justify the rq locking on the comments.

>                          The trailng smp_mb() in cpu_curr_snapshot()
> 		// does not currently play a role other than simplify
> 		// that function's ordering semantics.  If these simplified
> 		// ordering semantics continue to be redundant, that smp_mb()
> 		// might be removed.

That looks good.

> 
> I left out the "ordering agains previous tasks accesses" because,
> as you say, this ordering is provided elsewhere.

Right!

Thanks.
Paul E. McKenney May 20, 2024, 11:25 p.m. UTC | #3
On Mon, May 20, 2024 at 10:41:52PM +0200, Frederic Weisbecker wrote:
> Le Mon, May 20, 2024 at 11:48:54AM -0700, Paul E. McKenney a écrit :
> > On Fri, May 17, 2024 at 05:23:03PM +0200, Frederic Weisbecker wrote:
> > > Comment the current understanding of barriers and locking role around
> > > task snapshot.
> > > 
> > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > ---
> > >  kernel/rcu/tasks.h | 18 +++++++++++++++---
> > >  1 file changed, 15 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> > > index 6a9ee35a282e..05413b37dd6e 100644
> > > --- a/kernel/rcu/tasks.h
> > > +++ b/kernel/rcu/tasks.h
> > > @@ -1738,9 +1738,21 @@ static void rcu_tasks_trace_pregp_step(struct list_head *hop)
> > >  	for_each_online_cpu(cpu) {
> > >  		rcu_read_lock();
> > >  		/*
> > > -		 * RQ must be locked because no ordering exists/can be relied upon
> > > -		 * between rq->curr write and subsequent read sides. This ensures that
> > > -		 * further context switching tasks will see update side pre-GP accesses.
> > > +		 * RQ lock + smp_mb__after_spinlock() before reading rq->curr serve
> > > +		 * two purposes:
> > > +		 *
> > > +		 * 1) Ordering against previous tasks accesses (though already enforced
> > > +		 *    by upcoming IPIs and post-gp synchronize_rcu()).
> > > +		 *
> > > +		 * 2) Make sure not to miss latest context switch, because no ordering
> > > +		 *    exists/can be relied upon between rq->curr write and subsequent read
> > > +		 *    sides.
> > > +		 *
> > > +		 * 3) Make sure subsequent context switching tasks will see update side
> > > +		 *    pre-GP accesses.
> > > +		 *
> > > +		 * smp_mb() after reading rq->curr doesn't play a significant role and might
> > > +		 * be considered for removal in the future.
> > >  		 */
> > >  		t = cpu_curr_snapshot(cpu);
> > >  		if (rcu_tasks_trace_pertask_prep(t, true))
> > 
> > How about this for that comment?
> > 
> > 		// Note that cpu_curr_snapshot() picks up the target
> > 		// CPU's current task while its runqueue is locked with an
> > 		// smp_mb__after_spinlock().  This ensures that subsequent
> > 		// tasks running on that CPU will see the updater's pre-GP
> > 		// accesses.
> 
> Right but to achieve that, the smp_mb() was already enough, courtesy of
> the official full barrier on schedule that (this one at least) we could rely on:
> 
> Updater             Reader
> ------             -------
> X = 1              rq->curr = A
>                    // another context switch later
> smp_mb()           smp_mb__after_spin_lock() // right after rq_lock on __schedule()
> READ rq->curr      rq->curr = B
>                    READ X
> 
> If the updater misses A, B will see the update on X.
> 
> So I think we still need to justify the rq locking on the comments.
> 
> >                          The trailng smp_mb() in cpu_curr_snapshot()
> > 		// does not currently play a role other than simplify
> > 		// that function's ordering semantics.  If these simplified
> > 		// ordering semantics continue to be redundant, that smp_mb()
> > 		// might be removed.
> 
> That looks good.
> 
> > 
> > I left out the "ordering agains previous tasks accesses" because,
> > as you say, this ordering is provided elsewhere.
> 
> Right!

Good points!  How about the following?

		// Note that cpu_curr_snapshot() picks up the target
		// CPU's current task while its runqueue is locked with
		// an smp_mb__after_spinlock().  This ensures that either
		// the grace-period kthread will see that task's read-side
		// critical section or the task will see the updater's pre-GP
		// accesses.  The trailng smp_mb() in cpu_curr_snapshot()
		// does not currently play a role other than simplify
		// that function's ordering semantics.  If these simplified
		// ordering semantics continue to be redundant, that smp_mb()
		// might be removed.

Keeping in mind that the commit's log fully lays out the troublesome
scenario.

							Thanx, Paul
Frederic Weisbecker May 21, 2024, 9:59 a.m. UTC | #4
Le Mon, May 20, 2024 at 04:25:33PM -0700, Paul E. McKenney a écrit :
> Good points!  How about the following?
> 
> 		// Note that cpu_curr_snapshot() picks up the target
> 		// CPU's current task while its runqueue is locked with
> 		// an smp_mb__after_spinlock().  This ensures that either
> 		// the grace-period kthread will see that task's read-side
> 		// critical section or the task will see the updater's pre-GP
> 		// accesses.  The trailng smp_mb() in cpu_curr_snapshot()

*trailing

> 		// does not currently play a role other than simplify
> 		// that function's ordering semantics.  If these simplified
> 		// ordering semantics continue to be redundant, that smp_mb()
> 		// might be removed.
> 
> Keeping in mind that the commit's log fully lays out the troublesome
> scenario.

Yep, looks very good!

Thanks!

> 
> 							Thanx, Paul
>
Paul E. McKenney May 21, 2024, 1:38 p.m. UTC | #5
On Tue, May 21, 2024 at 11:59:57AM +0200, Frederic Weisbecker wrote:
> Le Mon, May 20, 2024 at 04:25:33PM -0700, Paul E. McKenney a écrit :
> > Good points!  How about the following?
> > 
> > 		// Note that cpu_curr_snapshot() picks up the target
> > 		// CPU's current task while its runqueue is locked with
> > 		// an smp_mb__after_spinlock().  This ensures that either
> > 		// the grace-period kthread will see that task's read-side
> > 		// critical section or the task will see the updater's pre-GP
> > 		// accesses.  The trailng smp_mb() in cpu_curr_snapshot()
> 
> *trailing

Good catch!

> > 		// does not currently play a role other than simplify
> > 		// that function's ordering semantics.  If these simplified
> > 		// ordering semantics continue to be redundant, that smp_mb()
> > 		// might be removed.
> > 
> > Keeping in mind that the commit's log fully lays out the troublesome
> > scenario.
> 
> Yep, looks very good!
> 
> Thanks!

Very good, I will fold this in on my next rebase.

							Thanx, Paul
diff mbox series

Patch

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 6a9ee35a282e..05413b37dd6e 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1738,9 +1738,21 @@  static void rcu_tasks_trace_pregp_step(struct list_head *hop)
 	for_each_online_cpu(cpu) {
 		rcu_read_lock();
 		/*
-		 * RQ must be locked because no ordering exists/can be relied upon
-		 * between rq->curr write and subsequent read sides. This ensures that
-		 * further context switching tasks will see update side pre-GP accesses.
+		 * RQ lock + smp_mb__after_spinlock() before reading rq->curr serve
+		 * two purposes:
+		 *
+		 * 1) Ordering against previous tasks accesses (though already enforced
+		 *    by upcoming IPIs and post-gp synchronize_rcu()).
+		 *
+		 * 2) Make sure not to miss latest context switch, because no ordering
+		 *    exists/can be relied upon between rq->curr write and subsequent read
+		 *    sides.
+		 *
+		 * 3) Make sure subsequent context switching tasks will see update side
+		 *    pre-GP accesses.
+		 *
+		 * smp_mb() after reading rq->curr doesn't play a significant role and might
+		 * be considered for removal in the future.
 		 */
 		t = cpu_curr_snapshot(cpu);
 		if (rcu_tasks_trace_pertask_prep(t, true))