diff mbox series

[3/3] srcu: Explain why callbacks invocations can't run concurrently

Message ID 20231212174817.11919-3-neeraj.iitr10@gmail.com (mailing list archive)
State Mainlined
Commit c21357e4461f3f9c8ff93302906b5372411ee108
Headers show
Series SRCU updates for v6.8 | expand

Commit Message

Neeraj Upadhyay (AMD) Dec. 12, 2023, 5:48 p.m. UTC
From: Frederic Weisbecker <frederic@kernel.org>

If an SRCU barrier is queued while callbacks are running and a new
callbacks invocator for the same sdp were to run concurrently, the
RCU barrier might execute too early. As this requirement is non-obvious,
make sure to keep a record.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
---
 kernel/rcu/srcutree.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Joel Fernandes Dec. 13, 2023, 2:27 p.m. UTC | #1
On Tue, Dec 12, 2023 at 12:48 PM Neeraj Upadhyay (AMD)
<neeraj.iitr10@gmail.com> wrote:
>
> From: Frederic Weisbecker <frederic@kernel.org>
>
> If an SRCU barrier is queued while callbacks are running and a new
> callbacks invocator for the same sdp were to run concurrently, the
> RCU barrier might execute too early. As this requirement is non-obvious,
> make sure to keep a record.
>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
> ---
>  kernel/rcu/srcutree.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 2bfc8ed1eed2..0351a4e83529 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1715,6 +1715,11 @@ static void srcu_invoke_callbacks(struct work_struct *work)
>         WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL));
>         rcu_segcblist_advance(&sdp->srcu_cblist,
>                               rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> +       /*
> +        * Although this function is theoretically re-entrant, concurrent
> +        * callbacks invocation is disallowed to avoid executing an SRCU barrier
> +        * too early.
> +        */

Side comment:
I guess even without the barrier reasoning, it is best not to allow
concurrent CB execution anyway since it diverges from the behavior of
straight RCU :)

  - Joel
Paul E. McKenney Dec. 13, 2023, 5:52 p.m. UTC | #2
On Wed, Dec 13, 2023 at 09:27:09AM -0500, Joel Fernandes wrote:
> On Tue, Dec 12, 2023 at 12:48 PM Neeraj Upadhyay (AMD)
> <neeraj.iitr10@gmail.com> wrote:
> >
> > From: Frederic Weisbecker <frederic@kernel.org>
> >
> > If an SRCU barrier is queued while callbacks are running and a new
> > callbacks invocator for the same sdp were to run concurrently, the
> > RCU barrier might execute too early. As this requirement is non-obvious,
> > make sure to keep a record.
> >
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
> > ---
> >  kernel/rcu/srcutree.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > index 2bfc8ed1eed2..0351a4e83529 100644
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -1715,6 +1715,11 @@ static void srcu_invoke_callbacks(struct work_struct *work)
> >         WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL));
> >         rcu_segcblist_advance(&sdp->srcu_cblist,
> >                               rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> > +       /*
> > +        * Although this function is theoretically re-entrant, concurrent
> > +        * callbacks invocation is disallowed to avoid executing an SRCU barrier
> > +        * too early.
> > +        */
> 
> Side comment:
> I guess even without the barrier reasoning, it is best not to allow
> concurrent CB execution anyway since it diverges from the behavior of
> straight RCU :)

Good point!

But please do not forget item 12 on the list in checklist.rst.  ;-)
(Which I just updated to include the other call_rcu*() functions.)

							Thanx, Paul
Joel Fernandes Dec. 13, 2023, 6:35 p.m. UTC | #3
On Wed, Dec 13, 2023 at 12:52 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Wed, Dec 13, 2023 at 09:27:09AM -0500, Joel Fernandes wrote:
> > On Tue, Dec 12, 2023 at 12:48 PM Neeraj Upadhyay (AMD)
> > <neeraj.iitr10@gmail.com> wrote:
> > >
> > > From: Frederic Weisbecker <frederic@kernel.org>
> > >
> > > If an SRCU barrier is queued while callbacks are running and a new
> > > callbacks invocator for the same sdp were to run concurrently, the
> > > RCU barrier might execute too early. As this requirement is non-obvious,
> > > make sure to keep a record.
> > >
> > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > > Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
> > > ---
> > >  kernel/rcu/srcutree.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > > index 2bfc8ed1eed2..0351a4e83529 100644
> > > --- a/kernel/rcu/srcutree.c
> > > +++ b/kernel/rcu/srcutree.c
> > > @@ -1715,6 +1715,11 @@ static void srcu_invoke_callbacks(struct work_struct *work)
> > >         WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL));
> > >         rcu_segcblist_advance(&sdp->srcu_cblist,
> > >                               rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> > > +       /*
> > > +        * Although this function is theoretically re-entrant, concurrent
> > > +        * callbacks invocation is disallowed to avoid executing an SRCU barrier
> > > +        * too early.
> > > +        */
> >
> > Side comment:
> > I guess even without the barrier reasoning, it is best not to allow
> > concurrent CB execution anyway since it diverges from the behavior of
> > straight RCU :)
>
> Good point!
>
> But please do not forget item 12 on the list in checklist.rst.  ;-)
> (Which I just updated to include the other call_rcu*() functions.)

I think this is more so now with recent kernels (with the dynamic nocb
switch) than with older kernels right? I haven't kept up with the
checklist recently (which is my bad).

My understanding comes from the fact that the RCU barrier depends on
callbacks on the same CPU executing in order with straight RCU
otherwise it breaks. Hence my comment. But as you pointed out, that's
outdated knowledge.

I should just shut up and hide in shame now.

:-/

 - Joel
Paul E. McKenney Dec. 13, 2023, 6:55 p.m. UTC | #4
On Wed, Dec 13, 2023 at 01:35:22PM -0500, Joel Fernandes wrote:
> On Wed, Dec 13, 2023 at 12:52 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Wed, Dec 13, 2023 at 09:27:09AM -0500, Joel Fernandes wrote:
> > > On Tue, Dec 12, 2023 at 12:48 PM Neeraj Upadhyay (AMD)
> > > <neeraj.iitr10@gmail.com> wrote:
> > > >
> > > > From: Frederic Weisbecker <frederic@kernel.org>
> > > >
> > > > If an SRCU barrier is queued while callbacks are running and a new
> > > > callbacks invocator for the same sdp were to run concurrently, the
> > > > RCU barrier might execute too early. As this requirement is non-obvious,
> > > > make sure to keep a record.
> > > >
> > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > > Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > > > Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
> > > > ---
> > > >  kernel/rcu/srcutree.c | 6 ++++++
> > > >  1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > > > index 2bfc8ed1eed2..0351a4e83529 100644
> > > > --- a/kernel/rcu/srcutree.c
> > > > +++ b/kernel/rcu/srcutree.c
> > > > @@ -1715,6 +1715,11 @@ static void srcu_invoke_callbacks(struct work_struct *work)
> > > >         WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL));
> > > >         rcu_segcblist_advance(&sdp->srcu_cblist,
> > > >                               rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> > > > +       /*
> > > > +        * Although this function is theoretically re-entrant, concurrent
> > > > +        * callbacks invocation is disallowed to avoid executing an SRCU barrier
> > > > +        * too early.
> > > > +        */
> > >
> > > Side comment:
> > > I guess even without the barrier reasoning, it is best not to allow
> > > concurrent CB execution anyway since it diverges from the behavior of
> > > straight RCU :)
> >
> > Good point!
> >
> > But please do not forget item 12 on the list in checklist.rst.  ;-)
> > (Which I just updated to include the other call_rcu*() functions.)
> 
> I think this is more so now with recent kernels (with the dynamic nocb
> switch) than with older kernels right? I haven't kept up with the
> checklist recently (which is my bad).

You are quite correct!  But even before this, I was saying that
lack of same-CPU callback concurrency was an accident of the current
implementation rather than a guarantee.  For example, there might come
a time when RCU needs to respond to callback flooding with concurrent
execution of the flooded CPU's callbacks.  Or not, but we do need to
keep this option open.

> My understanding comes from the fact that the RCU barrier depends on
> callbacks on the same CPU executing in order with straight RCU
> otherwise it breaks. Hence my comment. But as you pointed out, that's
> outdated knowledge.

That is still one motivation for ordered execution of callbacks.  For the
dynamic nocb switch, we could have chosen to make rcu_barrier() place
a callback on both lists, but we instead chose to exclude rcu_barrier()
calls during the switch.

> I should just shut up and hide in shame now.

No need for that!  After all, one motivation for Requirements.rst was
to help me keep track of all this stuff.

							Thanx, Paul

> :-/
> 
>  - Joel
Joel Fernandes Dec. 13, 2023, 7:01 p.m. UTC | #5
On Wed, Dec 13, 2023 at 1:55 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Wed, Dec 13, 2023 at 01:35:22PM -0500, Joel Fernandes wrote:
> > On Wed, Dec 13, 2023 at 12:52 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Wed, Dec 13, 2023 at 09:27:09AM -0500, Joel Fernandes wrote:
> > > > On Tue, Dec 12, 2023 at 12:48 PM Neeraj Upadhyay (AMD)
> > > > <neeraj.iitr10@gmail.com> wrote:
> > > > >
> > > > > From: Frederic Weisbecker <frederic@kernel.org>
> > > > >
> > > > > If an SRCU barrier is queued while callbacks are running and a new
> > > > > callbacks invocator for the same sdp were to run concurrently, the
> > > > > RCU barrier might execute too early. As this requirement is non-obvious,
> > > > > make sure to keep a record.
> > > > >
> > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > > > Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > > > > Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
> > > > > ---
> > > > >  kernel/rcu/srcutree.c | 6 ++++++
> > > > >  1 file changed, 6 insertions(+)
> > > > >
> > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > > > > index 2bfc8ed1eed2..0351a4e83529 100644
> > > > > --- a/kernel/rcu/srcutree.c
> > > > > +++ b/kernel/rcu/srcutree.c
> > > > > @@ -1715,6 +1715,11 @@ static void srcu_invoke_callbacks(struct work_struct *work)
> > > > >         WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL));
> > > > >         rcu_segcblist_advance(&sdp->srcu_cblist,
> > > > >                               rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> > > > > +       /*
> > > > > +        * Although this function is theoretically re-entrant, concurrent
> > > > > +        * callbacks invocation is disallowed to avoid executing an SRCU barrier
> > > > > +        * too early.
> > > > > +        */
> > > >
> > > > Side comment:
> > > > I guess even without the barrier reasoning, it is best not to allow
> > > > concurrent CB execution anyway since it diverges from the behavior of
> > > > straight RCU :)
> > >
> > > Good point!
> > >
> > > But please do not forget item 12 on the list in checklist.rst.  ;-)
> > > (Which I just updated to include the other call_rcu*() functions.)
> >
> > I think this is more so now with recent kernels (with the dynamic nocb
> > switch) than with older kernels right? I haven't kept up with the
> > checklist recently (which is my bad).
>
> You are quite correct!  But even before this, I was saying that
> lack of same-CPU callback concurrency was an accident of the current
> implementation rather than a guarantee.  For example, there might come
> a time when RCU needs to respond to callback flooding with concurrent
> execution of the flooded CPU's callbacks.  Or not, but we do need to
> keep this option open.

Got it, reminds me to focus on requirements as well along with implementation.

> > My understanding comes from the fact that the RCU barrier depends on
> > callbacks on the same CPU executing in order with straight RCU
> > otherwise it breaks. Hence my comment. But as you pointed out, that's
> > outdated knowledge.
>
> That is still one motivation for ordered execution of callbacks.  For the
> dynamic nocb switch, we could have chosen to make rcu_barrier() place
> a callback on both lists, but we instead chose to exclude rcu_barrier()
> calls during the switch.

Right!

> > I should just shut up and hide in shame now.
>
> No need for that!  After all, one motivation for Requirements.rst was
> to help me keep track of all this stuff.

Thanks!

 - Joel
diff mbox series

Patch

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 2bfc8ed1eed2..0351a4e83529 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1715,6 +1715,11 @@  static void srcu_invoke_callbacks(struct work_struct *work)
 	WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL));
 	rcu_segcblist_advance(&sdp->srcu_cblist,
 			      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
+	/*
+	 * Although this function is theoretically re-entrant, concurrent
+	 * callbacks invocation is disallowed to avoid executing an SRCU barrier
+	 * too early.
+	 */
 	if (sdp->srcu_cblist_invoking ||
 	    !rcu_segcblist_ready_cbs(&sdp->srcu_cblist)) {
 		spin_unlock_irq_rcu_node(sdp);
@@ -1745,6 +1750,7 @@  static void srcu_invoke_callbacks(struct work_struct *work)
 	sdp->srcu_cblist_invoking = false;
 	more = rcu_segcblist_ready_cbs(&sdp->srcu_cblist);
 	spin_unlock_irq_rcu_node(sdp);
+	/* An SRCU barrier or callbacks from previous nesting work pending */
 	if (more)
 		srcu_schedule_cbs_sdp(sdp, 0);
 }