Message ID | 20250219124309.463702-1-joelagnelf@nvidia.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2,-rcu] srcu: Use rcu_seq_done_exact() for polling API | expand |
On Wed, Feb 19, 2025 at 07:43:08AM -0500, Joel Fernandes wrote: > poll_state_synchronize_srcu() uses rcu_seq_done() unlike > poll_state_synchronize_rcu() which uses rcu_seq_done_exact(). > > The rcu_seq_done_exact() makes more sense for polling API, as with > this API, there is a higher chance that there is a significant delay > between the get_state..() and poll_state..() calls since a cookie > can be stored and reused at a later time. During such a delay, if > the gp_seq counter progresses more than ULONG_MAX/2 distance, then > poll_state..() may return false for a long time unwantedly. > > Fix by using the more accurate rcu_seq_done_exact() API which is > exactly what straight RCU's polling does. > > It may make sense, as future work, to add debug code here as well, where > we compare a physical timestamp between get_state..() and poll_state() > calls and yell if significant time has past but the grace period has > still not progressed. > > Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> > Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> But we should also run this by Kent Overstreet, given that bcachefs uses this. Should be OK, given that bcachefs uses this API in the same way that it does poll_state_synchronize_rcu(), but still... Thanx, Paul > --- > v1->v2: Resend with Neeraj review tag. Rebased on rcu/dev. > > kernel/rcu/srcutree.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index d2a694944553..591371d62e89 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -1589,7 +1589,7 @@ EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu); > bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie) > { > if (cookie != SRCU_GET_STATE_COMPLETED && > - !rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, cookie)) > + !rcu_seq_done_exact(&ssp->srcu_sup->srcu_gp_seq, cookie)) > return false; > // Ensure that the end of the SRCU grace period happens before > // any subsequent code that the caller might execute. > -- > 2.34.1 >
On 2/19/2025 8:22 AM, Paul E. McKenney wrote: > On Wed, Feb 19, 2025 at 07:43:08AM -0500, Joel Fernandes wrote: >> poll_state_synchronize_srcu() uses rcu_seq_done() unlike >> poll_state_synchronize_rcu() which uses rcu_seq_done_exact(). >> >> The rcu_seq_done_exact() makes more sense for polling API, as with >> this API, there is a higher chance that there is a significant delay >> between the get_state..() and poll_state..() calls since a cookie >> can be stored and reused at a later time. During such a delay, if >> the gp_seq counter progresses more than ULONG_MAX/2 distance, then >> poll_state..() may return false for a long time unwantedly. >> >> Fix by using the more accurate rcu_seq_done_exact() API which is >> exactly what straight RCU's polling does. >> >> It may make sense, as future work, to add debug code here as well, where >> we compare a physical timestamp between get_state..() and poll_state() >> calls and yell if significant time has past but the grace period has >> still not progressed. >> >> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> >> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> > > Reviewed-by: Paul E. McKenney <paulmck@kernel.org> > > But we should also run this by Kent Overstreet, given that bcachefs > uses this. Should be OK, given that bcachefs uses this API in the same > way that it does poll_state_synchronize_rcu(), but still... Thanks Paul! Adding Kent Overstreet to the email to raise any objections. thanks, - Joel > > Thanx, Paul > >> --- >> v1->v2: Resend with Neeraj review tag. Rebased on rcu/dev. >> >> kernel/rcu/srcutree.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c >> index d2a694944553..591371d62e89 100644 >> --- a/kernel/rcu/srcutree.c >> +++ b/kernel/rcu/srcutree.c >> @@ -1589,7 +1589,7 @@ EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu); >> bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie) >> { >> if (cookie != SRCU_GET_STATE_COMPLETED && >> - !rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, cookie)) >> + !rcu_seq_done_exact(&ssp->srcu_sup->srcu_gp_seq, cookie)) >> return false; >> // Ensure that the end of the SRCU grace period happens before >> // any subsequent code that the caller might execute. >> -- >> 2.34.1 >>
On Wed, Feb 19, 2025 at 08:29:47AM -0500, Joel Fernandes wrote: > > > On 2/19/2025 8:22 AM, Paul E. McKenney wrote: > > On Wed, Feb 19, 2025 at 07:43:08AM -0500, Joel Fernandes wrote: > >> poll_state_synchronize_srcu() uses rcu_seq_done() unlike > >> poll_state_synchronize_rcu() which uses rcu_seq_done_exact(). > >> > >> The rcu_seq_done_exact() makes more sense for polling API, as with > >> this API, there is a higher chance that there is a significant delay > >> between the get_state..() and poll_state..() calls since a cookie > >> can be stored and reused at a later time. During such a delay, if > >> the gp_seq counter progresses more than ULONG_MAX/2 distance, then > >> poll_state..() may return false for a long time unwantedly. > >> > >> Fix by using the more accurate rcu_seq_done_exact() API which is > >> exactly what straight RCU's polling does. > >> > >> It may make sense, as future work, to add debug code here as well, where > >> we compare a physical timestamp between get_state..() and poll_state() > >> calls and yell if significant time has past but the grace period has > >> still not progressed. > >> > >> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> > >> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> > > > > Reviewed-by: Paul E. McKenney <paulmck@kernel.org> > > > > But we should also run this by Kent Overstreet, given that bcachefs > > uses this. Should be OK, given that bcachefs uses this API in the same > > way that it does poll_state_synchronize_rcu(), but still... > > Thanks Paul! Adding Kent Overstreet to the email to raise any objections. It sounds like rcu_done_exact() is indeed what we want - bcachefs uses this for determining when objects may be reclaimed (as is typical with rcu), so we don't want objects to be stranded a "significant time past the grace period". Is there any additional cost? I'm not seeing rcu_done_exact() in Linus's tree yet. Minor additional overhead would be totally fine; we use this from fs/bcachefs/rcu_pending.c which doesn't call it for each object.
On 2/19/2025 4:07 PM, Kent Overstreet wrote: > On Wed, Feb 19, 2025 at 08:29:47AM -0500, Joel Fernandes wrote: >> >> >> On 2/19/2025 8:22 AM, Paul E. McKenney wrote: >>> On Wed, Feb 19, 2025 at 07:43:08AM -0500, Joel Fernandes wrote: >>>> poll_state_synchronize_srcu() uses rcu_seq_done() unlike >>>> poll_state_synchronize_rcu() which uses rcu_seq_done_exact(). >>>> >>>> The rcu_seq_done_exact() makes more sense for polling API, as with >>>> this API, there is a higher chance that there is a significant delay >>>> between the get_state..() and poll_state..() calls since a cookie >>>> can be stored and reused at a later time. During such a delay, if >>>> the gp_seq counter progresses more than ULONG_MAX/2 distance, then >>>> poll_state..() may return false for a long time unwantedly. >>>> >>>> Fix by using the more accurate rcu_seq_done_exact() API which is >>>> exactly what straight RCU's polling does. >>>> >>>> It may make sense, as future work, to add debug code here as well, where >>>> we compare a physical timestamp between get_state..() and poll_state() >>>> calls and yell if significant time has past but the grace period has >>>> still not progressed. >>>> >>>> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> >>>> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> >>> >>> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> >>> >>> But we should also run this by Kent Overstreet, given that bcachefs >>> uses this. Should be OK, given that bcachefs uses this API in the same >>> way that it does poll_state_synchronize_rcu(), but still... >> >> Thanks Paul! Adding Kent Overstreet to the email to raise any objections. > > It sounds like rcu_done_exact() is indeed what we want - bcachefs uses > this for determining when objects may be reclaimed (as is typical with > rcu), so we don't want objects to be stranded a "significant time past > the grace period". Thanks for confirming. May I add your Reviewed-by tag as well? > > Is there any additional cost? I'm not seeing rcu_done_exact() in Linus's > tree yet. Minor additional overhead would be totally fine; we use this > from fs/bcachefs/rcu_pending.c which doesn't call it for each object. The additional overhead should be minimal. rcu_seq_done_exact() is the function. thanks, -Joel
On Wed, Feb 19, 2025 at 06:45:07PM -0500, Joel Fernandes wrote: > > > On 2/19/2025 4:07 PM, Kent Overstreet wrote: > > On Wed, Feb 19, 2025 at 08:29:47AM -0500, Joel Fernandes wrote: > >> > >> > >> On 2/19/2025 8:22 AM, Paul E. McKenney wrote: > >>> On Wed, Feb 19, 2025 at 07:43:08AM -0500, Joel Fernandes wrote: > >>>> poll_state_synchronize_srcu() uses rcu_seq_done() unlike > >>>> poll_state_synchronize_rcu() which uses rcu_seq_done_exact(). > >>>> > >>>> The rcu_seq_done_exact() makes more sense for polling API, as with > >>>> this API, there is a higher chance that there is a significant delay > >>>> between the get_state..() and poll_state..() calls since a cookie > >>>> can be stored and reused at a later time. During such a delay, if > >>>> the gp_seq counter progresses more than ULONG_MAX/2 distance, then > >>>> poll_state..() may return false for a long time unwantedly. > >>>> > >>>> Fix by using the more accurate rcu_seq_done_exact() API which is > >>>> exactly what straight RCU's polling does. > >>>> > >>>> It may make sense, as future work, to add debug code here as well, where > >>>> we compare a physical timestamp between get_state..() and poll_state() > >>>> calls and yell if significant time has past but the grace period has > >>>> still not progressed. > >>>> > >>>> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> > >>>> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> > >>> > >>> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> > >>> > >>> But we should also run this by Kent Overstreet, given that bcachefs > >>> uses this. Should be OK, given that bcachefs uses this API in the same > >>> way that it does poll_state_synchronize_rcu(), but still... > >> > >> Thanks Paul! Adding Kent Overstreet to the email to raise any objections. > > > > It sounds like rcu_done_exact() is indeed what we want - bcachefs uses > > this for determining when objects may be reclaimed (as is typical with > > rcu), so we don't want objects to be stranded a "significant time past > > the grace period". > > Thanks for confirming. May I add your Reviewed-by tag as well? Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index d2a694944553..591371d62e89 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -1589,7 +1589,7 @@ EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu); bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie) { if (cookie != SRCU_GET_STATE_COMPLETED && - !rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, cookie)) + !rcu_seq_done_exact(&ssp->srcu_sup->srcu_gp_seq, cookie)) return false; // Ensure that the end of the SRCU grace period happens before // any subsequent code that the caller might execute.