diff mbox series

[v2,-rcu] srcu: Use rcu_seq_done_exact() for polling API

Message ID 20250219124309.463702-1-joelagnelf@nvidia.com (mailing list archive)
State New
Headers show
Series [v2,-rcu] srcu: Use rcu_seq_done_exact() for polling API | expand

Commit Message

Joel Fernandes Feb. 19, 2025, 12:43 p.m. UTC
poll_state_synchronize_srcu() uses rcu_seq_done() unlike
poll_state_synchronize_rcu() which uses rcu_seq_done_exact().

The  rcu_seq_done_exact() makes more sense for polling API, as with
this API, there is a higher chance that there is a significant delay
between the get_state..() and poll_state..() calls since a cookie
can be stored and reused at a later time. During such a delay, if
the gp_seq counter progresses more than ULONG_MAX/2 distance, then
poll_state..() may return false for a long time unwantedly.

Fix by using the more accurate rcu_seq_done_exact() API which is
exactly what straight RCU's polling does.

It may make sense, as future work, to add debug code here as well, where
we compare a physical timestamp between get_state..() and poll_state()
calls and yell if significant time has past but the grace period has
still not progressed.

Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
---
v1->v2: Resend with Neeraj review tag. Rebased on rcu/dev.

 kernel/rcu/srcutree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Paul E. McKenney Feb. 19, 2025, 1:22 p.m. UTC | #1
On Wed, Feb 19, 2025 at 07:43:08AM -0500, Joel Fernandes wrote:
> poll_state_synchronize_srcu() uses rcu_seq_done() unlike
> poll_state_synchronize_rcu() which uses rcu_seq_done_exact().
> 
> The  rcu_seq_done_exact() makes more sense for polling API, as with
> this API, there is a higher chance that there is a significant delay
> between the get_state..() and poll_state..() calls since a cookie
> can be stored and reused at a later time. During such a delay, if
> the gp_seq counter progresses more than ULONG_MAX/2 distance, then
> poll_state..() may return false for a long time unwantedly.
> 
> Fix by using the more accurate rcu_seq_done_exact() API which is
> exactly what straight RCU's polling does.
> 
> It may make sense, as future work, to add debug code here as well, where
> we compare a physical timestamp between get_state..() and poll_state()
> calls and yell if significant time has past but the grace period has
> still not progressed.
> 
> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

But we should also run this by Kent Overstreet, given that bcachefs
uses this.  Should be OK, given that bcachefs uses this API in the same
way that it does poll_state_synchronize_rcu(), but still...

							Thanx, Paul

> ---
> v1->v2: Resend with Neeraj review tag. Rebased on rcu/dev.
> 
>  kernel/rcu/srcutree.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index d2a694944553..591371d62e89 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1589,7 +1589,7 @@ EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu);
>  bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie)
>  {
>  	if (cookie != SRCU_GET_STATE_COMPLETED &&
> -	    !rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, cookie))
> +	    !rcu_seq_done_exact(&ssp->srcu_sup->srcu_gp_seq, cookie))
>  		return false;
>  	// Ensure that the end of the SRCU grace period happens before
>  	// any subsequent code that the caller might execute.
> -- 
> 2.34.1
>
Joel Fernandes Feb. 19, 2025, 1:29 p.m. UTC | #2
On 2/19/2025 8:22 AM, Paul E. McKenney wrote:
> On Wed, Feb 19, 2025 at 07:43:08AM -0500, Joel Fernandes wrote:
>> poll_state_synchronize_srcu() uses rcu_seq_done() unlike
>> poll_state_synchronize_rcu() which uses rcu_seq_done_exact().
>>
>> The  rcu_seq_done_exact() makes more sense for polling API, as with
>> this API, there is a higher chance that there is a significant delay
>> between the get_state..() and poll_state..() calls since a cookie
>> can be stored and reused at a later time. During such a delay, if
>> the gp_seq counter progresses more than ULONG_MAX/2 distance, then
>> poll_state..() may return false for a long time unwantedly.
>>
>> Fix by using the more accurate rcu_seq_done_exact() API which is
>> exactly what straight RCU's polling does.
>>
>> It may make sense, as future work, to add debug code here as well, where
>> we compare a physical timestamp between get_state..() and poll_state()
>> calls and yell if significant time has past but the grace period has
>> still not progressed.
>>
>> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
>> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
> 
> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> 
> But we should also run this by Kent Overstreet, given that bcachefs
> uses this.  Should be OK, given that bcachefs uses this API in the same
> way that it does poll_state_synchronize_rcu(), but still...

Thanks Paul!  Adding Kent Overstreet to the email to raise any objections.

thanks,

 - Joel

> 
> 							Thanx, Paul
> 
>> ---
>> v1->v2: Resend with Neeraj review tag. Rebased on rcu/dev.
>>
>>  kernel/rcu/srcutree.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
>> index d2a694944553..591371d62e89 100644
>> --- a/kernel/rcu/srcutree.c
>> +++ b/kernel/rcu/srcutree.c
>> @@ -1589,7 +1589,7 @@ EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu);
>>  bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie)
>>  {
>>  	if (cookie != SRCU_GET_STATE_COMPLETED &&
>> -	    !rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, cookie))
>> +	    !rcu_seq_done_exact(&ssp->srcu_sup->srcu_gp_seq, cookie))
>>  		return false;
>>  	// Ensure that the end of the SRCU grace period happens before
>>  	// any subsequent code that the caller might execute.
>> -- 
>> 2.34.1
>>
Kent Overstreet Feb. 19, 2025, 9:07 p.m. UTC | #3
On Wed, Feb 19, 2025 at 08:29:47AM -0500, Joel Fernandes wrote:
> 
> 
> On 2/19/2025 8:22 AM, Paul E. McKenney wrote:
> > On Wed, Feb 19, 2025 at 07:43:08AM -0500, Joel Fernandes wrote:
> >> poll_state_synchronize_srcu() uses rcu_seq_done() unlike
> >> poll_state_synchronize_rcu() which uses rcu_seq_done_exact().
> >>
> >> The  rcu_seq_done_exact() makes more sense for polling API, as with
> >> this API, there is a higher chance that there is a significant delay
> >> between the get_state..() and poll_state..() calls since a cookie
> >> can be stored and reused at a later time. During such a delay, if
> >> the gp_seq counter progresses more than ULONG_MAX/2 distance, then
> >> poll_state..() may return false for a long time unwantedly.
> >>
> >> Fix by using the more accurate rcu_seq_done_exact() API which is
> >> exactly what straight RCU's polling does.
> >>
> >> It may make sense, as future work, to add debug code here as well, where
> >> we compare a physical timestamp between get_state..() and poll_state()
> >> calls and yell if significant time has past but the grace period has
> >> still not progressed.
> >>
> >> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> >> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
> > 
> > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > But we should also run this by Kent Overstreet, given that bcachefs
> > uses this.  Should be OK, given that bcachefs uses this API in the same
> > way that it does poll_state_synchronize_rcu(), but still...
> 
> Thanks Paul!  Adding Kent Overstreet to the email to raise any objections.

It sounds like rcu_done_exact() is indeed what we want - bcachefs uses
this for determining when objects may be reclaimed (as is typical with
rcu), so we don't want objects to be stranded a "significant time past
the grace period".

Is there any additional cost? I'm not seeing rcu_done_exact() in Linus's
tree yet. Minor additional overhead would be totally fine; we use this
from fs/bcachefs/rcu_pending.c which doesn't call it for each object.
Joel Fernandes Feb. 19, 2025, 11:45 p.m. UTC | #4
On 2/19/2025 4:07 PM, Kent Overstreet wrote:
> On Wed, Feb 19, 2025 at 08:29:47AM -0500, Joel Fernandes wrote:
>>
>>
>> On 2/19/2025 8:22 AM, Paul E. McKenney wrote:
>>> On Wed, Feb 19, 2025 at 07:43:08AM -0500, Joel Fernandes wrote:
>>>> poll_state_synchronize_srcu() uses rcu_seq_done() unlike
>>>> poll_state_synchronize_rcu() which uses rcu_seq_done_exact().
>>>>
>>>> The  rcu_seq_done_exact() makes more sense for polling API, as with
>>>> this API, there is a higher chance that there is a significant delay
>>>> between the get_state..() and poll_state..() calls since a cookie
>>>> can be stored and reused at a later time. During such a delay, if
>>>> the gp_seq counter progresses more than ULONG_MAX/2 distance, then
>>>> poll_state..() may return false for a long time unwantedly.
>>>>
>>>> Fix by using the more accurate rcu_seq_done_exact() API which is
>>>> exactly what straight RCU's polling does.
>>>>
>>>> It may make sense, as future work, to add debug code here as well, where
>>>> we compare a physical timestamp between get_state..() and poll_state()
>>>> calls and yell if significant time has past but the grace period has
>>>> still not progressed.
>>>>
>>>> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
>>>> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
>>>
>>> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
>>>
>>> But we should also run this by Kent Overstreet, given that bcachefs
>>> uses this.  Should be OK, given that bcachefs uses this API in the same
>>> way that it does poll_state_synchronize_rcu(), but still...
>>
>> Thanks Paul!  Adding Kent Overstreet to the email to raise any objections.
> 
> It sounds like rcu_done_exact() is indeed what we want - bcachefs uses
> this for determining when objects may be reclaimed (as is typical with
> rcu), so we don't want objects to be stranded a "significant time past
> the grace period".

Thanks for confirming. May I add your Reviewed-by tag as well?

> 
> Is there any additional cost? I'm not seeing rcu_done_exact() in Linus's
> tree yet. Minor additional overhead would be totally fine; we use this
> from fs/bcachefs/rcu_pending.c which doesn't call it for each object.

The additional overhead should be minimal. rcu_seq_done_exact() is the function.

thanks,

 -Joel
Kent Overstreet Feb. 20, 2025, 12:32 a.m. UTC | #5
On Wed, Feb 19, 2025 at 06:45:07PM -0500, Joel Fernandes wrote:
> 
> 
> On 2/19/2025 4:07 PM, Kent Overstreet wrote:
> > On Wed, Feb 19, 2025 at 08:29:47AM -0500, Joel Fernandes wrote:
> >>
> >>
> >> On 2/19/2025 8:22 AM, Paul E. McKenney wrote:
> >>> On Wed, Feb 19, 2025 at 07:43:08AM -0500, Joel Fernandes wrote:
> >>>> poll_state_synchronize_srcu() uses rcu_seq_done() unlike
> >>>> poll_state_synchronize_rcu() which uses rcu_seq_done_exact().
> >>>>
> >>>> The  rcu_seq_done_exact() makes more sense for polling API, as with
> >>>> this API, there is a higher chance that there is a significant delay
> >>>> between the get_state..() and poll_state..() calls since a cookie
> >>>> can be stored and reused at a later time. During such a delay, if
> >>>> the gp_seq counter progresses more than ULONG_MAX/2 distance, then
> >>>> poll_state..() may return false for a long time unwantedly.
> >>>>
> >>>> Fix by using the more accurate rcu_seq_done_exact() API which is
> >>>> exactly what straight RCU's polling does.
> >>>>
> >>>> It may make sense, as future work, to add debug code here as well, where
> >>>> we compare a physical timestamp between get_state..() and poll_state()
> >>>> calls and yell if significant time has past but the grace period has
> >>>> still not progressed.
> >>>>
> >>>> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> >>>> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
> >>>
> >>> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> >>>
> >>> But we should also run this by Kent Overstreet, given that bcachefs
> >>> uses this.  Should be OK, given that bcachefs uses this API in the same
> >>> way that it does poll_state_synchronize_rcu(), but still...
> >>
> >> Thanks Paul!  Adding Kent Overstreet to the email to raise any objections.
> > 
> > It sounds like rcu_done_exact() is indeed what we want - bcachefs uses
> > this for determining when objects may be reclaimed (as is typical with
> > rcu), so we don't want objects to be stranded a "significant time past
> > the grace period".
> 
> Thanks for confirming. May I add your Reviewed-by tag as well?

Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
diff mbox series

Patch

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index d2a694944553..591371d62e89 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1589,7 +1589,7 @@  EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu);
 bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie)
 {
 	if (cookie != SRCU_GET_STATE_COMPLETED &&
-	    !rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, cookie))
+	    !rcu_seq_done_exact(&ssp->srcu_sup->srcu_gp_seq, cookie))
 		return false;
 	// Ensure that the end of the SRCU grace period happens before
 	// any subsequent code that the caller might execute.