diff mbox series

[2/2] docs/RCU: Bring back smp_wmb()

Message ID 20230706102849.437687-3-mmpgouride@gmail.com (mailing list archive)
State New, archived
Headers show
Series rcu: Fix rculist_nulls and doc | expand

Commit Message

Alan Huang July 6, 2023, 10:28 a.m. UTC
The objects are allocated with SLAB_TYPESAFE_BY_RCU, and there is
n->next = first within hlist_add_head_rcu() before rcu_assign_pointer(),
which modifies obj->obj_node.next. There may be readers holding the
reference of obj in lockless_lookup, and when updater modifies ->next,
readers can see the change immediately because of SLAB_TYPESAFE_BY_RCU.

There are two memory ordering required in the insertion algorithm,
we need to make sure obj->key is updated before obj->obj_node.next
and obj->refcnt, atomic_set_release is not enough to provide the
required memory barrier.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
---
 Documentation/RCU/rculist_nulls.rst | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Paul E. McKenney July 10, 2023, 7:11 p.m. UTC | #1
On Thu, Jul 06, 2023 at 10:28:49AM +0000, Alan Huang wrote:
> The objects are allocated with SLAB_TYPESAFE_BY_RCU, and there is
> n->next = first within hlist_add_head_rcu() before rcu_assign_pointer(),
> which modifies obj->obj_node.next. There may be readers holding the
> reference of obj in lockless_lookup, and when updater modifies ->next,
> readers can see the change immediately because of SLAB_TYPESAFE_BY_RCU.
> 
> There are two memory ordering required in the insertion algorithm,
> we need to make sure obj->key is updated before obj->obj_node.next
> and obj->refcnt, atomic_set_release is not enough to provide the
> required memory barrier.
> 
> Signed-off-by: Alan Huang <mmpgouride@gmail.com>
> ---
>  Documentation/RCU/rculist_nulls.rst | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst
> index 21e40fcc08de..fa3729dc7e74 100644
> --- a/Documentation/RCU/rculist_nulls.rst
> +++ b/Documentation/RCU/rculist_nulls.rst
> @@ -64,7 +64,7 @@ but a version with an additional memory barrier (smp_rmb())
>    {
>      struct hlist_node *node, *next;
>      for (pos = rcu_dereference((head)->first);
> -         pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) &&
> +         pos && ({ next = READ_ONCE(pos->next); smp_rmb(); prefetch(next); 1; }) &&

This one looks good, though the READ_ONCE() becoming rcu_dereference()
would allow the smp_rmb() to be dropped, correct?

>           ({ obj = hlist_entry(pos, typeof(*obj), obj_node); 1; });
>           pos = rcu_dereference(next))
>        if (obj->key == key)
> @@ -112,7 +112,12 @@ detect the fact that it missed following items in original chain.
>    obj = kmem_cache_alloc(...);
>    lock_chain(); // typically a spin_lock()
>    obj->key = key;
> -  atomic_set_release(&obj->refcnt, 1); // key before refcnt
> +  /*
> +   * We need to make sure obj->key is updated before obj->obj_node.next
> +   * and obj->refcnt.
> +   */
> +  smp_wmb();
> +  atomic_set(&obj->refcnt, 1);

...but what is smp_wmb() doing that the combination of atomic_set_release()
and hlist_add_head_rcu() was not already doing?  What am I missing?

							Thanx, Paul

>    hlist_add_head_rcu(&obj->obj_node, list);
>    unlock_chain(); // typically a spin_unlock()
>  
> -- 
> 2.34.1
>
Alan Huang July 11, 2023, 2:50 p.m. UTC | #2
> 2023年7月11日 03:11,Paul E. McKenney <paulmck@kernel.org> 写道:
> 
> On Thu, Jul 06, 2023 at 10:28:49AM +0000, Alan Huang wrote:
>> The objects are allocated with SLAB_TYPESAFE_BY_RCU, and there is
>> n->next = first within hlist_add_head_rcu() before rcu_assign_pointer(),
>> which modifies obj->obj_node.next. There may be readers holding the
>> reference of obj in lockless_lookup, and when updater modifies ->next,
>> readers can see the change immediately because of SLAB_TYPESAFE_BY_RCU.
>> 
>> There are two memory ordering required in the insertion algorithm,
>> we need to make sure obj->key is updated before obj->obj_node.next
>> and obj->refcnt, atomic_set_release is not enough to provide the
>> required memory barrier.
>> 
>> Signed-off-by: Alan Huang <mmpgouride@gmail.com>
>> ---
>> Documentation/RCU/rculist_nulls.rst | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>> 
>> diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst
>> index 21e40fcc08de..fa3729dc7e74 100644
>> --- a/Documentation/RCU/rculist_nulls.rst
>> +++ b/Documentation/RCU/rculist_nulls.rst
>> @@ -64,7 +64,7 @@ but a version with an additional memory barrier (smp_rmb())
>>   {
>>     struct hlist_node *node, *next;
>>     for (pos = rcu_dereference((head)->first);
>> -         pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) &&
>> +         pos && ({ next = READ_ONCE(pos->next); smp_rmb(); prefetch(next); 1; }) &&
> 
> This one looks good, though the READ_ONCE() becoming rcu_dereference()
> would allow the smp_rmb() to be dropped, correct?

The pattern here is:

reader 									               updater

// pos->next is also obj->obj_node.next				     
READ_ONCE(pos->next); 							WRITE_ONCE(obj->key, key);
smp_rmb();										smp_wmb();
												// this is n->next = first; within hlist_add_head_rcu
READ_ONCE(obj->key);							WRITE_ONCE(obj->obj_node.next, h->first);

The point here is that the objects are allocated with SLAB_TYPESAFE_BY_RCU, and there is 

	 n->next = first;

within hlist_add_head_rcu, the modification is visible to readers immediately (before the invocation of rcu_assign_pointer)

Therefore, the smp_rmb() is necessary to ensure that we won’t get the new value of ->next and the old ->key.
(If we get the new ->next and old ->key, we can not detect movement of the object.)

But in this patch, I forgot to add READ_ONCE to obj->key, will send v2.

> 
>>          ({ obj = hlist_entry(pos, typeof(*obj), obj_node); 1; });
>>          pos = rcu_dereference(next))
>>       if (obj->key == key)
>> @@ -112,7 +112,12 @@ detect the fact that it missed following items in original chain.
>>   obj = kmem_cache_alloc(...);
>>   lock_chain(); // typically a spin_lock()
>>   obj->key = key;
>> -  atomic_set_release(&obj->refcnt, 1); // key before refcnt
>> +  /*
>> +   * We need to make sure obj->key is updated before obj->obj_node.next
>> +   * and obj->refcnt.
>> +   */
>> +  smp_wmb();
>> +  atomic_set(&obj->refcnt, 1);
> 
> ...but what is smp_wmb() doing that the combination of atomic_set_release()
> and hlist_add_head_rcu() was not already doing?  What am I missing?

Like above.

> 
> Thanx, Paul
> 
>>   hlist_add_head_rcu(&obj->obj_node, list);
>>   unlock_chain(); // typically a spin_unlock()
>> 
>> -- 
>> 2.34.1
diff mbox series

Patch

diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst
index 21e40fcc08de..fa3729dc7e74 100644
--- a/Documentation/RCU/rculist_nulls.rst
+++ b/Documentation/RCU/rculist_nulls.rst
@@ -64,7 +64,7 @@  but a version with an additional memory barrier (smp_rmb())
   {
     struct hlist_node *node, *next;
     for (pos = rcu_dereference((head)->first);
-         pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) &&
+         pos && ({ next = READ_ONCE(pos->next); smp_rmb(); prefetch(next); 1; }) &&
          ({ obj = hlist_entry(pos, typeof(*obj), obj_node); 1; });
          pos = rcu_dereference(next))
       if (obj->key == key)
@@ -112,7 +112,12 @@  detect the fact that it missed following items in original chain.
   obj = kmem_cache_alloc(...);
   lock_chain(); // typically a spin_lock()
   obj->key = key;
-  atomic_set_release(&obj->refcnt, 1); // key before refcnt
+  /*
+   * We need to make sure obj->key is updated before obj->obj_node.next
+   * and obj->refcnt.
+   */
+  smp_wmb();
+  atomic_set(&obj->refcnt, 1);
   hlist_add_head_rcu(&obj->obj_node, list);
   unlock_chain(); // typically a spin_unlock()