diff mbox series

[v2] rhashtable: Fix rhashtable_try_insert test

Message ID Z4FXs8vAtitHIJyl@gondor.apana.org.au (mailing list archive)
State New
Headers show
Series [v2] rhashtable: Fix rhashtable_try_insert test | expand

Checks

Context Check Description
netdev/tree_selection success Guessing tree name failed - patch did not apply

Commit Message

Herbert Xu Jan. 10, 2025, 5:24 p.m. UTC
On Fri, Jan 10, 2025 at 04:59:28PM +0000, Michael Kelley wrote:
>
> This patch fixes the problem I saw with VMs in the Azure cloud.  Thanks!

Sorry, but the test on data is needed after all (it was just
buggy).  Otherwise we will break rhlist.  So please test this
patch instead.

---8<---
The test on whether rhashtable_insert_one did an insertion relies
on the value returned by rhashtable_lookup_one.  Unfortunately that
value is overwritten after rhashtable_insert_one returns.  Fix this
by saving the old value.

Also simplify the test as only data == NULL matters.

Reported-by: Michael Kelley <mhklinux@outlook.com>
Fixes: e1d3422c95f0 ("rhashtable: Fix potential deadlock by moving schedule_work outside lock")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Comments

Michael Kelley Jan. 10, 2025, 6:22 p.m. UTC | #1
From: Herbert Xu <herbert@gondor.apana.org.au> Sent: Friday, January 10, 2025 9:24 AM
> 
> On Fri, Jan 10, 2025 at 04:59:28PM +0000, Michael Kelley wrote:
> >
> > This patch fixes the problem I saw with VMs in the Azure cloud.  Thanks!
> 
> Sorry, but the test on data is needed after all (it was just
> buggy).  Otherwise we will break rhlist.  So please test this
> patch instead.
> 
> ---8<---
> The test on whether rhashtable_insert_one did an insertion relies
> on the value returned by rhashtable_lookup_one.  Unfortunately that
> value is overwritten after rhashtable_insert_one returns.  Fix this
> by saving the old value.
> 
> Also simplify the test as only data == NULL matters.
> 
> Reported-by: Michael Kelley <mhklinux@outlook.com>
> Fixes: e1d3422c95f0 ("rhashtable: Fix potential deadlock by moving schedule_work
> outside lock")
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index bf956b85455a..e36b36f3146d 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -611,17 +611,20 @@ static void *rhashtable_try_insert(struct rhashtable *ht,
> const void *key,
>  			new_tbl = rht_dereference_rcu(tbl->future_tbl, ht);
>  			data = ERR_PTR(-EAGAIN);
>  		} else {
> +			void *odata;
> +
>  			flags = rht_lock(tbl, bkt);
>  			data = rhashtable_lookup_one(ht, bkt, tbl,
>  						     hash, key, obj);
>  			new_tbl = rhashtable_insert_one(ht, bkt, tbl,
>  							hash, obj, data);
> +			odata = data;
>  			if (PTR_ERR(new_tbl) != -EEXIST)
>  				data = ERR_CAST(new_tbl);
> 
>  			rht_unlock(tbl, bkt, flags);
> 
> -			if (PTR_ERR(data) == -ENOENT && !new_tbl) {
> +			if (odata && !new_tbl) {
>  				atomic_inc(&ht->nelems);
>  				if (rht_grow_above_75(ht, tbl))
>  					schedule_work(&ht->run_work);

This patch passes my tests. I'm doing a narrow test to verify that
the boot failure when opening the Mellanox NIC is no longer occurring.
I also unloaded/reloaded the mlx5 driver a couple of times. For good
measure, I then did a full Linux kernel build, and all is good. My testing
does not broadly verify correct operation of rhashtable except as it
gets exercised implicitly by these basic tests.

Michael
diff mbox series

Patch

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index bf956b85455a..e36b36f3146d 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -611,17 +611,20 @@  static void *rhashtable_try_insert(struct rhashtable *ht, const void *key,
 			new_tbl = rht_dereference_rcu(tbl->future_tbl, ht);
 			data = ERR_PTR(-EAGAIN);
 		} else {
+			void *odata;
+
 			flags = rht_lock(tbl, bkt);
 			data = rhashtable_lookup_one(ht, bkt, tbl,
 						     hash, key, obj);
 			new_tbl = rhashtable_insert_one(ht, bkt, tbl,
 							hash, obj, data);
+			odata = data;
 			if (PTR_ERR(new_tbl) != -EEXIST)
 				data = ERR_CAST(new_tbl);
 
 			rht_unlock(tbl, bkt, flags);
 
-			if (PTR_ERR(data) == -ENOENT && !new_tbl) {
+			if (odata && !new_tbl) {
 				atomic_inc(&ht->nelems);
 				if (rht_grow_above_75(ht, tbl))
 					schedule_work(&ht->run_work);