diff mbox series

[2/4] RDMA/rxe: Fix dead lock caused by rxe_alloc interrupted by rxe_pool_get_index

Message ID 20220422194416.983549-2-yanjun.zhu@linux.dev (mailing list archive)
State Changes Requested
Delegated to: Jason Gunthorpe
Headers show
Series [PATCHv6,1/4] RDMA/rxe: Fix dead lock caused by __rxe_add_to_pool interrupted by rxe_pool_get_index | expand

Commit Message

Zhu Yanjun April 22, 2022, 7:44 p.m. UTC
From: Zhu Yanjun <yanjun.zhu@linux.dev>

The ah_pool xa_lock first is acquired in this:

{SOFTIRQ-ON-W} state was registered at:
  lock_acquire+0x1d2/0x5a0
  _raw_spin_lock+0x33/0x80
  rxe_alloc+0x1be/0x290 [rdma_rxe]

Then ah_pool xa_lock is acquired in this:

{IN-SOFTIRQ-W}:
  <TASK>
  __lock_acquire+0x1565/0x34a0
  lock_acquire+0x1d2/0x5a0
  _raw_spin_lock_irqsave+0x42/0x90
  rxe_pool_get_index+0x72/0x1d0 [rdma_rxe]
  </TASK>

From the above, in the function rxe_alloc,
xa_lock is acquired. Then the function rxe_alloc
is interrupted by softirq. The function
rxe_pool_get_index will also acquire xa_lock.

Finally, the dead lock appears.

        CPU0
        ----
   lock(&xa->xa_lock#15);  <----- rxe_alloc
   <Interrupt>
     lock(&xa->xa_lock#15); <---- rxe_pool_get_index

    *** DEADLOCK ***

Fixes: 3225717f6dfa ("RDMA/rxe: Replace red-black trees by carrays")
Reported-and-tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
 drivers/infiniband/sw/rxe/rxe_pool.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
index 67f1d4733682..7b12a52fed35 100644
--- a/drivers/infiniband/sw/rxe/rxe_pool.c
+++ b/drivers/infiniband/sw/rxe/rxe_pool.c
@@ -138,8 +138,8 @@  void *rxe_alloc(struct rxe_pool *pool)
 	elem->obj = obj;
 	kref_init(&elem->ref_cnt);
 
-	err = xa_alloc_cyclic(&pool->xa, &elem->index, elem, pool->limit,
-			      &pool->next, GFP_KERNEL);
+	err = xa_alloc_cyclic_irq(&pool->xa, &elem->index, elem, pool->limit,
+				  &pool->next, GFP_KERNEL);
 	if (err)
 		goto err_free;