mbox series

[for-next,v8,0/8] RDMA/rxe: Correct race conditions

Message ID 20211216233201.14893-1-rpearsonhpe@gmail.com (mailing list archive)
Headers show
Series RDMA/rxe: Correct race conditions | expand

Message

Bob Pearson Dec. 16, 2021, 11:31 p.m. UTC
There are several race conditions discovered in the current rdma_rxe
driver.  They mostly relate to races between normal operations and
destroying objects.  This patch series
 - Makes several minor cleanups in rxe_pool.[ch]
 - Replaces the red-black trees currently used by xarrays for indices
 - Simplifies the API for keyed objects
 - Corrects several reference counting errors
 - Adds wait for completions to the paths in verbs APIs which destroy
   objects.

This patch series applies cleanly to current for-next.
commit c8f476da84ad ("Merge branch 'mlx5-next' of
	git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux")

Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
v8
  Fixed an additional race in 3/8 which was not handled correctly.
v7
  Corrected issues reported by Jason Gunthorpe
Link: https://lore.kernel.org/linux-rdma/20211207190947.GH6385@nvidia.com/
Link: https://lore.kernel.org/linux-rdma/20211207191857.GI6385@nvidia.com/
Link: https://lore.kernel.org/linux-rdma/20211207192824.GJ6385@nvidia.com/
v6
  Fixed a kzalloc flags bug.
  Fixed comment bug reported by 'Kernel Test Robot'.
  Changed type of rxe_pool.c in __rxe_fini().
v5
  Removed patches already accepted into for-next and addressed comments
  from Jason Gunthorpe.
v4
  Restructured patch series to change to xarray earlier which
  greatly simplified the changes.
  Rebased to current for-next
v3
  Changed rxe_alloc to use GFP_KERNEL
  Addressed other comments by Jason Gunthorp
  Merged the previous 06/10 and 07/10 patches into one since they overlapped
  Added some minor cleanups as 10/10
v2
  Rebased to current for-next.
  Added 4 additional patches

Bob Pearson (8):
  RDMA/rxe: Replace RB tree by xarray for indexes
  RDMA/rxe: Reverse the sense of RXE_POOL_NO_ALLOC
  RDMA/rxe: Cleanup pool APIs for keyed objects
  RDMA/rxe: Fix ref error in rxe_av.c
  RDMA/rxe: Replace mr by rkey in responder resources
  RDMA/rxe: Minor cleanups in rxe_pool.c/rxe_pool.h
  RDMA/rxe: Replace rxe_alloc by kzalloc for rxe_mc_elem
  RDMA/rxe: Add wait for completion to obj destruct

 drivers/infiniband/sw/rxe/rxe.c       | 101 +----
 drivers/infiniband/sw/rxe/rxe_av.c    |  19 +-
 drivers/infiniband/sw/rxe/rxe_loc.h   |  10 +-
 drivers/infiniband/sw/rxe/rxe_mcast.c |  76 ++--
 drivers/infiniband/sw/rxe/rxe_mr.c    |   3 +-
 drivers/infiniband/sw/rxe/rxe_mw.c    |   7 +-
 drivers/infiniband/sw/rxe/rxe_net.c   |  17 +-
 drivers/infiniband/sw/rxe/rxe_pool.c  | 529 ++++++++++++--------------
 drivers/infiniband/sw/rxe/rxe_pool.h  | 110 ++----
 drivers/infiniband/sw/rxe/rxe_qp.c    |  10 +-
 drivers/infiniband/sw/rxe/rxe_req.c   |  55 +--
 drivers/infiniband/sw/rxe/rxe_resp.c  | 125 ++++--
 drivers/infiniband/sw/rxe/rxe_verbs.c |  72 ++--
 drivers/infiniband/sw/rxe/rxe_verbs.h |   3 -
 14 files changed, 515 insertions(+), 622 deletions(-)

Comments

Jason Gunthorpe Jan. 7, 2022, 1:12 p.m. UTC | #1
On Thu, Dec 16, 2021 at 05:31:54PM -0600, Bob Pearson wrote:
> There are several race conditions discovered in the current rdma_rxe
> driver.  They mostly relate to races between normal operations and
> destroying objects.  This patch series
>  - Makes several minor cleanups in rxe_pool.[ch]
>  - Replaces the red-black trees currently used by xarrays for indices
>  - Simplifies the API for keyed objects
>  - Corrects several reference counting errors
>  - Adds wait for completions to the paths in verbs APIs which destroy
>    objects.

I think this will work better for you you strip out the RXE_POOL_KEY
stuff that is only use for multicast and move it to the mcast
file. They are so different now they shouldn't be sharing much of
anything.

Jason