mbox series

[for-next,v10,00/11] Fix race conditions in rxe_pool

Message ID 20220225195750.37802-1-rpearsonhpe@gmail.com (mailing list archive)
Headers show
Series Fix race conditions in rxe_pool | expand

Message

Bob Pearson Feb. 25, 2022, 7:57 p.m. UTC
There are several race conditions discovered in the current rdma_rxe
driver.  They mostly relate to races between normal operations and
destroying objects.  This patch series
 - Makes several minor cleanups in rxe_pool.[ch]
 - Replaces the red-black trees currently used by xarrays for indices
 - Corrects several reference counting errors
 - Adds wait for completions to the paths in verbs APIs which destroy
   objects.
 - Changes read side locking to rcu.

Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
v10
  Rebased to current wip/jgg-for-next.
  Split some patches into smaller ones.
v9
  Corrected issues reported by Jason Gunthorpe,
  Converted locking in rxe_mcast.c and rxe_pool.c to use RCU
  Split up the patches into smaller changes
v8
  Fixed an additional race in 3/8 which was not handled correctly.
v7
  Corrected issues reported by Jason Gunthorpe
Link: https://lore.kernel.org/linux-rdma/20211207190947.GH6385@nvidia.com/
Link: https://lore.kernel.org/linux-rdma/20211207191857.GI6385@nvidia.com/
Link: https://lore.kernel.org/linux-rdma/20211207192824.GJ6385@nvidia.com/
v6
  Fixed a kzalloc flags bug.
  Fixed comment bug reported by 'Kernel Test Robot'.
  Changed type of rxe_pool.c in __rxe_fini().
v5
  Removed patches already accepted into for-next and addressed comments
  from Jason Gunthorpe.
v4
  Restructured patch series to change to xarray earlier which
  greatly simplified the changes.
  Rebased to current for-next
v3
  Changed rxe_alloc to use GFP_KERNEL
  Addressed other comments by Jason Gunthorp
  Merged the previous 06/10 and 07/10 patches into one since they overlapped
  Added some minor cleanups as 10/10
v2
  Rebased to current for-next.
  Added 4 additional patches

Bob Pearson (11):
  RDMA/rxe: Reverse the sense of RXE_POOL_NO_ALLOC
  RDMA/rxe: Delete _locked() APIs for pool objects
  RDMA/rxe: Replace obj by elem in declaration
  RDMA/rxe: Replace red-black trees by xarrays
  RDMA/rxe: Stop lookup of partially built objects
  RDMA/rxe: Add wait_for_completion to pool objects
  RDMA/rxe: Fix ref error in rxe_av.c
  RDMA/rxe: Replace mr by rkey in responder resources
  RDMA/rxe: Convert read side locking to rcu
  RDMA/rxe: Move max_elem into rxe_type_info
  RDMA/rxe: Cleanup rxe_pool.c

 drivers/infiniband/sw/rxe/rxe.c       |  87 +----
 drivers/infiniband/sw/rxe/rxe_av.c    |  19 +-
 drivers/infiniband/sw/rxe/rxe_loc.h   |   5 +-
 drivers/infiniband/sw/rxe/rxe_mr.c    |   4 +-
 drivers/infiniband/sw/rxe/rxe_mw.c    |  13 +-
 drivers/infiniband/sw/rxe/rxe_net.c   |  17 +-
 drivers/infiniband/sw/rxe/rxe_pool.c  | 453 ++++++++++++++------------
 drivers/infiniband/sw/rxe/rxe_pool.h  |  74 ++---
 drivers/infiniband/sw/rxe/rxe_qp.c    |  10 +-
 drivers/infiniband/sw/rxe/rxe_req.c   |  55 ++--
 drivers/infiniband/sw/rxe/rxe_resp.c  | 125 ++++---
 drivers/infiniband/sw/rxe/rxe_verbs.c |  55 ++--
 drivers/infiniband/sw/rxe/rxe_verbs.h |   1 -
 13 files changed, 462 insertions(+), 456 deletions(-)


base-commit: 3ac3107872b8dd4b5c4c1b598fcbc24983cd009b

Patch applies to current wip/jgg-for-next with or without the last
two (5-6/6) patches in the multicast series.

Comments

Jason Gunthorpe Feb. 25, 2022, 8:46 p.m. UTC | #1
On Fri, Feb 25, 2022 at 01:57:40PM -0600, Bob Pearson wrote:
> There are several race conditions discovered in the current rdma_rxe
> driver.  They mostly relate to races between normal operations and
> destroying objects.  This patch series
>  - Makes several minor cleanups in rxe_pool.[ch]
>  - Replaces the red-black trees currently used by xarrays for indices
>  - Corrects several reference counting errors
>  - Adds wait for completions to the paths in verbs APIs which destroy
>    objects.
>  - Changes read side locking to rcu.
> 
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
> v10
>   Rebased to current wip/jgg-for-next.
>   Split some patches into smaller ones.

Before I look at this, can I apply it without the last two mcast
patches?

Thanks,
Jason