Message ID | 20220127213755.31697-1-rpearsonhpe@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | [RFC,v9,01/26] RDMA/rxe: Move rxe_mcast_add/delete to rxe_mcast.c | expand |
On Thu, Jan 27, 2022 at 03:37:29PM -0600, Bob Pearson wrote: > There are several race conditions discovered in the current rdma_rxe > > Bob Pearson (26): > RDMA/rxe: Move rxe_mcast_add/delete to rxe_mcast.c > RDMA/rxe: Move rxe_mcast_attach/detach to rxe_mcast.c > RDMA/rxe: Rename rxe_mc_grp and rxe_mc_elem > RDMA/rxe: Enforce IBA o10-2.2.3 > RDMA/rxe: Remove rxe_drop_all_macst_groups > RDMA/rxe: Remove qp->grp_lock and qp->grp_list I took these patches to for-next > RDMA/rxe: Use kzmalloc/kfree for mca > RDMA/rxe: Rename grp to mcg and mce to mca > RDMA/rxe: Introduce RXECB(skb) > RDMA/rxe: Split rxe_rcv_mcast_pkt into two phases > RDMA/rxe: Replace locks by rxe->mcg_lock > RDMA/rxe: Replace pool key by rxe->mcg_tree > RDMA/rxe: Remove key'ed object support > RDMA/rxe: Remove mcg from rxe pools > RDMA/rxe: Add code to cleanup mcast memory > RDMA/rxe: Add comments to rxe_mcast.c > RDMA/rxe: Separate code into subroutines I think you should try to get up to here done in one series and merged, it looked OK > RDMA/rxe: Convert mca read locking to RCU Not sure this can ever work.. > RDMA/rxe: Reverse the sense of RXE_POOL_NO_ALLOC > RDMA/rxe: Delete _locked() APIs for pool objects > RDMA/rxe: Replace obj by elem in declaration > RDMA/rxe: Replace red-black trees by xarrays > RDMA/rxe: Change pool locking to RCU > RDMA/rxe: Add wait_for_completion to pool objects > RDMA/rxe: Fix ref error in rxe_av.c > RDMA/rxe: Replace mr by rkey in responder resources These also seem reasonable, didn't follow why we needed the RCU patch? Jason
On 1/28/22 12:42, Jason Gunthorpe wrote: > On Thu, Jan 27, 2022 at 03:37:29PM -0600, Bob Pearson wrote: >> There are several race conditions discovered in the current rdma_rxe >> >> Bob Pearson (26): >> RDMA/rxe: Move rxe_mcast_add/delete to rxe_mcast.c >> RDMA/rxe: Move rxe_mcast_attach/detach to rxe_mcast.c >> RDMA/rxe: Rename rxe_mc_grp and rxe_mc_elem >> RDMA/rxe: Enforce IBA o10-2.2.3 >> RDMA/rxe: Remove rxe_drop_all_macst_groups >> RDMA/rxe: Remove qp->grp_lock and qp->grp_list > > I took these patches to for-next > >> RDMA/rxe: Use kzmalloc/kfree for mca >> RDMA/rxe: Rename grp to mcg and mce to mca >> RDMA/rxe: Introduce RXECB(skb) >> RDMA/rxe: Split rxe_rcv_mcast_pkt into two phases >> RDMA/rxe: Replace locks by rxe->mcg_lock >> RDMA/rxe: Replace pool key by rxe->mcg_tree >> RDMA/rxe: Remove key'ed object support >> RDMA/rxe: Remove mcg from rxe pools >> RDMA/rxe: Add code to cleanup mcast memory >> RDMA/rxe: Add comments to rxe_mcast.c >> RDMA/rxe: Separate code into subroutines > > I think you should try to get up to here done in one series and > merged, it looked OK Jason, I have these ready again. It is a little restructured but gets to the same place. Last time I sent things in you had a complaint but it got mangled somehow so I couldn't read it. Is there anything else I should be looking at before posting these again? Bob > >> RDMA/rxe: Convert mca read locking to RCU > > Not sure this can ever work.. > >> RDMA/rxe: Reverse the sense of RXE_POOL_NO_ALLOC >> RDMA/rxe: Delete _locked() APIs for pool objects >> RDMA/rxe: Replace obj by elem in declaration >> RDMA/rxe: Replace red-black trees by xarrays >> RDMA/rxe: Change pool locking to RCU >> RDMA/rxe: Add wait_for_completion to pool objects >> RDMA/rxe: Fix ref error in rxe_av.c >> RDMA/rxe: Replace mr by rkey in responder resources > > These also seem reasonable, didn't follow why we needed the RCU patch? > > Jason
On Mon, Feb 07, 2022 at 01:20:32PM -0600, Bob Pearson wrote: > On 1/28/22 12:42, Jason Gunthorpe wrote: > > On Thu, Jan 27, 2022 at 03:37:29PM -0600, Bob Pearson wrote: > >> There are several race conditions discovered in the current rdma_rxe > >> > >> Bob Pearson (26): > >> RDMA/rxe: Move rxe_mcast_add/delete to rxe_mcast.c > >> RDMA/rxe: Move rxe_mcast_attach/detach to rxe_mcast.c > >> RDMA/rxe: Rename rxe_mc_grp and rxe_mc_elem > >> RDMA/rxe: Enforce IBA o10-2.2.3 > >> RDMA/rxe: Remove rxe_drop_all_macst_groups > >> RDMA/rxe: Remove qp->grp_lock and qp->grp_list > > > > I took these patches to for-next > > > >> RDMA/rxe: Use kzmalloc/kfree for mca > >> RDMA/rxe: Rename grp to mcg and mce to mca > >> RDMA/rxe: Introduce RXECB(skb) > >> RDMA/rxe: Split rxe_rcv_mcast_pkt into two phases > >> RDMA/rxe: Replace locks by rxe->mcg_lock > >> RDMA/rxe: Replace pool key by rxe->mcg_tree > >> RDMA/rxe: Remove key'ed object support > >> RDMA/rxe: Remove mcg from rxe pools > >> RDMA/rxe: Add code to cleanup mcast memory > >> RDMA/rxe: Add comments to rxe_mcast.c > >> RDMA/rxe: Separate code into subroutines > > > > I think you should try to get up to here done in one series and > > merged, it looked OK > > Jason, > > I have these ready again. It is a little restructured but gets to the same place. > Last time I sent things in you had a complaint but it got mangled somehow so I > couldn't read it. Is there anything else I should be looking at before posting these > again? I think I said you shouldn't re-send patches I've already applied? Jason
There are several race conditions discovered in the current rdma_rxe driver. They mostly relate to races between normal operations and destroying objects. This patch series - Makes several minor cleanups in rxe_pool.[ch] - Replaces the red-black trees currently used by xarrays for indices - Simplifies the API for keyed objects - Corrects several reference counting errors - Adds wait for completions to the paths in verbs APIs which destroy objects. The patch series has been changed to RFC PATCH instead of PATCH for-next because I have little experience with rcu locking and would like someone else to review this code (in 18/26 and 24/26). RCU locking should improve performance at large scale but this has not been tested yet. This patch series applies cleanly to current for-next. commit e783362eb54cd99b2cac8b3a9aeac942e6f6ac07 (tag: v5.17-rc1, origin/wip/jgg-for-rc, origin/wip/jgg-for-next, origin/wip/for-testing, origin/for-rc, origin/for-next, origin/HEAD, for-next) Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> --- v9 Corrected issues reported by Jason Gunthorpe, Converted locking in rxe_mcast.c and rxe_pool.c to use RCU Split up the patches into smaller changes v8 Fixed an additional race in 3/8 which was not handled correctly. v7 Corrected issues reported by Jason Gunthorpe Link: https://lore.kernel.org/linux-rdma/20211207190947.GH6385@nvidia.com/ Link: https://lore.kernel.org/linux-rdma/20211207191857.GI6385@nvidia.com/ Link: https://lore.kernel.org/linux-rdma/20211207192824.GJ6385@nvidia.com/ v6 Fixed a kzalloc flags bug. Fixed comment bug reported by 'Kernel Test Robot'. Changed type of rxe_pool.c in __rxe_fini(). v5 Removed patches already accepted into for-next and addressed comments from Jason Gunthorpe. v4 Restructured patch series to change to xarray earlier which greatly simplified the changes. Rebased to current for-next v3 Changed rxe_alloc to use GFP_KERNEL Addressed other comments by Jason Gunthorp Merged the previous 06/10 and 07/10 patches into one since they overlapped Added some minor cleanups as 10/10 v2 Rebased to current for-next. Added 4 additional patches Bob Pearson (26): RDMA/rxe: Move rxe_mcast_add/delete to rxe_mcast.c RDMA/rxe: Move rxe_mcast_attach/detach to rxe_mcast.c RDMA/rxe: Rename rxe_mc_grp and rxe_mc_elem RDMA/rxe: Enforce IBA o10-2.2.3 RDMA/rxe: Remove rxe_drop_all_macst_groups RDMA/rxe: Remove qp->grp_lock and qp->grp_list RDMA/rxe: Use kzmalloc/kfree for mca RDMA/rxe: Rename grp to mcg and mce to mca RDMA/rxe: Introduce RXECB(skb) RDMA/rxe: Split rxe_rcv_mcast_pkt into two phases RDMA/rxe: Replace locks by rxe->mcg_lock RDMA/rxe: Replace pool key by rxe->mcg_tree RDMA/rxe: Remove key'ed object support RDMA/rxe: Remove mcg from rxe pools RDMA/rxe: Add code to cleanup mcast memory RDMA/rxe: Add comments to rxe_mcast.c RDMA/rxe: Separate code into subroutines RDMA/rxe: Convert mca read locking to RCU RDMA/rxe: Reverse the sense of RXE_POOL_NO_ALLOC RDMA/rxe: Delete _locked() APIs for pool objects RDMA/rxe: Replace obj by elem in declaration RDMA/rxe: Replace red-black trees by xarrays RDMA/rxe: Change pool locking to RCU RDMA/rxe: Add wait_for_completion to pool objects RDMA/rxe: Fix ref error in rxe_av.c RDMA/rxe: Replace mr by rkey in responder resources drivers/infiniband/sw/rxe/rxe.c | 107 +--- drivers/infiniband/sw/rxe/rxe_av.c | 19 +- drivers/infiniband/sw/rxe/rxe_hdr.h | 3 + drivers/infiniband/sw/rxe/rxe_loc.h | 33 +- drivers/infiniband/sw/rxe/rxe_mcast.c | 678 ++++++++++++++++------ drivers/infiniband/sw/rxe/rxe_mr.c | 2 +- drivers/infiniband/sw/rxe/rxe_mw.c | 11 +- drivers/infiniband/sw/rxe/rxe_net.c | 35 +- drivers/infiniband/sw/rxe/rxe_pool.c | 798 ++++++++++---------------- drivers/infiniband/sw/rxe/rxe_pool.h | 233 +++----- drivers/infiniband/sw/rxe/rxe_qp.c | 29 +- drivers/infiniband/sw/rxe/rxe_recv.c | 98 ++-- drivers/infiniband/sw/rxe/rxe_req.c | 55 +- drivers/infiniband/sw/rxe/rxe_resp.c | 125 ++-- drivers/infiniband/sw/rxe/rxe_verbs.c | 54 +- drivers/infiniband/sw/rxe/rxe_verbs.h | 26 +- 16 files changed, 1159 insertions(+), 1147 deletions(-) rewrite drivers/infiniband/sw/rxe/rxe_mcast.c (86%) rewrite drivers/infiniband/sw/rxe/rxe_pool.c (67%) rewrite drivers/infiniband/sw/rxe/rxe_pool.h (73%)