mbox series

[for-next,v2,0/8] RDMA/rxe: Correct qp reference counting

Message ID 20230301045154.23733-1-rpearsonhpe@gmail.com (mailing list archive)
Headers show
Series RDMA/rxe: Correct qp reference counting | expand

Message

Bob Pearson March 1, 2023, 4:51 a.m. UTC
This patch series corrects qp reference counting issues
related to deferred execution of tasklets. These issues were
discovered in attempting to resolve soft lockups of the rxe
driver observed by Daisuke Matsuda in a version of the driver
using work queues where the workqueue implementation was based
on the current tasklet based driver. An attempt to find the
root cause of those lockups lead to an error in the tasklet
implementation that has been present since the driver went
upstream. This patch series corrects that error.

With this patch series applied the rxe driver is more stable and
has run the test cases reported by Matsuda for over 24 hours without
errors.

The series also corrects some errors in qp reference counting
related to qp cleanup.

This series depends on the RDMA/rxe: Add error logging to rxe"
series as a prerequisite.

Link: https://lore.kernel.org/linux-rdma/TYCPR01MB845522FD536170D75068DD41E5099@TYCPR01MB8455.jpnprd01.prod.outlook.com/
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>

v2:
  This version of this series split off the changes to rxe debug code
  which have been submitted as "RDMA/rxe: Add error logging to rxe".
  One unrelated patch was dropped and other patches earlier included
  in a series to convert from tasklets to workqueues were moved into
  this series because they are relevant both for the tasklet version
  and the workqueue version of the driver.

Bob Pearson (8):
  RDMA/rxe: Convert tasklet args to queue pairs
  RDMA/rxe: warn if refcnt zero in rxe_put
  RDMA/rxe: Cleanup reset state handling in rxe_resp.c
  RDMA/rxe: Cleanup error state handling in rxe_comp.c
  RDMA/rxe: Remove qp reference counting in tasks
  RDMA/rxe: Remove __rxe_do_task()
  RDMA/rxe: Make tasks schedule each other
  RDMA/rxe: Rewrite rxe_task.c

 drivers/infiniband/sw/rxe/rxe.h      |   1 -
 drivers/infiniband/sw/rxe/rxe_comp.c |  71 +++++--
 drivers/infiniband/sw/rxe/rxe_loc.h  |   6 +-
 drivers/infiniband/sw/rxe/rxe_pool.c |   2 +
 drivers/infiniband/sw/rxe/rxe_qp.c   |  56 ++----
 drivers/infiniband/sw/rxe/rxe_req.c  |  12 +-
 drivers/infiniband/sw/rxe/rxe_resp.c | 114 ++++++------
 drivers/infiniband/sw/rxe/rxe_task.c | 268 +++++++++++++++++++++------
 drivers/infiniband/sw/rxe/rxe_task.h |  23 ++-
 9 files changed, 352 insertions(+), 201 deletions(-)


base-commit: bceed5834cd43a0ed67e35ec16197a5c882d3a6d