Message ID | 20230612155032.17036-1-rpearsonhpe@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | f5d494e2e082f28fb6aea1bdba0b7ac43d6a9d6c |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | [for-next] RDMA/rxe: Fix rxe_cq_post | expand |
On Mon, Jun 12, 2023 at 10:50:33AM -0500, Bob Pearson wrote: > A recent patch replaced a tasklet execution of cq->comp_handler > by a direct call. While this made sense it let changes to cq->notify > state be unprotected and assumed that the cq completion machinery > and the ulp done callbacks were reentrant. The result is that in > some cases completion events can be lost. This patch moves the > cq->comp_handler call inside of the spinlock in rxe_cq_post which > solves both issues. This is compatible with the matching code in > the request notify verb. > > Fixes: 78b26a335310 ("RDMA/rxe: Remove tasklet call from rxe_cq.c") > Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> > --- > drivers/infiniband/sw/rxe/rxe_cq.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Applied to for-rc, thanks Jason
Jason, thanks. - Bob -----Original Message----- From: Jason Gunthorpe <jgg@nvidia.com> Sent: Monday, June 12, 2023 1:28 PM To: Bob Pearson <rpearsonhpe@gmail.com> Cc: zyjzyj2000@gmail.com; linux-rdma@vger.kernel.org Subject: Re: [PATCH for-next] RDMA/rxe: Fix rxe_cq_post On Mon, Jun 12, 2023 at 10:50:33AM -0500, Bob Pearson wrote: > A recent patch replaced a tasklet execution of cq->comp_handler by a > direct call. While this made sense it let changes to cq->notify state > be unprotected and assumed that the cq completion machinery and the > ulp done callbacks were reentrant. The result is that in some cases > completion events can be lost. This patch moves the > cq->comp_handler call inside of the spinlock in rxe_cq_post which > solves both issues. This is compatible with the matching code in the > request notify verb. > > Fixes: 78b26a335310 ("RDMA/rxe: Remove tasklet call from rxe_cq.c") > Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> > --- > drivers/infiniband/sw/rxe/rxe_cq.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Applied to for-rc, thanks Jason
diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c index 20ff0c0c4605..6ca2a05b6a2a 100644 --- a/drivers/infiniband/sw/rxe/rxe_cq.c +++ b/drivers/infiniband/sw/rxe/rxe_cq.c @@ -113,8 +113,6 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited) queue_advance_producer(cq->queue, QUEUE_TYPE_TO_CLIENT); - spin_unlock_irqrestore(&cq->cq_lock, flags); - if ((cq->notify == IB_CQ_NEXT_COMP) || (cq->notify == IB_CQ_SOLICITED && solicited)) { cq->notify = 0; @@ -122,6 +120,8 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited) cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context); } + spin_unlock_irqrestore(&cq->cq_lock, flags); + return 0; }
A recent patch replaced a tasklet execution of cq->comp_handler by a direct call. While this made sense it let changes to cq->notify state be unprotected and assumed that the cq completion machinery and the ulp done callbacks were reentrant. The result is that in some cases completion events can be lost. This patch moves the cq->comp_handler call inside of the spinlock in rxe_cq_post which solves both issues. This is compatible with the matching code in the request notify verb. Fixes: 78b26a335310 ("RDMA/rxe: Remove tasklet call from rxe_cq.c") Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> --- drivers/infiniband/sw/rxe/rxe_cq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) base-commit: 830f93f47068b1632cc127871fbf27e918efdf46