diff mbox series

[for-next,01/11] RDMA/rxe: Fix seg fault in rxe_comp_queue_pkt

Message ID 20240326174325.300849-3-rpearsonhpe@gmail.com (mailing list archive)
State Superseded
Headers show
Series RDMA/rxe: Various fixes and cleanups | expand

Commit Message

Bob Pearson March 26, 2024, 5:43 p.m. UTC
In rxe_comp_queue_pkt() an incoming response packet skb is enqueued
to the resp_pkts queue and then a decision is made whether to run the
completer task inline or schedule it. Finally the skb is dereferenced
to bump a 'hw' performance counter. This is wrong because if the
completer task is already running in a separate thread it may have
already processed the skb and freed it which can cause a seg fault.
This has been observed infrequently in testing at high scale.

This patch fixes this by changing the order of enqueuing the packet
until after the counter is accessed.

Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Fixes: 0b1e5b99a48b ("IB/rxe: Add port protocol stats")
---
 drivers/infiniband/sw/rxe/rxe_comp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Leon Romanovsky April 2, 2024, 12:31 p.m. UTC | #1
On Tue, Mar 26, 2024 at 12:43:16PM -0500, Bob Pearson wrote:
> In rxe_comp_queue_pkt() an incoming response packet skb is enqueued
> to the resp_pkts queue and then a decision is made whether to run the
> completer task inline or schedule it. Finally the skb is dereferenced
> to bump a 'hw' performance counter. This is wrong because if the
> completer task is already running in a separate thread it may have
> already processed the skb and freed it which can cause a seg fault.
> This has been observed infrequently in testing at high scale.
> 
> This patch fixes this by changing the order of enqueuing the packet
> until after the counter is accessed.
> 
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> Fixes: 0b1e5b99a48b ("IB/rxe: Add port protocol stats")

Signed-off-by needs to be after Fixes lines
It is applicable to all patches in this series.

Thanks
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index b78b8c0856ab..c997b7cbf2a9 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -131,12 +131,12 @@  void rxe_comp_queue_pkt(struct rxe_qp *qp, struct sk_buff *skb)
 {
 	int must_sched;
 
-	skb_queue_tail(&qp->resp_pkts, skb);
-
-	must_sched = skb_queue_len(&qp->resp_pkts) > 1;
+	must_sched = skb_queue_len(&qp->resp_pkts) > 0;
 	if (must_sched != 0)
 		rxe_counter_inc(SKB_TO_PKT(skb)->rxe, RXE_CNT_COMPLETER_SCHED);
 
+	skb_queue_tail(&qp->resp_pkts, skb);
+
 	if (must_sched)
 		rxe_sched_task(&qp->comp.task);
 	else