diff mbox series

[for-next,v2] Revert "RDMA/rxe: Remove unnecessary mr testing"

Message ID 20221209045926.531689-1-matsuda-daisuke@fujitsu.com (mailing list archive)
State Accepted
Delegated to: Jason Gunthorpe
Headers show
Series [for-next,v2] Revert "RDMA/rxe: Remove unnecessary mr testing" | expand

Commit Message

Daisuke Matsuda (Fujitsu) Dec. 9, 2022, 4:59 a.m. UTC
The commit 686d348476ee ("RDMA/rxe: Remove unnecessary mr testing") causes
a kernel crash. If responder get a zero-byte RDMA Read request, qp->resp.mr
is not set in check_rkey() [1]. The mr is NULL in this case, and a NULL
pointer dereference occurs as shown below.

 BUG: kernel NULL pointer dereference, address: 0000000000000010
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] PREEMPT SMP PTI
 CPU: 2 PID: 3622 Comm: python3 Kdump: loaded Not tainted 6.1.0-rc3+ #34
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
 RIP: 0010:__rxe_put+0xc/0x60 [rdma_rxe]
 Code: cc cc cc 31 f6 e8 64 36 1b d3 41 b8 01 00 00 00 44 89 c0 c3 cc cc cc cc 41 89 c0 eb c1 90 0f 1f 44 00 00 41 54 b8 ff ff ff ff <f0> 0f c1 47 10 83 f8 01 74 11 45 31 e4 85 c0 7e 20 44 89 e0 41 5c
 RSP: 0018:ffffb27bc012ce78 EFLAGS: 00010246
 RAX: 00000000ffffffff RBX: ffff9790857b0580 RCX: 0000000000000000
 RDX: ffff979080fe145a RSI: 000055560e3e0000 RDI: 0000000000000000
 RBP: ffff97909c7dd800 R08: 0000000000000001 R09: e7ce43d97f7bed0f
 R10: ffff97908b29c300 R11: 0000000000000000 R12: 0000000000000000
 R13: 0000000000000000 R14: ffff97908b29c300 R15: 0000000000000000
 FS:  00007f276f7bd740(0000) GS:ffff9792b5c80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000010 CR3: 0000000114230002 CR4: 0000000000060ee0
 Call Trace:
  <IRQ>
  read_reply+0xda/0x310 [rdma_rxe]
  rxe_responder+0x82d/0xe50 [rdma_rxe]
  do_task+0x84/0x170 [rdma_rxe]
  tasklet_action_common.constprop.0+0xa7/0x120
  __do_softirq+0xcb/0x2ac
  do_softirq+0x63/0x90
  </IRQ>

[1] InfiniBand Architecture Specification Volume 1, Release 1.5, C9-88.
    Available from https://www.infinibandta.org/

Link: https://lore.kernel.org/lkml/1666582315-2-1-git-send-email-lizhijian@fujitsu.com/
Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
---
v2:
  Modified the commit message:
  - Removed timestamps from the calltrace.
  - Added a reference to IBA spec [1].

 drivers/infiniband/sw/rxe/rxe_resp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Jason Gunthorpe Dec. 9, 2022, 7:56 p.m. UTC | #1
On Fri, Dec 09, 2022 at 01:59:26PM +0900, Daisuke Matsuda wrote:
> The commit 686d348476ee ("RDMA/rxe: Remove unnecessary mr testing") causes
> a kernel crash. If responder get a zero-byte RDMA Read request, qp->resp.mr
> is not set in check_rkey() [1]. The mr is NULL in this case, and a NULL
> pointer dereference occurs as shown below.
> 
>  BUG: kernel NULL pointer dereference, address: 0000000000000010
>  #PF: supervisor write access in kernel mode
>  #PF: error_code(0x0002) - not-present page
>  PGD 0 P4D 0
>  Oops: 0002 [#1] PREEMPT SMP PTI
>  CPU: 2 PID: 3622 Comm: python3 Kdump: loaded Not tainted 6.1.0-rc3+ #34
>  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
>  RIP: 0010:__rxe_put+0xc/0x60 [rdma_rxe]
>  Code: cc cc cc 31 f6 e8 64 36 1b d3 41 b8 01 00 00 00 44 89 c0 c3 cc cc cc cc 41 89 c0 eb c1 90 0f 1f 44 00 00 41 54 b8 ff ff ff ff <f0> 0f c1 47 10 83 f8 01 74 11 45 31 e4 85 c0 7e 20 44 89 e0 41 5c
>  RSP: 0018:ffffb27bc012ce78 EFLAGS: 00010246
>  RAX: 00000000ffffffff RBX: ffff9790857b0580 RCX: 0000000000000000
>  RDX: ffff979080fe145a RSI: 000055560e3e0000 RDI: 0000000000000000
>  RBP: ffff97909c7dd800 R08: 0000000000000001 R09: e7ce43d97f7bed0f
>  R10: ffff97908b29c300 R11: 0000000000000000 R12: 0000000000000000
>  R13: 0000000000000000 R14: ffff97908b29c300 R15: 0000000000000000
>  FS:  00007f276f7bd740(0000) GS:ffff9792b5c80000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 0000000000000010 CR3: 0000000114230002 CR4: 0000000000060ee0
>  Call Trace:
>   <IRQ>
>   read_reply+0xda/0x310 [rdma_rxe]
>   rxe_responder+0x82d/0xe50 [rdma_rxe]
>   do_task+0x84/0x170 [rdma_rxe]
>   tasklet_action_common.constprop.0+0xa7/0x120
>   __do_softirq+0xcb/0x2ac
>   do_softirq+0x63/0x90
>   </IRQ>
> 
> [1] InfiniBand Architecture Specification Volume 1, Release 1.5, C9-88.
>     Available from https://www.infinibandta.org/
> 
> Link: https://lore.kernel.org/lkml/1666582315-2-1-git-send-email-lizhijian@fujitsu.com/
> Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
> v2:
>   Modified the commit message:
>   - Removed timestamps from the calltrace.
>   - Added a reference to IBA spec [1].

I squished this with the other patch fixing the other error unwind

Thanks,
Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index 6761bcd1d4d8..5d3a4c6f81a3 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -832,7 +832,8 @@  static enum resp_states read_reply(struct rxe_qp *qp,
 
 	err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt),
 			  payload, RXE_FROM_MR_OBJ);
-	rxe_put(mr);
+	if (mr)
+		rxe_put(mr);
 	if (err) {
 		kfree_skb(skb);
 		return RESPST_ERR_RKEY_VIOLATION;