diff mbox

IB/core: correctly handle rdma_rw_init_mrs() failure

Message ID 20160929143217.F2C8DE0BD1@smtp.ogc.us (mailing list archive)
State Accepted
Headers show

Commit Message

Steve Wise Sept. 29, 2016, 2:31 p.m. UTC
Function ib_create_qp() was failing to return an error when
rdma_rw_init_mrs() fails, causing a crash further down in ib_create_qp()
when trying to dereferece the qp pointer which was actually a negative
errno.

The crash:

crash> log|grep BUG
[  136.458121] BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
crash> bt
PID: 3736   TASK: ffff8808543215c0  CPU: 2   COMMAND: "kworker/u64:2"
 #0 [ffff88084d323340] machine_kexec at ffffffff8105fbb0
 #1 [ffff88084d3233b0] __crash_kexec at ffffffff81116758
 #2 [ffff88084d323480] crash_kexec at ffffffff8111682d
 #3 [ffff88084d3234b0] oops_end at ffffffff81032bd6
 #4 [ffff88084d3234e0] no_context at ffffffff8106e431
 #5 [ffff88084d323530] __bad_area_nosemaphore at ffffffff8106e610
 #6 [ffff88084d323590] bad_area_nosemaphore at ffffffff8106e6f4
 #7 [ffff88084d3235a0] __do_page_fault at ffffffff8106ebdc
 #8 [ffff88084d323620] do_page_fault at ffffffff8106f057
 #9 [ffff88084d323660] page_fault at ffffffff816e3148
    [exception RIP: ib_create_qp+427]
    RIP: ffffffffa02554fb  RSP: ffff88084d323718  RFLAGS: 00010246
    RAX: 0000000000000004  RBX: fffffffffffffff4  RCX: 000000018020001f
    RDX: ffff880830997fc0  RSI: 0000000000000001  RDI: ffff88085f407200
    RBP: ffff88084d323778   R8: 0000000000000001   R9: ffffea0020bae210
    R10: ffffea0020bae218  R11: 0000000000000001  R12: ffff88084d3237c8
    R13: 00000000fffffff4  R14: ffff880859fa5000  R15: ffff88082eb89800
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#10 [ffff88084d323780] rdma_create_qp at ffffffffa0782681 [rdma_cm]
#11 [ffff88084d3237b0] nvmet_rdma_create_queue_ib at ffffffffa07c43f3 [nvmet_rdma]
#12 [ffff88084d323860] nvmet_rdma_alloc_queue at ffffffffa07c5ba9 [nvmet_rdma]
#13 [ffff88084d323900] nvmet_rdma_queue_connect at ffffffffa07c5c96 [nvmet_rdma]
#14 [ffff88084d323980] nvmet_rdma_cm_handler at ffffffffa07c6450 [nvmet_rdma]
#15 [ffff88084d3239b0] iw_conn_req_handler at ffffffffa0787480 [rdma_cm]
#16 [ffff88084d323a60] cm_conn_req_handler at ffffffffa0775f06 [iw_cm]
#17 [ffff88084d323ab0] process_event at ffffffffa0776019 [iw_cm]
#18 [ffff88084d323af0] cm_work_handler at ffffffffa0776170 [iw_cm]
#19 [ffff88084d323cb0] process_one_work at ffffffff810a1483
#20 [ffff88084d323d90] worker_thread at ffffffff810a211d
#21 [ffff88084d323ec0] kthread at ffffffff810a6c5c
#22 [ffff88084d323f50] ret_from_fork at ffffffff816e1ebf

Fixes: 632bc3f65081 ("IB/core, RDMA RW API: Do not exceed QP SGE send limit")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: stable@vger.kernel.org
---
 drivers/infiniband/core/verbs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Bart Van Assche Sept. 29, 2016, 2:39 p.m. UTC | #1
On 09/29/16 07:32, Steve Wise wrote:
> Fixes: 632bc3f65081 ("IB/core, RDMA RW API: Do not exceed QP SGE send limit")

This is not correct. I think the "qp = ERR_PTR(ret)" code was introduced 
by commit a060b5629ab06 ("IB/core: generic RDMA READ/WRITE API").

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Sept. 29, 2016, 2:41 p.m. UTC | #2
> -----Original Message-----
> From: Bart Van Assche [mailto:bart.vanassche@sandisk.com]
> Sent: Thursday, September 29, 2016 9:40 AM
> To: Steve Wise; dledford@redhat.com
> Cc: linux-rdma@vger.kernel.org; Christoph Hellwig
> Subject: Re: [PATCH] IB/core: correctly handle rdma_rw_init_mrs() failure
> 
> On 09/29/16 07:32, Steve Wise wrote:
> > Fixes: 632bc3f65081 ("IB/core, RDMA RW API: Do not exceed QP SGE send limit")
> 
> This is not correct. I think the "qp = ERR_PTR(ret)" code was introduced
> by commit a060b5629ab06 ("IB/core: generic RDMA READ/WRITE API").
> 
> Bart.

It was, but at that time, the only thing that happened next was: 

return qp;

With 632bc3f65081, qp is dereferenced causing the crash...

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche Sept. 30, 2016, 3:18 a.m. UTC | #3
On 09/29/16 07:32, Steve Wise wrote:
> Function ib_create_qp() was failing to return an error when
> rdma_rw_init_mrs() fails, causing a crash further down in ib_create_qp()
> when trying to dereferece the qp pointer which was actually a negative
> errno.

Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford Oct. 3, 2016, 2:44 p.m. UTC | #4
On 9/29/2016 10:31 AM, Steve Wise wrote:
> Function ib_create_qp() was failing to return an error when
> rdma_rw_init_mrs() fails, causing a crash further down in ib_create_qp()
> when trying to dereferece the qp pointer which was actually a negative
> errno.
> 
> The crash:
> 
> crash> log|grep BUG
> [  136.458121] BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
> crash> bt
> PID: 3736   TASK: ffff8808543215c0  CPU: 2   COMMAND: "kworker/u64:2"
>  #0 [ffff88084d323340] machine_kexec at ffffffff8105fbb0
>  #1 [ffff88084d3233b0] __crash_kexec at ffffffff81116758
>  #2 [ffff88084d323480] crash_kexec at ffffffff8111682d
>  #3 [ffff88084d3234b0] oops_end at ffffffff81032bd6
>  #4 [ffff88084d3234e0] no_context at ffffffff8106e431
>  #5 [ffff88084d323530] __bad_area_nosemaphore at ffffffff8106e610
>  #6 [ffff88084d323590] bad_area_nosemaphore at ffffffff8106e6f4
>  #7 [ffff88084d3235a0] __do_page_fault at ffffffff8106ebdc
>  #8 [ffff88084d323620] do_page_fault at ffffffff8106f057
>  #9 [ffff88084d323660] page_fault at ffffffff816e3148
>     [exception RIP: ib_create_qp+427]
>     RIP: ffffffffa02554fb  RSP: ffff88084d323718  RFLAGS: 00010246
>     RAX: 0000000000000004  RBX: fffffffffffffff4  RCX: 000000018020001f
>     RDX: ffff880830997fc0  RSI: 0000000000000001  RDI: ffff88085f407200
>     RBP: ffff88084d323778   R8: 0000000000000001   R9: ffffea0020bae210
>     R10: ffffea0020bae218  R11: 0000000000000001  R12: ffff88084d3237c8
>     R13: 00000000fffffff4  R14: ffff880859fa5000  R15: ffff88082eb89800
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> #10 [ffff88084d323780] rdma_create_qp at ffffffffa0782681 [rdma_cm]
> #11 [ffff88084d3237b0] nvmet_rdma_create_queue_ib at ffffffffa07c43f3 [nvmet_rdma]
> #12 [ffff88084d323860] nvmet_rdma_alloc_queue at ffffffffa07c5ba9 [nvmet_rdma]
> #13 [ffff88084d323900] nvmet_rdma_queue_connect at ffffffffa07c5c96 [nvmet_rdma]
> #14 [ffff88084d323980] nvmet_rdma_cm_handler at ffffffffa07c6450 [nvmet_rdma]
> #15 [ffff88084d3239b0] iw_conn_req_handler at ffffffffa0787480 [rdma_cm]
> #16 [ffff88084d323a60] cm_conn_req_handler at ffffffffa0775f06 [iw_cm]
> #17 [ffff88084d323ab0] process_event at ffffffffa0776019 [iw_cm]
> #18 [ffff88084d323af0] cm_work_handler at ffffffffa0776170 [iw_cm]
> #19 [ffff88084d323cb0] process_one_work at ffffffff810a1483
> #20 [ffff88084d323d90] worker_thread at ffffffff810a211d
> #21 [ffff88084d323ec0] kthread at ffffffff810a6c5c
> #22 [ffff88084d323f50] ret_from_fork at ffffffff816e1ebf
> 
> Fixes: 632bc3f65081 ("IB/core, RDMA RW API: Do not exceed QP SGE send limit")
> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> Cc: stable@vger.kernel.org

Thanks, applied.
diff mbox

Patch

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index f2b776e..5f88ccd 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -821,7 +821,7 @@  struct ib_qp *ib_create_qp(struct ib_pd *pd,
 		if (ret) {
 			pr_err("failed to init MR pool ret= %d\n", ret);
 			ib_destroy_qp(qp);
-			qp = ERR_PTR(ret);
+			return ERR_PTR(ret);
 		}
 	}