Message ID | 20160929143217.F2C8DE0BD1@smtp.ogc.us (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On 09/29/16 07:32, Steve Wise wrote:
> Fixes: 632bc3f65081 ("IB/core, RDMA RW API: Do not exceed QP SGE send limit")
This is not correct. I think the "qp = ERR_PTR(ret)" code was introduced
by commit a060b5629ab06 ("IB/core: generic RDMA READ/WRITE API").
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Bart Van Assche [mailto:bart.vanassche@sandisk.com] > Sent: Thursday, September 29, 2016 9:40 AM > To: Steve Wise; dledford@redhat.com > Cc: linux-rdma@vger.kernel.org; Christoph Hellwig > Subject: Re: [PATCH] IB/core: correctly handle rdma_rw_init_mrs() failure > > On 09/29/16 07:32, Steve Wise wrote: > > Fixes: 632bc3f65081 ("IB/core, RDMA RW API: Do not exceed QP SGE send limit") > > This is not correct. I think the "qp = ERR_PTR(ret)" code was introduced > by commit a060b5629ab06 ("IB/core: generic RDMA READ/WRITE API"). > > Bart. It was, but at that time, the only thing that happened next was: return qp; With 632bc3f65081, qp is dereferenced causing the crash... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/29/16 07:32, Steve Wise wrote: > Function ib_create_qp() was failing to return an error when > rdma_rw_init_mrs() fails, causing a crash further down in ib_create_qp() > when trying to dereferece the qp pointer which was actually a negative > errno. Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9/29/2016 10:31 AM, Steve Wise wrote: > Function ib_create_qp() was failing to return an error when > rdma_rw_init_mrs() fails, causing a crash further down in ib_create_qp() > when trying to dereferece the qp pointer which was actually a negative > errno. > > The crash: > > crash> log|grep BUG > [ 136.458121] BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 > crash> bt > PID: 3736 TASK: ffff8808543215c0 CPU: 2 COMMAND: "kworker/u64:2" > #0 [ffff88084d323340] machine_kexec at ffffffff8105fbb0 > #1 [ffff88084d3233b0] __crash_kexec at ffffffff81116758 > #2 [ffff88084d323480] crash_kexec at ffffffff8111682d > #3 [ffff88084d3234b0] oops_end at ffffffff81032bd6 > #4 [ffff88084d3234e0] no_context at ffffffff8106e431 > #5 [ffff88084d323530] __bad_area_nosemaphore at ffffffff8106e610 > #6 [ffff88084d323590] bad_area_nosemaphore at ffffffff8106e6f4 > #7 [ffff88084d3235a0] __do_page_fault at ffffffff8106ebdc > #8 [ffff88084d323620] do_page_fault at ffffffff8106f057 > #9 [ffff88084d323660] page_fault at ffffffff816e3148 > [exception RIP: ib_create_qp+427] > RIP: ffffffffa02554fb RSP: ffff88084d323718 RFLAGS: 00010246 > RAX: 0000000000000004 RBX: fffffffffffffff4 RCX: 000000018020001f > RDX: ffff880830997fc0 RSI: 0000000000000001 RDI: ffff88085f407200 > RBP: ffff88084d323778 R8: 0000000000000001 R9: ffffea0020bae210 > R10: ffffea0020bae218 R11: 0000000000000001 R12: ffff88084d3237c8 > R13: 00000000fffffff4 R14: ffff880859fa5000 R15: ffff88082eb89800 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #10 [ffff88084d323780] rdma_create_qp at ffffffffa0782681 [rdma_cm] > #11 [ffff88084d3237b0] nvmet_rdma_create_queue_ib at ffffffffa07c43f3 [nvmet_rdma] > #12 [ffff88084d323860] nvmet_rdma_alloc_queue at ffffffffa07c5ba9 [nvmet_rdma] > #13 [ffff88084d323900] nvmet_rdma_queue_connect at ffffffffa07c5c96 [nvmet_rdma] > #14 [ffff88084d323980] nvmet_rdma_cm_handler at ffffffffa07c6450 [nvmet_rdma] > #15 [ffff88084d3239b0] iw_conn_req_handler at ffffffffa0787480 [rdma_cm] > #16 [ffff88084d323a60] cm_conn_req_handler at ffffffffa0775f06 [iw_cm] > #17 [ffff88084d323ab0] process_event at ffffffffa0776019 [iw_cm] > #18 [ffff88084d323af0] cm_work_handler at ffffffffa0776170 [iw_cm] > #19 [ffff88084d323cb0] process_one_work at ffffffff810a1483 > #20 [ffff88084d323d90] worker_thread at ffffffff810a211d > #21 [ffff88084d323ec0] kthread at ffffffff810a6c5c > #22 [ffff88084d323f50] ret_from_fork at ffffffff816e1ebf > > Fixes: 632bc3f65081 ("IB/core, RDMA RW API: Do not exceed QP SGE send limit") > Signed-off-by: Steve Wise <swise@opengridcomputing.com> > Cc: stable@vger.kernel.org Thanks, applied.
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index f2b776e..5f88ccd 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -821,7 +821,7 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd, if (ret) { pr_err("failed to init MR pool ret= %d\n", ret); ib_destroy_qp(qp); - qp = ERR_PTR(ret); + return ERR_PTR(ret); } }
Function ib_create_qp() was failing to return an error when rdma_rw_init_mrs() fails, causing a crash further down in ib_create_qp() when trying to dereferece the qp pointer which was actually a negative errno. The crash: crash> log|grep BUG [ 136.458121] BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 crash> bt PID: 3736 TASK: ffff8808543215c0 CPU: 2 COMMAND: "kworker/u64:2" #0 [ffff88084d323340] machine_kexec at ffffffff8105fbb0 #1 [ffff88084d3233b0] __crash_kexec at ffffffff81116758 #2 [ffff88084d323480] crash_kexec at ffffffff8111682d #3 [ffff88084d3234b0] oops_end at ffffffff81032bd6 #4 [ffff88084d3234e0] no_context at ffffffff8106e431 #5 [ffff88084d323530] __bad_area_nosemaphore at ffffffff8106e610 #6 [ffff88084d323590] bad_area_nosemaphore at ffffffff8106e6f4 #7 [ffff88084d3235a0] __do_page_fault at ffffffff8106ebdc #8 [ffff88084d323620] do_page_fault at ffffffff8106f057 #9 [ffff88084d323660] page_fault at ffffffff816e3148 [exception RIP: ib_create_qp+427] RIP: ffffffffa02554fb RSP: ffff88084d323718 RFLAGS: 00010246 RAX: 0000000000000004 RBX: fffffffffffffff4 RCX: 000000018020001f RDX: ffff880830997fc0 RSI: 0000000000000001 RDI: ffff88085f407200 RBP: ffff88084d323778 R8: 0000000000000001 R9: ffffea0020bae210 R10: ffffea0020bae218 R11: 0000000000000001 R12: ffff88084d3237c8 R13: 00000000fffffff4 R14: ffff880859fa5000 R15: ffff88082eb89800 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff88084d323780] rdma_create_qp at ffffffffa0782681 [rdma_cm] #11 [ffff88084d3237b0] nvmet_rdma_create_queue_ib at ffffffffa07c43f3 [nvmet_rdma] #12 [ffff88084d323860] nvmet_rdma_alloc_queue at ffffffffa07c5ba9 [nvmet_rdma] #13 [ffff88084d323900] nvmet_rdma_queue_connect at ffffffffa07c5c96 [nvmet_rdma] #14 [ffff88084d323980] nvmet_rdma_cm_handler at ffffffffa07c6450 [nvmet_rdma] #15 [ffff88084d3239b0] iw_conn_req_handler at ffffffffa0787480 [rdma_cm] #16 [ffff88084d323a60] cm_conn_req_handler at ffffffffa0775f06 [iw_cm] #17 [ffff88084d323ab0] process_event at ffffffffa0776019 [iw_cm] #18 [ffff88084d323af0] cm_work_handler at ffffffffa0776170 [iw_cm] #19 [ffff88084d323cb0] process_one_work at ffffffff810a1483 #20 [ffff88084d323d90] worker_thread at ffffffff810a211d #21 [ffff88084d323ec0] kthread at ffffffff810a6c5c #22 [ffff88084d323f50] ret_from_fork at ffffffff816e1ebf Fixes: 632bc3f65081 ("IB/core, RDMA RW API: Do not exceed QP SGE send limit") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Cc: stable@vger.kernel.org --- drivers/infiniband/core/verbs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)