From patchwork Tue Jun 18 18:10:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Himanshu Madhani X-Patchwork-Id: 11002469 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 928736C5 for ; Tue, 18 Jun 2019 18:11:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 85D3528B45 for ; Tue, 18 Jun 2019 18:11:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 79C6928B35; Tue, 18 Jun 2019 18:11:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8AF9728B27 for ; Tue, 18 Jun 2019 18:11:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729947AbfFRSLO (ORCPT ); Tue, 18 Jun 2019 14:11:14 -0400 Received: from mx0b-0016f401.pphosted.com ([67.231.156.173]:18214 "EHLO mx0b-0016f401.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727616AbfFRSLO (ORCPT ); Tue, 18 Jun 2019 14:11:14 -0400 Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x5II5DIO006579; Tue, 18 Jun 2019 11:11:11 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0818; bh=VzIHvVHplbtxWvCXNv67xB2KpLL3aCCRBBaCHDxU3+U=; b=cteiL3X4bDN0Ti6DHqvmY7VdFmAEaOilyTNf6+2KJPc4B2uLaevjzlFDR4F6+5wSOP6b S1+XQrjW2OePKod9G/BjemyQKbyIT1e5RozHgmdilLGjo42qyMX+snQ2mUMzjQ4y07Lm ICjlU/ailchdpiWYutNhdacOG/egDguqhh6HEZRd5jwAJIjBKvcEQeinE4qYeMOnrCxF Z++Ddr8o9U3G3nvS0LDFh49nBcZ+Ns3osDZzjOqeCqXS/DlkWTUyrc2GtH+NIrM3Fjjh g1EVB0jz0AdEvnJS1Ul+IAcy+mpdM6vdEBH/xWchnUbYjEOBX8UlQWEaDZ9uWGz0z8Xm 8w== Received: from sc-exch03.marvell.com ([199.233.58.183]) by mx0b-0016f401.pphosted.com with ESMTP id 2t6xkh1ppx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Tue, 18 Jun 2019 11:11:11 -0700 Received: from SC-EXCH01.marvell.com (10.93.176.81) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 18 Jun 2019 11:11:10 -0700 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 18 Jun 2019 11:11:10 -0700 Received: from dut1171.mv.qlogic.com (unknown [10.112.88.18]) by maili.marvell.com (Postfix) with ESMTP id C2E353F703F; Tue, 18 Jun 2019 11:11:09 -0700 (PDT) Received: from dut1171.mv.qlogic.com (localhost [127.0.0.1]) by dut1171.mv.qlogic.com (8.14.7/8.14.7) with ESMTP id x5IIB98i016594; Tue, 18 Jun 2019 11:11:09 -0700 Received: (from root@localhost) by dut1171.mv.qlogic.com (8.14.7/8.14.7/Submit) id x5IIB9a7016593; Tue, 18 Jun 2019 11:11:09 -0700 From: Himanshu Madhani To: , CC: , Subject: [PATCH v2 1/3] qla2xxx: Fix kernel crash after disconnecting NVMe devices Date: Tue, 18 Jun 2019 11:10:19 -0700 Message-ID: <20190618181021.16547-2-hmadhani@marvell.com> X-Mailer: git-send-email 2.12.0 In-Reply-To: <20190618181021.16547-1-hmadhani@marvell.com> References: <20190618181021.16547-1-hmadhani@marvell.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-06-18_08:,, signatures=0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Arun Easi BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] qla_nvme_unregister_remote_port+0x6c/0xf0 [qla2xxx] PGD 800000084cf41067 PUD 84d288067 PMD 0 Oops: 0000 [#1] SMP Call Trace: [] process_one_work+0x17f/0x440 [] worker_thread+0x126/0x3c0 [] ? manage_workers.isra.26+0x2a0/0x2a0 [] kthread+0xd1/0xe0 [] ? insert_kthread_work+0x40/0x40 [] ret_from_fork_nospec_begin+0x21/0x21 [] ? insert_kthread_work+0x40/0x40 RIP [] qla_nvme_unregister_remote_port+0x6c/0xf0 [qla2xxx] The crash is due to a bad entry in the nvme_rport_list. This list is not protected, and when a remoteport_delete callback is called, driver traverses the list and crashes. Actually, the list could be removed and driver could traverse the main fcport list instead. Fix does exactly that. Signed-off-by: Arun Easi Signed-off-by: Himanshu Madhani --- drivers/scsi/qla2xxx/qla_def.h | 1 - drivers/scsi/qla2xxx/qla_nvme.c | 37 ++++++++++--------------------------- drivers/scsi/qla2xxx/qla_nvme.h | 1 - drivers/scsi/qla2xxx/qla_os.c | 1 - 4 files changed, 10 insertions(+), 30 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 1a4095c56eee..602ed24bb806 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -4376,7 +4376,6 @@ typedef struct scsi_qla_host { struct nvme_fc_local_port *nvme_local_port; struct completion nvme_del_done; - struct list_head nvme_rport_list; uint16_t fcoe_vlan_id; uint16_t fcoe_fcf_idx; diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index 22e3fba28e51..b43c62758cec 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -74,7 +74,6 @@ int qla_nvme_register_remote(struct scsi_qla_host *vha, struct fc_port *fcport) rport = fcport->nvme_remote_port->private; rport->fcport = fcport; - list_add_tail(&rport->list, &vha->nvme_rport_list); fcport->nvme_flag |= NVME_FLAG_REGISTERED; return 0; @@ -542,19 +541,12 @@ static void qla_nvme_localport_delete(struct nvme_fc_local_port *lport) static void qla_nvme_remoteport_delete(struct nvme_fc_remote_port *rport) { fc_port_t *fcport; - struct qla_nvme_rport *qla_rport = rport->private, *trport; + struct qla_nvme_rport *qla_rport = rport->private; fcport = qla_rport->fcport; fcport->nvme_remote_port = NULL; fcport->nvme_flag &= ~NVME_FLAG_REGISTERED; - list_for_each_entry_safe(qla_rport, trport, - &fcport->vha->nvme_rport_list, list) { - if (qla_rport->fcport == fcport) { - list_del(&qla_rport->list); - break; - } - } complete(&fcport->nvme_del_done); if (!test_bit(UNLOADING, &fcport->vha->dpc_flags)) { @@ -590,7 +582,7 @@ static void qla_nvme_unregister_remote_port(struct work_struct *work) { struct fc_port *fcport = container_of(work, struct fc_port, nvme_del_work); - struct qla_nvme_rport *qla_rport, *trport; + int ret; if (!IS_ENABLED(CONFIG_NVME_FC)) return; @@ -598,23 +590,14 @@ static void qla_nvme_unregister_remote_port(struct work_struct *work) ql_log(ql_log_warn, NULL, 0x2112, "%s: unregister remoteport on %p\n",__func__, fcport); - list_for_each_entry_safe(qla_rport, trport, - &fcport->vha->nvme_rport_list, list) { - if (qla_rport->fcport == fcport) { - ql_log(ql_log_info, fcport->vha, 0x2113, - "%s: fcport=%p\n", __func__, fcport); - nvme_fc_set_remoteport_devloss - (fcport->nvme_remote_port, 0); - init_completion(&fcport->nvme_del_done); - if (nvme_fc_unregister_remoteport - (fcport->nvme_remote_port)) - ql_log(ql_log_info, fcport->vha, 0x2114, - "%s: Failed to unregister nvme_remote_port\n", - __func__); - wait_for_completion(&fcport->nvme_del_done); - break; - } - } + nvme_fc_set_remoteport_devloss(fcport->nvme_remote_port, 0); + init_completion(&fcport->nvme_del_done); + ret = nvme_fc_unregister_remoteport(fcport->nvme_remote_port); + if (ret) + ql_log(ql_log_info, fcport->vha, 0x2114, + "%s: Failed to unregister nvme_remote_port (%d)\n", + __func__, ret); + wait_for_completion(&fcport->nvme_del_done); } void qla_nvme_delete(struct scsi_qla_host *vha) diff --git a/drivers/scsi/qla2xxx/qla_nvme.h b/drivers/scsi/qla2xxx/qla_nvme.h index d3b8a6440113..2d088add7011 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.h +++ b/drivers/scsi/qla2xxx/qla_nvme.h @@ -37,7 +37,6 @@ struct nvme_private { }; struct qla_nvme_rport { - struct list_head list; struct fc_port *fcport; }; diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 00fee5bf4de1..ae93ae2b6090 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -4789,7 +4789,6 @@ struct scsi_qla_host *qla2x00_create_host(struct scsi_host_template *sht, INIT_LIST_HEAD(&vha->plogi_ack_list); INIT_LIST_HEAD(&vha->qp_list); INIT_LIST_HEAD(&vha->gnl.fcports); - INIT_LIST_HEAD(&vha->nvme_rport_list); INIT_LIST_HEAD(&vha->gpnid_list); INIT_WORK(&vha->iocb_work, qla2x00_iocb_work_fn); From patchwork Tue Jun 18 18:10:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Himanshu Madhani X-Patchwork-Id: 11002471 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1A9A914DB for ; Tue, 18 Jun 2019 18:11:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 09F7D28A86 for ; Tue, 18 Jun 2019 18:11:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F282C28B3E; Tue, 18 Jun 2019 18:11:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 90BC228B38 for ; Tue, 18 Jun 2019 18:11:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729898AbfFRSLo (ORCPT ); Tue, 18 Jun 2019 14:11:44 -0400 Received: from mx0a-0016f401.pphosted.com ([67.231.148.174]:8132 "EHLO mx0b-0016f401.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727616AbfFRSLo (ORCPT ); Tue, 18 Jun 2019 14:11:44 -0400 Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x5II4cOq015568; Tue, 18 Jun 2019 11:11:35 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0818; bh=NHRx3xOMR22e4vz2pkoNFJNlxcqARZlkcWVkTlcDHGk=; b=eZ88DeBX1F4JNJ1nKfuDLe1ukgRrugPy2B5yqKvmK9bMzRFrGjqoN7r633XiqrqeVgdl 2ibtF44lF2jt6vpfT+6u57y0a+onnbNhqRIJ36wQbNK1onIB++F3clSH+/ITlUDIhdQ6 TOHsiBzlNIj9Ch3LRVnd+MuNjwkfOBaMozXk/0FVjPtqa0wBN6Wvmg2D8q4XIQUCQ4JP ABpoN74ht7qex3ejEl35SZEqFepgqNqhhvoDtiKZVtOg0sGD4M09rjrzpDLg4yxrj+H7 /LTyWH1092NWNLCOrZ3jJOFUYS1zspt7UYHxK9S2l1sdRNhiu2LBbASwOl/1V/jljjWG Lg== Received: from sc-exch01.marvell.com ([199.233.58.181]) by mx0a-0016f401.pphosted.com with ESMTP id 2t73vqgcrh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Tue, 18 Jun 2019 11:11:34 -0700 Received: from SC-EXCH03.marvell.com (10.93.176.83) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 18 Jun 2019 11:11:34 -0700 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 18 Jun 2019 11:11:34 -0700 Received: from dut1171.mv.qlogic.com (unknown [10.112.88.18]) by maili.marvell.com (Postfix) with ESMTP id E2D873F703F; Tue, 18 Jun 2019 11:11:33 -0700 (PDT) Received: from dut1171.mv.qlogic.com (localhost [127.0.0.1]) by dut1171.mv.qlogic.com (8.14.7/8.14.7) with ESMTP id x5IIBXRX016606; Tue, 18 Jun 2019 11:11:33 -0700 Received: (from root@localhost) by dut1171.mv.qlogic.com (8.14.7/8.14.7/Submit) id x5IIBXP6016597; Tue, 18 Jun 2019 11:11:33 -0700 From: Himanshu Madhani To: , CC: , Subject: [PATCH v2 2/3] qla2xxx: on session delete return nvme cmd Date: Tue, 18 Jun 2019 11:10:20 -0700 Message-ID: <20190618181021.16547-3-hmadhani@marvell.com> X-Mailer: git-send-email 2.12.0 In-Reply-To: <20190618181021.16547-1-hmadhani@marvell.com> References: <20190618181021.16547-1-hmadhani@marvell.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-06-18_08:,, signatures=0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Quinn Tran - on session delete or chip reset, reject all NVME commands. - on NVME command submission error, free srb resource. Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani Reported-by: kbuild test robot Reported-by: Dan Carpenter --- drivers/scsi/qla2xxx/qla_nvme.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index b43c62758cec..2c64457ce713 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -241,6 +241,10 @@ static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, vha = fcport->vha; ha = vha->hw; + + if (!ha->flags.fw_started || (fcport && fcport->deleted)) + return rval; + /* Alloc SRB structure */ sp = qla2x00_get_sp(vha, fcport, GFP_ATOMIC); if (!sp) @@ -272,6 +276,7 @@ static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, "qla2x00_start_sp failed = %d\n", rval); atomic_dec(&sp->ref_count); wake_up(&sp->nvme_ls_waitq); + sp->free(sp); return rval; } @@ -488,7 +493,7 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, vha = fcport->vha; - if (test_bit(ABORT_ISP_ACTIVE, &vha->dpc_flags)) + if ((qpair && !qpair->fw_started) || (fcport && fcport->deleted)) return rval; /* @@ -523,6 +528,7 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, "qla2x00_start_nvme_mq failed = %d\n", rval); atomic_dec(&sp->ref_count); wake_up(&sp->nvme_ls_waitq); + sp->free(sp); } return rval; @@ -549,14 +555,13 @@ static void qla_nvme_remoteport_delete(struct nvme_fc_remote_port *rport) complete(&fcport->nvme_del_done); - if (!test_bit(UNLOADING, &fcport->vha->dpc_flags)) { - INIT_WORK(&fcport->free_work, qlt_free_session_done); - schedule_work(&fcport->free_work); - } + INIT_WORK(&fcport->free_work, qlt_free_session_done); + schedule_work(&fcport->free_work); fcport->nvme_flag &= ~NVME_FLAG_DELETING; ql_log(ql_log_info, fcport->vha, 0x2110, - "remoteport_delete of %p completed.\n", fcport); + "remoteport_delete of %p %8phN completed.\n", + fcport, fcport->port_name); } static struct nvme_fc_port_template qla_nvme_fc_transport = { @@ -588,7 +593,8 @@ static void qla_nvme_unregister_remote_port(struct work_struct *work) return; ql_log(ql_log_warn, NULL, 0x2112, - "%s: unregister remoteport on %p\n",__func__, fcport); + "%s: unregister remoteport on %p %8phN\n", + __func__, fcport, fcport->port_name); nvme_fc_set_remoteport_devloss(fcport->nvme_remote_port, 0); init_completion(&fcport->nvme_del_done); From patchwork Tue Jun 18 18:10:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Himanshu Madhani X-Patchwork-Id: 11002473 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C55DC14DB for ; Tue, 18 Jun 2019 18:12:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B6A2928B36 for ; Tue, 18 Jun 2019 18:12:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AB17128B30; Tue, 18 Jun 2019 18:12:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7796828B35 for ; Tue, 18 Jun 2019 18:12:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730169AbfFRSMB (ORCPT ); Tue, 18 Jun 2019 14:12:01 -0400 Received: from mx0a-0016f401.pphosted.com ([67.231.148.174]:20558 "EHLO mx0b-0016f401.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727616AbfFRSMB (ORCPT ); Tue, 18 Jun 2019 14:12:01 -0400 Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x5II4cjk015571; Tue, 18 Jun 2019 11:11:59 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0818; bh=1Rhg3meYJ3OxnNZ2ZPE32LQrbyAsZ5erPqj9zbBW3d8=; b=DCcjLBScT023ZKsXDXvgiWktHyTx4OEQ5+nxg24aN5LzECb9DB3BXrJND+rm7oE6Plah cQ22+kwk6WKOvSROpYxiABLFxe+J6l2Kh70P3Jzhmxg+c072UtMpuUBvDitOJXEt/H57 8NMJ1vCaRxnOBt4L5zoHBHzUkefXgETbtIVKa8OBoSaNb7ORysifwfal79+t4/z0Mvtk bkYPtyb2fSFuQoHKg1/DHtyM9DI1IEidOoANB9IBta41nKl5tcJlDU1JQNB3+usANCgE mDYxHNWUAdCP79/gOdruOrT2rvJmZscTLn8hYhRtMEtcX/oIHhnYyWnQXUhGfxc0/Mgg ug== Received: from sc-exch04.marvell.com ([199.233.58.184]) by mx0a-0016f401.pphosted.com with ESMTP id 2t73vqgcuy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Tue, 18 Jun 2019 11:11:59 -0700 Received: from SC-EXCH01.marvell.com (10.93.176.81) by SC-EXCH04.marvell.com (10.93.176.84) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 18 Jun 2019 11:11:58 -0700 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 18 Jun 2019 11:11:58 -0700 Received: from dut1171.mv.qlogic.com (unknown [10.112.88.18]) by maili.marvell.com (Postfix) with ESMTP id 1B9563F7041; Tue, 18 Jun 2019 11:11:58 -0700 (PDT) Received: from dut1171.mv.qlogic.com (localhost [127.0.0.1]) by dut1171.mv.qlogic.com (8.14.7/8.14.7) with ESMTP id x5IIBvne016610; Tue, 18 Jun 2019 11:11:57 -0700 Received: (from root@localhost) by dut1171.mv.qlogic.com (8.14.7/8.14.7/Submit) id x5IIBv78016609; Tue, 18 Jun 2019 11:11:57 -0700 From: Himanshu Madhani To: , CC: , Subject: [PATCH v2 3/3] qla2xxx: Fix NVME cmd and LS cmd timeout race condition Date: Tue, 18 Jun 2019 11:10:21 -0700 Message-ID: <20190618181021.16547-4-hmadhani@marvell.com> X-Mailer: git-send-email 2.12.0 In-Reply-To: <20190618181021.16547-1-hmadhani@marvell.com> References: <20190618181021.16547-1-hmadhani@marvell.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-06-18_08:,, signatures=0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Quinn Tran This patch uses kref to protect access between fcp_abort path and nvme command and LS command completion path. Stack trace below shows the abort path is accessing stale memory (nvme_private->sp). When command kref reaches 0, nvme_private & srb resource will be disconnected from each other. Any subsequence nvme abort request will not be able to reference the original srb. [ 5631.003998] BUG: unable to handle kernel paging request at 00000010000005d8 [ 5631.004016] IP: [] qla_nvme_abort_work+0x22/0x100 [qla2xxx] [ 5631.004086] Workqueue: events qla_nvme_abort_work [qla2xxx] [ 5631.004097] RIP: 0010:[] [] qla_nvme_abort_work+0x22/0x100 [qla2xxx] [ 5631.004109] Call Trace: [ 5631.004115] [] ? pwq_dec_nr_in_flight+0x64/0xb0 [ 5631.004117] [] process_one_work+0x17f/0x440 [ 5631.004120] [] worker_thread+0x126/0x3c0 Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani --- drivers/scsi/qla2xxx/qla_def.h | 2 + drivers/scsi/qla2xxx/qla_nvme.c | 164 ++++++++++++++++++++++++++++------------ drivers/scsi/qla2xxx/qla_nvme.h | 1 + 3 files changed, 117 insertions(+), 50 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 602ed24bb806..85a27ee5d647 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -532,6 +532,8 @@ typedef struct srb { uint8_t cmd_type; uint8_t pad[3]; atomic_t ref_count; + struct kref cmd_kref; /* need to migrate ref_count over to this */ + void *priv; wait_queue_head_t nvme_ls_waitq; struct fc_port *fcport; struct scsi_qla_host *vha; diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index 2c64457ce713..78df476e80a1 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -123,53 +123,91 @@ static int qla_nvme_alloc_queue(struct nvme_fc_local_port *lport, return 0; } +static void qla_nvme_release_fcp_cmd_kref(struct kref *kref) +{ + struct srb *sp = container_of(kref, struct srb, cmd_kref); + struct nvme_private *priv = (struct nvme_private *)sp->priv; + struct nvmefc_fcp_req *fd; + struct srb_iocb *nvme; + unsigned long flags; + + if (!priv) + goto out; + + nvme = &sp->u.iocb_cmd; + fd = nvme->u.nvme.desc; + + spin_lock_irqsave(&priv->cmd_lock, flags); + priv->sp = NULL; + sp->priv = NULL; + if (priv->comp_status == QLA_SUCCESS) { + fd->rcv_rsplen = nvme->u.nvme.rsp_pyld_len; + } else { + fd->rcv_rsplen = 0; + fd->transferred_length = 0; + } + fd->status = 0; + spin_unlock_irqrestore(&priv->cmd_lock, flags); + + fd->done(fd); +out: + qla2xxx_rel_qpair_sp(sp->qpair, sp); +} + +static void qla_nvme_release_ls_cmd_kref(struct kref *kref) +{ + struct srb *sp = container_of(kref, struct srb, cmd_kref); + struct nvme_private *priv = (struct nvme_private *)sp->priv; + struct nvmefc_ls_req *fd; + unsigned long flags; + + if (!priv) + goto out; + + spin_lock_irqsave(&priv->cmd_lock, flags); + priv->sp = NULL; + sp->priv = NULL; + spin_unlock_irqrestore(&priv->cmd_lock, flags); + + fd = priv->fd; + fd->done(fd, priv->comp_status); +out: + qla2x00_rel_sp(sp); +} + +static void qla_nvme_ls_complete(struct work_struct *work) +{ + struct nvme_private *priv = + container_of(work, struct nvme_private, ls_work); + + kref_put(&priv->sp->cmd_kref, qla_nvme_release_ls_cmd_kref); +} + static void qla_nvme_sp_ls_done(void *ptr, int res) { srb_t *sp = ptr; - struct srb_iocb *nvme; - struct nvmefc_ls_req *fd; struct nvme_private *priv; - if (WARN_ON_ONCE(atomic_read(&sp->ref_count) == 0)) + if (WARN_ON_ONCE(kref_read(&sp->cmd_kref) == 0)) return; - atomic_dec(&sp->ref_count); - if (res) res = -EINVAL; - nvme = &sp->u.iocb_cmd; - fd = nvme->u.nvme.desc; - priv = fd->private; + priv = (struct nvme_private *)sp->priv; priv->comp_status = res; + INIT_WORK(&priv->ls_work, qla_nvme_ls_complete); schedule_work(&priv->ls_work); - /* work schedule doesn't need the sp */ - qla2x00_rel_sp(sp); } +/* it assumed that QPair lock is held. */ static void qla_nvme_sp_done(void *ptr, int res) { srb_t *sp = ptr; - struct srb_iocb *nvme; - struct nvmefc_fcp_req *fd; + struct nvme_private *priv = (struct nvme_private *)sp->priv; - nvme = &sp->u.iocb_cmd; - fd = nvme->u.nvme.desc; - - if (WARN_ON_ONCE(atomic_read(&sp->ref_count) == 0)) - return; - - atomic_dec(&sp->ref_count); - - if (res == QLA_SUCCESS) { - fd->rcv_rsplen = nvme->u.nvme.rsp_pyld_len; - } else { - fd->rcv_rsplen = 0; - fd->transferred_length = 0; - } - fd->status = 0; - fd->done(fd); - qla2xxx_rel_qpair_sp(sp->qpair, sp); + priv->comp_status = res; + kref_put(&sp->cmd_kref, qla_nvme_release_fcp_cmd_kref); return; } @@ -188,44 +226,53 @@ static void qla_nvme_abort_work(struct work_struct *work) __func__, sp, sp->handle, fcport, fcport->deleted); if (!ha->flags.fw_started && (fcport && fcport->deleted)) - return; + goto out; if (ha->flags.host_shutting_down) { ql_log(ql_log_info, sp->fcport->vha, 0xffff, "%s Calling done on sp: %p, type: 0x%x, sp->ref_count: 0x%x\n", __func__, sp, sp->type, atomic_read(&sp->ref_count)); sp->done(sp, 0); - return; + goto out; } - if (WARN_ON_ONCE(atomic_read(&sp->ref_count) == 0)) - return; - rval = ha->isp_ops->abort_command(sp); ql_dbg(ql_dbg_io, fcport->vha, 0x212b, "%s: %s command for sp=%p, handle=%x on fcport=%p rval=%x\n", __func__, (rval != QLA_SUCCESS) ? "Failed to abort" : "Aborted", sp, sp->handle, fcport, rval); + +out: + /* kref_get was done before work was schedule. */ + if (sp->type == SRB_NVME_CMD) + kref_put(&sp->cmd_kref, qla_nvme_release_fcp_cmd_kref); + else if (sp->type == SRB_NVME_LS) + kref_put(&sp->cmd_kref, qla_nvme_release_ls_cmd_kref); } static void qla_nvme_ls_abort(struct nvme_fc_local_port *lport, struct nvme_fc_remote_port *rport, struct nvmefc_ls_req *fd) { struct nvme_private *priv = fd->private; + unsigned long flags; + + spin_lock_irqsave(&priv->cmd_lock, flags); + if (!priv->sp) { + spin_unlock_irqrestore(&priv->cmd_lock, flags); + return; + } + + if (!kref_get_unless_zero(&priv->sp->cmd_kref)) { + spin_unlock_irqrestore(&priv->cmd_lock, flags); + return; + } + spin_unlock_irqrestore(&priv->cmd_lock, flags); INIT_WORK(&priv->abort_work, qla_nvme_abort_work); schedule_work(&priv->abort_work); } -static void qla_nvme_ls_complete(struct work_struct *work) -{ - struct nvme_private *priv = - container_of(work, struct nvme_private, ls_work); - struct nvmefc_ls_req *fd = priv->fd; - - fd->done(fd, priv->comp_status); -} static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, struct nvme_fc_remote_port *rport, struct nvmefc_ls_req *fd) @@ -253,11 +300,12 @@ static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, sp->type = SRB_NVME_LS; sp->name = "nvme_ls"; sp->done = qla_nvme_sp_ls_done; - atomic_set(&sp->ref_count, 1); - nvme = &sp->u.iocb_cmd; + sp->priv = (void *)priv; priv->sp = sp; + kref_init(&sp->cmd_kref); + spin_lock_init(&priv->cmd_lock); + nvme = &sp->u.iocb_cmd; priv->fd = fd; - INIT_WORK(&priv->ls_work, qla_nvme_ls_complete); nvme->u.nvme.desc = fd; nvme->u.nvme.dir = 0; nvme->u.nvme.dl = 0; @@ -274,9 +322,10 @@ static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, if (rval != QLA_SUCCESS) { ql_log(ql_log_warn, vha, 0x700e, "qla2x00_start_sp failed = %d\n", rval); - atomic_dec(&sp->ref_count); wake_up(&sp->nvme_ls_waitq); - sp->free(sp); + sp->priv = NULL; + priv->sp = NULL; + qla2x00_rel_sp(sp); return rval; } @@ -288,6 +337,18 @@ static void qla_nvme_fcp_abort(struct nvme_fc_local_port *lport, struct nvmefc_fcp_req *fd) { struct nvme_private *priv = fd->private; + unsigned long flags; + + spin_lock_irqsave(&priv->cmd_lock, flags); + if (!priv->sp) { + spin_unlock_irqrestore(&priv->cmd_lock, flags); + return; + } + if (!kref_get_unless_zero(&priv->sp->cmd_kref)) { + spin_unlock_irqrestore(&priv->cmd_lock, flags); + return; + } + spin_unlock_irqrestore(&priv->cmd_lock, flags); INIT_WORK(&priv->abort_work, qla_nvme_abort_work); schedule_work(&priv->abort_work); @@ -511,8 +572,10 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, if (!sp) return -EBUSY; - atomic_set(&sp->ref_count, 1); init_waitqueue_head(&sp->nvme_ls_waitq); + kref_init(&sp->cmd_kref); + spin_lock_init(&priv->cmd_lock); + sp->priv = (void *)priv; priv->sp = sp; sp->type = SRB_NVME_CMD; sp->name = "nvme_cmd"; @@ -526,9 +589,10 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, if (rval != QLA_SUCCESS) { ql_log(ql_log_warn, vha, 0x212d, "qla2x00_start_nvme_mq failed = %d\n", rval); - atomic_dec(&sp->ref_count); wake_up(&sp->nvme_ls_waitq); - sp->free(sp); + sp->priv = NULL; + priv->sp = NULL; + qla2xxx_rel_qpair_sp(sp->qpair, sp); } return rval; diff --git a/drivers/scsi/qla2xxx/qla_nvme.h b/drivers/scsi/qla2xxx/qla_nvme.h index 2d088add7011..67bb4a2a3742 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.h +++ b/drivers/scsi/qla2xxx/qla_nvme.h @@ -34,6 +34,7 @@ struct nvme_private { struct work_struct ls_work; struct work_struct abort_work; int comp_status; + spinlock_t cmd_lock; }; struct qla_nvme_rport {