From patchwork Tue Nov 21 00:00:40 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Smart X-Patchwork-Id: 10067507 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7A71D60224 for ; Tue, 21 Nov 2017 00:01:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6C1122923B for ; Tue, 21 Nov 2017 00:01:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 60DF629315; Tue, 21 Nov 2017 00:01:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BE0D62923B for ; Tue, 21 Nov 2017 00:01:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752028AbdKUABN (ORCPT ); Mon, 20 Nov 2017 19:01:13 -0500 Received: from mail-qt0-f195.google.com ([209.85.216.195]:40288 "EHLO mail-qt0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752010AbdKUABK (ORCPT ); Mon, 20 Nov 2017 19:01:10 -0500 Received: by mail-qt0-f195.google.com with SMTP id u42so16953649qte.7 for ; Mon, 20 Nov 2017 16:01:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=hXABAA+GgTEeqQ4jKO1aADUAh6Qdg4dwxQVIHrhuJso=; b=Plz5e3bIq4dVrDXE1/cbVRDXIiyUf4B9+5FTawEj5R/2YSzPcjHdWpo9+5ZIRwbF1K puZfQBZZDCR3D0+Wi1emglB7kZ3nHaxlfP6P0uJGvJjNDSdTq9fXKFpdSiPK5Cqd8wTh shoseUzidoB/Q9JmSyvMJ0z9ZKl/3xDRKCenqL6zetdCBEGNhI7rc0760Zxl9wKQEx9T 5G5zbW9WLdeTdS4+BWhWhvPFao5ybK/5RS1oIfA0J2eYR6Nyv7OvG4bagmawWKLQW/Qj 3TeQu5EwIoPuNcf7J7dpVzo93sZiI7VES2Zq75mC+JebvEBCfKo0WUHX1S+yWGxYtAqf dgQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=hXABAA+GgTEeqQ4jKO1aADUAh6Qdg4dwxQVIHrhuJso=; b=NY40dsxBJJWf/nZPkJDkP7dt5HyXs4AaFJy1RWChQAqjM7cZEm1dInl34dp3dudCM2 F92wycf6JuuCUI5oyEBW2vCS3YnDhVMxxG+IOt/0D0yKyNaKw9k34oWpwEtv+G2zvXUe /COxt4LSuTeLWMnOeQbuN3UOEGotUsmYgl2xCK/xTU8q9Gbtx+XPhVsSmr6vaUpx+i8T qYcqheBOfvr5NZNNyEGkCV7/CWWBt+MaR89wVJmJSWcocMGykGFR/YQ+X7thqZCsELbe 8KYVb91zjt21fLaer8qW6IKKbM2ayh74GSBKVo5RsUqXtbMtnn/h9vhLNGIkUubm0Zoc lSdQ== X-Gm-Message-State: AJaThX7uQJLWlpfEHT0kxmrnss6DOMH/qL31GA7nkSUlSjEFW72l8Q1S HDCSAeM+pX0JwLHYRKXoH5RsFzPk X-Google-Smtp-Source: AGs4zMaMSZC71ap4LiZQENeJ/0AiwE2Ri5hr6+g7w8GewsLi5sVvOZTFfscxjV6U0eNG8oFvXNgmlg== X-Received: by 10.200.26.234 with SMTP id h39mr24098048qtk.135.1511222467884; Mon, 20 Nov 2017 16:01:07 -0800 (PST) Received: from pallmd1.broadcom.com ([192.19.228.250]) by smtp.gmail.com with ESMTPSA id w143sm1612821qka.84.2017.11.20.16.01.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 20 Nov 2017 16:01:07 -0800 (PST) From: James Smart To: linux-scsi@vger.kernel.org Cc: James Smart , Dick Kennedy , James Smart Subject: [PATCH v3 13/17] lpfc: Correct driver deregistrations with host nvme transport Date: Mon, 20 Nov 2017 16:00:40 -0800 Message-Id: <20171121000044.27702-14-jsmart2021@gmail.com> X-Mailer: git-send-email 2.13.1 In-Reply-To: <20171121000044.27702-1-jsmart2021@gmail.com> References: <20171121000044.27702-1-jsmart2021@gmail.com> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The driver's interaction with the host nvme transport has been incorrect for a while. The driver did not wait for the unregister callbacks (waited only 5 jiffies). Thus the driver may remove objects that may be referenced by subsequent abort commands from the transport, and the actual unregister callback was effectively a noop. This was especially problematic if the driver was unloaded. The driver now waits for the unregister callbacks, as it should, before continuing with teardown. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- v3: per review: clear NLP_WAIT_FOR_UNREG in all cases --- drivers/scsi/lpfc/lpfc_disc.h | 2 + drivers/scsi/lpfc/lpfc_nvme.c | 116 +++++++++++++++++++++++++++++++++++++++--- drivers/scsi/lpfc/lpfc_nvme.h | 2 + 3 files changed, 114 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_disc.h b/drivers/scsi/lpfc/lpfc_disc.h index f9a566eaef04..5a7547f9d8d8 100644 --- a/drivers/scsi/lpfc/lpfc_disc.h +++ b/drivers/scsi/lpfc/lpfc_disc.h @@ -134,6 +134,8 @@ struct lpfc_nodelist { struct lpfc_scsicmd_bkt *lat_data; /* Latency data */ uint32_t fc4_prli_sent; uint32_t upcall_flags; +#define NLP_WAIT_FOR_UNREG 0x1 + uint32_t nvme_fb_size; /* NVME target's supported byte cnt */ #define NVME_FB_BIT_SHIFT 9 /* PRLI Rsp first burst in 512B units. */ }; diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c index d3ada630b427..3aa3b889b4cf 100644 --- a/drivers/scsi/lpfc/lpfc_nvme.c +++ b/drivers/scsi/lpfc/lpfc_nvme.c @@ -154,6 +154,10 @@ lpfc_nvme_localport_delete(struct nvme_fc_local_port *localport) { struct lpfc_nvme_lport *lport = localport->private; + lpfc_printf_vlog(lport->vport, KERN_INFO, LOG_NVME, + "6173 localport %p delete complete\n", + lport); + /* release any threads waiting for the unreg to complete */ complete(&lport->lport_unreg_done); } @@ -946,10 +950,19 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pwqeIn, freqpriv->nvme_buf = NULL; /* NVME targets need completion held off until the abort exchange - * completes. + * completes unless the NVME Rport is getting unregistered. */ - if (!(lpfc_ncmd->flags & LPFC_SBUF_XBUSY)) + if (!(lpfc_ncmd->flags & LPFC_SBUF_XBUSY) || + ndlp->upcall_flags & NLP_WAIT_FOR_UNREG) { + /* Clear the XBUSY flag to prevent double completions. + * The nvme rport is getting unregistered and there is + * no need to defer the IO. + */ + if (lpfc_ncmd->flags & LPFC_SBUF_XBUSY) + lpfc_ncmd->flags &= ~LPFC_SBUF_XBUSY; + nCmd->done(nCmd); + } spin_lock_irqsave(&phba->hbalock, flags); lpfc_ncmd->nrport = NULL; @@ -2234,6 +2247,47 @@ lpfc_nvme_create_localport(struct lpfc_vport *vport) return ret; } +/* lpfc_nvme_lport_unreg_wait - Wait for the host to complete an lport unreg. + * + * The driver has to wait for the host nvme transport to callback + * indicating the localport has successfully unregistered all + * resources. Since this is an uninterruptible wait, loop every ten + * seconds and print a message indicating no progress. + * + * An uninterruptible wait is used because of the risk of transport-to- + * driver state mismatch. + */ +void +lpfc_nvme_lport_unreg_wait(struct lpfc_vport *vport, + struct lpfc_nvme_lport *lport) +{ +#if (IS_ENABLED(CONFIG_NVME_FC)) + u32 wait_tmo; + int ret; + + /* Host transport has to clean up and confirm requiring an indefinite + * wait. Print a message if a 10 second wait expires and renew the + * wait. This is unexpected. + */ + wait_tmo = msecs_to_jiffies(LPFC_NVME_WAIT_TMO * 1000); + while (true) { + ret = wait_for_completion_timeout(&lport->lport_unreg_done, + wait_tmo); + if (unlikely(!ret)) { + lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_IOERR, + "6176 Lport %p Localport %p wait " + "timed out. Renewing.\n", + lport, vport->localport); + continue; + } + break; + } + lpfc_printf_vlog(vport, KERN_INFO, LOG_NVME_IOERR, + "6177 Lport %p Localport %p Complete Success\n", + lport, vport->localport); +#endif +} + /** * lpfc_nvme_destroy_localport - Destroy lpfc_nvme bound to nvme transport. * @pnvme: pointer to lpfc nvme data structure. @@ -2268,7 +2322,11 @@ lpfc_nvme_destroy_localport(struct lpfc_vport *vport) */ init_completion(&lport->lport_unreg_done); ret = nvme_fc_unregister_localport(localport); - wait_for_completion_timeout(&lport->lport_unreg_done, 5); + + /* Wait for completion. This either blocks + * indefinitely or succeeds + */ + lpfc_nvme_lport_unreg_wait(vport, lport); /* Regardless of the unregister upcall response, clear * nvmei_support. All rports are unregistered and the @@ -2424,6 +2482,47 @@ lpfc_nvme_register_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp) #endif } +/* lpfc_nvme_rport_unreg_wait - Wait for the host to complete an rport unreg. + * + * The driver has to wait for the host nvme transport to callback + * indicating the remoteport has successfully unregistered all + * resources. Since this is an uninterruptible wait, loop every ten + * seconds and print a message indicating no progress. + * + * An uninterruptible wait is used because of the risk of transport-to- + * driver state mismatch. + */ +void +lpfc_nvme_rport_unreg_wait(struct lpfc_vport *vport, + struct lpfc_nvme_rport *rport) +{ +#if (IS_ENABLED(CONFIG_NVME_FC)) + u32 wait_tmo; + int ret; + + /* Host transport has to clean up and confirm requiring an indefinite + * wait. Print a message if a 10 second wait expires and renew the + * wait. This is unexpected. + */ + wait_tmo = msecs_to_jiffies(LPFC_NVME_WAIT_TMO * 1000); + while (true) { + ret = wait_for_completion_timeout(&rport->rport_unreg_done, + wait_tmo); + if (unlikely(!ret)) { + lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_IOERR, + "6174 Rport %p Remoteport %p wait " + "timed out. Renewing.\n", + rport, rport->remoteport); + continue; + } + break; + } + lpfc_printf_vlog(vport, KERN_INFO, LOG_NVME_IOERR, + "6175 Rport %p Remoteport %p Complete Success\n", + rport, rport->remoteport); +#endif +} + /* lpfc_nvme_unregister_port - unbind the DID and port_role from this rport. * * There is no notion of Devloss or rport recovery from the current @@ -2480,14 +2579,19 @@ lpfc_nvme_unregister_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp) /* No concern about the role change on the nvme remoteport. * The transport will update it. */ + ndlp->upcall_flags |= NLP_WAIT_FOR_UNREG; ret = nvme_fc_unregister_remoteport(remoteport); - if (ret != 0) { + if (ret != 0) lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_DISC, "6167 NVME unregister failed %d " "port_state x%x\n", ret, remoteport->port_state); - } - + else + /* Wait for completion. This either blocks + * indefinitely or succeeds + */ + lpfc_nvme_rport_unreg_wait(vport, rport); + ndlp->upcall_flags &= ~NLP_WAIT_FOR_UNREG; } return; diff --git a/drivers/scsi/lpfc/lpfc_nvme.h b/drivers/scsi/lpfc/lpfc_nvme.h index fbfc1786cd04..903ec37f465f 100644 --- a/drivers/scsi/lpfc/lpfc_nvme.h +++ b/drivers/scsi/lpfc/lpfc_nvme.h @@ -27,6 +27,8 @@ #define LPFC_NVME_ERSP_LEN 0x20 +#define LPFC_NVME_WAIT_TMO 10 + struct lpfc_nvme_qhandle { uint32_t index; /* WQ index to use */ uint32_t qidx; /* queue index passed to create */