From patchwork Tue Jan 30 23:58:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Smart X-Patchwork-Id: 10192919 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D837C60291 for ; Tue, 30 Jan 2018 23:59:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CAF7227F54 for ; Tue, 30 Jan 2018 23:59:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BFDAE27FAC; Tue, 30 Jan 2018 23:59:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 62B4427F54 for ; Tue, 30 Jan 2018 23:59:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754163AbeA3X7Z (ORCPT ); Tue, 30 Jan 2018 18:59:25 -0500 Received: from mail-qt0-f196.google.com ([209.85.216.196]:36485 "EHLO mail-qt0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753556AbeA3X7X (ORCPT ); Tue, 30 Jan 2018 18:59:23 -0500 Received: by mail-qt0-f196.google.com with SMTP id z11so19667747qtm.3 for ; Tue, 30 Jan 2018 15:59:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=rb3rGflFhwEunwWqhhuWjge3LPCeE2CIThlfuJgSPEY=; b=NjDxzpuh4z8gzk6NVlqVxOcw+trIx4Uc9QXT1PiXn+iil9w+o9QxeaRlA1Wn7/Qhzd etVaUUQapjewjmZWHTON9nfut+9obVwksgtmVlQweXvtqO3/tVPWCHRNjaG/rS+dn89t VJvXbgtTunnzg15DDIqCh4CzO5S4ZIyenp5r2Tl40YwOp76Wc5LYyzmggM3cb8NMeI98 c/aJ0UMC3XK1s9BFjIqX3JOwkasa6sNF1hudhyHzjXx2XPtB1qi2cLV4mKoZCbiWZXK0 m4R3WgwfqG3+puBu5AVxEY7vGJeOU89GViJbh8yEyLaf6OSfD+7EURCZfodaRzTt7Jc1 T5FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=rb3rGflFhwEunwWqhhuWjge3LPCeE2CIThlfuJgSPEY=; b=fguikbEZEjzy01l6SmozJIdBOvVYYIu28xcxjB3cf53aMAhZ3t28Ux8sT0znfhRgDL xURlvYo7A5MASuBqZp5pVGflXTHgGfswR1iXebb6HpENCO8NBk6tuHuFpXRMVrFreGq1 zQQZolyRbmtD3Kcyv91MOV7t7E9uufyd7HGPIbRoryWwEx9Rxmxl4nK5DgEzfhrTDHTw 5Vg06le3Lk1Ii9Cs6dZwNJYWfYILV20CWCYsB8mJCASkialgDM4yUFhV1WxDGDw6bsr4 z0pZxFSxluol8d1V178/FvkI4lfTFh1G3mvyztv0/tghEC6pEaCic/BO688hEQEFDnPx 7roQ== X-Gm-Message-State: AKwxytctmRPa2GPukckpqRquE9mAjeHmERGHK2dRT5ln+uLH2dvlwA1I xsQTvyjI1ChBBSQxoiVzakJlAA== X-Google-Smtp-Source: AH8x227iP54CuMson7PS5I2wRJonsq2ByJ0yE60z9sVGYh4e+Yb9Y7ygllnRUvp7HVHJj4gEbdN+8g== X-Received: by 10.237.60.74 with SMTP id u10mr49555353qte.235.1517356762419; Tue, 30 Jan 2018 15:59:22 -0800 (PST) Received: from pallmd1.broadcom.com ([192.19.223.250]) by smtp.gmail.com with ESMTPSA id i39sm12438537qte.19.2018.01.30.15.59.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 30 Jan 2018 15:59:22 -0800 (PST) From: James Smart To: linux-scsi@vger.kernel.org Cc: James Smart , Dick Kennedy , James Smart Subject: [PATCH v3 07/19] lpfc: Fix IO failure during hba reset testing with nvme io. Date: Tue, 30 Jan 2018 15:58:51 -0800 Message-Id: <20180130235903.5316-8-jsmart2021@gmail.com> X-Mailer: git-send-email 2.13.1 In-Reply-To: <20180130235903.5316-1-jsmart2021@gmail.com> References: <20180130235903.5316-1-jsmart2021@gmail.com> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP A stress test repeatedly resetting the adapter while performing io would eventually report I/O failures and missing nvme namespaces. The driver was setting the nvmefc_fcp_req->private pointer to NULL during the IO completion routine before upcalling done(). If the transport was also running an abort for that IO, the driver would fail the abort with message 6140. Failing the abort is not allowed by the nvme-fc transport, as it mandates that the io must be returned back to the transport. As that does not happen, the transport controller delete has an outstanding reference and can't complete teardown. The NULL-ing of the private pointer should be done only when the io is considered complete. It's complete when the adapter returns the exchange with the "exchange busy" flag clear. Move the NULL'ing of the structure to the done case. This leaves the io contexts set while it is busy and until the subsequent XRI_ABORTED completion which returns the exchange is received. Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- v3: Address review comment that preferred to continue to NULL if the io did complete in order to protect stale conditions. After review, we agreed, thus the above change. --- drivers/scsi/lpfc/lpfc_nvme.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c index 81e3a4f10c3c..c6e5b9972585 100644 --- a/drivers/scsi/lpfc/lpfc_nvme.c +++ b/drivers/scsi/lpfc/lpfc_nvme.c @@ -980,14 +980,14 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pwqeIn, phba->cpucheck_cmpl_io[lpfc_ncmd->cpu]++; } #endif - freqpriv = nCmd->private; - freqpriv->nvme_buf = NULL; /* NVME targets need completion held off until the abort exchange * completes unless the NVME Rport is getting unregistered. */ if (!(lpfc_ncmd->flags & LPFC_SBUF_XBUSY)) { + freqpriv = nCmd->private; + freqpriv->nvme_buf = NULL; nCmd->done(nCmd); lpfc_ncmd->nvmeCmd = NULL; }