From patchwork Thu Aug 9 20:04:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 10561885 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6734013BB for ; Thu, 9 Aug 2018 20:04:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 520D72B408 for ; Thu, 9 Aug 2018 20:04:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 45B082B700; Thu, 9 Aug 2018 20:04:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DEA812B408 for ; Thu, 9 Aug 2018 20:04:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727177AbeHIWaj (ORCPT ); Thu, 9 Aug 2018 18:30:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50422 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726894AbeHIWaj (ORCPT ); Thu, 9 Aug 2018 18:30:39 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D8C99308FBAD; Thu, 9 Aug 2018 20:04:16 +0000 (UTC) Received: from gimli.home (ovpn-116-35.phx2.redhat.com [10.3.116.35]) by smtp.corp.redhat.com (Postfix) with ESMTP id A52B510018F8; Thu, 9 Aug 2018 20:04:14 +0000 (UTC) Subject: [PATCH v4 1/3] PCI: Export pcie_has_flr() From: Alex Williamson To: linux-pci@vger.kernel.org Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org Date: Thu, 09 Aug 2018 14:04:14 -0600 Message-ID: <153384505430.15793.4988988835441214121.stgit@gimli.home> In-Reply-To: <153384487209.15793.15203778129263981368.stgit@gimli.home> References: <153384487209.15793.15203778129263981368.stgit@gimli.home> User-Agent: StGit/0.18-136-gffd7-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Thu, 09 Aug 2018 20:04:16 +0000 (UTC) Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP pcie_flr() suggests pcie_has_flr() to ensure that PCIe FLR support is present prior to calling. pcie_flr() is exported while pcie_has_flr() is not. Resolve this. Signed-off-by: Alex Williamson --- drivers/pci/pci.c | 3 ++- include/linux/pci.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 2bec76c9d9a7..52fe2d72a99c 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4071,7 +4071,7 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout) * Returns true if the device advertises support for PCIe function level * resets. */ -static bool pcie_has_flr(struct pci_dev *dev) +bool pcie_has_flr(struct pci_dev *dev) { u32 cap; @@ -4081,6 +4081,7 @@ static bool pcie_has_flr(struct pci_dev *dev) pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap); return cap & PCI_EXP_DEVCAP_FLR; } +EXPORT_SYMBOL_GPL(pcie_has_flr); /** * pcie_flr - initiate a PCIe function level reset diff --git a/include/linux/pci.h b/include/linux/pci.h index 04c7ea6ed67b..bbe030d7814f 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1092,6 +1092,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev, enum pci_bus_speed *speed, enum pcie_link_width *width); void pcie_print_link_status(struct pci_dev *dev); +bool pcie_has_flr(struct pci_dev *dev); int pcie_flr(struct pci_dev *dev); int __pci_reset_function_locked(struct pci_dev *dev); int pci_reset_function(struct pci_dev *dev); From patchwork Thu Aug 9 20:04:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 10561887 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5763E13AC for ; Thu, 9 Aug 2018 20:04:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 45DBC2B408 for ; Thu, 9 Aug 2018 20:04:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 397DA2B700; Thu, 9 Aug 2018 20:04:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C12C92B408 for ; Thu, 9 Aug 2018 20:04:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727117AbeHIWat (ORCPT ); Thu, 9 Aug 2018 18:30:49 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33284 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726894AbeHIWat (ORCPT ); Thu, 9 Aug 2018 18:30:49 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 11AF0307D86C; Thu, 9 Aug 2018 20:04:26 +0000 (UTC) Received: from gimli.home (ovpn-116-35.phx2.redhat.com [10.3.116.35]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4241762929; Thu, 9 Aug 2018 20:04:22 +0000 (UTC) Subject: [PATCH v4 2/3] PCI: Samsung SM961/PM961 NVMe disable before FLR quirk From: Alex Williamson To: linux-pci@vger.kernel.org Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org Date: Thu, 09 Aug 2018 14:04:22 -0600 Message-ID: <153384506190.15793.5959667641864701895.stgit@gimli.home> In-Reply-To: <153384487209.15793.15203778129263981368.stgit@gimli.home> References: <153384487209.15793.15203778129263981368.stgit@gimli.home> User-Agent: StGit/0.18-136-gffd7-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Thu, 09 Aug 2018 20:04:26 +0000 (UTC) Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The Samsung SM961/PM961 (960 EVO) sometimes fails to return from FLR with the PCI config space reading back as -1. A reproducible instance of this behavior is resolved by clearing the enable bit in the NVMe configuration register and waiting for the ready status to clear (disabling the NVMe controller) prior to FLR. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1542494 Signed-off-by: Alex Williamson --- drivers/pci/quirks.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 83 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e72c8742aafa..0a4d802cb307 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -28,6 +28,7 @@ #include #include #include +#include #include /* isa_dma_bridge_buggy */ #include "pci.h" @@ -3669,6 +3670,87 @@ static int reset_chelsio_generic_dev(struct pci_dev *dev, int probe) #define PCI_DEVICE_ID_INTEL_IVB_M_VGA 0x0156 #define PCI_DEVICE_ID_INTEL_IVB_M2_VGA 0x0166 +/* + * The Samsung SM961/PM961 controller can sometimes enter a fatal state after + * FLR where config space reads from the device return -1. We seem to be + * able to avoid this condition if we disable the NVMe controller prior to + * FLR. This quirk is generic for any NVMe class device requiring similar + * assistance to quiesce the device prior to FLR. + * + * NVMe specification: https://nvmexpress.org/resources/specifications/ + * Revision 1.0e: + * Chapter 2: Required and optional PCI config registers + * Chapter 3: NVMe control registers + * Chapter 7.3: Reset behavior + */ +static int nvme_disable_and_flr(struct pci_dev *dev, int probe) +{ + void __iomem *bar; + u16 cmd; + u32 cfg; + + if (dev->class != PCI_CLASS_STORAGE_EXPRESS || + !pcie_has_flr(dev) || !pci_resource_start(dev, 0)) + return -ENOTTY; + + if (probe) + return 0; + + bar = pci_iomap(dev, 0, NVME_REG_CC + sizeof(cfg)); + if (!bar) + return -ENOTTY; + + pci_read_config_word(dev, PCI_COMMAND, &cmd); + pci_write_config_word(dev, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY); + + cfg = readl(bar + NVME_REG_CC); + + /* Disable controller if enabled */ + if (cfg & NVME_CC_ENABLE) { + u32 cap = readl(bar + NVME_REG_CAP); + unsigned long timeout; + + /* + * Per nvme_disable_ctrl() skip shutdown notification as it + * could complete commands to the admin queue. We only intend + * to quiesce the device before reset. + */ + cfg &= ~(NVME_CC_SHN_MASK | NVME_CC_ENABLE); + + writel(cfg, bar + NVME_REG_CC); + + /* + * Some controllers require an additional delay here, see + * NVME_QUIRK_DELAY_BEFORE_CHK_RDY. None of those are yet + * supported by this quirk. + */ + + /* Cap register provides max timeout in 500ms increments */ + timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies; + + for (;;) { + u32 status = readl(bar + NVME_REG_CSTS); + + /* Ready status becomes zero on disable complete */ + if (!(status & NVME_CSTS_RDY)) + break; + + msleep(100); + + if (time_after(jiffies, timeout)) { + pci_warn(dev, "Timeout waiting for NVMe ready status to clear after disable\n"); + break; + } + } + } + + pci_iounmap(dev, bar); + + pcie_flr(dev); + + return 0; +} + static const struct pci_dev_reset_methods pci_dev_reset_methods[] = { { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF, reset_intel_82599_sfp_virtfn }, @@ -3676,6 +3758,7 @@ static const struct pci_dev_reset_methods pci_dev_reset_methods[] = { reset_ivb_igd }, { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M2_VGA, reset_ivb_igd }, + { PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr }, { PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID, reset_chelsio_generic_dev }, { 0 } From patchwork Thu Aug 9 20:04:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 10561889 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5FA8B13BB for ; Thu, 9 Aug 2018 20:04:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4CB7C2B700 for ; Thu, 9 Aug 2018 20:04:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4049A2B7D4; Thu, 9 Aug 2018 20:04:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E28502B700 for ; Thu, 9 Aug 2018 20:04:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727253AbeHIWa5 (ORCPT ); Thu, 9 Aug 2018 18:30:57 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59972 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726894AbeHIWa4 (ORCPT ); Thu, 9 Aug 2018 18:30:56 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D3071308626E; Thu, 9 Aug 2018 20:04:33 +0000 (UTC) Received: from gimli.home (ovpn-116-35.phx2.redhat.com [10.3.116.35]) by smtp.corp.redhat.com (Postfix) with ESMTP id 724AE45BF; Thu, 9 Aug 2018 20:04:31 +0000 (UTC) Subject: [PATCH v4 3/3] PCI: Intel DC P3700 NVMe delay after FLR quirk From: Alex Williamson To: linux-pci@vger.kernel.org Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org Date: Thu, 09 Aug 2018 14:04:31 -0600 Message-ID: <153384507110.15793.12191081187690448438.stgit@gimli.home> In-Reply-To: <153384487209.15793.15203778129263981368.stgit@gimli.home> References: <153384487209.15793.15203778129263981368.stgit@gimli.home> User-Agent: StGit/0.18-136-gffd7-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Thu, 09 Aug 2018 20:04:33 +0000 (UTC) Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add a device specific reset for Intel DC P3700 NVMe device which exhibits a timeout failure in drivers waiting for the ready status to update after NVMe enable if the driver interacts with the device too quickly after FLR. As this has been observed in device assignment scenarios, resolve this with a device specific reset quirk to add an additional, heuristically determined, delay after the FLR completes. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1592654 Signed-off-by: Alex Williamson --- drivers/pci/quirks.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 0a4d802cb307..93791bd31a55 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3751,6 +3751,27 @@ static int nvme_disable_and_flr(struct pci_dev *dev, int probe) return 0; } +/* + * Intel DC P3700 NVMe controller will timeout waiting for ready status + * to change after NVMe enable if the driver starts interacting with the + * device too quickly after FLR. A 250ms delay after FLR has heuristically + * proven to produce reliably working results for device assignment cases. + */ +static int delay_250ms_after_flr(struct pci_dev *dev, int probe) +{ + if (!pcie_has_flr(dev)) + return -ENOTTY; + + if (probe) + return 0; + + pcie_flr(dev); + + msleep(250); + + return 0; +} + static const struct pci_dev_reset_methods pci_dev_reset_methods[] = { { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF, reset_intel_82599_sfp_virtfn }, @@ -3759,6 +3780,7 @@ static const struct pci_dev_reset_methods pci_dev_reset_methods[] = { { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M2_VGA, reset_ivb_igd }, { PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr }, + { PCI_VENDOR_ID_INTEL, 0x0953, delay_250ms_after_flr }, { PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID, reset_chelsio_generic_dev }, { 0 }