From patchwork Fri Feb 28 00:59:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuppuswamy Sathyanarayanan X-Patchwork-Id: 11411333 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DF85A138D for ; Fri, 28 Feb 2020 01:03:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BE612246A8 for ; Fri, 28 Feb 2020 01:03:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730582AbgB1BDG (ORCPT ); Thu, 27 Feb 2020 20:03:06 -0500 Received: from mga14.intel.com ([192.55.52.115]:40114 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730559AbgB1BCo (ORCPT ); Thu, 27 Feb 2020 20:02:44 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Feb 2020 17:02:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,493,1574150400"; d="scan'208";a="317976997" Received: from skuppusw-desk.jf.intel.com ([10.7.201.16]) by orsmga001.jf.intel.com with ESMTP; 27 Feb 2020 17:02:42 -0800 From: sathyanarayanan.kuppuswamy@linux.intel.com To: bhelgaas@google.com Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, ashok.raj@intel.com, Kuppuswamy Sathyanarayanan , Keith Busch Subject: [PATCH v16 1/9] PCI/ERR: Update error status after reset_link() Date: Thu, 27 Feb 2020 16:59:43 -0800 Message-Id: <15e702a33cc27314f9d43a06ccb408086a229cef.1582850766.git.sathyanarayanan.kuppuswamy@linux.intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org From: Kuppuswamy Sathyanarayanan Commit bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") uses reset_link() to recover from fatal errors. But during fatal error recovery, if the initial value of error status is PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER then even after successful recovery (using reset_link()) pcie_do_recovery() will report the recovery result as failure. So update the status of error after reset_link(). You can reproduce this issue by triggering a SW DPC using "DPC Software Trigger" bit in "DPC Control Register". You should see recovery failed dmesg log as below. [ 164.659982] pcieport 0000:00:16.0: DPC: containment event, status:0x1f27 source:0x0000 [ 164.659989] pcieport 0000:00:16.0: DPC: software trigger detected [ 164.659994] pci 0000:04:00.0: AER: can't recover (no error_detected callback) [ 164.794300] pcieport 0000:00:16.0: AER: device recovery failed Fixes: bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") Cc: Ashok Raj Cc: Keith Busch Signed-off-by: Kuppuswamy Sathyanarayanan Acked-by: Keith Busch --- drivers/pci/pcie/err.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c index 01dfc8bb7ca0..eefefe03857a 100644 --- a/drivers/pci/pcie/err.c +++ b/drivers/pci/pcie/err.c @@ -208,9 +208,11 @@ void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state, else pci_walk_bus(bus, report_normal_detected, &status); - if (state == pci_channel_io_frozen && - reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED) - goto failed; + if (state == pci_channel_io_frozen) { + status = reset_link(dev, service); + if (status != PCI_ERS_RESULT_RECOVERED) + goto failed; + } if (status == PCI_ERS_RESULT_CAN_RECOVER) { status = PCI_ERS_RESULT_RECOVERED;