From patchwork Fri Jul 26 01:43:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiaofei Tan X-Patchwork-Id: 11060117 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5A47213AC for ; Fri, 26 Jul 2019 01:45:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2FAE828A33 for ; Fri, 26 Jul 2019 01:45:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1D44B28A6A; Fri, 26 Jul 2019 01:45:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0DD6528A33 for ; Fri, 26 Jul 2019 01:45:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725901AbfGZBpl (ORCPT ); Thu, 25 Jul 2019 21:45:41 -0400 Received: from szxga07-in.huawei.com ([45.249.212.35]:41058 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725852AbfGZBpl (ORCPT ); Thu, 25 Jul 2019 21:45:41 -0400 Received: from DGGEMS412-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 8EF18370B50BB262F715; Fri, 26 Jul 2019 09:45:38 +0800 (CST) Received: from localhost.localdomain (10.67.212.75) by DGGEMS412-HUB.china.huawei.com (10.3.19.212) with Microsoft SMTP Server id 14.3.439.0; Fri, 26 Jul 2019 09:45:31 +0800 From: Xiaofei Tan To: CC: Xiaofei Tan , , , , , , , , , , Subject: [PATCH v2 1/1] efi: cper: print AER info of PCIe fatal error Date: Fri, 26 Jul 2019 09:43:37 +0800 Message-ID: <1564105417-232048-1-git-send-email-tanxiaofei@huawei.com> X-Mailer: git-send-email 2.8.1 MIME-Version: 1.0 X-Originating-IP: [10.67.212.75] X-CFilter-Loop: Reflected Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP AER info of PCIe fatal error is not printed in the current driver. Because APEI driver will panic directly for fatal error, and can't run to the place of printing AER info. An example log is as following: {763}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 11 {763}[Hardware Error]: event severity: fatal {763}[Hardware Error]: Error 0, type: fatal {763}[Hardware Error]: section_type: PCIe error {763}[Hardware Error]: port_type: 0, PCIe end point {763}[Hardware Error]: version: 4.0 {763}[Hardware Error]: command: 0x0000, status: 0x0010 {763}[Hardware Error]: device_id: 0000:82:00.0 {763}[Hardware Error]: slot: 0 {763}[Hardware Error]: secondary_bus: 0x00 {763}[Hardware Error]: vendor_id: 0x8086, device_id: 0x10fb {763}[Hardware Error]: class_code: 000002 Kernel panic - not syncing: Fatal hardware error! This issue was imported by the patch, '37448adfc7ce ("aerdrv: Move cper_print_aer() call out of interrupt context")'. To fix this issue, this patch adds print of AER info in cper_print_pcie() for fatal error. Here is the example log after this patch applied: {24}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 10 {24}[Hardware Error]: event severity: fatal {24}[Hardware Error]: Error 0, type: fatal {24}[Hardware Error]: section_type: PCIe error {24}[Hardware Error]: port_type: 0, PCIe end point {24}[Hardware Error]: version: 4.0 {24}[Hardware Error]: command: 0x0546, status: 0x4010 {24}[Hardware Error]: device_id: 0000:01:00.0 {24}[Hardware Error]: slot: 0 {24}[Hardware Error]: secondary_bus: 0x00 {24}[Hardware Error]: vendor_id: 0x15b3, device_id: 0x1019 {24}[Hardware Error]: class_code: 000002 {24}[Hardware Error]: aer_uncor_status: 0x00040000, aer_uncor_mask: 0x00000000 {24}[Hardware Error]: aer_uncor_severity: 0x00062010 {24}[Hardware Error]: TLP Header: 000000c0 01010000 00000001 00000000 Kernel panic - not syncing: Fatal hardware error! Fixes: 37448adfc7ce ("aerdrv: Move cper_print_aer() call out of interrupt context") Signed-off-by: Xiaofei Tan Reviewed-by: James Morse --- drivers/firmware/efi/cper.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c index 8fa977c..78b8922 100644 --- a/drivers/firmware/efi/cper.c +++ b/drivers/firmware/efi/cper.c @@ -390,6 +390,21 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie, printk( "%s""bridge: secondary_status: 0x%04x, control: 0x%04x\n", pfx, pcie->bridge.secondary_status, pcie->bridge.control); + + /* Fatal errors call __ghes_panic() before AER handler prints this */ + if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO && + gdata->error_severity & CPER_SEV_FATAL) { + struct aer_capability_regs *aer; + + aer = (struct aer_capability_regs *)pcie->aer_info; + printk("%saer_uncor_status: 0x%08x, aer_uncor_mask: 0x%08x\n", + pfx, aer->uncor_status, aer->uncor_mask); + printk("%saer_uncor_severity: 0x%08x\n", + pfx, aer->uncor_severity); + printk("%sTLP Header: %08x %08x %08x %08x\n", pfx, + aer->header_log.dw0, aer->header_log.dw1, + aer->header_log.dw2, aer->header_log.dw3); + } } static void cper_print_tstamp(const char *pfx,