From patchwork Thu Sep 1 19:43:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Luck X-Patchwork-Id: 12963183 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC927ECAAD5 for ; Thu, 1 Sep 2022 19:43:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234546AbiIATn3 (ORCPT ); Thu, 1 Sep 2022 15:43:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50750 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233907AbiIATn3 (ORCPT ); Thu, 1 Sep 2022 15:43:29 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3358C696CE; Thu, 1 Sep 2022 12:43:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1662061408; x=1693597408; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+KjT95G5kF5GHfkSihejwy6zK0A0g4kfEgY4RnNnD+o=; b=gMflVZAXdLcCriP1eip7g1ZDc5vWZLDCwA8EZMGzWPutc/xTWInpFEAR o7yKd5Y0GQkIWxt1apl4MDe1ma8vEQwB4Q9Ss1eHbH6samepwA9s6bE3t 93VgrcJKDgJEzDnS/e/2d/HUmS30yuc7U6PrNMB+VuQmrKLDJidOrn+XF Wn0VD1x9obUO14Cyzk4NsL/X9UrvsrYZrRT+7kMyG4gFoEagRkAKlb4oi 4yuo/zuasZlurPcPG0wVH5skYzneddU6yj8UFiTODulrDHpcjbp7ky/s6 0Eo7SV5FD+WWDSBdGSfVsbySWUlLy/3vso0wWCxbRwqb4kxPIiZY1YaAx A==; X-IronPort-AV: E=McAfee;i="6500,9779,10457"; a="278830154" X-IronPort-AV: E=Sophos;i="5.93,281,1654585200"; d="scan'208";a="278830154" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2022 12:43:22 -0700 X-IronPort-AV: E=Sophos;i="5.93,281,1654585200"; d="scan'208";a="674020249" Received: from agluck-desk3.sc.intel.com ([172.25.222.78]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2022 12:43:20 -0700 From: Tony Luck To: linux-edac@vger.kernel.org Cc: Qiuxu Zhuo , Youquan Song , Tony Luck , Aristeu Rozanski , Borislav Petkov , Mauro Carvalho Chehab , linux-kernel@vger.kernel.org, patches@lists.linux.dev Subject: [PATCH 1/3] EDAC/skx_common: Use driver decoder first Date: Thu, 1 Sep 2022 12:43:08 -0700 Message-Id: <20220901194310.115427-2-tony.luck@intel.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901194310.115427-1-tony.luck@intel.com> References: <20220901194310.115427-1-tony.luck@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Qiuxu Zhuo The performance of driver decoder[1] is better than the performance of firmware decoder[2], especially on frequent correctable errors. So use the driver decoder first, fall back to firmware decoder if the driver decoder is unavailable. Also rename the function pointer skx_decode to driver_decode (better name to contrast with adxl_decode). [1] Decode errors by extracting error information from registers of memory controllers and/or MCA bank registers. [2] Decode errors by calling ACPI DSM methods. Co-developed-by: Youquan Song Signed-off-by: Youquan Song Signed-off-by: Qiuxu Zhuo Signed-off-by: Tony Luck --- drivers/edac/skx_common.h | 1 + drivers/edac/skx_base.c | 9 +++++++-- drivers/edac/skx_common.c | 16 +++++++++------- 3 files changed, 17 insertions(+), 9 deletions(-) diff --git a/drivers/edac/skx_common.h b/drivers/edac/skx_common.h index 03ac067a80b9..880ecd15ca42 100644 --- a/drivers/edac/skx_common.h +++ b/drivers/edac/skx_common.h @@ -136,6 +136,7 @@ struct decoded_addr { int column; int bank_address; int bank_group; + bool decoded_by_adxl; }; struct res_config { diff --git a/drivers/edac/skx_base.c b/drivers/edac/skx_base.c index 1abc020d49ab..7e2762f62eec 100644 --- a/drivers/edac/skx_base.c +++ b/drivers/edac/skx_base.c @@ -714,8 +714,13 @@ static int __init skx_init(void) skx_set_decode(skx_decode, skx_show_retry_rd_err_log); - if (nvdimm_count && skx_adxl_get() == -ENODEV) - skx_printk(KERN_NOTICE, "Only decoding DDR4 address!\n"); + if (nvdimm_count && skx_adxl_get() != -ENODEV) { + skx_set_decode(NULL, skx_show_retry_rd_err_log); + } else { + if (nvdimm_count) + skx_printk(KERN_NOTICE, "Only decoding DDR4 address!\n"); + skx_set_decode(skx_decode, skx_show_retry_rd_err_log); + } /* Ensure that the OPSTATE is set correctly for POLL or NMI */ opstate_init(); diff --git a/drivers/edac/skx_common.c b/drivers/edac/skx_common.c index 19c17c5198c5..9b10c359849b 100644 --- a/drivers/edac/skx_common.c +++ b/drivers/edac/skx_common.c @@ -40,7 +40,7 @@ static char *adxl_msg; static unsigned long adxl_nm_bitmap; static char skx_msg[MSG_SIZE]; -static skx_decode_f skx_decode; +static skx_decode_f driver_decode; static skx_show_retry_log_f skx_show_retry_rd_err_log; static u64 skx_tolm, skx_tohm; static LIST_HEAD(dev_edac_list); @@ -173,6 +173,8 @@ static bool skx_adxl_decode(struct decoded_addr *res, bool error_in_1st_level_me break; } + res->decoded_by_adxl = true; + return true; } @@ -183,7 +185,7 @@ void skx_set_mem_cfg(bool mem_cfg_2lm) void skx_set_decode(skx_decode_f decode, skx_show_retry_log_f show_retry_log) { - skx_decode = decode; + driver_decode = decode; skx_show_retry_rd_err_log = show_retry_log; } @@ -591,7 +593,7 @@ static void skx_mce_output_error(struct mem_ctl_info *mci, break; } } - if (adxl_component_count) { + if (res->decoded_by_adxl) { len = snprintf(skx_msg, MSG_SIZE, "%s%s err_code:0x%04x:0x%04x %s", overflow ? " OVERFLOW" : "", (uncorrected_error && recoverable) ? " recoverable" : "", @@ -651,11 +653,11 @@ int skx_mce_check_error(struct notifier_block *nb, unsigned long val, memset(&res, 0, sizeof(res)); res.addr = mce->addr; - if (adxl_component_count) { - if (!skx_adxl_decode(&res, skx_error_in_1st_level_mem(mce))) + /* Try driver decoder first */ + if (!(driver_decode && driver_decode(&res))) { + /* Then try firmware decoder (ACPI DSM methods) */ + if (!(adxl_component_count && skx_adxl_decode(&res, skx_error_in_1st_level_mem(mce)))) return NOTIFY_DONE; - } else if (!skx_decode || !skx_decode(&res)) { - return NOTIFY_DONE; } mci = res.dev->imc[res.imc].mci; From patchwork Thu Sep 1 19:43:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Luck X-Patchwork-Id: 12963184 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACDF8ECAAD3 for ; Thu, 1 Sep 2022 19:43:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234707AbiIATnd (ORCPT ); Thu, 1 Sep 2022 15:43:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234695AbiIATnc (ORCPT ); Thu, 1 Sep 2022 15:43:32 -0400 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 346F09AFB8; Thu, 1 Sep 2022 12:43:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1662061411; x=1693597411; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Zy92vzhKsP6d5wheJJP6Ba24kfkFA7i7Ka9Ina5Ye9M=; b=QVSG+eR5WM7zzxVmtstEyHZj+an1tkQ4bMbX5vx2KM6qMiu4i9zEbEfA Kg73Zxevj6un+Eo7bDR1WYO0iSjaCR0+K5Xlg4sC+BL5vRm7Lert1kNSe T++ak4WZBH9O6H/wJcvr4eQXTKI1Hei11s1i2vhu6bCuHsw54PQe8hgdG 0nC8c3XWF5sIe/MfWUVBtrT+Ws4zaJ0kCS0S+Wjl2CTLffrep4yFDlB+1 RD1FO6zDsbkM72MlP2e+/b4X1aXY55m8mOssjZKEwoAKkV+L7+OVbet+C WWdivdsAC+2ImHfrVxZ0Yn0qQkfCrMDpU45mN0tX4U8EjEXQmLXVJ1sNb Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10457"; a="295821696" X-IronPort-AV: E=Sophos;i="5.93,281,1654585200"; d="scan'208";a="295821696" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2022 12:43:22 -0700 X-IronPort-AV: E=Sophos;i="5.93,281,1654585200"; d="scan'208";a="674020253" Received: from agluck-desk3.sc.intel.com ([172.25.222.78]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2022 12:43:20 -0700 From: Tony Luck To: linux-edac@vger.kernel.org Cc: Qiuxu Zhuo , Tony Luck , Aristeu Rozanski , Borislav Petkov , Mauro Carvalho Chehab , Youquan Song , linux-kernel@vger.kernel.org, patches@lists.linux.dev Subject: [PATCH 2/3] EDAC/skx_common: Make output format similar Date: Thu, 1 Sep 2022 12:43:09 -0700 Message-Id: <20220901194310.115427-3-tony.luck@intel.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901194310.115427-1-tony.luck@intel.com> References: <20220901194310.115427-1-tony.luck@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Qiuxu Zhuo The decoded output format of driver decoder is different from the output format of firmware decoder. Make output format similar regardless of decode function (Align driver decoder's to firmware decoder's). Signed-off-by: Qiuxu Zhuo Signed-off-by: Tony Luck --- drivers/edac/skx_common.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/edac/skx_common.c b/drivers/edac/skx_common.c index 9b10c359849b..16ca3de57c24 100644 --- a/drivers/edac/skx_common.c +++ b/drivers/edac/skx_common.c @@ -600,12 +600,12 @@ static void skx_mce_output_error(struct mem_ctl_info *mci, mscod, errcode, adxl_msg); } else { len = snprintf(skx_msg, MSG_SIZE, - "%s%s err_code:0x%04x:0x%04x socket:%d imc:%d rank:%d bg:%d ba:%d row:0x%x col:0x%x", + "%s%s err_code:0x%04x:0x%04x ProcessorSocketId:0x%x MemoryControllerId:0x%x PhysicalRankId:0x%x Row:0x%x Column:0x%x Bank:0x%x BankGroup:0x%x", overflow ? " OVERFLOW" : "", (uncorrected_error && recoverable) ? " recoverable" : "", mscod, errcode, res->socket, res->imc, res->rank, - res->bank_group, res->bank_address, res->row, res->column); + res->row, res->column, res->bank_address, res->bank_group); } if (skx_show_retry_rd_err_log) From patchwork Thu Sep 1 19:43:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Luck X-Patchwork-Id: 12963185 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60728C6FA81 for ; Thu, 1 Sep 2022 19:43:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231936AbiIATnr (ORCPT ); Thu, 1 Sep 2022 15:43:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234706AbiIATnd (ORCPT ); Thu, 1 Sep 2022 15:43:33 -0400 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F14E9BB5A; Thu, 1 Sep 2022 12:43:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1662061412; x=1693597412; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=w4h4BOehZh3KOQ0TFBnp+WQud/kuPjA/m/SdaY7wrZ4=; b=KTWSXLKFbPFJ5vPoCogLxiNv9gztTAiOgnZ3HNBzly9xtwzHn95eF5Bh K5c7U/ftuVLXzT9KrOucIoB3bR+3JTEjYsgkU6Yw0CkTOsoQMtloic2q6 Td6tV4CB3Pd7H92HayBS5GaWlh0O3gj7A1ULF1G9+KGHqUMqU7NMaMyjW PYErBFxX0Pu6eHI4P9Ip9dyt0unYEX/EGOYPAQjNlG9JQ5NOX4sni/I9s zKTHL2wgJCrTaeD8vtzcGJrVvOWF1z2kJIPiqnImT5V9C2OghucqPRnjD UeAmuecqRAxJ0y96DjwyS2+MRaY+mQKbboVL1PHawspQ0Z4SXU+cRbHOP g==; X-IronPort-AV: E=McAfee;i="6500,9779,10457"; a="295821697" X-IronPort-AV: E=Sophos;i="5.93,281,1654585200"; d="scan'208";a="295821697" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2022 12:43:22 -0700 X-IronPort-AV: E=Sophos;i="5.93,281,1654585200"; d="scan'208";a="674020256" Received: from agluck-desk3.sc.intel.com ([172.25.222.78]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2022 12:43:20 -0700 From: Tony Luck To: linux-edac@vger.kernel.org Cc: Youquan Song , Qiuxu Zhuo , Tony Luck , Aristeu Rozanski , Borislav Petkov , Mauro Carvalho Chehab , linux-kernel@vger.kernel.org, patches@lists.linux.dev Subject: [PATCH 3/3] EDAC/i10nm: Add driver decoder for Ice Lake and Tremont CPUs Date: Thu, 1 Sep 2022 12:43:10 -0700 Message-Id: <20220901194310.115427-4-tony.luck@intel.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901194310.115427-1-tony.luck@intel.com> References: <20220901194310.115427-1-tony.luck@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Youquan Song Current i10nm_edac only supports firmware decoder (ACPI DSM methods). MCA bank registers of Ice Lake or Tremont CPUs contain the information to decode DDR memory errors. To get better decoding performance, add the driver decoder (decoding DDR memory errors via extracting error information from MCA bank registers) for Ice Lake and Tremont CPUs. Co-developed-by: Qiuxu Zhuo Signed-off-by: Qiuxu Zhuo Signed-off-by: Youquan Song Signed-off-by: Tony Luck --- arch/x86/include/asm/mce.h | 1 + drivers/edac/skx_common.h | 5 ++ drivers/edac/i10nm_base.c | 134 ++++++++++++++++++++++++++++++++++++- drivers/edac/skx_common.c | 1 + 4 files changed, 139 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index cc73061e7255..6e986088817d 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -42,6 +42,7 @@ #define MCI_STATUS_CEC_SHIFT 38 /* Corrected Error Count */ #define MCI_STATUS_CEC_MASK GENMASK_ULL(52,38) #define MCI_STATUS_CEC(c) (((c) & MCI_STATUS_CEC_MASK) >> MCI_STATUS_CEC_SHIFT) +#define MCI_STATUS_MSCOD(m) (((m) >> 16) & 0xffff) /* AMD-specific bits */ #define MCI_STATUS_TCC BIT_ULL(55) /* Task context corrupt */ diff --git a/drivers/edac/skx_common.h b/drivers/edac/skx_common.h index 880ecd15ca42..c542f1562825 100644 --- a/drivers/edac/skx_common.h +++ b/drivers/edac/skx_common.h @@ -10,6 +10,7 @@ #define _SKX_COMM_EDAC_H #include +#include #define MSG_SIZE 1024 @@ -52,6 +53,9 @@ #define IS_DIMM_PRESENT(r) GET_BITFIELD(r, 15, 15) #define IS_NVDIMM_PRESENT(r, i) GET_BITFIELD(r, i, i) +#define MCI_MISC_ECC_MODE(m) (((m) >> 59) & 15) +#define MCI_MISC_ECC_DDRT 8 /* read from DDRT */ + /* * Each cpu socket contains some pci devices that provide global * information, and also some that are local to each of the two @@ -120,6 +124,7 @@ enum { #define BIT_NM_DIMM BIT_ULL(INDEX_NM_DIMM) struct decoded_addr { + struct mce *mce; struct skx_dev *dev; u64 addr; int socket; diff --git a/drivers/edac/i10nm_base.c b/drivers/edac/i10nm_base.c index 6cf50ee0b77c..817f618fcff0 100644 --- a/drivers/edac/i10nm_base.c +++ b/drivers/edac/i10nm_base.c @@ -74,6 +74,8 @@ static struct list_head *i10nm_edac_list; static struct res_config *res_cfg; static int retry_rd_err_log; +static int decoding_via_mca; +static bool mem_cfg_2lm; static u32 offsets_scrub_icx[] = {0x22c60, 0x22c54, 0x22c5c, 0x22c58, 0x22c28, 0x20ed8}; static u32 offsets_scrub_spr[] = {0x22c60, 0x22c54, 0x22f08, 0x22c58, 0x22c28, 0x20ed8}; @@ -231,6 +233,103 @@ static bool i10nm_check_2lm(struct res_config *cfg) return false; } +/* + * Check whether the error comes from DDRT by ICX/Tremont model specific error code. + * Refer to SDM vol3B 16.11.3 Intel IMC MC error codes for IA32_MCi_STATUS. + */ +static bool i10nm_mscod_is_ddrt(u32 mscod) +{ + switch (mscod) { + case 0x0106: case 0x0107: + case 0x0800: case 0x0804: + case 0x0806 ... 0x0808: + case 0x080a ... 0x080e: + case 0x0810: case 0x0811: + case 0x0816: case 0x081e: + case 0x081f: + return true; + } + + return false; +} + +static bool i10nm_mc_decode_available(struct mce *mce) +{ + u8 bank; + + if (!decoding_via_mca || mem_cfg_2lm) + return false; + + if ((mce->status & (MCI_STATUS_MISCV | MCI_STATUS_ADDRV)) + != (MCI_STATUS_MISCV | MCI_STATUS_ADDRV)) + return false; + + bank = mce->bank; + + switch (res_cfg->type) { + case I10NM: + if (bank < 13 || bank > 26) + return false; + + /* DDRT errors can't be decoded from MCA bank registers */ + if (MCI_MISC_ECC_MODE(mce->misc) == MCI_MISC_ECC_DDRT) + return false; + + if (i10nm_mscod_is_ddrt(MCI_STATUS_MSCOD(mce->status))) + return false; + + /* Check whether one of {13,14,17,18,21,22,25,26} */ + return ((bank - 13) & BIT(1)) == 0; + default: + return false; + } +} + +static bool i10nm_mc_decode(struct decoded_addr *res) +{ + struct mce *m = res->mce; + struct skx_dev *d; + u8 bank; + + if (!i10nm_mc_decode_available(m)) + return false; + + list_for_each_entry(d, i10nm_edac_list, list) { + if (d->imc[0].src_id == m->socketid) { + res->socket = m->socketid; + res->dev = d; + break; + } + } + + switch (res_cfg->type) { + case I10NM: + bank = m->bank - 13; + res->imc = bank / 4; + res->channel = bank % 2; + break; + default: + return false; + } + + if (!res->dev) { + skx_printk(KERN_ERR, "No device for src_id %d imc %d\n", + m->socketid, res->imc); + return false; + } + + res->column = GET_BITFIELD(m->misc, 9, 18) << 2; + res->row = GET_BITFIELD(m->misc, 19, 39); + res->bank_group = GET_BITFIELD(m->misc, 40, 41); + res->bank_address = GET_BITFIELD(m->misc, 42, 43); + res->bank_group |= GET_BITFIELD(m->misc, 44, 44) << 2; + res->rank = GET_BITFIELD(m->misc, 56, 58); + res->dimm = res->rank >> 2; + res->rank = res->rank % 4; + + return true; +} + static int i10nm_get_ddr_munits(void) { struct pci_dev *mdev; @@ -574,7 +673,8 @@ static int __init i10nm_init(void) return -ENODEV; } - skx_set_mem_cfg(i10nm_check_2lm(cfg)); + mem_cfg_2lm = i10nm_check_2lm(cfg); + skx_set_mem_cfg(mem_cfg_2lm); rc = i10nm_get_ddr_munits(); @@ -626,9 +726,11 @@ static int __init i10nm_init(void) setup_i10nm_debug(); if (retry_rd_err_log && res_cfg->offsets_scrub && res_cfg->offsets_demand) { - skx_set_decode(NULL, show_retry_rd_err_log); + skx_set_decode(i10nm_mc_decode, show_retry_rd_err_log); if (retry_rd_err_log == 2) enable_retry_rd_err_log(true); + } else { + skx_set_decode(i10nm_mc_decode, NULL); } i10nm_printk(KERN_INFO, "%s\n", I10NM_REVISION); @@ -658,6 +760,34 @@ static void __exit i10nm_exit(void) module_init(i10nm_init); module_exit(i10nm_exit); +static int set_decoding_via_mca(const char *buf, const struct kernel_param *kp) +{ + unsigned long val; + int ret; + + ret = kstrtoul(buf, 0, &val); + + if (ret || val > 1) + return -EINVAL; + + if (val && mem_cfg_2lm) { + i10nm_printk(KERN_NOTICE, "Decoding errors via MCA banks for 2LM isn't supported yet\n"); + return -EIO; + } + + ret = param_set_int(buf, kp); + + return ret; +} + +static const struct kernel_param_ops decoding_via_mca_param_ops = { + .set = set_decoding_via_mca, + .get = param_get_int, +}; + +module_param_cb(decoding_via_mca, &decoding_via_mca_param_ops, &decoding_via_mca, 0644); +MODULE_PARM_DESC(decoding_via_mca, "decoding_via_mca: 0=off(default), 1=enable"); + module_param(retry_rd_err_log, int, 0444); MODULE_PARM_DESC(retry_rd_err_log, "retry_rd_err_log: 0=off(default), 1=bios(Linux doesn't reset any control bits, but just reports values.), 2=linux(Linux tries to take control and resets mode bits, clear valid/UC bits after reading.)"); diff --git a/drivers/edac/skx_common.c b/drivers/edac/skx_common.c index 16ca3de57c24..7276ce3a33e1 100644 --- a/drivers/edac/skx_common.c +++ b/drivers/edac/skx_common.c @@ -651,6 +651,7 @@ int skx_mce_check_error(struct notifier_block *nb, unsigned long val, return NOTIFY_DONE; memset(&res, 0, sizeof(res)); + res.mce = mce; res.addr = mce->addr; /* Try driver decoder first */