From patchwork Thu Apr 17 15:07:18 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhuo, Qiuxu" X-Patchwork-Id: 14055765 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 41F1021D3EA; Thu, 17 Apr 2025 15:08:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902530; cv=none; b=ooG32tK6nGB00PAG+0bWGeHOeXXAhhy8dK9tIoXYO+y1L2kaBaBDt2bZzaVbDq8Kb1AyA3JC7dddFGAhXX1uklKo6b7gej+cI8QySBK+FtcED9hh0cALfc8Pn/PY45FtxgDoITa1ZzawQOsiV3qWkNG5Njl0t7SA4fNHBYmDPAw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902530; c=relaxed/simple; bh=F2GZRp++KxKOvNNAn2NVxZTL1RAxokaCzoH8dbomSxA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=txINqyqUc/TjyrqEq9PuMXRcR2Qao/32bMXg0f2CD2omUmSg2ayIm5ZD2Au6RaYyaXDMzokbWM6lS2BU7anXuZ69USSvbIylnbvMdkQIui0cMEgd2VoEdPv3M9I7xA2pGAhTbhaz02ezNdeCxoIvU4WLPzqGPYh2AFMi3DeYxGE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ERKG8JzX; arc=none smtp.client-ip=198.175.65.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ERKG8JzX" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744902528; x=1776438528; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=F2GZRp++KxKOvNNAn2NVxZTL1RAxokaCzoH8dbomSxA=; b=ERKG8JzXEexNOWy7Vg1pA0uwYbrdbhgcNFaChm5Y/ixbBJXN3g78dtpN t8W2vlvSqaF0vUfWHhSls1MSTcuLdsjZVGjgVYyDVVlMSAN1+m8DMyJaO tqX5mUxrERImrCnQ30YkoctyEsUZ2WKynjkMKQDA30ptZh3O3+qknVXuf 5tRL6fResCY9aoU541pfmd/xmvBoN+6EXoKAhntQne1mDDad8daRkierK K4DLLQign6oJU9or18smOIhjDkrDNaHp2byHqQmpL2Upu5/FbFeyMYrsS V0QlcCQUzxpaCOExwh17v6AkwM8VeJoyL29VQwJGWQ8Mhq1iRljSZ23/R w==; X-CSE-ConnectionGUID: bXuu3WHIQ9WMxVWeWX3gFg== X-CSE-MsgGUID: 083XihPqS3GEtuQf5qMjSA== X-IronPort-AV: E=McAfee;i="6700,10204,11406"; a="57488636" X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="57488636" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:08:47 -0700 X-CSE-ConnectionGUID: bPmZ1V1mSQmkKcpH/My3DQ== X-CSE-MsgGUID: a+2UpJGDRZGV3Rq5stfpkw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="161876770" Received: from qiuxu-clx.sh.intel.com ([10.239.53.109]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:08:45 -0700 From: Qiuxu Zhuo To: Tony Luck Cc: Qiuxu Zhuo , Feng Xu , Borislav Petkov , James Morse , Mauro Carvalho Chehab , Robert Richter , Yi Lai , Shawn Fan , linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/7] EDAC/skx_common: Fix general protection fault Date: Thu, 17 Apr 2025 23:07:18 +0800 Message-ID: <20250417150724.1170168-2-qiuxu.zhuo@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> References: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 After loading i10nm_edac (which automatically loads skx_edac_common), if unload only i10nm_edac, then reload it and perform error injection testing, a general protection fault may occur: mce: [Hardware Error]: Machine check events logged Oops: general protection fault ... ... Workqueue: events mce_gen_pool_process RIP: 0010:string+0x53/0xe0 ... Call Trace: ? die_addr+0x37/0x90 ? exc_general_protection+0x1e7/0x3f0 ? asm_exc_general_protection+0x26/0x30 ? string+0x53/0xe0 vsnprintf+0x23e/0x4c0 snprintf+0x4d/0x70 skx_adxl_decode+0x16a/0x330 [skx_edac_common] skx_mce_check_error.part.0+0xf8/0x220 [skx_edac_common] skx_mce_check_error+0x17/0x20 [skx_edac_common] ... The issue arose was because the variable 'adxl_component_count' (inside skx_edac_common), which counts the ADXL components, was not reset. During the reloading of i10nm_edac, the count was incremented by the actual number of ADXL components again, resulting in a count that was double the real number of ADXL components. This led to an out-of-bounds reference to the ADXL component array, causing the general protection fault above. Fix this issue by resetting the 'adxl_component_count' in adxl_put(), which is called during the unloading of {skx,i10nm}_edac. Fixes: 123b15863550 ("EDAC, i10nm: make skx_common.o a separate module") Reported-by: Feng Xu Tested-by: Feng Xu Signed-off-by: Qiuxu Zhuo --- drivers/edac/skx_common.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/edac/skx_common.c b/drivers/edac/skx_common.c index fa5b442b1844..c9ade45c1a99 100644 --- a/drivers/edac/skx_common.c +++ b/drivers/edac/skx_common.c @@ -116,6 +116,7 @@ EXPORT_SYMBOL_GPL(skx_adxl_get); void skx_adxl_put(void) { + adxl_component_count = 0; kfree(adxl_values); kfree(adxl_msg); } From patchwork Thu Apr 17 15:07:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhuo, Qiuxu" X-Patchwork-Id: 14055766 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0FA5F2528FD; Thu, 17 Apr 2025 15:08:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902540; cv=none; b=AWXBTatB2fbIQIHB7a2vzElFd+mNHcv62t+v4+qTzg4aI1dR6cr3v193UsisWNUWYLA1PR3dYBf34C/tEF2h1TZ2iaoJnw/OXTIkMvb4Jgo4p3H1JqJCQJ1QOmjs+R6VQFmMnOfNGTkA2rmkMd55r2Z6tlzpF6seI78QxBYsbfI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902540; c=relaxed/simple; bh=P5cQvtwHFprTvaKH1BDhr2iMULhAjxnJwD+E5qasORk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aWFlR1WcirA6uAVeoM2qA2ODyvWbSai8hSCeR37is67HgXlDkWAum4KWhojTxryvlpcCC4Av2SgTCMUvV+thTjoK7BrWUg+fnPvHjQgb6H1mTMROZbVBItUuXVF4Z0vGEyQPuCr/HNls0fV5e/MIkV/7FFcKLyodt+2X1Gv3BLY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=dEk6IrGW; arc=none smtp.client-ip=198.175.65.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="dEk6IrGW" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744902538; x=1776438538; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=P5cQvtwHFprTvaKH1BDhr2iMULhAjxnJwD+E5qasORk=; b=dEk6IrGWlgk/YkkiX6lqgis612V6jkfYL7jLuQcrC7oL6giDuVv5rPgI plHiAGvXjqCLf5ExlvU3yOQVyVwGbghOGSJhj65b9ftxiWRSRWfx8Bri1 WbT3IBc/Ha6sNO8SolCs/WQGpmhydScV6/ASL2hmaLnDvY2H4Y1BF0S4o tbtU5FF2O8O+A/8yBTgnceumuC/HzCCiQ1qiQ6w1uDSdsGeGRFH0YpFhs bSxsmEFLgVq38y7OeTSJ+enyoG2mBv7nm3w5VIKMOcKrpNJdxCTtThbWg RK9S6OX6nCs4uBF+xSPoTH0jQ1aLRsiu1qbAI7qMxj+poURIANGR6AjXC A==; X-CSE-ConnectionGUID: n2NivDivRqyIp35zaJZY5A== X-CSE-MsgGUID: ZuHrpZAWSM+Dw6Czo2h2ew== X-IronPort-AV: E=McAfee;i="6700,10204,11406"; a="57488663" X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="57488663" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:08:57 -0700 X-CSE-ConnectionGUID: KX3ZYkofSbyCIz3dIVHk0A== X-CSE-MsgGUID: 7UoTKLZgT7KMb0claor4kA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="161876811" Received: from qiuxu-clx.sh.intel.com ([10.239.53.109]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:08:55 -0700 From: Qiuxu Zhuo To: Tony Luck Cc: Qiuxu Zhuo , Feng Xu , Borislav Petkov , James Morse , Mauro Carvalho Chehab , Robert Richter , Yi Lai , Shawn Fan , linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/7] EDAC/{skx_common,i10nm}: Fix the loss of saved RRL for HBM pseudo channel 0 Date: Thu, 17 Apr 2025 23:07:19 +0800 Message-ID: <20250417150724.1170168-3-qiuxu.zhuo@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> References: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 When enabling the retry_rd_err_log (RRL) feature during the loading of the i10nm_edac driver with the module parameter retry_rd_err_log=2 (Linux RRL control mode), the default values of the control bits of RRL are saved so that they can be restored during the unloading of the driver. In the current code, the RRL of pseudo channel 1 of HBM overwrites pseudo channel 0 during the loading of the driver, resulting in the loss of saved RRL for pseudo channel 0. This causes the RRL of pseudo channel 0 of HBM to be wrongly restored with the values from pseudo channel 1 when unloading the driver. Fix this issue by creating two separate groups of RRL control registers per channel to save default RRL settings of two {sub-,pseudo-}channels. Fixes: acd4cf68fefe ("EDAC/i10nm: Retrieve and print retry_rd_err_log registers for HBM") Tested-by: Feng Xu Signed-off-by: Qiuxu Zhuo --- drivers/edac/i10nm_base.c | 35 +++++++++++++++++++---------------- drivers/edac/skx_common.h | 11 ++++++++--- 2 files changed, 27 insertions(+), 19 deletions(-) diff --git a/drivers/edac/i10nm_base.c b/drivers/edac/i10nm_base.c index 355a977019e9..355b527d839e 100644 --- a/drivers/edac/i10nm_base.c +++ b/drivers/edac/i10nm_base.c @@ -95,7 +95,7 @@ static u32 offsets_demand2_spr[] = {0x22c70, 0x22d80, 0x22f18, 0x22d58, 0x22c64, static u32 offsets_demand_spr_hbm0[] = {0x2a54, 0x2a60, 0x2b10, 0x2a58, 0x2a5c, 0x0ee0}; static u32 offsets_demand_spr_hbm1[] = {0x2e54, 0x2e60, 0x2f10, 0x2e58, 0x2e5c, 0x0fb0}; -static void __enable_retry_rd_err_log(struct skx_imc *imc, int chan, bool enable, +static void __enable_retry_rd_err_log(struct skx_imc *imc, int chan, bool enable, u32 *rrl_ctl, u32 *offsets_scrub, u32 *offsets_demand, u32 *offsets_demand2) { @@ -108,10 +108,10 @@ static void __enable_retry_rd_err_log(struct skx_imc *imc, int chan, bool enable if (enable) { /* Save default configurations */ - imc->chan[chan].retry_rd_err_log_s = s; - imc->chan[chan].retry_rd_err_log_d = d; + rrl_ctl[0] = s; + rrl_ctl[1] = d; if (offsets_demand2) - imc->chan[chan].retry_rd_err_log_d2 = d2; + rrl_ctl[2] = d2; s &= ~RETRY_RD_ERR_LOG_NOOVER_UC; s |= RETRY_RD_ERR_LOG_EN; @@ -125,25 +125,25 @@ static void __enable_retry_rd_err_log(struct skx_imc *imc, int chan, bool enable } } else { /* Restore default configurations */ - if (imc->chan[chan].retry_rd_err_log_s & RETRY_RD_ERR_LOG_UC) + if (rrl_ctl[0] & RETRY_RD_ERR_LOG_UC) s |= RETRY_RD_ERR_LOG_UC; - if (imc->chan[chan].retry_rd_err_log_s & RETRY_RD_ERR_LOG_NOOVER) + if (rrl_ctl[0] & RETRY_RD_ERR_LOG_NOOVER) s |= RETRY_RD_ERR_LOG_NOOVER; - if (!(imc->chan[chan].retry_rd_err_log_s & RETRY_RD_ERR_LOG_EN)) + if (!(rrl_ctl[0] & RETRY_RD_ERR_LOG_EN)) s &= ~RETRY_RD_ERR_LOG_EN; - if (imc->chan[chan].retry_rd_err_log_d & RETRY_RD_ERR_LOG_UC) + if (rrl_ctl[1] & RETRY_RD_ERR_LOG_UC) d |= RETRY_RD_ERR_LOG_UC; - if (imc->chan[chan].retry_rd_err_log_d & RETRY_RD_ERR_LOG_NOOVER) + if (rrl_ctl[1] & RETRY_RD_ERR_LOG_NOOVER) d |= RETRY_RD_ERR_LOG_NOOVER; - if (!(imc->chan[chan].retry_rd_err_log_d & RETRY_RD_ERR_LOG_EN)) + if (!(rrl_ctl[1] & RETRY_RD_ERR_LOG_EN)) d &= ~RETRY_RD_ERR_LOG_EN; if (offsets_demand2) { - if (imc->chan[chan].retry_rd_err_log_d2 & RETRY_RD_ERR_LOG_UC) + if (rrl_ctl[2] & RETRY_RD_ERR_LOG_UC) d2 |= RETRY_RD_ERR_LOG_UC; - if (!(imc->chan[chan].retry_rd_err_log_d2 & RETRY_RD_ERR_LOG_NOOVER)) + if (!(rrl_ctl[2] & RETRY_RD_ERR_LOG_NOOVER)) d2 &= ~RETRY_RD_ERR_LOG_NOOVER; - if (!(imc->chan[chan].retry_rd_err_log_d2 & RETRY_RD_ERR_LOG_EN)) + if (!(rrl_ctl[2] & RETRY_RD_ERR_LOG_EN)) d2 &= ~RETRY_RD_ERR_LOG_EN; } } @@ -157,6 +157,7 @@ static void __enable_retry_rd_err_log(struct skx_imc *imc, int chan, bool enable static void enable_retry_rd_err_log(bool enable) { int i, j, imc_num, chan_num; + struct skx_channel *chan; struct skx_imc *imc; struct skx_dev *d; @@ -171,8 +172,9 @@ static void enable_retry_rd_err_log(bool enable) if (!imc->mbase) continue; + chan = d->imc[i].chan; for (j = 0; j < chan_num; j++) - __enable_retry_rd_err_log(imc, j, enable, + __enable_retry_rd_err_log(imc, j, enable, chan[j].rrl_ctl[0], res_cfg->offsets_scrub, res_cfg->offsets_demand, res_cfg->offsets_demand2); @@ -186,12 +188,13 @@ static void enable_retry_rd_err_log(bool enable) if (!imc->mbase || !imc->hbm_mc) continue; + chan = d->imc[i].chan; for (j = 0; j < chan_num; j++) { - __enable_retry_rd_err_log(imc, j, enable, + __enable_retry_rd_err_log(imc, j, enable, chan[j].rrl_ctl[0], res_cfg->offsets_scrub_hbm0, res_cfg->offsets_demand_hbm0, NULL); - __enable_retry_rd_err_log(imc, j, enable, + __enable_retry_rd_err_log(imc, j, enable, chan[j].rrl_ctl[1], res_cfg->offsets_scrub_hbm1, res_cfg->offsets_demand_hbm1, NULL); diff --git a/drivers/edac/skx_common.h b/drivers/edac/skx_common.h index ca5408803f87..5afd425f3b4f 100644 --- a/drivers/edac/skx_common.h +++ b/drivers/edac/skx_common.h @@ -79,6 +79,9 @@ */ #define MCACOD_EXT_MEM_ERR 0x280 +/* Max RRL register sets per {,sub-,pseudo-}channel. */ +#define NUM_RRL_SET 3 + /* * Each cpu socket contains some pci devices that provide global * information, and also some that are local to each of the two @@ -117,9 +120,11 @@ struct skx_dev { struct skx_channel { struct pci_dev *cdev; struct pci_dev *edev; - u32 retry_rd_err_log_s; - u32 retry_rd_err_log_d; - u32 retry_rd_err_log_d2; + /* + * Two groups of RRL control registers per channel to save default RRL + * settings of two {sub-,pseudo-}channels in Linux RRL control mode. + */ + u32 rrl_ctl[2][NUM_RRL_SET]; struct skx_dimm { u8 close_pg; u8 bank_xor_enable; From patchwork Thu Apr 17 15:07:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhuo, Qiuxu" X-Patchwork-Id: 14055767 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B15D1253333; Thu, 17 Apr 2025 15:09:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902545; cv=none; b=nATObuVgfMrtSWVKpduuaORVSquQRgl1Y0SvHSytvzK/63ATfm/E4PF/6azcxSVE9Yk47WAzDu0hucmrdKj3IR8gacCq6OviFCUkAP0tUH6V7lyRSrcAPxmpfhsWT3YBXsmYGaZB/i4EVroDQisoczvC8eQr6rgqDzh+khzU2tI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902545; c=relaxed/simple; bh=omD66plwxlONjb9MPWcXYIVDgznCrMr3UorH616Y564=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iLUxAs06lH+mrkqkDi994DomB862ZLyZQy20EecTOn8Ai2mjD87qX2w+Ie6XUmaLLTTfDEXJbWjGneFNbXDfRVrrF1Wn1DeZrs1KN9/PN/FHKLlJFm7tTd7gbOyVY6nFm2083RqT3eY+I7Yj6vvBnMtOCzbFRR3bbpCrYOTKWmE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=bdtTZ0YJ; arc=none smtp.client-ip=198.175.65.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="bdtTZ0YJ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744902543; x=1776438543; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=omD66plwxlONjb9MPWcXYIVDgznCrMr3UorH616Y564=; b=bdtTZ0YJjQ9RG1svu6PBNqVwCb2fCaYKlBZX7nGMj2JQ9Vz7XiqS/k64 xULhUxLbYswrD8ReFweRbHCedO44743c4nlrzippLPMEXtKbqrenVoHv8 svzELbFHaGj0oJmHCbbCqo5+SBwBvIACvfinUpm54mzLieJ2eEsxVLpUS IljO6TFeOKpP6Kd6/oqPzlNRlYnG0O9MZo7fAWqODVMXswAkTTANrI2Vz BhtvEFp5tAc4iYUsBdqOFJbIqEcXWkzs0R58TRaj4vl8JRkK/bsEMRUsn L9/xQ+4k66pYpZfKQKqIsm6KXoxtSwyASDbVUuT156aDeUZQ2+iVGc2vA A==; X-CSE-ConnectionGUID: CDuuLpY1RJOdd0V8ogReFQ== X-CSE-MsgGUID: o6gJq6TGQ6+xFmUuAuwRMg== X-IronPort-AV: E=McAfee;i="6700,10204,11406"; a="57488684" X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="57488684" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:09:03 -0700 X-CSE-ConnectionGUID: NwmoF8pSQUGVXEkqk3yRIQ== X-CSE-MsgGUID: hUoBE9kARNqkLwmMbQS0uQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="161876837" Received: from qiuxu-clx.sh.intel.com ([10.239.53.109]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:09:01 -0700 From: Qiuxu Zhuo To: Tony Luck Cc: Qiuxu Zhuo , Feng Xu , Borislav Petkov , James Morse , Mauro Carvalho Chehab , Robert Richter , Yi Lai , Shawn Fan , linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 3/7] EDAC/i10nm: Explicitly set the modes of the RRL register sets Date: Thu, 17 Apr 2025 23:07:20 +0800 Message-ID: <20250417150724.1170168-4-qiuxu.zhuo@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> References: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The i10nm_edac driver uses the default modes (either patrol scrub read or on-demand read) of the RRL register sets configured by the BIOS. Explicitly set the modes during the loading of the i10nm_edac driver with the module parameter retry_rd_err_log=2. Tested-by: Feng Xu Signed-off-by: Qiuxu Zhuo --- drivers/edac/i10nm_base.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/edac/i10nm_base.c b/drivers/edac/i10nm_base.c index 355b527d839e..50a16ce0aa22 100644 --- a/drivers/edac/i10nm_base.c +++ b/drivers/edac/i10nm_base.c @@ -73,6 +73,7 @@ #define I10NM_SAD_NM_CACHEABLE(reg) GET_BITFIELD(reg, 5, 5) #define RETRY_RD_ERR_LOG_UC BIT(1) +#define RETRY_RD_ERR_LOG_EN_PATSPR BIT(13) #define RETRY_RD_ERR_LOG_NOOVER BIT(14) #define RETRY_RD_ERR_LOG_EN BIT(15) #define RETRY_RD_ERR_LOG_NOOVER_UC (BIT(14) | BIT(1)) @@ -114,12 +115,15 @@ static void __enable_retry_rd_err_log(struct skx_imc *imc, int chan, bool enable rrl_ctl[2] = d2; s &= ~RETRY_RD_ERR_LOG_NOOVER_UC; + s |= RETRY_RD_ERR_LOG_EN_PATSPR; s |= RETRY_RD_ERR_LOG_EN; d &= ~RETRY_RD_ERR_LOG_NOOVER_UC; + d &= ~RETRY_RD_ERR_LOG_EN_PATSPR; d |= RETRY_RD_ERR_LOG_EN; if (offsets_demand2) { d2 &= ~RETRY_RD_ERR_LOG_UC; + d2 &= ~RETRY_RD_ERR_LOG_EN_PATSPR; d2 |= RETRY_RD_ERR_LOG_NOOVER; d2 |= RETRY_RD_ERR_LOG_EN; } @@ -129,18 +133,24 @@ static void __enable_retry_rd_err_log(struct skx_imc *imc, int chan, bool enable s |= RETRY_RD_ERR_LOG_UC; if (rrl_ctl[0] & RETRY_RD_ERR_LOG_NOOVER) s |= RETRY_RD_ERR_LOG_NOOVER; + if (!(rrl_ctl[0] & RETRY_RD_ERR_LOG_EN_PATSPR)) + s &= ~RETRY_RD_ERR_LOG_EN_PATSPR; if (!(rrl_ctl[0] & RETRY_RD_ERR_LOG_EN)) s &= ~RETRY_RD_ERR_LOG_EN; if (rrl_ctl[1] & RETRY_RD_ERR_LOG_UC) d |= RETRY_RD_ERR_LOG_UC; if (rrl_ctl[1] & RETRY_RD_ERR_LOG_NOOVER) d |= RETRY_RD_ERR_LOG_NOOVER; + if (rrl_ctl[1] & RETRY_RD_ERR_LOG_EN_PATSPR) + d |= RETRY_RD_ERR_LOG_EN_PATSPR; if (!(rrl_ctl[1] & RETRY_RD_ERR_LOG_EN)) d &= ~RETRY_RD_ERR_LOG_EN; if (offsets_demand2) { if (rrl_ctl[2] & RETRY_RD_ERR_LOG_UC) d2 |= RETRY_RD_ERR_LOG_UC; + if (rrl_ctl[2] & RETRY_RD_ERR_LOG_EN_PATSPR) + d2 |= RETRY_RD_ERR_LOG_EN_PATSPR; if (!(rrl_ctl[2] & RETRY_RD_ERR_LOG_NOOVER)) d2 &= ~RETRY_RD_ERR_LOG_NOOVER; if (!(rrl_ctl[2] & RETRY_RD_ERR_LOG_EN)) From patchwork Thu Apr 17 15:07:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhuo, Qiuxu" X-Patchwork-Id: 14055768 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4349D23E330; Thu, 17 Apr 2025 15:09:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902555; cv=none; b=leUJLtud7PlQ4CvjvOMt2tECJhWaj6FTnrc/XQmMtgPGb8+24IR2LRzWwy22pyh3YRsw1JflXvIGvVWhsYO3pgiXb6iNQvqVXLR23gPc+q7NZNbvkZt6oL2raUVRODqAi/Co1lFbVwntrfl3E8NkNYG1E4JDyRgJMt4Bkz+9IS4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902555; c=relaxed/simple; bh=81mjvg8U/osmb1FQogQ+eTd2UD/5Qvf+iQjQ+5a9Hn0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ftk2R5QXsuoDKGpH6785fjNRWcsh7AcMZXgnw8SybmKFsPRhTi14a5Fow/wGB+ynGwvV9jRioimeW4Em5YQnY62F47dMgkCUzbGTzAl5rUMOqDs+Mepm7K87ZT3VhA8wJFzPBGkan1ot2VUxwQAN8I+p7Kgu+uFf0NwPqzfZGcg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=TJj2EEbx; arc=none smtp.client-ip=198.175.65.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="TJj2EEbx" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744902550; x=1776438550; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=81mjvg8U/osmb1FQogQ+eTd2UD/5Qvf+iQjQ+5a9Hn0=; b=TJj2EEbx6yiF/ddTL58msEUhNRvInZf4C3P3T+YFJPcdA9KeT6nexTs/ qiZdgF2ZNhQ4uX8DHcsOPTlYOZc6ToVBYIyUUTfvcEudLkCv15JMhBf5z mQpzkf64CP8VbouryLCIRfYCpN4/S+GrQO8pWzgT0fRJ8jEHfRZ0//zAT uLCFsCHP0nEtXWzZyMyHnXNA1b0GHPXZLbvr2rgsKufRp8nPwvvExBUak /NAewVuzoPF6+d33G/68AVmID+me80gT3bxVwLxPyN7FxVqtrmbdgikHF n76MkZ7/oGY2oaWezgrk5NZLWmgIrs6rMnYBrSBr4jzpTy/HhV7iMkCMP Q==; X-CSE-ConnectionGUID: cWf3dyQOS6ea+QSt1EAwKg== X-CSE-MsgGUID: TqnU90usRo6Zk+Kopq9PrQ== X-IronPort-AV: E=McAfee;i="6700,10204,11406"; a="57488709" X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="57488709" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:09:10 -0700 X-CSE-ConnectionGUID: utWSB8YdS/axiXF26v3Qvg== X-CSE-MsgGUID: j6mBMZ3/T9iJzUKyXNlEIg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="161876876" Received: from qiuxu-clx.sh.intel.com ([10.239.53.109]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:09:07 -0700 From: Qiuxu Zhuo To: Tony Luck Cc: Qiuxu Zhuo , Feng Xu , Borislav Petkov , James Morse , Mauro Carvalho Chehab , Robert Richter , Yi Lai , Shawn Fan , linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 4/7] EDAC/{skx_common,i10nm}: Structure the per-channel RRL registers Date: Thu, 17 Apr 2025 23:07:21 +0800 Message-ID: <20250417150724.1170168-5-qiuxu.zhuo@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> References: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 As the number of RRL (retry_rd_err_log) registers per memory channel increases, the positions of the RRL control bits and the widths of the RRL registers vary across different CPU generations. Adding RRL support for a new CPU requires handling these differences throughout the RRL-related code. Structure the offsets, widths, control bit positions, set numbers, modes, etc., of the per-channel RRL registers and make them configurable to facilitate easier RRL support for new CPUs. No functional changes are intended. Tested-by: Feng Xu Signed-off-by: Qiuxu Zhuo --- drivers/edac/i10nm_base.c | 92 ++++++++++++++++++++++++--------------- drivers/edac/skx_common.h | 21 +++++---- 2 files changed, 69 insertions(+), 44 deletions(-) diff --git a/drivers/edac/i10nm_base.c b/drivers/edac/i10nm_base.c index 50a16ce0aa22..b47da970510c 100644 --- a/drivers/edac/i10nm_base.c +++ b/drivers/edac/i10nm_base.c @@ -86,15 +86,38 @@ static int retry_rd_err_log; static int decoding_via_mca; static bool mem_cfg_2lm; -static u32 offsets_scrub_icx[] = {0x22c60, 0x22c54, 0x22c5c, 0x22c58, 0x22c28, 0x20ed8}; -static u32 offsets_scrub_spr[] = {0x22c60, 0x22c54, 0x22f08, 0x22c58, 0x22c28, 0x20ed8}; -static u32 offsets_scrub_spr_hbm0[] = {0x2860, 0x2854, 0x2b08, 0x2858, 0x2828, 0x0ed8}; -static u32 offsets_scrub_spr_hbm1[] = {0x2c60, 0x2c54, 0x2f08, 0x2c58, 0x2c28, 0x0fa8}; -static u32 offsets_demand_icx[] = {0x22e54, 0x22e60, 0x22e64, 0x22e58, 0x22e5c, 0x20ee0}; -static u32 offsets_demand_spr[] = {0x22e54, 0x22e60, 0x22f10, 0x22e58, 0x22e5c, 0x20ee0}; -static u32 offsets_demand2_spr[] = {0x22c70, 0x22d80, 0x22f18, 0x22d58, 0x22c64, 0x20f10}; -static u32 offsets_demand_spr_hbm0[] = {0x2a54, 0x2a60, 0x2b10, 0x2a58, 0x2a5c, 0x0ee0}; -static u32 offsets_demand_spr_hbm1[] = {0x2e54, 0x2e60, 0x2f10, 0x2e58, 0x2e5c, 0x0fb0}; +static struct reg_rrl icx_reg_rrl_ddr = { + .set_num = 2, + .offsets = { + {0x22c60, 0x22c54, 0x22c5c, 0x22c58, 0x22c28, 0x20ed8}, + {0x22e54, 0x22e60, 0x22e64, 0x22e58, 0x22e5c, 0x20ee0}, + }, +}; + +static struct reg_rrl spr_reg_rrl_ddr = { + .set_num = 3, + .offsets = { + {0x22c60, 0x22c54, 0x22f08, 0x22c58, 0x22c28, 0x20ed8}, + {0x22e54, 0x22e60, 0x22f10, 0x22e58, 0x22e5c, 0x20ee0}, + {0x22c70, 0x22d80, 0x22f18, 0x22d58, 0x22c64, 0x20f10}, + }, +}; + +static struct reg_rrl spr_reg_rrl_hbm_pch0 = { + .set_num = 2, + .offsets = { + {0x2860, 0x2854, 0x2b08, 0x2858, 0x2828, 0x0ed8}, + {0x2a54, 0x2a60, 0x2b10, 0x2a58, 0x2a5c, 0x0ee0}, + }, +}; + +static struct reg_rrl spr_reg_rrl_hbm_pch1 = { + .set_num = 2, + .offsets = { + {0x2c60, 0x2c54, 0x2f08, 0x2c58, 0x2c28, 0x0fa8}, + {0x2e54, 0x2e60, 0x2f10, 0x2e58, 0x2e5c, 0x0fb0}, + }, +}; static void __enable_retry_rd_err_log(struct skx_imc *imc, int chan, bool enable, u32 *rrl_ctl, u32 *offsets_scrub, u32 *offsets_demand, @@ -185,9 +208,11 @@ static void enable_retry_rd_err_log(bool enable) chan = d->imc[i].chan; for (j = 0; j < chan_num; j++) __enable_retry_rd_err_log(imc, j, enable, chan[j].rrl_ctl[0], - res_cfg->offsets_scrub, - res_cfg->offsets_demand, - res_cfg->offsets_demand2); + res_cfg->reg_rrl_ddr->offsets[0], + res_cfg->reg_rrl_ddr->offsets[1], + res_cfg->reg_rrl_ddr->set_num > 2 ? + res_cfg->reg_rrl_ddr->offsets[2] : NULL); + } imc_num += res_cfg->hbm_imc_num; @@ -201,12 +226,12 @@ static void enable_retry_rd_err_log(bool enable) chan = d->imc[i].chan; for (j = 0; j < chan_num; j++) { __enable_retry_rd_err_log(imc, j, enable, chan[j].rrl_ctl[0], - res_cfg->offsets_scrub_hbm0, - res_cfg->offsets_demand_hbm0, + res_cfg->reg_rrl_hbm[0]->offsets[0], + res_cfg->reg_rrl_hbm[0]->offsets[1], NULL); __enable_retry_rd_err_log(imc, j, enable, chan[j].rrl_ctl[1], - res_cfg->offsets_scrub_hbm1, - res_cfg->offsets_demand_hbm1, + res_cfg->reg_rrl_hbm[1]->offsets[0], + res_cfg->reg_rrl_hbm[1]->offsets[1], NULL); } } @@ -233,17 +258,18 @@ static void show_retry_rd_err_log(struct decoded_addr *res, char *msg, pch = res->cs & 1; if (pch) - offsets = scrub_err ? res_cfg->offsets_scrub_hbm1 : - res_cfg->offsets_demand_hbm1; + offsets = scrub_err ? res_cfg->reg_rrl_hbm[1]->offsets[0] : + res_cfg->reg_rrl_hbm[1]->offsets[1]; else - offsets = scrub_err ? res_cfg->offsets_scrub_hbm0 : - res_cfg->offsets_demand_hbm0; + offsets = scrub_err ? res_cfg->reg_rrl_hbm[0]->offsets[0] : + res_cfg->reg_rrl_hbm[0]->offsets[1]; } else { if (scrub_err) { - offsets = res_cfg->offsets_scrub; + offsets = res_cfg->reg_rrl_ddr->offsets[0]; } else { - offsets = res_cfg->offsets_demand; - xffsets = res_cfg->offsets_demand2; + offsets = res_cfg->reg_rrl_ddr->offsets[1]; + if (res_cfg->reg_rrl_ddr->set_num > 2) + xffsets = res_cfg->reg_rrl_ddr->offsets[2]; } } @@ -883,8 +909,7 @@ static struct res_config i10nm_cfg0 = { .ddr_mdev_bdf = {0, 12, 0}, .hbm_mdev_bdf = {0, 12, 1}, .sad_all_offset = 0x108, - .offsets_scrub = offsets_scrub_icx, - .offsets_demand = offsets_demand_icx, + .reg_rrl_ddr = &icx_reg_rrl_ddr, }; static struct res_config i10nm_cfg1 = { @@ -902,8 +927,7 @@ static struct res_config i10nm_cfg1 = { .ddr_mdev_bdf = {0, 12, 0}, .hbm_mdev_bdf = {0, 12, 1}, .sad_all_offset = 0x108, - .offsets_scrub = offsets_scrub_icx, - .offsets_demand = offsets_demand_icx, + .reg_rrl_ddr = &icx_reg_rrl_ddr, }; static struct res_config spr_cfg = { @@ -926,13 +950,9 @@ static struct res_config spr_cfg = { .ddr_mdev_bdf = {0, 12, 0}, .hbm_mdev_bdf = {0, 12, 1}, .sad_all_offset = 0x300, - .offsets_scrub = offsets_scrub_spr, - .offsets_scrub_hbm0 = offsets_scrub_spr_hbm0, - .offsets_scrub_hbm1 = offsets_scrub_spr_hbm1, - .offsets_demand = offsets_demand_spr, - .offsets_demand2 = offsets_demand2_spr, - .offsets_demand_hbm0 = offsets_demand_spr_hbm0, - .offsets_demand_hbm1 = offsets_demand_spr_hbm1, + .reg_rrl_ddr = &spr_reg_rrl_ddr, + .reg_rrl_hbm[0] = &spr_reg_rrl_hbm_pch0, + .reg_rrl_hbm[1] = &spr_reg_rrl_hbm_pch1, }; static struct res_config gnr_cfg = { @@ -1121,7 +1141,7 @@ static int __init i10nm_init(void) mce_register_decode_chain(&i10nm_mce_dec); skx_setup_debug("i10nm_test"); - if (retry_rd_err_log && res_cfg->offsets_scrub && res_cfg->offsets_demand) { + if (retry_rd_err_log && res_cfg->reg_rrl_ddr) { skx_set_decode(i10nm_mc_decode, show_retry_rd_err_log); if (retry_rd_err_log == 2) enable_retry_rd_err_log(true); @@ -1141,7 +1161,7 @@ static void __exit i10nm_exit(void) { edac_dbg(2, "\n"); - if (retry_rd_err_log && res_cfg->offsets_scrub && res_cfg->offsets_demand) { + if (retry_rd_err_log && res_cfg->reg_rrl_ddr) { skx_set_decode(NULL, NULL); if (retry_rd_err_log == 2) enable_retry_rd_err_log(false); diff --git a/drivers/edac/skx_common.h b/drivers/edac/skx_common.h index 5afd425f3b4f..5833fbe7c0fb 100644 --- a/drivers/edac/skx_common.h +++ b/drivers/edac/skx_common.h @@ -81,6 +81,15 @@ /* Max RRL register sets per {,sub-,pseudo-}channel. */ #define NUM_RRL_SET 3 +/* Max RRL registers per set. */ +#define NUM_RRL_REG 6 + +/* RRL registers per {,sub-,pseudo-}channel. */ +struct reg_rrl { + /* RRL register parts. */ + int set_num; + u32 offsets[NUM_RRL_SET][NUM_RRL_REG]; +}; /* * Each cpu socket contains some pci devices that provide global @@ -237,14 +246,10 @@ struct res_config { /* HBM mdev device BDF */ struct pci_bdf hbm_mdev_bdf; int sad_all_offset; - /* Offsets of retry_rd_err_log registers */ - u32 *offsets_scrub; - u32 *offsets_scrub_hbm0; - u32 *offsets_scrub_hbm1; - u32 *offsets_demand; - u32 *offsets_demand2; - u32 *offsets_demand_hbm0; - u32 *offsets_demand_hbm1; + /* RRL register sets per DDR channel */ + struct reg_rrl *reg_rrl_ddr; + /* RRL register sets per HBM channel */ + struct reg_rrl *reg_rrl_hbm[2]; }; typedef int (*get_dimm_config_f)(struct mem_ctl_info *mci, From patchwork Thu Apr 17 15:07:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhuo, Qiuxu" X-Patchwork-Id: 14055769 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDBCD253955; Thu, 17 Apr 2025 15:09:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902560; cv=none; b=trKQJrxoy9FsJbg9Fktj7nvcyWYiC+Z714bWkUVYTPQqkN0iGQ3WQl2yu40SXwWJrgAWftB6IT5E4lUq58zLoGkyLF0lY6wpnBaKHYiFSPuAdyrK+dgybb9uxx7iwvzHypLsZJZJzmUza2GxYGn7XXzEHAQ3t2zlRHaKC12xtns= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902560; c=relaxed/simple; bh=HskvPqh0fp+Xs08GCGluLWoFMFTiGIhDxXetKAaR2zo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sro1Ijqcj/cmIoEQTluajz07y6hhbHBtVPOpN3uWph8psmY5D9ol/ad5IgzD8QpkSsBcU8/t3SSkTMzHl3MjebXdqZ1GmLUtazPyPOHXyQhdRfb4eK1K2T817kg3gxL48jhNWzSOsWxSCHVA4S5iBzGaCmb4D8tEMGIyRnIFtlY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=IcNZstgU; arc=none smtp.client-ip=198.175.65.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="IcNZstgU" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744902557; x=1776438557; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HskvPqh0fp+Xs08GCGluLWoFMFTiGIhDxXetKAaR2zo=; b=IcNZstgUZQi1JiTeElGnVrIsE2dYt6rblFejKJwK2Awsb4Im3dPHQItd BkBzTh99PKlGrdPwDZIGa8zwoUW8toWWYbf7GmZEeAMNQV/SqmRm8Ux76 JoElhbNQAj+dDzg5pxMRFjIAPn+OirENHknHFbwnIiXbc5hyndtKP9LSB p3ghz+tQZy6RrfE1yhxtavK4SGlPaz9vE7HtNvpD+p7V5b1QsxdNl8IQh asQD1PaoCRFnCxo65H4+7tlvP7W9PfSiwYvrRO+JtVZ6cqMKHsNsyj0+i oj66/oGBld/vuunSR5lxOQClzM73GFwewL13vXqg+yCs4Jp2H/YYMW6s5 A==; X-CSE-ConnectionGUID: atgH34e5TZG8cdm20ASeQw== X-CSE-MsgGUID: 3/5DysP8TdawPpnKNVjOyg== X-IronPort-AV: E=McAfee;i="6700,10204,11406"; a="57488732" X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="57488732" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:09:17 -0700 X-CSE-ConnectionGUID: z9KLyd6+SsyicaVLGaPR1Q== X-CSE-MsgGUID: NYORa+7LQcipEQVt690L0g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="161876913" Received: from qiuxu-clx.sh.intel.com ([10.239.53.109]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:09:14 -0700 From: Qiuxu Zhuo To: Tony Luck Cc: Qiuxu Zhuo , Feng Xu , Borislav Petkov , James Morse , Mauro Carvalho Chehab , Robert Richter , Yi Lai , Shawn Fan , linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 5/7] EDAC/{skx_common,i10nm}: Refactor enable_retry_rd_err_log() Date: Thu, 17 Apr 2025 23:07:22 +0800 Message-ID: <20250417150724.1170168-6-qiuxu.zhuo@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> References: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Refactor enable_retry_rd_err_log() using helper functions for both DDR and HBM, making the RRL control bits configurable instead of hard-coded. Additionally, explicitly define the four RRL modes for better readability. No functional changes intended. Tested-by: Feng Xu Signed-off-by: Qiuxu Zhuo --- drivers/edac/i10nm_base.c | 233 ++++++++++++++++++++++---------------- drivers/edac/skx_common.h | 20 ++++ 2 files changed, 154 insertions(+), 99 deletions(-) diff --git a/drivers/edac/i10nm_base.c b/drivers/edac/i10nm_base.c index b47da970510c..2a03db86883c 100644 --- a/drivers/edac/i10nm_base.c +++ b/drivers/edac/i10nm_base.c @@ -72,11 +72,6 @@ #define I10NM_SAD_ENABLE(reg) GET_BITFIELD(reg, 0, 0) #define I10NM_SAD_NM_CACHEABLE(reg) GET_BITFIELD(reg, 5, 5) -#define RETRY_RD_ERR_LOG_UC BIT(1) -#define RETRY_RD_ERR_LOG_EN_PATSPR BIT(13) -#define RETRY_RD_ERR_LOG_NOOVER BIT(14) -#define RETRY_RD_ERR_LOG_EN BIT(15) -#define RETRY_RD_ERR_LOG_NOOVER_UC (BIT(14) | BIT(1)) #define RETRY_RD_ERR_LOG_OVER_UC_V (BIT(2) | BIT(1) | BIT(0)) static struct list_head *i10nm_edac_list; @@ -88,153 +83,193 @@ static bool mem_cfg_2lm; static struct reg_rrl icx_reg_rrl_ddr = { .set_num = 2, + .modes = {LRE_SCRUB, LRE_DEMAND}, .offsets = { {0x22c60, 0x22c54, 0x22c5c, 0x22c58, 0x22c28, 0x20ed8}, {0x22e54, 0x22e60, 0x22e64, 0x22e58, 0x22e5c, 0x20ee0}, }, + .widths = {4, 4, 4, 4, 4, 8}, + .uc_mask = BIT(1), + .en_patspr_mask = BIT(13), + .noover_mask = BIT(14), + .en_mask = BIT(15), }; static struct reg_rrl spr_reg_rrl_ddr = { .set_num = 3, + .modes = {LRE_SCRUB, LRE_DEMAND, FRE_DEMAND}, .offsets = { {0x22c60, 0x22c54, 0x22f08, 0x22c58, 0x22c28, 0x20ed8}, {0x22e54, 0x22e60, 0x22f10, 0x22e58, 0x22e5c, 0x20ee0}, {0x22c70, 0x22d80, 0x22f18, 0x22d58, 0x22c64, 0x20f10}, }, + .widths = {4, 4, 8, 4, 4, 8}, + .uc_mask = BIT(1), + .en_patspr_mask = BIT(13), + .noover_mask = BIT(14), + .en_mask = BIT(15), }; static struct reg_rrl spr_reg_rrl_hbm_pch0 = { .set_num = 2, + .modes = {LRE_SCRUB, LRE_DEMAND}, .offsets = { {0x2860, 0x2854, 0x2b08, 0x2858, 0x2828, 0x0ed8}, {0x2a54, 0x2a60, 0x2b10, 0x2a58, 0x2a5c, 0x0ee0}, }, + .widths = {4, 4, 8, 4, 4, 8}, + .uc_mask = BIT(1), + .en_patspr_mask = BIT(13), + .noover_mask = BIT(14), + .en_mask = BIT(15), }; static struct reg_rrl spr_reg_rrl_hbm_pch1 = { .set_num = 2, + .modes = {LRE_SCRUB, LRE_DEMAND}, .offsets = { {0x2c60, 0x2c54, 0x2f08, 0x2c58, 0x2c28, 0x0fa8}, {0x2e54, 0x2e60, 0x2f10, 0x2e58, 0x2e5c, 0x0fb0}, }, + .widths = {4, 4, 8, 4, 4, 8}, + .uc_mask = BIT(1), + .en_patspr_mask = BIT(13), + .noover_mask = BIT(14), + .en_mask = BIT(15), }; -static void __enable_retry_rd_err_log(struct skx_imc *imc, int chan, bool enable, u32 *rrl_ctl, - u32 *offsets_scrub, u32 *offsets_demand, - u32 *offsets_demand2) +static u64 read_imc_reg(struct skx_imc *imc, int chan, u32 offset, u8 width) { - u32 s, d, d2; + switch (width) { + case 4: + return I10NM_GET_REG32(imc, chan, offset); + case 8: + return I10NM_GET_REG64(imc, chan, offset); + default: + i10nm_printk(KERN_ERR, "Invalid readd RRL 0x%x width %d\n", offset, width); + return 0; + } +} - s = I10NM_GET_REG32(imc, chan, offsets_scrub[0]); - d = I10NM_GET_REG32(imc, chan, offsets_demand[0]); - if (offsets_demand2) - d2 = I10NM_GET_REG32(imc, chan, offsets_demand2[0]); +static void write_imc_reg(struct skx_imc *imc, int chan, u32 offset, u8 width, u64 val) +{ + switch (width) { + case 4: + return I10NM_SET_REG32(imc, chan, offset, (u32)val); + default: + i10nm_printk(KERN_ERR, "Invalid write RRL 0x%x width %d\n", offset, width); + } +} + +static void enable_rrl(struct skx_imc *imc, int chan, struct reg_rrl *rrl, + int rrl_set, bool enable, u32 *rrl_ctl) +{ + enum rrl_mode mode = rrl->modes[rrl_set]; + u32 offset = rrl->offsets[rrl_set][0], v; + u8 width = rrl->widths[0]; + bool first, scrub; + + /* First or last read error. */ + first = (mode == FRE_SCRUB || mode == FRE_DEMAND); + /* Patrol scrub or on-demand read error. */ + scrub = (mode == FRE_SCRUB || mode == LRE_SCRUB); + + v = read_imc_reg(imc, chan, offset, width); if (enable) { - /* Save default configurations */ - rrl_ctl[0] = s; - rrl_ctl[1] = d; - if (offsets_demand2) - rrl_ctl[2] = d2; + /* Save default configurations. */ + *rrl_ctl = v; + v &= ~rrl->uc_mask; - s &= ~RETRY_RD_ERR_LOG_NOOVER_UC; - s |= RETRY_RD_ERR_LOG_EN_PATSPR; - s |= RETRY_RD_ERR_LOG_EN; - d &= ~RETRY_RD_ERR_LOG_NOOVER_UC; - d &= ~RETRY_RD_ERR_LOG_EN_PATSPR; - d |= RETRY_RD_ERR_LOG_EN; + if (first) + v |= rrl->noover_mask; + else + v &= ~rrl->noover_mask; - if (offsets_demand2) { - d2 &= ~RETRY_RD_ERR_LOG_UC; - d2 &= ~RETRY_RD_ERR_LOG_EN_PATSPR; - d2 |= RETRY_RD_ERR_LOG_NOOVER; - d2 |= RETRY_RD_ERR_LOG_EN; - } + if (scrub) + v |= rrl->en_patspr_mask; + else + v &= ~rrl->en_patspr_mask; + + v |= rrl->en_mask; } else { - /* Restore default configurations */ - if (rrl_ctl[0] & RETRY_RD_ERR_LOG_UC) - s |= RETRY_RD_ERR_LOG_UC; - if (rrl_ctl[0] & RETRY_RD_ERR_LOG_NOOVER) - s |= RETRY_RD_ERR_LOG_NOOVER; - if (!(rrl_ctl[0] & RETRY_RD_ERR_LOG_EN_PATSPR)) - s &= ~RETRY_RD_ERR_LOG_EN_PATSPR; - if (!(rrl_ctl[0] & RETRY_RD_ERR_LOG_EN)) - s &= ~RETRY_RD_ERR_LOG_EN; - if (rrl_ctl[1] & RETRY_RD_ERR_LOG_UC) - d |= RETRY_RD_ERR_LOG_UC; - if (rrl_ctl[1] & RETRY_RD_ERR_LOG_NOOVER) - d |= RETRY_RD_ERR_LOG_NOOVER; - if (rrl_ctl[1] & RETRY_RD_ERR_LOG_EN_PATSPR) - d |= RETRY_RD_ERR_LOG_EN_PATSPR; - if (!(rrl_ctl[1] & RETRY_RD_ERR_LOG_EN)) - d &= ~RETRY_RD_ERR_LOG_EN; - - if (offsets_demand2) { - if (rrl_ctl[2] & RETRY_RD_ERR_LOG_UC) - d2 |= RETRY_RD_ERR_LOG_UC; - if (rrl_ctl[2] & RETRY_RD_ERR_LOG_EN_PATSPR) - d2 |= RETRY_RD_ERR_LOG_EN_PATSPR; - if (!(rrl_ctl[2] & RETRY_RD_ERR_LOG_NOOVER)) - d2 &= ~RETRY_RD_ERR_LOG_NOOVER; - if (!(rrl_ctl[2] & RETRY_RD_ERR_LOG_EN)) - d2 &= ~RETRY_RD_ERR_LOG_EN; + /* Restore default configurations. */ + if (*rrl_ctl & rrl->uc_mask) + v |= rrl->uc_mask; + + if (first) { + if (!(*rrl_ctl & rrl->noover_mask)) + v &= ~rrl->noover_mask; + } else { + if (*rrl_ctl & rrl->noover_mask) + v |= rrl->noover_mask; + } + + if (scrub) { + if (!(*rrl_ctl & rrl->en_patspr_mask)) + v &= ~rrl->en_patspr_mask; + } else { + if (*rrl_ctl & rrl->en_patspr_mask) + v |= rrl->en_patspr_mask; } + + if (!(*rrl_ctl & rrl->en_mask)) + v &= ~rrl->en_mask; } - I10NM_SET_REG32(imc, chan, offsets_scrub[0], s); - I10NM_SET_REG32(imc, chan, offsets_demand[0], d); - if (offsets_demand2) - I10NM_SET_REG32(imc, chan, offsets_demand2[0], d2); + write_imc_reg(imc, chan, offset, width, v); +} + +static void enable_rrls(struct skx_imc *imc, int chan, struct reg_rrl *rrl, + bool enable, u32 *rrl_ctl) +{ + for (int i = 0; i < rrl->set_num; i++) + enable_rrl(imc, chan, rrl, i, enable, rrl_ctl + i); +} + +static void enable_rrls_ddr(struct skx_imc *imc, bool enable) +{ + struct reg_rrl *rrl_ddr = res_cfg->reg_rrl_ddr; + int i, chan_num = res_cfg->ddr_chan_num; + struct skx_channel *chan = imc->chan; + + if (!imc->mbase) + return; + + for (i = 0; i < chan_num; i++) + enable_rrls(imc, i, rrl_ddr, enable, chan[i].rrl_ctl[0]); +} + +static void enable_rrls_hbm(struct skx_imc *imc, bool enable) +{ + struct reg_rrl **rrl_hbm = res_cfg->reg_rrl_hbm; + int i, chan_num = res_cfg->hbm_chan_num; + struct skx_channel *chan = imc->chan; + + if (!imc->mbase || !imc->hbm_mc || !rrl_hbm[0] || !rrl_hbm[1]) + return; + + for (i = 0; i < chan_num; i++) { + enable_rrls(imc, i, rrl_hbm[0], enable, chan[i].rrl_ctl[0]); + enable_rrls(imc, i, rrl_hbm[1], enable, chan[i].rrl_ctl[1]); + } } static void enable_retry_rd_err_log(bool enable) { - int i, j, imc_num, chan_num; - struct skx_channel *chan; - struct skx_imc *imc; struct skx_dev *d; + int i, imc_num; edac_dbg(2, "\n"); list_for_each_entry(d, i10nm_edac_list, list) { imc_num = res_cfg->ddr_imc_num; - chan_num = res_cfg->ddr_chan_num; - - for (i = 0; i < imc_num; i++) { - imc = &d->imc[i]; - if (!imc->mbase) - continue; - - chan = d->imc[i].chan; - for (j = 0; j < chan_num; j++) - __enable_retry_rd_err_log(imc, j, enable, chan[j].rrl_ctl[0], - res_cfg->reg_rrl_ddr->offsets[0], - res_cfg->reg_rrl_ddr->offsets[1], - res_cfg->reg_rrl_ddr->set_num > 2 ? - res_cfg->reg_rrl_ddr->offsets[2] : NULL); - - } + for (i = 0; i < imc_num; i++) + enable_rrls_ddr(&d->imc[i], enable); imc_num += res_cfg->hbm_imc_num; - chan_num = res_cfg->hbm_chan_num; - - for (; i < imc_num; i++) { - imc = &d->imc[i]; - if (!imc->mbase || !imc->hbm_mc) - continue; - - chan = d->imc[i].chan; - for (j = 0; j < chan_num; j++) { - __enable_retry_rd_err_log(imc, j, enable, chan[j].rrl_ctl[0], - res_cfg->reg_rrl_hbm[0]->offsets[0], - res_cfg->reg_rrl_hbm[0]->offsets[1], - NULL); - __enable_retry_rd_err_log(imc, j, enable, chan[j].rrl_ctl[1], - res_cfg->reg_rrl_hbm[1]->offsets[0], - res_cfg->reg_rrl_hbm[1]->offsets[1], - NULL); - } - } + for (; i < imc_num; i++) + enable_rrls_hbm(&d->imc[i], enable); } } diff --git a/drivers/edac/skx_common.h b/drivers/edac/skx_common.h index 5833fbe7c0fb..cf3d0aac035a 100644 --- a/drivers/edac/skx_common.h +++ b/drivers/edac/skx_common.h @@ -84,11 +84,31 @@ /* Max RRL registers per set. */ #define NUM_RRL_REG 6 +/* Modes of RRL register set. */ +enum rrl_mode { + /* Last read error from patrol scrub. */ + LRE_SCRUB, + /* Last read error from demand. */ + LRE_DEMAND, + /* First read error from patrol scrub. */ + FRE_SCRUB, + /* First read error from demand. */ + FRE_DEMAND, +}; + /* RRL registers per {,sub-,pseudo-}channel. */ struct reg_rrl { /* RRL register parts. */ int set_num; + enum rrl_mode modes[NUM_RRL_SET]; u32 offsets[NUM_RRL_SET][NUM_RRL_REG]; + /* RRL register widths in byte per set. */ + u8 widths[NUM_RRL_REG]; + /* RRL control bits of the first register per set. */ + u32 uc_mask; + u32 en_patspr_mask; + u32 noover_mask; + u32 en_mask; }; /* From patchwork Thu Apr 17 15:07:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhuo, Qiuxu" X-Patchwork-Id: 14055770 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3BB4E253B66; Thu, 17 Apr 2025 15:09:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902564; cv=none; b=q8c7oNV/Fxr5LZOn+ISakwj/LkPhXNYY5Yr+1uMUOshPFkRFnHshBLKrVjHNUqwk3YDaenlnjR1TvdjpahMMI9iCQGPaAho2TwxeYDWzERiWpga56Wy8xyph893g5a5AUaHZBAisbvl47X5EmuYoaxtMYbvfKn7QN+nSZp9SGoE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902564; c=relaxed/simple; bh=2+rdq708IgOQ8Y9SrhY0PXGXgcLtCeRE4b9uBqxRRyA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ym+rHLluqO04zSTodhLd+jpNxS7c0ecyU8LO6RaMqYgCPV/DGxZLkMkago/35UPsqL9l33qvDjOaVxBfkUpArFhYR9DtHJYIgYG2KXbRAgFEF83tHA3sNJgh7MGH6LXgIxok+g3L2UrcIsg3+Hn11jx8zg/a0kW/kZmv/hMTjCA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=X/rHbQiW; arc=none smtp.client-ip=198.175.65.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="X/rHbQiW" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744902562; x=1776438562; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2+rdq708IgOQ8Y9SrhY0PXGXgcLtCeRE4b9uBqxRRyA=; b=X/rHbQiWOBkQuUiE0qwjPAO/wadOKcE78T+y7S6qom03ZDK2Zi9gfrlo 254fYT+Tgd+zY8XUUVIGEWsoPppFQeTIOnVNgQiEtdhDQ1kgsOMSbhFOY C6j3ooBKJPivfY+kmoU0Tk+M/rW9AuVX8nmaZJyvibeopveParWrvv/5t kxN3Duo0gtF/7S1Q/BcH3dnsMjtmXYFA3TdsBRFmGPd9ZxnuQIYzJ5fw3 rQQHG/HV90WmAG30sKHkOI25/53cN6+GuJFGvkpCMVcsx9AdAkLLlQa3q 1irTZ3RR/iSt8TyZx1LvTqB62Cg37p91j0KBf6znZmlq7M2WASqZOR87o w==; X-CSE-ConnectionGUID: pcZQe3nbQlmowMrTxFnLnQ== X-CSE-MsgGUID: 1UBPZomBSci5LLZT4UX5Jg== X-IronPort-AV: E=McAfee;i="6700,10204,11406"; a="57488743" X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="57488743" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:09:22 -0700 X-CSE-ConnectionGUID: iEpU18jaToyCyaaAXg9N7g== X-CSE-MsgGUID: dbxj0U2wQGaJaTbvt1xWiQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="161876933" Received: from qiuxu-clx.sh.intel.com ([10.239.53.109]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:09:19 -0700 From: Qiuxu Zhuo To: Tony Luck Cc: Qiuxu Zhuo , Feng Xu , Borislav Petkov , James Morse , Mauro Carvalho Chehab , Robert Richter , Yi Lai , Shawn Fan , linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 6/7] EDAC/{skx_common,i10nm}: Refactor show_retry_rd_err_log() Date: Thu, 17 Apr 2025 23:07:23 +0800 Message-ID: <20250417150724.1170168-7-qiuxu.zhuo@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> References: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Make the {valid bit, overwritten status, number} of RRL registers and the {number, offsets, widths} of per-channel CORRERRCNT registers configurable. Refactor show_retry_rd_err_log() to use the configurable fields of struct reg_rrl, making the code more scalable and simpler. No functional changes intended. Tested-by: Feng Xu Signed-off-by: Qiuxu Zhuo --- drivers/edac/i10nm_base.c | 158 +++++++++++++++++--------------------- drivers/edac/skx_common.h | 11 ++- 2 files changed, 79 insertions(+), 90 deletions(-) diff --git a/drivers/edac/i10nm_base.c b/drivers/edac/i10nm_base.c index 2a03db86883c..aefc448283d3 100644 --- a/drivers/edac/i10nm_base.c +++ b/drivers/edac/i10nm_base.c @@ -72,8 +72,6 @@ #define I10NM_SAD_ENABLE(reg) GET_BITFIELD(reg, 0, 0) #define I10NM_SAD_NM_CACHEABLE(reg) GET_BITFIELD(reg, 5, 5) -#define RETRY_RD_ERR_LOG_OVER_UC_V (BIT(2) | BIT(1) | BIT(0)) - static struct list_head *i10nm_edac_list; static struct res_config *res_cfg; @@ -83,20 +81,28 @@ static bool mem_cfg_2lm; static struct reg_rrl icx_reg_rrl_ddr = { .set_num = 2, + .reg_num = 6, .modes = {LRE_SCRUB, LRE_DEMAND}, .offsets = { {0x22c60, 0x22c54, 0x22c5c, 0x22c58, 0x22c28, 0x20ed8}, {0x22e54, 0x22e60, 0x22e64, 0x22e58, 0x22e5c, 0x20ee0}, }, .widths = {4, 4, 4, 4, 4, 8}, + .v_mask = BIT(0), .uc_mask = BIT(1), + .over_mask = BIT(2), .en_patspr_mask = BIT(13), .noover_mask = BIT(14), .en_mask = BIT(15), + + .cecnt_num = 4, + .cecnt_offsets = {0x22c18, 0x22c1c, 0x22c20, 0x22c24}, + .cecnt_widths = {4, 4, 4, 4}, }; static struct reg_rrl spr_reg_rrl_ddr = { .set_num = 3, + .reg_num = 6, .modes = {LRE_SCRUB, LRE_DEMAND, FRE_DEMAND}, .offsets = { {0x22c60, 0x22c54, 0x22f08, 0x22c58, 0x22c28, 0x20ed8}, @@ -104,38 +110,58 @@ static struct reg_rrl spr_reg_rrl_ddr = { {0x22c70, 0x22d80, 0x22f18, 0x22d58, 0x22c64, 0x20f10}, }, .widths = {4, 4, 8, 4, 4, 8}, + .v_mask = BIT(0), .uc_mask = BIT(1), + .over_mask = BIT(2), .en_patspr_mask = BIT(13), .noover_mask = BIT(14), .en_mask = BIT(15), + + .cecnt_num = 4, + .cecnt_offsets = {0x22c18, 0x22c1c, 0x22c20, 0x22c24}, + .cecnt_widths = {4, 4, 4, 4}, }; static struct reg_rrl spr_reg_rrl_hbm_pch0 = { .set_num = 2, + .reg_num = 6, .modes = {LRE_SCRUB, LRE_DEMAND}, .offsets = { {0x2860, 0x2854, 0x2b08, 0x2858, 0x2828, 0x0ed8}, {0x2a54, 0x2a60, 0x2b10, 0x2a58, 0x2a5c, 0x0ee0}, }, .widths = {4, 4, 8, 4, 4, 8}, + .v_mask = BIT(0), .uc_mask = BIT(1), + .over_mask = BIT(2), .en_patspr_mask = BIT(13), .noover_mask = BIT(14), .en_mask = BIT(15), + + .cecnt_num = 4, + .cecnt_offsets = {0x2818, 0x281c, 0x2820, 0x2824}, + .cecnt_widths = {4, 4, 4, 4}, }; static struct reg_rrl spr_reg_rrl_hbm_pch1 = { .set_num = 2, + .reg_num = 6, .modes = {LRE_SCRUB, LRE_DEMAND}, .offsets = { {0x2c60, 0x2c54, 0x2f08, 0x2c58, 0x2c28, 0x0fa8}, {0x2e54, 0x2e60, 0x2f10, 0x2e58, 0x2e5c, 0x0fb0}, }, .widths = {4, 4, 8, 4, 4, 8}, + .v_mask = BIT(0), .uc_mask = BIT(1), + .over_mask = BIT(2), .en_patspr_mask = BIT(13), .noover_mask = BIT(14), .en_mask = BIT(15), + + .cecnt_num = 4, + .cecnt_offsets = {0x2c18, 0x2c1c, 0x2c20, 0x2c24}, + .cecnt_widths = {4, 4, 4, 4}, }; static u64 read_imc_reg(struct skx_imc *imc, int chan, u32 offset, u8 width) @@ -276,110 +302,64 @@ static void enable_retry_rd_err_log(bool enable) static void show_retry_rd_err_log(struct decoded_addr *res, char *msg, int len, bool scrub_err) { + int i, j, n, ch = res->channel, pch = res->cs & 1; struct skx_imc *imc = &res->dev->imc[res->imc]; - u32 log0, log1, log2, log3, log4; - u32 corr0, corr1, corr2, corr3; - u32 lxg0, lxg1, lxg3, lxg4; - u32 *xffsets = NULL; - u64 log2a, log5; - u64 lxg2a, lxg5; - u32 *offsets; - int n, pch; + u32 offset, status_mask; + struct reg_rrl *rrl; + u64 log, corr; + bool scrub; + u8 width; if (!imc->mbase) return; - if (imc->hbm_mc) { - pch = res->cs & 1; + rrl = imc->hbm_mc ? res_cfg->reg_rrl_hbm[pch] : res_cfg->reg_rrl_ddr; - if (pch) - offsets = scrub_err ? res_cfg->reg_rrl_hbm[1]->offsets[0] : - res_cfg->reg_rrl_hbm[1]->offsets[1]; - else - offsets = scrub_err ? res_cfg->reg_rrl_hbm[0]->offsets[0] : - res_cfg->reg_rrl_hbm[0]->offsets[1]; - } else { - if (scrub_err) { - offsets = res_cfg->reg_rrl_ddr->offsets[0]; - } else { - offsets = res_cfg->reg_rrl_ddr->offsets[1]; - if (res_cfg->reg_rrl_ddr->set_num > 2) - xffsets = res_cfg->reg_rrl_ddr->offsets[2]; - } - } + if (!rrl) + return; - log0 = I10NM_GET_REG32(imc, res->channel, offsets[0]); - log1 = I10NM_GET_REG32(imc, res->channel, offsets[1]); - log3 = I10NM_GET_REG32(imc, res->channel, offsets[3]); - log4 = I10NM_GET_REG32(imc, res->channel, offsets[4]); - log5 = I10NM_GET_REG64(imc, res->channel, offsets[5]); + status_mask = rrl->over_mask | rrl->uc_mask | rrl->v_mask; - if (xffsets) { - lxg0 = I10NM_GET_REG32(imc, res->channel, xffsets[0]); - lxg1 = I10NM_GET_REG32(imc, res->channel, xffsets[1]); - lxg3 = I10NM_GET_REG32(imc, res->channel, xffsets[3]); - lxg4 = I10NM_GET_REG32(imc, res->channel, xffsets[4]); - lxg5 = I10NM_GET_REG64(imc, res->channel, xffsets[5]); - } + n = snprintf(msg, len, " retry_rd_err_log["); + for (i = 0; i < rrl->set_num; i++) { + scrub = (rrl->modes[i] == FRE_SCRUB || rrl->modes[i] == LRE_SCRUB); + if (scrub_err != scrub) + continue; - if (res_cfg->type == SPR) { - log2a = I10NM_GET_REG64(imc, res->channel, offsets[2]); - n = snprintf(msg, len, " retry_rd_err_log[%.8x %.8x %.16llx %.8x %.8x %.16llx", - log0, log1, log2a, log3, log4, log5); + for (j = 0; j < rrl->reg_num && len - n > 0; j++) { + offset = rrl->offsets[i][j]; + width = rrl->widths[j]; + log = read_imc_reg(imc, ch, offset, width); - if (len - n > 0) { - if (xffsets) { - lxg2a = I10NM_GET_REG64(imc, res->channel, xffsets[2]); - n += snprintf(msg + n, len - n, " %.8x %.8x %.16llx %.8x %.8x %.16llx]", - lxg0, lxg1, lxg2a, lxg3, lxg4, lxg5); - } else { - n += snprintf(msg + n, len - n, "]"); - } - } - } else { - log2 = I10NM_GET_REG32(imc, res->channel, offsets[2]); - n = snprintf(msg, len, " retry_rd_err_log[%.8x %.8x %.8x %.8x %.8x %.16llx]", - log0, log1, log2, log3, log4, log5); - } + if (width == 4) + n += snprintf(msg + n, len - n, "%.8llx ", log); + else + n += snprintf(msg + n, len - n, "%.16llx ", log); - if (imc->hbm_mc) { - if (pch) { - corr0 = I10NM_GET_REG32(imc, res->channel, 0x2c18); - corr1 = I10NM_GET_REG32(imc, res->channel, 0x2c1c); - corr2 = I10NM_GET_REG32(imc, res->channel, 0x2c20); - corr3 = I10NM_GET_REG32(imc, res->channel, 0x2c24); - } else { - corr0 = I10NM_GET_REG32(imc, res->channel, 0x2818); - corr1 = I10NM_GET_REG32(imc, res->channel, 0x281c); - corr2 = I10NM_GET_REG32(imc, res->channel, 0x2820); - corr3 = I10NM_GET_REG32(imc, res->channel, 0x2824); + /* Clear RRL status if RRL in Linux control mode. */ + if (retry_rd_err_log == 2 && !j && (log & status_mask)) + write_imc_reg(imc, ch, offset, width, log & ~status_mask); } - } else { - corr0 = I10NM_GET_REG32(imc, res->channel, 0x22c18); - corr1 = I10NM_GET_REG32(imc, res->channel, 0x22c1c); - corr2 = I10NM_GET_REG32(imc, res->channel, 0x22c20); - corr3 = I10NM_GET_REG32(imc, res->channel, 0x22c24); } - if (len - n > 0) - snprintf(msg + n, len - n, - " correrrcnt[%.4x %.4x %.4x %.4x %.4x %.4x %.4x %.4x]", - corr0 & 0xffff, corr0 >> 16, - corr1 & 0xffff, corr1 >> 16, - corr2 & 0xffff, corr2 >> 16, - corr3 & 0xffff, corr3 >> 16); + /* Move back one space. */ + n--; + n += snprintf(msg + n, len - n, "]"); - /* Clear status bits */ - if (retry_rd_err_log == 2) { - if (log0 & RETRY_RD_ERR_LOG_OVER_UC_V) { - log0 &= ~RETRY_RD_ERR_LOG_OVER_UC_V; - I10NM_SET_REG32(imc, res->channel, offsets[0], log0); - } + if (len - n > 0) { + n += snprintf(msg + n, len - n, " correrrcnt["); + for (i = 0; i < rrl->cecnt_num && len - n > 0; i++) { + offset = rrl->cecnt_offsets[i]; + width = rrl->cecnt_widths[i]; + corr = read_imc_reg(imc, ch, offset, width); - if (xffsets && (lxg0 & RETRY_RD_ERR_LOG_OVER_UC_V)) { - lxg0 &= ~RETRY_RD_ERR_LOG_OVER_UC_V; - I10NM_SET_REG32(imc, res->channel, xffsets[0], lxg0); + n += snprintf(msg + n, len - n, "%.4llx %.4llx ", + corr & 0xffff, corr >> 16); } + + /* Move back one space. */ + n--; + n += snprintf(msg + n, len - n, "]"); } } diff --git a/drivers/edac/skx_common.h b/drivers/edac/skx_common.h index cf3d0aac035a..8f0f4af2cb27 100644 --- a/drivers/edac/skx_common.h +++ b/drivers/edac/skx_common.h @@ -83,6 +83,8 @@ #define NUM_RRL_SET 3 /* Max RRL registers per set. */ #define NUM_RRL_REG 6 +/* Max correctable error count registers. */ +#define NUM_CECNT_REG 4 /* Modes of RRL register set. */ enum rrl_mode { @@ -99,16 +101,23 @@ enum rrl_mode { /* RRL registers per {,sub-,pseudo-}channel. */ struct reg_rrl { /* RRL register parts. */ - int set_num; + int set_num, reg_num; enum rrl_mode modes[NUM_RRL_SET]; u32 offsets[NUM_RRL_SET][NUM_RRL_REG]; /* RRL register widths in byte per set. */ u8 widths[NUM_RRL_REG]; /* RRL control bits of the first register per set. */ + u32 v_mask; u32 uc_mask; + u32 over_mask; u32 en_patspr_mask; u32 noover_mask; u32 en_mask; + + /* CORRERRCNT register parts. */ + int cecnt_num; + u32 cecnt_offsets[NUM_CECNT_REG]; + u8 cecnt_widths[NUM_CECNT_REG]; }; /* From patchwork Thu Apr 17 15:07:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhuo, Qiuxu" X-Patchwork-Id: 14055771 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 412B1253B78; Thu, 17 Apr 2025 15:09:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902568; cv=none; b=PVXLSZZ14kEWiRX9er0LIUlXOiNxtvsPXy3VmE/REVpOhbL2lx9F9hYIu3shqq/10XipmPmCuti62qNdFLbrWzqk1QsQ8bvcVK/IAk4bbKl2U7/Bbnglx7R2r9zPPZbAxs/AmbjIZn3qpkgo1RGYYTPNvLcAo8M39Nq7hfUnMmc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744902568; c=relaxed/simple; bh=AdFC9QPAdrG+ZuiF0LMjbfb7m0Ibpa1So+c1GMQJ9Zc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fbEBockRAhHBbiS7JUQjNJ+eCr+BagbbuFTfir6wsQZYkDBoLVv3CcJt35Ne8t1i/nkIzcNlS3cxeqMwA9nb0Yx9CEJHFdzd2b6BMYeL5g2p65afpV46jg4UE8F0m6+kVRCEsB2QYAMiUzVBk4szdyPtIUseo9dep77vhipk8eY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=OIfd5xhx; arc=none smtp.client-ip=198.175.65.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="OIfd5xhx" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744902567; x=1776438567; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=AdFC9QPAdrG+ZuiF0LMjbfb7m0Ibpa1So+c1GMQJ9Zc=; b=OIfd5xhxf8vDbgbr31zSFqxCHDLhIUJpxasKBLdWqfdykwTFwNa98UMX +1wT9LN92MPNbuoA+kvXzHjQrKleFFJEq8jKC1A4MWaxxEIRIzMjWnNVs HYW7JDZqms7C5Jas60jtrWxgKzjYvw+25y93b49zwDnLJ+tHOZaEebtPW zw2MmtHikTyVRhlEH2Aka7yoD7g4i/fLlUUpEOeTpGc1AxRQy6vBSxEB4 mTdR59Ggwky0QnIGPQAwiRi7XwMKESx3S2dkQaYwMrqQ6x5itCE2NLdjC eMx+Z68b/z73wvVE6wX0PabMmsTQN+xaG73d1VV5/OeYoj39T0tRWuc3+ Q==; X-CSE-ConnectionGUID: atWwOe9GTiikVPDfalxlYw== X-CSE-MsgGUID: 4MyW9Dx+TO+9sdNr6qjhgw== X-IronPort-AV: E=McAfee;i="6700,10204,11406"; a="57488756" X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="57488756" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:09:27 -0700 X-CSE-ConnectionGUID: BXXrn+CxTlak/hFwHn9DmA== X-CSE-MsgGUID: 8thGXoyRTOeuF5R7/Ihm/w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,219,1739865600"; d="scan'208";a="161876948" Received: from qiuxu-clx.sh.intel.com ([10.239.53.109]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2025 08:09:24 -0700 From: Qiuxu Zhuo To: Tony Luck Cc: Qiuxu Zhuo , Feng Xu , Borislav Petkov , James Morse , Mauro Carvalho Chehab , Robert Richter , Yi Lai , Shawn Fan , linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 7/7] EDAC/{skx_common,i10nm}: Add RRL support for Intel Granite Rapids server Date: Thu, 17 Apr 2025 23:07:24 +0800 Message-ID: <20250417150724.1170168-8-qiuxu.zhuo@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> References: <20250417150724.1170168-1-qiuxu.zhuo@intel.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Compared to previous generations, Granite Rapids defines the RRL control bits {en_patspr, noover, en} in different positions, adds an extra RRL set for the new mode of the first patrol-scrub read error, and extends the number of CORRERRCNT registers from 4 to 8, encoding one counter per CORRERRCNT register. Add a Granite Rapids reg_rrl configuration table and adjust the code to accommodate the differences mentioned above for RRL support. Tested-by: Feng Xu Signed-off-by: Qiuxu Zhuo --- drivers/edac/i10nm_base.c | 37 +++++++++++++++++++++++++++++++++++-- drivers/edac/skx_common.h | 4 ++-- 2 files changed, 37 insertions(+), 4 deletions(-) diff --git a/drivers/edac/i10nm_base.c b/drivers/edac/i10nm_base.c index aefc448283d3..8863f1fb4caf 100644 --- a/drivers/edac/i10nm_base.c +++ b/drivers/edac/i10nm_base.c @@ -164,6 +164,29 @@ static struct reg_rrl spr_reg_rrl_hbm_pch1 = { .cecnt_widths = {4, 4, 4, 4}, }; +static struct reg_rrl gnr_reg_rrl_ddr = { + .set_num = 4, + .reg_num = 6, + .modes = {FRE_SCRUB, FRE_DEMAND, LRE_SCRUB, LRE_DEMAND}, + .offsets = { + {0x2f10, 0x2f20, 0x2f30, 0x2f50, 0x2f60, 0xba0}, + {0x2f14, 0x2f24, 0x2f38, 0x2f54, 0x2f64, 0xba8}, + {0x2f18, 0x2f28, 0x2f40, 0x2f58, 0x2f68, 0xbb0}, + {0x2f1c, 0x2f2c, 0x2f48, 0x2f5c, 0x2f6c, 0xbb8}, + }, + .widths = {4, 4, 8, 4, 4, 8}, + .v_mask = BIT(0), + .uc_mask = BIT(1), + .over_mask = BIT(2), + .en_patspr_mask = BIT(14), + .noover_mask = BIT(15), + .en_mask = BIT(12), + + .cecnt_num = 8, + .cecnt_offsets = {0x2c10, 0x2c14, 0x2c18, 0x2c1c, 0x2c20, 0x2c24, 0x2c28, 0x2c2c}, + .cecnt_widths = {4, 4, 4, 4, 4, 4, 4, 4}, +}; + static u64 read_imc_reg(struct skx_imc *imc, int chan, u32 offset, u8 width) { switch (width) { @@ -353,8 +376,17 @@ static void show_retry_rd_err_log(struct decoded_addr *res, char *msg, width = rrl->cecnt_widths[i]; corr = read_imc_reg(imc, ch, offset, width); - n += snprintf(msg + n, len - n, "%.4llx %.4llx ", - corr & 0xffff, corr >> 16); + /* CPUs {ICX,SPR} encode two counters per 4-byte CORRERRCNT register. */ + if (res_cfg->type <= SPR) { + n += snprintf(msg + n, len - n, "%.4llx %.4llx ", + corr & 0xffff, corr >> 16); + } else { + /* CPUs {GNR} encode one counter per CORRERRCNT register. */ + if (width == 4) + n += snprintf(msg + n, len - n, "%.8llx ", corr); + else + n += snprintf(msg + n, len - n, "%.16llx ", corr); + } } /* Move back one space. */ @@ -985,6 +1017,7 @@ static struct res_config gnr_cfg = { .uracu_bdf = {0, 0, 1}, .ddr_mdev_bdf = {0, 5, 1}, .sad_all_offset = 0x300, + .reg_rrl_ddr = &gnr_reg_rrl_ddr, }; static const struct x86_cpu_id i10nm_cpuids[] = { diff --git a/drivers/edac/skx_common.h b/drivers/edac/skx_common.h index 8f0f4af2cb27..ec4966f7ea40 100644 --- a/drivers/edac/skx_common.h +++ b/drivers/edac/skx_common.h @@ -80,11 +80,11 @@ #define MCACOD_EXT_MEM_ERR 0x280 /* Max RRL register sets per {,sub-,pseudo-}channel. */ -#define NUM_RRL_SET 3 +#define NUM_RRL_SET 4 /* Max RRL registers per set. */ #define NUM_RRL_REG 6 /* Max correctable error count registers. */ -#define NUM_CECNT_REG 4 +#define NUM_CECNT_REG 8 /* Modes of RRL register set. */ enum rrl_mode {