Message ID | 20200214123407.4184-1-prarit@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/mce: Do not log spurious corrected mce errors | expand |
On Fri, Feb 14, 2020 at 07:34:07AM -0500, Prarit Bhargava wrote: > #ifdef CONFIG_X86_MCE_AMD > extern bool amd_filter_mce(struct mce *m); > +extern bool intel_filter_mce(struct mce *m); > #else Something very weird is going on here. Why does CONFIG_X86_MCE_AMD have to be set to enable some *Intel* filter operation? -Tony
On 2/14/20 12:36 PM, Luck, Tony wrote: > On Fri, Feb 14, 2020 at 07:34:07AM -0500, Prarit Bhargava wrote: >> #ifdef CONFIG_X86_MCE_AMD >> extern bool amd_filter_mce(struct mce *m); >> +extern bool intel_filter_mce(struct mce *m); >> #else > > Something very weird is going on here. Why does > CONFIG_X86_MCE_AMD have to be set to enable some > *Intel* filter operation? That's a mistake. I'll fix that in v2. P. > > -Tony >
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 2c4f949611e4..fe3983d551cc 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1877,6 +1877,8 @@ bool filter_mce(struct mce *m) { if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) return amd_filter_mce(m); + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) + return intel_filter_mce(m); return false; } diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c index 5627b1091b85..989148e6746c 100644 --- a/arch/x86/kernel/cpu/mce/intel.c +++ b/arch/x86/kernel/cpu/mce/intel.c @@ -520,3 +520,20 @@ void mce_intel_feature_clear(struct cpuinfo_x86 *c) { intel_clear_lmce(); } + +bool intel_filter_mce(struct mce *m) +{ + struct cpuinfo_x86 *c = &boot_cpu_data; + + /* MCE errata HSD131, HSM142, HSW131, BDM48, and HSM142 */ + if ((c->x86 == 6) && + ((c->x86_model == INTEL_FAM6_HASWELL) || + (c->x86_model == INTEL_FAM6_HASWELL_L) || + (c->x86_model == INTEL_FAM6_BROADWELL) || + (c->x86_model == INTEL_FAM6_HASWELL_G)) && + (m->bank == 0) && + ((m->status & 0xa0000000ffffffff) == 0x80000000000f0005)) + return true; + + return false; +} diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h index b785c0d0b590..2729db355b36 100644 --- a/arch/x86/kernel/cpu/mce/internal.h +++ b/arch/x86/kernel/cpu/mce/internal.h @@ -172,8 +172,10 @@ extern bool filter_mce(struct mce *m); #ifdef CONFIG_X86_MCE_AMD extern bool amd_filter_mce(struct mce *m); +extern bool intel_filter_mce(struct mce *m); #else static inline bool amd_filter_mce(struct mce *m) { return false; }; +static inline bool intel_filter_mce(struct mce *m) { return false; }; #endif #endif /* __X86_MCE_INTERNAL_H__ */
Alex has reported that he is seeing spurious corrected errors on his hardware. Intel Errata HSD131, HSM142, HSW131, and BDM48 report that "spurious corrected errors may be logged in the IA32_MC0_STATUS register with the valid field (bit 63) set, the uncorrected error field (bit 61) not set, a Model Specific Error Code (bits [31:16]) of 0x000F, and an MCA Error Code (bits [15:0]) of 0x0005." Block these spurious errors from the console and logs. Links to Intel Specification updates: HSD131: https://www.intel.com/content/www/us/en/products/docs/processors/core/4th-gen-core-family-desktop-specification-update.html HSM142: https://www.intel.com/content/www/us/en/products/docs/processors/core/4th-gen-core-family-mobile-specification-update.html HSW131: https://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200v3-spec-update.html BDM48: https://www.intel.com/content/www/us/en/products/docs/processors/core/5th-gen-core-family-spec-update.html Signed-off-by: Prarit Bhargava <prarit@redhat.com> Co-developed-by: Alexander Krupp <centos@akr.yagii.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-edac@vger.kernel.org --- arch/x86/kernel/cpu/mce/core.c | 2 ++ arch/x86/kernel/cpu/mce/intel.c | 17 +++++++++++++++++ arch/x86/kernel/cpu/mce/internal.h | 2 ++ 3 files changed, 21 insertions(+)