Message ID | 20240404151359.47970-7-yazen.ghannam@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | MCA Updates | expand |
On Thu, Apr 04, 2024 at 10:13:49AM -0500, Yazen Ghannam wrote: > Scalable MCA systems use the per-bank MCA_CONFIG register to enable > deferred error interrupts. This is done as part of SMCA configuration. > > Currently, the deferred error interrupt handler is set up after SMCA > configuration. > > Move the deferred error interrupt handler set up before SMCA > configuration. This ensures the kernel is ready to receive the > interrupts before the hardware is configured to send them. > > Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> > --- > > Notes: > Link: > https://lkml.kernel.org/r/20231118193248.1296798-11-yazen.ghannam@amd.com > > v1->v2: > * No change. > > arch/x86/kernel/cpu/mce/amd.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c > index 3093fed06194..e8e78d91082b 100644 > --- a/arch/x86/kernel/cpu/mce/amd.c > +++ b/arch/x86/kernel/cpu/mce/amd.c > @@ -589,6 +589,9 @@ static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c) > u32 low = 0, high = 0; > int def_offset = -1, def_new; > > + if (!mce_flags.succor) Does succor imply smca? Because you have now this order: deferred_error_interrupt_enable(c); ... configure_smca(bank); and that one tests mce_flags.smca Now, if succor didn't imply smca, we'll enable the DF irq handler for no good reason on (succor=true && smca=false) systems. If the implication is given: Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
On 4/24/2024 2:34 PM, Borislav Petkov wrote: > On Thu, Apr 04, 2024 at 10:13:49AM -0500, Yazen Ghannam wrote: >> Scalable MCA systems use the per-bank MCA_CONFIG register to enable >> deferred error interrupts. This is done as part of SMCA configuration. >> >> Currently, the deferred error interrupt handler is set up after SMCA >> configuration. >> >> Move the deferred error interrupt handler set up before SMCA >> configuration. This ensures the kernel is ready to receive the >> interrupts before the hardware is configured to send them. >> >> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> >> --- >> >> Notes: >> Link: >> https://lkml.kernel.org/r/20231118193248.1296798-11-yazen.ghannam@amd.com >> >> v1->v2: >> * No change. >> >> arch/x86/kernel/cpu/mce/amd.c | 7 ++++--- >> 1 file changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c >> index 3093fed06194..e8e78d91082b 100644 >> --- a/arch/x86/kernel/cpu/mce/amd.c >> +++ b/arch/x86/kernel/cpu/mce/amd.c >> @@ -589,6 +589,9 @@ static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c) >> u32 low = 0, high = 0; >> int def_offset = -1, def_new; >> >> + if (!mce_flags.succor) > > Does succor imply smca? > > Because you have now this order: > > deferred_error_interrupt_enable(c); > > ... > > configure_smca(bank); > > and that one tests mce_flags.smca > > Now, if succor didn't imply smca, we'll enable the DF irq handler for > no good reason on (succor=true && smca=false) systems. > > If the implication is given: > > Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> > They are independent features. SUCCOR is the feature that defines the deferred error interrupt and data poisoning. This predates SMCA. SUCCOR was introduced in the later Family 15h systems. Thanks, Yazen
On Thu, Apr 25, 2024 at 09:31:58AM -0400, Yazen Ghannam wrote: > They are independent features. SUCCOR is the feature that defines the deferred > error interrupt and data poisoning. This predates SMCA. SUCCOR was introduced > in the later Family 15h systems. ... and as we've established, it is not really enabled on them for whatever reason so for simplicity's sake, we'll simply assume that it was enabled together with SMCA and that would simplify a lot of things in the code. Please put that in the commit message so that we know. Thx.
On 4/29/2024 8:38 AM, Borislav Petkov wrote: > On Thu, Apr 25, 2024 at 09:31:58AM -0400, Yazen Ghannam wrote: >> They are independent features. SUCCOR is the feature that defines the deferred >> error interrupt and data poisoning. This predates SMCA. SUCCOR was introduced >> in the later Family 15h systems. > > ... and as we've established, it is not really enabled on them for > whatever reason so for simplicity's sake, we'll simply assume that it > was enabled together with SMCA and that would simplify a lot of things > in the code. > > Please put that in the commit message so that we know. > > Thx. > Okay, will do. Thanks, Yazen
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 3093fed06194..e8e78d91082b 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -589,6 +589,9 @@ static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c) u32 low = 0, high = 0; int def_offset = -1, def_new; + if (!mce_flags.succor) + return; + if (rdmsr_safe(MSR_CU_DEF_ERR, &low, &high)) return; @@ -768,6 +771,7 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) u32 low = 0, high = 0, address = 0; int offset = -1; + deferred_error_interrupt_enable(c); for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) { if (mce_flags.smca) @@ -794,9 +798,6 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) offset = prepare_threshold_block(bank, block, address, offset, high); } } - - if (mce_flags.succor) - deferred_error_interrupt_enable(c); } /*
Scalable MCA systems use the per-bank MCA_CONFIG register to enable deferred error interrupts. This is done as part of SMCA configuration. Currently, the deferred error interrupt handler is set up after SMCA configuration. Move the deferred error interrupt handler set up before SMCA configuration. This ensures the kernel is ready to receive the interrupts before the hardware is configured to send them. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-11-yazen.ghannam@amd.com v1->v2: * No change. arch/x86/kernel/cpu/mce/amd.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)