Message ID | 20250213-wip-mca-updates-v2-15-3636547fe05f@amd.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | AMD MCA interrupts rework | expand |
On Thu, Feb 13, 2025 at 04:46:04PM +0000, Yazen Ghannam wrote: > @@ -306,6 +306,11 @@ static void smca_configure(unsigned int bank, unsigned int cpu) > high |= BIT(5); > } > > + if ((low & BIT(10)) && data->thr_intr_en) { > + __set_bit(bank, data->thr_intr_banks); > + high |= BIT(8); Maybe someday add some #define names for these BITs? > + } > + > this_cpu_ptr(mce_banks_array)[bank].lsb_in_status = !!(low & BIT(8)); > > wrmsr(smca_config, low, high); -Tony
On Thu, Feb 13, 2025 at 02:34:59PM -0800, Luck, Tony wrote: > On Thu, Feb 13, 2025 at 04:46:04PM +0000, Yazen Ghannam wrote: > > @@ -306,6 +306,11 @@ static void smca_configure(unsigned int bank, unsigned int cpu) > > high |= BIT(5); > > } > > > > + if ((low & BIT(10)) && data->thr_intr_en) { > > + __set_bit(bank, data->thr_intr_banks); > > + high |= BIT(8); > > Maybe someday add some #define names for these BITs? > Yep, I have something like that: https://lore.kernel.org/r/20240404151359.47970-6-yazen.ghannam@amd.com https://lore.kernel.org/r/20240404151359.47970-13-yazen.ghannam@amd.com These sets keep growing, so I'm trying to do better about breaking things up and prioritizing. :) Thanks, Yazen
> From: Yazen Ghannam <yazen.ghannam@amd.com> > [...] > --- a/arch/x86/kernel/cpu/mce/amd.c > +++ b/arch/x86/kernel/cpu/mce/amd.c > @@ -306,6 +306,11 @@ static void smca_configure(unsigned int bank, > unsigned int cpu) > high |= BIT(5); > } > Do you think it would be better to have some comments on the thresholding interrupt here (as the bits 10/8 look like magic numbers), similar to the comments above for the deferred interrupt? > + if ((low & BIT(10)) && data->thr_intr_en) { > + __set_bit(bank, data->thr_intr_banks); > + high |= BIT(8); > + } > + > [...]
On Tue, Feb 18, 2025 at 01:27:04PM +0000, Zhuo, Qiuxu wrote: > > From: Yazen Ghannam <yazen.ghannam@amd.com> > > [...] > > --- a/arch/x86/kernel/cpu/mce/amd.c > > +++ b/arch/x86/kernel/cpu/mce/amd.c > > @@ -306,6 +306,11 @@ static void smca_configure(unsigned int bank, > > unsigned int cpu) > > high |= BIT(5); > > } > > > > Do you think it would be better to have some comments on the thresholding > interrupt here (as the bits 10/8 look like magic numbers), similar to the > comments above for the deferred interrupt? > > > + if ((low & BIT(10)) && data->thr_intr_en) { > > + __set_bit(bank, data->thr_intr_banks); > > + high |= BIT(8); > > + } > > + > > [...] Yes, I'd like to convert these to defines. I left that idea, for now, to simplify the changes. Thanks, Yazen
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 75a48f484815..404e0c38f9d8 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -306,6 +306,11 @@ static void smca_configure(unsigned int bank, unsigned int cpu) high |= BIT(5); } + if ((low & BIT(10)) && data->thr_intr_en) { + __set_bit(bank, data->thr_intr_banks); + high |= BIT(8); + } + this_cpu_ptr(mce_banks_array)[bank].lsb_in_status = !!(low & BIT(8)); wrmsr(smca_config, low, high);
AMD systems optionally support MCA thresholding which provides the ability for hardware to send an interrupt when a set error threshold is reached. This feature counts errors of all severities, but it is commonly used to report correctable errors with an interrupt rather than polling. Scalable MCA systems allow the Platform to take control of this feature. In this case, the OS will not see the feature configuration and control bits in the MCA_MISC* registers. The OS will not receive the MCA thresholding interrupt, and it will need to poll for correctable errors. A "corrected error interrupt" will be available on Scalable MCA systems. This will be used in the same configuration where the Platform controls MCA thresholding. However, the Platform will now be able to send the MCA thresholding interrupt to the OS. Check for the feature bit in the MCA_CONFIG register and confirm that the MCA thresholding interrupt handler is already enabled. If successful, set the feature enable bit in the MCA_CONFIG register to indicate to the Platform that the OS is ready for the interrupt. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> --- Notes: Link: https://lore.kernel.org/r/20240523155641.2805411-10-yazen.ghannam@amd.com v1->v2: * Use new per-CPU struct. arch/x86/kernel/cpu/mce/amd.c | 5 +++++ 1 file changed, 5 insertions(+)