diff mbox series

[v2,15/16] x86/mce/amd: Support SMCA Corrected Error Interrupt

Message ID 20250213-wip-mca-updates-v2-15-3636547fe05f@amd.com (mailing list archive)
State New
Headers show
Series AMD MCA interrupts rework | expand

Commit Message

Yazen Ghannam Feb. 13, 2025, 4:46 p.m. UTC
AMD systems optionally support MCA thresholding which provides the
ability for hardware to send an interrupt when a set error threshold is
reached. This feature counts errors of all severities, but it is
commonly used to report correctable errors with an interrupt rather than
polling.

Scalable MCA systems allow the Platform to take control of this feature.
In this case, the OS will not see the feature configuration and control
bits in the MCA_MISC* registers. The OS will not receive the MCA
thresholding interrupt, and it will need to poll for correctable errors.

A "corrected error interrupt" will be available on Scalable MCA systems.
This will be used in the same configuration where the Platform controls
MCA thresholding. However, the Platform will now be able to send the
MCA thresholding interrupt to the OS.

Check for the feature bit in the MCA_CONFIG register and confirm that
the MCA thresholding interrupt handler is already enabled. If successful,
set the feature enable bit in the MCA_CONFIG register to indicate to the
Platform that the OS is ready for the interrupt.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---

Notes:
    Link:
    https://lore.kernel.org/r/20240523155641.2805411-10-yazen.ghannam@amd.com
    
    v1->v2:
    * Use new per-CPU struct.

 arch/x86/kernel/cpu/mce/amd.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Luck, Tony Feb. 13, 2025, 10:34 p.m. UTC | #1
On Thu, Feb 13, 2025 at 04:46:04PM +0000, Yazen Ghannam wrote:
> @@ -306,6 +306,11 @@ static void smca_configure(unsigned int bank, unsigned int cpu)
>  			high |= BIT(5);
>  		}
>  
> +		if ((low & BIT(10)) && data->thr_intr_en) {
> +			__set_bit(bank, data->thr_intr_banks);
> +			high |= BIT(8);

Maybe someday add some #define names for these BITs?

> +		}
> +
>  		this_cpu_ptr(mce_banks_array)[bank].lsb_in_status = !!(low & BIT(8));
>  
>  		wrmsr(smca_config, low, high);

-Tony
Yazen Ghannam Feb. 17, 2025, 2:06 p.m. UTC | #2
On Thu, Feb 13, 2025 at 02:34:59PM -0800, Luck, Tony wrote:
> On Thu, Feb 13, 2025 at 04:46:04PM +0000, Yazen Ghannam wrote:
> > @@ -306,6 +306,11 @@ static void smca_configure(unsigned int bank, unsigned int cpu)
> >  			high |= BIT(5);
> >  		}
> >  
> > +		if ((low & BIT(10)) && data->thr_intr_en) {
> > +			__set_bit(bank, data->thr_intr_banks);
> > +			high |= BIT(8);
> 
> Maybe someday add some #define names for these BITs?
> 

Yep, I have something like that:
https://lore.kernel.org/r/20240404151359.47970-6-yazen.ghannam@amd.com
https://lore.kernel.org/r/20240404151359.47970-13-yazen.ghannam@amd.com

These sets keep growing, so I'm trying to do better about breaking
things up and prioritizing. :)

Thanks,
Yazen
Zhuo, Qiuxu Feb. 18, 2025, 1:27 p.m. UTC | #3
> From: Yazen Ghannam <yazen.ghannam@amd.com>
> [...]
> --- a/arch/x86/kernel/cpu/mce/amd.c
> +++ b/arch/x86/kernel/cpu/mce/amd.c
> @@ -306,6 +306,11 @@ static void smca_configure(unsigned int bank,
> unsigned int cpu)
>  			high |= BIT(5);
>  		}
> 

Do you think it would be better to have some comments on the thresholding 
interrupt here (as the bits 10/8 look like magic numbers), similar to the 
comments above for the deferred interrupt?

> +		if ((low & BIT(10)) && data->thr_intr_en) {
> +			__set_bit(bank, data->thr_intr_banks);
> +			high |= BIT(8);
> +		}
> +
> [...]
Yazen Ghannam Feb. 19, 2025, 4:19 p.m. UTC | #4
On Tue, Feb 18, 2025 at 01:27:04PM +0000, Zhuo, Qiuxu wrote:
> > From: Yazen Ghannam <yazen.ghannam@amd.com>
> > [...]
> > --- a/arch/x86/kernel/cpu/mce/amd.c
> > +++ b/arch/x86/kernel/cpu/mce/amd.c
> > @@ -306,6 +306,11 @@ static void smca_configure(unsigned int bank,
> > unsigned int cpu)
> >  			high |= BIT(5);
> >  		}
> > 
> 
> Do you think it would be better to have some comments on the thresholding 
> interrupt here (as the bits 10/8 look like magic numbers), similar to the 
> comments above for the deferred interrupt?
> 
> > +		if ((low & BIT(10)) && data->thr_intr_en) {
> > +			__set_bit(bank, data->thr_intr_banks);
> > +			high |= BIT(8);
> > +		}
> > +
> > [...]

Yes, I'd like to convert these to defines. I left that idea, for now, to
simplify the changes.

Thanks,
Yazen
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 75a48f484815..404e0c38f9d8 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -306,6 +306,11 @@  static void smca_configure(unsigned int bank, unsigned int cpu)
 			high |= BIT(5);
 		}
 
+		if ((low & BIT(10)) && data->thr_intr_en) {
+			__set_bit(bank, data->thr_intr_banks);
+			high |= BIT(8);
+		}
+
 		this_cpu_ptr(mce_banks_array)[bank].lsb_in_status = !!(low & BIT(8));
 
 		wrmsr(smca_config, low, high);