diff mbox series

[02/20] x86/mce: Define mce_setup() helpers for global and per-CPU fields

Message ID 20231118193248.1296798-3-yazen.ghannam@amd.com (mailing list archive)
State Handled Elsewhere
Headers show
Series MCA Updates | expand

Commit Message

Yazen Ghannam Nov. 18, 2023, 7:32 p.m. UTC
Generally, MCA information for an error is gathered on the CPU that
reported the error. In this case, CPU-specific information from the
running CPU will be correct.

However, this will be incorrect if the MCA information is gathered while
running on a CPU that didn't report the error. One example is creating
an MCA record using mce_setup() for errors reported from ACPI.

Split mce_setup() so that there is a helper function to gather global,
i.e. not CPU-specific, information and another helper for CPU-specific
information.

Don't set the CPU number in either helper function. This will be set
appropriately for each call site of the helpers.

Leave mce_setup() defined as-is for the common case when running on the
reporting CPU.

Get MCG_CAP in the global helper even though the register is per-CPU.
This value is not already cached per-CPU like other values. And it does
not assist with any per-CPU decoding or handling.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
 arch/x86/kernel/cpu/mce/core.c | 31 +++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

Comments

Borislav Petkov Nov. 22, 2023, 6:24 p.m. UTC | #1
On Sat, Nov 18, 2023 at 01:32:30PM -0600, Yazen Ghannam wrote:
> +void mce_setup_global(struct mce *m)

We usually call those things "common":

mce_setup_common().

> +{
> +	memset(m, 0, sizeof(struct mce));
> +
> +	m->cpuid	= cpuid_eax(1);
> +	m->cpuvendor	= boot_cpu_data.x86_vendor;
> +	m->mcgcap	= __rdmsr(MSR_IA32_MCG_CAP);
> +	/* need the internal __ version to avoid deadlocks */
> +	m->time		= __ktime_get_real_seconds();
> +}
> +
> +void mce_setup_per_cpu(struct mce *m)

And call this

	mce_setup_for_cpu(unsigned int cpu, struct mce *m);

so that it doesn't look like some per_cpu helper.

And yes, you should supply the CPU number as an argument. Because
otherwise, when you look at your next change:


+       mce_setup_global(&m);
+       m.cpu = m.extcpu = cpu;
+       mce_setup_per_cpu(&m);

This contains the "hidden" requirement that m.extcpu happens *always*
*before* the mce_setup_per_cpu() call and that is flaky and error prone.

So make that:

	mce_setup_common(&m);
	mce_setup_for_cpu(m.extcpu, &m);

and do m.cpu = m.extcpu = cpu inside the second function.

And then it JustWorks(tm) and you can't "forget" assigning m.extcpu and
there's no subtlety.

Ok?
Yazen Ghannam Nov. 27, 2023, 2:52 p.m. UTC | #2
On 11/22/2023 1:24 PM, Borislav Petkov wrote:
> On Sat, Nov 18, 2023 at 01:32:30PM -0600, Yazen Ghannam wrote:
>> +void mce_setup_global(struct mce *m)
> 
> We usually call those things "common":
> 
> mce_setup_common().
> 
>> +{
>> +	memset(m, 0, sizeof(struct mce));
>> +
>> +	m->cpuid	= cpuid_eax(1);
>> +	m->cpuvendor	= boot_cpu_data.x86_vendor;
>> +	m->mcgcap	= __rdmsr(MSR_IA32_MCG_CAP);
>> +	/* need the internal __ version to avoid deadlocks */
>> +	m->time		= __ktime_get_real_seconds();
>> +}
>> +
>> +void mce_setup_per_cpu(struct mce *m)
> 
> And call this
> 
> 	mce_setup_for_cpu(unsigned int cpu, struct mce *m);
> 
> so that it doesn't look like some per_cpu helper.
> 
> And yes, you should supply the CPU number as an argument. Because
> otherwise, when you look at your next change:
> 
> 
> +       mce_setup_global(&m);
> +       m.cpu = m.extcpu = cpu;
> +       mce_setup_per_cpu(&m);
> 
> This contains the "hidden" requirement that m.extcpu happens *always*
> *before* the mce_setup_per_cpu() call and that is flaky and error prone.
> 
> So make that:
> 
> 	mce_setup_common(&m);
> 	mce_setup_for_cpu(m.extcpu, &m);
> 
> and do m.cpu = m.extcpu = cpu inside the second function.
> 
> And then it JustWorks(tm) and you can't "forget" assigning m.extcpu and
> there's no subtlety.
> 
> Ok?
> 

Yep, understood. Thanks!

-Yazen
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 1642018dd6c9..7e86086aa19c 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -115,20 +115,31 @@  static struct irq_work mce_irq_work;
  */
 BLOCKING_NOTIFIER_HEAD(x86_mce_decoder_chain);
 
+void mce_setup_global(struct mce *m)
+{
+	memset(m, 0, sizeof(struct mce));
+
+	m->cpuid	= cpuid_eax(1);
+	m->cpuvendor	= boot_cpu_data.x86_vendor;
+	m->mcgcap	= __rdmsr(MSR_IA32_MCG_CAP);
+	/* need the internal __ version to avoid deadlocks */
+	m->time		= __ktime_get_real_seconds();
+}
+
+void mce_setup_per_cpu(struct mce *m)
+{
+	m->apicid		= cpu_data(m->extcpu).topo.initial_apicid;
+	m->microcode		= cpu_data(m->extcpu).microcode;
+	m->ppin			= cpu_data(m->extcpu).ppin;
+	m->socketid		= cpu_data(m->extcpu).topo.pkg_id;
+}
+
 /* Do initial initialization of a struct mce */
 void mce_setup(struct mce *m)
 {
-	memset(m, 0, sizeof(struct mce));
+	mce_setup_global(m);
 	m->cpu = m->extcpu = smp_processor_id();
-	/* need the internal __ version to avoid deadlocks */
-	m->time = __ktime_get_real_seconds();
-	m->cpuvendor = boot_cpu_data.x86_vendor;
-	m->cpuid = cpuid_eax(1);
-	m->socketid = cpu_data(m->extcpu).topo.pkg_id;
-	m->apicid = cpu_data(m->extcpu).topo.initial_apicid;
-	m->mcgcap = __rdmsr(MSR_IA32_MCG_CAP);
-	m->ppin = cpu_data(m->extcpu).ppin;
-	m->microcode = boot_cpu_data.microcode;
+	mce_setup_per_cpu(m);
 }
 
 DEFINE_PER_CPU(struct mce, injectm);