diff mbox series

[v3] x86/amd: Address AMD erratum #1485

Message ID 20231013153801.188324-1-alejandro.vallejo@cloud.com (mailing list archive)
State New, archived
Headers show
Series [v3] x86/amd: Address AMD erratum #1485 | expand

Commit Message

Alejandro Vallejo Oct. 13, 2023, 3:38 p.m. UTC
Fix adapted off Linux's mailing list:
  https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Release-acked-by: Henry Wang <Henry.Wang@arm.com>
---
v3:
  * Factored MSR shuffling into a separate function and moved the call to
    the end of init_amd() rather than having it in the middle
  * Style changes
  * Added comment about the unavoidable wrmsr race
  * Avoid wrmsrl() if the chickenbit already set
  * Removed the "ull" suffix as it's pointless with a local variable and it
    violates a MISRA 7.3.

Pipeline pending:
    https://gitlab.com/xen-project/people/agvallejo/xen/-/pipelines/1036125869
---
 xen/arch/x86/cpu/amd.c               | 23 +++++++++++++++++++++++
 xen/arch/x86/include/asm/amd.h       |  5 +++++
 xen/arch/x86/include/asm/msr-index.h |  1 +
 3 files changed, 29 insertions(+)

Comments

Jan Beulich Oct. 17, 2023, 7:44 a.m. UTC | #1
On 13.10.2023 17:38, Alejandro Vallejo wrote:
> Fix adapted off Linux's mailing list:
>   https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u

Why reference the bug report when there's a proper commit (f454b18e07f5) now?
Plus in any event a short summary of the erratum would help if put right here
(without needing to look up any documents or follow any links).

> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -1004,6 +1004,28 @@ static void cf_check zen2_disable_c6(void *arg)
>  	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
>  }
>  
> +static void amd_check_erratum_1485(void)
> +{
> +	uint64_t val, chickenbit = (1 << 5);

Linux gives the bit a name. Any reason you don't?

Everything else lgtm.

Jan
Andrew Cooper Oct. 17, 2023, 7:50 a.m. UTC | #2
On 17/10/2023 8:44 am, Jan Beulich wrote:
> On 13.10.2023 17:38, Alejandro Vallejo wrote:
>> Fix adapted off Linux's mailing list:
>>   https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u
> Why reference the bug report when there's a proper commit (f454b18e07f5) now?
> Plus in any event a short summary of the erratum would help if put right here
> (without needing to look up any documents or follow any links).

That is not public information yet.  The erratum number alone is the
best we can do at this juncture.
>> --- a/xen/arch/x86/cpu/amd.c
>> +++ b/xen/arch/x86/cpu/amd.c
>> @@ -1004,6 +1004,28 @@ static void cf_check zen2_disable_c6(void *arg)
>>  	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
>>  }
>>  
>> +static void amd_check_erratum_1485(void)
>> +{
>> +	uint64_t val, chickenbit = (1 << 5);
> Linux gives the bit a name. Any reason you don't?

There are multiple different names depending on where you look, and none
are particularly relevant here.

~Andrew
Roger Pau Monné Oct. 17, 2023, 9:13 a.m. UTC | #3
On Tue, Oct 17, 2023 at 08:50:45AM +0100, Andrew Cooper wrote:
> On 17/10/2023 8:44 am, Jan Beulich wrote:
> > On 13.10.2023 17:38, Alejandro Vallejo wrote:
> >> Fix adapted off Linux's mailing list:
> >>   https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u
> > Why reference the bug report when there's a proper commit (f454b18e07f5) now?
> > Plus in any event a short summary of the erratum would help if put right here
> > (without needing to look up any documents or follow any links).
> 
> That is not public information yet.  The erratum number alone is the
> best we can do at this juncture.
> >> --- a/xen/arch/x86/cpu/amd.c
> >> +++ b/xen/arch/x86/cpu/amd.c
> >> @@ -1004,6 +1004,28 @@ static void cf_check zen2_disable_c6(void *arg)
> >>  	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
> >>  }
> >>  
> >> +static void amd_check_erratum_1485(void)
> >> +{
> >> +	uint64_t val, chickenbit = (1 << 5);
> > Linux gives the bit a name. Any reason you don't?
> 
> There are multiple different names depending on where you look, and none
> are particularly relevant here.

Could we make chickenbit const static?

I would also use ULL just to be on the safe side, because we then copy
this for a different bit and it explodes.

(not strong requirements, but if a resend is needed it would be nice
to have).

Thanks, Roger.
Jan Beulich Oct. 17, 2023, 9:21 a.m. UTC | #4
On 17.10.2023 11:13, Roger Pau Monné wrote:
> On Tue, Oct 17, 2023 at 08:50:45AM +0100, Andrew Cooper wrote:
>> On 17/10/2023 8:44 am, Jan Beulich wrote:
>>> On 13.10.2023 17:38, Alejandro Vallejo wrote:
>>>> Fix adapted off Linux's mailing list:
>>>>   https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u
>>> Why reference the bug report when there's a proper commit (f454b18e07f5) now?
>>> Plus in any event a short summary of the erratum would help if put right here
>>> (without needing to look up any documents or follow any links).
>>
>> That is not public information yet.  The erratum number alone is the
>> best we can do at this juncture.
>>>> --- a/xen/arch/x86/cpu/amd.c
>>>> +++ b/xen/arch/x86/cpu/amd.c
>>>> @@ -1004,6 +1004,28 @@ static void cf_check zen2_disable_c6(void *arg)
>>>>  	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
>>>>  }
>>>>  
>>>> +static void amd_check_erratum_1485(void)
>>>> +{
>>>> +	uint64_t val, chickenbit = (1 << 5);
>>> Linux gives the bit a name. Any reason you don't?
>>
>> There are multiple different names depending on where you look, and none
>> are particularly relevant here.
> 
> Could we make chickenbit const static?
> 
> I would also use ULL just to be on the safe side, because we then copy
> this for a different bit and it explodes.

I guess the way it is resembles what we already have in amd_check_zenbleed().
Also it's not clear to me why besides "const" you also ask for "static".

Jan
Andrew Cooper Oct. 17, 2023, 9:23 a.m. UTC | #5
On 17/10/2023 10:13 am, Roger Pau Monné wrote:
> On Tue, Oct 17, 2023 at 08:50:45AM +0100, Andrew Cooper wrote:
>> On 17/10/2023 8:44 am, Jan Beulich wrote:
>>> On 13.10.2023 17:38, Alejandro Vallejo wrote:
>>>> Fix adapted off Linux's mailing list:
>>>>   https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u
>>> Why reference the bug report when there's a proper commit (f454b18e07f5) now?
>>> Plus in any event a short summary of the erratum would help if put right here
>>> (without needing to look up any documents or follow any links).
>> That is not public information yet.  The erratum number alone is the
>> best we can do at this juncture.
>>>> --- a/xen/arch/x86/cpu/amd.c
>>>> +++ b/xen/arch/x86/cpu/amd.c
>>>> @@ -1004,6 +1004,28 @@ static void cf_check zen2_disable_c6(void *arg)
>>>>  	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
>>>>  }
>>>>  
>>>> +static void amd_check_erratum_1485(void)
>>>> +{
>>>> +	uint64_t val, chickenbit = (1 << 5);
>>> Linux gives the bit a name. Any reason you don't?
>> There are multiple different names depending on where you look, and none
>> are particularly relevant here.
> Could we make chickenbit const static?

Why would we want to force something that's optimised to an instruction
immediate into a .data variable?

~Andrew
Roger Pau Monné Oct. 17, 2023, 9:33 a.m. UTC | #6
On Tue, Oct 17, 2023 at 11:21:47AM +0200, Jan Beulich wrote:
> On 17.10.2023 11:13, Roger Pau Monné wrote:
> > On Tue, Oct 17, 2023 at 08:50:45AM +0100, Andrew Cooper wrote:
> >> On 17/10/2023 8:44 am, Jan Beulich wrote:
> >>> On 13.10.2023 17:38, Alejandro Vallejo wrote:
> >>>> Fix adapted off Linux's mailing list:
> >>>>   https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u
> >>> Why reference the bug report when there's a proper commit (f454b18e07f5) now?
> >>> Plus in any event a short summary of the erratum would help if put right here
> >>> (without needing to look up any documents or follow any links).
> >>
> >> That is not public information yet.  The erratum number alone is the
> >> best we can do at this juncture.
> >>>> --- a/xen/arch/x86/cpu/amd.c
> >>>> +++ b/xen/arch/x86/cpu/amd.c
> >>>> @@ -1004,6 +1004,28 @@ static void cf_check zen2_disable_c6(void *arg)
> >>>>  	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
> >>>>  }
> >>>>  
> >>>> +static void amd_check_erratum_1485(void)
> >>>> +{
> >>>> +	uint64_t val, chickenbit = (1 << 5);
> >>> Linux gives the bit a name. Any reason you don't?
> >>
> >> There are multiple different names depending on where you look, and none
> >> are particularly relevant here.
> > 
> > Could we make chickenbit const static?
> > 
> > I would also use ULL just to be on the safe side, because we then copy
> > this for a different bit and it explodes.
> 
> I guess the way it is resembles what we already have in amd_check_zenbleed().
> Also it's not clear to me why besides "const" you also ask for "static".

Yes, makes no sense to put in .rodata, sorry, just const.

Roger.
diff mbox series

Patch

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 4f27187f92..cb77c23f1a 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -1004,6 +1004,28 @@  static void cf_check zen2_disable_c6(void *arg)
 	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
 }
 
+static void amd_check_erratum_1485(void)
+{
+	uint64_t val, chickenbit = (1 << 5);
+
+	if (cpu_has_hypervisor || boot_cpu_data.x86 != 0x19 || !is_zen4_uarch())
+		return;
+
+	rdmsrl(MSR_AMD64_BP_CFG, val);
+
+	if (val & chickenbit)
+		return;
+
+	/*
+	 * BP_CFG is a core-scoped MSR. There's a benign race on this write
+	 * on the case where 2 threads perform the previous check at the
+	 * same time before the chickenbit is set. It's benign because the
+	 * value being written is the same on both.
+	 */
+	wrmsrl(MSR_AMD64_BP_CFG, val | chickenbit);
+
+}
+
 static void cf_check init_amd(struct cpuinfo_x86 *c)
 {
 	u32 l, h;
@@ -1271,6 +1293,7 @@  static void cf_check init_amd(struct cpuinfo_x86 *c)
 		disable_c1_ramping();
 
 	amd_check_zenbleed();
+	amd_check_erratum_1485();
 
 	if (zen2_c6_disabled)
 		zen2_disable_c6(NULL);
diff --git a/xen/arch/x86/include/asm/amd.h b/xen/arch/x86/include/asm/amd.h
index d862cb7972..0700827561 100644
--- a/xen/arch/x86/include/asm/amd.h
+++ b/xen/arch/x86/include/asm/amd.h
@@ -145,11 +145,16 @@ 
  * Hygon (Fam18h) but without simple model number rules.  Instead, use STIBP
  * as a heuristic that distinguishes the two.
  *
+ * For Zen3 and Zen4 (Fam19h) the heuristic is the presence of AutoIBRS, as
+ * it's Zen4-specific.
+ *
  * The caller is required to perform the appropriate vendor/family checks
  * first.
  */
 #define is_zen1_uarch() (!boot_cpu_has(X86_FEATURE_AMD_STIBP))
 #define is_zen2_uarch()   boot_cpu_has(X86_FEATURE_AMD_STIBP)
+#define is_zen3_uarch() (!boot_cpu_has(X86_FEATURE_AUTO_IBRS))
+#define is_zen4_uarch()   boot_cpu_has(X86_FEATURE_AUTO_IBRS)
 
 struct cpuinfo_x86;
 int cpu_has_amd_erratum(const struct cpuinfo_x86 *, int, ...);
diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h
index 11ffed543a..7b3490bfb1 100644
--- a/xen/arch/x86/include/asm/msr-index.h
+++ b/xen/arch/x86/include/asm/msr-index.h
@@ -403,6 +403,7 @@ 
 #define MSR_AMD64_DE_CFG		0xc0011029
 #define AMD64_DE_CFG_LFENCE_SERIALISE	(_AC(1, ULL) << 1)
 #define MSR_AMD64_EX_CFG		0xc001102c
+#define MSR_AMD64_BP_CFG		0xc001102e
 #define MSR_AMD64_DE_CFG2		0xc00110e3
 
 #define MSR_AMD64_DR0_ADDRESS_MASK	0xc0011027