diff mbox series

amd: disable C6 after 1000 days on Fam17h models 30-3fh

Message ID 20230605151036.18085-1-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series amd: disable C6 after 1000 days on Fam17h models 30-3fh | expand

Commit Message

Roger Pau Monné June 5, 2023, 3:10 p.m. UTC
As specified on Errata 1474:

"A core will fail to exit CC6 after about 1044 days after the last
system reset. The time of failure may vary depending on the spread
spectrum and REFCLK frequency."

Detect when running on AMD Fam17h models 30h-3fh and setup a timer to
prevent entering C6 after 1000 days have elapsed.  Take into account
the TSC value at boot in order to account for any time elapsed before
Xen has been booted.

Print a message once C6 is disabled in order to let the user know.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
I think the only 30-3fh model is 31h (Rome/Castle Peak), but I've
coded the check as to allow the whole range.
---
 xen/arch/x86/acpi/cpu_idle.c   |  3 ++-
 xen/arch/x86/cpu/amd.c         | 42 ++++++++++++++++++++++++++++++++++
 xen/arch/x86/include/asm/amd.h |  2 ++
 xen/include/xen/time.h         |  1 +
 4 files changed, 47 insertions(+), 1 deletion(-)

Comments

Andrew Cooper June 5, 2023, 3:51 p.m. UTC | #1
On 05/06/2023 4:10 pm, Roger Pau Monne wrote:
> diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
> index 0d3143031b5b..728fa61a54bb 100644
> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -1189,3 +1190,44 @@ const struct cpu_dev amd_cpu_dev = {
>  	.c_early_init	= early_init_amd,
>  	.c_init		= init_amd,
>  };
> +
> +static void cf_check disable_c6(void *arg)
> +{
> +	printk(XENLOG_WARNING
> +	       "Disabling C6 after 1000 days uptime due to AMD errata 1474\n");
> +	amd_disable_c6 = true;

I don't think this is good enough.

AMD CPUs can enter C6 from HLT and IOCstate as well as via MWAIT. 
You're going to need to modify the MSRs as described in the revision
guide, which will inhibit all ways of entering C6.

~Andrew
Jan Beulich June 5, 2023, 3:54 p.m. UTC | #2
On 05.06.2023 17:10, Roger Pau Monne wrote:
> As specified on Errata 1474:
> 
> "A core will fail to exit CC6 after about 1044 days after the last
> system reset. The time of failure may vary depending on the spread
> spectrum and REFCLK frequency."
> 
> Detect when running on AMD Fam17h models 30h-3fh and setup a timer to
> prevent entering C6 after 1000 days have elapsed.  Take into account
> the TSC value at boot in order to account for any time elapsed before
> Xen has been booted.

Models 6x are also affected as per their RG. I have some trouble with
the site, so it's too slow going to actually try and fish out the RGs
for the other possible models.

Given more than one set of models is affected I of course also wonder
whether Hygon CPUs wouldn't be affected, too. But I realize we have
hardly any means to find out.

> @@ -1189,3 +1190,44 @@ const struct cpu_dev amd_cpu_dev = {
>  	.c_early_init	= early_init_amd,
>  	.c_init		= init_amd,
>  };
> +
> +static void cf_check disable_c6(void *arg)
> +{
> +	printk(XENLOG_WARNING
> +	       "Disabling C6 after 1000 days uptime due to AMD errata 1474\n");
> +	amd_disable_c6 = true;
> +}
> +
> +static int __init cf_check amd_c6_errata(void)
> +{
> +	/*
> +	 * Errata #1474: A Core May Hang After About 1044 Days
> +	 * Set up a timer to disable C6 after 1000 days uptime.
> +	 */
> +	s_time_t;
> +
> +	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD ||
> +	    boot_cpu_data.x86 != 0x17 ||
> +	    (boot_cpu_data.x86_model & 0xf0) != 0x30)

Perhaps better ... & ~0xf, just to be future-proof?

> +		return 0;
> +
> +	/*
> +	 * Deduct current TSC value, this would be relevant if
> +	 * kexec'ed for example.
> +	 */
> +	delta = DAYS(1000) - tsc_ticks2ns(rdtsc());

Generally the TSC can be written (and we actually do so under certain
circumstances), so deriving any absolute time from the TSC value is of
at best questionable value.

Furthermore, perhaps better overall to suppress all of this logic when
we're running virtualized ourselves?

> +	if (delta > 0) {
> +		static struct timer errata_c6;
> +
> +		init_timer(&errata_c6, disable_c6, NULL, 0);
> +		set_timer(&errata_c6, NOW() + delta);
> +	} else
> +		disable_c6(NULL);

The log message is going to be misleading in this case. Maybe pass
that as the so far unused handler argument? Albeit I realize that this
will mean casting away const-ness, which isn't very nice.

> --- a/xen/include/xen/time.h
> +++ b/xen/include/xen/time.h
> @@ -53,6 +53,7 @@ struct tm wallclock_time(uint64_t *ns);
>  
>  #define SYSTEM_TIME_HZ  1000000000ULL
>  #define NOW()           ((s_time_t)get_s_time())
> +#define DAYS(_d)        ((s_time_t)((_d)  * 86400000000000ULL))
>  #define SECONDS(_s)     ((s_time_t)((_s)  * 1000000000ULL))
>  #define MILLISECS(_ms)  ((s_time_t)((_ms) * 1000000ULL))
>  #define MICROSECS(_us)  ((s_time_t)((_us) * 1000ULL))

While consistent with the other macros we have here, I think this would
be neater as

#define DAYS(_d)        SECONDS((_d) * 86400ULL))

especially if considering that yet later someone may want to add YEARS().

Jan
Andrew Cooper June 5, 2023, 4:07 p.m. UTC | #3
On 05/06/2023 4:54 pm, Jan Beulich wrote:
> On 05.06.2023 17:10, Roger Pau Monne wrote:
>> As specified on Errata 1474:
>>
>> "A core will fail to exit CC6 after about 1044 days after the last
>> system reset. The time of failure may vary depending on the spread
>> spectrum and REFCLK frequency."
>>
>> Detect when running on AMD Fam17h models 30h-3fh and setup a timer to
>> prevent entering C6 after 1000 days have elapsed.  Take into account
>> the TSC value at boot in order to account for any time elapsed before
>> Xen has been booted.
> Models 6x are also affected as per their RG. I have some trouble with
> the site, so it's too slow going to actually try and fish out the RGs
> for the other possible models.
>
> Given more than one set of models is affected I of course also wonder
> whether Hygon CPUs wouldn't be affected, too. But I realize we have
> hardly any means to find out.

I'd say it's more likely than unlikely, and ...

>> @@ -1189,3 +1190,44 @@ const struct cpu_dev amd_cpu_dev = {
>>  	.c_early_init	= early_init_amd,
>>  	.c_init		= init_amd,
>>  };
>> +
>> +static void cf_check disable_c6(void *arg)
>> +{
>> +	printk(XENLOG_WARNING
>> +	       "Disabling C6 after 1000 days uptime due to AMD errata 1474\n");
>> +	amd_disable_c6 = true;
>> +}
>> +
>> +static int __init cf_check amd_c6_errata(void)
>> +{
>> +	/*
>> +	 * Errata #1474: A Core May Hang After About 1044 Days
>> +	 * Set up a timer to disable C6 after 1000 days uptime.
>> +	 */
>> +	s_time_t;
>> +
>> +	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD ||
>> +	    boot_cpu_data.x86 != 0x17 ||
>> +	    (boot_cpu_data.x86_model & 0xf0) != 0x30)
> Perhaps better ... & ~0xf, just to be future-proof?

... this wants to follow the same logic as for Branch Type Confusion. 
See amd_init_spectral_chicken() looking for STIBP.

It's very likely all Zen2 models, given that it will have taken nearly 3
years to be discovered in the first place...

~Andrew
Roger Pau Monné June 6, 2023, 12:47 p.m. UTC | #4
On Mon, Jun 05, 2023 at 05:54:50PM +0200, Jan Beulich wrote:
> On 05.06.2023 17:10, Roger Pau Monne wrote:
> > As specified on Errata 1474:
> > 
> > "A core will fail to exit CC6 after about 1044 days after the last
> > system reset. The time of failure may vary depending on the spread
> > spectrum and REFCLK frequency."
> > 
> > Detect when running on AMD Fam17h models 30h-3fh and setup a timer to
> > prevent entering C6 after 1000 days have elapsed.  Take into account
> > the TSC value at boot in order to account for any time elapsed before
> > Xen has been booted.
> 
> Models 6x are also affected as per their RG. I have some trouble with
> the site, so it's too slow going to actually try and fish out the RGs
> for the other possible models.
> 
> Given more than one set of models is affected I of course also wonder
> whether Hygon CPUs wouldn't be affected, too. But I realize we have
> hardly any means to find out.

I've considered Hygon, but as you say I have no way I know to figure
out what models are affected.

> > @@ -1189,3 +1190,44 @@ const struct cpu_dev amd_cpu_dev = {
> >  	.c_early_init	= early_init_amd,
> >  	.c_init		= init_amd,
> >  };
> > +
> > +static void cf_check disable_c6(void *arg)
> > +{
> > +	printk(XENLOG_WARNING
> > +	       "Disabling C6 after 1000 days uptime due to AMD errata 1474\n");
> > +	amd_disable_c6 = true;
> > +}
> > +
> > +static int __init cf_check amd_c6_errata(void)
> > +{
> > +	/*
> > +	 * Errata #1474: A Core May Hang After About 1044 Days
> > +	 * Set up a timer to disable C6 after 1000 days uptime.
> > +	 */
> > +	s_time_t;
> > +
> > +	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD ||
> > +	    boot_cpu_data.x86 != 0x17 ||
> > +	    (boot_cpu_data.x86_model & 0xf0) != 0x30)
> 
> Perhaps better ... & ~0xf, just to be future-proof?

Right, will need to change anyway to account for 0x60 models.  But
x86_model is a char, and hence the mask is 0xff only (8 bits).

> > +		return 0;
> > +
> > +	/*
> > +	 * Deduct current TSC value, this would be relevant if
> > +	 * kexec'ed for example.
> > +	 */
> > +	delta = DAYS(1000) - tsc_ticks2ns(rdtsc());
> 
> Generally the TSC can be written (and we actually do so under certain
> circumstances), so deriving any absolute time from the TSC value is of
> at best questionable value.

It's IMO better than not accounting for the TSC at all. Worst
case is we end up disabling C6 before actually required, but that
would leave us in safe position anyway.

> Furthermore, perhaps better overall to suppress all of this logic when
> we're running virtualized ourselves?

Indeed, will do that.

> > +	if (delta > 0) {
> > +		static struct timer errata_c6;
> > +
> > +		init_timer(&errata_c6, disable_c6, NULL, 0);
> > +		set_timer(&errata_c6, NOW() + delta);
> > +	} else
> > +		disable_c6(NULL);
> 
> The log message is going to be misleading in this case. Maybe pass
> that as the so far unused handler argument? Albeit I realize that this
> will mean casting away const-ness, which isn't very nice.

I think we could use the same message in both cases, what about:

"Disabling C6 after 1000 days hardware uptime due to AMD errata 1474"

> > --- a/xen/include/xen/time.h
> > +++ b/xen/include/xen/time.h
> > @@ -53,6 +53,7 @@ struct tm wallclock_time(uint64_t *ns);
> >  
> >  #define SYSTEM_TIME_HZ  1000000000ULL
> >  #define NOW()           ((s_time_t)get_s_time())
> > +#define DAYS(_d)        ((s_time_t)((_d)  * 86400000000000ULL))
> >  #define SECONDS(_s)     ((s_time_t)((_s)  * 1000000000ULL))
> >  #define MILLISECS(_ms)  ((s_time_t)((_ms) * 1000000ULL))
> >  #define MICROSECS(_us)  ((s_time_t)((_us) * 1000ULL))
> 
> While consistent with the other macros we have here, I think this would
> be neater as
> 
> #define DAYS(_d)        SECONDS((_d) * 86400ULL))

Hm, it would make more sense to introduce the missing macros between
DAYS() and SECONDS() (MINUTES() and HOURS()) and use HOURS() in DAYS()?

Thanks, Roger.
Jan Beulich June 7, 2023, 6:24 a.m. UTC | #5
On 06.06.2023 14:47, Roger Pau Monné wrote:
> On Mon, Jun 05, 2023 at 05:54:50PM +0200, Jan Beulich wrote:
>> On 05.06.2023 17:10, Roger Pau Monne wrote:
>>> As specified on Errata 1474:
>>>
>>> "A core will fail to exit CC6 after about 1044 days after the last
>>> system reset. The time of failure may vary depending on the spread
>>> spectrum and REFCLK frequency."
>>>
>>> Detect when running on AMD Fam17h models 30h-3fh and setup a timer to
>>> prevent entering C6 after 1000 days have elapsed.  Take into account
>>> the TSC value at boot in order to account for any time elapsed before
>>> Xen has been booted.
>>
>> Models 6x are also affected as per their RG. I have some trouble with
>> the site, so it's too slow going to actually try and fish out the RGs
>> for the other possible models.
>>
>> Given more than one set of models is affected I of course also wonder
>> whether Hygon CPUs wouldn't be affected, too. But I realize we have
>> hardly any means to find out.
> 
> I've considered Hygon, but as you say I have no way I know to figure
> out what models are affected.

Well, see also Andrew's reply.

>>> +static int __init cf_check amd_c6_errata(void)
>>> +{
>>> +	/*
>>> +	 * Errata #1474: A Core May Hang After About 1044 Days
>>> +	 * Set up a timer to disable C6 after 1000 days uptime.
>>> +	 */
>>> +	s_time_t;
>>> +
>>> +	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD ||
>>> +	    boot_cpu_data.x86 != 0x17 ||
>>> +	    (boot_cpu_data.x86_model & 0xf0) != 0x30)
>>
>> Perhaps better ... & ~0xf, just to be future-proof?
> 
> Right, will need to change anyway to account for 0x60 models.  But
> x86_model is a char, and hence the mask is 0xff only (8 bits).

I understand it is right now, hence why I said future proof. I consider
it possible that upon widening the field an expression like the one
above might be overlooked.

>>> +		return 0;
>>> +
>>> +	/*
>>> +	 * Deduct current TSC value, this would be relevant if
>>> +	 * kexec'ed for example.
>>> +	 */
>>> +	delta = DAYS(1000) - tsc_ticks2ns(rdtsc());
>>
>> Generally the TSC can be written (and we actually do so under certain
>> circumstances), so deriving any absolute time from the TSC value is of
>> at best questionable value.
> 
> It's IMO better than not accounting for the TSC at all. Worst
> case is we end up disabling C6 before actually required, but that
> would leave us in safe position anyway.

Hmm, yes, fair point.

>>> +	if (delta > 0) {
>>> +		static struct timer errata_c6;
>>> +
>>> +		init_timer(&errata_c6, disable_c6, NULL, 0);
>>> +		set_timer(&errata_c6, NOW() + delta);
>>> +	} else
>>> +		disable_c6(NULL);
>>
>> The log message is going to be misleading in this case. Maybe pass
>> that as the so far unused handler argument? Albeit I realize that this
>> will mean casting away const-ness, which isn't very nice.
> 
> I think we could use the same message in both cases, what about:
> 
> "Disabling C6 after 1000 days hardware uptime due to AMD errata 1474"

But that's still misleading if TSC was ever written. The message is
pretty long already, but inserting "apparent" would address this.

>>> --- a/xen/include/xen/time.h
>>> +++ b/xen/include/xen/time.h
>>> @@ -53,6 +53,7 @@ struct tm wallclock_time(uint64_t *ns);
>>>  
>>>  #define SYSTEM_TIME_HZ  1000000000ULL
>>>  #define NOW()           ((s_time_t)get_s_time())
>>> +#define DAYS(_d)        ((s_time_t)((_d)  * 86400000000000ULL))
>>>  #define SECONDS(_s)     ((s_time_t)((_s)  * 1000000000ULL))
>>>  #define MILLISECS(_ms)  ((s_time_t)((_ms) * 1000000ULL))
>>>  #define MICROSECS(_us)  ((s_time_t)((_us) * 1000ULL))
>>
>> While consistent with the other macros we have here, I think this would
>> be neater as
>>
>> #define DAYS(_d)        SECONDS((_d) * 86400ULL))
> 
> Hm, it would make more sense to introduce the missing macros between
> DAYS() and SECONDS() (MINUTES() and HOURS()) and use HOURS() in DAYS()?

Probably, but I didn't want to go as far as asking for this.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c
index 427c8c89c5c4..452cba3fb953 100644
--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -50,6 +50,7 @@ 
 #include <public/platform.h>
 #include <public/sysctl.h>
 #include <acpi/cpufreq/cpufreq.h>
+#include <asm/amd.h>
 #include <asm/apic.h>
 #include <asm/cpuidle.h>
 #include <asm/mwait.h>
@@ -643,7 +644,7 @@  bool errata_c6_workaround(void)
                       x86_match_cpu(isr_errata));
     }
 
-    return (fix_needed && cpu_has_pending_apic_eoi());
+    return (fix_needed && cpu_has_pending_apic_eoi()) || amd_disable_c6;
 }
 
 void update_last_cx_stat(struct acpi_processor_power *power,
diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 0d3143031b5b..728fa61a54bb 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -50,6 +50,7 @@  boolean_param("allow_unsafe", opt_allow_unsafe);
 bool __read_mostly amd_acpi_c1e_quirk;
 bool __ro_after_init amd_legacy_ssbd;
 bool __initdata amd_virt_spec_ctrl;
+bool __read_mostly amd_disable_c6;
 
 static inline int rdmsr_amd_safe(unsigned int msr, unsigned int *lo,
 				 unsigned int *hi)
@@ -1189,3 +1190,44 @@  const struct cpu_dev amd_cpu_dev = {
 	.c_early_init	= early_init_amd,
 	.c_init		= init_amd,
 };
+
+static void cf_check disable_c6(void *arg)
+{
+	printk(XENLOG_WARNING
+	       "Disabling C6 after 1000 days uptime due to AMD errata 1474\n");
+	amd_disable_c6 = true;
+}
+
+static int __init cf_check amd_c6_errata(void)
+{
+	/*
+	 * Errata #1474: A Core May Hang After About 1044 Days
+	 * Set up a timer to disable C6 after 1000 days uptime.
+	 */
+	s_time_t;
+
+	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD ||
+	    boot_cpu_data.x86 != 0x17 ||
+	    (boot_cpu_data.x86_model & 0xf0) != 0x30)
+		return 0;
+
+	/*
+	 * Deduct current TSC value, this would be relevant if
+	 * kexec'ed for example.
+	 */
+	delta = DAYS(1000) - tsc_ticks2ns(rdtsc());
+	if (delta > 0) {
+		static struct timer errata_c6;
+
+		init_timer(&errata_c6, disable_c6, NULL, 0);
+		set_timer(&errata_c6, NOW() + delta);
+	} else
+		disable_c6(NULL);
+
+	return 0;
+}
+/*
+ * Must be executed after early_time_init() for tsc_ticks2ns() to have been
+ * calibrated.  That prevents us doing the check in init_amd().
+ */
+presmp_initcall(amd_c6_errata);
diff --git a/xen/arch/x86/include/asm/amd.h b/xen/arch/x86/include/asm/amd.h
index 09ee52dc1c09..c54bc6a8903f 100644
--- a/xen/arch/x86/include/asm/amd.h
+++ b/xen/arch/x86/include/asm/amd.h
@@ -157,4 +157,6 @@  bool amd_setup_legacy_ssbd(void);
 void amd_set_legacy_ssbd(bool enable);
 void amd_set_cpuid_user_dis(bool enable);
 
+extern bool amd_disable_c6;
+
 #endif /* __AMD_H__ */
diff --git a/xen/include/xen/time.h b/xen/include/xen/time.h
index b7427460dd13..99a91579438e 100644
--- a/xen/include/xen/time.h
+++ b/xen/include/xen/time.h
@@ -53,6 +53,7 @@  struct tm wallclock_time(uint64_t *ns);
 
 #define SYSTEM_TIME_HZ  1000000000ULL
 #define NOW()           ((s_time_t)get_s_time())
+#define DAYS(_d)        ((s_time_t)((_d)  * 86400000000000ULL))
 #define SECONDS(_s)     ((s_time_t)((_s)  * 1000000000ULL))
 #define MILLISECS(_ms)  ((s_time_t)((_ms) * 1000000ULL))
 #define MICROSECS(_us)  ((s_time_t)((_us) * 1000ULL))