Message ID | da80b8dd-177c-d27a-9a00-c9538a139d37@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86: HPET PC10 quirk workaround and some cleanup | expand |
On Tue, Oct 19, 2021 at 09:07:39AM +0200, Jan Beulich wrote: > From: Thomas Gleixner <tglx@linutronix.de> > > On recent Intel systems the HPET stops working when the system reaches PC10 > idle state. > > The approach of adding PCI ids to the early quirks to disable HPET on > these systems is a whack a mole game which makes no sense. > > Check for PC10 instead and force disable HPET if supported. The check is > overbroad as it does not take ACPI, mwait-idle enablement and command > line parameters into account. That's fine as long as there is at least > PMTIMER available to calibrate the TSC frequency. The decision can be > overruled by adding "clocksource=hpet" on the Xen command line. > > Remove the related PCI quirks for affected Coffee Lake systems as they > are not longer required. That should also cover all other systems, i.e. > Ice Lake, Tiger Lake, and newer generations, which are most likely > affected by this as well. > > Fixes: Yet another hardware trainwreck > Reported-by: Jakub Kicinski <kuba@kernel.org> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > [Linux commit: 6e3cd95234dc1eda488f4f487c281bac8fef4d9b] > > I have to admit that the purpose of checking CPUID5_ECX_INTERRUPT_BREAK > is unclear to me, but I didn't want to diverge in technical aspects from > the Linux commit. > > In mwait_pc10_supported(), besides some cosmetic adjustments, avoid UB > from shifting left a signed 4-bit constant by 28 bits. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Just one comment below. > --- > v2: Fully different replacement of "x86: avoid HPET use also on certain > Coffee Lake H". > > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -34,6 +34,7 @@ > #include <asm/fixmap.h> > #include <asm/guest.h> > #include <asm/mc146818rtc.h> > +#include <asm/mwait.h> > #include <asm/div64.h> > #include <asm/acpi.h> > #include <asm/hpet.h> > @@ -395,14 +396,43 @@ static int64_t __init init_hpet(struct p > } > > /* > - * Some Coffee Lake platforms have a skewed HPET timer once the SoCs > - * entered PC10. > + * Some Coffee Lake and later platforms have a skewed HPET timer once > + * they entered PC10. > + * > + * Check whether the system supports PC10. If so force disable HPET as > + * that stops counting in PC10. This check is overbroad as it does not > + * take any of the following into account: > + * > + * - ACPI tables > + * - Enablement of mwait-idle > + * - Command line arguments which limit mwait-idle C-state support > + * > + * That's perfectly fine. HPET is a piece of hardware designed by > + * committee and the only reasons why it is still in use on modern > + * systems is the fact that it is impossible to reliably query TSC and > + * CPU frequency via CPUID or firmware. > + * > + * If HPET is functional it is useful for calibrating TSC, but this can > + * be done via PMTIMER as well which seems to be the last remaining > + * timer on X86/INTEL platforms that has not been completely wreckaged > + * by feature creep. > + * > + * In theory HPET support should be removed altogether, but there are > + * older systems out there which depend on it because TSC and APIC timer > + * are dysfunctional in deeper C-states. > */ > - if ( pci_conf_read16(PCI_SBDF(0, 0, 0, 0), > - PCI_VENDOR_ID) == PCI_VENDOR_ID_INTEL && > - pci_conf_read16(PCI_SBDF(0, 0, 0, 0), > - PCI_DEVICE_ID) == 0x3ec4 ) > - hpet_address = 0; > + if ( mwait_pc10_supported() ) > + { > + uint64_t pcfg; > + > + rdmsrl(MSR_PKG_CST_CONFIG_CONTROL, pcfg); > + if ( (pcfg & 0xf) < 8 ) > + /* nothing */; > + else if ( !strcmp(opt_clocksource, pts->id) ) > + printk("HPET use requested via command line, but dysfunctional in PC10\n"); > + else > + hpet_address = 0; Should we print a message that HPET is being disabled? Thanks, Roger.
On 19.10.2021 13:30, Roger Pau Monné wrote: > On Tue, Oct 19, 2021 at 09:07:39AM +0200, Jan Beulich wrote: >> From: Thomas Gleixner <tglx@linutronix.de> >> >> On recent Intel systems the HPET stops working when the system reaches PC10 >> idle state. >> >> The approach of adding PCI ids to the early quirks to disable HPET on >> these systems is a whack a mole game which makes no sense. >> >> Check for PC10 instead and force disable HPET if supported. The check is >> overbroad as it does not take ACPI, mwait-idle enablement and command >> line parameters into account. That's fine as long as there is at least >> PMTIMER available to calibrate the TSC frequency. The decision can be >> overruled by adding "clocksource=hpet" on the Xen command line. >> >> Remove the related PCI quirks for affected Coffee Lake systems as they >> are not longer required. That should also cover all other systems, i.e. >> Ice Lake, Tiger Lake, and newer generations, which are most likely >> affected by this as well. >> >> Fixes: Yet another hardware trainwreck >> Reported-by: Jakub Kicinski <kuba@kernel.org> >> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> >> [Linux commit: 6e3cd95234dc1eda488f4f487c281bac8fef4d9b] >> >> I have to admit that the purpose of checking CPUID5_ECX_INTERRUPT_BREAK >> is unclear to me, but I didn't want to diverge in technical aspects from >> the Linux commit. >> >> In mwait_pc10_supported(), besides some cosmetic adjustments, avoid UB >> from shifting left a signed 4-bit constant by 28 bits. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Acked-by: Roger Pau Monné <roger.pau@citrix.com> Thanks. >> @@ -395,14 +396,43 @@ static int64_t __init init_hpet(struct p >> } >> >> /* >> - * Some Coffee Lake platforms have a skewed HPET timer once the SoCs >> - * entered PC10. >> + * Some Coffee Lake and later platforms have a skewed HPET timer once >> + * they entered PC10. >> + * >> + * Check whether the system supports PC10. If so force disable HPET as >> + * that stops counting in PC10. This check is overbroad as it does not >> + * take any of the following into account: >> + * >> + * - ACPI tables >> + * - Enablement of mwait-idle >> + * - Command line arguments which limit mwait-idle C-state support >> + * >> + * That's perfectly fine. HPET is a piece of hardware designed by >> + * committee and the only reasons why it is still in use on modern >> + * systems is the fact that it is impossible to reliably query TSC and >> + * CPU frequency via CPUID or firmware. >> + * >> + * If HPET is functional it is useful for calibrating TSC, but this can >> + * be done via PMTIMER as well which seems to be the last remaining >> + * timer on X86/INTEL platforms that has not been completely wreckaged >> + * by feature creep. >> + * >> + * In theory HPET support should be removed altogether, but there are >> + * older systems out there which depend on it because TSC and APIC timer >> + * are dysfunctional in deeper C-states. >> */ >> - if ( pci_conf_read16(PCI_SBDF(0, 0, 0, 0), >> - PCI_VENDOR_ID) == PCI_VENDOR_ID_INTEL && >> - pci_conf_read16(PCI_SBDF(0, 0, 0, 0), >> - PCI_DEVICE_ID) == 0x3ec4 ) >> - hpet_address = 0; >> + if ( mwait_pc10_supported() ) >> + { >> + uint64_t pcfg; >> + >> + rdmsrl(MSR_PKG_CST_CONFIG_CONTROL, pcfg); >> + if ( (pcfg & 0xf) < 8 ) >> + /* nothing */; >> + else if ( !strcmp(opt_clocksource, pts->id) ) >> + printk("HPET use requested via command line, but dysfunctional in PC10\n"); >> + else >> + hpet_address = 0; > > Should we print a message that HPET is being disabled? There is one, and it was even visible in patch context that you did strip from your reply: if ( !hpet_address ) printk("Disabling HPET for being unreliable\n"); Jan
On Tue, Oct 19, 2021 at 02:08:38PM +0200, Jan Beulich wrote: > On 19.10.2021 13:30, Roger Pau Monné wrote: > > On Tue, Oct 19, 2021 at 09:07:39AM +0200, Jan Beulich wrote: > >> From: Thomas Gleixner <tglx@linutronix.de> > >> > >> On recent Intel systems the HPET stops working when the system reaches PC10 > >> idle state. > >> > >> The approach of adding PCI ids to the early quirks to disable HPET on > >> these systems is a whack a mole game which makes no sense. > >> > >> Check for PC10 instead and force disable HPET if supported. The check is > >> overbroad as it does not take ACPI, mwait-idle enablement and command > >> line parameters into account. That's fine as long as there is at least > >> PMTIMER available to calibrate the TSC frequency. The decision can be > >> overruled by adding "clocksource=hpet" on the Xen command line. > >> > >> Remove the related PCI quirks for affected Coffee Lake systems as they > >> are not longer required. That should also cover all other systems, i.e. > >> Ice Lake, Tiger Lake, and newer generations, which are most likely > >> affected by this as well. > >> > >> Fixes: Yet another hardware trainwreck > >> Reported-by: Jakub Kicinski <kuba@kernel.org> > >> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > >> [Linux commit: 6e3cd95234dc1eda488f4f487c281bac8fef4d9b] > >> > >> I have to admit that the purpose of checking CPUID5_ECX_INTERRUPT_BREAK > >> is unclear to me, but I didn't want to diverge in technical aspects from > >> the Linux commit. > >> > >> In mwait_pc10_supported(), besides some cosmetic adjustments, avoid UB > >> from shifting left a signed 4-bit constant by 28 bits. > >> > >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > > > Acked-by: Roger Pau Monné <roger.pau@citrix.com> > > Thanks. > > >> @@ -395,14 +396,43 @@ static int64_t __init init_hpet(struct p > >> } > >> > >> /* > >> - * Some Coffee Lake platforms have a skewed HPET timer once the SoCs > >> - * entered PC10. > >> + * Some Coffee Lake and later platforms have a skewed HPET timer once > >> + * they entered PC10. > >> + * > >> + * Check whether the system supports PC10. If so force disable HPET as > >> + * that stops counting in PC10. This check is overbroad as it does not > >> + * take any of the following into account: > >> + * > >> + * - ACPI tables > >> + * - Enablement of mwait-idle > >> + * - Command line arguments which limit mwait-idle C-state support > >> + * > >> + * That's perfectly fine. HPET is a piece of hardware designed by > >> + * committee and the only reasons why it is still in use on modern > >> + * systems is the fact that it is impossible to reliably query TSC and > >> + * CPU frequency via CPUID or firmware. > >> + * > >> + * If HPET is functional it is useful for calibrating TSC, but this can > >> + * be done via PMTIMER as well which seems to be the last remaining > >> + * timer on X86/INTEL platforms that has not been completely wreckaged > >> + * by feature creep. > >> + * > >> + * In theory HPET support should be removed altogether, but there are > >> + * older systems out there which depend on it because TSC and APIC timer > >> + * are dysfunctional in deeper C-states. > >> */ > >> - if ( pci_conf_read16(PCI_SBDF(0, 0, 0, 0), > >> - PCI_VENDOR_ID) == PCI_VENDOR_ID_INTEL && > >> - pci_conf_read16(PCI_SBDF(0, 0, 0, 0), > >> - PCI_DEVICE_ID) == 0x3ec4 ) > >> - hpet_address = 0; > >> + if ( mwait_pc10_supported() ) > >> + { > >> + uint64_t pcfg; > >> + > >> + rdmsrl(MSR_PKG_CST_CONFIG_CONTROL, pcfg); > >> + if ( (pcfg & 0xf) < 8 ) > >> + /* nothing */; > >> + else if ( !strcmp(opt_clocksource, pts->id) ) > >> + printk("HPET use requested via command line, but dysfunctional in PC10\n"); > >> + else > >> + hpet_address = 0; > > > > Should we print a message that HPET is being disabled? > > There is one, and it was even visible in patch context that you > did strip from your reply: I meant something about being disabled for PC10, but I think the generic one is fine enough. Thanks, Roger.
On 19/10/2021 08:07, Jan Beulich wrote: > From: Thomas Gleixner <tglx@linutronix.de> > > On recent Intel systems the HPET stops working when the system reaches PC10 > idle state. > > The approach of adding PCI ids to the early quirks to disable HPET on > these systems is a whack a mole game which makes no sense. > > Check for PC10 instead and force disable HPET if supported. The check is > overbroad as it does not take ACPI, mwait-idle enablement and command > line parameters into account. That's fine as long as there is at least > PMTIMER available to calibrate the TSC frequency. The decision can be > overruled by adding "clocksource=hpet" on the Xen command line. > > Remove the related PCI quirks for affected Coffee Lake systems as they > are not longer required. That should also cover all other systems, i.e. > Ice Lake, Tiger Lake, and newer generations, which are most likely > affected by this as well. > > Fixes: Yet another hardware trainwreck > Reported-by: Jakub Kicinski <kuba@kernel.org> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > [Linux commit: 6e3cd95234dc1eda488f4f487c281bac8fef4d9b] > > I have to admit that the purpose of checking CPUID5_ECX_INTERRUPT_BREAK > is unclear to me, but I didn't want to diverge in technical aspects from > the Linux commit. > > In mwait_pc10_supported(), besides some cosmetic adjustments, avoid UB > from shifting left a signed 4-bit constant by 28 bits. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> This breaks booting on recent Intel platforms. Ian: Complete blocker for the release. ~Andrew
--- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -34,6 +34,7 @@ #include <asm/fixmap.h> #include <asm/guest.h> #include <asm/mc146818rtc.h> +#include <asm/mwait.h> #include <asm/div64.h> #include <asm/acpi.h> #include <asm/hpet.h> @@ -395,14 +396,43 @@ static int64_t __init init_hpet(struct p } /* - * Some Coffee Lake platforms have a skewed HPET timer once the SoCs - * entered PC10. + * Some Coffee Lake and later platforms have a skewed HPET timer once + * they entered PC10. + * + * Check whether the system supports PC10. If so force disable HPET as + * that stops counting in PC10. This check is overbroad as it does not + * take any of the following into account: + * + * - ACPI tables + * - Enablement of mwait-idle + * - Command line arguments which limit mwait-idle C-state support + * + * That's perfectly fine. HPET is a piece of hardware designed by + * committee and the only reasons why it is still in use on modern + * systems is the fact that it is impossible to reliably query TSC and + * CPU frequency via CPUID or firmware. + * + * If HPET is functional it is useful for calibrating TSC, but this can + * be done via PMTIMER as well which seems to be the last remaining + * timer on X86/INTEL platforms that has not been completely wreckaged + * by feature creep. + * + * In theory HPET support should be removed altogether, but there are + * older systems out there which depend on it because TSC and APIC timer + * are dysfunctional in deeper C-states. */ - if ( pci_conf_read16(PCI_SBDF(0, 0, 0, 0), - PCI_VENDOR_ID) == PCI_VENDOR_ID_INTEL && - pci_conf_read16(PCI_SBDF(0, 0, 0, 0), - PCI_DEVICE_ID) == 0x3ec4 ) - hpet_address = 0; + if ( mwait_pc10_supported() ) + { + uint64_t pcfg; + + rdmsrl(MSR_PKG_CST_CONFIG_CONTROL, pcfg); + if ( (pcfg & 0xf) < 8 ) + /* nothing */; + else if ( !strcmp(opt_clocksource, pts->id) ) + printk("HPET use requested via command line, but dysfunctional in PC10\n"); + else + hpet_address = 0; + } if ( !hpet_address ) printk("Disabling HPET for being unreliable\n"); --- a/xen/arch/x86/cpu/mwait-idle.c +++ b/xen/arch/x86/cpu/mwait-idle.c @@ -1308,3 +1308,20 @@ int __init mwait_idle_init(struct notifi return err; } + +/* Helper function for HPET. */ +bool __init mwait_pc10_supported(void) +{ + unsigned int ecx, edx, dummy; + + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL || + !cpu_has_monitor || + boot_cpu_data.cpuid_level < CPUID_MWAIT_LEAF) + return false; + + cpuid(CPUID_MWAIT_LEAF, &dummy, &dummy, &ecx, &edx); + + return (ecx & CPUID5_ECX_EXTENSIONS_SUPPORTED) && + (ecx & CPUID5_ECX_INTERRUPT_BREAK) && + (edx >> 28); +} --- a/xen/include/asm-x86/mwait.h +++ b/xen/include/asm-x86/mwait.h @@ -1,6 +1,8 @@ #ifndef __ASM_X86_MWAIT_H__ #define __ASM_X86_MWAIT_H__ +#include <xen/types.h> + #define MWAIT_SUBSTATE_MASK 0xf #define MWAIT_CSTATE_MASK 0xf #define MWAIT_SUBSTATE_SIZE 4 @@ -12,5 +14,6 @@ #define MWAIT_ECX_INTERRUPT_BREAK 0x1 void mwait_idle_with_hints(unsigned int eax, unsigned int ecx); +bool mwait_pc10_supported(void); #endif /* __ASM_X86_MWAIT_H__ */