Message ID | 20241125132029.7241-6-patryk.wlazlyn@linux.intel.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Series | SRF: Fix offline CPU preventing pc6 entry | expand |
On Mon, Nov 25, 2024 at 02:20:25PM +0100, Patryk Wlazlyn wrote: > The algorithm based on cpuid leaf 0x5 in the mwait_play_dead(), > for looking up the mwait hint for the deepest cstate may calculate the > wrong hint on recent Intel x86 platforms. > > Prefer cpuidle_play_dead() over the generic mwait_play_dead() loop and > only fallback to the later in case of missing appropriate enter_dead() > handler. So 7/8 should go first, such that picking cpuidle doesn't end up in some weird state. > > Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> > --- > arch/x86/kernel/smpboot.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > index d0464c7a0af5..2627b56fb9bc 100644 > --- a/arch/x86/kernel/smpboot.c > +++ b/arch/x86/kernel/smpboot.c > @@ -1413,9 +1413,10 @@ void native_play_dead(void) > play_dead_common(); > tboot_shutdown(TB_SHUTDOWN_WFS); > > + /* The first successful play_dead() will not return */ > + cpuidle_play_dead(); > mwait_play_dead(); > - if (cpuidle_play_dead()) > - hlt_play_dead(); > + hlt_play_dead(); > } > > #else /* ... !CONFIG_HOTPLUG_CPU */ > -- > 2.47.0 >
On Mon, Nov 25, 2024 at 2:21 PM Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> wrote: > > The algorithm based on cpuid leaf 0x5 in the mwait_play_dead(), > for looking up the mwait hint for the deepest cstate may calculate the > wrong hint on recent Intel x86 platforms. > > Prefer cpuidle_play_dead() over the generic mwait_play_dead() loop and > only fallback to the later in case of missing appropriate enter_dead() > handler. > > Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> > --- > arch/x86/kernel/smpboot.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > index d0464c7a0af5..2627b56fb9bc 100644 > --- a/arch/x86/kernel/smpboot.c > +++ b/arch/x86/kernel/smpboot.c > @@ -1413,9 +1413,10 @@ void native_play_dead(void) > play_dead_common(); > tboot_shutdown(TB_SHUTDOWN_WFS); > > + /* The first successful play_dead() will not return */ > + cpuidle_play_dead(); > mwait_play_dead(); > - if (cpuidle_play_dead()) > - hlt_play_dead(); > + hlt_play_dead(); > } > > #else /* ... !CONFIG_HOTPLUG_CPU */ > -- If you first make intel_idle provide :enter_dead() for all CPUs on all platforms and implement it by calling mwait_play_dead_with_hint(), you won't need mwait_play_dead() any more.
> If you first make intel_idle provide :enter_dead() for all CPUs on all > platforms and implement it by calling mwait_play_dead_with_hint(), you > won't need mwait_play_dead() any more. Crossed my mind, but because mwait_play_dead doesn't filter on Intel vendor specifically, I thought that I might break some edge case. I am all for simplifying.
On Mon, Nov 25, 2024 at 3:43 PM Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> wrote: > > > If you first make intel_idle provide :enter_dead() for all CPUs on all > > platforms and implement it by calling mwait_play_dead_with_hint(), you > > won't need mwait_play_dead() any more. > Crossed my mind, but because mwait_play_dead doesn't filter on Intel > vendor specifically, In practice, it does. The vendor check in it is equivalent to "if Intel". > I thought that I might break some edge case. > > > I am all for simplifying.
>>> If you first make intel_idle provide :enter_dead() for all CPUs on all >>> platforms and implement it by calling mwait_play_dead_with_hint(), you >>> won't need mwait_play_dead() any more. >> Crossed my mind, but because mwait_play_dead doesn't filter on Intel >> vendor specifically, > > In practice, it does. > > The vendor check in it is equivalent to "if Intel". Actually, what about INTEL_IDLE=n? We might hit acpi_idle, which would call mwait_play_dead_with_hint() now, but if we don't, don't we want to try mwait_play_dead before hlt or is it too unrealistic to happen?
On Tue, Nov 26, 2024 at 12:56 PM Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> wrote: > > >>> If you first make intel_idle provide :enter_dead() for all CPUs on all > >>> platforms and implement it by calling mwait_play_dead_with_hint(), you > >>> won't need mwait_play_dead() any more. > >> Crossed my mind, but because mwait_play_dead doesn't filter on Intel > >> vendor specifically, > > > > In practice, it does. > > > > The vendor check in it is equivalent to "if Intel". > > Actually, what about INTEL_IDLE=n? > We might hit acpi_idle, which would call mwait_play_dead_with_hint() now, but > if we don't, don't we want to try mwait_play_dead before hlt or is it too > unrealistic to happen? In that case the hint to use would not be known anyway, so hlt_play_dead() is the right choice IMV.
>>>>> If you first make intel_idle provide :enter_dead() for all CPUs on all >>>>> platforms and implement it by calling mwait_play_dead_with_hint(), you >>>>> won't need mwait_play_dead() any more. >>>> Crossed my mind, but because mwait_play_dead doesn't filter on Intel >>>> vendor specifically, >>> >>> In practice, it does. >>> >>> The vendor check in it is equivalent to "if Intel". >> >> Actually, what about INTEL_IDLE=n? >> We might hit acpi_idle, which would call mwait_play_dead_with_hint() now, but >> if we don't, don't we want to try mwait_play_dead before hlt or is it too >> unrealistic to happen? > > In that case the hint to use would not be known anyway, so > hlt_play_dead() is the right choice IMV. Fair, it's not known, but we could fallback to the old, cpuid leaf 0x5 based algorithm, which works on a lot of hardware. That being said, I think it's cleaner to get rid of the old algorithm entirely and rely on idle drivers to do the right thing from this point forward. If we could bring the CPU out of the mwait loop into the hlt loop somehow (via interrupt for example) we could remove the old kexec hack altogether.
On Tue, Nov 26, 2024 at 2:10 PM Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> wrote: > > >>>>> If you first make intel_idle provide :enter_dead() for all CPUs on all > >>>>> platforms and implement it by calling mwait_play_dead_with_hint(), you > >>>>> won't need mwait_play_dead() any more. > >>>> Crossed my mind, but because mwait_play_dead doesn't filter on Intel > >>>> vendor specifically, > >>> > >>> In practice, it does. > >>> > >>> The vendor check in it is equivalent to "if Intel". > >> > >> Actually, what about INTEL_IDLE=n? > >> We might hit acpi_idle, which would call mwait_play_dead_with_hint() now, but > >> if we don't, don't we want to try mwait_play_dead before hlt or is it too > >> unrealistic to happen? > > > > In that case the hint to use would not be known anyway, so > > hlt_play_dead() is the right choice IMV. > > Fair, it's not known, but we could fallback to the old, cpuid leaf 0x5 based > algorithm, which works on a lot of hardware. > > That being said, I think it's cleaner to get rid of the old algorithm entirely > and rely on idle drivers to do the right thing from this point forward. > > If we could bring the CPU out of the mwait loop into the hlt loop somehow (via > interrupt for example) we could remove the old kexec hack altogether. Well, the problem is that CPUs may be woken up from MWAIT waits for various reasons and the reason for a given wakeup is not known a priori.
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index d0464c7a0af5..2627b56fb9bc 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1413,9 +1413,10 @@ void native_play_dead(void) play_dead_common(); tboot_shutdown(TB_SHUTDOWN_WFS); + /* The first successful play_dead() will not return */ + cpuidle_play_dead(); mwait_play_dead(); - if (cpuidle_play_dead()) - hlt_play_dead(); + hlt_play_dead(); } #else /* ... !CONFIG_HOTPLUG_CPU */
The algorithm based on cpuid leaf 0x5 in the mwait_play_dead(), for looking up the mwait hint for the deepest cstate may calculate the wrong hint on recent Intel x86 platforms. Prefer cpuidle_play_dead() over the generic mwait_play_dead() loop and only fallback to the later in case of missing appropriate enter_dead() handler. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> --- arch/x86/kernel/smpboot.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)