Message ID | 20241108122909.763663-4-patryk.wlazlyn@linux.intel.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Series | SRF: Fix offline CPU preventing pc6 entry | expand |
On 11/8/24 04:29, Patryk Wlazlyn wrote: > Intel's Sierra Forest report two C6 substates in cpuid leaf 5: > C6S (hint 0x22) > C6SP (hint 0x23) > > Hints 0x20 and 0x21 are skipped entirely, causing the generic > implementation in mwait_play_dead() to compute the wrong hint, when > looking for the deepest cstate. As a result, package with an offlined > CPU can never reach PC6. This series has said multiple times how the old algorithm is wrong. But it never actually _fixed_ the bad algorithm, only worked around it. Does mwait_play_dead() itself need to get fixed? > Define the enter_dead() handler for SRF. This effectively gets the mwait hints from ______ instead of using the calculation in mwait_play_dead(). > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c > index 9aab7abc2ae9..bd67959e5e8b 100644 > --- a/drivers/idle/intel_idle.c > +++ b/drivers/idle/intel_idle.c > @@ -56,6 +56,7 @@ > #include <asm/mwait.h> > #include <asm/spec-ctrl.h> > #include <asm/fpu/api.h> > +#include <asm/smp.h> > > #define INTEL_IDLE_VERSION "0.5.1" > > @@ -221,6 +222,17 @@ static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, > return 0; > } > > +static __cpuidle int intel_idle_enter_dead(struct cpuidle_device *dev, > + int index) > +{ > + struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev); > + struct cpuidle_state *state = &drv->states[index]; > + unsigned long eax = flg2MWAIT(state->flags); > + > + /* Retruns only in case of an error. */ ^ returns? > + return mwait_play_dead_with_hint(eax); > +} > + > /* > * States are indexed by the cstate number, > * which is also the index into the MWAIT hint array. > @@ -1303,6 +1315,7 @@ static struct cpuidle_state srf_cstates[] __initdata = { > .exit_latency = 1, > .target_residency = 1, > .enter = &intel_idle, > + .enter_dead = &intel_idle_enter_dead, > .enter_s2idle = intel_idle_s2idle, }, > { > .name = "C1E", > @@ -1311,6 +1324,7 @@ static struct cpuidle_state srf_cstates[] __initdata = { > .exit_latency = 2, > .target_residency = 10, > .enter = &intel_idle, > + .enter_dead = &intel_idle_enter_dead, > .enter_s2idle = intel_idle_s2idle, }, > { > .name = "C6S", > @@ -1319,6 +1333,7 @@ static struct cpuidle_state srf_cstates[] __initdata = { > .exit_latency = 270, > .target_residency = 700, > .enter = &intel_idle, > + .enter_dead = &intel_idle_enter_dead, > .enter_s2idle = intel_idle_s2idle, }, > { > .name = "C6SP", > @@ -1327,6 +1342,7 @@ static struct cpuidle_state srf_cstates[] __initdata = { > .exit_latency = 310, > .target_residency = 900, > .enter = &intel_idle, > + .enter_dead = &intel_idle_enter_dead, > .enter_s2idle = intel_idle_s2idle, }, > { > .enter = NULL }
Hi Patryk,
kernel test robot noticed the following build warnings:
[auto build test WARNING on acpi/next]
[also build test WARNING on tip/master tip/x86/core linus/master v6.12-rc6 next-20241108]
[cannot apply to tip/auto-latest]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Patryk-Wlazlyn/x86-smp-Allow-calling-mwait_play_dead-with-arbitrary-hint/20241108-203137
base: https://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux.git next
patch link: https://lore.kernel.org/r/20241108122909.763663-4-patryk.wlazlyn%40linux.intel.com
patch subject: [PATCH v3 3/3] intel_idle: Provide enter_dead() handler for SRF
config: x86_64-randconfig-161-20241109 (https://download.01.org/0day-ci/archive/20241109/202411090651.FNnbtLe2-lkp@intel.com/config)
compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241109/202411090651.FNnbtLe2-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411090651.FNnbtLe2-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> vmlinux.o: warning: objtool: intel_idle_enter_dead+0x9: call to cpuidle_get_cpu_driver() leaves .noinstr.text section
> This series has said multiple times how the old algorithm is wrong. But > it never actually _fixed_ the bad algorithm, only worked around it. > > Does mwait_play_dead() itself need to get fixed? I don't think so. The old algorithm gives fairly good heuristic for computing the mwait hint for the deepest cstate. Even though it's not guaranteed to work, it does work on most of the platforms that don't early return. I think we should leave it, but prefer idle_driver. >> Define the enter_dead() handler for SRF. > > This effectively gets the mwait hints from ______ instead of using the > calculation in mwait_play_dead(). Ok. >> +static __cpuidle int intel_idle_enter_dead(struct cpuidle_device *dev, >> + int index) >> +{ >> + struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev); >> + struct cpuidle_state *state = &drv->states[index]; >> + unsigned long eax = flg2MWAIT(state->flags); >> + >> + /* Retruns only in case of an error. */ > > ^ returns? Yup. Will fix.
On Tue, Nov 12, 2024 at 12:18 PM Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> wrote: > > > This series has said multiple times how the old algorithm is wrong. But > > it never actually _fixed_ the bad algorithm, only worked around it. > > > > Does mwait_play_dead() itself need to get fixed? > > I don't think so. The old algorithm gives fairly good heuristic for computing > the mwait hint for the deepest cstate. Even though it's not guaranteed to work, > it does work on most of the platforms that don't early return. I think we should > leave it, but prefer idle_driver. IOW, as a fallback mechanism, it is as good as it gets. As the primary source of information though, not quite. > >> Define the enter_dead() handler for SRF. > > > > This effectively gets the mwait hints from ______ instead of using the > > calculation in mwait_play_dead(). > > Ok. > > >> +static __cpuidle int intel_idle_enter_dead(struct cpuidle_device *dev, > >> + int index) > >> +{ > >> + struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev); > >> + struct cpuidle_state *state = &drv->states[index]; > >> + unsigned long eax = flg2MWAIT(state->flags); > >> + > >> + /* Retruns only in case of an error. */ > > > > ^ returns? > > Yup. Will fix. > >
On 11/12/24 03:28, Rafael J. Wysocki wrote: > On Tue, Nov 12, 2024 at 12:18 PM Patryk Wlazlyn > <patryk.wlazlyn@linux.intel.com> wrote: >>> This series has said multiple times how the old algorithm is wrong. But >>> it never actually _fixed_ the bad algorithm, only worked around it. >>> >>> Does mwait_play_dead() itself need to get fixed? >> I don't think so. The old algorithm gives fairly good heuristic for computing >> the mwait hint for the deepest cstate. Even though it's not guaranteed to work, >> it does work on most of the platforms that don't early return. I think we should >> leave it, but prefer idle_driver. > IOW, as a fallback mechanism, it is as good as it gets. > > As the primary source of information though, not quite. Could we also work to make this a bit more clear when it should be used and where the primary sources of information are?
On Tue, Nov 12 2024 at 12:28, Rafael J. Wysocki wrote: > On Tue, Nov 12, 2024 at 12:18 PM Patryk Wlazlyn > <patryk.wlazlyn@linux.intel.com> wrote: >> I don't think so. The old algorithm gives fairly good heuristic for computing >> the mwait hint for the deepest cstate. Even though it's not guaranteed to work, >> it does work on most of the platforms that don't early return. I think we should >> leave it, but prefer idle_driver. > > IOW, as a fallback mechanism, it is as good as it gets. > > As the primary source of information though, not quite. So we have at least 5 places in the kernel which evaluate CPUID leaf 0x5 in different ways. Can we please have _ONE_ function which evaluates the leaf correctly once and provides the required information for all places ready to use? Thanks, tglx
On Tue, Nov 12, 2024 at 8:18 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > On Tue, Nov 12 2024 at 12:28, Rafael J. Wysocki wrote: > > On Tue, Nov 12, 2024 at 12:18 PM Patryk Wlazlyn > > <patryk.wlazlyn@linux.intel.com> wrote: > >> I don't think so. The old algorithm gives fairly good heuristic for computing > >> the mwait hint for the deepest cstate. Even though it's not guaranteed to work, > >> it does work on most of the platforms that don't early return. I think we should > >> leave it, but prefer idle_driver. > > > > IOW, as a fallback mechanism, it is as good as it gets. > > > > As the primary source of information though, not quite. > > So we have at least 5 places in the kernel which evaluate CPUID leaf 0x5 > in different ways. > > Can we please have _ONE_ function which evaluates the leaf correctly > once and provides the required information for all places ready to use? Yup, that's what needs to be done.
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 9aab7abc2ae9..bd67959e5e8b 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -56,6 +56,7 @@ #include <asm/mwait.h> #include <asm/spec-ctrl.h> #include <asm/fpu/api.h> +#include <asm/smp.h> #define INTEL_IDLE_VERSION "0.5.1" @@ -221,6 +222,17 @@ static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, return 0; } +static __cpuidle int intel_idle_enter_dead(struct cpuidle_device *dev, + int index) +{ + struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev); + struct cpuidle_state *state = &drv->states[index]; + unsigned long eax = flg2MWAIT(state->flags); + + /* Retruns only in case of an error. */ + return mwait_play_dead_with_hint(eax); +} + /* * States are indexed by the cstate number, * which is also the index into the MWAIT hint array. @@ -1303,6 +1315,7 @@ static struct cpuidle_state srf_cstates[] __initdata = { .exit_latency = 1, .target_residency = 1, .enter = &intel_idle, + .enter_dead = &intel_idle_enter_dead, .enter_s2idle = intel_idle_s2idle, }, { .name = "C1E", @@ -1311,6 +1324,7 @@ static struct cpuidle_state srf_cstates[] __initdata = { .exit_latency = 2, .target_residency = 10, .enter = &intel_idle, + .enter_dead = &intel_idle_enter_dead, .enter_s2idle = intel_idle_s2idle, }, { .name = "C6S", @@ -1319,6 +1333,7 @@ static struct cpuidle_state srf_cstates[] __initdata = { .exit_latency = 270, .target_residency = 700, .enter = &intel_idle, + .enter_dead = &intel_idle_enter_dead, .enter_s2idle = intel_idle_s2idle, }, { .name = "C6SP", @@ -1327,6 +1342,7 @@ static struct cpuidle_state srf_cstates[] __initdata = { .exit_latency = 310, .target_residency = 900, .enter = &intel_idle, + .enter_dead = &intel_idle_enter_dead, .enter_s2idle = intel_idle_s2idle, }, { .enter = NULL }
Intel's Sierra Forest report two C6 substates in cpuid leaf 5: C6S (hint 0x22) C6SP (hint 0x23) Hints 0x20 and 0x21 are skipped entirely, causing the generic implementation in mwait_play_dead() to compute the wrong hint, when looking for the deepest cstate. As a result, package with an offlined CPU can never reach PC6. Define the enter_dead() handler for SRF. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> --- drivers/idle/intel_idle.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)