[RFT,v7.3,5/8] cpuidle: Return nohz hint from cpuidle_select()

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Add a new pointer argument to cpuidle_select() and to the ->select
cpuidle governor callback to allow a boolean value indicating
whether or not the tick should be stopped before entering the
selected state to be returned from there.

Make the ladder governor ignore that pointer (to preserve its
current behavior) and make the menu governor return 'false" through
it if:
 (1) the idle exit latency is constrained at 0, or
 (2) the selected state is a polling one, or
 (3) the expected idle period duration is within the tick period
     range.

In addition to that, the correction factor computations in the menu
governor need to take the possibility that the tick may not be
stopped into account to avoid artificially small correction factor
values.  To that end, add a mechanism to record tick wakeups, as
suggested by Peter Zijlstra, and use it to modify the menu_update()
behavior when tick wakeup occurs.  Namely, if the CPU is woken up by
the tick and the return value of tick_nohz_get_sleep_length() is not
within the tick boundary, the predicted idle duration is likely too
short, so make menu_update() try to compensate for that by updating
the governor statistics as though the CPU was idle for a long time.

Since the value returned through the new argument pointer of
cpuidle_select() is not used by its caller yet, this change by
itself is not expected to alter the functionality of the code.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

One more revision here.

From the Thomas Ilsche's testing on the Skylake server it looks like
data->intervals[] need to be updated along with the correction factor
on tick wakeups that occur when next_timer_us is above the tick boundary.

The difference between this and the original v7 (of patch [5/8]) is
what happens in menu_update().  This time next_timer_us is checked
properly and if that is above the tick boundary and a tick wakeup occurs,
the function simply sets mesured_us to a large constant and uses that to
update both the correction factor and data->intervals[] (the particular
value used in this patch was found through a bit of experimentation).

Let's see how this works for Thomas and Doug.

For easier testing there is a git branch containing this patch (and the
rest of the series) at:

 git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
 idle-loop-v7.3

Thanks!

---
 drivers/cpuidle/cpuidle.c          |   10 +++++-
 drivers/cpuidle/governors/ladder.c |    3 +
 drivers/cpuidle/governors/menu.c   |   59 +++++++++++++++++++++++++++++--------
 include/linux/cpuidle.h            |    8 +++--
 include/linux/tick.h               |    2 +
 kernel/sched/idle.c                |    4 +-
 kernel/time/tick-sched.c           |   20 ++++++++++++
 7 files changed, 87 insertions(+), 19 deletions(-)

[RFT,v7.3,5/8] cpuidle: Return nohz hint from cpuidle_select()

Commit Message

Comments

Patch