diff mbox

[v9,10/10] cpuidle: menu: Avoid selecting shallow states with stopped tick

Message ID 6542020.eHGLEK9V0J@aspire.rjw.lan (mailing list archive)
State Mainlined
Delegated to: Rafael Wysocki
Headers show

Commit Message

Rafael J. Wysocki April 4, 2018, 8:50 a.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

If the scheduler tick has been stopped already and the governor
selects a shallow idle state, the CPU can spend a long time in that
state if the selection is based on an inaccurate prediction of idle
time.  That effect turns out to be relevant, so it needs to be
mitigated.

To that end, modify the menu governor to discard the result of the
idle time prediction if the tick is stopped and the predicted idle
time is less than the tick period length, unless the tick timer is
going to expire soon.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v8 -> v9: No changes.

---
 drivers/cpuidle/governors/menu.c |   29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

Comments

Peter Zijlstra April 5, 2018, 12:47 p.m. UTC | #1
On Wed, Apr 04, 2018 at 10:50:36AM +0200, Rafael J. Wysocki wrote:
> +	if (tick_nohz_tick_stopped()) {
> +		/*
> +		 * If the tick is already stopped, the cost of possible short
> +		 * idle duration misprediction is much higher, because the CPU
> +		 * may be stuck in a shallow idle state for a long time as a
> +		 * result of it.  In that case say we might mispredict and try
> +		 * to force the CPU into a state for which we would have stopped
> +		 * the tick, unless the tick timer is going to expire really
> +		 * soon anyway.

Wait what; the tick was stopped, therefore it _cannot_ expire soon.

*confused*

Did you mean s/tick/a/ ?

> +		 */
> +		if (data->predicted_us < TICK_USEC)
> +			data->predicted_us = min_t(unsigned int, TICK_USEC,
> +						   ktime_to_us(delta_next));
> +	} else {
> +		/*
> +		 * Use the performance multiplier and the user-configurable
> +		 * latency_req to determine the maximum exit latency.
> +		 */
> +		interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load);
> +		if (latency_req > interactivity_req)
> +			latency_req = interactivity_req;
> +	}
>  
>  	expected_interval = data->predicted_us;
>  	/*
>
Rafael J. Wysocki April 5, 2018, 1:49 p.m. UTC | #2
On Thu, Apr 5, 2018 at 2:47 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, Apr 04, 2018 at 10:50:36AM +0200, Rafael J. Wysocki wrote:
>> +     if (tick_nohz_tick_stopped()) {
>> +             /*
>> +              * If the tick is already stopped, the cost of possible short
>> +              * idle duration misprediction is much higher, because the CPU
>> +              * may be stuck in a shallow idle state for a long time as a
>> +              * result of it.  In that case say we might mispredict and try
>> +              * to force the CPU into a state for which we would have stopped
>> +              * the tick, unless the tick timer is going to expire really
>> +              * soon anyway.
>
> Wait what; the tick was stopped, therefore it _cannot_ expire soon.
>
> *confused*
>
> Did you mean s/tick/a/ ?

Yeah, that should be "a timer".
Peter Zijlstra April 5, 2018, 2:11 p.m. UTC | #3
On Thu, Apr 05, 2018 at 03:49:32PM +0200, Rafael J. Wysocki wrote:
> On Thu, Apr 5, 2018 at 2:47 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Wed, Apr 04, 2018 at 10:50:36AM +0200, Rafael J. Wysocki wrote:
> >> +     if (tick_nohz_tick_stopped()) {
> >> +             /*
> >> +              * If the tick is already stopped, the cost of possible short
> >> +              * idle duration misprediction is much higher, because the CPU
> >> +              * may be stuck in a shallow idle state for a long time as a
> >> +              * result of it.  In that case say we might mispredict and try
> >> +              * to force the CPU into a state for which we would have stopped
> >> +              * the tick, unless the tick timer is going to expire really
> >> +              * soon anyway.
> >
> > Wait what; the tick was stopped, therefore it _cannot_ expire soon.
> >
> > *confused*
> >
> > Did you mean s/tick/a/ ?
> 
> Yeah, that should be "a timer".

*phew* ok, that makes a lot more sense ;-)

My only concern with this is that we can now be overly pessimistic. The
predictor might know that statistically it's very likely a device
interrupt will arrive soon, but because the tick is already disabled, we
don't dare trust it, causing possible excessive latencies.

Would an alternative be to make @stop_tick be an enum capable of forcing
the tick back on?

enum tick_action {
	NOHZ_TICK_STOP,
	NOHZ_TICK_RETAIN,
	NOHZ_TICK_START,
};

	enum tick_action tick_action = NOHZ_TICK_STOP;

	state = cpuidle_select(..., &tick_action);

	switch (tick_action) {
	case NOHZ_TICK_STOP:
		tick_nohz_stop_tick();
		break;

	case NOHZ_TICK_RETAIN:
		tick_nozh_retain_tick();
		break;

	case NOHZ_TICK_START:
		tick_nohz_start_tick();
		break;
	};


Or something along those lines?
Peter Zijlstra April 5, 2018, 2:13 p.m. UTC | #4
On Thu, Apr 05, 2018 at 04:11:27PM +0200, Peter Zijlstra wrote:
> On Thu, Apr 05, 2018 at 03:49:32PM +0200, Rafael J. Wysocki wrote:
> > On Thu, Apr 5, 2018 at 2:47 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> > > On Wed, Apr 04, 2018 at 10:50:36AM +0200, Rafael J. Wysocki wrote:
> > >> +     if (tick_nohz_tick_stopped()) {
> > >> +             /*
> > >> +              * If the tick is already stopped, the cost of possible short
> > >> +              * idle duration misprediction is much higher, because the CPU
> > >> +              * may be stuck in a shallow idle state for a long time as a
> > >> +              * result of it.  In that case say we might mispredict and try
> > >> +              * to force the CPU into a state for which we would have stopped
> > >> +              * the tick, unless the tick timer is going to expire really
> > >> +              * soon anyway.
> > >
> > > Wait what; the tick was stopped, therefore it _cannot_ expire soon.
> > >
> > > *confused*
> > >
> > > Did you mean s/tick/a/ ?
> > 
> > Yeah, that should be "a timer".
> 
> *phew* ok, that makes a lot more sense ;-)
> 
> My only concern with this is that we can now be overly pessimistic. The
> predictor might know that statistically it's very likely a device
> interrupt will arrive soon, but because the tick is already disabled, we
> don't dare trust it, causing possible excessive latencies.
> 
> Would an alternative be to make @stop_tick be an enum capable of forcing
> the tick back on?
> 
> enum tick_action {
> 	NOHZ_TICK_STOP,
> 	NOHZ_TICK_RETAIN,
> 	NOHZ_TICK_START,
> };
> 
> 	enum tick_action tick_action = NOHZ_TICK_STOP;
> 
> 	state = cpuidle_select(..., &tick_action);
> 
> 	switch (tick_action) {
> 	case NOHZ_TICK_STOP:
> 		tick_nohz_stop_tick();
> 		break;
> 
> 	case NOHZ_TICK_RETAIN:
> 		tick_nozh_retain_tick();
> 		break;
> 
> 	case NOHZ_TICK_START:
> 		tick_nohz_start_tick();
> 		break;
> 	};
> 
> 
> Or something along those lines?

To clarify, RETAIN keeps the status quo, if its off, it stays off, if
its on it stays on.
Rafael J. Wysocki April 5, 2018, 2:29 p.m. UTC | #5
On Thu, Apr 5, 2018 at 4:13 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Apr 05, 2018 at 04:11:27PM +0200, Peter Zijlstra wrote:
>> On Thu, Apr 05, 2018 at 03:49:32PM +0200, Rafael J. Wysocki wrote:
>> > On Thu, Apr 5, 2018 at 2:47 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>> > > On Wed, Apr 04, 2018 at 10:50:36AM +0200, Rafael J. Wysocki wrote:
>> > >> +     if (tick_nohz_tick_stopped()) {
>> > >> +             /*
>> > >> +              * If the tick is already stopped, the cost of possible short
>> > >> +              * idle duration misprediction is much higher, because the CPU
>> > >> +              * may be stuck in a shallow idle state for a long time as a
>> > >> +              * result of it.  In that case say we might mispredict and try
>> > >> +              * to force the CPU into a state for which we would have stopped
>> > >> +              * the tick, unless the tick timer is going to expire really
>> > >> +              * soon anyway.
>> > >
>> > > Wait what; the tick was stopped, therefore it _cannot_ expire soon.
>> > >
>> > > *confused*
>> > >
>> > > Did you mean s/tick/a/ ?
>> >
>> > Yeah, that should be "a timer".
>>
>> *phew* ok, that makes a lot more sense ;-)
>>
>> My only concern with this is that we can now be overly pessimistic. The
>> predictor might know that statistically it's very likely a device
>> interrupt will arrive soon, but because the tick is already disabled, we
>> don't dare trust it, causing possible excessive latencies.

That's possible, but then we already stopped the tick before and the
CPU was in a deep idle state (or we wouldn't have got here with the
tick stopped), so the extra bit of latency coming from this is not
likely to matter IMO.

And the code can stay simpler this way. :-)

>> Would an alternative be to make @stop_tick be an enum capable of forcing
>> the tick back on?
>>
>> enum tick_action {
>>       NOHZ_TICK_STOP,
>>       NOHZ_TICK_RETAIN,
>>       NOHZ_TICK_START,
>> };
>>
>>       enum tick_action tick_action = NOHZ_TICK_STOP;
>>
>>       state = cpuidle_select(..., &tick_action);
>>
>>       switch (tick_action) {
>>       case NOHZ_TICK_STOP:
>>               tick_nohz_stop_tick();
>>               break;
>>
>>       case NOHZ_TICK_RETAIN:
>>               tick_nozh_retain_tick();
>>               break;
>>
>>       case NOHZ_TICK_START:
>>               tick_nohz_start_tick();
>>               break;
>>       };
>>
>>
>> Or something along those lines?
>
> To clarify, RETAIN keeps the status quo, if its off, it stays off, if
> its on it stays on.

That could be done, but I'm not sure if the menu governor has a good
way to decide whether to use NOHZ_TICK_RETAIN or NOHZ_TICK_START.

Doing NOHZ_TICK_START every time the predicted idle duration is within
the tick range may be wasteful.

Besides, enum tick_action and so on can be introduced later if really
needed, but for now it looks like the simpler code gets the job done.
diff mbox

Patch

Index: linux-pm/drivers/cpuidle/governors/menu.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/governors/menu.c
+++ linux-pm/drivers/cpuidle/governors/menu.c
@@ -352,13 +352,28 @@  static int menu_select(struct cpuidle_dr
 	 */
 	data->predicted_us = min(data->predicted_us, expected_interval);
 
-	/*
-	 * Use the performance multiplier and the user-configurable
-	 * latency_req to determine the maximum exit latency.
-	 */
-	interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load);
-	if (latency_req > interactivity_req)
-		latency_req = interactivity_req;
+	if (tick_nohz_tick_stopped()) {
+		/*
+		 * If the tick is already stopped, the cost of possible short
+		 * idle duration misprediction is much higher, because the CPU
+		 * may be stuck in a shallow idle state for a long time as a
+		 * result of it.  In that case say we might mispredict and try
+		 * to force the CPU into a state for which we would have stopped
+		 * the tick, unless the tick timer is going to expire really
+		 * soon anyway.
+		 */
+		if (data->predicted_us < TICK_USEC)
+			data->predicted_us = min_t(unsigned int, TICK_USEC,
+						   ktime_to_us(delta_next));
+	} else {
+		/*
+		 * Use the performance multiplier and the user-configurable
+		 * latency_req to determine the maximum exit latency.
+		 */
+		interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load);
+		if (latency_req > interactivity_req)
+			latency_req = interactivity_req;
+	}
 
 	expected_interval = data->predicted_us;
 	/*