Message ID | 6542020.eHGLEK9V0J@aspire.rjw.lan (mailing list archive) |
---|---|
State | Mainlined |
Delegated to: | Rafael Wysocki |
Headers | show |
On Wed, Apr 04, 2018 at 10:50:36AM +0200, Rafael J. Wysocki wrote: > + if (tick_nohz_tick_stopped()) { > + /* > + * If the tick is already stopped, the cost of possible short > + * idle duration misprediction is much higher, because the CPU > + * may be stuck in a shallow idle state for a long time as a > + * result of it. In that case say we might mispredict and try > + * to force the CPU into a state for which we would have stopped > + * the tick, unless the tick timer is going to expire really > + * soon anyway. Wait what; the tick was stopped, therefore it _cannot_ expire soon. *confused* Did you mean s/tick/a/ ? > + */ > + if (data->predicted_us < TICK_USEC) > + data->predicted_us = min_t(unsigned int, TICK_USEC, > + ktime_to_us(delta_next)); > + } else { > + /* > + * Use the performance multiplier and the user-configurable > + * latency_req to determine the maximum exit latency. > + */ > + interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load); > + if (latency_req > interactivity_req) > + latency_req = interactivity_req; > + } > > expected_interval = data->predicted_us; > /* >
On Thu, Apr 5, 2018 at 2:47 PM, Peter Zijlstra <peterz@infradead.org> wrote: > On Wed, Apr 04, 2018 at 10:50:36AM +0200, Rafael J. Wysocki wrote: >> + if (tick_nohz_tick_stopped()) { >> + /* >> + * If the tick is already stopped, the cost of possible short >> + * idle duration misprediction is much higher, because the CPU >> + * may be stuck in a shallow idle state for a long time as a >> + * result of it. In that case say we might mispredict and try >> + * to force the CPU into a state for which we would have stopped >> + * the tick, unless the tick timer is going to expire really >> + * soon anyway. > > Wait what; the tick was stopped, therefore it _cannot_ expire soon. > > *confused* > > Did you mean s/tick/a/ ? Yeah, that should be "a timer".
On Thu, Apr 05, 2018 at 03:49:32PM +0200, Rafael J. Wysocki wrote: > On Thu, Apr 5, 2018 at 2:47 PM, Peter Zijlstra <peterz@infradead.org> wrote: > > On Wed, Apr 04, 2018 at 10:50:36AM +0200, Rafael J. Wysocki wrote: > >> + if (tick_nohz_tick_stopped()) { > >> + /* > >> + * If the tick is already stopped, the cost of possible short > >> + * idle duration misprediction is much higher, because the CPU > >> + * may be stuck in a shallow idle state for a long time as a > >> + * result of it. In that case say we might mispredict and try > >> + * to force the CPU into a state for which we would have stopped > >> + * the tick, unless the tick timer is going to expire really > >> + * soon anyway. > > > > Wait what; the tick was stopped, therefore it _cannot_ expire soon. > > > > *confused* > > > > Did you mean s/tick/a/ ? > > Yeah, that should be "a timer". *phew* ok, that makes a lot more sense ;-) My only concern with this is that we can now be overly pessimistic. The predictor might know that statistically it's very likely a device interrupt will arrive soon, but because the tick is already disabled, we don't dare trust it, causing possible excessive latencies. Would an alternative be to make @stop_tick be an enum capable of forcing the tick back on? enum tick_action { NOHZ_TICK_STOP, NOHZ_TICK_RETAIN, NOHZ_TICK_START, }; enum tick_action tick_action = NOHZ_TICK_STOP; state = cpuidle_select(..., &tick_action); switch (tick_action) { case NOHZ_TICK_STOP: tick_nohz_stop_tick(); break; case NOHZ_TICK_RETAIN: tick_nozh_retain_tick(); break; case NOHZ_TICK_START: tick_nohz_start_tick(); break; }; Or something along those lines?
On Thu, Apr 05, 2018 at 04:11:27PM +0200, Peter Zijlstra wrote: > On Thu, Apr 05, 2018 at 03:49:32PM +0200, Rafael J. Wysocki wrote: > > On Thu, Apr 5, 2018 at 2:47 PM, Peter Zijlstra <peterz@infradead.org> wrote: > > > On Wed, Apr 04, 2018 at 10:50:36AM +0200, Rafael J. Wysocki wrote: > > >> + if (tick_nohz_tick_stopped()) { > > >> + /* > > >> + * If the tick is already stopped, the cost of possible short > > >> + * idle duration misprediction is much higher, because the CPU > > >> + * may be stuck in a shallow idle state for a long time as a > > >> + * result of it. In that case say we might mispredict and try > > >> + * to force the CPU into a state for which we would have stopped > > >> + * the tick, unless the tick timer is going to expire really > > >> + * soon anyway. > > > > > > Wait what; the tick was stopped, therefore it _cannot_ expire soon. > > > > > > *confused* > > > > > > Did you mean s/tick/a/ ? > > > > Yeah, that should be "a timer". > > *phew* ok, that makes a lot more sense ;-) > > My only concern with this is that we can now be overly pessimistic. The > predictor might know that statistically it's very likely a device > interrupt will arrive soon, but because the tick is already disabled, we > don't dare trust it, causing possible excessive latencies. > > Would an alternative be to make @stop_tick be an enum capable of forcing > the tick back on? > > enum tick_action { > NOHZ_TICK_STOP, > NOHZ_TICK_RETAIN, > NOHZ_TICK_START, > }; > > enum tick_action tick_action = NOHZ_TICK_STOP; > > state = cpuidle_select(..., &tick_action); > > switch (tick_action) { > case NOHZ_TICK_STOP: > tick_nohz_stop_tick(); > break; > > case NOHZ_TICK_RETAIN: > tick_nozh_retain_tick(); > break; > > case NOHZ_TICK_START: > tick_nohz_start_tick(); > break; > }; > > > Or something along those lines? To clarify, RETAIN keeps the status quo, if its off, it stays off, if its on it stays on.
On Thu, Apr 5, 2018 at 4:13 PM, Peter Zijlstra <peterz@infradead.org> wrote: > On Thu, Apr 05, 2018 at 04:11:27PM +0200, Peter Zijlstra wrote: >> On Thu, Apr 05, 2018 at 03:49:32PM +0200, Rafael J. Wysocki wrote: >> > On Thu, Apr 5, 2018 at 2:47 PM, Peter Zijlstra <peterz@infradead.org> wrote: >> > > On Wed, Apr 04, 2018 at 10:50:36AM +0200, Rafael J. Wysocki wrote: >> > >> + if (tick_nohz_tick_stopped()) { >> > >> + /* >> > >> + * If the tick is already stopped, the cost of possible short >> > >> + * idle duration misprediction is much higher, because the CPU >> > >> + * may be stuck in a shallow idle state for a long time as a >> > >> + * result of it. In that case say we might mispredict and try >> > >> + * to force the CPU into a state for which we would have stopped >> > >> + * the tick, unless the tick timer is going to expire really >> > >> + * soon anyway. >> > > >> > > Wait what; the tick was stopped, therefore it _cannot_ expire soon. >> > > >> > > *confused* >> > > >> > > Did you mean s/tick/a/ ? >> > >> > Yeah, that should be "a timer". >> >> *phew* ok, that makes a lot more sense ;-) >> >> My only concern with this is that we can now be overly pessimistic. The >> predictor might know that statistically it's very likely a device >> interrupt will arrive soon, but because the tick is already disabled, we >> don't dare trust it, causing possible excessive latencies. That's possible, but then we already stopped the tick before and the CPU was in a deep idle state (or we wouldn't have got here with the tick stopped), so the extra bit of latency coming from this is not likely to matter IMO. And the code can stay simpler this way. :-) >> Would an alternative be to make @stop_tick be an enum capable of forcing >> the tick back on? >> >> enum tick_action { >> NOHZ_TICK_STOP, >> NOHZ_TICK_RETAIN, >> NOHZ_TICK_START, >> }; >> >> enum tick_action tick_action = NOHZ_TICK_STOP; >> >> state = cpuidle_select(..., &tick_action); >> >> switch (tick_action) { >> case NOHZ_TICK_STOP: >> tick_nohz_stop_tick(); >> break; >> >> case NOHZ_TICK_RETAIN: >> tick_nozh_retain_tick(); >> break; >> >> case NOHZ_TICK_START: >> tick_nohz_start_tick(); >> break; >> }; >> >> >> Or something along those lines? > > To clarify, RETAIN keeps the status quo, if its off, it stays off, if > its on it stays on. That could be done, but I'm not sure if the menu governor has a good way to decide whether to use NOHZ_TICK_RETAIN or NOHZ_TICK_START. Doing NOHZ_TICK_START every time the predicted idle duration is within the tick range may be wasteful. Besides, enum tick_action and so on can be introduced later if really needed, but for now it looks like the simpler code gets the job done.
Index: linux-pm/drivers/cpuidle/governors/menu.c =================================================================== --- linux-pm.orig/drivers/cpuidle/governors/menu.c +++ linux-pm/drivers/cpuidle/governors/menu.c @@ -352,13 +352,28 @@ static int menu_select(struct cpuidle_dr */ data->predicted_us = min(data->predicted_us, expected_interval); - /* - * Use the performance multiplier and the user-configurable - * latency_req to determine the maximum exit latency. - */ - interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load); - if (latency_req > interactivity_req) - latency_req = interactivity_req; + if (tick_nohz_tick_stopped()) { + /* + * If the tick is already stopped, the cost of possible short + * idle duration misprediction is much higher, because the CPU + * may be stuck in a shallow idle state for a long time as a + * result of it. In that case say we might mispredict and try + * to force the CPU into a state for which we would have stopped + * the tick, unless the tick timer is going to expire really + * soon anyway. + */ + if (data->predicted_us < TICK_USEC) + data->predicted_us = min_t(unsigned int, TICK_USEC, + ktime_to_us(delta_next)); + } else { + /* + * Use the performance multiplier and the user-configurable + * latency_req to determine the maximum exit latency. + */ + interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load); + if (latency_req > interactivity_req) + latency_req = interactivity_req; + } expected_interval = data->predicted_us; /*