diff mbox

[RFT,v4,3/7] sched: idle: Do not stop the tick before cpuidle_idle_call()

Message ID 3550231.xNMJN5JcLx@aspire.rjw.lan (mailing list archive)
State Superseded, archived
Delegated to: Rafael Wysocki
Headers show

Commit Message

Rafael J. Wysocki March 12, 2018, 9:53 a.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Make cpuidle_idle_call() decide whether or not to stop the tick.

First, the cpuidle_enter_s2idle() path deals with the tick (and with
the entire timekeeping for that matter) by itself and it doesn't need
the tick to be stopped beforehand.

Second, to address the issue with short idle duration predictions
by the idle governor after the tick has been stopped, it will be
necessary to change the ordering of cpuidle_select() with respect
to tick_nohz_idle_stop_tick().  To prepare for that, put a
tick_nohz_idle_stop_tick() call in the same branch in which
cpuidle_select() is called.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 kernel/sched/idle.c |   25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

Comments

Frederic Weisbecker March 15, 2018, 6:19 p.m. UTC | #1
On Mon, Mar 12, 2018 at 10:53:25AM +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Make cpuidle_idle_call() decide whether or not to stop the tick.
> 
> First, the cpuidle_enter_s2idle() path deals with the tick (and with
> the entire timekeeping for that matter) by itself and it doesn't need
> the tick to be stopped beforehand.

Not sure you meant timekeeping either :)

>  	if (idle_should_enter_s2idle() || dev->use_deepest_state) {
>  		if (idle_should_enter_s2idle()) {
> +			rcu_idle_enter();
> +
>  			entered_state = cpuidle_enter_s2idle(drv, dev);
>  			if (entered_state > 0) {
>  				local_irq_enable();
>  				goto exit_idle;
>  			}
> +
> +			rcu_idle_exit();
>  		}

I'm not sure how the tick is stopped on suspend to idle. Perhaps through
hrtimer (tick_cancel_sched_timer()) or clockevents code.

But we may have a similar problem than with idle_poll() called right after
call_cpuidle(). Ie: we arrive in cpuidle_enter_s2idle() with a tick that
should be reprogrammed while it is not. No idea if that can hurt somehow.

I guess it depends what happens to the tick on s2idle, I'm not clear with that.
Rafael J. Wysocki March 15, 2018, 8:41 p.m. UTC | #2
On Thu, Mar 15, 2018 at 7:19 PM, Frederic Weisbecker
<frederic@kernel.org> wrote:
> On Mon, Mar 12, 2018 at 10:53:25AM +0100, Rafael J. Wysocki wrote:
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> Make cpuidle_idle_call() decide whether or not to stop the tick.
>>
>> First, the cpuidle_enter_s2idle() path deals with the tick (and with
>> the entire timekeeping for that matter) by itself and it doesn't need
>> the tick to be stopped beforehand.
>
> Not sure you meant timekeeping either :)

Yeah, I meant nohz.

>>       if (idle_should_enter_s2idle() || dev->use_deepest_state) {
>>               if (idle_should_enter_s2idle()) {
>> +                     rcu_idle_enter();
>> +
>>                       entered_state = cpuidle_enter_s2idle(drv, dev);
>>                       if (entered_state > 0) {
>>                               local_irq_enable();
>>                               goto exit_idle;
>>                       }
>> +
>> +                     rcu_idle_exit();
>>               }
>
> I'm not sure how the tick is stopped on suspend to idle. Perhaps through
> hrtimer (tick_cancel_sched_timer()) or clockevents code.

The latter.

It does clockevents_shutdown() down the road, eventually.

IOW, it couldn't care less. :-)

> But we may have a similar problem than with idle_poll() called right after
> call_cpuidle(). Ie: we arrive in cpuidle_enter_s2idle() with a tick that
> should be reprogrammed while it is not. No idea if that can hurt somehow.
>
> I guess it depends what happens to the tick on s2idle, I'm not clear with that.

No problem there, AFAICS.
Rafael J. Wysocki March 15, 2018, 9:12 p.m. UTC | #3
On Thu, Mar 15, 2018 at 9:41 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Thu, Mar 15, 2018 at 7:19 PM, Frederic Weisbecker
> <frederic@kernel.org> wrote:
>> On Mon, Mar 12, 2018 at 10:53:25AM +0100, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>> Make cpuidle_idle_call() decide whether or not to stop the tick.
>>>
>>> First, the cpuidle_enter_s2idle() path deals with the tick (and with
>>> the entire timekeeping for that matter) by itself and it doesn't need
>>> the tick to be stopped beforehand.
>>
>> Not sure you meant timekeeping either :)
>
> Yeah, I meant nohz.

Well, not really. :-)

It is the entire timekeeping this time, as it can be suspended
entirely in that code path.
Frederic Weisbecker March 16, 2018, 2:16 p.m. UTC | #4
On Thu, Mar 15, 2018 at 09:41:10PM +0100, Rafael J. Wysocki wrote:
> On Thu, Mar 15, 2018 at 7:19 PM, Frederic Weisbecker
> <frederic@kernel.org> wrote:
> > On Mon, Mar 12, 2018 at 10:53:25AM +0100, Rafael J. Wysocki wrote:
> >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>
> >> Make cpuidle_idle_call() decide whether or not to stop the tick.
> >>
> >> First, the cpuidle_enter_s2idle() path deals with the tick (and with
> >> the entire timekeeping for that matter) by itself and it doesn't need
> >> the tick to be stopped beforehand.
> >
> > Not sure you meant timekeeping either :)
> 
> Yeah, I meant nohz.
> 
> >>       if (idle_should_enter_s2idle() || dev->use_deepest_state) {
> >>               if (idle_should_enter_s2idle()) {
> >> +                     rcu_idle_enter();
> >> +
> >>                       entered_state = cpuidle_enter_s2idle(drv, dev);
> >>                       if (entered_state > 0) {
> >>                               local_irq_enable();
> >>                               goto exit_idle;
> >>                       }
> >> +
> >> +                     rcu_idle_exit();
> >>               }
> >
> > I'm not sure how the tick is stopped on suspend to idle. Perhaps through
> > hrtimer (tick_cancel_sched_timer()) or clockevents code.
> 
> The latter.
> 
> It does clockevents_shutdown() down the road, eventually.

Ah good. And I see tick_resume_oneshot() takes care of restoring if necessary.

> 
> IOW, it couldn't care less. :-)
> 
> > But we may have a similar problem than with idle_poll() called right after
> > call_cpuidle(). Ie: we arrive in cpuidle_enter_s2idle() with a tick that
> > should be reprogrammed while it is not. No idea if that can hurt somehow.
> >
> > I guess it depends what happens to the tick on s2idle, I'm not clear with that.
> 
> No problem there, AFAICS.

Yep, all good.

Thanks!
Frederic Weisbecker March 16, 2018, 2:17 p.m. UTC | #5
On Thu, Mar 15, 2018 at 10:12:57PM +0100, Rafael J. Wysocki wrote:
> On Thu, Mar 15, 2018 at 9:41 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> > On Thu, Mar 15, 2018 at 7:19 PM, Frederic Weisbecker
> > <frederic@kernel.org> wrote:
> >> On Mon, Mar 12, 2018 at 10:53:25AM +0100, Rafael J. Wysocki wrote:
> >>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>
> >>> Make cpuidle_idle_call() decide whether or not to stop the tick.
> >>>
> >>> First, the cpuidle_enter_s2idle() path deals with the tick (and with
> >>> the entire timekeeping for that matter) by itself and it doesn't need
> >>> the tick to be stopped beforehand.
> >>
> >> Not sure you meant timekeeping either :)
> >
> > Yeah, I meant nohz.
> 
> Well, not really. :-)
> 
> It is the entire timekeeping this time, as it can be suspended
> entirely in that code path.

Fair point.

Thanks!
diff mbox

Patch

Index: linux-pm/kernel/sched/idle.c
===================================================================
--- linux-pm.orig/kernel/sched/idle.c
+++ linux-pm/kernel/sched/idle.c
@@ -146,13 +146,15 @@  static void cpuidle_idle_call(void)
 	}
 
 	/*
-	 * Tell the RCU framework we are entering an idle section,
-	 * so no more rcu read side critical sections and one more
+	 * The RCU framework needs to be told that we are entering an idle
+	 * section, so no more rcu read side critical sections and one more
 	 * step to the grace period
 	 */
-	rcu_idle_enter();
 
 	if (cpuidle_not_available(drv, dev)) {
+		tick_nohz_idle_stop_tick();
+		rcu_idle_enter();
+
 		default_idle_call();
 		goto exit_idle;
 	}
@@ -169,16 +171,26 @@  static void cpuidle_idle_call(void)
 
 	if (idle_should_enter_s2idle() || dev->use_deepest_state) {
 		if (idle_should_enter_s2idle()) {
+			rcu_idle_enter();
+
 			entered_state = cpuidle_enter_s2idle(drv, dev);
 			if (entered_state > 0) {
 				local_irq_enable();
 				goto exit_idle;
 			}
+
+			rcu_idle_exit();
 		}
 
+		tick_nohz_idle_stop_tick();
+		rcu_idle_enter();
+
 		next_state = cpuidle_find_deepest_state(drv, dev);
 		call_cpuidle(drv, dev, next_state);
 	} else {
+		tick_nohz_idle_stop_tick();
+		rcu_idle_enter();
+
 		/*
 		 * Ask the cpuidle framework to choose a convenient idle state.
 		 */
@@ -241,12 +253,11 @@  static void do_idle(void)
 		 * broadcast device expired for us, we don't want to go deep
 		 * idle as we know that the IPI is going to arrive right away.
 		 */
-		if (cpu_idle_force_poll || tick_check_broadcast_expired()) {
+		if (cpu_idle_force_poll || tick_check_broadcast_expired())
 			cpu_idle_poll();
-		} else {
-			tick_nohz_idle_stop_tick();
+		else
 			cpuidle_idle_call();
-		}
+
 		arch_cpu_idle_exit();
 	}