Message ID | 20170420130813.h7dycr5cptbrvdkz@hirez.programming.kicks-ass.net (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Thu, Apr 20, 2017 at 03:08:13PM +0200, Peter Zijlstra wrote: > On Thu, Apr 20, 2017 at 03:44:47PM +0300, ville.syrjala@linux.intel.com wrote: > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > This reverts commit e93e59ce5b85e6c2b444f09fd1f707274ec066dc. > > > > The TSC stops in deeper C states, > > On some old hardware (Core2 era and before) only. You've forgotten to > mention what hardware you've observed problems with. Yeah, Core2 is what I used when I finally decided to bisect this. I've been plagued by the bogus powertop numbers on many machines, most likely all of them were of some older vintage. > > > so using local_clock() in cpuidle > > But on said hardware, local_clock() isn't an immediate TSC user. > > > to track the C state residency seems like a bad idea. With local_clock() > > powertop is reporting mostly 0% residency for C states here. Presumably > > the core is still spending most of its time in some deep C-state since > > the totals typically add up to only 5% or so, so perhaps the governor > > isn't getting totally confused by these bogus numbers. But let's go > > back to using ktime_get() as that at least works correctly across the > > board. > > Does this cure it? It does indeed. Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > > --- > drivers/cpuidle/cpuidle.c | 2 ++ > kernel/sched/clock.c | 7 +++---- > 2 files changed, 5 insertions(+), 4 deletions(-) > > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c > index 548b90be7685..e0d4ad108887 100644 > --- a/drivers/cpuidle/cpuidle.c > +++ b/drivers/cpuidle/cpuidle.c > @@ -219,6 +219,8 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, > entered_state = target_state->enter(dev, drv, index); > start_critical_timings(); > > + sched_clock_idle_wakeup_event(0); > + > time_end = ns_to_ktime(local_clock()); > trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu); > > diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c > index 00a45c45beca..15e848706be4 100644 > --- a/kernel/sched/clock.c > +++ b/kernel/sched/clock.c > @@ -347,6 +347,9 @@ void sched_clock_tick(void) > { > struct sched_clock_data *scd; > > + if (timekeeping_suspended) > + return; > + > WARN_ON_ONCE(!irqs_disabled()); > > /* > @@ -378,11 +381,7 @@ EXPORT_SYMBOL_GPL(sched_clock_idle_sleep_event); > */ > void sched_clock_idle_wakeup_event(u64 delta_ns) > { > - if (timekeeping_suspended) > - return; > - > sched_clock_tick(); > - touch_softlockup_watchdog_sched(); > } > EXPORT_SYMBOL_GPL(sched_clock_idle_wakeup_event); >
On Thu, Apr 20, 2017 at 04:43:45PM +0300, Ville Syrjälä wrote: > On Thu, Apr 20, 2017 at 03:08:13PM +0200, Peter Zijlstra wrote: > > On Thu, Apr 20, 2017 at 03:44:47PM +0300, ville.syrjala@linux.intel.com wrote: > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > > > This reverts commit e93e59ce5b85e6c2b444f09fd1f707274ec066dc. > > > > > > The TSC stops in deeper C states, > > > > On some old hardware (Core2 era and before) only. You've forgotten to > > mention what hardware you've observed problems with. > > Yeah, Core2 is what I used when I finally decided to bisect this. I've > been plagued by the bogus powertop numbers on many machines, most > likely all of them were of some older vintage. > Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> OK, thanks. I'm currently chasing some other Core2 issue that is somewhat related. See: http://lkml.kernel.org/r/20170413132349.thxkwptdymsfsyxb@hirez.programming.kicks-ass.net Once I have that sorted I'll post both patches.
On Thu, Apr 20, 2017 at 03:08:13PM +0200, Peter Zijlstra wrote: > On Thu, Apr 20, 2017 at 03:44:47PM +0300, ville.syrjala@linux.intel.com wrote: > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > This reverts commit e93e59ce5b85e6c2b444f09fd1f707274ec066dc. > > > > The TSC stops in deeper C states, > > On some old hardware (Core2 era and before) only. You've forgotten to > mention what hardware you've observed problems with. > > > so using local_clock() in cpuidle > > But on said hardware, local_clock() isn't an immediate TSC user. > > > to track the C state residency seems like a bad idea. With local_clock() > > powertop is reporting mostly 0% residency for C states here. Presumably > > the core is still spending most of its time in some deep C-state since > > the totals typically add up to only 5% or so, so perhaps the governor > > isn't getting totally confused by these bogus numbers. But let's go > > back to using ktime_get() as that at least works correctly across the > > board. > > Does this cure it? > > --- > drivers/cpuidle/cpuidle.c | 2 ++ > kernel/sched/clock.c | 7 +++---- > 2 files changed, 5 insertions(+), 4 deletions(-) > > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c > index 548b90be7685..e0d4ad108887 100644 > --- a/drivers/cpuidle/cpuidle.c > +++ b/drivers/cpuidle/cpuidle.c > @@ -219,6 +219,8 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, > entered_state = target_state->enter(dev, drv, index); > start_critical_timings(); > > + sched_clock_idle_wakeup_event(0); > + Is it planned to skip this if the tsc is reliable? > time_end = ns_to_ktime(local_clock()); > trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu); > > diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c > index 00a45c45beca..15e848706be4 100644 > --- a/kernel/sched/clock.c > +++ b/kernel/sched/clock.c > @@ -347,6 +347,9 @@ void sched_clock_tick(void) > { > struct sched_clock_data *scd; > > + if (timekeeping_suspended) > + return; > + > WARN_ON_ONCE(!irqs_disabled()); > > /* > @@ -378,11 +381,7 @@ EXPORT_SYMBOL_GPL(sched_clock_idle_sleep_event); > */ > void sched_clock_idle_wakeup_event(u64 delta_ns) > { > - if (timekeeping_suspended) > - return; > - > sched_clock_tick(); > - touch_softlockup_watchdog_sched(); > } > EXPORT_SYMBOL_GPL(sched_clock_idle_wakeup_event); >
On Thu, Apr 20, 2017 at 04:37:51PM +0200, Daniel Lezcano wrote: > > > > + sched_clock_idle_wakeup_event(0); > > + > > Is it planned to skip this if the tsc is reliable? Yes. Current code doesn't quite do that, but if you follow that link I send earlier, I'm about to fix that (again). Now, if only this Core2 piece of crap would actually boot a recent kernel, I could go figure out wth is wrong.. :/
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 548b90be7685..e0d4ad108887 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -219,6 +219,8 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, entered_state = target_state->enter(dev, drv, index); start_critical_timings(); + sched_clock_idle_wakeup_event(0); + time_end = ns_to_ktime(local_clock()); trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu); diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c index 00a45c45beca..15e848706be4 100644 --- a/kernel/sched/clock.c +++ b/kernel/sched/clock.c @@ -347,6 +347,9 @@ void sched_clock_tick(void) { struct sched_clock_data *scd; + if (timekeeping_suspended) + return; + WARN_ON_ONCE(!irqs_disabled()); /* @@ -378,11 +381,7 @@ EXPORT_SYMBOL_GPL(sched_clock_idle_sleep_event); */ void sched_clock_idle_wakeup_event(u64 delta_ns) { - if (timekeeping_suspended) - return; - sched_clock_tick(); - touch_softlockup_watchdog_sched(); } EXPORT_SYMBOL_GPL(sched_clock_idle_wakeup_event);