Message ID | 20180410103440.24048-1-tvrtko.ursulin@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting Tvrtko Ursulin (2018-04-10 11:34:40) > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > While thinking about sporadic failures of perf_pmu/rc6-runtime-pm* tests > on some CI machines I have concluded that: a) the PMU readout of RC6 can > race against runtime PM transitions, and b) there are other reasons than > being runtime suspended which can cause intel_runtime_pm_get_if_in_use to > fail. > > Therefore when estimating RC6 the code needs to assert we are indeed in > suspended state and if not the best we can do is return the last known RC6 > value. > > v2: > * Re-arrange the code a bit to avoid second unlock and return branch. > (Chris Wilson) > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Fixes: 1fe699e30113 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout") > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105010 > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Imre Deak <imre.deak@intel.com> > --- > drivers/gpu/drm/i915/i915_pmu.c | 39 +++++++++++++++++++++++++++------------ > 1 file changed, 27 insertions(+), 12 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c > index bd7e695fc663..247a050f816e 100644 > --- a/drivers/gpu/drm/i915/i915_pmu.c > +++ b/drivers/gpu/drm/i915/i915_pmu.c > @@ -473,20 +473,35 @@ static u64 get_rc6(struct drm_i915_private *i915) > spin_lock_irqsave(&i915->pmu.lock, flags); > spin_lock(&kdev->power.lock); > > - if (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur) > - i915->pmu.suspended_jiffies_last = > - kdev->power.suspended_jiffies; > - > - val = kdev->power.suspended_jiffies - > - i915->pmu.suspended_jiffies_last; > - val += jiffies - kdev->power.accounting_timestamp; > + /* > + * After the above branch intel_runtime_pm_get_if_in_use failed > + * to get the runtime PM reference we cannot assume we are in > + * runtime suspend since we can either: a) race with coming out > + * of it before we took the power.lock, or b) there are other > + * states than suspended which can bring us here. > + * > + * We need to double-check that we are indeed currently runtime > + * suspended and if not we cannot do better than report the last > + * known RC6 value. > + */ > + if (kdev->power.runtime_status == RPM_SUSPENDED) { > + if (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur) > + i915->pmu.suspended_jiffies_last = > + kdev->power.suspended_jiffies; > + > + val = kdev->power.suspended_jiffies - > + i915->pmu.suspended_jiffies_last; > + val += jiffies - kdev->power.accounting_timestamp; Keep the line of white space here? Breaks up the computation of val from the power timestamps with the accumulation of sample. > + val = jiffies_to_nsecs(val); > + val += i915->pmu.sample[__I915_SAMPLE_RC6].cur; Then probably one more line of whitespace here to break up the computation of val with it's assignment. We definitely want the assignment on this branch to stand out from the no assignments later. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> -Chris
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index bd7e695fc663..247a050f816e 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -473,20 +473,35 @@ static u64 get_rc6(struct drm_i915_private *i915) spin_lock_irqsave(&i915->pmu.lock, flags); spin_lock(&kdev->power.lock); - if (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur) - i915->pmu.suspended_jiffies_last = - kdev->power.suspended_jiffies; - - val = kdev->power.suspended_jiffies - - i915->pmu.suspended_jiffies_last; - val += jiffies - kdev->power.accounting_timestamp; + /* + * After the above branch intel_runtime_pm_get_if_in_use failed + * to get the runtime PM reference we cannot assume we are in + * runtime suspend since we can either: a) race with coming out + * of it before we took the power.lock, or b) there are other + * states than suspended which can bring us here. + * + * We need to double-check that we are indeed currently runtime + * suspended and if not we cannot do better than report the last + * known RC6 value. + */ + if (kdev->power.runtime_status == RPM_SUSPENDED) { + if (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur) + i915->pmu.suspended_jiffies_last = + kdev->power.suspended_jiffies; + + val = kdev->power.suspended_jiffies - + i915->pmu.suspended_jiffies_last; + val += jiffies - kdev->power.accounting_timestamp; + val = jiffies_to_nsecs(val); + val += i915->pmu.sample[__I915_SAMPLE_RC6].cur; + i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = val; + } else if (i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur) { + val = i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur; + } else { + val = i915->pmu.sample[__I915_SAMPLE_RC6].cur; + } spin_unlock(&kdev->power.lock); - - val = jiffies_to_nsecs(val); - val += i915->pmu.sample[__I915_SAMPLE_RC6].cur; - i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = val; - spin_unlock_irqrestore(&i915->pmu.lock, flags); }