Message ID | 20161118133647.4868-4-david.weinehall@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Nov 18, 2016 at 03:36:47PM +0200, David Weinehall wrote: > Benchmarking shows that on resume we spend quite a bit of time > just taking and dropping these references, leaving us two options; > either rewriting the code not to take these references more than > once, which would be a rather invasive change since the involved > functions are used from other places, or to optimise > intel_runtime_pm_{get,put}(). This patch does the latter. > Initial benchmarking indicate improvements of a couple > of milliseconds on resume. > > Original patch by Chris, with slight fixes by me. > > v2: Fix missing return value (Patchwork) > Remove extra atomic_dec() (Chris) > > Signed-off-by: David Weinehall <david.weinehall@linux.intel.com> > CC: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@linux.intel.com> I'm happy with this. Not amused that it apparently saves quite a bit of overhead with frequent pm_runtime calls. Imre? -Chris
On Fri, Nov 18, 2016 at 02:00:40PM +0000, Chris Wilson wrote: > On Fri, Nov 18, 2016 at 03:36:47PM +0200, David Weinehall wrote: > > Benchmarking shows that on resume we spend quite a bit of time > > just taking and dropping these references, leaving us two options; > > either rewriting the code not to take these references more than > > once, which would be a rather invasive change since the involved > > functions are used from other places, or to optimise > > intel_runtime_pm_{get,put}(). This patch does the latter. > > Initial benchmarking indicate improvements of a couple > > of milliseconds on resume. > > > > Original patch by Chris, with slight fixes by me. > > > > v2: Fix missing return value (Patchwork) > > Remove extra atomic_dec() (Chris) > > > > Signed-off-by: David Weinehall <david.weinehall@linux.intel.com> > > CC: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Imre Deak <imre.deak@linux.intel.com> > > I'm happy with this. Not amused that it apparently saves quite a bit of > overhead with frequent pm_runtime calls. We could eliminate some of those calls entirely by moving them from intel_display_power_{get,put}() into the always on well enable/disable hooks. But I'm not sure how much this overhead originates from the power well code as opposed to some gem/etc. stuff. > > Imre? > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On pe, 2016-11-18 at 14:00 +0000, Chris Wilson wrote: > On Fri, Nov 18, 2016 at 03:36:47PM +0200, David Weinehall wrote: > > Benchmarking shows that on resume we spend quite a bit of time > > just taking and dropping these references, leaving us two options; > > either rewriting the code not to take these references more than > > once, which would be a rather invasive change since the involved > > functions are used from other places, or to optimise > > intel_runtime_pm_{get,put}(). This patch does the latter. > > Initial benchmarking indicate improvements of a couple > > of milliseconds on resume. > > > > Original patch by Chris, with slight fixes by me. > > > > v2: Fix missing return value (Patchwork) > > Remove extra atomic_dec() (Chris) > > > > Signed-off-by: David Weinehall <david.weinehall@linux.intel.com> > > CC: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Imre Deak <imre.deak@linux.intel.com> > > I'm happy with this. Not amused that it apparently saves quite a bit > of > overhead with frequent pm_runtime calls. > > Imre? I think the overhead is because the RPM core takes a lock and checks if the device needs to be woken up even if the runtime_usage count is 0. But for us the device is awake whenever wakeref_count > 0, so yes we can optimize things. > -Chris >
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c index 356c662ad453..831fde343d10 100644 --- a/drivers/gpu/drm/i915/intel_runtime_pm.c +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c @@ -2632,6 +2632,9 @@ void intel_runtime_pm_get(struct drm_i915_private *dev_priv) struct pci_dev *pdev = dev_priv->drm.pdev; struct device *kdev = &pdev->dev; + if (atomic_inc_not_zero(&dev_priv->pm.wakeref_count)) + return; + pm_runtime_get_sync(kdev); atomic_inc(&dev_priv->pm.wakeref_count); @@ -2653,6 +2656,9 @@ bool intel_runtime_pm_get_if_in_use(struct drm_i915_private *dev_priv) struct pci_dev *pdev = dev_priv->drm.pdev; struct device *kdev = &pdev->dev; + if (atomic_inc_not_zero(&dev_priv->pm.wakeref_count)) + return true; + if (IS_ENABLED(CONFIG_PM)) { int ret = pm_runtime_get_if_in_use(kdev); @@ -2695,6 +2701,9 @@ void intel_runtime_pm_get_noresume(struct drm_i915_private *dev_priv) struct pci_dev *pdev = dev_priv->drm.pdev; struct device *kdev = &pdev->dev; + if (atomic_inc_not_zero(&dev_priv->pm.wakeref_count)) + return; + assert_rpm_wakelock_held(dev_priv); pm_runtime_get_noresume(kdev); @@ -2714,8 +2723,10 @@ void intel_runtime_pm_put(struct drm_i915_private *dev_priv) struct pci_dev *pdev = dev_priv->drm.pdev; struct device *kdev = &pdev->dev; + if (!atomic_dec_and_test(&dev_priv->pm.wakeref_count)) + return; + assert_rpm_wakelock_held(dev_priv); - atomic_dec(&dev_priv->pm.wakeref_count); pm_runtime_mark_last_busy(kdev); pm_runtime_put_autosuspend(kdev);
Benchmarking shows that on resume we spend quite a bit of time just taking and dropping these references, leaving us two options; either rewriting the code not to take these references more than once, which would be a rather invasive change since the involved functions are used from other places, or to optimise intel_runtime_pm_{get,put}(). This patch does the latter. Initial benchmarking indicate improvements of a couple of milliseconds on resume. Original patch by Chris, with slight fixes by me. v2: Fix missing return value (Patchwork) Remove extra atomic_dec() (Chris) Signed-off-by: David Weinehall <david.weinehall@linux.intel.com> CC: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/intel_runtime_pm.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)