diff mbox

[1/5] drm/i915: Fix forcewake counts for gen8

Message ID 1392996723-8886-2-git-send-email-mika.kuoppala@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Mika Kuoppala Feb. 21, 2014, 3:31 p.m. UTC
From: Mika Kuoppala <mika.kuoppala@intel.com>

Sometimes generic driver code gets forcewake explicitly by
gen6_gt_force_wake_get(), which check forcewake_count before accessing
hardware. However the register access with gen8_write function access
low level hw accessors directly, ignoring the forcewake_count. This
leads to nested forcewake get from hardware, in ring init and possibly
elsewhere, causing forcewake ack clear errors and/or hangs.

Fix this by checking the forcewake count also in gen8_write

v2: Read side doesn't care about shadowed registers,
    Remove __needs_put funkiness from gen8_write. (Ville)
    Improved commit message.

References: https://bugs.freedesktop.org/show_bug.cgi?id=74007
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/intel_uncore.c |   19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

Comments

Ben Widawsky Feb. 21, 2014, 7:38 p.m. UTC | #1
On Fri, Feb 21, 2014 at 05:31:59PM +0200, mika.kuoppala@intel.com wrote:
> From: Mika Kuoppala <mika.kuoppala@intel.com>
> 
> Sometimes generic driver code gets forcewake explicitly by
> gen6_gt_force_wake_get(), which check forcewake_count before accessing
> hardware. However the register access with gen8_write function access
> low level hw accessors directly, ignoring the forcewake_count. This
> leads to nested forcewake get from hardware, in ring init and possibly
> elsewhere, causing forcewake ack clear errors and/or hangs.
> 
> Fix this by checking the forcewake count also in gen8_write
> 
> v2: Read side doesn't care about shadowed registers,
>     Remove __needs_put funkiness from gen8_write. (Ville)
>     Improved commit message.
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=74007
> Cc: Ben Widawsky <benjamin.widawsky@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

For those concerned with the performance implication of the extra if
(if anyone at all cares, it's Chris) - I suppose we could also just add
the lock to gen6_gt_force_wake_get/put.

[snip]
Ben Widawsky Feb. 21, 2014, 7:42 p.m. UTC | #2
On Fri, Feb 21, 2014 at 11:38:48AM -0800, Ben Widawsky wrote:
> On Fri, Feb 21, 2014 at 05:31:59PM +0200, mika.kuoppala@intel.com wrote:
> > From: Mika Kuoppala <mika.kuoppala@intel.com>
> > 
> > Sometimes generic driver code gets forcewake explicitly by
> > gen6_gt_force_wake_get(), which check forcewake_count before accessing
> > hardware. However the register access with gen8_write function access
> > low level hw accessors directly, ignoring the forcewake_count. This
> > leads to nested forcewake get from hardware, in ring init and possibly
> > elsewhere, causing forcewake ack clear errors and/or hangs.
> > 
> > Fix this by checking the forcewake count also in gen8_write
> > 
> > v2: Read side doesn't care about shadowed registers,
> >     Remove __needs_put funkiness from gen8_write. (Ville)
> >     Improved commit message.
> > 
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=74007
> > Cc: Ben Widawsky <benjamin.widawsky@intel.com>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> > Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> 
> For those concerned with the performance implication of the extra if
> (if anyone at all cares, it's Chris) - I suppose we could also just add
> the lock to gen6_gt_force_wake_get/put.
> 
> [snip]

I should have also said, we have very few cases where it's actually a
problem (ie. not just done at init). And in fact, I am wondering now how
one could hit this at all without enabling rps.
Chris Wilson Feb. 21, 2014, 7:54 p.m. UTC | #3
On Fri, Feb 21, 2014 at 11:38:48AM -0800, Ben Widawsky wrote:
> On Fri, Feb 21, 2014 at 05:31:59PM +0200, mika.kuoppala@intel.com wrote:
> > From: Mika Kuoppala <mika.kuoppala@intel.com>
> > 
> > Sometimes generic driver code gets forcewake explicitly by
> > gen6_gt_force_wake_get(), which check forcewake_count before accessing
> > hardware. However the register access with gen8_write function access
> > low level hw accessors directly, ignoring the forcewake_count. This
> > leads to nested forcewake get from hardware, in ring init and possibly
> > elsewhere, causing forcewake ack clear errors and/or hangs.
> > 
> > Fix this by checking the forcewake count also in gen8_write
> > 
> > v2: Read side doesn't care about shadowed registers,
> >     Remove __needs_put funkiness from gen8_write. (Ville)
> >     Improved commit message.
> > 
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=74007
> > Cc: Ben Widawsky <benjamin.widawsky@intel.com>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> > Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> 
> For those concerned with the performance implication of the extra if
> (if anyone at all cares, it's Chris) - I suppose we could also just add
> the lock to gen6_gt_force_wake_get/put.

Nah, my goal is to minimise the number of register reads (and even the
writes since they are not light but end up involving register reads) out
of hotpaths (execbuf, interrupts, fences).
-Chris
Ville Syrjälä Feb. 24, 2014, 11:27 a.m. UTC | #4
On Fri, Feb 21, 2014 at 11:38:48AM -0800, Ben Widawsky wrote:
> On Fri, Feb 21, 2014 at 05:31:59PM +0200, mika.kuoppala@intel.com wrote:
> > From: Mika Kuoppala <mika.kuoppala@intel.com>
> > 
> > Sometimes generic driver code gets forcewake explicitly by
> > gen6_gt_force_wake_get(), which check forcewake_count before accessing
> > hardware. However the register access with gen8_write function access
> > low level hw accessors directly, ignoring the forcewake_count. This
> > leads to nested forcewake get from hardware, in ring init and possibly
> > elsewhere, causing forcewake ack clear errors and/or hangs.
> > 
> > Fix this by checking the forcewake count also in gen8_write
> > 
> > v2: Read side doesn't care about shadowed registers,
> >     Remove __needs_put funkiness from gen8_write. (Ville)
> >     Improved commit message.
> > 
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=74007
> > Cc: Ben Widawsky <benjamin.widawsky@intel.com>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> > Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> 
> For those concerned with the performance implication of the extra if
> (if anyone at all cares, it's Chris) - I suppose we could also just add
> the lock to gen6_gt_force_wake_get/put.

Two things:
a) I don't understand what you mean here. uncore.lock already protects
the forcewake count in gen6_gt_force_wake_get/put. Also there's no way
to avoid the branch here since doing a .force_wake_put() w/o checking
the forcewake count is never ok.

b) The cost of branch should be a drop in the ocean compared to the
cost of the register reads/writes in .forcewake_get/put.
Ben Widawsky Feb. 24, 2014, 8:29 p.m. UTC | #5
On Mon, Feb 24, 2014 at 01:27:09PM +0200, Ville Syrjälä wrote:
> On Fri, Feb 21, 2014 at 11:38:48AM -0800, Ben Widawsky wrote:
> > On Fri, Feb 21, 2014 at 05:31:59PM +0200, mika.kuoppala@intel.com wrote:
> > > From: Mika Kuoppala <mika.kuoppala@intel.com>
> > > 
> > > Sometimes generic driver code gets forcewake explicitly by
> > > gen6_gt_force_wake_get(), which check forcewake_count before accessing
> > > hardware. However the register access with gen8_write function access
> > > low level hw accessors directly, ignoring the forcewake_count. This
> > > leads to nested forcewake get from hardware, in ring init and possibly
> > > elsewhere, causing forcewake ack clear errors and/or hangs.
> > > 
> > > Fix this by checking the forcewake count also in gen8_write
> > > 
> > > v2: Read side doesn't care about shadowed registers,
> > >     Remove __needs_put funkiness from gen8_write. (Ville)
> > >     Improved commit message.
> > > 
> > > References: https://bugs.freedesktop.org/show_bug.cgi?id=74007
> > > Cc: Ben Widawsky <benjamin.widawsky@intel.com>
> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> > > Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> > 
> > For those concerned with the performance implication of the extra if
> > (if anyone at all cares, it's Chris) - I suppose we could also just add
> > the lock to gen6_gt_force_wake_get/put.
> 
> Two things:
> a) I don't understand what you mean here. uncore.lock already protects
> the forcewake count in gen6_gt_force_wake_get/put. Also there's no way
> to avoid the branch here since doing a .force_wake_put() w/o checking
> the forcewake count is never ok.
> 

I meant acquire the lock in get() release the lock in put()

> b) The cost of branch should be a drop in the ocean compared to the
> cost of the register reads/writes in .forcewake_get/put.
> 
> -- 
> Ville Syrjälä
> Intel OTC
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index c628414..d1e9d63 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -634,16 +634,17 @@  static bool is_gen8_shadowed(struct drm_i915_private *dev_priv, u32 reg)
 #define __gen8_write(x) \
 static void \
 gen8_write##x(struct drm_i915_private *dev_priv, off_t reg, u##x val, bool trace) { \
-	bool __needs_put = reg < 0x40000 && !is_gen8_shadowed(dev_priv, reg); \
 	REG_WRITE_HEADER; \
-	if (__needs_put) { \
-		dev_priv->uncore.funcs.force_wake_get(dev_priv, \
-							FORCEWAKE_ALL); \
-	} \
-	__raw_i915_write##x(dev_priv, reg, val); \
-	if (__needs_put) { \
-		dev_priv->uncore.funcs.force_wake_put(dev_priv, \
-							FORCEWAKE_ALL); \
+	if (reg < 0x40000 && !is_gen8_shadowed(dev_priv, reg)) { \
+		if (dev_priv->uncore.forcewake_count == 0) \
+			dev_priv->uncore.funcs.force_wake_get(dev_priv,	\
+							      FORCEWAKE_ALL); \
+		__raw_i915_write##x(dev_priv, reg, val); \
+		if (dev_priv->uncore.forcewake_count == 0) \
+			dev_priv->uncore.funcs.force_wake_put(dev_priv, \
+							      FORCEWAKE_ALL); \
+	} else { \
+		__raw_i915_write##x(dev_priv, reg, val); \
 	} \
 	REG_WRITE_FOOTER; \
 }