Message ID | 1392996723-8886-2-git-send-email-mika.kuoppala@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Feb 21, 2014 at 05:31:59PM +0200, mika.kuoppala@intel.com wrote: > From: Mika Kuoppala <mika.kuoppala@intel.com> > > Sometimes generic driver code gets forcewake explicitly by > gen6_gt_force_wake_get(), which check forcewake_count before accessing > hardware. However the register access with gen8_write function access > low level hw accessors directly, ignoring the forcewake_count. This > leads to nested forcewake get from hardware, in ring init and possibly > elsewhere, causing forcewake ack clear errors and/or hangs. > > Fix this by checking the forcewake count also in gen8_write > > v2: Read side doesn't care about shadowed registers, > Remove __needs_put funkiness from gen8_write. (Ville) > Improved commit message. > > References: https://bugs.freedesktop.org/show_bug.cgi?id=74007 > Cc: Ben Widawsky <benjamin.widawsky@intel.com> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> > Reviewed-by: Ben Widawsky <ben@bwidawsk.net> For those concerned with the performance implication of the extra if (if anyone at all cares, it's Chris) - I suppose we could also just add the lock to gen6_gt_force_wake_get/put. [snip]
On Fri, Feb 21, 2014 at 11:38:48AM -0800, Ben Widawsky wrote: > On Fri, Feb 21, 2014 at 05:31:59PM +0200, mika.kuoppala@intel.com wrote: > > From: Mika Kuoppala <mika.kuoppala@intel.com> > > > > Sometimes generic driver code gets forcewake explicitly by > > gen6_gt_force_wake_get(), which check forcewake_count before accessing > > hardware. However the register access with gen8_write function access > > low level hw accessors directly, ignoring the forcewake_count. This > > leads to nested forcewake get from hardware, in ring init and possibly > > elsewhere, causing forcewake ack clear errors and/or hangs. > > > > Fix this by checking the forcewake count also in gen8_write > > > > v2: Read side doesn't care about shadowed registers, > > Remove __needs_put funkiness from gen8_write. (Ville) > > Improved commit message. > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=74007 > > Cc: Ben Widawsky <benjamin.widawsky@intel.com> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> > > Reviewed-by: Ben Widawsky <ben@bwidawsk.net> > > For those concerned with the performance implication of the extra if > (if anyone at all cares, it's Chris) - I suppose we could also just add > the lock to gen6_gt_force_wake_get/put. > > [snip] I should have also said, we have very few cases where it's actually a problem (ie. not just done at init). And in fact, I am wondering now how one could hit this at all without enabling rps.
On Fri, Feb 21, 2014 at 11:38:48AM -0800, Ben Widawsky wrote: > On Fri, Feb 21, 2014 at 05:31:59PM +0200, mika.kuoppala@intel.com wrote: > > From: Mika Kuoppala <mika.kuoppala@intel.com> > > > > Sometimes generic driver code gets forcewake explicitly by > > gen6_gt_force_wake_get(), which check forcewake_count before accessing > > hardware. However the register access with gen8_write function access > > low level hw accessors directly, ignoring the forcewake_count. This > > leads to nested forcewake get from hardware, in ring init and possibly > > elsewhere, causing forcewake ack clear errors and/or hangs. > > > > Fix this by checking the forcewake count also in gen8_write > > > > v2: Read side doesn't care about shadowed registers, > > Remove __needs_put funkiness from gen8_write. (Ville) > > Improved commit message. > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=74007 > > Cc: Ben Widawsky <benjamin.widawsky@intel.com> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> > > Reviewed-by: Ben Widawsky <ben@bwidawsk.net> > > For those concerned with the performance implication of the extra if > (if anyone at all cares, it's Chris) - I suppose we could also just add > the lock to gen6_gt_force_wake_get/put. Nah, my goal is to minimise the number of register reads (and even the writes since they are not light but end up involving register reads) out of hotpaths (execbuf, interrupts, fences). -Chris
On Fri, Feb 21, 2014 at 11:38:48AM -0800, Ben Widawsky wrote: > On Fri, Feb 21, 2014 at 05:31:59PM +0200, mika.kuoppala@intel.com wrote: > > From: Mika Kuoppala <mika.kuoppala@intel.com> > > > > Sometimes generic driver code gets forcewake explicitly by > > gen6_gt_force_wake_get(), which check forcewake_count before accessing > > hardware. However the register access with gen8_write function access > > low level hw accessors directly, ignoring the forcewake_count. This > > leads to nested forcewake get from hardware, in ring init and possibly > > elsewhere, causing forcewake ack clear errors and/or hangs. > > > > Fix this by checking the forcewake count also in gen8_write > > > > v2: Read side doesn't care about shadowed registers, > > Remove __needs_put funkiness from gen8_write. (Ville) > > Improved commit message. > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=74007 > > Cc: Ben Widawsky <benjamin.widawsky@intel.com> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> > > Reviewed-by: Ben Widawsky <ben@bwidawsk.net> > > For those concerned with the performance implication of the extra if > (if anyone at all cares, it's Chris) - I suppose we could also just add > the lock to gen6_gt_force_wake_get/put. Two things: a) I don't understand what you mean here. uncore.lock already protects the forcewake count in gen6_gt_force_wake_get/put. Also there's no way to avoid the branch here since doing a .force_wake_put() w/o checking the forcewake count is never ok. b) The cost of branch should be a drop in the ocean compared to the cost of the register reads/writes in .forcewake_get/put.
On Mon, Feb 24, 2014 at 01:27:09PM +0200, Ville Syrjälä wrote: > On Fri, Feb 21, 2014 at 11:38:48AM -0800, Ben Widawsky wrote: > > On Fri, Feb 21, 2014 at 05:31:59PM +0200, mika.kuoppala@intel.com wrote: > > > From: Mika Kuoppala <mika.kuoppala@intel.com> > > > > > > Sometimes generic driver code gets forcewake explicitly by > > > gen6_gt_force_wake_get(), which check forcewake_count before accessing > > > hardware. However the register access with gen8_write function access > > > low level hw accessors directly, ignoring the forcewake_count. This > > > leads to nested forcewake get from hardware, in ring init and possibly > > > elsewhere, causing forcewake ack clear errors and/or hangs. > > > > > > Fix this by checking the forcewake count also in gen8_write > > > > > > v2: Read side doesn't care about shadowed registers, > > > Remove __needs_put funkiness from gen8_write. (Ville) > > > Improved commit message. > > > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=74007 > > > Cc: Ben Widawsky <benjamin.widawsky@intel.com> > > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> > > > Reviewed-by: Ben Widawsky <ben@bwidawsk.net> > > > > For those concerned with the performance implication of the extra if > > (if anyone at all cares, it's Chris) - I suppose we could also just add > > the lock to gen6_gt_force_wake_get/put. > > Two things: > a) I don't understand what you mean here. uncore.lock already protects > the forcewake count in gen6_gt_force_wake_get/put. Also there's no way > to avoid the branch here since doing a .force_wake_put() w/o checking > the forcewake count is never ok. > I meant acquire the lock in get() release the lock in put() > b) The cost of branch should be a drop in the ocean compared to the > cost of the register reads/writes in .forcewake_get/put. > > -- > Ville Syrjälä > Intel OTC
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index c628414..d1e9d63 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -634,16 +634,17 @@ static bool is_gen8_shadowed(struct drm_i915_private *dev_priv, u32 reg) #define __gen8_write(x) \ static void \ gen8_write##x(struct drm_i915_private *dev_priv, off_t reg, u##x val, bool trace) { \ - bool __needs_put = reg < 0x40000 && !is_gen8_shadowed(dev_priv, reg); \ REG_WRITE_HEADER; \ - if (__needs_put) { \ - dev_priv->uncore.funcs.force_wake_get(dev_priv, \ - FORCEWAKE_ALL); \ - } \ - __raw_i915_write##x(dev_priv, reg, val); \ - if (__needs_put) { \ - dev_priv->uncore.funcs.force_wake_put(dev_priv, \ - FORCEWAKE_ALL); \ + if (reg < 0x40000 && !is_gen8_shadowed(dev_priv, reg)) { \ + if (dev_priv->uncore.forcewake_count == 0) \ + dev_priv->uncore.funcs.force_wake_get(dev_priv, \ + FORCEWAKE_ALL); \ + __raw_i915_write##x(dev_priv, reg, val); \ + if (dev_priv->uncore.forcewake_count == 0) \ + dev_priv->uncore.funcs.force_wake_put(dev_priv, \ + FORCEWAKE_ALL); \ + } else { \ + __raw_i915_write##x(dev_priv, reg, val); \ } \ REG_WRITE_FOOTER; \ }