diff mbox series

drm/i915: Kick waiters on resetting legacy rings

Message ID 20180814104056.27001-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series drm/i915: Kick waiters on resetting legacy rings | expand

Commit Message

Chris Wilson Aug. 14, 2018, 10:40 a.m. UTC
This reapplies commit 39f3be162c46 ("drm/i915: Kick waiters on resetting
legacy rings") after the improved gem_eio was run across all machines we
found that gen3 and early gen4 still lost the immediate interrupt
following reset, and the HWSTAM w/a applied to gen6+ is inadequate.

Unlike the later gen, on gen3/4 the principle (and only tests to fail so
far) are the wait vs reset test cases, whereas the reset stress case
works fine (which was the predominantly failing case for gen6+). That is
enough to suggest the underlying issue is sufficiently different to
support the difference in HWSTAM efficacy.

Testcase: igt/gem_eio/wait-10ms
References: 39f3be162c46 ("drm/i915: Kick waiters on resetting legacy rings")
References: a69ab52b0358 ("drm/i915: Remove extra waiter kick on legacy resets")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Mika Kuoppala Aug. 14, 2018, 11:38 a.m. UTC | #1
Chris Wilson <chris@chris-wilson.co.uk> writes:

> This reapplies commit 39f3be162c46 ("drm/i915: Kick waiters on resetting
> legacy rings") after the improved gem_eio was run across all machines we
> found that gen3 and early gen4 still lost the immediate interrupt
> following reset, and the HWSTAM w/a applied to gen6+ is inadequate.
>
> Unlike the later gen, on gen3/4 the principle (and only tests to fail so
> far) are the wait vs reset test cases, whereas the reset stress case
> works fine (which was the predominantly failing case for gen6+). That is
> enough to suggest the underlying issue is sufficiently different to
> support the difference in HWSTAM efficacy.
>
> Testcase: igt/gem_eio/wait-10ms
> References: 39f3be162c46 ("drm/i915: Kick waiters on resetting legacy rings")
> References: a69ab52b0358 ("drm/i915: Remove extra waiter kick on legacy resets")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index b65cf7832b39..d40f55a8dc34 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -537,6 +537,8 @@ static int init_ring_common(struct intel_engine_cs *engine)
>  	if (INTEL_GEN(dev_priv) > 2)
>  		I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING));
>  
> +	/* Papering over lost _interrupts_ immediately following the restart */
> +	intel_engine_wakeup(engine);

We came, we toggled, we learned. Onwards!

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

>  out:
>  	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>  
> -- 
> 2.18.0
Chris Wilson Aug. 14, 2018, 11:46 a.m. UTC | #2
Quoting Mika Kuoppala (2018-08-14 12:38:18)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > This reapplies commit 39f3be162c46 ("drm/i915: Kick waiters on resetting
> > legacy rings") after the improved gem_eio was run across all machines we
> > found that gen3 and early gen4 still lost the immediate interrupt
> > following reset, and the HWSTAM w/a applied to gen6+ is inadequate.
> >
> > Unlike the later gen, on gen3/4 the principle (and only tests to fail so
> > far) are the wait vs reset test cases, whereas the reset stress case
> > works fine (which was the predominantly failing case for gen6+). That is
> > enough to suggest the underlying issue is sufficiently different to
> > support the difference in HWSTAM efficacy.
> >
> > Testcase: igt/gem_eio/wait-10ms
> > References: 39f3be162c46 ("drm/i915: Kick waiters on resetting legacy rings")
> > References: a69ab52b0358 ("drm/i915: Remove extra waiter kick on legacy resets")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Matthew Auld <matthew.auld@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index b65cf7832b39..d40f55a8dc34 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -537,6 +537,8 @@ static int init_ring_common(struct intel_engine_cs *engine)
> >       if (INTEL_GEN(dev_priv) > 2)
> >               I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING));
> >  
> > +     /* Papering over lost _interrupts_ immediately following the restart */
> > +     intel_engine_wakeup(engine);
> 
> We came, we toggled, we learned. Onwards!

The pessimist in me says if this again boils down to a timing bug, we
need to find another w/a. It's a shame that HWSTAM wasn't it (even
remembering the bit differences between gen, as a last resort I tested
with HWSTAM==0 i.e. no mask).

Until then, ignorance is bliss.
-Chris
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index b65cf7832b39..d40f55a8dc34 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -537,6 +537,8 @@  static int init_ring_common(struct intel_engine_cs *engine)
 	if (INTEL_GEN(dev_priv) > 2)
 		I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING));
 
+	/* Papering over lost _interrupts_ immediately following the restart */
+	intel_engine_wakeup(engine);
 out:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);