diff mbox

drm/i915: add flags to i915_ring_stop

Message ID 1396023498-8635-1-git-send-email-mika.kuoppala@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Mika Kuoppala March 28, 2014, 4:18 p.m. UTC
Piglit runner and QA are both looking at the dmesg for
DRM_ERRORs with test cases. Add a flag to control those
when we they are expected from related test cases.

Also add flag to control if contexts should be banned
that introduced the hang. Hangcheck is timer based and
preventing bans by adding sleeps to testcases makes
testing slower.

v2: intel_ring_stopped(), readable comment (Chris)
v3: keep compatibility (Daniel)

References: https://bugs.freedesktop.org/show_bug.cgi?id=75876
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |   20 ++++++++++++++++++--
 drivers/gpu/drm/i915/i915_gem.c         |    5 +++--
 drivers/gpu/drm/i915/intel_ringbuffer.c |    8 ++++++--
 3 files changed, 27 insertions(+), 6 deletions(-)

Comments

Chris Wilson March 28, 2014, 5:04 p.m. UTC | #1
On Fri, Mar 28, 2014 at 06:18:18PM +0200, Mika Kuoppala wrote:
> Piglit runner and QA are both looking at the dmesg for
> DRM_ERRORs with test cases. Add a flag to control those
> when we they are expected from related test cases.
> 
> Also add flag to control if contexts should be banned
> that introduced the hang. Hangcheck is timer based and
> preventing bans by adding sleeps to testcases makes
> testing slower.
> 
> v2: intel_ring_stopped(), readable comment (Chris)
> v3: keep compatibility (Daniel)

Do we really want to keep backwards compatibility with the igt testsuite
that currently fails through debugfs?
 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=75876
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>

However, it is not a battle worth fighting, so either way,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
Daniel Vetter April 9, 2014, 1:09 p.m. UTC | #2
On Fri, Mar 28, 2014 at 05:04:25PM +0000, Chris Wilson wrote:
> On Fri, Mar 28, 2014 at 06:18:18PM +0200, Mika Kuoppala wrote:
> > Piglit runner and QA are both looking at the dmesg for
> > DRM_ERRORs with test cases. Add a flag to control those
> > when we they are expected from related test cases.
> > 
> > Also add flag to control if contexts should be banned
> > that introduced the hang. Hangcheck is timer based and
> > preventing bans by adding sleeps to testcases makes
> > testing slower.
> > 
> > v2: intel_ring_stopped(), readable comment (Chris)
> > v3: keep compatibility (Daniel)
> 
> Do we really want to keep backwards compatibility with the igt testsuite
> that currently fails through debugfs?

I'm leaning towards keeping things working if it's not too much fuzz.
Yeah, currently igt is a bit botched, but with this negative flag you can
still run older igt on newer kernels and the other way round, withing
limits of things simply being broken.

>  
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=75876
> > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> 
> However, it is not a battle worth fighting, so either way,
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Sorry that I've missed this, please ping me next time around there's a
patch ready to be picked up. Merged to dinq, thanks.

Please push the relevant igt patches, last time I've looked the interfaces
and docs and all that looked excellent.

Cheers, Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c697204..7b80a55 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1102,8 +1102,12 @@  struct i915_gpu_error {
 	 */
 	wait_queue_head_t reset_queue;
 
-	/* For gpu hang simulation. */
-	unsigned int stop_rings;
+	/* Userspace knobs for gpu hang simulation;
+	 * combines both a ring mask, and extra flags
+	 */
+	u32 stop_rings;
+#define I915_STOP_RING_ALLOW_BAN       (1 << 31)
+#define I915_STOP_RING_ALLOW_WARN      (1 << 30)
 
 	/* For missed irq/seqno simulation. */
 	unsigned int test_irq_rings;
@@ -2149,6 +2153,18 @@  static inline u32 i915_reset_count(struct i915_gpu_error *error)
 	return ((atomic_read(&error->reset_counter) & ~I915_WEDGED) + 1) / 2;
 }
 
+static inline bool i915_stop_ring_allow_ban(struct drm_i915_private *dev_priv)
+{
+	return dev_priv->gpu_error.stop_rings == 0 ||
+		dev_priv->gpu_error.stop_rings & I915_STOP_RING_ALLOW_BAN;
+}
+
+static inline bool i915_stop_ring_allow_warn(struct drm_i915_private *dev_priv)
+{
+	return dev_priv->gpu_error.stop_rings == 0 ||
+		dev_priv->gpu_error.stop_rings & I915_STOP_RING_ALLOW_WARN;
+}
+
 void i915_gem_reset(struct drm_device *dev);
 bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
 int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 33bbaa0..4dde5b1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2277,8 +2277,9 @@  static bool i915_context_is_banned(struct drm_i915_private *dev_priv,
 		if (!i915_gem_context_is_default(ctx)) {
 			DRM_DEBUG("context hanging too fast, banning!\n");
 			return true;
-		} else if (dev_priv->gpu_error.stop_rings == 0) {
-			DRM_ERROR("gpu hanging too fast, banning!\n");
+		} else if (i915_stop_ring_allow_ban(dev_priv)) {
+			if (i915_stop_ring_allow_warn(dev_priv))
+				DRM_ERROR("gpu hanging too fast, banning!\n");
 			return true;
 		}
 	}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 913b8ab..9c0c292 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -41,12 +41,16 @@  static inline int ring_space(struct intel_ring_buffer *ring)
 	return space;
 }
 
-void __intel_ring_advance(struct intel_ring_buffer *ring)
+static bool intel_ring_stopped(struct intel_ring_buffer *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	return dev_priv->gpu_error.stop_rings & intel_ring_flag(ring);
+}
 
+void __intel_ring_advance(struct intel_ring_buffer *ring)
+{
 	ring->tail &= ring->size - 1;
-	if (dev_priv->gpu_error.stop_rings & intel_ring_flag(ring))
+	if (intel_ring_stopped(ring))
 		return;
 	ring->write_tail(ring, ring->tail);
 }