diff mbox series

[1/2] drm/i915/execlists: Clear STOP_RING bit before restoring the context

Message ID 20180814144120.18714-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series [1/2] drm/i915/execlists: Clear STOP_RING bit before restoring the context | expand

Commit Message

Chris Wilson Aug. 14, 2018, 2:41 p.m. UTC
Before a reset, we set the STOP_RING bit of RING_MI_MODE to freeze the
engine. However, sometimes we observe that upon restart, the engine
freezes again with the STOP_RING bit still asserted. By inspection, we
know that the register itself is cleared by the GPU reset, so that bit
must be preserved inside the context image and reloaded from there. A
simple fix (as the RING_MI_MODE is at a fixed offset in a valid context)
is to clobber the STOP_RING bit inside the image before replaying the
request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: MichaƂ Winiarski <michal.winiarski@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c     | 17 +++++++++++++++--
 drivers/gpu/drm/i915/intel_lrc_reg.h |  2 ++
 2 files changed, 17 insertions(+), 2 deletions(-)

Comments

Chris Wilson Aug. 14, 2018, 4:08 p.m. UTC | #1
Quoting Patchwork (2018-08-14 16:53:29)
> == Series Details ==
> 
> Series: series starting with [1/2] drm/i915/execlists: Clear STOP_RING bit before restoring the context
> URL   : https://patchwork.freedesktop.org/series/48195/
> State : success
> 
> == Summary ==
> 
> = CI Bug Log - changes from CI_DRM_4666 -> Patchwork_9942 =
> 
> == Summary - SUCCESS ==
> 
>   No regressions found.
> 
>   External URL: https://patchwork.freedesktop.org/api/1.0/series/48195/revisions/1/mbox/
> 
> == Known issues ==
> 
>   Here are the changes found in Patchwork_9942 that come from known issues:
> 
>   === IGT changes ===
> 
>     ==== Issues hit ====
> 
>     igt@drv_selftest@live_hangcheck:
>       {fi-bdw-samus}:     PASS -> DMESG-FAIL (fdo#106560)

Barking up the wrong tree. Hah, case of the mistaken reset.
-Chris
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3f90c74038ef..37fe842de639 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1918,6 +1918,20 @@  static void execlists_reset(struct intel_engine_cs *engine,
 
 	spin_unlock_irqrestore(&engine->timeline.lock, flags);
 
+	if (!request)
+		return;
+
+	regs = request->hw_context->lrc_reg_state;
+
+	/*
+	 * After a reset, the context may have preserved the STOP bit
+	 * of RING_MI_MODE we used to freeze the active engine before
+	 * the reset. If that bit is restored the ring stops instead
+	 * of being executed.
+	 */
+	regs[CTX_MI_MODE + 1] |= STOP_RING << 16;
+	regs[CTX_MI_MODE + 1] &= ~STOP_RING;
+
 	/*
 	 * If the request was innocent, we leave the request in the ELSP
 	 * and will try to replay it on restarting. The context image may
@@ -1929,7 +1943,7 @@  static void execlists_reset(struct intel_engine_cs *engine,
 	 * and have to at least restore the RING register in the context
 	 * image back to the expected values to skip over the guilty request.
 	 */
-	if (!request || request->fence.error != -EIO)
+	if (request->fence.error != -EIO)
 		return;
 
 	/*
@@ -1940,7 +1954,6 @@  static void execlists_reset(struct intel_engine_cs *engine,
 	 * future request will be after userspace has had the opportunity
 	 * to recreate its own state.
 	 */
-	regs = request->hw_context->lrc_reg_state;
 	if (engine->pinned_default_state) {
 		memcpy(regs, /* skip restoring the vanilla PPHWSP */
 		       engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE,
diff --git a/drivers/gpu/drm/i915/intel_lrc_reg.h b/drivers/gpu/drm/i915/intel_lrc_reg.h
index 5ef932d810a7..3b155ecbfa92 100644
--- a/drivers/gpu/drm/i915/intel_lrc_reg.h
+++ b/drivers/gpu/drm/i915/intel_lrc_reg.h
@@ -39,6 +39,8 @@ 
 #define CTX_R_PWR_CLK_STATE		0x42
 #define CTX_END				0x44
 
+#define CTX_MI_MODE			0x54
+
 #define CTX_REG(reg_state, pos, reg, val) do { \
 	u32 *reg_state__ = (reg_state); \
 	const u32 pos__ = (pos); \