diff mbox series

[RFC/CI] drm/i915/icl: account for context save/restore removed bits

Message ID 20180809235852.24516-1-paulo.r.zanoni@intel.com (mailing list archive)
State New, archived
Headers show
Series [RFC/CI] drm/i915/icl: account for context save/restore removed bits | expand

Commit Message

Zanoni, Paulo R Aug. 9, 2018, 11:58 p.m. UTC
The RS_CTX_ENABLE and CTX_SAVE_INHIBIT bits are not present on ICL
anymore, but we still try to set them and then check them with
GEM_BUG_ON, resulting in a BUG() call. The bug can be reproduced by
igt/drv_selftest/live_hangcheck/others-priority and our CI was able
to catch it. The machine hangs, which prevents further testing on it.

It is worth noticing that commit 05f0addd9b10 ("drm/i915/icl: Enhanced
execution list support") already tried to avoid the save/restore bits
on ICL, but only inside populate_lr_context().

TODO: Should we also avoid CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT on ICL
for execlists_init_reg_state()? We already avoid it inside
populate_lr_context().

TODO: Shouldn't a new problem surface when we remove these registers?
What should we do to replace the functionality that was provided by
them?

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Testcase: igt/drv_selftest/live_hangcheck/others-priority
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107399
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

As you may notice from the TODO comments, my GEM-fu is still not
strong yet. Any help on the pending questions would be really
appreciated.

Comments

Chris Wilson Aug. 10, 2018, 12:11 a.m. UTC | #1
Quoting Paulo Zanoni (2018-08-10 00:58:52)
> The RS_CTX_ENABLE and CTX_SAVE_INHIBIT bits are not present on ICL
> anymore, but we still try to set them and then check them with
> GEM_BUG_ON, resulting in a BUG() call. The bug can be reproduced by
> igt/drv_selftest/live_hangcheck/others-priority and our CI was able
> to catch it. The machine hangs, which prevents further testing on it.

Non-sequitur. Capture the bug report, move on after the panic. How does
that prevent testing? What you might note is that CI's pstore is
reporting -EIO, that is what has prevented remote debugging of this.
 
> It is worth noticing that commit 05f0addd9b10 ("drm/i915/icl: Enhanced
> execution list support") already tried to avoid the save/restore bits
> on ICL, but only inside populate_lr_context().
> 
> TODO: Should we also avoid CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT on ICL
> for execlists_init_reg_state()? We already avoid it inside
> populate_lr_context().

RESTORE_INHIBIT is still very much required. It's just the SAVE_INHIBIT
that they removed for whimsy.
 
> TODO: Shouldn't a new problem surface when we remove these registers?
> What should we do to replace the functionality that was provided by
> them?

Shed a tear. Resource Streamer doesn't exist and would require userspace
support for it anyway. The SAVE_INHIBIT is the one that shaves a few
cycles on preemption, but is supposed to be replaced by a bit in ELSQ at
the expense of forking process_csb/preemption. No one has demonstrated
that the cost would be worth it.
-Chris
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e5385dbfcdda..bcd0a1758f13 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -541,7 +541,8 @@  static void inject_preempt_context(struct intel_engine_cs *engine)
 
 	GEM_BUG_ON(execlists->preempt_complete_status !=
 		   upper_32_bits(ce->lrc_desc));
-	GEM_BUG_ON((ce->lrc_reg_state[CTX_CONTEXT_CONTROL + 1] &
+	GEM_BUG_ON(INTEL_GEN(engine->i915) < 11 &&
+		   (ce->lrc_reg_state[CTX_CONTEXT_CONTROL + 1] &
 		    _MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT |
 				       CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT)) !=
 		   _MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT |
@@ -2569,6 +2570,7 @@  static void execlists_init_reg_state(u32 *regs,
 	struct drm_i915_private *dev_priv = engine->i915;
 	struct i915_hw_ppgtt *ppgtt = ctx->ppgtt ?: dev_priv->mm.aliasing_ppgtt;
 	u32 base = engine->mmio_base;
+	u32 ctx_sr_ctl;
 	bool rcs = engine->class == RENDER_CLASS;
 
 	/* A context is actually a big batch buffer with several
@@ -2581,10 +2583,12 @@  static void execlists_init_reg_state(u32 *regs,
 	regs[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(rcs ? 14 : 11) |
 				 MI_LRI_FORCE_POSTED;
 
+	ctx_sr_ctl = CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT;
+	if (INTEL_GEN(dev_priv) < 11)
+		ctx_sr_ctl |= CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT |
+			      CTX_CTRL_RS_CTX_ENABLE;
 	CTX_REG(regs, CTX_CONTEXT_CONTROL, RING_CONTEXT_CONTROL(engine),
-		_MASKED_BIT_DISABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT |
-				    CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT |
-				    CTX_CTRL_RS_CTX_ENABLE) |
+		_MASKED_BIT_DISABLE(ctx_sr_ctl) |
 		_MASKED_BIT_ENABLE(CTX_CTRL_INHIBIT_SYN_CTX_SWITCH));
 	CTX_REG(regs, CTX_RING_HEAD, RING_HEAD(base), 0);
 	CTX_REG(regs, CTX_RING_TAIL, RING_TAIL(base), 0);