diff mbox

[v3] drm/i915/cnl: WaPipeControlBefore3DStateSamplePattern

Message ID 20180205233330.14973-1-rafael.antognolli@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Rafael Antognolli Feb. 5, 2018, 11:33 p.m. UTC
This workaround should prevent a bug that can be hit on a context
restore. To avoid the issue, we must emit a PIPE_CONTROL with CS stall
(0x7a000004 0x00100000 0x00000000 0x00000000) followed by 12DW's of
NOOP(0x0) in the indirect context batch buffer, to ensure the engine is
idle prior to programming 3DSTATE_SAMPLE_PATTERN.

It's also not clear whether we should add those extra dwords because of
the workaround itself, or if that's just padding for the WA BB (and next
commands could come right after the PIPE_CONTROL). We keep them for now.

References: HSD#1939868

 v2: More descriptive changelog and comments.
 v3: Explain that PIPE_CONTROL is actually 6 dwords, and that we advance
     10 more dwords because of that.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_lrc.c | 38 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

Comments

Chris Wilson Feb. 6, 2018, 9:03 a.m. UTC | #1
Quoting Rafael Antognolli (2018-02-05 23:33:30)
> This workaround should prevent a bug that can be hit on a context
> restore. To avoid the issue, we must emit a PIPE_CONTROL with CS stall
> (0x7a000004 0x00100000 0x00000000 0x00000000) followed by 12DW's of
> NOOP(0x0) in the indirect context batch buffer, to ensure the engine is
> idle prior to programming 3DSTATE_SAMPLE_PATTERN.
> 
> It's also not clear whether we should add those extra dwords because of
> the workaround itself, or if that's just padding for the WA BB (and next
> commands could come right after the PIPE_CONTROL). We keep them for now.
> 
> References: HSD#1939868
> 
>  v2: More descriptive changelog and comments.
>  v3: Explain that PIPE_CONTROL is actually 6 dwords, and that we advance
>      10 more dwords because of that.
> 
> Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Acked-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks for the clear comment elucidating our doubt about what is
actually required for the w/a. Pushed.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index adf257dfa5e0..4bca4bc47cca 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1328,6 +1328,40 @@  static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
 	return batch;
 }
 
+static u32 *gen10_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
+{
+	int i;
+
+	/*
+	 * WaPipeControlBefore3DStateSamplePattern: cnl
+	 *
+	 * Ensure the engine is idle prior to programming a
+	 * 3DSTATE_SAMPLE_PATTERN during a context restore.
+	 */
+	batch = gen8_emit_pipe_control(batch,
+				       PIPE_CONTROL_CS_STALL,
+				       0);
+	/*
+	 * WaPipeControlBefore3DStateSamplePattern says we need 4 dwords for
+	 * the PIPE_CONTROL followed by 12 dwords of 0x0, so 16 dwords in
+	 * total. However, a PIPE_CONTROL is 6 dwords long, not 4, which is
+	 * confusing. Since gen8_emit_pipe_control() already advances the
+	 * batch by 6 dwords, we advance the other 10 here, completing a
+	 * cacheline. It's not clear if the workaround requires this padding
+	 * before other commands, or if it's just the regular padding we would
+	 * already have for the workaround bb, so leave it here for now.
+	 */
+	for (i = 0; i < 10; i++)
+		*batch++ = MI_NOOP;
+
+	/* Pad to end of cacheline */
+	while ((unsigned long)batch % CACHELINE_BYTES)
+		*batch++ = MI_NOOP;
+
+	return batch;
+}
+
+
 #define CTX_WA_BB_OBJ_SIZE (PAGE_SIZE)
 
 static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
@@ -1381,7 +1415,9 @@  static int intel_init_workaround_bb(struct intel_engine_cs *engine)
 
 	switch (INTEL_GEN(engine->i915)) {
 	case 10:
-		return 0;
+		wa_bb_fn[0] = gen10_init_indirectctx_bb;
+		wa_bb_fn[1] = NULL;
+		break;
 	case 9:
 		wa_bb_fn[0] = gen9_init_indirectctx_bb;
 		wa_bb_fn[1] = NULL;