diff mbox series

[4/4] drm/i915/gen12: Invalidate aux table entries forcibly

Message ID 20200506144734.29297-4-mika.kuoppala@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series [1/4] Revert "drm/i915/tgl: Include ro parts of l3 to invalidate" | expand

Commit Message

Mika Kuoppala May 6, 2020, 2:47 p.m. UTC
Aux table invalidation can fail on update. So
next access may cause memory access to be into stale entry.

Proposed workaround is to invalidate entries between
all batchbuffers.

References bspec#43904, hsdes#1809175790
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chuansheng Liu <chuansheng.liu@intel.com>
Cc: Rafael ANtognolli <rafael.antognolli@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 16 +++++++++++++++-
 drivers/gpu/drm/i915/i915_reg.h     |  2 ++
 2 files changed, 17 insertions(+), 1 deletion(-)

Comments

Chris Wilson May 6, 2020, 2:59 p.m. UTC | #1
Quoting Mika Kuoppala (2020-05-06 15:47:34)
> Aux table invalidation can fail on update. So
> next access may cause memory access to be into stale entry.
> 
> Proposed workaround is to invalidate entries between
> all batchbuffers.

This sounds like it should have a sunset clause. Do we anticipate being
able to drop this w/a in future?
-Chris
Mika Kuoppala May 6, 2020, 3:20 p.m. UTC | #2
Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2020-05-06 15:47:34)
>> Aux table invalidation can fail on update. So
>> next access may cause memory access to be into stale entry.
>> 
>> Proposed workaround is to invalidate entries between
>> all batchbuffers.
>
> This sounds like it should have a sunset clause. Do we anticipate being
> able to drop this w/a in future?

Rafael kindly pointed out that Mesa already does this:
https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/gallium/drivers/iris/iris_state.c#L5131
So I would say we can drop this patch.

But it makes me wonder that is that LRI dropped for not being whitelisted.

-Mika
Chris Wilson May 6, 2020, 3:32 p.m. UTC | #3
Quoting Mika Kuoppala (2020-05-06 16:20:22)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Quoting Mika Kuoppala (2020-05-06 15:47:34)
> >> Aux table invalidation can fail on update. So
> >> next access may cause memory access to be into stale entry.
> >> 
> >> Proposed workaround is to invalidate entries between
> >> all batchbuffers.
> >
> > This sounds like it should have a sunset clause. Do we anticipate being
> > able to drop this w/a in future?
> 
> Rafael kindly pointed out that Mesa already does this:
> https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/gallium/drivers/iris/iris_state.c#L5131
> So I would say we can drop this patch.

Is the false hit self-contained? Is it caused by PTE update of the
address or by a 3D state change i.e. is it a potential isolation issue?
-Chris
Rafael Antognolli May 7, 2020, 10:28 p.m. UTC | #4
On Wed, May 06, 2020 at 04:32:02PM +0100, Chris Wilson wrote:
> Quoting Mika Kuoppala (2020-05-06 16:20:22)
> > Chris Wilson <chris@chris-wilson.co.uk> writes:
> > 
> > > Quoting Mika Kuoppala (2020-05-06 15:47:34)
> > >> Aux table invalidation can fail on update. So
> > >> next access may cause memory access to be into stale entry.
> > >> 
> > >> Proposed workaround is to invalidate entries between
> > >> all batchbuffers.
> > >
> > > This sounds like it should have a sunset clause. Do we anticipate being
> > > able to drop this w/a in future?
> > 
> > Rafael kindly pointed out that Mesa already does this:
> > https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/gallium/drivers/iris/iris_state.c#L5131
> > So I would say we can drop this patch.
> 
> Is the false hit self-contained? Is it caused by PTE update of the
> address or by a 3D state change i.e. is it a potential isolation issue?
> -Chris

There's no 3D state for the aux table. What we do in Iris is:
 - allocate buffers and cpu map them
 - write the multiple levels of the table (main buffers -> CCS buffers)
   and keep track of when it changes
 - whenever it changes, we emit LRI to 0x4208 to invalidate the cache.

Anv does something similar except it writes to the table from the GPU.

I tried removing the heuristics and invalidating the table on the
beginning of every batch, but it doesn't help. We can probably get
preempted mid-batch and then another context with a different aux table
and the wrong cache is probably causing the hang. The kernel
invalidating it seems to fix the problem.

> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index e1235d504837..bbdb0e2a4571 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -4539,6 +4539,17 @@  static u32 preparser_disable(bool state)
 	return MI_ARB_CHECK | 1 << 8 | state;
 }
 
+static u32 *
+gen12_emit_aux_table_inv(struct i915_request *rq, u32 *cs)
+{
+	*cs++ = MI_LOAD_REGISTER_IMM(1);
+	*cs++ = i915_mmio_reg_offset(GEN12_GFX_CCS_AUX_NV);
+	*cs++ = AUX_INV;
+	*cs++ = MI_NOOP;
+
+	return cs;
+}
+
 static int gen12_emit_flush_render(struct i915_request *request,
 				   u32 mode)
 {
@@ -4587,7 +4598,7 @@  static int gen12_emit_flush_render(struct i915_request *request,
 
 		flags |= PIPE_CONTROL_CS_STALL;
 
-		cs = intel_ring_begin(request, 8);
+		cs = intel_ring_begin(request, 8 + 4);
 		if (IS_ERR(cs))
 			return PTR_ERR(cs);
 
@@ -4600,6 +4611,9 @@  static int gen12_emit_flush_render(struct i915_request *request,
 
 		cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
 
+		/* hsdes: 1809175790 */
+		cs = gen12_emit_aux_table_inv(request, cs);
+
 		*cs++ = preparser_disable(false);
 		intel_ring_advance(request, cs);
 	}
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index fde54b86ea20..07e702bd16b5 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2526,6 +2526,8 @@  static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define HSW_GTT_CACHE_EN	_MMIO(0x4024)
 #define   GTT_CACHE_EN_ALL	0xF0007FFF
 #define GEN7_WR_WATERMARK	_MMIO(0x4028)
+#define GEN12_GFX_CCS_AUX_NV	_MMIO(0x4028)
+#define   AUX_INV		REG_BIT(0)
 #define GEN7_GFX_PRIO_CTRL	_MMIO(0x402C)
 #define ARB_MODE		_MMIO(0x4030)
 #define   ARB_MODE_SWIZZLE_SNB	(1 << 4)