Message ID | 20200513085934.9859-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/gt: Reset execlists registers before HWSP | expand |
Chris Wilson <chris@chris-wilson.co.uk> writes: > Upon gt resume, we first poison then sanitize the engine. However, our > testing shows that gen9 will very rarely retain the poisoned value from > the HWSP mappings of the execlists status registers. This suggests that > it is reading back from the HWSP, so rejig the register reset. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > --- > drivers/gpu/drm/i915/gt/intel_lrc.c | 19 +++++++++++++------ > 1 file changed, 13 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c > index 3d0e0894c015..a7d644a21f14 100644 > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > @@ -3924,6 +3924,14 @@ static void reset_csb_pointers(struct intel_engine_cs *engine) > > ring_set_paused(engine, 0); > > + /* > + * Sometimes Icelake forgets to reset its pointers on a GPU reset. > + * Bludgeon them with a mmio update to be sure. > + */ > + ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR, > + reset_value << 8 | reset_value); > + ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR); > + > /* > * After a reset, the HW starts writing into CSB entry [0]. We > * therefore have to set our HEAD pointer back one entry so that > @@ -3937,16 +3945,15 @@ static void reset_csb_pointers(struct intel_engine_cs *engine) > WRITE_ONCE(*execlists->csb_write, reset_value); > wmb(); /* Make sure this is visible to HW (paranoia?) */ > > - /* > - * Sometimes Icelake forgets to reset its pointers on a GPU reset. > - * Bludgeon them with a mmio update to be sure. > - */ > + invalidate_csb_entries(&execlists->csb_status[0], > + &execlists->csb_status[reset_value]); > + > + /* Once more for luck and our trusty paranoia */ > ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR, > reset_value << 8 | reset_value); > ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR); > > - invalidate_csb_entries(&execlists->csb_status[0], > - &execlists->csb_status[reset_value]); > + GEM_BUG_ON(READ_ONCE(*execlists->csb_write) != reset_value); > } > > static void execlists_sanitize(struct intel_engine_cs *engine) > -- > 2.20.1
Quoting Mika Kuoppala (2020-05-13 10:32:37) > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > Upon gt resume, we first poison then sanitize the engine. However, our > > testing shows that gen9 will very rarely retain the poisoned value from > > the HWSP mappings of the execlists status registers. This suggests that > > it is reading back from the HWSP, so rejig the register reset. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> It failed in exactly the same way, got past the GEM_BUG_ON(*csb_write != reset_value) and still ended up with *csb_write == 0x5a [90] in process_csb. How it's able to see 0x5a at all is a mystery. We poison, we sanitize, we reset the GPU. The value comes back from out of nowhere. -Chris
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 3d0e0894c015..a7d644a21f14 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -3924,6 +3924,14 @@ static void reset_csb_pointers(struct intel_engine_cs *engine) ring_set_paused(engine, 0); + /* + * Sometimes Icelake forgets to reset its pointers on a GPU reset. + * Bludgeon them with a mmio update to be sure. + */ + ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR, + reset_value << 8 | reset_value); + ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR); + /* * After a reset, the HW starts writing into CSB entry [0]. We * therefore have to set our HEAD pointer back one entry so that @@ -3937,16 +3945,15 @@ static void reset_csb_pointers(struct intel_engine_cs *engine) WRITE_ONCE(*execlists->csb_write, reset_value); wmb(); /* Make sure this is visible to HW (paranoia?) */ - /* - * Sometimes Icelake forgets to reset its pointers on a GPU reset. - * Bludgeon them with a mmio update to be sure. - */ + invalidate_csb_entries(&execlists->csb_status[0], + &execlists->csb_status[reset_value]); + + /* Once more for luck and our trusty paranoia */ ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR, reset_value << 8 | reset_value); ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR); - invalidate_csb_entries(&execlists->csb_status[0], - &execlists->csb_status[reset_value]); + GEM_BUG_ON(READ_ONCE(*execlists->csb_write) != reset_value); } static void execlists_sanitize(struct intel_engine_cs *engine)
Upon gt resume, we first poison then sanitize the engine. However, our testing shows that gen9 will very rarely retain the poisoned value from the HWSP mappings of the execlists status registers. This suggests that it is reading back from the HWSP, so rejig the register reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/gt/intel_lrc.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-)