Message ID | 20190912070925.11526-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset | expand |
Chris Wilson <chris@chris-wilson.co.uk> writes: > After a GPU reset, we need to drain all the CS events so that we have an > accurate picture of the execlists state at the time of the reset. Be > paranoid and force a read of the CSB write pointer from memory. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > --- > drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c > index 3d83c7e0d9de..61a38a4ccbca 100644 > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled) > struct i915_request *rq; > u32 *regs; > > + mb(); /* paranoia: read the CSB pointers from after the reset */ > + clflush(execlists->csb_write); > + mb(); > + We know there is always a cost. We do invalidate the csb on each pass on process_csb. Add csb_write in to invalidate_csb entries along with mbs. Rename it to invalidate_csb and use it always? By doing so, we could prolly throw out the rmb() at the start of the process_csb as we would have invalidated the write pointer along with the entries we read, on previous pass. -Mika > process_csb(engine); /* drain preemption events */ > > /* Following the reset, we need to reload the CSB read/write pointers */ > -- > 2.23.0
Quoting Mika Kuoppala (2019-09-12 08:51:38) > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > After a GPU reset, we need to drain all the CS events so that we have an > > accurate picture of the execlists state at the time of the reset. Be > > paranoid and force a read of the CSB write pointer from memory. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > --- > > drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c > > index 3d83c7e0d9de..61a38a4ccbca 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > > @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled) > > struct i915_request *rq; > > u32 *regs; > > > > + mb(); /* paranoia: read the CSB pointers from after the reset */ > > + clflush(execlists->csb_write); > > + mb(); > > + > > We know there is always a cost. We do invalidate the csb > on each pass on process_csb. > > Add csb_write in to invalidate_csb entries along > with mbs. Rename it to invalidate_csb and use it > always? > > By doing so, we could prolly throw out the rmb() at > the start of the process_csb as we would have invalidated > the write pointer along with the entries we read, > on previous pass. No. That rmb is essential for the read ordering at that moment in time. All I have in mind here is a delay, not really a barrier per se, just this is a nice way of saying no speculation either. -Chris
Chris Wilson <chris@chris-wilson.co.uk> writes: > Quoting Mika Kuoppala (2019-09-12 08:51:38) >> Chris Wilson <chris@chris-wilson.co.uk> writes: >> >> > After a GPU reset, we need to drain all the CS events so that we have an >> > accurate picture of the execlists state at the time of the reset. Be >> > paranoid and force a read of the CSB write pointer from memory. >> > >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> >> > --- >> > drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++ >> > 1 file changed, 4 insertions(+) >> > >> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c >> > index 3d83c7e0d9de..61a38a4ccbca 100644 >> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c >> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c >> > @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled) >> > struct i915_request *rq; >> > u32 *regs; >> > >> > + mb(); /* paranoia: read the CSB pointers from after the reset */ >> > + clflush(execlists->csb_write); >> > + mb(); >> > + >> >> We know there is always a cost. We do invalidate the csb >> on each pass on process_csb. >> >> Add csb_write in to invalidate_csb entries along >> with mbs. Rename it to invalidate_csb and use it >> always? >> >> By doing so, we could prolly throw out the rmb() at >> the start of the process_csb as we would have invalidated >> the write pointer along with the entries we read, >> on previous pass. > > No. That rmb is essential for the read ordering at that moment in time. Ah yes indeed it is. head vs entries coherency. > > All I have in mind here is a delay, not really a barrier per se, just > this is a nice way of saying no speculation either. Forgetting the rmb(), there is similar pattern of mb()+flush elsewhere. Just saw the profiliferation and opportunity to converge. But syncing with the hardware on moment of reset, this should do. Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Quoting Mika Kuoppala (2019-09-12 09:27:56) > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > Quoting Mika Kuoppala (2019-09-12 08:51:38) > >> Chris Wilson <chris@chris-wilson.co.uk> writes: > >> > >> > After a GPU reset, we need to drain all the CS events so that we have an > >> > accurate picture of the execlists state at the time of the reset. Be > >> > paranoid and force a read of the CSB write pointer from memory. > >> > > >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > >> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > >> > --- > >> > drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++ > >> > 1 file changed, 4 insertions(+) > >> > > >> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c > >> > index 3d83c7e0d9de..61a38a4ccbca 100644 > >> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > >> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > >> > @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled) > >> > struct i915_request *rq; > >> > u32 *regs; > >> > > >> > + mb(); /* paranoia: read the CSB pointers from after the reset */ > >> > + clflush(execlists->csb_write); > >> > + mb(); > >> > + > >> > >> We know there is always a cost. We do invalidate the csb > >> on each pass on process_csb. > >> > >> Add csb_write in to invalidate_csb entries along > >> with mbs. Rename it to invalidate_csb and use it > >> always? > >> > >> By doing so, we could prolly throw out the rmb() at > >> the start of the process_csb as we would have invalidated > >> the write pointer along with the entries we read, > >> on previous pass. > > > > No. That rmb is essential for the read ordering at that moment in time. > > Ah yes indeed it is. head vs entries coherency. > > > > > All I have in mind here is a delay, not really a barrier per se, just > > this is a nice way of saying no speculation either. > > Forgetting the rmb(), there is similar pattern of mb()+flush > elsewhere. Just saw the profiliferation and opportunity to converge. I understood. I think your barrier-less w/a works pretty well and I haven't yet poked a hole in how I think it works ;) > But syncing with the hardware on moment of reset, this should > do. I looked at reusing invalidate_csb_entries() and I think the key part here is that we do want to invalidate the execlists->csb_write itself, so a subtly different location/reason (not sure if it's the same cacheline or the neighbouring one). -Chris
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 3d83c7e0d9de..61a38a4ccbca 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled) struct i915_request *rq; u32 *regs; + mb(); /* paranoia: read the CSB pointers from after the reset */ + clflush(execlists->csb_write); + mb(); + process_csb(engine); /* drain preemption events */ /* Following the reset, we need to reload the CSB read/write pointers */
After a GPU reset, we need to drain all the CS events so that we have an accurate picture of the execlists state at the time of the reset. Be paranoid and force a read of the CSB write pointer from memory. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> --- drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++ 1 file changed, 4 insertions(+)