diff mbox series

[1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset

Message ID 20190912070925.11526-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series [1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset | expand

Commit Message

Chris Wilson Sept. 12, 2019, 7:09 a.m. UTC
After a GPU reset, we need to drain all the CS events so that we have an
accurate picture of the execlists state at the time of the reset. Be
paranoid and force a read of the CSB write pointer from memory.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Mika Kuoppala Sept. 12, 2019, 7:51 a.m. UTC | #1
Chris Wilson <chris@chris-wilson.co.uk> writes:

> After a GPU reset, we need to drain all the CS events so that we have an
> accurate picture of the execlists state at the time of the reset. Be
> paranoid and force a read of the CSB write pointer from memory.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 3d83c7e0d9de..61a38a4ccbca 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
>  	struct i915_request *rq;
>  	u32 *regs;
>  
> +	mb(); /* paranoia: read the CSB pointers from after the reset */
> +	clflush(execlists->csb_write);
> +	mb();
> +

We know there is always a cost. We do invalidate the csb
on each pass on process_csb.

Add csb_write in to invalidate_csb entries along
with mbs. Rename it to invalidate_csb and use it
always?

By doing so, we could prolly throw out the rmb() at
the start of the process_csb as we would have invalidated
the write pointer along with the entries we read,
on previous pass.

-Mika


>  	process_csb(engine); /* drain preemption events */
>  
>  	/* Following the reset, we need to reload the CSB read/write pointers */
> -- 
> 2.23.0
Chris Wilson Sept. 12, 2019, 8:04 a.m. UTC | #2
Quoting Mika Kuoppala (2019-09-12 08:51:38)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > After a GPU reset, we need to drain all the CS events so that we have an
> > accurate picture of the execlists state at the time of the reset. Be
> > paranoid and force a read of the CSB write pointer from memory.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index 3d83c7e0d9de..61a38a4ccbca 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
> >       struct i915_request *rq;
> >       u32 *regs;
> >  
> > +     mb(); /* paranoia: read the CSB pointers from after the reset */
> > +     clflush(execlists->csb_write);
> > +     mb();
> > +
> 
> We know there is always a cost. We do invalidate the csb
> on each pass on process_csb.
> 
> Add csb_write in to invalidate_csb entries along
> with mbs. Rename it to invalidate_csb and use it
> always?
> 
> By doing so, we could prolly throw out the rmb() at
> the start of the process_csb as we would have invalidated
> the write pointer along with the entries we read,
> on previous pass.

No. That rmb is essential for the read ordering at that moment in time.

All I have in mind here is a delay, not really a barrier per se, just
this is a nice way of saying no speculation either.
-Chris
Mika Kuoppala Sept. 12, 2019, 8:27 a.m. UTC | #3
Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2019-09-12 08:51:38)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > After a GPU reset, we need to drain all the CS events so that we have an
>> > accurate picture of the execlists state at the time of the reset. Be
>> > paranoid and force a read of the CSB write pointer from memory.
>> >
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++
>> >  1 file changed, 4 insertions(+)
>> >
>> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > index 3d83c7e0d9de..61a38a4ccbca 100644
>> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
>> >       struct i915_request *rq;
>> >       u32 *regs;
>> >  
>> > +     mb(); /* paranoia: read the CSB pointers from after the reset */
>> > +     clflush(execlists->csb_write);
>> > +     mb();
>> > +
>> 
>> We know there is always a cost. We do invalidate the csb
>> on each pass on process_csb.
>> 
>> Add csb_write in to invalidate_csb entries along
>> with mbs. Rename it to invalidate_csb and use it
>> always?
>> 
>> By doing so, we could prolly throw out the rmb() at
>> the start of the process_csb as we would have invalidated
>> the write pointer along with the entries we read,
>> on previous pass.
>
> No. That rmb is essential for the read ordering at that moment in time.

Ah yes indeed it is. head vs entries coherency.

>
> All I have in mind here is a delay, not really a barrier per se, just
> this is a nice way of saying no speculation either.

Forgetting the rmb(), there is similar pattern of mb()+flush
elsewhere. Just saw the profiliferation and opportunity to converge.

But syncing with the hardware on moment of reset, this should
do.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Chris Wilson Sept. 12, 2019, 8:38 a.m. UTC | #4
Quoting Mika Kuoppala (2019-09-12 09:27:56)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Quoting Mika Kuoppala (2019-09-12 08:51:38)
> >> Chris Wilson <chris@chris-wilson.co.uk> writes:
> >> 
> >> > After a GPU reset, we need to drain all the CS events so that we have an
> >> > accurate picture of the execlists state at the time of the reset. Be
> >> > paranoid and force a read of the CSB write pointer from memory.
> >> >
> >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >> > ---
> >> >  drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++
> >> >  1 file changed, 4 insertions(+)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> > index 3d83c7e0d9de..61a38a4ccbca 100644
> >> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> > @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
> >> >       struct i915_request *rq;
> >> >       u32 *regs;
> >> >  
> >> > +     mb(); /* paranoia: read the CSB pointers from after the reset */
> >> > +     clflush(execlists->csb_write);
> >> > +     mb();
> >> > +
> >> 
> >> We know there is always a cost. We do invalidate the csb
> >> on each pass on process_csb.
> >> 
> >> Add csb_write in to invalidate_csb entries along
> >> with mbs. Rename it to invalidate_csb and use it
> >> always?
> >> 
> >> By doing so, we could prolly throw out the rmb() at
> >> the start of the process_csb as we would have invalidated
> >> the write pointer along with the entries we read,
> >> on previous pass.
> >
> > No. That rmb is essential for the read ordering at that moment in time.
> 
> Ah yes indeed it is. head vs entries coherency.
> 
> >
> > All I have in mind here is a delay, not really a barrier per se, just
> > this is a nice way of saying no speculation either.
> 
> Forgetting the rmb(), there is similar pattern of mb()+flush
> elsewhere. Just saw the profiliferation and opportunity to converge.

I understood. I think your barrier-less w/a works pretty well and I
haven't yet poked a hole in how I think it works ;)

> But syncing with the hardware on moment of reset, this should
> do.

I looked at reusing invalidate_csb_entries() and I think the key part
here is that we do want to invalidate the execlists->csb_write itself,
so a subtly different location/reason (not sure if it's the same
cacheline or the neighbouring one).
-Chris
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 3d83c7e0d9de..61a38a4ccbca 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2836,6 +2836,10 @@  static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
 	struct i915_request *rq;
 	u32 *regs;
 
+	mb(); /* paranoia: read the CSB pointers from after the reset */
+	clflush(execlists->csb_write);
+	mb();
+
 	process_csb(engine); /* drain preemption events */
 
 	/* Following the reset, we need to reload the CSB read/write pointers */