diff mbox

drm/i915: Apply post-sync write for pipe control invalidates

Message ID 1344590290-5206-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson Aug. 10, 2012, 9:18 a.m. UTC
When invalidating the TLBs it is documentated as requiring a post-sync
write. Failure to do so seems to result in a GPU hang.

Exposure to this hang on IVB seems to be a result of removing the extra
stalls required for SNB pipecontrol workarounds:

commit 6c6cf5aa9c583478b19e23149feaa92d01fb8c2d
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jul 20 18:02:28 2012 +0100

    drm/i915: Only apply the SNB pipe control w/a to gen6

Reported-by: yex.tian@intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53322
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c |   35 ++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 14 deletions(-)

Comments

Jani Nikula Aug. 10, 2012, 9:57 a.m. UTC | #1
On Fri, 10 Aug 2012, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> When invalidating the TLBs it is documentated as requiring a post-sync
> write. Failure to do so seems to result in a GPU hang.
>
> Exposure to this hang on IVB seems to be a result of removing the extra
> stalls required for SNB pipecontrol workarounds:

Hi Chris, AFAICT TLB invalidate requires PIPE_CONTROL_CS_STALL set per
the spec. I can't find a mention of the post-sync write, though. Could
you double check, please?

BR,
Jani.


>
> commit 6c6cf5aa9c583478b19e23149feaa92d01fb8c2d
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jul 20 18:02:28 2012 +0100
>
>     drm/i915: Only apply the SNB pipe control w/a to gen6
>
> Reported-by: yex.tian@intel.com
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53322
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c |   35 ++++++++++++++++++-------------
>  1 file changed, 21 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 13318a0..7608bc2 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -213,20 +213,27 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring,
>  	 * number of bits based on the write domains has little performance
>  	 * impact.
>  	 */
> -	flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> -	flags |= PIPE_CONTROL_TLB_INVALIDATE;
> -	flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
> -	flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
> -	flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> -	flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
> -	flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
> -	flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
> -	/*
> -	 * Ensure that any following seqno writes only happen when the render
> -	 * cache is indeed flushed (but only if the caller actually wants that).
> -	 */
> -	if (flush_domains)
> +	if (flush_domains) {
> +		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> +		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> +		/*
> +		 * Ensure that any following seqno writes only happen
> +		 * when the render cache is indeed flushed.
> +		 */
>  		flags |= PIPE_CONTROL_CS_STALL;
> +	}
> +	if (invalidate_domains) {
> +		flags |= PIPE_CONTROL_TLB_INVALIDATE;
> +		flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
> +		flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
> +		flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
> +		flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
> +		flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
> +		/*
> +		 * TLB invalidate requires a post-sync write.
> +		 */
> +		flags |= PIPE_CONTROL_QW_WRITE;
> +	}
>  
>  	ret = intel_ring_begin(ring, 4);
>  	if (ret)
> @@ -234,7 +241,7 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring,
>  
>  	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
>  	intel_ring_emit(ring, flags);
> -	intel_ring_emit(ring, 0);
> +	intel_ring_emit(ring, (u32)ring->status_page.gfx_addr+2048);
>  	intel_ring_emit(ring, 0);
>  	intel_ring_advance(ring);
>  
> -- 
> 1.7.10.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Chris Wilson Aug. 10, 2012, 10:07 a.m. UTC | #2
On Fri, 10 Aug 2012 12:57:59 +0300, Jani Nikula <jani.nikula@linux.intel.com> wrote:
> On Fri, 10 Aug 2012, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > When invalidating the TLBs it is documentated as requiring a post-sync
> > write. Failure to do so seems to result in a GPU hang.
> >
> > Exposure to this hang on IVB seems to be a result of removing the extra
> > stalls required for SNB pipecontrol workarounds:
> 
> Hi Chris, AFAICT TLB invalidate requires PIPE_CONTROL_CS_STALL set per
> the spec. I can't find a mention of the post-sync write, though. Could
> you double check, please?

Considering replacing it with a CS_STALL just hard hung my box, I remain
unconvinced. :-p
-Chris
Chris Wilson Aug. 10, 2012, 10:11 a.m. UTC | #3
On Fri, 10 Aug 2012 11:07:47 +0100, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Fri, 10 Aug 2012 12:57:59 +0300, Jani Nikula <jani.nikula@linux.intel.com> wrote:
> > On Fri, 10 Aug 2012, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > When invalidating the TLBs it is documentated as requiring a post-sync
> > > write. Failure to do so seems to result in a GPU hang.
> > >
> > > Exposure to this hang on IVB seems to be a result of removing the extra
> > > stalls required for SNB pipecontrol workarounds:
> > 
> > Hi Chris, AFAICT TLB invalidate requires PIPE_CONTROL_CS_STALL set per
> > the spec. I can't find a mention of the post-sync write, though. Could
> > you double check, please?

To be clear, the w/a is mentioned for DevGT-A (but presumably still
required):

For all PIPE_CONTROLs that *only* have RO cache invalidation, software
must set the post-sync operation field to something other than 0
-Chris
Jani Nikula Aug. 10, 2012, 10:46 a.m. UTC | #4
On Fri, 10 Aug 2012, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Fri, 10 Aug 2012 12:57:59 +0300, Jani Nikula <jani.nikula@linux.intel.com> wrote:
>> On Fri, 10 Aug 2012, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> > When invalidating the TLBs it is documentated as requiring a post-sync
>> > write. Failure to do so seems to result in a GPU hang.
>> >
>> > Exposure to this hang on IVB seems to be a result of removing the extra
>> > stalls required for SNB pipecontrol workarounds:
>> 
>> Hi Chris, AFAICT TLB invalidate requires PIPE_CONTROL_CS_STALL set per
>> the spec. I can't find a mention of the post-sync write, though. Could
>> you double check, please?
>
> Considering replacing it with a CS_STALL just hard hung my box, I remain
> unconvinced. :-p

I meant you could check the spec, not actually try it! ;) But I accept
that's a good reason not to use it.

BR,
Jani.
Ben Widawsky Aug. 11, 2012, 7:20 p.m. UTC | #5
On Fri, 10 Aug 2012 10:18:10 +0100
Chris Wilson <chris@chris-wilson.co.uk> wrote:

> When invalidating the TLBs it is documentated as requiring a post-sync
> write. Failure to do so seems to result in a GPU hang.
> 
> Exposure to this hang on IVB seems to be a result of removing the
> extra stalls required for SNB pipecontrol workarounds:
> 
> commit 6c6cf5aa9c583478b19e23149feaa92d01fb8c2d
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jul 20 18:02:28 2012 +0100
> 
>     drm/i915: Only apply the SNB pipe control w/a to gen6
> 
> Reported-by: yex.tian@intel.com
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53322
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

This is the moral equivalent of my patch to make the simulator happy
on IVB. Daniel, I'll settle for either patch.
Therefore,
Acked-by: Ben Widawsky <ben@bwidawsk.net>

> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c |   35
> ++++++++++++++++++------------- 1 file changed, 21 insertions(+), 14
> deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> b/drivers/gpu/drm/i915/intel_ringbuffer.c index 13318a0..7608bc2
> 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -213,20 +213,27 @@ gen6_render_ring_flush(struct intel_ring_buffer
> *ring,
>  	 * number of bits based on the write domains has little
> performance
>  	 * impact.
>  	 */
> -	flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> -	flags |= PIPE_CONTROL_TLB_INVALIDATE;
> -	flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
> -	flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
> -	flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> -	flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
> -	flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
> -	flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
> -	/*
> -	 * Ensure that any following seqno writes only happen when
> the render
> -	 * cache is indeed flushed (but only if the caller actually
> wants that).
> -	 */
> -	if (flush_domains)
> +	if (flush_domains) {
> +		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> +		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> +		/*
> +		 * Ensure that any following seqno writes only happen
> +		 * when the render cache is indeed flushed.
> +		 */
>  		flags |= PIPE_CONTROL_CS_STALL;
> +	}
> +	if (invalidate_domains) {
> +		flags |= PIPE_CONTROL_TLB_INVALIDATE;
> +		flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
> +		flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
> +		flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
> +		flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
> +		flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
> +		/*
> +		 * TLB invalidate requires a post-sync write.
> +		 */
> +		flags |= PIPE_CONTROL_QW_WRITE;
> +	}
>  
>  	ret = intel_ring_begin(ring, 4);
>  	if (ret)
> @@ -234,7 +241,7 @@ gen6_render_ring_flush(struct intel_ring_buffer
> *ring, 
>  	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
>  	intel_ring_emit(ring, flags);
> -	intel_ring_emit(ring, 0);
> +	intel_ring_emit(ring, (u32)ring->status_page.gfx_addr+2048);
>  	intel_ring_emit(ring, 0);
>  	intel_ring_advance(ring);
>
Daniel Vetter Aug. 11, 2012, 7:47 p.m. UTC | #6
On Sat, Aug 11, 2012 at 12:20:19PM -0700, Ben Widawsky wrote:
> On Fri, 10 Aug 2012 10:18:10 +0100
> Chris Wilson <chris@chris-wilson.co.uk> wrote:
> 
> > When invalidating the TLBs it is documentated as requiring a post-sync
> > write. Failure to do so seems to result in a GPU hang.
> > 
> > Exposure to this hang on IVB seems to be a result of removing the
> > extra stalls required for SNB pipecontrol workarounds:
> > 
> > commit 6c6cf5aa9c583478b19e23149feaa92d01fb8c2d
> > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > Date:   Fri Jul 20 18:02:28 2012 +0100
> > 
> >     drm/i915: Only apply the SNB pipe control w/a to gen6
> > 
> > Reported-by: yex.tian@intel.com
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53322
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> This is the moral equivalent of my patch to make the simulator happy
> on IVB. Daniel, I'll settle for either patch.
> Therefore,
> Acked-by: Ben Widawsky <ben@bwidawsk.net>

Ok, I'll wait until we have testing feedback from the bug report and then
either merge this to -fixes or -next.
-Daniel
Daniel Vetter Aug. 14, 2012, 7:57 a.m. UTC | #7
jk Sat, Aug 11, 2012 at 12:20:19PM -0700, Ben Widawsky wrote:
> On Fri, 10 Aug 2012 10:18:10 +0100
> Chris Wilson <chris@chris-wilson.co.uk> wrote:
> 
> > When invalidating the TLBs it is documentated as requiring a post-sync
> > write. Failure to do so seems to result in a GPU hang.
> > 
> > Exposure to this hang on IVB seems to be a result of removing the
> > extra stalls required for SNB pipecontrol workarounds:
> > 
> > commit 6c6cf5aa9c583478b19e23149feaa92d01fb8c2d
> > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > Date:   Fri Jul 20 18:02:28 2012 +0100
> > 
> >     drm/i915: Only apply the SNB pipe control w/a to gen6
> > 
> > Reported-by: yex.tian@intel.com
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53322
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> This is the moral equivalent of my patch to make the simulator happy
> on IVB. Daniel, I'll settle for either patch.
> Therefore,
> Acked-by: Ben Widawsky <ben@bwidawsk.net>

Patch merged to -fixes, with some manual frobbery to ensure we get a load
conflict instead of a silent one.
-Daniel

> 
> > ---
> >  drivers/gpu/drm/i915/intel_ringbuffer.c |   35
> > ++++++++++++++++++------------- 1 file changed, 21 insertions(+), 14
> > deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > b/drivers/gpu/drm/i915/intel_ringbuffer.c index 13318a0..7608bc2
> > 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -213,20 +213,27 @@ gen6_render_ring_flush(struct intel_ring_buffer
> > *ring,
> >  	 * number of bits based on the write domains has little
> > performance
> >  	 * impact.
> >  	 */
> > -	flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> > -	flags |= PIPE_CONTROL_TLB_INVALIDATE;
> > -	flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
> > -	flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
> > -	flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> > -	flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
> > -	flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
> > -	flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
> > -	/*
> > -	 * Ensure that any following seqno writes only happen when
> > the render
> > -	 * cache is indeed flushed (but only if the caller actually
> > wants that).
> > -	 */
> > -	if (flush_domains)
> > +	if (flush_domains) {
> > +		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> > +		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> > +		/*
> > +		 * Ensure that any following seqno writes only happen
> > +		 * when the render cache is indeed flushed.
> > +		 */
> >  		flags |= PIPE_CONTROL_CS_STALL;
> > +	}
> > +	if (invalidate_domains) {
> > +		flags |= PIPE_CONTROL_TLB_INVALIDATE;
> > +		flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
> > +		flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
> > +		flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
> > +		flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
> > +		flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
> > +		/*
> > +		 * TLB invalidate requires a post-sync write.
> > +		 */
> > +		flags |= PIPE_CONTROL_QW_WRITE;
> > +	}
> >  
> >  	ret = intel_ring_begin(ring, 4);
> >  	if (ret)
> > @@ -234,7 +241,7 @@ gen6_render_ring_flush(struct intel_ring_buffer
> > *ring, 
> >  	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
> >  	intel_ring_emit(ring, flags);
> > -	intel_ring_emit(ring, 0);
> > +	intel_ring_emit(ring, (u32)ring->status_page.gfx_addr+2048);
> >  	intel_ring_emit(ring, 0);
> >  	intel_ring_advance(ring);
> >  
> 
> 
> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 13318a0..7608bc2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -213,20 +213,27 @@  gen6_render_ring_flush(struct intel_ring_buffer *ring,
 	 * number of bits based on the write domains has little performance
 	 * impact.
 	 */
-	flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
-	flags |= PIPE_CONTROL_TLB_INVALIDATE;
-	flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
-	flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
-	flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
-	flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
-	flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
-	flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
-	/*
-	 * Ensure that any following seqno writes only happen when the render
-	 * cache is indeed flushed (but only if the caller actually wants that).
-	 */
-	if (flush_domains)
+	if (flush_domains) {
+		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
+		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+		/*
+		 * Ensure that any following seqno writes only happen
+		 * when the render cache is indeed flushed.
+		 */
 		flags |= PIPE_CONTROL_CS_STALL;
+	}
+	if (invalidate_domains) {
+		flags |= PIPE_CONTROL_TLB_INVALIDATE;
+		flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
+		/*
+		 * TLB invalidate requires a post-sync write.
+		 */
+		flags |= PIPE_CONTROL_QW_WRITE;
+	}
 
 	ret = intel_ring_begin(ring, 4);
 	if (ret)
@@ -234,7 +241,7 @@  gen6_render_ring_flush(struct intel_ring_buffer *ring,
 
 	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
 	intel_ring_emit(ring, flags);
-	intel_ring_emit(ring, 0);
+	intel_ring_emit(ring, (u32)ring->status_page.gfx_addr+2048);
 	intel_ring_emit(ring, 0);
 	intel_ring_advance(ring);