diff mbox

[1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore'

Message ID 1395643764-24353-2-git-send-email-sourab.gupta@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

sourab.gupta@intel.com March 24, 2014, 6:49 a.m. UTC
From: Akash Goel <akash.goel@intel.com>

Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
Store data commands.

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

Comments

Chris Wilson March 24, 2014, 9:32 a.m. UTC | #1
On Mon, Mar 24, 2014 at 12:19:19PM +0530, sourab.gupta@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> Store data commands.
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 87d1a2d..2812384 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2207,6 +2207,28 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
>  	uint32_t flush_domains;
>  	int ret;
>  
> +	if (IS_VALLEYVIEW(ring->dev)) {
The ring flushes are vfuncs, so why is this here and not in a special
vlv ring flush?

> +		/*
> +		 * WaTlbInvalidateStoreDataBefore:vlv
> +		 * Before pipecontrol with TLB invalidate set, need 2 store
> +		 * data commands (such as MI_STORE_DATA_IMM or MI_STORE_DATA_INDEX)
> +		 * Without this, hardware cannot guarantee the command after the
> +		 * PIPE_CONTROL with TLB inv will not use the old TLB values.

Crumbs, it sounds like our i-g-t are not sensitive enough. This bug
crops up in many disguises over the years, do you have any suggestion on
how we can improve our tests?

> +		 */
> +		int i;
> +		ret = intel_ring_begin(ring, 4 * 2);

This can be we written to use 6 dwords.

> +		if (ret)
> +			return ret;
> +		for (i = 0; i < 2; i++) {
> +			intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
> +			intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_INDEX <<
> +						MI_STORE_DWORD_INDEX_SHIFT);

This is I915_GEM_HWS_SCRATCH_ADDR

> +			intel_ring_emit(ring, 0);
> +			intel_ring_emit(ring, MI_NOOP);
> +		}
> +		intel_ring_advance(ring);
> +	}
> +
>  	flush_domains = 0;
>  	if (ring->gpu_caches_dirty)
>  		flush_domains = I915_GEM_GPU_DOMAINS;
sourab.gupta@intel.com March 24, 2014, 11:20 a.m. UTC | #2
On Mon, 2014-03-24 at 09:32 +0000, Chris Wilson wrote:
> On Mon, Mar 24, 2014 at 12:19:19PM +0530, sourab.gupta@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> > In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> > Store data commands.
> > 
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 87d1a2d..2812384 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -2207,6 +2207,28 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
> >  	uint32_t flush_domains;
> >  	int ret;
> >  
> > +	if (IS_VALLEYVIEW(ring->dev)) {
> The ring flushes are vfuncs, so why is this here and not in a special
> vlv ring flush?

Yes, we can as well put it in the platform specific vlv flush. Since we
apply this WA only for invalidate_all_caches function, we have to
differentiate in the vlv flush function regarding where the flush
originated from. For this we plan to check the 'invalidate_domains'
field of flush function. (This field will be non-zero in case the call
originated from invalidate_all_caches function). So, we'll have a
vlv_render_ring_flush something like this:
	if(invalidate_domains)
		apply_our_wa;
	gen7_render_ring_flush();

Does this look okay?

Regards,
Sourab

> 
> > +		/*
> > +		 * WaTlbInvalidateStoreDataBefore:vlv
> > +		 * Before pipecontrol with TLB invalidate set, need 2 store
> > +		 * data commands (such as MI_STORE_DATA_IMM or MI_STORE_DATA_INDEX)
> > +		 * Without this, hardware cannot guarantee the command after the
> > +		 * PIPE_CONTROL with TLB inv will not use the old TLB values.
> 
> Crumbs, it sounds like our i-g-t are not sensitive enough. This bug
> crops up in many disguises over the years, do you have any suggestion on
> how we can improve our tests?
> 
We'll think of how to capture the scenario in the i-g-t testcases and
come back with suggestions.

> > +		 */
> > +		int i;
> > +		ret = intel_ring_begin(ring, 4 * 2);
> 
> This can be we written to use 6 dwords.
> 
Agreed. We'll have this in our next version
> > +		if (ret)
> > +			return ret;
> > +		for (i = 0; i < 2; i++) {
> > +			intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
> > +			intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_INDEX <<
> > +						MI_STORE_DWORD_INDEX_SHIFT);
> 
> This is I915_GEM_HWS_SCRATCH_ADDR

Agreed. We'll have this in our next version

> 
> > +			intel_ring_emit(ring, 0);
> > +			intel_ring_emit(ring, MI_NOOP);
> > +		}
> > +		intel_ring_advance(ring);
> > +	}
> > +
> >  	flush_domains = 0;
> >  	if (ring->gpu_caches_dirty)
> >  		flush_domains = I915_GEM_GPU_DOMAINS;
>
Ville Syrjälä March 24, 2014, 6:32 p.m. UTC | #3
On Mon, Mar 24, 2014 at 11:20:40AM +0000, Gupta, Sourab wrote:
> On Mon, 2014-03-24 at 09:32 +0000, Chris Wilson wrote:
> > On Mon, Mar 24, 2014 at 12:19:19PM +0530, sourab.gupta@intel.com wrote:
> > > From: Akash Goel <akash.goel@intel.com>
> > > 
> > > Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> > > In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> > > Store data commands.
> > > 
> > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++
> > >  1 file changed, 22 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > index 87d1a2d..2812384 100644
> > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > @@ -2207,6 +2207,28 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
> > >  	uint32_t flush_domains;
> > >  	int ret;
> > >  
> > > +	if (IS_VALLEYVIEW(ring->dev)) {
> > The ring flushes are vfuncs, so why is this here and not in a special
> > vlv ring flush?
> 
> Yes, we can as well put it in the platform specific vlv flush. Since we
> apply this WA only for invalidate_all_caches function, we have to
> differentiate in the vlv flush function regarding where the flush
> originated from. For this we plan to check the 'invalidate_domains'
> field of flush function. (This field will be non-zero in case the call
> originated from invalidate_all_caches function). So, we'll have a
> vlv_render_ring_flush something like this:
> 	if(invalidate_domains)
> 		apply_our_wa;
> 	gen7_render_ring_flush();
> 
> Does this look okay?

Since we supposdely need this for all gen6/gen7, I'd just add a new func
(eg. gen6_tlb_invalidate_wa()) and call that from gen6_render_ring_flush(),
gen7_render_ring_flush(), gen6_bsd_ring_flush() and gen6_ring_flush().
Chris Wilson March 24, 2014, 6:47 p.m. UTC | #4
On Mon, Mar 24, 2014 at 08:32:30PM +0200, Ville Syrjälä wrote:
> On Mon, Mar 24, 2014 at 11:20:40AM +0000, Gupta, Sourab wrote:
> > On Mon, 2014-03-24 at 09:32 +0000, Chris Wilson wrote:
> > > On Mon, Mar 24, 2014 at 12:19:19PM +0530, sourab.gupta@intel.com wrote:
> > > > From: Akash Goel <akash.goel@intel.com>
> > > > 
> > > > Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> > > > In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> > > > Store data commands.
> > > > 
> > > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > > > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++
> > > >  1 file changed, 22 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > index 87d1a2d..2812384 100644
> > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > @@ -2207,6 +2207,28 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
> > > >  	uint32_t flush_domains;
> > > >  	int ret;
> > > >  
> > > > +	if (IS_VALLEYVIEW(ring->dev)) {
> > > The ring flushes are vfuncs, so why is this here and not in a special
> > > vlv ring flush?
> > 
> > Yes, we can as well put it in the platform specific vlv flush. Since we
> > apply this WA only for invalidate_all_caches function, we have to
> > differentiate in the vlv flush function regarding where the flush
> > originated from. For this we plan to check the 'invalidate_domains'
> > field of flush function. (This field will be non-zero in case the call
> > originated from invalidate_all_caches function). So, we'll have a
> > vlv_render_ring_flush something like this:
> > 	if(invalidate_domains)
> > 		apply_our_wa;
> > 	gen7_render_ring_flush();
> > 
> > Does this look okay?
> 
> Since we supposdely need this for all gen6/gen7, I'd just add a new func
> (eg. gen6_tlb_invalidate_wa()) and call that from gen6_render_ring_flush(),
> gen7_render_ring_flush(), gen6_bsd_ring_flush() and gen6_ring_flush().

Now, I am extremely curious as to what the exact bug symptoms are. We
seem to have an absence of bug reports since SNB regarding random
corruption.
-Chris
sourab.gupta@intel.com March 25, 2014, 5:17 a.m. UTC | #5
On Mon, 2014-03-24 at 18:47 +0000, Chris Wilson wrote:
> On Mon, Mar 24, 2014 at 08:32:30PM +0200, Ville Syrjälä wrote:

> > On Mon, Mar 24, 2014 at 11:20:40AM +0000, Gupta, Sourab wrote:

> > > On Mon, 2014-03-24 at 09:32 +0000, Chris Wilson wrote:

> > > > On Mon, Mar 24, 2014 at 12:19:19PM +0530, sourab.gupta@intel.com wrote:

> > > > > From: Akash Goel <akash.goel@intel.com>

> > > > > 

> > > > > Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.

> > > > > In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI

> > > > > Store data commands.

> > > > > 

> > > > > Signed-off-by: Akash Goel <akash.goel@intel.com>

> > > > > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>

> > > > > ---

> > > > >  drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++

> > > > >  1 file changed, 22 insertions(+)

> > > > > 

> > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c

> > > > > index 87d1a2d..2812384 100644

> > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c

> > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c

> > > > > @@ -2207,6 +2207,28 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)

> > > > >  	uint32_t flush_domains;

> > > > >  	int ret;

> > > > >  

> > > > > +	if (IS_VALLEYVIEW(ring->dev)) {

> > > > The ring flushes are vfuncs, so why is this here and not in a special

> > > > vlv ring flush?

> > > 

> > > Yes, we can as well put it in the platform specific vlv flush. Since we

> > > apply this WA only for invalidate_all_caches function, we have to

> > > differentiate in the vlv flush function regarding where the flush

> > > originated from. For this we plan to check the 'invalidate_domains'

> > > field of flush function. (This field will be non-zero in case the call

> > > originated from invalidate_all_caches function). So, we'll have a

> > > vlv_render_ring_flush something like this:

> > > 	if(invalidate_domains)

> > > 		apply_our_wa;

> > > 	gen7_render_ring_flush();

> > > 

> > > Does this look okay?

> > 

> > Since we supposdely need this for all gen6/gen7, I'd just add a new func

> > (eg. gen6_tlb_invalidate_wa()) and call that from gen6_render_ring_flush(),

> > gen7_render_ring_flush(), gen6_bsd_ring_flush() and gen6_ring_flush().

> 

> Now, I am extremely curious as to what the exact bug symptoms are. We

> seem to have an absence of bug reports since SNB regarding random

> corruption.

> -Chris

> 

Hi Chris,
We had applied this WA in a preemptive way, as this was amongst the
recommended list of WA's applicable. So, we don't have any specific bug
characteristics which is prevented by this particular WA.
Generally, we have been very cautious wrt WA's as we have seen hangs
(which may be very difficult to debug in wild), if we miss out on some
WA's recommended.
Regards,
Sourab
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 87d1a2d..2812384 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2207,6 +2207,28 @@  intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
 	uint32_t flush_domains;
 	int ret;
 
+	if (IS_VALLEYVIEW(ring->dev)) {
+		/*
+		 * WaTlbInvalidateStoreDataBefore:vlv
+		 * Before pipecontrol with TLB invalidate set, need 2 store
+		 * data commands (such as MI_STORE_DATA_IMM or MI_STORE_DATA_INDEX)
+		 * Without this, hardware cannot guarantee the command after the
+		 * PIPE_CONTROL with TLB inv will not use the old TLB values.
+		 */
+		int i;
+		ret = intel_ring_begin(ring, 4 * 2);
+		if (ret)
+			return ret;
+		for (i = 0; i < 2; i++) {
+			intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
+			intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_INDEX <<
+						MI_STORE_DWORD_INDEX_SHIFT);
+			intel_ring_emit(ring, 0);
+			intel_ring_emit(ring, MI_NOOP);
+		}
+		intel_ring_advance(ring);
+	}
+
 	flush_domains = 0;
 	if (ring->gpu_caches_dirty)
 		flush_domains = I915_GEM_GPU_DOMAINS;