diff mbox

drm/i915: Flush chipset caches after GGTT writes

Message ID 20180717092655.28417-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson July 17, 2018, 9:26 a.m. UTC
Our I915g (early gen3, the oldest machine we have in the farm) is still
reporting occasional incoherency performing the following operations:

  1) write through GGTT (indirect write into memory)
  2) write through either CPU or WC (direct write into memory)
  3) read from GGTT (indirect read)

Instead of reporting the value from (2), the read from GGTT reports the
earlier value written via the GGTT. We have made sure that the writes are
flushed from the CPU (commit 3a32497f0dbe ("drm/i915/selftests: Provide
full mb() around clflush") and commit add00e6d896f ("drm/i915: Flush the
WCB following a WC write")), but still see the error, just less
frequently. The only remaining cache that might be affected here is a
chipset cache, so flush that as well.

Testcase: igt/drv_selftest/live_coherency #gdg
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Rodrigo Vivi July 17, 2018, 2:32 p.m. UTC | #1
On Tue, Jul 17, 2018 at 10:26:55AM +0100, Chris Wilson wrote:
> Our I915g (early gen3, the oldest machine we have in the farm) is still
> reporting occasional incoherency performing the following operations:
> 
>   1) write through GGTT (indirect write into memory)
>   2) write through either CPU or WC (direct write into memory)
>   3) read from GGTT (indirect read)
> 
> Instead of reporting the value from (2), the read from GGTT reports the
> earlier value written via the GGTT. We have made sure that the writes are
> flushed from the CPU (commit 3a32497f0dbe ("drm/i915/selftests: Provide
> full mb() around clflush") and commit add00e6d896f ("drm/i915: Flush the
> WCB following a WC write")), but still see the error, just less
> frequently. The only remaining cache that might be affected here is a
> chipset cache, so flush that as well.
> 
> Testcase: igt/drv_selftest/live_coherency #gdg
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 42d24410a98c..08266791801e 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -802,7 +802,7 @@ void i915_gem_flush_ggtt_writes(struct drm_i915_private *dev_priv)
>  	 * that was!).
>  	 */
>  
> -	wmb();
> +	i915_gem_chipset_flush(dev_priv);

this seems a void change for me... because I couldn't find the implementation
of intel_gtt_chipset_flush() so it seems that we are just replacing wmb per wmb

But I'm probably missing something here... please point me to the part that I'm
missing...

>  
>  	intel_runtime_pm_get(dev_priv);
>  	spin_lock_irq(&dev_priv->uncore.lock);
> -- 
> 2.18.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Chris Wilson July 17, 2018, 2:37 p.m. UTC | #2
Quoting Rodrigo Vivi (2018-07-17 15:32:08)
> On Tue, Jul 17, 2018 at 10:26:55AM +0100, Chris Wilson wrote:
> > Our I915g (early gen3, the oldest machine we have in the farm) is still
> > reporting occasional incoherency performing the following operations:
> > 
> >   1) write through GGTT (indirect write into memory)
> >   2) write through either CPU or WC (direct write into memory)
> >   3) read from GGTT (indirect read)
> > 
> > Instead of reporting the value from (2), the read from GGTT reports the
> > earlier value written via the GGTT. We have made sure that the writes are
> > flushed from the CPU (commit 3a32497f0dbe ("drm/i915/selftests: Provide
> > full mb() around clflush") and commit add00e6d896f ("drm/i915: Flush the
> > WCB following a WC write")), but still see the error, just less
> > frequently. The only remaining cache that might be affected here is a
> > chipset cache, so flush that as well.
> > 
> > Testcase: igt/drv_selftest/live_coherency #gdg
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 42d24410a98c..08266791801e 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -802,7 +802,7 @@ void i915_gem_flush_ggtt_writes(struct drm_i915_private *dev_priv)
> >        * that was!).
> >        */
> >  
> > -     wmb();
> > +     i915_gem_chipset_flush(dev_priv);
> 
> this seems a void change for me... because I couldn't find the implementation
> of intel_gtt_chipset_flush() so it seems that we are just replacing wmb per wmb

For gen3, it triggers a write into the igfx flush page via
agp/intel-gtt.c, see i9xx_chipset_flush().
-Chris
Rodrigo Vivi July 17, 2018, 3:13 p.m. UTC | #3
On Tue, Jul 17, 2018 at 03:37:13PM +0100, Chris Wilson wrote:
> Quoting Rodrigo Vivi (2018-07-17 15:32:08)
> > On Tue, Jul 17, 2018 at 10:26:55AM +0100, Chris Wilson wrote:
> > > Our I915g (early gen3, the oldest machine we have in the farm) is still
> > > reporting occasional incoherency performing the following operations:
> > > 
> > >   1) write through GGTT (indirect write into memory)
> > >   2) write through either CPU or WC (direct write into memory)
> > >   3) read from GGTT (indirect read)
> > > 
> > > Instead of reporting the value from (2), the read from GGTT reports the
> > > earlier value written via the GGTT. We have made sure that the writes are
> > > flushed from the CPU (commit 3a32497f0dbe ("drm/i915/selftests: Provide
> > > full mb() around clflush") and commit add00e6d896f ("drm/i915: Flush the
> > > WCB following a WC write")), but still see the error, just less
> > > frequently. The only remaining cache that might be affected here is a
> > > chipset cache, so flush that as well.
> > > 
> > > Testcase: igt/drv_selftest/live_coherency #gdg
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 42d24410a98c..08266791801e 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -802,7 +802,7 @@ void i915_gem_flush_ggtt_writes(struct drm_i915_private *dev_priv)
> > >        * that was!).
> > >        */
> > >  
> > > -     wmb();
> > > +     i915_gem_chipset_flush(dev_priv);
> > 
> > this seems a void change for me... because I couldn't find the implementation
> > of intel_gtt_chipset_flush() so it seems that we are just replacing wmb per wmb
> 
> For gen3, it triggers a write into the igfx flush page via
> agp/intel-gtt.c, see i9xx_chipset_flush().

thanks for the pointer... now it makes sense

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

> -Chris
Chris Wilson July 17, 2018, 4:34 p.m. UTC | #4
Quoting Rodrigo Vivi (2018-07-17 16:13:41)
> On Tue, Jul 17, 2018 at 03:37:13PM +0100, Chris Wilson wrote:
> > Quoting Rodrigo Vivi (2018-07-17 15:32:08)
> > > On Tue, Jul 17, 2018 at 10:26:55AM +0100, Chris Wilson wrote:
> > > > Our I915g (early gen3, the oldest machine we have in the farm) is still
> > > > reporting occasional incoherency performing the following operations:
> > > > 
> > > >   1) write through GGTT (indirect write into memory)
> > > >   2) write through either CPU or WC (direct write into memory)
> > > >   3) read from GGTT (indirect read)
> > > > 
> > > > Instead of reporting the value from (2), the read from GGTT reports the
> > > > earlier value written via the GGTT. We have made sure that the writes are
> > > > flushed from the CPU (commit 3a32497f0dbe ("drm/i915/selftests: Provide
> > > > full mb() around clflush") and commit add00e6d896f ("drm/i915: Flush the
> > > > WCB following a WC write")), but still see the error, just less
> > > > frequently. The only remaining cache that might be affected here is a
> > > > chipset cache, so flush that as well.
> > > > 
> > > > Testcase: igt/drv_selftest/live_coherency #gdg
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_gem.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > > index 42d24410a98c..08266791801e 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -802,7 +802,7 @@ void i915_gem_flush_ggtt_writes(struct drm_i915_private *dev_priv)
> > > >        * that was!).
> > > >        */
> > > >  
> > > > -     wmb();
> > > > +     i915_gem_chipset_flush(dev_priv);
> > > 
> > > this seems a void change for me... because I couldn't find the implementation
> > > of intel_gtt_chipset_flush() so it seems that we are just replacing wmb per wmb
> > 
> > For gen3, it triggers a write into the igfx flush page via
> > agp/intel-gtt.c, see i9xx_chipset_flush().
> 
> thanks for the pointer... now it makes sense
> 
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Sadly the only way to find out if this is the final piece of the puzzle
is to soak test it in CI, so in goes.

Thanks,
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 42d24410a98c..08266791801e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -802,7 +802,7 @@  void i915_gem_flush_ggtt_writes(struct drm_i915_private *dev_priv)
 	 * that was!).
 	 */
 
-	wmb();
+	i915_gem_chipset_flush(dev_priv);
 
 	intel_runtime_pm_get(dev_priv);
 	spin_lock_irq(&dev_priv->uncore.lock);