diff mbox

drm/i915: Invalidate TLBs for the rings after a reset

Message ID 1375812074-2665-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson Aug. 6, 2013, 6:01 p.m. UTC
After any "soft gfx reset" we must manually invalidate the TLBs
associated with each ring. Empirically, it seems that a
suspend/resume or D3-D0 cycle count as a "soft reset". The symptom is
that the hardware would fail to note the new address for its status
page, and so it would continue to write the shadow registers and
breadcrumbs into the old physical address (now used by something
completely different, scary). Whereas the driver would read the new
status page and never see any progress, it would appear that the GPU
hung immediately upon resume.

Based on a patch by naresh kumar kachhi <naresh.kumar.kacchi@intel.com>

Reported-by: Thiago Macieira <thiago@kde.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64725
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_reg.h         |  2 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 12 ++++++++++++
 2 files changed, 14 insertions(+)

Comments

Chris Wilson Aug. 16, 2013, 7:31 a.m. UTC | #1
On Tue, Aug 06, 2013 at 07:01:14PM +0100, Chris Wilson wrote:
> After any "soft gfx reset" we must manually invalidate the TLBs
> associated with each ring. Empirically, it seems that a
> suspend/resume or D3-D0 cycle count as a "soft reset". The symptom is
> that the hardware would fail to note the new address for its status
> page, and so it would continue to write the shadow registers and
> breadcrumbs into the old physical address (now used by something
> completely different, scary). Whereas the driver would read the new
> status page and never see any progress, it would appear that the GPU
> hung immediately upon resume.
> 
> Based on a patch by naresh kumar kachhi <naresh.kumar.kacchi@intel.com>
> 
> Reported-by: Thiago Macieira <thiago@kde.org>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64725
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Thiago reports that early testing indicates success.

Anyone fancy acking this and sending this onto to stable@?
-Chris
Daniel Vetter Aug. 16, 2013, 12:17 p.m. UTC | #2
On Fri, Aug 16, 2013 at 08:31:35AM +0100, Chris Wilson wrote:
> On Tue, Aug 06, 2013 at 07:01:14PM +0100, Chris Wilson wrote:
> > After any "soft gfx reset" we must manually invalidate the TLBs
> > associated with each ring. Empirically, it seems that a
> > suspend/resume or D3-D0 cycle count as a "soft reset". The symptom is
> > that the hardware would fail to note the new address for its status
> > page, and so it would continue to write the shadow registers and
> > breadcrumbs into the old physical address (now used by something
> > completely different, scary). Whereas the driver would read the new
> > status page and never see any progress, it would appear that the GPU
> > hung immediately upon resume.
> > 
> > Based on a patch by naresh kumar kachhi <naresh.kumar.kacchi@intel.com>
> > 
> > Reported-by: Thiago Macieira <thiago@kde.org>
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64725
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Thiago reports that early testing indicates success.
> 
> Anyone fancy acking this and sending this onto to stable@?

Picked up for -fixes, thanks for the patch. I'll let it hang there a bit
though before forwarding, so I don't plan to update the -fixes pull
request I've just recently sent out.
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 714a909..ca82e5f 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -752,6 +752,8 @@ 
 					will not assert AGPBUSY# and will only
 					be delivered when out of C3. */
 #define   INSTPM_FORCE_ORDERING				(1<<7) /* GEN6+ */
+#define   INSTPM_TLB_INVALIDATE	(1<<9)
+#define   INSTPM_SYNC_FLUSH	(1<<5)
 #define ACTHD	        0x020c8
 #define FW_BLC		0x020d8
 #define FW_BLC2		0x020dc
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index dbc1f7c..58eb6a0 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -972,6 +972,18 @@  void intel_ring_setup_status_page(struct intel_ring_buffer *ring)
 
 	I915_WRITE(mmio, (u32)ring->status_page.gfx_addr);
 	POSTING_READ(mmio);
+
+	/* Flush the TLB for this page */
+	if (INTEL_INFO(dev)->gen >= 6) {
+		u32 reg = RING_INSTPM(ring->mmio_base);
+		I915_WRITE(reg,
+			   _MASKED_BIT_ENABLE(INSTPM_TLB_INVALIDATE |
+					      INSTPM_SYNC_FLUSH));
+		if (wait_for((I915_READ(reg) & INSTPM_SYNC_FLUSH) == 0,
+			     1000))
+			DRM_ERROR("%s: wait for SyncFlush to complete for TLB invalidation timed out\n",
+				  ring->name);
+	}
 }
 
 static int