diff mbox

[CI,1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance()

Message ID 20170206170502.30944-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson Feb. 6, 2017, 5:05 p.m. UTC
It is required that the caller declare the exact number of dwords they
wish to write into the ring. This is required for two reasons, we need
to allocate sufficient space for the entire command packet and we need
to be sure that the contents are completely written to avoid executing
stale data. The current interface requires for any bug to be caught in
review, the reader has to carefully count the number of
intel_ring_emit() between intel_ring_begin() and intel_ring_advance().
If we record the end of the packet of each intel_ring_begin() we can
also have CI check for us.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.h         | 9 +++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 1 +
 drivers/gpu/drm/i915/intel_ringbuffer.h | 3 +++
 3 files changed, 13 insertions(+)

Comments

Chris Wilson Feb. 6, 2017, 8:44 p.m. UTC | #1
On Mon, Feb 06, 2017 at 07:52:15PM -0000, Patchwork wrote:
> == Series Details ==
> 
> Series: series starting with [CI,1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance()
> URL   : https://patchwork.freedesktop.org/series/19169/
> State : success
> 
> == Summary ==
> 
> Series 19169v1 Series without cover letter
> https://patchwork.freedesktop.org/api/1.0/series/19169/revisions/1/mbox/
> 
> Test kms_force_connector_basic:
>         Subgroup force-connector-state:
>                 skip       -> PASS       (fi-snb-2520m)

Ok, I pushed this as given Mika's feedback this may be responsible for
skl (at least) sporadically dieing in CI. I'm sure the macros for the
conditional code I added can and will be improved.
-Chris
Mika Kuoppala Feb. 7, 2017, 11:02 a.m. UTC | #2
Chris Wilson <chris@chris-wilson.co.uk> writes:

> On Mon, Feb 06, 2017 at 07:52:15PM -0000, Patchwork wrote:
>> == Series Details ==
>> 
>> Series: series starting with [CI,1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance()
>> URL   : https://patchwork.freedesktop.org/series/19169/
>> State : success
>> 
>> == Summary ==
>> 
>> Series 19169v1 Series without cover letter
>> https://patchwork.freedesktop.org/api/1.0/series/19169/revisions/1/mbox/
>> 
>> Test kms_force_connector_basic:
>>         Subgroup force-connector-state:
>>                 skip       -> PASS       (fi-snb-2520m)
>
> Ok, I pushed this as given Mika's feedback this may be responsible for
> skl (at least) sporadically dieing in CI. I'm sure the macros for the
> conditional code I added can and will be improved.

Observations with my kbl indicate that the retirement freed the request's
context, which then got referenced by the GEM_BUG_ON. Instead of neat
trace, full system hang ensued.

With this patch, my kbl seems very stable again.

I suspect that atleast some incomplete CI results we have seen,
on execlist machines, were due to this.

-Mika

> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index a585d47c420a..412fa2794cbb 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -28,9 +28,18 @@ 
 #ifdef CONFIG_DRM_I915_DEBUG_GEM
 #define GEM_BUG_ON(expr) BUG_ON(expr)
 #define GEM_WARN_ON(expr) WARN_ON(expr)
+
+#define GEM_BUG_ONLY(expr) expr
+#define GEM_BUG_ONLY_DECLARE(var) var
+#define GEM_BUG_ONLY_ON(expr) GEM_BUG_ON(expr)
+
 #else
 #define GEM_BUG_ON(expr) BUILD_BUG_ON_INVALID(expr)
 #define GEM_WARN_ON(expr) (BUILD_BUG_ON_INVALID(expr), 0)
+
+#define GEM_BUG_ONLY(expr) do { } while (0)
+#define GEM_BUG_ONLY_DECLARE(var)
+#define GEM_BUG_ONLY_ON(expr)
 #endif
 
 #define I915_NUM_ENGINES 5
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index d32cbba25d98..383083ef2210 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2238,6 +2238,7 @@  int intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
 
 	ring->space -= bytes;
 	GEM_BUG_ON(ring->space < 0);
+	GEM_BUG_ONLY(ring->advance = ring->tail + bytes);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index b9c15cd40fbf..2c6d3655985e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -144,6 +144,8 @@  struct intel_ring {
 
 	u32 head;
 	u32 tail;
+	GEM_BUG_ONLY_DECLARE(u32 advance);
+
 	int space;
 	int size;
 	int effective_size;
@@ -516,6 +518,7 @@  static inline void intel_ring_advance(struct intel_ring *ring)
 	 * reserved for the command packet (i.e. the value passed to
 	 * intel_ring_begin()).
 	 */
+	GEM_BUG_ONLY_ON(ring->tail != ring->advance);
 }
 
 static inline u32 intel_ring_offset(struct intel_ring *ring, void *addr)