diff mbox

[3/3] drm/i915: Track & check calls to intel(_logical)_ring_{begin, advance}

Message ID 1418400783-5524-4-git-send-email-david.s.gordon@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dave Gordon Dec. 12, 2014, 4:13 p.m. UTC
When adding instructions to a legacy or LRC ringbuffer, the sequence of
emit() calls must be preceded by a call to intel(_logical)_ring_begin()
to reserve the required amount of space, and followed by a matching call
to intel(_logical)_ring_advance() (note that this used to trigger
immediate submission to the h/w, but now actual submission is deferred
until all the instructions for a single batch submission have been
assembled). Historically some (display) code didn't use begin/advance,
but just inserted instructions ad hoc, which would then be sent to the
hardware along with the current or next batch, but this is not supported
and is now regarded as incorrect.

This commit therefore adds begin/advance tracking, with WARNings where
various forms of misuse are detected. These include:
* advance without begin
* begin without advance before submission to h/w
* multiple begins without an advance between
* exceeding the space reserved by begin
* leaving the ring misaligned
* ring buffer overrun (negative freespace)

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c        |    4 ++-
 drivers/gpu/drm/i915/intel_lrc.h        |   11 ++++++-
 drivers/gpu/drm/i915/intel_ringbuffer.c |    9 ++++--
 drivers/gpu/drm/i915/intel_ringbuffer.h |   54 ++++++++++++++++++++++++++++++-
 4 files changed, 73 insertions(+), 5 deletions(-)

Comments

Chris Wilson Dec. 12, 2014, 5:12 p.m. UTC | #1
On Fri, Dec 12, 2014 at 04:13:03PM +0000, Dave Gordon wrote:
>  static inline void intel_ring_advance(struct intel_engine_cs *ring)
>  {
>  	struct intel_ringbuffer *ringbuf = ring->buffer;
> -	ringbuf->tail &= ringbuf->size - 1;
> +
> +	__intel_ringbuffer_check(ringbuf);
> +
> +	/*
> +	 * Tail == effecive_size is legitimate (buffer exactly full).
> +	 * Tail > effective_size is not, and should give a warning,
> +	 * but we'll reset tail in both cases to prevent further chaos
> +	 */
> +	if (ringbuf->tail >= ringbuf->effective_size)
> +		ringbuf->tail -= ringbuf->effective_size;

Urm. No. If you never write into the reserved pair of cachelines at the
end of the ringbuffer but the hw reads garbage from it, you lose.
tail &= size - 1; is a nice description of how the hw works that is
suitable for inlining, with all the magic in begin().

The goal is to remove the duplicated logic from intel_lrc.c, use
requests completely, and remove the dri1 hangover. To repeat what I said
last time:
http://cgit.freedesktop.org/~ickle/linux-2.6/tree/drivers/gpu/drm/i915/intel_ringbuffer.c?h=requests#n2447
is where I want us to go, a single piece of logic for ring management.
-Chris
Dave Gordon Dec. 12, 2014, 6:28 p.m. UTC | #2
On 12/12/14 17:12, Chris Wilson wrote:
> On Fri, Dec 12, 2014 at 04:13:03PM +0000, Dave Gordon wrote:
>>  static inline void intel_ring_advance(struct intel_engine_cs *ring)
>>  {
>>  	struct intel_ringbuffer *ringbuf = ring->buffer;
>> -	ringbuf->tail &= ringbuf->size - 1;
>> +
>> +	__intel_ringbuffer_check(ringbuf);
>> +
>> +	/*
>> +	 * Tail == effecive_size is legitimate (buffer exactly full).
>> +	 * Tail > effective_size is not, and should give a warning,
>> +	 * but we'll reset tail in both cases to prevent further chaos
>> +	 */
>> +	if (ringbuf->tail >= ringbuf->effective_size)
>> +		ringbuf->tail -= ringbuf->effective_size;
> 
> Urm. No. If you never write into the reserved pair of cachelines at the
> end of the ringbuffer but the hw reads garbage from it, you lose.

This won't happen, because (with the first patch of this set applied)
the check for sufficient space uses effective_size, but the wrap code
uses the actual size. Thus, once the next chunk won't fit in the
available space up to the effective_size, we write NOPs all over
whatever is left up to the total size. The part between the effective
and the actual sizes is therefore always (and only) written with NOPs.

> tail &= size - 1; is a nice description of how the hw works that is
> suitable for inlining, with all the magic in begin().
> 
> The goal is to remove the duplicated logic from intel_lrc.c, use
> requests completely, and remove the dri1 hangover. To repeat what I said
> last time:
> http://cgit.freedesktop.org/~ickle/linux-2.6/tree/drivers/gpu/drm/i915/intel_ringbuffer.c?h=requests#n2447
> is where I want us to go, a single piece of logic for ring management.
> -Chris

I agree with the deduplication, which is why nearly all these changes
are in functions operating on ringbuffers (not rings), and located in
intel_ringbuffer.h, with only minimal changes in the execlist and legacy
paths to use these common functions rather than having two implementations.

So, what I think I'll do is rework this third patch to merge the
tail-masking into the common function; rename it to something other than
/check/; and then the legacy and LRC versions become trivial wrappers
round this.

Are you happy with the first two patches?

.Dave.
Shuang He Dec. 14, 2014, 6:06 a.m. UTC | #3
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                                  364/364              364/364
ILK              +3                 362/366              365/366
SNB                                  448/450              448/450
IVB                                  497/498              497/498
BYT                                  289/289              289/289
HSW                                  563/564              563/564
BDW                                  417/417              417/417
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
 ILK  igt_kms_flip_nonexisting-fb      DMESG_WARN(1, M26)PASS(3, M37M26)      PASS(1, M37)
 ILK  igt_kms_flip_rcs-flip-vs-panning-interruptible      DMESG_WARN(1, M26)PASS(3, M37M26)      PASS(1, M37)
 ILK  igt_kms_flip_rcs-wf_vblank-vs-dpms-interruptible      DMESG_WARN(1, M26)PASS(2, M26M37)      PASS(1, M37)
Note: You need to pay more attention to line start with '*'
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 69b042f..422f3b9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -825,6 +825,7 @@  void intel_logical_ring_advance_and_submit(struct intel_ringbuffer *ringbuf)
 	struct intel_context *ctx = ringbuf->FIXME_lrc_ctx;
 
 	intel_logical_ring_advance(ringbuf);
+	WARN_ON(ringbuf->rsv_level != 0);
 
 	if (intel_ring_stopped(ring))
 		return;
@@ -1093,7 +1094,8 @@  int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf, int num_dwords)
 	if (ret)
 		return ret;
 
-	ringbuf->space -= num_dwords * sizeof(uint32_t);
+	__intel_ringbuffer_begin(ringbuf, num_dwords);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 14b216b..9a0457e 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -48,8 +48,17 @@  void intel_logical_ring_advance_and_submit(struct intel_ringbuffer *ringbuf);
  */
 static inline void intel_logical_ring_advance(struct intel_ringbuffer *ringbuf)
 {
-	ringbuf->tail &= ringbuf->size - 1;
+	__intel_ringbuffer_check(ringbuf);
+
+	/*
+	 * Tail == effecive_size is legitimate (buffer exactly full).
+	 * Tail > effective_size is not, and should give a warning,
+	 * but we'll reset tail in both cases to prevent further chaos
+	 */
+	if (ringbuf->tail >= ringbuf->effective_size)
+		ringbuf->tail -= ringbuf->effective_size;
 }
+
 /**
  * intel_logical_ring_emit() - write a DWORD to the ringbuffer.
  * @ringbuf: Ringbuffer to write to.
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 660d10d..7ffad3a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -55,6 +55,7 @@  int __intel_ring_space(int head, int tail, int size)
 	int space = head - tail;
 	if (space <= 0)
 		space += size;
+	WARN_ON(space < I915_RING_FREE_SPACE);
 	return space - I915_RING_FREE_SPACE;
 }
 
@@ -85,7 +86,10 @@  bool intel_ring_stopped(struct intel_engine_cs *ring)
 void __intel_ring_advance(struct intel_engine_cs *ring)
 {
 	struct intel_ringbuffer *ringbuf = ring->buffer;
-	ringbuf->tail &= ringbuf->size - 1;
+
+	intel_ring_advance(ring);
+	WARN_ON(ringbuf->rsv_level != 0);
+
 	if (intel_ring_stopped(ring))
 		return;
 	ring->write_tail(ring, ringbuf->tail);
@@ -2107,7 +2111,8 @@  int intel_ring_begin(struct intel_engine_cs *ring,
 	if (ret)
 		return ret;
 
-	ring->buffer->space -= num_dwords * sizeof(uint32_t);
+	__intel_ringbuffer_begin(ring->buffer, num_dwords);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 6dbb6f4..a6660c1 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -112,6 +112,11 @@  struct intel_ringbuffer {
 	int size;
 	int effective_size;
 
+	/* these let advance() check for misuse */
+	int rsv_level;	/* reservation nesting level */
+	int rsv_size;	/* size passed to begin() */
+	int rsv_start;	/* tail when begin() last returned */
+
 	/** We track the position of the requests in the ring buffer, and
 	 * when each is retired we increment last_retired_head as the GPU
 	 * must have finished processing the request and so we know we
@@ -401,11 +406,58 @@  static inline void intel_ring_emit(struct intel_engine_cs *ring,
 	iowrite32(data, ringbuf->virtual_start + ringbuf->tail);
 	ringbuf->tail += 4;
 }
+
+static inline void __intel_ringbuffer_begin(struct intel_ringbuffer *ringbuf,
+					    int num_dwords)
+{
+	int nbytes = num_dwords * sizeof(uint32_t);
+
+	WARN_ON(num_dwords & 1);
+	WARN_ON(nbytes <= 0);
+
+	if (ringbuf->rsv_level++) {
+		/* begin() called twice or more without advance() */
+		WARN_ON(1);
+	} else {
+		/*
+		 * A new reservation; validate and record the start and
+		 * size, then deduct the size from the remaining space
+		 */
+		WARN_ON(ringbuf->tail & 7);
+		WARN_ON(ringbuf->tail > ringbuf->effective_size-8);
+		WARN_ON(ringbuf->tail + nbytes > ringbuf->effective_size);
+		WARN_ON(ringbuf->space < nbytes);
+	}
+
+	ringbuf->rsv_start = ringbuf->tail;
+	ringbuf->rsv_size = nbytes;
+	ringbuf->space -= nbytes;
+}
+
+static inline void __intel_ringbuffer_check(struct intel_ringbuffer *ringbuf)
+{
+	WARN_ON(ringbuf->rsv_level-- != 1);
+	WARN_ON(ringbuf->rsv_start < 0 || ringbuf->rsv_size < 0);
+	WARN_ON(ringbuf->tail & 7);
+	WARN_ON(ringbuf->tail > ringbuf->rsv_start + ringbuf->rsv_size);
+	WARN_ON(ringbuf->tail > ringbuf->effective_size);
+}
+
 static inline void intel_ring_advance(struct intel_engine_cs *ring)
 {
 	struct intel_ringbuffer *ringbuf = ring->buffer;
-	ringbuf->tail &= ringbuf->size - 1;
+
+	__intel_ringbuffer_check(ringbuf);
+
+	/*
+	 * Tail == effecive_size is legitimate (buffer exactly full).
+	 * Tail > effective_size is not, and should give a warning,
+	 * but we'll reset tail in both cases to prevent further chaos
+	 */
+	if (ringbuf->tail >= ringbuf->effective_size)
+		ringbuf->tail -= ringbuf->effective_size;
 }
+
 int __intel_ring_space(int head, int tail, int size);
 void intel_ring_update_space(struct intel_ringbuffer *ringbuf);
 int intel_ring_space(struct intel_ringbuffer *ringbuf);