diff mbox

[v10,6/6] drm/i915: Cache last IRQ seqno to reduce IRQ overhead

Message ID 1466081680-2344-7-git-send-email-John.C.Harrison@Intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

John Harrison June 16, 2016, 12:54 p.m. UTC
From: John Harrison <John.C.Harrison@Intel.com>

The notify function can be called many times without the seqno
changing. Some are to prevent races due to the requirement of not
enabling interrupts until requested. However, when interrupts are
enabled the IRQ handler can be called multiple times without the
ring's seqno value changing. E.g. two interrupts are generated by
batch buffers completing in quick succession, the first call to the
handler processes both completions but the handler still gets executed
a second time. This patch reduces the overhead of these extra calls by
caching the last processed seqno value and early exiting if it has not
changed.

v3: New patch for series.

v5: Added comment about last_irq_seqno usage due to code review
feedback (Tvrtko Ursulin).

v6: Minor update to resolve a race condition with the wait_request
optimisation.

v7: Updated to newer nightly - lots of ring -> engine renaming plus an
interface change to get_seqno().

v10: Renamed the cached variable as it is no longer used at IRQ time.
[Review comment from Tvrtko Ursulin]

For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         | 26 ++++++++++++++++++++++----
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 2 files changed, 23 insertions(+), 4 deletions(-)

Comments

Maarten Lankhorst June 21, 2016, 12:29 p.m. UTC | #1
Op 16-06-16 om 14:54 schreef John.C.Harrison@Intel.com:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> The notify function can be called many times without the seqno
> changing. Some are to prevent races due to the requirement of not
> enabling interrupts until requested. However, when interrupts are
> enabled the IRQ handler can be called multiple times without the
> ring's seqno value changing. E.g. two interrupts are generated by
> batch buffers completing in quick succession, the first call to the
> handler processes both completions but the handler still gets executed
> a second time. This patch reduces the overhead of these extra calls by
> caching the last processed seqno value and early exiting if it has not
> changed.
>
> v3: New patch for series.
>
> v5: Added comment about last_irq_seqno usage due to code review
> feedback (Tvrtko Ursulin).
>
> v6: Minor update to resolve a race condition with the wait_request
> optimisation.
>
> v7: Updated to newer nightly - lots of ring -> engine renaming plus an
> interface change to get_seqno().
>
> v10: Renamed the cached variable as it is no longer used at IRQ time.
> [Review comment from Tvrtko Ursulin]
>
> For: VIZ-5190
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

I think it would be useful to add in the commit message that this happens in most cases.

From your earlier mail:

"Doing the cache check hits the early exit approx 98% of the time when 
running GLBenchmark. Although the vast majority of duplicate calls are 
from having to call the notify function from 
i915_gem_retire_requests_ring() and that being called at least once for 
every execbuf IOCTL (possibly multiple times). I have just made a couple 
of tweaks to further reduce the number of these calls and their impact, 
but there are still a lot of them."

If added,

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d7b88b1..405b1b7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1382,6 +1382,7 @@  out:
 			 * request has not actually been fully processed yet.
 			 */
 			spin_lock_irq(&req->engine->fence_lock);
+			req->engine->last_notify_seqno = 0;
 			i915_gem_request_notify(req->engine, true, true);
 			spin_unlock_irq(&req->engine->fence_lock);
 		}
@@ -2628,9 +2629,12 @@  i915_gem_init_seqno(struct drm_i915_private *dev_priv, u32 seqno)
 	i915_gem_retire_requests(dev_priv);
 
 	/* Finally reset hw state */
-	for_each_engine(engine, dev_priv)
+	for_each_engine(engine, dev_priv) {
 		intel_ring_init_seqno(engine, seqno);
 
+		engine->last_notify_seqno = 0;
+	}
+
 	return 0;
 }
 
@@ -2972,13 +2976,24 @@  void i915_gem_request_notify(struct intel_engine_cs *engine, bool fence_locked,
 		return;
 	}
 
-	if (!fence_locked)
-		spin_lock_irq(&engine->fence_lock);
-
+	/*
+	 * Check for a new seqno. If it hasn't actually changed then early
+	 * exit without even grabbing the spinlock. Note that this is safe
+	 * because any corruption of last_notify_seqno merely results in doing
+	 * the full processing when there is potentially no work to be done.
+	 * It can never lead to not processing work that does need to happen.
+	 */
 	if (!lazy_coherency && engine->irq_seqno_barrier)
 		engine->irq_seqno_barrier(engine);
 	seqno = engine->get_seqno(engine);
 	trace_i915_gem_request_notify(engine, seqno);
+	if (seqno == engine->last_notify_seqno)
+		return;
+
+	if (!fence_locked)
+		spin_lock_irq(&engine->fence_lock);
+
+	engine->last_notify_seqno = seqno;
 
 	list_for_each_entry_safe(req, req_next, &engine->fence_signal_list, signal_link) {
 		if (!req->cancelled && !i915_seqno_passed(seqno, req->seqno))
@@ -3270,7 +3285,10 @@  static void i915_gem_reset_engine_cleanup(struct drm_i915_private *dev_priv,
 	/*
 	 * Make sure that any requests that were on the signal pending list also
 	 * get cleaned up.
+	 * NB: The seqno cache must be cleared first otherwise the notify call
+	 * will simply return immediately.
 	 */
+	engine->last_notify_seqno = 0;
 	i915_gem_request_notify(engine, false, false);
 
 	/* Having flushed all requests from all queues, we know that all
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 9e79fbd..929f2f2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -354,6 +354,7 @@  struct intel_engine_cs {
 	 */
 	spinlock_t fence_lock;
 	struct list_head fence_signal_list;
+	uint32_t last_notify_seqno;
 
 	struct work_struct request_work;
 };