Message ID | 1466081680-2344-7-git-send-email-John.C.Harrison@Intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Op 16-06-16 om 14:54 schreef John.C.Harrison@Intel.com: > From: John Harrison <John.C.Harrison@Intel.com> > > The notify function can be called many times without the seqno > changing. Some are to prevent races due to the requirement of not > enabling interrupts until requested. However, when interrupts are > enabled the IRQ handler can be called multiple times without the > ring's seqno value changing. E.g. two interrupts are generated by > batch buffers completing in quick succession, the first call to the > handler processes both completions but the handler still gets executed > a second time. This patch reduces the overhead of these extra calls by > caching the last processed seqno value and early exiting if it has not > changed. > > v3: New patch for series. > > v5: Added comment about last_irq_seqno usage due to code review > feedback (Tvrtko Ursulin). > > v6: Minor update to resolve a race condition with the wait_request > optimisation. > > v7: Updated to newer nightly - lots of ring -> engine renaming plus an > interface change to get_seqno(). > > v10: Renamed the cached variable as it is no longer used at IRQ time. > [Review comment from Tvrtko Ursulin] > > For: VIZ-5190 > Signed-off-by: John Harrison <John.C.Harrison@Intel.com> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> I think it would be useful to add in the commit message that this happens in most cases. From your earlier mail: "Doing the cache check hits the early exit approx 98% of the time when running GLBenchmark. Although the vast majority of duplicate calls are from having to call the notify function from i915_gem_retire_requests_ring() and that being called at least once for every execbuf IOCTL (possibly multiple times). I have just made a couple of tweaks to further reduce the number of these calls and their impact, but there are still a lot of them." If added, Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d7b88b1..405b1b7 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1382,6 +1382,7 @@ out: * request has not actually been fully processed yet. */ spin_lock_irq(&req->engine->fence_lock); + req->engine->last_notify_seqno = 0; i915_gem_request_notify(req->engine, true, true); spin_unlock_irq(&req->engine->fence_lock); } @@ -2628,9 +2629,12 @@ i915_gem_init_seqno(struct drm_i915_private *dev_priv, u32 seqno) i915_gem_retire_requests(dev_priv); /* Finally reset hw state */ - for_each_engine(engine, dev_priv) + for_each_engine(engine, dev_priv) { intel_ring_init_seqno(engine, seqno); + engine->last_notify_seqno = 0; + } + return 0; } @@ -2972,13 +2976,24 @@ void i915_gem_request_notify(struct intel_engine_cs *engine, bool fence_locked, return; } - if (!fence_locked) - spin_lock_irq(&engine->fence_lock); - + /* + * Check for a new seqno. If it hasn't actually changed then early + * exit without even grabbing the spinlock. Note that this is safe + * because any corruption of last_notify_seqno merely results in doing + * the full processing when there is potentially no work to be done. + * It can never lead to not processing work that does need to happen. + */ if (!lazy_coherency && engine->irq_seqno_barrier) engine->irq_seqno_barrier(engine); seqno = engine->get_seqno(engine); trace_i915_gem_request_notify(engine, seqno); + if (seqno == engine->last_notify_seqno) + return; + + if (!fence_locked) + spin_lock_irq(&engine->fence_lock); + + engine->last_notify_seqno = seqno; list_for_each_entry_safe(req, req_next, &engine->fence_signal_list, signal_link) { if (!req->cancelled && !i915_seqno_passed(seqno, req->seqno)) @@ -3270,7 +3285,10 @@ static void i915_gem_reset_engine_cleanup(struct drm_i915_private *dev_priv, /* * Make sure that any requests that were on the signal pending list also * get cleaned up. + * NB: The seqno cache must be cleared first otherwise the notify call + * will simply return immediately. */ + engine->last_notify_seqno = 0; i915_gem_request_notify(engine, false, false); /* Having flushed all requests from all queues, we know that all diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 9e79fbd..929f2f2 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -354,6 +354,7 @@ struct intel_engine_cs { */ spinlock_t fence_lock; struct list_head fence_signal_list; + uint32_t last_notify_seqno; struct work_struct request_work; };