drm/i915: Only emit one semaphore per request

Message ID	20190401141838.1311-1-chris@chris-wilson.co.uk (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: Chris Wilson <chris@chris-wilson.co.uk> To: intel-gfx@lists.freedesktop.org Date: Mon, 1 Apr 2019 15:18:38 +0100 Message-Id: <20190401141838.1311-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH] drm/i915: Only emit one semaphore per request Precedence: list Cc: Mika Kuoppala <mika.kuoppala@intel.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	drm/i915: Only emit one semaphore per request \| expand drm/i915: Only emit one semaphore per request

Message ID

20190401141838.1311-1-chris@chris-wilson.co.uk (mailing list archive)

State

New, archived

Headers

From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Date: Mon,  1 Apr 2019 15:18:38 +0100
Message-Id: <20190401141838.1311-1-chris@chris-wilson.co.uk>
MIME-Version: 1.0
Subject: [Intel-gfx] [PATCH] drm/i915: Only emit one semaphore per request
Precedence: list
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Series

drm/i915: Only emit one semaphore per request | expand

Commit Message

Chris Wilson April 1, 2019, 2:18 p.m. UTC

Ideally we only need one semaphore per ring to accommodate waiting on
multiple engines in parallel. However, since we do not know which fences
we will finally be waiting on, we emit a semaphore for every fence. It
turns out to be quite easy to trick ourselves into exhausting our
ringbuffer causing an error, just by feeding in a batch that depends on
several thousand contexts.

Since we never can be waiting on more than one semaphore in parallel
(other than perhaps the desire to busywait on multiple engines), just
pick the first fence for our semaphore. If we pick the wrong fence to
busywait on, we just miss an opportunity to reduce latency.

An adaption might be to use sched.flags as either a semaphore counter,
or to track the first busywait on each engine, converting it back to a
single use bit prior to closing the request.

Testcase: igt/gem_exec_fence/long-history
Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
Fixes: e88619646971 ("drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Chris Wilson April 1, 2019, 3:04 p.m. UTC | #1

Quoting Chris Wilson (2019-04-01 15:18:38)
> Ideally we only need one semaphore per ring to accommodate waiting on
> multiple engines in parallel. However, since we do not know which fences
> we will finally be waiting on, we emit a semaphore for every fence. It
> turns out to be quite easy to trick ourselves into exhausting our
> ringbuffer causing an error, just by feeding in a batch that depends on
> several thousand contexts.
> 
> Since we never can be waiting on more than one semaphore in parallel
> (other than perhaps the desire to busywait on multiple engines), just
> pick the first fence for our semaphore. If we pick the wrong fence to
> busywait on, we just miss an opportunity to reduce latency.
> 
> An adaption might be to use sched.flags as either a semaphore counter,
> or to track the first busywait on each engine, converting it back to a
> single use bit prior to closing the request.

Grrr, this disables semaphores for basic-sequential gem_exec_nop. Maybe
allow semaphores for the same GEM context regardless.
-Chris

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index e9c2094ab8ea..941c89e2fe82 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -783,6 +783,12 @@  emit_semaphore_wait(struct i915_request *to,
 	GEM_BUG_ON(!from->timeline->has_initial_breadcrumb);
 	GEM_BUG_ON(INTEL_GEN(to->i915) < 8);
 
+	/* Just emit the first semaphore we see as request space is limited. */
+	if (to->sched.flags & I915_SCHED_HAS_SEMAPHORE)
+		return i915_sw_fence_await_dma_fence(&to->submit,
+						     &from->fence, 0,
+						     I915_FENCE_GFP);
+
 	/* We need to pin the signaler's HWSP until we are finished reading. */
 	err = i915_timeline_read_hwsp(from, to, &hwsp_offset);
 	if (err)

drm/i915: Only emit one semaphore per request

Commit Message

Comments

Patch