From patchwork Thu Dec 12 01:03:53 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Chris Wilson <chris@chris-wilson.co.uk>
X-Patchwork-Id: 11286665
Return-Path: <SRS0=0W9h=2C=lists.freedesktop.org=intel-gfx-bounces@kernel.org>
Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org
 [172.30.200.123])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0A52914B7
	for <patchwork-intel-gfx@patchwork.kernel.org>;
 Thu, 12 Dec 2019 01:04:11 +0000 (UTC)
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id E70982077B
	for <patchwork-intel-gfx@patchwork.kernel.org>;
 Thu, 12 Dec 2019 01:04:10 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E70982077B
Authentication-Results: mail.kernel.org;
 dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk
Authentication-Results: mail.kernel.org;
 spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id 9EB5E6EC21;
	Thu, 12 Dec 2019 01:04:08 +0000 (UTC)
X-Original-To: intel-gfx@lists.freedesktop.org
Delivered-To: intel-gfx@lists.freedesktop.org
Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192])
 by gabe.freedesktop.org (Postfix) with ESMTPS id 52BFD6EC21
 for <intel-gfx@lists.freedesktop.org>; Thu, 12 Dec 2019 01:04:06 +0000 (UTC)
X-Default-Received-SPF: pass (skip=forwardok (res=PASS))
 x-ip-name=78.156.65.138;
Received: from haswell.alporthouse.com (unverified [78.156.65.138])
 by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 19548400-1500050
 for multiple; Thu, 12 Dec 2019 01:03:52 +0000
From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Date: Thu, 12 Dec 2019 01:03:53 +0000
Message-Id: <20191212010353.736593-1-chris@chris-wilson.co.uk>
X-Mailer: git-send-email 2.24.0
MIME-Version: 1.0
Subject: [Intel-gfx] [PATCH] drm/i915/gt: Pull intel_timeline.requests list
 under a spinlock
X-BeenThere: intel-gfx@lists.freedesktop.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Intel graphics driver community testing & development
 <intel-gfx.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Currently we use the intel_timeline.mutex to guard constructing and
retiring requests in the timeline, including the adding and removing of
the request from the list of requests (intel_timeline.requests).
However, we want to peek at neighbouring elements in the request list
while constructing a request on another timeline (see
i915_request_await_start) and this implies nesting timeline mutexes. To
avoid the nested mutex, we currently use a mutex_trylock() but this is
fraught with a potential race causing an -EBUSY. We can eliminate the
nested mutex here with a spinlock guarding list operations within the
broader mutex, so callers can choose between locking everything with the
mutex or just the list with the spinlock. (The mutex caters for
virtually all of the current users, but maybe being able to easily peek
at the request list, we will do so more often in the future.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
 .../gpu/drm/i915/gt/intel_timeline_types.h    |  1 +
 .../gpu/drm/i915/gt/selftests/mock_timeline.c |  1 +
 drivers/gpu/drm/i915/i915_request.c           | 54 +++++++++++--------
 4 files changed, 34 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 038e05a6336c..06cbd0777a0c 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -256,6 +256,7 @@ int intel_timeline_init(struct intel_timeline *timeline,
 
 	INIT_ACTIVE_FENCE(&timeline->last_request);
 	INIT_LIST_HEAD(&timeline->requests);
+	spin_lock_init(&timeline->request_lock);
 
 	i915_syncmap_init(&timeline->sync);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
index aaf15cbe1ce1..7c9f49f46626 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
@@ -57,6 +57,7 @@ struct intel_timeline {
 	 * outstanding.
 	 */
 	struct list_head requests;
+	spinlock_t request_lock;
 
 	/*
 	 * Contains an RCU guarded pointer to the last request. No reference is
diff --git a/drivers/gpu/drm/i915/gt/selftests/mock_timeline.c b/drivers/gpu/drm/i915/gt/selftests/mock_timeline.c
index aeb1d1f616e8..540729250fef 100644
--- a/drivers/gpu/drm/i915/gt/selftests/mock_timeline.c
+++ b/drivers/gpu/drm/i915/gt/selftests/mock_timeline.c
@@ -17,6 +17,7 @@ void mock_timeline_init(struct intel_timeline *timeline, u64 context)
 
 	INIT_ACTIVE_FENCE(&timeline->last_request);
 	INIT_LIST_HEAD(&timeline->requests);
+	spin_lock_init(&timeline->request_lock);
 
 	i915_syncmap_init(&timeline->sync);
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 51bb8a0812a1..219e1e3ed440 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -184,6 +184,27 @@ remove_from_client(struct i915_request *request)
 	rcu_read_unlock();
 }
 
+static inline void remove_from_timeline(struct i915_request *rq)
+{
+	struct intel_timeline *tl = i915_request_timeline(rq);
+
+	/*
+	 * We know the GPU must have read the request to have
+	 * sent us the seqno + interrupt, so use the position
+	 * of tail of the request to update the last known position
+	 * of the GPU head.
+	 *
+	 * Note this requires that we are always called in request
+	 * completion order.
+	 */
+	GEM_BUG_ON(!list_is_first(&rq->link, &tl->requests));
+	rq->ring->head = rq->postfix;
+
+	spin_lock(&tl->request_lock);
+	list_del(&rq->link);
+	spin_unlock(&tl->request_lock);
+}
+
 static void free_capture_list(struct i915_request *request)
 {
 	struct i915_capture_list *capture;
@@ -231,19 +252,6 @@ bool i915_request_retire(struct i915_request *rq)
 	GEM_BUG_ON(!i915_sw_fence_signaled(&rq->submit));
 	trace_i915_request_retire(rq);
 
-	/*
-	 * We know the GPU must have read the request to have
-	 * sent us the seqno + interrupt, so use the position
-	 * of tail of the request to update the last known position
-	 * of the GPU head.
-	 *
-	 * Note this requires that we are always called in request
-	 * completion order.
-	 */
-	GEM_BUG_ON(!list_is_first(&rq->link,
-				  &i915_request_timeline(rq)->requests));
-	rq->ring->head = rq->postfix;
-
 	/*
 	 * We only loosely track inflight requests across preemption,
 	 * and so we may find ourselves attempting to retire a _completed_
@@ -270,7 +278,7 @@ bool i915_request_retire(struct i915_request *rq)
 	spin_unlock_irq(&rq->lock);
 
 	remove_from_client(rq);
-	list_del(&rq->link);
+	remove_from_timeline(rq);
 
 	intel_context_exit(rq->hw_context);
 	intel_context_unpin(rq->hw_context);
@@ -783,16 +791,14 @@ i915_request_await_start(struct i915_request *rq, struct i915_request *signal)
 	if (!tl) /* already started or maybe even completed */
 		return 0;
 
-	fence = ERR_PTR(-EBUSY);
-	if (mutex_trylock(&tl->mutex)) {
-		fence = NULL;
-		if (!i915_request_started(signal) &&
-		    !list_is_first(&signal->link, &tl->requests)) {
-			signal = list_prev_entry(signal, link);
-			fence = dma_fence_get(&signal->fence);
-		}
-		mutex_unlock(&tl->mutex);
+	fence = NULL;
+	spin_lock(&tl->request_lock);
+	if (!i915_request_started(signal) &&
+	    !list_is_first(&signal->link, &tl->requests)) {
+		signal = list_prev_entry(signal, link);
+		fence = dma_fence_get(&signal->fence);
 	}
+	spin_unlock(&tl->request_lock);
 	intel_timeline_put(tl);
 	if (IS_ERR_OR_NULL(fence))
 		return PTR_ERR_OR_ZERO(fence);
@@ -1238,7 +1244,9 @@ __i915_request_add_to_timeline(struct i915_request *rq)
 							 0);
 	}
 
+	spin_lock(&timeline->request_lock);
 	list_add_tail(&rq->link, &timeline->requests);
+	spin_unlock(&timeline->request_lock);
 
 	/*
 	 * Make sure that no request gazumped us - if it was allocated after