From patchwork Thu Oct 30 18:41:21 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: John Harrison <John.C.Harrison@Intel.com>
X-Patchwork-Id: 5199881
Return-Path: <intel-gfx-bounces@lists.freedesktop.org>
X-Original-To: patchwork-intel-gfx@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.19.201])
	by patchwork2.web.kernel.org (Postfix) with ESMTP id 77D60C11AC
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Thu, 30 Oct 2014 18:42:10 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id 6B745201FE
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Thu, 30 Oct 2014 18:42:09 +0000 (UTC)
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	by mail.kernel.org (Postfix) with ESMTP id 56300201DD
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Thu, 30 Oct 2014 18:42:08 +0000 (UTC)
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id AA9A56E55C;
	Thu, 30 Oct 2014 11:42:07 -0700 (PDT)
X-Original-To: Intel-GFX@lists.freedesktop.org
Delivered-To: Intel-GFX@lists.freedesktop.org
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
	by gabe.freedesktop.org (Postfix) with ESMTP id B43016E0D1
	for <Intel-GFX@lists.freedesktop.org>;
	Thu, 30 Oct 2014 11:41:57 -0700 (PDT)
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
	by fmsmga101.fm.intel.com with ESMTP; 30 Oct 2014 11:41:57 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.07,287,1413270000"; d="scan'208";a="623664553"
Received: from johnharr-linux.isw.intel.com ([10.102.226.51])
	by fmsmga002.fm.intel.com with ESMTP; 30 Oct 2014 11:41:56 -0700
From: John.C.Harrison@Intel.com
To: Intel-GFX@Lists.FreeDesktop.Org
Date: Thu, 30 Oct 2014 18:41:21 +0000
Message-Id: <1414694481-15724-30-git-send-email-John.C.Harrison@Intel.com>
X-Mailer: git-send-email 1.7.9.5
In-Reply-To: <1414694481-15724-1-git-send-email-John.C.Harrison@Intel.com>
References: <1414694481-15724-1-git-send-email-John.C.Harrison@Intel.com>
Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way,
	Swindon SN3 1RJ
Subject: [Intel-gfx] [PATCH 29/29] WIP: Defer seqno allocation until actual
	hardware submission time
X-BeenThere: intel-gfx@lists.freedesktop.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: Intel graphics driver community testing & development
	<intel-gfx.lists.freedesktop.org>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
MIME-Version: 1.0
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED,
	RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

From: John Harrison <John.C.Harrison@Intel.com>

*** Work in progress. Do not submit - currently broken! ***
This patch is being included in the series simply to show the intention.

The seqno value is now only used for the final test for completion of a request.
It is no longer used to track the request through the software stack. Thus it is
no longer necessary to allocate the seqno immediately with the request. Instead,
it can be done lazily and left until the request is actually sent to the
hardware. This is particular advantageous with a GPU scheduler as the requests
can then be re-ordered between their creation and their hardware submission.

The problem with lazy seqno allocation is that wrapping the seqno around zero
currently involves idling the hardware for reasons that are not entirely clear.
This is something that is not safe to do half way through ring submission. Thus
the wrapping must be done in advance even if the actual seqno assignment is not
done until i915_add_request().

Unfortunately, there is no clearly defined point at which to do the
semi-lazy-allocation or advance wrapping. The problem is that requests can be
submitted asynchronously half way through do_execbuffer() processing! The
solution is to ensure that all commands are actually submitted as requests at
the appropriate time by adding extra add_request() calls. That is future work...

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
---
 drivers/gpu/drm/i915/i915_drv.h         |    2 +-
 drivers/gpu/drm/i915/i915_gem.c         |   31 ++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_lrc.c        |   10 ++++------
 drivers/gpu/drm/i915/intel_ringbuffer.c |   10 ++++------
 4 files changed, 39 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2284ecef..fcc9a4b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2526,7 +2526,7 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
 	return (int32_t)(seq1 - seq2) >= 0;
 }
 
-int __must_check i915_gem_get_seqno(struct drm_device *dev, u32 *seqno);
+int __must_check i915_gem_prepare_next_seqno(struct drm_device *dev);
 int __must_check i915_gem_set_seqno(struct drm_device *dev, u32 seqno);
 int __must_check i915_gem_object_get_fence(struct drm_i915_gem_object *obj);
 int __must_check i915_gem_object_put_fence(struct drm_i915_gem_object *obj);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ad0458b..da2371a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2314,12 +2314,15 @@ int i915_gem_set_seqno(struct drm_device *dev, u32 seqno)
 }
 
 int
-i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
+i915_gem_prepare_next_seqno(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
 	/* reserve 0 for non-seqno */
 	if (dev_priv->next_seqno == 0) {
+		/* Why is the full re-initialisation required? Is it only for
+		 * hardware semaphores? If so, could skip it in the case where
+		 * semaphores are disabled? */
 		int ret = i915_gem_init_seqno(dev, 0);
 		if (ret)
 			return ret;
@@ -2327,6 +2330,24 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
 		dev_priv->next_seqno = 1;
 	}
 
+	return 0;
+}
+
+static int
+i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	/* reserve 0 for non-seqno */
+	if (dev_priv->next_seqno == 0) {
+		/* Should never get here! Must always call 'prepare_next' in
+		 * advance. This code is called during request submission.
+		 * Trying to wrap the seqno and the implicit idle() calls that
+		 * the wrap code makes are a bad idea at this point! */
+		DRM_ERROR("Need to wrap seqno at inopportune moment!\n");
+		return -EBUSY;
+	}
+
 	*seqno = dev_priv->last_seqno = dev_priv->next_seqno++;
 	return 0;
 }
@@ -2369,6 +2390,11 @@ int __i915_add_request(struct intel_engine_cs *ring,
 			return ret;
 	}
 
+	/* Assign an identifier to track this request through the hardware: */
+	ret = i915_gem_get_seqno(ring->dev, &request->seqno);
+	if (ret)
+		return ret;
+
 	/* Record the position of the start of the request so that
 	 * should we detect the updated seqno part-way through the
 	 * GPU processing the request, we never over-estimate the
@@ -2673,6 +2699,9 @@ void i915_gem_complete_requests_ring(struct intel_engine_cs *ring,
 		if (req->complete)
 			continue;
 
+		if (req->seqno == 0)
+			continue;
+
 		if (i915_seqno_passed(seqno, req->seqno))
 			req->complete = true;
 	}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index edf1033..4d62978 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -803,6 +803,10 @@ static int logical_ring_alloc_request(struct intel_engine_cs *ring,
 	if (ring->outstanding_lazy_request)
 		return 0;
 
+	ret = i915_gem_prepare_next_seqno(ring->dev);
+	if (ret)
+		return ret;
+
 	request = kzalloc(sizeof(*request), GFP_KERNEL);
 	if (request == NULL)
 		return -ENOMEM;
@@ -811,12 +815,6 @@ static int logical_ring_alloc_request(struct intel_engine_cs *ring,
 	request->ring = ring;
 	request->uniq = dev_private->request_uniq++;
 
-	ret = i915_gem_get_seqno(ring->dev, &request->seqno);
-	if (ret) {
-		kfree(request);
-		return ret;
-	}
-
 	/* Hold a reference to the context this request belongs to
 	 * (we will need it when the time comes to emit/retire the
 	 * request).
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 7b5c8b8..9ff4ea6 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2034,6 +2034,10 @@ intel_ring_alloc_request(struct intel_engine_cs *ring)
 	if (ring->outstanding_lazy_request)
 		return 0;
 
+	ret = i915_gem_prepare_next_seqno(ring->dev);
+	if (ret)
+		return ret;
+
 	request = kzalloc(sizeof(*request), GFP_KERNEL);
 	if (request == NULL)
 		return -ENOMEM;
@@ -2042,12 +2046,6 @@ intel_ring_alloc_request(struct intel_engine_cs *ring)
 	request->ring = ring;
 	request->uniq = dev_private->request_uniq++;
 
-	ret = i915_gem_get_seqno(ring->dev, &request->seqno);
-	if (ret) {
-		kfree(request);
-		return ret;
-	}
-
 	ring->outstanding_lazy_request = request;
 	return 0;
 }