From patchwork Tue Aug 20 13:52:32 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Chris Wilson <chris@chris-wilson.co.uk>
X-Patchwork-Id: 11103839
Return-Path: <SRS0=1ZBs=WQ=lists.freedesktop.org=intel-gfx-bounces@kernel.org>
Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org
 [172.30.200.123])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5ECE914DE
	for <patchwork-intel-gfx@patchwork.kernel.org>;
 Tue, 20 Aug 2019 13:52:44 +0000 (UTC)
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id 4405F22CF7
	for <patchwork-intel-gfx@patchwork.kernel.org>;
 Tue, 20 Aug 2019 13:52:44 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4405F22CF7
Authentication-Results: mail.kernel.org;
 dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk
Authentication-Results: mail.kernel.org;
 spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id 9657789C6B;
	Tue, 20 Aug 2019 13:52:43 +0000 (UTC)
X-Original-To: intel-gfx@lists.freedesktop.org
Delivered-To: intel-gfx@lists.freedesktop.org
Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192])
 by gabe.freedesktop.org (Postfix) with ESMTPS id 7911589C6B
 for <intel-gfx@lists.freedesktop.org>; Tue, 20 Aug 2019 13:52:41 +0000 (UTC)
X-Default-Received-SPF: pass (skip=forwardok (res=PASS))
 x-ip-name=78.156.65.138;
Received: from haswell.alporthouse.com (unverified [78.156.65.138])
 by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 18202949-1500050
 for multiple; Tue, 20 Aug 2019 14:52:33 +0100
From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Date: Tue, 20 Aug 2019 14:52:32 +0100
Message-Id: <20190820135232.31961-1-chris@chris-wilson.co.uk>
X-Mailer: git-send-email 2.23.0.rc1
In-Reply-To: <20190820134847.22991-1-chris@chris-wilson.co.uk>
References: <20190820134847.22991-1-chris@chris-wilson.co.uk>
MIME-Version: 1.0
Subject: [Intel-gfx] [PATCH] drm/i915: Serialise the fill BLT with the vma
 pinning
X-BeenThere: intel-gfx@lists.freedesktop.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Intel graphics driver community testing & development
 <intel-gfx.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Cc: Matthew Auld <matthew.auld@intel.com>
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Make sure that we wait for the vma to be pinned prior to telling the GPU
to fill the pages through that vma.

However, since our async operations fight over obj->resv->excl_fence we
must manually order them. This makes it much more fragile, and gives an
outside observer the chance to see the intermediate fences. To be
discussed!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_client_blt.c    | 46 ++++++++++++++-----
 1 file changed, 35 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
index 3502071e1391..bbbc10499099 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
@@ -71,10 +71,30 @@ static struct i915_sleeve *create_sleeve(struct i915_address_space *vm,
 		goto err_free;
 	}
 
+	/*
+	 * XXX fix scheduling with get_pages & clear workers
+	 *
+	 * The complication is that we end up overwriting the same
+	 * obj->resv->excl_fence for each stage of the operation. That fence
+	 * should be set on scheduling the work, and only signaled upon
+	 * completion of the entire workqueue.
+	 *
+	 * Within the workqueue, we use the fence to schedule each individual
+	 * task. Each individual task knows to use obj->resv->fence.
+	 *
+	 * To an outsider, they must wait until the end and so the
+	 * obj->resv->fence must be the composite.
+	 *
+	 * Ideas?
+	 */
+	err = i915_vma_pin(vma, 0, 0, PIN_USER);
+	if (unlikely(err))
+		goto err_free;
+
 	vma->private = sleeve;
 	vma->ops = &proxy_vma_ops;
 
-	sleeve->vma = vma;
+	sleeve->vma = i915_vma_get(vma);
 	sleeve->pages = pages;
 	sleeve->page_sizes = *page_sizes;
 
@@ -87,6 +107,13 @@ static struct i915_sleeve *create_sleeve(struct i915_address_space *vm,
 
 static void destroy_sleeve(struct i915_sleeve *sleeve)
 {
+	struct i915_vma *vma = sleeve->vma;
+
+	if (vma) {
+		i915_vma_unpin(vma);
+		i915_vma_put(vma);
+	}
+
 	kfree(sleeve);
 }
 
@@ -155,8 +182,8 @@ static void clear_pages_dma_fence_cb(struct dma_fence *fence,
 static void clear_pages_worker(struct work_struct *work)
 {
 	struct clear_pages_work *w = container_of(work, typeof(*w), work);
-	struct drm_i915_gem_object *obj = w->sleeve->vma->obj;
-	struct i915_vma *vma = w->sleeve->vma;
+	struct i915_vma *vma = fetch_and_zero(&w->sleeve->vma);
+	struct drm_i915_gem_object *obj = vma->obj;
 	struct i915_request *rq;
 	struct i915_vma *batch;
 	int err = w->dma.error;
@@ -166,20 +193,16 @@ static void clear_pages_worker(struct work_struct *work)
 
 	if (obj->cache_dirty) {
 		if (i915_gem_object_has_struct_page(obj))
-			drm_clflush_sg(w->sleeve->pages);
+			drm_clflush_sg(vma->pages);
 		obj->cache_dirty = false;
 	}
 	obj->read_domains = I915_GEM_GPU_DOMAINS;
 	obj->write_domain = 0;
 
-	err = i915_vma_pin(vma, 0, 0, PIN_USER);
-	if (unlikely(err))
-		goto out_signal;
-
 	batch = intel_emit_vma_fill_blt(w->ce, vma, w->value);
 	if (IS_ERR(batch)) {
 		err = PTR_ERR(batch);
-		goto out_unpin;
+		goto out_signal;
 	}
 
 	rq = intel_context_create_request(w->ce);
@@ -224,14 +247,15 @@ static void clear_pages_worker(struct work_struct *work)
 	i915_request_add(rq);
 out_batch:
 	intel_emit_vma_release(w->ce, batch);
-out_unpin:
-	i915_vma_unpin(vma);
 out_signal:
 	if (unlikely(err)) {
 		dma_fence_set_error(&w->dma, err);
 		dma_fence_signal(&w->dma);
 		dma_fence_put(&w->dma);
 	}
+
+	i915_vma_unpin(vma);
+	i915_vma_put(vma);
 }
 
 static int __i915_sw_fence_call