From patchwork Fri Jun 8 12:55:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10454361 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A98AE60467 for ; Fri, 8 Jun 2018 12:56:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 89D1629173 for ; Fri, 8 Jun 2018 12:56:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7E7A929509; Fri, 8 Jun 2018 12:56:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id DE6A029173 for ; Fri, 8 Jun 2018 12:56:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 643F86E0BB; Fri, 8 Jun 2018 12:56:29 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 75AA46E0AC for ; Fri, 8 Jun 2018 12:56:24 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 11986093-1500050 for multiple; Fri, 08 Jun 2018 13:56:10 +0100 Received: by haswell.alporthouse.com (sSMTP sendmail emulation); Fri, 08 Jun 2018 13:56:10 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 8 Jun 2018 13:55:47 +0100 Message-Id: <20180608125602.17693-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180608125602.17693-1-chris@chris-wilson.co.uk> References: <20180608125602.17693-1-chris@chris-wilson.co.uk> X-Originating-IP: 78.156.65.138 X-Country: code=GB country="United Kingdom" ip=78.156.65.138 Subject: [Intel-gfx] [PATCH 03/18] drm/i915/ringbuffer: Fix context restore upon reset X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP The discovery with trying to enable full-ppgtt was that we were completely failing to the load both the mm and context following the reset. Although we were performing mmio to set the PP_DIR (per-process GTT) and CCID (context), these were taking no effect (the assumption was that this would trigger reload of the context and restore the page tables). It was not until we performed the LRI + MI_SET_CONTEXT in a following context switch would anything occur. Since we are then required to reset the context image and PP_DIR using CS commands, we place those commands into every batch. The hardware should recognise the no-ops and eliminate the expensive context loads, but we still have to pay the cost of using cross-powerwell register writes. In practice, this has no effect on actual context switch times, and only adds a few hundred nanoseconds to no-op switches. We can improve the latter by eliminating the w/a around known no-op switches, but there is an ulterior motive to keeping them. Always emitting the context switch at the beginning of the request (and relying on HW to skip unneeded switches) does have one key advantage. Should we implement request reordering on Haswell, we will not know in advance what the previous executing context was on the GPU and so we would not be able to elide the MI_SET_CONTEXT commands ourselves and always have to emit them. Having our hand forced now actually prepares us for later. Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Mika Kuoppala Cc: Matthew Auld Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_request.c | 2 + drivers/gpu/drm/i915/i915_request.h | 3 + drivers/gpu/drm/i915/intel_engine_cs.c | 3 - drivers/gpu/drm/i915/intel_ringbuffer.c | 75 ++++++++----------------- drivers/gpu/drm/i915/intel_ringbuffer.h | 9 --- 5 files changed, 28 insertions(+), 64 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index f187250e60c6..9092f5464c24 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -817,6 +817,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx) /* Keep a second pin for the dual retirement along engine and ring */ __intel_context_pin(ce); + rq->infix = rq->ring->emit; /* end of header; start of user payload */ + /* Check that we didn't interrupt ourselves with a new request */ GEM_BUG_ON(rq->timeline->seqno != rq->fence.seqno); return rq; diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 491ff81d0fea..0e9aba53d0e4 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -134,6 +134,9 @@ struct i915_request { /** Position in the ring of the start of the request */ u32 head; + /** Position in the ring of the start of the user packets */ + u32 infix; + /** * Position in the ring of the start of the postfix. * This is required to calculate the maximum available ring space diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 2ec2e60dc670..d1cf8b4926ab 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -1168,9 +1168,6 @@ void intel_engine_lost_context(struct intel_engine_cs *engine) lockdep_assert_held(&engine->i915->drm.struct_mutex); - engine->legacy_active_context = NULL; - engine->legacy_active_ppgtt = NULL; - ce = fetch_and_zero(&engine->last_retired_context); if (ce) intel_context_unpin(ce); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 332d97bc5c27..1b3805adbd57 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -541,6 +541,21 @@ static struct i915_request *reset_prepare(struct intel_engine_cs *engine) return i915_gem_find_active_request(engine); } +static void skip_request(struct i915_request *request) +{ + void *vaddr = request->ring->vaddr; + u32 head; + + head = request->infix; + if (request->postfix < head) { + memset32(vaddr + head, MI_NOOP, + (request->ring->size - head) / sizeof(u32)); + head = 0; + } + memset32(vaddr + head, MI_NOOP, + (request->postfix - head) / sizeof(u32)); +} + static void reset_ring(struct intel_engine_cs *engine, struct i915_request *request) { @@ -570,42 +585,10 @@ static void reset_ring(struct intel_engine_cs *engine, * the restored context. */ if (request) { - struct drm_i915_private *dev_priv = request->i915; - struct intel_context *ce = request->hw_context; - struct i915_hw_ppgtt *ppgtt; - - if (ce->state) { - I915_WRITE(CCID, - i915_ggtt_offset(ce->state) | - BIT(8) /* must be set! */ | - CCID_EXTENDED_STATE_SAVE | - CCID_EXTENDED_STATE_RESTORE | - CCID_EN); - } - - ppgtt = request->gem_context->ppgtt ?: engine->i915->mm.aliasing_ppgtt; - if (ppgtt) { - u32 pd_offset = ppgtt->pd.base.ggtt_offset << 10; - - I915_WRITE(RING_PP_DIR_DCLV(engine), PP_DIR_DCLV_2G); - I915_WRITE(RING_PP_DIR_BASE(engine), pd_offset); - - /* Wait for the PD reload to complete */ - if (intel_wait_for_register(dev_priv, - RING_PP_DIR_BASE(engine), - BIT(0), 0, - 10)) - DRM_ERROR("Wait for reload of ppgtt page-directory timed out\n"); - - ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine); - } - /* If the rq hung, jump to its breadcrumb and skip the batch */ + request->ring->head = request->head; if (request->fence.error == -EIO) - request->ring->head = request->postfix; - } else { - engine->legacy_active_context = NULL; - engine->legacy_active_ppgtt = NULL; + skip_request(request); } } @@ -1589,31 +1572,25 @@ static int switch_context(struct i915_request *rq) struct i915_gem_context *to_ctx = rq->gem_context; struct i915_hw_ppgtt *to_mm = to_ctx->ppgtt ?: rq->i915->mm.aliasing_ppgtt; - struct i915_gem_context *from_ctx = engine->legacy_active_context; - struct i915_hw_ppgtt *from_mm = engine->legacy_active_ppgtt; u32 hw_flags = 0; int ret, i; lockdep_assert_held(&rq->i915->drm.struct_mutex); GEM_BUG_ON(HAS_EXECLISTS(rq->i915)); - if (to_mm != from_mm || - (to_mm && intel_engine_flag(engine) & to_mm->pd_dirty_rings)) { + if (to_mm) { trace_switch_mm(engine, to_ctx); ret = to_mm->switch_mm(to_mm, rq); if (ret) goto err; - to_mm->pd_dirty_rings &= ~intel_engine_flag(engine); - engine->legacy_active_ppgtt = to_mm; - - if (to_ctx == from_ctx) { + if (intel_engine_flag(engine) & to_mm->pd_dirty_rings) { hw_flags = MI_FORCE_RESTORE; - from_ctx = NULL; + to_mm->pd_dirty_rings &= ~intel_engine_flag(engine); } } - if (rq->hw_context->state && to_ctx != from_ctx) { + if (rq->hw_context->state) { GEM_BUG_ON(engine->id != RCS); /* @@ -1628,9 +1605,7 @@ static int switch_context(struct i915_request *rq) ret = mi_set_context(rq, hw_flags); if (ret) - goto err_mm; - - engine->legacy_active_context = to_ctx; + goto err; } if (to_ctx->remap_slice) { @@ -1640,7 +1615,7 @@ static int switch_context(struct i915_request *rq) ret = remap_l3(rq, i); if (ret) - goto err_ctx; + goto err; } to_ctx->remap_slice = 0; @@ -1648,10 +1623,6 @@ static int switch_context(struct i915_request *rq) return 0; -err_ctx: - engine->legacy_active_context = from_ctx; -err_mm: - engine->legacy_active_ppgtt = from_mm; err: return ret; } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index acef385c4c80..b44c67849749 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -557,15 +557,6 @@ struct intel_engine_cs { */ struct intel_context *last_retired_context; - /* We track the current MI_SET_CONTEXT in order to eliminate - * redudant context switches. This presumes that requests are not - * reordered! Or when they are the tracking is updated along with - * the emission of individual requests into the legacy command - * stream (ring). - */ - struct i915_gem_context *legacy_active_context; - struct i915_hw_ppgtt *legacy_active_ppgtt; - /* status_notifier: list of callbacks for context-switch changes */ struct atomic_notifier_head context_status_notifier;