From patchwork Wed May 23 14:53:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10421703 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A9F9260327 for ; Wed, 23 May 2018 14:53:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 998FB2893C for ; Wed, 23 May 2018 14:53:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8E5AB28E7C; Wed, 23 May 2018 14:53:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E73B628B10 for ; Wed, 23 May 2018 14:53:36 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4451F6E4F7; Wed, 23 May 2018 14:53:36 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9C32A6E4F7 for ; Wed, 23 May 2018 14:53:34 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 11807747-1500050 for multiple; Wed, 23 May 2018 15:53:13 +0100 Received: by haswell.alporthouse.com (sSMTP sendmail emulation); Wed, 23 May 2018 15:53:16 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 23 May 2018 15:53:13 +0100 Message-Id: <20180523145313.15149-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180523145313.15149-1-chris@chris-wilson.co.uk> References: <20180523145313.15149-1-chris@chris-wilson.co.uk> X-Originating-IP: 78.156.65.138 X-Country: code=GB country="United Kingdom" ip=78.156.65.138 Subject: [Intel-gfx] [PATCH 2/2] drm/i915: Look for an active kernel context before switching X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mika Kuoppala MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We were not very carefully checking to see if an older request on the engine was an earlier switch-to-kernel-context before deciding to emit a new switch. The end result would be that we could get into a permanent loop of trying to emit a new request to perform the switch simply to flush the existing switch. What we need is a means of tracking the completion of each timeline versus the kernel context, that is to detect if a more recent request has been submitted that would result in a switch away from the kernel context. To realise this, we need only to look in our syncmap on the kernel context and check that we have synchronized against all active rings. v2: Since all ringbuffer clients currently share the same timeline, we do have to use the gem_context to distinguish clients. As a bonus, include all the tracing used to debug the death inside suspend. Reported-by: Mika Kuoppala Fixes: a89d1f921c15 ("drm/i915: Split i915_gem_timeline into individual timelines") Signed-off-by: Chris Wilson Cc: Mika Kuoppala --- drivers/gpu/drm/i915/i915_gem.c | 7 +++ drivers/gpu/drm/i915/i915_gem_context.c | 67 +++++++++++++++++++++---- drivers/gpu/drm/i915/i915_request.c | 5 +- 3 files changed, 67 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 03874b50ada9..05f44ca35a06 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3703,6 +3703,9 @@ static int wait_for_engines(struct drm_i915_private *i915) int i915_gem_wait_for_idle(struct drm_i915_private *i915, unsigned int flags) { + GEM_TRACE("flags=%x (%s)\n", + flags, flags & I915_WAIT_LOCKED ? "locked" : "unlocked"); + /* If the device is asleep, we have no requests outstanding */ if (!READ_ONCE(i915->gt.awake)) return 0; @@ -3719,6 +3722,7 @@ int i915_gem_wait_for_idle(struct drm_i915_private *i915, unsigned int flags) return err; } i915_retire_requests(i915); + GEM_BUG_ON(i915->gt.active_requests); return wait_for_engines(i915); } else { @@ -4901,6 +4905,7 @@ static void assert_kernel_context_is_current(struct drm_i915_private *i915) struct intel_engine_cs *engine; enum intel_engine_id id; + GEM_BUG_ON(i915->gt.active_requests); for_each_engine(engine, i915, id) { GEM_BUG_ON(__i915_gem_active_peek(&engine->timeline.last_request)); GEM_BUG_ON(engine->last_retired_context->gem_context != kctx); @@ -4932,6 +4937,8 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv) struct drm_device *dev = &dev_priv->drm; int ret; + GEM_TRACE("\n"); + intel_runtime_pm_get(dev_priv); intel_suspend_gt_powersave(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index b69b18ef8120..82198bfacb7f 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -581,24 +581,58 @@ last_request_on_engine(struct i915_timeline *timeline, rq = i915_gem_active_raw(&timeline->last_request, &engine->i915->drm.struct_mutex); - if (rq && rq->engine == engine) + if (rq && rq->engine == engine) { + GEM_TRACE("last request for %s on engine %s: %llx:%d\n", + timeline->name, engine->name, + rq->fence.context, rq->fence.seqno); return rq; + } return NULL; } -static bool engine_has_idle_kernel_context(struct intel_engine_cs *engine) +static bool engine_has_kernel_context_barrier(struct intel_engine_cs *engine) { - struct list_head * const active_rings = &engine->i915->gt.active_rings; + struct drm_i915_private *i915 = engine->i915; + struct i915_timeline *barrier = + to_intel_context(i915->kernel_context, engine)->ring->timeline; struct intel_ring *ring; + bool any_active = false; - lockdep_assert_held(&engine->i915->drm.struct_mutex); + lockdep_assert_held(&i915->drm.struct_mutex); + list_for_each_entry(ring, &i915->gt.active_rings, active_link) { + struct i915_request *rq; + + rq = last_request_on_engine(ring->timeline, engine); + if (!rq) + continue; - list_for_each_entry(ring, active_rings, active_link) { - if (last_request_on_engine(ring->timeline, engine)) + if (rq->gem_context == i915->kernel_context) + continue; + + /* + * Was this request submitted after the previous + * switch-to-kernel-context? + */ + if (!i915_timeline_sync_is_later(barrier, &rq->fence)) { + GEM_TRACE("%s needs barrier\n", ring->timeline->name); return false; + } + + GEM_TRACE("%s has barrier\n", ring->timeline->name); + any_active = true; } + /* + * If any other timeline was still active and behind the last barrier, + * then our last switch-to-kernel-context must still be queued and + * will run last (leaving the engine in the kernel context when it + * eventually idles). + */ + if (any_active) + return true; + + /* The engine is idle; check that it is idling in the kernel context. */ return intel_engine_has_kernel_context(engine); } @@ -607,7 +641,10 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915) struct intel_engine_cs *engine; enum intel_engine_id id; + GEM_TRACE("\n"); + lockdep_assert_held(&i915->drm.struct_mutex); + GEM_BUG_ON(!i915->kernel_context); i915_retire_requests(i915); @@ -615,9 +652,12 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915) struct intel_ring *ring; struct i915_request *rq; - if (engine_has_idle_kernel_context(engine)) + GEM_BUG_ON(!to_intel_context(i915->kernel_context, engine)); + if (engine_has_kernel_context_barrier(engine)) continue; + GEM_TRACE("emit barrier on %s\n", engine->name); + rq = i915_request_alloc(engine, i915->kernel_context); if (IS_ERR(rq)) return PTR_ERR(rq); @@ -627,10 +667,15 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915) struct i915_request *prev; prev = last_request_on_engine(ring->timeline, engine); - if (prev) - i915_sw_fence_await_sw_fence_gfp(&rq->submit, - &prev->submit, - I915_FENCE_GFP); + if (!prev) + continue; + + if (prev->gem_context == i915->kernel_context) + continue; + + i915_sw_fence_await_sw_fence_gfp(&rq->submit, + &prev->submit, + I915_FENCE_GFP); } /* diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index fc499bcbd105..f187250e60c6 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -320,6 +320,7 @@ static void advance_ring(struct i915_request *request) * is just about to be. Either works, if we miss the last two * noops - they are safe to be replayed on a reset. */ + GEM_TRACE("marking %s as inactive\n", ring->timeline->name); tail = READ_ONCE(request->tail); list_del(&ring->active_link); } else { @@ -1095,8 +1096,10 @@ void __i915_request_add(struct i915_request *request, bool flush_caches) i915_gem_active_set(&timeline->last_request, request); list_add_tail(&request->ring_link, &ring->request_list); - if (list_is_first(&request->ring_link, &ring->request_list)) + if (list_is_first(&request->ring_link, &ring->request_list)) { + GEM_TRACE("marking %s as active\n", ring->timeline->name); list_add(&ring->active_link, &request->i915->gt.active_rings); + } request->emitted_jiffies = jiffies; /*