From patchwork Thu Mar 15 09:26:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10284139 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B2B48602BD for ; Thu, 15 Mar 2018 09:27:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BAE7D28949 for ; Thu, 15 Mar 2018 09:27:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AEB9128962; Thu, 15 Mar 2018 09:27:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 189A228949 for ; Thu, 15 Mar 2018 09:27:08 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2E14A6E7CD; Thu, 15 Mar 2018 09:27:08 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id EFFF06E7CD for ; Thu, 15 Mar 2018 09:27:06 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 11035204-1500050 for multiple; Thu, 15 Mar 2018 09:26:49 +0000 Received: by haswell.alporthouse.com (sSMTP sendmail emulation); Thu, 15 Mar 2018 09:26:49 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 15 Mar 2018 09:26:46 +0000 Message-Id: <20180315092646.27471-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.16.2 X-Originating-IP: 78.156.65.138 X-Country: code=GB country="United Kingdom" ip=78.156.65.138 Subject: [Intel-gfx] [PATCH] drm/i915: Trace GEM steps between submit and wedging X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We still have an odd race with wedging/unwedging as shown by igt/gem_eio that defies expectations. Add some more trace_printks to try and visualize the flow over the precipice. Signed-off-by: Chris Wilson Cc: Mika Kuoppala Cc: Tvrtko Ursulin Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++++++++ drivers/gpu/drm/i915/i915_request.c | 18 ++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 13d4b0e74641..2fbd622bba30 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3193,6 +3193,9 @@ void i915_gem_reset_finish(struct drm_i915_private *dev_priv) static void nop_submit_request(struct i915_request *request) { + GEM_TRACE("%s fence %llx:%d -> -EIO\n", + request->engine->name, + request->fence.context, request->fence.seqno); dma_fence_set_error(&request->fence, -EIO); i915_request_submit(request); @@ -3202,6 +3205,9 @@ static void nop_complete_submit_request(struct i915_request *request) { unsigned long flags; + GEM_TRACE("%s fence %llx:%d -> -EIO\n", + request->engine->name, + request->fence.context, request->fence.seqno); dma_fence_set_error(&request->fence, -EIO); spin_lock_irqsave(&request->engine->timeline->lock, flags); @@ -3215,6 +3221,8 @@ void i915_gem_set_wedged(struct drm_i915_private *i915) struct intel_engine_cs *engine; enum intel_engine_id id; + GEM_TRACE("start\n"); + if (drm_debug & DRM_UT_DRIVER) { struct drm_printer p = drm_debug_printer(__func__); @@ -3279,6 +3287,8 @@ void i915_gem_set_wedged(struct drm_i915_private *i915) i915_gem_reset_finish_engine(engine); } + GEM_TRACE("end\n"); + wake_up_all(&i915->gpu_error.reset_queue); } @@ -3291,6 +3301,8 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915) if (!test_bit(I915_WEDGED, &i915->gpu_error.flags)) return true; + GEM_TRACE("start\n"); + /* * Before unwedging, make sure that all pending operations * are flushed and errored out - we may have requests waiting upon @@ -3341,6 +3353,8 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915) intel_engines_reset_default_submission(i915); i915_gem_contexts_lost(i915); + GEM_TRACE("end\n"); + smp_mb__before_atomic(); /* complete takeover before enabling execbuf */ clear_bit(I915_WEDGED, &i915->gpu_error.flags); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 1810fa1b81cb..fac1056422a5 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -381,6 +381,11 @@ static void i915_request_retire(struct i915_request *request) struct intel_engine_cs *engine = request->engine; struct i915_gem_active *active, *next; + GEM_TRACE("%s fence %llx:%d, global_seqno %d\n", + engine->name, + request->fence.context, request->fence.seqno, + request->global_seqno); + lockdep_assert_held(&request->i915->drm.struct_mutex); GEM_BUG_ON(!i915_sw_fence_signaled(&request->submit)); GEM_BUG_ON(!i915_request_completed(request)); @@ -488,6 +493,11 @@ void __i915_request_submit(struct i915_request *request) struct intel_timeline *timeline; u32 seqno; + GEM_TRACE("%s fence %llx:%d -> global_seqno %d\n", + request->engine->name, + request->fence.context, request->fence.seqno, + engine->timeline->seqno); + GEM_BUG_ON(!irqs_disabled()); lockdep_assert_held(&engine->timeline->lock); @@ -537,6 +547,11 @@ void __i915_request_unsubmit(struct i915_request *request) struct intel_engine_cs *engine = request->engine; struct intel_timeline *timeline; + GEM_TRACE("%s fence %llx:%d <- global_seqno %d\n", + request->engine->name, + request->fence.context, request->fence.seqno, + request->global_seqno); + GEM_BUG_ON(!irqs_disabled()); lockdep_assert_held(&engine->timeline->lock); @@ -996,6 +1011,9 @@ void __i915_request_add(struct i915_request *request, bool flush_caches) u32 *cs; int err; + GEM_TRACE("%s fence %llx:%d\n", + engine->name, request->fence.context, request->fence.seqno); + lockdep_assert_held(&request->i915->drm.struct_mutex); trace_i915_request_add(request);