From patchwork Thu Apr 27 23:12:46 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michel Thierry X-Patchwork-Id: 9703743 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D3A40602CC for ; Thu, 27 Apr 2017 23:13:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D21EF285EA for ; Thu, 27 Apr 2017 23:13:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C5B9B2863B; Thu, 27 Apr 2017 23:13:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 739E6285EA for ; Thu, 27 Apr 2017 23:13:24 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 052DA6E71C; Thu, 27 Apr 2017 23:13:08 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id B666E6E194 for ; Thu, 27 Apr 2017 23:13:02 +0000 (UTC) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP; 27 Apr 2017 16:13:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.37,386,1488873600"; d="scan'208";a="81571426" Received: from relo-linux-11.sc.intel.com ([10.3.160.214]) by orsmga004.jf.intel.com with ESMTP; 27 Apr 2017 16:13:01 -0700 From: Michel Thierry To: intel-gfx@lists.freedesktop.org Date: Thu, 27 Apr 2017 16:12:46 -0700 Message-Id: <20170427231300.32841-7-michel.thierry@intel.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170427231300.32841-1-michel.thierry@intel.com> References: <20170427231300.32841-1-michel.thierry@intel.com> Subject: [Intel-gfx] [PATCH v7 06/20] drm/i915: Add engine reset count to error state X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Arun Siluvery Driver maintains count of how many times a given engine is reset, useful to capture this in error state also. It gives an idea of how engine is coping up with the workloads it is executing before this error state. A follow-up patch will provide this information in debugfs. v2: s/engine_reset/reset_engine/ (Chris) Define count as unsigned int (Tvrtko) Cc: Chris Wilson Cc: Mika Kuoppala Signed-off-by: Arun Siluvery Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/i915_drv.c | 3 ++- drivers/gpu/drm/i915/i915_drv.h | 10 ++++++++++ drivers/gpu/drm/i915/i915_gpu_error.c | 3 +++ 3 files changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index a64e9b63cdbc..426db8756e95 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1941,8 +1941,9 @@ int i915_reset_engine(struct intel_engine_cs *engine) /* replay remaining requests in the queue */ ret = engine->init_hw(engine); if (ret) - goto out; //XXX: ignore this line for now + goto out; + error->reset_engine_count[engine->id]++; out: return ret; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 8e93189c2104..b00ea523a634 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -982,6 +982,7 @@ struct i915_gpu_state { enum intel_engine_hangcheck_action hangcheck_action; struct i915_address_space *vm; int num_requests; + u32 reset_count; /* position of active request inside the ring */ u32 rq_head, rq_post, rq_tail; @@ -1617,6 +1618,9 @@ struct i915_gpu_error { #define I915_RESET_HANDOFF 1 #define I915_WEDGED (BITS_PER_LONG - 1) + /** Number of times an engine has been reset */ + u32 reset_engine_count[I915_NUM_ENGINES]; + /** * Waitqueue to signal when a hang is detected. Used to for waiters * to release the struct_mutex for the reset to procede. @@ -3439,6 +3443,12 @@ static inline u32 i915_reset_count(struct i915_gpu_error *error) return READ_ONCE(error->reset_count); } +static inline u32 i915_reset_engine_count(struct i915_gpu_error *error, + struct intel_engine_cs *engine) +{ + return READ_ONCE(error->reset_engine_count[engine->id]); +} + struct drm_i915_gem_request * i915_gem_reset_prepare_engine(struct intel_engine_cs *engine); int i915_gem_reset_prepare(struct drm_i915_private *dev_priv, diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 14e2064b7653..a2ffb1ef2cfa 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -463,6 +463,7 @@ static void error_print_engine(struct drm_i915_error_state_buf *m, err_printf(m, " hangcheck action timestamp: %lu, %u ms ago\n", ee->hangcheck_timestamp, jiffies_to_msecs(jiffies - ee->hangcheck_timestamp)); + err_printf(m, " engine reset count: %u\n", ee->reset_count); error_print_request(m, " ELSP[0]: ", &ee->execlist[0]); error_print_request(m, " ELSP[1]: ", &ee->execlist[1]); @@ -1244,6 +1245,8 @@ static void error_record_engine_registers(struct i915_gpu_state *error, ee->hangcheck_timestamp = engine->hangcheck.action_timestamp; ee->hangcheck_action = engine->hangcheck.action; ee->hangcheck_stalled = engine->hangcheck.stalled; + ee->reset_count = i915_reset_engine_count(&dev_priv->gpu_error, + engine); if (USES_PPGTT(dev_priv)) { int i;