From patchwork Tue Oct 21 10:48:12 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dheeraj Jamwal X-Patchwork-Id: 5115911 Return-Path: X-Original-To: patchwork-ltsi-dev@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id E2D4CC11AC for ; Tue, 21 Oct 2014 11:22:24 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 7B24E2010F for ; Tue, 21 Oct 2014 11:22:23 +0000 (UTC) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 26E41200CA for ; Tue, 21 Oct 2014 11:22:22 +0000 (UTC) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 1A230DDA; Tue, 21 Oct 2014 11:03:09 +0000 (UTC) X-Original-To: ltsi-dev@lists.linuxfoundation.org Delivered-To: ltsi-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 951F2B39 for ; Tue, 21 Oct 2014 11:03:08 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by smtp1.linuxfoundation.org (Postfix) with ESMTP id 1DB5E1FFFA for ; Tue, 21 Oct 2014 11:03:07 +0000 (UTC) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP; 21 Oct 2014 04:03:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,761,1406617200"; d="scan'208";a="617794529" Received: from ubuntu-desktop.png.intel.com ([10.221.122.25]) by fmsmga002.fm.intel.com with ESMTP; 21 Oct 2014 04:03:00 -0700 From: Dheeraj Jamwal To: ltsi-dev@lists.linuxfoundation.org Date: Tue, 21 Oct 2014 18:48:12 +0800 Message-Id: <1413889294-31328-293-git-send-email-dheerajx.s.jamwal@intel.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1413889294-31328-1-git-send-email-dheerajx.s.jamwal@intel.com> References: <1413889294-31328-1-git-send-email-dheerajx.s.jamwal@intel.com> X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org Subject: [LTSI-dev] [PATCH 0292/1094] drm/i915: Add error code into error state X-BeenThere: ltsi-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: "A list to discuss patches, development, and other things related to the LTSI project" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ltsi-dev-bounces@lists.linuxfoundation.org Errors-To: ltsi-dev-bounces@lists.linuxfoundation.org X-Virus-Scanned: ClamAV using ClamSMTP From: Mika Kuoppala commit 011cf577b2531dfbd2254bd9ec147ad71471abaf Author: Ben Widawsky Date: Tue Feb 4 12:18:55 2014 +0000 drm/i915: Generate a hang error code added error code debug into dmesg. Store this also with error state to make matching dmesg logs and error states easier. As we need to have full ring state for error code generation, do full capture always, print hang message into log and then decide if we need to keep the error state. Signed-off-by: Mika Kuoppala Reviewed-by: Chris Wilson Signed-off-by: Daniel Vetter (cherry picked from commit cb38300215dc24886347bfc6400ccfed806dac21) Signed-off-by: Dheeraj Jamwal --- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gpu_error.c | 61 ++++++++++++++++++++------------- 2 files changed, 40 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 71fddee..ae5bc91 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -303,6 +303,8 @@ struct drm_i915_error_state { struct kref ref; struct timeval time; + char error_msg[128]; + /* Generic register state */ u32 eir; u32 pgtbl_er; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 8360362..4879a2a 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -332,6 +332,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, goto out; } + err_printf(m, "%s\n", error->error_msg); err_printf(m, "Time: %ld s %ld us\n", error->time.tv_sec, error->time.tv_usec); err_printf(m, "Kernel: " UTS_RELEASE "\n"); @@ -692,7 +693,8 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err, * It's only a small step better than a random number in its current form. */ static uint32_t i915_error_generate_code(struct drm_i915_private *dev_priv, - struct drm_i915_error_state *error) + struct drm_i915_error_state *error, + int *ring_id) { uint32_t error_code = 0; int i; @@ -702,9 +704,14 @@ static uint32_t i915_error_generate_code(struct drm_i915_private *dev_priv, * synchronization commands which almost always appear in the case * strictly a client bug. Use instdone to differentiate those some. */ - for (i = 0; i < I915_NUM_RINGS; i++) - if (error->ring[i].hangcheck_action == HANGCHECK_HUNG) + for (i = 0; i < I915_NUM_RINGS; i++) { + if (error->ring[i].hangcheck_action == HANGCHECK_HUNG) { + if (ring_id) + *ring_id = i; + return error->ring[i].ipehr ^ error->ring[i].instdone; + } + } return error_code; } @@ -1089,6 +1096,19 @@ static void i915_capture_reg_state(struct drm_i915_private *dev_priv, i915_get_extra_instdone(dev, error->extra_instdone); } +static void i915_error_capture_msg(struct drm_device *dev, + struct drm_i915_error_state *error) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + u32 ecode; + int ring_id = -1; + + ecode = i915_error_generate_code(dev_priv, error, &ring_id); + + scnprintf(error->error_msg, sizeof(error->error_msg), + "GPU HANG: ecode %d:0x%08x", ring_id, ecode); +} + /** * i915_capture_error_state - capture an error record for later analysis * @dev: drm device @@ -1104,13 +1124,6 @@ void i915_capture_error_state(struct drm_device *dev) struct drm_i915_private *dev_priv = dev->dev_private; struct drm_i915_error_state *error; unsigned long flags; - uint32_t ecode; - - spin_lock_irqsave(&dev_priv->gpu_error.lock, flags); - error = dev_priv->gpu_error.first_error; - spin_unlock_irqrestore(&dev_priv->gpu_error.lock, flags); - if (error) - return; /* Account for pipe specific data like PIPE*STAT */ error = kzalloc(sizeof(*error), GFP_ATOMIC); @@ -1119,30 +1132,21 @@ void i915_capture_error_state(struct drm_device *dev) return; } - DRM_INFO("GPU crash dump saved to /sys/class/drm/card%d/error\n", - dev->primary->index); kref_init(&error->ref); i915_capture_reg_state(dev_priv, error); i915_gem_capture_buffers(dev_priv, error); i915_gem_record_fences(dev, error); i915_gem_record_rings(dev, error); - ecode = i915_error_generate_code(dev_priv, error); - - if (!warned) { - DRM_INFO("GPU HANG [%x]\n", ecode); - DRM_INFO("GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.\n"); - DRM_INFO("Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel\n"); - DRM_INFO("drm/i915 developers can then reassign to the right component if it's not a kernel issue.\n"); - DRM_INFO("The gpu crash dump is required to analyze gpu hangs, so please always attach it.\n"); - warned = true; - } do_gettimeofday(&error->time); error->overlay = intel_overlay_capture_error_state(dev); error->display = intel_display_capture_error_state(dev); + i915_error_capture_msg(dev, error); + DRM_INFO("%s\n", error->error_msg); + spin_lock_irqsave(&dev_priv->gpu_error.lock, flags); if (dev_priv->gpu_error.first_error == NULL) { dev_priv->gpu_error.first_error = error; @@ -1150,8 +1154,19 @@ void i915_capture_error_state(struct drm_device *dev) } spin_unlock_irqrestore(&dev_priv->gpu_error.lock, flags); - if (error) + if (error) { i915_error_state_free(&error->ref); + return; + } + + if (!warned) { + DRM_INFO("GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.\n"); + DRM_INFO("Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel\n"); + DRM_INFO("drm/i915 developers can then reassign to the right component if it's not a kernel issue.\n"); + DRM_INFO("The gpu crash dump is required to analyze gpu hangs, so please always attach it.\n"); + DRM_INFO("GPU crash dump saved to /sys/class/drm/card%d/error\n", dev->primary->index); + warned = true; + } } void i915_error_state_get(struct drm_device *dev,