From patchwork Tue Jul 1 18:17:51 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 4461241 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id E6085BEECB for ; Tue, 1 Jul 2014 18:18:50 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id A898A203F1 for ; Tue, 1 Jul 2014 18:18:49 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id A00A9203DC for ; Tue, 1 Jul 2014 18:18:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A68B96E2C2; Tue, 1 Jul 2014 11:18:47 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTP id 8D0CF6E22D for ; Tue, 1 Jul 2014 11:18:44 -0700 (PDT) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP; 01 Jul 2014 11:18:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.01,583,1400050800"; d="scan'208";a="537318005" Received: from ironside.jf.intel.com ([10.7.197.210]) by orsmga001.jf.intel.com with ESMTP; 01 Jul 2014 11:17:56 -0700 From: Ben Widawsky To: Intel GFX Date: Tue, 1 Jul 2014 11:17:51 -0700 Message-Id: <1404238671-18760-17-git-send-email-benjamin.widawsky@intel.com> X-Mailer: git-send-email 2.0.1 In-Reply-To: <1404238671-18760-1-git-send-email-benjamin.widawsky@intel.com> References: <1404238671-18760-1-git-send-email-benjamin.widawsky@intel.com> Cc: Ben Widawsky , Ben Widawsky Subject: [Intel-gfx] [PATCH 16/16] drm/i915: Get the error state over the wire (HACKish) X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP I was dealing with a bug recently where the system would hard hang somewhere between hangcheck and reset. There was time after error collection to actually get my error state out, but I couldn't get the reads to work. This patch is also useful for when reset kills the machine, and you want to keep reset enabled but still get error state. Since I found the patch pretty useful, I decided to clean it up and submit it. It was mostly meant as a one-off hack originally though. If a maintainer decides it's useful, then here it is. Signed-off-by: Ben Widawsky --- drivers/gpu/drm/i915/i915_debugfs.c | 2 +- drivers/gpu/drm/i915/i915_drv.h | 3 ++- drivers/gpu/drm/i915/i915_gpu_error.c | 31 +++++++++++++++++++++++++------ drivers/gpu/drm/i915/i915_sysfs.c | 2 +- 4 files changed, 29 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 6b7b32b..2daad46 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -929,7 +929,7 @@ static ssize_t i915_error_state_read(struct file *file, char __user *userbuf, if (ret) return ret; - ret = i915_error_state_to_str(&error_str, error_priv); + ret = i915_error_state_to_str(&error_str, error_priv->dev, error_priv->error); if (ret) goto out; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 1045006..b6a4f1e 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2544,7 +2544,8 @@ static inline void intel_display_crc_init(struct drm_device *dev) {} __printf(2, 3) void i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...); int i915_error_state_to_str(struct drm_i915_error_state_buf *estr, - const struct i915_error_state_file_priv *error); + struct drm_device *dev, + const struct drm_i915_error_state *error); int i915_error_state_buf_init(struct drm_i915_error_state_buf *eb, size_t count, loff_t pos); static inline void i915_error_state_buf_release( diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index e82e590..1540bf6 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -184,8 +184,22 @@ static void i915_error_puts(struct drm_i915_error_state_buf *e, __i915_error_advance(e, len); } -#define err_printf(e, ...) i915_error_printf(e, __VA_ARGS__) -#define err_puts(e, s) i915_error_puts(e, s) + +static bool wire = false; +#define err_printf(e, ...) do { \ + if (wire) { \ + printk(__VA_ARGS__); \ + } else { \ + i915_error_printf(e, __VA_ARGS__); \ + } \ +} while (0) +#define err_puts(e, s) do { \ + if (wire) { \ + printk(s); \ + } else { \ + i915_error_puts(e, s); \ + } \ +} while (0) static void print_error_buffers(struct drm_i915_error_state_buf *m, const char *name, @@ -240,7 +254,7 @@ static const char *hangcheck_action_to_str(enum intel_ring_hangcheck_action a) static void i915_ring_error_state(struct drm_i915_error_state_buf *m, struct drm_device *dev, - struct drm_i915_error_ring *ring) + const struct drm_i915_error_ring *ring) { if (!ring->valid) return; @@ -322,11 +336,10 @@ static void print_error_obj(struct drm_i915_error_state_buf *m, } int i915_error_state_to_str(struct drm_i915_error_state_buf *m, - const struct i915_error_state_file_priv *error_priv) + struct drm_device *dev, + const struct drm_i915_error_state *error) { - struct drm_device *dev = error_priv->dev; struct drm_i915_private *dev_priv = dev->dev_private; - struct drm_i915_error_state *error = error_priv->error; int i, j, offset, elt; int max_hangcheck_score; @@ -1197,6 +1210,12 @@ void i915_capture_error_state(struct drm_device *dev, bool wedged, spin_lock_irqsave(&dev_priv->gpu_error.lock, flags); if (dev_priv->gpu_error.first_error == NULL) { dev_priv->gpu_error.first_error = error; +#ifdef PUSH_TO_WIRE + /* Probably racy, but this is emergency debug */ + wire = true; + i915_error_state_to_str(NULL, dev, error); + wire = false; +#endif error = NULL; } spin_unlock_irqrestore(&dev_priv->gpu_error.lock, flags); diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c index 86ce39a..6f4be9d 100644 --- a/drivers/gpu/drm/i915/i915_sysfs.c +++ b/drivers/gpu/drm/i915/i915_sysfs.c @@ -512,7 +512,7 @@ static ssize_t error_state_read(struct file *filp, struct kobject *kobj, error_priv.dev = dev; i915_error_state_get(dev, &error_priv); - ret = i915_error_state_to_str(&error_str, &error_priv); + ret = i915_error_state_to_str(&error_str, dev, error_priv.error); if (ret) goto out;