From patchwork Fri Jan 17 14:20:30 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 3504841 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 68D08C02DC for ; Fri, 17 Jan 2014 14:20:46 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 5911F2015D for ; Fri, 17 Jan 2014 14:20:45 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 2EE252015E for ; Fri, 17 Jan 2014 14:20:44 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 247D0105E78; Fri, 17 Jan 2014 06:20:41 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTP id 397B610596F for ; Fri, 17 Jan 2014 06:20:37 -0800 (PST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP; 17 Jan 2014 06:16:33 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.95,670,1384329600"; d="scan'208";a="468270273" Received: from rosetta.fi.intel.com (HELO rosetta) ([10.237.72.60]) by orsmga002.jf.intel.com with ESMTP; 17 Jan 2014 06:20:35 -0800 Received: by rosetta (Postfix, from userid 1000) id A731D8009B; Fri, 17 Jan 2014 16:20:34 +0200 (EET) From: Mika Kuoppala To: intel-gfx@lists.freedesktop.org Date: Fri, 17 Jan 2014 16:20:30 +0200 Message-Id: <1389968431-24123-2-git-send-email-mika.kuoppala@intel.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1389968431-24123-1-git-send-email-mika.kuoppala@intel.com> References: <1389968431-24123-1-git-send-email-mika.kuoppala@intel.com> Subject: [Intel-gfx] [PATCH 2/3] drm/i915: Seek only one guilty batch per hanged ring X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: intel-gfx-bounces@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Instead of going through all the requests to find a batch that hanged the machine, use hangcheck score and the fact that first noncompleted request on hanged ring is, with great probability, the guilty one. This also ensure that we get one guilty batch per hang instead of possibly more (for each ring) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73652 Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/i915_gem.c | 19 ++++++++++--------- drivers/gpu/drm/i915/i915_irq.c | 3 +-- drivers/gpu/drm/i915/intel_ringbuffer.h | 2 ++ 3 files changed, 13 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d270351..27a97c3 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2322,20 +2322,17 @@ static bool i915_context_is_banned(const struct i915_ctx_hang_stats *hs) static void i915_set_reset_status(struct intel_ring_buffer *ring, struct drm_i915_gem_request *request, - u32 acthd) + u32 acthd, const bool guilty) { struct i915_ctx_hang_stats *hs = NULL; - bool inside, guilty; + bool inside; unsigned long offset = 0; - /* Innocent until proven guilty */ - guilty = false; - if (request->batch_obj) offset = i915_gem_obj_offset(request->batch_obj, request_to_vm(request)); - if (ring->hangcheck.action != HANGCHECK_WAIT && + if (guilty && i915_request_guilty(request, acthd, &inside)) { DRM_DEBUG("%s hung %s bo (0x%lx ctx %d) at 0x%x\n", ring->name, @@ -2343,8 +2340,6 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring, offset, request->ctx ? request->ctx->id : 0, acthd); - - guilty = true; } /* If contexts are disabled or this is the default context, use @@ -2383,12 +2378,18 @@ static void i915_gem_reset_ring_status(struct drm_i915_private *dev_priv, u32 completed_seqno = ring->get_seqno(ring, false); u32 acthd = intel_ring_get_active_head(ring); struct drm_i915_gem_request *request; + bool guilty = false; list_for_each_entry(request, &ring->request_list, list) { if (i915_seqno_passed(completed_seqno, request->seqno)) continue; - i915_set_reset_status(ring, request, acthd); + if (!guilty && ring->hangcheck.score >= HANGCHECK_SCORE_GUILTY) { + guilty = true; + i915_set_reset_status(ring, request, acthd, true); + } else { + i915_set_reset_status(ring, request, acthd, false); + } } } diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 6d11e25..e24f9ef 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2473,7 +2473,6 @@ static void i915_hangcheck_elapsed(unsigned long data) #define BUSY 1 #define KICK 5 #define HUNG 20 -#define FIRE 30 if (!i915_enable_hangcheck) return; @@ -2557,7 +2556,7 @@ static void i915_hangcheck_elapsed(unsigned long data) } for_each_ring(ring, dev_priv, i) { - if (ring->hangcheck.score > FIRE) { + if (ring->hangcheck.score >= HANGCHECK_SCORE_GUILTY) { DRM_INFO("%s on %s\n", stuck[i] ? "stuck" : "no progress", ring->name); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 71a73f4..6018793 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -41,6 +41,8 @@ enum intel_ring_hangcheck_action { HANGCHECK_HUNG, }; +#define HANGCHECK_SCORE_GUILTY 31 + struct intel_ring_hangcheck { bool deadlock; u32 seqno;