From patchwork Wed Jun 12 12:13:20 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 2709851 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id B85F39F3DD for ; Wed, 12 Jun 2013 12:13:41 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 810ED20240 for ; Wed, 12 Jun 2013 12:13:40 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id C4E792022D for ; Wed, 12 Jun 2013 12:13:35 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9D7FEE6242 for ; Wed, 12 Jun 2013 05:13:35 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga03.intel.com (mga03.intel.com [143.182.124.21]) by gabe.freedesktop.org (Postfix) with ESMTP id BF771E5BFA for ; Wed, 12 Jun 2013 05:13:24 -0700 (PDT) Received: from azsmga002.ch.intel.com ([10.2.17.35]) by azsmga101.ch.intel.com with ESMTP; 12 Jun 2013 05:13:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.87,851,1363158000"; d="scan'208";a="253929855" Received: from rosetta.fi.intel.com (HELO rosetta) ([10.237.72.67]) by AZSMGA002.ch.intel.com with ESMTP; 12 Jun 2013 05:13:22 -0700 Received: by rosetta (Postfix, from userid 1000) id 4CF1F8007A; Wed, 12 Jun 2013 15:13:22 +0300 (EEST) From: Mika Kuoppala To: intel-gfx@lists.freedesktop.org Date: Wed, 12 Jun 2013 15:13:20 +0300 Message-Id: <1371039200-29713-1-git-send-email-mika.kuoppala@intel.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <20130612101703.GD32648@cantiga.alporthouse.com> References: <20130612101703.GD32648@cantiga.alporthouse.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 6/7] drm/i915: find guilty batch buffer on ring resets X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP After hang check timer has declared gpu to be hung, rings are reset. In ring reset, when clearing request list, do post mortem analysis to find out the guilty batch buffer. Select requests for further analysis by inspecting the completed sequence number which has been updated into the HWS page. If request was completed, it can't be related to the hang. For noncompleted requests mark the batch as guilty if the ring was not waiting and the ring head was stuck inside the buffer object or in the flush region right after the batch. For everything else, mark them as innocents. v2: Fixed a typo in commit message (Ville Syrjälä) v3: - more descriptive function parameters (Chris Wilson) - use masked head address when inspecting if request is in ring - s/hangcheck.last_action/hangcheck.action - added comment about unmasked head hitting batch_obj range Reviewed-by: Chris Wilson Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/i915_gem.c | 97 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 97 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index dc32fae..9b20fbb 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2110,6 +2110,94 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request) spin_unlock(&file_priv->mm.lock); } +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj) +{ + if (acthd >= obj->gtt_offset && + acthd < obj->gtt_offset + obj->base.size) + return true; + + return false; +} + +static bool i915_head_inside_request(const u32 acthd_unmasked, + const u32 request_start, + const u32 request_end) +{ + const u32 acthd = acthd_unmasked & HEAD_ADDR; + + if (request_start < request_end) { + if (acthd >= request_start && acthd < request_end) + return true; + } else if (request_start > request_end) { + if (acthd >= request_start || acthd < request_end) + return true; + } + + return false; +} + +static bool i915_request_guilty(struct drm_i915_gem_request *request, + const u32 acthd, bool *inside) +{ + /* There is a possibility that unmasked head address + * pointing inside the ring, matches the batch_obj address range. + * However this is extremely unlikely. + */ + + if (request->batch_obj) { + if (i915_head_inside_object(acthd, request->batch_obj)) { + *inside = true; + return true; + } + } + + if (i915_head_inside_request(acthd, request->head, request->tail)) { + *inside = false; + return true; + } + + return false; +} + +static void i915_set_reset_status(struct intel_ring_buffer *ring, + struct drm_i915_gem_request *request, + u32 acthd) +{ + struct i915_ctx_hang_stats *hs = NULL; + bool inside, guilty; + + /* Innocent until proven guilty */ + guilty = false; + + if (ring->hangcheck.action != wait && + i915_request_guilty(request, acthd, &inside)) { + DRM_ERROR("%s hung %s bo (0x%x ctx %d) at 0x%x\n", + ring->name, + inside ? "inside" : "flushing", + request->batch_obj ? + request->batch_obj->gtt_offset : 0, + request->ctx ? request->ctx->id : 0, + acthd); + + guilty = true; + } + + /* If contexts are disabled or this is the default context, use + * file_priv->reset_state + */ + if (request->ctx && request->ctx->id != DEFAULT_CONTEXT_ID) + hs = &request->ctx->hang_stats; + else if (request->file_priv) + hs = &request->file_priv->hang_stats; + + if (hs) { + if (guilty) + hs->batch_active++; + else + hs->batch_pending++; + } +} + static void i915_gem_free_request(struct drm_i915_gem_request *request) { list_del(&request->list); @@ -2124,6 +2212,12 @@ static void i915_gem_free_request(struct drm_i915_gem_request *request) static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv, struct intel_ring_buffer *ring) { + u32 completed_seqno; + u32 acthd; + + acthd = intel_ring_get_active_head(ring); + completed_seqno = ring->get_seqno(ring, false); + while (!list_empty(&ring->request_list)) { struct drm_i915_gem_request *request; @@ -2131,6 +2225,9 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv, struct drm_i915_gem_request, list); + if (request->seqno > completed_seqno) + i915_set_reset_status(ring, request, acthd); + i915_gem_free_request(request); }