From patchwork Tue Jan 17 15:59:06 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 9521437 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B8E0B601C3 for ; Tue, 17 Jan 2017 16:00:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A85EF285DC for ; Tue, 17 Jan 2017 16:00:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9CBBC285E8; Tue, 17 Jan 2017 16:00:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 56AC8285DC for ; Tue, 17 Jan 2017 16:00:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CBCC96E6A7; Tue, 17 Jan 2017 16:00:55 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7BB636E6A8 for ; Tue, 17 Jan 2017 16:00:55 +0000 (UTC) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP; 17 Jan 2017 08:00:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,245,1477983600"; d="scan'208";a="31691175" Received: from rosetta.fi.intel.com ([10.237.72.176]) by orsmga002.jf.intel.com with ESMTP; 17 Jan 2017 08:00:54 -0800 Received: by rosetta.fi.intel.com (Postfix, from userid 1000) id C0D2E840473; Tue, 17 Jan 2017 17:59:09 +0200 (EET) From: Mika Kuoppala To: intel-gfx@lists.freedesktop.org Date: Tue, 17 Jan 2017 17:59:06 +0200 Message-Id: <1484668747-9120-6-git-send-email-mika.kuoppala@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1484668747-9120-1-git-send-email-mika.kuoppala@intel.com> References: <1484668747-9120-1-git-send-email-mika.kuoppala@intel.com> Subject: [Intel-gfx] [PATCH 6/7] drm/i915: Detect a failed GPU reset+recovery X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Chris Wilson If we can't recover the GPU after the reset, mark it as wedged to cancel the outstanding tasks and to prevent new users from trying to use the broken GPU. v2: Check the same ring is hung again before declaring the reset broken. v3: use engine_stalled (Mika) Signed-off-by: Chris Wilson (v2) Cc: Mika Kuoppala Cc: Tvrtko Ursulin Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/i915_drv.c | 7 ++++++- drivers/gpu/drm/i915/i915_drv.h | 2 +- drivers/gpu/drm/i915/i915_gem.c | 16 ++++++++++++++-- 3 files changed, 21 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index bb747ae..5ee6297 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1761,7 +1761,12 @@ void i915_reset(struct drm_i915_private *dev_priv) pr_notice("drm/i915: Resetting chip after gpu hang\n"); disable_irq(dev_priv->drm.irq); - i915_gem_reset_prepare(dev_priv); + ret = i915_gem_reset_prepare(dev_priv); + if (ret) { + DRM_ERROR("GPU recovery failed\n"); + intel_gpu_reset(dev_priv, ALL_ENGINES); + goto error; + } ret = intel_gpu_reset(dev_priv, ALL_ENGINES); if (ret) { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f861418..3850950 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3328,7 +3328,7 @@ static inline u32 i915_reset_count(struct i915_gpu_error *error) return READ_ONCE(error->reset_count); } -void i915_gem_reset_prepare(struct drm_i915_private *dev_priv); +int i915_gem_reset_prepare(struct drm_i915_private *dev_priv); void i915_gem_reset_finish(struct drm_i915_private *dev_priv); void i915_gem_set_wedged(struct drm_i915_private *dev_priv); void i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 832ad09..3e10e81 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2625,16 +2625,28 @@ static bool engine_stalled(struct intel_engine_cs *engine) return true; } -void i915_gem_reset_prepare(struct drm_i915_private *dev_priv) +int i915_gem_reset_prepare(struct drm_i915_private *dev_priv) { struct intel_engine_cs *engine; enum intel_engine_id id; + int err = 0; /* Ensure irq handler finishes, and not run again. */ - for_each_engine(engine, dev_priv, id) + for_each_engine(engine, dev_priv, id) { + struct drm_i915_gem_request *request; + tasklet_kill(&engine->irq_tasklet); + if (engine_stalled(engine)) { + request = i915_gem_find_active_request(engine); + if (request && request->fence.error == -EIO) + err = -EIO; /* Previous reset failed! */ + } + } + i915_gem_revoke_fences(dev_priv); + + return err; } static void skip_request(struct drm_i915_gem_request *request)