From patchwork Wed Apr 11 12:03:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10335373 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F28FD60134 for ; Wed, 11 Apr 2018 12:04:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DA6DC28346 for ; Wed, 11 Apr 2018 12:04:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CDFD928905; Wed, 11 Apr 2018 12:04:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5EDD428346 for ; Wed, 11 Apr 2018 12:04:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7909C6E5B0; Wed, 11 Apr 2018 12:04:03 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2779B6E5B0 for ; Wed, 11 Apr 2018 12:04:01 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 11334653-1500050 for multiple; Wed, 11 Apr 2018 13:03:46 +0100 Received: by haswell.alporthouse.com (sSMTP sendmail emulation); Wed, 11 Apr 2018 13:03:46 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 11 Apr 2018 13:03:46 +0100 Message-Id: <20180411120346.27618-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.17.0 MIME-Version: 1.0 X-Originating-IP: 78.156.65.138 X-Country: code=GB country="United Kingdom" ip=78.156.65.138 Subject: [Intel-gfx] [PATCH] drm/i915/selftests: Wait for idle between idle resets as well X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Even though we weren't injecting guilty requests to be reset, we could still fall over the issue of resetting the same request too fast -- where the GPU refuses to start again. (Although it is interesting to note that reloading the driver is sufficient, suggesting that we could recover if we delayed the setup after reset?) Continue to paper over the problem by adding a small delay by waiting for the engine to idle between tests, and ensure that the engines are idle before starting the idle tests. References: 028666793a02 ("drm/i915/selftests: Avoid repeatedly harming the same innocent context") Signed-off-by: Chris Wilson Cc: Mika Kuoppala Cc: Michał Winiarski Cc: Michel Thierry Cc: Tvrtko Ursulin Reviewed-by: Michał Winiarski --- .../gpu/drm/i915/selftests/intel_hangcheck.c | 48 ++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c index 24f913f26a7b..7e23e6a6719c 100644 --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c @@ -454,6 +454,11 @@ static int igt_global_reset(void *arg) return err; } +static bool wait_for_idle(struct intel_engine_cs *engine) +{ + return wait_for(intel_engine_is_idle(engine), 50) == 0; +} + static int __igt_reset_engine(struct drm_i915_private *i915, bool active) { struct intel_engine_cs *engine; @@ -481,6 +486,13 @@ static int __igt_reset_engine(struct drm_i915_private *i915, bool active) if (active && !intel_engine_can_store_dword(engine)) continue; + if (!wait_for_idle(engine)) { + pr_err("%s failed to idle before reset\n", + engine->name); + err = -EIO; + break; + } + reset_count = i915_reset_count(&i915->gpu_error); reset_engine_count = i915_reset_engine_count(&i915->gpu_error, engine); @@ -542,6 +554,19 @@ static int __igt_reset_engine(struct drm_i915_private *i915, bool active) err = -EINVAL; break; } + + if (!wait_for_idle(engine)) { + struct drm_printer p = + drm_info_printer(i915->drm.dev); + + pr_err("%s failed to idle after reset\n", + engine->name); + intel_engine_dump(engine, &p, + "%s\n", engine->name); + + err = -EIO; + break; + } } while (time_before(jiffies, end_time)); clear_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags); @@ -696,6 +721,13 @@ static int __igt_reset_engines(struct drm_i915_private *i915, !intel_engine_can_store_dword(engine)) continue; + if (!wait_for_idle(engine)) { + pr_err("i915_reset_engine(%s:%s): failed to idle before reset\n", + engine->name, test_name); + err = -EIO; + break; + } + memset(threads, 0, sizeof(threads)); for_each_engine(other, i915, tmp) { struct task_struct *tsk; @@ -772,6 +804,20 @@ static int __igt_reset_engines(struct drm_i915_private *i915, i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT); i915_request_put(rq); } + + if (!(flags & TEST_SELF) && !wait_for_idle(engine)) { + struct drm_printer p = + drm_info_printer(i915->drm.dev); + + pr_err("i915_reset_engine(%s:%s):" + " failed to idle after reset\n", + engine->name, test_name); + intel_engine_dump(engine, &p, + "%s\n", engine->name); + + err = -EIO; + break; + } } while (time_before(jiffies, end_time)); clear_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags); pr_info("i915_reset_engine(%s:%s): %lu resets\n", @@ -981,7 +1027,7 @@ static int wait_for_others(struct drm_i915_private *i915, if (engine == exclude) continue; - if (wait_for(intel_engine_is_idle(engine), 10)) + if (!wait_for_idle(engine)) return -EIO; }