From patchwork Thu May 17 14:24:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10406851 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 21A3E60247 for ; Thu, 17 May 2018 14:25:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1341C2856F for ; Thu, 17 May 2018 14:25:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 080922853E; Thu, 17 May 2018 14:25:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id AEF2628511 for ; Thu, 17 May 2018 14:24:59 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C8EF16E937; Thu, 17 May 2018 14:24:58 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 614876E926 for ; Thu, 17 May 2018 14:24:55 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 11738206-1500050 for multiple; Thu, 17 May 2018 15:24:44 +0100 Received: by haswell.alporthouse.com (sSMTP sendmail emulation); Thu, 17 May 2018 15:24:43 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 17 May 2018 15:24:41 +0100 Message-Id: <20180517142442.16979-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.17.0 X-Originating-IP: 78.156.65.138 X-Country: code=GB country="United Kingdom" ip=78.156.65.138 Subject: [Intel-gfx] [PATCH 1/2] drm/i915/selftests: Wait longer for the old active request X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP When testing reset, we wait for 1s on the main thread for the hang to start. Meanwhile, we continue submitting requests on all the background threads, and we may have more threads than cores and so potentially starve the waiter from being woken within the timeout. As the hang timeout and the active timeouts are the same, it is hard to distinguish which caused the timeout. Bump the active thread timeouts to 5s, compared to the 1s timeout for the hang, so that we preferentially report the hang timing out, while hopefully ensuring that we do at least wake up the hang thread first before declaring the background active timeout. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin --- .../gpu/drm/i915/selftests/intel_hangcheck.c | 48 +++++++++++++------ 1 file changed, 34 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c index 438e0b045a2c..f1dc42a171c8 100644 --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c @@ -560,6 +560,30 @@ struct active_engine { #define TEST_SELF BIT(2) #define TEST_PRIORITY BIT(3) +static int active_request_put(struct i915_request *rq) +{ + int err = 0; + + if (!rq) + return 0; + + if (i915_request_wait(rq, 0, 5 * HZ) < 0) { + GEM_TRACE("%s timed out waiting for completion of fence %llx:%d, seqno %d.\n", + rq->engine->name, + rq->fence.context, + rq->fence.seqno, + i915_request_global_seqno(rq)); + GEM_TRACE_DUMP(); + + i915_gem_set_wedged(rq->i915); + err = -EIO; + } + + i915_request_put(rq); + + return err; +} + static int active_engine(void *data) { I915_RND_STATE(prng); @@ -608,24 +632,20 @@ static int active_engine(void *data) i915_request_add(new); mutex_unlock(&engine->i915->drm.struct_mutex); - if (old) { - if (i915_request_wait(old, 0, HZ) < 0) { - GEM_TRACE("%s timed out.\n", engine->name); - GEM_TRACE_DUMP(); - - i915_gem_set_wedged(engine->i915); - i915_request_put(old); - err = -EIO; - break; - } - i915_request_put(old); - } + err = active_request_put(old); + if (err) + break; cond_resched(); } - for (count = 0; count < ARRAY_SIZE(rq); count++) - i915_request_put(rq[count]); + for (count = 0; count < ARRAY_SIZE(rq); count++) { + int err__ = active_request_put(rq[count]); + + /* Keep the first error */ + if (!err) + err = err__; + } err_file: mock_file_free(engine->i915, file);