From patchwork Thu May 28 07:41:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11574891 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 05B421391 for ; Thu, 28 May 2020 07:41:46 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E29CE208E4 for ; Thu, 28 May 2020 07:41:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E29CE208E4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 600A76E3E7; Thu, 28 May 2020 07:41:43 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5154C6E3EB for ; Thu, 28 May 2020 07:41:41 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21318021-1500050 for multiple; Thu, 28 May 2020 08:41:10 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 28 May 2020 08:41:00 +0100 Message-Id: <20200528074109.28235-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200528074109.28235-1-chris@chris-wilson.co.uk> References: <20200528074109.28235-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 02/11] drm/i915/gt: Don't declare hangs if engine is stalled X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If the ring submission is stalled on an external request, nothing can be submitted, not even the heartbeat in the kernel context. Since nothing is running, resetting the engine/device does not unblock the system and is pointless. We can see if the heartbeat is supposed to be running before declaring foul. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c index 5136c8bf112d..f67ad937eefb 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -48,8 +48,10 @@ static void show_heartbeat(const struct i915_request *rq, struct drm_printer p = drm_debug_printer("heartbeat"); intel_engine_dump(engine, &p, - "%s heartbeat {prio:%d} not ticking\n", + "%s heartbeat {seqno:%llx:%lld, prio:%d} not ticking\n", engine->name, + rq->fence.context, + rq->fence.seqno, rq->sched.attr.priority); } @@ -76,8 +78,19 @@ static void heartbeat(struct work_struct *wrk) goto out; if (engine->heartbeat.systole) { - if (engine->schedule && - rq->sched.attr.priority < I915_PRIORITY_BARRIER) { + if (!i915_sw_fence_signaled(&rq->submit)) { + /* + * Not yet submitted, system is stalled. + * + * This more often happens for ring submission, + * where all contexts are funnelled into a common + * ringbuffer. If one context is blocked on an + * external fence, not only is it not submitted, + * but all other contexts, including the kernel + * context are stuck waiting for the signal. + */ + } else if (engine->schedule && + rq->sched.attr.priority < I915_PRIORITY_BARRIER) { /* * Gradually raise the priority of the heartbeat to * give high priority work [which presumably desires