From patchwork Fri Jul 10 12:16:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11656447 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DD94413B1 for ; Fri, 10 Jul 2020 12:16:21 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C4D9F20748 for ; Fri, 10 Jul 2020 12:16:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C4D9F20748 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 522DB6E0BC; Fri, 10 Jul 2020 12:16:21 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id C48356E0BC for ; Fri, 10 Jul 2020 12:16:19 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21777162-1500050 for multiple; Fri, 10 Jul 2020 13:16:13 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 10 Jul 2020 13:16:09 +0100 Message-Id: <20200710121609.6775-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200710120717.32484-1-chris@chris-wilson.co.uk> References: <20200710120717.32484-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH] drm/i915/gt: Be defensive in the face of false CS events X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If the HW throws a curve ball and reports either en event before it is possible, or just a completely impossible event, we have to grin and bear it. The first few events, we will likely not notice as we would be expecting some event, but as soon as we stop expecting an event and yet they still keep coming, then we enter into undefined state territory. In which case, bail out, stop processing the events, and reset the engine and our set of queued requests to recover. The sporadic hangs and warnings will continue to plague CI, but at least system stability should not be compromised. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045 Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_lrc.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index fbcfeaed6441..c86324d2d2bb 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2567,6 +2567,7 @@ static void process_csb(struct intel_engine_cs *engine) tail = READ_ONCE(*execlists->csb_write); if (unlikely(head == tail)) return; + execlists->csb_head = tail; /* * Hopefully paired with a wmb() in HW! @@ -2613,6 +2614,9 @@ static void process_csb(struct intel_engine_cs *engine) if (promote) { struct i915_request * const *old = execlists->active; + if (GEM_WARN_ON(!*execlists->pending)) + break; + ring_set_paused(engine, 0); /* Point active to the new ELSP; prevent overwriting */ @@ -2635,7 +2639,8 @@ static void process_csb(struct intel_engine_cs *engine) WRITE_ONCE(execlists->pending[0], NULL); } else { - GEM_BUG_ON(!*execlists->active); + if (GEM_WARN_ON(!*execlists->active)) + break; /* port0 completed, advanced to port1 */ trace_ports(execlists, "completed", execlists->active); @@ -2686,7 +2691,6 @@ static void process_csb(struct intel_engine_cs *engine) } } while (head != tail); - execlists->csb_head = head; set_timeslice(engine); /*