From patchwork Tue Apr 24 01:24:12 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Chris Wilson <chris@chris-wilson.co.uk>
X-Patchwork-Id: 10358455
Return-Path: <intel-gfx-bounces@lists.freedesktop.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	0D932601D2 for <patchwork-intel-gfx@patchwork.kernel.org>;
	Tue, 24 Apr 2018 01:24:25 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E8C0728CA2
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Tue, 24 Apr 2018 01:24:24 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id DD60528CB7; Tue, 24 Apr 2018 01:24:24 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 53A7028CA2
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Tue, 24 Apr 2018 01:24:23 +0000 (UTC)
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id 5CE476E226;
	Tue, 24 Apr 2018 01:24:23 +0000 (UTC)
X-Original-To: intel-gfx@lists.freedesktop.org
Delivered-To: intel-gfx@lists.freedesktop.org
Received: from fireflyinternet.com (mail.fireflyinternet.com
	[109.228.58.192])
	by gabe.freedesktop.org (Postfix) with ESMTPS id 12A656E226
	for <intel-gfx@lists.freedesktop.org>;
	Tue, 24 Apr 2018 01:24:20 +0000 (UTC)
X-Default-Received-SPF: pass (skip=forwardok (res=PASS))
	x-ip-name=78.156.65.138;
Received: from haswell.alporthouse.com (unverified [78.156.65.138])
	by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id
	11470526-1500050 for multiple; Tue, 24 Apr 2018 02:24:15 +0100
Received: by haswell.alporthouse.com (sSMTP sendmail emulation);
	Tue, 24 Apr 2018 02:24:13 +0100
From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Date: Tue, 24 Apr 2018 02:24:12 +0100
Message-Id: <20180424012412.18770-1-chris@chris-wilson.co.uk>
X-Mailer: git-send-email 2.17.0
X-Originating-IP: 78.156.65.138
X-Country: code=GB country="United Kingdom" ip=78.156.65.138
Subject: [Intel-gfx] [PATCH] drm/i915: Don't dump umpteen thousand requests
X-BeenThere: intel-gfx@lists.freedesktop.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Intel graphics driver community testing & development
	<intel-gfx.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
MIME-Version: 1.0
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
X-Virus-Scanned: ClamAV using ClamSMTP

If we have more than a few, possibly several thousand request in the
queue, don't show the central portion, just the first few and the last
being executed and/or queued. The first few should be enough to help
identify a problem in execution, and most often comparing the first/last
in the queue is enough to identify problems in the scheduling.

We may need some fine tuning to set MAX_REQUESTS_TO_SHOW for common
debug scenarios, but for the moment if we can avoiding spending more
than a few seconds dumping the GPU state that will avoid a nasty
livelock (where hangcheck spends so long dumping the state, it fires
again and starts to dump the state again in parallel, ad infinitum).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_engine_cs.c | 41 +++++++++++++++++++++++---
 1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 66cddd059666..db28f9e3c306 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1307,11 +1307,13 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 		       struct drm_printer *m,
 		       const char *header, ...)
 {
+	const int MAX_REQUESTS_TO_SHOW = 8;
 	struct intel_breadcrumbs * const b = &engine->breadcrumbs;
 	const struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct i915_gpu_error * const error = &engine->i915->gpu_error;
-	struct i915_request *rq;
+	struct i915_request *rq, *last;
 	struct rb_node *rb;
+	int count;
 
 	if (header) {
 		va_list ap;
@@ -1378,16 +1380,47 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 	}
 
 	spin_lock_irq(&engine->timeline->lock);
-	list_for_each_entry(rq, &engine->timeline->requests, link)
+
+	last = NULL;
+	count = 0;
+	list_for_each_entry(rq, &engine->timeline->requests, link) {
+		if (count++ < MAX_REQUESTS_TO_SHOW - 1)
+			print_request(m, rq, "\t\tE ");
+		else
+			last = rq;
+	}
+	if (last) {
+		if (count > MAX_REQUESTS_TO_SHOW) {
+			drm_printf(m,
+				   "\t\t...skipping %d executing requests...\n",
+				   count - MAX_REQUESTS_TO_SHOW);
+		}
 		print_request(m, rq, "\t\tE ");
+	}
+
+	last = NULL;
+	count = 0;
 	drm_printf(m, "\t\tQueue priority: %d\n", execlists->queue_priority);
 	for (rb = execlists->first; rb; rb = rb_next(rb)) {
 		struct i915_priolist *p =
 			rb_entry(rb, typeof(*p), node);
 
-		list_for_each_entry(rq, &p->requests, sched.link)
-			print_request(m, rq, "\t\tQ ");
+		list_for_each_entry(rq, &p->requests, sched.link) {
+			if (count++ < MAX_REQUESTS_TO_SHOW - 1)
+				print_request(m, rq, "\t\tQ ");
+			else
+				last = rq;
+		}
 	}
+	if (last) {
+		if (count > MAX_REQUESTS_TO_SHOW) {
+			drm_printf(m,
+				   "\t\t...skipping %d queued requests...\n",
+				   count - MAX_REQUESTS_TO_SHOW);
+		}
+		print_request(m, last, "\t\tQ ");
+	}
+
 	spin_unlock_irq(&engine->timeline->lock);
 
 	spin_lock_irq(&b->rb_lock);