From patchwork Thu Jul 19 09:35:59 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Tvrtko Ursulin <tursulin@ursulin.net>
X-Patchwork-Id: 10533995
Return-Path: <intel-gfx-bounces@lists.freedesktop.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	6222A600F4 for <patchwork-intel-gfx@patchwork.kernel.org>;
	Thu, 19 Jul 2018 09:36:14 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5092629667
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Thu, 19 Jul 2018 09:36:14 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 44B5C29676; Thu, 19 Jul 2018 09:36:14 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id ADE9829667
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Thu, 19 Jul 2018 09:36:13 +0000 (UTC)
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id 09E2D6ED69;
	Thu, 19 Jul 2018 09:36:13 +0000 (UTC)
X-Original-To: intel-gfx@lists.freedesktop.org
Delivered-To: intel-gfx@lists.freedesktop.org
Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com
	[IPv6:2a00:1450:4864:20::433])
	by gabe.freedesktop.org (Postfix) with ESMTPS id AB1596ED69
	for <intel-gfx@lists.freedesktop.org>;
	Thu, 19 Jul 2018 09:36:10 +0000 (UTC)
Received: by mail-wr1-x433.google.com with SMTP id h10-v6so7356703wre.6
	for <intel-gfx@lists.freedesktop.org>;
	Thu, 19 Jul 2018 02:36:10 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
	:references;
	bh=JOzzQjlcu2hLaAGIyWJgBGfuL4sBVJjTbUo+/zfuEyo=;
	b=PqhvZ9G17qq9WsEKDCdsADxLH0Tvgzdk7oKvlyNxauIxJixRQsQMdPjbVtBUCtBF0u
	6nPJmW9h2UVf+k5DL9J3/OYNIAonvxO27EuOt34ks4dOzWWqqVi2RmSVhb36xopU7JjJ
	ESPjZcaNL4Mf8hRF7REzPj2m+FzmCTpKZ4qagCUEIWn6bZMQ3wfrNCymoag1pePInPI7
	IYEhbn9oABgnsUeLmYqX/wOb5ufS37nH5kWZ7qZfJC9aUmvdhYKi4BYIhMgc4iUMurd1
	m/mDKBy5fQGqIcGe+X7T8FIXkQfq499ARniS/MXsdDTJqKmlR0Btilz4TFx0dsKzl3P6
	IxQQ==
X-Gm-Message-State: AOUpUlFWHOnTBAEZmn9imTjUQTPj4GVvsPOFTIsw0/dKebcx/7IJJb7e
	wQCsd3JxC1wzh2Q06klW712nyw==
X-Google-Smtp-Source: 
 AAOMgpewBu8JhYrnwyuYOWgXNjbxtoYhHDdl7NW5DBMgCfZam+0JSN1KODRli+CCG/xQ9ngn4D4PQg==
X-Received: by 2002:adf:8ec2:: with SMTP id
	q60-v6mr6552791wrb.275.1531992969191;
	Thu, 19 Jul 2018 02:36:09 -0700 (PDT)
Received: from localhost.localdomain ([95.146.151.144])
	by smtp.gmail.com with ESMTPSA id
	a11-v6sm8422004wrr.81.2018.07.19.02.36.08
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
	Thu, 19 Jul 2018 02:36:08 -0700 (PDT)
From: Tvrtko Ursulin <tursulin@ursulin.net>
X-Google-Original-From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: igt-dev@lists.freedesktop.org
Date: Thu, 19 Jul 2018 10:35:59 +0100
Message-Id: <20180719093601.11788-3-tvrtko.ursulin@linux.intel.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20180719093601.11788-1-tvrtko.ursulin@linux.intel.com>
References: <20180719093601.11788-1-tvrtko.ursulin@linux.intel.com>
Subject: [Intel-gfx] [PATCH i-g-t 2/4] trace.pl: Fix request split mode
X-BeenThere: intel-gfx@lists.freedesktop.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Intel graphics driver community testing & development
	<intel-gfx.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Cc: intel-gfx@lists.freedesktop.org
MIME-Version: 1.0
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
X-Virus-Scanned: ClamAV using ClamSMTP

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Request split mode had several bugs, both in the original version and also
after the recent refactorings.

One big one was that it wasn't considering different submit ports as a
reason to split execution, and also that it was too time based instead of
looking at relevant timelines.

In this refactoring we address the former by using the engine timelines
introduced in the previous patch. Secondary port submissions are moved
to follow the preceding submission as a first step in the correction
process.

In the second step, we add context timelines and use then in a similar
fashion to separate start and end time of coalesced requests. For each
coalesced request we know its boundaries by looking at the engine
timeline (via global seqnos), and we know the previous request it should
only start after, by looking at the context timeline.

v2:
 * Remove some dead code.
 * Fix !port0 shifting logic.

v3:
 * Refactor for less list walking as with incomplete handling.

v4:
 * Database of context timelines should not contain duplicates!
   (Converted from array into a hash.)

v5:
 * Avoid over-accounting runnable time for a coalesced group by recording
   the time first request entered the GPU and ending the execute delay at
   that point for the whole group.

v6:
 * Update for engine class:instance.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: John Harrison <John.C.Harrison@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <a class="moz-txt-link-rfc2396E" href="mailto:tvrtko.ursulin@intel.com">&lt;tvrtko.ursulin@intel.com&gt;</a>
---
 scripts/trace.pl | 138 ++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 108 insertions(+), 30 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 41bedeefb776..59f6d32dc3c8 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -27,7 +27,7 @@ use warnings;
 use 5.010;
 
 my $gid = 0;
-my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait);
+my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines);
 my @freqs;
 
 my $max_items = 3000;
@@ -418,6 +418,7 @@ while (<>) {
 		$req{'ring'} = $ring;
 		$req{'seqno'} = $seqno;
 		$req{'ctx'} = $ctx;
+		$ctxtimelines{$ctx . '/' . $ring} = 1;
 		$req{'name'} = $ctx . '/' . $seqno;
 		$req{'global'} = $tp{'global'};
 		$req{'port'} = $tp{'port'};
@@ -573,41 +574,113 @@ sub sortStart {
 	return $val;
 }
 
-my @sorted_keys = sort sortStart keys %db;
-my $re_sort = 0;
+my $re_sort = 1;
+my @sorted_keys;
 
-die "Database changed size?!" unless scalar(@sorted_keys) == $key_count;
+sub maybe_sort_keys
+{
+	if ($re_sort) {
+		@sorted_keys = sort sortStart keys %db;
+		$re_sort = 0;
+		die "Database changed size?!" unless scalar(@sorted_keys) ==
+						     $key_count;
+	}
+}
 
-foreach my $key (@sorted_keys) {
-	my $ring = $db{$key}->{'ring'};
-	my $end = $db{$key}->{'end'};
+maybe_sort_keys();
+
+my %ctx_timelines;
+
+sub sortContext {
+	my $as = $db{$a}->{'seqno'};
+	my $bs = $db{$b}->{'seqno'};
+	my $val;
+
+	$val = $as <=> $bs;
+
+	die if $val == 0;
+
+	return $val;
+}
+
+sub get_ctx_timeline {
+	my ($ctx, $ring, $key) = @_;
+	my @timeline;
+
+	return $ctx_timelines{$key} if exists $ctx_timelines{$key};
+
+	@timeline = grep { $db{$_}->{'ring'} eq $ring and
+			   $db{$_}->{'ctx'} == $ctx } @sorted_keys;
+	# FIXME seqno restart
+	@timeline = sort sortContext @timeline;
+
+	$ctx_timelines{$key} = \@timeline;
+
+	return \@timeline;
+}
+
+# Split out merged batches if requested.
+if ($correct_durations) {
+	# Shift !port0 requests start time to after the previous context on the
+	# same timeline has finished.
+	foreach my $gid (sort keys %rings) {
+		my $ring = $ringmap{$rings{$gid}};
+		my $timeline = get_engine_timeline($ring);
+		my $complete;
+
+		foreach my $pos (0..$#{$timeline}) {
+			my $key = @{$timeline}[$pos];
+			my $prev = $complete;
+			my $pkey;
+
+			$complete = $key unless exists $db{$key}->{'no-end'};
+			$pkey = $complete;
+
+			next if $db{$key}->{'port'} == 0;
+
+			$pkey = $prev if $complete eq $key;
+
+			die unless defined $pkey;
+
+			$db{$key}->{'start'} = $db{$pkey}->{'end'};
+			$db{$key}->{'start'} = $db{$pkey}->{'notify'} if $db{$key}->{'start'} > $db{$key}->{'end'};
+
+			die if $db{$key}->{'start'} > $db{$key}->{'end'};
 
-	# correct duration of merged batches
-	if ($correct_durations and exists $db{$key}->{'no-end'}) {
-		my $ctx = $db{$key}->{'ctx'};
-		my $seqno = $db{$key}->{'seqno'};
-		my $start = $db{$key}->{'start'};
-		my $next_key;
-		my $i = 1;
-
-		do {
-			$next_key = db_key($ring, $ctx, $seqno + $i);
-			$i++;
-		} until (exists $db{$next_key} or $i > $key_count);  # ugly stop hack
-
-		# 20us tolerance
-		if (exists $db{$next_key} and $db{$next_key}->{'start'} < $start + 20) {
-			my $notify = $db{$key}->{'notify'};
 			$re_sort = 1;
-			$db{$next_key}->{'start'} = $notify;
-			$db{$next_key}->{'start'} = $db{$next_key}->{'end'} if $db{$next_key}->{'start'} > $db{$next_key}->{'end'};
-			die if $db{$next_key}->{'start'} > $db{$next_key}->{'end'};
 		}
-		die if $start > $end;
+	}
+
+	maybe_sort_keys();
+
+	# Batch with no-end (no request_out) means it was submitted as part of
+	# coalesced context. This means it's start time should be set to the end
+	# time of a following request on this context timeline.
+	foreach my $tkey (sort keys %ctxtimelines) {
+		my ($ctx, $ring) = split '/', $tkey;
+		my $timeline = get_ctx_timeline($ctx, $ring, $tkey);
+		my $last_complete = -1;
+		my $complete;
+
+		foreach my $pos (0..$#{$timeline}) {
+			my $key = @{$timeline}[$pos];
+			my $next_key;
+
+			next unless exists $db{$key}->{'no-end'};
+			last if $pos == $#{$timeline};
+
+			# Shift following request to start after the current one
+			$next_key = ${$timeline}[$pos + 1];
+			if (exists $db{$key}->{'notify'}) {
+				$db{$next_key}->{'engine-start'} = $db{$next_key}->{'start'};
+				$db{$next_key}->{'start'} = $db{$key}->{'notify'};
+				$re_sort = 1;
+			}
+		}
 	}
 }
 
-@sorted_keys = sort sortStart keys %db if $re_sort;
+maybe_sort_keys();
 
 # GPU time accounting
 my (%running, %runnable, %queued, %batch_avg, %batch_total_avg, %batch_count);
@@ -621,6 +694,7 @@ foreach my $key (@sorted_keys) {
 	my $ring = $db{$key}->{'ring'};
 	my $end = $db{$key}->{'end'};
 	my $start = $db{$key}->{'start'};
+	my $engine_start = $db{$key}->{'engine_start'};
 	my $notify = $db{$key}->{'notify'};
 
 	$first_ts = $db{$key}->{'queue'} if not defined $first_ts or $db{$key}->{'queue'} < $first_ts;
@@ -633,7 +707,9 @@ foreach my $key (@sorted_keys) {
 	} else {
 		$db{$key}->{'context-complete-delay'} = 0;
 	}
-	$db{$key}->{'execute-delay'} = $start - $db{$key}->{'submit'};
+
+	$engine_start = $db{$key}->{'start'} unless defined $engine_start;
+	$db{$key}->{'execute-delay'} = $engine_start - $db{$key}->{'submit'};
 	$db{$key}->{'submit-delay'} = $db{$key}->{'submit'} - $db{$key}->{'queue'};
 	unless (exists $db{$key}->{'no-notify'}) {
 		$db{$key}->{'duration'} = $notify - $start;
@@ -1059,6 +1135,7 @@ my $i = 0;
 foreach my $key (sort sortQueue keys %db) {
 	my ($name, $ctx, $seqno) = ($db{$key}->{'name'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
 	my ($queue, $start, $notify, $end) = ($db{$key}->{'queue'}, $db{$key}->{'start'}, $db{$key}->{'notify'}, $db{$key}->{'end'});
+	my $engine_start = $db{$key}->{'engine-start'};
 	my $submit = $queue + $db{$key}->{'submit-delay'};
 	my ($content, $style);
 	my $group = $engine_start_id + $rings{$db{$key}->{'ring'}};
@@ -1078,11 +1155,12 @@ foreach my $key (sort sortQueue keys %db) {
 	}
 
 	# execute to start
+	$engine_start = $db{$key}->{'start'} unless defined $engine_start;
 	unless (exists $skip_box{'ready'}) {
 		$skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1;
 		$style = box_style($ctx, 'ready');
 		$content = "<small>$name<br>$db{$key}->{'execute-delay'}us</small>";
-		$startend = 'start: ' . $submit . ', end: ' . $start;
+		$startend = 'start: ' . $submit . ', end: ' . $engine_start;
 		print "\t{id: $i, key: $skey, $type group: $group, subgroup: $subgroup, subgroupOrder: $subgroup, content: '$content', $startend, style: \'$style\'},\n";
 		$i++;
 	}