From patchwork Thu May 10 10:19:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10391603 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4BBC060540 for ; Thu, 10 May 2018 10:20:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3A39F289C6 for ; Thu, 10 May 2018 10:20:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2EA36289CA; Thu, 10 May 2018 10:20:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8213E2890F for ; Thu, 10 May 2018 10:20:22 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5A6CE6EEEE; Thu, 10 May 2018 10:20:03 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm0-x235.google.com (mail-wm0-x235.google.com [IPv6:2a00:1450:400c:c09::235]) by gabe.freedesktop.org (Postfix) with ESMTPS id 478AD6EEEC for ; Thu, 10 May 2018 10:20:02 +0000 (UTC) Received: by mail-wm0-x235.google.com with SMTP id a8-v6so3050622wmg.5 for ; Thu, 10 May 2018 03:20:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ursulin-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=CiXalHIYm3Ip8uw/zs75PzfD//+xQA2NallG+J+QKGw=; b=EP+mRhyr8bDdH5uuR89BKIA1tuKd+65E6JfLUd3xKHzMeFMSeJ8QJkmqgWMRRtROh3 o4GqMiQJwPXoyDmiV+iD88Zd5WhtU97KGTqnRCkQa5hOL+FbeyuSZ20jrnCztXXfb9GS S4a6Aylpk8UUK3f0Yni7evdaqg6aJOApytDIKCIM7QmqKno/GoLOaIc4+Fl/L+mO65/u P67KegfCmlJIm3zGLxFRvwyGUnNZlo/wj+Wh0iUuC6sWZ+HS6duq5pdibjouVrnkFfeo ZcDTDS/Bg1aUv/bKQD+pjbD+Rigx0dgdK3IUaBO30kp7TBBgk8TkmrgSBxEMRq7bqwxt Xzng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=CiXalHIYm3Ip8uw/zs75PzfD//+xQA2NallG+J+QKGw=; b=dpqoNSYZpRGQ77mCehmhYhOwUtQkREB/szlW15ZaRMvJemBeJQNHAbnwKcxloCbjUp jANT/F76KmZzpgnuY+JfdsCY2X/rSnoeEXBd28QI8czb2Zgm0LsmzY+6E38Y2DpSbHJA e3sSX4iAL59sxOTivpDhNIhDKv2FZQfZYYQhxTqv1z3bBS/EwuHXbCtYUFOixiIk3Pdn HJDLg+4+HnjSbp2lvM/0wQndTuRY2srG1je6O1uQtgLm6LKyY2gVVttP3f5anT3CI39Z 8AIgU0c7rr3D7M0moFAwwe2eP4ic4PO5TDr8zyAOg++2yH/ZZ5sCD5O3X2i9l+4QUEmw 8VWA== X-Gm-Message-State: ALKqPwe4I5sPgiv3CLUmLyKmqRAuoX0JNKaennqd8plPTI2yMoJTRJqd 6HbDyEt0pIHylLrIWogU8d/Hhw== X-Google-Smtp-Source: AB8JxZqbjs2xkqPWXQnvNJqJEoxvdnpZvLfmtdDRFCE6seokCf+ks7qG5XmXYVrbF0dIestlJqi/uQ== X-Received: by 2002:a1c:459a:: with SMTP id l26-v6mr905232wmi.95.1525947600787; Thu, 10 May 2018 03:20:00 -0700 (PDT) Received: from localhost.localdomain ([95.146.151.144]) by smtp.gmail.com with ESMTPSA id o53-v6sm1178774wrc.96.2018.05.10.03.19.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 10 May 2018 03:20:00 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 10 May 2018 11:19:24 +0100 Message-Id: <20180510101924.20814-9-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20180510101924.20814-1-tvrtko.ursulin@linux.intel.com> References: <20180510101924.20814-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 9/9] trace.pl: Fix request split mode X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Request split mode had several bugs, both in the original version and also after the recent refactorings. One big one was that it wasn't considering different submit ports as a reason to split execution, and also that it was too time based instead of looking at relevant timelines. In this refactoring we address the former by using the engine timelines introduced in the previous patch. Secondary port submissions are moved to follow the preceding submission as a first step in the correction process. In the second step, we add context timelines and use then in a similar fashion to separate start and end time of coalesced requests. For each coalesced request we know its boundaries by looking at the engine timeline (via global seqnos), and we know the previous request it should only start after, by looking at the context timeline. v2: * Remove some dead code. * Fix !port0 shifting logic. v3: * Refactor for less list walking as with incomplete handling. v4: * Database of context timelines should not contain duplicates! (Converted from array into a hash.) v5: * Avoid over-accounting runnable time for a coalesced group by recording the time first request entered the GPU and ending the execute delay at that point for the whole group. Signed-off-by: Tvrtko Ursulin Cc: John Harrison --- scripts/trace.pl | 140 +++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 109 insertions(+), 31 deletions(-) diff --git a/scripts/trace.pl b/scripts/trace.pl index 92431974296a..e1fc2c7366d2 100755 --- a/scripts/trace.pl +++ b/scripts/trace.pl @@ -27,7 +27,7 @@ use warnings; use 5.010; my $gid = 0; -my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait); +my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines); my @freqs; my $max_items = 3000; @@ -436,6 +436,7 @@ while (<>) { $req{'ring'} = $ring; $req{'seqno'} = $seqno; $req{'ctx'} = $ctx; + $ctxtimelines{$ctx . '/' . $ring} = 1; $req{'name'} = $ctx . '/' . $seqno; $req{'global'} = $tp{'global'}; $req{'port'} = $tp{'port'}; @@ -591,41 +592,113 @@ sub sortStart { return $val; } -my @sorted_keys = sort sortStart keys %db; -my $re_sort = 0; +my $re_sort = 1; +my @sorted_keys; -die "Database changed size?!" unless scalar(@sorted_keys) == $key_count; +sub maybe_sort_keys +{ + if ($re_sort) { + @sorted_keys = sort sortStart keys %db; + $re_sort = 0; + die "Database changed size?!" unless scalar(@sorted_keys) == + $key_count; + } +} -foreach my $key (@sorted_keys) { - my $ring = $db{$key}->{'ring'}; - my $end = $db{$key}->{'end'}; +maybe_sort_keys(); + +my %ctx_timelines; + +sub sortContext { + my $as = $db{$a}->{'seqno'}; + my $bs = $db{$b}->{'seqno'}; + my $val; + + $val = $as <=> $bs; + + die if $val == 0; + + return $val; +} + +sub get_ctx_timeline { + my ($ctx, $ring, $key) = @_; + my @timeline; + + return $ctx_timelines{$key} if exists $ctx_timelines{$key}; + + @timeline = grep { $db{$_}->{'ring'} == $ring and + $db{$_}->{'ctx'} == $ctx } @sorted_keys; + # FIXME seqno restart + @timeline = sort sortContext @timeline; + + $ctx_timelines{$key} = \@timeline; + + return \@timeline; +} + +# Split out merged batches if requested. +if ($correct_durations) { + # Shift !port0 requests start time to after the previous context on the + # same timeline has finished. + foreach my $gid (sort keys %rings) { + my $ring = $ringmap{$rings{$gid}}; + my $timeline = get_engine_timeline($ring); + my $complete; + + foreach my $pos (0..$#{$timeline}) { + my $key = @{$timeline}[$pos]; + my $prev = $complete; + my $pkey; + + $complete = $key unless exists $db{$key}->{'no-end'}; + $pkey = $complete; + + next if $db{$key}->{'port'} == 0; + + $pkey = $prev if $complete eq $key; + + die unless defined $pkey; + + $db{$key}->{'start'} = $db{$pkey}->{'end'}; + $db{$key}->{'start'} = $db{$pkey}->{'notify'} if $db{$key}->{'start'} > $db{$key}->{'end'}; + + die if $db{$key}->{'start'} > $db{$key}->{'end'}; - # correct duration of merged batches - if ($correct_durations and exists $db{$key}->{'no-end'}) { - my $ctx = $db{$key}->{'ctx'}; - my $seqno = $db{$key}->{'seqno'}; - my $start = $db{$key}->{'start'}; - my $next_key; - my $i = 1; - - do { - $next_key = db_key($ring, $ctx, $seqno + $i); - $i++; - } until (exists $db{$next_key} or $i > $key_count); # ugly stop hack - - # 20us tolerance - if (exists $db{$next_key} and $db{$next_key}->{'start'} < $start + 20) { - my $notify = $db{$key}->{'notify'}; $re_sort = 1; - $db{$next_key}->{'start'} = $notify; - $db{$next_key}->{'start'} = $db{$next_key}->{'end'} if $db{$next_key}->{'start'} > $db{$next_key}->{'end'}; - die if $db{$next_key}->{'start'} > $db{$next_key}->{'end'}; } - die if $start > $end; + } + + maybe_sort_keys(); + + # Batch with no-end (no request_out) means it was submitted as part of + # coalesced context. This means it's start time should be set to the end + # time of a following request on this context timeline. + foreach my $tkey (sort keys %ctxtimelines) { + my ($ctx, $ring) = split '/', $tkey; + my $timeline = get_ctx_timeline($ctx, $ring, $tkey); + my $last_complete = -1; + my $complete; + + foreach my $pos (0..$#{$timeline}) { + my $key = @{$timeline}[$pos]; + my $next_key; + + next unless exists $db{$key}->{'no-end'}; + last if $pos == $#{$timeline}; + + # Shift following request to start after the current one + $next_key = ${$timeline}[$pos + 1]; + if (exists $db{$key}->{'notify'}) { + $db{$next_key}->{'engine-start'} = $db{$next_key}->{'start'}; + $db{$next_key}->{'start'} = $db{$key}->{'notify'}; + $re_sort = 1; + } + } } } -@sorted_keys = sort sortStart keys %db if $re_sort; +maybe_sort_keys(); # GPU time accounting my (%running, %runnable, %queued, %batch_avg, %batch_total_avg, %batch_count); @@ -639,6 +712,7 @@ foreach my $key (@sorted_keys) { my $ring = $db{$key}->{'ring'}; my $end = $db{$key}->{'end'}; my $start = $db{$key}->{'start'}; + my $engine_start = $db{$key}->{'engine_start'}; my $notify = $db{$key}->{'notify'}; $first_ts = $db{$key}->{'queue'} if not defined $first_ts or $db{$key}->{'queue'} < $first_ts; @@ -653,7 +727,9 @@ foreach my $key (@sorted_keys) { } else { $db{$key}->{'context-complete-delay'} = 0; } - $db{$key}->{'execute-delay'} = $start - $db{$key}->{'submit'}; + + $engine_start = $db{$key}->{'start'} unless defined $engine_start; + $db{$key}->{'execute-delay'} = $engine_start - $db{$key}->{'submit'}; $db{$key}->{'submit-delay'} = $db{$key}->{'submit'} - $db{$key}->{'queue'}; unless (exists $db{$key}->{'no-notify'}) { $db{$key}->{'duration'} = $notify - $start; @@ -958,6 +1034,7 @@ my $i = 0; foreach my $key (sort sortQueue keys %db) { my ($name, $ctx, $seqno) = ($db{$key}->{'name'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'}); my ($queue, $start, $notify, $end) = ($db{$key}->{'queue'}, $db{$key}->{'start'}, $db{$key}->{'notify'}, $db{$key}->{'end'}); + my $engine_start = $db{$key}->{'engine-start'}; my $submit = $queue + $db{$key}->{'submit-delay'}; my ($content, $style, $duration); my $group = $engine_start_id + $rings{$db{$key}->{'ring'}}; @@ -977,12 +1054,13 @@ foreach my $key (sort sortQueue keys %db) { } # execute to start - $duration = $start - $submit; + $engine_start = $db{$key}->{'start'} unless defined $engine_start; + $duration = $engine_start - $submit; unless (exists $skip_box{'ready'} or $duration < $min_duration) { $skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1; $style = box_style($ctx, 45, 35, 45); $content = "$name
$db{$key}->{'execute-delay'}us
"; - $startend = 'start: \'' . ts($submit) . '\', end: \'' . ts($start) . '\''; + $startend = 'start: \'' . ts($submit) . '\', end: \'' . ts($engine_start) . '\''; print "\t{id: $i, key: $skey, $type group: $group, subgroup: 1, subgroupOrder: 2, content: '$content', $startend, style: \'$style\'},\n"; $i++; }