Message ID | 20200731113820.5765-9-ian.jackson@eu.citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Performance work | expand |
> On Jul 31, 2020, at 12:37 PM, Ian Jackson <ian.jackson@eu.citrix.com> wrote: > > Specifically, we narrow the initial query to flights which have at > least some job with the built_revision_foo we are looking for. > > This condition is strictly broader than that implemented inside the > flight search loop, so there is no functional change. Assuming this is true, that job / runvar is filtered after extracting this information, then... > > Perf: runtime of my test case now ~300s-500s. > > Example query before (from the Perl DBI trace): > > SELECT * FROM ( > SELECT flight, blessing FROM flights > WHERE (branch='xen-unstable') > AND EXISTS (SELECT 1 > FROM jobs > WHERE jobs.flight = flights.flight > AND jobs.job = ?) > > AND ( (TRUE AND flight <= 151903) AND (blessing='real') ) > ORDER BY flight DESC > LIMIT 1000 > ) AS sub > ORDER BY blessing ASC, flight DESC > > With these bind variables: > > "test-armhf-armhf-libvirt" > > After: > > SELECT * FROM ( > SELECT DISTINCT flight, blessing > FROM flights > JOIN runvars r1 USING (flight) > > WHERE (branch='xen-unstable') > AND ( (TRUE AND flight <= 151903) AND (blessing='real') ) > AND EXISTS (SELECT 1 > FROM jobs > WHERE jobs.flight = flights.flight > AND jobs.job = ?) > > AND r1.name LIKE 'built\_revision\_%' > AND r1.name = ? > AND r1.val= ? > > ORDER BY flight DESC > LIMIT 1000 > ) AS sub > ORDER BY blessing ASC, flight DESC …I agree that this shoud introduce no other changes. Reviewed-by: George Dunlap <george.dunlap@citrix.com>
George Dunlap writes ("Re: [OSSTEST PATCH v2 08/41] sg-report-flight: Ask the db for flights of interest"): > > On Jul 31, 2020, at 12:37 PM, Ian Jackson <ian.jackson@eu.citrix.com> wrote: > > Specifically, we narrow the initial query to flights which have at > > least some job with the built_revision_foo we are looking for. > > > > This condition is strictly broader than that implemented inside the > > flight search loop, so there is no functional change. > > Assuming this is true, that job / runvar is filtered after extracting this information, then... ... > …I agree that this shoud introduce no other changes. > > Reviewed-by: George Dunlap <george.dunlap@citrix.com> Thanks. Just to convince myself, I ran through the argument based on the perl code. I found a lacuna. 1. The job of findaflight is to find a flight, and it doesn't have significant side effects - just a return value. 2. If it returns a flight from the loop, $whynot must have been undef. $whynot is never unset. Consider some tree in %{ $specver{$thisthat} }. 3. If @revisions is 0 for that tree, $whynot is set. So one of the two queries $revisionsq or $revisionsosstestq must have returned some rows. 4. Furthermore, none of those rows must have passed the $wronginfo grep. If they had, $whynot would have been set. Any row whose val doesn't contain a colon, and which doesn't end up in $wronginfo, had a val equal to the requested specver. 5. Colons in this field appear only in mercurial revisions. These are now obsoelete - we have no mercurial trees. A consequence of this commit is actually that we should explicitly abolish mercurial support, at least pending a change to osstest to arrange for the val column to contain only the hash part and not the number part. 6. Together, these conditons means that if $whynot wasn't set, there must have been some row whose val matched the specver. 7. Both the $revisionsq and $revisionsosstestq queries take a flight bound variable condition. This is bound by a value that came out of @binfos. @binfos is made from %binfos, where the flight number is the key. %binfos is populated by the @binfos_todo loop, where it gets the flight number from a @binfos_todos entry - but it filters them for $bflight == $tflight. 8. So some row must have matched the flight, and the specver, and of course the name. This is precisely the new condition. I think this means I should put a commit earlier in this series which disables mercurial support until the colon version situation is rationalised. Ian.
diff --git a/schema/runvars-built-index.sql b/schema/runvars-built-index.sql index 7108e0af..128e69e9 100644 --- a/schema/runvars-built-index.sql +++ b/schema/runvars-built-index.sql @@ -1,4 +1,4 @@ --- ##OSSTEST## 007 Preparatory +-- ##OSSTEST## 007 Needed -- -- This index helps sg-report-flight find relevant flights. diff --git a/sg-report-flight b/sg-report-flight index 7f2790ce..10127582 100755 --- a/sg-report-flight +++ b/sg-report-flight @@ -185,19 +185,77 @@ END if (defined $job) { push @flightsq_params, $job; $flightsq_jobcond = <<END; - EXISTS (SELECT 1 + AND EXISTS (SELECT 1 FROM jobs WHERE jobs.flight = flights.flight AND jobs.job = ?) END } + # We build a slightly complicated query to find possibly-relevant + # flights. A "possibly-relevant" flight is one which the main + # flight categorisation algorithm below (the loop over $tflight) + # *might* decide is of interest. + # + # That algorithm produces a table of which revision(s) of what + # %specver trees the build jobs for the relevant test job used. + # And then it insists (amongst other things) that for each such + # tree the revision in question appears. + # + # It only looks at build jobs within the flight. So any flight + # that the main algorithm finds interesting will have *some* job + # (in the same flight) mentioning that revision in a built + # revision runvar. So we can search the runvars table by its + # index on the revision. + # + # So we look for flights that have an appropriate entry in runvars + # for each %specver tree. We can do this by joining the runvar + # table once for each tree. + # + # The "osstest" tree is handled specially. as ever. (We use + # "r$ri" there too for orthogonality of the code, not because + # there could be multiple specifiations for the osstest revision.) + # + # This complex query is an optimisation: for correctness, we must + # still execute the full job-specific recursive examination, for + # each possibly-relevant flight - that's the $tflight loop body. + + my $runvars_joins = ''; + my $runvars_conds = ''; + my $ri=0; + foreach my $tree (sort keys %{ $specver{$thisthat} }) { + $ri++; + if ($tree ne 'osstest') { + $runvars_joins .= <<END; + JOIN runvars r$ri USING (flight) +END + $runvars_conds .= <<END; + AND r$ri.name LIKE 'built\_revision\_%' + AND r$ri.name = ? + AND r$ri.val= ? +END + push @flightsq_params, "built_revision_$tree", + $specver{$thisthat}{$tree}; + } else { + $runvars_joins .= <<END; + JOIN flights_harness_touched r$ri USING (flight) +END + $runvars_conds .= <<END; + AND r$ri.harness= ? +END + push @flightsq_params, $specver{$thisthat}{$tree}; + } + } + my $flightsq= <<END; SELECT * FROM ( - SELECT flight, blessing FROM flights + SELECT DISTINCT flight, blessing + FROM flights +$runvars_joins WHERE $branches_cond_q - AND $flightsq_jobcond AND $blessingscond +$flightsq_jobcond +$runvars_conds ORDER BY flight DESC LIMIT 1000 ) AS sub
Specifically, we narrow the initial query to flights which have at least some job with the built_revision_foo we are looking for. This condition is strictly broader than that implemented inside the flight search loop, so there is no functional change. Perf: runtime of my test case now ~300s-500s. Example query before (from the Perl DBI trace): SELECT * FROM ( SELECT flight, blessing FROM flights WHERE (branch='xen-unstable') AND EXISTS (SELECT 1 FROM jobs WHERE jobs.flight = flights.flight AND jobs.job = ?) AND ( (TRUE AND flight <= 151903) AND (blessing='real') ) ORDER BY flight DESC LIMIT 1000 ) AS sub ORDER BY blessing ASC, flight DESC With these bind variables: "test-armhf-armhf-libvirt" After: SELECT * FROM ( SELECT DISTINCT flight, blessing FROM flights JOIN runvars r1 USING (flight) WHERE (branch='xen-unstable') AND ( (TRUE AND flight <= 151903) AND (blessing='real') ) AND EXISTS (SELECT 1 FROM jobs WHERE jobs.flight = flights.flight AND jobs.job = ?) AND r1.name LIKE 'built\_revision\_%' AND r1.name = ? AND r1.val= ? ORDER BY flight DESC LIMIT 1000 ) AS sub ORDER BY blessing ASC, flight DESC With these bind variables: "test-armhf-armhf-libvirt" 'built_revision_xen' '165f3afbfc3db70fcfdccad07085cde0a03c858b' Diff to the query: SELECT * FROM ( - SELECT flight, blessing FROM flights + SELECT DISTINCT flight, blessing + FROM flights + JOIN runvars r1 USING (flight) + WHERE (branch='xen-unstable') + AND ( (TRUE AND flight <= 151903) AND (blessing='real') ) AND EXISTS (SELECT 1 FROM jobs WHERE jobs.flight = flights.flight AND jobs.job = ?) - AND ( (TRUE AND flight <= 151903) AND (blessing='real') ) + AND r1.name LIKE 'built\_revision\_%' + AND r1.name = ? + AND r1.val= ? + ORDER BY flight DESC LIMIT 1000 ) AS sub CC: George Dunlap <George.Dunlap@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> --- v2: Use proper \ escaping for underscores in LIKE --- schema/runvars-built-index.sql | 2 +- sg-report-flight | 64 ++++++++++++++++++++++++++++++++-- 2 files changed, 62 insertions(+), 4 deletions(-)