diff mbox series

[OSSTEST,v2,08/41] sg-report-flight: Ask the db for flights of interest

Message ID 20200731113820.5765-9-ian.jackson@eu.citrix.com (mailing list archive)
State New, archived
Headers show
Series Performance work | expand

Commit Message

Ian Jackson July 31, 2020, 11:37 a.m. UTC
Specifically, we narrow the initial query to flights which have at
least some job with the built_revision_foo we are looking for.

This condition is strictly broader than that implemented inside the
flight search loop, so there is no functional change.

Perf: runtime of my test case now ~300s-500s.

Example query before (from the Perl DBI trace):

      SELECT * FROM (
        SELECT flight, blessing FROM flights
            WHERE (branch='xen-unstable')
              AND                   EXISTS (SELECT 1
                            FROM jobs
                           WHERE jobs.flight = flights.flight
                             AND jobs.job = ?)

              AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
            ORDER BY flight DESC
            LIMIT 1000
      ) AS sub
      ORDER BY blessing ASC, flight DESC

With these bind variables:

    "test-armhf-armhf-libvirt"

After:

      SELECT * FROM (
        SELECT DISTINCT flight, blessing
             FROM flights
             JOIN runvars r1 USING (flight)

            WHERE (branch='xen-unstable')
              AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
                  AND EXISTS (SELECT 1
                            FROM jobs
                           WHERE jobs.flight = flights.flight
                             AND jobs.job = ?)

              AND r1.name LIKE 'built\_revision\_%'
              AND r1.name = ?
              AND r1.val= ?

            ORDER BY flight DESC
            LIMIT 1000
      ) AS sub
      ORDER BY blessing ASC, flight DESC

With these bind variables:

      "test-armhf-armhf-libvirt"
      'built_revision_xen'
      '165f3afbfc3db70fcfdccad07085cde0a03c858b'

Diff to the query:

      SELECT * FROM (
-        SELECT flight, blessing FROM flights
+        SELECT DISTINCT flight, blessing
+             FROM flights
+             JOIN runvars r1 USING (flight)
+
             WHERE (branch='xen-unstable')
+              AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
               AND                   EXISTS (SELECT 1
                             FROM jobs
                            WHERE jobs.flight = flights.flight
                              AND jobs.job = ?)

-              AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
+              AND r1.name LIKE 'built\_revision\_%'
+              AND r1.name = ?
+              AND r1.val= ?
+
             ORDER BY flight DESC
             LIMIT 1000
       ) AS sub

CC: George Dunlap <George.Dunlap@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v2: Use proper \ escaping for underscores in LIKE
---
 schema/runvars-built-index.sql |  2 +-
 sg-report-flight               | 64 ++++++++++++++++++++++++++++++++--
 2 files changed, 62 insertions(+), 4 deletions(-)

Comments

George Dunlap July 31, 2020, 2:17 p.m. UTC | #1
> On Jul 31, 2020, at 12:37 PM, Ian Jackson <ian.jackson@eu.citrix.com> wrote:
> 
> Specifically, we narrow the initial query to flights which have at
> least some job with the built_revision_foo we are looking for.
> 
> This condition is strictly broader than that implemented inside the
> flight search loop, so there is no functional change.

Assuming this is true, that job / runvar is filtered after extracting this information, then...

> 
> Perf: runtime of my test case now ~300s-500s.
> 
> Example query before (from the Perl DBI trace):
> 
>      SELECT * FROM (
>        SELECT flight, blessing FROM flights
>            WHERE (branch='xen-unstable')
>              AND                   EXISTS (SELECT 1
>                            FROM jobs
>                           WHERE jobs.flight = flights.flight
>                             AND jobs.job = ?)
> 
>              AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
>            ORDER BY flight DESC
>            LIMIT 1000
>      ) AS sub
>      ORDER BY blessing ASC, flight DESC
> 
> With these bind variables:
> 
>    "test-armhf-armhf-libvirt"
> 
> After:
> 
>      SELECT * FROM (
>        SELECT DISTINCT flight, blessing
>             FROM flights
>             JOIN runvars r1 USING (flight)
> 
>            WHERE (branch='xen-unstable')
>              AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
>                  AND EXISTS (SELECT 1
>                            FROM jobs
>                           WHERE jobs.flight = flights.flight
>                             AND jobs.job = ?)
> 
>              AND r1.name LIKE 'built\_revision\_%'
>              AND r1.name = ?
>              AND r1.val= ?
> 
>            ORDER BY flight DESC
>            LIMIT 1000
>      ) AS sub
>      ORDER BY blessing ASC, flight DESC

…I agree that this shoud introduce no other changes.

Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Ian Jackson July 31, 2020, 3:43 p.m. UTC | #2
George Dunlap writes ("Re: [OSSTEST PATCH v2 08/41] sg-report-flight: Ask the db for flights of interest"):
> > On Jul 31, 2020, at 12:37 PM, Ian Jackson <ian.jackson@eu.citrix.com> wrote:
> > Specifically, we narrow the initial query to flights which have at
> > least some job with the built_revision_foo we are looking for.
> > 
> > This condition is strictly broader than that implemented inside the
> > flight search loop, so there is no functional change.
> 
> Assuming this is true, that job / runvar is filtered after extracting this information, then...
...
> …I agree that this shoud introduce no other changes.
> 
> Reviewed-by: George Dunlap <george.dunlap@citrix.com>

Thanks.

Just to convince myself, I ran through the argument based on the perl
code.  I found a lacuna.

1. The job of findaflight is to find a flight, and it doesn't have
   significant side effects - just a return value.

2. If it returns a flight from the loop, $whynot must have been
   undef.  $whynot is never unset.

Consider some tree in %{ $specver{$thisthat} }.

3. If @revisions is 0 for that tree, $whynot is set.  So one of the
   two queries $revisionsq or $revisionsosstestq must have returned
   some rows.

4. Furthermore, none of those rows must have passed the $wronginfo
   grep.  If they had, $whynot would have been set.  Any row
   whose val doesn't contain a colon, and which doesn't end up
   in $wronginfo, had a val equal to the requested specver.

5. Colons in this field appear only in mercurial revisions.  These are
   now obsoelete - we have no mercurial trees.  A consequence of this
   commit is actually that we should explicitly abolish mercurial
   support, at least pending a change to osstest to arrange for the
   val column to contain only the hash part and not the number part.

6. Together, these conditons means that if $whynot wasn't set,
   there must have been some row whose val matched the specver.

7. Both the $revisionsq and $revisionsosstestq queries take a flight
   bound variable condition.  This is bound by a value that came out
   of @binfos.  @binfos is made from %binfos, where the flight number
   is the key.  %binfos is populated by the @binfos_todo loop, where
   it gets the flight number from a @binfos_todos entry - but it
   filters them for $bflight == $tflight.

8. So some row must have matched the flight, and the specver, and
   of course the name.  This is precisely the new condition.

I think this means I should put a commit earlier in this series which
disables mercurial support until the colon version situation is
rationalised.

Ian.
diff mbox series

Patch

diff --git a/schema/runvars-built-index.sql b/schema/runvars-built-index.sql
index 7108e0af..128e69e9 100644
--- a/schema/runvars-built-index.sql
+++ b/schema/runvars-built-index.sql
@@ -1,4 +1,4 @@ 
--- ##OSSTEST## 007 Preparatory
+-- ##OSSTEST## 007 Needed
 --
 -- This index helps sg-report-flight find relevant flights.
 
diff --git a/sg-report-flight b/sg-report-flight
index 7f2790ce..10127582 100755
--- a/sg-report-flight
+++ b/sg-report-flight
@@ -185,19 +185,77 @@  END
     if (defined $job) {
 	push @flightsq_params, $job;
 	$flightsq_jobcond = <<END;
-                  EXISTS (SELECT 1
+                  AND EXISTS (SELECT 1
 			    FROM jobs
 			   WHERE jobs.flight = flights.flight
 			     AND jobs.job = ?)
 END
     }
 
+    # We build a slightly complicated query to find possibly-relevant
+    # flights.  A "possibly-relevant" flight is one which the main
+    # flight categorisation algorithm below (the loop over $tflight)
+    # *might* decide is of interest.
+    #
+    # That algorithm produces a table of which revision(s) of what
+    # %specver trees the build jobs for the relevant test job used.
+    # And then it insists (amongst other things) that for each such
+    # tree the revision in question appears.
+    #
+    # It only looks at build jobs within the flight.  So any flight
+    # that the main algorithm finds interesting will have *some* job
+    # (in the same flight) mentioning that revision in a built
+    # revision runvar.  So we can search the runvars table by its
+    # index on the revision.
+    #
+    # So we look for flights that have an appropriate entry in runvars
+    # for each %specver tree.  We can do this by joining the runvar
+    # table once for each tree.
+    #
+    # The "osstest" tree is handled specially. as ever.  (We use
+    # "r$ri" there too for orthogonality of the code, not because
+    # there could be multiple specifiations for the osstest revision.)
+    #
+    # This complex query is an optimisation: for correctness, we must
+    # still execute the full job-specific recursive examination, for
+    # each possibly-relevant flight - that's the $tflight loop body.
+
+    my $runvars_joins = '';
+    my $runvars_conds = '';
+    my $ri=0;
+    foreach my $tree (sort keys %{ $specver{$thisthat} }) {
+      $ri++;
+      if ($tree ne 'osstest') {
+	  $runvars_joins .= <<END;
+             JOIN runvars r$ri USING (flight)
+END
+	  $runvars_conds .= <<END;
+              AND r$ri.name LIKE 'built\_revision\_%' 
+              AND r$ri.name = ?
+              AND r$ri.val= ?
+END
+	  push @flightsq_params, "built_revision_$tree",
+	                     $specver{$thisthat}{$tree};
+      } else {
+	  $runvars_joins .= <<END;
+             JOIN flights_harness_touched r$ri USING (flight)
+END
+	  $runvars_conds .= <<END;
+              AND r$ri.harness= ?
+END
+	  push @flightsq_params, $specver{$thisthat}{$tree};
+      }
+    }
+
     my $flightsq= <<END;
       SELECT * FROM (
-        SELECT flight, blessing FROM flights
+        SELECT DISTINCT flight, blessing
+             FROM flights
+$runvars_joins
             WHERE $branches_cond_q
-              AND $flightsq_jobcond
               AND $blessingscond
+$flightsq_jobcond
+$runvars_conds
             ORDER BY flight DESC
             LIMIT 1000
       ) AS sub