[OSSTEST,v2,07/41] schema: Provide indices for sg-report-flight
diff mbox series

Message ID 20200731113820.5765-8-ian.jackson@eu.citrix.com
State New
Headers show
Series
  • Performance work
Related show

Commit Message

Ian Jackson July 31, 2020, 11:37 a.m. UTC
These indexes allow very fast lookup of "relevant" flights eg when
trying to justify failures.

In my ad-hoc test case, these indices (along with the subsequent
changes to sg-report-flight and Executive.pm, reduce the runtime of
sg-report-flight from 2-3ks (unacceptably long!) to as little as
5-7s seconds - a speedup of about 500x.

(Getting the database snapshot may take a while first, but deploying
this code should help with that too by reducing long-running
transactions.  Quoted perf timings are from snapshot acquisition.)

Without these new indexes there may be a performance change from the
query changes.  I haven't benchmarked this so I am setting the schema
updates to be Preparatory/Needed (ie, "Schema first" as
schema/README.updates has it), to say that the index should be created
before the new code is deployed.

Testing: I have tested this series by creating experimental indices
"trial_..." in the actual production instance.  (Transactional DDL was
very helpful with this.)  I have verified with \d that schema update
instructions in this commit generate indexes which are equivalent to
the trial indices.

Deployment: AFter these schema updates are applied, the trial indices
are redundant duplicates and should be deleted.

CC: George Dunlap <George.Dunlap@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v2: Use proper \ escaping for underscores in LIKE
---
 schema/runvars-built-index.sql    | 7 +++++++
 schema/runvars-revision-index.sql | 7 +++++++
 schema/steps-job-index.sql        | 7 +++++++
 3 files changed, 21 insertions(+)
 create mode 100644 schema/runvars-built-index.sql
 create mode 100644 schema/runvars-revision-index.sql
 create mode 100644 schema/steps-job-index.sql

Comments

George Dunlap July 31, 2020, 2:21 p.m. UTC | #1
> On Jul 31, 2020, at 12:37 PM, Ian Jackson <ian.jackson@eu.citrix.com> wrote:
> 
> These indexes allow very fast lookup of "relevant" flights eg when
> trying to justify failures.
> 
> In my ad-hoc test case, these indices (along with the subsequent
> changes to sg-report-flight and Executive.pm, reduce the runtime of
> sg-report-flight from 2-3ks (unacceptably long!) to as little as
> 5-7s seconds - a speedup of about 500x.
> 
> (Getting the database snapshot may take a while first, but deploying
> this code should help with that too by reducing long-running
> transactions.  Quoted perf timings are from snapshot acquisition.)
> 
> Without these new indexes there may be a performance change from the
> query changes.  I haven't benchmarked this so I am setting the schema
> updates to be Preparatory/Needed (ie, "Schema first" as
> schema/README.updates has it), to say that the index should be created
> before the new code is deployed.
> 
> Testing: I have tested this series by creating experimental indices
> "trial_..." in the actual production instance.  (Transactional DDL was
> very helpful with this.)  I have verified with \d that schema update
> instructions in this commit generate indexes which are equivalent to
> the trial indices.
> 
> Deployment: AFter these schema updates are applied, the trial indices
> are redundant duplicates and should be deleted.
> 
> CC: George Dunlap <George.Dunlap@citrix.com>
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>

I have no idea if building an index on a LIKE is a good idea or not, but it certainly seems to be useful, so:

Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Ian Jackson July 31, 2020, 2:55 p.m. UTC | #2
George Dunlap writes ("Re: [OSSTEST PATCH v2 07/41] schema: Provide indices for sg-report-flight"):
> 
> 
> > On Jul 31, 2020, at 12:37 PM, Ian Jackson <ian.jackson@eu.citrix.com> wrote:
> > 
> > These indexes allow very fast lookup of "relevant" flights eg when
> > trying to justify failures.
> > 
> > In my ad-hoc test case, these indices (along with the subsequent
> > changes to sg-report-flight and Executive.pm, reduce the runtime of
> > sg-report-flight from 2-3ks (unacceptably long!) to as little as
> > 5-7s seconds - a speedup of about 500x.
> > 
> > (Getting the database snapshot may take a while first, but deploying
> > this code should help with that too by reducing long-running
> > transactions.  Quoted perf timings are from snapshot acquisition.)
> > 
> > Without these new indexes there may be a performance change from the
> > query changes.  I haven't benchmarked this so I am setting the schema
> > updates to be Preparatory/Needed (ie, "Schema first" as
> > schema/README.updates has it), to say that the index should be created
> > before the new code is deployed.
> > 
> > Testing: I have tested this series by creating experimental indices
> > "trial_..." in the actual production instance.  (Transactional DDL was
> > very helpful with this.)  I have verified with \d that schema update
> > instructions in this commit generate indexes which are equivalent to
> > the trial indices.
> > 
> > Deployment: AFter these schema updates are applied, the trial indices
> > are redundant duplicates and should be deleted.
...
> 
> I have no idea if building an index on a LIKE is a good idea or not, but it certainly seems to be useful, so:
> 
> Reviewed-by: George Dunlap <george.dunlap@citrix.com>

Thanks.

This is a thing called a "partial index", where the index only covers
some subset of the rows.  The subset is determined a condition on the
row contents.

Such an index can be a lot smaller than an index on the whole table
and also avoids slowing down updates that don't match the index
condition.

The idea is that when the query contains a condition that matches the
index condition, the query planner can use this small on-topic index
instead of wading through something large and irrelevant.

The query planner is not always very bright about what conditions are
subsets of what other conditions, and it runs without seeing the
contents of bind variables.  So with LIKE, for example, it's generally
necessary to precisely replicate the index condition in the queries.
That's why some of the queries in this series have things like this:

              AND r$ri.name LIKE 'built\_revision\_%'
              AND r$ri.name = ?

where the Perl code passes in 'built_revison_something'.

I hope this explanation was interesting :-).

Ian.

Patch
diff mbox series

diff --git a/schema/runvars-built-index.sql b/schema/runvars-built-index.sql
new file mode 100644
index 00000000..7108e0af
--- /dev/null
+++ b/schema/runvars-built-index.sql
@@ -0,0 +1,7 @@ 
+-- ##OSSTEST## 007 Preparatory
+--
+-- This index helps sg-report-flight find relevant flights.
+
+CREATE INDEX runvars_built_revision_idx
+    ON runvars (val)
+ WHERE name LIKE 'built\_revision\_%';
diff --git a/schema/runvars-revision-index.sql b/schema/runvars-revision-index.sql
new file mode 100644
index 00000000..8871b528
--- /dev/null
+++ b/schema/runvars-revision-index.sql
@@ -0,0 +1,7 @@ 
+-- ##OSSTEST## 008 Preparatory
+--
+-- This index helps Executive::report__find_test find relevant flights.
+
+CREATE INDEX runvars_revision_idx
+    ON runvars (val)
+ WHERE name LIKE 'revision\_%';
diff --git a/schema/steps-job-index.sql b/schema/steps-job-index.sql
new file mode 100644
index 00000000..07dc5a30
--- /dev/null
+++ b/schema/steps-job-index.sql
@@ -0,0 +1,7 @@ 
+-- ##OSSTEST## 006 Preparatory
+--
+-- This index helps sg-report-flight find if a test ever passed.
+
+CREATE INDEX steps_job_testid_status_idx
+    ON steps (job, testid, status);
+