[RFC,5/5] line-log: try to use generation number-based topo-ordering
diff mbox series

Message ID 20190818182801.7580-6-szeder.dev@gmail.com
State New
Headers show
Series
  • line-log: towards a more responsive, incremental 'git log -L'
Related show

Commit Message

SZEDER Gábor Aug. 18, 2019, 6:28 p.m. UTC
The previous patch made it possible to perform line-level filtering
during history traversal instead of in an expensive preprocessing
step, but it still requires some simpler preprocessing steps, notably
topo-ordering.  However, nowadays we have commit-graphs storing
generation numbers, which make it possible to incrementally traverse
the history in topological order, without the preparatory limit_list()
and sort_in_topological_order() steps; see b45424181e (revision.c:
generation-based topo-order algorithm, 2018-11-01).

This patch combines the two, so we can do both the topo-ordering and
the line-level filtering during history traversal, eliminating even
those simpler preprocessing steps, and thus further reducing the delay
before showing the first commit modifying the given line range.

The 'revs->limited' flag plays the central role in this, because, due
to limitations of the current implementation, the generation
number-based topo-ordering is only enabled when this flag remains
unset.  Line-level log, however, always sets this flag in
setup_revisions() ever since the feature was introduced in 12da1d1f6f
(Implement line-history search (git log -L), 2013-03-28).  The reason
for setting 'limited' is unclear, though, because the line-level log
itself doesn't directly depend on it, and it doesn't affect how the
limit_list() function limits the revision range.  However, there is an
indirect dependency: the line-level log requires topo-ordering, and
the "traditional" sort_in_topological_order() requires an already
limited commit list since e6c3505b44 (Make sure we generate the whole
commit list before trying to sort it topologically, 2005-07-06).  The
new, generation numbers-based topo-ordering doesn't require a limited
commit list anymore.

So don't set 'revs->limited' for line-level log, unless it is really
necessary, namely:

  - The user explicitly requested parent rewriting, because that is
    still done in the line_log_filter() preprocessing step (see
    previous patch), which requires sort_in_topological_order() and in
    turn limit_list() as well.

  - A commit-graph file is not available or it doesn't yet contain
    generation numbers.  In these cases we had to fall back on
    sort_in_topological_order() and in turn limit_list().  The
    existing condition with generation_numbers_enabled() has already
    ensured that the 'limited' flag is set in these cases; this patch
    just makes sure that the line-level log sets 'revs->topo_order'
    before that condition.

While the reduced delay before showing the first commit is measurable
in git.git, it takes a bigger repository to make it clearly noticable.
In both cases below the line ranges were chosen so that they were
modified rather close to the starting revisions, so the effect of this
change is most noticable.

  # git.git
  $ time git --no-pager log -L:read_alternate_refs:sha1-file.c -1 v2.23.0

  Before:

    real    0m0.107s
    user    0m0.091s
    sys     0m0.013s

  After:

    real    0m0.058s
    user    0m0.050s
    sys     0m0.005s

  # linux.git
  $ time git --no-pager log \
    -L:build_restore_work_registers:arch/mips/mm/tlbex.c -1 v5.2

  Before:

    real   0m1.129s
    user   0m1.061s
    sys    0m0.069s

  After:

    real   0m0.096s
    user   0m0.087s
    sys    0m0.009s

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---

Notes:
    FWIW, that line-level log sets 'limited' in addition to 'topo_order'
    thing was already part of the first submitted iteration of the
    line-level log patch series:
    
    https://public-inbox.org/git/1277558857-23103-4-git-send-email-struggleyb.nku@gmail.com/#Z30revision.c
    
    But it was never discussed during review.

 revision.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Patch
diff mbox series

diff --git a/revision.c b/revision.c
index 6bdfcb38cd..7a9dc54771 100644
--- a/revision.c
+++ b/revision.c
@@ -2658,6 +2658,12 @@  int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 	if (revs->diffopt.objfind)
 		revs->simplify_history = 0;
 
+	if (revs->line_level_traverse) {
+		if (want_ancestry(revs))
+			revs->limited = 1;
+		revs->topo_order = 1;
+	}
+
 	if (revs->topo_order && !generation_numbers_enabled(the_repository))
 		revs->limited = 1;
 
@@ -2677,11 +2683,6 @@  int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 
 	revs->diffopt.abbrev = revs->abbrev;
 
-	if (revs->line_level_traverse) {
-		revs->limited = 1;
-		revs->topo_order = 1;
-	}
-
 	diff_setup_done(&revs->diffopt);
 
 	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,