Message ID | pull.25.v3.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Use generation numbers for --topo-order | expand |
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > Changes in V3: I added a new patch that updates the tab-alignment for flags > in revision.h before adding new ones (Thanks, Ævar!). This is most unwelcome while other topics are in flight that caused unnecessary conflict. It would have been very welcomed if the codebase was dormant, though. I'll live, and there is no need to resend, but this change may not appear in today's pushout (I'll have to push out the result of integration before I saw this new reroll with all the other topics).
This patch series performs a decently-sized refactoring of the revision-walk machinery. Well, "refactoring" is probably the wrong word, as I don't actually remove the old code. Instead, when we see certain options in the 'rev_info' struct, we redirect the commit-walk logic to a new set of methods that distribute the workload differently. By using generation numbers in the commit-graph, we can significantly improve 'git log --graph' commands (and the underlying 'git rev-list --topo-order'). On the Linux repository, I got the following performance results when comparing to the previous version with or without a commit-graph: Test: git rev-list --topo-order -100 HEAD HEAD~1, no commit-graph: 6.80 s HEAD~1, w/ commit-graph: 0.77 s HEAD, w/ commit-graph: 0.02 s Test: git rev-list --topo-order -100 HEAD -- tools HEAD~1, no commit-graph: 9.63 s HEAD~1, w/ commit-graph: 6.06 s HEAD, w/ commit-graph: 0.06 s If you want to read this series but are unfamiliar with the commit-graph and generation numbers, then I recommend reading Documentation/technical/commit-graph.txt or a blob post [1] I wrote on the subject. In particular, the three-part walk described in "revision.c: refactor basic topo-order logic" is present (but underexplained) as an animated PNG [2]. Since revision.c is an incredibly important (and old) portion of the codebase -- and because there are so many orthogonal options in 'struct rev_info' -- I consider this submission to be "RFC quality". That is, I am not confident that I am not missing anything, or that my solution is the best it can be. I did merge this branch with ds/commit-graph-with-grafts and the "DO-NOT-MERGE: write and read commit-graph always" commit that computes a commit-graph with every 'git commit' command. The test suite passed with that change, available on GitHub [3]. To ensure that I cover at least the case I think are interesting, I added tests to t6600-test-reach.sh to verify the walks report the correct results for the three cases there (no commit-graph, full commit-graph, and a partial commit-graph so the walk starts at GENERATION_NUMBER_INFINITY). One notable case that is not included in this series is the case of a history comparison such as 'git rev-list --topo-order A..B'. The existing code in limit_list() has ways to cut the walk short when all pending commits are UNINTERESTING. Since this code depends on commit_list instead of the prio_queue we are using here, I chose to leave it untouched for now. We can revisit it in a separate series later. Since handle_commit() turns on revs->limited when a commit is UNINTERESTING, we do not hit the new code in this case. Removing this 'revs->limited = 1;' line yields correct results, but the performance is worse. This series was based on ds/reachable, but is now based on 'master' to not conflict with 182070 "commit: use timestamp_t for author_date_slab". There is a small conflict with md/filter-trees, because it renamed a flag in revisions.h in the line before I add new flags. Hopefully this conflict is not too difficult to resolve. Changes in V3: I added a new patch that updates the tab-alignment for flags in revision.h before adding new ones (Thanks, Ævar!). Also, I squashed the recommended changes to run_three_modes and test_three_modes from Szeder and Junio. Thanks! Thanks, -Stolee [1] https://blogs.msdn.microsoft.com/devops/2018/07/09/supercharging-the-git-commit-graph-iii-generations/ Supercharging the Git Commit Graph III: Generations and Graph Algorithms [2] https://msdnshared.blob.core.windows.net/media/2018/06/commit-graph-topo-order-b-a.png Animation showing three-part walk [3] https://github.com/derrickstolee/git/tree/topo-order/testA branch containing this series along with commits to compute commit-graph in entire test suite. Cc: avarab@gmail.comCc: szeder.dev@gmail.com Derrick Stolee (7): prio-queue: add 'peek' operation test-reach: add run_three_modes method test-reach: add rev-list tests revision.c: begin refactoring --topo-order logic commit/revisions: bookkeeping before refactoring revision.h: add whitespace in flag definitions revision.c: refactor basic topo-order logic commit.c | 11 +- commit.h | 8 ++ object.h | 4 +- prio-queue.c | 9 ++ prio-queue.h | 6 + revision.c | 232 ++++++++++++++++++++++++++++++++++++- revision.h | 34 +++--- t/helper/test-prio-queue.c | 10 +- t/t6600-test-reach.sh | 96 ++++++++++++++- 9 files changed, 374 insertions(+), 36 deletions(-) base-commit: 2d3b1c576c85b7f5db1f418907af00ab88e0c303 Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-25%2Fderrickstolee%2Ftopo-order%2Fprogress-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-25/derrickstolee/topo-order/progress-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/25 Range-diff vs v2: 1: cc1ec4c270 = 1: cc1ec4c270 prio-queue: add 'peek' operation 2: 404c918608 ! 2: b2a1ade148 test-reach: add run_three_modes method @@ -11,10 +11,6 @@ run_three_modes method that executes the given command and tests the actual output to the expected output. - While inspecting this code, I realized that the final test for - 'commit_contains --tag' is silently dropping the '--tag' argument. - It should be quoted to include both. - Signed-off-by: Derrick Stolee <dstolee@microsoft.com> diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh @@ -28,31 +24,22 @@ +run_three_modes () { test_when_finished rm -rf .git/objects/info/commit-graph && - test-tool reach $1 <input >actual && -+ $1 <input >actual && ++ "$@" <input >actual && test_cmp expect actual && cp commit-graph-full .git/objects/info/commit-graph && - test-tool reach $1 <input >actual && -+ $1 <input >actual && ++ "$@" <input >actual && test_cmp expect actual && cp commit-graph-half .git/objects/info/commit-graph && - test-tool reach $1 <input >actual && -+ $1 <input >actual && ++ "$@" <input >actual && test_cmp expect actual } +test_three_modes () { -+ run_three_modes "test-tool reach $1" ++ run_three_modes test-tool reach "$@" +} + test_expect_success 'ref_newer:miss' ' cat >input <<-\EOF && A:commit-5-7 -@@ - EOF - echo "commit_contains(_,A,X,_):1" >expect && - test_three_modes commit_contains && -- test_three_modes commit_contains --tag -+ test_three_modes "commit_contains --tag" - ' - - test_expect_success 'commit_contains:miss' ' 3: 30dee58c61 ! 3: b0ceb96076 test-reach: add rev-list tests @@ -30,7 +30,7 @@ + commit-6-2 commit-5-2 commit-4-2 commit-3-2 commit-2-2 commit-1-2 \ + commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 commit-1-1 \ + >expect && -+ run_three_modes "git rev-list --topo-order commit-6-6" ++ run_three_modes git rev-list --topo-order commit-6-6 +' + +test_expect_success 'rev-list: first-parent topo-order' ' @@ -42,7 +42,7 @@ + commit-6-2 \ + commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 commit-1-1 \ + >expect && -+ run_three_modes "git rev-list --first-parent --topo-order commit-6-6" ++ run_three_modes git rev-list --first-parent --topo-order commit-6-6 +' + +test_expect_success 'rev-list: range topo-order' ' @@ -54,7 +54,7 @@ + commit-6-2 commit-5-2 commit-4-2 \ + commit-6-1 commit-5-1 commit-4-1 \ + >expect && -+ run_three_modes "git rev-list --topo-order commit-3-3..commit-6-6" ++ run_three_modes git rev-list --topo-order commit-3-3..commit-6-6 +' + +test_expect_success 'rev-list: range topo-order' ' @@ -66,7 +66,7 @@ + commit-6-2 commit-5-2 commit-4-2 \ + commit-6-1 commit-5-1 commit-4-1 \ + >expect && -+ run_three_modes "git rev-list --topo-order commit-3-8..commit-6-6" ++ run_three_modes git rev-list --topo-order commit-3-8..commit-6-6 +' + +test_expect_success 'rev-list: first-parent range topo-order' ' @@ -78,7 +78,7 @@ + commit-6-2 \ + commit-6-1 commit-5-1 commit-4-1 \ + >expect && -+ run_three_modes "git rev-list --first-parent --topo-order commit-3-8..commit-6-6" ++ run_three_modes git rev-list --first-parent --topo-order commit-3-8..commit-6-6 +' + +test_expect_success 'rev-list: ancestry-path topo-order' ' @@ -88,7 +88,7 @@ + commit-6-4 commit-5-4 commit-4-4 commit-3-4 \ + commit-6-3 commit-5-3 commit-4-3 \ + >expect && -+ run_three_modes "git rev-list --topo-order --ancestry-path commit-3-3..commit-6-6" ++ run_three_modes git rev-list --topo-order --ancestry-path commit-3-3..commit-6-6 +' + +test_expect_success 'rev-list: symmetric difference topo-order' ' @@ -102,7 +102,7 @@ + commit-3-8 commit-2-8 commit-1-8 \ + commit-3-7 commit-2-7 commit-1-7 \ + >expect && -+ run_three_modes "git rev-list --topo-order commit-3-8...commit-6-6" ++ run_three_modes git rev-list --topo-order commit-3-8...commit-6-6 +' + test_done 4: a74ae13d4e = 4: fd1a0ab7cd revision.c: begin refactoring --topo-order logic 5: 0e64fc144c = 5: e86f304082 commit/revisions: bookkeeping before refactoring -: ---------- > 6: fa6d5ef152 revision.h: add whitespace in flag definitions 6: 3b185ac3b1 ! 7: 020b2f50c5 revision.c: refactor basic topo-order logic @@ -404,11 +404,11 @@ --- a/revision.h +++ b/revision.h @@ - #define USER_GIVEN (1u<<25) /* given directly by the user */ - #define TRACK_LINEAR (1u<<26) - #define ALL_REV_FLAGS (((1u<<11)-1) | USER_GIVEN | TRACK_LINEAR) -+#define TOPO_WALK_EXPLORED (1u<<27) -+#define TOPO_WALK_INDEGREE (1u<<28) + #define USER_GIVEN (1u<<25) /* given directly by the user */ + #define TRACK_LINEAR (1u<<26) + #define ALL_REV_FLAGS (((1u<<11)-1) | USER_GIVEN | TRACK_LINEAR) ++#define TOPO_WALK_EXPLORED (1u<<27) ++#define TOPO_WALK_INDEGREE (1u<<28) #define DECORATE_SHORT_REFS 1 #define DECORATE_FULL_REFS 2