mbox series

[v3,00/10] commit-graph write: progress output improvements

Message ID 20181122132823.9883-1-avarab@gmail.com (mailing list archive)
Headers show
Series commit-graph write: progress output improvements | expand

Message

Ævar Arnfjörð Bjarmason Nov. 22, 2018, 1:28 p.m. UTC
This incorporates SZEDER's recent two-part series, rebases mine on
top, and fixes a few things while I'm at it. Now there's no progress
output where we don't show a completion percentage.

SZEDER Gábor (2):
  commit-graph: rename 'num_extra_edges' variable to 'num_large_edges'
  commit-graph: don't call write_graph_chunk_large_edges() unnecessarily

Ævar Arnfjörð Bjarmason (8):
  commit-graph write: rephrase confusing progress output
  commit-graph write: add "Writing out" progress output
  commit-graph write: more descriptive "writing out" output
  commit-graph write: show progress for object search
  commit-graph write: add more descriptive progress output
  commit-graph write: remove empty line for readability
  commit-graph write: add itermediate progress
  commit-graph write: emit a percentage for all progress

 commit-graph.c | 130 ++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 102 insertions(+), 28 deletions(-)

Range-diff:

By the way, is there any way to....

 [.. snipped lots of irrelevant commits...]
 -:  ---------- > 14:  07d06c50c0 commit-graph: rename 'num_extra_edges' variable to 'num_large_edges'
 -:  ---------- > 15:  904dda1e7a commit-graph: don't call write_graph_chunk_large_edges() unnecessarily

Pass the equivalent of "git range-diff origin/master topic-2 topic-3"
to git-format-patch?

 1:  9f7fb459bd = 16:  1126c7e29d commit-graph write: rephrase confusing progress output
     a => b | 0
     1 file changed, 0 insertions(+), 0 deletions(-)
    
 2:  093c63e99f ! 17:  2b52ad2284 commit-graph write: add more progress output
     a => b | 0
     1 file changed, 0 insertions(+), 0 deletions(-)
    
    @@ -1,9 +1,10 @@
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    -    commit-graph write: add more progress output
    +    commit-graph write: add "Writing out" progress output
     
    -    Add more progress output to the output already added in
    -    7b0f229222 ("commit-graph write: add progress output", 2018-09-17).
    +    Add progress output to be shown when we're writing out the
    +    commit-graph, this adds to the output already added in 7b0f229222
    +    ("commit-graph write: add progress output", 2018-09-17).
     
         As noted in that commit most of the progress output isn't displayed on
         small repositories, but before this change we'd noticeably hang for
    @@ -13,30 +14,13 @@
         point at which we're not producing progress output:
     
             $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
    -        Finding commits for commit graph: 6365492, done.
    +        Finding commits for commit graph: 6365442, done.
             Computing commit graph generation numbers: 100% (797222/797222), done.
    -        Writing out commit graph: 2399912, done.
    +        Writing out commit graph: 100% (3986110/3986110), done.
     
    -    This "writing out" number is not meant to be meaningful to the user,
    -    but just to show that we're doing work and the command isn't
    -    hanging.
    -
    -    In the current implementation it's approximately 4x the number of
    -    commits. As noted in on-list discussion[1] we could add the loops up
    -    and show percentage progress here, but I don't think it's worth it. It
    -    would make the implementation more complex and harder to maintain for
    -    very little gain.
    -
    -    On a much larger in-house repository I have we'll show (note how we
    -    also say "Annotating[...]"):
    -
    -        $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
    -        Finding commits for commit graph: 50026015, done.
    -        Annotating commit graph: 21567407, done.
    -        Computing commit graph generation numbers: 100% (7144680/7144680), done.
    -        Writing out commit graph: 21434417, done.
    -
    -    1. https://public-inbox.org/git/20181120165800.GB30222@szeder.dev/
    +    This "Writing out" number is 4x or 5x the number of commits, depending
    +    on the graph we're processing. A later change will make this explicit
    +    to the user.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    @@ -55,13 +39,13 @@
      	int i, count = 0;
      	struct commit **list = commits;
     @@
    - 	 */
    - 	for (i = 0; i < 256; i++) {
      		while (count < nr_commits) {
    -+			display_progress(progress, ++*progress_cnt);
      			if ((*list)->object.oid.hash[0] != i)
      				break;
    ++			display_progress(progress, ++*progress_cnt);
      			count++;
    + 			list++;
    + 		}
     @@
      }
      
    @@ -112,15 +96,17 @@
      	struct commit **list = commits;
      	struct commit **last = commits + nr_commits;
     @@
    - 						  commits,
    - 						  nr_commits,
    - 						  commit_to_sha1);
    -+			display_progress(progress, ++*progress_cnt);
      
    - 			if (edge_value < 0)
    - 				edge_value = GRAPH_PARENT_MISSING;
    + 	while (list < last) {
    + 		int num_parents = 0;
    ++
    ++		display_progress(progress, ++*progress_cnt);
    ++
    + 		for (parent = (*list)->parents; num_parents < 3 && parent;
    + 		     parent = parent->next)
    + 			num_parents++;
     @@
    - 	int num_extra_edges;
    + 	int num_large_edges;
      	struct commit_list *parent;
      	struct progress *progress = NULL;
     +	uint64_t progress_cnt = 0;
    @@ -134,19 +120,25 @@
     -	write_graph_chunk_fanout(f, commits.list, commits.nr);
     -	write_graph_chunk_oids(f, GRAPH_OID_LEN, commits.list, commits.nr);
     -	write_graph_chunk_data(f, GRAPH_OID_LEN, commits.list, commits.nr);
    --	write_graph_chunk_large_edges(f, commits.list, commits.nr);
    -+	if (report_progress)
    ++	if (report_progress) {
    ++		/*
    ++		 * Each of the write_graph_chunk_*() functions just
    ++		 * below loops over our N commits. This number must be
    ++		 * kept in sync with the number of passes we're doing.
    ++		 */
    ++		int graph_passes = 4;
    ++		if (num_large_edges)
    ++			graph_passes++;
     +		progress = start_delayed_progress(
     +			_("Writing out commit graph"),
    -+			0);
    -+	write_graph_chunk_fanout(f, commits.list, commits.nr, progress,
    -+				 &progress_cnt);
    -+	write_graph_chunk_oids(f, GRAPH_OID_LEN, commits.list, commits.nr,
    -+			       progress, &progress_cnt);
    -+	write_graph_chunk_data(f, GRAPH_OID_LEN, commits.list, commits.nr,
    -+			       progress, &progress_cnt);
    -+	write_graph_chunk_large_edges(f, commits.list, commits.nr, progress,
    -+				      &progress_cnt);
    ++			graph_passes * commits.nr);
    ++	}
    ++	write_graph_chunk_fanout(f, commits.list, commits.nr, progress, &progress_cnt);
    ++	write_graph_chunk_oids(f, GRAPH_OID_LEN, commits.list, commits.nr, progress, &progress_cnt);
    ++	write_graph_chunk_data(f, GRAPH_OID_LEN, commits.list, commits.nr, progress, &progress_cnt);
    + 	if (num_large_edges)
    +-		write_graph_chunk_large_edges(f, commits.list, commits.nr);
    ++		write_graph_chunk_large_edges(f, commits.list, commits.nr, progress, &progress_cnt);
     +	stop_progress(&progress);
      
      	close_commit_graph(the_repository);
 -:  ---------- > 18:  b1773677b1 commit-graph write: more descriptive "writing out" output
 3:  6c71de9460 ! 19:  3138b00a2c commit-graph write: show progress for object search
     a => b | 0
     1 file changed, 0 insertions(+), 0 deletions(-)
    
    @@ -37,9 +37,9 @@
      --- a/commit-graph.c
      +++ b/commit-graph.c
     @@
    - 	struct commit_list *parent;
      	struct progress *progress = NULL;
      	uint64_t progress_cnt = 0;
    + 	struct strbuf progress_title = STRBUF_INIT;
     +	unsigned long approx_nr_objects;
      
      	if (!commit_graph_compatible(the_repository))
 4:  c665dbdacb ! 20:  f41e3b3eb3 commit-graph write: add more describing progress output
     a => b | 0
     1 file changed, 0 insertions(+), 0 deletions(-)
    
    @@ -1,6 +1,6 @@
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    -    commit-graph write: add more describing progress output
    +    commit-graph write: add more descriptive progress output
     
         Make the progress output shown when we're searching for commits to
         include in the graph more descriptive. This amends code I added in
    @@ -36,14 +36,6 @@
      diff --git a/commit-graph.c b/commit-graph.c
      --- a/commit-graph.c
      +++ b/commit-graph.c
    -@@
    - 	struct progress *progress = NULL;
    - 	uint64_t progress_cnt = 0;
    - 	unsigned long approx_nr_objects;
    -+	struct strbuf progress_title = STRBUF_INIT;
    - 
    - 	if (!commit_graph_compatible(the_repository))
    - 		return;
     @@
      		strbuf_addf(&packname, "%s/pack/", obj_dir);
      		dirlen = packname.len;
    @@ -99,12 +91,3 @@
      				approx_nr_objects);
      		for_each_packed_object(add_packed_commits, &oids, 0);
      		if (oids.progress_done < approx_nr_objects)
    -@@
    - 				      &progress_cnt);
    - 	stop_progress(&progress);
    - 
    -+	strbuf_release(&progress_title);
    -+
    - 	close_commit_graph(the_repository);
    - 	finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_FSYNC);
    - 	commit_lock_file(&lk);
 5:  f70fc5045d = 21:  74037032d3 commit-graph write: remove empty line for readability
     a => b | 0
     1 file changed, 0 insertions(+), 0 deletions(-)
    
 6:  2e943fa925 ! 22:  502da68d14 commit-graph write: add even more progress output
     a => b | 0
     1 file changed, 0 insertions(+), 0 deletions(-)
    
    @@ -1,11 +1,13 @@
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    -    commit-graph write: add even more progress output
    +    commit-graph write: add itermediate progress
     
    -    Add more progress output to sections of code that can collectively
    -    take 5-10 seconds on a large enough repository. On a test repository
    -    with I have with ~7 million commits and ~50 million objects we'll now
    -    emit:
    +    Add progress output to sections of code between "Annotating[...]" and
    +    "Computing[...]generation numbers". This can collectively take 5-10
    +    seconds on a large enough repository.
    +
    +    On a test repository with I have with ~7 million commits and ~50
    +    million objects we'll now emit:
     
             $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
             Finding commits for commit graph among packed objects: 100% (50026015/50026015), done.
    @@ -57,7 +59,7 @@
     @@
      	ALLOC_ARRAY(commits.list, commits.alloc);
      
    - 	num_extra_edges = 0;
    + 	num_large_edges = 0;
     +	if (report_progress)
     +		progress = start_delayed_progress(
     +			_("Finding extra edges in commit graph"),
    @@ -71,7 +73,7 @@
     @@
      		commits.nr++;
      	}
    - 	num_chunks = num_extra_edges ? 4 : 3;
    + 	num_chunks = num_large_edges ? 4 : 3;
     +	stop_progress(&progress);
      
      	if (commits.nr >= GRAPH_PARENT_MISSING)
 -:  ---------- > 23:  dfaf840983 commit-graph write: emit a percentage for all progress

Comments

Ævar Arnfjörð Bjarmason Nov. 22, 2018, 3:39 p.m. UTC | #1
The "Writing out" progress output was off-by-one because I'd screwed
up a merge conflict. Fix that, and update the various progress output.

On my test setup the "Annotating commit graph" progress sometimes
shows up on linux.git, sometimes not, it's right on that edge of
taking 1 second. So always show it in the commit message examples,
that's less confusing for the reader.

SZEDER Gábor (2):
  commit-graph: rename 'num_extra_edges' variable to 'num_large_edges'
  commit-graph: don't call write_graph_chunk_large_edges() unnecessarily

Ævar Arnfjörð Bjarmason (8):
  commit-graph write: rephrase confusing progress output
  commit-graph write: add "Writing out" progress output
  commit-graph write: more descriptive "writing out" output
  commit-graph write: show progress for object search
  commit-graph write: add more descriptive progress output
  commit-graph write: remove empty line for readability
  commit-graph write: add itermediate progress
  commit-graph write: emit a percentage for all progress

 commit-graph.c | 130 ++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 102 insertions(+), 28 deletions(-)

Range-diff:
1:  2b52ad2284 ! 1:  9c17f56ed3 commit-graph write: add "Writing out" progress output
     a => b | 0
     1 file changed, 0 insertions(+), 0 deletions(-)
    
    @@ -15,10 +15,11 @@
     
             $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
             Finding commits for commit graph: 6365442, done.
    +        Annotating commit graph: 2391666, done.
             Computing commit graph generation numbers: 100% (797222/797222), done.
    -        Writing out commit graph: 100% (3986110/3986110), done.
    +        Writing out commit graph: 100% (3188888/3188888), done.
     
    -    This "Writing out" number is 4x or 5x the number of commits, depending
    +    This "Writing out" number is 3x or 4x the number of commits, depending
         on the graph we're processing. A later change will make this explicit
         to the user.
     
    @@ -126,7 +127,7 @@
     +		 * below loops over our N commits. This number must be
     +		 * kept in sync with the number of passes we're doing.
     +		 */
    -+		int graph_passes = 4;
    ++		int graph_passes = 3;
     +		if (num_large_edges)
     +			graph_passes++;
     +		progress = start_delayed_progress(
2:  b1773677b1 ! 2:  79b0a467d9 commit-graph write: more descriptive "writing out" output
     a => b | 0
     1 file changed, 0 insertions(+), 0 deletions(-)
    
    @@ -3,7 +3,7 @@
         commit-graph write: more descriptive "writing out" output
     
         Make the "Writing out" part of the progress output more
    -    descriptive. Depending on the shape of the graph we either make 4 or 5
    +    descriptive. Depending on the shape of the graph we either make 3 or 4
         passes over it.
     
         Let's present this information to the user in case they're wondering
    @@ -13,8 +13,9 @@
     
             $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
             Finding commits for commit graph: 6365442, done.
    +        Annotating commit graph: 2391666, done.
             Computing commit graph generation numbers: 100% (797222/797222), done.
    -        Writing out commit graph in 5 passes: 100% (3986110/3986110), done.
    +        Writing out commit graph in 4 passes: 100% (3188888/3188888), done.
     
         A note on i18n: Why are we using the Q_() function and passing a
         number & English text for a singular which'll never be used? Because
    @@ -35,7 +36,7 @@
      	if (!commit_graph_compatible(the_repository))
      		return;
     @@
    - 		int graph_passes = 4;
    + 		int graph_passes = 3;
      		if (num_large_edges)
      			graph_passes++;
     +		strbuf_addf(&progress_title,
3:  3138b00a2c ! 3:  b32be83b38 commit-graph write: show progress for object search
     a => b | 0
     1 file changed, 0 insertions(+), 0 deletions(-)
    
    @@ -8,12 +8,12 @@
     
         Before we'd emit on e.g. linux.git with "commit-graph write":
     
    -        Finding commits for commit graph: 6365492, done.
    +        Finding commits for commit graph: 6365442, done.
             [...]
     
         And now:
     
    -        Finding commits for commit graph: 100% (6365492/6365492), done.
    +        Finding commits for commit graph: 100% (6365442/6365442), done.
             [...]
     
         Since the commit graph only includes those commits that are packed
4:  f41e3b3eb3 ! 4:  54276723c0 commit-graph write: add more descriptive progress output
     a => b | 0
     1 file changed, 0 insertions(+), 0 deletions(-)
    
    @@ -10,7 +10,7 @@
         we support:
     
             $ git commit-graph write
    -        Finding commits for commit graph among packed objects: 100% (6365492/6365492), done.
    +        Finding commits for commit graph among packed objects: 100% (6365442/6365442), done.
             [...]
     
             # Actually we don't emit this since this takes almost no time at
    @@ -20,7 +20,7 @@
             [...]
     
             $ (cd .git/objects/pack/ && ls *idx) | git commit-graph write --stdin-pack
    -        Finding commits for commit graph in 3 packs: 6365492, done.
    +        Finding commits for commit graph in 2 packs: 6365442, done.
             [...]
     
         The middle on of those is going to be the output users might see in
5:  74037032d3 = 5:  0e847366e1 commit-graph write: remove empty line for readability
     a => b | 0
     1 file changed, 0 insertions(+), 0 deletions(-)
    
6:  502da68d14 ! 6:  c388aff73e commit-graph write: add itermediate progress
     a => b | 0
     1 file changed, 0 insertions(+), 0 deletions(-)
    
    @@ -10,22 +10,22 @@
         million objects we'll now emit:
     
             $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
    -        Finding commits for commit graph among packed objects: 100% (50026015/50026015), done.
    -        Annotating commit graph: 21567407, done.
    -        Counting distinct commits in commit graph: 100% (7189147/7189147), done.
    -        Finding extra edges in commit graph: 100% (7189147/7189147), done.
    -        Computing commit graph generation numbers: 100% (7144680/7144680), done.
    -        Writing out commit graph: 21434417, done.
    +        Finding commits for commit graph among packed objects: 100% (48333911/48333911), done.
    +        Annotating commit graph: 21435984, done.
    +        Counting distinct commits in commit graph: 100% (7145328/7145328), done.
    +        Finding extra edges in commit graph: 100% (7145328/7145328), done.
    +        Computing commit graph generation numbers: 100% (7145328/7145328), done.
    +        Writing out commit graph in 4 passes: 100% (28581312/28581312), done.
     
         Whereas on a medium-sized repository such as linux.git these new
         progress bars won't have time to kick in and as before and we'll still
         emit output like:
     
             $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
    -        Finding commits for commit graph among packed objects: 100% (6365492/6365492), done.
    +        Finding commits for commit graph among packed objects: 100% (6365442/6365442), done.
             Annotating commit graph: 2391666, done.
             Computing commit graph generation numbers: 100% (797222/797222), done.
    -        Writing out commit graph: 2399912, done.
    +        Writing out commit graph in 4 passes: 100% (3188888/3188888), done.
     
         The "Counting distinct commits in commit graph" phase will spend most
         of its time paused at "0/*" as we QSORT(...) the list. That's not
7:  dfaf840983 ! 7:  fd692499e0 commit-graph write: emit a percentage for all progress
     a => b | 0
     1 file changed, 0 insertions(+), 0 deletions(-)
    
    @@ -7,22 +7,13 @@
         write: add progress output", 2018-09-17) and evidently didn't notice
         how easy it was to add a completion percentage.
     
    -    Now for the very large test repository mentioned in previous commits
    -    we'll emit (shows all progress output):
    -
    -        Finding commits for commit graph among packed objects: 100% (48333911/48333911), done.
    -        Annotating commit graph: 100% (21435984/21435984), done.
    -        Counting distinct commits in commit graph: 100% (7145328/7145328), done.
    -        Finding extra edges in commit graph: 100% (7145328/7145328), done.
    -        Computing commit graph generation numbers: 100% (7145328/7145328), done.
    -        Writing out commit graph in 5 passes: 100% (35726640/35726640), done.
    -
    -    And for linux.git:
    +    Now for e.g. linux.git we'll emit:
     
    +        ~/g/git/git --exec-path=$HOME/g/git commit-graph write
             Finding commits for commit graph among packed objects: 100% (6365442/6365442), done.
             Annotating commit graph: 100% (2391666/2391666), done.
             Computing commit graph generation numbers: 100% (797222/797222), done.
    -        Writing out commit graph in 5 passes: 100% (3986110/3986110), done.
    +        Writing out commit graph in 4 passes: 100% (3188888/3188888), done.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Eric Sunshine Nov. 22, 2018, 6:59 p.m. UTC | #2
On Thu, Nov 22, 2018 at 8:28 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> Range-diff:
> By the way, is there any way to....
> Pass the equivalent of "git range-diff origin/master topic-2 topic-3"
> to git-format-patch?

git-range-diff documentations says that the three-argument form:

    git range-diff <base> <rev1> <rev2>

is equivalent to passing two ranges:

    git range-diff <base>..<rev1> <base>..<rev2>

git-format-patch synopsis shows:

    git format-patch --range-diff=<previous> <rev-range>

where <rev-range> is the range of commits to format, and <previous>
can be a range specifying the previous version, so:

    git format-patch --range-diff=<base>..<rev1> <base>..<rev2>

should do what you ask.

However, since the two versions in your example both derive from
origin/master, you should be able to get by with the simpler:

    git format-patch --range-diff=<rev1> <base>..<rev2>

which, if you were running git-range-diff manually, would be the equivalent of:

    git range-diff <rev1>...<rev2>

for which the range-diff machinery figures out the common base
(origin/master) automatically.