diff mbox series

[5/8] builtin/commit-graph.c: dereference tags in builtin

Message ID 513a634f14e90ec0c2e80a6aaf8cb66bbedf8966.1588641176.git.me@ttaylorr.com (mailing list archive)
State New, archived
Headers show
Series commit-graph: drop CHECK_OIDS, peel in callers | expand

Commit Message

Taylor Blau May 5, 2020, 1:13 a.m. UTC
When given a list of commits, the commit-graph machinery calls
'lookup_commit_reference_gently()' on each element in the set and treats
the resulting set of OIDs as the base over which to close for
reachability.

In an earlier collection of commits, the 'git commit-graph write
--reachable' case made the inner-most call to
'lookup_commit_reference_gently()' by peeling references before they
were passed over to the commit-graph internals.

Do the analog for 'git commit-graph write --stdin-commits' by calling
'lookup_commit_reference_gently()' outside of the commit-graph
machinery, making the inner-most call a noop.

Since this may incur additional processing time, surround
'read_one_commit' with a progress meter to provide output to the caller.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/commit-graph.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

Comments

Derrick Stolee May 5, 2020, 12:01 p.m. UTC | #1
On 5/4/2020 9:13 PM, Taylor Blau wrote:
> @@ -228,18 +240,25 @@ static int graph_write(int argc, const char **argv)
>  		if (opts.stdin_commits) {
>  			oidset_init(&commits, 0);
>  			flags |= COMMIT_GRAPH_WRITE_CHECK_OIDS;
> +			if (opts.progress)
> +				progress = start_delayed_progress(
> +					_("Analyzing commits from stdin"), 0);
The code functions as you intend and is an improvement. Similar to my
earlier suggestion to use something like "Collecting referenced commits"
for the --reachable case, perhaps this could be "Collecting commits from input"?

Thanks,
-Stolee
Taylor Blau May 5, 2020, 4:14 p.m. UTC | #2
On Tue, May 05, 2020 at 08:01:29AM -0400, Derrick Stolee wrote:
> On 5/4/2020 9:13 PM, Taylor Blau wrote:
> > @@ -228,18 +240,25 @@ static int graph_write(int argc, const char **argv)
> >  		if (opts.stdin_commits) {
> >  			oidset_init(&commits, 0);
> >  			flags |= COMMIT_GRAPH_WRITE_CHECK_OIDS;
> > +			if (opts.progress)
> > +				progress = start_delayed_progress(
> > +					_("Analyzing commits from stdin"), 0);
> The code functions as you intend and is an improvement. Similar to my
> earlier suggestion to use something like "Collecting referenced commits"
> for the --reachable case, perhaps this could be "Collecting commits from input"?

Yep, making these consistent with one another is a good thing to do,
thanks.

> Thanks,
> -Stolee

Thanks,
Taylor
Jeff King May 7, 2020, 8:14 p.m. UTC | #3
On Mon, May 04, 2020 at 07:13:49PM -0600, Taylor Blau wrote:

> When given a list of commits, the commit-graph machinery calls
> 'lookup_commit_reference_gently()' on each element in the set and treats
> the resulting set of OIDs as the base over which to close for
> reachability.
> 
> In an earlier collection of commits, the 'git commit-graph write
> --reachable' case made the inner-most call to
> 'lookup_commit_reference_gently()' by peeling references before they
> were passed over to the commit-graph internals.
> 
> Do the analog for 'git commit-graph write --stdin-commits' by calling
> 'lookup_commit_reference_gently()' outside of the commit-graph
> machinery, making the inner-most call a noop.

Yep, I think this is a good direction.

> @@ -148,7 +151,15 @@ static int read_one_commit(struct oidset *commits, char *hash)
>  		return 1;
>  	}
>  
> -	oidset_insert(commits, &oid);
> +	display_progress(progress, oidset_size(commits) + 1);

Most of our meters increment progress _after_ doing work.

This is especially important for meters with percentages (if we knew we
had 1 commit, we'd print "100%" and _then_ start to peel it, which is
silly).

For this instance we don't know the total number, so we're just counting
up. But I think we should be consistent about when we update meters.
Plus it's shorter to say:

  display_progress(progress, oidset_size(commits));

after having done the work. ;)

> +	result = lookup_commit_reference_gently(the_repository, &oid, 1);

Would we want to pass quiet==0 here? If we see an error we're going to
bail loudly below, so getting more details from the low-level functions
is helpful (I think the only one you'd really get is "I'm looking for a
commit, but it's a tree" or similar).

lookup_commit_reference_gently() is pretty aggressive about parsing
objects. We'll have to parse commits eventually, but we could possibly
do so using their graph representations. It may not be worth optimizing,
because it would only matter if you fed a lot of --stdin-commits inputs
that were already graphed. (And if it is worth optimizing, it should
probably come in a separate commit anyway; this is just about moving the
existing peeling).

> +	if (result)
> +		oidset_insert(commits, &result->object.oid);
> +	else {
> +		error(_("invalid commit object id: %s"), hash);
> +		return 1;
> +	}

If you follow my "return -1" suggestion from earlier, this would need
it, too.

>  		while (strbuf_getline(&buf, stdin) != EOF) {
>  			char *line = strbuf_detach(&buf, NULL);
>  			if (opts.stdin_commits) {
> -				int result = read_one_commit(&commits, line);
> +				int result = read_one_commit(&commits, progress,
> +							     line);
>  				if (result)
>  					return result;
>  			} else
>  				string_list_append(&pack_indexes, line);
>  		}
>  
> +		if (progress)
> +			stop_progress(&progress);

If we return early in the loop, we'd leave this progress meter hanging.
It might be worth converting that return to a break or goto that handles
cleanup (it also needs to handle releasing the strbuf).

-Peff
diff mbox series

Patch

diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index f550d8489a..9eec68572f 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -6,6 +6,7 @@ 
 #include "repository.h"
 #include "commit-graph.h"
 #include "object-store.h"
+#include "progress.h"
 
 static char const * const builtin_commit_graph_usage[] = {
 	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
@@ -138,8 +139,10 @@  static int write_option_parse_split(const struct option *opt, const char *arg,
 	return 0;
 }
 
-static int read_one_commit(struct oidset *commits, char *hash)
+static int read_one_commit(struct oidset *commits, struct progress *progress,
+			   char *hash)
 {
+	struct commit *result;
 	struct object_id oid;
 	const char *end;
 
@@ -148,7 +151,15 @@  static int read_one_commit(struct oidset *commits, char *hash)
 		return 1;
 	}
 
-	oidset_insert(commits, &oid);
+	display_progress(progress, oidset_size(commits) + 1);
+
+	result = lookup_commit_reference_gently(the_repository, &oid, 1);
+	if (result)
+		oidset_insert(commits, &result->object.oid);
+	else {
+		error(_("invalid commit object id: %s"), hash);
+		return 1;
+	}
 	return 0;
 }
 
@@ -159,6 +170,7 @@  static int graph_write(int argc, const char **argv)
 	struct object_directory *odb = NULL;
 	int result = 0;
 	enum commit_graph_write_flags flags = 0;
+	struct progress *progress = NULL;
 
 	static struct option builtin_commit_graph_write_options[] = {
 		OPT_STRING(0, "object-dir", &opts.obj_dir,
@@ -228,18 +240,25 @@  static int graph_write(int argc, const char **argv)
 		if (opts.stdin_commits) {
 			oidset_init(&commits, 0);
 			flags |= COMMIT_GRAPH_WRITE_CHECK_OIDS;
+			if (opts.progress)
+				progress = start_delayed_progress(
+					_("Analyzing commits from stdin"), 0);
 		}
 
 		while (strbuf_getline(&buf, stdin) != EOF) {
 			char *line = strbuf_detach(&buf, NULL);
 			if (opts.stdin_commits) {
-				int result = read_one_commit(&commits, line);
+				int result = read_one_commit(&commits, progress,
+							     line);
 				if (result)
 					return result;
 			} else
 				string_list_append(&pack_indexes, line);
 		}
 
+		if (progress)
+			stop_progress(&progress);
+
 		UNLEAK(buf);
 	}