diff mbox series

[6/8] commit-graph.c: simplify 'fill_oids_from_commits'

Message ID 7e9d8c1f1a124171ebb5b4d874718053d1c9064a.1588641176.git.me@ttaylorr.com (mailing list archive)
State New, archived
Headers show
Series commit-graph: drop CHECK_OIDS, peel in callers | expand

Commit Message

Taylor Blau May 5, 2020, 1:13 a.m. UTC
In the previous handful of commits, both 'git commit-graph write
--reachable' and '--stdin-commits' learned to peel tags down to the
commits which they refer to before passing them into the commit-graph
internals.

This makes the call to 'lookup_commit_reference_gently()' inside of
'fill_oids_from_commits()' a noop, since all OIDs are commits by that
point.

As such, remove the call entirely, as well as the progress meter, which
has been split and moved out to the callers in the aforementioned
earlier commits.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 commit-graph.c | 33 +++------------------------------
 1 file changed, 3 insertions(+), 30 deletions(-)

Comments

Derrick Stolee May 5, 2020, 12:05 p.m. UTC | #1
On 5/4/2020 9:13 PM, Taylor Blau wrote:
> In the previous handful of commits, both 'git commit-graph write
> --reachable' and '--stdin-commits' learned to peel tags down to the
> commits which they refer to before passing them into the commit-graph
> internals.
> 
> This makes the call to 'lookup_commit_reference_gently()' inside of
> 'fill_oids_from_commits()' a noop, since all OIDs are commits by that
> point.
> 
> As such, remove the call entirely, as well as the progress meter, which
> has been split and moved out to the callers in the aforementioned
> earlier commits.
...
>  	oidset_iter_init(commits, &iter);
>  	while ((oid = oidset_iter_next(&iter))) {
[snip removed code]
> +		ALLOC_GROW(ctx->oids.list, ctx->oids.nr + 1, ctx->oids.alloc);
> +		oidcpy(&ctx->oids.list[ctx->oids.nr], oid);
> +		ctx->oids.nr++;
>  	}

After looking at this loop, I wondered why we can't just use the
oidset inside the context struct. But then I realized that the
oids list expands with reachable commits and then is sorted
lexicographically. Thus, this is the best time to copy from the
oidset to the commit list.

I agree that dropping the progress is valuable here, since this
_should_ be a very fast operation even for enormous commit sets.

Thanks,
-Stolee
Jeff King May 7, 2020, 8:21 p.m. UTC | #2
On Mon, May 04, 2020 at 07:13:54PM -0600, Taylor Blau wrote:

> In the previous handful of commits, both 'git commit-graph write
> --reachable' and '--stdin-commits' learned to peel tags down to the
> commits which they refer to before passing them into the commit-graph
> internals.
> 
> This makes the call to 'lookup_commit_reference_gently()' inside of
> 'fill_oids_from_commits()' a noop, since all OIDs are commits by that
> point.
> 
> As such, remove the call entirely, as well as the progress meter, which
> has been split and moved out to the callers in the aforementioned
> earlier commits.

Yep, all this makes sense. I agree with Stolee that it's unfortunate we
can't just reuse the oidset now, but we do need the flattened array view
here. We could perhaps create such an array from the beginning (perhaps
using an oid_array), but we do need to care about de-duping the input.
That can be done easily with the sorted list, but there are pathological
corner cases where the performance is worse (e.g., if you have a ton of
refs all pointing to the same tags, like if you happened to be storing
the refs for 20,000 forks of the kernel all in one giant repo).

I think we'd eventually turn all these into "struct commit" (and indeed,
we already do in --stdin-commits when we try to peel). So another
alternative would be passing in an object_array(). But I guess that
would require surgery to the rest of the commit-graph code to work with
that instead of the oid list.

>  	while ((oid = oidset_iter_next(&iter))) {
> -		struct commit *result;
> -
> -		display_progress(ctx->progress, ++i);
> -
> -		result = lookup_commit_reference_gently(ctx->r, oid, 1);
> -		if (result) {
> -			ALLOC_GROW(ctx->oids.list, ctx->oids.nr + 1, ctx->oids.alloc);
> -			oidcpy(&ctx->oids.list[ctx->oids.nr], &(result->object.oid));
> -			ctx->oids.nr++;
> -		} else if (ctx->check_oids) {
> -			error(_("invalid commit object id: %s"),
> -			      oid_to_hex(oid));
> -			return -1;
> -		}
> +		ALLOC_GROW(ctx->oids.list, ctx->oids.nr + 1, ctx->oids.alloc);
> +		oidcpy(&ctx->oids.list[ctx->oids.nr], oid);
> +		ctx->oids.nr++;
>  	}

I wondered if it's worth asserting that everything we got here is a
commit. But it's not cheap to make that check, and anyway we'd
presumably just barf later on when we try to resolve the oids to
commits. So there's little point in spending cycles to catch it here.

-Peff
diff mbox series

Patch

diff --git a/commit-graph.c b/commit-graph.c
index 5c3fad0dd7..24c2d80935 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1411,46 +1411,19 @@  static int fill_oids_from_packs(struct write_commit_graph_context *ctx,
 static int fill_oids_from_commits(struct write_commit_graph_context *ctx,
 				  struct oidset *commits)
 {
-	uint32_t i = 0;
-	struct strbuf progress_title = STRBUF_INIT;
 	struct oidset_iter iter;
 	struct object_id *oid;
 
 	if (!oidset_size(commits))
 		return 0;
 
-	if (ctx->report_progress) {
-		strbuf_addf(&progress_title,
-			    Q_("Finding commits for commit graph from %d ref",
-			       "Finding commits for commit graph from %d refs",
-			       oidset_size(commits)),
-			    oidset_size(commits));
-		ctx->progress = start_delayed_progress(
-					progress_title.buf,
-					oidset_size(commits));
-	}
-
 	oidset_iter_init(commits, &iter);
 	while ((oid = oidset_iter_next(&iter))) {
-		struct commit *result;
-
-		display_progress(ctx->progress, ++i);
-
-		result = lookup_commit_reference_gently(ctx->r, oid, 1);
-		if (result) {
-			ALLOC_GROW(ctx->oids.list, ctx->oids.nr + 1, ctx->oids.alloc);
-			oidcpy(&ctx->oids.list[ctx->oids.nr], &(result->object.oid));
-			ctx->oids.nr++;
-		} else if (ctx->check_oids) {
-			error(_("invalid commit object id: %s"),
-			      oid_to_hex(oid));
-			return -1;
-		}
+		ALLOC_GROW(ctx->oids.list, ctx->oids.nr + 1, ctx->oids.alloc);
+		oidcpy(&ctx->oids.list[ctx->oids.nr], oid);
+		ctx->oids.nr++;
 	}
 
-	stop_progress(&ctx->progress);
-	strbuf_release(&progress_title);
-
 	return 0;
 }