[3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
diff mbox series

Message ID 20190805080240.30892-4-szeder.dev@gmail.com
State New
Headers show
Series
  • commit-graph: error out on invalid commit oids in 'write --stdin-commits'
Related show

Commit Message

SZEDER Gábor Aug. 5, 2019, 8:02 a.m. UTC
While 'git commit-graph write --stdin-commits' expects commit object
ids as input, it accepts and silently skips over any invalid commit
object ids, and still exits with success:

  # nonsense
  $ echo not-a-commit-oid | git commit-graph write --stdin-commits
  $ echo $?
  0
  # sometimes I forgot that refs are not good...
  $ echo HEAD | git commit-graph write --stdin-commits
  $ echo $?
  0
  # valid tree OID, but not a commit OID
  $ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
  $ echo $?
  0
  $ ls -l .git/objects/info/commit-graph
  ls: cannot access '.git/objects/info/commit-graph': No such file or directory

Check that all input records are indeed valid commit object ids and
return with error otherwise, the same way '--stdin-packs' handles
invalid input; see e103f7276f (commit-graph: return with errors during
write, 2019-06-12).

Note that it should only return with error when encountering an
invalid commit object id coming from standard input.  However,
'--reachable' uses the same code path to process object ids pointed to
by all refs, and that includes tag object ids as well, which should
still be skipped over.  Therefore add a new flag to 'enum
commit_graph_write_flags' and a corresponding field to 'struct
write_commit_graph_context', so we can differentiate between those two
cases.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 builtin/commit-graph.c  |  4 +++-
 commit-graph.c          | 29 +++++++++++++++++------------
 commit-graph.h          |  4 +++-
 t/t5318-commit-graph.sh | 11 ++++++++++-
 4 files changed, 33 insertions(+), 15 deletions(-)

Comments

Derrick Stolee Aug. 5, 2019, 1:57 p.m. UTC | #1
On 8/5/2019 4:02 AM, SZEDER Gábor wrote:
> While 'git commit-graph write --stdin-commits' expects commit object
> ids as input, it accepts and silently skips over any invalid commit
> object ids, and still exits with success:
> 
>   # nonsense
>   $ echo not-a-commit-oid | git commit-graph write --stdin-commits
>   $ echo $?
>   0
>   # sometimes I forgot that refs are not good...
>   $ echo HEAD | git commit-graph write --stdin-commits
>   $ echo $?
>   0
>   # valid tree OID, but not a commit OID
>   $ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
>   $ echo $?
>   0
>   $ ls -l .git/objects/info/commit-graph
>   ls: cannot access '.git/objects/info/commit-graph': No such file or directory
> 
> Check that all input records are indeed valid commit object ids and
> return with error otherwise, the same way '--stdin-packs' handles
> invalid input; see e103f7276f (commit-graph: return with errors during
> write, 2019-06-12).

Consistency is good. We should definitely make these modes match.

> Note that it should only return with error when encountering an
> invalid commit object id coming from standard input.  However,
> '--reachable' uses the same code path to process object ids pointed to
> by all refs, and that includes tag object ids as well, which should
> still be skipped over.  Therefore add a new flag to 'enum
> commit_graph_write_flags' and a corresponding field to 'struct
> write_commit_graph_context', so we can differentiate between those two
> cases.

Thank you for the care here.

[snip]
> @@ -1215,20 +1216,21 @@ static void fill_oids_from_commit_hex(struct write_commit_graph_context *ctx,
>  		struct commit *result;
>  
>  		display_progress(ctx->progress, i + 1);
> -		if (commit_hex->items[i].string &&
> -		    parse_oid_hex(commit_hex->items[i].string, &oid, &end))
> -			continue;
> -
> -		result = lookup_commit_reference_gently(ctx->r, &oid, 1);
> -
> -		if (result) {
> +		if (!parse_oid_hex(commit_hex->items[i].string, &oid, &end) &&
> +		    (result = lookup_commit_reference_gently(ctx->r, &oid, 1))) {
>  			ALLOC_GROW(ctx->oids.list, ctx->oids.nr + 1, ctx->oids.alloc);
>  			oidcpy(&ctx->oids.list[ctx->oids.nr], &(result->object.oid));
>  			ctx->oids.nr++;
> +		} else if (ctx->check_oids) {
> +			error(_("invalid commit object id: %s"),
> +			    commit_hex->items[i].string);
> +			return -1;
>  		}
>  	}
>  	stop_progress(&ctx->progress);
>  	strbuf_release(&progress_title);
> +
> +	return 0;
>  }

This is the critical bit. I notice that you are not checking commit_hex->items[i].string
for NULL, but it should never be NULL here anyway.

> @@ -1775,6 +1777,7 @@ int write_commit_graph(const char *obj_dir,
>  	ctx->append = flags & COMMIT_GRAPH_WRITE_APPEND ? 1 : 0;
>  	ctx->report_progress = flags & COMMIT_GRAPH_WRITE_PROGRESS ? 1 : 0;
>  	ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0;
> +	ctx->check_oids = flags & COMMIT_GRAPH_WRITE_CHECK_OIDS ? 1 : 0;
>  	ctx->split_opts = split_opts;

Using the enum for the function and the bitfield for internal logic matches the
existing pattern. Thanks.

> @@ -1829,8 +1832,10 @@ int write_commit_graph(const char *obj_dir,
>  			goto cleanup;
>  	}
>  
> -	if (commit_hex)
> -		fill_oids_from_commit_hex(ctx, commit_hex);
> +	if (commit_hex) {
> +		if ((res = fill_oids_from_commit_hex(ctx, commit_hex)))
> +			goto cleanup;
> +	}

And this links the low-level error to a return code.

Thanks for this! The changes here look good and justify the two cleanup
patches.

-Stolee
SZEDER Gábor Aug. 5, 2019, 5:57 p.m. UTC | #2
On Mon, Aug 05, 2019 at 09:57:19AM -0400, Derrick Stolee wrote:
> On 8/5/2019 4:02 AM, SZEDER Gábor wrote:
> > While 'git commit-graph write --stdin-commits' expects commit object
> > ids as input, it accepts and silently skips over any invalid commit
> > object ids, and still exits with success:
> > 
> >   # nonsense
> >   $ echo not-a-commit-oid | git commit-graph write --stdin-commits
> >   $ echo $?
> >   0
> >   # sometimes I forgot that refs are not good...
> >   $ echo HEAD | git commit-graph write --stdin-commits
> >   $ echo $?
> >   0
> >   # valid tree OID, but not a commit OID
> >   $ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
> >   $ echo $?
> >   0
> >   $ ls -l .git/objects/info/commit-graph
> >   ls: cannot access '.git/objects/info/commit-graph': No such file or directory
> > 
> > Check that all input records are indeed valid commit object ids and
> > return with error otherwise, the same way '--stdin-packs' handles
> > invalid input; see e103f7276f (commit-graph: return with errors during
> > write, 2019-06-12).
> 
> Consistency is good. We should definitely make these modes match.

I was also wondering whether it would be worth accepting refs as well,
either as DWIMery or only when a '--revs' option is given (similar to
'git pack-objects --revs').  Dunno, I'm a bit hesitant about always
accepting refs as a DWIMery, this is plumbing after all.  And I don't
really care whether I correct my bogus command by replacing 'echo'
with 'git rev-parse' or by adding a '--revs' argument; the important
thing is that the command should tell me that I gave it junk.  And
that would be a new feature, while this patch is a bugfix IMO.

> > Note that it should only return with error when encountering an
> > invalid commit object id coming from standard input.  However,
> > '--reachable' uses the same code path to process object ids pointed to
> > by all refs, and that includes tag object ids as well, which should
> > still be skipped over.  Therefore add a new flag to 'enum
> > commit_graph_write_flags' and a corresponding field to 'struct
> > write_commit_graph_context', so we can differentiate between those two
> > cases.
> 
> Thank you for the care here.

Well, to be honest, I wasn't careful... :)  but running the test suite
with GIT_TEST_COMMIT_GRAPH=1 resulted in about a dozen failed test
scripts that traced back to this.

Patch
diff mbox series

diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 64eccde314..57863619b7 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -213,8 +213,10 @@  static int graph_write(int argc, const char **argv)
 
 		if (opts.stdin_packs)
 			pack_indexes = &lines;
-		if (opts.stdin_commits)
+		if (opts.stdin_commits) {
 			commit_hex = &lines;
+			flags |= COMMIT_GRAPH_WRITE_CHECK_OIDS;
+		}
 
 		UNLEAK(buf);
 	}
diff --git a/commit-graph.c b/commit-graph.c
index 04324a4648..821900675b 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -782,7 +782,8 @@  struct write_commit_graph_context {
 
 	unsigned append:1,
 		 report_progress:1,
-		 split:1;
+		 split:1,
+		 check_oids:1;
 
 	const struct split_commit_graph_opts *split_opts;
 };
@@ -1193,8 +1194,8 @@  static int fill_oids_from_packs(struct write_commit_graph_context *ctx,
 	return 0;
 }
 
-static void fill_oids_from_commit_hex(struct write_commit_graph_context *ctx,
-				      struct string_list *commit_hex)
+static int fill_oids_from_commit_hex(struct write_commit_graph_context *ctx,
+				     struct string_list *commit_hex)
 {
 	uint32_t i;
 	struct strbuf progress_title = STRBUF_INIT;
@@ -1215,20 +1216,21 @@  static void fill_oids_from_commit_hex(struct write_commit_graph_context *ctx,
 		struct commit *result;
 
 		display_progress(ctx->progress, i + 1);
-		if (commit_hex->items[i].string &&
-		    parse_oid_hex(commit_hex->items[i].string, &oid, &end))
-			continue;
-
-		result = lookup_commit_reference_gently(ctx->r, &oid, 1);
-
-		if (result) {
+		if (!parse_oid_hex(commit_hex->items[i].string, &oid, &end) &&
+		    (result = lookup_commit_reference_gently(ctx->r, &oid, 1))) {
 			ALLOC_GROW(ctx->oids.list, ctx->oids.nr + 1, ctx->oids.alloc);
 			oidcpy(&ctx->oids.list[ctx->oids.nr], &(result->object.oid));
 			ctx->oids.nr++;
+		} else if (ctx->check_oids) {
+			error(_("invalid commit object id: %s"),
+			    commit_hex->items[i].string);
+			return -1;
 		}
 	}
 	stop_progress(&ctx->progress);
 	strbuf_release(&progress_title);
+
+	return 0;
 }
 
 static void fill_oids_from_all_packs(struct write_commit_graph_context *ctx)
@@ -1775,6 +1777,7 @@  int write_commit_graph(const char *obj_dir,
 	ctx->append = flags & COMMIT_GRAPH_WRITE_APPEND ? 1 : 0;
 	ctx->report_progress = flags & COMMIT_GRAPH_WRITE_PROGRESS ? 1 : 0;
 	ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0;
+	ctx->check_oids = flags & COMMIT_GRAPH_WRITE_CHECK_OIDS ? 1 : 0;
 	ctx->split_opts = split_opts;
 
 	if (ctx->split) {
@@ -1829,8 +1832,10 @@  int write_commit_graph(const char *obj_dir,
 			goto cleanup;
 	}
 
-	if (commit_hex)
-		fill_oids_from_commit_hex(ctx, commit_hex);
+	if (commit_hex) {
+		if ((res = fill_oids_from_commit_hex(ctx, commit_hex)))
+			goto cleanup;
+	}
 
 	if (!pack_indexes && !commit_hex)
 		fill_oids_from_all_packs(ctx);
diff --git a/commit-graph.h b/commit-graph.h
index ae4db87fb5..486e64e591 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -74,7 +74,9 @@  int generation_numbers_enabled(struct repository *r);
 enum commit_graph_write_flags {
 	COMMIT_GRAPH_WRITE_APPEND     = (1 << 0),
 	COMMIT_GRAPH_WRITE_PROGRESS   = (1 << 1),
-	COMMIT_GRAPH_WRITE_SPLIT      = (1 << 2)
+	COMMIT_GRAPH_WRITE_SPLIT      = (1 << 2),
+	/* Make sure that each OID in the input is a valid commit OID. */
+	COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3)
 };
 
 struct split_commit_graph_opts {
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 4391007f4c..ab3eccf0fa 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -23,7 +23,7 @@  test_expect_success 'write graph with no packs' '
 	test_path_is_missing info/commit-graph
 '
 
-test_expect_success 'close with correct error on bad input' '
+test_expect_success 'exit with correct error on bad input to --stdin-packs' '
 	cd "$TRASH_DIRECTORY/full" &&
 	echo doesnotexist >in &&
 	test_expect_code 1 git commit-graph write --stdin-packs <in 2>stderr &&
@@ -40,6 +40,15 @@  test_expect_success 'create commits and repack' '
 	git repack
 '
 
+test_expect_success 'exit with correct error on bad input to --stdin-commits' '
+	cd "$TRASH_DIRECTORY/full" &&
+	echo HEAD | test_expect_code 1 git commit-graph write --stdin-commits 2>stderr &&
+	test_i18ngrep "invalid commit object id" stderr &&
+	# valid tree OID, but not a commit OID
+	git rev-parse HEAD^{tree} | test_expect_code 1 git commit-graph write --stdin-commits 2>stderr &&
+	test_i18ngrep "invalid commit object id" stderr
+'
+
 graph_git_two_modes() {
 	git -c core.commitGraph=true $1 >output
 	git -c core.commitGraph=false $1 >expect