diff mbox series

[02/23] parse_commit_buffer(): treat lookup_tree() failure as parse error

Message ID 20191018044328.GB17879@sigill.intra.peff.net (mailing list archive)
State New, archived
Headers show
Series parsing and fsck cleanups | expand

Commit Message

Jeff King Oct. 18, 2019, 4:43 a.m. UTC
If parsing a commit yields a valid tree oid, but we've seen that same
oid as a non-tree in the same process, the resulting commit struct will
end up with a NULL tree pointer, but not otherwise report a parsing
failure.

That's perhaps convenient for callers which want to operate on even
partially corrupt commits (e.g., by still looking at the parents). But
it leaves a potential trap for most callers, who now have to manually
check for a NULL tree. Most do not, and it's likely that there are
possible segfaults lurking. I say "possible" because there are many
candidates, and I don't think it's worth following through on
reproducing them when we can just fix them all in one spot. And
certainly we _have_ seen real-world cases, such as the one fixed by
806278dead (commit-graph.c: handle corrupt/missing trees, 2019-09-05).

Note that we can't quite drop the check in the caller added by that
commit yet, as there's some subtlety with repeated parsings (which will
be addressed in a future commit).

Signed-off-by: Jeff King <peff@peff.net>
---
 commit.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Jonathan Tan Oct. 24, 2019, 11:12 p.m. UTC | #1
> And
> certainly we _have_ seen real-world cases, such as the one fixed by
> 806278dead (commit-graph.c: handle corrupt/missing trees, 2019-09-05).
> 
> Note that we can't quite drop the check in the caller added by that
> commit yet, as there's some subtlety with repeated parsings (which will
> be addressed in a future commit).

I tried figuring out what the subtlety is by undoing the check you
describe, and did get a segfault in one of the tests, but couldn't
figure out exactly what is going on. Looking at the code, is it because
load_tree_for_commit() in commit-graph.c uses the return value of
lookup_tree() indiscriminately (which is the issue that this patch
fixes)?

This patch itself and patch 1 looks good (with the reasoning in the
commit message), of course.
Jeff King Oct. 24, 2019, 11:22 p.m. UTC | #2
On Thu, Oct 24, 2019 at 04:12:20PM -0700, Jonathan Tan wrote:

> > And
> > certainly we _have_ seen real-world cases, such as the one fixed by
> > 806278dead (commit-graph.c: handle corrupt/missing trees, 2019-09-05).
> > 
> > Note that we can't quite drop the check in the caller added by that
> > commit yet, as there's some subtlety with repeated parsings (which will
> > be addressed in a future commit).
> 
> I tried figuring out what the subtlety is by undoing the check you
> describe, and did get a segfault in one of the tests, but couldn't
> figure out exactly what is going on. Looking at the code, is it because
> load_tree_for_commit() in commit-graph.c uses the return value of
> lookup_tree() indiscriminately (which is the issue that this patch
> fixes)?
> 
> This patch itself and patch 1 looks good (with the reasoning in the
> commit message), of course.

It's the issue addressed in patch 4. The issue is that some _other_ code
already called parse_commit(), got an error, but then for whatever
reason ignored it. Now when we call parse_commit(), the "parsed" flag is
set, so it returns success even though the tree is still NULL.

That commit fixes the problem and does drop the extra tree check.

-Peff
diff mbox series

Patch

diff --git a/commit.c b/commit.c
index 6467c9e175..810419a168 100644
--- a/commit.c
+++ b/commit.c
@@ -401,6 +401,7 @@  int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
 	struct commit_graft *graft;
 	const int tree_entry_len = the_hash_algo->hexsz + 5;
 	const int parent_entry_len = the_hash_algo->hexsz + 7;
+	struct tree *tree;
 
 	if (item->object.parsed)
 		return 0;
@@ -412,7 +413,12 @@  int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
 	if (get_oid_hex(bufptr + 5, &parent) < 0)
 		return error("bad tree pointer in commit %s",
 			     oid_to_hex(&item->object.oid));
-	set_commit_tree(item, lookup_tree(r, &parent));
+	tree = lookup_tree(r, &parent);
+	if (!tree)
+		return error("bad tree pointer %s in commit %s",
+			     oid_to_hex(&parent),
+			     oid_to_hex(&item->object.oid));
+	set_commit_tree(item, tree);
 	bufptr += tree_entry_len + 1; /* "tree " + "hex sha1" + "\n" */
 	pptr = &item->parents;