diff mbox series

[5/9] commit-graph: abort as soon as we see a bogus chunk

Message ID 20231109071711.GE2698043@coredump.intra.peff.net (mailing list archive)
State Accepted
Commit 8bd40ed2aed2baee235299cdbf8482c752f980a3
Headers show
Series some more chunk-file bounds-checks fixes | expand

Commit Message

Jeff King Nov. 9, 2023, 7:17 a.m. UTC
The code to read commit-graph files tries to read all of the required
chunks, but doesn't abort if we can't find one (or if it's corrupted).
It's only at the end of reading the file that we then do some sanity
checks for NULL entries. But it's preferable to detect the errors and
bail immediately, for a few reasons:

  1. It's less error-prone. It's easy in the reader functions to flag an
     error but still end up setting some struct fields (an error I in
     fact made while working on this patch series).

  2. It's safer. Since verifying some chunks depends on the values of
     other chunks, we may be depending on not-yet-verified data. I don't
     know offhand of any case where this can cause problems, but it's
     one less subtle thing to worry about in the reader code.

  3. It prevents the user from seeing nonsense errors. If we're missing
     an OIDL chunk, then g->num_commits will be zero. And so we may
     complain that the size of our CDAT chunk (which should have a
     fixed-size record for each commit) is wrong unless it's also zero.
     But that's misleading; the problem is the missing OIDL chunk; the
     CDAT one might be fine!

So let's just check the return value from read_chunk(). This is exactly
how the midx chunk-reading code does it.

Signed-off-by: Jeff King <peff@peff.net>
---
 commit-graph.c | 29 +++++++++++++----------------
 1 file changed, 13 insertions(+), 16 deletions(-)

Comments

Taylor Blau Nov. 9, 2023, 9:18 p.m. UTC | #1
On Thu, Nov 09, 2023 at 02:17:11AM -0500, Jeff King wrote:
> The code to read commit-graph files tries to read all of the required
> chunks, but doesn't abort if we can't find one (or if it's corrupted).
> It's only at the end of reading the file that we then do some sanity
> checks for NULL entries. But it's preferable to detect the errors and
> bail immediately, for a few reasons:
>
>   1. It's less error-prone. It's easy in the reader functions to flag an
>      error but still end up setting some struct fields (an error I in
>      fact made while working on this patch series).
>
>   2. It's safer. Since verifying some chunks depends on the values of
>      other chunks, we may be depending on not-yet-verified data. I don't
>      know offhand of any case where this can cause problems, but it's
>      one less subtle thing to worry about in the reader code.
>
>   3. It prevents the user from seeing nonsense errors. If we're missing
>      an OIDL chunk, then g->num_commits will be zero. And so we may
>      complain that the size of our CDAT chunk (which should have a
>      fixed-size record for each commit) is wrong unless it's also zero.
>      But that's misleading; the problem is the missing OIDL chunk; the
>      CDAT one might be fine!
>
> So let's just check the return value from read_chunk(). This is exactly
> how the midx chunk-reading code does it.

All very well explained. I hit that same snag as you did when I was
working on the few patches I proposed we put on top of your earlier
chunk-format hardening series.

I'm glad to see this getting cleaned up, and I'm very happy with the
post-image of this patch.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/commit-graph.c b/commit-graph.c
index 989ebbe816..374575b484 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -291,19 +291,6 @@  static int verify_commit_graph_lite(struct commit_graph *g)
 	 * over g->num_commits, or runs a checksum on the commit-graph
 	 * itself.
 	 */
-	if (!g->chunk_oid_fanout) {
-		error(_("commit-graph required OID fanout chunk missing or corrupted"));
-		return 1;
-	}
-	if (!g->chunk_oid_lookup) {
-		error(_("commit-graph required OID lookup chunk missing or corrupted"));
-		return 1;
-	}
-	if (!g->chunk_commit_data) {
-		error(_("commit-graph required commit data chunk missing or corrupted"));
-		return 1;
-	}
-
 	for (i = 0; i < 255; i++) {
 		uint32_t oid_fanout1 = ntohl(g->chunk_oid_fanout[i]);
 		uint32_t oid_fanout2 = ntohl(g->chunk_oid_fanout[i + 1]);
@@ -462,9 +449,19 @@  struct commit_graph *parse_commit_graph(struct repo_settings *s,
 				   GRAPH_HEADER_SIZE, graph->num_chunks, 1))
 		goto free_and_return;
 
-	read_chunk(cf, GRAPH_CHUNKID_OIDFANOUT, graph_read_oid_fanout, graph);
-	read_chunk(cf, GRAPH_CHUNKID_OIDLOOKUP, graph_read_oid_lookup, graph);
-	read_chunk(cf, GRAPH_CHUNKID_DATA, graph_read_commit_data, graph);
+	if (read_chunk(cf, GRAPH_CHUNKID_OIDFANOUT, graph_read_oid_fanout, graph)) {
+		error(_("commit-graph required OID fanout chunk missing or corrupted"));
+		goto free_and_return;
+	}
+	if (read_chunk(cf, GRAPH_CHUNKID_OIDLOOKUP, graph_read_oid_lookup, graph)) {
+		error(_("commit-graph required OID lookup chunk missing or corrupted"));
+		goto free_and_return;
+	}
+	if (read_chunk(cf, GRAPH_CHUNKID_DATA, graph_read_commit_data, graph)) {
+		error(_("commit-graph required commit data chunk missing or corrupted"));
+		goto free_and_return;
+	}
+
 	pair_chunk(cf, GRAPH_CHUNKID_EXTRAEDGES, &graph->chunk_extra_edges,
 		   &graph->chunk_extra_edges_size);
 	pair_chunk(cf, GRAPH_CHUNKID_BASE, &graph->chunk_base_graphs,