diff mbox series

[07/10] pack-check: push oid lookup into loop

Message ID 20200224043631.GG1018190@coredump.intra.peff.net (mailing list archive)
State New, archived
Headers show
Series removing nth_packed_object_sha1() | expand

Commit Message

Jeff King Feb. 24, 2020, 4:36 a.m. UTC
When we're checking a pack with fsck or verify-pack, we first sort the
idx entries by offset, since accessing them in pack order is more
efficient. To do so, we loop over them and fill in an array of structs
with the offset, object_id, and index position of each, sort the result,
and only then do we iterate over the sorted array and process each
entry.

In order to avoid the memory cost of storing the hash of each object, we
just store a pointer into the copy in the mmap'd pack index file. To
keep that property even as the rest of the code converted to "struct
object_id", commit 9fd750461b (Convert the verify_pack callback to
struct object_id, 2017-05-06) introduced a union in order to type-pun
the pointer-to-hash into an object_id struct.

But we can make this even simpler by observing that the sort operation
doesn't need the object id at all! We only need them one at a time while
we actually process each entry. So we can just omit the oid from the
struct entirely and load it on the fly into a local variable in the
second loop.

This gets rid of the type-punning, and lets us directly use the more
type-safe nth_packed_object_id(), simplifying the code. And as a bonus,
it saves 8 bytes of memory per object.

Note that this does mean we'll do the offset lookup for each object
before the oid lookup. The oid lookup has more safety checks in it
(e.g., for looking past p->num_objects) which in theory protected the
offset lookup. But since violating those checks was already a BUG()
condition (as described in the previous commit), it's not worth worrying
about.

Signed-off-by: Jeff King <peff@peff.net>
---
 pack-check.c | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)
diff mbox series

Patch

diff --git a/pack-check.c b/pack-check.c
index 39196ecfbc..dad6d8ae7f 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -8,10 +8,6 @@ 
 
 struct idx_entry {
 	off_t                offset;
-	union idx_entry_object {
-		const unsigned char *hash;
-		struct object_id *oid;
-	} oid;
 	unsigned int nr;
 };
 
@@ -97,30 +93,31 @@  static int verify_packfile(struct repository *r,
 	entries[nr_objects].offset = pack_sig_ofs;
 	/* first sort entries by pack offset, since unpacking them is more efficient that way */
 	for (i = 0; i < nr_objects; i++) {
-		entries[i].oid.hash = nth_packed_object_sha1(p, i);
-		if (!entries[i].oid.hash)
-			BUG("unable to get oid of object %lu from %s",
-			    (unsigned long)i, p->pack_name);
 		entries[i].offset = nth_packed_object_offset(p, i);
 		entries[i].nr = i;
 	}
 	QSORT(entries, nr_objects, compare_entries);
 
 	for (i = 0; i < nr_objects; i++) {
 		void *data;
+		struct object_id oid;
 		enum object_type type;
 		unsigned long size;
 		off_t curpos;
 		int data_valid;
 
+		if (nth_packed_object_id(&oid, p, entries[i].nr) < 0)
+			BUG("unable to get oid of object %lu from %s",
+			    (unsigned long)entries[i].nr, p->pack_name);
+
 		if (p->index_version > 1) {
 			off_t offset = entries[i].offset;
 			off_t len = entries[i+1].offset - offset;
 			unsigned int nr = entries[i].nr;
 			if (check_pack_crc(p, w_curs, offset, len, nr))
 				err = error("index CRC mismatch for object %s "
 					    "from %s at offset %"PRIuMAX"",
-					    oid_to_hex(entries[i].oid.oid),
+					    oid_to_hex(&oid),
 					    p->pack_name, (uintmax_t)offset);
 		}
 
@@ -143,14 +140,14 @@  static int verify_packfile(struct repository *r,
 
 		if (data_valid && !data)
 			err = error("cannot unpack %s from %s at offset %"PRIuMAX"",
-				    oid_to_hex(entries[i].oid.oid), p->pack_name,
+				    oid_to_hex(&oid), p->pack_name,
 				    (uintmax_t)entries[i].offset);
-		else if (check_object_signature(r, entries[i].oid.oid, data, size, type_name(type)))
+		else if (check_object_signature(r, &oid, data, size, type_name(type)))
 			err = error("packed %s from %s is corrupt",
-				    oid_to_hex(entries[i].oid.oid), p->pack_name);
+				    oid_to_hex(&oid), p->pack_name);
 		else if (fn) {
 			int eaten = 0;
-			err |= fn(entries[i].oid.oid, type, size, data, &eaten);
+			err |= fn(&oid, type, size, data, &eaten);
 			if (eaten)
 				data = NULL;
 		}