Message ID | 20200513005424.81369-26-sandals@crustytoothpaste.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | SHA-256 part 2/3: protocol functionality | expand |
On Wed, 13 May 2020 at 02:56, brian m. carlson <sandals@crustytoothpaste.net> wrote: > > Both v2 pack index files and the v3 format specified as part of the > NewHash work have similar data starting at the CRC table. Much of the > existing code wants to read either this table or the offset entries > following it, and in doing so computes the offset each time. > > In order to share as much code between v2 and v3, compute the offset of > the CRC table and store it when the pack is opened. Use this value to > compute offsets to not only the CRC table, but to the offset entries > beyond it. > --- a/builtin/index-pack.c > +++ b/builtin/index-pack.c > @@ -1555,13 +1555,9 @@ static void read_v2_anomalous_offsets(struct packed_git *p, > { > const uint32_t *idx1, *idx2; > uint32_t i; > - const uint32_t hashwords = the_hash_algo->rawsz / sizeof(uint32_t); > > /* The address of the 4-byte offset table */ > - idx1 = (((const uint32_t *)p->index_data) > - + 2 /* 8-byte header */ > - + 256 /* fan out */ > - + hashwords * p->num_objects /* object ID table */ > + idx1 = (((const uint32_t *)((const uint8_t *)p->index_data + p->crc_offset)) > + p->num_objects /* CRC32 table */ > ); This counts in four-byte words (so `+ 2` skips ahead 8B as the comment notes). And that's why we need to use "rawsz/4". Not new in this patch, but that outer pair of parenthesis just makes this harder to read, IMHO. I keep scanning back and forth wondering, "where is this whole thing going to get multiplied or something?" idx1 = (const uint32_t *)((const uint8_t *)p->index_data + p->crc_offset) + p->num_objects /* CRC32 table */; The double-casting can be avoided with something like this, but I'm not sure it's really any better: idx1 = (const uint32_t *)p->index_data + p->crc_offset/sizeof(uint32_t) + p->num_objects /* CRC32 table */; > --- a/packfile.c > +++ b/packfile.c > @@ -178,6 +178,7 @@ int load_idx(const char *path, const unsigned int hashsz, void *idx_map, > */ > (sizeof(off_t) <= 4)) > return error("pack too large for current definition of off_t in %s", path); > + p->crc_offset = 8 + 4 * 256 + nr * hashsz; > } > > p->index_version = version; It doesn't fit in the context, but `nr` will be assigned to `p->num_objects`. And now we can just use `hashsz` without dividing by 4, so this does the same calculation as the old one above. Martin
diff --git a/builtin/index-pack.c b/builtin/index-pack.c index f176dd28c8..7bea1fba52 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1555,13 +1555,9 @@ static void read_v2_anomalous_offsets(struct packed_git *p, { const uint32_t *idx1, *idx2; uint32_t i; - const uint32_t hashwords = the_hash_algo->rawsz / sizeof(uint32_t); /* The address of the 4-byte offset table */ - idx1 = (((const uint32_t *)p->index_data) - + 2 /* 8-byte header */ - + 256 /* fan out */ - + hashwords * p->num_objects /* object ID table */ + idx1 = (((const uint32_t *)((const uint8_t *)p->index_data + p->crc_offset)) + p->num_objects /* CRC32 table */ ); diff --git a/object-store.h b/object-store.h index d1e490f203..f439d47af8 100644 --- a/object-store.h +++ b/object-store.h @@ -70,6 +70,7 @@ struct packed_git { size_t index_size; uint32_t num_objects; uint32_t num_bad_objects; + uint32_t crc_offset; unsigned char *bad_object_sha1; int index_version; time_t mtime; diff --git a/packfile.c b/packfile.c index f4e752996d..6ab5233613 100644 --- a/packfile.c +++ b/packfile.c @@ -178,6 +178,7 @@ int load_idx(const char *path, const unsigned int hashsz, void *idx_map, */ (sizeof(off_t) <= 4)) return error("pack too large for current definition of off_t in %s", path); + p->crc_offset = 8 + 4 * 256 + nr * hashsz; } p->index_version = version;
Both v2 pack index files and the v3 format specified as part of the NewHash work have similar data starting at the CRC table. Much of the existing code wants to read either this table or the offset entries following it, and in doing so computes the offset each time. In order to share as much code between v2 and v3, compute the offset of the CRC table and store it when the pack is opened. Use this value to compute offsets to not only the CRC table, but to the offset entries beyond it. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> --- builtin/index-pack.c | 6 +----- object-store.h | 1 + packfile.c | 1 + 3 files changed, 3 insertions(+), 5 deletions(-)