Message ID | 13c49ed40ca72b7ab50939244616f0a90b5bf7f6.1610129796.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | pack-revindex: prepare for on-disk reverse index | expand |
On Fri, Jan 08, 2021 at 01:17:39PM -0500, Taylor Blau wrote: > diff --git a/packfile.c b/packfile.c > index 469c8d4f57..34467ea4a3 100644 > --- a/packfile.c > +++ b/packfile.c > @@ -1692,11 +1692,19 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset, > } > > if (do_check_packed_object_crc && p->index_version > 1) { > - struct revindex_entry *revidx = find_pack_revindex(p, obj_offset); > - off_t len = revidx[1].offset - obj_offset; > - if (check_pack_crc(p, &w_curs, obj_offset, len, revidx->nr)) { > + uint32_t pos, nr; We have "pos" and "nr". What's the difference? :) I think pack_pos and index_pos might be harder to get confused. > + off_t len; > + > + if (offset_to_pack_pos(p, obj_offset, &pos) < 0) { > + data = NULL; > + goto out; > + } Nice to see the error check here. As with the previous commit, we probably want to error(), just as we would for errors below. Do we also need to call mark_bad_packed_object()? I guess we can't, because we only have the offset, and not the oid (the code below uses nth_packed_object_id(), but it is relying on the revindex, which we know just failed to work). I'm just wondering if an error here is going to put us into an infinite loop of retrying the lookup in the same pack over and over. Let's see...our caller is ultimately packed_object_info(), but it too does not have the oid. It returns an error up to do_oid_object_info_extended(). Which yes, does mark_bad_packed_object() itself. Good. So I think we are fine, and arguably these lower-level calls to mark_bad_packed_object() are not necessary. But they do not hurt either. -Peff
On Tue, Jan 12, 2021 at 04:22:26AM -0500, Jeff King wrote: > We have "pos" and "nr". What's the difference? :) > > I think pack_pos and index_pos might be harder to get confused. Much clearer, thanks. > > + off_t len; > > + > > + if (offset_to_pack_pos(p, obj_offset, &pos) < 0) { > > + data = NULL; > > + goto out; > > + } > > Nice to see the error check here. As with the previous commit, we > probably want to error(), just as we would for errors below. Yup, good call. > Do we also need to call mark_bad_packed_object()? I guess we can't, > because we only have the offset, and not the oid (the code below uses > nth_packed_object_id(), but it is relying on the revindex, which we know > just failed to work). > > I'm just wondering if an error here is going to put us into an infinite > loop of retrying the lookup in the same pack over and over. Let's > see...our caller is ultimately packed_object_info(), but it too does not > have the oid. It returns an error up to do_oid_object_info_extended(). > Which yes, does mark_bad_packed_object() itself. Good. So I think we are > fine, and arguably these lower-level calls to mark_bad_packed_object() > are not necessary. But they do not hurt either. Thanks, this rationale is helpful to have. I included an abridged version of it in the patch message. > -Peff Thanks, Taylor
diff --git a/packfile.c b/packfile.c index 469c8d4f57..34467ea4a3 100644 --- a/packfile.c +++ b/packfile.c @@ -1692,11 +1692,19 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset, } if (do_check_packed_object_crc && p->index_version > 1) { - struct revindex_entry *revidx = find_pack_revindex(p, obj_offset); - off_t len = revidx[1].offset - obj_offset; - if (check_pack_crc(p, &w_curs, obj_offset, len, revidx->nr)) { + uint32_t pos, nr; + off_t len; + + if (offset_to_pack_pos(p, obj_offset, &pos) < 0) { + data = NULL; + goto out; + } + + len = pack_pos_to_offset(p, pos + 1) - obj_offset; + nr = pack_pos_to_index(p, pos); + if (check_pack_crc(p, &w_curs, obj_offset, len, nr)) { struct object_id oid; - nth_packed_object_id(&oid, p, revidx->nr); + nth_packed_object_id(&oid, p, nr); error("bad packed object CRC for %s", oid_to_hex(&oid)); mark_bad_packed_object(p, oid.hash); @@ -1779,11 +1787,11 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset, * This is costly but should happen only in the presence * of a corrupted pack, and is better than failing outright. */ - struct revindex_entry *revidx; + uint32_t pos; struct object_id base_oid; - revidx = find_pack_revindex(p, obj_offset); - if (revidx) { - nth_packed_object_id(&base_oid, p, revidx->nr); + if (!(offset_to_pack_pos(p, obj_offset, &pos))) { + nth_packed_object_id(&base_oid, p, + pack_pos_to_index(p, pos)); error("failed to read delta base object %s" " at offset %"PRIuMAX" from %s", oid_to_hex(&base_oid), (uintmax_t)obj_offset,
Remove direct manipulation of the 'struct revindex_entry' type as well as calls to the deprecated API in 'packfile.c:unpack_entry()'. Usual clean-up is performed (replacing '->nr' with calls to 'pack_pos_to_index()' and so on). Add an additional check to make sure that 'obj_offset()' points at a valid object. Signed-off-by: Taylor Blau <me@ttaylorr.com> --- packfile.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-)