Message ID | patch-v6-08.22-38e4266772d-20210907T104559Z-avarab@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | fsck: lib-ify object-file.c & better fsck "invalid object" error reporting | expand |
On Tue, Sep 07, 2021 at 12:58:03PM +0200, Ævar Arnfjörð Bjarmason wrote: > When the loose_object_info() function returns an error stop faking up > the "oi->typep" to OBJ_BAD. Let the return value of the function > itself suffice. This code cleanup simplifies subsequent changes. The obvious danger (which you mention) is that somebody is relying on what typep points to, and is reading it even if we returned non-zero from whatever called this function. Hopefully nobody is, but this change makes me a little uncomfortable nonetheless, since there are so many potential callers (even though this function has only one caller, it doesn't take long before the number of indirect callers explodes). So it would be nice if we could do without it, but you claim that it simplifies changes that happen later on. So let's continue to see if we really do need it... Thanks, Taylor
On Thu, Sep 16, 2021 at 05:29:30PM -0400, Taylor Blau wrote: > On Tue, Sep 07, 2021 at 12:58:03PM +0200, Ævar Arnfjörð Bjarmason wrote: > > When the loose_object_info() function returns an error stop faking up > > the "oi->typep" to OBJ_BAD. Let the return value of the function > > itself suffice. This code cleanup simplifies subsequent changes. > > The obvious danger (which you mention) is that somebody is relying on > what typep points to, and is reading it even if we returned non-zero > from whatever called this function. > > Hopefully nobody is, but this change makes me a little uncomfortable > nonetheless, since there are so many potential callers (even though this > function has only one caller, it doesn't take long before the number of > indirect callers explodes). > > So it would be nice if we could do without it, but you claim that it > simplifies changes that happen later on. So let's continue to see if we > really do need it... I'm actually reasonable comfortable with this patch. If we return an error from the *_object_info() functions, then I think all bets are off on what is in the resulting object_info struct. E.g., we'd already leave sizep uninitialized in such a case. It feels like oi->typep may be a little bit special because we conflate "error" and "type" in the return from oid_object_info(). But oid_object_info_extended() does not do that, and the innards of oid_object_info() do the right thing. Of course we _have_ been setting typep in this way for a while, so it's worth making sure nobody is depending on. Notably packed_object_info() does not behave in this way; if it hits an error, typep may be left unset. So any oid_object_info_extended() callers depending on this were already potentially buggy. I'd be OK with a quick sweep of the hits of "git grep typep" here. I just did that, and all the sites look pretty reasonable (they call oid_object_info_extended() and bail as soon as they see that it fails). -Peff
diff --git a/object-file.c b/object-file.c index a8be8994814..bda3497d5ca 100644 --- a/object-file.c +++ b/object-file.c @@ -1503,8 +1503,6 @@ static int loose_object_info(struct repository *r, git_inflate_end(&stream); munmap(map, mapsize); - if (status && oi->typep) - *oi->typep = status; if (oi->sizep == &size_scratch) oi->sizep = NULL; strbuf_release(&hdrbuf);
When the loose_object_info() function returns an error stop faking up the "oi->typep" to OBJ_BAD. Let the return value of the function itself suffice. This code cleanup simplifies subsequent changes. That we set this at all is a relic from the past. Before 052fe5eaca9 (sha1_loose_object_info: make type lookup optional, 2013-07-12) we would always return the type_from_string(type) via the parse_sha1_header() function, or -1 (i.e. OBJ_BAD) if we couldn't parse it. Then in a combination of 46f034483eb (sha1_file: support reading from a loose object of unknown type, 2015-05-03) and b3ea7dd32d6 (sha1_loose_object_info: handle errors from unpack_sha1_rest, 2017-10-05) our API drifted even further towards conflating the two again. Having read the code paths involved carefully I think this is OK. We are just about to return -1, and we have only one caller: do_oid_object_info_extended(). That function will in turn go on to return -1 when we return -1 here. This might be introducing a subtle bug where a caller of oid_object_info_extended() would inspect its "typep" and expect a meaningful value if the function returned -1. Such a problem would not occur for its simpler oid_object_info() sister function. That one always returns the "enum object_type", which in the case of -1 would be the OBJ_BAD. Having read the code for all the callers of these functions I don't believe any such bug is being introduced here, and in any case we'd likely already have such a bug for the "sizep" member (although blindly checking "typep" first would be a more common case). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- object-file.c | 2 -- 1 file changed, 2 deletions(-)