diff mbox series

[v13,2/7] object-file.c: do fsync() and close() before post-write die()

Message ID patch-v13-2.7-b3568f0c5c0-20220604T095113Z-avarab@gmail.com (mailing list archive)
State New, archived
Headers show
Series unpack-objects: support streaming blobs to disk | expand

Commit Message

Ævar Arnfjörð Bjarmason June 4, 2022, 10:10 a.m. UTC
Change write_loose_object() to do an fsync() and close() before the
oideq() sanity check at the end. This change re-joins code that was
split up by the die() sanity check added in 748af44c63e (sha1_file: be
paranoid when creating loose objects, 2010-02-21).

I don't think that this change matters in itself, if we called die()
it was possible that our data wouldn't fully make it to disk, but in
any case we were writing data that we'd consider corrupted. It's
possible that a subsequent "git fsck" will be less confused now.

The real reason to make this change is that in a subsequent commit
we'll split this code in write_loose_object() into a utility function,
all its callers will want the preceding sanity checks, but not the
"oideq" check. By moving the close_loose_object() earlier it'll be
easier to reason about the introduction of the utility function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-file.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Junio C Hamano June 6, 2022, 6:45 p.m. UTC | #1
Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Change write_loose_object() to do an fsync() and close() before the
> oideq() sanity check at the end. This change re-joins code that was
> split up by the die() sanity check added in 748af44c63e (sha1_file: be
> paranoid when creating loose objects, 2010-02-21).
>
> I don't think that this change matters in itself, if we called die()
> it was possible that our data wouldn't fully make it to disk, but in
> any case we were writing data that we'd consider corrupted. It's
> possible that a subsequent "git fsck" will be less confused now.

write_loose_object() 

 - prepares a temporary file
 - deflates into the temporary file
 - closes and syncs it
 - moves the temporary file to the final locaiton

And any die() inserted in between any of these steps will cause the
corrupt temporary file not to become the final loose object file.

So, "git fsck" does not need this change at all.

> The real reason to make this change is that in a subsequent commit
> we'll split this code in write_loose_object() into a utility function,
> all its callers will want the preceding sanity checks, but not the
> "oideq" check. By moving the close_loose_object() earlier it'll be
> easier to reason about the introduction of the utility function.

If a "split" relies on the "close and sync" step being in any
particular place, that smells really fishy.  Is the series loosening
the object integrity check?  Are we adding some exploitable hole
into our codebase without people knowing, or something?  I am not
sure if I am following the above logic.



> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  object-file.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/object-file.c b/object-file.c
> index 79eb8339b60..e4a83012ba4 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -2012,12 +2012,12 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
>  		die(_("deflateEnd on object %s failed (%d)"), oid_to_hex(oid),
>  		    ret);
>  	the_hash_algo->final_oid_fn(&parano_oid, &c);
> +	close_loose_object(fd, tmp_file.buf);
> +
>  	if (!oideq(oid, &parano_oid))
>  		die(_("confused by unstable object source data for %s"),
>  		    oid_to_hex(oid));
>  
> -	close_loose_object(fd, tmp_file.buf);
> -
>  	if (mtime) {
>  		struct utimbuf utb;
>  		utb.actime = mtime;
diff mbox series

Patch

diff --git a/object-file.c b/object-file.c
index 79eb8339b60..e4a83012ba4 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2012,12 +2012,12 @@  static int write_loose_object(const struct object_id *oid, char *hdr,
 		die(_("deflateEnd on object %s failed (%d)"), oid_to_hex(oid),
 		    ret);
 	the_hash_algo->final_oid_fn(&parano_oid, &c);
+	close_loose_object(fd, tmp_file.buf);
+
 	if (!oideq(oid, &parano_oid))
 		die(_("confused by unstable object source data for %s"),
 		    oid_to_hex(oid));
 
-	close_loose_object(fd, tmp_file.buf);
-
 	if (mtime) {
 		struct utimbuf utb;
 		utb.actime = mtime;