mbox series

[v2,0/9] packfile: avoid .idx rename races

Message ID cover.1631228928.git.me@ttaylorr.com (mailing list archive)
Headers show
Series packfile: avoid .idx rename races | expand

Message

Taylor Blau Sept. 9, 2021, 11:24 p.m. UTC
Here is a relatively small reroll of mine and Ævar's series to prevent
packfile-related races when moving the `.idx` into place before other auxiliary
files are renamed.

All changes are cosmetic, but all of the feedback on the previous round was a
strict improvement in the overall quality, so it seemed pertinent to send an
improved version. Range-diff is below, and thanks in advance for your review.

Taylor Blau (4):
  bulk-checkin.c: store checksum directly
  pack-write.c: rename `.idx` files after `*.rev`
  builtin/repack.c: move `.idx` files into place last
  builtin/index-pack.c: move `.idx` files into place last

Ævar Arnfjörð Bjarmason (5):
  pack.h: line-wrap the definition of finish_tmp_packfile()
  pack-write: refactor renaming in finish_tmp_packfile()
  index-pack: refactor renaming in final()
  pack-write: split up finish_tmp_packfile() function
  pack-objects: rename .idx files into place after .bitmap files

 builtin/index-pack.c   | 48 +++++++++++++++++------------------
 builtin/pack-objects.c | 15 ++++++++---
 builtin/repack.c       |  2 +-
 bulk-checkin.c         | 31 +++++++++++++++++------
 pack-write.c           | 57 +++++++++++++++++++++---------------------
 pack.h                 | 10 +++++++-
 6 files changed, 96 insertions(+), 67 deletions(-)

Range-diff against v1:
 -:  ---------- >  1:  0b07aa4947 pack.h: line-wrap the definition of finish_tmp_packfile()
 1:  20b35ce050 !  2:  c46d3c29b4 bulk-checkin.c: store checksum directly
    @@ Commit message
         Store the hash directly in an unsigned char array. This behaves the same
         as writing to the `hash` member, but makes the intent clearer (and
         avoids allocating an extra four bytes for the `algo` member of `struct
    -    object_id`).
    +    object_id`). It likewise prevents the possibility of a segfault when
    +    reading `algo` (e.g., by calling `oid_to_hex()`) if it is uninitialized.
     
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
 2:  35052ef494 !  3:  2e907e309d pack-write: refactor renaming in finish_tmp_packfile()
    @@ pack-write.c: struct hashfile *create_tmp_packfile(char **pack_tmp_name)
      	return hashfd(fd, *pack_tmp_name);
      }
      
    -+static void rename_tmp_packfile(struct strbuf *nb, const char *source,
    ++static void rename_tmp_packfile(struct strbuf *name_prefix, const char *source,
     +				const char *ext)
     +{
    -+	size_t nb_len = nb->len;
    ++	size_t name_prefix_len = name_prefix->len;
     +
    -+	strbuf_addstr(nb, ext);
    -+	if (rename(source, nb->buf))
    -+		die_errno("unable to rename temporary '*.%s' file to '%s'",
    -+			  ext, nb->buf);
    -+	strbuf_setlen(nb, nb_len);
    ++	strbuf_addstr(name_prefix, ext);
    ++	if (rename(source, name_prefix->buf))
    ++		die_errno("unable to rename temporary file to '%s'",
    ++			  name_prefix->buf);
    ++	strbuf_setlen(name_prefix, name_prefix_len);
     +}
     +
      void finish_tmp_packfile(struct strbuf *name_buffer,
 3:  0fb2c25f5a =  4:  41d34b1f18 pack-write.c: rename `.idx` files after `*.rev`
 4:  3b10a97ec0 =  5:  6b340b7eba builtin/repack.c: move `.idx` files into place last
 5:  3c9b515907 !  6:  1e4d0ea0f3 index-pack: refactor renaming in final()
    @@ Commit message
         helper, this is both a net decrease in lines, and improves the
         readability, since we can easily see at a glance that the logic for
         writing these three types of files is exactly the same, aside from the
    -    obviously differing cases of "*final_xyz_name" being NULL, and
    -    "else_chmod_if" being different.
    +    obviously differing cases of "*final_name" being NULL, and
    +    "make_read_only_if_same" being different.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
    @@ builtin/index-pack.c: static void write_special_file(const char *suffix, const c
      	strbuf_release(&name_buf);
      }
      
    -+static void rename_tmp_packfile(const char **final_xyz_name,
    -+				const char *curr_xyz_name,
    -+				struct strbuf *xyz_name, unsigned char *hash,
    -+				const char *ext, int else_chmod_if)
    ++static void rename_tmp_packfile(const char **final_name,
    ++				const char *curr_name,
    ++				struct strbuf *name, unsigned char *hash,
    ++				const char *ext, int make_read_only_if_same)
     +{
    -+	if (*final_xyz_name != curr_xyz_name) {
    -+		if (!*final_xyz_name)
    -+			*final_xyz_name = odb_pack_name(xyz_name, hash, ext);
    -+		if (finalize_object_file(curr_xyz_name, *final_xyz_name))
    ++	if (*final_name != curr_name) {
    ++		if (!*final_name)
    ++			*final_name = odb_pack_name(name, hash, ext);
    ++		if (finalize_object_file(curr_name, *final_name))
     +			die(_("unable to rename temporary '*.%s' file to '%s"),
    -+			    ext, *final_xyz_name);
    -+	} else if (else_chmod_if) {
    -+		chmod(*final_xyz_name, 0444);
    ++			    ext, *final_name);
    ++	} else if (make_read_only_if_same) {
    ++		chmod(*final_name, 0444);
     +	}
     +}
     +
 6:  8d67a71501 =  7:  906e75d707 builtin/index-pack.c: move `.idx` files into place last
 7:  5c553229b0 !  8:  90bebe4e51 pack-write: split up finish_tmp_packfile() function
    @@ bulk-checkin.c: static struct bulk_checkin_state {
      	unsigned char hash[GIT_MAX_RAWSZ];
     
      ## pack-write.c ##
    -@@ pack-write.c: static void rename_tmp_packfile(struct strbuf *nb, const char *source,
    - 	strbuf_setlen(nb, nb_len);
    +@@ pack-write.c: static void rename_tmp_packfile(struct strbuf *name_prefix, const char *source,
    + 	strbuf_setlen(name_prefix, name_prefix_len);
      }
      
     -void finish_tmp_packfile(struct strbuf *name_buffer,
 8:  d8286cf107 !  9:  1409725509 pack-objects: rename .idx files into place after .bitmap files
    @@ Commit message
         about filesystem races in the face of doing and not doing fsync() (and
         if doing fsync(), not doing it properly).
     
    -    In particular, in this case of writing to ".git/objects/pack" we only
    -    write and fsync() the individual files, but if we wanted to guarantee
    -    that the metadata update was seen in that way by concurrent processes
    -    we'd need to fsync() on the "fd" of the containing directory. That
    -    concern is probably more theoretical than not, modern OS's tend to be
    -    more on the forgiving side than the overly pedantic side of
    -    implementing POSIX FS semantics.
    +    We may want to fsync the containing directory once after renaming the
    +    *.idx file into place, but that is outside the scope of this series.
     
         1. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/

Comments

Junio C Hamano Sept. 10, 2021, 1:35 a.m. UTC | #1
Taylor Blau <me@ttaylorr.com> writes:

> Here is a relatively small reroll of mine and Ævar's series to prevent
> packfile-related races when moving the `.idx` into place before other auxiliary
> files are renamed.

Will replace.  I didn't see anything questionable in range-diff (of
course, lack of anything is hard to spot but at least there is no
new iffy stuff and there wasn't anything iffy in the original, so
... ;-)