Message ID | patch-v2-4.4-70f4a9767d-20210908T003631Z-avarab@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | rename *.idx file into place last (also after *.bitmap) | expand |
On Wed, Sep 08 2021, Ævar Arnfjörð Bjarmason wrote: > Follow-up a preceding commit (pack-write.c: rename `.idx` file into > place last, 2021-08-16)[1] and rename the *.idx file in-place after we > write the *.bitmap. The preceding commit fixed the issue of *.idx > being written before *.rev files, but did not do so for *.idx files. > > See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) > for commentary at the time when *.bitmap was implemented about how > those files are written out, nothing in that commit contradicts what's > being done here. > > Note that the referenced earlier commit[1] is overly optimistic about > "clos[ing the] race", i.e. yes we'll now write the files in the right > order, but we might still race due to our sloppy use of fsync(). See > the thread at [2] for a rabbit hole of various discussions about > filesystem races in the face of doing and not doing fsync() (and if > doing fsync(), not doing it properly). Actually I think it's a bit worse than that, we will unconditionally fsync() the *.pack we write out, but in stage_tmp_packfiles() (the behavior pre-dates both this series and its parent, I just think my stage_tmp_packfiles() is easier to follow) we'll not write the *.idx file with fsync() since we won't pass WRITE_IDX_VERIFY. The same goes for *.rev (which oddly makes its fsync() conditional on WRITE_IDX_VERIFY), but not *.bitmap, which fsyncs unconditionally just like *.pack does. And then of course we'll do all these in-place renames but nothing fsyncs the fd of the directory, so the metadata and new names being committed to disk & visible to other processes is anyone's guess. But not only is that metadata commit not made, but due to the inconsistent fsync() we might end up with an *.idx that's partial and renamed in-place. In any case, any such issues pre-date this series and the series by Taylor it depends on, just adding some #leftoverbits for future fsync() fixes since I spent time looking into it.
On Wed, Sep 08, 2021 at 02:38:28AM +0200, Ævar Arnfjörð Bjarmason wrote: > Follow-up a preceding commit (pack-write.c: rename `.idx` file into > place last, 2021-08-16)[1] and rename the *.idx file in-place after we > write the *.bitmap. The preceding commit fixed the issue of *.idx > being written before *.rev files, but did not do so for *.idx files. s/idx/bitmap/ in the last line, I believe. Otherwise this patch looks good to me. Thanks, Taylor
On Wed, Sep 08 2021, Ævar Arnfjörð Bjarmason wrote: > On Wed, Sep 08 2021, Ævar Arnfjörð Bjarmason wrote: > >> Follow-up a preceding commit (pack-write.c: rename `.idx` file into >> place last, 2021-08-16)[1] and rename the *.idx file in-place after we >> write the *.bitmap. The preceding commit fixed the issue of *.idx >> being written before *.rev files, but did not do so for *.idx files. >> >> See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) >> for commentary at the time when *.bitmap was implemented about how >> those files are written out, nothing in that commit contradicts what's >> being done here. >> >> Note that the referenced earlier commit[1] is overly optimistic about >> "clos[ing the] race", i.e. yes we'll now write the files in the right >> order, but we might still race due to our sloppy use of fsync(). See >> the thread at [2] for a rabbit hole of various discussions about >> filesystem races in the face of doing and not doing fsync() (and if >> doing fsync(), not doing it properly). > > Actually I think it's a bit worse than that, we will unconditionally > fsync() the *.pack we write out, but in stage_tmp_packfiles() (the > behavior pre-dates both this series and its parent, I just think my > stage_tmp_packfiles() is easier to follow) we'll not write the *.idx > file with fsync() since we won't pass WRITE_IDX_VERIFY. > > The same goes for *.rev (which oddly makes its fsync() conditional on > WRITE_IDX_VERIFY), but not *.bitmap, which fsyncs unconditionally just > like *.pack does. I was confused here, we use fsync() with *.bitmap because there's no --verify mode for it, i.e. we'll always fsync() the *.rev files if we're actually writing them "for real", the non-fsync() path is when we're doing the "noop-write" to /dev/null with "index-pack --verify", i.e. writing to an FD without an associated "real" filename. That we use WRITE_IDX_VERIFY to decide on that fsync() for the *.rev files, as opposed to WRITE_REV_VERIFY is a bug in 8ef50d9958 (pack-write.c: prepare to write 'pack-*.rev' files, 2021-01-25). In practice if WRITE_REV_VERIFY is true then WRITE_IDX_VERIFY is true as well. See e37d0b8730 (builtin/index-pack.c: write reverse indexes, 2021-01-25) and its interaction with 68be2fea50 (receive-pack, fetch-pack: reject bogus pack that records objects twice, 2011-11-16). I.e. both flags are set if --verify is there. So it's not a bug as far as the behavior of the program goes, it just makes for some very confusing code. Why should *.rev fsyncing be contingent on whether or not we want to fsync *.idx files on writes? It shouldn't, but it turns out those two flags always go hand in hand if we're writing *.rev at all. But re the thread started at <patch-1.1-366ba928bd-20210908T010743Z-avarab@gmail.com> about how exactly we inspect these flags I think we should just take that change as-is for now, and refactor & fix this properly later. Once write_rev_file{_order}() don't pretend to take the trie-state bitflags of WRITE_REV|WRITE_REV_VERIFY|WRITE_IDX_VERIFY but just take a boolean "int write_file" the whole callchain & confusion around "do we have more than 1 flag?" goes away. I.e. something like the end-state shown by this patch, which I'm not submitting now because it conflicts with the other in-flight topic (and it's on top of some trivial but not-included whitespace changes in pack.h): diff --git a/builtin/fast-import.c b/builtin/fast-import.c index 20406f6775..69301fc044 100644 --- a/builtin/fast-import.c +++ b/builtin/fast-import.c @@ -778,7 +778,7 @@ static const char *create_index(void) die("internal consistency error creating the index"); tmpfile = write_idx_file(NULL, idx, object_count, &pack_idx_opts, - pack_data->hash); + pack_data->hash, 0); free(idx); return tmpfile; } diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 8336466865..b9c2a21216 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1590,12 +1590,9 @@ static int git_index_pack_config(const char *k, const char *v, void *cb) } return 0; } - if (!strcmp(k, "pack.writereverseindex")) { - if (git_config_bool(k, v)) - opts->flags |= WRITE_REV; - else - opts->flags &= ~WRITE_REV; - } + if (!strcmp(k, "pack.writereverseindex")) + opts->write_rev = git_config_bool(k, v); + return git_default_config(k, v, cb); } @@ -1713,7 +1710,7 @@ static void show_pack_info(int stat_only) int cmd_index_pack(int argc, const char **argv, const char *prefix) { - int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index; + int i, fix_thin_pack = 0, verify = 0, stat_only = 0; const char *curr_index; const char *curr_rev_index = NULL; const char *index_name = NULL, *pack_name = NULL, *rev_index_name = NULL; @@ -1747,10 +1744,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (prefix && chdir(prefix)) die(_("Cannot come back to cwd")); - if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0)) - rev_index = 1; - else - rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV)); + opts.write_rev = git_env_bool(GIT_TEST_WRITE_REV_INDEX, + opts.write_rev); for (i = 1; i < argc; i++) { const char *arg = argv[i]; @@ -1831,9 +1826,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) die(_("unknown hash algorithm '%s'"), arg); repo_set_hash_algo(the_repository, hash_algo); } else if (!strcmp(arg, "--rev-index")) { - rev_index = 1; + opts.write_rev = 1; } else if (!strcmp(arg, "--no-rev-index")) { - rev_index = 0; + opts.write_rev = 0; } else usage(index_pack_usage); continue; @@ -1855,9 +1850,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (!index_name && pack_name) index_name = derive_filename(pack_name, "pack", "idx", &index_name_buf); - opts.flags &= ~(WRITE_REV | WRITE_REV_VERIFY); - if (rev_index) { - opts.flags |= verify ? WRITE_REV_VERIFY : WRITE_REV; + if (opts.write_rev) { if (index_name) rev_index_name = derive_filename(index_name, "idx", "rev", @@ -1868,10 +1861,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (!index_name) die(_("--verify with no packfile name given")); read_idx_option(&opts, index_name); - opts.flags |= WRITE_IDX_VERIFY | WRITE_IDX_STRICT; } if (strict) - opts.flags |= WRITE_IDX_STRICT; + opts.write_idx_strict = 1; if (HAVE_THREADS && !nr_threads) { nr_threads = online_cpus(); @@ -1915,11 +1907,12 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) ALLOC_ARRAY(idx_objects, nr_objects); for (i = 0; i < nr_objects; i++) idx_objects[i] = &objects[i].idx; - curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts, pack_hash); - if (rev_index) + curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts, + pack_hash, verify); + if (opts.write_rev) curr_rev_index = write_rev_file(rev_index_name, idx_objects, nr_objects, pack_hash, - opts.flags); + verify); free(idx_objects); if (!verify) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index df49f656b9..1dae01800f 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3139,10 +3139,7 @@ static int git_pack_config(const char *k, const char *v, void *cb) return 0; } if (!strcmp(k, "pack.writereverseindex")) { - if (git_config_bool(k, v)) - pack_idx_opts.flags |= WRITE_REV; - else - pack_idx_opts.flags &= ~WRITE_REV; + pack_idx_opts.write_rev = git_config_bool(k, v); return 0; } if (!strcmp(k, "uploadpack.blobpackfileuri")) { @@ -4030,8 +4027,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) reset_pack_idx_option(&pack_idx_opts); git_config(git_pack_config, NULL); - if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0)) - pack_idx_opts.flags |= WRITE_REV; + pack_idx_opts.write_rev = git_env_bool(GIT_TEST_WRITE_REV_INDEX, + pack_idx_opts.write_rev); progress = isatty(2); argc = parse_options(argc, argv, prefix, pack_objects_options, diff --git a/midx.c b/midx.c index 321c6fdd2f..63da740cd4 100644 --- a/midx.c +++ b/midx.c @@ -874,7 +874,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash, strbuf_addf(&buf, "%s-%s.rev", midx_name, hash_to_hex(midx_hash)); tmp_file = write_rev_file_order(NULL, ctx->pack_order, ctx->entries_nr, - midx_hash, WRITE_REV); + midx_hash, 0); if (finalize_object_file(tmp_file, buf.buf)) die(_("cannot store reverse index file")); diff --git a/pack-write.c b/pack-write.c index 1883848e7c..4f54191499 100644 --- a/pack-write.c +++ b/pack-write.c @@ -44,7 +44,7 @@ static int need_large_offset(off_t offset, const struct pack_idx_option *opts) */ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects, int nr_objects, const struct pack_idx_option *opts, - const unsigned char *sha1) + const unsigned char *sha1, int verify) { struct hashfile *f; struct pack_idx_entry **sorted_by_sha, **list, **last; @@ -65,7 +65,7 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec else sorted_by_sha = list = last = NULL; - if (opts->flags & WRITE_IDX_VERIFY) { + if (verify) { assert(index_name); f = hashfd_check(index_name); } else { @@ -119,7 +119,7 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec if (index_version < 2) hashwrite_be32(f, obj->offset); hashwrite(f, obj->oid.hash, the_hash_algo->rawsz); - if ((opts->flags & WRITE_IDX_STRICT) && + if (opts->write_idx_strict && (i && oideq(&list[-2]->oid, &obj->oid))) die("The same object %s appears twice in the pack", oid_to_hex(&obj->oid)); @@ -162,8 +162,7 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec hashwrite(f, sha1, the_hash_algo->rawsz); finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_CLOSE | - ((opts->flags & WRITE_IDX_VERIFY) - ? 0 : CSUM_FSYNC)); + (verify ? 0 : CSUM_FSYNC)); return index_name; } @@ -218,22 +217,19 @@ const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, - unsigned flags) + int verify) { uint32_t *pack_order; uint32_t i; const char *ret; - if (!(flags & WRITE_REV) && !(flags & WRITE_REV_VERIFY)) - return NULL; - ALLOC_ARRAY(pack_order, nr_objects); for (i = 0; i < nr_objects; i++) pack_order[i] = i; QSORT_S(pack_order, nr_objects, pack_order_cmp, objects); ret = write_rev_file_order(rev_name, pack_order, nr_objects, hash, - flags); + verify); free(pack_order); @@ -244,15 +240,12 @@ const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, - unsigned flags) + int verify) { struct hashfile *f; int fd; - if ((flags & WRITE_REV) && (flags & WRITE_REV_VERIFY)) - die(_("cannot both write and verify reverse index")); - - if (flags & WRITE_REV) { + if (!verify) { if (!rev_name) { struct strbuf tmp_file = STRBUF_INIT; fd = odb_mkstemp(&tmp_file, "pack/tmp_rev_XXXXXX"); @@ -264,7 +257,7 @@ const char *write_rev_file_order(const char *rev_name, die_errno("unable to create '%s'", rev_name); } f = hashfd(fd, rev_name); - } else if (flags & WRITE_REV_VERIFY) { + } else { struct stat statbuf; if (stat(rev_name, &statbuf)) { if (errno == ENOENT) { @@ -274,8 +267,7 @@ const char *write_rev_file_order(const char *rev_name, die_errno(_("could not stat: %s"), rev_name); } f = hashfd_check(rev_name); - } else - return NULL; + } write_rev_header(f); @@ -286,7 +278,7 @@ const char *write_rev_file_order(const char *rev_name, die(_("failed to make %s readable"), rev_name); finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_CLOSE | - ((flags & WRITE_IDX_VERIFY) ? 0 : CSUM_FSYNC)); + (verify ? 0 : CSUM_FSYNC)); return rev_name; } @@ -479,12 +471,13 @@ void finish_tmp_packfile(struct strbuf *name_buffer, die_errno("unable to make temporary pack file readable"); idx_tmp_name = write_idx_file(NULL, written_list, nr_written, - pack_idx_opts, hash); + pack_idx_opts, hash, 0); if (adjust_shared_perm(idx_tmp_name)) die_errno("unable to make temporary index file readable"); - rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, - pack_idx_opts->flags); + if (pack_idx_opts->write_rev) + rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, + 0); strbuf_addf(name_buffer, "%s.pack", hash_to_hex(hash)); diff --git a/pack.h b/pack.h index aa23be74df..e096dea4c9 100644 --- a/pack.h +++ b/pack.h @@ -38,12 +38,8 @@ struct pack_header { #define PACK_IDX_SIGNATURE 0xff744f63 /* "\377tOc" */ struct pack_idx_option { - unsigned flags; - /* flag bits */ -#define WRITE_IDX_VERIFY 01 /* verify only, do not write the idx file */ -#define WRITE_IDX_STRICT 02 -#define WRITE_REV 04 -#define WRITE_REV_VERIFY 010 + unsigned int write_idx_strict:1, + write_rev:1; uint32_t version; uint32_t off32_limit; @@ -84,7 +80,8 @@ typedef int (*verify_fn)(const struct object_id *, enum object_type, unsigned lo const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects, int nr_objects, const struct pack_idx_option *opts, - const unsigned char *sha1); + const unsigned char *sha1, + int verify); int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, off_t len, unsigned int nr); int verify_pack_index(struct packed_git *); int verify_pack(struct repository *, struct packed_git *, verify_fn fn, struct progress *, uint32_t); @@ -99,11 +96,11 @@ void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_ const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, - const unsigned char *hash, unsigned flags); + const unsigned char *hash, int verify); const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, - unsigned flags); + int verify); /* * The "hdr" output buffer should be at least this big, which will handle sizes
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 33567aaa74..396240c25a 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1249,7 +1249,6 @@ static void write_pack_file(void) stage_tmp_packfiles(&tmpname, pack_tmp_name, written_list, nr_written, &pack_idx_opts, hash, &idx_tmp_name); - rename_tmp_packfile_idx(&tmpname, hash, &idx_tmp_name); if (write_bitmap_index) { struct strbuf sb = STRBUF_INIT; @@ -1268,6 +1267,8 @@ static void write_pack_file(void) strbuf_release(&sb); } + rename_tmp_packfile_idx(&tmpname, hash, &idx_tmp_name); + free(idx_tmp_name); strbuf_release(&tmpname); free(pack_tmp_name);
Follow-up a preceding commit (pack-write.c: rename `.idx` file into place last, 2021-08-16)[1] and rename the *.idx file in-place after we write the *.bitmap. The preceding commit fixed the issue of *.idx being written before *.rev files, but did not do so for *.idx files. See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) for commentary at the time when *.bitmap was implemented about how those files are written out, nothing in that commit contradicts what's being done here. Note that the referenced earlier commit[1] is overly optimistic about "clos[ing the] race", i.e. yes we'll now write the files in the right order, but we might still race due to our sloppy use of fsync(). See the thread at [2] for a rabbit hole of various discussions about filesystem races in the face of doing and not doing fsync() (and if doing fsync(), not doing it properly). 1. https://lore.kernel.org/git/a6a4d2154e83d41c10986c5f455279ab249a910c.1630461918.git.me@ttaylorr.com/ 2. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/pack-objects.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)