diff mbox series

[v2,4/4] pack-write: rename *.idx file into place last (really!)

Message ID patch-v2-4.4-70f4a9767d-20210908T003631Z-avarab@gmail.com (mailing list archive)
State New, archived
Headers show
Series rename *.idx file into place last (also after *.bitmap) | expand

Commit Message

Ævar Arnfjörð Bjarmason Sept. 8, 2021, 12:38 a.m. UTC
Follow-up a preceding commit (pack-write.c: rename `.idx` file into
place last, 2021-08-16)[1] and rename the *.idx file in-place after we
write the *.bitmap. The preceding commit fixed the issue of *.idx
being written before *.rev files, but did not do so for *.idx files.

See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21)
for commentary at the time when *.bitmap was implemented about how
those files are written out, nothing in that commit contradicts what's
being done here.

Note that the referenced earlier commit[1] is overly optimistic about
"clos[ing the] race", i.e. yes we'll now write the files in the right
order, but we might still race due to our sloppy use of fsync(). See
the thread at [2] for a rabbit hole of various discussions about
filesystem races in the face of doing and not doing fsync() (and if
doing fsync(), not doing it properly).

1. https://lore.kernel.org/git/a6a4d2154e83d41c10986c5f455279ab249a910c.1630461918.git.me@ttaylorr.com/
2. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/pack-objects.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Ævar Arnfjörð Bjarmason Sept. 8, 2021, 1:14 a.m. UTC | #1
On Wed, Sep 08 2021, Ævar Arnfjörð Bjarmason wrote:

> Follow-up a preceding commit (pack-write.c: rename `.idx` file into
> place last, 2021-08-16)[1] and rename the *.idx file in-place after we
> write the *.bitmap. The preceding commit fixed the issue of *.idx
> being written before *.rev files, but did not do so for *.idx files.
>
> See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21)
> for commentary at the time when *.bitmap was implemented about how
> those files are written out, nothing in that commit contradicts what's
> being done here.
>
> Note that the referenced earlier commit[1] is overly optimistic about
> "clos[ing the] race", i.e. yes we'll now write the files in the right
> order, but we might still race due to our sloppy use of fsync(). See
> the thread at [2] for a rabbit hole of various discussions about
> filesystem races in the face of doing and not doing fsync() (and if
> doing fsync(), not doing it properly).

Actually I think it's a bit worse than that, we will unconditionally
fsync() the *.pack we write out, but in stage_tmp_packfiles() (the
behavior pre-dates both this series and its parent, I just think my
stage_tmp_packfiles() is easier to follow) we'll not write the *.idx
file with fsync() since we won't pass WRITE_IDX_VERIFY.

The same goes for *.rev (which oddly makes its fsync() conditional on
WRITE_IDX_VERIFY), but not *.bitmap, which fsyncs unconditionally just
like *.pack does.

And then of course we'll do all these in-place renames but nothing
fsyncs the fd of the directory, so the metadata and new names being
committed to disk & visible to other processes is anyone's guess.

But not only is that metadata commit not made, but due to the
inconsistent fsync() we might end up with an *.idx that's partial and
renamed in-place.

In any case, any such issues pre-date this series and the series by
Taylor it depends on, just adding some #leftoverbits for future fsync()
fixes since I spent time looking into it.
Taylor Blau Sept. 8, 2021, 4:24 a.m. UTC | #2
On Wed, Sep 08, 2021 at 02:38:28AM +0200, Ævar Arnfjörð Bjarmason wrote:
> Follow-up a preceding commit (pack-write.c: rename `.idx` file into
> place last, 2021-08-16)[1] and rename the *.idx file in-place after we
> write the *.bitmap. The preceding commit fixed the issue of *.idx
> being written before *.rev files, but did not do so for *.idx files.

s/idx/bitmap/ in the last line, I believe.

Otherwise this patch looks good to me.

Thanks,
Taylor
Ævar Arnfjörð Bjarmason Sept. 8, 2021, 9:18 a.m. UTC | #3
On Wed, Sep 08 2021, Ævar Arnfjörð Bjarmason wrote:

> On Wed, Sep 08 2021, Ævar Arnfjörð Bjarmason wrote:
>
>> Follow-up a preceding commit (pack-write.c: rename `.idx` file into
>> place last, 2021-08-16)[1] and rename the *.idx file in-place after we
>> write the *.bitmap. The preceding commit fixed the issue of *.idx
>> being written before *.rev files, but did not do so for *.idx files.
>>
>> See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21)
>> for commentary at the time when *.bitmap was implemented about how
>> those files are written out, nothing in that commit contradicts what's
>> being done here.
>>
>> Note that the referenced earlier commit[1] is overly optimistic about
>> "clos[ing the] race", i.e. yes we'll now write the files in the right
>> order, but we might still race due to our sloppy use of fsync(). See
>> the thread at [2] for a rabbit hole of various discussions about
>> filesystem races in the face of doing and not doing fsync() (and if
>> doing fsync(), not doing it properly).
>
> Actually I think it's a bit worse than that, we will unconditionally
> fsync() the *.pack we write out, but in stage_tmp_packfiles() (the
> behavior pre-dates both this series and its parent, I just think my
> stage_tmp_packfiles() is easier to follow) we'll not write the *.idx
> file with fsync() since we won't pass WRITE_IDX_VERIFY.
>
> The same goes for *.rev (which oddly makes its fsync() conditional on
> WRITE_IDX_VERIFY), but not *.bitmap, which fsyncs unconditionally just
> like *.pack does.

I was confused here, we use fsync() with *.bitmap because there's no
--verify mode for it, i.e. we'll always fsync() the *.rev files if we're
actually writing them "for real", the non-fsync() path is when we're
doing the "noop-write" to /dev/null with "index-pack --verify",
i.e. writing to an FD without an associated "real" filename.

That we use WRITE_IDX_VERIFY to decide on that fsync() for the *.rev
files, as opposed to WRITE_REV_VERIFY is a bug in 8ef50d9958
(pack-write.c: prepare to write 'pack-*.rev' files, 2021-01-25).

In practice if WRITE_REV_VERIFY is true then WRITE_IDX_VERIFY is true as
well. See e37d0b8730 (builtin/index-pack.c: write reverse indexes,
2021-01-25) and its interaction with 68be2fea50 (receive-pack,
fetch-pack: reject bogus pack that records objects twice,
2011-11-16). I.e. both flags are set if --verify is there.

So it's not a bug as far as the behavior of the program goes, it just
makes for some very confusing code. Why should *.rev fsyncing be
contingent on whether or not we want to fsync *.idx files on writes? It
shouldn't, but it turns out those two flags always go hand in hand if
we're writing *.rev at all.

But re the thread started at
<patch-1.1-366ba928bd-20210908T010743Z-avarab@gmail.com> about how
exactly we inspect these flags I think we should just take that change
as-is for now, and refactor & fix this properly later.

Once write_rev_file{_order}() don't pretend to take the trie-state
bitflags of WRITE_REV|WRITE_REV_VERIFY|WRITE_IDX_VERIFY but just take a
boolean "int write_file" the whole callchain & confusion around "do we
have more than 1 flag?" goes away.

I.e. something like the end-state shown by this patch, which I'm not
submitting now because it conflicts with the other in-flight topic (and
it's on top of some trivial but not-included whitespace changes in
pack.h):

diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 20406f6775..69301fc044 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -778,7 +778,7 @@ static const char *create_index(void)
 		die("internal consistency error creating the index");
 
 	tmpfile = write_idx_file(NULL, idx, object_count, &pack_idx_opts,
-				 pack_data->hash);
+				 pack_data->hash, 0);
 	free(idx);
 	return tmpfile;
 }
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 8336466865..b9c2a21216 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1590,12 +1590,9 @@ static int git_index_pack_config(const char *k, const char *v, void *cb)
 		}
 		return 0;
 	}
-	if (!strcmp(k, "pack.writereverseindex")) {
-		if (git_config_bool(k, v))
-			opts->flags |= WRITE_REV;
-		else
-			opts->flags &= ~WRITE_REV;
-	}
+	if (!strcmp(k, "pack.writereverseindex"))
+		opts->write_rev = git_config_bool(k, v);
+
 	return git_default_config(k, v, cb);
 }
 
@@ -1713,7 +1710,7 @@ static void show_pack_info(int stat_only)
 
 int cmd_index_pack(int argc, const char **argv, const char *prefix)
 {
-	int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index;
+	int i, fix_thin_pack = 0, verify = 0, stat_only = 0;
 	const char *curr_index;
 	const char *curr_rev_index = NULL;
 	const char *index_name = NULL, *pack_name = NULL, *rev_index_name = NULL;
@@ -1747,10 +1744,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	if (prefix && chdir(prefix))
 		die(_("Cannot come back to cwd"));
 
-	if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0))
-		rev_index = 1;
-	else
-		rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV));
+	opts.write_rev = git_env_bool(GIT_TEST_WRITE_REV_INDEX,
+				       opts.write_rev);
 
 	for (i = 1; i < argc; i++) {
 		const char *arg = argv[i];
@@ -1831,9 +1826,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 					die(_("unknown hash algorithm '%s'"), arg);
 				repo_set_hash_algo(the_repository, hash_algo);
 			} else if (!strcmp(arg, "--rev-index")) {
-				rev_index = 1;
+				opts.write_rev = 1;
 			} else if (!strcmp(arg, "--no-rev-index")) {
-				rev_index = 0;
+				opts.write_rev = 0;
 			} else
 				usage(index_pack_usage);
 			continue;
@@ -1855,9 +1850,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	if (!index_name && pack_name)
 		index_name = derive_filename(pack_name, "pack", "idx", &index_name_buf);
 
-	opts.flags &= ~(WRITE_REV | WRITE_REV_VERIFY);
-	if (rev_index) {
-		opts.flags |= verify ? WRITE_REV_VERIFY : WRITE_REV;
+	if (opts.write_rev) {
 		if (index_name)
 			rev_index_name = derive_filename(index_name,
 							 "idx", "rev",
@@ -1868,10 +1861,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 		if (!index_name)
 			die(_("--verify with no packfile name given"));
 		read_idx_option(&opts, index_name);
-		opts.flags |= WRITE_IDX_VERIFY | WRITE_IDX_STRICT;
 	}
 	if (strict)
-		opts.flags |= WRITE_IDX_STRICT;
+		opts.write_idx_strict = 1;
 
 	if (HAVE_THREADS && !nr_threads) {
 		nr_threads = online_cpus();
@@ -1915,11 +1907,12 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	ALLOC_ARRAY(idx_objects, nr_objects);
 	for (i = 0; i < nr_objects; i++)
 		idx_objects[i] = &objects[i].idx;
-	curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts, pack_hash);
-	if (rev_index)
+	curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts,
+				    pack_hash, verify);
+	if (opts.write_rev)
 		curr_rev_index = write_rev_file(rev_index_name, idx_objects,
 						nr_objects, pack_hash,
-						opts.flags);
+						verify);
 	free(idx_objects);
 
 	if (!verify)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index df49f656b9..1dae01800f 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3139,10 +3139,7 @@ static int git_pack_config(const char *k, const char *v, void *cb)
 		return 0;
 	}
 	if (!strcmp(k, "pack.writereverseindex")) {
-		if (git_config_bool(k, v))
-			pack_idx_opts.flags |= WRITE_REV;
-		else
-			pack_idx_opts.flags &= ~WRITE_REV;
+		pack_idx_opts.write_rev = git_config_bool(k, v);
 		return 0;
 	}
 	if (!strcmp(k, "uploadpack.blobpackfileuri")) {
@@ -4030,8 +4027,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 
 	reset_pack_idx_option(&pack_idx_opts);
 	git_config(git_pack_config, NULL);
-	if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0))
-		pack_idx_opts.flags |= WRITE_REV;
+	pack_idx_opts.write_rev = git_env_bool(GIT_TEST_WRITE_REV_INDEX,
+					       pack_idx_opts.write_rev);
 
 	progress = isatty(2);
 	argc = parse_options(argc, argv, prefix, pack_objects_options,
diff --git a/midx.c b/midx.c
index 321c6fdd2f..63da740cd4 100644
--- a/midx.c
+++ b/midx.c
@@ -874,7 +874,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash,
 	strbuf_addf(&buf, "%s-%s.rev", midx_name, hash_to_hex(midx_hash));
 
 	tmp_file = write_rev_file_order(NULL, ctx->pack_order, ctx->entries_nr,
-					midx_hash, WRITE_REV);
+					midx_hash, 0);
 
 	if (finalize_object_file(tmp_file, buf.buf))
 		die(_("cannot store reverse index file"));
diff --git a/pack-write.c b/pack-write.c
index 1883848e7c..4f54191499 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -44,7 +44,7 @@ static int need_large_offset(off_t offset, const struct pack_idx_option *opts)
  */
 const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects,
 			   int nr_objects, const struct pack_idx_option *opts,
-			   const unsigned char *sha1)
+			   const unsigned char *sha1, int verify)
 {
 	struct hashfile *f;
 	struct pack_idx_entry **sorted_by_sha, **list, **last;
@@ -65,7 +65,7 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec
 	else
 		sorted_by_sha = list = last = NULL;
 
-	if (opts->flags & WRITE_IDX_VERIFY) {
+	if (verify) {
 		assert(index_name);
 		f = hashfd_check(index_name);
 	} else {
@@ -119,7 +119,7 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec
 		if (index_version < 2)
 			hashwrite_be32(f, obj->offset);
 		hashwrite(f, obj->oid.hash, the_hash_algo->rawsz);
-		if ((opts->flags & WRITE_IDX_STRICT) &&
+		if (opts->write_idx_strict &&
 		    (i && oideq(&list[-2]->oid, &obj->oid)))
 			die("The same object %s appears twice in the pack",
 			    oid_to_hex(&obj->oid));
@@ -162,8 +162,7 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec
 
 	hashwrite(f, sha1, the_hash_algo->rawsz);
 	finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_CLOSE |
-				    ((opts->flags & WRITE_IDX_VERIFY)
-				    ? 0 : CSUM_FSYNC));
+			  (verify ? 0 : CSUM_FSYNC));
 	return index_name;
 }
 
@@ -218,22 +217,19 @@ const char *write_rev_file(const char *rev_name,
 			   struct pack_idx_entry **objects,
 			   uint32_t nr_objects,
 			   const unsigned char *hash,
-			   unsigned flags)
+			   int verify)
 {
 	uint32_t *pack_order;
 	uint32_t i;
 	const char *ret;
 
-	if (!(flags & WRITE_REV) && !(flags & WRITE_REV_VERIFY))
-		return NULL;
-
 	ALLOC_ARRAY(pack_order, nr_objects);
 	for (i = 0; i < nr_objects; i++)
 		pack_order[i] = i;
 	QSORT_S(pack_order, nr_objects, pack_order_cmp, objects);
 
 	ret = write_rev_file_order(rev_name, pack_order, nr_objects, hash,
-				   flags);
+				   verify);
 
 	free(pack_order);
 
@@ -244,15 +240,12 @@ const char *write_rev_file_order(const char *rev_name,
 				 uint32_t *pack_order,
 				 uint32_t nr_objects,
 				 const unsigned char *hash,
-				 unsigned flags)
+				 int verify)
 {
 	struct hashfile *f;
 	int fd;
 
-	if ((flags & WRITE_REV) && (flags & WRITE_REV_VERIFY))
-		die(_("cannot both write and verify reverse index"));
-
-	if (flags & WRITE_REV) {
+	if (!verify) {
 		if (!rev_name) {
 			struct strbuf tmp_file = STRBUF_INIT;
 			fd = odb_mkstemp(&tmp_file, "pack/tmp_rev_XXXXXX");
@@ -264,7 +257,7 @@ const char *write_rev_file_order(const char *rev_name,
 				die_errno("unable to create '%s'", rev_name);
 		}
 		f = hashfd(fd, rev_name);
-	} else if (flags & WRITE_REV_VERIFY) {
+	} else {
 		struct stat statbuf;
 		if (stat(rev_name, &statbuf)) {
 			if (errno == ENOENT) {
@@ -274,8 +267,7 @@ const char *write_rev_file_order(const char *rev_name,
 				die_errno(_("could not stat: %s"), rev_name);
 		}
 		f = hashfd_check(rev_name);
-	} else
-		return NULL;
+	}
 
 	write_rev_header(f);
 
@@ -286,7 +278,7 @@ const char *write_rev_file_order(const char *rev_name,
 		die(_("failed to make %s readable"), rev_name);
 
 	finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_CLOSE |
-				    ((flags & WRITE_IDX_VERIFY) ? 0 : CSUM_FSYNC));
+			  (verify ? 0 : CSUM_FSYNC));
 
 	return rev_name;
 }
@@ -479,12 +471,13 @@ void finish_tmp_packfile(struct strbuf *name_buffer,
 		die_errno("unable to make temporary pack file readable");
 
 	idx_tmp_name = write_idx_file(NULL, written_list, nr_written,
-				      pack_idx_opts, hash);
+				      pack_idx_opts, hash, 0);
 	if (adjust_shared_perm(idx_tmp_name))
 		die_errno("unable to make temporary index file readable");
 
-	rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash,
-				      pack_idx_opts->flags);
+	if (pack_idx_opts->write_rev)
+		rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash,
+					      0);
 
 	strbuf_addf(name_buffer, "%s.pack", hash_to_hex(hash));
 
diff --git a/pack.h b/pack.h
index aa23be74df..e096dea4c9 100644
--- a/pack.h
+++ b/pack.h
@@ -38,12 +38,8 @@ struct pack_header {
 #define PACK_IDX_SIGNATURE 0xff744f63	/* "\377tOc" */
 
 struct pack_idx_option {
-	unsigned flags;
-	/* flag bits */
-#define WRITE_IDX_VERIFY 01 /* verify only, do not write the idx file */
-#define WRITE_IDX_STRICT 02
-#define WRITE_REV 04
-#define WRITE_REV_VERIFY 010
+	unsigned int write_idx_strict:1,
+		     write_rev:1;
 
 	uint32_t version;
 	uint32_t off32_limit;
@@ -84,7 +80,8 @@ typedef int (*verify_fn)(const struct object_id *, enum object_type, unsigned lo
 const char *write_idx_file(const char *index_name,
 			   struct pack_idx_entry **objects, int nr_objects,
 			   const struct pack_idx_option *opts,
-			   const unsigned char *sha1);
+			   const unsigned char *sha1,
+			   int verify);
 int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, off_t len, unsigned int nr);
 int verify_pack_index(struct packed_git *);
 int verify_pack(struct repository *, struct packed_git *, verify_fn fn, struct progress *, uint32_t);
@@ -99,11 +96,11 @@ void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_
 const char *write_rev_file(const char *rev_name,
 			   struct pack_idx_entry **objects,
 			   uint32_t nr_objects,
-			   const unsigned char *hash, unsigned flags);
+			   const unsigned char *hash, int verify);
 const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order,
 				 uint32_t nr_objects,
 				 const unsigned char *hash,
-				 unsigned flags);
+				 int verify);
 
 /*
  * The "hdr" output buffer should be at least this big, which will handle sizes
diff mbox series

Patch

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 33567aaa74..396240c25a 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1249,7 +1249,6 @@  static void write_pack_file(void)
 			stage_tmp_packfiles(&tmpname, pack_tmp_name,
 					    written_list, nr_written,
 					    &pack_idx_opts, hash, &idx_tmp_name);
-			rename_tmp_packfile_idx(&tmpname, hash, &idx_tmp_name);
 
 			if (write_bitmap_index) {
 				struct strbuf sb = STRBUF_INIT;
@@ -1268,6 +1267,8 @@  static void write_pack_file(void)
 				strbuf_release(&sb);
 			}
 
+			rename_tmp_packfile_idx(&tmpname, hash, &idx_tmp_name);
+
 			free(idx_tmp_name);
 			strbuf_release(&tmpname);
 			free(pack_tmp_name);