Message ID | d8286cf1075dc85231128145c5abb0db3881032b.1631157880.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | packfile: avoid .idx rename races | expand |
On Wed, Sep 08 2021, Taylor Blau wrote: > From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > > In preceding commits the race of renaming .idx files in place before > .rev files and other auxiliary files was fixed in pack-write.c's > finish_tmp_packfile(), builtin/repack.c's "struct exts", and > builtin/index-pack.c's final(). As noted in the change to pack-write.c > we left in place the issue of writing *.bitmap files after the *.idx, > let's fix that issue. > > See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) > for commentary at the time when *.bitmap was implemented about how > those files are written out, nothing in that commit contradicts what's > being done here. > > Note that this commit and preceding ones only close any race condition > with *.idx files being written before their auxiliary files if we're > optimistic about our lack of fsync()-ing in this are not tripping us > over. See the thread at [1] for a rabbit hole of various discussions > about filesystem races in the face of doing and not doing fsync() (and > if doing fsync(), not doing it properly). > > In particular, in this case of writing to ".git/objects/pack" we only > write and fsync() the individual files, but if we wanted to guarantee > that the metadata update was seen in that way by concurrent processes > we'd need to fsync() on the "fd" of the containing directory. That > concern is probably more theoretical than not, modern OS's tend to be > more on the forgiving side than the overly pedantic side of > implementing POSIX FS semantics. Some weird gramma/phrasing of mine left over here, i.e. the "[...] in this are not tripping us over.". On reflection perhaps it's better to replace these last two paragraphs with say: With this and preceding commits we've covered all the cases where we wrote another auxiliary file before the *.idx file, and should thus never have another concurrent process try to use an incomplete set of pack-OID.* files. This assumes that the renames we make here and elsewhere appear on the filesystem in the order we performed them. This assumption is known to be false in the face of pedantic POSIX FS semantics. See the thread around [1] for discussions on that point. We fsync() the file descriptors we're writing out for all the auxiliary files, but we don't yet fsync() the file descriptor for the containing directory. Therefore our data might have been written out, but it's anyone's guess what the state of the directory containing the file is after we write the *.idx. In practice modern OS's are known to be forgiving on that point, so this will probably solve races in practice for most users. It will almost certainly make them better than they were before when we didn't write *.idx files last. We should more generally improve our use of fsync() to cover containing directories, but that'll hopefully be addressed by some follow-up series.
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > We fsync() the file descriptors we're writing out for all the > auxiliary files, but we don't yet fsync() the file descriptor for > the containing directory. Therefore our data might have been written > out, but it's anyone's guess what the state of the directory > containing the file is after we write the *.idx. > > In practice modern OS's are known to be forgiving on that point, so > this will probably solve races in practice for most users. It will > almost certainly make them better than they were before when we > didn't write *.idx files last. We should more generally improve our > use of fsync() to cover containing directories, but that'll > hopefully be addressed by some follow-up series. I'd probably drop the last paragraph, and replace it with a single sentence "we may want to fsync the containing directory once after placing *.idx file in place, but it is outside of the scope of this series", if I were doing this series. Other than that (and I made a few comments on other patches), these were pleasant read. Thanks.
On Thu, Sep 09, 2021 at 12:52:03PM -0700, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > > > We fsync() the file descriptors we're writing out for all the > > auxiliary files, but we don't yet fsync() the file descriptor for > > the containing directory. Therefore our data might have been written > > out, but it's anyone's guess what the state of the directory > > containing the file is after we write the *.idx. > > > > In practice modern OS's are known to be forgiving on that point, so > > this will probably solve races in practice for most users. It will > > almost certainly make them better than they were before when we > > didn't write *.idx files last. We should more generally improve our > > use of fsync() to cover containing directories, but that'll > > hopefully be addressed by some follow-up series. > > I'd probably drop the last paragraph, and replace it with a single > sentence "we may want to fsync the containing directory once after > placing *.idx file in place, but it is outside of the scope of this > series", if I were doing this series. Yep, I think that this makes things clearer, and I prefer it to Ævar's suggestion (which conveys more information than I think is necessary here). Thanks, Taylor
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 944134b6f2..a01767a384 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1250,7 +1250,6 @@ static void write_pack_file(void) stage_tmp_packfiles(&tmpname, pack_tmp_name, written_list, nr_written, &pack_idx_opts, hash, &idx_tmp_name); - rename_tmp_packfile_idx(&tmpname, &idx_tmp_name); if (write_bitmap_index) { size_t tmpname_len = tmpname.len; @@ -1267,6 +1266,8 @@ static void write_pack_file(void) strbuf_setlen(&tmpname, tmpname_len); } + rename_tmp_packfile_idx(&tmpname, &idx_tmp_name); + free(idx_tmp_name); strbuf_release(&tmpname); free(pack_tmp_name);