diff mbox series

[v2,3/3] object-file: retry linking file into place when occluding file vanishes

Message ID 20250106-b4-pks-object-file-racy-collision-check-v2-3-8b3984ecbb18@pks.im (mailing list archive)
State Accepted
Commit d7fcbe2c56468ac780c689b02c6a9e056ce39c12
Headers show
Series object-file: retry linking file into place when occluding file vanishes | expand

Commit Message

Patrick Steinhardt Jan. 6, 2025, 9:24 a.m. UTC
Prior to 0ad3d65652 (object-file: fix race in object collision check,
2024-12-30), callers could expect that a successful return from
`finalize_object_file()` means that either the file was moved into
place, or the identical bytes were already present. If neither of those
happens, we'd return an error.

Since that commit, if the destination file disappears between our
link(3p) call and the collision check, we'd return success without
actually checking the contents, and without retrying the link. This
solves the common case that the files were indeed the same, but it means
that we may corrupt the repository if they weren't (this implies a hash
collision, but the whole point of this function is protecting against
hash collisions).

We can't be pessimistic and assume they're different; that hurts the
common case that the mentioned commit was trying to fix. But after
seeing that the destination file went away, we can retry linking again.
Adapt the code to do so when we see that the destination file has racily
vanished. This should generally succeed as we have just observed that
the destination file does not exist anymore, except in the very unlikely
event that it gets recreated by another concurrent process again.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)
diff mbox series

Patch

diff --git a/object-file.c b/object-file.c
index acfda5e303659195109c94f5b54b8a081fe92c59..aeca61b8ae3aadecef1ce11306b38e7368af829f 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1970,6 +1970,8 @@  static void write_object_file_prepare_literally(const struct git_hash_algo *algo
 	hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen);
 }
 
+#define CHECK_COLLISION_DEST_VANISHED -2
+
 static int check_collision(const char *source, const char *dest)
 {
 	char buf_source[4096], buf_dest[4096];
@@ -1986,6 +1988,8 @@  static int check_collision(const char *source, const char *dest)
 	if (fd_dest < 0) {
 		if (errno != ENOENT)
 			ret = error_errno(_("unable to open %s"), dest);
+		else
+			ret = CHECK_COLLISION_DEST_VANISHED;
 		goto out;
 	}
 
@@ -2033,8 +2037,11 @@  int finalize_object_file(const char *tmpfile, const char *filename)
 int finalize_object_file_flags(const char *tmpfile, const char *filename,
 			       enum finalize_object_file_flags flags)
 {
-	struct stat st;
-	int ret = 0;
+	unsigned retries = 0;
+	int ret;
+
+retry:
+	ret = 0;
 
 	if (object_creation_mode == OBJECT_CREATION_USES_RENAMES)
 		goto try_rename;
@@ -2055,6 +2062,8 @@  int finalize_object_file_flags(const char *tmpfile, const char *filename,
 	 * left to unlink.
 	 */
 	if (ret && ret != EEXIST) {
+		struct stat st;
+
 	try_rename:
 		if (!stat(filename, &st))
 			ret = EEXIST;
@@ -2070,9 +2079,17 @@  int finalize_object_file_flags(const char *tmpfile, const char *filename,
 			errno = saved_errno;
 			return error_errno(_("unable to write file %s"), filename);
 		}
-		if (!(flags & FOF_SKIP_COLLISION_CHECK) &&
-		    check_collision(tmpfile, filename))
+		if (!(flags & FOF_SKIP_COLLISION_CHECK)) {
+			ret = check_collision(tmpfile, filename);
+			if (ret == CHECK_COLLISION_DEST_VANISHED) {
+				if (retries++ > 5)
+					return error(_("unable to write repeatedly vanishing file %s"),
+						     filename);
+				goto retry;
+			}
+			else if (ret)
 				return -1;
+		}
 		unlink_or_warn(tmpfile);
 	}