diff mbox series

clone: clean up directory after transport_fetch_refs() failure

Message ID YKTzu96fN7XlpDoG@coredump.intra.peff.net (mailing list archive)
State Accepted
Commit 6aacb7d8619c50b3c92fe3e99b93ffa9b6065dfc
Headers show
Series clone: clean up directory after transport_fetch_refs() failure | expand

Commit Message

Jeff King May 19, 2021, 11:17 a.m. UTC
git-clone started respecting errors from the transport subsystem in
aab179d937 (builtin/clone.c: don't ignore transport_fetch_refs() errors,
2020-12-03). However, that commit didn't handle the cleanup of the
filesystem quite right.

The cleanup of the directory that cmd_clone() creates is done by an
atexit() handler, which we control with a flag. It starts as
JUNK_LEAVE_NONE ("clean up everything"), then progresses to
JUNK_LEAVE_REPO when we know we have a valid repo but not working tree,
and then finally JUNK_LEAVE_ALL when we have a successful checkout.

Most errors cause us to die(), which then triggers the handler to do the
right thing based on how far into cmd_clone() we got. But the checks
added by aab179d937 instead set the "err" variable and then jump to a
new "cleanup" label, which then returns our non-zero status. However,
the code after the cleanup label includes setting the flag to
JUNK_LEAVE_ALL, and so we accidentally leave the repository and working
tree in place.

One obvious option to fix this is to reorder the end of the function to
set the flag first, before cleanup code, and put the label between them.

But we can observe another small bug: the error return from
transport_fetch_refs() is generally "-1", and we propagate that to the
return value of cmd_clone(), which ultimately becomes the exit code of
the process. And we try to avoid transmitting negative values via exit
codes (only the low 8 bits are passed along as an unsigned value, though
in practice for "-1" this at least retains the property that it's
non-zero).

Instead, let's just die(). That makes us consistent with rest of the
code in the function. It does add a new "fatal:" line to the output, but
I'd argue that's a good thing:

  - in the rare case that the transport code didn't say anything, now
    the user gets _some_ error message

  - even if the transport code said something like "error: ssh died of
    signal 9", it's nice to also say "fatal" to indicate that we
    considered that to be a show-stopper.

Triggering this in the test suite turns out to be surprisingly
difficult. Almost every error we'd encounter, including ones deep inside
the transport code, cause us to just die() right there! However, one way
is to put a fake wrapper around git-upload-pack that sends the complete
packfile but exits with a failure code.

Signed-off-by: Jeff King <peff@peff.net>
---
Pulling this out of the discussion in:

  https://lore.kernel.org/git/YKTg8nYjSGpKbq8W@coredump.intra.peff.net/

The problem is in v2.30.0, so this doesn't need to be part of the
upcoming v2.32 release (but it's probably a good candidate for maint).
Note that it isn't even a regression in v2.30. Before then, we ignored
the error completely! So the fix there simply didn't go far enough. :)

 builtin/clone.c               | 11 ++++-------
 t/t5600-clone-fail-cleanup.sh |  7 +++++++
 2 files changed, 11 insertions(+), 7 deletions(-)

Comments

Taylor Blau May 19, 2021, 2:30 p.m. UTC | #1
On Wed, May 19, 2021 at 07:17:15AM -0400, Jeff King wrote:
> One obvious option to fix this is to reorder the end of the function to
> set the flag first, before cleanup code, and put the label between them.
>
> But we can observe another small bug: the error return from
> transport_fetch_refs() is generally "-1", and we propagate that to the
> return value of cmd_clone(), which ultimately becomes the exit code of
> the process. And we try to avoid transmitting negative values via exit
> codes (only the low 8 bits are passed along as an unsigned value, though
> in practice for "-1" this at least retains the property that it's
> non-zero).
>
> Instead, let's just die(). That makes us consistent with rest of the
> code in the function. It does add a new "fatal:" line to the output, but
> I'd argue that's a good thing:

This reasoning makes sense to me, and thanks for cleaning this up.

    Reviewed-by: Taylor Blau <me@ttaylorr.com>

Thanks,
Taylor
diff mbox series

Patch

diff --git a/builtin/clone.c b/builtin/clone.c
index eeb74c0217..66fe66679c 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1320,9 +1320,8 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 			}
 
 		if (!is_local && !complete_refs_before_fetch) {
-			err = transport_fetch_refs(transport, mapped_refs);
-			if (err)
-				goto cleanup;
+			if (transport_fetch_refs(transport, mapped_refs))
+				die(_("remote transport reported error"));
 		}
 
 		remote_head = find_ref_by_name(refs, "HEAD");
@@ -1380,9 +1379,8 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (is_local)
 		clone_local(path, git_dir);
 	else if (refs && complete_refs_before_fetch) {
-		err = transport_fetch_refs(transport, mapped_refs);
-		if (err)
-			goto cleanup;
+		if (transport_fetch_refs(transport, mapped_refs))
+			die(_("remote transport reported error"));
 	}
 
 	update_remote_refs(refs, mapped_refs, remote_head_points_at,
@@ -1410,7 +1408,6 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 	junk_mode = JUNK_LEAVE_REPO;
 	err = checkout(submodule_progress);
 
-cleanup:
 	free(remote_name);
 	strbuf_release(&reflog_msg);
 	strbuf_release(&branch_top);
diff --git a/t/t5600-clone-fail-cleanup.sh b/t/t5600-clone-fail-cleanup.sh
index 4a1a912e03..5bf10261d3 100755
--- a/t/t5600-clone-fail-cleanup.sh
+++ b/t/t5600-clone-fail-cleanup.sh
@@ -97,4 +97,11 @@  test_expect_success 'failed clone into empty leaves directory (separate, wt)' '
 	test_dir_is_empty empty-wt
 '
 
+test_expect_success 'transport failure cleans up directory' '
+	test_must_fail git clone --no-local \
+		-u "f() { git-upload-pack \"\$@\"; return 1; }; f" \
+		foo broken-clone &&
+	test_path_is_missing broken-clone
+'
+
 test_done