diff mbox series

builtin/ls-remote: fall back to SHA1 outside of a repo

Message ID c52112d3946b2fd8d030580cd7acb809fa54012a.1722573777.git.ps@pks.im (mailing list archive)
State Accepted
Commit 9e89dcb66a70902fef966083998a75272d47d998
Headers show
Series builtin/ls-remote: fall back to SHA1 outside of a repo | expand

Commit Message

Patrick Steinhardt Aug. 2, 2024, 4:44 a.m. UTC
In c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), we have stopped setting the default hash algorithm for
`the_repository`. Consequently, code that relies on `the_hash_algo` will
now crash when it hasn't explicitly been initialized, which may be the
case when running outside of a Git repository.

It was reported that git-ls-remote(1) may crash in such a way when using
a remote helper that advertises refspecs. This is because the refspec
announced by the helper will get parsed during capability negotiation.
At that point we haven't yet figured out what object format the remote
uses though, so when run outside of a repository then we will fail.

The course of action is somewhat dubious in the first place. Ideally, we
should only parse object IDs once we have asked the remote helper for
the object format. And if the helper didn't announce the "object-format"
capability, then we should always assume SHA256. But instead, we used to
take either SHA1 if there was no repository, or we used the hash of the
local repository, which is wrong.

Arguably though, crashing hard may not be in the best interest of our
users, either. So while the old behaviour was buggy, let's restore it
for now as a short-term fix. We should eventually revisit, potentially
by deferring the point in time when we parse the refspec until after we
have figured out the remote's object hash.

Reported-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---

I didn't spot this in the "What's cooking" report. I guess that's my own
fault for not sending it as a proper patch, so let me fix that now :)

Patrick

 builtin/ls-remote.c  | 15 +++++++++++++++
 t/t5512-ls-remote.sh | 13 +++++++++++++
 2 files changed, 28 insertions(+)

Comments

Junio C Hamano Aug. 2, 2024, 3:26 p.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> I didn't spot this in the "What's cooking" report. I guess that's my own
> fault for not sending it as a proper patch, so let me fix that now :)

Yeah, I had a fix already when I gave my response to the initial
problem report, but felt that it was too premature to commit to the
approach before listening to others (you included) for potentially
better alternative approaches, and then forgot about it.

The fix here is in line with my thoughts, after seeing how other
parts of the transport work.  Thanks for tying the loose ends.
diff mbox series

Patch

diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index debf2d4f88..6da63a67f5 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -91,6 +91,21 @@  int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 			     PARSE_OPT_STOP_AT_NON_OPTION);
 	dest = argv[0];
 
+	/*
+	 * TODO: This is buggy, but required for transport helpers. When a
+	 * transport helper advertises a "refspec", then we'd add that to a
+	 * list of refspecs via `refspec_append()`, which transitively depends
+	 * on `the_hash_algo`. Thus, when the hash algorithm isn't properly set
+	 * up, this would lead to a segfault.
+	 *
+	 * We really should fix this in the transport helper logic such that we
+	 * lazily parse refspec capabilities _after_ we have learned about the
+	 * remote's object format. Otherwise, we may end up misparsing refspecs
+	 * depending on what object hash the remote uses.
+	 */
+	if (!the_repository->hash_algo)
+		repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
 	packet_trace_identity("ls-remote");
 
 	if (argc > 1) {
diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
index 42e77eb5a9..bc442ec221 100755
--- a/t/t5512-ls-remote.sh
+++ b/t/t5512-ls-remote.sh
@@ -402,4 +402,17 @@  test_expect_success 'v0 clients can handle multiple symrefs' '
 	test_cmp expect actual
 '
 
+test_expect_success 'helper with refspec capability fails gracefully' '
+	mkdir test-bin &&
+	write_script test-bin/git-remote-foo <<-EOF &&
+	echo import
+	echo refspec ${SQ}*:*${SQ}
+	EOF
+	(
+		PATH="$PWD/test-bin:$PATH" &&
+		export PATH &&
+		test_must_fail nongit git ls-remote foo::bar
+	)
+'
+
 test_done