diff mbox series

[v4,4/5] builtin/hash-object: fix uninitialized hash function

Message ID 20240514011437.3779151-5-gitster@pobox.com (mailing list archive)
State Superseded
Headers show
Series Fix use of uninitialized hash algorithms | expand

Commit Message

Junio C Hamano May 14, 2024, 1:14 a.m. UTC
From: Patrick Steinhardt <ps@pks.im>

The git-hash-object(1) command allows users to hash an object even
without a repository. Starting with c8aed5e8da (repository: stop setting
SHA1 as the default object hash, 2024-05-07), this will make us hit an
uninitialized hash function, which subsequently leads to a segfault.

Fix this by falling back to SHA-1 explicitly when running outside of
a Git repository. Users can use GIT_DEFAULT_HASH environment to
specify what hash algorithm they want, so arguably this code should
not be needed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/hash-object.c   | 3 +++
 t/t1007-hash-object.sh  | 6 ++++++
 t/t1517-outside-repo.sh | 2 +-
 3 files changed, 10 insertions(+), 1 deletion(-)

Comments

Junio C Hamano May 17, 2024, 11:49 p.m. UTC | #1
Junio C Hamano <gitster@pobox.com> writes:

> From: Patrick Steinhardt <ps@pks.im>
>
> The git-hash-object(1) command allows users to hash an object even
> without a repository. Starting with c8aed5e8da (repository: stop setting
> SHA1 as the default object hash, 2024-05-07), this will make us hit an
> uninitialized hash function, which subsequently leads to a segfault.
>
> Fix this by falling back to SHA-1 explicitly when running outside of
> a Git repository. Users can use GIT_DEFAULT_HASH environment to
> specify what hash algorithm they want, so arguably this code should
> not be needed.

By the way, this breaks the expectation of t1007 and t1517 when run
with GIT_DEFAULT_HASH=sha256 (I recall that they passed with the
earlier iteration of my "fall back to GIT_DEFAULT_HASH" in [1/5],
but we decided abusing GIT_DEFAULT_HASH is a bad idea).

https://github.com/git/git/actions/runs/9135149292/job/25122031380
Junio C Hamano May 20, 2024, 9:19 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> From: Patrick Steinhardt <ps@pks.im>
>>
>> The git-hash-object(1) command allows users to hash an object even
>> without a repository. Starting with c8aed5e8da (repository: stop setting
>> SHA1 as the default object hash, 2024-05-07), this will make us hit an
>> uninitialized hash function, which subsequently leads to a segfault.
>>
>> Fix this by falling back to SHA-1 explicitly when running outside of
>> a Git repository. Users can use GIT_DEFAULT_HASH environment to
>> specify what hash algorithm they want, so arguably this code should
>> not be needed.
>
> By the way, this breaks the expectation of t1007 and t1517 when run
> with GIT_DEFAULT_HASH=sha256 (I recall that they passed with the
> earlier iteration of my "fall back to GIT_DEFAULT_HASH" in [1/5],
> but we decided abusing GIT_DEFAULT_HASH is a bad idea).

The breakage in 1517 turns out to be a thinko on my part.  Outside a
repository, we do use SHA-1 with our "if the_hash_algo is not set
yet, default to SHA-1" in patch-id and hash-object but the way I
prepared the expected output was to use whatever GIT_TEST_DEFAULT_HASH
was in use.  A fix would be to go outside a repository and force the
hash with GIT_DEFAULT_HASH to sha1 when preparing the expected output.

I haven't looked at the breakage in 1007 yet, though.

diff --git i/t/t1517-outside-repo.sh w/t/t1517-outside-repo.sh
index 6f32a40c6d..6363c8e3c4 100755
--- i/t/t1517-outside-repo.sh
+++ w/t/t1517-outside-repo.sh
@@ -21,22 +21,24 @@ test_expect_success 'set up a non-repo directory and test file' '
 	git diff >sample.patch
 '
 
-test_expect_success 'compute a patch-id outside repository' '
-	git patch-id <sample.patch >patch-id.expect &&
+test_expect_success 'compute a patch-id outside repository (default to SHA-1)' '
 	(
 		cd non-repo &&
-		git patch-id <../sample.patch >../patch-id.actual
-	) &&
-	test_cmp patch-id.expect patch-id.actual
+		GIT_DEFAULT_HASH=sha1 \
+		git patch-id <../sample.patch >patch-id.expect &&
+		git patch-id <../sample.patch >patch-id.actual &&
+		test_cmp patch-id.expect patch-id.actual
+	)
 '
 
-test_expect_success 'hash-object outside repository' '
-	git hash-object --stdin <sample.patch >hash.expect &&
+test_expect_success 'hash-object outside repository (default to SHA-1)' '
 	(
 		cd non-repo &&
-		git hash-object --stdin <../sample.patch >../hash.actual
-	) &&
-	test_cmp hash.expect hash.actual
+		GIT_DEFAULT_HASH=sha1 \
+		git hash-object --stdin <../sample.patch >hash.expect &&
+		git hash-object --stdin <../sample.patch >hash.actual &&
+		test_cmp hash.expect hash.actual
+	)
 '
 
 test_expect_success 'apply a patch outside repository' '
Junio C Hamano May 20, 2024, 10:45 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> I haven't looked at the breakage in 1007 yet, though.

This turned out to be almost trivial.  A fix relative to an earlier
version is that the call to test_oid helper needs to explicitly ask
for SHA-1 variant, as the command invocation outside a repository
uses SHA-1 (not due to falling back to a hardcoded default, but by
an explicit fallback in the "git hash-object" itself.

I'll send a v5 of the whole series sometime later (if I have time it
may happen today but otherwise tomorrow).

Thanks.

diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index 64aea38486..d73a5cc237 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -260,4 +260,10 @@ test_expect_success '--literally with extra-long type' '
 	echo example | git hash-object -t $t --literally --stdin
 '
 
+test_expect_success '--stdin outside of repository (uses SHA-1)' '
+	nongit git hash-object --stdin <hello >actual &&
+	echo "$(test_oid --hash=sha1 hello)" >expect &&
+	test_cmp expect actual
+'
+
 test_done

diff --git a/builtin/hash-object.c b/builtin/hash-object.c
index 82ca6d2bfd..c767414a0c 100644
--- a/builtin/hash-object.c
+++ b/builtin/hash-object.c
@@ -123,6 +123,9 @@ int cmd_hash_object(int argc, const char **argv, const char *prefix)
 	else
 		prefix = setup_git_directory_gently(&nongit);
 
+	if (nongit && !the_hash_algo)
+		repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
 	if (vpath && prefix) {
 		vpath_free = prefix_filename(prefix, vpath);
 		vpath = vpath_free;
diff mbox series

Patch

diff --git a/builtin/hash-object.c b/builtin/hash-object.c
index 82ca6d2bfd..c767414a0c 100644
--- a/builtin/hash-object.c
+++ b/builtin/hash-object.c
@@ -123,6 +123,9 @@  int cmd_hash_object(int argc, const char **argv, const char *prefix)
 	else
 		prefix = setup_git_directory_gently(&nongit);
 
+	if (nongit && !the_hash_algo)
+		repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
 	if (vpath && prefix) {
 		vpath_free = prefix_filename(prefix, vpath);
 		vpath = vpath_free;
diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index 64aea38486..4c138c6ca4 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -260,4 +260,10 @@  test_expect_success '--literally with extra-long type' '
 	echo example | git hash-object -t $t --literally --stdin
 '
 
+test_expect_success '--stdin outside of repository' '
+	nongit git hash-object --stdin <hello >actual &&
+	echo "$(test_oid hello)" >expect &&
+	test_cmp expect actual
+'
+
 test_done
diff --git a/t/t1517-outside-repo.sh b/t/t1517-outside-repo.sh
index ac5f3191cc..854bb8f343 100755
--- a/t/t1517-outside-repo.sh
+++ b/t/t1517-outside-repo.sh
@@ -30,7 +30,7 @@  test_expect_success 'compute a patch-id outside repository' '
 	test_cmp patch-id.expect patch-id.actual
 '
 
-test_expect_failure 'hash-object outside repository' '
+test_expect_success 'hash-object outside repository' '
 	git hash-object --stdin <sample.patch >hash.expect &&
 	(
 		cd non-repo &&