diff mbox series

[v5,1/5] setup: add an escape hatch for "no more default hash algorithm" change

Message ID 20240520231434.1816979-2-gitster@pobox.com (mailing list archive)
State Accepted
Commit d3b2ff75fd7056bc88ea838e8ed5c772ba149e2c
Headers show
Series Fix use of uninitialized hash algorithms | expand

Commit Message

Junio C Hamano May 20, 2024, 11:14 p.m. UTC
Partially revert c8aed5e8 (repository: stop setting SHA1 as the
default object hash, 2024-05-07), to keep end-user systems still
broken when we have gap in our test coverage but yet give them an
escape hatch to set the GIT_TEST_DEFAULT_HASH_ALGO environment
variable to "sha1" in order to revert to the previous behaviour.

Due to the way the end-user facing GIT_DEFAULT_HASH environment
variable is used in our test suite, we unfortunately cannot reuse it
for this purpose.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 repository.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

Comments

Patrick Steinhardt May 21, 2024, 7:57 a.m. UTC | #1
On Mon, May 20, 2024 at 04:14:30PM -0700, Junio C Hamano wrote:
> Partially revert c8aed5e8 (repository: stop setting SHA1 as the
> default object hash, 2024-05-07), to keep end-user systems still
> broken when we have gap in our test coverage but yet give them an
> escape hatch to set the GIT_TEST_DEFAULT_HASH_ALGO environment
> variable to "sha1" in order to revert to the previous behaviour.
> 
> Due to the way the end-user facing GIT_DEFAULT_HASH environment
> variable is used in our test suite, we unfortunately cannot reuse it
> for this purpose.

Okay, so this now really only is an escape hatch for users as we do not
specify it in our tests anymore. It does make me wonder whether we want
to name the envvar accordingly, but don't mind it much either way. It is
only intended to be a stop gap solution anyway that we can eventually
drop once we are sufficiently sure that there is no further breakage.

Patrick
Junio C Hamano May 21, 2024, 3:59 p.m. UTC | #2
Patrick Steinhardt <ps@pks.im> writes:

> On Mon, May 20, 2024 at 04:14:30PM -0700, Junio C Hamano wrote:
>> Partially revert c8aed5e8 (repository: stop setting SHA1 as the
>> default object hash, 2024-05-07), to keep end-user systems still
>> broken when we have gap in our test coverage but yet give them an
>> escape hatch to set the GIT_TEST_DEFAULT_HASH_ALGO environment
>> variable to "sha1" in order to revert to the previous behaviour.
>> 
>> Due to the way the end-user facing GIT_DEFAULT_HASH environment
>> variable is used in our test suite, we unfortunately cannot reuse it
>> for this purpose.
>
> Okay, so this now really only is an escape hatch for users as we do not
> specify it in our tests anymore. It does make me wonder whether we want
> to name the envvar accordingly, but don't mind it much either way. It is
> only intended to be a stop gap solution anyway that we can eventually
> drop once we are sufficiently sure that there is no further breakage.

Yes, I think the name of that escape hatch variable should be
irrelevant by the time we are reasonably confident that we could
remove it ;-)

But the proposed log message should make that intention more clear.
How does this read?

    ... give them an escape hatch ... in order to revert to the
    previous behaviour, just in case we haven't done a thorough job
    in fixing the fallout from c8aed5e8.  After we build confidence,
    we should remove the escape hatch support, but we are not there
    yet after only fixing three commands (hash-object, apply, and
    patch-id) in this series.

Thanks.
diff mbox series

Patch

diff --git a/repository.c b/repository.c
index 15c10015b0..c62e329878 100644
--- a/repository.c
+++ b/repository.c
@@ -19,6 +19,28 @@ 
 static struct repository the_repo;
 struct repository *the_repository = &the_repo;
 
+/*
+ * An escape hatch: if we hit a bug in the production code that fails
+ * to set an appropriate hash algorithm (most likely to happen when
+ * running outside a repository), we can tell the user who reported
+ * the crash to set the environment variable to "sha1" (all lowercase)
+ * to revert to the historical behaviour of defaulting to SHA-1.
+ */
+static void set_default_hash_algo(struct repository *repo)
+{
+	const char *hash_name;
+	int algo;
+
+	hash_name = getenv("GIT_TEST_DEFAULT_HASH_ALGO");
+	if (!hash_name)
+		return;
+	algo = hash_algo_by_name(hash_name);
+	if (algo == GIT_HASH_UNKNOWN)
+		return;
+
+	repo_set_hash_algo(repo, algo);
+}
+
 void initialize_repository(struct repository *repo)
 {
 	repo->objects = raw_object_store_new();
@@ -26,6 +48,28 @@  void initialize_repository(struct repository *repo)
 	repo->parsed_objects = parsed_object_pool_new();
 	ALLOC_ARRAY(repo->index, 1);
 	index_state_init(repo->index, repo);
+
+	/*
+	 * When a command runs inside a repository, it learns what
+	 * hash algorithm is in use from the repository, but some
+	 * commands are designed to work outside a repository, yet
+	 * they want to access the_hash_algo, if only for the length
+	 * of the hashed value to see if their input looks like a
+	 * plausible hash value.
+	 *
+	 * We are in the process of identifying such code paths and
+	 * giving them an appropriate default individually; any
+	 * unconverted code paths that try to access the_hash_algo
+	 * will thus fail.  The end-users however have an escape hatch
+	 * to set GIT_TEST_DEFAULT_HASH_ALGO environment variable to
+	 * "sha1" to get back the old behaviour of defaulting to SHA-1.
+	 *
+	 * This escape hatch is deliberately kept unadvertised, so
+	 * that they see crashes and we can get a report before
+	 * telling them about it.
+	 */
+	if (repo == the_repository)
+		set_default_hash_algo(repo);
 }
 
 static void expand_base_dir(char **out, const char *in,