diff mbox series

show-index: fix uninitialized hash function

Message ID 20240712142326.266533-1-abhijeet.nkt@gmail.com (mailing list archive)
State Superseded
Headers show
Series show-index: fix uninitialized hash function | expand

Commit Message

Abhijeet Sonar July 12, 2024, 2:23 p.m. UTC
As stated in the docs, show-index should use SHA1 as the default hash algorithm
when run outsize of a repository.  However, 'the_hash_algo' is currently left
uninitialized if we are not in a repository and no explicit hash funciton is
specified, causing a crash.  Fix it by falling back to SHA1 when it is found
uninitialized.

Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
---
 builtin/show-index.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Junio C Hamano July 12, 2024, 3:35 p.m. UTC | #1
Abhijeet Sonar <abhijeet.nkt@gmail.com> writes:

> As stated in the docs, show-index should use SHA1 as the default hash algorithm
> when run outsize of a repository.  However, 'the_hash_algo' is currently left
> uninitialized if we are not in a repository and no explicit hash funciton is
> specified, causing a crash.  Fix it by falling back to SHA1 when it is found
> uninitialized.
>
> Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
> ---
>  builtin/show-index.c | 3 +++
>  1 file changed, 3 insertions(+)

Nicely described.

We'd probably want to protect this with a new test, so that
regardless of the choice of GIT_TEST_DEFAULT_HASH, the command
should behave as advertised.

Having said that, I am not sure if --object-format specified on the
command line, or picked up from the repository, makes much sense in
the context of the command, especially for the longer term [*].  The
command is designed to read from its standard input a byte-stream,
which is assumed to be an .idx file of _any_ origin, so ideally it
should be able to tell what hash the incoming data uses and use that
hash algorithm, without being told from the command line?

But that longer-term worry has nothing to do with the validity of
this patch (but the lack of test does).  Thanks.

[Footnote]

 * Perhaps the file format does not make it obvious what hash
   algorithm it uses, so it may be hard to auto-detect without
   additional code.  But if that is the case, it would be something
   we may want to eventually fix.
Eric Sunshine July 12, 2024, 4:53 p.m. UTC | #2
On Fri, Jul 12, 2024 at 10:24 AM Abhijeet Sonar <abhijeet.nkt@gmail.com> wrote:
> As stated in the docs, show-index should use SHA1 as the default hash algorithm
> when run outsize of a repository.  However, 'the_hash_algo' is currently left
> uninitialized if we are not in a repository and no explicit hash funciton is

s/funciton/function/

> specified, causing a crash.  Fix it by falling back to SHA1 when it is found
> uninitialized.
>
> Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
Abhijeet Sonar July 15, 2024, 10:31 a.m. UTC | #3
On 12/07/24 21:05, Junio C Hamano wrote:
> Abhijeet Sonar <abhijeet.nkt@gmail.com> writes:
> 
>> As stated in the docs, show-index should use SHA1 as the default hash algorithm
>> when run outsize of a repository.  However, 'the_hash_algo' is currently left
>> uninitialized if we are not in a repository and no explicit hash funciton is
>> specified, causing a crash.  Fix it by falling back to SHA1 when it is found
>> uninitialized.
>>
>> Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
>> ---
>>  builtin/show-index.c | 3 +++
>>  1 file changed, 3 insertions(+)
> 
> Nicely described.
> 
> We'd probably want to protect this with a new test, so that
> regardless of the choice of GIT_TEST_DEFAULT_HASH, the command
> should behave as advertised.

I wrote a test which build an index file using a `hash-object |
pack-objects` chain.  I am not sure if its the best way to do this, I
would appreciate some guidance on this.

Another way I can think of is having an index file sit along with the
tests in the codebase which will be read by `show-index` instead of
generating one on the fly.  Thoughts?

Thanks.

> 
> Having said that, I am not sure if --object-format specified on the
> command line, or picked up from the repository, makes much sense in
> the context of the command, especially for the longer term [*].  The
> command is designed to read from its standard input a byte-stream,
> which is assumed to be an .idx file of _any_ origin, so ideally it
> should be able to tell what hash the incoming data uses and use that
> hash algorithm, without being told from the command line?
> 
> But that longer-term worry has nothing to do with the validity of
> this patch (but the lack of test does).  Thanks.
> 
> [Footnote]
> 
>  * Perhaps the file format does not make it obvious what hash
>    algorithm it uses, so it may be hard to auto-detect without
>    additional code.  But if that is the case, it would be something
>    we may want to eventually fix.
>
diff mbox series

Patch

diff --git a/builtin/show-index.c b/builtin/show-index.c
index 540dc3dad1..bb6d9e3c40 100644
--- a/builtin/show-index.c
+++ b/builtin/show-index.c
@@ -35,6 +35,9 @@  int cmd_show_index(int argc, const char **argv, const char *prefix)
 		repo_set_hash_algo(the_repository, hash_algo);
 	}
 
+	if (!the_hash_algo)
+		repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
 	hashsz = the_hash_algo->rawsz;
 
 	if (fread(top_index, 2 * 4, 1, stdin) != 1)