diff mbox series

[v5,02/12] sha1-file: provide functions to look up hash algorithms

Message ID 20181104234458.139223-3-sandals@crustytoothpaste.net (mailing list archive)
State New, archived
Headers show
Series Base SHA-256 implementation | expand

Commit Message

brian m. carlson Nov. 4, 2018, 11:44 p.m. UTC
There are several ways we might refer to a hash algorithm: by name, such
as in the config file; by format ID, such as in a pack; or internally,
by a pointer to the hash_algos array.  Provide functions to look up hash
algorithms based on these various forms and return the internal constant
used for them.  If conversion to another form is necessary, this
internal constant can be used to look up the proper data in the
hash_algos array.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 hash.h      | 13 +++++++++++++
 sha1-file.c | 21 +++++++++++++++++++++
 2 files changed, 34 insertions(+)

Comments

Derrick Stolee Nov. 13, 2018, 6:42 p.m. UTC | #1
On 11/4/2018 6:44 PM, brian m. carlson wrote:
> +int hash_algo_by_name(const char *name)
> +{
> +	int i;
> +	if (!name)
> +		return GIT_HASH_UNKNOWN;
> +	for (i = 1; i < GIT_HASH_NALGOS; i++)
> +		if (!strcmp(name, hash_algos[i].name))
> +			return i;
> +	return GIT_HASH_UNKNOWN;
> +}
> +

Today's test coverage report [1] shows this method is not covered in the 
test suite. Looking at 'pu', it doesn't have any callers.

Do you have a work in progress series that will use this? Could we add a 
test-tool to exercise this somehow?

Thanks,
-Stolee

[1] 
https://public-inbox.org/git/97be3e21-6a8c-9718-5161-37318f6d685f@gmail.com/
Duy Nguyen Nov. 13, 2018, 6:45 p.m. UTC | #2
On Tue, Nov 13, 2018 at 7:42 PM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 11/4/2018 6:44 PM, brian m. carlson wrote:
> > +int hash_algo_by_name(const char *name)
> > +{
> > +     int i;
> > +     if (!name)
> > +             return GIT_HASH_UNKNOWN;
> > +     for (i = 1; i < GIT_HASH_NALGOS; i++)
> > +             if (!strcmp(name, hash_algos[i].name))
> > +                     return i;
> > +     return GIT_HASH_UNKNOWN;
> > +}
> > +
>
> Today's test coverage report [1] shows this method is not covered in the
> test suite. Looking at 'pu', it doesn't have any callers.
>
> Do you have a work in progress series that will use this? Could we add a
> test-tool to exercise this somehow?

Going by the function name, it should be used when hash-object or
other commands learn about --object-format=<name>.
Ramsay Jones Nov. 14, 2018, 12:11 a.m. UTC | #3
On 13/11/2018 18:42, Derrick Stolee wrote:
> On 11/4/2018 6:44 PM, brian m. carlson wrote:
>> +int hash_algo_by_name(const char *name)
>> +{
>> +    int i;
>> +    if (!name)
>> +        return GIT_HASH_UNKNOWN;
>> +    for (i = 1; i < GIT_HASH_NALGOS; i++)
>> +        if (!strcmp(name, hash_algos[i].name))
>> +            return i;
>> +    return GIT_HASH_UNKNOWN;
>> +}
>> +
> 
> Today's test coverage report [1] shows this method is not covered in the test suite. Looking at 'pu', it doesn't have any callers.
> 
> Do you have a work in progress series that will use this? Could we add a test-tool to exercise this somehow?

There are actually 4 unused external symbols resulting from Brian's
'bc/sha-256' branch. The new unused externals in 'pu' looks like:

    $ diff nsc psc
    37a38,39
    > hex.o	- hash_to_hex
    > hex.o	- hash_to_hex_algop_r
    48a51
    > parse-options.o	- optname
    71a75
    > sha1-file.o	- for_each_file_in_obj_subdir
    72a77,78
    > sha1-file.o	- hash_algo_by_id
    > sha1-file.o	- hash_algo_by_name
    $ 

The symbols from hex.o and sha1-file.o being the 4 symbols from
this branch.

I suspect that upcoming patches will make use of them. ;-)

ATB,
Ramsay Jones
Ramsay Jones Nov. 14, 2018, 12:42 a.m. UTC | #4
On 14/11/2018 00:11, Ramsay Jones wrote:
> 
> 
> On 13/11/2018 18:42, Derrick Stolee wrote:
>> On 11/4/2018 6:44 PM, brian m. carlson wrote:
>>> +int hash_algo_by_name(const char *name)
>>> +{
>>> +    int i;
>>> +    if (!name)
>>> +        return GIT_HASH_UNKNOWN;
>>> +    for (i = 1; i < GIT_HASH_NALGOS; i++)
>>> +        if (!strcmp(name, hash_algos[i].name))
>>> +            return i;
>>> +    return GIT_HASH_UNKNOWN;
>>> +}
>>> +
>>
>> Today's test coverage report [1] shows this method is not covered in the test suite. Looking at 'pu', it doesn't have any callers.
>>
>> Do you have a work in progress series that will use this? Could we add a test-tool to exercise this somehow?
> 
> There are actually 4 unused external symbols resulting from Brian's
> 'bc/sha-256' branch. The new unused externals in 'pu' looks like:
> 
>     $ diff nsc psc
>     37a38,39
>     > hex.o	- hash_to_hex
>     > hex.o	- hash_to_hex_algop_r
>     48a51
>     > parse-options.o	- optname
>     71a75
>     > sha1-file.o	- for_each_file_in_obj_subdir
>     72a77,78
>     > sha1-file.o	- hash_algo_by_id
>     > sha1-file.o	- hash_algo_by_name
>     $ 
> 
> The symbols from hex.o and sha1-file.o being the 4 symbols from
> this branch.
> 
> I suspect that upcoming patches will make use of them. ;-)

BTW, if you were puzzling over the 3rd symbol from sha1-file.o
(which I wasn't counting in the 4 symbols above! ;-) ), then I
believe that is because Jeff's commit 3a2e08245c ("object-store: 
provide helpers for loose_objects_cache", 2018-11-12) effectively
moved the only call outside of sha1-file.c (in sha1-name.c) back
into sha1-file.c

So, for_each_file_in_obj_subdir() could now be marked 'static'.
(whether it should is a different issue).

ATB,
Ramsay Jones
Jeff King Nov. 14, 2018, 12:51 a.m. UTC | #5
On Wed, Nov 14, 2018 at 12:42:45AM +0000, Ramsay Jones wrote:

> BTW, if you were puzzling over the 3rd symbol from sha1-file.o
> (which I wasn't counting in the 4 symbols above! ;-) ), then I
> believe that is because Jeff's commit 3a2e08245c ("object-store: 
> provide helpers for loose_objects_cache", 2018-11-12) effectively
> moved the only call outside of sha1-file.c (in sha1-name.c) back
> into sha1-file.c
> 
> So, for_each_file_in_obj_subdir() could now be marked 'static'.
> (whether it should is a different issue).

I think it would be reasonable to do so. Most code shouldn't have to
care about the on-disk hashing structure, so ideally it wouldn't be part
of the public interface.

-Peff
brian m. carlson Nov. 14, 2018, 1:01 a.m. UTC | #6
On Tue, Nov 13, 2018 at 07:45:41PM +0100, Duy Nguyen wrote:
> On Tue, Nov 13, 2018 at 7:42 PM Derrick Stolee <stolee@gmail.com> wrote:
> >
> > On 11/4/2018 6:44 PM, brian m. carlson wrote:
> > > +int hash_algo_by_name(const char *name)
> > > +{
> > > +     int i;
> > > +     if (!name)
> > > +             return GIT_HASH_UNKNOWN;
> > > +     for (i = 1; i < GIT_HASH_NALGOS; i++)
> > > +             if (!strcmp(name, hash_algos[i].name))
> > > +                     return i;
> > > +     return GIT_HASH_UNKNOWN;
> > > +}
> > > +
> >
> > Today's test coverage report [1] shows this method is not covered in the
> > test suite. Looking at 'pu', it doesn't have any callers.
> >
> > Do you have a work in progress series that will use this? Could we add a
> > test-tool to exercise this somehow?
> 
> Going by the function name, it should be used when hash-object or
> other commands learn about --object-format=<name>.

Correct.  I have extensions.objectFormat code that will use this.
brian m. carlson Nov. 14, 2018, 2:11 a.m. UTC | #7
On Wed, Nov 14, 2018 at 12:11:07AM +0000, Ramsay Jones wrote:
> 
> 
> On 13/11/2018 18:42, Derrick Stolee wrote:
> > On 11/4/2018 6:44 PM, brian m. carlson wrote:
> >> +int hash_algo_by_name(const char *name)
> >> +{
> >> +    int i;
> >> +    if (!name)
> >> +        return GIT_HASH_UNKNOWN;
> >> +    for (i = 1; i < GIT_HASH_NALGOS; i++)
> >> +        if (!strcmp(name, hash_algos[i].name))
> >> +            return i;
> >> +    return GIT_HASH_UNKNOWN;
> >> +}
> >> +
> > 
> > Today's test coverage report [1] shows this method is not covered in the test suite. Looking at 'pu', it doesn't have any callers.
> > 
> > Do you have a work in progress series that will use this? Could we add a test-tool to exercise this somehow?
> 
> There are actually 4 unused external symbols resulting from Brian's
> 'bc/sha-256' branch. The new unused externals in 'pu' looks like:
> 
>     $ diff nsc psc
>     37a38,39
>     > hex.o	- hash_to_hex

I have code that uses this in my object-id-part15 series.  I also have
another series coming after this one that makes heavy use of it.

>     > hex.o	- hash_to_hex_algop_r

I believe this is because it's inline, since it is indeed used just a
few lines below its definition.  I'll drop the inline, since it's meant
to be externally visible.

>     > sha1-file.o	- hash_algo_by_id

This will be used when I write pack index v3, which will be in my
object-id-part15 series.

>     > sha1-file.o	- hash_algo_by_name

This is used in my object-id-part15 series.
Ramsay Jones Nov. 14, 2018, 3:53 a.m. UTC | #8
On 14/11/2018 02:11, brian m. carlson wrote:
> On Wed, Nov 14, 2018 at 12:11:07AM +0000, Ramsay Jones wrote:
>>
>>
>> On 13/11/2018 18:42, Derrick Stolee wrote:
>>> On 11/4/2018 6:44 PM, brian m. carlson wrote:
>>>> +int hash_algo_by_name(const char *name)
>>>> +{
>>>> +    int i;
>>>> +    if (!name)
>>>> +        return GIT_HASH_UNKNOWN;
>>>> +    for (i = 1; i < GIT_HASH_NALGOS; i++)
>>>> +        if (!strcmp(name, hash_algos[i].name))
>>>> +            return i;
>>>> +    return GIT_HASH_UNKNOWN;
>>>> +}
>>>> +
>>>
>>> Today's test coverage report [1] shows this method is not covered in the test suite. Looking at 'pu', it doesn't have any callers.
>>>
>>> Do you have a work in progress series that will use this? Could we add a test-tool to exercise this somehow?
>>
>> There are actually 4 unused external symbols resulting from Brian's
>> 'bc/sha-256' branch. The new unused externals in 'pu' looks like:
>>
>>     $ diff nsc psc
>>     37a38,39
>>     > hex.o	- hash_to_hex
> 
> I have code that uses this in my object-id-part15 series.  I also have
> another series coming after this one that makes heavy use of it.
> 
>>     > hex.o	- hash_to_hex_algop_r
> 
> I believe this is because it's inline, since it is indeed used just a
> few lines below its definition.  I'll drop the inline, since it's meant
> to be externally visible.

No, this has nothing to do with the 'inline', it is simply not
called outside of hex.c (at present). If you look at the assembler
(objdump -d hex.o), you will find practically all of the function
calls in that file are inlined (even those not marked with 'inline').

[I think the external declaration in cache.h forces the compiler to
add the external definition, despite the 'inline'. If you remove the
'inline' and re-compile and disassemble again, the result is identical.]

Thanks for confirming upcoming patches will add uses for all of
these functions - I suspected that would be the case.

Thanks!

ATB,
Ramsay Jones

> 
>>     > sha1-file.o	- hash_algo_by_id
> 
> This will be used when I write pack index v3, which will be in my
> object-id-part15 series.
> 
>>     > sha1-file.o	- hash_algo_by_name
> 
> This is used in my object-id-part15 series.
>
diff mbox series

Patch

diff --git a/hash.h b/hash.h
index 7c8238bc2e..80881eea47 100644
--- a/hash.h
+++ b/hash.h
@@ -98,4 +98,17 @@  struct git_hash_algo {
 };
 extern const struct git_hash_algo hash_algos[GIT_HASH_NALGOS];
 
+/*
+ * Return a GIT_HASH_* constant based on the name.  Returns GIT_HASH_UNKNOWN if
+ * the name doesn't match a known algorithm.
+ */
+int hash_algo_by_name(const char *name);
+/* Identical, except based on the format ID. */
+int hash_algo_by_id(uint32_t format_id);
+/* Identical, except for a pointer to struct git_hash_algo. */
+static inline int hash_algo_by_ptr(const struct git_hash_algo *p)
+{
+	return p - hash_algos;
+}
+
 #endif
diff --git a/sha1-file.c b/sha1-file.c
index 91311ebb3d..7e9dedc744 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -122,6 +122,27 @@  const char *empty_blob_oid_hex(void)
 	return oid_to_hex_r(buf, the_hash_algo->empty_blob);
 }
 
+int hash_algo_by_name(const char *name)
+{
+	int i;
+	if (!name)
+		return GIT_HASH_UNKNOWN;
+	for (i = 1; i < GIT_HASH_NALGOS; i++)
+		if (!strcmp(name, hash_algos[i].name))
+			return i;
+	return GIT_HASH_UNKNOWN;
+}
+
+int hash_algo_by_id(uint32_t format_id)
+{
+	int i;
+	for (i = 1; i < GIT_HASH_NALGOS; i++)
+		if (format_id == hash_algos[i].format_id)
+			return i;
+	return GIT_HASH_UNKNOWN;
+}
+
+
 /*
  * This is meant to hold a *small* number of objects that you would
  * want read_sha1_file() to be able to return, but yet you do not want