diff mbox series

[v4,3/9] config API: add and use a "git_config_get()" family of functions

Message ID patch-v4-3.9-998b11ae4bc-20230202T131155Z-avarab@gmail.com (mailing list archive)
State New, archived
Headers show
Series config API: make "multi" safe, fix numerous segfaults | expand

Commit Message

Ævar Arnfjörð Bjarmason Feb. 2, 2023, 1:27 p.m. UTC
We already have the basic "git_config_get_value()" function and its
"repo_*" and "configset" siblings to get a given "key" and assign the
last key found to a provided "value".

But some callers don't care about that value, but just want to use the
return value of the "get_value()" function to check whether the key
exist (or another non-zero return value).

The immediate motivation for this is that a subsequent commit will
need to change all callers of the "*_get_value_multi()" family of
functions. In two cases here we (ab)used it to check whether we had
any values for the given key, but didn't care about the return value.

The rest of the callers here used various other config API functions
to do the same, all of which resolved to the same underlying functions
to provide the answer.

Some of these were using either git_config_get_string() or
git_config_get_string_tmp(), see fe4c750fb13 (submodule--helper: fix a
configure_added_submodule() leak, 2022-09-01) for a recent example. We
can now use a helper function that doesn't require a throwaway
variable.

We could have changed git_configset_get_value_multi() to accept a
"NULL" as a "dest" for all callers, but let's avoid changing the
behavior of existing API users. Having an "unused" value that we throw
away internal to config.c is cheap.

Another name for this function could have been
"*_config_key_exists()", as suggested in [1]. That would work for all
of these callers, and would currently be equivalent to this function,
as the git_configset_get_value() API normalizes all non-zero return
values to a "1".

But adding that API would set us up to lose information, as e.g. if
git_config_parse_key() in the underlying configset_find_element()
fails we'd like to return -1, not 1.

Let's change the underlying configset_find_element() function to
support this use-case, we'll make further use of it in a subsequent
commit where the git_configset_get_value_multi() function itself will
expose this new return value.

1. https://lore.kernel.org/git/xmqqczadkq9f.fsf@gitster.g/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c                |  5 +---
 builtin/submodule--helper.c |  7 +++---
 builtin/worktree.c          |  3 +--
 config.c                    | 50 +++++++++++++++++++++++++++++++------
 config.h                    | 19 +++++++++++++-
 5 files changed, 66 insertions(+), 18 deletions(-)

Comments

Junio C Hamano Feb. 2, 2023, 11:56 p.m. UTC | #1
Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> We already have the basic "git_config_get_value()" function and its
> "repo_*" and "configset" siblings to get a given "key" and assign the
> last key found to a provided "value".
>
> But some callers don't care about that value, but just want to use the
> return value of the "get_value()" function to check whether the key
> exist (or another non-zero return value).
>
> The immediate motivation for this is that a subsequent commit will
> need to change all callers of the "*_get_value_multi()" family of
> functions. In two cases here we (ab)used it to check whether we had
> any values for the given key, but didn't care about the return value.

So, the idea is that 

	if (!git_config_get_string(key, &discard))
		free(discard);
	else
		... the key is missing ...

becomes

	if (git_config_get(key))
		... the key is missing ...

In other words, git_config_get() returns 0 only when the key is
used, and non-zero return signals that the key is not used?

Similarly, get_value_multi() was an interface to get to the
value_list associated to the given key, and was abused like

	if (git_config_get_value_multi(key))
		... the key exists ...

which will become

	if (!git_config_get(key))
		... the key exists ...

right?

>  	/* Set maintenance strategy, if unset */
> -	if (!git_config_get_string("maintenance.strategy", &config_value))
> -		free(config_value);
> -	else
> +	if (git_config_get("maintenance.strategy"))
>  		git_config_set("maintenance.strategy", "incremental");

OK.  config_get() says "true" meaning the key is missing.

> -	if (!argc && git_config_get_value_multi("submodule.active"))
> +	if (!argc && !git_config_get("submodule.active"))
>  		module_list_active(&list);

OK.

> @@ -2743,7 +2743,7 @@ static int module_update(int argc, const char **argv, const char *prefix)
>  		 * If there are no path args and submodule.active is set then,
>  		 * by default, only initialize 'active' modules.
>  		 */
> -		if (!argc && git_config_get_value_multi("submodule.active"))
> +		if (!argc && !git_config_get("submodule.active"))
>  			module_list_active(&list);

OK.

> @@ -3185,7 +3184,7 @@ static void configure_added_submodule(struct add_data *add_data)
>  	 * is_submodule_active(), since that function needs to find
>  	 * out the value of "submodule.active" again anyway.
>  	 */
> -	if (!git_config_get_string_tmp("submodule.active", &val)) {
> +	if (!git_config_get("submodule.active")) {

string_tmp() variant is to retrieve borrowed value, and it returns 0
when there is a value.  If it is a valueless true, we get -1 back
with an error message.  What does the updated version do in the
valueless true case?

> diff --git a/builtin/worktree.c b/builtin/worktree.c
> index f51c40f1e1e..6ba42d4ad20 100644
> --- a/builtin/worktree.c
> +++ b/builtin/worktree.c
> @@ -338,7 +337,7 @@ static void copy_filtered_worktree_config(const char *worktree_git_dir)
>  				to_file, "core.bare", NULL, "true", 0))
>  			error(_("failed to unset '%s' in '%s'"),
>  				"core.bare", to_file);
> -		if (!git_configset_get_value(&cs, "core.worktree", &core_worktree) &&
> +		if (!git_configset_get(&cs, "core.worktree") &&

OK.

> diff --git a/config.c b/config.c
> index 00090a32fc3..b88da70c664 100644
> --- a/config.c
> +++ b/config.c
> @@ -2289,23 +2289,28 @@ void read_very_early_config(config_fn_t cb, void *data)
>  	config_with_options(cb, data, NULL, &opts);
>  }
>  
> -static struct config_set_element *configset_find_element(struct config_set *cs, const char *key)
> +static int configset_find_element(struct config_set *cs, const char *key,
> +				  struct config_set_element **dest)
>  {
>  	struct config_set_element k;
>  	struct config_set_element *found_entry;
>  	char *normalized_key;
> +	int ret;
> +
>  	/*
>  	 * `key` may come from the user, so normalize it before using it
>  	 * for querying entries from the hashmap.
>  	 */
> -	if (git_config_parse_key(key, &normalized_key, NULL))
> -		return NULL;
> +	ret = git_config_parse_key(key, &normalized_key, NULL);
> +	if (ret)
> +		return ret;
>  
>  	hashmap_entry_init(&k.ent, strhash(normalized_key));
>  	k.key = normalized_key;
>  	found_entry = hashmap_get_entry(&cs->config_hash, &k, ent, NULL);
>  	free(normalized_key);
> -	return found_entry;
> +	*dest = found_entry;
> +	return 0;

OK, so we used to return NULL when the key is not parseable, and
otherwise we returned the config_set_element we found for the key,
or NULL if there is no such element.  Now we return error code as
the return value and allow the caller to peek the element via *dest
parameter.

So, from the caller's point of view (dest != NULL) is how it checks
if the key is used.  The function returning 0 is a sign that the key
passed to it is healthy.  OK.

> @@ -2314,8 +2319,11 @@ static int configset_add_value(struct config_set *cs, const char *key, const cha
>  	struct string_list_item *si;
>  	struct configset_list_item *l_item;
>  	struct key_value_info *kv_info = xmalloc(sizeof(*kv_info));
> +	int ret;
>  
> -	e = configset_find_element(cs, key);
> +	ret = configset_find_element(cs, key, &e);
> +	if (ret)
> +		return ret;

The function never returned any meaningful error, so the callers may
not be prepared to see such an error return.  But now we at least
notice an error at this level.

> @@ -2425,8 +2433,25 @@ int git_configset_get_value(struct config_set *cs, const char *key, const char *
>  
>  const struct string_list *git_configset_get_value_multi(struct config_set *cs, const char *key)
>  {
> -	struct config_set_element *e = configset_find_element(cs, key);
> -	return e ? &e->value_list : NULL;
> +	struct config_set_element *e;
> +
> +	if (configset_find_element(cs, key, &e))
> +		return NULL;
> +	else if (!e)
> +		return NULL;
> +	return &e->value_list;
> +}

OK.  !e means "we got a healthy key and peeking into the hash table,
there wasn't any entry for the key", and that is reported with NULL.
Do we evern return a string list with .nr == 0, I wonder.  Having to
deal with such a list would make the caller's job more complex, but
perhaps we are not allowing the code to shrink value_list.nr to
avoid such a situation?

> +int git_configset_get(struct config_set *cs, const char *key)
> +{
> +	struct config_set_element *e;
> +	int ret;
> +
> +	if ((ret = configset_find_element(cs, key, &e)))
> +		return ret;
> +	else if (!e)
> +		return 1;
> +	return 0;
>  }

OK.  So 0 return from the function means there is a value (or more)
for a given key.  Good.

> diff --git a/config.h b/config.h
> index ef9eade6414..04c5e594015 100644
> --- a/config.h
> +++ b/config.h
> @@ -471,9 +471,12 @@ void git_configset_clear(struct config_set *cs);
>  
>  /*
>   * These functions return 1 if not found, and 0 if found, leaving the found
> - * value in the 'dest' pointer.
> + * value in the 'dest' pointer (if any).
>   */

Now the returned non-zero values are no longer 1 alone, no?
Whatever lower-level functions use to signal an error is propagated
up with the

	if ((ret = func())
		return ret;

pattern.
Glen Choo Feb. 6, 2023, 12:36 p.m. UTC | #2
I think introducing a new function to replace the *_get_value_multi()
abuses is a good way forward. Junio has already adequately commented on
your implementation, so I'll focus this review on a different approach
that this patch could have taken.

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> We already have the basic "git_config_get_value()" function and its
> "repo_*" and "configset" siblings to get a given "key" and assign the
> last key found to a provided "value".
>
> But some callers don't care about that value, but just want to use the
> return value of the "get_value()" function to check whether the key
> exist (or another non-zero return value).

[...]

> We could have changed git_configset_get_value_multi() to accept a
> "NULL" as a "dest" for all callers, but let's avoid changing the
> behavior of existing API users. Having an "unused" value that we throw
> away internal to config.c is cheap.

There is yet another option, which is to teach "git_config_get_value()"
(mentioned earlier) to accept NULL to mean "I just want to know if there
is a value, I don't care what it is". That's what the *_get_<type>()
functions use under the hood (i.e. the ones that return either 0 or 1 or
exit).

This amounts to implementing the "*_config_key_exists()" API you
mentioned, but I think this is better fit for the current set of
semantics. At the very least, that would be an easy 1-1 replacement for
the *_get_string[_tmp]() replacements we make here. There's also the
small benefit of saving one function implementation.

> Another name for this function could have been
> "*_config_key_exists()", as suggested in [1]. That would work for all
> of these callers, and would currently be equivalent to this function,
> as the git_configset_get_value() API normalizes all non-zero return
> values to a "1".
>
> But adding that API would set us up to lose information, as e.g. if
> git_config_parse_key() in the underlying configset_find_element()
> fails we'd like to return -1, not 1.

We were already 'losing' (or rather, not caring about) this information
with the *_get_<type>() functions. The only reason we'd care about this
is if we using git_configset_get_value_multi() or similar.

We replace two callers of git_configset_get_value_multi() in this patch,
but they didn't care about the -1 case anyway...

> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -557,7 +557,7 @@ static int module_init(int argc, const char **argv, const char *prefix)
>  	 * If there are no path args and submodule.active is set then,
>  	 * by default, only initialize 'active' modules.
>  	 */
> -	if (!argc && git_config_get_value_multi("submodule.active"))
> +	if (!argc && !git_config_get("submodule.active"))
>  		module_list_active(&list);
>  
>  	info.prefix = prefix;
> @@ -2743,7 +2743,7 @@ static int module_update(int argc, const char **argv, const char *prefix)
>  		 * If there are no path args and submodule.active is set then,
>  		 * by default, only initialize 'active' modules.
>  		 */
> -		if (!argc && git_config_get_value_multi("submodule.active"))
> +		if (!argc && !git_config_get("submodule.active"))
>  			module_list_active(&list);
>  
>  		info.prefix = opt.prefix;

Here they are.

> diff --git a/config.h b/config.h
> index ef9eade6414..04c5e594015 100644
> --- a/config.h
> +++ b/config.h
> @@ -471,9 +471,12 @@ void git_configset_clear(struct config_set *cs);
>  
>  /*
>   * These functions return 1 if not found, and 0 if found, leaving the found
> - * value in the 'dest' pointer.
> + * value in the 'dest' pointer (if any).
>   */
>  
> +RESULT_MUST_BE_USED
> +int git_configset_get(struct config_set *cs, const char *key);
> +

As Junio pointed out, git_configset_get() can now return -1, so this
isn't so accurate any more. git_configset_get() is really the exception
here, since all the other functions in this section are the
git_configset_get_*() functions that use git_configset_get_value(). I'd
prefer returning only 0 or 1 for consistency.

>  /*
>   * Finds the highest-priority value for the configuration variable `key`
>   * and config set `cs`, stores the pointer to it in `value` and returns 0.
> @@ -494,6 +497,14 @@ int git_configset_get_pathname(struct config_set *cs, const char *key, const cha
>  /* Functions for reading a repository's config */
>  struct repository;
>  void repo_config(struct repository *repo, config_fn_t fn, void *data);
> +
> +/**
> + * Run only the discover part of the repo_config_get_*() functions
> + * below, in addition to 1 if not found, returns negative values on
> + * error (e.g. if the key itself is invalid).
> + */
> +RESULT_MUST_BE_USED
> +int repo_config_get(struct repository *repo, const char *key);

This comment is quite a welcome addition. I've found myself losing track
of this information quite often
Glen Choo Feb. 6, 2023, 12:37 p.m. UTC | #3
I think introducing a new function to replace the *_get_value_multi()
abuses is a good way forward. Junio has already adequately commented on
your implementation, so I'll focus this review on a different approach
that this patch could have taken.

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> We already have the basic "git_config_get_value()" function and its
> "repo_*" and "configset" siblings to get a given "key" and assign the
> last key found to a provided "value".
>
> But some callers don't care about that value, but just want to use the
> return value of the "get_value()" function to check whether the key
> exist (or another non-zero return value).

[...]

> We could have changed git_configset_get_value_multi() to accept a
> "NULL" as a "dest" for all callers, but let's avoid changing the
> behavior of existing API users. Having an "unused" value that we throw
> away internal to config.c is cheap.

There is yet another option, which is to teach "git_config_get_value()"
(mentioned earlier) to accept NULL to mean "I just want to know if there
is a value, I don't care what it is". That's what the *_get_<type>()
functions use under the hood (i.e. the ones that return either 0 or 1 or
exit).

This amounts to implementing the "*_config_key_exists()" API you
mentioned, but I think this is better fit for the current set of
semantics. At the very least, that would be an easy 1-1 replacement for
the *_get_string[_tmp]() replacements we make here. There's also the
small benefit of saving one function implementation.

> Another name for this function could have been
> "*_config_key_exists()", as suggested in [1]. That would work for all
> of these callers, and would currently be equivalent to this function,
> as the git_configset_get_value() API normalizes all non-zero return
> values to a "1".
>
> But adding that API would set us up to lose information, as e.g. if
> git_config_parse_key() in the underlying configset_find_element()
> fails we'd like to return -1, not 1.

We were already 'losing' (or rather, not caring about) this information
with the *_get_<type>() functions. The only reason we'd care about this
is if we using git_configset_get_value_multi() or similar.

We replace two callers of git_configset_get_value_multi() in this patch,
but they didn't care about the -1 case anyway...

> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -557,7 +557,7 @@ static int module_init(int argc, const char **argv, const char *prefix)
>  	 * If there are no path args and submodule.active is set then,
>  	 * by default, only initialize 'active' modules.
>  	 */
> -	if (!argc && git_config_get_value_multi("submodule.active"))
> +	if (!argc && !git_config_get("submodule.active"))
>  		module_list_active(&list);
>  
>  	info.prefix = prefix;
> @@ -2743,7 +2743,7 @@ static int module_update(int argc, const char **argv, const char *prefix)
>  		 * If there are no path args and submodule.active is set then,
>  		 * by default, only initialize 'active' modules.
>  		 */
> -		if (!argc && git_config_get_value_multi("submodule.active"))
> +		if (!argc && !git_config_get("submodule.active"))
>  			module_list_active(&list);
>  
>  		info.prefix = opt.prefix;

Here they are.

> diff --git a/config.h b/config.h
> index ef9eade6414..04c5e594015 100644
> --- a/config.h
> +++ b/config.h
> @@ -471,9 +471,12 @@ void git_configset_clear(struct config_set *cs);
>  
>  /*
>   * These functions return 1 if not found, and 0 if found, leaving the found
> - * value in the 'dest' pointer.
> + * value in the 'dest' pointer (if any).
>   */
>  
> +RESULT_MUST_BE_USED
> +int git_configset_get(struct config_set *cs, const char *key);
> +

As Junio pointed out, git_configset_get() can now return -1, so this
isn't so accurate any more. git_configset_get() is really the exception
here, since all the other functions in this section are the
git_configset_get_*() functions that use git_configset_get_value(). I'd
prefer returning only 0 or 1 for consistency.

>  /*
>   * Finds the highest-priority value for the configuration variable `key`
>   * and config set `cs`, stores the pointer to it in `value` and returns 0.
> @@ -494,6 +497,14 @@ int git_configset_get_pathname(struct config_set *cs, const char *key, const cha
>  /* Functions for reading a repository's config */
>  struct repository;
>  void repo_config(struct repository *repo, config_fn_t fn, void *data);
> +
> +/**
> + * Run only the discover part of the repo_config_get_*() functions
> + * below, in addition to 1 if not found, returns negative values on
> + * error (e.g. if the key itself is invalid).
> + */
> +RESULT_MUST_BE_USED
> +int repo_config_get(struct repository *repo, const char *key);

This comment is quite a welcome addition. I've found myself losing track
of this information quite often
Ævar Arnfjörð Bjarmason Feb. 7, 2023, 10:29 a.m. UTC | #4
On Thu, Feb 02 2023, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> We already have the basic "git_config_get_value()" function and its
>> "repo_*" and "configset" siblings to get a given "key" and assign the
>> last key found to a provided "value".
>>
>> But some callers don't care about that value, but just want to use the
>> return value of the "get_value()" function to check whether the key
>> exist (or another non-zero return value).
>>
>> The immediate motivation for this is that a subsequent commit will
>> need to change all callers of the "*_get_value_multi()" family of
>> functions. In two cases here we (ab)used it to check whether we had
>> any values for the given key, but didn't care about the return value.
>
> So, the idea is that 
>
> 	if (!git_config_get_string(key, &discard))
> 		free(discard);
> 	else
> 		... the key is missing ...
>
> becomes
>
> 	if (git_config_get(key))
> 		... the key is missing ...
>
> In other words, git_config_get() returns 0 only when the key is
> used, and non-zero return signals that the key is not used?
>
> Similarly, get_value_multi() was an interface to get to the
> value_list associated to the given key, and was abused like
>
> 	if (git_config_get_value_multi(key))
> 		... the key exists ...
>
> which will become
>
> 	if (!git_config_get(key))
> 		... the key exists ...
>
> right?

Yes, I've amended this in a re-roll to add tests to the configset tests,
so how the new API should be used becomes obvious.

>> @@ -3185,7 +3184,7 @@ static void configure_added_submodule(struct add_data *add_data)
>>  	 * is_submodule_active(), since that function needs to find
>>  	 * out the value of "submodule.active" again anyway.
>>  	 */
>> -	if (!git_config_get_string_tmp("submodule.active", &val)) {
>> +	if (!git_config_get("submodule.active")) {
>
> string_tmp() variant is to retrieve borrowed value, and it returns 0
> when there is a value.  If it is a valueless true, we get -1 back
> with an error message.  What does the updated version do in the
> valueless true case?

No, we'll get back 0, as a value-less key exists. 

This sort of code would be correct in general, as if we really want to
check if the key exists a value-less is a valid boolean true.

In this case we happen to segfault later on if we encounter such a key,
but that's unrelated to this being correct.

The segfault is then fixed later in this topic.

>> @@ -2425,8 +2433,25 @@ int git_configset_get_value(struct config_set *cs, const char *key, const char *
>>  
>>  const struct string_list *git_configset_get_value_multi(struct config_set *cs, const char *key)
>>  {
>> -	struct config_set_element *e = configset_find_element(cs, key);
>> -	return e ? &e->value_list : NULL;
>> +	struct config_set_element *e;
>> +
>> +	if (configset_find_element(cs, key, &e))
>> +		return NULL;
>> +	else if (!e)
>> +		return NULL;
>> +	return &e->value_list;
>> +}
>
> OK.  !e means "we got a healthy key and peeking into the hash table,
> there wasn't any entry for the key", and that is reported with NULL.
> Do we evern return a string list with .nr == 0, I wonder.  Having to
> deal with such a list would make the caller's job more complex, but
> perhaps we are not allowing the code to shrink value_list.nr to
> avoid such a situation?

No, we never return a list with .nr == 0. That's why Stolee's earlier
RFC tried to solve a subset of the problem this topic addresse by
returning such a list as a way to indicate "does not exist".

That would work to an extent, but would leave the main problem (which
Stolee wasn't aware of at the time) of having a ".nr > 0" list with
"NULL" elements in it.

It also wouldn't be idiomatic with the rest of the API, as this topic
shows, and means you can't distinguish non-existence from other errors.

>> +int git_configset_get(struct config_set *cs, const char *key)
>> +{
>> +	struct config_set_element *e;
>> +	int ret;
>> +
>> +	if ((ret = configset_find_element(cs, key, &e)))
>> +		return ret;
>> +	else if (!e)
>> +		return 1;
>> +	return 0;
>>  }
>
> OK.  So 0 return from the function means there is a value (or more)
> for a given key.  Good.
>
>> diff --git a/config.h b/config.h
>> index ef9eade6414..04c5e594015 100644
>> --- a/config.h
>> +++ b/config.h
>> @@ -471,9 +471,12 @@ void git_configset_clear(struct config_set *cs);
>>  
>>  /*
>>   * These functions return 1 if not found, and 0 if found, leaving the found
>> - * value in the 'dest' pointer.
>> + * value in the 'dest' pointer (if any).
>>   */
>
> Now the returned non-zero values are no longer 1 alone, no?
> Whatever lower-level functions use to signal an error is propagated
> up with the
>
> 	if ((ret = func())
> 		return ret;

That's still mostly true, but I should have added the new
git_configset_get() below this comment, and split it up, i.e. it's still
true for the functions that populate "dest".

I have a topic-on-top to fix those (to propagate them up), and it was
hard to know where to draw the line, in this case I got it wrong. Will
fix it.
Ævar Arnfjörð Bjarmason Feb. 7, 2023, 11:52 a.m. UTC | #5
On Mon, Feb 06 2023, Glen Choo wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> We already have the basic "git_config_get_value()" function and its
>> "repo_*" and "configset" siblings to get a given "key" and assign the
>> last key found to a provided "value".
>>
>> But some callers don't care about that value, but just want to use the
>> return value of the "get_value()" function to check whether the key
>> exist (or another non-zero return value).
>
> [...]
>
>> We could have changed git_configset_get_value_multi() to accept a
>> "NULL" as a "dest" for all callers, but let's avoid changing the
>> behavior of existing API users. Having an "unused" value that we throw
>> away internal to config.c is cheap.
>
> There is yet another option, which is to teach "git_config_get_value()"
> (mentioned earlier) to accept NULL to mean "I just want to know if there
> is a value, I don't care what it is". That's what the *_get_<type>()
> functions use under the hood (i.e. the ones that return either 0 or 1 or
> exit).

I've clarified the commit message, but that's the same as what this is
describing. I.e. I meant "git_config_get_value_multi() and the functions
that are wrapping it", which is "git_config_get_value()" etc.

> This amounts to implementing the "*_config_key_exists()" API you
> mentioned, but I think this is better fit for the current set of
> semantics. At the very least, that would be an easy 1-1 replacement for
> the *_get_string[_tmp]() replacements we make here. There's also the
> small benefit of saving one function implementation.

I think this is the wrong approach, and have updated the commit message
further to advocate for this one.

>> Another name for this function could have been
>> "*_config_key_exists()", as suggested in [1]. That would work for all
>> of these callers, and would currently be equivalent to this function,
>> as the git_configset_get_value() API normalizes all non-zero return
>> values to a "1".
>>
>> But adding that API would set us up to lose information, as e.g. if
>> git_config_parse_key() in the underlying configset_find_element()
>> fails we'd like to return -1, not 1.
>
> We were already 'losing' (or rather, not caring about) this information
> with the *_get_<type>() functions. The only reason we'd care about this
> is if we using git_configset_get_value_multi() or similar.
>
> We replace two callers of git_configset_get_value_multi() in this patch,
> but they didn't care about the -1 case anyway...
> [...]
> As Junio pointed out, git_configset_get() can now return -1, so this
> isn't so accurate any more. git_configset_get() is really the exception
> here, since all the other functions in this section are the
> git_configset_get_*() functions that use git_configset_get_value(). I'd
> prefer returning only 0 or 1 for consistency.

I really prefer not clobbering these return values, but I take your
point that the end-state here is inconsistent.

I figured that I could fix it for the APIs added here, and follow-up
(with a patch I already had mostly ready) after this series to fix the
remaining config API warts.

I won't fix all of those in the incoming re-roll of this, but I'll fix
this issue, i.e. we'll consistently ferry up "ret", and stop normalizing
non-zero-non-1 to "return 1".
diff mbox series

Patch

diff --git a/builtin/gc.c b/builtin/gc.c
index 02455fdcd73..e38d1783f30 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1493,7 +1493,6 @@  static int maintenance_register(int argc, const char **argv, const char *prefix)
 	};
 	int found = 0;
 	const char *key = "maintenance.repo";
-	char *config_value;
 	char *maintpath = get_maintpath();
 	struct string_list_item *item;
 	const struct string_list *list;
@@ -1508,9 +1507,7 @@  static int maintenance_register(int argc, const char **argv, const char *prefix)
 	git_config_set("maintenance.auto", "false");
 
 	/* Set maintenance strategy, if unset */
-	if (!git_config_get_string("maintenance.strategy", &config_value))
-		free(config_value);
-	else
+	if (git_config_get("maintenance.strategy"))
 		git_config_set("maintenance.strategy", "incremental");
 
 	list = git_config_get_value_multi(key);
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 4c173d8b37a..2278e8c91cb 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -557,7 +557,7 @@  static int module_init(int argc, const char **argv, const char *prefix)
 	 * If there are no path args and submodule.active is set then,
 	 * by default, only initialize 'active' modules.
 	 */
-	if (!argc && git_config_get_value_multi("submodule.active"))
+	if (!argc && !git_config_get("submodule.active"))
 		module_list_active(&list);
 
 	info.prefix = prefix;
@@ -2743,7 +2743,7 @@  static int module_update(int argc, const char **argv, const char *prefix)
 		 * If there are no path args and submodule.active is set then,
 		 * by default, only initialize 'active' modules.
 		 */
-		if (!argc && git_config_get_value_multi("submodule.active"))
+		if (!argc && !git_config_get("submodule.active"))
 			module_list_active(&list);
 
 		info.prefix = opt.prefix;
@@ -3140,7 +3140,6 @@  static int config_submodule_in_gitmodules(const char *name, const char *var, con
 static void configure_added_submodule(struct add_data *add_data)
 {
 	char *key;
-	const char *val;
 	struct child_process add_submod = CHILD_PROCESS_INIT;
 	struct child_process add_gitmodules = CHILD_PROCESS_INIT;
 
@@ -3185,7 +3184,7 @@  static void configure_added_submodule(struct add_data *add_data)
 	 * is_submodule_active(), since that function needs to find
 	 * out the value of "submodule.active" again anyway.
 	 */
-	if (!git_config_get_string_tmp("submodule.active", &val)) {
+	if (!git_config_get("submodule.active")) {
 		/*
 		 * If the submodule being added isn't already covered by the
 		 * current configured pathspec, set the submodule's active flag
diff --git a/builtin/worktree.c b/builtin/worktree.c
index f51c40f1e1e..6ba42d4ad20 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -319,7 +319,6 @@  static void copy_filtered_worktree_config(const char *worktree_git_dir)
 
 	if (file_exists(from_file)) {
 		struct config_set cs = { { 0 } };
-		const char *core_worktree;
 		int bare;
 
 		if (safe_create_leading_directories(to_file) ||
@@ -338,7 +337,7 @@  static void copy_filtered_worktree_config(const char *worktree_git_dir)
 				to_file, "core.bare", NULL, "true", 0))
 			error(_("failed to unset '%s' in '%s'"),
 				"core.bare", to_file);
-		if (!git_configset_get_value(&cs, "core.worktree", &core_worktree) &&
+		if (!git_configset_get(&cs, "core.worktree") &&
 			git_config_set_in_file_gently(to_file,
 							"core.worktree", NULL))
 			error(_("failed to unset '%s' in '%s'"),
diff --git a/config.c b/config.c
index 00090a32fc3..b88da70c664 100644
--- a/config.c
+++ b/config.c
@@ -2289,23 +2289,28 @@  void read_very_early_config(config_fn_t cb, void *data)
 	config_with_options(cb, data, NULL, &opts);
 }
 
-static struct config_set_element *configset_find_element(struct config_set *cs, const char *key)
+static int configset_find_element(struct config_set *cs, const char *key,
+				  struct config_set_element **dest)
 {
 	struct config_set_element k;
 	struct config_set_element *found_entry;
 	char *normalized_key;
+	int ret;
+
 	/*
 	 * `key` may come from the user, so normalize it before using it
 	 * for querying entries from the hashmap.
 	 */
-	if (git_config_parse_key(key, &normalized_key, NULL))
-		return NULL;
+	ret = git_config_parse_key(key, &normalized_key, NULL);
+	if (ret)
+		return ret;
 
 	hashmap_entry_init(&k.ent, strhash(normalized_key));
 	k.key = normalized_key;
 	found_entry = hashmap_get_entry(&cs->config_hash, &k, ent, NULL);
 	free(normalized_key);
-	return found_entry;
+	*dest = found_entry;
+	return 0;
 }
 
 static int configset_add_value(struct config_set *cs, const char *key, const char *value)
@@ -2314,8 +2319,11 @@  static int configset_add_value(struct config_set *cs, const char *key, const cha
 	struct string_list_item *si;
 	struct configset_list_item *l_item;
 	struct key_value_info *kv_info = xmalloc(sizeof(*kv_info));
+	int ret;
 
-	e = configset_find_element(cs, key);
+	ret = configset_find_element(cs, key, &e);
+	if (ret)
+		return ret;
 	/*
 	 * Since the keys are being fed by git_config*() callback mechanism, they
 	 * are already normalized. So simply add them without any further munging.
@@ -2425,8 +2433,25 @@  int git_configset_get_value(struct config_set *cs, const char *key, const char *
 
 const struct string_list *git_configset_get_value_multi(struct config_set *cs, const char *key)
 {
-	struct config_set_element *e = configset_find_element(cs, key);
-	return e ? &e->value_list : NULL;
+	struct config_set_element *e;
+
+	if (configset_find_element(cs, key, &e))
+		return NULL;
+	else if (!e)
+		return NULL;
+	return &e->value_list;
+}
+
+int git_configset_get(struct config_set *cs, const char *key)
+{
+	struct config_set_element *e;
+	int ret;
+
+	if ((ret = configset_find_element(cs, key, &e)))
+		return ret;
+	else if (!e)
+		return 1;
+	return 0;
 }
 
 int git_configset_get_string(struct config_set *cs, const char *key, char **dest)
@@ -2565,6 +2590,12 @@  void repo_config(struct repository *repo, config_fn_t fn, void *data)
 	configset_iter(repo->config, fn, data);
 }
 
+int repo_config_get(struct repository *repo, const char *key)
+{
+	git_config_check_init(repo);
+	return git_configset_get(repo->config, key);
+}
+
 int repo_config_get_value(struct repository *repo,
 			  const char *key, const char **value)
 {
@@ -2679,6 +2710,11 @@  void git_config_clear(void)
 	repo_config_clear(the_repository);
 }
 
+int git_config_get(const char *key)
+{
+	return repo_config_get(the_repository, key);
+}
+
 int git_config_get_value(const char *key, const char **value)
 {
 	return repo_config_get_value(the_repository, key, value);
diff --git a/config.h b/config.h
index ef9eade6414..04c5e594015 100644
--- a/config.h
+++ b/config.h
@@ -471,9 +471,12 @@  void git_configset_clear(struct config_set *cs);
 
 /*
  * These functions return 1 if not found, and 0 if found, leaving the found
- * value in the 'dest' pointer.
+ * value in the 'dest' pointer (if any).
  */
 
+RESULT_MUST_BE_USED
+int git_configset_get(struct config_set *cs, const char *key);
+
 /*
  * Finds the highest-priority value for the configuration variable `key`
  * and config set `cs`, stores the pointer to it in `value` and returns 0.
@@ -494,6 +497,14 @@  int git_configset_get_pathname(struct config_set *cs, const char *key, const cha
 /* Functions for reading a repository's config */
 struct repository;
 void repo_config(struct repository *repo, config_fn_t fn, void *data);
+
+/**
+ * Run only the discover part of the repo_config_get_*() functions
+ * below, in addition to 1 if not found, returns negative values on
+ * error (e.g. if the key itself is invalid).
+ */
+RESULT_MUST_BE_USED
+int repo_config_get(struct repository *repo, const char *key);
 int repo_config_get_value(struct repository *repo,
 			  const char *key, const char **value);
 const struct string_list *repo_config_get_value_multi(struct repository *repo,
@@ -530,8 +541,14 @@  void git_protected_config(config_fn_t fn, void *data);
  * manner, the config API provides two functions `git_config_get_value`
  * and `git_config_get_value_multi`. They both read values from an internal
  * cache generated previously from reading the config files.
+ *
+ * For those git_config_get*() functions that aren't documented,
+ * consult the corresponding repo_config_get*() function's
+ * documentation.
  */
 
+int git_config_get(const char *key);
+
 /**
  * Finds the highest-priority value for the configuration variable `key`,
  * stores the pointer to it in `value` and returns 0. When the