diff mbox series

[v12,04/26] ident: add the ability to provide a "fallback identity"

Message ID 57a654887e652251ae966ec31b4604dc8222f9c6.1545331726.git.ungureanupaulsebastian@gmail.com (mailing list archive)
State New, archived
Headers show
Series Convert "git stash" to C builtin | expand

Commit Message

Paul-Sebastian Ungureanu Dec. 20, 2018, 7:44 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

In 3bc2111fc2e9 (stash: tolerate missing user identity, 2018-11-18),
`git stash` learned to provide a fallback identity for the case that no
proper name/email was given (and `git stash` does not really care about
a correct identity anyway, but it does want to create a commit object).

In preparation for the same functionality in the upcoming built-in
version of `git stash`, let's offer the same functionality as an API
function.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 cache.h |  1 +
 ident.c | 20 ++++++++++++++++++++
 2 files changed, 21 insertions(+)

Comments

Junio C Hamano Dec. 26, 2018, 9:21 p.m. UTC | #1
Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com> writes:

> +static void set_env_if(const char *key, const char *value, int *given, int bit)
> +{
> +	if ((*given & bit) || getenv(key))
> +		return; /* nothing to do */
> +	setenv(key, value, 0);
> +	*given |= bit;
> +}

We call setenv(3) with overwrite=0 but we protect the call with a
check for existing value with getenv(3), which feels a bit like an
anti-pattern.  Wouldn't the following be simpler to follow, I wonder?

	if (!(*given & bit)) {
		setenv(key, value, 1);
		*given |= bit;
	}

The only case these two may behave differently is when '*given' does
not have the 'bit' set but the environment 'key' already exists.
The proposed patch will leave 'bit' in '*given' unset, so when a
later code says "let's see if author_ident is explicitly given, and
complain otherwise", such a check will trigger and cause complaint.

On the other hand, the simplified version does not allow the
"explicitly-given" bits to be left unset, so it won't cause
complaint.

Isn't it a BUG() if *given lacks 'bit' when the corresponding
environment variable 'key' is missing?  IOW, I would understand
an implementation that is more elaborate than the simplified one I
just gave above were something like

	if (!(*given & bit)) {
		if (getenv(key))
			BUG("why does %s exist and no %x bit set???", key, bit);
		setenv(key, value, 0);
		*given |= bit;
	}

but I do not quite understand the reasoning behind the "check either
the bit, or the environment variable" in the proposed patch.

> +void prepare_fallback_ident(const char *name, const char *email)
> +{
> +	set_env_if("GIT_AUTHOR_NAME", name,
> +		   &author_ident_explicitly_given, IDENT_NAME_GIVEN);
> +	set_env_if("GIT_AUTHOR_EMAIL", email,
> +		   &author_ident_explicitly_given, IDENT_MAIL_GIVEN);
> +	set_env_if("GIT_COMMITTER_NAME", name,
> +		   &committer_ident_explicitly_given, IDENT_NAME_GIVEN);
> +	set_env_if("GIT_COMMITTER_EMAIL", email,
> +		   &committer_ident_explicitly_given, IDENT_MAIL_GIVEN);
> +}

Introducing this function alone without a caller and without
function doc is a bit unfriendly to future callers, who must be
careful when to call it, I think.  For example, they must know that
it will be a disaster if they call this before they call
git_ident_config(), right?

> +
>  static int buf_cmp(const char *a_begin, const char *a_end,
>  		   const char *b_begin, const char *b_end)
>  {
Johannes Schindelin Dec. 27, 2018, 9:24 p.m. UTC | #2
Hi Junio,

On Wed, 26 Dec 2018, Junio C Hamano wrote:

> Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com> writes:
> 
> > +static void set_env_if(const char *key, const char *value, int *given, int bit)
> > +{
> > +	if ((*given & bit) || getenv(key))
> > +		return; /* nothing to do */
> > +	setenv(key, value, 0);
> > +	*given |= bit;
> > +}
> 
> We call setenv(3) with overwrite=0 but we protect the call with a
> check for existing value with getenv(3), which feels a bit like an
> anti-pattern.  Wouldn't the following be simpler to follow, I wonder?
> 
> 	if (!(*given & bit)) {
> 		setenv(key, value, 1);
> 		*given |= bit;
> 	}
> 
> The only case these two may behave differently is when '*given' does
> not have the 'bit' set but the environment 'key' already exists.

Indeed, this is the case where your version would actually do the wrong
thing. Imagine that GIT_AUTHOR_NAME is set already. Your code would
*override* it. But that is not what we want to do here. We want to *fall
back* if there is no already-configured value.

And of course we won't set the `given` bit if we don't fall back here;
that should be done somewhere else, where that environment variable (that
we *refuse* to overwrite) is *actually* used.

Ciao,
Dscho

> The proposed patch will leave 'bit' in '*given' unset, so when a
> later code says "let's see if author_ident is explicitly given, and
> complain otherwise", such a check will trigger and cause complaint.
> 
> On the other hand, the simplified version does not allow the
> "explicitly-given" bits to be left unset, so it won't cause
> complaint.
> 
> Isn't it a BUG() if *given lacks 'bit' when the corresponding
> environment variable 'key' is missing?  IOW, I would understand
> an implementation that is more elaborate than the simplified one I
> just gave above were something like
> 
> 	if (!(*given & bit)) {
> 		if (getenv(key))
> 			BUG("why does %s exist and no %x bit set???", key, bit);
> 		setenv(key, value, 0);
> 		*given |= bit;
> 	}
> 
> but I do not quite understand the reasoning behind the "check either
> the bit, or the environment variable" in the proposed patch.
> 
> > +void prepare_fallback_ident(const char *name, const char *email)
> > +{
> > +	set_env_if("GIT_AUTHOR_NAME", name,
> > +		   &author_ident_explicitly_given, IDENT_NAME_GIVEN);
> > +	set_env_if("GIT_AUTHOR_EMAIL", email,
> > +		   &author_ident_explicitly_given, IDENT_MAIL_GIVEN);
> > +	set_env_if("GIT_COMMITTER_NAME", name,
> > +		   &committer_ident_explicitly_given, IDENT_NAME_GIVEN);
> > +	set_env_if("GIT_COMMITTER_EMAIL", email,
> > +		   &committer_ident_explicitly_given, IDENT_MAIL_GIVEN);
> > +}
> 
> Introducing this function alone without a caller and without
> function doc is a bit unfriendly to future callers, who must be
> careful when to call it, I think.  For example, they must know that
> it will be a disaster if they call this before they call
> git_ident_config(), right?
> 
> > +
> >  static int buf_cmp(const char *a_begin, const char *a_end,
> >  		   const char *b_begin, const char *b_end)
> >  {
> 
> 
>
Junio C Hamano Dec. 28, 2018, 7:40 p.m. UTC | #3
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Junio,
>
> On Wed, 26 Dec 2018, Junio C Hamano wrote:
>
>> Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com> writes:
>> 
>> > +static void set_env_if(const char *key, const char *value, int *given, int bit)
>> > +{
>> > +	if ((*given & bit) || getenv(key))
>> > +		return; /* nothing to do */
>> > +	setenv(key, value, 0);
>> > +	*given |= bit;
>> > +}
>> 
>> We call setenv(3) with overwrite=0 but we protect the call with a
>> check for existing value with getenv(3), which feels a bit like an
>> anti-pattern.  Wouldn't the following be simpler to follow, I wonder?
>> 
>> 	if (!(*given & bit)) {
>> 		setenv(key, value, 1);
>> 		*given |= bit;
>> 	}
>> 
>> The only case these two may behave differently is when '*given' does
>> not have the 'bit' set but the environment 'key' already exists.
>
> Indeed, this is the case where your version would actually do the wrong
> thing. Imagine that GIT_AUTHOR_NAME is set already. Your code would
> *override* it. But that is not what we want to do here. We want to *fall
> back* if there is no already-configured value.
>
> And of course we won't set the `given` bit if we don't fall back here;
> that should be done somewhere else, where that environment variable (that
> we *refuse* to overwrite) is *actually* used.

OK, so the designed calling sequence of the new "prepare fallback
ident" is that any process that wants to use fallback ident must
call the "prepare" function _before_ making a call to any other
functions (IOW, it is a BUG() if things like git_committer_info() is
called before prepare_fallback_ident() gets called).  Under that
condition, you are absolutely right that the two implementation
behaves differently.

That indeed indicates that this is unfriendly to future callers,
which was the main issue I had with the patch.  A comment before the
prepare_fallback_ident() function to explain that would have helped.

Also the first condition checking bit in *given does not help---it
is quite misleading.  It would have helped if it were

	static void set_env_if( ... )
	{
		if (*given & bit)
			BUG("%s was checked before prepare_fallback got called", key);
		if (getenv(key))
			return; /* nothing to do */
 		setenv(key, value, 1);
		*given |= bit;
	}

Because (author|committer)_ident_explicitly_given would never say
"yes, already" if the setting of fallback MUST be done before using
other API functions in ident.c, it we were to have a check for that
condition, it would be testing for a BUG().  And I wouldn't have
been confused by the code while reviewing.

>> > +void prepare_fallback_ident(const char *name, const char *email)
>> > +{
>> > +	set_env_if("GIT_AUTHOR_NAME", name,
>> > +		   &author_ident_explicitly_given, IDENT_NAME_GIVEN);
>> > +	set_env_if("GIT_AUTHOR_EMAIL", email,
>> > +		   &author_ident_explicitly_given, IDENT_MAIL_GIVEN);
>> > +	set_env_if("GIT_COMMITTER_NAME", name,
>> > +		   &committer_ident_explicitly_given, IDENT_NAME_GIVEN);
>> > +	set_env_if("GIT_COMMITTER_EMAIL", email,
>> > +		   &committer_ident_explicitly_given, IDENT_MAIL_GIVEN);
>> > +}
>> 
>> Introducing this function alone without a caller and without
>> function doc is a bit unfriendly to future callers, who must be
>> careful when to call it, I think.  For example, they must know that
>> it will be a disaster if they call this before they call
>> git_ident_config(), right?

As you can see from this "For example,...", the review comment (and
the "simplified but does the wrong thing" version) was written under
an assumption different from the expected calling sequence the
posted version makes.  What future writers of callers that use the
fallback ident feature must know is (unlike the above) that they
must call the "prepare" before they call git_ident_config() or
anything in ident.c.  A comment before this function that explain
how and when in the program flow this is designed to be used is
needed.

Thanks for a clarification.
diff mbox series

Patch

diff --git a/cache.h b/cache.h
index 28ccc97e10..dd2154bcd3 100644
--- a/cache.h
+++ b/cache.h
@@ -1493,6 +1493,7 @@  extern const char *git_sequence_editor(void);
 extern const char *git_pager(int stdout_is_tty);
 extern int is_terminal_dumb(void);
 extern int git_ident_config(const char *, const char *, void *);
+void prepare_fallback_ident(const char *name, const char *email);
 extern void reset_ident_date(void);
 
 struct ident_split {
diff --git a/ident.c b/ident.c
index 33bcf40644..bce20e8652 100644
--- a/ident.c
+++ b/ident.c
@@ -505,6 +505,26 @@  int git_ident_config(const char *var, const char *value, void *data)
 	return 0;
 }
 
+static void set_env_if(const char *key, const char *value, int *given, int bit)
+{
+	if ((*given & bit) || getenv(key))
+		return; /* nothing to do */
+	setenv(key, value, 0);
+	*given |= bit;
+}
+
+void prepare_fallback_ident(const char *name, const char *email)
+{
+	set_env_if("GIT_AUTHOR_NAME", name,
+		   &author_ident_explicitly_given, IDENT_NAME_GIVEN);
+	set_env_if("GIT_AUTHOR_EMAIL", email,
+		   &author_ident_explicitly_given, IDENT_MAIL_GIVEN);
+	set_env_if("GIT_COMMITTER_NAME", name,
+		   &committer_ident_explicitly_given, IDENT_NAME_GIVEN);
+	set_env_if("GIT_COMMITTER_EMAIL", email,
+		   &committer_ident_explicitly_given, IDENT_MAIL_GIVEN);
+}
+
 static int buf_cmp(const char *a_begin, const char *a_end,
 		   const char *b_begin, const char *b_end)
 {