mbox series

[v2,0/2] config: allow specifying config entries via envvar pairs

Message ID cover.1606214397.git.ps@pks.im (mailing list archive)
Headers show
Series config: allow specifying config entries via envvar pairs | expand

Message

Patrick Steinhardt Nov. 24, 2020, 10:50 a.m. UTC
Hi,

this is the second version of my patch series which aims to implement a
way to pass config entries via the environment while avoiding any
requirements to perform shell quoting on the user's side.

There's been quite some feedback on the first version, which I tried to
include in this version. Changes include:

    - I reworked how git detects which variables it should process.
      Instead of iterating from GIT_CONFIG_KEY_0 to $n until we find the
      first gap, this now uses a third environment variable
      GIT_CONFIG_COUNT which specifies show many environment config
      pairs should be processed. I've added this variable to the local
      environment variables, so that it's properly unset when moving
      between repos and printed by `git rev-parse --local-env-vars`.

    - Missing GIT_CONFIG_VALUE_$n keys for a given key are now treated
      as an error. The same is true for any environment value which
      should exist based on the value of GIT_CONFIG_COUNT.

    - I've changed priorities. The envvars are treated as command-level
      and as such override all values configured in files. But any
      explicit `git -c key=value` will now override these envvars.

    - I've improved test coverage to also nail down priorities.

Patrick

Patrick Steinhardt (2):
  config: extract function to parse config pairs
  config: allow specifying config entries via envvar pairs

 Documentation/git-config.txt |   9 +++
 cache.h                      |   1 +
 config.c                     |  96 +++++++++++++++++++++++++-------
 environment.c                |   1 +
 t/t1300-config.sh            | 105 ++++++++++++++++++++++++++++++++++-
 5 files changed, 190 insertions(+), 22 deletions(-)

Comments

Jeff King Nov. 25, 2020, 10:41 a.m. UTC | #1
On Tue, Nov 24, 2020 at 11:50:46AM +0100, Patrick Steinhardt wrote:

>     - I've changed priorities. The envvars are treated as command-level
>       and as such override all values configured in files. But any
>       explicit `git -c key=value` will now override these envvars.

That ordering makes sense. Those get passed through the environment,
too, but at some point there is a process where your new ones are in the
environment and the "-c" ones are on the command-line.

I do still think that a "--config-env" option solves your problem in a
much simpler way (especially in terms of interface we expose to users
that we'll be locked into forever). I sketched out the solution below if
it's of interest (and I'd be happy to polish it up, or hand it off to
you if so). But if you're unconvinced, I'll stop mentioning it.

diff --git a/config.c b/config.c
index 8f324ed3a6..d8cf6a5d6b 100644
--- a/config.c
+++ b/config.c
@@ -345,6 +345,27 @@ void git_config_push_parameter(const char *text)
 	strbuf_release(&env);
 }
 
+void git_config_push_env(const char *spec)
+{
+	struct strbuf buf = STRBUF_INIT;
+	const char *env_name;
+	const char *env_value;
+
+	env_name = strchr(spec, '=');
+	if (!env_name)
+		return; /* die or warn? */
+	env_name++;
+
+	env_value = getenv(env_name);
+	if (!env_value)
+		return; /* die or warn? */
+
+	strbuf_add(&buf, spec, env_name - spec);
+	strbuf_addstr(&buf, env_value);
+	git_config_push_parameter(buf.buf);
+	strbuf_release(&buf);
+}
+
 static inline int iskeychar(int c)
 {
 	return isalnum(c) || c == '-';
diff --git a/config.h b/config.h
index 91cdfbfb41..d05651c96c 100644
--- a/config.h
+++ b/config.h
@@ -138,6 +138,7 @@ int git_config_from_mem(config_fn_t fn,
 int git_config_from_blob_oid(config_fn_t fn, const char *name,
 			     const struct object_id *oid, void *data);
 void git_config_push_parameter(const char *text);
+void git_config_push_env(const char *spec);
 int git_config_from_parameters(config_fn_t fn, void *data);
 void read_early_config(config_fn_t cb, void *data);
 void read_very_early_config(config_fn_t cb, void *data);
diff --git a/git.c b/git.c
index 4b7bd77b80..342f2fb0c9 100644
--- a/git.c
+++ b/git.c
@@ -254,6 +254,8 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
 			git_config_push_parameter((*argv)[1]);
 			(*argv)++;
 			(*argc)--;
+		} else if (skip_prefix(cmd, "--config-env=", &cmd)) {
+			git_config_push_env(cmd);
 		} else if (!strcmp(cmd, "--literal-pathspecs")) {
 			setenv(GIT_LITERAL_PATHSPECS_ENVIRONMENT, "1", 1);
 			if (envchanged)
Patrick Steinhardt Nov. 25, 2020, 1:16 p.m. UTC | #2
On Wed, Nov 25, 2020 at 05:41:14AM -0500, Jeff King wrote:
> On Tue, Nov 24, 2020 at 11:50:46AM +0100, Patrick Steinhardt wrote:
> 
> >     - I've changed priorities. The envvars are treated as command-level
> >       and as such override all values configured in files. But any
> >       explicit `git -c key=value` will now override these envvars.
> 
> That ordering makes sense. Those get passed through the environment,
> too, but at some point there is a process where your new ones are in the
> environment and the "-c" ones are on the command-line.
> 
> I do still think that a "--config-env" option solves your problem in a
> much simpler way (especially in terms of interface we expose to users
> that we'll be locked into forever). I sketched out the solution below if
> it's of interest (and I'd be happy to polish it up, or hand it off to
> you if so). But if you're unconvinced, I'll stop mentioning it.

The thing I like more about using envvars only is that you only need to
modify a single part, while with `--config-env` there's two moving
parts. E.g. assume you have a script and want certain configuration to
apply to all git commands in that script. It's trivial in the envvar
case, while for `--config-env` you'll also have to modify each single
git call. You could get around that by using a wrapper, but it's still a
tad more involved. A second thing I briefly wondered about is the
maximum command line length, which may be easier to hit in case you want
to pass a lot of config entries.

None of these complaints apply to my original usecase, where
`--config-env` would work equally well. But I do think that for
scripting use, which is going to be most of all cases where my patch
series is useful, GIT_CONFIG_COUNT is easier to use.

There probably are good arguments for `--config-env`, for example that
it's easier to spot when executing a git command. I stil lean towards my
current implementation, but I'm obviously biased. So if there is
consensus that we should use `--config-env` instead, I'm not opposed.

Patrick
Junio C Hamano Nov. 25, 2020, 8:28 p.m. UTC | #3
Jeff King <peff@peff.net> writes:

> I do still think that a "--config-env" option solves your problem in a
> much simpler way (especially in terms of interface we expose to users
> that we'll be locked into forever).

As a mechanism to allow a custom configuration for a single
invocation of a command, I tend to agree.  For a mechansim to affect
multiple commands in a sequence (read: scripts), I am not so sure.

The simplicity of the implementation we see below is also very
attractive.

> I sketched out the solution below if
> it's of interest (and I'd be happy to polish it up, or hand it off to
> you if so). But if you're unconvinced, I'll stop mentioning it.
>
> diff --git a/config.c b/config.c
> index 8f324ed3a6..d8cf6a5d6b 100644
> --- a/config.c
> +++ b/config.c
> @@ -345,6 +345,27 @@ void git_config_push_parameter(const char *text)
>  	strbuf_release(&env);
>  }
>  
> +void git_config_push_env(const char *spec)
> +{
> +	struct strbuf buf = STRBUF_INIT;
> +	const char *env_name;
> +	const char *env_value;
> +
> +	env_name = strchr(spec, '=');
> +	if (!env_name)
> +		return; /* die or warn? */
> +	env_name++;
> +
> +	env_value = getenv(env_name);
> +	if (!env_value)
> +		return; /* die or warn? */
> +
> +	strbuf_add(&buf, spec, env_name - spec);
> +	strbuf_addstr(&buf, env_value);
> +	git_config_push_parameter(buf.buf);
> +	strbuf_release(&buf);
> +}
> +
>  static inline int iskeychar(int c)
>  {
>  	return isalnum(c) || c == '-';
> diff --git a/config.h b/config.h
> index 91cdfbfb41..d05651c96c 100644
> --- a/config.h
> +++ b/config.h
> @@ -138,6 +138,7 @@ int git_config_from_mem(config_fn_t fn,
>  int git_config_from_blob_oid(config_fn_t fn, const char *name,
>  			     const struct object_id *oid, void *data);
>  void git_config_push_parameter(const char *text);
> +void git_config_push_env(const char *spec);
>  int git_config_from_parameters(config_fn_t fn, void *data);
>  void read_early_config(config_fn_t cb, void *data);
>  void read_very_early_config(config_fn_t cb, void *data);
> diff --git a/git.c b/git.c
> index 4b7bd77b80..342f2fb0c9 100644
> --- a/git.c
> +++ b/git.c
> @@ -254,6 +254,8 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
>  			git_config_push_parameter((*argv)[1]);
>  			(*argv)++;
>  			(*argc)--;
> +		} else if (skip_prefix(cmd, "--config-env=", &cmd)) {
> +			git_config_push_env(cmd);
>  		} else if (!strcmp(cmd, "--literal-pathspecs")) {
>  			setenv(GIT_LITERAL_PATHSPECS_ENVIRONMENT, "1", 1);
>  			if (envchanged)
brian m. carlson Nov. 25, 2020, 10:47 p.m. UTC | #4
On 2020-11-25 at 10:41:14, Jeff King wrote:
> On Tue, Nov 24, 2020 at 11:50:46AM +0100, Patrick Steinhardt wrote:
> 
> >     - I've changed priorities. The envvars are treated as command-level
> >       and as such override all values configured in files. But any
> >       explicit `git -c key=value` will now override these envvars.
> 
> That ordering makes sense. Those get passed through the environment,
> too, but at some point there is a process where your new ones are in the
> environment and the "-c" ones are on the command-line.

Yeah, I agree this would be the right way to go.

> I do still think that a "--config-env" option solves your problem in a
> much simpler way (especially in terms of interface we expose to users
> that we'll be locked into forever). I sketched out the solution below if
> it's of interest (and I'd be happy to polish it up, or hand it off to
> you if so). But if you're unconvinced, I'll stop mentioning it.

I do rather prefer this approach over the multiple key-value pairs.  I
think the use case of scripts could probably be easily solved with an
additional environment variable like so:

  args="--config-env abc.def=GHI --config-env jkl.mno=PQR"

This isn't necessarily super elegant, but I like it more than needing
to handle many key-value pairs.

But while I do have a moderately strong preference, I'm not going to
argue for blocking the series if you still want to go this way.
Jeff King Nov. 26, 2020, 12:36 a.m. UTC | #5
On Wed, Nov 25, 2020 at 02:16:38PM +0100, Patrick Steinhardt wrote:

> > I do still think that a "--config-env" option solves your problem in a
> > much simpler way (especially in terms of interface we expose to users
> > that we'll be locked into forever). I sketched out the solution below if
> > it's of interest (and I'd be happy to polish it up, or hand it off to
> > you if so). But if you're unconvinced, I'll stop mentioning it.
> 
> The thing I like more about using envvars only is that you only need to
> modify a single part, while with `--config-env` there's two moving
> parts. E.g. assume you have a script and want certain configuration to
> apply to all git commands in that script. It's trivial in the envvar
> case, while for `--config-env` you'll also have to modify each single
> git call. You could get around that by using a wrapper, but it's still a
> tad more involved. A second thing I briefly wondered about is the
> maximum command line length, which may be easier to hit in case you want
> to pass a lot of config entries.

Yeah, that's true. I haven't typically run across this myself because
usually such a script ends up invoked by git itself. I.e., it is
git-foo, and then I do:

  git -c some.var=value foo

which puts everything in the environment, but it's done by Git itself,
so the exact environment format remains opaque.

-Peff
Patrick Steinhardt Nov. 26, 2020, 6:31 a.m. UTC | #6
On Wed, Nov 25, 2020 at 10:47:37PM +0000, brian m. carlson wrote:
> On 2020-11-25 at 10:41:14, Jeff King wrote:
> > On Tue, Nov 24, 2020 at 11:50:46AM +0100, Patrick Steinhardt wrote:
> > I do still think that a "--config-env" option solves your problem in a
> > much simpler way (especially in terms of interface we expose to users
> > that we'll be locked into forever). I sketched out the solution below if
> > it's of interest (and I'd be happy to polish it up, or hand it off to
> > you if so). But if you're unconvinced, I'll stop mentioning it.
> 
> I do rather prefer this approach over the multiple key-value pairs.  I
> think the use case of scripts could probably be easily solved with an
> additional environment variable like so:
> 
>   args="--config-env abc.def=GHI --config-env jkl.mno=PQR"
> 
> This isn't necessarily super elegant, but I like it more than needing
> to handle many key-value pairs.
> 
> But while I do have a moderately strong preference, I'm not going to
> argue for blocking the series if you still want to go this way.

In the end, it probably boils down to taste. Both work to solve the
problem at hand while there are tradeoffs for other usecases for both.

Ultimately, those two ways are not mutually exclusive and we could even
implement both. So I might as well include Peffs patch in this series,
even though I'm not sure whether adding two new ways of doing things at
the same time would be welcome.

Patrick
Patrick Steinhardt Dec. 1, 2020, 9:47 a.m. UTC | #7
On Wed, Nov 25, 2020 at 05:41:14AM -0500, Jeff King wrote:
> On Tue, Nov 24, 2020 at 11:50:46AM +0100, Patrick Steinhardt wrote:
> 
> >     - I've changed priorities. The envvars are treated as command-level
> >       and as such override all values configured in files. But any
> >       explicit `git -c key=value` will now override these envvars.
> 
> That ordering makes sense. Those get passed through the environment,
> too, but at some point there is a process where your new ones are in the
> environment and the "-c" ones are on the command-line.
> 
> I do still think that a "--config-env" option solves your problem in a
> much simpler way (especially in terms of interface we expose to users
> that we'll be locked into forever). I sketched out the solution below if
> it's of interest (and I'd be happy to polish it up, or hand it off to
> you if so). But if you're unconvinced, I'll stop mentioning it.
> 
> diff --git a/config.c b/config.c
> index 8f324ed3a6..d8cf6a5d6b 100644
> --- a/config.c
> +++ b/config.c
> @@ -345,6 +345,27 @@ void git_config_push_parameter(const char *text)
>  	strbuf_release(&env);
>  }
>  
> +void git_config_push_env(const char *spec)
> +{
> +	struct strbuf buf = STRBUF_INIT;
> +	const char *env_name;
> +	const char *env_value;
> +
> +	env_name = strchr(spec, '=');
> +	if (!env_name)
> +		return; /* die or warn? */
> +	env_name++;
> +
> +	env_value = getenv(env_name);
> +	if (!env_value)
> +		return; /* die or warn? */
> +
> +	strbuf_add(&buf, spec, env_name - spec);
> +	strbuf_addstr(&buf, env_value);
> +	git_config_push_parameter(buf.buf);
> +	strbuf_release(&buf);
> +}

I realize that you say it's yet unpolished, but doesn't this have
parsing issues? The first strchr(3P) probably needs to be a strrchr(3P)
to correctly parse `includeIf./home/foo/=repo.path=MY_PATH_ENV`. But
we'd also have to handle shell quoting for the user, don't we?

Anyway, I'd be happy to adopt is as part of the series if we care
enough. For now I'll send out the current state I have though.

Patrick

>  static inline int iskeychar(int c)
>  {
>  	return isalnum(c) || c == '-';
> diff --git a/config.h b/config.h
> index 91cdfbfb41..d05651c96c 100644
> --- a/config.h
> +++ b/config.h
> @@ -138,6 +138,7 @@ int git_config_from_mem(config_fn_t fn,
>  int git_config_from_blob_oid(config_fn_t fn, const char *name,
>  			     const struct object_id *oid, void *data);
>  void git_config_push_parameter(const char *text);
> +void git_config_push_env(const char *spec);
>  int git_config_from_parameters(config_fn_t fn, void *data);
>  void read_early_config(config_fn_t cb, void *data);
>  void read_very_early_config(config_fn_t cb, void *data);
> diff --git a/git.c b/git.c
> index 4b7bd77b80..342f2fb0c9 100644
> --- a/git.c
> +++ b/git.c
> @@ -254,6 +254,8 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
>  			git_config_push_parameter((*argv)[1]);
>  			(*argv)++;
>  			(*argc)--;
> +		} else if (skip_prefix(cmd, "--config-env=", &cmd)) {
> +			git_config_push_env(cmd);
>  		} else if (!strcmp(cmd, "--literal-pathspecs")) {
>  			setenv(GIT_LITERAL_PATHSPECS_ENVIRONMENT, "1", 1);
>  			if (envchanged)
Jeff King Dec. 1, 2020, 11:30 a.m. UTC | #8
On Tue, Dec 01, 2020 at 10:47:56AM +0100, Patrick Steinhardt wrote:

> > +void git_config_push_env(const char *spec)
> > +{
> > +	struct strbuf buf = STRBUF_INIT;
> > +	const char *env_name;
> > +	const char *env_value;
> > +
> > +	env_name = strchr(spec, '=');
> > +	if (!env_name)
> > +		return; /* die or warn? */
> > +	env_name++;
> > +
> > +	env_value = getenv(env_name);
> > +	if (!env_value)
> > +		return; /* die or warn? */
> > +
> > +	strbuf_add(&buf, spec, env_name - spec);
> > +	strbuf_addstr(&buf, env_value);
> > +	git_config_push_parameter(buf.buf);
> > +	strbuf_release(&buf);
> > +}
> 
> I realize that you say it's yet unpolished, but doesn't this have
> parsing issues? The first strchr(3P) probably needs to be a strrchr(3P)
> to correctly parse `includeIf./home/foo/=repo.path=MY_PATH_ENV`.

Without further changes to $GIT_CONFIG_PARAMETERS, there'd be little
point. The value we put in there has the same parsing issue when read
out of the environment (which we resolve by disallowing "=" in the
subsection, just as here).

I don't think it's actually that big of a deal in practice (it _could_
be an injection source, but it seems somewhat implausible that somebody
is generating complex config keys based on untrusted input). But if we
care, then we could pretty easily change the reading side to separately
quote the key/value in this case:

  'foo.subsection=with=equals.bar'='value=with=equals'

And then doing strrchr() would make sense, with the explicitly
documented rule that the environment variable name cannot contain an
equals sign. (Doing a raw "git -c" wouldn't work unless we introduce
another option that lets you specify the key and value separately; that
might be worthwhile, too).

> But we'd also have to handle shell quoting for the user, don't we?

I'm not sure exactly what you mean here. We wouldn't typically see any
shell quoting from the user, since the shell would dequote it and give
us a NUL-terminated argv. Or if you meant we'd have to adjust the shell
quoting in $GIT_CONFIG_PARAMETERS, then see above.

-Peff