diff mbox series

config: add an includeIf.env{Exists,Bool,Is,Match}

Message ID patch-1.1-1fe6f60d2bf-20210924T005553Z-avarab@gmail.com (mailing list archive)
State New, archived
Headers show
Series config: add an includeIf.env{Exists,Bool,Is,Match} | expand

Commit Message

Ævar Arnfjörð Bjarmason Sept. 24, 2021, 12:58 a.m. UTC
Add an "includeIf" directive that's conditional on the value of an
environment variable. This has been suggested at least a couple of
times[1][2] as a proposed mechanism to drive dynamic includes, and to
e.g. match based on TERM=xterm in the user's environment.

I initially tried to implement just an "env" keyword, but found that
splitting them up was both easier to implement and to explain to
users. I.e. an "env" will need to handle an optional "=" for matching
the value, and should the semantics be if the variable exists? If it's
boolean?

By splitting them up we present a less ambiguous interface to users,
and make it easy to extend this in the future. I didn't see any point
in implementing a "/i" variant, that only makes sense for "gitdir"'s
matching of FS paths.

I would like syntax that used a "=" better, i.e. "envIs:TERM=xterm"
instead of the "envIs:TERM:xterm" implemented here, but the problem
with that is that it isn't possible to specify those sorts of config
keys on the command-line:

    $ git -c includeIf.someVerb:FOO:BAR.path=bar status --short
    $ git -c includeIf.someVerb:FOO=BAR.path=bar status --short
    error: invalid key: includeIf.someVerb:FOO
    fatal: unable to parse command-line config
    $

I.e. not only isn't the "someVerb" in that scenario not understood by
an older git, but it'll hard error on seeing such a key. See
1ff21c05ba9 (config: store "git -c" variables using more robust
format, 2021-01-12) for a discussion about "=" in config keys. By
picking any other character (e.g. ":") we avoid that whole issue.

1. https://lore.kernel.org/git/YUzvhLUmvsdF5w+r@coredump.intra.peff.net/
2. https://lore.kernel.org/git/X9OB7ek8fVRXUBdK@coredump.intra.peff.net/
3. https://lore.kernel.org/git/87in9ucsbb.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

> On Wed, Sep 22, 2021 at 10:21:22PM -0700, The Grey Wolf wrote:
>
>> Anything else you want to add:
>>      I searched google and the documentation as best I was able for
>>      this, but I am unable to find anywhere that will let me disable
>>      (or enable) colour for a particular term type.  Sometimes I'm on
>>      an xterm, for which this is GREAT.  Sometimes I'm on a Wyse WY60,
>>      for which this is sub-optimal.  My workaround is to disable colour
>>      completely, which is reluctantly acceptable, but it would be nice
>>      to say "If I'm on an xterm/aterm/urxvt/ansi terminal, enable
>>      colour or cursor-positioning, otherwise shut it off."  If this
>>      seems too much of a one-off to handle, fine, but most things that
>>      talk fancy to screens are kind enough to allow an opt-out based on
>>      terminal type. :)
>
> Git doesn't have any kind of list of terminals, beyond knowing that
> "dumb" should disable auto-color. It's possible we could expand that if
> there are known terminals that don't understand ANSI colors. I'm a bit
> wary of having a laundry list of obscure terminals, though.
>
> If we built against ncurses or some other terminfo-aware library we
> could outsource that, but that would be a new dependency. I'm hesitant
> to do that even as an optional dependency given the bang-for-the-buck
> (and certainly making it require would be right out).
>
> Obviously you can wrap Git with a script to tweak the config based on
> the current setting of the $TERM variable. It would be nice if you could
> have conditional config for that. E.g., something like:
>
>   [includeIf "env:TERM==xterm"]
>   path = gitconfig-color
>
> That doesn't exist, but would fit in reasonably well with our other
> conditional config options.

Perhaps something like this?

 Documentation/config.txt  | 35 ++++++++++++++++++++
 config.c                  | 61 +++++++++++++++++++++++++++++++++--
 t/t1305-config-include.sh | 67 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 161 insertions(+), 2 deletions(-)

Comments

Jeff King Sept. 24, 2021, 9:07 p.m. UTC | #1
On Fri, Sep 24, 2021 at 02:58:04AM +0200, Ævar Arnfjörð Bjarmason wrote:

> Add an "includeIf" directive that's conditional on the value of an
> environment variable. This has been suggested at least a couple of
> times[1][2] as a proposed mechanism to drive dynamic includes, and to
> e.g. match based on TERM=xterm in the user's environment.

Thanks. I think this is a reasonable thing to have (not surprising,
since both of those suggestion references point to me!), and it's not
much of a burden to carry, even if it isn't all that commonly used.

I probably would have started with a smaller set of variants (just
equality, with a missing variable presented as an empty string). But I
don't think the bool/glob/exists variants are a lot of extra code or
complexity.

> I initially tried to implement just an "env" keyword, but found that
> splitting them up was both easier to implement and to explain to
> users. I.e. an "env" will need to handle an optional "=" for matching
> the value, and should the semantics be if the variable exists? If it's
> boolean?
> 
> By splitting them up we present a less ambiguous interface to users,
> and make it easy to extend this in the future. I didn't see any point
> in implementing a "/i" variant, that only makes sense for "gitdir"'s
> matching of FS paths.

I had thought to extend with the operator, like:

  # equality
  [includeIf "env:FOO==value"]
  # regex
  [includeIf "env:FOO=~v[a]l"]

But as you note, "=" is somewhat problematic, and without that we can't
use the "usual" operators. Plus there's no usual operator for globbing. ;)
So embedding it in the name is fine by me (and mostly a bikeshed thing
anyway).

I agree we don't really need a "/i" variant here.

> I would like syntax that used a "=" better, i.e. "envIs:TERM=xterm"
> instead of the "envIs:TERM:xterm" implemented here, but the problem
> with that is that it isn't possible to specify those sorts of config
> keys on the command-line:
> 
>     $ git -c includeIf.someVerb:FOO:BAR.path=bar status --short
>     $ git -c includeIf.someVerb:FOO=BAR.path=bar status --short
>     error: invalid key: includeIf.someVerb:FOO
>     fatal: unable to parse command-line config
>     $

Yeah, it's annoying that it doesn't work, and it's nice to think about
that when designing the syntax. OTOH, I kind of wonder how often folks
would write such a thing anyway. For one-offs like this, you'd just do
an unconditional include (or set the actual variables you care about)
anyway. This kind of conditional stuff is much more likely to appear in
an actual file.

Plus we will be stuck with whatever syntax we design here forever.
Whereas we may eventually provide a split version of "-c" that can
handle names with equals, at which point our syntax decision here will
just be a historical wart. In fact, we already have "--config-env" that
can do this.

>  Documentation/config.txt  | 35 ++++++++++++++++++++
>  config.c                  | 61 +++++++++++++++++++++++++++++++++--
>  t/t1305-config-include.sh | 67 +++++++++++++++++++++++++++++++++++++++
>  3 files changed, 161 insertions(+), 2 deletions(-)

The patch itself looks correct to me. Everything below is bikeshedding
(as if the parts above were not!), but you may or may not find some of
it compelling.

> +`envExists`::
> [...]
> +`envMatch`::

As I said, these four variants seem OK. Two other variants we might
consider:

  - regex vs glob; I think globbing is probably fine for now and regex
    would be overkill. We might want to reconsider the use of the word
    "match" though, since I assumed it to be a regex at first.

  - negation; for the TERM example discussed recently, I wonder if TERM
    != xterm would be a more natural fit. I had imagined "!=" as the
    operator, but in your scheme, it would probably be "!envIs", etc.

> +There is no way to match an environment variable name with any of the
> +`env*` directives if that variable name contains a ":" character. Such
> +names are allowed by POSIX, but it is assumed that nobody will need to
> +match such a variable name in practice.

That seems like a perfectly reasonable restriction.

Should we allow whitespace around key names and values? E.g.:

  [includeIf "env: FOO: bar"]

is IMHO more readable (even more so if we had infix operators like
"==").

> +static int include_by_env_exists(const char *cond, size_t cond_len)
> +{
> +	char *cfg = xstrndup(cond, cond_len);
> +	int ret = !!getenv(cfg);
> +	free(cfg);
> +	return ret;
> +}

Having to xstrndup() in each one of these is ugly. But doing it in the
caller would be even uglier, as we don't discover cond/cond_len until we
match via skip_prefix() in the big if/else chain. And certainly it's not
new to your patch anyway, just something I noticed.

BTW, I notice you used xmemdupz() in one case later on. I generally
prefer it to xstrndup() because it's more straight-forward: the string
is this long, and it needs to be NUL-terminated. Whereas xstrndup() is
_mostly_ equivalent, but will produce a smaller string if there are
embedded NULs. We would not expect them in this case, so I don't think
it matters functionally. It just seemed funny to me to see them mixed.

(I actually suspect 99% or more of xstrndup() calls should just be
xmemdupz(), and I'd be happy to consolidate and drop one of them, but it
would be finicky looking at each one to see if that's really true).

> +static int include_by_env_match(const char *cond, size_t cond_len, int glob,
> +				int *err)
> +{
> +	const char *eq;
> +	const char *value;
> +	const char *env;
> +	char *cfg = xstrndup(cond, cond_len);
> +	char *key = NULL;
> +	int ret = 0;
> +
> +	eq = strchr(cfg, ':');
> +	if (!eq) {
> +		*err = error(_("'%s:%.*s' missing a ':' to match the value"),
> +			     glob ? "envMatch" : "envIs", (int)(cond_len),
> +			     cond);

You made a string out of (cond, cond_len) already, so you could just use
"cfg" here in the error (what you have isn't wrong, but I always find
%.*s hard to read).

> +	key = xmemdupz(cfg, eq - cfg);

And this is the mixed xstrndup()/xmemdupz() case I mentioned. :)

>  static int include_condition_is_true(const struct config_options *opts,
> -				     const char *cond, size_t cond_len)
> +				     const char *cond, size_t cond_len,
> +				     int *err)

OK, we need to return not just "true" or "not true" from the return now,
so we stuff it into an out-parameter. We could use a tri-state instead.
Our if-else would already propagate it:

> @@ -301,6 +350,14 @@ static int include_condition_is_true(const struct config_options *opts,
>  		return include_by_gitdir(opts, cond, cond_len, 1);
>  	else if (skip_prefix_mem(cond, cond_len, "onbranch:", &cond, &cond_len))
>  		return include_by_branch(cond, cond_len);
> +	else if (skip_prefix_mem(cond, cond_len, "envExists:", &cond, &cond_len))
> +		return include_by_env_exists(cond, cond_len);
> +	else if (skip_prefix_mem(cond, cond_len, "envBool:", &cond, &cond_len))
> +		return include_by_env_bool(cond, cond_len);
> +	else if (skip_prefix_mem(cond, cond_len, "envIs:", &cond, &cond_len))
> +		return include_by_env_match(cond, cond_len, 0, err);
> +	else if (skip_prefix_mem(cond, cond_len, "envMatch:", &cond, &cond_len))
> +		return include_by_env_match(cond, cond_len, 1, err);

But it would mess up this:

>  	if (!parse_config_key(var, "includeif", &cond, &cond_len, &key) &&
> -	    (cond && include_condition_is_true(inc->opts, cond, cond_len)) &&
> +	    (cond && include_condition_is_true(inc->opts, cond, cond_len, &ret)) &&
>  	    !strcmp(key, "path"))
>  		ret = handle_path_include(value, inc);

So the out-parameter seems like a reasonable path.

I did find it interesting that there are no other error-cases in the
existing conditions. They mostly just evaluate to false (including if we
don't recognize the condition at all). But in the cases you're catching
here, it really is syntactic nonsense.

I guess you could define "there is no colon" as "the value we'd parse
after it is assumed to be the empty string" which makes the error case
go away. I'm not sure it really matters all that much either way in
practice.

> +test_expect_success 'conditional include, envExists:*' '
> +	echo value >expect &&
> +	git config -f envExists.cfg some.key $(cat expect) &&
> +
> +	test_must_fail git -c includeIf.envExists:VAR.path="$PWD/envExists.cfg" config some.key 2>err &&
> +	test_must_be_empty err &&

The tests all look sane to me. We do define some exit codes for
git-config here, so:

  test_expect_code 1 git -c ...

would be tighter (and you could probably just ditch the empty-err check
then).

Obviously avoiding the "=" question is beneficial here for testing via
"-c".  As I said earlier, I think it's more realistic to expect these in
actual files, but I think testing them either way is fine. If you did
test them in a file, though, you could use a relative path in the
include. Plus all three invocations could use the same file, so you
don't have to repeat it over and over. I.e.:

  test_config includeIf.envExists:VAR.path envExists.cfg &&

  test_expect_code 1 git config some.key &&

  VAR= git config some.key >actual &&
  test_cmp expect actual &&

  VAR=0 git config some.key >actual &&
  test_cmp expect actual

should work.

-Peff
Junio C Hamano Sept. 24, 2021, 9:28 p.m. UTC | #2
Jeff King <peff@peff.net> writes:

> I had thought to extend with the operator, like:
>
>   # equality
>   [includeIf "env:FOO==value"]
>   # regex
>   [includeIf "env:FOO=~v[a]l"]

Yup, that matched my aesthetics better ;-)

> But as you note, "=" is somewhat problematic, and without that we can't
> use the "usual" operators. Plus there's no usual operator for globbing. ;)
> So embedding it in the name is fine by me (and mostly a bikeshed thing
> anyway).

Perhaps.  I am not sure if we deeply care about "git -c var=val" in
this case, especially since this is part of includeif, though.  It
may be more important to keep the syntax useful and extensible for
everyday use than for one-off "git -c" testing.

> I agree we don't really need a "/i" variant here.

Case insensitive environment variable names, no, but case
insensitive matching of values, maybe?  But I'd be happy to see us
start very minimally (even just envEQ alone without any other
frills, or optionally envNE to negate it, would be fine by me).

> Should we allow whitespace around key names and values? E.g.:
>
>   [includeIf "env: FOO: bar"]
>
> is IMHO more readable (even more so if we had infix operators like
> "==").

This asserts what? FOO=" bar"?
Jeff King Sept. 24, 2021, 9:59 p.m. UTC | #3
On Fri, Sep 24, 2021 at 02:28:29PM -0700, Junio C Hamano wrote:

> > But as you note, "=" is somewhat problematic, and without that we can't
> > use the "usual" operators. Plus there's no usual operator for globbing. ;)
> > So embedding it in the name is fine by me (and mostly a bikeshed thing
> > anyway).
> 
> Perhaps.  I am not sure if we deeply care about "git -c var=val" in
> this case, especially since this is part of includeif, though.  It
> may be more important to keep the syntax useful and extensible for
> everyday use than for one-off "git -c" testing.

Yeah, see my comments later in that mail. :)

> > I agree we don't really need a "/i" variant here.
> 
> Case insensitive environment variable names, no, but case
> insensitive matching of values, maybe?  But I'd be happy to see us
> start very minimally (even just envEQ alone without any other
> frills, or optionally envNE to negate it, would be fine by me).

Yeah, as long as we leave the door open syntactically, I think it is OK.

> > Should we allow whitespace around key names and values? E.g.:
> >
> >   [includeIf "env: FOO: bar"]
> >
> > is IMHO more readable (even more so if we had infix operators like
> > "==").
> 
> This asserts what? FOO=" bar"?

Whoops, that should have been "envIs", asserting that $FOO contains
"bar".

As I said, I think it matters more with the infix operators, as:

  [includeIf "env:FOO == bar"]

is more readable than:

  [includeIf "env:FOO==bar"]

But I do think:

  [includeIf "envIs:FOO:bar"]

is harder to read than even:

  [includeIf "envIs:FOO: bar"]

-Peff
Junio C Hamano Sept. 27, 2021, 4:30 p.m. UTC | #4
Jeff King <peff@peff.net> writes:

>> > Should we allow whitespace around key names and values? E.g.:
>> >
>> >   [includeIf "env: FOO: bar"]
>> >
>> > is IMHO more readable (even more so if we had infix operators like
>> > "==").
>> 
>> This asserts what? FOO=" bar"?
>
> Whoops, that should have been "envIs", asserting that $FOO contains
> "bar".

Oh, "can we check with a literal with leading whitespace?" was what
my question was about ;-)

> As I said, I think it matters more with the infix operators, as:
>
>   [includeIf "env:FOO == bar"]
>
> is more readable than:
>
>   [includeIf "env:FOO==bar"]

Sure, but at that point, we'd probably want some quoting mechanism
for the literal to be compared, e.g.

	[includeIf "env:PATH ~= \"(:|^)/usr/bin(:|$)\""]

> But I do think:
>
>   [includeIf "envIs:FOO:bar"]
>
> is harder to read than even:
>
>   [includeIf "envIs:FOO: bar"]

Hmph, that's quite subjective, I am afraid.  When I see the latter
in the configuration file, "do I have to have a single space before
'bar' in the value of $FOO" would be the first question that would
come to my mind.

With an understanding that our syntax is so limited that we cannot
even write '=' and need to resort to Is: instead, I'd actually find
that the former less confusing than the latter.

Thanks.
Jeff King Sept. 27, 2021, 8:15 p.m. UTC | #5
On Mon, Sep 27, 2021 at 09:30:41AM -0700, Junio C Hamano wrote:

> >> This asserts what? FOO=" bar"?
> >
> > Whoops, that should have been "envIs", asserting that $FOO contains
> > "bar".
> 
> Oh, "can we check with a literal with leading whitespace?" was what
> my question was about ;-)

My assumption was that nobody would really care about doing so. It is
true that it's less flexible, though (and is a decision we can't easily
take back later).

> > As I said, I think it matters more with the infix operators, as:
> >
> >   [includeIf "env:FOO == bar"]
> >
> > is more readable than:
> >
> >   [includeIf "env:FOO==bar"]
> 
> Sure, but at that point, we'd probably want some quoting mechanism
> for the literal to be compared, e.g.
> 
> 	[includeIf "env:PATH ~= \"(:|^)/usr/bin(:|$)\""]

Ick. The extra quoting of the internal double-quotes is pretty horrid to
look at. Also, how does one match a double-quote in the value? \\\"?

If it were optional, that would make the common cases easy (no dq, no
whitespace), and the hard ones possible.

I think this is getting into a bit of a digression, though. I'm willing
to defer to Ævar, who is doing the actual work, and I don't know if he
has found any of this compelling. ;)

> > But I do think:
> >
> >   [includeIf "envIs:FOO:bar"]
> >
> > is harder to read than even:
> >
> >   [includeIf "envIs:FOO: bar"]
> 
> Hmph, that's quite subjective, I am afraid.  When I see the latter
> in the configuration file, "do I have to have a single space before
> 'bar' in the value of $FOO" would be the first question that would
> come to my mind.

I think it's just the mashed-up colons that I find ugly in the first
one. But I agree the latter isn't that nice either, and introduces the
ambiguity you describe.

> With an understanding that our syntax is so limited that we cannot
> even write '=' and need to resort to Is: instead, I'd actually find
> that the former less confusing than the latter.

That I think is the most interesting question: is the "=" actually
out-of-bounds? I tend to think not, based on our responses earlier in
the thread.

-Peff
Randall S. Becker Sept. 27, 2021, 8:53 p.m. UTC | #6
On September 27, 2021 4:16 PM, Jeff King wrote:
>Subject: Re: [PATCH] config: add an includeIf.env{Exists,Bool,Is,Match}
>
>On Mon, Sep 27, 2021 at 09:30:41AM -0700, Junio C Hamano wrote:
>
>> >> This asserts what? FOO=" bar"?
>> >
>> > Whoops, that should have been "envIs", asserting that $FOO contains
>> > "bar".
>>
>> Oh, "can we check with a literal with leading whitespace?" was what my
>> question was about ;-)
>
>My assumption was that nobody would really care about doing so. It is true that it's less flexible, though (and is a decision we can't easily
>take back later).
>
>> > As I said, I think it matters more with the infix operators, as:
>> >
>> >   [includeIf "env:FOO == bar"]
>> >
>> > is more readable than:
>> >
>> >   [includeIf "env:FOO==bar"]
>>
>> Sure, but at that point, we'd probably want some quoting mechanism for
>> the literal to be compared, e.g.
>>
>> 	[includeIf "env:PATH ~= \"(:|^)/usr/bin(:|$)\""]
>
>Ick. The extra quoting of the internal double-quotes is pretty horrid to look at. Also, how does one match a double-quote in the value? \\\"?
>
>If it were optional, that would make the common cases easy (no dq, no whitespace), and the hard ones possible.
>
>I think this is getting into a bit of a digression, though. I'm willing to defer to Ævar, who is doing the actual work, and I don't know if he has
>found any of this compelling. ;)

What about something like:

	[includeIf "env:PATH ~= '^(.*
Jeff King Sept. 27, 2021, 9:37 p.m. UTC | #7
On Mon, Sep 27, 2021 at 04:53:59PM -0400, Randall S. Becker wrote:

> What about something like:
> 
> 	[includeIf "env:PATH ~= '^(.*
Randall S. Becker Sept. 27, 2021, 9:56 p.m. UTC | #8
On September 27, 2021 5:38 PM, Jeff King wrote:
>Subject: Re: [PATCH] config: add an includeIf.env{Exists,Bool,Is,Match}
>
>On Mon, Sep 27, 2021 at 04:53:59PM -0400, Randall S. Becker wrote:
>
>> What about something like:
>>
>> 	[includeIf "env:PATH ~= '^(.*
Ævar Arnfjörð Bjarmason Sept. 27, 2021, 11:52 p.m. UTC | #9
On Mon, Sep 27 2021, Jeff King wrote:

> On Mon, Sep 27, 2021 at 09:30:41AM -0700, Junio C Hamano wrote:
>
>> >> This asserts what? FOO=" bar"?
>> >
>> > Whoops, that should have been "envIs", asserting that $FOO contains
>> > "bar".
>> 
>> Oh, "can we check with a literal with leading whitespace?" was what
>> my question was about ;-)
>
> My assumption was that nobody would really care about doing so. It is
> true that it's less flexible, though (and is a decision we can't easily
> take back later).

Yeah, I think nobody really cares about stripspace() or not for this
sort of feature.

I do think having implicit and explicit complexity like that makes it
harder to document, implement and understand for users though. I.e. is
"env:FOO == bar" the same as "env:FOO ==bar" etc., what whitespace
exactly is accepted etc.

>> > As I said, I think it matters more with the infix operators, as:
>> >
>> >   [includeIf "env:FOO == bar"]
>> >
>> > is more readable than:
>> >
>> >   [includeIf "env:FOO==bar"]
>> 
>> Sure, but at that point, we'd probably want some quoting mechanism
>> for the literal to be compared, e.g.
>> 
>> 	[includeIf "env:PATH ~= \"(:|^)/usr/bin(:|$)\""]
>
> Ick. The extra quoting of the internal double-quotes is pretty horrid to
> look at. Also, how does one match a double-quote in the value? \\\"?
>
> If it were optional, that would make the common cases easy (no dq, no
> whitespace), and the hard ones possible.

An implicit assumption of mine in the simpler positive-match-only
version (which I should have made clear) is that anyone who needs this
sort of complexity can just arrange to wrap their "git" in a function,
or do this sort of thing in their ~/.bashrc, i.e. just:

    if code_of_arbitrary_complexity
    then
        export GIT_DO_XYZ_INCLUDES=1
    fi

Then in your config:

    includeIf.envBool:GIT_DO_XYZ_INCLUDES.path=~/.gitconfig.d/xyz.cfg

And having written that out I think the best thing to do is probably to
have a version that only does the envExists and envBool version (or just
envBool), and skip envIs and envMatch entirely.

In the case of env:PATH we're just setting users up for some buggy or
unexpected interaction with something that would be better done either
via a gitdir include, or if they really need $PATH they can just wrap
"git" in a function that sets a boolean inclusion variable.

That would get us out of having to support emergent behavior where some
git tool invoked via run_command() or something chdir's somewhere as an
implementation detail, and such an env:PATH match means we'd need to
support that, or potentially break existing user config.

Or, since we might not be invoked via a shell, the same issue with a
$PATH being "stale" from the POV of a user who's wondering why say a
command like:

    # status in the "t" subdirectory
    git -C t <cmd> <question>

Doesn't have the "right" $PWD, which we might not have as some future
shortcut in <cmd> decided not to bother chdir()-ing to answer the user's
<question>.

> I think this is getting into a bit of a digression, though. I'm willing
> to defer to Ævar, who is doing the actual work, and I don't know if he
> has found any of this compelling. ;)
>
>> > But I do think:
>> >
>> >   [includeIf "envIs:FOO:bar"]
>> >
>> > is harder to read than even:
>> >
>> >   [includeIf "envIs:FOO: bar"]
>> 
>> Hmph, that's quite subjective, I am afraid.  When I see the latter
>> in the configuration file, "do I have to have a single space before
>> 'bar' in the value of $FOO" would be the first question that would
>> come to my mind.
>
> I think it's just the mashed-up colons that I find ugly in the first
> one. But I agree the latter isn't that nice either, and introduces the
> ambiguity you describe.

FWIW I hacked up a --config-key --config-value pairing so you could set
keys with "=" in them on the command-line, I'm not sure I like the
interface, but it gets rid of that ":" v.s. "=" edge case:
https://github.com/avar/git/commit/a86053df48b

>> With an understanding that our syntax is so limited that we cannot
>> even write '=' and need to resort to Is: instead, I'd actually find
>> that the former less confusing than the latter.
>
> That I think is the most interesting question: is the "=" actually
> out-of-bounds? I tend to think not, based on our responses earlier in
> the thread.
Junio C Hamano Sept. 28, 2021, 12:24 a.m. UTC | #10
Jeff King <peff@peff.net> writes:

>> Sure, but at that point, we'd probably want some quoting mechanism
>> for the literal to be compared, e.g.
>> 
>> 	[includeIf "env:PATH ~= \"(:|^)/usr/bin(:|$)\""]
>
> Ick. The extra quoting of the internal double-quotes is pretty horrid to
> look at. Also, how does one match a double-quote in the value? \\\"?

Ick indeed.  I didn't mean to say you must always dq quote.  It
started more like 

	includeIf "env:VAR == ' value with leading whitespace'"

(or use \" inside ""-pair to mean a double-quote) as a demonstration
of an escape hatch needed if we took your "let's by default strip
the whitespace around the value" example in the message I was
responding to.

Just like we in most cases do not have to quote the value in the
configuration files, unless you have strange needs like wanting to
express a value with leading whitespace that should not be stripped,
if we were to go this route, 

> If it were optional, that would make the common cases easy (no dq, no
> whitespace), and the hard ones possible.

Yup.
Jeff King Sept. 28, 2021, 12:41 a.m. UTC | #11
On Tue, Sep 28, 2021 at 01:52:26AM +0200, Ævar Arnfjörð Bjarmason wrote:

> An implicit assumption of mine in the simpler positive-match-only
> version (which I should have made clear) is that anyone who needs this
> sort of complexity can just arrange to wrap their "git" in a function,
> or do this sort of thing in their ~/.bashrc, i.e. just:
> 
>     if code_of_arbitrary_complexity
>     then
>         export GIT_DO_XYZ_INCLUDES=1
>     fi
> 
> Then in your config:
> 
>     includeIf.envBool:GIT_DO_XYZ_INCLUDES.path=~/.gitconfig.d/xyz.cfg
> 
> And having written that out I think the best thing to do is probably to
> have a version that only does the envExists and envBool version (or just
> envBool), and skip envIs and envMatch entirely.

I'm not sure I agree. If you are willing to wrap git, then you can just
add:

  git -c include.path=~/.gitconfig.d/xyz.cfg

to the command-line in the first place. Or if you're willing to use our
undocumented interface, you can even do it in your .bashrc:

  if code_of_arbitrary_complexity
  then
          GIT_CONFIG_PARAMETERS="'include.path'='~/.gitconfig.d/xyz.cfg'"
  fi

The value of this env matching is that it is done at run-time without
wrapping, and can meaningfully inspect the state of the world. E.g., the
$TERM thing that started this thread.

> In the case of env:PATH we're just setting users up for some buggy or
> unexpected interaction with something that would be better done either
> via a gitdir include, or if they really need $PATH they can just wrap
> "git" in a function that sets a boolean inclusion variable.

Yes, I have trouble imagining why any matching on env:PATH would be
useful (or $PWD, since we have the much less confusing gitdir
conditional). Which isn't to say I want to forbid it, but just because
people can shoot themselves in the foot with complexity doesn't mean
that "envIs" is a bad thing when it's not misused.

> > I think it's just the mashed-up colons that I find ugly in the first
> > one. But I agree the latter isn't that nice either, and introduces the
> > ambiguity you describe.
> 
> FWIW I hacked up a --config-key --config-value pairing so you could set
> keys with "=" in them on the command-line, I'm not sure I like the
> interface, but it gets rid of that ":" v.s. "=" edge case:
> https://github.com/avar/git/commit/a86053df48b

Yeah, we talked about that a while ago, but nobody liked the interface
enough to actually code it (and as far as I know, it's really
theoretical; nobody has actually wanted to set such an option from the
command-line yet, and we have the --config-env stuff for people who want
to robustly pass along arbitrary keys).

A perhaps more subtle but less awkward to type version is to just
require two arguments, like:

  git --config <key> <value> ...

but I'd just as soon continue to leave it un-implemented if nobody has
actually needed it in practice.

-Peff
Ævar Arnfjörð Bjarmason Sept. 28, 2021, 2:42 a.m. UTC | #12
On Mon, Sep 27 2021, Jeff King wrote:

> On Tue, Sep 28, 2021 at 01:52:26AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> An implicit assumption of mine in the simpler positive-match-only
>> version (which I should have made clear) is that anyone who needs this
>> sort of complexity can just arrange to wrap their "git" in a function,
>> or do this sort of thing in their ~/.bashrc, i.e. just:
>> 
>>     if code_of_arbitrary_complexity
>>     then
>>         export GIT_DO_XYZ_INCLUDES=1
>>     fi
>> 
>> Then in your config:
>> 
>>     includeIf.envBool:GIT_DO_XYZ_INCLUDES.path=~/.gitconfig.d/xyz.cfg
>> 
>> And having written that out I think the best thing to do is probably to
>> have a version that only does the envExists and envBool version (or just
>> envBool), and skip envIs and envMatch entirely.
>
> I'm not sure I agree. If you are willing to wrap git, then you can just
> add:
>
>   git -c include.path=~/.gitconfig.d/xyz.cfg
>
> to the command-line in the first place. Or if you're willing to use our
> undocumented interface, you can even do it in your .bashrc:
>
>   if code_of_arbitrary_complexity
>   then
>           GIT_CONFIG_PARAMETERS="'include.path'='~/.gitconfig.d/xyz.cfg'"
>   fi

Sort of, that'll give you unconditional inclusion, but won't e.g. handle
a case where the env include only runs in some .git/config, or depending
on other inclusion (e.g. in ~/dev/git, but only if XYZ env var).

But yeah, it won't handle all potential cases. I figured for this sort
of thing it was better to start small and see if the provided interface
was enough..

> The value of this env matching is that it is done at run-time without
> wrapping, and can meaningfully inspect the state of the world. E.g., the
> $TERM thing that started this thread.

Yeah, maybe we should have at least an ifStrEQ, whatever we call it...

>> In the case of env:PATH we're just setting users up for some buggy or
>> unexpected interaction with something that would be better done either
>> via a gitdir include, or if they really need $PATH they can just wrap
>> "git" in a function that sets a boolean inclusion variable.
>
> Yes, I have trouble imagining why any matching on env:PATH would be
> useful (or $PWD, since we have the much less confusing gitdir
> conditional). Which isn't to say I want to forbid it, but just because
> people can shoot themselves in the foot with complexity doesn't mean
> that "envIs" is a bad thing when it's not misused.

I'm biased by past on-list discussions where existing behavior, no
matter if unintentional or emergent can be really hard to fix once
established.

>> > I think it's just the mashed-up colons that I find ugly in the first
>> > one. But I agree the latter isn't that nice either, and introduces the
>> > ambiguity you describe.
>> 
>> FWIW I hacked up a --config-key --config-value pairing so you could set
>> keys with "=" in them on the command-line, I'm not sure I like the
>> interface, but it gets rid of that ":" v.s. "=" edge case:
>> https://github.com/avar/git/commit/a86053df48b
>
> Yeah, we talked about that a while ago, but nobody liked the interface
> enough to actually code it (and as far as I know, it's really
> theoretical; nobody has actually wanted to set such an option from the
> command-line yet, and we have the --config-env stuff for people who want
> to robustly pass along arbitrary keys).
>
> A perhaps more subtle but less awkward to type version is to just
> require two arguments, like:
>
>   git --config <key> <value> ...

I suppose --config would work like that, you can'd to it with "-c". I
think it's more confusing to have a "-c" and "--config" which unlike
most other things don't follow the obvious long and short option names
working the same way.

> but I'd just as soon continue to leave it un-implemented if nobody has
> actually needed it in practice.

*nod*. I do think it's bad design to introduce an "env" inclusion
feature that relies on "=" though while we don't have something like
that, i.e.

I think we should probably not add that --config-{key,value}, but
avoiding the arbitrary limitation of not being able to specify certain
config keys seems prudent in that case, and since the "=" v.s. ":" is
only an aesthetic preference I think being able to compose things
without limitations wins out.

We do have the "=" key limitation now, but I don't think it's there for
any key we currently define, except things like "url.<base>.insteadOf"
if the "<base> has a "=" in it (and maybe just that one).
Jeff King Sept. 28, 2021, 5:42 a.m. UTC | #13
On Tue, Sep 28, 2021 at 04:42:51AM +0200, Ævar Arnfjörð Bjarmason wrote:

> > A perhaps more subtle but less awkward to type version is to just
> > require two arguments, like:
> >
> >   git --config <key> <value> ...
> 
> I suppose --config would work like that, you can'd to it with "-c". I
> think it's more confusing to have a "-c" and "--config" which unlike
> most other things don't follow the obvious long and short option names
> working the same way.

Yeah, probably "--config-pair" or something might be less confusing.
Anyway...

> > but I'd just as soon continue to leave it un-implemented if nobody has
> > actually needed it in practice.
> 
> *nod*. I do think it's bad design to introduce an "env" inclusion
> feature that relies on "=" though while we don't have something like
> that, i.e.
> 
> I think we should probably not add that --config-{key,value}, but
> avoiding the arbitrary limitation of not being able to specify certain
> config keys seems prudent in that case, and since the "=" v.s. ":" is
> only an aesthetic preference I think being able to compose things
> without limitations wins out.

I don't really agree with that. Whatever syntax we use now, we'll be
stuck with forever. It seems a shame to predicate that choice only on
the "-c doesn't support =" thing that nobody has actually run across in
practice (and I don't think is something people will run into with
this).

> We do have the "=" key limitation now, but I don't think it's there for
> any key we currently define, except things like "url.<base>.insteadOf"
> if the "<base> has a "=" in it (and maybe just that one).

It's really a potential problem for any 3-level config key. So urls,
branch names, remote names, various tool names, filter/diff drivers,
existing includeIf conditions. This might be the first one where we
really _encourage_ the use of "=" signs, but it still strikes me as
weird that you'd want to do so on the command-line in practice.

-Peff
Ævar Arnfjörð Bjarmason Sept. 28, 2021, 7:28 p.m. UTC | #14
On Tue, Sep 28 2021, Jeff King wrote:

> On Tue, Sep 28, 2021 at 04:42:51AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > A perhaps more subtle but less awkward to type version is to just
>> > require two arguments, like:
>> >
>> >   git --config <key> <value> ...
>> 
>> I suppose --config would work like that, you can'd to it with "-c". I
>> think it's more confusing to have a "-c" and "--config" which unlike
>> most other things don't follow the obvious long and short option names
>> working the same way.
>
> Yeah, probably "--config-pair" or something might be less confusing.
> Anyway...

*nod*

>> > but I'd just as soon continue to leave it un-implemented if nobody has
>> > actually needed it in practice.
>> 
>> *nod*. I do think it's bad design to introduce an "env" inclusion
>> feature that relies on "=" though while we don't have something like
>> that, i.e.
>> 
>> I think we should probably not add that --config-{key,value}, but
>> avoiding the arbitrary limitation of not being able to specify certain
>> config keys seems prudent in that case, and since the "=" v.s. ":" is
>> only an aesthetic preference I think being able to compose things
>> without limitations wins out.
>
> I don't really agree with that. Whatever syntax we use now, we'll be
> stuck with forever. It seems a shame to predicate that choice only on
> the "-c doesn't support =" thing that nobody has actually run across in
> practice (and I don't think is something people will run into with
> this).

Yeah, anyway. I don't care much either way, and not enough to drive this
forward. I.e. it seemed like an easy thing to hack up, but if anyone
else is interested in driving it forward...

>> We do have the "=" key limitation now, but I don't think it's there for
>> any key we currently define, except things like "url.<base>.insteadOf"
>> if the "<base> has a "=" in it (and maybe just that one).
>
> It's really a potential problem for any 3-level config key. So urls,
> branch names, remote names, various tool names, filter/diff drivers,
> existing includeIf conditions. This might be the first one where we
> really _encourage_ the use of "=" signs, but it still strikes me as
> weird that you'd want to do so on the command-line in practice.

Just for future reference:

I think given the discussion in the thread, and particularly if we're
going to have some regex syntax for these keys that the artificial
straitjacket of putting this all in one config key is something we
should just do away with.

I.e. I'm not proposing a *specific* schema other than noting that
there's no law that forces us to take say Junio's (in
https://lore.kernel.org/git/xmqqo88eq8um.fsf@gitster.g/):

        [includeIf "env:PATH ~= \"(:|^)/usr/bin(:|$)\""]

Over say:

    [includeCondition]
        type = envRegex
        envVariable = PATH
        envRegex = "(:|^)/usr/bin(:|$)"
        path = ~/.gitconfig.d/env-stuff.cfg

Or whatever, i.e. the state machine of seeing an "includeCondition" in
the config's event parser, and then erroring unless the next N config
keys satisfy some mandatory minimum set of config keys is rather simple.

We could then make any such syntax optional for existing constructs,
i.e. you could write:

    [includeIf "gitdir:/path/to/group/"]
    path = /path/to/foo.inc

As:

    [includeCondition]
        type = gitdir
        path = /path/to/foo.inc
        gitdirPath = /path/to/group/

Or something. And say add "includeCondition.negated = true" to that for
a "!=" match.

The shorthand syntax could then be omitted for anything new if it's
deemed too gnarly to represent it all in one key.

The key names & schema is something I came up with offhand, please don't
read too much into it. The point is that we don't need to make it all
one key.

That approach also fits nicely in with the rest of the config framework,
i.e. you can incrementally edit things using "git config" options, and
we could add a "--type regex" or whatever which we could then
set/validate say the "includeCondition.envRegex" key with.
diff mbox series

Patch

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 0c0e6b859f1..58f6d49216d 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -159,6 +159,41 @@  all branches that begin with `foo/`. This is useful if your branches are
 organized hierarchically and you would like to apply a configuration to
 all the branches in that hierarchy.
 
+`envExists`::
+	The data that follows the keyword `envExists:` is taken to be the
+	name of an environment variable, e.g. `envExists:NAME`. If the
+	variable (`NAME` in this case) exists the include condition is
+	met.
+
+`envBool`::
+	The data that follows the keyword `envBool:` is taken to be the
+	name of an environment variable, e.g. `envBool:NAME`. If the
+	variable (`NAME` in this case) exists, its value will be
+	checked for boolean truth.
++
+The accepted boolean values are the same as those that `--type bool`
+would accept. A value that's normalized to "true" will satisfy the
+include condition (a nonexisting environment variable is "false").
+
+`envIs`::
+	The data that follows the keyword `envIs:` is taken to be the
+	name of an environment variable, followed by a mandatory ":"
+	character, followed by the expected value of the environment
+	variable. If the value matches the include condition is
+	satisfied.
++
+Values may contain any characters otherwise accepted by the config
+mechanism (including ":").
+
+`envMatch`::
+	Like `envIs`, except that the value is matched using standard
+	globbing wildcards.
+
+There is no way to match an environment variable name with any of the
+`env*` directives if that variable name contains a ":" character. Such
+names are allowed by POSIX, but it is assumed that nobody will need to
+match such a variable name in practice.
+
 A few more notes on matching via `gitdir` and `gitdir/i`:
 
  * Symlinks in `$GIT_DIR` are not resolved before matching.
diff --git a/config.c b/config.c
index 2edf835262f..064059b0d6d 100644
--- a/config.c
+++ b/config.c
@@ -271,6 +271,54 @@  static int include_by_gitdir(const struct config_options *opts,
 	return ret;
 }
 
+static int include_by_env_exists(const char *cond, size_t cond_len)
+{
+	char *cfg = xstrndup(cond, cond_len);
+	int ret = !!getenv(cfg);
+	free(cfg);
+	return ret;
+}
+
+static int include_by_env_bool(const char *cond, size_t cond_len)
+{
+	char *cfg = xstrndup(cond, cond_len);
+	int ret = git_env_bool(cfg, 0);
+	free(cfg);
+	return ret;
+}
+
+static int include_by_env_match(const char *cond, size_t cond_len, int glob,
+				int *err)
+{
+	const char *eq;
+	const char *value;
+	const char *env;
+	char *cfg = xstrndup(cond, cond_len);
+	char *key = NULL;
+	int ret = 0;
+
+	eq = strchr(cfg, ':');
+	if (!eq) {
+		*err = error(_("'%s:%.*s' missing a ':' to match the value"),
+			     glob ? "envMatch" : "envIs", (int)(cond_len),
+			     cond);
+		goto cleanup;
+	}
+	value = eq + 1;
+
+	key = xmemdupz(cfg, eq - cfg);
+	env = getenv(key);
+	if (!env)
+		goto cleanup;
+
+	ret = glob ? !wildmatch(value, env, 0) : !strcmp(value, env);
+
+cleanup:
+	free(key);
+	free(cfg);
+	return ret;
+}
+
 static int include_by_branch(const char *cond, size_t cond_len)
 {
 	int flags;
@@ -292,7 +340,8 @@  static int include_by_branch(const char *cond, size_t cond_len)
 }
 
 static int include_condition_is_true(const struct config_options *opts,
-				     const char *cond, size_t cond_len)
+				     const char *cond, size_t cond_len,
+				     int *err)
 {
 
 	if (skip_prefix_mem(cond, cond_len, "gitdir:", &cond, &cond_len))
@@ -301,6 +350,14 @@  static int include_condition_is_true(const struct config_options *opts,
 		return include_by_gitdir(opts, cond, cond_len, 1);
 	else if (skip_prefix_mem(cond, cond_len, "onbranch:", &cond, &cond_len))
 		return include_by_branch(cond, cond_len);
+	else if (skip_prefix_mem(cond, cond_len, "envExists:", &cond, &cond_len))
+		return include_by_env_exists(cond, cond_len);
+	else if (skip_prefix_mem(cond, cond_len, "envBool:", &cond, &cond_len))
+		return include_by_env_bool(cond, cond_len);
+	else if (skip_prefix_mem(cond, cond_len, "envIs:", &cond, &cond_len))
+		return include_by_env_match(cond, cond_len, 0, err);
+	else if (skip_prefix_mem(cond, cond_len, "envMatch:", &cond, &cond_len))
+		return include_by_env_match(cond, cond_len, 1, err);
 
 	/* unknown conditionals are always false */
 	return 0;
@@ -325,7 +382,7 @@  int git_config_include(const char *var, const char *value, void *data)
 		ret = handle_path_include(value, inc);
 
 	if (!parse_config_key(var, "includeif", &cond, &cond_len, &key) &&
-	    (cond && include_condition_is_true(inc->opts, cond, cond_len)) &&
+	    (cond && include_condition_is_true(inc->opts, cond, cond_len, &ret)) &&
 	    !strcmp(key, "path"))
 		ret = handle_path_include(value, inc);
 
diff --git a/t/t1305-config-include.sh b/t/t1305-config-include.sh
index ccbb116c016..cebe2bb75f1 100755
--- a/t/t1305-config-include.sh
+++ b/t/t1305-config-include.sh
@@ -348,6 +348,73 @@  test_expect_success 'conditional include, onbranch, implicit /** for /' '
 	test_cmp expect actual
 '
 
+test_expect_success 'conditional include, envExists:*' '
+	echo value >expect &&
+	git config -f envExists.cfg some.key $(cat expect) &&
+
+	test_must_fail git -c includeIf.envExists:VAR.path="$PWD/envExists.cfg" config some.key 2>err &&
+	test_must_be_empty err &&
+
+	VAR= git -c includeIf.envExists:VAR.path="$PWD/envExists.cfg" config some.key >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp expect actual &&
+
+	VAR=0 git -c includeIf.envExists:VAR.path="$PWD/envExists.cfg" config some.key >actual 2>err &&
+	test_cmp expect actual &&
+	test_must_be_empty err
+'
+
+test_expect_success 'conditional include, envBool:*' '
+	echo value >expect &&
+	git config -f envBool.cfg some.key $(cat expect) &&
+
+	test_must_fail env VAR= git -c includeIf.envBool:VAR.path="$PWD/envBool.cfg" config some.key 2>err &&
+	test_must_be_empty err &&
+
+	test_must_fail env VAR=0 git -c includeIf.envBool:VAR.path="$PWD/envBool.cfg" config some.key 2>err &&
+	test_must_be_empty err &&
+
+	test_must_fail env VAR=false git -c includeIf.envBool:VAR.path="$PWD/envBool.cfg" config some.key 2>err &&
+	test_must_be_empty err &&
+
+	# envBool:* bad value
+	cat >expect.err <<-\EOF &&
+	fatal: bad boolean config value '\''gibberish'\'' for '\''VAR'\''
+	EOF
+	test_must_fail env VAR=gibberish git -c includeIf.envBool:VAR.path="$PWD/envBool.cfg" config some.key 2>err.actual &&
+	test_cmp expect.err err.actual
+'
+
+test_expect_success 'conditional include, envIs:*' '
+	echo value >expect &&
+	git config -f envIs.cfg some.key $(cat expect) &&
+
+	VAR=foo git -c includeIf.envIs:VAR:foo.path="$PWD/envExists.cfg" config some.key &&
+	test_must_fail env VAR=foo git -c includeIf.envIs:VAR:*f*.path="$PWD/envExists.cfg" config some.key &&
+
+	cat >expect.err <<-\EOF &&
+	error: '\''envIs:VAR'\'' missing a '\'':'\'' to match the value
+	fatal: unable to parse command-line config
+	EOF
+	test_must_fail env VAR=x git -c includeIf.envIs:VAR.path="$PWD/envBool.cfg" config some.key 2>err.actual &&
+	test_cmp expect.err err.actual
+'
+
+test_expect_success 'conditional include, envMatch:*' '
+	echo value >expect &&
+	git config -f envMatch.cfg some.key $(cat expect) &&
+
+	VAR=foo git -c includeIf.envMatch:VAR:foo.path="$PWD/envExists.cfg" config some.key &&
+	VAR=foo git -c includeIf.envMatch:VAR:*f*.path="$PWD/envExists.cfg" config some.key &&
+
+	cat >expect.err <<-\EOF &&
+	error: '\''envMatch:VAR'\'' missing a '\'':'\'' to match the value
+	fatal: unable to parse command-line config
+	EOF
+	test_must_fail env VAR=x git -c includeIf.envMatch:VAR.path="$PWD/envBool.cfg" config some.key 2>err.actual &&
+	test_cmp expect.err err.actual
+'
+
 test_expect_success 'include cycles are detected' '
 	git init --bare cycle &&
 	git -C cycle config include.path cycle &&