Message ID | cover.1638487815.git.jonathantanmy@google.com (mailing list archive) |
---|---|
Headers | show |
Series | Conditional config includes based on remote URL | expand |
On Thu, Dec 02 2021, Jonathan Tan wrote: > Thanks, Junio, for your comments. I think the code is more clearly laid > out now. > > The main changes from v4 are that I've maintained the existing code > structure more, and changed the keyword used to something that hopefully > will be more forwards compatible. I've also updated the documentation to > explain the forwards compatibility idea. I read through this and came up with the below as a proposed squash-in just while reading through it. These may or may not help. Changes: * There was some needless "$(pwd)" in the tests * Inlining the "remote_urls" in the struct makes its management easier; and the free/NULL checks just check .nr now, and string_list_clear() can be unconditional. * Created a include_by_remote_url() function. Makes the overall diff smaller since you don't need to add braces to everything in include_condition_is_true() Other comments (not related to the below): * It would be nice if e.g. the "includeIf.hasconfig:remote.*.url globs" test were split up by condition, but maybe that's a hassle (would need a small helper). Just something that would have helped while hacking on this, i.e. now most of it was an all-or-nothing failure & peek at the trace output * Your last test appears to entirely forbid recursion. I.e. we die if you include config which in turn tries to use this include mechanism, right? That's probably wise, and it is explicitly documented. But as far as the documentation about this being a forward-compatible facility, do we think that this limitation would apply to any future config key? I.e. if I include based on "user.email" nothing in that to-be-included can set user.email? That's probably OK, just wondering. In any case it can always be expanded later on. diff --git a/config.c b/config.c index 39ac38e0e78..91b0a328e59 100644 --- a/config.c +++ b/config.c @@ -130,9 +130,11 @@ struct config_include_data { /* * All remote URLs discovered when reading all config files. */ - struct string_list *remote_urls; + struct string_list remote_urls; }; -#define CONFIG_INCLUDE_INIT { 0 } +#define CONFIG_INCLUDE_INIT { \ + .remote_urls = STRING_LIST_INIT_DUP, \ +} static int git_config_include(const char *var, const char *value, void *data); @@ -340,9 +342,7 @@ static void populate_remote_urls(struct config_include_data *inc) current_config_kvi = NULL; current_parsing_scope = 0; - inc->remote_urls = xmalloc(sizeof(*inc->remote_urls)); - string_list_init_dup(inc->remote_urls); - config_with_options(add_remote_url, inc->remote_urls, inc->config_source, &opts); + config_with_options(add_remote_url, &inc->remote_urls, inc->config_source, &opts); cf = store_cf; current_config_kvi = store_kvi; @@ -381,26 +381,31 @@ static int at_least_one_url_matches_glob(const char *glob, int glob_len, return found; } +static int include_by_remote_url(struct config_include_data *inc, + const char *cond, size_t cond_len) +{ + if (inc->opts->unconditional_remote_url) + return 1; + if (!inc->remote_urls.nr) + populate_remote_urls(inc); + return at_least_one_url_matches_glob(cond, cond_len, + &inc->remote_urls); +} + static int include_condition_is_true(struct config_include_data *inc, const char *cond, size_t cond_len) { const struct config_options *opts = inc->opts; - if (skip_prefix_mem(cond, cond_len, "gitdir:", &cond, &cond_len)) { + if (skip_prefix_mem(cond, cond_len, "gitdir:", &cond, &cond_len)) return include_by_gitdir(opts, cond, cond_len, 0); - } else if (skip_prefix_mem(cond, cond_len, "gitdir/i:", &cond, &cond_len)) { + else if (skip_prefix_mem(cond, cond_len, "gitdir/i:", &cond, &cond_len)) return include_by_gitdir(opts, cond, cond_len, 1); - } else if (skip_prefix_mem(cond, cond_len, "onbranch:", &cond, &cond_len)) { + else if (skip_prefix_mem(cond, cond_len, "onbranch:", &cond, &cond_len)) return include_by_branch(cond, cond_len); - } else if (skip_prefix_mem(cond, cond_len, "hasconfig:remote.*.url:", &cond, - &cond_len)) { - if (inc->opts->unconditional_remote_url) - return 1; - if (!inc->remote_urls) - populate_remote_urls(inc); - return at_least_one_url_matches_glob(cond, cond_len, - inc->remote_urls); - } + else if (skip_prefix_mem(cond, cond_len, "hasconfig:remote.*.url:", &cond, + &cond_len)) + return include_by_remote_url(inc, cond, cond_len); /* unknown conditionals are always false */ return 0; @@ -2061,10 +2066,7 @@ int config_with_options(config_fn_t fn, void *data, ret = do_git_config_sequence(opts, fn, data); } - if (inc.remote_urls) { - string_list_clear(inc.remote_urls, 0); - FREE_AND_NULL(inc.remote_urls); - } + string_list_clear(&inc.remote_urls, 0); return ret; } diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 0f7bae31b4b..8310562b842 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -2391,11 +2391,11 @@ test_expect_success 'includeIf.hasconfig:remote.*.url' ' git init hasremoteurlTest && test_when_finished "rm -rf hasremoteurlTest" && - cat >"$(pwd)"/include-this <<-\EOF && + cat >include-this <<-\EOF && [user] this = this-is-included EOF - cat >"$(pwd)"/dont-include-that <<-\EOF && + cat >dont-include-that <<-\EOF && [user] that = that-is-not-included EOF @@ -2419,7 +2419,7 @@ test_expect_success 'includeIf.hasconfig:remote.*.url respects last-config-wins' git init hasremoteurlTest && test_when_finished "rm -rf hasremoteurlTest" && - cat >"$(pwd)"/include-two-three <<-\EOF && + cat >include-two-three <<-\EOF && [user] two = included-config three = included-config @@ -2453,11 +2453,11 @@ test_expect_success 'includeIf.hasconfig:remote.*.url globs' ' git init hasremoteurlTest && test_when_finished "rm -rf hasremoteurlTest" && - printf "[user]\ndss = yes\n" >"$(pwd)/double-star-start" && - printf "[user]\ndse = yes\n" >"$(pwd)/double-star-end" && - printf "[user]\ndsm = yes\n" >"$(pwd)/double-star-middle" && - printf "[user]\nssm = yes\n" >"$(pwd)/single-star-middle" && - printf "[user]\nno = no\n" >"$(pwd)/no" && + printf "[user]\ndss = yes\n" >double-star-start && + printf "[user]\ndse = yes\n" >double-star-end && + printf "[user]\ndsm = yes\n" >double-star-middle && + printf "[user]\nssm = yes\n" >single-star-middle && + printf "[user]\nno = no\n" >no && cat >>hasremoteurlTest/.git/config <<-EOF && [remote "foo"] @@ -2491,7 +2491,7 @@ test_expect_success 'includeIf.hasconfig:remote.*.url forbids remote url in such git init hasremoteurlTest && test_when_finished "rm -rf hasremoteurlTest" && - cat >"$(pwd)"/include-with-url <<-\EOF && + cat >include-with-url <<-\EOF && [remote "bar"] url = bar EOF
> I read through this and came up with the below as a proposed squash-in > just while reading through it. These may or may not help. Changes: > > * There was some needless "$(pwd)" in the tests Ah, thanks for catching that. > * Inlining the "remote_urls" in the struct makes its management easier; > and the free/NULL checks just check .nr now, and string_list_clear() can be > unconditional. I don't think we can do this - nr might still be 0 after a scan if we don't have remote URLs for some reason, so we still need to distinguish between not-scanned and scanned-with-zero-URLs. > * Created a include_by_remote_url() function. Makes the overall diff smaller > since you don't need to add braces to everything in include_condition_is_true() Ah, good idea. I'll do this. > Other comments (not related to the below): > > * It would be nice if e.g. the "includeIf.hasconfig:remote.*.url globs" test > were split up by condition, but maybe that's a hassle (would need a small helper). > > Just something that would have helped while hacking on this, i.e. now most of it > was an all-or-nothing failure & peek at the trace output What do you mean by condition? There seems to only be one condition (whether the URL is there or not), unless you were thinking of smaller subdivisions. > * Your last test appears to entirely forbid recursion. I.e. we die if you include config > which in turn tries to use this include mechanism, right? > > That's probably wise, and it is explicitly documented. > > But as far as the documentation about this being a forward-compatible facility, do we > think that this limitation would apply to any future config key? I.e. if I include based > on "user.email" nothing in that to-be-included can set user.email? > > That's probably OK, just wondering. In any case it can always be expanded later on. We can decide later what the future facility will be, but I envision that we will not allow included files to set config that can affect any include directives in use. So, for example, if I have a user.email-based include, none of my config-conditionally included files can set user.email.
On Tue, Dec 07 2021, Jonathan Tan wrote: >> I read through this and came up with the below as a proposed squash-in >> just while reading through it. These may or may not help. Changes: >> >> * There was some needless "$(pwd)" in the tests > > Ah, thanks for catching that. > >> * Inlining the "remote_urls" in the struct makes its management easier; >> and the free/NULL checks just check .nr now, and string_list_clear() can be >> unconditional. > > I don't think we can do this - nr might still be 0 after a scan if we > don't have remote URLs for some reason, so we still need to distinguish > between not-scanned and scanned-with-zero-URLs. You mean so that we don't double-free? The way string_list_clear() protects against that, but maybe there's something else. Whatever it is (if there's anything) it could use test coverage then :) >> * Created a include_by_remote_url() function. Makes the overall diff smaller >> since you don't need to add braces to everything in include_condition_is_true() > > Ah, good idea. I'll do this. > >> Other comments (not related to the below): >> >> * It would be nice if e.g. the "includeIf.hasconfig:remote.*.url globs" test >> were split up by condition, but maybe that's a hassle (would need a small helper). >> >> Just something that would have helped while hacking on this, i.e. now most of it >> was an all-or-nothing failure & peek at the trace output > > What do you mean by condition? There seems to only be one condition > (whether the URL is there or not), unless you were thinking of smaller > subdivisions. Maybe I'm just misunderstanding the intent here, but aren't you trying to guard against the case of having a ~/.gitconfig that includes ~/.gitconfig.d/for-this-url, and *that* file in turns changes the remote's "url" in its config, followed by another "include if url matches" condition therein? I.e. I read (more like skimmed) the documentation & test at the end as forbidding that, but maybe that's OK? >> * Your last test appears to entirely forbid recursion. I.e. we die if you include config >> which in turn tries to use this include mechanism, right? >> >> That's probably wise, and it is explicitly documented. >> >> But as far as the documentation about this being a forward-compatible facility, do we >> think that this limitation would apply to any future config key? I.e. if I include based >> on "user.email" nothing in that to-be-included can set user.email? >> >> That's probably OK, just wondering. In any case it can always be expanded later on. > > We can decide later what the future facility will be, but I envision > that we will not allow included files to set config that can affect any > include directives in use. So, for example, if I have a user.email-based > include, none of my config-conditionally included files can set user.email. I didn't look deeply at the implementation at all, but why would this be a problem? You parse ~/.gitconfig, it has user.name=foo, then right after in that file we do: [includeIf "hasconfig:user.name:*foo*"] path = ~/.gitconfig.d/foo Now the top of ~/.gitconfig.d/foo we have: [user] name = bar [includeIf "hasconfig:user.name:*bar*"] path = ~/.gitconfig.d/bar Why would it matter that we included on user.name=foo before? Doesn't that only matter *while* we process that first "path" line? Once we move past it we update our configset to user.name=bar once we hit the "name" line of the included file. Then when we get another "hasconfig:user.name" we just match it to our current user.name=*bar*. No? Anyway, I think it's fine to punt on it for now or whatever, just curious...
> >> * Inlining the "remote_urls" in the struct makes its management easier; > >> and the free/NULL checks just check .nr now, and string_list_clear() can be > >> unconditional. > > > > I don't think we can do this - nr might still be 0 after a scan if we > > don't have remote URLs for some reason, so we still need to distinguish > > between not-scanned and scanned-with-zero-URLs. > > You mean so that we don't double-free? The way string_list_clear() > protects against that, but maybe there's something else. > > Whatever it is (if there's anything) it could use test coverage then :) No - we only want to do one scan per config read. If we scan and there are no remote URLs, with your scheme, next time we encounter another includeIf.hasconfig, we would need to scan again (because nr is still 0). With my scheme, we can see that the pointer is non-NULL, so we know that we have already scanned. > >> * It would be nice if e.g. the "includeIf.hasconfig:remote.*.url globs" test > >> were split up by condition, but maybe that's a hassle (would need a small helper). > >> > >> Just something that would have helped while hacking on this, i.e. now most of it > >> was an all-or-nothing failure & peek at the trace output > > > > What do you mean by condition? There seems to only be one condition > > (whether the URL is there or not), unless you were thinking of smaller > > subdivisions. > > Maybe I'm just misunderstanding the intent here, but aren't you trying > to guard against the case of having a ~/.gitconfig that includes > ~/.gitconfig.d/for-this-url, and *that* file in turns changes the > remote's "url" in its config, followed by another "include if url > matches" condition therein? > > I.e. I read (more like skimmed) the documentation & test at the end as > forbidding that, but maybe that's OK? If we're including "~/.gitconfig.d/for-this-url" by includeIf.hasconfig, then yes, I'm guarding against that and other similar conditions. > > We can decide later what the future facility will be, but I envision > > that we will not allow included files to set config that can affect any > > include directives in use. So, for example, if I have a user.email-based > > include, none of my config-conditionally included files can set user.email. > > I didn't look deeply at the implementation at all, but why would this be > a problem? > > You parse ~/.gitconfig, it has user.name=foo, then right after in that > file we do: > > [includeIf "hasconfig:user.name:*foo*"] > path = ~/.gitconfig.d/foo > > Now the top of ~/.gitconfig.d/foo we have: > > [user] > name = bar > [includeIf "hasconfig:user.name:*bar*"] > path = ~/.gitconfig.d/bar > > Why would it matter that we included on user.name=foo before? > > Doesn't that only matter *while* we process that first "path" line? Once > we move past it we update our configset to user.name=bar once we hit the > "name" line of the included file. > > Then when we get another "hasconfig:user.name" we just match it to our > current user.name=*bar*. > > No? > > Anyway, I think it's fine to punt on it for now or whatever, just > curious... Well, we can't punt on it because what you describe also applies to remote URL :-) So what you're saying is that once we have decided to include a file, we always include it in its entirety regardless of whether the condition changes during the file's include. That's reasonable, but other people could have differing opinions. In this case, I think it's fine just to prohibit it entirely. In the future, we may look into relaxing this condition.