Message ID | pull.710.git.1598456751674.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | clone: add remote.cloneDefault config option | expand |
"Sean Barag via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Sean Barag <sean@barag.org> > > While the default remote name of `origin` can be overridden both > pre-clone (with `git clone --origin foo`) and post-clone (with `git > remote rename origin foo`), it'd be handy to have a configurable > system-wide default for clones! I doubt it is handy enough to deserve an explamation point. Replace it with a plain-vanilla full-stop instead. I however tend to agree that, even evidenced by the manual page description of "git clone", i.e. -o <name>:: --origin <name>:: Instead of using the remote name `origin` to keep track of the upstream repository, use `<name>`. that it is understandable if many users and projects wish to call it "upstream". > This commit implements > `remote.cloneDefault` as a parallel to `remote.pushDefault`, > with prioritized name resolution: I highly doubt that .cloneDefault is a good name. After reading only the title of the patch e-mail, i.e. when the only available information on the change available to me was the name of the configuration variable and the fact that it pertains to the command "git clone", I thought it is to specify a URL, from which "git clone" without the URL would clone from that single repository. And the name will cause the same misunderstanding to normal users, not just to reviewers of your patch, after this change hits a future Git release. Taking a parallel from init.defaultBranchName, I would probably call it clone.defaultUpstreamName if I were writing this feature. > diff --git a/builtin/clone.c b/builtin/clone.c > index b087ee40c2..b0dbb848c6 100644 > --- a/builtin/clone.c > +++ b/builtin/clone.c > @@ -941,6 +941,29 @@ static int path_exists(const char *path) > return !stat(path, &sb); > } > > +struct clone_default_info > +{ > + enum config_scope scope; > + struct strbuf remote_name; > + int linenr; > +}; > + > +static int config_read_clone_default(const char *key, const char *value, > + void *cb) > +{ > + struct clone_default_info* info = cb; > + if (strcmp(key, "remote.clonedefault") || !value) { > + return 0; > + } > + > + info->scope = current_config_scope(); > + strbuf_reset(&info->remote_name); > + strbuf_addstr(&info->remote_name, value); > + info->linenr = current_config_line(); > + > + return 0; > +} This feels way overkill and insufficient at the same time. It does not need scope, it does not need linenr, and we already have a place to store end-user specified name for the upstream in the form of the variable option_origin. And the code is not diagnosing any error. static int git_clone_config(const char *k, const char *v, void *cb) { if (option_origin) return 0; /* ignore -- the user gave us an option */ if (!strcmp(k, "clone.defaultupstreamname")) { if (!v) return config_error_nonbool(k); if (strchr(v, '/') || check_refname_format(v, REFNAME_ALLOW_ONELEVEL)) return error(_("invalid upstream name '%s'"), v); option_origin = xstrdup(v); return 0; } return 0; } would be sufficient, and at the same time makes sure it rejects names like 'o..ri..gin', 'o/ri/gin', etc. > @@ -992,8 +1015,15 @@ int cmd_clone(int argc, const char **argv, const char *prefix) > option_no_checkout = 1; > } > > - if (!option_origin) > - option_origin = "origin"; > + if (!option_origin) { > + struct clone_default_info clone_default = { CONFIG_SCOPE_UNKNOWN, STRBUF_INIT, -1 }; > + git_config(config_read_clone_default, &clone_default); > + if (strcmp("", (const char*) clone_default.remote_name.buf)) > + option_origin = clone_default.remote_name.buf; > + else > + option_origin = "origin"; > + } > + It is somewhat sad that we have the git_config(git_default_config) call so late in the control flow. I wonder if we can update the start-up sequence to match the usual flow, e.g. these things happen in the following order in a normal program. - variables are given their default value. - call git_config(git_commandspecific_config, ...) early in the program. - git_commandspecific_config() interprets the command specific variables and passes everything else to git_default_config() variables like option_origin are given values of configuration variable at this point. - call parse_options() next, which may override the variables from the value on the command line. - main control flow uses the variable. "Command-line wins over configuration which wins over the default" falls out naturally. One oddity "git clone" has is that it wants to delay the reading of configuration files (they are read only once, and second and subsequent git_config() calls will reuse what was read before [*]) so that it can read what clone.c::write_config() wrote, so if we were to "fix" the start-up sequence to match the usual flow, we need to satisfy what that odd arrangement wanted to achieve in some other way (e.g. feed what is in option_config to git_default_config ourselves, without using git_config(), as part of the "main control flow uses the variable" part), but it should be doable. [*Side note*]. The above means that this patch, even when the configuration variable does not give upstream name, may break the option_config feature by breaking the second call to git_config(). We need to have a test for that. > diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh > index e69427f881..8aac67b385 100755 > --- a/t/t5606-clone-options.sh > +++ b/t/t5606-clone-options.sh > @@ -19,6 +19,20 @@ test_expect_success 'clone -o' ' > > ' > > +test_expect_success 'clone respects remote.cloneDefault' ' > + > + git -c remote.cloneDefault=bar clone parent clone-config && > + (cd clone-config && git rev-parse --verify refs/remotes/bar/master) > + > +' > + > +test_expect_success 'clone chooses correct remote name' ' > + > + git -c remote.cloneDefault=bar clone -o foo parent clone-o-and-config && > + (cd clone-o-and-config && git rev-parse --verify refs/remotes/foo/master) > + > +' These two are "showing off my shiny new toy" tests, which are needed, but we also need negative tests where the shiny new toy does not kick in when it should not. For example git -c remote.cloneDefault="bad.../...name" clone parent should fail, no? Thanks.
On 8/26/2020 2:46 PM, Junio C Hamano wrote: > "Sean Barag via GitGitGadget" <gitgitgadget@gmail.com> writes: >> This commit implements >> `remote.cloneDefault` as a parallel to `remote.pushDefault`, >> with prioritized name resolution: > > I highly doubt that .cloneDefault is a good name. After reading > only the title of the patch e-mail, i.e. when the only available > information on the change available to me was the name of the > configuration variable and the fact that it pertains to the command > "git clone", I thought it is to specify a URL, from which "git > clone" without the URL would clone from that single repository. > > And the name will cause the same misunderstanding to normal users, > not just to reviewers of your patch, after this change hits a future > Git release. > > Taking a parallel from init.defaultBranchName, I would probably call > it clone.defaultUpstreamName if I were writing this feature. I was thinking "clone.defaultRemoteName" makes it clear we are naming the remote for the provided <url> in the command. Having clone.defaultUpstreamName = upstream may look a bit confusing, while clone.defaultRemoteName = upstream makes it a bit clearer that we will set remote.upstream.url = <url> >> diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh >> index e69427f881..8aac67b385 100755 >> --- a/t/t5606-clone-options.sh >> +++ b/t/t5606-clone-options.sh >> @@ -19,6 +19,20 @@ test_expect_success 'clone -o' ' >> >> ' >> >> +test_expect_success 'clone respects remote.cloneDefault' ' >> + >> + git -c remote.cloneDefault=bar clone parent clone-config && >> + (cd clone-config && git rev-parse --verify refs/remotes/bar/master) >> + >> +' >> + >> +test_expect_success 'clone chooses correct remote name' ' >> + >> + git -c remote.cloneDefault=bar clone -o foo parent clone-o-and-config && >> + (cd clone-o-and-config && git rev-parse --verify refs/remotes/foo/master) >> + >> +' > > These two are "showing off my shiny new toy" tests, which are > needed, but we also need negative tests where the shiny new toy does > not kick in when it should not. For example > > git -c remote.cloneDefault="bad.../...name" clone parent > > should fail, no? This is an important suggestion. Thanks, -Stolee
Derrick Stolee <stolee@gmail.com> writes: > On 8/26/2020 2:46 PM, Junio C Hamano wrote: >> "Sean Barag via GitGitGadget" <gitgitgadget@gmail.com> writes: >>> This commit implements >>> `remote.cloneDefault` as a parallel to `remote.pushDefault`, >>> with prioritized name resolution: >> >> I highly doubt that .cloneDefault is a good name. After reading >> only the title of the patch e-mail, i.e. when the only available >> information on the change available to me was the name of the >> configuration variable and the fact that it pertains to the command >> "git clone", I thought it is to specify a URL, from which "git >> clone" without the URL would clone from that single repository. >> >> And the name will cause the same misunderstanding to normal users, >> not just to reviewers of your patch, after this change hits a future >> Git release. >> >> Taking a parallel from init.defaultBranchName, I would probably call >> it clone.defaultUpstreamName if I were writing this feature. > > I was thinking "clone.defaultRemoteName" makes it clear we are naming > the remote for the provided <url> in the command. I 100% agree that defaultremotename is much better. >> ... For example >> >> git -c remote.cloneDefault="bad.../...name" clone parent >> >> should fail, no? > > This is an important suggestion. To be fair, the current code does not handle the "--origin" command line option not so carefully. Back when the command was scripted, e.g. 47874d6d (revamp git-clone (take #2)., 2006-03-21), had both ref-format check and */* multi-level check, and these checks been retained throughout its life until 8434c2f1 (Build in clone, 2008-04-27) rewrote the whole thing while discarding these checks for --origin=bad.../...name It would make an excellent #leftoverbits or #microproject. Thanks.
> Derrick Stolee <stolee@gmail.com> writes: > >> On 8/26/2020 2:46 PM, Junio C Hamano wrote: >>> "Sean Barag via GitGitGadget" <gitgitgadget@gmail.com> writes: >>>> This commit implements >>>> `remote.cloneDefault` as a parallel to `remote.pushDefault`, >>>> with prioritized name resolution: >>> >>> I highly doubt that .cloneDefault is a good name. After reading >>> only the title of the patch e-mail, i.e. when the only available >>> information on the change available to me was the name of the >>> configuration variable and the fact that it pertains to the command >>> "git clone", I thought it is to specify a URL, from which "git >>> clone" without the URL would clone from that single repository. >>> >>> And the name will cause the same misunderstanding to normal users, >>> not just to reviewers of your patch, after this change hits a future >>> Git release. >>> >>> Taking a parallel from init.defaultBranchName, I would probably call >>> it clone.defaultUpstreamName if I were writing this feature. >> >> I was thinking "clone.defaultRemoteName" makes it clear we are naming >> the remote for the provided <url> in the command. > >I 100% agree that defaultremotename is much better. Perfect, I'll move forward with `clone.defaultRemoteName`. Thanks for the recommendation. This would be the first config variable inside the a "clone" subsection -- is there anything special that needs to happen when a new subsection is added? >>>> ... For example >>> >>> git -c remote.cloneDefault="bad.../...name" clone parent >>> >>> should fail, no? >> >> This is an important suggestion. > > To be fair, the current code does not handle the "--origin" command > line option not so carefully. Agreed - I'm sorry for not including those tests. They'll be present in v2. I'll be sure to include some validation for `clone.defaultRemoteName` within `git_config` as well. > It is somewhat sad that we have the git_config(git_default_config) > call so late in the control flow. I wonder if we can update the > start-up sequence to match the usual flow > ... > One oddity "git clone" has is that it wants to delay the reading of > configuration files (they are read only once, and second and > subsequent git_config() calls will reuse what was read before [*]) so > that it can read what clone.c::write_config() wrote, so if we were to > "fix" the start-up sequence to match the usual flow, we need to > satisfy what that odd arrangement wanted to achieve in some other way > (e.g. feed what is in option_config to git_default_config ourselves, > without using git_config(), as part of the "main control flow uses the > variable" part), but it should be doable. Sounds like a pretty big change! I'm willing to take a crack at it, but given that this is my first patch I'm frankly a bit intimidated :) How would you feel about that being a separate patch? Thanks for all the guidance, folks. Sean
Sean Barag <sean@barag.org> writes: >> It is somewhat sad that we have the git_config(git_default_config) >> call so late in the control flow. I wonder if we can update the >> start-up sequence to match the usual flow >> ... >> One oddity "git clone" has is that it wants to delay the reading of >> configuration files (they are read only once, and second and >> subsequent git_config() calls will reuse what was read before [*]) so >> that it can read what clone.c::write_config() wrote, so if we were to >> "fix" the start-up sequence to match the usual flow, we need to >> satisfy what that odd arrangement wanted to achieve in some other way >> (e.g. feed what is in option_config to git_default_config ourselves, >> without using git_config(), as part of the "main control flow uses the >> variable" part), but it should be doable. > > Sounds like a pretty big change! I'm willing to take a crack at it, > but given that this is my first patch I'm frankly a bit intimidated :) > How would you feel about that being a separate patch? It definitely needs to be a separate patch. In order to realize any new feature that needs to read the existing (i.e. per-machine or per-user) configuration files to affect the behaviour of "git clone", whether it is the "give default to --origin option" or any other thing, first needs to fix the start-up sequence so that the configuration is read once before we process command line options, which is the norm. Only after that is done, we can build the clone.defaultRemoteName and other features that would be affected by the settings of clone.* variables on top, and it is not wise to mix the foundation with a new feature that uses the foundation. So I would imagine it would at least be 3-patch series: - [1/3] would be to whip the start-up sequence into the normal order. This may be the most tricky part. I identified that the handling of option_config might need adjustment, but there may be others. This may not need any new tests, but if the existing tests for "clone -c var=val" do not catch breakage when we naively move the git_config(git_default_config) call early without doing any other adjustment, we might need to add more tests to cover the option. If we find things other than option_config needs adjustment, they also need test coverage. - [2/3] would be to tighten the error checking of option_origin that was lost when the command was reimplemented in C (already discussed). This would need new tests to see "--origin $bogus" command line option is flagged as an error. - [3/3] would be to read option_origin from the configuration file. The start-up sequence fixed by [1/3] would allow us to write the config callback in a more natural way, compared to what you wrote and what I suggested to rewrite. Error checking code in [2/3] would directly be reusable in the code added in this step. We'd need tests similar to we add in [2/3] for the configuration, and combination of configuration and command line (you already wrote most of these). Thanks.
> In order to realize any new feature that needs to read the existing > (i.e. per-machine or per-user) configuration files to affect the > behaviour of "git clone", whether it is the "give default to > --origin option" or any other thing, first needs to fix the start-up > sequence so that the configuration is read once before we process > command line options, which is the norm. Only after that is done, > we can build the clone.defaultRemoteName and other features that > would be affected by the settings of clone.* variables on top, and > it is not wise to mix the foundation with a new feature that uses > the foundation. Makes sense! Thanks for all your help -- I _really_ appreciate it. I'll give it a try over the next few days.
diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt index a8e6437a90..debb21ecbf 100644 --- a/Documentation/config/remote.txt +++ b/Documentation/config/remote.txt @@ -3,6 +3,10 @@ remote.pushDefault:: `branch.<name>.remote` for all branches, and is overridden by `branch.<name>.pushRemote` for specific branches. +remote.cloneDefault:: + The name of the remote to create during `git clone <url>`. + Defaults to "origin". + remote.<name>.url:: The URL of a remote repository. See linkgit:git-fetch[1] or linkgit:git-push[1]. diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt index c898310099..2e101ba4f4 100644 --- a/Documentation/git-clone.txt +++ b/Documentation/git-clone.txt @@ -183,8 +183,9 @@ objects from the source repository into a pack in the cloned repository. -o <name>:: --origin <name>:: - Instead of using the remote name `origin` to keep track - of the upstream repository, use `<name>`. + The remote name used to keep track of the upstream repository. + Overrides `remote.cloneDefault` from the config, and defaults + to `origin`. -b <name>:: --branch <name>:: diff --git a/builtin/clone.c b/builtin/clone.c index b087ee40c2..b0dbb848c6 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -941,6 +941,29 @@ static int path_exists(const char *path) return !stat(path, &sb); } +struct clone_default_info +{ + enum config_scope scope; + struct strbuf remote_name; + int linenr; +}; + +static int config_read_clone_default(const char *key, const char *value, + void *cb) +{ + struct clone_default_info* info = cb; + if (strcmp(key, "remote.clonedefault") || !value) { + return 0; + } + + info->scope = current_config_scope(); + strbuf_reset(&info->remote_name); + strbuf_addstr(&info->remote_name, value); + info->linenr = current_config_line(); + + return 0; +} + int cmd_clone(int argc, const char **argv, const char *prefix) { int is_bundle = 0, is_local; @@ -992,8 +1015,15 @@ int cmd_clone(int argc, const char **argv, const char *prefix) option_no_checkout = 1; } - if (!option_origin) - option_origin = "origin"; + if (!option_origin) { + struct clone_default_info clone_default = { CONFIG_SCOPE_UNKNOWN, STRBUF_INIT, -1 }; + git_config(config_read_clone_default, &clone_default); + if (strcmp("", (const char*) clone_default.remote_name.buf)) + option_origin = clone_default.remote_name.buf; + else + option_origin = "origin"; + } + repo_name = argv[0]; diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh index e69427f881..8aac67b385 100755 --- a/t/t5606-clone-options.sh +++ b/t/t5606-clone-options.sh @@ -19,6 +19,20 @@ test_expect_success 'clone -o' ' ' +test_expect_success 'clone respects remote.cloneDefault' ' + + git -c remote.cloneDefault=bar clone parent clone-config && + (cd clone-config && git rev-parse --verify refs/remotes/bar/master) + +' + +test_expect_success 'clone chooses correct remote name' ' + + git -c remote.cloneDefault=bar clone -o foo parent clone-o-and-config && + (cd clone-o-and-config && git rev-parse --verify refs/remotes/foo/master) + +' + test_expect_success 'redirected clone does not show progress' ' git clone "file://$(pwd)/parent" clone-redirected >out 2>err &&