Message ID | pull.695.v2.git.git.1580851963616.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] clone: use submodules.recurse option for automatically clone submodules | expand |
Hi Markus, On Tue, 4 Feb 2020, Markus Klein via GitGitGadget wrote: > From: Markus Klein <masmiseim@gmx.de> > > Simplify cloning repositories with submodules when the option > submodules.recurse is set by the user. This makes it transparent to the > user if submodules are used. The user doesn’t have to know if he has to add > an extra parameter to get the full project including the used submodules. > This makes clone behave identical to other commands like fetch, pull, > checkout, ... which include the submodules automatically if this option is > set. > > It is implemented analog to the pull command by using an own config > function instead of using just the default config. In contrast to the pull > command, the submodule.recurse state is saved as an array of strings as it > can take an optionally pathspec argument which describes which submodules > should be recursively initialized and cloned. To recursively initialize and > clone all submodules a pathspec of "." has to be used. > The regression test is simplified compared to the test for "git clone > --recursive" as the general functionality is already checked there. > > Changes since v1: > * Fixed the commit author to match the Signed-off-by line This changelog should go... > > Signed-off-by: Markus Klein <masmiseim@gmx.de> > --- ... after the `---`. I.e. it should go into the PR description (which is the first comment on the PR) instead of the commit message. Ciao, Johannes > Add the usage of the submodules.recurse parameter on clone > > I try to finish the pullrequest #573 from Maddimax. This adds the usage > of the submodules.recurse parameter on clone > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-695%2FMasmiseim36%2Fdev%2FCloneWithSubmodule-v2 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-695/Masmiseim36/dev/CloneWithSubmodule-v2 > Pull-Request: https://github.com/git/git/pull/695 > > Range-diff vs v1: > > 1: 7fa8d19faf ! 1: c75835268a clone: use submodules.recurse option for automatically clone submodules > @@ -1,4 +1,4 @@ > -Author: Markus <masmiseim@gmx.de> > +Author: Markus Klein <masmiseim@gmx.de> > > clone: use submodules.recurse option for automatically clone submodules > > @@ -19,6 +19,9 @@ > The regression test is simplified compared to the test for "git clone > --recursive" as the general functionality is already checked there. > > + Changes since v1: > + * Fixed the commit author to match the Signed-off-by line > + > Signed-off-by: Markus Klein <masmiseim@gmx.de> > > diff --git a/builtin/clone.c b/builtin/clone.c > > > builtin/clone.c | 16 +++++++++++++++- > t/t7407-submodule-foreach.sh | 11 +++++++++++ > 2 files changed, 26 insertions(+), 1 deletion(-) > > diff --git a/builtin/clone.c b/builtin/clone.c > index 0fc89ae2b9..21b9d927a2 100644 > --- a/builtin/clone.c > +++ b/builtin/clone.c > @@ -26,6 +26,8 @@ > #include "dir-iterator.h" > #include "iterator.h" > #include "sigchain.h" > +#include "submodule-config.h" > +#include "submodule.h" > #include "branch.h" > #include "remote.h" > #include "run-command.h" > @@ -929,6 +931,18 @@ static int path_exists(const char *path) > return !stat(path, &sb); > } > > +/** > + * Read config variables. > + */ > +static int git_clone_config(const char *var, const char *value, void *cb) > +{ > + if (!strcmp(var, "submodule.recurse") && git_config_bool(var, value)) { > + string_list_append(&option_recurse_submodules, "true"); > + return 0; > + } > + return git_default_config(var, value, cb); > +} > + > int cmd_clone(int argc, const char **argv, const char *prefix) > { > int is_bundle = 0, is_local; > @@ -1103,7 +1117,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) > > write_config(&option_config); > > - git_config(git_default_config, NULL); > + git_config(git_clone_config, NULL); > > if (option_bare) { > if (option_mirror) > diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh > index 6b2aa917e1..44b32f7b27 100755 > --- a/t/t7407-submodule-foreach.sh > +++ b/t/t7407-submodule-foreach.sh > @@ -383,6 +383,17 @@ test_expect_success 'use "update --recursive nested1" to checkout all submodules > git rev-parse --resolve-git-dir nested1/nested2/nested3/submodule/.git > ) > ' > +test_expect_success 'use "git clone" with submodule.recurse=true to checkout all submodules' ' > + git clone -c submodule.recurse=true super clone7 && > + ( > + git -C clone7 rev-parse --resolve-git-dir .git --resolve-git-dir nested1/nested2/nested3/submodule/.git >actual && > + cat >expect <<-EOF && > + .git > + $(pwd)/clone7/.git/modules/nested1/modules/nested2/modules/nested3/modules/submodule > + EOF > + test_cmp expect actual > + ) > +' > > test_expect_success 'command passed to foreach retains notion of stdin' ' > ( > > base-commit: d0654dc308b0ba76dd8ed7bbb33c8d8f7aacd783 > -- > gitgitgadget >
"Markus Klein via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Markus Klein <masmiseim@gmx.de> > > Simplify cloning repositories with submodules when the option > submodules.recurse is set by the user. This makes it transparent to the > user if submodules are used. The user doesn’t have to know if he has to add > an extra parameter to get the full project including the used submodules. > This makes clone behave identical to other commands like fetch, pull, > checkout, ... which include the submodules automatically if this option is > set. I am not sure if it is even a good idea to make clone behave identically to fetch and pull. We cannot escape from the fact that the initial cloning of the top-level superproject is a special event---we do not even have a place to put the configuration specific to that superproject (e.g. which submodules are good ones to clone by default) before that happens. You misspelt "submodule.recurse" everywhere in the log message, by the way, even though the code seems to react to the right variable. > It is implemented analog to the pull command by using an own config > function instead of using just the default config. I am not sure if this is worth saying, but it is not incorrect per-se. > In contrast to the pull > command, the submodule.recurse state is saved as an array of strings as it > can take an optionally pathspec argument which describes which submodules > should be recursively initialized and cloned. Sorry, but I do not think I get this part at all. Your callback seems to add a fixed string "true" to option_recurse_submodules string list as many times as submodule.recurse variable is defined in various configuration files. Does anybody count how many and react differently? You mention "pathspec" here, but how does one specify a pathspec beforehand (remember, this is clone and there is no superproject repository or its per-repository configuration file yet before we clone it)? > To recursively initialize and > clone all submodules a pathspec of "." has to be used. > The regression test is simplified compared to the test for "git clone > --recursive" as the general functionality is already checked there. Documentation/config/submodule.txt says submodule.recurse says Specifies if commands recurse into submodules by default. This applies to all commands that have a `--recurse-submodules` option, except `clone`. Defaults to false. so I take that the value must be a boolean. So I am lost what pathspec you are talking about here. > +/** > + * Read config variables. > + */ That's a fairly useless comment that does not say more than what the name of the function already tells us X-<. > +static int git_clone_config(const char *var, const char *value, void *cb) > +{ > + if (!strcmp(var, "submodule.recurse") && git_config_bool(var, value)) { > + string_list_append(&option_recurse_submodules, "true"); > + return 0; The breakage of this is not apparent, but this is misleading. If submodule.recurse is set to a value that git_config_bool() would say "false", the if statement is skipped, and you end up calling git_default_config() with "submodule.recurse", even though you are supposed to have already dealt with the setting. if (!strcmp(var, "submodule.recurse")) { if (git_config_bool(var, value)) ... return 0; /* done with the variable either way */ } is more appropriate. I still do not know what this code is trying to do by appending "true" as many times as submodule.recurse appears in the configuration file(s), though. When given from the command line, i.e. git clone --no-recurse-submodules ... git clone --recurse-submodules ... git clone --recurse-submodules=<something> ... recurse_submodules_cb() reacts to them by (1) clearing what have been accumulated so far, (2) appending the match-all "." pathspec, and (3) appending the <something> string to option_recurse_submodules string list. But given that submodule.recurse is not (and will not be without an involved transition plan) a pathspec but merely a boolean, I would think appending hardcoded string constant "true" makes little sense. After sorting the list, these values become values of the submodule.active configuration variable whose values are pathspec elements in cmd_clone(); see the part of the code before it makes a call to init_db(). So, I would sort-of understand if you pretend --recurse-submodules was given from the command line when submodule.recurse is set to true (which would mean that you'd append "." to the string list). But I do not understand why appending "true" is a good thing at all here. Another thing I noticed. If you have "[submodule] recurse" in your $HOME/.gitconfig, you'd want to be able to countermand from the command line with git clone --no-recurse-submodules ... so that the clone would not go recursive. And that should be tested. You'd also want the opposite, i.e. with "[submodule] recurse=no" in your $HOME/.gitconfig and running git clone --recurse-submodules ... should countermand the configuration. Thanks. > +test_expect_success 'use "git clone" with submodule.recurse=true to checkout all submodules' ' > + git clone -c submodule.recurse=true super clone7 && > + ( > + git -C clone7 rev-parse --resolve-git-dir .git --resolve-git-dir nested1/nested2/nested3/submodule/.git >actual && > + cat >expect <<-EOF && > + .git > + $(pwd)/clone7/.git/modules/nested1/modules/nested2/modules/nested3/modules/submodule > + EOF > + test_cmp expect actual > + ) > +'
Hi Junio, On Thu, 6 Feb 2020, Junio C Hamano wrote: > "Markus Klein via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > +static int git_clone_config(const char *var, const char *value, void *cb) > > +{ > > + if (!strcmp(var, "submodule.recurse") && git_config_bool(var, value)) { > > + string_list_append(&option_recurse_submodules, "true"); > > + return 0; > > The breakage of this is not apparent, but this is misleading. If > submodule.recurse is set to a value that git_config_bool() would say > "false", the if statement is skipped, and you end up calling > git_default_config() with "submodule.recurse", even though you are > supposed to have already dealt with the setting. > > if (!strcmp(var, "submodule.recurse")) { > if (git_config_bool(var, value)) > ... > return 0; /* done with the variable either way */ > } > > is more appropriate. Good catch, and I think you will have to do even more: in the "else" case, it is possible that the user overrode a `submodule.recurse` from the system config in their user-wide config, so we must _undo_ the `string_list_append(). Further, it is probably not a good idea to append "true" _twice_ if multiple configs in the chain specify `submodule.recurse = true`. > I still do not know what this code is trying to do by appending "true" > as many times as submodule.recurse appears in the configuration file(s), > though. > > When given from the command line, i.e. > > git clone --no-recurse-submodules ... > git clone --recurse-submodules ... > git clone --recurse-submodules=<something> ... > > recurse_submodules_cb() reacts to them by > > (1) clearing what have been accumulated so far, > (2) appending the match-all "." pathspec, and > (3) appending the <something> string > > to option_recurse_submodules string list. But given that > submodule.recurse is not (and will not be without an involved > transition plan) a pathspec but merely a boolean, I would think > appending hardcoded string constant "true" makes little sense. > After sorting the list, these values become values of the > submodule.active configuration variable whose values are pathspec > elements in cmd_clone(); see the part of the code before it makes a > call to init_db(). Indeed, I think I even pointed out that "true" is not an appropriate value to use here: https://github.com/git/git/pull/695/#discussion_r367866708 Ciao, Dscho
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> if (!strcmp(var, "submodule.recurse")) { >> if (git_config_bool(var, value)) >> ... >> return 0; /* done with the variable either way */ >> } >> >> is more appropriate. > > Good catch, and I think you will have to do even more: in the "else" case, > it is possible that the user overrode a `submodule.recurse` from the > system config in their user-wide config, so we must _undo_ the > `string_list_append(). Yeah, I tend to agree that submodule.recurse should not be made into a multi-valued fields with this change; it should stay to be the usual last-one-wins single boolean. > Further, it is probably not a good idea to append "true" _twice_ if > multiple configs in the chain specify `submodule.recurse = true`. The user of this list in cmd_clone() first sorts and dedups, so appending the same is OK, even though it may appear sloppy.
Junio C Hamano <gitster@pobox.com> writes: > So, I would sort-of understand if you pretend --recurse-submodules > was given from the command line when submodule.recurse is set to > true (which would mean that you'd append "." to the string list). > But I do not understand why appending "true" is a good thing at all > here. > > Another thing I noticed. > > If you have "[submodule] recurse" in your $HOME/.gitconfig, you'd > want to be able to countermand from the command line with > > git clone --no-recurse-submodules ... > > so that the clone would not go recursive. And that should be > tested. > > You'd also want the opposite, i.e. with "[submodule] recurse=no" in > your $HOME/.gitconfig and running > > git clone --recurse-submodules ... > > should countermand the configuration. Totally untested, but just to illustrate the approach, here is a sample patch to implement "Pretend --recurse-submodules=. is set on the command line when submodule.recurse is set (in the 'last one wins' sense) and there is no --recurse-submodules command line option." It should outline the right interactions between the command line options and configuration variable, like allowing "git clone --no-recurse-submodules" to defeat submodule.recurse configuration. Not that I agree that "[submodules] recurse" set in the $HOME/.gitconfig should affect "git clone". It is merely to illustrate how it could be done, if it were a good idea. builtin/clone.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index 0fc89ae2b9..163803d89e 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -32,6 +32,7 @@ #include "connected.h" #include "packfile.h" #include "list-objects-filter-options.h" +#include "submodule.h" /* * Overall FIXMEs: @@ -71,6 +72,8 @@ static struct list_objects_filter_options filter_options; static struct string_list server_options = STRING_LIST_INIT_NODUP; static int option_remote_submodules; +static int recurse_submodules_option_given; + static int recurse_submodules_cb(const struct option *opt, const char *arg, int unset) { @@ -81,7 +84,7 @@ static int recurse_submodules_cb(const struct option *opt, else string_list_append((struct string_list *)opt->value, (const char *)opt->defval); - + recurse_submodules_option_given = 1; return 0; } @@ -929,6 +932,13 @@ static int path_exists(const char *path) return !stat(path, &sb); } +static int git_clone_config(const char *var, const char *value, void *cb) +{ + if (starts_with(var, "submodule.")) + return git_default_submodule_config(var, value, NULL); + return git_default_config(var, value, cb); +} + int cmd_clone(int argc, const char **argv, const char *prefix) { int is_bundle = 0, is_local; @@ -1103,7 +1113,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix) write_config(&option_config); - git_config(git_default_config, NULL); + git_config(git_clone_config, NULL); + if (!recurse_submodules_option_given && should_update_submodules()) + string_list_append(&option_recurse_submodules, "."); if (option_bare) { if (option_mirror)
> "Markus Klein via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > From: Markus Klein <masmiseim@gmx.de> > > > > Simplify cloning repositories with submodules when the option > > submodules.recurse is set by the user. This makes it transparent to > > the user if submodules are used. The user doesn’t have to know if he > > has to add an extra parameter to get the full project including the used submodules. > > This makes clone behave identical to other commands like fetch, pull, > > checkout, ... which include the submodules automatically if this > > option is set. > > I am not sure if it is even a good idea to make clone behave identically to fetch and pull. > We cannot escape from the fact that the initial cloning of the top-level superproject is a special > event---we do not even have a place to put the configuration specific to that superproject > (e.g. which submodules are good ones to clone by default) before that happens. It behaves only identical if the option "submodule.recurse" is set in the global .gitconfig. So, it is optional for people who know what they do. For people which use submodules heavily, this is very useful. For the case where you don't like to get all submodules but have this option set, you can disable it via --no-recurse-submodules > > You misspelt "submodule.recurse" everywhere in the log message, by the way, even though the code seems > to react to the right variable. > > > It is implemented analog to the pull command by using an own config > > function instead of using just the default config. > > I am not sure if this is worth saying, but it is not incorrect per-se. > > > In contrast to the pull > > command, the submodule.recurse state is saved as an array of strings > > as it can take an optionally pathspec argument which describes which > > submodules should be recursively initialized and cloned. > > Sorry, but I do not think I get this part at all. Your callback seems to add a fixed string "true" > to option_recurse_submodules string list as many times as submodule.recurse variable is defined in > various configuration files. Does anybody count how many and react differently? You mention "pathspec" > here, but how does one specify a pathspec beforehand (remember, this is clone and there is no superproject > repository or its per-repository configuration file yet before we clone it)? I'm so sorry for the confusing with the true. This is definitely wrong. Johannes already pointed this out to me and I had already fixed it. Shame on me, as I had uploaded an old version :-( > > > To recursively initialize and > > clone all submodules a pathspec of "." has to be used. > > The regression test is simplified compared to the test for "git clone > > --recursive" as the general functionality is already checked there. > > Documentation/config/submodule.txt says submodule.recurse says > > Specifies if commands recurse into submodules by default. This > applies to all commands that have a `--recurse-submodules` > option, except `clone`. Defaults to false. > > so I take that the value must be a boolean. So I am lost what pathspec you are talking about here. > > > +/** > > + * Read config variables. > > + */ > > That's a fairly useless comment that does not say more than what the name of the function already tells us X-<. True, this was copy'pasted from the pull implementation. So it should be useless there also. > > > +static int git_clone_config(const char *var, const char *value, void > > +*cb) { > > + if (!strcmp(var, "submodule.recurse") && git_config_bool(var, value)) { > > + string_list_append(&option_recurse_submodules, "true"); > > + return 0; > > The breakage of this is not apparent, but this is misleading. If submodule.recurse is set to a value > that git_config_bool() would say "false", the if statement is skipped, and you end up calling > git_default_config() with "submodule.recurse", even though you are supposed to have already dealt with > the setting. > > if (!strcmp(var, "submodule.recurse")) { > if (git_config_bool(var, value)) > ... > return 0; /* done with the variable either way */ > } > > is more appropriate. I still do not know what this code is trying to do by appending "true" as many > times as submodule.recurse appears in the configuration file(s), though. > > When given from the command line, i.e. > > git clone --no-recurse-submodules ... > git clone --recurse-submodules ... > git clone --recurse-submodules=<something> ... > > recurse_submodules_cb() reacts to them by > > (1) clearing what have been accumulated so far, > (2) appending the match-all "." pathspec, and > (3) appending the <something> string > > to option_recurse_submodules string list. But given that submodule.recurse is not (and will not be without > an involved transition plan) a pathspec but merely a boolean, I would think appending hardcoded string > constant "true" makes little sense. > After sorting the list, these values become values of the submodule.active configuration variable whose > values are pathspec elements in cmd_clone(); see the part of the code before it makes a call to init_db(). > > So, I would sort-of understand if you pretend --recurse-submodules was given from the command line when > submodule.recurse is set to true (which would mean that you'd append "." to the string list). > But I do not understand why appending "true" is a good thing at all here. > > Another thing I noticed. > > If you have "[submodule] recurse" in your $HOME/.gitconfig, you'd want to be able to countermand > from the command line with > > git clone --no-recurse-submodules ... > > so that the clone would not go recursive. And that should be tested. > > You'd also want the opposite, i.e. with "[submodule] recurse=no" in your $HOME/.gitconfig and running > > git clone --recurse-submodules ... > > should countermand the configuration. Thanks for the hint. I added this tests, and it was very helpful, as it pointed out, that the disabling via --no-recurse-submodules was not working. > > Thanks. > > > +test_expect_success 'use "git clone" with submodule.recurse=true to checkout all submodules' ' > > + git clone -c submodule.recurse=true super clone7 && > > + ( > > + git -C clone7 rev-parse --resolve-git-dir .git --resolve-git-dir nested1/nested2/nested3/submodule/.git >actual && > > + cat >expect <<-EOF && > > + .git > > + $(pwd)/clone7/.git/modules/nested1/modules/nested2/modules/nested3/modules/submodule > > + EOF > > + test_cmp expect actual > > + ) > > +' Thanks for the feedback
diff --git a/builtin/clone.c b/builtin/clone.c index 0fc89ae2b9..21b9d927a2 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -26,6 +26,8 @@ #include "dir-iterator.h" #include "iterator.h" #include "sigchain.h" +#include "submodule-config.h" +#include "submodule.h" #include "branch.h" #include "remote.h" #include "run-command.h" @@ -929,6 +931,18 @@ static int path_exists(const char *path) return !stat(path, &sb); } +/** + * Read config variables. + */ +static int git_clone_config(const char *var, const char *value, void *cb) +{ + if (!strcmp(var, "submodule.recurse") && git_config_bool(var, value)) { + string_list_append(&option_recurse_submodules, "true"); + return 0; + } + return git_default_config(var, value, cb); +} + int cmd_clone(int argc, const char **argv, const char *prefix) { int is_bundle = 0, is_local; @@ -1103,7 +1117,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) write_config(&option_config); - git_config(git_default_config, NULL); + git_config(git_clone_config, NULL); if (option_bare) { if (option_mirror) diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh index 6b2aa917e1..44b32f7b27 100755 --- a/t/t7407-submodule-foreach.sh +++ b/t/t7407-submodule-foreach.sh @@ -383,6 +383,17 @@ test_expect_success 'use "update --recursive nested1" to checkout all submodules git rev-parse --resolve-git-dir nested1/nested2/nested3/submodule/.git ) ' +test_expect_success 'use "git clone" with submodule.recurse=true to checkout all submodules' ' + git clone -c submodule.recurse=true super clone7 && + ( + git -C clone7 rev-parse --resolve-git-dir .git --resolve-git-dir nested1/nested2/nested3/submodule/.git >actual && + cat >expect <<-EOF && + .git + $(pwd)/clone7/.git/modules/nested1/modules/nested2/modules/nested3/modules/submodule + EOF + test_cmp expect actual + ) +' test_expect_success 'command passed to foreach retains notion of stdin' ' (