Message ID | 1607095412-40109-1-git-send-email-peter.kaestle@nokia.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | submodules: fix of regression on fetching of non-init subsub-repo | expand |
On Fri, Dec 4, 2020 at 10:25 AM Peter Kaestle <peter.kaestle@nokia.com> wrote: > [...] > Furthermore a regression test case is added, which tests for recursive > fetches on a superproject with uninitialized sub repositories. This > issue was leading to an infinite loop when doing a revert of a62387b. Just a few small comments (nothing comprehensive) from a quick scan of the patch... Mostly they are just minor style issues, not necessarily worth a re-roll, but there is one actionable item. > Signed-off-by: Peter Kaestle <peter.kaestle@nokia.com> > --- > diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh > @@ -719,4 +719,98 @@ test_expect_success 'fetch new submodule commit intermittently referenced by sup > +add_commit_push () { > + dir="$1" > + msg="$2" > + shift 2 We typically recommend including these assignments in the &&-chain to future-proof against someone later inserting code above them and not realizing that that code is not part of the &&-chain, in which case if the new code fails, the failure might go unnoticed. > + git -C "$dir" add "$@" && > + git -C "$dir" commit -a -m "$msg" && > + git -C "$dir" push > +} > + > +compare_refs_in_dir () { > + fail= && > + if test "x$1" = 'x!' > + then > + fail='!' && > + shift > + fi && > + git -C "$1" rev-parse --verify "$2" >expect && > + git -C "$3" rev-parse --verify "$4" >actual && > + eval $fail test_cmp expect actual > +} We have a test_cmp_rev() similar to this but it doesn't support -C as some of our other test functions do. I briefly wondered if it would make sense to extend it to understand -C, but even that wouldn't help this case since compare_refs_in_dir() introduced here involves two distinct directories. The need here is so special-purpose that it likely would not make sense to upgrade test_cmp_rev() to accommodate it. Okay. > +test_expect_success 'setup nested submodule fetch test' ' > + # does not depend on any previous test setups > + > + for repo in outer middle inner > + do > + ( > + git init --bare $repo && > + git clone $repo ${repo}_content && > + echo "$repo" >"${repo}_content/file" && > + add_commit_push ${repo}_content "initial" file > + ) || return 1 > + done && What is the purpose of the subshell here? Is it to ensure that commits in each repo have identical timestamps? Or is it just for making the && and || expression more clear? If the latter, we normally don't bother with the parentheses. > + git clone outer A && > + git -C A submodule add "$pwd/middle" && > + git -C A/middle/ submodule add "$pwd/inner" && > + add_commit_push A/middle/ "adding inner sub" .gitmodules inner && > + add_commit_push A/ "adding middle sub" .gitmodules middle && > + > + git clone outer B && > + git -C B/ submodule update --init middle && > + > + compare_refs_in_dir A HEAD B HEAD && > + compare_refs_in_dir A/middle HEAD B/middle HEAD && > + test -f B/file && > + test -f B/middle/file && > + ! test -f B/middle/inner/file && These days we typically use test_path_exists() (or test_path_is_file()) and test_path_is_missing() rather than bare `test`. > +test_expect_success 'setup recursive fetch with uninit submodule' ' > + # does not depend on any previous test setups > + > + git init main && > + git init sub && > + > + touch sub/file && Unless the timestamp of the file is significant to the test, in which case `touch` is used, we normally create empty files like this: >sub/file && > +test_expect_success 'recursive fetch with uninit submodule' ' > + git -C main submodule deinit -f sub && > + ! git -C main fetch --recurse-submodules |& > + grep -v -m1 "Fetching submodule sub$" && We want the test scripts to be portable, thus avoid Bashisms such as `|&`. We also avoid placing a Git command upstream in a pipe since doing so causes the exit code of the Git command to be lost. Instead, we would normally send the Git output to a file and then send that file to whatever would be downstream of the Git command in the pipe. So, a mechanical rewrite of the above (without thinking too hard about it) might be: git -C main fetch --recurse-submodules >out 2>&1 && ! grep -v -m1 "Fetching submodule sub$" && > + git -C main submodule status | > + sed -e "s/^-//" -e "s/ sub$//" >actual && Same comment about avoiding Git upstream in a pipe, so perhaps: git -C main submodule status >out && sed -e "s/^-//" -e "s/ sub$//" out >actual && > + test_cmp expect actual > +'
Hi Eric, On 04.12.20 19:06, Eric Sunshine wrote: > On Fri, Dec 4, 2020 at 10:25 AM Peter Kaestle <peter.kaestle@nokia.com> wrote: >> [...] >> Furthermore a regression test case is added, which tests for recursive >> fetches on a superproject with uninitialized sub repositories. This >> issue was leading to an infinite loop when doing a revert of a62387b. > > Just a few small comments (nothing comprehensive) from a quick scan of > the patch... > > Mostly they are just minor style issues, not necessarily worth a > re-roll, but there is one actionable item. thanks for your comments. A new patch will follow soon. >> Signed-off-by: Peter Kaestle <peter.kaestle@nokia.com> >> --- >> diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh >> @@ -719,4 +719,98 @@ test_expect_success 'fetch new submodule commit intermittently referenced by sup >> +add_commit_push () { >> + dir="$1" >> + msg="$2" >> + shift 2 > > We typically recommend including these assignments in the &&-chain to > future-proof against someone later inserting code above them and not > realizing that that code is not part of the &&-chain, in which case if > the new code fails, the failure might go unnoticed. ok > >> + git -C "$dir" add "$@" && >> + git -C "$dir" commit -a -m "$msg" && >> + git -C "$dir" push >> +} >> + >> +compare_refs_in_dir () { >> + fail= && >> + if test "x$1" = 'x!' >> + then >> + fail='!' && >> + shift >> + fi && >> + git -C "$1" rev-parse --verify "$2" >expect && >> + git -C "$3" rev-parse --verify "$4" >actual && >> + eval $fail test_cmp expect actual >> +} > > We have a test_cmp_rev() similar to this but it doesn't support -C as > some of our other test functions do. I briefly wondered if it would > make sense to extend it to understand -C, but even that wouldn't help > this case since compare_refs_in_dir() introduced here involves two > distinct directories. The need here is so special-purpose that it > likely would not make sense to upgrade test_cmp_rev() to accommodate > it. Okay. Yes, I saw that there's a similar function and I tried to modify this one first. Unfortunately this didn't work without touching much unaffected test code. So I propose to continue with this additional function. >> +test_expect_success 'setup nested submodule fetch test' ' >> + # does not depend on any previous test setups >> + >> + for repo in outer middle inner >> + do >> + ( >> + git init --bare $repo && >> + git clone $repo ${repo}_content && >> + echo "$repo" >"${repo}_content/file" && >> + add_commit_push ${repo}_content "initial" file >> + ) || return 1 >> + done && > > What is the purpose of the subshell here? Is it to ensure that commits > in each repo have identical timestamps? Or is it just for making the > && and || expression more clear? If the latter, we normally don't > bother with the parentheses. It was intended to make the correlation of && and || clear. I have experienced many cases in the past where things were screwed because it was not clearly understood by everybody. I'll propose next patch without this subshell. >> + git clone outer A && >> + git -C A submodule add "$pwd/middle" && >> + git -C A/middle/ submodule add "$pwd/inner" && >> + add_commit_push A/middle/ "adding inner sub" .gitmodules inner && >> + add_commit_push A/ "adding middle sub" .gitmodules middle && >> + >> + git clone outer B && >> + git -C B/ submodule update --init middle && >> + >> + compare_refs_in_dir A HEAD B HEAD && >> + compare_refs_in_dir A/middle HEAD B/middle HEAD && >> + test -f B/file && >> + test -f B/middle/file && >> + ! test -f B/middle/inner/file && > > These days we typically use test_path_exists() (or > test_path_is_file()) and test_path_is_missing() rather than bare > `test`. ok. >> +test_expect_success 'setup recursive fetch with uninit submodule' ' >> + # does not depend on any previous test setups >> + >> + git init main && >> + git init sub && >> + >> + touch sub/file && > > Unless the timestamp of the file is significant to the test, in which > case `touch` is used, we normally create empty files like this: > > >sub/file && ok. > >> +test_expect_success 'recursive fetch with uninit submodule' ' >> + git -C main submodule deinit -f sub && >> + ! git -C main fetch --recurse-submodules |& >> + grep -v -m1 "Fetching submodule sub$" && > > We want the test scripts to be portable, thus avoid Bashisms such as `|&`. > We also avoid placing a Git command upstream in a pipe since doing so > causes the exit code of the Git command to be lost. Instead, we would > normally send the Git output to a file and then send that file to > whatever would be downstream of the Git command in the pipe. So, a > mechanical rewrite of the above (without thinking too hard about it) > might be: > > git -C main fetch --recurse-submodules >out 2>&1 && > ! grep -v -m1 "Fetching submodule sub$" && In general I agree, but for this special test case, it's required to have the two commands connected by a pipe, as the grep needs to kill the git call in error case. Otherwise for this regression git would go for an infinite recursion loop. Of course, we can go for a "git 2>&1 | grep" solution. >> + git -C main submodule status | >> + sed -e "s/^-//" -e "s/ sub$//" >actual && > > Same comment about avoiding Git upstream in a pipe, so perhaps: > > git -C main submodule status >out && > sed -e "s/^-//" -e "s/ sub$//" out >actual && > >> + test_cmp expect actual >> +' ok.
On Mon, Dec 7, 2020 at 3:29 AM Peter Kästle <peter.kaestle@nokia.com> wrote: > On 04.12.20 19:06, Eric Sunshine wrote: > > On Fri, Dec 4, 2020 at 10:25 AM Peter Kaestle <peter.kaestle@nokia.com> wrote: > >> + ! git -C main fetch --recurse-submodules |& > >> + grep -v -m1 "Fetching submodule sub$" && > > > > We want the test scripts to be portable, thus avoid Bashisms such as `|&`. > > git -C main fetch --recurse-submodules >out 2>&1 && > > ! grep -v -m1 "Fetching submodule sub$" && > > In general I agree, but for this special test case, it's required to > have the two commands connected by a pipe, as the grep needs to kill the > git call in error case. Otherwise for this regression git would go for > an infinite recursion loop. > > Of course, we can go for a "git 2>&1 | grep" solution. In that case, an in-code comment explaining why the output of `git` must be piped to `grep` would be helpful since it is not uncommon for people to "modernize" test code as they are working in the vicinity, and removing such pipes is one of the modernizations often made. A comment would help avoid such a change.
diff --git a/submodule.c b/submodule.c index b3bb59f066..b561445329 100644 --- a/submodule.c +++ b/submodule.c @@ -1477,6 +1477,7 @@ static int get_next_submodule(struct child_process *cp, strbuf_release(&submodule_prefix); return 1; } else { + struct strbuf empty_submodule_path = STRBUF_INIT; fetch_task_release(task); free(task); @@ -1485,13 +1486,17 @@ static int get_next_submodule(struct child_process *cp, * An empty directory is normal, * the submodule is not initialized */ + strbuf_addf(&empty_submodule_path, "%s/%s/", + spf->r->worktree, + ce->name); if (S_ISGITLINK(ce->ce_mode) && - !is_empty_dir(ce->name)) { + !is_empty_dir(empty_submodule_path.buf)) { spf->result = 1; strbuf_addf(err, _("Could not access submodule '%s'\n"), ce->name); } + strbuf_release(&empty_submodule_path); } } diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh index dd8e423d25..04eb587862 100755 --- a/t/t5526-fetch-submodules.sh +++ b/t/t5526-fetch-submodules.sh @@ -719,4 +719,98 @@ test_expect_success 'fetch new submodule commit intermittently referenced by sup ) ' +add_commit_push () { + dir="$1" + msg="$2" + shift 2 + git -C "$dir" add "$@" && + git -C "$dir" commit -a -m "$msg" && + git -C "$dir" push +} + +compare_refs_in_dir () { + fail= && + if test "x$1" = 'x!' + then + fail='!' && + shift + fi && + git -C "$1" rev-parse --verify "$2" >expect && + git -C "$3" rev-parse --verify "$4" >actual && + eval $fail test_cmp expect actual +} + + +test_expect_success 'setup nested submodule fetch test' ' + # does not depend on any previous test setups + + for repo in outer middle inner + do + ( + git init --bare $repo && + git clone $repo ${repo}_content && + echo "$repo" >"${repo}_content/file" && + add_commit_push ${repo}_content "initial" file + ) || return 1 + done && + + git clone outer A && + git -C A submodule add "$pwd/middle" && + git -C A/middle/ submodule add "$pwd/inner" && + add_commit_push A/middle/ "adding inner sub" .gitmodules inner && + add_commit_push A/ "adding middle sub" .gitmodules middle && + + git clone outer B && + git -C B/ submodule update --init middle && + + compare_refs_in_dir A HEAD B HEAD && + compare_refs_in_dir A/middle HEAD B/middle HEAD && + test -f B/file && + test -f B/middle/file && + ! test -f B/middle/inner/file && + + echo "change on inner repo of A" >"A/middle/inner/file" && + add_commit_push A/middle/inner "change on inner" file && + add_commit_push A/middle "change on inner" inner && + add_commit_push A "change on inner" middle +' + +test_expect_success 'fetching a superproject containing an uninitialized sub/sub project' ' + # depends on previous test for setup + + git -C B/ fetch && + compare_refs_in_dir A origin/master B origin/master +' + + +test_expect_success 'setup recursive fetch with uninit submodule' ' + # does not depend on any previous test setups + + git init main && + git init sub && + + touch sub/file && + git -C sub add file && + git -C sub commit -m "add file" && + git -C sub rev-parse HEAD >expect && + + git -C main submodule add ../sub && + git -C main submodule init && + git -C main submodule update --checkout && + git -C main submodule status | + sed -e "s/^ //" -e "s/ sub .*$//" >actual && + test_cmp expect actual +' + +test_expect_success 'recursive fetch with uninit submodule' ' + # depends on previous test for setup + + git -C main submodule deinit -f sub && + ! git -C main fetch --recurse-submodules |& + grep -v -m1 "Fetching submodule sub$" && + git -C main submodule status | + sed -e "s/^-//" -e "s/ sub$//" >actual && + test_cmp expect actual +' + test_done
A regression has been introduced by a62387b (submodule.c: fetch in submodules git directory instead of in worktree, 2018-11-28). The scenario in which it triggers is when one has a remote repository with a subrepository inside a subrepository like this: superproject/middle_repo/inner_repo Person A and B have both a clone of it, while Person B is not working with the inner_repo and thus does not have it initialized in his working copy. Now person A introduces a change to the inner_repo and propagates it through the middle_repo and the superproject. Once person A pushed the changes and person B wants to fetch them using "git fetch" on superproject level, B's git call will return with error saying: Could not access submodule 'inner_repo' Errors during submodule fetch: middle_repo Expectation is that in this case the inner submodule will be recognized as uninitialized subrepository and skipped by the git fetch command. This used to work correctly before 'a62387b (submodule.c: fetch in submodules git directory instead of in worktree, 2018-11-28)'. Starting with a62387b the code wants to evaluate "is_empty_dir()" inside .git/modules for a directory only existing in the worktree, delivering then of course wrong return value. This patch ensures is_empty_dir() is getting the correct path of the uninitialized submodule by concatenation of the actual worktree and the name of the uninitialized submodule. Furthermore a regression test case is added, which tests for recursive fetches on a superproject with uninitialized sub repositories. This issue was leading to an infinite loop when doing a revert of a62387b. Signed-off-by: Peter Kaestle <peter.kaestle@nokia.com> CC: Junio C Hamano <gitster@pobox.com> CC: Philippe Blain <levraiphilippeblain@gmail.com> CC: Ralf Thielow <ralf.thielow@gmail.com> --- submodule.c | 7 ++- t/t5526-fetch-submodules.sh | 94 +++++++++++++++++++++++++++++++++++++ 2 files changed, 100 insertions(+), 1 deletion(-)