Message ID | 3cdb1abd43779844b8e8dc094e2fd2da1adc461a.1656381667.git.hanxin.hx@bytedance.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | no lazy fetch in lookup_commit_in_graph() | expand |
On Tue, Jun 28 2022, Han Xin wrote: > The commit-graph is used to opportunistically optimize accesses to > certain pieces of information on commit objects, and > lookup_commit_in_graph() tries to say "no" when the requested commit > does not locally exist by returning NULL, in which case the caller > can ask for (which may result in on-demand fetching from a promisor > remote) and parse the commit object itself. > > However, it uses a wrong helper, repo_has_object_file(), to do so. > This helper not only checks if an object is mmediately available in > the local object store, but also tries to fetch from a promisor remote. > But the fetch machinery calls lookup_commit_in_graph(), thus causing an > infinite loop. > > We should make lookup_commit_in_graph() expect that a commit given to it > can be legitimately missing from the local object store, by using the > has_object_file() helper instead. > > Signed-off-by: Han Xin <hanxin.hx@bytedance.com> > --- > commit-graph.c | 2 +- > t/t5329-no-lazy-fetch-with-commit-graph.sh | 53 ++++++++++++++++++++++ > 2 files changed, 54 insertions(+), 1 deletion(-) > create mode 100755 t/t5329-no-lazy-fetch-with-commit-graph.sh > > diff --git a/commit-graph.c b/commit-graph.c > index 2b52818731..2dd9bcc7ea 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -907,7 +907,7 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje > return NULL; > if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos)) > return NULL; > - if (!repo_has_object_file(repo, id)) > + if (!has_object(repo, id, 0)) > return NULL; > > commit = lookup_commit(repo, id); > diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh > new file mode 100755 > index 0000000000..d7877a5758 > --- /dev/null > +++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh > @@ -0,0 +1,53 @@ > +#!/bin/sh > + > +test_description='test for no lazy fetch with the commit-graph' > + > +. ./test-lib.sh > + > +if ! test_have_prereq ULIMIT_PROCESSES I think the prereq in 1/2 would be better off squashed into this commit. > +then > + skip_all='skipping tests for no lazy fetch with the commit-graph, ulimit processes not available' > + test_done > +fi > + > +test_expect_success 'setup: prepare a repository with a commit' ' > + git init with-commit && > + test_commit -C with-commit the-commit && > + oid=$(git -C with-commit rev-parse HEAD) > +' > + > +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' ' > + git init with-commit-graph && > + echo "$(pwd)/with-commit/.git/objects" \ nit: you can use $PWD instead of $(pwd). > + >with-commit-graph/.git/objects/info/alternates && > + # create a ref that points to the commit in alternates > + git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" && > + # prepare some other objects to commit-graph > + test_commit -C with-commit-graph something && > + git -c gc.writeCommitGraph=true -C with-commit-graph gc && > + test_path_is_file with-commit-graph/.git/objects/info/commit-graph > +' > + > +test_expect_success 'setup: change the alternates to what without the commit' ' > + git init --bare without-commit && Maybe run a successful cat-file here... > + echo "$(pwd)/without-commit/objects" \ > + >with-commit-graph/.git/objects/info/alternates && > + test_must_fail git -C with-commit-graph cat-file -e $oid ...to show that it fails after the "echo" above? > +' > + > +test_expect_success 'setup: prepare any commit to fetch' ' > + test_commit -C with-commit any-commit && > + anycommit=$(git -C with-commit rev-parse HEAD) I think this would be better just added before a \n\n in the next test. > +' > + > +test_expect_success 'fetch any commit from promisor with the usage of the commit graph' ' > + git -C with-commit-graph remote add origin "$(pwd)/with-commit" && > + git -C with-commit-graph config remote.origin.promisor true && > + git -C with-commit-graph config remote.origin.partialclonefilter blob:none && > + run_with_limited_processses env GIT_TRACE="$(pwd)/trace" \ > + git -C with-commit-graph fetch origin $anycommit 2>err && > + test_i18ngrep ! "fatal: promisor-remote: unable to fork off fetch subprocess" err && > + test $(grep "fetch origin" trace | wc -l) -eq 1 Use "grep", not "test_i18ngrep", and this should use "test_line_count". But actually better yet: this whole thing looks like it could use "test_subcommand" instead, couldn't it?
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> +test_description='test for no lazy fetch with the commit-graph' >> + >> +. ./test-lib.sh >> + >> +if ! test_have_prereq ULIMIT_PROCESSES > > I think the prereq in 1/2 would be better off squashed into this commit. Good thinking. It also may make sense to implement it in this file, without touching test-lib.sh at all. >> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' ' >> + git init with-commit-graph && >> + echo "$(pwd)/with-commit/.git/objects" \ >> + >with-commit-graph/.git/objects/info/alternates && > > nit: you can use $PWD instead of $(pwd). We can, and it would not make any difference on non-Windows. But which one should we use to cater to Windows? $(pwd) is a full path in Windows notation "C:\Program Files\Git\..." while $PWD is MSYS style "/C/Program Files/Git/..." or something like that, IIRC?
On Tue, Jun 28, 2022 at 3:53 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > + test_i18ngrep ! "fatal: promisor-remote: unable to fork off fetch subprocess" err && > > + test $(grep "fetch origin" trace | wc -l) -eq 1 > > > Use "grep", not "test_i18ngrep", and this should use "test_line_count". > > But actually better yet: this whole thing looks like it could use > "test_subcommand" instead, couldn't it? When using test_subcommand() we should give all the args, if we remove or add any args later, this test case will always pass even without this fix. So, is this test case still strict? run_with_limited_processses env GIT_TRACE2_EVENT="$(PWD)/trace.txt" \ git -C with-commit-graph fetch origin $anycommit && test_subcommand ! git -c fetch.negotiationAlgorithm=noop \ fetch origin --no-tags --no-write-fetch-head \ --recurse-submodules=no --filter=blob:none \ --stdin <trace.txt
Hi Junio, On Tue, 28 Jun 2022, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > > >> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' ' > >> + git init with-commit-graph && > >> + echo "$(pwd)/with-commit/.git/objects" \ > >> + >with-commit-graph/.git/objects/info/alternates && > > > > nit: you can use $PWD instead of $(pwd). > > We can, and it would not make any difference on non-Windows. > > But which one should we use to cater to Windows? $(pwd) is a full > path in Windows notation "C:\Program Files\Git\..." while $PWD is > MSYS style "/C/Program Files/Git/..." or something like that, IIRC? Indeed, and since the `alternates` file is supposed to be read by `git.exe`, a non-MSYS program, the original was good, and the nit suggested the incorrect form. Thank you for catching this before it was worsimproved, Dscho
On Thu, Jun 30 2022, Johannes Schindelin wrote: > Hi Junio, > > On Tue, 28 Jun 2022, Junio C Hamano wrote: > >> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> >> >> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' ' >> >> + git init with-commit-graph && >> >> + echo "$(pwd)/with-commit/.git/objects" \ >> >> + >with-commit-graph/.git/objects/info/alternates && >> > >> > nit: you can use $PWD instead of $(pwd). >> >> We can, and it would not make any difference on non-Windows. >> >> But which one should we use to cater to Windows? $(pwd) is a full >> path in Windows notation "C:\Program Files\Git\..." while $PWD is >> MSYS style "/C/Program Files/Git/..." or something like that, IIRC? > > Indeed, and since the `alternates` file is supposed to be read by > `git.exe`, a non-MSYS program, the original was good, and the nit > suggested the incorrect form. I looked at t5615-alternate-env.sh which does the equivalent of: GIT_ALTERNATE_OBJECT_DIRECTORIES="$PWD/one.git/objects:$PWD/two.git/objects" \ git cat-file [...] We run that test on all our platforms, does the $PWD form work in the environment variable, but not when we write it to the "alternates" file? Or is there some other subtlety there that I'm missing?
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > On Thu, Jun 30 2022, Johannes Schindelin wrote: > >> Hi Junio, >> >> On Tue, 28 Jun 2022, Junio C Hamano wrote: >> >>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >>> >>> >> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' ' >>> >> + git init with-commit-graph && >>> >> + echo "$(pwd)/with-commit/.git/objects" \ >>> >> + >with-commit-graph/.git/objects/info/alternates && >>> > >>> > nit: you can use $PWD instead of $(pwd). >>> >>> We can, and it would not make any difference on non-Windows. >>> >>> But which one should we use to cater to Windows? $(pwd) is a full >>> path in Windows notation "C:\Program Files\Git\..." while $PWD is >>> MSYS style "/C/Program Files/Git/..." or something like that, IIRC? >> >> Indeed, and since the `alternates` file is supposed to be read by >> `git.exe`, a non-MSYS program, the original was good, and the nit >> suggested the incorrect form. > > I looked at t5615-alternate-env.sh which does the equivalent of: > > GIT_ALTERNATE_OBJECT_DIRECTORIES="$PWD/one.git/objects:$PWD/two.git/objects" \ > git cat-file [...] > > We run that test on all our platforms, does the $PWD form work in the > environment variable, but not when we write it to the "alternates" file? > Or is there some other subtlety there that I'm missing? I am also curious to see a clear and concise explanation so that we do not have to repeat this discussion later. We have - When a test checks for an absolute path that a git command generated, construct the expected value using $(pwd) rather than $PWD, $TEST_DIRECTORY, or $TRASH_DIRECTORY. It makes a difference on Windows, where the shell (MSYS bash) mangles absolute path names. For details, see the commit message of 4114156ae9. in t/README, but even with the log mesasge of 4114156a (Tests on Windows: $(pwd) must return Windows-style paths, 2009-03-13) [*1*], I have no idea what makes the thing you found in t5615 work and your suggestion to use $PWD in the new one not work. GIT_ALTERNATE_OBJECT_DIRECTORIES is a PATH_SEP (not necessarily a colon) separated list, and I think the way t5615 uses it is broken on Windows where PATH_SEP is defined as semicolon without the $PWD vs $(pwd) issue. Is the test checking the right thing? [Footnote] *1* Tests on Windows: $(pwd) must return Windows-style paths Many tests pass $(pwd) in some form to git and later test that the output of git contains the correct value of $(pwd). For example, the test of 'git remote show' sets up a remote that contains $(pwd) and then the expected result must contain $(pwd). Again, MSYS-bash's path mangling kicks in: Plain $(pwd) uses the MSYS style absolute path /c/path/to/git. The test case would write this name into the 'expect' file. But when git is invoked, MSYS-bash converts this name to the Windows style path c:/path/to/git, and git would produce this form in the result; the test would fail. We fix this by passing -W to bash's pwd that produces the Windows-style path. There are a two cases that need an accompanying change: - In t1504 the value of $(pwd) becomes part of a path list. In this case, the lone 'c' in something like /foo:c:/path/to/git:/bar inhibits MSYS-bashes path mangling; IOW in this case we want the /c/path/to/git form to allow path mangling. We use $PWD instead of $(pwd), which always has the latter form. - In t6200, $(pwd) - the Windows style path - must be used to construct the expected result because that is the path form that git sees. (The change in the test itself is just for consistency: 'git fetch' always sees the Windows-style path, with or without the change.) Signed-off-by: Johannes Sixt <j6t@kdbg.org>
On Thu, Jun 30 2022, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > >> On Thu, Jun 30 2022, Johannes Schindelin wrote: >> >>> Hi Junio, >>> >>> On Tue, 28 Jun 2022, Junio C Hamano wrote: >>> >>>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >>>> >>>> >> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' ' >>>> >> + git init with-commit-graph && >>>> >> + echo "$(pwd)/with-commit/.git/objects" \ >>>> >> + >with-commit-graph/.git/objects/info/alternates && >>>> > >>>> > nit: you can use $PWD instead of $(pwd). >>>> >>>> We can, and it would not make any difference on non-Windows. >>>> >>>> But which one should we use to cater to Windows? $(pwd) is a full >>>> path in Windows notation "C:\Program Files\Git\..." while $PWD is >>>> MSYS style "/C/Program Files/Git/..." or something like that, IIRC? >>> >>> Indeed, and since the `alternates` file is supposed to be read by >>> `git.exe`, a non-MSYS program, the original was good, and the nit >>> suggested the incorrect form. >> >> I looked at t5615-alternate-env.sh which does the equivalent of: >> >> GIT_ALTERNATE_OBJECT_DIRECTORIES="$PWD/one.git/objects:$PWD/two.git/objects" \ >> git cat-file [...] >> >> We run that test on all our platforms, does the $PWD form work in the >> environment variable, but not when we write it to the "alternates" file? >> Or is there some other subtlety there that I'm missing? > > I am also curious to see a clear and concise explanation so that we > do not have to repeat this discussion later. We have > > - When a test checks for an absolute path that a git command generated, > construct the expected value using $(pwd) rather than $PWD, > $TEST_DIRECTORY, or $TRASH_DIRECTORY. It makes a difference on > Windows, where the shell (MSYS bash) mangles absolute path names. > For details, see the commit message of 4114156ae9. > > in t/README, but even with the log mesasge of 4114156a (Tests on > Windows: $(pwd) must return Windows-style paths, 2009-03-13) [*1*], > I have no idea what makes the thing you found in t5615 work and your > suggestion to use $PWD in the new one not work. > > GIT_ALTERNATE_OBJECT_DIRECTORIES is a PATH_SEP (not necessarily a > colon) separated list, and I think the way t5615 uses it is broken > on Windows where PATH_SEP is defined as semicolon without the $PWD > vs $(pwd) issue. Is the test checking the right thing? Whatever th explanation is CI passed with a $(pwd) -> $PWD repacement in the test being introduced here: https://github.com/avar/git/runs/7136686130?check_suite_focus=true Diff here: https://github.com/avar/git/commit/606d0060a57b7021396919044c7696489b7835cd So either $PWD is fine there, or our Windows CI doesn't reflect this particular caveat on some Windows systems, or the test is erroneously passing with an invalid value. Knowing which of those it is would be very useful...
Hi Junio, On Thu, 30 Jun 2022, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > > > On Thu, Jun 30 2022, Johannes Schindelin wrote: > > > >> On Tue, 28 Jun 2022, Junio C Hamano wrote: > >> > >>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > >>> > >>> >> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' ' > >>> >> + git init with-commit-graph && > >>> >> + echo "$(pwd)/with-commit/.git/objects" \ > >>> >> + >with-commit-graph/.git/objects/info/alternates && > >>> > > >>> > nit: you can use $PWD instead of $(pwd). > >>> > >>> We can, and it would not make any difference on non-Windows. > >>> > >>> But which one should we use to cater to Windows? $(pwd) is a full > >>> path in Windows notation "C:\Program Files\Git\..." while $PWD is > >>> MSYS style "/C/Program Files/Git/..." or something like that, IIRC? > >> > >> Indeed, and since the `alternates` file is supposed to be read by > >> `git.exe`, a non-MSYS program, the original was good, and the nit > >> suggested the incorrect form. > > > > I looked at t5615-alternate-env.sh which does the equivalent of: > > > > GIT_ALTERNATE_OBJECT_DIRECTORIES="$PWD/one.git/objects:$PWD/two.git/objects" \ > > git cat-file [...] > > > > We run that test on all our platforms, does the $PWD form work in the > > environment variable, but not when we write it to the "alternates" file? > > Or is there some other subtlety there that I'm missing? > > I am also curious to see a clear and concise explanation so that we > do not have to repeat this discussion later. Unfortunately, I do not see any way to explain this concisely: MSYS2 does hard-to-explain things here, in the hopes to Do The Right Thing (most of the time, anyways). Whenever you call a non-MSYS program from an MSYS program (and remember, an MSYS program is a program that uses the MSYS2 runtime that acts as a POSIX emulation layer), "magic" things are done. In our context, `bash.exe` is an MSYS program, and the non-MSYS program that is called is `git.exe`. So what are those "magic" things? The command-line arguments and the environment variables are auto-converted: everything that looks like a Unix-style path (or path list, like the `PATH` environment variable) is converted to a Windows-style path or path list (on Windows, the colon cannot be the separator in `PATH`, therefore the semicolon is used). And this is where it gets _really_ tricky to explain what is going on: what _does_ look like a Unix-style path? The exact rules are convoluted and hard to explain, but they work _most of the time_. For example, `/usr/bin:/hello` is converted to `C:\Program Files\Git\usr\bin;C:\Program Files\Git\hello` or something like that. But `kernel.org:/home/gitster` is not, because it looks more like an SSH path. Similarly, `C:/Program Files` is interpreted as a Windows-style path, even if it could technically be a Unix-style path list. Now, if you call `git.exe -C /blabla <command>`, it works, because `git.exe` is a non-MSYS program, therefore that `/blabla` is converted to a Windows-style path before executing `git.exe`. However, when you write a file via `echo /blabla >file`, that `echo` is either the Bash built-in, or it is an MSYS program, and no argument conversion takes place. If you _then_ ask `git.exe` to read and interpret the file as a path, it won't know what to do with that Unix-style path. You can substitute `$PWD` for `/blabla` in all of this, and it will hold true just the same. So what makes `pwd` special? Well, `pwd.exe` itself is an MSYS program, so it would still report a path that `git.exe` cannot understand. But in Git's test suite, we specifically override `pwd` to be a shell function that calls `pwd.exe -W`, which does output Windows-style paths. The thing that makes that `GIT_*=$PWD git ...` call work is that the environment is automagically converted because `git` is a non-MSYS program. The thing that makes `echo $PWD >.git/objects/info/alternates` not work is that `echo` _is_ an MSYS program (or Bash built-in, which is the same thing here, for all practical purposes), so it writes the path verbatim into that file, but then we expect `git.exe` to read this file and interpret it as a list of paths. Hopefully that clarifies the scenario a bit, even if it is far from a concise explanation (I did edit this mail multiple times for clarity and brevity, though, as I do with pretty much all of my mails). Ciao, Dscho > We have > > - When a test checks for an absolute path that a git command generated, > construct the expected value using $(pwd) rather than $PWD, > $TEST_DIRECTORY, or $TRASH_DIRECTORY. It makes a difference on > Windows, where the shell (MSYS bash) mangles absolute path names. > For details, see the commit message of 4114156ae9. > > in t/README, but even with the log mesasge of 4114156a (Tests on > Windows: $(pwd) must return Windows-style paths, 2009-03-13) [*1*], > I have no idea what makes the thing you found in t5615 work and your > suggestion to use $PWD in the new one not work. > > GIT_ALTERNATE_OBJECT_DIRECTORIES is a PATH_SEP (not necessarily a > colon) separated list, and I think the way t5615 uses it is broken > on Windows where PATH_SEP is defined as semicolon without the $PWD > vs $(pwd) issue. Is the test checking the right thing? > > > [Footnote] > > *1* > > Tests on Windows: $(pwd) must return Windows-style paths > > Many tests pass $(pwd) in some form to git and later test that the output > of git contains the correct value of $(pwd). For example, the test of > 'git remote show' sets up a remote that contains $(pwd) and then the > expected result must contain $(pwd). > > Again, MSYS-bash's path mangling kicks in: Plain $(pwd) uses the MSYS style > absolute path /c/path/to/git. The test case would write this name into > the 'expect' file. But when git is invoked, MSYS-bash converts this name to > the Windows style path c:/path/to/git, and git would produce this form in > the result; the test would fail. > > We fix this by passing -W to bash's pwd that produces the Windows-style > path. > > There are a two cases that need an accompanying change: > > - In t1504 the value of $(pwd) becomes part of a path list. In this case, > the lone 'c' in something like /foo:c:/path/to/git:/bar inhibits > MSYS-bashes path mangling; IOW in this case we want the /c/path/to/git > form to allow path mangling. We use $PWD instead of $(pwd), which always > has the latter form. > > - In t6200, $(pwd) - the Windows style path - must be used to construct the > expected result because that is the path form that git sees. (The change > in the test itself is just for consistency: 'git fetch' always sees the > Windows-style path, with or without the change.) > > Signed-off-by: Johannes Sixt <j6t@kdbg.org> > >
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > Whenever you call a non-MSYS program from an MSYS program (and remember, > an MSYS program is a program that uses the MSYS2 runtime that acts as a > POSIX emulation layer), "magic" things are done. In our context, > `bash.exe` is an MSYS program, and the non-MSYS program that is called is > `git.exe`. > > So what are those "magic" things? The command-line arguments and the > environment variables are auto-converted: everything that looks like a > Unix-style path (or path list, like the `PATH` environment variable) is > converted to a Windows-style path or path list (on Windows, the colon > cannot be the separator in `PATH`, therefore the semicolon is used). > > And this is where it gets _really_ tricky to explain what is going on: > what _does_ look like a Unix-style path? The exact rules are convoluted > and hard to explain, but they work _most of the time_. For example, > `/usr/bin:/hello` is converted to `C:\Program Files\Git\usr\bin;C:\Program > Files\Git\hello` or something like that. But `kernel.org:/home/gitster` is > not, because it looks more like an SSH path. Similarly, `C:/Program Files` > is interpreted as a Windows-style path, even if it could technically be a > Unix-style path list. > > Now, if you call `git.exe -C /blabla <command>`, it works, because > `git.exe` is a non-MSYS program, therefore that `/blabla` is converted to > a Windows-style path before executing `git.exe`. However, when you write a > file via `echo /blabla >file`, that `echo` is either the Bash built-in, or > it is an MSYS program, and no argument conversion takes place. If you > _then_ ask `git.exe` to read and interpret the file as a path, it won't > know what to do with that Unix-style path. > > You can substitute `$PWD` for `/blabla` in all of this, and it will hold > true just the same. > > So what makes `pwd` special? > > Well, `pwd.exe` itself is an MSYS program, so it would still report a path > that `git.exe` cannot understand. But in Git's test suite, we specifically > override `pwd` to be a shell function that calls `pwd.exe -W`, which does > output Windows-style paths. > > The thing that makes that `GIT_*=$PWD git ...` call work is that the > environment is automagically converted because `git` is a non-MSYS > program. The thing that makes `echo $PWD >.git/objects/info/alternates` > not work is that `echo` _is_ an MSYS program (or Bash built-in, which is > the same thing here, for all practical purposes), so it writes the path > verbatim into that file, but then we expect `git.exe` to read this file > and interpret it as a list of paths. ----- 8< --------- 8< --------- 8< --------- 8< --------- 8< ----- > Hopefully that clarifies the scenario a bit, even if it is far from a > concise explanation (I did edit this mail multiple times for clarity and > brevity, though, as I do with pretty much all of my mails). Certainly it does help. Thanks. I wonder if it makes sense to keep a copy of the bulk of your response in t/ somewhere, and refer to it from t/README, to help fellow non-Windows developers to avoid breaking tests on Windows without knowing.
diff --git a/commit-graph.c b/commit-graph.c index 2b52818731..2dd9bcc7ea 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -907,7 +907,7 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje return NULL; if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos)) return NULL; - if (!repo_has_object_file(repo, id)) + if (!has_object(repo, id, 0)) return NULL; commit = lookup_commit(repo, id); diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh new file mode 100755 index 0000000000..d7877a5758 --- /dev/null +++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh @@ -0,0 +1,53 @@ +#!/bin/sh + +test_description='test for no lazy fetch with the commit-graph' + +. ./test-lib.sh + +if ! test_have_prereq ULIMIT_PROCESSES +then + skip_all='skipping tests for no lazy fetch with the commit-graph, ulimit processes not available' + test_done +fi + +test_expect_success 'setup: prepare a repository with a commit' ' + git init with-commit && + test_commit -C with-commit the-commit && + oid=$(git -C with-commit rev-parse HEAD) +' + +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' ' + git init with-commit-graph && + echo "$(pwd)/with-commit/.git/objects" \ + >with-commit-graph/.git/objects/info/alternates && + # create a ref that points to the commit in alternates + git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" && + # prepare some other objects to commit-graph + test_commit -C with-commit-graph something && + git -c gc.writeCommitGraph=true -C with-commit-graph gc && + test_path_is_file with-commit-graph/.git/objects/info/commit-graph +' + +test_expect_success 'setup: change the alternates to what without the commit' ' + git init --bare without-commit && + echo "$(pwd)/without-commit/objects" \ + >with-commit-graph/.git/objects/info/alternates && + test_must_fail git -C with-commit-graph cat-file -e $oid +' + +test_expect_success 'setup: prepare any commit to fetch' ' + test_commit -C with-commit any-commit && + anycommit=$(git -C with-commit rev-parse HEAD) +' + +test_expect_success 'fetch any commit from promisor with the usage of the commit graph' ' + git -C with-commit-graph remote add origin "$(pwd)/with-commit" && + git -C with-commit-graph config remote.origin.promisor true && + git -C with-commit-graph config remote.origin.partialclonefilter blob:none && + run_with_limited_processses env GIT_TRACE="$(pwd)/trace" \ + git -C with-commit-graph fetch origin $anycommit 2>err && + test_i18ngrep ! "fatal: promisor-remote: unable to fork off fetch subprocess" err && + test $(grep "fetch origin" trace | wc -l) -eq 1 +' + +test_done
The commit-graph is used to opportunistically optimize accesses to certain pieces of information on commit objects, and lookup_commit_in_graph() tries to say "no" when the requested commit does not locally exist by returning NULL, in which case the caller can ask for (which may result in on-demand fetching from a promisor remote) and parse the commit object itself. However, it uses a wrong helper, repo_has_object_file(), to do so. This helper not only checks if an object is mmediately available in the local object store, but also tries to fetch from a promisor remote. But the fetch machinery calls lookup_commit_in_graph(), thus causing an infinite loop. We should make lookup_commit_in_graph() expect that a commit given to it can be legitimately missing from the local object store, by using the has_object_file() helper instead. Signed-off-by: Han Xin <hanxin.hx@bytedance.com> --- commit-graph.c | 2 +- t/t5329-no-lazy-fetch-with-commit-graph.sh | 53 ++++++++++++++++++++++ 2 files changed, 54 insertions(+), 1 deletion(-) create mode 100755 t/t5329-no-lazy-fetch-with-commit-graph.sh