Message ID | xmqqwn4sq73f.fsf@gitster.g (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | test: make SYMLINKS prerequisite more robust | expand |
On Wed, Feb 08 2023, Junio C Hamano wrote: > I see many failures around SYMLINKS prerequisite in Windows tests. > There are too many to point at, but the pattern seems to be the > same. Here is one example: > > https://github.com/git/git/actions/runs/4127147009/jobs/7130175639#step:5:502 > > where "ln -s x y && test -h y" succeeds and declares SYMLINKS > lazy prerequisite is satisfied, but then it fails to run > > "ln -s unrelated DS && git update-index --add DS" > > with: > > error: readlink("DS"): Function not implemented > > I wonder if something like this is in order? I don't have much to contribute on that front, but this is really missing some "why", this worked before, why is it failing now? Do we have any idea. We use "windows-latest", at first I suspected that this was the 2019->2022 migration, which seems to be happening around now: https://github.blog/changelog/2022-01-11-github-actions-jobs-running-on-windows-latest-are-now-running-on-windows-server-2022/ But looking at a previous successful run on master: https://github.com/git/git/actions/runs/4089369223/jobs/7052074004 V.s. this current failure: https://github.com/git/git/actions/runs/4127146869/jobs/7130173444 Shows that both run 2022, and seem to be running the same software, except that in "set up job" this is different, it was: Download action repository 'git-for-windows/setup-git-for-windows-sdk@v1' (SHA:cbe017cd7ae39629bf4e34fce8b1ccd211fec009) Now: Download action repository 'git-for-windows/setup-git-for-windows-sdk@v1' (SHA:848609620edfa4c2fc64838b85fbe19e534236ee) I have no idea if that's related though... > diff --git a/t/helper/test-readlink.c b/t/helper/test-readlink.c > new file mode 100644 > index 0000000000..c300dc8a1a > --- /dev/null > +++ b/t/helper/test-readlink.c > @@ -0,0 +1,19 @@ > +#include "test-tool.h" > +#include "strbuf.h" > + > +static const char *usage_msg = "test-tool readlink file"; > + > +int cmd__readlink(int ac, const char **av) > +{ > + struct strbuf buf; You're leaving the strbuf uninitialized here, use STRBUF_INIT... > + int ret; > + > + if (ac != 2 || !av[1]) > + usage(usage_msg); > + > + ret = strbuf_readlink(&buf, av[1], 0); ...it's used here, and there's no implicit strbuf_init() on strbuf_readlink(). The first thing it does is inspect sb->alloc. > + if (!ret) > + printf("%s\n", buf.buf); Nit: puts(buf.buf); All in all a simple helper, but isn't this redundant to "test_readlink"? That requires perl, and once we have this we could just replace it, but then let's do that here, no? > diff --git a/t/test-lib.sh b/t/test-lib.sh > index 01e88781dd..c8094f643b 100644 > --- a/t/test-lib.sh > +++ b/t/test-lib.sh > @@ -1773,7 +1773,8 @@ test_lazy_prereq PIPE ' > > test_lazy_prereq SYMLINKS ' > # test whether the filesystem supports symbolic links > - ln -s x y && test -h y > + ln -s x y && test -h y && test-tool readlink y >/dev/null && > + test "$(test-tool readlink y)" = x Why get the exit code, and then proceed to hide the exit code with the test "$()" pattern, we can just (untested): echo x >expect && test-tool readlink y >actual && test_cmp expect actual If you're trying to avoid leaving litter or cleaning up that's not needed anymore with these lazy prereqs for a while now (they get their throw-away temporary directory).
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> I wonder if something like this is in order? > > I don't have much to contribute on that front, but this is really > missing some "why", this worked before, why is it failing now? Do we > have any idea. Your guess is as good as mine. I do not do Windows. Thanks for independently noticing the uninitialized strbuf. What I have queued has it fixed already locally but there isn't much point to send another copy out to the list ;-). > All in all a simple helper, but isn't this redundant to "test_readlink"? Not at all. We need to avoid the Perl one for this purpose. What matters is whether "git" considers if symlink is working. Perhaps our readlink(3) emulation we have in compat/ may hardcode our "knowledge" that symlink is not available in Windows, which may not match what the POSIX XCU emulation in our Windows environment offers, which apparently ran "ln -s x y && test -h y" successfully, and who knows what test_readlink that is written in Perl thinks? We are testing "git" with the test suite, so even if with some magic that is still unknown to compat/mingw.h it knows how to read what "ln -s x y" left in "y", until compat/mingw.h::readlink() learns the same trick, asking Perl to decide SYMLINKS prerequisite would not help our test scripts at all. > echo x >expect && > test-tool readlink y >actual && > test_cmp expect actual > > If you're trying to avoid leaving litter or cleaning up that's not > needed anymore with these lazy prereqs for a while now (they get their > throw-away temporary directory). Indeed, I just did not want to add another cruft, but 'x' and 'y' are already such crufts, so I could have just done ln -s x y && test -h y && test-tool readlink y >x && test $(cat x) = x to use one of them ;-)
On Wed, Feb 08 2023, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > >>> I wonder if something like this is in order? >> >> I don't have much to contribute on that front, but this is really >> missing some "why", this worked before, why is it failing now? Do we >> have any idea. > > Your guess is as good as mine. I do not do Windows. > > Thanks for independently noticing the uninitialized strbuf. What I > have queued has it fixed already locally but there isn't much point > to send another copy out to the list ;-). > >> All in all a simple helper, but isn't this redundant to "test_readlink"? > > Not at all. We need to avoid the Perl one for this purpose. What > matters is whether "git" considers if symlink is working. > > Perhaps our readlink(3) emulation we have in compat/ may hardcode > our "knowledge" that symlink is not available in Windows, which may > not match what the POSIX XCU emulation in our Windows environment > offers, which apparently ran "ln -s x y && test -h y" successfully, > and who knows what test_readlink that is written in Perl thinks? Yeah that's fair, I wonder why we haven't replaced this already... FWIW I think this is what perl will dispatch to on Windows, so it makes your point, it has its own NIH Windows emulation layer: https://github.com/Perl/perl5/blob/blead/win32/win32.c#L1983-L2026 > We > are testing "git" with the test suite, so even if with some magic > that is still unknown to compat/mingw.h it knows how to read what > "ln -s x y" left in "y", until compat/mingw.h::readlink() learns the > same trick, asking Perl to decide SYMLINKS prerequisite would not > help our test scripts at all. We could always see if they return the same answer :) But not worth pursuing in this case. >> If you're trying to avoid leaving litter or cleaning up that's not >> needed anymore with these lazy prereqs for a while now (they get their >> throw-away temporary directory). > > Indeed, I just did not want to add another cruft, but 'x' and 'y' > are already such crufts, so I could have just done > > ln -s x y && > test -h y && > test-tool readlink y >x && > test $(cat x) = x > > to use one of them ;-) Yeah, I mean you don't need to avoid cruft, because the whole directory is about to be rm -rf'd Now that I've dug it up I see you implemented that in 04083f278d1 (test: allow prerequisite to be evaluated lazily, 2012-07-26). I was recalling 53ff3b96a87 (tests: make sure nested lazy prereqs work reliably, 2020-11-18) as the more recent change, but that just solved a nested prereq edge case.
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> We >> are testing "git" with the test suite, so even if with some magic >> that is still unknown to compat/mingw.h it knows how to read what >> "ln -s x y" left in "y", until compat/mingw.h::readlink() learns the >> same trick, asking Perl to decide SYMLINKS prerequisite would not >> help our test scripts at all. > > We could always see if they return the same answer :) But not worth > pursuing in this case. Actually we probably should test if "git" can create a symbolic link, because that is what matters more than what "ln -s" can or cannot do. We also are interested in the "symbolic link" does behave as a symbolic link, i.e. writing into it should modify the contents of the target of the link. In any case, here is what I have in mind. There is only a symbolic link 'y' without 'x' (because this happens in a new and empty directory for evaluating the lazy prerequisite), redirecting output from the command, which should be 'x', into 'y' will vivify file 'x' with contents also 'x', and reading 'y' should yield 'x' because it points at 'x'. diff --git c/t/test-lib.sh w/t/test-lib.sh index 6db377f68b..3fb5957bd2 100644 --- c/t/test-lib.sh +++ w/t/test-lib.sh @@ -1773,7 +1773,9 @@ test_lazy_prereq PIPE ' test_lazy_prereq SYMLINKS ' # test whether the filesystem supports symbolic links - ln -s x y && test -h y + ln -s x y && test -h y && test-tool readlink y >y && + test "$(cat y)" = x && + test "$(cat x)" = x ' test_lazy_prereq SYMLINKS_WINDOWS '
Junio C Hamano <gitster@pobox.com> writes: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > >>> I wonder if something like this is in order? >> >> I don't have much to contribute on that front, but this is really >> missing some "why", this worked before, why is it failing now? Do we >> have any idea. > > Your guess is as good as mine. I do not do Windows. This morning, I notice that those CI jobs I ran last night that failed with "whoa, windows tests are somehow reporting that symlinks are available but not really" issue the patch in this thread were attempting to address has passed even for branches like 'master' and 'next' that do not yet have it, and it seems to be because you re-run these failed jobs. Whatever magic you used to fix these failing tests, thanks. Do you have an insight on why and how these were failing? The patch in this thread was a band-aid without knowing why all of a sudden "ln -s x y && test -h y" started passing (while compat/mingw.c still says readlink() is not supported). If we know that such a breakage is not expected, we can drop this workaroun, which would be great. Thanks.
On Fri, Feb 10 2023, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > >> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> >>>> I wonder if something like this is in order? >>> >>> I don't have much to contribute on that front, but this is really >>> missing some "why", this worked before, why is it failing now? Do we >>> have any idea. >> >> Your guess is as good as mine. I do not do Windows. > > This morning, I notice that those CI jobs I ran last night that > failed with "whoa, windows tests are somehow reporting that symlinks > are available but not really" issue the patch in this thread were > attempting to address has passed even for branches like 'master' and > 'next' that do not yet have it, and it seems to be because you > re-run these failed jobs. > > Whatever magic you used to fix these failing tests, thanks. > > Do you have an insight on why and how these were failing? The patch > in this thread was a band-aid without knowing why all of a sudden > "ln -s x y && test -h y" started passing (while compat/mingw.c still > says readlink() is not supported). If we know that such a breakage > is not expected, we can drop this workaroun, which would be great. I'm not Johannes :) But as I noted upthread this failed when we went from: Download action repository 'git-for-windows/setup-git-for-windows-sdk@v1' (SHA:cbe017cd7ae39629bf4e34fce8b1ccd211fec009) To: Download action repository 'git-for-windows/setup-git-for-windows-sdk@v1' (SHA:848609620edfa4c2fc64838b85fbe19e534236ee) And now our passing "next" has: Download action repository 'git-for-windows/setup-git-for-windows-sdk@v1' (SHA:a8e2a23eb07129d628ff6f9d5f11047b0662aeba) If you then look at that range in that repository you'll find the release includes: https://github.com/git-for-windows/setup-git-for-windows-sdk/pull/595 https://github.com/git-for-windows/setup-git-for-windows-sdk/pull/596 https://github.com/git-for-windows/setup-git-for-windows-sdk/pull/597 The release notes for the last of those then has: "Fix issue with symlink restoration on windows". Which from some looking around seems like it might be the issue we've been seeing, it's this commit & PR: https://github.com/actions/toolkit/commit/b2d865f18051b214475dae41766e9970fd68ca12 https://github.com/actions/toolkit/pull/1291 And following that rabbit hole leads to Johannes noting that this (or a related change) was breaking GFW: https://github.com/actions/toolkit/pull/1291/commits/2867e318d4d0de11b10a2887fb29dcf713559a71#r1098571737 So I think we can drop the workaround, at least as far as fixing the CI breakages goes.
Hi, On Fri, 10 Feb 2023, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > > > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > > > >>> I wonder if something like this is in order? > >> > >> I don't have much to contribute on that front, but this is really > >> missing some "why", this worked before, why is it failing now? Do we > >> have any idea. > > > > Your guess is as good as mine. I do not do Windows. > > This morning, I notice that those CI jobs I ran last night that > failed with "whoa, windows tests are somehow reporting that symlinks > are available but not really" issue the patch in this thread were > attempting to address has passed even for branches like 'master' and > 'next' that do not yet have it, and it seems to be because you > re-run these failed jobs. > > Whatever magic you used to fix these failing tests, thanks. The magic was to pester the maintainers of the dependency of the `setup-git-for-windows-sdk` GitHub Action that broke said action. Background (you can skip this unless you're interested in details): Since Git's test suite relies so heavily on Unix shell scripting, which is really foreign on Windows, we have to use the MSYS2 Bash to run it. And when I say "MSYS2 Bash", it is short-hand for "a Bash that uses the POSIX emulation layer provided by the MSYS2 runtime". The MSYS2 runtime _does_ have support for actual symbolic links, it just depends on certain Windows features (and then it still is not quite the same as Unix symbolic links because on Windows, the symbolic links are either pointing to files or to directories and you have to specify that when creating them). This support is toggled via the environment variable `MSYS`. To enable support for symbolic links, set it to `winsymlinks:nativestrict`. Now, the dependency of `setup-git-for-windows-sdk` that is used for caching the minimal subset of Git for Windows' SDK needed for git/git's CI runs is called `@actions/cache`, and it recently saw a contribution to allow for caching symbolic links even on Windows. This change set the `MSYS` environment variable in the way indicated above. Unfortunately, it set it not only for the `tar` invocation that needed it, but it was generous enough to extend that setting to the entirety of the containing workflow run. Affecting, among other things, Git's test suite. The `@actions/cache` PR with the bug fix was actually already opened when I noticed the problem, which was only because I released a new version of the `setup-git-for-windows-sdk` Action that supports Windows/ARM64 better (and which unfortunately dragged in the borked dependency). But that PR had not been merged yet. Once it had been merged, and a new version of that dependency had been fast-tracked (I want to believe that my gentle nudging played a part in expediting this), I released a new version of the `setup-git-for-windows-sdk` Action and re-ran the failed workflow runs. And subsequently I ran out of time and could not update anyone about what I had done to fix this. > Do you have an insight on why and how these were failing? The patch > in this thread was a band-aid without knowing why all of a sudden > "ln -s x y && test -h y" started passing (while compat/mingw.c still > says readlink() is not supported). If we know that such a breakage > is not expected, we can drop this workaroun, which would be great. The patch that started this thread looks a lot like papering over the issue to me. And a cleaner way to accomplish that would be to force-set `MSYS` to an explicit value. It would be nice to eventually get to a point to have Git's test suite pass with symbolic links enabled on Windows, but introducing a new test helper won't do that. The actual root cause of the failures we saw when `MSYS=winsymlinks:nativestrict` was graced onto us was that Git's test suite creates a symbolic link to `/dev/null` and then expects Git's `opendir()` call on the symbolic link to do something which it does not do on Windows. I _guess_ that the symptom is caused by `opendir()` resolving the symbolic link and then trying to open the `NUL` device as a directory, which makes no sense whatsoever. But I would contend that the actual culprit is (from the cursory look I took) that Git's test suite is abusing `/dev/null` as being some kind of empty directory instead of explicitly creating an empty directory and using _that_. This is _not_ at all the full and thorough analysis I expect of code reviews, especially of my own reviews, please do not mistake my reply for that. I have more pressing things to attend to than to fix what seems like a "wouldn't it be nice" rather than an "OMG must fix rn!" problem, and therefore I cannot justify spending more time on it for the time being. Ciao, Johannes
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> Whatever magic you used to fix these failing tests, thanks. > > The magic was to pester the maintainers of the dependency of the > `setup-git-for-windows-sdk` GitHub Action that broke said action. > > Background (you can skip this unless you're interested in details): > > Since Git's test suite relies so heavily on Unix shell > scripting, which is really foreign on Windows, we have to use the MSYS2 > Bash to run it. And when I say "MSYS2 Bash", it is short-hand for "a Bash > that uses the POSIX emulation layer provided by the MSYS2 runtime". > > The MSYS2 runtime _does_ have support for actual symbolic links, it just > depends on certain Windows features (and then it still is not quite the > same as Unix symbolic links because on Windows, the symbolic links are > either pointing to files or to directories and you have to specify that > when creating them). OK, so bash runtime accidentally enabled symbolic links. But of course "git" thinks that Windows never do symbolic links by having {ENOSYS; return -1;} in compat/mingw.h for readlink and symlink, running any git operation that depends on SYMLINKS prerequisite would not work and that is where it fails. We do not even have to hit a test that points a symbolic link at any special file. Thanks for fixing the environment. The patch in the thread only meant to ensure that the test suite does not get fooled by only bash supporting symbolic links without updating "git" supporting any by being more careful. Now "test -h" and "ln -s" behave without contradiction with readlink(2)/symlink(2) emulation "git" uses, we can safely drop it.
diff --git a/Makefile b/Makefile index 45bd6ac9c3..2261e56d31 100644 --- a/Makefile +++ b/Makefile @@ -837,6 +837,7 @@ TEST_BUILTINS_OBJS += test-reach.o TEST_BUILTINS_OBJS += test-read-cache.o TEST_BUILTINS_OBJS += test-read-graph.o TEST_BUILTINS_OBJS += test-read-midx.o +TEST_BUILTINS_OBJS += test-readlink.o TEST_BUILTINS_OBJS += test-ref-store.o TEST_BUILTINS_OBJS += test-reftable.o TEST_BUILTINS_OBJS += test-regex.o diff --git a/t/helper/test-readlink.c b/t/helper/test-readlink.c new file mode 100644 index 0000000000..c300dc8a1a --- /dev/null +++ b/t/helper/test-readlink.c @@ -0,0 +1,19 @@ +#include "test-tool.h" +#include "strbuf.h" + +static const char *usage_msg = "test-tool readlink file"; + +int cmd__readlink(int ac, const char **av) +{ + struct strbuf buf; + int ret; + + if (ac != 2 || !av[1]) + usage(usage_msg); + + ret = strbuf_readlink(&buf, av[1], 0); + if (!ret) + printf("%s\n", buf.buf); + strbuf_release(&buf); + return ret; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index abe8a785eb..12054d13d4 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -64,6 +64,7 @@ static struct test_cmd cmds[] = { { "read-cache", cmd__read_cache }, { "read-graph", cmd__read_graph }, { "read-midx", cmd__read_midx }, + { "readlink", cmd__readlink }, { "ref-store", cmd__ref_store }, { "reftable", cmd__reftable }, { "rot13-filter", cmd__rot13_filter }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index ea2672436c..efe382d48e 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -57,6 +57,7 @@ int cmd__reach(int argc, const char **argv); int cmd__read_cache(int argc, const char **argv); int cmd__read_graph(int argc, const char **argv); int cmd__read_midx(int argc, const char **argv); +int cmd__readlink(int argc, const char **argv); int cmd__ref_store(int argc, const char **argv); int cmd__rot13_filter(int argc, const char **argv); int cmd__reftable(int argc, const char **argv); diff --git a/t/test-lib.sh b/t/test-lib.sh index 01e88781dd..c8094f643b 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1773,7 +1773,8 @@ test_lazy_prereq PIPE ' test_lazy_prereq SYMLINKS ' # test whether the filesystem supports symbolic links - ln -s x y && test -h y + ln -s x y && test -h y && test-tool readlink y >/dev/null && + test "$(test-tool readlink y)" = x ' test_lazy_prereq SYMLINKS_WINDOWS '
I see many failures around SYMLINKS prerequisite in Windows tests. There are too many to point at, but the pattern seems to be the same. Here is one example: https://github.com/git/git/actions/runs/4127147009/jobs/7130175639#step:5:502 where "ln -s x y && test -h y" succeeds and declares SYMLINKS lazy prerequisite is satisfied, but then it fails to run "ln -s unrelated DS && git update-index --add DS" with: error: readlink("DS"): Function not implemented I wonder if something like this is in order? ----- >8 ----- We should not just ensure "ln -s" and "test -h" works, but readlink yields the expected value. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- Makefile | 1 + t/helper/test-readlink.c | 19 +++++++++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/test-lib.sh | 3 ++- 5 files changed, 24 insertions(+), 1 deletion(-) create mode 100644 t/helper/test-readlink.c