Message ID | 1f5366f137967cbec30041b40eedd86ce5f6e953.1662469859.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | tests: replace mingw_test_cmp with a helper in C | expand |
On Tue, Sep 06 2022, Johannes Schindelin via GitGitGadget wrote: > From: Johannes Schindelin <johannes.schindelin@gmx.de> > [...] > +++ b/t/helper/test-text-cmp.c > @@ -0,0 +1,78 @@ > +#include "test-tool.h" > +#include "git-compat-util.h" > +#include "strbuf.h" > +#include "gettext.h" Superflous header? Compiles without gettext.h for me (and we shouldn't use i18n in test helpers). > [...] > +int cmd__text_cmp(int argc, const char **argv) > +{ > + FILE *f0, *f1; > + struct strbuf b0 = STRBUF_INIT, b1 = STRBUF_INIT; > + > + if (argc != 3) > + die("Require exactly 2 arguments, got %d", argc); Here you conflate the argc v.s. arguments minus the "text-cmp", resulting in: helper/test-tool text-cmp 2 fatal: Require exactly 2 arguments, got 2 An argc-- argv++ at the beginning seems like the easiest way out of this. Also s/Require/require/ per CodingGuidelines. > + if (!strcmp(argv[1], "-") && !strcmp(argv[2], "-")) > + die("only one parameter can refer to `stdin` but not both"); > + > + if (!(f0 = !strcmp(argv[1], "-") ? stdin : fopen(argv[1], "r"))) > + return error_errno("could not open '%s'", argv[1]); > + if (!(f1 = !strcmp(argv[2], "-") ? stdin : fopen(argv[2], "r"))) { > + fclose(f0); > + return error_errno("could not open '%s'", argv[2]); > + } Faithfully emulating the old version. I do wonder if we couldn't simply adjust the handful of tests that actually make use of the "-" diff(1) feature. AFAICT there's around 10 of those at most, and they all seem like cases where it would be easy to change: (echo foo) | test_cmp - actual Or whatever, to: echo foo >expected && test_cmp expected actual ... > + if (!strcmp(argv[1], "-") || !strcmp(argv[2], "-")) > + warning("cannot show diff because `stdin` was already consumed"); ... Which means we wouldn't need to punt on this. > + else if (!run_diff(argv[1], argv[2])) > + die("Huh? 'diff --no-index %s %s' succeeded", > + argv[1], argv[2]); I tried manually testing this with: GIT_TRACE=1 GIT_TEST_CMP="/home/avar/g/git/git diff --no-index --" ./t0021-conversion.sh -vixd v.s.: GIT_TRACE=1 GIT_TEST_CMP="$PWD/helper/test-tool text-cmp" ./t0021-conversion.sh -vixd Your version doesn't get confused by the same, but AFAICT this is by fragile accident. I.e. you run your own equivalent of "cmp", so because the files are the same in that case we don't run the "diff --no-index". But the "diff --no-index" in that t0021*.sh case *would* report differences, even though the files are byte-for-byte identical. So the "cmp"-a-like here isn't just an optimization to avoid forking the "git diff" process, it's an entirely different comparison method in cases where we have a "filter". It just so happens that our test suite doesn't currently combine them in a way that causes a current failure. > test_cmp () { > test "$#" -ne 2 && BUG "2 param" > - eval "$GIT_TEST_CMP" '"$@"' > + GIT_ALLOC_LIMIT=0 eval "$GIT_TEST_CMP" '"$@"' > } Further, we have a clear boundary in the test suite between "git" and "test-tool" things we invoke, and third party tools. The former we put in "test_must_fail_acceptable". When using this new helper we'd hide potential segfaults and BUGs in any "! test_cmp" invocation.. To avoid the introduction of such a blindspot we'd need to change "test_cmp" to take an optional "!" as the 1st argument, and convert the existing "! test_cmp" to "test_cmp !", then carry some flag to indicate that our "GIT_TEST_CMP" is a git or test-tool invocation, and check it appropriately. > [...] > diff --git a/t/test-lib.sh b/t/test-lib.sh > index 7726d1da88a..0be25ecbd59 100644 > --- a/t/test-lib.sh > +++ b/t/test-lib.sh > @@ -1546,7 +1546,7 @@ case $uname_s in > test_set_prereq SED_STRIPS_CR > test_set_prereq GREP_STRIPS_CR > test_set_prereq WINDOWS > - GIT_TEST_CMP=mingw_test_cmp > + GIT_TEST_CMP="test-tool text-cmp" > ;; > *CYGWIN*) > test_set_prereq POSIXPERM Not a new problem, but this is incompatible with GIT_TEST_CMP_USE_COPIED_CONTEXT. What is new though is that with this series there's no longer a good reason AFAICT to carry GIT_TEST_CMP_USE_COPIED_CONTEXT at all. I.e. we have it for a "diff" that doesn't understand "-u". If (after getting past tho caveats noted above) we could simply invoke our own test-tool we could drop that special-casing & just always invoke our own test_cmp helper.
On Wed, Sep 07 2022, Ævar Arnfjörð Bjarmason wrote: > On Tue, Sep 06 2022, Johannes Schindelin via GitGitGadget wrote: > >> From: Johannes Schindelin <johannes.schindelin@gmx.de> >> [...] >> +++ b/t/helper/test-text-cmp.c >> @@ -0,0 +1,78 @@ >> +#include "test-tool.h" >> +#include "git-compat-util.h" >> +#include "strbuf.h" >> +#include "gettext.h" > > Superflous header? Compiles without gettext.h for me (and we shouldn't > use i18n in test helpers). > >> [...] >> +int cmd__text_cmp(int argc, const char **argv) >> +{ >> + FILE *f0, *f1; >> + struct strbuf b0 = STRBUF_INIT, b1 = STRBUF_INIT; >> + >> + if (argc != 3) >> + die("Require exactly 2 arguments, got %d", argc); > > Here you conflate the argc v.s. arguments minus the "text-cmp", > resulting in: > > helper/test-tool text-cmp 2 > fatal: Require exactly 2 arguments, got 2 > > An argc-- argv++ at the beginning seems like the easiest way out of > this. Also s/Require/require/ per CodingGuidelines. > >> + if (!strcmp(argv[1], "-") && !strcmp(argv[2], "-")) >> + die("only one parameter can refer to `stdin` but not both"); >> + >> + if (!(f0 = !strcmp(argv[1], "-") ? stdin : fopen(argv[1], "r"))) >> + return error_errno("could not open '%s'", argv[1]); >> + if (!(f1 = !strcmp(argv[2], "-") ? stdin : fopen(argv[2], "r"))) { >> + fclose(f0); >> + return error_errno("could not open '%s'", argv[2]); >> + } > > Faithfully emulating the old version. I do wonder if we couldn't simply > adjust the handful of tests that actually make use of the "-" diff(1) > feature. AFAICT there's around 10 of those at most, and they all seem > like cases where it would be easy to change: > > (echo foo) | test_cmp - actual > > Or whatever, to: > > echo foo >expected && > test_cmp expected actual > > ... > >> + if (!strcmp(argv[1], "-") || !strcmp(argv[2], "-")) >> + warning("cannot show diff because `stdin` was already consumed"); > > ... > > Which means we wouldn't need to punt on this. > >> + else if (!run_diff(argv[1], argv[2])) >> + die("Huh? 'diff --no-index %s %s' succeeded", >> + argv[1], argv[2]); > > I tried manually testing this with: > > GIT_TRACE=1 GIT_TEST_CMP="/home/avar/g/git/git diff --no-index --" ./t0021-conversion.sh -vixd > > v.s.: > > GIT_TRACE=1 GIT_TEST_CMP="$PWD/helper/test-tool text-cmp" ./t0021-conversion.sh -vixd > > Your version doesn't get confused by the same, but AFAICT this is by > fragile accident. > > I.e. you run your own equivalent of "cmp", so because the files are the > same in that case we don't run the "diff --no-index". > > But the "diff --no-index" in that t0021*.sh case *would* report > differences, even though the files are byte-for-byte identical. > > So the "cmp"-a-like here isn't just an optimization to avoid forking the > "git diff" process, it's an entirely different comparison method in > cases where we have a "filter". > > It just so happens that our test suite doesn't currently combine them in > a way that causes a current failure. Ah (partially?) I spoke too soon on this part. I.e. the GIT_DIR=/dev/null precludes reading the filter/repo in this case. So I *think* we're out of the woods as far as this is concerned. Still, it would be nice to document in a code comment or commit message that the "not read the local repo's filter" is absolutely critical here. But I think that re-raises the point René had in: https://lore.kernel.org/git/b21d2b60-428f-58ec-28b6-3c617b9f2e45@web.de/ I ran the full test suite with: GIT_TEST_CMP='GIT_DIR=/dev/null HOME=/dev/null /usr/bin/git diff --no-index --ignore-cr-at-eol --' And all of it passes, except for a test in t0001-init.sh which we could fix up as: diff --git a/t/t0001-init.sh b/t/t0001-init.sh index d479303efa0..d65afe7cceb 100755 --- a/t/t0001-init.sh +++ b/t/t0001-init.sh @@ -426,7 +426,7 @@ test_expect_success SYMLINKS 're-init to move gitdir symlink' ' git init --separate-git-dir ../realgitdir ) && echo "gitdir: $(pwd)/realgitdir" >expected && - test_cmp expected newdir/.git && + test "$(test_readlink newdir/.git)" = here && test_cmp expected newdir/here && test_path_is_dir realgitdir/refs ' Which without this series is more correct, as all we're re-testing there is whether the symlink is pointing to what we expect. A hypothetical "--dereference" to "git diff" would also take care of it (the equivalent of "--no-dereference" being the default). But with that all tests pass for me, so I'm puzzled as to the need for the new helper, as opposed to just constructing the command above and sticking it in GIT_TEST_CMP ...
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > But I think that re-raises the point René had in: > https://lore.kernel.org/git/b21d2b60-428f-58ec-28b6-3c617b9f2e45@web.de/ As the primary point of no-index mode was to expose fancy options "git diff" has to comparisons of files outside version control, without having to go through the trouble of upstreaming changes to GNU diff, I do think "--ignore-cr-at-eol" should work fine with it, and René's idea sounds like the best implementation for the test-text-cmp helper command. Thanks.
diff --git a/Makefile b/Makefile index 1624471badc..73db55bba0f 100644 --- a/Makefile +++ b/Makefile @@ -786,6 +786,7 @@ TEST_BUILTINS_OBJS += test-string-list.o TEST_BUILTINS_OBJS += test-submodule-config.o TEST_BUILTINS_OBJS += test-submodule-nested-repo-config.o TEST_BUILTINS_OBJS += test-subprocess.o +TEST_BUILTINS_OBJS += test-text-cmp.o TEST_BUILTINS_OBJS += test-trace2.o TEST_BUILTINS_OBJS += test-urlmatch-normalization.o TEST_BUILTINS_OBJS += test-userdiff.o diff --git a/t/helper/test-text-cmp.c b/t/helper/test-text-cmp.c new file mode 100644 index 00000000000..7c26d925086 --- /dev/null +++ b/t/helper/test-text-cmp.c @@ -0,0 +1,78 @@ +#include "test-tool.h" +#include "git-compat-util.h" +#include "strbuf.h" +#include "gettext.h" +#include "parse-options.h" +#include "run-command.h" + +#ifdef WIN32 +#define NO_SUCH_DIR "\\\\.\\GLOBALROOT\\invalid" +#else +#define NO_SUCH_DIR "/dev/null" +#endif + +static int run_diff(const char *path1, const char *path2) +{ + const char *argv[] = { + "diff", "--no-index", "--", NULL, NULL, NULL + }; + const char *env[] = { + "GIT_PAGER=cat", + "GIT_DIR=" NO_SUCH_DIR, + "HOME=" NO_SUCH_DIR, + NULL + }; + + argv[3] = path1; + argv[4] = path2; + return run_command_v_opt_cd_env(argv, + RUN_COMMAND_NO_STDIN | RUN_GIT_CMD, + NULL, env); +} + +int cmd__text_cmp(int argc, const char **argv) +{ + FILE *f0, *f1; + struct strbuf b0 = STRBUF_INIT, b1 = STRBUF_INIT; + + if (argc != 3) + die("Require exactly 2 arguments, got %d", argc); + + if (!strcmp(argv[1], "-") && !strcmp(argv[2], "-")) + die("only one parameter can refer to `stdin` but not both"); + + if (!(f0 = !strcmp(argv[1], "-") ? stdin : fopen(argv[1], "r"))) + return error_errno("could not open '%s'", argv[1]); + if (!(f1 = !strcmp(argv[2], "-") ? stdin : fopen(argv[2], "r"))) { + fclose(f0); + return error_errno("could not open '%s'", argv[2]); + } + + for (;;) { + int r0 = strbuf_getline(&b0, f0); + int r1 = strbuf_getline(&b1, f1); + + if (r0 == EOF) { + fclose(f0); + fclose(f1); + strbuf_release(&b0); + strbuf_release(&b1); + if (r1 == EOF) + return 0; +cmp_failed: + if (!strcmp(argv[1], "-") || !strcmp(argv[2], "-")) + warning("cannot show diff because `stdin` was already consumed"); + else if (!run_diff(argv[1], argv[2])) + die("Huh? 'diff --no-index %s %s' succeeded", + argv[1], argv[2]); + return 1; + } + if (r1 == EOF || strbuf_cmp(&b0, &b1)) { + fclose(f0); + fclose(f1); + strbuf_release(&b0); + strbuf_release(&b1); + goto cmp_failed; + } + } +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index 318fdbab0c3..c6654ebc48b 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -81,6 +81,7 @@ static struct test_cmd cmds[] = { { "submodule-config", cmd__submodule_config }, { "submodule-nested-repo-config", cmd__submodule_nested_repo_config }, { "subprocess", cmd__subprocess }, + { "text-cmp", cmd__text_cmp }, { "trace2", cmd__trace2 }, { "userdiff", cmd__userdiff }, { "urlmatch-normalization", cmd__urlmatch_normalization }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index bb799271631..2acfd2bcabc 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -71,6 +71,7 @@ int cmd__string_list(int argc, const char **argv); int cmd__submodule_config(int argc, const char **argv); int cmd__submodule_nested_repo_config(int argc, const char **argv); int cmd__subprocess(int argc, const char **argv); +int cmd__text_cmp(int argc, const char **argv); int cmd__trace2(int argc, const char **argv); int cmd__userdiff(int argc, const char **argv); int cmd__urlmatch_normalization(int argc, const char **argv); diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 8c44856eaec..28eddbc8e36 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -1240,7 +1240,7 @@ test_expect_code () { test_cmp () { test "$#" -ne 2 && BUG "2 param" - eval "$GIT_TEST_CMP" '"$@"' + GIT_ALLOC_LIMIT=0 eval "$GIT_TEST_CMP" '"$@"' } # Check that the given config key has the expected value. @@ -1541,72 +1541,6 @@ test_skip_or_die () { error "$2" } -# The following mingw_* functions obey POSIX shell syntax, but are actually -# bash scripts, and are meant to be used only with bash on Windows. - -# A test_cmp function that treats LF and CRLF equal and avoids to fork -# diff when possible. -mingw_test_cmp () { - # Read text into shell variables and compare them. If the results - # are different, use regular diff to report the difference. - local test_cmp_a= test_cmp_b= - - # When text came from stdin (one argument is '-') we must feed it - # to diff. - local stdin_for_diff= - - # Since it is difficult to detect the difference between an - # empty input file and a failure to read the files, we go straight - # to diff if one of the inputs is empty. - if test -s "$1" && test -s "$2" - then - # regular case: both files non-empty - mingw_read_file_strip_cr_ test_cmp_a <"$1" - mingw_read_file_strip_cr_ test_cmp_b <"$2" - elif test -s "$1" && test "$2" = - - then - # read 2nd file from stdin - mingw_read_file_strip_cr_ test_cmp_a <"$1" - mingw_read_file_strip_cr_ test_cmp_b - stdin_for_diff='<<<"$test_cmp_b"' - elif test "$1" = - && test -s "$2" - then - # read 1st file from stdin - mingw_read_file_strip_cr_ test_cmp_a - mingw_read_file_strip_cr_ test_cmp_b <"$2" - stdin_for_diff='<<<"$test_cmp_a"' - fi - test -n "$test_cmp_a" && - test -n "$test_cmp_b" && - test "$test_cmp_a" = "$test_cmp_b" || - eval "diff -u \"\$@\" $stdin_for_diff" -} - -# $1 is the name of the shell variable to fill in -mingw_read_file_strip_cr_ () { - # Read line-wise using LF as the line separator - # and use IFS to strip CR. - local line - while : - do - if IFS=$'\r' read -r -d $'\n' line - then - # good - line=$line$'\n' - else - # we get here at EOF, but also if the last line - # was not terminated by LF; in the latter case, - # some text was read - if test -z "$line" - then - # EOF, really - break - fi - fi - eval "$1=\$$1\$line" - done -} - # Like "env FOO=BAR some-program", but run inside a subshell, which means # it also works for shell functions (though those functions cannot impact # the environment outside of the test_env invocation). diff --git a/t/test-lib.sh b/t/test-lib.sh index 7726d1da88a..0be25ecbd59 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1546,7 +1546,7 @@ case $uname_s in test_set_prereq SED_STRIPS_CR test_set_prereq GREP_STRIPS_CR test_set_prereq WINDOWS - GIT_TEST_CMP=mingw_test_cmp + GIT_TEST_CMP="test-tool text-cmp" ;; *CYGWIN*) test_set_prereq POSIXPERM