[00/11,RFH] Introduce support for GETTEXT_POISON=rot13

Message ID	pull.836.git.1610441262.gitgitgadget@gmail.com (mailing list archive)
Headers	show Return-Path: <git-owner@kernel.org> Message-Id: <pull.836.git.1610441262.gitgitgadget@gmail.com> From: "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> Date: Tue, 12 Jan 2021 08:47:31 +0000 Subject: [PATCH 00/11] [RFH] Introduce support for GETTEXT_POISON=rot13 Fcc: Sent Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 To: git@vger.kernel.org Cc: =?utf-8?b?Tmd1eeG7hW4gVGjDoWkgTmfhu41j?= Duy <pclouds@gmail.com>, =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= <avarab@gmail.com>, SZEDER =?utf-8?b?R8OhYm9y?= <szeder.dev@gmail.com>, Johannes Schindelin <johannes.schindelin@gmx.de> Precedence: bulk
Series	Introduce support for GETTEXT_POISON=rot13 \| expand [00/11,RFH] Introduce support for GETTEXT_POISON=rot13 [01/11] tests: remove unnecessary GIT_TEST_GETTEXT_POISON=false constructs [02/11] Support GIT_TEST_GETTEXT_POISON=rot13 [03/11] parse-options: add forgotten translations [04/11] sha1dc: mark forgotten message for translation [05/11] t0006: use `test_i18ncmp` only for C locales [06/11] t0041: stop using `test_i18ngrep` on untranslated output [07/11] t0027: mark a function as requiring the C locale [08/11] t6301: do not expect the output of `for-each-ref` to be translated [09/11] GETTEXT_POISON=rot13: do compare the output in `test_i18ncmp` [10/11] GETTEXT_POISON=rot13: perform actual checks in `test_i18ngrep` [11/11] test-tool i18n: do not complain about empty messages

Philippe Blain via GitGitGadget Jan. 12, 2021, 8:47 a.m. UTC

This patch series is incomplete because I lack the time to finish it, and I
hope that there are interested developers who can help with it.

In https://lore.kernel.org/git/20181022153633.31757-1-pclouds@gmail.com/,
Duy proposed introducing a fake language they called "Ook" to be able to
test translations better. "Ook" would "translate" messages without any
printf format specifiers to "Eek", otherwise convert all upper-case letters
to "Ook" and lower-case letters to "ook".

Gábor replied with a different proposal that has the advantage of being
bijective, i.e. it is possible to reconstruct the untranslated message from
the translated one.

Ævar suggested recently
[https://lore.kernel.org/git/xmqqim836v6m.fsf@gitster.c.googlers.com/T/#m6fdc43d4f1eb3f20f841096c59e985b69c84875e]
that this whole GETTEXT_POISON business is totally useless.

I do not believe that it is useless. To back up my belief with something
tangible, I implemented a GETTEXT_POISON=rot13 mode and ran the test suite
to see whether we erroneously expect translated messages where they aren't,
and related problems.

And the experiment delivered. It is just a demonstration (I only addressed a
handful of the test suite failures, 124 test scripts still need to be
inspected to find out why they fail), of course. Yet I think that finding
e.g. the missing translations of sha1dc-cb and parse-options show that it
would be very useful to complete this patch series and then use the rot13
mode in our CI jobs (instead of the current GETTEXT_POISON=true mode, which
really is less useful).

Note: in its current form, this patch series fails t0013 and t9902 in the
GETTEXT_POISON job
[https://github.com/gitgitgadget/git/pull/836/checks?check_run_id=1686715225#step:4:1590],
therefore it cannot be integrated into Git as-is, even if it does not switch
the GETTEXT_POISON job to use the rot13 mode yet.

Sadly, even though I want to see this patch series go further, I am
operating under serious time constraints and therefore had to admit to
myself, grudgingly, that I can't. So again, I hope that somebody else can
push this effort further.

Johannes Schindelin (11):
  tests: remove unnecessary GIT_TEST_GETTEXT_POISON=false constructs
  Support GIT_TEST_GETTEXT_POISON=rot13
  parse-options: add forgotten translations
  sha1dc: mark forgotten message for translation
  t0006: use `test_i18ncmp` only for C locales
  t0041: stop using `test_i18ngrep` on untranslated output
  t0027: mark a function as requiring the C locale
  t6301: do not expect the output of `for-each-ref` to be translated
  GETTEXT_POISON=rot13: do compare the output in `test_i18ncmp`
  GETTEXT_POISON=rot13: perform actual checks in `test_i18ngrep`
  test-tool i18n: do not complain about empty messages

 Makefile                       |   1 +
 gettext.c                      |  65 ++++++++++++++++-
 gettext.h                      |   5 +-
 parse-options-cb.c             |   4 +-
 sha1dc_git.c                   |   2 +-
 t/helper/test-i18n.c           | 129 +++++++++++++++++++++++++++++++++
 t/helper/test-tool.c           |   1 +
 t/helper/test-tool.h           |   1 +
 t/t0006-date.sh                |   4 +-
 t/t0017-env-helper.sh          |   6 +-
 t/t0027-auto-crlf.sh           |   3 +
 t/t0041-usage.sh               |   2 +-
 t/t1305-config-include.sh      |   6 +-
 t/t6301-for-each-ref-errors.sh |   2 +-
 t/t7201-co.sh                  |   4 +-
 t/t9902-completion.sh          |   1 -
 t/test-lib-functions.sh        |  14 +++-
 t/test-lib.sh                  |   1 +
 18 files changed, 228 insertions(+), 23 deletions(-)
 create mode 100644 t/helper/test-i18n.c


base-commit: 72c4083ddf91b489b7b7b812df67ee8842177d98
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-836%2Fdscho%2Fgettext-poison-should-rot13-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-836/dscho/gettext-poison-should-rot13-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/836

Jeff King Jan. 12, 2021, 11:34 a.m. UTC | #1

On Tue, Jan 12, 2021 at 08:47:31AM +0000, Johannes Schindelin via GitGitGadget wrote:

> Ævar suggested recently
> [https://lore.kernel.org/git/xmqqim836v6m.fsf@gitster.c.googlers.com/T/#m6fdc43d4f1eb3f20f841096c59e985b69c84875e]
> that this whole GETTEXT_POISON business is totally useless.
> 
> I do not believe that it is useless. To back up my belief with something
> tangible, I implemented a GETTEXT_POISON=rot13 mode and ran the test suite
> to see whether we erroneously expect translated messages where they aren't,
> and related problems.
> 
> And the experiment delivered. It is just a demonstration (I only addressed a
> handful of the test suite failures, 124 test scripts still need to be
> inspected to find out why they fail), of course. Yet I think that finding
> e.g. the missing translations of sha1dc-cb and parse-options show that it
> would be very useful to complete this patch series and then use the rot13
> mode in our CI jobs (instead of the current GETTEXT_POISON=true mode, which
> really is less useful).

I'm not entirely convinced by this. The original point of the poison
code was not to find opportunities to translate strings, but to make
sure we did not accidentally translate a string that some script was
relying on. And I don't see any fixes for the latter here (and as Ævar
suggested in the linked thread, the fact that we're not combing through
existing code looking for translations makes such an error a lot less
likely).

Which isn't to say repurposing it in the other direction might not be
worthwhile. But I suspect a lot of the test failures are just false
positives. Until now it was always reasonable to conservatively use
test_i18ngrep for cases which could reasonably translated, even if they
were not yet.

Likewise, I'm not sure that one can reliably rot13 an output for
test_i18ncmp. It could contain mixed translated and untranslated bits
from different messages.

So I dunno. You did find two spots where translations could be used.
But if nobody actually saw them to care that they got translated, were
they important? I may be a bit biased as somebody who would not use the
translations in the first place, of course.

-Peff

Ævar Arnfjörð Bjarmason Jan. 12, 2021, 1:32 p.m. UTC | #2

On Tue, Jan 12 2021, Johannes Schindelin via GitGitGadget wrote:

> This patch series is incomplete because I lack the time to finish it, and I
> hope that there are interested developers who can help with it.
>
> In https://lore.kernel.org/git/20181022153633.31757-1-pclouds@gmail.com/,
> Duy proposed introducing a fake language they called "Ook" to be able to
> test translations better. "Ook" would "translate" messages without any
> printf format specifiers to "Eek", otherwise convert all upper-case letters
> to "Ook" and lower-case letters to "ook".
>
> Gábor replied with a different proposal that has the advantage of being
> bijective, i.e. it is possible to reconstruct the untranslated message from
> the translated one.
>
> Ævar suggested recently
> [https://lore.kernel.org/git/xmqqim836v6m.fsf@gitster.c.googlers.com/T/#m6fdc43d4f1eb3f20f841096c59e985b69c84875e]
> that this whole GETTEXT_POISON business is totally useless.

To clarify what I really mean, but admittedly am not always the best at
articulating: I do not think GETTEXT_POISON is useless, it's useful.

The interesting question is "useful enough to bother?", or in economics
terms "is this investment in time/money worth the opportunity cost?".

Any hoops we make people jump through when writing tests takes away time
and attention from other things.

My motivation here is that I feel this whole poison business is all my
fault. Every time some newbie submits a patch and needs to re-roll a v2
because they used "grep" instead of "test_i18ngrep" in a case that
clearly didn't involve a plumbing string I cringe a little about the
technical debt.

So while I still think my series (just nuke it) would be better overall,
I'm secretly rooting for yours. If the consensus is that this is worth
keeping and improving perhaps we haven't been mostly wasting our time :)

On to discussing this series:

> I do not believe that it is useless. To back up my belief with something
> tangible, I implemented a GETTEXT_POISON=rot13 mode and ran the test suite
> to see whether we erroneously expect translated messages where they aren't,
> and related problems.

I agree that rot13ing it makes it much more useful in the sense that
before we could over-tag things with test_i18n* and we'd just "return 0"
there in the poison mode, now we actually check if the string is
poisoned.

We could get most of the way there by checking e.g. if "GETTEXT POISON"
is in the output, but it wouldn't assert that the message is 100%
translated as this WIP series does.

In our off-list discussion you maintained that
"GIT_TEST_GETTEXT_POISON=false <cmd>" was an anti-pattern. Was it
because you thought "test_i18n*" et al was actually doing some "is
poison?" comparison as we now do, or just because you didn't
conceptually like a pattern like:

    GIT_TEST_GETTEXT_POISON=false git cmd >out &&
    grep string out

Being changed later to:

    GIT_TEST_GETTEXT_POISON=false git cmd >out &&
    grep string out &&
    grep plumbing-string out

Or whatever, as opposed to:

    git cmd >out &&
    test_i18ngrep string out &&
    test_i18ngrep plumbing-string out

I get the aesthetic preference, I'm just wondering if I'm missing
something else about it...

> And the experiment delivered. It is just a demonstration (I only addressed a
> handful of the test suite failures, 124 test scripts still need to be
> inspected to find out why they fail), of course. Yet I think that finding
> e.g. the missing translations of sha1dc-cb and parse-options show that it
> would be very useful to complete this patch series and then use the rot13
> mode in our CI jobs (instead of the current GETTEXT_POISON=true mode, which
> really is less useful).

The following patch on top cuts down on the failures by more than half:

-----------------
diff --git a/git-sh-i18n.sh b/git-sh-i18n.sh
index 8eef60b43f..f9239d2917 100644
--- a/git-sh-i18n.sh
+++ b/git-sh-i18n.sh
@@ -17,7 +17,10 @@ export TEXTDOMAINDIR
 
 # First decide what scheme to use...
 GIT_INTERNAL_GETTEXT_SH_SCHEME=fallthrough
-if test -n "$GIT_TEST_GETTEXT_POISON" &&
+if test -n "$GIT_TEST_GETTEXT_POISON" -a "$GIT_TEST_GETTEXT_POISON" = "rot13"
+then
+	GIT_INTERNAL_GETTEXT_SH_SCHEME=rot13poison
+elif  test -n "$GIT_TEST_GETTEXT_POISON" &&
 	    git env--helper --type=bool --default=0 --exit-code \
 		GIT_TEST_GETTEXT_POISON
 then
@@ -63,6 +66,21 @@ gettext_without_eval_gettext)
 		)
 	}
 	;;
+rot13poison)
+	# Emit garbage so that tests that incorrectly rely on translatable
+	# strings will fail.
+	gettext () {
+		printf "%s" "# SHELL GETTEXT POISON #"
+	}
+
+	eval_gettext () {
+		printf "%s" "# SHELL GETTEXT POISON #"
+	}
+
+	eval_ngettext () {
+		printf "%s" "# SHELL GETTEXT POISON #"
+	}
+	;;
 poison)
 	# Emit garbage so that tests that incorrectly rely on translatable
 	# strings will fail.
diff --git a/t/helper/test-i18n.c b/t/helper/test-i18n.c
index 82efc66f1f..1f0fa3d041 100644
--- a/t/helper/test-i18n.c
+++ b/t/helper/test-i18n.c
@@ -1,6 +1,8 @@
 #include "test-tool.h"
 #include "cache.h"
 #include "grep.h"
+#include "config.h"
 
 static const char *usage_msg = "\n"
 "  test-tool i18n cmp <file1> <file2>\n"
@@ -34,8 +36,12 @@ static size_t unrot13(char *buf)
 
 		p += strlen("<rot13>");
 		end = strstr(p, "</rot13>");
-		if (!end)
-			BUG("could not find </rot13> in\n%s", buf);
+		if (!end) {
+			if (git_env_bool("GIT_TEST_GETTEXT_POISON_PEDANTIC", 0))
+				BUG("could not find </rot13> in\n%s", buf);
+			else
+				break;
+		}
 
 		while (p != end)
 			*(q++) = do_rot13(*(p++));
@@ -67,8 +73,12 @@ static int i18n_cmp(const char **argv)
 
 	if (strbuf_read_file(&a, path1, 0) < 0)
 		die_errno("could not read %s", path1);
+	if (strstr(a.buf, "# SHELL GETTEXT POISON #"))
+		return 0;
 	if (strbuf_read_file(&b, path2, 0) < 0)
 		die_errno("could not read %s", path2);
+	if (strstr(b.buf, "# SHELL GETTEXT POISON #"))
+		return 0;
 	unrot13_strbuf(&b);
 
 	if (a.len != b.len || memcmp(a.buf, b.buf, a.len))
@@ -79,7 +89,6 @@ static int i18n_cmp(const char **argv)
 
 static int i18n_grep(const char **argv)
 {
-	int dont_match = 0;
 	const char *pattern, *path;
 
 	struct grep_opt opt;
@@ -87,11 +96,6 @@ static int i18n_grep(const char **argv)
 	struct strbuf buf = STRBUF_INIT;
 	int hit;
 
-	if (*argv && !strcmp("!", *argv)) {
-		dont_match = 1;
-		argv++;
-	}
-
 	pattern = *(argv++);
 	path = *(argv++);
 
@@ -104,13 +108,15 @@ static int i18n_grep(const char **argv)
 
 	if (strbuf_read_file(&buf, path, 0) < 0)
 		die_errno("could not read %s", path);
+	if (strstr(buf.buf, "# SHELL GETTEXT POISON #"))
+		return 0;
 	unrot13_strbuf(&buf);
 	grep_source_init(&source, GREP_SOURCE_BUF, path, path, path);
 	source.buf = buf.buf;
 	source.size = buf.len;
 	hit = grep_source(&opt, &source);
 	strbuf_release(&buf);
-	return dont_match ^ !hit;
+	return !hit;
 }
 
 int cmd__i18n(int argc, const char **argv)
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 394fd2ef5b..6c580a3000 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1019,14 +1019,12 @@ test_i18ngrep () {
 		BUG "too few parameters to test_i18ngrep"
 	fi
 
-	if test_have_prereq !C_LOCALE_OUTPUT
+	grep_cmd=grep
+	if test "$GIT_TEST_GETTEXT_POISON" == "rot13"
+	then
+		grep_cmd="test-tool i18n grep"
+	elif test_have_prereq !C_LOCALE_OUTPUT
 	then
-		if test rot13 = "$GIT_TEST_GETTEXT_POISON"
-		then
-			test-tool i18n grep "$@"
-			return $?
-		fi
-
 		# pretend success
 		return 0
 	fi
@@ -1034,13 +1032,13 @@ test_i18ngrep () {
 	if test "x!" = "x$1"
 	then
 		shift
-		! grep "$@" && return 0
+		! $grep_cmd "$@" && return 0
 
-		echo >&4 "error: '! grep $@' did find a match in:"
+		echo >&4 "error: '! $grep_cmd $@' did find a match in:"
 	else
-		grep "$@" && return 0
+		$grep_cmd "$@" && return 0
 
-		echo >&4 "error: 'grep $@' didn't find a match in:"
+		echo >&4 "error: '$grep_cmd $@' didn't find a match in:"
 	fi
 
 	if test -s "$last_arg"
-----------------

I.e. most of this was because you didn't add any support for this to the
shellscript translations.

We still punt on that here, it should ideally preserve the shell
variables like the format specifiers, but I think that's not worth the
effort with us actively cutting down our shell code to nothing.

We still have failures e.g. due to "test_i18ngrep -F fixed file" not
being supported. Wouldn't a better/simpler approach be to just have a
light rot13 helper, and then call the actual "cmp"/"grep" with the
munged input/output?

Junio C Hamano Jan. 12, 2021, 7:50 p.m. UTC | #3

Jeff King <peff@peff.net> writes:

> Likewise, I'm not sure that one can reliably rot13 an output for
> test_i18ncmp. It could contain mixed translated and untranslated bits
> from different messages.

As long as <rot13>...</rot13> marking can be trusted (and it cannot
be, but if we declare it is OK to punt, then by definition we can),
I think the messages can be compared.

> So I dunno. You did find two spots where translations could be used.
> But if nobody actually saw them to care that they got translated, were
> they important? I may be a bit biased as somebody who would not use the
> translations in the first place, of course.

I viewed the series with interest, mostly (i.e. 80%+) for its "fun
value"; I tend to agree with you that I doubt its usefulness.

Jeff King Jan. 13, 2021, 7:32 a.m. UTC | #4

On Tue, Jan 12, 2021 at 11:50:52AM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Likewise, I'm not sure that one can reliably rot13 an output for
> > test_i18ncmp. It could contain mixed translated and untranslated bits
> > from different messages.
> 
> As long as <rot13>...</rot13> marking can be trusted (and it cannot
> be, but if we declare it is OK to punt, then by definition we can),
> I think the messages can be compared.

Good point. I should have looked at the implementation more carefully
before responding (and I agree that while not foolproof, tagging like
that would work OK in practice).

-Peff

Johannes Schindelin Jan. 15, 2021, 4:13 p.m. UTC | #5

Hi Ævar,


On Tue, 12 Jan 2021, Ævar Arnfjörð Bjarmason wrote:

> On Tue, Jan 12 2021, Johannes Schindelin via GitGitGadget wrote:
>
> > I implemented a GETTEXT_POISON=rot13 mode and ran the test suite to
> > see whether we erroneously expect translated messages where they
> > aren't, and related problems.
>
> I agree that rot13ing it makes it much more useful in the sense that
> before we could over-tag things with test_i18n* and we'd just "return 0"
> there in the poison mode, now we actually check if the string is
> poisoned.

Precisely.

> In our off-list discussion you maintained that
> "GIT_TEST_GETTEXT_POISON=false <cmd>" was an anti-pattern. Was it
> because you thought "test_i18n*" et al was actually doing some "is
> poison?" [...]

I maintain this view because the whole point of having a GETTEXT_POISON
job is to have the entire test suite run in that mode. Using this
anti-pattern says to me "the author could not make this work correctly in
GETTEXT_POISON mode".

Even more, _iff_ we expect certain validations to be skipped in the
GETTEXT_POISON job, using `GIT_TEST_GETTEXT_POISON=false` opens the
possibility that this job fails when it never should have failed.

> > And the experiment delivered. It is just a demonstration (I only addressed a
> > handful of the test suite failures, 124 test scripts still need to be
> > inspected to find out why they fail), of course. Yet I think that finding
> > e.g. the missing translations of sha1dc-cb and parse-options show that it
> > would be very useful to complete this patch series and then use the rot13
> > mode in our CI jobs (instead of the current GETTEXT_POISON=true mode, which
> > really is less useful).
>
> The following patch on top cuts down on the failures by more than half:

Interesting! I forgot the shell half... Curious that the scripted parts
_still_ affect so much of our test suite :-(

> -----------------
> diff --git a/git-sh-i18n.sh b/git-sh-i18n.sh
> index 8eef60b43f..f9239d2917 100644
> --- a/git-sh-i18n.sh
> +++ b/git-sh-i18n.sh
> @@ -17,7 +17,10 @@ export TEXTDOMAINDIR
>
>  # First decide what scheme to use...
>  GIT_INTERNAL_GETTEXT_SH_SCHEME=fallthrough
> -if test -n "$GIT_TEST_GETTEXT_POISON" &&
> +if test -n "$GIT_TEST_GETTEXT_POISON" -a "$GIT_TEST_GETTEXT_POISON" = "rot13"
> +then
> +	GIT_INTERNAL_GETTEXT_SH_SCHEME=rot13poison
> +elif  test -n "$GIT_TEST_GETTEXT_POISON" &&
>  	    git env--helper --type=bool --default=0 --exit-code \
>  		GIT_TEST_GETTEXT_POISON
>  then
> @@ -63,6 +66,21 @@ gettext_without_eval_gettext)
>  		)
>  	}
>  	;;
> +rot13poison)
> +	# Emit garbage so that tests that incorrectly rely on translatable
> +	# strings will fail.
> +	gettext () {
> +		printf "%s" "# SHELL GETTEXT POISON #"
> +	}
> +
> +	eval_gettext () {
> +		printf "%s" "# SHELL GETTEXT POISON #"
> +	}
> +
> +	eval_ngettext () {
> +		printf "%s" "# SHELL GETTEXT POISON #"
> +	}

Right, and for that, I'd rather use the `test-tool i18n` helper (it is
appropriate to expect `test-tool` to be in the `PATH` in GETTEXT_POISON
mode, right?)

> +	;;
>  poison)
>  	# Emit garbage so that tests that incorrectly rely on translatable
>  	# strings will fail.
> diff --git a/t/helper/test-i18n.c b/t/helper/test-i18n.c
> index 82efc66f1f..1f0fa3d041 100644
> --- a/t/helper/test-i18n.c
> +++ b/t/helper/test-i18n.c
> @@ -1,6 +1,8 @@
>  #include "test-tool.h"
>  #include "cache.h"
>  #include "grep.h"
> +#include "config.h"
>
>  static const char *usage_msg = "\n"
>  "  test-tool i18n cmp <file1> <file2>\n"
> @@ -34,8 +36,12 @@ static size_t unrot13(char *buf)
>
>  		p += strlen("<rot13>");
>  		end = strstr(p, "</rot13>");
> -		if (!end)
> -			BUG("could not find </rot13> in\n%s", buf);
> +		if (!end) {
> +			if (git_env_bool("GIT_TEST_GETTEXT_POISON_PEDANTIC", 0))
> +				BUG("could not find </rot13> in\n%s", buf);
> +			else
> +				break;
> +		}
>
>  		while (p != end)
>  			*(q++) = do_rot13(*(p++));
> @@ -67,8 +73,12 @@ static int i18n_cmp(const char **argv)
>
>  	if (strbuf_read_file(&a, path1, 0) < 0)
>  		die_errno("could not read %s", path1);
> +	if (strstr(a.buf, "# SHELL GETTEXT POISON #"))
> +		return 0;
>  	if (strbuf_read_file(&b, path2, 0) < 0)
>  		die_errno("could not read %s", path2);
> +	if (strstr(b.buf, "# SHELL GETTEXT POISON #"))
> +		return 0;
>  	unrot13_strbuf(&b);
>
>  	if (a.len != b.len || memcmp(a.buf, b.buf, a.len))
> @@ -79,7 +89,6 @@ static int i18n_cmp(const char **argv)
>
>  static int i18n_grep(const char **argv)
>  {
> -	int dont_match = 0;
>  	const char *pattern, *path;
>
>  	struct grep_opt opt;
> @@ -87,11 +96,6 @@ static int i18n_grep(const char **argv)
>  	struct strbuf buf = STRBUF_INIT;
>  	int hit;
>
> -	if (*argv && !strcmp("!", *argv)) {
> -		dont_match = 1;
> -		argv++;
> -	}
> -
>  	pattern = *(argv++);
>  	path = *(argv++);
>
> @@ -104,13 +108,15 @@ static int i18n_grep(const char **argv)
>
>  	if (strbuf_read_file(&buf, path, 0) < 0)
>  		die_errno("could not read %s", path);
> +	if (strstr(buf.buf, "# SHELL GETTEXT POISON #"))
> +		return 0;
>  	unrot13_strbuf(&buf);
>  	grep_source_init(&source, GREP_SOURCE_BUF, path, path, path);
>  	source.buf = buf.buf;
>  	source.size = buf.len;
>  	hit = grep_source(&opt, &source);
>  	strbuf_release(&buf);
> -	return dont_match ^ !hit;
> +	return !hit;
>  }
>
>  int cmd__i18n(int argc, const char **argv)
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index 394fd2ef5b..6c580a3000 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -1019,14 +1019,12 @@ test_i18ngrep () {
>  		BUG "too few parameters to test_i18ngrep"
>  	fi
>
> -	if test_have_prereq !C_LOCALE_OUTPUT
> +	grep_cmd=grep
> +	if test "$GIT_TEST_GETTEXT_POISON" == "rot13"
> +	then
> +		grep_cmd="test-tool i18n grep"
> +	elif test_have_prereq !C_LOCALE_OUTPUT
>  	then
> -		if test rot13 = "$GIT_TEST_GETTEXT_POISON"
> -		then
> -			test-tool i18n grep "$@"
> -			return $?
> -		fi
> -
>  		# pretend success
>  		return 0
>  	fi
> @@ -1034,13 +1032,13 @@ test_i18ngrep () {
>  	if test "x!" = "x$1"
>  	then
>  		shift
> -		! grep "$@" && return 0
> +		! $grep_cmd "$@" && return 0
>
> -		echo >&4 "error: '! grep $@' did find a match in:"
> +		echo >&4 "error: '! $grep_cmd $@' did find a match in:"
>  	else
> -		grep "$@" && return 0
> +		$grep_cmd "$@" && return 0
>
> -		echo >&4 "error: 'grep $@' didn't find a match in:"
> +		echo >&4 "error: '$grep_cmd $@' didn't find a match in:"
>  	fi
>
>  	if test -s "$last_arg"
> -----------------
>
> I.e. most of this was because you didn't add any support for this to the
> shellscript translations.
>
> We still punt on that here, it should ideally preserve the shell
> variables like the format specifiers, but I think that's not worth the
> effort with us actively cutting down our shell code to nothing.

I'm not sure how fast we'll get there, what with `git submodule` being
slow to get over the finish line, and with `mergetool`/`difftool` still
being scripted, and probably for a long time, too.

> We still have failures e.g. due to "test_i18ngrep -F fixed file" not
> being supported. Wouldn't a better/simpler approach be to just have a
> light rot13 helper, and then call the actual "cmp"/"grep" with the
> munged input/output?

Hmm. I hoped that this work would open the door to introducing more C
helpers in the test suite, thereby accelerating it on Windows.

We could also potentially move the `cmp`/`grep` parts into separate
helpers that give more useful output in case of a failure. (Right now, if
anything fails in CI involving `grep` or `cmp`, the user investigating
this pretty much _has_ to try to reproduce the issue locally because the
output is so not helpful, and reproducing it might be a challenge e.g.
when the failure is macOS-specific and the developer does not have access
to a macOS setup).

Ciao,
Dscho

Junio C Hamano Jan. 16, 2021, 6:38 a.m. UTC | #6

Junio C Hamano <gitster@pobox.com> writes:

> Jeff King <peff@peff.net> writes:
> ...
>> So I dunno. You did find two spots where translations could be used.
>> But if nobody actually saw them to care that they got translated, were
>> they important? I may be a bit biased as somebody who would not use the
>> translations in the first place, of course.
>
> I viewed the series with interest, mostly (i.e. 80%+) for its "fun
> value"; I tend to agree with you that I doubt its usefulness.

Actually, "I doubt its usefulness" is way too strong than what I
really wanted to say.  I was just skeptical, and I still somewhat
am.  I'd just need a bit more convincing ;-)

[00/11,RFH] Introduce support for GETTEXT_POISON=rot13

Message

Comments