diff mbox series

[RFC,v2] t4210: detect REG_ILLSEQ dynamically

Message ID 20200513180213.45866-1-carenas@gmail.com (mailing list archive)
State New, archived
Headers show
Series [RFC,v2] t4210: detect REG_ILLSEQ dynamically | expand

Commit Message

Carlo Marcelo Arenas Belón May 13, 2020, 6:02 p.m. UTC
7187c7bbb8 (t4210: skip i18n tests that don't work on FreeBSD, 2019-11-27)
adds a REG_ILLSEQ prerequisite to avoid failures from the tests added with
4e2443b181 (log tests: test regex backends in "--encode=<enc>" tests,
2019-06-28), but hardcodes it to be only enabled for FreeBSD.

Instead of hardcoding the affected platform, add a test using test-tool to
make sure it can be dynamically detected in other affected systems (like
DragonFlyBSD or macOS), and while at it, enhance the tool so it can report
better on the cause of failure and be silent when needed.

To minimize changes, it is assumed the regcomp error is of the right type
since we control the only caller, and is also assumed to affect both basic
and extended syntax (only extended is tested, but both behave the same in
all three affected platforms).

The description of the first test which wasn't accurate has been corrected,
and unlike the original fix from 7187c7bbb8, all added entries to test-lib
are no longer needed and only the 2 affected engines will have their tests
suppressed.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
v2:
- nicer variable names and hopefully easier to understand logic

 t/helper/test-regex.c | 20 +++++++++++++++----
 t/t4210-log-i18n.sh   | 45 ++++++++++++++++++++++++++++---------------
 t/test-lib.sh         |  6 ------
 3 files changed, 45 insertions(+), 26 deletions(-)

Comments

Eric Sunshine May 13, 2020, 8:40 p.m. UTC | #1
On Wed, May 13, 2020 at 2:02 PM Carlo Marcelo Arenas Belón
<carenas@gmail.com> wrote:
> [...]
> The description of the first test which wasn't accurate has been corrected,
> and unlike the original fix from 7187c7bbb8, all added entries to test-lib
> are no longer needed and only the 2 affected engines will have their tests
> suppressed.

I see this paragraph was updated in response to my question about why
those additional test-lib.sh variable assignments were being dropped
by the patch. However, this explanation gives no actual detail about
why those assignments are unneeded, thus their removal is just as much
of a head-scratcher as was v1 in which the commit message did not talk
about them at all.

> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
> ---
> @@ -41,16 +41,21 @@ int cmd__regex(int argc, const char **argv)
> +       ret = regcomp(&r, pat, flags);
> +       if (ret) {
> +               if (!silent) {
> +                       regerror(ret, &r, errbuf, sizeof(errbuf));
> +                       die("failed regcomp() for pattern '%s' (%s)",
> +                               pat, errbuf);
> +               } else
> +                       return 1;
> +       }

This is pure nit, and I wouldn't necessarily want to see a re-roll
just for this, but you could lose an indentation level and make the
code a bit easier to grok by structuring it like this:

    if (ret) {
        if (silent)
            return 1;
        regerror(...);
        die(...);
    }
Junio C Hamano May 15, 2020, 6 p.m. UTC | #2
Is https://github.com/git/git/runs/678964255 related to this change?

Carlo Marcelo Arenas Belón  <carenas@gmail.com> writes:

> 7187c7bbb8 (t4210: skip i18n tests that don't work on FreeBSD, 2019-11-27)
> adds a REG_ILLSEQ prerequisite to avoid failures from the tests added with
> 4e2443b181 (log tests: test regex backends in "--encode=<enc>" tests,
> 2019-06-28), but hardcodes it to be only enabled for FreeBSD.
>
> Instead of hardcoding the affected platform, add a test using test-tool to
> make sure it can be dynamically detected in other affected systems (like
> DragonFlyBSD or macOS), and while at it, enhance the tool so it can report
> better on the cause of failure and be silent when needed.
>
> To minimize changes, it is assumed the regcomp error is of the right type
> since we control the only caller, and is also assumed to affect both basic
> and extended syntax (only extended is tested, but both behave the same in
> all three affected platforms).
>
> The description of the first test which wasn't accurate has been corrected,
> and unlike the original fix from 7187c7bbb8, all added entries to test-lib
> are no longer needed and only the 2 affected engines will have their tests
> suppressed.
>
> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
> ---
> v2:
> - nicer variable names and hopefully easier to understand logic



>
>  t/helper/test-regex.c | 20 +++++++++++++++----
>  t/t4210-log-i18n.sh   | 45 ++++++++++++++++++++++++++++---------------
>  t/test-lib.sh         |  6 ------
>  3 files changed, 45 insertions(+), 26 deletions(-)
>
> diff --git a/t/helper/test-regex.c b/t/helper/test-regex.c
> index 10284cc56f..a4bd1f3a36 100644
> --- a/t/helper/test-regex.c
> +++ b/t/helper/test-regex.c
> @@ -41,16 +41,21 @@ int cmd__regex(int argc, const char **argv)
>  {
>  	const char *pat;
>  	const char *str;
> -	int flags = 0;
> +	int ret, silent = 0, flags = 0;
>  	regex_t r;
>  	regmatch_t m[1];
> +	char errbuf[64];
>  
>  	if (argc == 2 && !strcmp(argv[1], "--bug"))
>  		return test_regex_bug();
>  	else if (argc < 3)
>  		usage("test-tool regex --bug\n"
> -		      "test-tool regex <pattern> <string> [<options>]");
> +		      "test-tool regex [--silent] <pattern> <string> [<options>]");
>  
> +	if (!strcmp(argv[1], "--silent")) {
> +		silent = 1;
> +		argv++;
> +	}
>  	argv++;
>  	pat = *argv++;
>  	str = *argv++;
> @@ -67,8 +72,15 @@ int cmd__regex(int argc, const char **argv)
>  	}
>  	git_setup_gettext();
>  
> -	if (regcomp(&r, pat, flags))
> -		die("failed regcomp() for pattern '%s'", pat);
> +	ret = regcomp(&r, pat, flags);
> +	if (ret) {
> +		if (!silent) {
> +			regerror(ret, &r, errbuf, sizeof(errbuf));
> +			die("failed regcomp() for pattern '%s' (%s)",
> +				pat, errbuf);
> +		} else
> +			return 1;
> +	}
>  	if (regexec(&r, str, 1, m, 0))
>  		return 1;
>  
> diff --git a/t/t4210-log-i18n.sh b/t/t4210-log-i18n.sh
> index c3792081e6..8fccb83020 100755
> --- a/t/t4210-log-i18n.sh
> +++ b/t/t4210-log-i18n.sh
> @@ -10,6 +10,12 @@ latin1_e=$(printf '\351')
>  # invalid UTF-8
>  invalid_e=$(printf '\303\50)') # ")" at end to close opening "("
>  
> +if test_have_prereq GETTEXT_LOCALE &&
> +	! LC_ALL=$is_IS_locale test-tool regex --silent $latin1_e $latin1_e EXTENDED
> +then
> +	have_reg_illseq=1
> +fi
> +
>  test_expect_success 'create commits in different encodings' '
>  	test_tick &&
>  	cat >msg <<-EOF &&
> @@ -61,33 +67,40 @@ do
>  	prereq=
>  	if test $engine = "perl"
>  	then
> -		prereq="PCRE"
> -	else
> -		prereq=""
> +		prereq=PCRE
>  	fi
>  	force_regex=
>  	if test $engine != "fixed"
>  	then
> -	    force_regex=.*
> +		force_regex=.*
> +	fi
> +
> +	if test -z "$have_reg_illseq" ||
> +		test $engine = "fixed" || test $engine = "perl"
> +	then
> +		test_expect_success !MINGW,GETTEXT_LOCALE,$prereq "-c grep.patternType=$engine log --grep searches in log output encoding (latin1 + locale)" "
> +			cat >expect <<-\EOF &&
> +			latin1
> +			utf8
> +			EOF
> +			LC_ALL=$is_IS_locale git -c grep.patternType=$engine log --encoding=ISO-8859-1 --format=%s --grep=\"$force_regex$latin1_e\" >actual &&
> +			test_cmp expect actual
> +		"
>  	fi
> -	test_expect_success !MINGW,!REGEX_ILLSEQ,GETTEXT_LOCALE,$prereq "-c grep.patternType=$engine log --grep does not find non-reencoded values (latin1 + locale)" "
> -		cat >expect <<-\EOF &&
> -		latin1
> -		utf8
> -		EOF
> -		LC_ALL=\"$is_IS_locale\" git -c grep.patternType=$engine log --encoding=ISO-8859-1 --format=%s --grep=\"$force_regex$latin1_e\" >actual &&
> -		test_cmp expect actual
> -	"
>  
>  	test_expect_success !MINGW,GETTEXT_LOCALE,$prereq "-c grep.patternType=$engine log --grep does not find non-reencoded values (latin1 + locale)" "
>  		LC_ALL=\"$is_IS_locale\" git -c grep.patternType=$engine log --encoding=ISO-8859-1 --format=%s --grep=\"$force_regex$utf8_e\" >actual &&
>  		test_must_be_empty actual
>  	"
>  
> -	test_expect_success !MINGW,!REGEX_ILLSEQ,GETTEXT_LOCALE,$prereq "-c grep.patternType=$engine log --grep does not die on invalid UTF-8 value (latin1 + locale + invalid needle)" "
> -		LC_ALL=\"$is_IS_locale\" git -c grep.patternType=$engine log --encoding=ISO-8859-1 --format=%s --grep=\"$force_regex$invalid_e\" >actual &&
> -		test_must_be_empty actual
> -	"
> +	if test -z "$have_reg_illseq" ||
> +		test $engine = "fixed" || test $engine = "perl"
> +	then
> +		test_expect_success !MINGW,GETTEXT_LOCALE,$prereq "-c grep.patternType=$engine log --grep does not die on invalid UTF-8 value (latin1 + locale + invalid needle)" "
> +			LC_ALL=$is_IS_locale git -c grep.patternType=$engine log --encoding=ISO-8859-1 --format=%s --grep=\"$force_regex$invalid_e\" >actual &&
> +			test_must_be_empty actual
> +		"
> +	fi
>  done
>  
>  test_done
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 0ea1e5a05e..81473fea1d 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1454,12 +1454,6 @@ case $uname_s in
>  	test_set_prereq SED_STRIPS_CR
>  	test_set_prereq GREP_STRIPS_CR
>  	;;
> -FreeBSD)
> -	test_set_prereq REGEX_ILLSEQ
> -	test_set_prereq POSIXPERM
> -	test_set_prereq BSLASHPSPEC
> -	test_set_prereq EXECKEEPSPID
> -	;;
>  *)
>  	test_set_prereq POSIXPERM
>  	test_set_prereq BSLASHPSPEC
Carlo Marcelo Arenas Belón May 15, 2020, 6:18 p.m. UTC | #3
On Fri, May 15, 2020 at 11:00:50AM -0700, Junio C Hamano wrote:
> Is https://github.com/git/git/runs/678964255 related to this change?

yes; when built without PCRE then fixed can also trigger REG_SIGILL

Carlo
diff mbox series

Patch

diff --git a/t/helper/test-regex.c b/t/helper/test-regex.c
index 10284cc56f..a4bd1f3a36 100644
--- a/t/helper/test-regex.c
+++ b/t/helper/test-regex.c
@@ -41,16 +41,21 @@  int cmd__regex(int argc, const char **argv)
 {
 	const char *pat;
 	const char *str;
-	int flags = 0;
+	int ret, silent = 0, flags = 0;
 	regex_t r;
 	regmatch_t m[1];
+	char errbuf[64];
 
 	if (argc == 2 && !strcmp(argv[1], "--bug"))
 		return test_regex_bug();
 	else if (argc < 3)
 		usage("test-tool regex --bug\n"
-		      "test-tool regex <pattern> <string> [<options>]");
+		      "test-tool regex [--silent] <pattern> <string> [<options>]");
 
+	if (!strcmp(argv[1], "--silent")) {
+		silent = 1;
+		argv++;
+	}
 	argv++;
 	pat = *argv++;
 	str = *argv++;
@@ -67,8 +72,15 @@  int cmd__regex(int argc, const char **argv)
 	}
 	git_setup_gettext();
 
-	if (regcomp(&r, pat, flags))
-		die("failed regcomp() for pattern '%s'", pat);
+	ret = regcomp(&r, pat, flags);
+	if (ret) {
+		if (!silent) {
+			regerror(ret, &r, errbuf, sizeof(errbuf));
+			die("failed regcomp() for pattern '%s' (%s)",
+				pat, errbuf);
+		} else
+			return 1;
+	}
 	if (regexec(&r, str, 1, m, 0))
 		return 1;
 
diff --git a/t/t4210-log-i18n.sh b/t/t4210-log-i18n.sh
index c3792081e6..8fccb83020 100755
--- a/t/t4210-log-i18n.sh
+++ b/t/t4210-log-i18n.sh
@@ -10,6 +10,12 @@  latin1_e=$(printf '\351')
 # invalid UTF-8
 invalid_e=$(printf '\303\50)') # ")" at end to close opening "("
 
+if test_have_prereq GETTEXT_LOCALE &&
+	! LC_ALL=$is_IS_locale test-tool regex --silent $latin1_e $latin1_e EXTENDED
+then
+	have_reg_illseq=1
+fi
+
 test_expect_success 'create commits in different encodings' '
 	test_tick &&
 	cat >msg <<-EOF &&
@@ -61,33 +67,40 @@  do
 	prereq=
 	if test $engine = "perl"
 	then
-		prereq="PCRE"
-	else
-		prereq=""
+		prereq=PCRE
 	fi
 	force_regex=
 	if test $engine != "fixed"
 	then
-	    force_regex=.*
+		force_regex=.*
+	fi
+
+	if test -z "$have_reg_illseq" ||
+		test $engine = "fixed" || test $engine = "perl"
+	then
+		test_expect_success !MINGW,GETTEXT_LOCALE,$prereq "-c grep.patternType=$engine log --grep searches in log output encoding (latin1 + locale)" "
+			cat >expect <<-\EOF &&
+			latin1
+			utf8
+			EOF
+			LC_ALL=$is_IS_locale git -c grep.patternType=$engine log --encoding=ISO-8859-1 --format=%s --grep=\"$force_regex$latin1_e\" >actual &&
+			test_cmp expect actual
+		"
 	fi
-	test_expect_success !MINGW,!REGEX_ILLSEQ,GETTEXT_LOCALE,$prereq "-c grep.patternType=$engine log --grep does not find non-reencoded values (latin1 + locale)" "
-		cat >expect <<-\EOF &&
-		latin1
-		utf8
-		EOF
-		LC_ALL=\"$is_IS_locale\" git -c grep.patternType=$engine log --encoding=ISO-8859-1 --format=%s --grep=\"$force_regex$latin1_e\" >actual &&
-		test_cmp expect actual
-	"
 
 	test_expect_success !MINGW,GETTEXT_LOCALE,$prereq "-c grep.patternType=$engine log --grep does not find non-reencoded values (latin1 + locale)" "
 		LC_ALL=\"$is_IS_locale\" git -c grep.patternType=$engine log --encoding=ISO-8859-1 --format=%s --grep=\"$force_regex$utf8_e\" >actual &&
 		test_must_be_empty actual
 	"
 
-	test_expect_success !MINGW,!REGEX_ILLSEQ,GETTEXT_LOCALE,$prereq "-c grep.patternType=$engine log --grep does not die on invalid UTF-8 value (latin1 + locale + invalid needle)" "
-		LC_ALL=\"$is_IS_locale\" git -c grep.patternType=$engine log --encoding=ISO-8859-1 --format=%s --grep=\"$force_regex$invalid_e\" >actual &&
-		test_must_be_empty actual
-	"
+	if test -z "$have_reg_illseq" ||
+		test $engine = "fixed" || test $engine = "perl"
+	then
+		test_expect_success !MINGW,GETTEXT_LOCALE,$prereq "-c grep.patternType=$engine log --grep does not die on invalid UTF-8 value (latin1 + locale + invalid needle)" "
+			LC_ALL=$is_IS_locale git -c grep.patternType=$engine log --encoding=ISO-8859-1 --format=%s --grep=\"$force_regex$invalid_e\" >actual &&
+			test_must_be_empty actual
+		"
+	fi
 done
 
 test_done
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 0ea1e5a05e..81473fea1d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1454,12 +1454,6 @@  case $uname_s in
 	test_set_prereq SED_STRIPS_CR
 	test_set_prereq GREP_STRIPS_CR
 	;;
-FreeBSD)
-	test_set_prereq REGEX_ILLSEQ
-	test_set_prereq POSIXPERM
-	test_set_prereq BSLASHPSPEC
-	test_set_prereq EXECKEEPSPID
-	;;
 *)
 	test_set_prereq POSIXPERM
 	test_set_prereq BSLASHPSPEC