diff mbox series

[v3,01/12] check-non-portable-shell: check for "test <cond> -a/-o <cond>"

Message ID patch-01.12-a8b483bc77-20210420T121833Z-avarab@gmail.com (mailing list archive)
State Superseded
Headers show
Series test-lib.sh: new test_commit args, simplification & fixes | expand

Commit Message

Ævar Arnfjörð Bjarmason April 20, 2021, 12:21 p.m. UTC
Add a check for -a/-o in "test", as a follow-up to the
CodingGuidelines having recommended against their use since
897f964c0dc (CodingGuidelines: avoid "test <cond> -a/-o <cond>",
2014-05-20).

These constructs are considered obsolescent by POSIX[1]. GNU has
likewise warned against them[2] for a few decades.

These will only match the simplistic forms of `test -X blah` (where
"-X" is some single letter option), but will miss expressions such as
`test "$foo" = bar`. We stop at "&" or "|" to try not to overmatch
things like:

    test whatever && ls -a foo
    test whatever && foo -o outfile

1. https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html#tag_20_128_16
2. https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.70/html_node/Limitations-of-Builtins.html

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/check-non-portable-shell.pl | 2 ++
 1 file changed, 2 insertions(+)

Comments

Junio C Hamano April 20, 2021, 10:25 p.m. UTC | #1
Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> These will only match the simplistic forms of `test -X blah` (where
> "-X" is some single letter option), but will miss expressions such as
> `test "$foo" = bar`. We stop at "&" or "|" to try not to overmatch
> things like:
>
>     test whatever && ls -a foo
>     test whatever && foo -o outfile

I still do not understand why you have to insist on dashed operator
as the first thing given to "test", like this:

> +	/\btest\s+-[a-z]\s+[^&|]+\s+-a\s+/ and err '"test A && test B" ...
> +	/\btest\s+-[a-z]\s+[^&|]+\s+-o\s+/ and err '"test A || test B" ...

IOW, what over-matching would we get if we simplified the condition
like so?

    /\btest\s+[^&|]+\s+-a\s/
    /\btest\s+[^&|]+\s+-o\s/

The one in the patch would miss things like

	test "$a" = "$b" -o "$a" -lt "$b"
	test "$n" -a "$n" -lt 4

but the only thing that we care about is that a command that started
with "test " has "-a" or "-o" before we see "&" or "|", no?
Ævar Arnfjörð Bjarmason April 21, 2021, 8:46 a.m. UTC | #2
On Wed, Apr 21 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> These will only match the simplistic forms of `test -X blah` (where
>> "-X" is some single letter option), but will miss expressions such as
>> `test "$foo" = bar`. We stop at "&" or "|" to try not to overmatch
>> things like:
>>
>>     test whatever && ls -a foo
>>     test whatever && foo -o outfile
>
> I still do not understand why you have to insist on dashed operator
> as the first thing given to "test", like this:
>
>> +	/\btest\s+-[a-z]\s+[^&|]+\s+-a\s+/ and err '"test A && test B" ...
>> +	/\btest\s+-[a-z]\s+[^&|]+\s+-o\s+/ and err '"test A || test B" ...
>
> IOW, what over-matching would we get if we simplified the condition
> like so?
>
>     /\btest\s+[^&|]+\s+-a\s/
>     /\btest\s+[^&|]+\s+-o\s/
>
> The one in the patch would miss things like
>
> 	test "$a" = "$b" -o "$a" -lt "$b"
> 	test "$n" -a "$n" -lt 4
>
> but the only thing that we care about is that a command that started
> with "test " has "-a" or "-o" before we see "&" or "|", no?

Applying your suggestion results in these false positives:
	
	t4038-diff-combined.sh:135: error: "test A && test B" preferred to "test A -a B": git commit -m "test space change" -a &&
	t4038-diff-combined.sh:147: error: "test A && test B" preferred to "test A -a B": git commit -m "test other space changes" -a &&
	t6400-merge-df.sh:89: error: "test A || test B" preferred to "test A -o B": test 0 -eq $(git ls-files -o | wc -l)
	t6400-merge-df.sh:91: error: "test A || test B" preferred to "test A -o B": test 1 -eq $(git ls-files -o | wc -l)
	t6400-merge-df.sh:110: error: "test A || test B" preferred to "test A -o B": test 0 -eq $(git ls-files -o | wc -l)
	t6400-merge-df.sh:112: error: "test A || test B" preferred to "test A -o B": test 1 -eq $(git ls-files -o | wc -l)
	t6402-merge-rename.sh:639: error: "test A || test B" preferred to "test A -o B": test 0 -eq "$(git ls-files -o | wc -l)"
	t6402-merge-rename.sh:646: error: "test A || test B" preferred to "test A -o B": test 2 -eq "$(git ls-files -o | wc -l)"
	t6402-merge-rename.sh:686: error: "test A || test B" preferred to "test A -o B": test 0 -eq "$(git ls-files -o | wc -l)" &&
	t6402-merge-rename.sh:865: error: "test A || test B" preferred to "test A -o B": test 0 -eq $(git ls-files -o | wc -l) &&
	annotate-tests.sh:201: error: "test A && test B" preferred to "test A -a B": GIT_AUTHOR_NAME="E" GIT_AUTHOR_EMAIL="E at test dot git" git commit -a -m "norobots"

I'll just drop this from the re-roll of this series. Maybe you/someone
has a better suggestion for something that's simple but still catches
these cases, but in any case I'd like to not have this series blocked on
this minor thing (which none of the rest of it needs).
Đoàn Trần Công Danh April 21, 2021, 10:39 a.m. UTC | #3
On 2021-04-21 10:46:08+0200, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> 
> On Wed, Apr 21 2021, Junio C Hamano wrote:
> 
> > Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
> >
> >> These will only match the simplistic forms of `test -X blah` (where
> >> "-X" is some single letter option), but will miss expressions such as
> >> `test "$foo" = bar`. We stop at "&" or "|" to try not to overmatch
> >> things like:
> >>
> >>     test whatever && ls -a foo
> >>     test whatever && foo -o outfile
> >
> > I still do not understand why you have to insist on dashed operator
> > as the first thing given to "test", like this:
> >
> >> +	/\btest\s+-[a-z]\s+[^&|]+\s+-a\s+/ and err '"test A && test B" ...
> >> +	/\btest\s+-[a-z]\s+[^&|]+\s+-o\s+/ and err '"test A || test B" ...
> >
> > IOW, what over-matching would we get if we simplified the condition
> > like so?
> >
> >     /\btest\s+[^&|]+\s+-a\s/
> >     /\btest\s+[^&|]+\s+-o\s/
> >
> > The one in the patch would miss things like
> >
> > 	test "$a" = "$b" -o "$a" -lt "$b"
> > 	test "$n" -a "$n" -lt 4
> >
> > but the only thing that we care about is that a command that started
> > with "test " has "-a" or "-o" before we see "&" or "|", no?
> 
> Applying your suggestion results in these false positives:
> 	
> 	t4038-diff-combined.sh:135: error: "test A && test B" preferred to "test A -a B": git commit -m "test space change" -a &&
> 	t4038-diff-combined.sh:147: error: "test A && test B" preferred to "test A -a B": git commit -m "test other space changes" -a &&
> 	t6400-merge-df.sh:89: error: "test A || test B" preferred to "test A -o B": test 0 -eq $(git ls-files -o | wc -l)
> 	t6400-merge-df.sh:91: error: "test A || test B" preferred to "test A -o B": test 1 -eq $(git ls-files -o | wc -l)
> 	t6400-merge-df.sh:110: error: "test A || test B" preferred to "test A -o B": test 0 -eq $(git ls-files -o | wc -l)
> 	t6400-merge-df.sh:112: error: "test A || test B" preferred to "test A -o B": test 1 -eq $(git ls-files -o | wc -l)
> 	t6402-merge-rename.sh:639: error: "test A || test B" preferred to "test A -o B": test 0 -eq "$(git ls-files -o | wc -l)"
> 	t6402-merge-rename.sh:646: error: "test A || test B" preferred to "test A -o B": test 2 -eq "$(git ls-files -o | wc -l)"
> 	t6402-merge-rename.sh:686: error: "test A || test B" preferred to "test A -o B": test 0 -eq "$(git ls-files -o | wc -l)" &&
> 	t6402-merge-rename.sh:865: error: "test A || test B" preferred to "test A -o B": test 0 -eq $(git ls-files -o | wc -l) &&

With: 7dbe8c8003, (check-non-portable-shell.pl: `wc -l` may have
leading WS, 2017-12-21)
Unless the situation has been changed, since.
I think those tests with quoted "$(.. | wc -l)" has been deemed
unportable and should be replaced with test_line_count anyway?
Does "test -eq" strip spaces from integer strings?

And I think we're working on moving "git" command to its own commmand
instead of put it in the left of a pipe.

2 followed patch will clean them out
Ævar Arnfjörð Bjarmason April 21, 2021, 2:18 p.m. UTC | #4
On Wed, Apr 21 2021, Đoàn Trần Công Danh wrote:

> On 2021-04-21 10:46:08+0200, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>> 
>> On Wed, Apr 21 2021, Junio C Hamano wrote:
>> 
>> > Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>> >
>> >> These will only match the simplistic forms of `test -X blah` (where
>> >> "-X" is some single letter option), but will miss expressions such as
>> >> `test "$foo" = bar`. We stop at "&" or "|" to try not to overmatch
>> >> things like:
>> >>
>> >>     test whatever && ls -a foo
>> >>     test whatever && foo -o outfile
>> >
>> > I still do not understand why you have to insist on dashed operator
>> > as the first thing given to "test", like this:
>> >
>> >> +	/\btest\s+-[a-z]\s+[^&|]+\s+-a\s+/ and err '"test A && test B" ...
>> >> +	/\btest\s+-[a-z]\s+[^&|]+\s+-o\s+/ and err '"test A || test B" ...
>> >
>> > IOW, what over-matching would we get if we simplified the condition
>> > like so?
>> >
>> >     /\btest\s+[^&|]+\s+-a\s/
>> >     /\btest\s+[^&|]+\s+-o\s/
>> >
>> > The one in the patch would miss things like
>> >
>> > 	test "$a" = "$b" -o "$a" -lt "$b"
>> > 	test "$n" -a "$n" -lt 4
>> >
>> > but the only thing that we care about is that a command that started
>> > with "test " has "-a" or "-o" before we see "&" or "|", no?
>> 
>> Applying your suggestion results in these false positives:
>> 	
>> 	t4038-diff-combined.sh:135: error: "test A && test B" preferred to "test A -a B": git commit -m "test space change" -a &&
>> 	t4038-diff-combined.sh:147: error: "test A && test B" preferred to "test A -a B": git commit -m "test other space changes" -a &&
>> 	t6400-merge-df.sh:89: error: "test A || test B" preferred to "test A -o B": test 0 -eq $(git ls-files -o | wc -l)
>> 	t6400-merge-df.sh:91: error: "test A || test B" preferred to "test A -o B": test 1 -eq $(git ls-files -o | wc -l)
>> 	t6400-merge-df.sh:110: error: "test A || test B" preferred to "test A -o B": test 0 -eq $(git ls-files -o | wc -l)
>> 	t6400-merge-df.sh:112: error: "test A || test B" preferred to "test A -o B": test 1 -eq $(git ls-files -o | wc -l)
>> 	t6402-merge-rename.sh:639: error: "test A || test B" preferred to "test A -o B": test 0 -eq "$(git ls-files -o | wc -l)"
>> 	t6402-merge-rename.sh:646: error: "test A || test B" preferred to "test A -o B": test 2 -eq "$(git ls-files -o | wc -l)"
>> 	t6402-merge-rename.sh:686: error: "test A || test B" preferred to "test A -o B": test 0 -eq "$(git ls-files -o | wc -l)" &&
>> 	t6402-merge-rename.sh:865: error: "test A || test B" preferred to "test A -o B": test 0 -eq $(git ls-files -o | wc -l) &&
>
> With: 7dbe8c8003, (check-non-portable-shell.pl: `wc -l` may have
> leading WS, 2017-12-21)
> Unless the situation has been changed, since.
> I think those tests with quoted "$(.. | wc -l)" has been deemed
> unportable and should be replaced with test_line_count anyway?
> Does "test -eq" strip spaces from integer strings?
>
> And I think we're working on moving "git" command to its own commmand
> instead of put it in the left of a pipe.
>
> 2 followed patch will clean them out

I think those patches are good in their own right, i.e. replacing things
with more incremental helpers and test_cmp-like functions.

But I believe the code you're changing is not non-portable. It was using
the output of "wc -l" with the "=" operator that wasn't portable.

These ones are all occurances that use "-eq".

And:

    test "0" -eq " 0"

etc., is true, which is why these pass on OSX and beyond.
Junio C Hamano April 21, 2021, 4:32 p.m. UTC | #5
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>>> 	t6402-merge-rename.sh:865: error: "test A || test B" preferred to "test A -o B": test 0 -eq $(git ls-files -o | wc -l) &&
>>
>> With: 7dbe8c8003, (check-non-portable-shell.pl: `wc -l` may have
> ...
> I think those patches are good in their own right, i.e. replacing things
> with more incremental helpers and test_cmp-like functions.
>
> But I believe the code you're changing is not non-portable. It was using
> the output of "wc -l" with the "=" operator that wasn't portable.

Not necessarily.  The last one shown above

	test 0 -eq $(git ls-files -o | wc -l) &&

is still valid and correct with s/-eq/=/.  It lets shell to remove
excess whitespaces around output from some implementations of "wc",
so textual '=' is perfectly fine.

> These ones are all occurances that use "-eq".
>
> And:
>
>     test "0" -eq " 0"
>
> etc., is true, which is why these pass on OSX and beyond.

Perhaps, but I am not sure how portable it is to rely on "-eq"
ignoring the leading whitespaces.

A more conventional (read: time tested) way is to $(... | wc -l)
*NOT* enclosed inside dq, so that the shell will strip out any
excess spaces around it, and use '=', i.e.

    test 3 = $(... | wc -l)

and that is what we end up evaluating when we write

    test_line_count = 3 file

This does rely on that "wc -l" will run and produce _some_ output to
be syntactically correct, though.
Junio C Hamano April 21, 2021, 6:56 p.m. UTC | #6
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> Applying your suggestion results in these false positives:
> 	
> 	t4038-diff-combined.sh:135: error: "test A && test B" preferred to "test A -a B": git commit -m "test space change" -a &&

This is utterly amusing.  Consider the suggestion withdrawn.
diff mbox series

Patch

diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl
index fd3303552b..894aa0a4f9 100755
--- a/t/check-non-portable-shell.pl
+++ b/t/check-non-portable-shell.pl
@@ -41,6 +41,8 @@  sub err {
 	/^\s*declare\s+/ and err 'arrays/declare not portable';
 	/^\s*[^#]\s*which\s/ and err 'which is not portable (use type)';
 	/\btest\s+[^=]*==/ and err '"test a == b" is not portable (use =)';
+	/\btest\s+-[a-z]\s+[^&|]+\s+-a\s+/ and err '"test A && test B" preferred to "test A -a B"';
+	/\btest\s+-[a-z]\s+[^&|]+\s+-o\s+/ and err '"test A || test B" preferred to "test A -o B"';
 	/\bwc -l.*"\s*=/ and err '`"$(wc -l)"` is not portable (use test_line_count)';
 	/\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes BYTES <file >out)';
 	/(?:\$\(seq|^\s*seq\b)/ and err 'seq is not portable (use test_seq)';