diff mbox series

[6/6] t: teach lint that RHS of 'local VAR=VAL' needs to be quoted

Message ID 20240406000902.3082301-7-gitster@pobox.com (mailing list archive)
State Accepted
Commit 8bfe4861913843b6aac8656aabfd43ac405362e8
Headers show
Series local VAR="VAL" | expand

Commit Message

Junio C Hamano April 6, 2024, 12:09 a.m. UTC
Teach t/check-non-portable-shell.pl that right hand side of the
assignment done with "local VAR=VAL" need to be quoted.  We
deliberately target only VAL that begins with $ so that we can catch

 - $variable_reference and positional parameter reference like $4
 - $(command substitution)
 - ${variable_reference-with_magic}

while excluding

 - $'\n' that is a bash-ism freely usable in t990[23]
 - $(( arithmetic )) whose result should be $IFS safe.
 - $? that also is $IFS safe

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/check-non-portable-shell.pl | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jeff King April 7, 2024, 1:43 a.m. UTC | #1
On Fri, Apr 05, 2024 at 05:09:02PM -0700, Junio C Hamano wrote:

> Teach t/check-non-portable-shell.pl that right hand side of the
> assignment done with "local VAR=VAL" need to be quoted.  We
> deliberately target only VAL that begins with $ so that we can catch
> 
>  - $variable_reference and positional parameter reference like $4
>  - $(command substitution)
>  - ${variable_reference-with_magic}
> 
> while excluding
> 
>  - $'\n' that is a bash-ism freely usable in t990[23]
>  - $(( arithmetic )) whose result should be $IFS safe.
>  - $? that also is $IFS safe

Hmm. Just porting over my comment from the other thread (before I
realized you'd written this series), this misses:

  local foo=bar/$1

etc. Should we look for the "$" anywhere on the line? I doubt we can get
things foolproof, but requiring somebody to quote:

  local foo=$((1+2))

does not seem like the worst outcome. I dunno.

-Peff
Junio C Hamano April 8, 2024, 5:31 p.m. UTC | #2
Jeff King <peff@peff.net> writes:

> On Fri, Apr 05, 2024 at 05:09:02PM -0700, Junio C Hamano wrote:
>
>> Teach t/check-non-portable-shell.pl that right hand side of the
>> assignment done with "local VAR=VAL" need to be quoted.  We
>> deliberately target only VAL that begins with $ so that we can catch
>> 
>>  - $variable_reference and positional parameter reference like $4
>>  - $(command substitution)
>>  - ${variable_reference-with_magic}
>> 
>> while excluding
>> 
>>  - $'\n' that is a bash-ism freely usable in t990[23]
>>  - $(( arithmetic )) whose result should be $IFS safe.
>>  - $? that also is $IFS safe
>
> Hmm. Just porting over my comment from the other thread (before I
> realized you'd written this series), this misses:
>
>   local foo=bar/$1
>
> etc. Should we look for the "$" anywhere on the line? I doubt we can get
> things foolproof, but requiring somebody to quote:
>
>   local foo=$((1+2))
>
> does not seem like the worst outcome. I dunno.

Looking at the output from

    $ git grep -E -e 'local [a-zA-Z0-9_]+=[^"]*[$]' t/

the listed ones in the proposed commit log message are the false
positives.  Luckily we didn't have anything that tries to
concatenate parameter reference to something else.

But with the pattern we do miss

    local var=$*

and possibly many others.  So I am not sure.  The false positives
do look moderately bad, so I'd rather start with the simplest one
proposed in the patch.
Jeff King April 8, 2024, 8:40 p.m. UTC | #3
On Mon, Apr 08, 2024 at 10:31:34AM -0700, Junio C Hamano wrote:

> > Hmm. Just porting over my comment from the other thread (before I
> > realized you'd written this series), this misses:
> >
> >   local foo=bar/$1
> >
> > etc. Should we look for the "$" anywhere on the line? I doubt we can get
> > things foolproof, but requiring somebody to quote:
> >
> >   local foo=$((1+2))
> >
> > does not seem like the worst outcome. I dunno.
> 
> Looking at the output from
> 
>     $ git grep -E -e 'local [a-zA-Z0-9_]+=[^"]*[$]' t/
> 
> the listed ones in the proposed commit log message are the false
> positives.  Luckily we didn't have anything that tries to
> concatenate parameter reference to something else.
> 
> But with the pattern we do miss
> 
>     local var=$*
> 
> and possibly many others.  So I am not sure.  The false positives
> do look moderately bad, so I'd rather start with the simplest one
> proposed in the patch.

Yeah, I think a regex is probably going to end up with either false
positives or false negatives. It probably does not matter too much which
way we err, if we expect them to be rare on either side.

My thinking was mostly that false negatives are worse, because they only
matter on old buggy versions of dash (and only if the tests actually
pass a value with spaces). And so most developers will not notice them
immediately. Whereas false positives, while annoying, are reported to
them immediately by the linter. And generally, dealing with problems
closer to the time of writing means less work overall.

But I am happy to take your series as-is and we can see which cases (if
any!) we miss in practice.

I do hope that eventually we could just say "that buggy version of dash
does not matter anymore", but I think it is too soon for that (it sounds
like it is still being used in CI).

-Peff
diff mbox series

Patch

diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl
index dd8107cd7d..b2b28c2ced 100755
--- a/t/check-non-portable-shell.pl
+++ b/t/check-non-portable-shell.pl
@@ -47,6 +47,8 @@  sub err {
 	/\bgrep\b.*--file\b/ and err 'grep --file FILE is not portable (use grep -f FILE)';
 	/\b[ef]grep\b/ and err 'egrep/fgrep obsolescent (use grep -E/-F)';
 	/\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable (use FOO=bar && export FOO)';
+	/\blocal\s+[A-Za-z0-9_]*=\$([A-Za-z0-9_{]|[(][^(])/ and
+		err q(quote "$val" in 'local var=$val');
 	/^\s*([A-Z0-9_]+=(\w*|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and
 		err '"FOO=bar shell_func" assignment extends beyond "shell_func"';
 	$line = '';