diff mbox series

[1/4] t3430: modernize one-shot "VAR=val shell-func" invocation

Message ID 20240722065915.80760-2-ericsunshine@charter.net (mailing list archive)
State Superseded
Headers show
Series improve one-shot variable detection with shell function | expand

Commit Message

Eric Sunshine July 22, 2024, 6:59 a.m. UTC
From: Eric Sunshine <sunshine@sunshineco.com>

Unlike "VAR=val cmd" one-shot environment variable assignments which
exist only for the invocation of 'cmd', those assigned by "VAR=val
shell-func" exist within the running shell and continue to do so until
the process exits (or are explicitly unset). check-non-portable-shell.pl
warns when it detects such usage since, more often than not, the author
who writes such an invocation is unaware of the undesirable behavior.

A common way to work around the problem is to wrap a subshell around the
variable assignments and function call, thus ensuring that the
assignments are short-lived. However, these days, a more ergonomic
approach is to employ test_env() which is tailor-made for this specific
use-case.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/t3430-rebase-merges.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Phillip Wood July 22, 2024, 3:09 p.m. UTC | #1
Hi Eric

On 22/07/2024 07:59, Eric Sunshine wrote:
> From: Eric Sunshine <sunshine@sunshineco.com>
> 
> Unlike "VAR=val cmd" one-shot environment variable assignments which
> exist only for the invocation of 'cmd', those assigned by "VAR=val
> shell-func" exist within the running shell and continue to do so until
> the process exits (or are explicitly unset).

I'm not sure I follow. If I run

sh -c 'f() {
     echo "f: HELLO=$HELLO"
     env | grep HELLO
}
HELLO=x f; echo "HELLO=$HELLO"'

Then I see

f: HELLO=x
HELLO=x
HELLO=

which seems to contradict the commit message as $HELLO is unset when the 
function returns. I see the same result if I replace "sh" (which is bash 
on my system) with an explicit "bash", "dash" or "zsh".

I'm also confused as to why this caused a problem for Rubén's test as 
$HELLO is set in the environment so I'm don't understand why git wasn't 
picking up the right pager.

> check-non-portable-shell.pl
> warns when it detects such usage since, more often than not, the author
> who writes such an invocation is unaware of the undesirable behavior.
> 
> A common way to work around the problem is to wrap a subshell around the
> variable assignments and function call, thus ensuring that the
> assignments are short-lived. However, these days, a more ergonomic
> approach is to employ test_env() which is tailor-made for this specific
> use-case.

Oh, that sounds useful, I didn't know it existed.

Best Wishes

Phillip

> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
> ---
>   t/t3430-rebase-merges.sh | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
> index 36ca126bcd..e851ede4f9 100755
> --- a/t/t3430-rebase-merges.sh
> +++ b/t/t3430-rebase-merges.sh
> @@ -392,8 +392,8 @@ test_expect_success 'refuse to merge ancestors of HEAD' '
>   
>   test_expect_success 'root commits' '
>   	git checkout --orphan unrelated &&
> -	(GIT_AUTHOR_NAME="Parsnip" GIT_AUTHOR_EMAIL="root@example.com" \
> -	 test_commit second-root) &&
> +	test_env GIT_AUTHOR_NAME="Parsnip" GIT_AUTHOR_EMAIL="root@example.com" \
> +		test_commit second-root &&
>   	test_commit third-root &&
>   	cat >script-from-scratch <<-\EOF &&
>   	pick third-root
Junio C Hamano July 22, 2024, 6:24 p.m. UTC | #2
Eric Sunshine <ericsunshine@charter.net> writes:

> A common way to work around the problem is to wrap a subshell around the
> variable assignments and function call, thus ensuring that the
> assignments are short-lived. However, these days, a more ergonomic
> approach is to employ test_env() which is tailor-made for this specific
> use-case.

OK.  I am not sure if that is "ergonomic", though.  An explict
subshell has a good documentation value that even though we call
test_commit there, we do not care about the committer timestamps
subsequent commits would record, and we do not mind losing the
effect of test_tick from this invocation of test_commit.  Hiding
all of that behind test_env loses the documentation value.

We could resurrect it by explicitly passing "--no-tick" to
test_commit, though ;-).

> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
> ---
>  t/t3430-rebase-merges.sh | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
> index 36ca126bcd..e851ede4f9 100755
> --- a/t/t3430-rebase-merges.sh
> +++ b/t/t3430-rebase-merges.sh
> @@ -392,8 +392,8 @@ test_expect_success 'refuse to merge ancestors of HEAD' '
>  
>  test_expect_success 'root commits' '
>  	git checkout --orphan unrelated &&
> -	(GIT_AUTHOR_NAME="Parsnip" GIT_AUTHOR_EMAIL="root@example.com" \
> -	 test_commit second-root) &&
> +	test_env GIT_AUTHOR_NAME="Parsnip" GIT_AUTHOR_EMAIL="root@example.com" \
> +		test_commit second-root &&
>  	test_commit third-root &&
>  	cat >script-from-scratch <<-\EOF &&
>  	pick third-root
Phillip Wood July 23, 2024, 9:26 a.m. UTC | #3
On 22/07/2024 16:09, Phillip Wood wrote:
> Hi Eric
> 
> On 22/07/2024 07:59, Eric Sunshine wrote:
>> From: Eric Sunshine <sunshine@sunshineco.com>
>>
>> Unlike "VAR=val cmd" one-shot environment variable assignments which
>> exist only for the invocation of 'cmd', those assigned by "VAR=val
>> shell-func" exist within the running shell and continue to do so until
>> the process exits (or are explicitly unset).

Having seen the parallel discussion about the behavior of hash this 
construct is non-portable because the behavior differs between shells so 
perhaps the commit message could say something like

Unlike "VAR=val cmd" one-shot environment variable assignments which 
only exist for the invocation of external command "cmd", the behavior of 
"VAR=val func" where "func" is a shell function or builtin command 
varies between shells and so we should not use it in our test suite.

Best Wishes

Phillip

> I'm not sure I follow. If I run
> 
> sh -c 'f() {
>      echo "f: HELLO=$HELLO"
>      env | grep HELLO
> }
> HELLO=x f; echo "HELLO=$HELLO"'
> 
> Then I see
> 
> f: HELLO=x
> HELLO=x
> HELLO=
> 
> which seems to contradict the commit message as $HELLO is unset when the 
> function returns. I see the same result if I replace "sh" (which is bash 
> on my system) with an explicit "bash", "dash" or "zsh".
> 
> I'm also confused as to why this caused a problem for Rubén's test as 
> $HELLO is set in the environment so I'm don't understand why git wasn't 
> picking up the right pager.
> 
>> check-non-portable-shell.pl
>> warns when it detects such usage since, more often than not, the author
>> who writes such an invocation is unaware of the undesirable behavior.
>>
>> A common way to work around the problem is to wrap a subshell around the
>> variable assignments and function call, thus ensuring that the
>> assignments are short-lived. However, these days, a more ergonomic
>> approach is to employ test_env() which is tailor-made for this specific
>> use-case.
> 
> Oh, that sounds useful, I didn't know it existed.
> 
> Best Wishes
> 
> Phillip
> 
>> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
>> ---
>>   t/t3430-rebase-merges.sh | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
>> index 36ca126bcd..e851ede4f9 100755
>> --- a/t/t3430-rebase-merges.sh
>> +++ b/t/t3430-rebase-merges.sh
>> @@ -392,8 +392,8 @@ test_expect_success 'refuse to merge ancestors of 
>> HEAD' '
>>   test_expect_success 'root commits' '
>>       git checkout --orphan unrelated &&
>> -    (GIT_AUTHOR_NAME="Parsnip" GIT_AUTHOR_EMAIL="root@example.com" \
>> -     test_commit second-root) &&
>> +    test_env GIT_AUTHOR_NAME="Parsnip" 
>> GIT_AUTHOR_EMAIL="root@example.com" \
>> +        test_commit second-root &&
>>       test_commit third-root &&
>>       cat >script-from-scratch <<-\EOF &&
>>       pick third-root
Eric Sunshine July 26, 2024, 6:15 a.m. UTC | #4
On Mon, Jul 22, 2024 at 11:10 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> On 22/07/2024 07:59, Eric Sunshine wrote:
> > Unlike "VAR=val cmd" one-shot environment variable assignments which
> > exist only for the invocation of 'cmd', those assigned by "VAR=val
> > shell-func" exist within the running shell and continue to do so until
> > the process exits (or are explicitly unset).
>
> I'm not sure I follow. If I run
>
> sh -c 'f() {
>      echo "f: HELLO=$HELLO"
>      env | grep HELLO
> }
> HELLO=x f; echo "HELLO=$HELLO"'
>
> Then I see
>
> f: HELLO=x
> HELLO=x
> HELLO=
>
> which seems to contradict the commit message as $HELLO is unset when the
> function returns. I see the same result if I replace "sh" (which is bash
> on my system) with an explicit "bash", "dash" or "zsh".

I believe downstream discussion[1][2] established that the behavior is
inconsistent between various shells and versions of shells, and is
considered undefined by POSIX.

[1]: https://lore.kernel.org/git/xmqq34o1cn6b.fsf@gitster.g/
[2]: https://lore.kernel.org/git/xmqqbk2p9lwi.fsf_-_@gitster.g/

> I'm also confused as to why this caused a problem for Rubén's test as
> $HELLO is set in the environment so I'm don't understand why git wasn't
> picking up the right pager.

Junio summarized the problem and explanation[3].

[3]: https://lore.kernel.org/git/xmqq7cdd9l0m.fsf@gitster.g/

> > A common way to work around the problem is to wrap a subshell around the
> > variable assignments and function call, thus ensuring that the
> > assignments are short-lived. However, these days, a more ergonomic
> > approach is to employ test_env() which is tailor-made for this specific
> > use-case.
>
> Oh, that sounds useful, I didn't know it existed.

I didn't know about it either, and only discovered it upon my initial
attempt at making check-non-portable-shell.pl recognize the case Rubén
identified, at which point it started showing false-positives on
`test_env` invocations. Actually, considering that I was involved[4]
in the conversation which led to the introduction[5] of `test_env` by
Peff, it may be that I did know about it but forgot.

[4]: https://lore.kernel.org/git/CAPig+cR989yU4+JNTFREaeXqY61nusUOhufeBGGVCi29tR1P5w@mail.gmail.com/
[5]: https://lore.kernel.org/git/20160601070425.GA13648@sigill.intra.peff.net/
Eric Sunshine July 26, 2024, 6:30 a.m. UTC | #5
On Mon, Jul 22, 2024 at 2:24 PM Junio C Hamano <gitster@pobox.com> wrote:
> Eric Sunshine <ericsunshine@charter.net> writes:
> > A common way to work around the problem is to wrap a subshell around the
> > variable assignments and function call, thus ensuring that the
> > assignments are short-lived. However, these days, a more ergonomic
> > approach is to employ test_env() which is tailor-made for this specific
> > use-case.
>
> OK.  I am not sure if that is "ergonomic", though.  An explict
> subshell has a good documentation value that even though we call
> test_commit there, we do not care about the committer timestamps
> subsequent commits would record, and we do not mind losing the
> effect of test_tick from this invocation of test_commit.  Hiding
> all of that behind test_env loses the documentation value.
>
> We could resurrect it by explicitly passing "--no-tick" to
> test_commit, though ;-).

Your mention of `test_commit` reminded me that, these days, it accepts
an --author switch which seems tailor-made for what this chunk of code
in this test is actually checking: namely, that the root commit of the
orphan branch has "Parsnip <root@example.com>" as its author. So an
even simpler approach is to change it to:

    test_commit --author "Parsnip <root@example.com>" second-root &&

That conveys precisely that the test is interested in overriding the
author for that one commit.
Eric Sunshine July 26, 2024, 6:33 a.m. UTC | #6
On Tue, Jul 23, 2024 at 5:26 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> On 22/07/2024 16:09, Phillip Wood wrote:
> > On 22/07/2024 07:59, Eric Sunshine wrote:
> >> Unlike "VAR=val cmd" one-shot environment variable assignments which
> >> exist only for the invocation of 'cmd', those assigned by "VAR=val
> >> shell-func" exist within the running shell and continue to do so until
> >> the process exits (or are explicitly unset).
>
> Having seen the parallel discussion about the behavior of hash this
> construct is non-portable because the behavior differs between shells so
> perhaps the commit message could say something like
>
> Unlike "VAR=val cmd" one-shot environment variable assignments which
> only exist for the invocation of external command "cmd", the behavior of
> "VAR=val func" where "func" is a shell function or builtin command
> varies between shells and so we should not use it in our test suite.

Indeed, given all the subsequent discussion, it is now apparent that
multiple undesirable behaviors have been experienced, not just the one
mentioned by this commit message, and that POSIX states that the
behavior is undefined.
diff mbox series

Patch

diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index 36ca126bcd..e851ede4f9 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -392,8 +392,8 @@  test_expect_success 'refuse to merge ancestors of HEAD' '
 
 test_expect_success 'root commits' '
 	git checkout --orphan unrelated &&
-	(GIT_AUTHOR_NAME="Parsnip" GIT_AUTHOR_EMAIL="root@example.com" \
-	 test_commit second-root) &&
+	test_env GIT_AUTHOR_NAME="Parsnip" GIT_AUTHOR_EMAIL="root@example.com" \
+		test_commit second-root &&
 	test_commit third-root &&
 	cat >script-from-scratch <<-\EOF &&
 	pick third-root