diff mbox series

[1/2] test-lib: allow test snippets as here-docs

Message ID YHDVAxxKDzfTlq3h@coredump.intra.peff.net (mailing list archive)
State New, archived
Headers show
Series here-doc test bodies | expand

Commit Message

Jeff King April 9, 2021, 10:28 p.m. UTC
Most test snippets are wrapped in single quotes, like:

  test_expect_success 'some description' '
          do_something
  '

This sometimes makes the snippets awkward to write, because you can't
easily use single quotes. We sometimes work around this with $SQ, or by
loosening regexes to use "." instead of a literal quote, or by using
double quotes when we'd prefer to use single-quotes (and just adding
extra backslash-escapes to avoid interpolation).

This commit adds another option: feeding the snippet on the function's
stdin. This doesn't conflict with anything the snippet would want to do,
because we always redirect its stdin from /dev/null anyway (which we'll
continue to do).

A few notes on the implementation:

  - it would be nice to push this down into test_run_, but we can't, as
    test_expect_success and test_expect_failure want to see the actual
    script content to report it for verbose-mode. A helper function
    limits the amount of duplication in those callers here.

  - The helper function is a little awkward to call, as you feed it the
    name of the variable you want to set. The more natural thing in
    shell would be command substitution like:

      body=$(body_or_stdin "$2")

    but that loses trailing whitespace. There are tricks around this,
    like:

      body=$(body_or_stdin "$2"; printf '.)
      body=${body%.}

    but we'd prefer to keep such tricks in the helper, not in each
    caller.

  - I implemented the helper using a sequence of "read" calls. Together
    with "-r" and unsetting the IFS, this preserves incoming whitespace.
    An alternative is to use "cat" (which then requires the gross "."
    trick above). But this saves us a process, which is probably a good
    thing. The "read" builtin does use more read() syscalls than
    necessary (one per byte), but that is almost certainly a win over a
    separate process.

    Both are probably slower than passing a single-quoted string, but
    the difference is lost in the noise for a script that I converted as
    an experiment.

  - I handle test_expect_success and test_expect_failure here. If we
    like this style, we could easily extend it to other spots (e.g.,
    lazy_prereq bodies) on top of this patch.

  - even though we are using "local", we have to be careful about our
    variable names. Within test_expect_success, any variable we declare
    with local will be seen by the test snippets themselves (so it won't
    persist between tests like normal variables would).

Signed-off-by: Jeff King <peff@peff.net>
---
 t/README                |  8 ++++++++
 t/test-lib-functions.sh | 30 ++++++++++++++++++++++++++----
 2 files changed, 34 insertions(+), 4 deletions(-)

Comments

Jeff King April 9, 2021, 10:30 p.m. UTC | #1
On Fri, Apr 09, 2021 at 06:28:19PM -0400, Jeff King wrote:

> +   If <script> is `-` (a single dash), then the script to run is read
> +   from stdin. This lets you more easily use single quotes within the
> +   script by using a here-doc. For example:
> +
> +        test_expect_success 'output contains expected string' <<-\EOT
> +	        grep "this string has 'quotes' in it" output
> +	EOT

Whoops, this should have an extra "-", of course (I got this wrong in
the cover letter, too). I.e.:

  test_expect_success 'whatever' - <<\EOT
     ...
  EOT

It would be nice to drop it, but then:

  test_expect_success PREREQ 'whatever' <<\EOT

becomes ambiguous (and I don't think there is a portable and reliable
way to decide that our input is coming from a here-doc versus the
original stdin).

-Peff
Junio C Hamano April 9, 2021, 10:56 p.m. UTC | #2
Jeff King <peff@peff.net> writes:

> +   If <script> is `-` (a single dash), then the script to run is read
> +   from stdin. This lets you more easily use single quotes within the
> +   script by using a here-doc. For example:
> +
> +        test_expect_success 'output contains expected string' <<-\EOT

Missing '-'?

> +	        grep "this string has 'quotes' in it" output
> +	EOT
> +
> ...
> +	# start with a newline, to match hanging newline from open-quote style
> +	eval "$1=\$LF"
> +	local test_line
> +	while IFS= read -r test_line
> +	do
> +		eval "$1=\${$1}\${test_line}\${LF}"
> +	done

I wonder if we can do this without relying on "read -r" (which I
distrust, perhaps out of superstition)?  Perhaps by slurping the
whole thing with "$(cat)"?

Thanks.
Junio C Hamano April 10, 2021, 12:57 a.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

>> +	# start with a newline, to match hanging newline from open-quote style
>> +	eval "$1=\$LF"
>> +	local test_line
>> +	while IFS= read -r test_line
>> +	do
>> +		eval "$1=\${$1}\${test_line}\${LF}"
>> +	done
>
> I wonder if we can do this without relying on "read -r" (which I
> distrust, perhaps out of superstition)?  Perhaps by slurping the
> whole thing with "$(cat)"?

Meaning, something along this line...

----- >8 --------- >8 --------- >8 --------- >8 ----
#!/bin/sh
LF='
'
ttt () {
	eval "$1"='$LF$(cat)'
}
ttt script <<\EOT
	a
	b
EOT
echo "<<<$script>>>"
----- 8< --------- 8< --------- 8< --------- 8< ----
Jeff King April 10, 2021, 1:26 a.m. UTC | #4
On Fri, Apr 09, 2021 at 05:57:10PM -0700, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> >> +	# start with a newline, to match hanging newline from open-quote style
> >> +	eval "$1=\$LF"
> >> +	local test_line
> >> +	while IFS= read -r test_line
> >> +	do
> >> +		eval "$1=\${$1}\${test_line}\${LF}"
> >> +	done
> >
> > I wonder if we can do this without relying on "read -r" (which I
> > distrust, perhaps out of superstition)?  Perhaps by slurping the
> > whole thing with "$(cat)"?
> 
> Meaning, something along this line...
> 
> ----- >8 --------- >8 --------- >8 --------- >8 ----
> #!/bin/sh
> LF='
> '
> ttt () {
> 	eval "$1"='$LF$(cat)'
> }
> ttt script <<\EOT
> 	a
> 	b
> EOT
> echo "<<<$script>>>"
> ----- 8< --------- 8< --------- 8< --------- 8< ----

I wrote it using cat initially. If you want to preserve trailing
whitespace, you have to do something like:

  val=$(printf '\n'
        cat
	printf '.')
  val=${val%.}

I was worried about adding the overhead of the extra subshell and
process for the command substitution, but perhaps that is overblown.
TBH, worrying about whitespace may be overblown, too. Some test snippets
have extra blank lines at the end, but possibly we should just not care.

I imagine "read -r" does not work reliably for binary junk, but keep in
mind that we are talking about text shell script snippets (that are
already in shell strings anyway).

-Peff
Ævar Arnfjörð Bjarmason April 10, 2021, 8:30 a.m. UTC | #5
On Sat, Apr 10 2021, Jeff King wrote:

> Most test snippets are wrapped in single quotes, like:
>
>   test_expect_success 'some description' '
>           do_something
>   '
>
> This sometimes makes the snippets awkward to write, because you can't
> easily use single quotes. We sometimes work around this with $SQ, or by
> loosening regexes to use "." instead of a literal quote, or by using
> double quotes when we'd prefer to use single-quotes (and just adding
> extra backslash-escapes to avoid interpolation).
>
> This commit adds another option: feeding the snippet on the function's
> stdin. This doesn't conflict with anything the snippet would want to do,
> because we always redirect its stdin from /dev/null anyway (which we'll
> continue to do).

I like this, and not having to write $SQ, '"'"' etc.

> A few notes on the implementation:
>
>   - it would be nice to push this down into test_run_, but we can't, as
>     test_expect_success and test_expect_failure want to see the actual
>     script content to report it for verbose-mode. A helper function
>     limits the amount of duplication in those callers here.

I've got an unsubmitted series (a bigger part of the -V rewrite) which
conflicted with this one, because I'd moved that message into those
helper functions.

But in that case I end up having to have this in
test_expect_{success,failure} anyway, because the change I had was to
move it into test_{ok,failure}_, i.e. to color the emitted body under
verbose differently depending on test ok/failure (which means deferring
the "this is our test body" until after the run).

It got slightly awkward because before I could pass "$@" to those (they
pass "$1" now), but with your change there's a "-" left on the argument
list, so we need to pass "$1" and "$test_body".

Anyway, it's no problem, just musings on having re-arranged this code
you're pointing out needs/could be re-arranged.

Maybe it would be easier to pass test_run arguments saying whether we
expect failure or not, and then move the whole if/else after it into its
own body. It already takes the "expecting_failure" parameter, so this on
top of master:

	diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
	index 6348e8d733..9e20bd607d 100644
	--- a/t/test-lib-functions.sh
	+++ b/t/test-lib-functions.sh
	@@ -611,8 +611,7 @@ test_expect_failure () {
	 	export test_prereq
	 	if ! test_skip "$@"
	 	then
	-		say >&3 "checking known breakage of $TEST_NUMBER.$test_count '$1': $2"
	-		if test_run_ "$2" expecting_failure
	+		if test_run_ "$1" "$2" expecting_failure
	 		then
	 			test_known_broken_ok_ "$1"
	 		else
	@@ -631,8 +630,7 @@ test_expect_success () {
	 	export test_prereq
	 	if ! test_skip "$@"
	 	then
	-		say >&3 "expecting success of $TEST_NUMBER.$test_count '$1': $2"
	-		if test_run_ "$2"
	+		if test_run_ "$1" "$2"
	 		then
	 			test_ok_ "$1"
	 		else
	diff --git a/t/test-lib.sh b/t/test-lib.sh
	index d3f6af6a65..5a1192e80c 100644
	--- a/t/test-lib.sh
	+++ b/t/test-lib.sh
	@@ -935,9 +935,20 @@ test_eval_ () {
	 }
	 
	 test_run_ () {
	+	local description
	+	description="$1"
	+	shift
	+
	 	test_cleanup=:
	 	expecting_failure=$2
	 
	+	if test -n "$expecting_failure"
	+	then
	+		say >&3 "checking known breakage of $TEST_NUMBER.$test_count '$description': $1"
	+	else
	+		say >&3 "expecting success of $TEST_NUMBER.$test_count '$description': $1"
	+	fi
	+
	 	if test "${GIT_TEST_CHAIN_LINT:-1}" != 0; then
	 		# turn off tracing for this test-eval, as it simply creates
	 		# confusing noise in the "-x" output

... or maybe not, but in any case, if the verbose mode was what was
stopping you from moving this down to "test_run_" just that seems like
an easy change.

I like your current implementation better, i.e. to have the stdin
consumption happen ASAP and have the others be low-level utility
functions, but I don't care much, but if you wanted it the other way
maybe the above diff helps.
	
>   - The helper function is a little awkward to call, as you feed it the
>     name of the variable you want to set. The more natural thing in
>     shell would be command substitution like:
>
>       body=$(body_or_stdin "$2")
>
>     but that loses trailing whitespace. There are tricks around this,
>     like:
>
>       body=$(body_or_stdin "$2"; printf '.)
>       body=${body%.}
>
>     but we'd prefer to keep such tricks in the helper, not in each
>     caller.

I see why you did this, and for a narrow change it's a good thing.

FWIW having spent some more time on making the TAP format more pruttah
in a parallel WIP series I think this is ultimately a losing
game. You're inserting the extra LF because you don't want to have the
"checking..." and the first line of the test body on the same line;

But we have all of:

    test_expect_success 'foo' 'true'
    test_expect_success 'foo' '
        true
    '

And now:

    test_expect_success 'foo' - <<\EOT
        true
    '

So if (definitely not needed for your change) wanted to always make this
pretty/indented we'd need to push that logic down to the formatter,
which would insert a leading LF and/or indentation as appropriate.

I just declared that if you use the first form you don't get
indentation :)

>   - I implemented the helper using a sequence of "read" calls. Together
>     with "-r" and unsetting the IFS, this preserves incoming whitespace.
>     An alternative is to use "cat" (which then requires the gross "."
>     trick above). But this saves us a process, which is probably a good
>     thing. The "read" builtin does use more read() syscalls than
>     necessary (one per byte), but that is almost certainly a win over a
>     separate process.
>
>     Both are probably slower than passing a single-quoted string, but
>     the difference is lost in the noise for a script that I converted as
>     an experiment.
>
>   - I handle test_expect_success and test_expect_failure here. If we
>     like this style, we could easily extend it to other spots (e.g.,
>     lazy_prereq bodies) on top of this patch.
>
>   - even though we are using "local", we have to be careful about our
>     variable names. Within test_expect_success, any variable we declare
>     with local will be seen by the test snippets themselves (so it won't
>     persist between tests like normal variables would).
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
diff mbox series

Patch

diff --git a/t/README b/t/README
index fd9375b146..a234c87792 100644
--- a/t/README
+++ b/t/README
@@ -755,6 +755,14 @@  library for your script to use.
 	    'git-write-tree should be able to write an empty tree.' \
 	    'tree=$(git-write-tree)'
 
+   If <script> is `-` (a single dash), then the script to run is read
+   from stdin. This lets you more easily use single quotes within the
+   script by using a here-doc. For example:
+
+        test_expect_success 'output contains expected string' <<-\EOT
+	        grep "this string has 'quotes' in it" output
+	EOT
+
    If you supply three parameters the first will be taken to be a
    prerequisite; see the test_set_prereq and test_have_prereq
    documentation below:
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 6348e8d733..3c8081b256 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -602,6 +602,24 @@  test_verify_prereq () {
 	BUG "'$test_prereq' does not look like a prereq"
 }
 
+# assign the variable named by "$1" with the contents of "$2";
+# if "$2" is "-", then read stdin into "$1" instead
+test_body_or_stdin () {
+	if test "$2" != "-"
+	then
+		eval "$1=\$2"
+		return
+	fi
+
+	# start with a newline, to match hanging newline from open-quote style
+	eval "$1=\$LF"
+	local test_line
+	while IFS= read -r test_line
+	do
+		eval "$1=\${$1}\${test_line}\${LF}"
+	done
+}
+
 test_expect_failure () {
 	test_start_
 	test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq=
@@ -611,8 +629,10 @@  test_expect_failure () {
 	export test_prereq
 	if ! test_skip "$@"
 	then
-		say >&3 "checking known breakage of $TEST_NUMBER.$test_count '$1': $2"
-		if test_run_ "$2" expecting_failure
+		local test_body
+		test_body_or_stdin test_body "$2"
+		say >&3 "checking known breakage of $TEST_NUMBER.$test_count '$1': $test_body"
+		if test_run_ "$test_body" expecting_failure
 		then
 			test_known_broken_ok_ "$1"
 		else
@@ -631,8 +651,10 @@  test_expect_success () {
 	export test_prereq
 	if ! test_skip "$@"
 	then
-		say >&3 "expecting success of $TEST_NUMBER.$test_count '$1': $2"
-		if test_run_ "$2"
+		local test_body
+		test_body_or_stdin test_body "$2"
+		say >&3 "expecting success of $TEST_NUMBER.$test_count '$1': $test_body"
+		if test_run_ "$test_body"
 		then
 			test_ok_ "$1"
 		else