[1/4] test-lib-functions: introduce test_line_count_cmd

Message ID	20210612042755.28342-2-congdanhqx@gmail.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <git-owner@kernel.org> From: =?utf-8?b?xJBvw6BuIFRy4bqnbiBDw7RuZyBEYW5o?= <congdanhqx@gmail.com> To: git@vger.kernel.org Cc: =?utf-8?b?xJBvw6BuIFRy4bqnbiBDw7RuZyBEYW5o?= <congdanhqx@gmail.com>, Eric Sunshine <sunshine@sunshineco.com>, =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBC?= =?utf-8?b?amFybWFzb24=?= <avarab@gmail.com>, Junio C Hamano <gitster@pobox.com> Subject: [PATCH 1/4] test-lib-functions: introduce test_line_count_cmd Date: Sat, 12 Jun 2021 11:27:52 +0700 Message-Id: <20210612042755.28342-2-congdanhqx@gmail.com> In-Reply-To: <20210612042755.28342-1-congdanhqx@gmail.com> References: <20210612042755.28342-1-congdanhqx@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	t: new helper test_line_count_cmd \| expand [0/4] t: new helper test_line_count_cmd [1/4] test-lib-functions: introduce test_line_count_cmd [2/4] t6402: use find(1) builtin to filter instead of grep [3/4] t6400: use test_line_count_cmd to count # of lines in stdout [4/4] t6402: use test_line_count_cmd to count # of lines in stdout

Đoàn Trần Công Danh June 12, 2021, 4:27 a.m. UTC

In Git project, we have multiple occasions that requires checking number
of lines of text in stdout and/or stderr of a command. One of such
example is t6400, which checks number of files in various states.

Some of those commands are Git command, and we would like to check their
exit status.  In some of those checks, we pipe the stdout of those
commands to "wc -l" to check for line count, thus loosing the exit status.

Introduce a helper function to check for number of lines in stdout and
stderr from those commands.

This helper will create 2 temporary files in process, thus it may affect
output of some checks.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---

Notes:
    Theoretically, we could avoid those temporary files by this shenanigan:
    
    	! (
    	  test $(
    		(
    		  test $(
    			  ( "$@" || echo "'$*' run into failure" >&3) | wc -l
    			)
    			"$out_ops" "$out_val" ||
    		  echo "stdout: !$outop $outval '$*'" >&3
    		) 2>&1 | wc -l
    	  ) "$errop" "$errval" ||
    	  echo "stderr: !$errop $errval '$*'" >&3
    	) 3>&1 | grep .
    
    However, it looks too complicated.

 t/test-lib-functions.sh | 64 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

Eric Sunshine June 13, 2021, 3:10 a.m. UTC | #1

On Sat, Jun 12, 2021 at 12:28 AM Đoàn Trần Công Danh
<congdanhqx@gmail.com> wrote:
> In Git project, we have multiple occasions that requires checking number
> of lines of text in stdout and/or stderr of a command. One of such
> example is t6400, which checks number of files in various states.
>
> Some of those commands are Git command, and we would like to check their
> exit status.  In some of those checks, we pipe the stdout of those
> commands to "wc -l" to check for line count, thus loosing the exit status.
>
> Introduce a helper function to check for number of lines in stdout and
> stderr from those commands.
>
> This helper will create 2 temporary files in process, thus it may affect
> output of some checks.

If the presence of these files is a concern, I wonder if it would make
sense to turn these into dot-files (leading dot in name) or shove them
into the .git/ directory? (Not necessarily an actionable comment; just
tossing around some thoughts.)

> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> ---
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> @@ -817,6 +817,70 @@ test_line_count () {
> +# test_line_count_cmd checks the number of lines of captured stdout and/or
> +# stderr of a command.
> +#
> +# NOTE: this helper function will create 2 temporary files named:
> +# * test_line_count_cmd_.out; and
> +# * test_line_count_cmd_.err
> +#
> +# And this helper function will remove those 2 files if the check is succeed.
> +# In case of failure, those files will be preserved.
> +test_line_count_cmd () {

It would be good to have some usage information in the above comment
so that people aren't forced to consult the implementation to learn
what options this function takes. At minimum, it should mention
`--out`, `--err`, and `!`, and should explain the arguments each
option takes (even if just through examples).

> +       local outop outval
> +       local errop errval
> +
> +       while test $# -ge 3
> +       do
> +               case "$1" in
> +               --out)
> +                       outop="$2"
> +                       outval="$3"
> +                       ;;
> +               --err)
> +                       errop="$2"
> +                       errval="$3"
> +                       ;;
> +               *)
> +                       break
> +                       ;;
> +               esac
> +               shift 3
> +       done &&

This is really minor, but if test_line_count_cmd() ever learns some
new option and that option does not consume two arguments, then the
`shift 3` at the end of the `case/esac` will need to be adjusted in
some fashion. It might make more sense, therefore, to perform the
`shift 3` closer to where it is needed (that is, in the `--out` case
and in the `--err` case) rather than delaying it as is done here. (Not
necessarily worth a re-roll.)

Another minor comment: Since you're &&-chaining everything else in the
function, it wouldn't hurt to also do so for the `local` declarations
and the assignments within each `case` arm, and to chain `esac` with
`shift 3`. Doing so could help some future programmer who might (for
some reason) insert code above the `while` loop, thinking that a
failure in the new code will abort the function, but not realizing
that the &&-chain is not intact in this area of the code.

> +       if test $# = 0 ||
> +          { test "x$1" = "x!" && test $# = 1 ; }
> +       then
> +               BUG "test_line_count_cmd: no command to be run"
> +       fi &&
> +       if test -z "$outop$errop"
> +       then
> +               BUG "test_line_count_cmd: check which stream?"
> +       fi &&
> +
> +       if test "x$1" = "x!"
> +       then
> +               shift &&
> +               if "$@" >test_line_count_cmd_.out 2>test_line_count_cmd_.err
> +               then
> +                       echo "error: '$@' succeed!"
> +                       return 1
> +               fi
> +       elif ! "$@" >test_line_count_cmd_.out 2>test_line_count_cmd_.err
> +       then
> +               echo "error: '$@' run into failure!"
> +               return 1
> +       fi &&

Do we care that the "!" negated case doesn't have the same semantics
as test_must_fail()? If we do care, then should there be a way to tell
the function whether we want test_must_fail() semantics or `!`
semantics (i.e. whether we're running a Git command or a non-Git
command) or should it infer it on its own? (These are genuine
questions -- not necessarily requests for changes -- as I'm trying to
reason through the implications of this implementation choice.)

> +       if test -n "$outop"
> +       then
> +               test_line_count "$outop" "$outval" test_line_count_cmd_.out
> +       fi &&
> +       if test -n "$errop"
> +       then
> +               test_line_count "$errop" "$errval" test_line_count_cmd_.err
> +       fi &&
> +       rm -f test_line_count_cmd_.out test_line_count_cmd_.err
> +}

Đoàn Trần Công Danh June 13, 2021, 7:36 a.m. UTC | #2

On 2021-06-12 23:10:19-0400, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Sat, Jun 12, 2021 at 12:28 AM Đoàn Trần Công Danh
> <congdanhqx@gmail.com> wrote:
> > In Git project, we have multiple occasions that requires checking number
> > of lines of text in stdout and/or stderr of a command. One of such
> > example is t6400, which checks number of files in various states.
> >
> > Some of those commands are Git command, and we would like to check their
> > exit status.  In some of those checks, we pipe the stdout of those
> > commands to "wc -l" to check for line count, thus loosing the exit status.
> >
> > Introduce a helper function to check for number of lines in stdout and
> > stderr from those commands.
> >
> > This helper will create 2 temporary files in process, thus it may affect
> > output of some checks.
> 
> If the presence of these files is a concern, I wonder if it would make
> sense to turn these into dot-files (leading dot in name) or shove them
> into the .git/ directory? (Not necessarily an actionable comment; just
> tossing around some thoughts.)

The presence of those files is concern to some command likes:

* git ls-files -o; or
* ls -a

Shoving into ".git" would be actionable if we are testing inside a git
repository. Should we testing outside, we don't have that luxury.

I think we can accept that limitation (only allow this helper inside
a Git worktree to avoid surpising behavior).

> 
> > Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> > ---
> > diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> > @@ -817,6 +817,70 @@ test_line_count () {
> > +# test_line_count_cmd checks the number of lines of captured stdout and/or
> > +# stderr of a command.
> > +#
> > +# NOTE: this helper function will create 2 temporary files named:
> > +# * test_line_count_cmd_.out; and
> > +# * test_line_count_cmd_.err
> > +#
> > +# And this helper function will remove those 2 files if the check is succeed.
> > +# In case of failure, those files will be preserved.
> > +test_line_count_cmd () {
> 
> It would be good to have some usage information in the above comment
> so that people aren't forced to consult the implementation to learn
> what options this function takes. At minimum, it should mention
> `--out`, `--err`, and `!`, and should explain the arguments each
> option takes (even if just through examples).

Make sense!

> 
> > +       local outop outval
> > +       local errop errval
> > +
> > +       while test $# -ge 3
> > +       do
> > +               case "$1" in
> > +               --out)
> > +                       outop="$2"
> > +                       outval="$3"
> > +                       ;;
> > +               --err)
> > +                       errop="$2"
> > +                       errval="$3"
> > +                       ;;
> > +               *)
> > +                       break
> > +                       ;;
> > +               esac
> > +               shift 3
> > +       done &&
> 
> This is really minor, but if test_line_count_cmd() ever learns some
> new option and that option does not consume two arguments, then the
> `shift 3` at the end of the `case/esac` will need to be adjusted in
> some fashion. It might make more sense, therefore, to perform the
> `shift 3` closer to where it is needed (that is, in the `--out` case
> and in the `--err` case) rather than delaying it as is done here. (Not
> necessarily worth a re-roll.)

Make sense, too. I think I'll add another leg here for "-*" to avoid
potential break when adding/removing options. Not that I imagine about
new options, but I can't think of any command starts with "-"

> Another minor comment: Since you're &&-chaining everything else in the
> function, it wouldn't hurt to also do so for the `local` declarations
> and the assignments within each `case` arm, and to chain `esac` with
> `shift 3`. Doing so could help some future programmer who might (for
> some reason) insert code above the `while` loop, thinking that a
> failure in the new code will abort the function, but not realizing
> that the &&-chain is not intact in this area of the code.

Will do, too!

> > +       if test $# = 0 ||
> > +          { test "x$1" = "x!" && test $# = 1 ; }
> > +       then
> > +               BUG "test_line_count_cmd: no command to be run"
> > +       fi &&
> > +       if test -z "$outop$errop"
> > +       then
> > +               BUG "test_line_count_cmd: check which stream?"
> > +       fi &&
> > +
> > +       if test "x$1" = "x!"
> > +       then
> > +               shift &&
> > +               if "$@" >test_line_count_cmd_.out 2>test_line_count_cmd_.err
> > +               then
> > +                       echo "error: '$@' succeed!"
> > +                       return 1
> > +               fi
> > +       elif ! "$@" >test_line_count_cmd_.out 2>test_line_count_cmd_.err
> > +       then
> > +               echo "error: '$@' run into failure!"
> > +               return 1
> > +       fi &&
> 
> Do we care that the "!" negated case doesn't have the same semantics
> as test_must_fail()? If we do care, then should there be a way to tell
> the function whether we want test_must_fail() semantics or `!`
> semantics (i.e. whether we're running a Git command or a non-Git
> command) or should it infer it on its own? (These are genuine
> questions -- not necessarily requests for changes -- as I'm trying to
> reason through the implications of this implementation choice.)

I think that we shouldn't care, in my mind, I would let users to chain
test_line_count and test_must_fail if they do care about that sematic.
So, we have the freedom to use test_line_count_cmd with both Git and
non-Git commands.  E.g:

---------8<---------------
diff --git a/t/t0040-parse-options.sh b/t/t0040-parse-options.sh
index ad4746d899..94c8cfc9f2 100755
--- a/t/t0040-parse-options.sh
+++ b/t/t0040-parse-options.sh
@@ -288,9 +288,8 @@ test_expect_success 'OPT_CALLBACK() and OPT_BIT() work' '
 '
 
 test_expect_success 'OPT_CALLBACK() and callback errors work' '
-	test_must_fail test-tool parse-options --no-length >output 2>output.err &&
-	test_must_be_empty output &&
-	test_must_be_empty output.err
+	test_line_count_cmd --out = 0 --err = 0 \
+		test_must_fail test-tool parse-options --no-length
 '
 
 cat >expect <<\EOF
------------>8--------------
> 
> > +       if test -n "$outop"
> > +       then
> > +               test_line_count "$outop" "$outval" test_line_count_cmd_.out
> > +       fi &&
> > +       if test -n "$errop"
> > +       then
> > +               test_line_count "$errop" "$errval" test_line_count_cmd_.err
> > +       fi &&
> > +       rm -f test_line_count_cmd_.out test_line_count_cmd_.err
> > +}

Ævar Arnfjörð Bjarmason June 13, 2021, 1:28 p.m. UTC | #3

On Sat, Jun 12 2021, Đoàn Trần Công Danh wrote:

> +	   { test "x$1" = "x!" && test $# = 1 ; }
> [...]
> +	if test "x$1" = "x!" 

We don't use this test idiom in other places, it's OK to just use "$1" =
"!". I think we're past whatever portability concern made that popular
in some older shell code in the wild.

Ævar Arnfjörð Bjarmason June 13, 2021, 1:36 p.m. UTC | #4

On Sat, Jun 12 2021, Đoàn Trần Công Danh wrote:

> In Git project, we have multiple occasions that requires checking number
> of lines of text in stdout and/or stderr of a command. One of such
> example is t6400, which checks number of files in various states.

Thanks for following up on this.

> Some of those commands are Git command, and we would like to check their
> exit status.  In some of those checks, we pipe the stdout of those
> commands to "wc -l" to check for line count, thus loosing the exit status.
>
> Introduce a helper function to check for number of lines in stdout and
> stderr from those commands.
>
> This helper will create 2 temporary files in process, thus it may affect
> output of some checks.

I think it's fine to just blindly stick the name into a file in the CWD
as long as it's not "actual" or some other such obviously likely to
conflict name.

The convention you've picked here (I'm not sure if it's existing
already) of naming the temp files after the test lib function name is a
good one.

More generally speaking we have a bunch of helpers that have this
potential issue/bug, in practice it's not a big deal.  A test that's
being overly specific and doing a test_cmp on unbounded "find" output or
whatever is likely buggy anyway.

If it ever becomes a bigger issue we can easily set up two scratch
directories during the test, one for the use of the test itself, and one
for the internals of the test run itself.

> +# test_line_count_cmd checks the number of lines of captured stdout and/or
> +# stderr of a command.
> +#
> +# NOTE: this helper function will create 2 temporary files named:
> +# * test_line_count_cmd_.out; and
> +# * test_line_count_cmd_.err
> +#
> +# And this helper function will remove those 2 files if the check is succeed.
> +# In case of failure, those files will be preserved.
> +test_line_count_cmd () {
> +	local outop outval
> +	local errop errval
> +
> +	while test $# -ge 3
> +	do
> +		case "$1" in
> +		--out)
> +			outop="$2"
> +			outval="$3"
> +			;;
> +		--err)
> +			errop="$2"
> +			errval="$3"
> +			;;

It looks like the end-state of the series leaves us with no user of the
--err option; Maybe it's good to have it anyway for completeness, or
just skip it? ...

> +		*)
> +			break
> +			;;
> +		esac
> +		shift 3
> +	done &&
> +	if test $# = 0 ||
> +	   { test "x$1" = "x!" && test $# = 1 ; }
> +	then
> +		BUG "test_line_count_cmd: no command to be run"
> +	fi &&
> +	if test -z "$outop$errop"
> +	then
> +		BUG "test_line_count_cmd: check which stream?"
> +	fi &&
> +
> +	if test "x$1" = "x!" 
> +	then
> +		shift &&
> +		if "$@" >test_line_count_cmd_.out 2>test_line_count_cmd_.err
> +		then
> +			echo "error: '$@' succeed!"
> +			return 1
> +		fi
> +	elif ! "$@" >test_line_count_cmd_.out 2>test_line_count_cmd_.err
> +	then
> +		echo "error: '$@' run into failure!"
> +		return 1
> +	fi &&

...I think it's better to not pipe to *.err if we haven't requested it,
so under "-v" etc. we can get the stderr.

If we're unifying them I think a better pattern is to only run that "$@"
once, get $?, and then act differently on that in the "!" and ""
cases. It requires less careful reading of the critical function path,
especially with the differing indentation.

> +	if test -n "$outop"
> +	then
> +		test_line_count "$outop" "$outval" test_line_count_cmd_.out
> +	fi &&
> +	if test -n "$errop"
> +	then
> +		test_line_count "$errop" "$errval" test_line_count_cmd_.err
> +	fi &&
> +	rm -f test_line_count_cmd_.out test_line_count_cmd_.err

Let's do that "rm -f" as a "test_when_finished" before we first pipe to
them. It's fine to do that in a test lib function, see e.g. test_config.

We'll get the benefit of preseving these files under -di etc.

> +}
> +
>  test_file_size () {
>  	test "$#" -ne 1 && BUG "1 param"
>  	test-tool path-utils file-size "$1"

Đoàn Trần Công Danh June 13, 2021, 4:37 p.m. UTC | #5

On 2021-06-13 15:28:58+0200, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> 
> On Sat, Jun 12 2021, Đoàn Trần Công Danh wrote:
> 
> > +	   { test "x$1" = "x!" && test $# = 1 ; }
> > [...]
> > +	if test "x$1" = "x!" 
> 
> We don't use this test idiom in other places, it's OK to just use "$1" =
> "!". I think we're past whatever portability concern made that popular
> in some older shell code in the wild.

We actually use this test idiom in test_i18ngrep and test_cmp_rev.

I don't know enough to see where should I cut the base line.

Phillip Wood June 13, 2021, 6:18 p.m. UTC | #6

On 13/06/2021 14:28, Ævar Arnfjörð Bjarmason wrote:
> 
> On Sat, Jun 12 2021, Đoàn Trần Công Danh wrote:
> 
>> +	   { test "x$1" = "x!" && test $# = 1 ; }
>> [...]
>> +	if test "x$1" = "x!"
> 
> We don't use this test idiom in other places, it's OK to just use "$1" =
> "!". I think we're past whatever portability concern made that popular
> in some older shell code in the wild.

Slightly off topic but if anyone is interested in the history of this 
test idiom and why it is no longer needed there is a good article at 
https://www.vidarholen.net/contents/blog/?p=1035

Best Wishes

Phillip

Felipe Contreras June 13, 2021, 9:42 p.m. UTC | #7

Phillip Wood wrote:
> On 13/06/2021 14:28, Ævar Arnfjörð Bjarmason wrote:
> > 
> > On Sat, Jun 12 2021, Đoàn Trần Công Danh wrote:
> > 
> >> +	   { test "x$1" = "x!" && test $# = 1 ; }
> >> [...]
> >> +	if test "x$1" = "x!"
> > 
> > We don't use this test idiom in other places, it's OK to just use "$1" =
> > "!". I think we're past whatever portability concern made that popular
> > in some older shell code in the wild.
> 
> Slightly off topic but if anyone is interested in the history of this 
> test idiom and why it is no longer needed there is a good article at 
> https://www.vidarholen.net/contents/blog/?p=1035

Very nice! I often wondered why people did that.

Eric Sunshine June 13, 2021, 11:43 p.m. UTC | #8

On Sun, Jun 13, 2021 at 2:18 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
> On 13/06/2021 14:28, Ævar Arnfjörð Bjarmason wrote:
> > On Sat, Jun 12 2021, Đoàn Trần Công Danh wrote:
> >> +       { test "x$1" = "x!" && test $# = 1 ; }
> >> [...]
> >> +    if test "x$1" = "x!"
> >
> > We don't use this test idiom in other places, it's OK to just use "$1" =
> > "!". I think we're past whatever portability concern made that popular
> > in some older shell code in the wild.
>
> Slightly off topic but if anyone is interested in the history of this
> test idiom and why it is no longer needed there is a good article at
> https://www.vidarholen.net/contents/blog/?p=1035

Thanks for the link to the article; it was an interesting read.
However, the article does seem to say that such idioms and care may
still be warranted. In particular, the epilog gives an example which
is still relevant on macOS today. (Indeed, I just tried it and it does
error out as the article states.) Even discounting macOS, it also
talks about such bugs existing as late as 2015, which isn't long ago
by any stretch. (And, as someone whose primary -- indeed only --
development machine is ten years old, some of the other bugs it
mentions -- which existed as recently as ten years ago -- don't seem
all that long ago either.)

At any rate, for those of us who are old-timers, the `"x$foo"` idiom
is habit and only costs a couple extra characters, so I for one have
no problem with its presence in the proposed patch.

Junio C Hamano June 14, 2021, 2:56 a.m. UTC | #9

Eric Sunshine <sunshine@sunshineco.com> writes:

> ... However, the article does seem to say that such idioms and care may
> still be warranted. In particular, the epilog gives an example which
> is still relevant on macOS today. ...
>
> At any rate, for those of us who are old-timers, the `"x$foo"` idiom
> is habit and only costs a couple extra characters, so I for one have
> no problem with its presence in the proposed patch.

Yes.  I think it is prudent to use the x-prefix idiom, especially
for the case cited upthread like comparing $1 with an exclamation
mark '!'.

Junio C Hamano June 14, 2021, 3:01 a.m. UTC | #10

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> More generally speaking we have a bunch of helpers that have this
> potential issue/bug, in practice it's not a big deal.  A test that's
> being overly specific and doing a test_cmp on unbounded "find" output or
> whatever is likely buggy anyway.

Unless we are testing ".gitignore", "ls-files -o", or "git status".
Carving out $TRASH_DIRECTORY/.git/trash/ directory to store these
throwaway files would be less error prone from that point of view.

>> +		case "$1" in
>> +		--out)
>> +			outop="$2"
>> +			outval="$3"
>> +			;;
>> +		--err)
>> +			errop="$2"
>> +			errval="$3"
>> +			;;
>
> It looks like the end-state of the series leaves us with no user of the
> --err option; Maybe it's good to have it anyway for completeness, or
> just skip it? ...

I'd rather not to see --out added if we have not yet any use for the
other one.  When the time comes that we want to validate the error
stream, we may find that we want to check _both_, and we'd have to
discard dead code and replace with what we want, instead of just
adding what we want to a working code without dead code.

Đoàn Trần Công Danh June 15, 2021, 3:40 p.m. UTC | #11

On 2021-06-13 15:36:11+0200, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> 
> On Sat, Jun 12 2021, Đoàn Trần Công Danh wrote:
> 
> > In Git project, we have multiple occasions that requires checking number
> > of lines of text in stdout and/or stderr of a command. One of such
> > example is t6400, which checks number of files in various states.
> 
> Thanks for following up on this.
> 
> > Some of those commands are Git command, and we would like to check their
> > exit status.  In some of those checks, we pipe the stdout of those
> > commands to "wc -l" to check for line count, thus loosing the exit status.
> >
> > Introduce a helper function to check for number of lines in stdout and
> > stderr from those commands.
> >
> > This helper will create 2 temporary files in process, thus it may affect
> > output of some checks.
> 
> I think it's fine to just blindly stick the name into a file in the CWD
> as long as it's not "actual" or some other such obviously likely to
> conflict name.
> 
> The convention you've picked here (I'm not sure if it's existing
> already) of naming the temp files after the test lib function name is a
> good one.
> 
> More generally speaking we have a bunch of helpers that have this
> potential issue/bug, in practice it's not a big deal.  A test that's
> being overly specific and doing a test_cmp on unbounded "find" output or
> whatever is likely buggy anyway.

I'm going to write into $TRASH_DIRECTORY/.git/trash if .git is-a-dir,
otherwise to $TRASH_DIRECTORY in the next reroll.

> If it ever becomes a bigger issue we can easily set up two scratch
> directories during the test, one for the use of the test itself, and one
> for the internals of the test run itself.
> 
> > +# test_line_count_cmd checks the number of lines of captured stdout and/or
> > +# stderr of a command.
> > +#
> > +# NOTE: this helper function will create 2 temporary files named:
> > +# * test_line_count_cmd_.out; and
> > +# * test_line_count_cmd_.err
> > +#
> > +# And this helper function will remove those 2 files if the check is succeed.
> > +# In case of failure, those files will be preserved.
> > +test_line_count_cmd () {
> > +	local outop outval
> > +	local errop errval
> > +
> > +	while test $# -ge 3
> > +	do
> > +		case "$1" in
> > +		--out)
> > +			outop="$2"
> > +			outval="$3"
> > +			;;
> > +		--err)
> > +			errop="$2"
> > +			errval="$3"
> > +			;;
> 
> It looks like the end-state of the series leaves us with no user of the
> --err option; Maybe it's good to have it anyway for completeness, or
> just skip it? ...

In the re-roll that being prepared, t0041 will be converted, too.
So --err will have a user.

> > +		*)
> > +			break
> > +			;;
> > +		esac
> > +		shift 3
> > +	done &&
> > +	if test $# = 0 ||
> > +	   { test "x$1" = "x!" && test $# = 1 ; }
> > +	then
> > +		BUG "test_line_count_cmd: no command to be run"
> > +	fi &&
> > +	if test -z "$outop$errop"
> > +	then
> > +		BUG "test_line_count_cmd: check which stream?"
> > +	fi &&
> > +
> > +	if test "x$1" = "x!" 
> > +	then
> > +		shift &&
> > +		if "$@" >test_line_count_cmd_.out 2>test_line_count_cmd_.err
> > +		then
> > +			echo "error: '$@' succeed!"
> > +			return 1
> > +		fi
> > +	elif ! "$@" >test_line_count_cmd_.out 2>test_line_count_cmd_.err
> > +	then
> > +		echo "error: '$@' run into failure!"
> > +		return 1
> > +	fi &&
> 
> ...I think it's better to not pipe to *.err if we haven't requested it,
> so under "-v" etc. we can get the stderr.

And with t0041 converted, it reveals some difficulty when implementing
this suggestion.

Yes, it would be nice if we can get the stderr under -v,
however, -x will make our life hard, since -x also writes into
/dev/stderr, and below command is broken:

	test_line_count_cmd --out = 0 test_must_fail git for-each-ref --no-contains something 2>actual.err &&
	test_i18ngrep ! usage actual.err

> If we're unifying them I think a better pattern is to only run that "$@"
> once, get $?, and then act differently on that in the "!" and ""
> cases. It requires less careful reading of the critical function path,
> especially with the differing indentation.

This is definitely doable.

This is a replacement patch that still has trouble with "-x",
I guess I should always exec 2>*.err
---------8<--------
Subject: [PATCH] test-lib-functions: introduce test_line_count_cmd

In Git project, we have multiple occasions that requires checking number
of lines of text in stdout and/or stderr of a command. One of such
example is t6400, which checks number of files in various states.

Some of those commands are Git command, and we would like to check their
exit status.  In some of those checks, we pipe the stdout of those
commands to "wc -l" to check for line count, thus loosing the exit status.

Introduce a helper function to check for number of lines in stdout and
stderr from those commands.

This helper will create 2 temporary files in process, thus it may affect
output of some checks.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 t/test-lib-functions.sh | 116 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 116 insertions(+)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index f0448daa74..2261de7caa 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -845,6 +845,122 @@ test_line_count () {
 	fi
 }
 
+# test_line_count_cmd checks exit status, and the number of lines in
+# captured stdout and/or stderr of a command.
+#
+# Usage:
+#
+# test_line_count_cmd [--[out|err] <binop> <value>]... [--] [!] cmd [args...]
+#
+# Options:
+# --out <binop> <value>:
+# --err <binop> <value>:
+#	Run sh's "test <# of lines> <binop> <value>" on # of lines in stdout
+#	(for --out) or stderr (for --err)
+# !:
+#	Instead of expecting "cmd [args...]" succeed, expect its failure.
+#	Note, if command under testing is "git", test_must_fail should be used
+#	instead of "!".
+#
+# Example:
+#	test_line_count_cmd --out -ge 10 --err = 0 git tag --no-contains v1.0.0
+#	test_line_count_cmd --out -le 10 ! grep some-text a-file
+#	test_line_count_cmd --out = 0 test_must_fail git rev-parse --verify abcd1234
+#
+# NOTE:
+# * if "--out" is specified, a temporary file named test_line_count_cmd_.out
+#   will be created.
+# * if "--err" is specified, a temporary file named test_line_count_cmd_.err
+#   will be created.
+# Those temporary files will be created under $TRASH_DIRECTORY/.git/trash
+# if $TRASH_DIRECTORY/.git directory existed.
+# Otherwise, they will be created in $TRASH_DIRECTORY.
+# Those temporary files will be cleant by test_when_finished
+test_line_count_cmd () {
+	local outop outval outfile
+	local errop errval errfile
+	local expect_failure actual_failure
+	local trashdir="$TRASH_DIRECTORY"
+
+	if test -d "$TRASH_DIRECTORY/.git"
+	then
+		trashdir="$TRASH_DIRECTORY/.git/trash" &&
+		mkdir -p "$trashdir"
+	fi &&
+	while test $# != 0
+	do
+		case "$1" in
+		--out)
+			outop="$2" &&
+			outval="$3" &&
+			outfile="$trashdir/test_line_count_cmd_.out" &&
+			shift 3
+			;;
+		--err)
+			errop="$2" &&
+			errval="$3" &&
+			errfile="$trashdir/test_line_count_cmd_.err" &&
+			shift 3
+			;;
+		--)
+			shift &&
+			break
+			;;
+		-*)
+			BUG "test_line_count_cmd: unknown options: '$1'"
+			;;
+		*)
+			break
+			;;
+		esac
+	done &&
+	if test "x$1" = "x!"
+	then
+		shift &&
+		expect_failure=yes
+	fi &&
+	if test $# = 0
+	then
+		BUG "test_line_count_cmd: no command to be run"
+	elif test -z "$outop$errop"
+	then
+		BUG "test_line_count_cmd: check which stream?"
+	else
+		if test -n "$outfile"
+		then
+			test_when_finished "rm -f '$outfile'" &&
+			exec 3>"$outfile"
+		fi &&
+		if test -n "$errfile"
+		then
+			test_when_finished "rm -f '$errfile'" &&
+			exec 5>"$errfile"
+		fi &&
+		if ! "$@" >&3 2>&5
+		then
+			actual_failure=yes
+		fi
+	# Don't use &4, it's used for test_must_fail
+	fi 3>&1 5>&2 &&
+	case "$expect_failure,$actual_failure" in
+	yes,)
+		echo "error: '$@' succeed!"
+		return 1
+		;;
+	,yes)
+		echo "error: '$@' run into failure!"
+		return 1
+	esac &&
+	if test -n "$outop"
+	then
+		test_line_count "$outop" "$outval" "$outfile"
+	fi &&
+	if test -n "$errop"
+	then
+		test_line_count "$errop" "$errval" "$errfile"
+	fi
+}
+
 test_file_size () {
 	test "$#" -ne 1 && BUG "1 param"
 	test-tool path-utils file-size "$1"

Ævar Arnfjörð Bjarmason June 24, 2021, 11:19 p.m. UTC | #12

On Sun, Jun 13 2021, Eric Sunshine wrote:

> On Sun, Jun 13, 2021 at 2:18 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
>> On 13/06/2021 14:28, Ævar Arnfjörð Bjarmason wrote:
>> > On Sat, Jun 12 2021, Đoàn Trần Công Danh wrote:
>> >> +       { test "x$1" = "x!" && test $# = 1 ; }
>> >> [...]
>> >> +    if test "x$1" = "x!"
>> >
>> > We don't use this test idiom in other places, it's OK to just use "$1" =
>> > "!". I think we're past whatever portability concern made that popular
>> > in some older shell code in the wild.
>>
>> Slightly off topic but if anyone is interested in the history of this
>> test idiom and why it is no longer needed there is a good article at
>> https://www.vidarholen.net/contents/blog/?p=1035
>
> Thanks for the link to the article; it was an interesting read.
> However, the article does seem to say that such idioms and care may
> still be warranted. In particular, the epilog gives an example which
> is still relevant on macOS today. (Indeed, I just tried it and it does
> error out as the article states.) Even discounting macOS, it also
> talks about such bugs existing as late as 2015, which isn't long ago
> by any stretch. (And, as someone whose primary -- indeed only --
> development machine is ten years old, some of the other bugs it
> mentions -- which existed as recently as ten years ago -- don't seem
> all that long ago either.)

It's only for the case where the first byte is "(" or ")" though, so
e.g. the use of this to compare things like command-line options and
other things that don't start with those characters is portable, if I'm
understanding the article correctly.

> At any rate, for those of us who are old-timers, the `"x$foo"` idiom
> is habit and only costs a couple extra characters, so I for one have
> no problem with its presence in the proposed patch.

Indeed, but it's interesting to dig and see if there's any reason for
such workarounds still.

[1/4] test-lib-functions: introduce test_line_count_cmd

Commit Message

Comments

Patch