diff mbox series

fstests: reduce runtime of check -n

Message ID 20230619105058.2711467-1-amir73il@gmail.com (mailing list archive)
State New, archived
Headers show
Series fstests: reduce runtime of check -n | expand

Commit Message

Amir Goldstein June 19, 2023, 10:50 a.m. UTC
kvm-xfstests invokes check -n twice to pre-process and generate the
tests-to-run list, which is then being passed as a long list of tests
for invkoing check in the command line.

check invokes dirname, basename and sed several times per test just
for doing basic string prefix/suffix trimming.
Use bash string pattern matching instead which is much faster.

Note that the following pattern matching expression change:
< test_dir=${test_dir#$SRC_DIR/*}
> t=${t#$SRC_DIR/}
does not change the meaning of the expression, because the
shortest match of "$SRC_DIR/*" that is being trimmed is "$SRC_DIR/"
and removing the tests/ prefix is what this code intended to do.

With check -n, there is no need to cleanup the results dir,
but check -n is doing that for every single listed test.
Move the cleanup of results dir to before actually running the test.

These improvements to check pre-test code cut down several minutes
from the time until tests actually start to run with kvm-xfstests.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---

Zorro,

Just to clarify, this change is not expected to change behavior -
only to improve runtime of check -n and the things that happen
before the first test is invoked.

Thanks,
Amir.

 check | 22 +++++++++-------------
 1 file changed, 9 insertions(+), 13 deletions(-)

Comments

Zorro Lang June 21, 2023, 1:13 a.m. UTC | #1
On Mon, Jun 19, 2023 at 01:50:58PM +0300, Amir Goldstein wrote:
> kvm-xfstests invokes check -n twice to pre-process and generate the
> tests-to-run list, which is then being passed as a long list of tests
> for invkoing check in the command line.
> 
> check invokes dirname, basename and sed several times per test just
> for doing basic string prefix/suffix trimming.
> Use bash string pattern matching instead which is much faster.
> 
> Note that the following pattern matching expression change:
> < test_dir=${test_dir#$SRC_DIR/*}
> > t=${t#$SRC_DIR/}
> does not change the meaning of the expression, because the
> shortest match of "$SRC_DIR/*" that is being trimmed is "$SRC_DIR/"
> and removing the tests/ prefix is what this code intended to do.
> 
> With check -n, there is no need to cleanup the results dir,
> but check -n is doing that for every single listed test.
> Move the cleanup of results dir to before actually running the test.
> 
> These improvements to check pre-test code cut down several minutes
> from the time until tests actually start to run with kvm-xfstests.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
> 
> Zorro,
> 
> Just to clarify, this change is not expected to change behavior -
> only to improve runtime of check -n and the things that happen
> before the first test is invoked.

Hi Amir,

Sure, it's good to me to improve the performace of 'check -n'. But as this patch
changes the common logic of check file, I need to make sure it won't bring in
regression at first by some testing. If it works fine, I'll ack and merge it.

Thanks,
Zorro

> 
> Thanks,
> Amir.
> 
>  check | 22 +++++++++-------------
>  1 file changed, 9 insertions(+), 13 deletions(-)
> 
> diff --git a/check b/check
> index e36978c1..1a16eccb 100755
> --- a/check
> +++ b/check
> @@ -399,9 +399,9 @@ if $have_test_arg; then
>  		*)	# Expand test pattern (e.g. xfs/???, *fs/001)
>  			list=$(cd $SRC_DIR; echo $1)
>  			for t in $list; do
> -				test_dir=`dirname $t`
> -				test_dir=${test_dir#$SRC_DIR/*}
> -				test_name=`basename $t`
> +				t=${t#$SRC_DIR/}
> +				test_dir=${t%/*}
> +				test_name=${t#*/}
>  				group_file=$SRC_DIR/$test_dir/group.list
>  
>  				if grep -Eq "^$test_name" $group_file; then
> @@ -849,18 +849,14 @@ function run_section()
>  
>  		# the filename for the test and the name output are different.
>  		# we don't include the tests/ directory in the name output.
> -		export seqnum=`echo $seq | sed -e "s;$SRC_DIR/;;"`
> -
> -		# Similarly, the result directory needs to replace the tests/
> -		# part of the test location.
> -		group=`dirname $seq`
> +		export seqnum=${seq#$SRC_DIR/}
> +		group=${seqnum%/*}
>  		if $OPTIONS_HAVE_SECTIONS; then
> -			export RESULT_DIR=`echo $group | sed -e "s;$SRC_DIR;${RESULT_BASE}/$section;"`
>  			REPORT_DIR="$RESULT_BASE/$section"
>  		else
> -			export RESULT_DIR=`echo $group | sed -e "s;$SRC_DIR;$RESULT_BASE;"`
>  			REPORT_DIR="$RESULT_BASE"
>  		fi
> +		export RESULT_DIR="$REPORT_DIR/$group"
>  		seqres="$REPORT_DIR/$seqnum"
>  
>  		# Generate the entire section report with whatever test results
> @@ -872,9 +868,6 @@ function run_section()
>  					     "" &> /dev/null
>  		fi
>  
> -		mkdir -p $RESULT_DIR
> -		rm -f ${RESULT_DIR}/require_scratch*
> -		rm -f ${RESULT_DIR}/require_test*
>  		echo -n "$seqnum"
>  
>  		if $showme; then
> @@ -899,6 +892,9 @@ function run_section()
>  		fi
>  
>  		# really going to try and run this one
> +		mkdir -p $RESULT_DIR
> +		rm -f ${RESULT_DIR}/require_scratch*
> +		rm -f ${RESULT_DIR}/require_test*
>  		rm -f $seqres.out.bad $seqres.hints
>  
>  		# check if we really should run it
> -- 
> 2.34.1
>
Zorro Lang June 21, 2023, 2:23 a.m. UTC | #2
On Mon, Jun 19, 2023 at 01:50:58PM +0300, Amir Goldstein wrote:
> kvm-xfstests invokes check -n twice to pre-process and generate the
> tests-to-run list, which is then being passed as a long list of tests
> for invkoing check in the command line.
> 
> check invokes dirname, basename and sed several times per test just
> for doing basic string prefix/suffix trimming.
> Use bash string pattern matching instead which is much faster.
> 
> Note that the following pattern matching expression change:
> < test_dir=${test_dir#$SRC_DIR/*}
> > t=${t#$SRC_DIR/}
> does not change the meaning of the expression, because the
> shortest match of "$SRC_DIR/*" that is being trimmed is "$SRC_DIR/"
> and removing the tests/ prefix is what this code intended to do.
> 
> With check -n, there is no need to cleanup the results dir,
> but check -n is doing that for every single listed test.
> Move the cleanup of results dir to before actually running the test.
> 
> These improvements to check pre-test code cut down several minutes
> from the time until tests actually start to run with kvm-xfstests.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
> 
> Zorro,
> 
> Just to clarify, this change is not expected to change behavior -
> only to improve runtime of check -n and the things that happen
> before the first test is invoked.
> 
> Thanks,
> Amir.
> 
>  check | 22 +++++++++-------------
>  1 file changed, 9 insertions(+), 13 deletions(-)
> 
> diff --git a/check b/check
> index e36978c1..1a16eccb 100755
> --- a/check
> +++ b/check
> @@ -399,9 +399,9 @@ if $have_test_arg; then
>  		*)	# Expand test pattern (e.g. xfs/???, *fs/001)
>  			list=$(cd $SRC_DIR; echo $1)
>  			for t in $list; do
> -				test_dir=`dirname $t`
> -				test_dir=${test_dir#$SRC_DIR/*}
> -				test_name=`basename $t`
> +				t=${t#$SRC_DIR/}

Actually I'm not sure what's this line for. We get $list by "cd $SRC_DIR; echo $1",
is there any chance to have "$SRC_DIR" in $t (or $test_dir) ?

> +				test_dir=${t%/*}
> +				test_name=${t#*/}

This change looks isn't related with the "check -n". Any reason to change
that? Does the 'dirname' and 'basename' have poor performance, or some systems
miss them?

I prefer the 'dirname' and 'basename' ways, if there's not a specific reason
to change that. Due to they can deal with more complicated path name, e.g.
"xfs//001". But the "$(cd $SRC_DIR; echo $1)" looks won't generate a path name
likes that, so if you have a proper reason to make this change, that's fine.

>  				group_file=$SRC_DIR/$test_dir/group.list
>  
>  				if grep -Eq "^$test_name" $group_file; then
> @@ -849,18 +849,14 @@ function run_section()
>  
>  		# the filename for the test and the name output are different.
>  		# we don't include the tests/ directory in the name output.
> -		export seqnum=`echo $seq | sed -e "s;$SRC_DIR/;;"`
> -
> -		# Similarly, the result directory needs to replace the tests/
> -		# part of the test location.
> -		group=`dirname $seq`
> +		export seqnum=${seq#$SRC_DIR/}
> +		group=${seqnum%/*}
>  		if $OPTIONS_HAVE_SECTIONS; then
> -			export RESULT_DIR=`echo $group | sed -e "s;$SRC_DIR;${RESULT_BASE}/$section;"`
>  			REPORT_DIR="$RESULT_BASE/$section"
>  		else
> -			export RESULT_DIR=`echo $group | sed -e "s;$SRC_DIR;$RESULT_BASE;"`
>  			REPORT_DIR="$RESULT_BASE"
>  		fi
> +		export RESULT_DIR="$REPORT_DIR/$group"

Hmm, this code logic looks more concise.

>  		seqres="$REPORT_DIR/$seqnum"
>  
>  		# Generate the entire section report with whatever test results
> @@ -872,9 +868,6 @@ function run_section()
>  					     "" &> /dev/null
>  		fi
>  
> -		mkdir -p $RESULT_DIR
> -		rm -f ${RESULT_DIR}/require_scratch*
> -		rm -f ${RESULT_DIR}/require_test*
>  		echo -n "$seqnum"
>  
>  		if $showme; then
> @@ -899,6 +892,9 @@ function run_section()
>  		fi
>  
>  		# really going to try and run this one
> +		mkdir -p $RESULT_DIR
> +		rm -f ${RESULT_DIR}/require_scratch*
> +		rm -f ${RESULT_DIR}/require_test*

Nice catch, loop running these lines really takes much time.

Thanks,
Zorro

>  		rm -f $seqres.out.bad $seqres.hints
>  
>  		# check if we really should run it
> -- 
> 2.34.1
>
David Disseldorp June 21, 2023, 9:26 a.m. UTC | #3
Hi Zorro and Amir,

On Wed, 21 Jun 2023 10:23:55 +0800, Zorro Lang wrote:

> On Mon, Jun 19, 2023 at 01:50:58PM +0300, Amir Goldstein wrote:
> > kvm-xfstests invokes check -n twice to pre-process and generate the
> > tests-to-run list, which is then being passed as a long list of tests
> > for invkoing check in the command line.
> > 
> > check invokes dirname, basename and sed several times per test just
> > for doing basic string prefix/suffix trimming.
> > Use bash string pattern matching instead which is much faster.
> > 
> > Note that the following pattern matching expression change:
> > < test_dir=${test_dir#$SRC_DIR/*}  
> > > t=${t#$SRC_DIR/}  
> > does not change the meaning of the expression, because the
> > shortest match of "$SRC_DIR/*" that is being trimmed is "$SRC_DIR/"
> > and removing the tests/ prefix is what this code intended to do.
> > 
> > With check -n, there is no need to cleanup the results dir,
> > but check -n is doing that for every single listed test.
> > Move the cleanup of results dir to before actually running the test.
> > 
> > These improvements to check pre-test code cut down several minutes
> > from the time until tests actually start to run with kvm-xfstests.
> > 
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> > 
> > Zorro,
> > 
> > Just to clarify, this change is not expected to change behavior -
> > only to improve runtime of check -n and the things that happen
> > before the first test is invoked.
> > 
> > Thanks,
> > Amir.
> > 
> >  check | 22 +++++++++-------------
> >  1 file changed, 9 insertions(+), 13 deletions(-)
> > 
> > diff --git a/check b/check
> > index e36978c1..1a16eccb 100755
> > --- a/check
> > +++ b/check
> > @@ -399,9 +399,9 @@ if $have_test_arg; then
> >  		*)	# Expand test pattern (e.g. xfs/???, *fs/001)
> >  			list=$(cd $SRC_DIR; echo $1)
> >  			for t in $list; do
> > -				test_dir=`dirname $t`
> > -				test_dir=${test_dir#$SRC_DIR/*}
> > -				test_name=`basename $t`
> > +				t=${t#$SRC_DIR/}  
> 
> Actually I'm not sure what's this line for. We get $list by "cd $SRC_DIR; echo $1",
> is there any chance to have "$SRC_DIR" in $t (or $test_dir) ?

Indeed, this line can be dropped. One minor change in behaviour here is
that a test pattern with a wacky path prefix e.g. ../tests/xfs/00? will
no longer work. I think breaking such patterns is okay, but it could
also be resolved by more carefully extracting the last two path
components.

> > +				test_dir=${t%/*}
> > +				test_name=${t#*/}  
> 
> This change looks isn't related with the "check -n". Any reason to change
> that? Does the 'dirname' and 'basename' have poor performance, or some systems
> miss them?
> 
> I prefer the 'dirname' and 'basename' ways, if there's not a specific reason
> to change that. Due to they can deal with more complicated path name, e.g.
> "xfs//001". But the "$(cd $SRC_DIR; echo $1)" looks won't generate a path name
> likes that, so if you have a proper reason to make this change, that's fine.

The per-iteration `dirname` and `basename` subshells add multiple
seconds of latency to `./check -n *fs/*` in my test env.

Cheers, David
Amir Goldstein June 21, 2023, 12:09 p.m. UTC | #4
On Wed, Jun 21, 2023 at 12:24 PM David Disseldorp <ddiss@suse.de> wrote:
>
> Hi Zorro and Amir,
>
> On Wed, 21 Jun 2023 10:23:55 +0800, Zorro Lang wrote:
>
> > On Mon, Jun 19, 2023 at 01:50:58PM +0300, Amir Goldstein wrote:
> > > kvm-xfstests invokes check -n twice to pre-process and generate the
> > > tests-to-run list, which is then being passed as a long list of tests
> > > for invkoing check in the command line.
> > >
> > > check invokes dirname, basename and sed several times per test just
> > > for doing basic string prefix/suffix trimming.
> > > Use bash string pattern matching instead which is much faster.
> > >
> > > Note that the following pattern matching expression change:
> > > < test_dir=${test_dir#$SRC_DIR/*}
> > > > t=${t#$SRC_DIR/}
> > > does not change the meaning of the expression, because the
> > > shortest match of "$SRC_DIR/*" that is being trimmed is "$SRC_DIR/"
> > > and removing the tests/ prefix is what this code intended to do.
> > >
> > > With check -n, there is no need to cleanup the results dir,
> > > but check -n is doing that for every single listed test.
> > > Move the cleanup of results dir to before actually running the test.
> > >
> > > These improvements to check pre-test code cut down several minutes
> > > from the time until tests actually start to run with kvm-xfstests.
> > >
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > ---
> > >
> > > Zorro,
> > >
> > > Just to clarify, this change is not expected to change behavior -
> > > only to improve runtime of check -n and the things that happen
> > > before the first test is invoked.
> > >
> > > Thanks,
> > > Amir.
> > >
> > >  check | 22 +++++++++-------------
> > >  1 file changed, 9 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/check b/check
> > > index e36978c1..1a16eccb 100755
> > > --- a/check
> > > +++ b/check
> > > @@ -399,9 +399,9 @@ if $have_test_arg; then
> > >             *)      # Expand test pattern (e.g. xfs/???, *fs/001)
> > >                     list=$(cd $SRC_DIR; echo $1)
> > >                     for t in $list; do
> > > -                           test_dir=`dirname $t`
> > > -                           test_dir=${test_dir#$SRC_DIR/*}
> > > -                           test_name=`basename $t`
> > > +                           t=${t#$SRC_DIR/}
> >
> > Actually I'm not sure what's this line for. We get $list by "cd $SRC_DIR; echo $1",
> > is there any chance to have "$SRC_DIR" in $t (or $test_dir) ?
>
> Indeed, this line can be dropped. One minor change in behaviour here is
> that a test pattern with a wacky path prefix e.g. ../tests/xfs/00? will
> no longer work. I think breaking such patterns is okay, but it could
> also be resolved by more carefully extracting the last two path
> components.
>

I didn't want to drop it because I didn't want to change legacy behavior.
When user's cwd is fstests root directory, it may be natural for some
users to use bash auto completion to run the command
./check tests/generic/001

so there may well be users out these that pass test names including
the tests/ prefix and check has always trimmed this prefix.

There is something a bit confusing about expanding of bash
patterns:

When the user uses this syntax:
./check tests/generic/00?

the shell expands that pattern to a list of 9 arguments passed to
./check tests/generic/001  tests/generic/002 ...

But when the user uses the documented syntax:
./check generic/00?
check gets a single argument and $(cd $SRC_DIR; echo $1)
expands this single argument to a list

All the above is existing behavior which my patch should
not have changed.

> > > +                           test_dir=${t%/*}
> > > +                           test_name=${t#*/}
> >

You may change the above to:
+                           test_dir=${t%%/*}
+                           test_name=${t##*/}

I tested and this deals with "xfs//001" and "xfs//00?"
correctly (it trimms the longest match to */ prefix or /* suffix
instead of the shortest match.

The line below group= could be changed to %%/* as well
but it does not matter because $group is used as part of
the results path which can live with //.

Unfortunately, trimming tests// from tests//xfs/001 without
also trimming xfs/ is less trivial and while I gave a reason
why tests/ prefix may be used in the wild, but I can think
of no good reason why tests// prefix would be used in the wild.

> > This change looks isn't related with the "check -n". Any reason to change
> > that? Does the 'dirname' and 'basename' have poor performance, or some systems
> > miss them?
> >
> > I prefer the 'dirname' and 'basename' ways, if there's not a specific reason
> > to change that. Due to they can deal with more complicated path name, e.g.
> > "xfs//001". But the "$(cd $SRC_DIR; echo $1)" looks won't generate a path name
> > likes that, so if you have a proper reason to make this change, that's fine.
>
> The per-iteration `dirname` and `basename` subshells add multiple
> seconds of latency to `./check -n *fs/*` in my test env.
>

Exactly.
Whoever looks at strace of check will observe that it is very noisy with
execve of dirname and basename.

The next step would be to use bash dictionary instead of multiple
invocations of grep to filter tests by match to -g <group> or -g xfs/<group>,
but that is left for another day...

Thanks,
Amir.
David Disseldorp June 21, 2023, 12:42 p.m. UTC | #5
On Wed, 21 Jun 2023 15:09:35 +0300, Amir Goldstein wrote:

> On Wed, Jun 21, 2023 at 12:24 PM David Disseldorp <ddiss@suse.de> wrote:
...
> > > Actually I'm not sure what's this line for. We get $list by "cd $SRC_DIR; echo $1",
> > > is there any chance to have "$SRC_DIR" in $t (or $test_dir) ?  
> >
> > Indeed, this line can be dropped. One minor change in behaviour here is
> > that a test pattern with a wacky path prefix e.g. ../tests/xfs/00? will
> > no longer work. I think breaking such patterns is okay, but it could
> > also be resolved by more carefully extracting the last two path
> > components.
> >  
> 
> I didn't want to drop it because I didn't want to change legacy behavior.
> When user's cwd is fstests root directory, it may be natural for some
> users to use bash auto completion to run the command
> ./check tests/generic/001
> 
> so there may well be users out these that pass test names including
> the tests/ prefix and check has always trimmed this prefix.

Ah, yes, understood.

> There is something a bit confusing about expanding of bash
> patterns:
> 
> When the user uses this syntax:
> ./check tests/generic/00?
> 
> the shell expands that pattern to a list of 9 arguments passed to
> ./check tests/generic/001  tests/generic/002 ...
> 
> But when the user uses the documented syntax:
> ./check generic/00?
> check gets a single argument and $(cd $SRC_DIR; echo $1)
> expands this single argument to a list
> 
> All the above is existing behavior which my patch should
> not have changed.

The example I gave was "../tests/xfs/00?", which does get broken by this
change. It's relative to SRC_DIR and independent of check parameter
expansion. As mentioned, I think it's obscure enough usage that this
breakage shouldn't be worth bothering about.

Cheers, David
Amir Goldstein June 21, 2023, 1:01 p.m. UTC | #6
On Wed, Jun 21, 2023 at 3:40 PM David Disseldorp <ddiss@suse.de> wrote:
>
> On Wed, 21 Jun 2023 15:09:35 +0300, Amir Goldstein wrote:
>
> > On Wed, Jun 21, 2023 at 12:24 PM David Disseldorp <ddiss@suse.de> wrote:
> ...
> > > > Actually I'm not sure what's this line for. We get $list by "cd $SRC_DIR; echo $1",
> > > > is there any chance to have "$SRC_DIR" in $t (or $test_dir) ?
> > >
> > > Indeed, this line can be dropped. One minor change in behaviour here is
> > > that a test pattern with a wacky path prefix e.g. ../tests/xfs/00? will
> > > no longer work. I think breaking such patterns is okay, but it could
> > > also be resolved by more carefully extracting the last two path
> > > components.
> > >
> >
> > I didn't want to drop it because I didn't want to change legacy behavior.
> > When user's cwd is fstests root directory, it may be natural for some
> > users to use bash auto completion to run the command
> > ./check tests/generic/001
> >
> > so there may well be users out these that pass test names including
> > the tests/ prefix and check has always trimmed this prefix.
>
> Ah, yes, understood.
>
> > There is something a bit confusing about expanding of bash
> > patterns:
> >
> > When the user uses this syntax:
> > ./check tests/generic/00?
> >
> > the shell expands that pattern to a list of 9 arguments passed to
> > ./check tests/generic/001  tests/generic/002 ...
> >
> > But when the user uses the documented syntax:
> > ./check generic/00?
> > check gets a single argument and $(cd $SRC_DIR; echo $1)
> > expands this single argument to a list
> >
> > All the above is existing behavior which my patch should
> > not have changed.
>
> The example I gave was "../tests/xfs/00?", which does get broken by this
> change. It's relative to SRC_DIR and independent of check parameter
> expansion. As mentioned, I think it's obscure enough usage that this
> breakage shouldn't be worth bothering about.
>

I agree.
Users of tests/xfs/* may be out there (due to bash auto complete)
Any other prefix, including ../tests/ is far less likely to exist in the wild
(famous last words)

Thanks,
Amir.
Zorro Lang June 23, 2023, 3:29 p.m. UTC | #7
On Wed, Jun 21, 2023 at 03:09:35PM +0300, Amir Goldstein wrote:
> On Wed, Jun 21, 2023 at 12:24 PM David Disseldorp <ddiss@suse.de> wrote:
> >
> > Hi Zorro and Amir,
> >
> > On Wed, 21 Jun 2023 10:23:55 +0800, Zorro Lang wrote:
> >
> > > On Mon, Jun 19, 2023 at 01:50:58PM +0300, Amir Goldstein wrote:
> > > > kvm-xfstests invokes check -n twice to pre-process and generate the
> > > > tests-to-run list, which is then being passed as a long list of tests
> > > > for invkoing check in the command line.
> > > >
> > > > check invokes dirname, basename and sed several times per test just
> > > > for doing basic string prefix/suffix trimming.
> > > > Use bash string pattern matching instead which is much faster.
> > > >
> > > > Note that the following pattern matching expression change:
> > > > < test_dir=${test_dir#$SRC_DIR/*}
> > > > > t=${t#$SRC_DIR/}
> > > > does not change the meaning of the expression, because the
> > > > shortest match of "$SRC_DIR/*" that is being trimmed is "$SRC_DIR/"
> > > > and removing the tests/ prefix is what this code intended to do.
> > > >
> > > > With check -n, there is no need to cleanup the results dir,
> > > > but check -n is doing that for every single listed test.
> > > > Move the cleanup of results dir to before actually running the test.
> > > >
> > > > These improvements to check pre-test code cut down several minutes
> > > > from the time until tests actually start to run with kvm-xfstests.
> > > >
> > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > > ---
> > > >
> > > > Zorro,
> > > >
> > > > Just to clarify, this change is not expected to change behavior -
> > > > only to improve runtime of check -n and the things that happen
> > > > before the first test is invoked.
> > > >
> > > > Thanks,
> > > > Amir.
> > > >
> > > >  check | 22 +++++++++-------------
> > > >  1 file changed, 9 insertions(+), 13 deletions(-)
> > > >
> > > > diff --git a/check b/check
> > > > index e36978c1..1a16eccb 100755
> > > > --- a/check
> > > > +++ b/check
> > > > @@ -399,9 +399,9 @@ if $have_test_arg; then
> > > >             *)      # Expand test pattern (e.g. xfs/???, *fs/001)
> > > >                     list=$(cd $SRC_DIR; echo $1)
> > > >                     for t in $list; do
> > > > -                           test_dir=`dirname $t`
> > > > -                           test_dir=${test_dir#$SRC_DIR/*}
> > > > -                           test_name=`basename $t`
> > > > +                           t=${t#$SRC_DIR/}
> > >
> > > Actually I'm not sure what's this line for. We get $list by "cd $SRC_DIR; echo $1",
> > > is there any chance to have "$SRC_DIR" in $t (or $test_dir) ?
> >
> > Indeed, this line can be dropped. One minor change in behaviour here is
> > that a test pattern with a wacky path prefix e.g. ../tests/xfs/00? will
> > no longer work. I think breaking such patterns is okay, but it could
> > also be resolved by more carefully extracting the last two path
> > components.
> >

Hi Amir and David,

Sorry for the late reply, I'm on a holiday of Dragon Boat Festival recently,
will be back tomorrow :)

> 
> I didn't want to drop it because I didn't want to change legacy behavior.

Sure, I didn't want to remove this line in this patch, due to it's not
match the target of this patch. Just speak out to check with you and
others :)

As we agree this line is useless, we can deal with it in later patch
which improves ./check.

> When user's cwd is fstests root directory, it may be natural for some
> users to use bash auto completion to run the command
> ./check tests/generic/001
> 
> so there may well be users out these that pass test names including
> the tests/ prefix and check has always trimmed this prefix.
> 
> There is something a bit confusing about expanding of bash
> patterns:
> 
> When the user uses this syntax:
> ./check tests/generic/00?

I think the "tests/" prefix is not necessary to support :)

> 
> the shell expands that pattern to a list of 9 arguments passed to
> ./check tests/generic/001  tests/generic/002 ...
> 
> But when the user uses the documented syntax:
> ./check generic/00?
> check gets a single argument and $(cd $SRC_DIR; echo $1)
> expands this single argument to a list
> 
> All the above is existing behavior which my patch should
> not have changed.
> 
> > > > +                           test_dir=${t%/*}
> > > > +                           test_name=${t#*/}
> > >
> 
> You may change the above to:
> +                           test_dir=${t%%/*}
> +                           test_name=${t##*/}

OK, this makes more sense to me. I'll give this a test, and merge this
change if no regression. Or you'd like to send a V2 to change more.

> 
> I tested and this deals with "xfs//001" and "xfs//00?"
> correctly (it trimms the longest match to */ prefix or /* suffix
> instead of the shortest match.
> 
> The line below group= could be changed to %%/* as well
> but it does not matter because $group is used as part of
> the results path which can live with //.
> 
> Unfortunately, trimming tests// from tests//xfs/001 without
> also trimming xfs/ is less trivial and while I gave a reason
> why tests/ prefix may be used in the wild, but I can think
> of no good reason why tests// prefix would be used in the wild.
> 
> > > This change looks isn't related with the "check -n". Any reason to change
> > > that? Does the 'dirname' and 'basename' have poor performance, or some systems
> > > miss them?
> > >
> > > I prefer the 'dirname' and 'basename' ways, if there's not a specific reason
> > > to change that. Due to they can deal with more complicated path name, e.g.
> > > "xfs//001". But the "$(cd $SRC_DIR; echo $1)" looks won't generate a path name
> > > likes that, so if you have a proper reason to make this change, that's fine.
> >
> > The per-iteration `dirname` and `basename` subshells add multiple
> > seconds of latency to `./check -n *fs/*` in my test env.

Thanks for this feedback. I understand now.

Thanks,
Zorro

> >
> 
> Exactly.
> Whoever looks at strace of check will observe that it is very noisy with
> execve of dirname and basename.
> 
> The next step would be to use bash dictionary instead of multiple
> invocations of grep to filter tests by match to -g <group> or -g xfs/<group>,
> but that is left for another day...
> 
> Thanks,
> Amir.
>
Amir Goldstein June 24, 2023, 5:32 a.m. UTC | #8
On Fri, Jun 23, 2023 at 6:29 PM Zorro Lang <zlang@redhat.com> wrote:
>
> On Wed, Jun 21, 2023 at 03:09:35PM +0300, Amir Goldstein wrote:
> > On Wed, Jun 21, 2023 at 12:24 PM David Disseldorp <ddiss@suse.de> wrote:
> > >
> > > Hi Zorro and Amir,
> > >
> > > On Wed, 21 Jun 2023 10:23:55 +0800, Zorro Lang wrote:
> > >
> > > > On Mon, Jun 19, 2023 at 01:50:58PM +0300, Amir Goldstein wrote:
> > > > > kvm-xfstests invokes check -n twice to pre-process and generate the
> > > > > tests-to-run list, which is then being passed as a long list of tests
> > > > > for invkoing check in the command line.
> > > > >
> > > > > check invokes dirname, basename and sed several times per test just
> > > > > for doing basic string prefix/suffix trimming.
> > > > > Use bash string pattern matching instead which is much faster.
> > > > >
> > > > > Note that the following pattern matching expression change:
> > > > > < test_dir=${test_dir#$SRC_DIR/*}
> > > > > > t=${t#$SRC_DIR/}
> > > > > does not change the meaning of the expression, because the
> > > > > shortest match of "$SRC_DIR/*" that is being trimmed is "$SRC_DIR/"
> > > > > and removing the tests/ prefix is what this code intended to do.
> > > > >
> > > > > With check -n, there is no need to cleanup the results dir,
> > > > > but check -n is doing that for every single listed test.
> > > > > Move the cleanup of results dir to before actually running the test.
> > > > >
> > > > > These improvements to check pre-test code cut down several minutes
> > > > > from the time until tests actually start to run with kvm-xfstests.
> > > > >
> > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > > > ---
> > > > >
> > > > > Zorro,
> > > > >
> > > > > Just to clarify, this change is not expected to change behavior -
> > > > > only to improve runtime of check -n and the things that happen
> > > > > before the first test is invoked.
> > > > >
> > > > > Thanks,
> > > > > Amir.
> > > > >
> > > > >  check | 22 +++++++++-------------
> > > > >  1 file changed, 9 insertions(+), 13 deletions(-)
> > > > >
> > > > > diff --git a/check b/check
> > > > > index e36978c1..1a16eccb 100755
> > > > > --- a/check
> > > > > +++ b/check
> > > > > @@ -399,9 +399,9 @@ if $have_test_arg; then
> > > > >             *)      # Expand test pattern (e.g. xfs/???, *fs/001)
> > > > >                     list=$(cd $SRC_DIR; echo $1)
> > > > >                     for t in $list; do
> > > > > -                           test_dir=`dirname $t`
> > > > > -                           test_dir=${test_dir#$SRC_DIR/*}
> > > > > -                           test_name=`basename $t`
> > > > > +                           t=${t#$SRC_DIR/}
> > > >
> > > > Actually I'm not sure what's this line for. We get $list by "cd $SRC_DIR; echo $1",
> > > > is there any chance to have "$SRC_DIR" in $t (or $test_dir) ?
> > >
> > > Indeed, this line can be dropped. One minor change in behaviour here is
> > > that a test pattern with a wacky path prefix e.g. ../tests/xfs/00? will
> > > no longer work. I think breaking such patterns is okay, but it could
> > > also be resolved by more carefully extracting the last two path
> > > components.
> > >
>
> Hi Amir and David,
>
> Sorry for the late reply, I'm on a holiday of Dragon Boat Festival recently,
> will be back tomorrow :)
>
> >
> > I didn't want to drop it because I didn't want to change legacy behavior.
>
> Sure, I didn't want to remove this line in this patch, due to it's not
> match the target of this patch. Just speak out to check with you and
> others :)
>
> As we agree this line is useless, we can deal with it in later patch
> which improves ./check.
>
> > When user's cwd is fstests root directory, it may be natural for some
> > users to use bash auto completion to run the command
> > ./check tests/generic/001
> >
> > so there may well be users out these that pass test names including
> > the tests/ prefix and check has always trimmed this prefix.
> >
> > There is something a bit confusing about expanding of bash
> > patterns:
> >
> > When the user uses this syntax:
> > ./check tests/generic/00?
>
> I think the "tests/" prefix is not necessary to support :)
>

Fine by me.

> >
> > the shell expands that pattern to a list of 9 arguments passed to
> > ./check tests/generic/001  tests/generic/002 ...
> >
> > But when the user uses the documented syntax:
> > ./check generic/00?
> > check gets a single argument and $(cd $SRC_DIR; echo $1)
> > expands this single argument to a list
> >
> > All the above is existing behavior which my patch should
> > not have changed.
> >
> > > > > +                           test_dir=${t%/*}
> > > > > +                           test_name=${t#*/}
> > > >
> >
> > You may change the above to:
> > +                           test_dir=${t%%/*}
> > +                           test_name=${t##*/}
>
> OK, this makes more sense to me. I'll give this a test, and merge this
> change if no regression. Or you'd like to send a V2 to change more.
>

I have no plans to send v2 with more changes.
Please make this change on merge.

Thanks,
Amir.
diff mbox series

Patch

diff --git a/check b/check
index e36978c1..1a16eccb 100755
--- a/check
+++ b/check
@@ -399,9 +399,9 @@  if $have_test_arg; then
 		*)	# Expand test pattern (e.g. xfs/???, *fs/001)
 			list=$(cd $SRC_DIR; echo $1)
 			for t in $list; do
-				test_dir=`dirname $t`
-				test_dir=${test_dir#$SRC_DIR/*}
-				test_name=`basename $t`
+				t=${t#$SRC_DIR/}
+				test_dir=${t%/*}
+				test_name=${t#*/}
 				group_file=$SRC_DIR/$test_dir/group.list
 
 				if grep -Eq "^$test_name" $group_file; then
@@ -849,18 +849,14 @@  function run_section()
 
 		# the filename for the test and the name output are different.
 		# we don't include the tests/ directory in the name output.
-		export seqnum=`echo $seq | sed -e "s;$SRC_DIR/;;"`
-
-		# Similarly, the result directory needs to replace the tests/
-		# part of the test location.
-		group=`dirname $seq`
+		export seqnum=${seq#$SRC_DIR/}
+		group=${seqnum%/*}
 		if $OPTIONS_HAVE_SECTIONS; then
-			export RESULT_DIR=`echo $group | sed -e "s;$SRC_DIR;${RESULT_BASE}/$section;"`
 			REPORT_DIR="$RESULT_BASE/$section"
 		else
-			export RESULT_DIR=`echo $group | sed -e "s;$SRC_DIR;$RESULT_BASE;"`
 			REPORT_DIR="$RESULT_BASE"
 		fi
+		export RESULT_DIR="$REPORT_DIR/$group"
 		seqres="$REPORT_DIR/$seqnum"
 
 		# Generate the entire section report with whatever test results
@@ -872,9 +868,6 @@  function run_section()
 					     "" &> /dev/null
 		fi
 
-		mkdir -p $RESULT_DIR
-		rm -f ${RESULT_DIR}/require_scratch*
-		rm -f ${RESULT_DIR}/require_test*
 		echo -n "$seqnum"
 
 		if $showme; then
@@ -899,6 +892,9 @@  function run_section()
 		fi
 
 		# really going to try and run this one
+		mkdir -p $RESULT_DIR
+		rm -f ${RESULT_DIR}/require_scratch*
+		rm -f ${RESULT_DIR}/require_test*
 		rm -f $seqres.out.bad $seqres.hints
 
 		# check if we really should run it