[V3] check: fail tests if check/dmesg are not clean

On Tue, May 29, 2018 at 09:07:37AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Currently a test appears to pass even if it leaves a corrupt
> filesystem behind, or a splat in the system logs that should not be
> there. While the test is internally tracked as failed (and the
> summary reports it as failed) the per-test output exits with a
> success and so emits a completion time before the post-test checks
> are run by the test harness.  Rework the check code to report
> post-test check failures as specific test failures rather than as
> separate failure line items in the overall harness output.
> 
> Reworking where we emit the errors this also allows us to include
> the post-test filesystem checking in the test runtime. This is
> currently not accounted to the test and can be substantial. Hence
> the real elapsed time of each test is not accurately reflected in
> the time stats being reported and so regressions in filesystem
> checking performance go unnoticed.
> 
> Changing the output reporting requires a complete reworking of the
> main test check loop. It's a bunch of spaghetti at the moment
> because it has post test reporting code at the end of the loop which
> must run regardless of the test result.  By moving the post test
> reporting to the start of the next loop iteration, we can clean up
> the code substantially by using continue directives where
> appropriate.
> 
> Also, for cases where we haven't run the test or it's already been
> marked as failed, don't bother running the filesystem/dmesg checks
> for failure as we're already going to report the test as failed.
> 
> This touches almost all of the loop, so get rid of the remaining
> 4 space indents inside the loop while moving all this code around.
> 
> Signed-Off-By: Dave Chinner <dchinner@redhat.com>

I hit some other errors when testing the xUnit report result, please see
below. (Sorry for not finding it in previous review..)

> ---
>  check     | 260 +++++++++++++++++++++++++++++-------------------------
>  common/rc |   6 ++
>  2 files changed, 144 insertions(+), 122 deletions(-)
> 
> diff --git a/check b/check
> index f6fb352bcbb9..0cdf95e9f5e7 100755
> --- a/check
> +++ b/check
> @@ -38,10 +38,12 @@ randomize=false
>  export here=`pwd`
>  xfile=""
>  brief_test_summary=false
> -_err_msg=""
>  do_report=false
>  DUMP_OUTPUT=false
>  
> +# This is a global variable used to pass test failure text to reporting gunk
> +_err_msg=""
> +
>  # start the initialisation work now
>  iam=check
>  
> @@ -643,78 +645,97 @@ for section in $HOST_OPTIONS_SECTIONS; do
>  	seqres="$check"
>  	_check_test_fs
>  
> -	for seq in $list
> -	do
> -	    err=false
> -	    _err_msg=""
> -	    if [ ! -f $seq ]; then
> -	        # Try to get full name in case the user supplied only seq id
> -	        # and the test has a name. A bit of hassle to find really
> -	        # the test and not its sample output or helping files.
> -	        bname=$(basename $seq)
> -	        full_seq=$(find $(dirname $seq) -name $bname* -executable |
> -	            awk '(NR == 1 || length < length(shortest)) { shortest = $0 }\
> -	                END { print shortest }')
> -	        if [ -f $full_seq ] \
> -	            && [ x$(echo $bname | grep -o "^$VALID_TEST_ID") != x ]; then
> -	            seq=$full_seq
> -	        fi
> -	    fi
> +	err=false
> +	first_test=true
> +	for seq in $list ; do
> +		# Run report for previous test!
> +		if $err ; then
> +			bad="$bad $seqnum"
> +			n_bad=`expr $n_bad + 1`
> +			tc_status="fail"
> +		fi
> +		if $do_report && ! $first_test ; then
> +			if [ $tc_status != "expunge" ] ; then
> +				_make_testcase_report "$tc_status"
> +			fi

This report is based on $seq variable, so when the test is _notrun (thus
"continue" in the for loop), it's reporting the wrong test seq name, it
should report the previous test seq not current. I saw reports like:

        <testcase classname="xfstests.xfs_4k_reflink" name="generic/403" time="1">
                <skipped message="no kernel support for y2038 sysfs switch" />
        </testcase>
        <testcase classname="xfstests.xfs_4k_reflink" name="generic/403" time="2">

Note that both results reported test name as "generic/403", but the
first one should really be "generic/402".

I appended a fix patch in the end, if it looks fine I can fold the fix
into this patch.

...
<snip>
...
> +	if $do_report -a ! $first_test -a $tc_status != "expunge" ; then
>  		_make_testcase_report "$tc_status"
> -	    fi
> -	    seq="after_$seqnum"
> -	done
> +	fi
> +

This if check should be fixed too as above one.

Thanks,
Eryu

--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[V3] check: fail tests if check/dmesg are not clean

Commit Message

Comments

Patch