check: fail tests if check/dmesg are not clean

Message ID	20180223011632.28893-1-david@fromorbit.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <fstests-owner@kernel.org> From: Dave Chinner <david@fromorbit.com> To: fstests@vger.kernel.org Subject: [PATCH] check: fail tests if check/dmesg are not clean Date: Fri, 23 Feb 2018 12:16:32 +1100 Message-Id: <20180223011632.28893-1-david@fromorbit.com> Sender: fstests-owner@vger.kernel.org Precedence: bulk

Message ID

20180223011632.28893-1-david@fromorbit.com (mailing list archive)

State

New, archived

Headers

From: Dave Chinner <david@fromorbit.com>
To: fstests@vger.kernel.org
Subject: [PATCH] check: fail tests if check/dmesg are not clean
Date: Fri, 23 Feb 2018 12:16:32 +1100
Message-Id: <20180223011632.28893-1-david@fromorbit.com>
Sender: fstests-owner@vger.kernel.org
Precedence: bulk

Commit Message

Dave Chinner Feb. 23, 2018, 1:16 a.m. UTC

From: Dave Chinner <dchinner@redhat.com>

Currently a test passes even if it leaves a corrupt filesystem
behind, or a splat in the system logs that should not be there.
Rework the check code to consider these as test failures so they can
be accounted and tracked correctly. This also allows us to include
the post-test filesystem checking in the test runtime - that is
currently not accounted to the test, either, so the real runtime of
each test is not accurately reflected in the time stats being
reported.

This requires a complete reworking of the main test check loop. It's
a bunch of spaghetti at the moment because it has post test
reporting code preventing use from using continue when a test is
done. Move that post test reporting to the start of the next loop
iteration and clean up the code to use continues where appropriate.

Also, for cases where we haven't run the test or it's already been
marked as failed, don't bother running the filesystem/dmesg checks
for failure as we're already going to report the test as failed.

This touches almost all of the loop, so get rid of the remaining
4 space indents inside the loop while moving all this code around.

Signed-Off-By: Dave Chinner <dchinner@redhat.com>
---
 check | 250 +++++++++++++++++++++++++++++++++++-------------------------------
 1 file changed, 132 insertions(+), 118 deletions(-)

Comments

Eryu Guan Feb. 23, 2018, 4:34 a.m. UTC | #1

On Fri, Feb 23, 2018 at 12:16:32PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Currently a test passes even if it leaves a corrupt filesystem
> behind, or a splat in the system logs that should not be there.
> Rework the check code to consider these as test failures so they can
> be accounted and tracked correctly. This also allows us to include
> the post-test filesystem checking in the test runtime - that is
> currently not accounted to the test, either, so the real runtime of
> each test is not accurately reflected in the time stats being
> reported.
> 
> This requires a complete reworking of the main test check loop. It's
> a bunch of spaghetti at the moment because it has post test
> reporting code preventing use from using continue when a test is
> done. Move that post test reporting to the start of the next loop
> iteration and clean up the code to use continues where appropriate.
> 
> Also, for cases where we haven't run the test or it's already been
> marked as failed, don't bother running the filesystem/dmesg checks
> for failure as we're already going to report the test as failed.
> 
> This touches almost all of the loop, so get rid of the remaining
> 4 space indents inside the loop while moving all this code around.
> 
> Signed-Off-By: Dave Chinner <dchinner@redhat.com>

Thanks for the rework! It needs careful testing and most probably won't
catch this week's update. I'll take it in next week if I see no problems
in testing.

Thanks,
Eryu
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Eryu Guan Feb. 26, 2018, 10:10 a.m. UTC | #2

Hi Dave,

On Fri, Feb 23, 2018 at 12:16:32PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Currently a test passes even if it leaves a corrupt filesystem
> behind, or a splat in the system logs that should not be there.
> Rework the check code to consider these as test failures so they can
> be accounted and tracked correctly. This also allows us to include
> the post-test filesystem checking in the test runtime - that is
> currently not accounted to the test, either, so the real runtime of
> each test is not accurately reflected in the time stats being
> reported.
> 
> This requires a complete reworking of the main test check loop. It's
> a bunch of spaghetti at the moment because it has post test
> reporting code preventing use from using continue when a test is
> done. Move that post test reporting to the start of the next loop
> iteration and clean up the code to use continues where appropriate.
> 
> Also, for cases where we haven't run the test or it's already been
> marked as failed, don't bother running the filesystem/dmesg checks
> for failure as we're already going to report the test as failed.
> 
> This touches almost all of the loop, so get rid of the remaining
> 4 space indents inside the loop while moving all this code around.
> 
> Signed-Off-By: Dave Chinner <dchinner@redhat.com>
> ---
>  check | 250 +++++++++++++++++++++++++++++++++++-------------------------------
>  1 file changed, 132 insertions(+), 118 deletions(-)
> 

[snip]

> @@ -724,90 +742,86 @@ for section in $HOST_OPTIONS_SECTIONS; do

Patch doesn't apply starting from this hunk.

Applying: check: fail tests if check/dmesg are not clean
error: patch failed: check:724

>  			./$seq >$tmp.out 2>&1

This context doesn't look right, it should be

			./$seq >$tmp.rawout 2>&1

>  			sts=$?
>  		fi
> -		$timestamp && _timestamp
> -		stop=`_wallclock`

And there're two more lines below the two removed lines:

                _fix_malloc <$tmp.rawout >$tmp.out                                                                                                                                             
                rm -f $tmp.rawout

And the code has been this way since 2015, so it's not likely some
random conflicts caused by recent patches.

$ git blame -L 724,734 check
7fd308513c0ae (Jan Tulak       2015-08-04 14:10:49 +1000 724)                   ./$seq >$tmp.rawout 2>&1
7fd308513c0ae (Jan Tulak       2015-08-04 14:10:49 +1000 725)                   sts=$?
7fd308513c0ae (Jan Tulak       2015-08-04 14:10:49 +1000 726)           fi
bf4445942d2ba (Lukas Czerner   2014-04-04 17:17:53 +1100 727)           $timestamp && _timestamp
bf4445942d2ba (Lukas Czerner   2014-04-04 17:17:53 +1100 728)           stop=`_wallclock`
774f4dd775340 (Tomas Racek     2013-09-19 16:20:37 +0000 729) 
bf4445942d2ba (Lukas Czerner   2014-04-04 17:17:53 +1100 730)           _fix_malloc <$tmp.rawout >$tmp.out
bf4445942d2ba (Lukas Czerner   2014-04-04 17:17:53 +1100 731)           rm -f $tmp.rawout
bf4445942d2ba (Lukas Czerner   2014-04-04 17:17:53 +1100 732) 
bf4445942d2ba (Lukas Czerner   2014-04-04 17:17:53 +1100 733)           if [ -f core ]
bf4445942d2ba (Lukas Czerner   2014-04-04 17:17:53 +1100 734)           then

Do you have some local patches applied?

Thanks,
Eryu
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/check b/check
index 05eb2835918b..db43fb03963b 100755
--- a/check
+++ b/check
@@ -631,78 +631,96 @@  for section in $HOST_OPTIONS_SECTIONS; do
 	seqres="$check"
 	_check_test_fs
 
-	for seq in $list
-	do
-	    err=false
-	    err_msg=""
-	    if [ ! -f $seq ]; then
-	        # Try to get full name in case the user supplied only seq id
-	        # and the test has a name. A bit of hassle to find really
-	        # the test and not its sample output or helping files.
-	        bname=$(basename $seq)
-	        full_seq=$(find $(dirname $seq) -name $bname* -executable |
-	            awk '(NR == 1 || length < length(shortest)) { shortest = $0 }\
-	                END { print shortest }')
-	        if [ -f $full_seq ] \
-	            && [ x$(echo $bname | grep -o "^$VALID_TEST_ID") != x ]; then
-	            seq=$full_seq
-	        fi
-	    fi
+	err=false
+	first_test=true
+	for seq in $list ; do
+		# Run report for previous test!
+		if $err ; then
+			bad="$bad $seqnum"
+			n_bad=`expr $n_bad + 1`
+			tc_status="fail"
+		fi
+		if $do_report -a ! $first_test -a $tc_status != "expunge" ; then
+			_make_testcase_report "$tc_status"
+		fi
+		first_test=false
+
+		err=false
+		err_msg=""
+		if [ ! -f $seq ]; then
+			# Try to get full name in case the user supplied only
+			# seq id and the test has a name. A bit of hassle to
+			# find really the test and not its sample output or
+			# helping files.
+			bname=$(basename $seq)
+			full_seq=$(find $(dirname $seq) -name $bname* -executable |
+				awk '(NR == 1 || length < length(shortest)) { shortest = $0 }\
+				     END { print shortest }')
+			if [ -f $full_seq ] && \
+			   [ x$(echo $bname | grep -o "^$VALID_TEST_ID") != x ]; then
+				seq=$full_seq
+			fi
+		fi
 
-	    # the filename for the test and the name output are different.
-	    # we don't include the tests/ directory in the name output.
-	    export seqnum=`echo $seq | sed -e "s;$SRC_DIR/;;"`
-
-	    # Similarly, the result directory needs to replace the tests/
-	    # part of the test location.
-	    group=`dirname $seq`
-	    if $OPTIONS_HAVE_SECTIONS; then
-		export RESULT_DIR=`echo $group | sed -e "s;$SRC_DIR;${RESULT_BASE}/$section;"`
-		REPORT_DIR="$RESULT_BASE/$section"
-	    else
-		export RESULT_DIR=`echo $group | sed -e "s;$SRC_DIR;$RESULT_BASE;"`
-		REPORT_DIR="$RESULT_BASE"
-	    fi
-	    seqres="$REPORT_DIR/$seqnum"
+		# the filename for the test and the name output are different.
+		# we don't include the tests/ directory in the name output.
+		export seqnum=`echo $seq | sed -e "s;$SRC_DIR/;;"`
 
-	    mkdir -p $RESULT_DIR
+		# Similarly, the result directory needs to replace the tests/
+		# part of the test location.
+		group=`dirname $seq`
+		if $OPTIONS_HAVE_SECTIONS; then
+			export RESULT_DIR=`echo $group | sed -e "s;$SRC_DIR;${RESULT_BASE}/$section;"`
+			REPORT_DIR="$RESULT_BASE/$section"
+		else
+			export RESULT_DIR=`echo $group | sed -e "s;$SRC_DIR;$RESULT_BASE;"`
+			REPORT_DIR="$RESULT_BASE"
+		fi
+		seqres="$REPORT_DIR/$seqnum"
 
-	    echo -n "$seqnum"
+		mkdir -p $RESULT_DIR
+		echo -n "$seqnum"
 
-	    if $showme; then
-		_expunge_test $seqnum
-		if [ $? -eq 1 ]; then
+		if $showme; then
+			_expunge_test $seqnum
+			if [ $? -eq 1 ]; then
+			    tc_status="expunge"
+			    continue
+			fi
+			echo
+			start=0
+			stop=0
+			tc_status="list"
+			n_notrun=`expr $n_notrun + 1`
 			continue
 		fi
-		echo
-		start=0
-		stop=0
-		n_notrun=`expr $n_notrun + 1`
-		if $do_report; then
-			_make_testcase_report "list"
+
+		tc_status="pass"
+		if [ ! -f $seq ]; then
+			echo " - no such test?"
+			continue
 		fi
-		continue
-	    fi
-	    tc_status="pass"
-	    if [ ! -f $seq ]; then
-		echo " - no such test?"
-	    else
+
 		# really going to try and run this one
-		#
 		rm -f $seqres.out.bad
 
 		# check if we really should run it
 		_expunge_test $seqnum
 		if [ $? -eq 1 ]; then
+			tc_status="expunge"
 			continue
 		fi
 
+		# record that we really tried to run this test.
+		try="$try $seqnum"
+		n_try=`expr $n_try + 1`
+
 		# slashes now in names, sed barfs on them so use grep
 		lasttime=`grep -w ^$seqnum $check.time | awk '// {print $2}'`
 		if [ "X$lasttime" != X ]; then
 			echo -n " ${lasttime}s ..."
 		else
-			echo -n "	"	# prettier output with timestamps.
+			echo -n "	" # prettier output with timestamps.
 		fi
 		rm -f core $seqres.notrun
 
@@ -724,90 +742,86 @@  for section in $HOST_OPTIONS_SECTIONS; do
 			./$seq >$tmp.out 2>&1
 			sts=$?
 		fi
-		$timestamp && _timestamp
-		stop=`_wallclock`
 
-		if [ -f core ]
-		then
-		    err_msg="[dumped core]"
-		    echo -n " $err_msg"
-		    mv core $RESULT_BASE/$seqnum.core
-		    err=true
+		if [ -f core ]; then
+			err_msg="[dumped core]"
+			echo -n " $err_msg"
+			mv core $RESULT_BASE/$seqnum.core
+			err=true
 		fi
 
-		if [ -f $seqres.notrun ]
-		then
-		    $timestamp || echo -n " [not run] "
-		    $timestamp && echo " [not run]" && echo -n "	$seqnum -- "
-		    cat $seqres.notrun
-		    notrun="$notrun $seqnum"
-		    n_notrun=`expr $n_notrun + 1`
-		    tc_status="notrun"
-		else
-		    if [ $sts -ne 0 ]
-		    then
+		if [ -f $seqres.notrun ]; then
+			$timestamp && _timestamp
+			stop=`_wallclock`
+			$timestamp || echo -n " [not run] "
+			$timestamp && echo " [not run]" && \
+				      echo -n "	$seqnum -- "
+			cat $seqres.notrun
+			notrun="$notrun $seqnum"
+			n_notrun=`expr $n_notrun + 1`
+			tc_status="notrun"
+			continue;
+		fi
+
+		if [ $sts -ne 0 ]; then
 			err_msg="[failed, exit status $sts]"
 			echo -n " $err_msg"
 			err=true
-		    fi
-		    if [ ! -f $seq.out ]
-		    then
+		else
+			# the test apparently passed, so check for corruption
+			# and log messages that shouldn't be there.
+			_check_filesystems
+			_check_dmesg || err=true
+			_check_kmemleak || err=true
+		fi
+
+		# test ends after all checks are done.
+		$timestamp && _timestamp
+		stop=`_wallclock`
+
+		if [ ! -f $seq.out ]; then
 			_dump_err "no qualified output"
 			err=true
-		    else
-
-			# coreutils 8.16+ changed quote formats in error messages from
-			# `foo' to 'foo'. Filter old versions to match the new version.
-			sed -i "s/\`/\'/g" $tmp.out
-			if diff $seq.out $tmp.out >/dev/null 2>&1
-			then
-			    if $err
-			    then
-				:
-			    else
+			continue;
+		fi
+		# coreutils 8.16+ changed quote formats in error messages
+		# from `foo' to 'foo'. Filter old versions to match the new
+		# version.
+		sed -i "s/\`/\'/g" $tmp.out
+		if diff $seq.out $tmp.out >/dev/null 2>&1 ; then
+			if ! $err ; then
 				echo "$seqnum `expr $stop - $start`" >>$tmp.time
 				echo -n " `expr $stop - $start`s"
-			    fi
-			    echo ""
-			else
-			    echo " - output mismatch (see $seqres.out.bad)"
-			    mv $tmp.out $seqres.out.bad
-			    $diff $seq.out $seqres.out.bad | {
-				if test "$DIFF_LENGTH" -le 0; then
-					cat
-				else
-					head -n "$DIFF_LENGTH"
-					echo "..."
-					echo "(Run '$diff $seq.out $seqres.out.bad'" \
-						" to see the entire diff)"
-				fi; } | \
-				sed -e 's/^\(.\)/    \1/'
-			    err_msg="output mismatch (see $diff $seq.out $seqres.out.bad)"
-			    err=true
 			fi
-		    fi
-		    try="$try $seqnum"
-		    n_try=`expr $n_try + 1`
-		    _check_filesystems
-		    _check_dmesg || err=true
-		    _check_kmemleak || err=true
-		fi
+			echo ""
+		else
+			echo " - output mismatch (see $seqres.out.bad)"
+			mv $tmp.out $seqres.out.bad
+			$diff $seq.out $seqres.out.bad | {
+			if test "$DIFF_LENGTH" -le 0; then
+				cat
+			else
+				head -n "$DIFF_LENGTH"
+				echo "..."
+				echo "(Run '$diff $seq.out $seqres.out.bad'" \
+					" to see the entire diff)"
+			fi; } | sed -e 's/^\(.\)/    \1/'
 
-	    fi
+			err_msg="output mismatch (see $diff $seq.out $seqres.out.bad)"
+			err=true
+		fi
+	done
 
-	    # come here for each test, except when $showme is true
-	    #
-	    if $err
-	    then
+	# make sure we record the status of the last test we ran.
+	if $err ; then
 		bad="$bad $seqnum"
 		n_bad=`expr $n_bad + 1`
 		tc_status="fail"
-	    fi
-	    if $do_report; then
+	fi
+	if $do_report -a ! $first_test -a $tc_status != "expunge" ; then
 		_make_testcase_report "$tc_status"
-	    fi
-	    seq="after_$seqnum"
-	done
+	fi
+
 	sect_stop=`_wallclock`
 	interrupt=false
 	_wrapup