diff mbox series

[14/24] common/fuzzy: fix some problems with the online-then-offline repair strategy

Message ID 167243878089.730387.3339474427317162674.stgit@magnolia (mailing list archive)
State New, archived
Headers show
Series fstests: improve xfs fuzzing | expand

Commit Message

Darrick J. Wong Dec. 30, 2022, 10:19 p.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

While auditing the fuzz tester code, I noticed there were numerous
problems with the online-then-offline repair strategy -- the stages of
the strategy are not consistently logged to the kernel log, some of the
error messages don't identify /which/ scrubber we're calling, we don't
do a pre-repair check to make sure we detect the fuzzed fields, and we
don't actually re-run online scrub after a repair to make sure that it's
ok.  Disable xfs_repair prefetch to reduce the possibility of OOM kills.
Rework the error messages to make reading the golden output easier.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 common/fuzzy |   80 ++++++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 53 insertions(+), 27 deletions(-)
diff mbox series

Patch

diff --git a/common/fuzzy b/common/fuzzy
index 16fca67534..a33c230b40 100644
--- a/common/fuzzy
+++ b/common/fuzzy
@@ -306,45 +306,71 @@  __scratch_xfs_fuzz_field_norepair() {
 __scratch_xfs_fuzz_field_both() {
 	local fuzz_action="$1"
 
+	# Make sure offline scrub will catch whatever we fuzzed
+	__fuzz_notify "+ Detect fuzzed field (offline)"
+	_scratch_xfs_repair -P -n 2>&1
+	res=$?
+	test $res -eq 0 && \
+		(>&2 echo "${fuzz_action}: offline scrub didn't fail.")
+
 	# Mount or else we can't do anything in both repair mode
-	echo "+ Mount filesystem to try both repairs"
+	__fuzz_notify "+ Mount filesystem to try both repairs"
 	_try_scratch_mount 2>&1
 	res=$?
 	if [ $res -ne 0 ]; then
-		(>&2 echo "mount failed ($res) with ${fuzz_action}.")
-		return 0
+		(>&2 echo "${fuzz_action}: mount failed ($res).")
+	else
+		# Make sure online scrub will catch whatever we fuzzed
+		__fuzz_notify "++ Detect fuzzed field (online)"
+		_scratch_scrub -n -a 1 -e continue 2>&1
+		res=$?
+		test $res -eq 0 && \
+			(>&2 echo "${fuzz_action}: online scrub didn't fail.")
+
+		# Try fixing the filesystem online
+		__fuzz_notify "++ Try to repair filesystem (online)"
+		_scratch_scrub 2>&1
+		res=$?
+		test $res -ne 0 && \
+			(>&2 echo "${fuzz_action}: online repair failed ($res).")
+
+		__scratch_xfs_fuzz_unmount
+	fi
+
+	# Repair the filesystem offline if online repair failed?
+	if [ $res -ne 0 ]; then
+		__fuzz_notify "+ Try to repair the filesystem (offline)"
+		_repair_scratch_fs -P 2>&1
+		res=$?
+		test $res -ne 0 && \
+			(>&2 echo "${fuzz_action}: offline repair failed ($res).")
+	fi
+
+	# See if repair finds a clean fs
+	__fuzz_notify "+ Make sure error is gone (offline)"
+	_scratch_xfs_repair -P -n 2>&1
+	res=$?
+	test $res -ne 0 && \
+		(>&2 echo "${fuzz_action}: offline re-scrub failed ($res).")
+
+	# Mount so that we can see what scrub says after we've fixed the fs
+	__fuzz_notify "+ Re-mount filesystem to re-try online scan"
+	_try_scratch_mount 2>&1
+	res=$?
+	if [ $res -ne 0 ]; then
+		(>&2 echo "${fuzz_action}: mount failed ($res).")
+		return 1
 	fi
 
-	# Make sure online scrub will catch whatever we fuzzed
-	echo "++ Online scrub"
+	# Online scrub should pass now
+	__fuzz_notify "++ Make sure error is gone (online)"
 	_scratch_scrub -n -a 1 -e continue 2>&1
 	res=$?
-	test $res -eq 0 && \
-		(>&2 echo "online scrub didn't fail with ${fuzz_action}.")
-
-	# Try fixing the filesystem online
-	__fuzz_notify "++ Try to repair filesystem online"
-	_scratch_scrub 2>&1
-	res=$?
 	test $res -ne 0 && \
-		(>&2 echo "online repair failed ($res) with ${fuzz_action}.")
+		(>&2 echo "${fuzz_action}: online re-scrub failed ($res).")
 
 	__scratch_xfs_fuzz_unmount
 
-	# Repair the filesystem offline?
-	echo "+ Try to repair the filesystem offline"
-	_repair_scratch_fs 2>&1
-	res=$?
-	test $res -ne 0 && \
-		(>&2 echo "offline repair failed ($res) with ${fuzz_action}.")
-
-	# See if repair finds a clean fs
-	echo "+ Make sure error is gone (offline)"
-	_scratch_xfs_repair -n 2>&1
-	res=$?
-	test $res -ne 0 && \
-		(>&2 echo "offline re-scrub ($res) with ${fuzz_action}.")
-
 	return 0
 }