diff mbox series

[blktests] block/035: test return EIO from BLKRRPART

Message ID 20240410170241.66966-1-saranyamohan@google.com (mailing list archive)
State New
Headers show
Series [blktests] block/035: test return EIO from BLKRRPART | expand

Commit Message

Saranya Muruganandam April 10, 2024, 5:02 p.m. UTC
When we fail to reread the partition superblock from the disk, due to
bad sector or bad disk etc, BLKRRPART should fail with EIO.
Simulate failure for the entire block device and run
"blockdev --rereadpt" and expect it to fail and return EIO instead of
pass.

Link: https://lore.kernel.org/all/20240405014253.748627-1-saranyamohan@google.com/
Signed-off-by: Saranya Muruganandam <saranyamohan@google.com>
---
 tests/block/035     | 86 +++++++++++++++++++++++++++++++++++++++++++++
 tests/block/035.out |  4 +++
 2 files changed, 90 insertions(+)
 create mode 100755 tests/block/035
 create mode 100644 tests/block/035.out

Comments

Shin'ichiro Kawasaki April 11, 2024, 5:29 a.m. UTC | #1
On Apr 10, 2024 / 17:02, Saranya Muruganandam wrote:
> When we fail to reread the partition superblock from the disk, due to
> bad sector or bad disk etc, BLKRRPART should fail with EIO.
> Simulate failure for the entire block device and run
> "blockdev --rereadpt" and expect it to fail and return EIO instead of
> pass.
> 
> Link: https://lore.kernel.org/all/20240405014253.748627-1-saranyamohan@google.com/
> Signed-off-by: Saranya Muruganandam <saranyamohan@google.com>

I ran the test case with this patch on the kernel v6.9-rc3, then observed the
output below.

block/035 => sdc (test return EIO from BLKRRPART for whole-dev) [failed]
    runtime  0.049s  ...  0.041s
    --- tests/block/035.out     2024-04-11 13:40:17.938655578 +0900
    +++ /home/shin/Blktests/blktests/results/sdc/block/035.out.bad      2024-04-11 13:40:19.808650194 +0900
    @@ -1,4 +1,3 @@
     Running block/035
    -blockdev: ioctl error on BLKRRPART: Input/output error
     Return EIO for BLKRRPART on bad disk
     Test complete

Failure is expected but the output is weird. The line "Return EIO for BLKRRPART
on bad disk" indicates that EIO was returned, which is wrong. Please find my
comments in line about this.

BTW, "The canonical patch format" section in Linux kernel Documentation/process/
submitting-patches.rst describes the patch version descriptor (v1, v2...) and
the patch changelog below the '---' separator line. Those practices help
reviewing kernel patches. I encourage to follow them for blktests patches also.

> ---
>  tests/block/035     | 86 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/block/035.out |  4 +++
>  2 files changed, 90 insertions(+)
>  create mode 100755 tests/block/035
>  create mode 100644 tests/block/035.out
> 
> diff --git a/tests/block/035 b/tests/block/035
> new file mode 100755
> index 0000000..0ba6292
> --- /dev/null
> +++ b/tests/block/035
> @@ -0,0 +1,86 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-3.0+
> +# Copyright (C) 2024 Google LLC
> +#
> +# Regression test for BLKRRPART.
> +#
> +# If we fail to read the partition table due to bad sector or other IO
> +# failures, running "blockdev --rereadpt" should fail and return
> +# -EIO.  On a buggy kernel, it passes unexpectedly.
> +
> +. tests/block/rc
> +
> +DESCRIPTION="test return EIO from BLKRRPART for whole-dev"
> +QUICK=1
> +
> +DEBUGFS_MNT="/sys/kernel/debug/fail_make_request"
> +PROBABILITY=0
> +TIMES=0
> +VERBOSE=0
> +MAKE_FAIL=0
> +
> +_have_debugfs() {
> +	if [[ ! -d "${DEBUGFS_MNT}" ]]; then
> +		SKIP_REASONS+=("debugfs does not exist")
> +		return 1
> +	fi
> +	return 0
> +}
> +
> +requires() {
> +	_have_debugfs
> +}
> +
> +save_fail_make_request()
> +{
> +	# Save existing global fail_make_request settings
> +	PROBABILITY=$(cat "${DEBUGFS_MNT}"/probability)
> +	TIMES=$(cat "${DEBUGFS_MNT}"/times)
> +	VERBOSE=$(cat "${DEBUGFS_MNT}"/verbose)
> +
> +	# Save TEST_DEV make-it-fail setting
> +	MAKE_FAIL=$(cat "${TEST_DEV_SYSFS}"/make-it-fail)
> +}
> +
> +allow_fail_make_request()
> +{
> +	# Allow global fail_make_request feature
> +	echo 100 > "${DEBUGFS_MNT}"/probability
> +	echo 9999999 > "${DEBUGFS_MNT}"/times
> +	echo 0 > "${DEBUGFS_MNT}"/verbose
> +
> +	# Force TEST_DEV device failure
> +	echo 1 > "${TEST_DEV_SYSFS}"/make-it-fail
> +}
> +
> +restore_fail_make_request()
> +{
> +	echo "${MAKE_FAIL}" > "${TEST_DEV_SYSFS}"/make-it-fail
> +
> +	# Disallow global fail_make_request feature
> +	echo "${PROBABILITY}" > "${DEBUGFS_MNT}"/probability
> +	echo "${TIMES}" > "${DEBUGFS_MNT}"/times
> +	echo "${VERBOSE}" > "${DEBUGFS_MNT}"/verbose
> +}
> +
> +test_device() {
> +	echo "Running ${TEST_NAME}"
> +
> +	# Save configuration
> +	save_fail_make_request
> +
> +	# set up device for failure
> +	allow_fail_make_request
> +
> +	# Check rereading partitions on bad disk cannot open $TEST_DEV: Input/output error
> +	if blockdev --rereadpt "${TEST_DEV}" | grep -q "Input/output error" &> "$FULL"; then

On the v6.9-rc3 kernel, blockdev --rereadpt command succeeds and prints nothing.
Then grep -q command receives nothing. It fails to match and returns non-zero
status. Then "Return EIO for BLKRRPART on bad disk" is printed even when
blockdev does not output "Input/output error". The if statement logic should be
reverted.

The grep -q command prints nothing, then the redirect to "$FULL" is meaningless.
I think it's the better to redirect blockdev command output to $FULL

The line above is not working as expected on the kernel with your fix patch
either. The blockdev command prints the "Input/output error" message to stderr.
However the pipeline '|' only passes stdout of the blockdev command. Then the
grep command does not receive the message. That's why the message "blockdev:
ioctl error on BLKRRPART: Input/output error" is printed in the test case
output. I suggest not to use the pipeline.

> +		echo "Did not return EIO for BLKRRPART on bad disk"
> +	else
> +		echo "Return EIO for BLKRRPART on bad disk"
> +	fi
> +
> +	# Restore TEST_DEV device to original state
> +	restore_fail_make_request
> +
> +	echo "Test complete"
> +}
> diff --git a/tests/block/035.out b/tests/block/035.out
> new file mode 100644
> index 0000000..125f4b8
> --- /dev/null
> +++ b/tests/block/035.out
> @@ -0,0 +1,4 @@
> +Running block/035
> +blockdev: ioctl error on BLKRRPART: Input/output error
> +Return EIO for BLKRRPART on bad disk
> +Test complete
> -- 
> 2.44.0.683.g7961c838ac-goog
>

Based on my understandings, I suggest the additional change below on top of this
patch. If you are okay with it, I can fold it in when I apply the patch. Or you
can respin the patch and repost. I'm fine with either way. Please let me know
your preference. (If you repost, please modify the test case number from
block/035 to block/036.) Thanks!

diff --git a/tests/block/035 b/tests/block/035
index 0ba6292..d4e67fd 100755
--- a/tests/block/035
+++ b/tests/block/035
@@ -73,10 +73,11 @@ test_device() {
 	allow_fail_make_request
 
 	# Check rereading partitions on bad disk cannot open $TEST_DEV: Input/output error
-	if blockdev --rereadpt "${TEST_DEV}" | grep -q "Input/output error" &> "$FULL"; then
-		echo "Did not return EIO for BLKRRPART on bad disk"
-	else
+	blockdev --rereadpt "${TEST_DEV}" &> "$FULL"
+	if grep -q "Input/output error" "$FULL"; then
 		echo "Return EIO for BLKRRPART on bad disk"
+	else
+		echo "Did not return EIO for BLKRRPART on bad disk"
 	fi
 
 	# Restore TEST_DEV device to original state
diff --git a/tests/block/035.out b/tests/block/035.out
index 125f4b8..0f97f6b 100644
--- a/tests/block/035.out
+++ b/tests/block/035.out
@@ -1,4 +1,3 @@
 Running block/035
-blockdev: ioctl error on BLKRRPART: Input/output error
 Return EIO for BLKRRPART on bad disk
 Test complete
diff mbox series

Patch

diff --git a/tests/block/035 b/tests/block/035
new file mode 100755
index 0000000..0ba6292
--- /dev/null
+++ b/tests/block/035
@@ -0,0 +1,86 @@ 
+#!/bin/bash
+# SPDX-License-Identifier: GPL-3.0+
+# Copyright (C) 2024 Google LLC
+#
+# Regression test for BLKRRPART.
+#
+# If we fail to read the partition table due to bad sector or other IO
+# failures, running "blockdev --rereadpt" should fail and return
+# -EIO.  On a buggy kernel, it passes unexpectedly.
+
+. tests/block/rc
+
+DESCRIPTION="test return EIO from BLKRRPART for whole-dev"
+QUICK=1
+
+DEBUGFS_MNT="/sys/kernel/debug/fail_make_request"
+PROBABILITY=0
+TIMES=0
+VERBOSE=0
+MAKE_FAIL=0
+
+_have_debugfs() {
+	if [[ ! -d "${DEBUGFS_MNT}" ]]; then
+		SKIP_REASONS+=("debugfs does not exist")
+		return 1
+	fi
+	return 0
+}
+
+requires() {
+	_have_debugfs
+}
+
+save_fail_make_request()
+{
+	# Save existing global fail_make_request settings
+	PROBABILITY=$(cat "${DEBUGFS_MNT}"/probability)
+	TIMES=$(cat "${DEBUGFS_MNT}"/times)
+	VERBOSE=$(cat "${DEBUGFS_MNT}"/verbose)
+
+	# Save TEST_DEV make-it-fail setting
+	MAKE_FAIL=$(cat "${TEST_DEV_SYSFS}"/make-it-fail)
+}
+
+allow_fail_make_request()
+{
+	# Allow global fail_make_request feature
+	echo 100 > "${DEBUGFS_MNT}"/probability
+	echo 9999999 > "${DEBUGFS_MNT}"/times
+	echo 0 > "${DEBUGFS_MNT}"/verbose
+
+	# Force TEST_DEV device failure
+	echo 1 > "${TEST_DEV_SYSFS}"/make-it-fail
+}
+
+restore_fail_make_request()
+{
+	echo "${MAKE_FAIL}" > "${TEST_DEV_SYSFS}"/make-it-fail
+
+	# Disallow global fail_make_request feature
+	echo "${PROBABILITY}" > "${DEBUGFS_MNT}"/probability
+	echo "${TIMES}" > "${DEBUGFS_MNT}"/times
+	echo "${VERBOSE}" > "${DEBUGFS_MNT}"/verbose
+}
+
+test_device() {
+	echo "Running ${TEST_NAME}"
+
+	# Save configuration
+	save_fail_make_request
+
+	# set up device for failure
+	allow_fail_make_request
+
+	# Check rereading partitions on bad disk cannot open $TEST_DEV: Input/output error
+	if blockdev --rereadpt "${TEST_DEV}" | grep -q "Input/output error" &> "$FULL"; then
+		echo "Did not return EIO for BLKRRPART on bad disk"
+	else
+		echo "Return EIO for BLKRRPART on bad disk"
+	fi
+
+	# Restore TEST_DEV device to original state
+	restore_fail_make_request
+
+	echo "Test complete"
+}
diff --git a/tests/block/035.out b/tests/block/035.out
new file mode 100644
index 0000000..125f4b8
--- /dev/null
+++ b/tests/block/035.out
@@ -0,0 +1,4 @@ 
+Running block/035
+blockdev: ioctl error on BLKRRPART: Input/output error
+Return EIO for BLKRRPART on bad disk
+Test complete