diff mbox series

btrfs: add a test case to verify the scrub error reports

Message ID 20230606073233.75900-1-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: add a test case to verify the scrub error reports | expand

Commit Message

Qu Wenruo June 6, 2023, 7:32 a.m. UTC
There is a regression in recent v6.4 cycle where a scrub rewrite changed
how we report errors, especially repairable errors.

Before the rewrite, we report the initial errors hit, and the amount of
repairable errors.
While after the rewrite, we no longer report the initial errors, but
only the number of repairable errors.

This behavior change is a regression, thus needs a test case to prevent
such problem from happening again.

The test case itself would:

- Create a btrfs using DUP data profile and 4K sector size

- Create a file with one 128K extent

- Corrupt the first mirror of that 128K extent

- Scrub and checks the detailed report
  Both corrected errors and csum errors should be 32.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 tests/btrfs/289     | 67 +++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/289.out |  2 ++
 2 files changed, 69 insertions(+)
 create mode 100755 tests/btrfs/289
 create mode 100644 tests/btrfs/289.out

Comments

Filipe Manana June 6, 2023, 9:45 a.m. UTC | #1
On Tue, Jun 6, 2023 at 8:41 AM Qu Wenruo <wqu@suse.com> wrote:
>
> There is a regression in recent v6.4 cycle where a scrub rewrite changed
> how we report errors, especially repairable errors.
>
> Before the rewrite, we report the initial errors hit, and the amount of
> repairable errors.
> While after the rewrite, we no longer report the initial errors, but
> only the number of repairable errors.
>
> This behavior change is a regression, thus needs a test case to prevent
> such problem from happening again.
>
> The test case itself would:
>
> - Create a btrfs using DUP data profile and 4K sector size
>
> - Create a file with one 128K extent
>
> - Corrupt the first mirror of that 128K extent
>
> - Scrub and checks the detailed report
>   Both corrected errors and csum errors should be 32.
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  tests/btrfs/289     | 67 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/btrfs/289.out |  2 ++
>  2 files changed, 69 insertions(+)
>  create mode 100755 tests/btrfs/289
>  create mode 100644 tests/btrfs/289.out
>
> diff --git a/tests/btrfs/289 b/tests/btrfs/289
> new file mode 100755
> index 00000000..914b6280
> --- /dev/null
> +++ b/tests/btrfs/289
> @@ -0,0 +1,67 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2023 SUSE Linux Products GmbH. All Rights Reserved.
> +#
> +# FS QA Test 289
> +#
> +# Make sure btrfs-scrub reports errors correctly for repaired sectors.
> +#
> +. ./common/preamble
> +_begin_fstest auto quick scrub repair
> +
> +# For filedefrag and all the filters

So almost the same comment as in the other test:

This comment is a bit confusing. File defrag? The test doesn't exercise defrag.
I'm not seeing the test using filters either, the test is redirecting
xfs_io's stdout to /dev/null

> +. ./common/filter
> +
> +# real QA test starts here
> +
> +# Modify as appropriate.
> +_supported_fs btrfs
> +_require_scratch
> +
> +_require_odirect
> +# Overwriting data is forbidden on a zoned block device
> +_require_non_zoned_device "${SCRATCH_DEV}"
> +
> +# The errors reported would be in the unit of sector, thus the number
> +# is dependent on the sectorsize.
> +_require_btrfs_support_sectorsize 4096

So same as before, can we please get a _fixed_by_kernel_commit to
identify the patch that fixes the regression?

> +
> +# Create a single btrfs with DUP data profile, and create one 128K file.
> +_scratch_mkfs -s 4k -d dup -b 1G >> $seqres.full 2>&1
> +_scratch_mount
> +$XFS_IO_PROG -f -d -c "pwrite -S 0xaa -b 128K 0 128K" "$SCRATCH_MNT/foobar" \
> +       > /dev/null
> +sync
> +
> +logical=$(_btrfs_get_first_logical "$SCRATCH_MNT/foobar")
> +
> +physical1=$(_btrfs_get_physical ${logical} 1)
> +devpath1=$(_btrfs_get_device_path ${logical} 1)
> +_scratch_unmount
> +
> +echo " corrupt stripe #1, devpath $devpath1 physical $physical1" \
> +       >> $seqres.full
> +$XFS_IO_PROG -d -c "pwrite -S 0xf1 -b 64K $physical1 128K" $devpath1 \
> +       > /dev/null
> +
> +# Mount and do a scrub and compare the ouput

ouput -> output

> +_scratch_mount
> +$BTRFS_UTIL_PROG scrub start -BR $SCRATCH_MNT >> $tmp.scrub_report 2>&1
> +cat $tmp.scrub_report >> $seqres.full
> +
> +# Csum errors should be 128K/4K = 32
> +csum_errors=$(grep "csum_errors" $tmp.scrub_report | awk '{print $2}')

Use $AWK_PROG instead.

> +if [ $csum_errors -ne 32 ]; then
> +       echo "csum_errors incorrect, expect 32 has $csum_errors"
> +fi
> +
> +# And all errors should be repaired, thus corrected errors should also be 32.
> +corrected_errors=$(grep "corrected_errors" $tmp.scrub_report | awk '{print $2}')

Same here, $AWK_PROG instead.

Otherwise, it looks fine, thanks.

> +if [ $corrected_errors -ne 32 ]; then
> +       echo "csum_errors incorrect, expect 32 has $corrected_errors"
> +fi
> +
> +echo "Silence is golden"
> +
> +status=0
> +exit
> diff --git a/tests/btrfs/289.out b/tests/btrfs/289.out
> new file mode 100644
> index 00000000..7d3b7f80
> --- /dev/null
> +++ b/tests/btrfs/289.out
> @@ -0,0 +1,2 @@
> +QA output created by 289
> +Silence is golden
> --
> 2.39.0
>
diff mbox series

Patch

diff --git a/tests/btrfs/289 b/tests/btrfs/289
new file mode 100755
index 00000000..914b6280
--- /dev/null
+++ b/tests/btrfs/289
@@ -0,0 +1,67 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2023 SUSE Linux Products GmbH. All Rights Reserved.
+#
+# FS QA Test 289
+#
+# Make sure btrfs-scrub reports errors correctly for repaired sectors.
+#
+. ./common/preamble
+_begin_fstest auto quick scrub repair
+
+# For filedefrag and all the filters
+. ./common/filter
+
+# real QA test starts here
+
+# Modify as appropriate.
+_supported_fs btrfs
+_require_scratch
+
+_require_odirect
+# Overwriting data is forbidden on a zoned block device
+_require_non_zoned_device "${SCRATCH_DEV}"
+
+# The errors reported would be in the unit of sector, thus the number
+# is dependent on the sectorsize.
+_require_btrfs_support_sectorsize 4096
+
+# Create a single btrfs with DUP data profile, and create one 128K file.
+_scratch_mkfs -s 4k -d dup -b 1G >> $seqres.full 2>&1
+_scratch_mount
+$XFS_IO_PROG -f -d -c "pwrite -S 0xaa -b 128K 0 128K" "$SCRATCH_MNT/foobar" \
+	> /dev/null
+sync
+
+logical=$(_btrfs_get_first_logical "$SCRATCH_MNT/foobar")
+
+physical1=$(_btrfs_get_physical ${logical} 1)
+devpath1=$(_btrfs_get_device_path ${logical} 1)
+_scratch_unmount
+
+echo " corrupt stripe #1, devpath $devpath1 physical $physical1" \
+	>> $seqres.full
+$XFS_IO_PROG -d -c "pwrite -S 0xf1 -b 64K $physical1 128K" $devpath1 \
+	> /dev/null
+
+# Mount and do a scrub and compare the ouput
+_scratch_mount
+$BTRFS_UTIL_PROG scrub start -BR $SCRATCH_MNT >> $tmp.scrub_report 2>&1
+cat $tmp.scrub_report >> $seqres.full
+
+# Csum errors should be 128K/4K = 32
+csum_errors=$(grep "csum_errors" $tmp.scrub_report | awk '{print $2}')
+if [ $csum_errors -ne 32 ]; then
+	echo "csum_errors incorrect, expect 32 has $csum_errors"
+fi
+
+# And all errors should be repaired, thus corrected errors should also be 32. 
+corrected_errors=$(grep "corrected_errors" $tmp.scrub_report | awk '{print $2}')
+if [ $corrected_errors -ne 32 ]; then
+	echo "csum_errors incorrect, expect 32 has $corrected_errors"
+fi
+
+echo "Silence is golden"
+
+status=0
+exit
diff --git a/tests/btrfs/289.out b/tests/btrfs/289.out
new file mode 100644
index 00000000..7d3b7f80
--- /dev/null
+++ b/tests/btrfs/289.out
@@ -0,0 +1,2 @@ 
+QA output created by 289
+Silence is golden