[6/7] btrfs-progs: Add test for super block recovery

Message ID	1512463189-24724-7-git-send-email-nborisov@suse.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-btrfs-owner@kernel.org> From: Nikolay Borisov <nborisov@suse.com> To: linux-btrfs@vger.kernel.org Cc: Nikolay Borisov <nborisov@suse.com> Subject: [PATCH 6/7] btrfs-progs: Add test for super block recovery Date: Tue, 5 Dec 2017 10:39:48 +0200 Message-Id: <1512463189-24724-7-git-send-email-nborisov@suse.com> In-Reply-To: <1512463189-24724-1-git-send-email-nborisov@suse.com> References: <1512463189-24724-1-git-send-email-nborisov@suse.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk

Nikolay Borisov Dec. 5, 2017, 8:39 a.m. UTC

This functionality regressed some time ago and it was never caught. Seems no
one complained of that, but to be sure add a regression test to prevent future 
regressions.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
 tests/fsck-tests/029-superblock-recovery/test.sh | 64 ++++++++++++++++++++++++
 1 file changed, 64 insertions(+)
 create mode 100755 tests/fsck-tests/029-superblock-recovery/test.sh

Qu Wenruo Dec. 5, 2017, 9:33 a.m. UTC | #1

On 2017年12月05日 16:39, Nikolay Borisov wrote:
> This functionality regressed some time ago and it was never caught. Seems no
> one complained of that, but to be sure add a regression test to prevent future 
> regressions.
> 
> Signed-off-by: Nikolay Borisov <nborisov@suse.com>

One nitpick for the patch sequence, normally we put fix before test
case, to avoid breaking bisect.

> ---
>  tests/fsck-tests/029-superblock-recovery/test.sh | 64 ++++++++++++++++++++++++
>  1 file changed, 64 insertions(+)
>  create mode 100755 tests/fsck-tests/029-superblock-recovery/test.sh
> 
> diff --git a/tests/fsck-tests/029-superblock-recovery/test.sh b/tests/fsck-tests/029-superblock-recovery/test.sh
> new file mode 100755
> index 000000000000..beb78d6ccc22
> --- /dev/null
> +++ b/tests/fsck-tests/029-superblock-recovery/test.sh
> @@ -0,0 +1,64 @@
> +#!/bin/bash
> +# Test that any superblock is correctly detected
> +# and fixed by btrfs rescue
> +
> +source "$TOP/tests/common"
> +
> +check_prereq btrfs
> +check_prereq mkfs.btrfs
> +check_prereq btrfs-select-super
> +
> +setup_root_helper
> +
> +rm -f dev1
> +run_check truncate -s 260G dev1
> +loop=$(run_check_stdout $SUDO_HELPER losetup --find --show dev1)

We have function to do it already.
prepare_test_dev will use loopback device as fallback if $TEST_DEV is
not specified.
Tt can handle size well, and it also uses sparse file so no need to
worry about disk usage.

> +
> +# Create the test file system.
> +run_check $SUDO_HELPER "$TOP"/mkfs.btrfs -f "$loop"
> +
> +function check_corruption {
> +	local sb_offset=$1
> +	local source_sb=$2
> +
> +
> +	# First we ensure we can mount it successfully
> +	run_check $SUDO_HELPER mount $loop "$TEST_MNT"
> +	run_check $SUDO_HELPER umount "$TEST_MNT"
> +
> +	# Now corrupt 1k of the superblock at sb_offset
> +	run_check $SUDO_HELPER dd bs=1K count=1 seek=$(($sb_offset + 1)) if=/dev/zero of="$loop"
> +
> +	#if corrupting one of the sb copies, copy it over the initial superblock
> +	if [ ! -z $source_sb ]; then
> +		local shift_val=$((16 << $source_sb * 12 ))
> +		run_check $SUDO_HELPER dd bs=1K count=4 seek=64 skip=$shift_val if="$loop" of="$loop"
> +	fi

Personally speaking, corrupt 64K (1st super) then corrupt the desired
copy could make the function easier.
Although we need to split the check part from this function, resulting
something like:

corrupt_super 64k
corrupt_super 64m
check_super_recover

> +
> +	run_mustfail "Mounted fs with corrupted superblock" \
> +		$SUDO_HELPER mount $loop "$TEST_MNT"
> +
> +	# Now run btrfs rescue which should fix the superblock. It uses 2
> +	# to signal success of recovery use mayfail to ignore that retval
> +	# but still log the output of the command
> +	run_mayfail $SUDO_HELPER "$TOP"/btrfs rescue super-recover -yv "$loop"
> +	if [ $? != 2 ]; then
> +		_fail "couldn't rescue super"
> +	fi

It's understandable to have return value other than 0 to distinguish
health fs from repairable fs.
But at least let's also put this into man page.

Thanks,
Qu

> +
> +	run_check $SUDO_HELPER mount $loop "$TEST_MNT"
> +	run_check $SUDO_HELPER umount "$TEST_MNT"
> +}
> +
> +_log "Corrupting first superblock"
> +check_corruption 64
> +
> +_log "Corrupting second superblock"
> +check_corruption 65536 1
> +
> +_log "Corrupting third superblock"
> +check_corruption 268435456 2
> +
> +# Cleanup
> +run_check $SUDO_HELPER losetup -d "$loop"
> +rm -f dev1
>

Nikolay Borisov Dec. 5, 2017, 10:04 a.m. UTC | #2

On  5.12.2017 11:33, Qu Wenruo wrote:
> 
> 
> On 2017年12月05日 16:39, Nikolay Borisov wrote:
>> This functionality regressed some time ago and it was never caught. Seems no
>> one complained of that, but to be sure add a regression test to prevent future 
>> regressions.
>>
>> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> 
> One nitpick for the patch sequence, normally we put fix before test
> case, to avoid breaking bisect.
> 
>> ---
>>  tests/fsck-tests/029-superblock-recovery/test.sh | 64 ++++++++++++++++++++++++
>>  1 file changed, 64 insertions(+)
>>  create mode 100755 tests/fsck-tests/029-superblock-recovery/test.sh
>>
>> diff --git a/tests/fsck-tests/029-superblock-recovery/test.sh b/tests/fsck-tests/029-superblock-recovery/test.sh
>> new file mode 100755
>> index 000000000000..beb78d6ccc22
>> --- /dev/null
>> +++ b/tests/fsck-tests/029-superblock-recovery/test.sh
>> @@ -0,0 +1,64 @@
>> +#!/bin/bash
>> +# Test that any superblock is correctly detected
>> +# and fixed by btrfs rescue
>> +
>> +source "$TOP/tests/common"
>> +
>> +check_prereq btrfs
>> +check_prereq mkfs.btrfs
>> +check_prereq btrfs-select-super
>> +
>> +setup_root_helper
>> +
>> +rm -f dev1
>> +run_check truncate -s 260G dev1
>> +loop=$(run_check_stdout $SUDO_HELPER losetup --find --show dev1)
> 
> We have function to do it already.
> prepare_test_dev will use loopback device as fallback if $TEST_DEV is
> not specified.
> Tt can handle size well, and it also uses sparse file so no need to
> worry about disk usage.

Then the test suite is not very consistent, since I copied this loopback
handling from some other test.

> 
>> +
>> +# Create the test file system.
>> +run_check $SUDO_HELPER "$TOP"/mkfs.btrfs -f "$loop"
>> +
>> +function check_corruption {
>> +	local sb_offset=$1
>> +	local source_sb=$2
>> +
>> +
>> +	# First we ensure we can mount it successfully
>> +	run_check $SUDO_HELPER mount $loop "$TEST_MNT"
>> +	run_check $SUDO_HELPER umount "$TEST_MNT"
>> +
>> +	# Now corrupt 1k of the superblock at sb_offset
>> +	run_check $SUDO_HELPER dd bs=1K count=1 seek=$(($sb_offset + 1)) if=/dev/zero of="$loop"
>> +
>> +	#if corrupting one of the sb copies, copy it over the initial superblock
>> +	if [ ! -z $source_sb ]; then
>> +		local shift_val=$((16 << $source_sb * 12 ))
>> +		run_check $SUDO_HELPER dd bs=1K count=4 seek=64 skip=$shift_val if="$loop" of="$loop"
>> +	fi
> 
> Personally speaking, corrupt 64K (1st super) then corrupt the desired
> copy could make the function easier.
> Although we need to split the check part from this function, resulting
> something like:
> 
> corrupt_super 64k
> corrupt_super 64m
> check_super_recover
I'm reluctant to change this function any more.  It has comments on all
logical steps and is self-contained and I'd rather keep it that way.

> 
>> +
>> +	run_mustfail "Mounted fs with corrupted superblock" \
>> +		$SUDO_HELPER mount $loop "$TEST_MNT"
>> +
>> +	# Now run btrfs rescue which should fix the superblock. It uses 2
>> +	# to signal success of recovery use mayfail to ignore that retval
>> +	# but still log the output of the command
>> +	run_mayfail $SUDO_HELPER "$TOP"/btrfs rescue super-recover -yv "$loop"
>> +	if [ $? != 2 ]; then
>> +		_fail "couldn't rescue super"
>> +	fi
> 
> It's understandable to have return value other than 0 to distinguish
> health fs from repairable fs.
> But at least let's also put this into man page.

Yeah, tell me about it, super recovery actually has 5 return values:

7985fe64e0e2 ("Btrfs-progs: add super-recover to recover bad supers")

    There will be five kinds of return values:

    0: all supers are valid, no need to recover
    1: usage or syntax error
    2: recover all bad superblocks successfully
    3: fail to recover bad superblocks
    4: abort to recover bad superblocks


> 
> Thanks,
> Qu
> 
>> +
>> +	run_check $SUDO_HELPER mount $loop "$TEST_MNT"
>> +	run_check $SUDO_HELPER umount "$TEST_MNT"
>> +}
>> +
>> +_log "Corrupting first superblock"
>> +check_corruption 64
>> +
>> +_log "Corrupting second superblock"
>> +check_corruption 65536 1
>> +
>> +_log "Corrupting third superblock"
>> +check_corruption 268435456 2
>> +
>> +# Cleanup
>> +run_check $SUDO_HELPER losetup -d "$loop"
>> +rm -f dev1
>>
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Qu Wenruo Dec. 5, 2017, 11:12 a.m. UTC | #3

On 2017年12月05日 18:04, Nikolay Borisov wrote:
> 
> 
> On  5.12.2017 11:33, Qu Wenruo wrote:
>>
>>
>> On 2017年12月05日 16:39, Nikolay Borisov wrote:
>>> This functionality regressed some time ago and it was never caught. Seems no
>>> one complained of that, but to be sure add a regression test to prevent future 
>>> regressions.
>>>
>>> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
>>
>> One nitpick for the patch sequence, normally we put fix before test
>> case, to avoid breaking bisect.
>>
>>> ---
>>>  tests/fsck-tests/029-superblock-recovery/test.sh | 64 ++++++++++++++++++++++++
>>>  1 file changed, 64 insertions(+)
>>>  create mode 100755 tests/fsck-tests/029-superblock-recovery/test.sh
>>>
>>> diff --git a/tests/fsck-tests/029-superblock-recovery/test.sh b/tests/fsck-tests/029-superblock-recovery/test.sh
>>> new file mode 100755
>>> index 000000000000..beb78d6ccc22
>>> --- /dev/null
>>> +++ b/tests/fsck-tests/029-superblock-recovery/test.sh
>>> @@ -0,0 +1,64 @@
>>> +#!/bin/bash
>>> +# Test that any superblock is correctly detected
>>> +# and fixed by btrfs rescue
>>> +
>>> +source "$TOP/tests/common"
>>> +
>>> +check_prereq btrfs
>>> +check_prereq mkfs.btrfs
>>> +check_prereq btrfs-select-super
>>> +
>>> +setup_root_helper
>>> +
>>> +rm -f dev1
>>> +run_check truncate -s 260G dev1
>>> +loop=$(run_check_stdout $SUDO_HELPER losetup --find --show dev1)
>>
>> We have function to do it already.
>> prepare_test_dev will use loopback device as fallback if $TEST_DEV is
>> not specified.
>> Tt can handle size well, and it also uses sparse file so no need to
>> worry about disk usage.
> 
> Then the test suite is not very consistent, since I copied this loopback
> handling from some other test.

The same feeling when I am pointed that something can be replaced by
wrappers in fstests.

Some of them can be cleaned up later.

> 
>>
>>> +
>>> +# Create the test file system.
>>> +run_check $SUDO_HELPER "$TOP"/mkfs.btrfs -f "$loop"
>>> +
>>> +function check_corruption {
>>> +	local sb_offset=$1
>>> +	local source_sb=$2
>>> +
>>> +
>>> +	# First we ensure we can mount it successfully
>>> +	run_check $SUDO_HELPER mount $loop "$TEST_MNT"
>>> +	run_check $SUDO_HELPER umount "$TEST_MNT"
>>> +
>>> +	# Now corrupt 1k of the superblock at sb_offset
>>> +	run_check $SUDO_HELPER dd bs=1K count=1 seek=$(($sb_offset + 1)) if=/dev/zero of="$loop"
>>> +
>>> +	#if corrupting one of the sb copies, copy it over the initial superblock
>>> +	if [ ! -z $source_sb ]; then
>>> +		local shift_val=$((16 << $source_sb * 12 ))
>>> +		run_check $SUDO_HELPER dd bs=1K count=4 seek=64 skip=$shift_val if="$loop" of="$loop"
>>> +	fi
>>
>> Personally speaking, corrupt 64K (1st super) then corrupt the desired
>> copy could make the function easier.
>> Although we need to split the check part from this function, resulting
>> something like:
>>
>> corrupt_super 64k
>> corrupt_super 64m
>> check_super_recover
> I'm reluctant to change this function any more.  It has comments on all
> logical steps and is self-contained and I'd rather keep it that way.
> 
>>
>>> +
>>> +	run_mustfail "Mounted fs with corrupted superblock" \
>>> +		$SUDO_HELPER mount $loop "$TEST_MNT"
>>> +
>>> +	# Now run btrfs rescue which should fix the superblock. It uses 2
>>> +	# to signal success of recovery use mayfail to ignore that retval
>>> +	# but still log the output of the command
>>> +	run_mayfail $SUDO_HELPER "$TOP"/btrfs rescue super-recover -yv "$loop"
>>> +	if [ $? != 2 ]; then
>>> +		_fail "couldn't rescue super"
>>> +	fi
>>
>> It's understandable to have return value other than 0 to distinguish
>> health fs from repairable fs.
>> But at least let's also put this into man page.
> 
> Yeah, tell me about it, super recovery actually has 5 return values:
> 
> 7985fe64e0e2 ("Btrfs-progs: add super-recover to recover bad supers")
> 
>     There will be five kinds of return values:
> 
>     0: all supers are valid, no need to recover
>     1: usage or syntax error
>     2: recover all bad superblocks successfully
>     3: fail to recover bad superblocks
>     4: abort to recover bad superblocks

Since we all agree that the return value is a messy,
maybe we could just simplify it to 0 (all valid or successful recover)
and 1 (the rest)?

Thanks,
Qu

> 
> 
>>
>> Thanks,
>> Qu
>>
>>> +
>>> +	run_check $SUDO_HELPER mount $loop "$TEST_MNT"
>>> +	run_check $SUDO_HELPER umount "$TEST_MNT"
>>> +}
>>> +
>>> +_log "Corrupting first superblock"
>>> +check_corruption 64
>>> +
>>> +_log "Corrupting second superblock"
>>> +check_corruption 65536 1
>>> +
>>> +_log "Corrupting third superblock"
>>> +check_corruption 268435456 2
>>> +
>>> +# Cleanup
>>> +run_check $SUDO_HELPER losetup -d "$loop"
>>> +rm -f dev1
>>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Nikolay Borisov Dec. 5, 2017, 11:26 a.m. UTC | #4

On  5.12.2017 13:12, Qu Wenruo wrote:
> 
> 
> On 2017年12月05日 18:04, Nikolay Borisov wrote:
>>
>>
>> On  5.12.2017 11:33, Qu Wenruo wrote:
>>>
>>>
>>> On 2017年12月05日 16:39, Nikolay Borisov wrote:
>>>> This functionality regressed some time ago and it was never caught. Seems no
>>>> one complained of that, but to be sure add a regression test to prevent future 
>>>> regressions.
>>>>
>>>> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
>>>
>>> One nitpick for the patch sequence, normally we put fix before test
>>> case, to avoid breaking bisect.
>>>
>>>> ---
>>>>  tests/fsck-tests/029-superblock-recovery/test.sh | 64 ++++++++++++++++++++++++
>>>>  1 file changed, 64 insertions(+)
>>>>  create mode 100755 tests/fsck-tests/029-superblock-recovery/test.sh
>>>>
>>>> diff --git a/tests/fsck-tests/029-superblock-recovery/test.sh b/tests/fsck-tests/029-superblock-recovery/test.sh
>>>> new file mode 100755
>>>> index 000000000000..beb78d6ccc22
>>>> --- /dev/null
>>>> +++ b/tests/fsck-tests/029-superblock-recovery/test.sh
>>>> @@ -0,0 +1,64 @@
>>>> +#!/bin/bash
>>>> +# Test that any superblock is correctly detected
>>>> +# and fixed by btrfs rescue
>>>> +
>>>> +source "$TOP/tests/common"
>>>> +
>>>> +check_prereq btrfs
>>>> +check_prereq mkfs.btrfs
>>>> +check_prereq btrfs-select-super
>>>> +
>>>> +setup_root_helper
>>>> +
>>>> +rm -f dev1
>>>> +run_check truncate -s 260G dev1
>>>> +loop=$(run_check_stdout $SUDO_HELPER losetup --find --show dev1)
>>>
>>> We have function to do it already.
>>> prepare_test_dev will use loopback device as fallback if $TEST_DEV is
>>> not specified.
>>> Tt can handle size well, and it also uses sparse file so no need to
>>> worry about disk usage.
>>
>> Then the test suite is not very consistent, since I copied this loopback
>> handling from some other test.
> 
> The same feeling when I am pointed that something can be replaced by
> wrappers in fstests.
> 
> Some of them can be cleaned up later.
> 
>>
>>>
>>>> +
>>>> +# Create the test file system.
>>>> +run_check $SUDO_HELPER "$TOP"/mkfs.btrfs -f "$loop"
>>>> +
>>>> +function check_corruption {
>>>> +	local sb_offset=$1
>>>> +	local source_sb=$2
>>>> +
>>>> +
>>>> +	# First we ensure we can mount it successfully
>>>> +	run_check $SUDO_HELPER mount $loop "$TEST_MNT"
>>>> +	run_check $SUDO_HELPER umount "$TEST_MNT"
>>>> +
>>>> +	# Now corrupt 1k of the superblock at sb_offset
>>>> +	run_check $SUDO_HELPER dd bs=1K count=1 seek=$(($sb_offset + 1)) if=/dev/zero of="$loop"
>>>> +
>>>> +	#if corrupting one of the sb copies, copy it over the initial superblock
>>>> +	if [ ! -z $source_sb ]; then
>>>> +		local shift_val=$((16 << $source_sb * 12 ))
>>>> +		run_check $SUDO_HELPER dd bs=1K count=4 seek=64 skip=$shift_val if="$loop" of="$loop"
>>>> +	fi
>>>
>>> Personally speaking, corrupt 64K (1st super) then corrupt the desired
>>> copy could make the function easier.
>>> Although we need to split the check part from this function, resulting
>>> something like:
>>>
>>> corrupt_super 64k
>>> corrupt_super 64m
>>> check_super_recover
>> I'm reluctant to change this function any more.  It has comments on all
>> logical steps and is self-contained and I'd rather keep it that way.
>>
>>>
>>>> +
>>>> +	run_mustfail "Mounted fs with corrupted superblock" \
>>>> +		$SUDO_HELPER mount $loop "$TEST_MNT"
>>>> +
>>>> +	# Now run btrfs rescue which should fix the superblock. It uses 2
>>>> +	# to signal success of recovery use mayfail to ignore that retval
>>>> +	# but still log the output of the command
>>>> +	run_mayfail $SUDO_HELPER "$TOP"/btrfs rescue super-recover -yv "$loop"
>>>> +	if [ $? != 2 ]; then
>>>> +		_fail "couldn't rescue super"
>>>> +	fi
>>>
>>> It's understandable to have return value other than 0 to distinguish
>>> health fs from repairable fs.
>>> But at least let's also put this into man page.
>>
>> Yeah, tell me about it, super recovery actually has 5 return values:
>>
>> 7985fe64e0e2 ("Btrfs-progs: add super-recover to recover bad supers")
>>
>>     There will be five kinds of return values:
>>
>>     0: all supers are valid, no need to recover
>>     1: usage or syntax error
>>     2: recover all bad superblocks successfully
>>     3: fail to recover bad superblocks
>>     4: abort to recover bad superblocks
> 
> Since we all agree that the return value is a messy,
> maybe we could just simplify it to 0 (all valid or successful recover)
> and 1 (the rest)?

I have no objection, but it's out of the scope of the current series.

> 
> Thanks,
> Qu
> 
>>
>>
>>>
>>> Thanks,
>>> Qu
>>>
>>>> +
>>>> +	run_check $SUDO_HELPER mount $loop "$TEST_MNT"
>>>> +	run_check $SUDO_HELPER umount "$TEST_MNT"
>>>> +}
>>>> +
>>>> +_log "Corrupting first superblock"
>>>> +check_corruption 64
>>>> +
>>>> +_log "Corrupting second superblock"
>>>> +check_corruption 65536 1
>>>> +
>>>> +_log "Corrupting third superblock"
>>>> +check_corruption 268435456 2
>>>> +
>>>> +# Cleanup
>>>> +run_check $SUDO_HELPER losetup -d "$loop"
>>>> +rm -f dev1
>>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Qu Wenruo Dec. 5, 2017, 12:13 p.m. UTC | #5

On 2017年12月05日 19:26, Nikolay Borisov wrote:
> 
> 
> On  5.12.2017 13:12, Qu Wenruo wrote:
>>
>>
>> On 2017年12月05日 18:04, Nikolay Borisov wrote:
>>>
>>>
>>> On  5.12.2017 11:33, Qu Wenruo wrote:
>>>>
>>>>
>>>> On 2017年12月05日 16:39, Nikolay Borisov wrote:
>>>>> This functionality regressed some time ago and it was never caught. Seems no
>>>>> one complained of that, but to be sure add a regression test to prevent future 
>>>>> regressions.
>>>>>
>>>>> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
>>>>
>>>> One nitpick for the patch sequence, normally we put fix before test
>>>> case, to avoid breaking bisect.
>>>>
>>>>> ---
>>>>>  tests/fsck-tests/029-superblock-recovery/test.sh | 64 ++++++++++++++++++++++++
>>>>>  1 file changed, 64 insertions(+)
>>>>>  create mode 100755 tests/fsck-tests/029-superblock-recovery/test.sh
>>>>>
>>>>> diff --git a/tests/fsck-tests/029-superblock-recovery/test.sh b/tests/fsck-tests/029-superblock-recovery/test.sh
>>>>> new file mode 100755
>>>>> index 000000000000..beb78d6ccc22
>>>>> --- /dev/null
>>>>> +++ b/tests/fsck-tests/029-superblock-recovery/test.sh
>>>>> @@ -0,0 +1,64 @@
>>>>> +#!/bin/bash
>>>>> +# Test that any superblock is correctly detected
>>>>> +# and fixed by btrfs rescue
>>>>> +
>>>>> +source "$TOP/tests/common"
>>>>> +
>>>>> +check_prereq btrfs
>>>>> +check_prereq mkfs.btrfs
>>>>> +check_prereq btrfs-select-super
>>>>> +
>>>>> +setup_root_helper
>>>>> +
>>>>> +rm -f dev1
>>>>> +run_check truncate -s 260G dev1
>>>>> +loop=$(run_check_stdout $SUDO_HELPER losetup --find --show dev1)
>>>>
>>>> We have function to do it already.
>>>> prepare_test_dev will use loopback device as fallback if $TEST_DEV is
>>>> not specified.
>>>> Tt can handle size well, and it also uses sparse file so no need to
>>>> worry about disk usage.
>>>
>>> Then the test suite is not very consistent, since I copied this loopback
>>> handling from some other test.
>>
>> The same feeling when I am pointed that something can be replaced by
>> wrappers in fstests.
>>
>> Some of them can be cleaned up later.
>>
>>>
>>>>
>>>>> +
>>>>> +# Create the test file system.
>>>>> +run_check $SUDO_HELPER "$TOP"/mkfs.btrfs -f "$loop"
>>>>> +
>>>>> +function check_corruption {
>>>>> +	local sb_offset=$1
>>>>> +	local source_sb=$2
>>>>> +
>>>>> +
>>>>> +	# First we ensure we can mount it successfully
>>>>> +	run_check $SUDO_HELPER mount $loop "$TEST_MNT"
>>>>> +	run_check $SUDO_HELPER umount "$TEST_MNT"
>>>>> +
>>>>> +	# Now corrupt 1k of the superblock at sb_offset
>>>>> +	run_check $SUDO_HELPER dd bs=1K count=1 seek=$(($sb_offset + 1)) if=/dev/zero of="$loop"
>>>>> +
>>>>> +	#if corrupting one of the sb copies, copy it over the initial superblock
>>>>> +	if [ ! -z $source_sb ]; then
>>>>> +		local shift_val=$((16 << $source_sb * 12 ))
>>>>> +		run_check $SUDO_HELPER dd bs=1K count=4 seek=64 skip=$shift_val if="$loop" of="$loop"
>>>>> +	fi
>>>>
>>>> Personally speaking, corrupt 64K (1st super) then corrupt the desired
>>>> copy could make the function easier.
>>>> Although we need to split the check part from this function, resulting
>>>> something like:
>>>>
>>>> corrupt_super 64k
>>>> corrupt_super 64m
>>>> check_super_recover
>>> I'm reluctant to change this function any more.  It has comments on all
>>> logical steps and is self-contained and I'd rather keep it that way.
>>>
>>>>
>>>>> +
>>>>> +	run_mustfail "Mounted fs with corrupted superblock" \
>>>>> +		$SUDO_HELPER mount $loop "$TEST_MNT"
>>>>> +
>>>>> +	# Now run btrfs rescue which should fix the superblock. It uses 2
>>>>> +	# to signal success of recovery use mayfail to ignore that retval
>>>>> +	# but still log the output of the command
>>>>> +	run_mayfail $SUDO_HELPER "$TOP"/btrfs rescue super-recover -yv "$loop"
>>>>> +	if [ $? != 2 ]; then
>>>>> +		_fail "couldn't rescue super"
>>>>> +	fi
>>>>
>>>> It's understandable to have return value other than 0 to distinguish
>>>> health fs from repairable fs.
>>>> But at least let's also put this into man page.
>>>
>>> Yeah, tell me about it, super recovery actually has 5 return values:
>>>
>>> 7985fe64e0e2 ("Btrfs-progs: add super-recover to recover bad supers")
>>>
>>>     There will be five kinds of return values:
>>>
>>>     0: all supers are valid, no need to recover
>>>     1: usage or syntax error
>>>     2: recover all bad superblocks successfully
>>>     3: fail to recover bad superblocks
>>>     4: abort to recover bad superblocks
>>
>> Since we all agree that the return value is a messy,
>> maybe we could just simplify it to 0 (all valid or successful recover)
>> and 1 (the rest)?
> 
> I have no objection, but it's out of the scope of the current series.

Yep, could be done as another patchset.

Thanks,
Qu
> 
>>
>> Thanks,
>> Qu
>>
>>>
>>>
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>> +
>>>>> +	run_check $SUDO_HELPER mount $loop "$TEST_MNT"
>>>>> +	run_check $SUDO_HELPER umount "$TEST_MNT"
>>>>> +}
>>>>> +
>>>>> +_log "Corrupting first superblock"
>>>>> +check_corruption 64
>>>>> +
>>>>> +_log "Corrupting second superblock"
>>>>> +check_corruption 65536 1
>>>>> +
>>>>> +_log "Corrupting third superblock"
>>>>> +check_corruption 268435456 2
>>>>> +
>>>>> +# Cleanup
>>>>> +run_check $SUDO_HELPER losetup -d "$loop"
>>>>> +rm -f dev1
>>>>>
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

David Sterba Jan. 23, 2018, 3:07 p.m. UTC | #6

On Tue, Dec 05, 2017 at 12:04:49PM +0200, Nikolay Borisov wrote:
> >> --- /dev/null
> >> +++ b/tests/fsck-tests/029-superblock-recovery/test.sh
> >> @@ -0,0 +1,64 @@
> >> +#!/bin/bash
> >> +# Test that any superblock is correctly detected
> >> +# and fixed by btrfs rescue
> >> +
> >> +source "$TOP/tests/common"
> >> +
> >> +check_prereq btrfs
> >> +check_prereq mkfs.btrfs
> >> +check_prereq btrfs-select-super
> >> +
> >> +setup_root_helper
> >> +
> >> +rm -f dev1
> >> +run_check truncate -s 260G dev1
> >> +loop=$(run_check_stdout $SUDO_HELPER losetup --find --show dev1)
> > 
> > We have function to do it already.
> > prepare_test_dev will use loopback device as fallback if $TEST_DEV is
> > not specified.
> > Tt can handle size well, and it also uses sparse file so no need to
> > worry about disk usage.
> 
> Then the test suite is not very consistent, since I copied this loopback
> handling from some other test.

From which one? Some tests have special needs and may set up the loop
device on their own. The rest should use the helpers in tests/common and
I don't see anything that would not be provided by them.

The consistency in the testsuite is being improved as we find and
identify the patterns worth wrapping into helpers so you may see some
code duplication.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Nikolay Borisov Jan. 23, 2018, 3:29 p.m. UTC | #7

On 23.01.2018 17:07, David Sterba wrote:
> On Tue, Dec 05, 2017 at 12:04:49PM +0200, Nikolay Borisov wrote:
>>>> --- /dev/null
>>>> +++ b/tests/fsck-tests/029-superblock-recovery/test.sh
>>>> @@ -0,0 +1,64 @@
>>>> +#!/bin/bash
>>>> +# Test that any superblock is correctly detected
>>>> +# and fixed by btrfs rescue
>>>> +
>>>> +source "$TOP/tests/common"
>>>> +
>>>> +check_prereq btrfs
>>>> +check_prereq mkfs.btrfs
>>>> +check_prereq btrfs-select-super
>>>> +
>>>> +setup_root_helper
>>>> +
>>>> +rm -f dev1
>>>> +run_check truncate -s 260G dev1
>>>> +loop=$(run_check_stdout $SUDO_HELPER losetup --find --show dev1)
>>>
>>> We have function to do it already.
>>> prepare_test_dev will use loopback device as fallback if $TEST_DEV is
>>> not specified.
>>> Tt can handle size well, and it also uses sparse file so no need to
>>> worry about disk usage.
>>
>> Then the test suite is not very consistent, since I copied this loopback
>> handling from some other test.
> 
> From which one? Some tests have special needs and may set up the loop
> device on their own. The rest should use the helpers in tests/common and
> I don't see anything that would not be provided by them.

Looking back at the code it seems it could have been from
021-image-multi-devices.

Anyway, I haven't resubmitted the sequence since I was waiting for
feedback. Qu already mentioned this could be refactored.


> 
> The consistency in the testsuite is being improved as we find and
> identify the patterns worth wrapping into helpers so you may see some
> code duplication.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Sterba Jan. 23, 2018, 3:39 p.m. UTC | #8

On Tue, Jan 23, 2018 at 05:29:57PM +0200, Nikolay Borisov wrote:
> On 23.01.2018 17:07, David Sterba wrote:
> > On Tue, Dec 05, 2017 at 12:04:49PM +0200, Nikolay Borisov wrote:
> >>>> --- /dev/null
> >>>> +++ b/tests/fsck-tests/029-superblock-recovery/test.sh
> >>>> @@ -0,0 +1,64 @@
> >>>> +#!/bin/bash
> >>>> +# Test that any superblock is correctly detected
> >>>> +# and fixed by btrfs rescue
> >>>> +
> >>>> +source "$TOP/tests/common"
> >>>> +
> >>>> +check_prereq btrfs
> >>>> +check_prereq mkfs.btrfs
> >>>> +check_prereq btrfs-select-super
> >>>> +
> >>>> +setup_root_helper
> >>>> +
> >>>> +rm -f dev1
> >>>> +run_check truncate -s 260G dev1
> >>>> +loop=$(run_check_stdout $SUDO_HELPER losetup --find --show dev1)
> >>>
> >>> We have function to do it already.
> >>> prepare_test_dev will use loopback device as fallback if $TEST_DEV is
> >>> not specified.
> >>> Tt can handle size well, and it also uses sparse file so no need to
> >>> worry about disk usage.
> >>
> >> Then the test suite is not very consistent, since I copied this loopback
> >> handling from some other test.
> > 
> > From which one? Some tests have special needs and may set up the loop
> > device on their own. The rest should use the helpers in tests/common and
> > I don't see anything that would not be provided by them.
> 
> Looking back at the code it seems it could have been from
> 021-image-multi-devices.

That was my guess too. There are now multi-loop device helpers, this
test has been added before that.

> Anyway, I haven't resubmitted the sequence since I was waiting for
> feedback. Qu already mentioned this could be refactored.

I've applied your patch without changes and updated it in another commit
as it changed more than just some trivial bits.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[6/7] btrfs-progs: Add test for super block recovery

Commit Message

Comments

Patch