diff mbox series

[v2] btrfs: add a test case to make sure scrub can repair parity corruption

Message ID 20230724023606.91107-1-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series [v2] btrfs: add a test case to make sure scrub can repair parity corruption | expand

Commit Message

Qu Wenruo July 24, 2023, 2:36 a.m. UTC
There is a kernel regression caused by commit 75b470332965 ("btrfs:
raid56: migrate recovery and scrub recovery path to use error_bitmap"),
which leads to scrub not repairing corrupted parity stripes.

So here we add a test case to verify the P/Q stripe scrub behavior by:

- Create a RAID5 or RAID6 btrfs with minimal amount of devices
  This means 2 devices for RAID5, and 3 devices for RAID6.
  This would result the parity stripe to be a mirror of the only data
  stripe.

  And since we have control of the content of data stripes, the content
  of the P stripe is also fixed.

- Create an 64K file
  The file would cover one data stripe.

- Corrupt the P stripe

- Scrub the fs
  If scrub is working, the P stripe would be repaired.

  Unfortunately scrub can not report any P/Q corruption, limited by its
  reporting structure.
  So we can not use the return value of scrub to determine if we
  repaired anything.

- Verify the content of the P stripe

- Use "btrfs check --check-data-csum" to double check

By above steps, we can verify if the P stripe is properly fixed.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Rebase to the latest misc-next
- Use space_cache=v2 mount option instead of nospace_cache
  New features like block group tree and extent tree v2 requires v2
  cache
- Fix a white space error
---
 tests/btrfs/297     | 85 +++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/297.out |  2 ++
 2 files changed, 87 insertions(+)
 create mode 100755 tests/btrfs/297
 create mode 100644 tests/btrfs/297.out

Comments

Anand Jain July 24, 2023, 7:48 a.m. UTC | #1
On 24/7/23 10:36, Qu Wenruo wrote:
> There is a kernel regression caused by commit 75b470332965 ("btrfs:
> raid56: migrate recovery and scrub recovery path to use error_bitmap"),
> which leads to scrub not repairing corrupted parity stripes.
> 
> So here we add a test case to verify the P/Q stripe scrub behavior by:
> 
> - Create a RAID5 or RAID6 btrfs with minimal amount of devices
>    This means 2 devices for RAID5, and 3 devices for RAID6.
>    This would result the parity stripe to be a mirror of the only data
>    stripe.
> 
>    And since we have control of the content of data stripes, the content
>    of the P stripe is also fixed.
> 
> - Create an 64K file
>    The file would cover one data stripe.
> 
> - Corrupt the P stripe
> 
> - Scrub the fs
>    If scrub is working, the P stripe would be repaired.
> 
>    Unfortunately scrub can not report any P/Q corruption, limited by its
>    reporting structure.
>    So we can not use the return value of scrub to determine if we
>    repaired anything.
> 
> - Verify the content of the P stripe
> 
> - Use "btrfs check --check-data-csum" to double check
> 
> By above steps, we can verify if the P stripe is properly fixed.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Changelog:
> v2:
> - Rebase to the latest misc-next
> - Use space_cache=v2 mount option instead of nospace_cache
>    New features like block group tree and extent tree v2 requires v2
>    cache
> - Fix a white space error
> ---
>   tests/btrfs/297     | 85 +++++++++++++++++++++++++++++++++++++++++++++
>   tests/btrfs/297.out |  2 ++
>   2 files changed, 87 insertions(+)
>   create mode 100755 tests/btrfs/297
>   create mode 100644 tests/btrfs/297.out
> 
> diff --git a/tests/btrfs/297 b/tests/btrfs/297
> new file mode 100755
> index 00000000..852c3ace
> --- /dev/null
> +++ b/tests/btrfs/297
> @@ -0,0 +1,85 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0

> +# Copyright (c) 2023 YOUR NAME HERE.  All Rights Reserved.

NIT: Actual name is required here.

Rest of the code looks good.

Thanks.

> +#
> +# FS QA Test 297
> +#
> +# Make sure btrfs scrub can fix parity stripe corruption
> +#
> +. ./common/preamble
> +_begin_fstest auto quick raid scrub
> +
> +. ./common/filter
> +
> +_supported_fs btrfs
> +_require_odirect
> +_require_non_zoned_device "${SCRATCH_DEV}"
> +_require_scratch_dev_pool 3
> +_fixed_by_kernel_commit 486c737f7fdc \
> +	"btrfs: raid56: always verify the P/Q contents for scrub"
> +
> +workload()
> +{
> +	local profile=$1
> +	local nr_devs=$2
> +
> +	echo "=== Testing $nr_devs devices $profile ===" >> $seqres.full
> +	_scratch_dev_pool_get $nr_devs
> +
> +	_scratch_pool_mkfs -d $profile -m single >> $seqres.full 2>&1
> +	# Use v2 space cache to prevent v1 space cache affecting
> +	# the result.
> +	_scratch_mount -o space_cache=v2
> +
> +	# Create one 64K extent which would cover one data stripe.
> +	$XFS_IO_PROG -f -d -c "pwrite -S 0xaa -b 64K 0 64K" \
> +		"$SCRATCH_MNT/foobar" > /dev/null
> +	sync
> +
> +	# Corrupt the P/Q stripe
> +	local logical=$(_btrfs_get_first_logical $SCRATCH_MNT/foobar)
> +
> +	# The 2nd copy is pointed to P stripe directly.
> +	physical_p=$(_btrfs_get_physical ${logical} 2)
> +	devpath_p=$(_btrfs_get_device_path ${logical} 2)
> +
> +	_scratch_unmount
> +
> +	echo "Corrupt stripe P at devpath $devpath_p physical $physical_p" \
> +		>> $seqres.full
> +	$XFS_IO_PROG -d -c "pwrite -S 0xff -b 64K $physical_p 64K" $devpath_p \
> +		> /dev/null
> +
> +	# Do a scrub to try repair the P stripe.
> +	_scratch_mount -o space_cache=v2
> +	$BTRFS_UTIL_PROG scrub start -BdR $SCRATCH_MNT >> $seqres.full 2>&1
> +	_scratch_unmount
> +
> +	# Verify the repaired content directly
> +	local output=$($XFS_IO_PROG -c "pread -qv $physical_p 16" $devpath_p | _filter_xfs_io_offset)
> +	local expect="XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa  ................"
> +
> +	echo "The first 16 bytes of parity stripe after scrub:" >> $seqres.full
> +	echo $output >> $seqres.full
> +	if [ "$output" != "$expect" ]; then
> +		echo "Unexpected parity content"
> +		echo "has:"
> +		echo "$output"
> +		echo "expect"
> +		echo "$expect"
> +	fi
> +
> +	# Last safenet, let btrfs check --check-data-csum to do an offline scrub.
> +	$BTRFS_UTIL_PROG check --check-data-csum $SCRATCH_DEV >> $seqres.full 2>&1
> +	if [ $? -ne 0 ]; then
> +		echo "Error detected after the scrub"
> +	fi
> +	_scratch_dev_pool_put
> +}
> +
> +workload raid5 2
> +workload raid6 3
> +
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/btrfs/297.out b/tests/btrfs/297.out
> new file mode 100644
> index 00000000..41c373c4
> --- /dev/null
> +++ b/tests/btrfs/297.out
> @@ -0,0 +1,2 @@
> +QA output created by 297
> +Silence is golden
Zorro Lang July 25, 2023, 10:36 a.m. UTC | #2
On Mon, Jul 24, 2023 at 03:48:52PM +0800, Anand Jain wrote:
> On 24/7/23 10:36, Qu Wenruo wrote:
> > There is a kernel regression caused by commit 75b470332965 ("btrfs:
> > raid56: migrate recovery and scrub recovery path to use error_bitmap"),
> > which leads to scrub not repairing corrupted parity stripes.
> > 
> > So here we add a test case to verify the P/Q stripe scrub behavior by:
> > 
> > - Create a RAID5 or RAID6 btrfs with minimal amount of devices
> >    This means 2 devices for RAID5, and 3 devices for RAID6.
> >    This would result the parity stripe to be a mirror of the only data
> >    stripe.
> > 
> >    And since we have control of the content of data stripes, the content
> >    of the P stripe is also fixed.
> > 
> > - Create an 64K file
> >    The file would cover one data stripe.
> > 
> > - Corrupt the P stripe
> > 
> > - Scrub the fs
> >    If scrub is working, the P stripe would be repaired.
> > 
> >    Unfortunately scrub can not report any P/Q corruption, limited by its
> >    reporting structure.
> >    So we can not use the return value of scrub to determine if we
> >    repaired anything.
> > 
> > - Verify the content of the P stripe
> > 
> > - Use "btrfs check --check-data-csum" to double check
> > 
> > By above steps, we can verify if the P stripe is properly fixed.
> > 
> > Signed-off-by: Qu Wenruo <wqu@suse.com>
> > ---
> > Changelog:
> > v2:
> > - Rebase to the latest misc-next
> > - Use space_cache=v2 mount option instead of nospace_cache
> >    New features like block group tree and extent tree v2 requires v2
> >    cache
> > - Fix a white space error
> > ---
> >   tests/btrfs/297     | 85 +++++++++++++++++++++++++++++++++++++++++++++
> >   tests/btrfs/297.out |  2 ++
> >   2 files changed, 87 insertions(+)
> >   create mode 100755 tests/btrfs/297
> >   create mode 100644 tests/btrfs/297.out
> > 
> > diff --git a/tests/btrfs/297 b/tests/btrfs/297
> > new file mode 100755
> > index 00000000..852c3ace
> > --- /dev/null
> > +++ b/tests/btrfs/297
> > @@ -0,0 +1,85 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> 
> > +# Copyright (c) 2023 YOUR NAME HERE.  All Rights Reserved.
> 
> NIT: Actual name is required here.
> 
> Rest of the code looks good.

If there's not more review points, I can help to
   s/YOUR NAME HERE/SUSE Linux Products GmbH
when I merge it.

BTW, do you mean there's a RVB from you?

Thanks,
Zorro

> 
> Thanks.
> 
> > +#
> > +# FS QA Test 297
> > +#
> > +# Make sure btrfs scrub can fix parity stripe corruption
> > +#
> > +. ./common/preamble
> > +_begin_fstest auto quick raid scrub
> > +
> > +. ./common/filter
> > +
> > +_supported_fs btrfs
> > +_require_odirect
> > +_require_non_zoned_device "${SCRATCH_DEV}"
> > +_require_scratch_dev_pool 3
> > +_fixed_by_kernel_commit 486c737f7fdc \
> > +	"btrfs: raid56: always verify the P/Q contents for scrub"
> > +
> > +workload()
> > +{
> > +	local profile=$1
> > +	local nr_devs=$2
> > +
> > +	echo "=== Testing $nr_devs devices $profile ===" >> $seqres.full
> > +	_scratch_dev_pool_get $nr_devs
> > +
> > +	_scratch_pool_mkfs -d $profile -m single >> $seqres.full 2>&1
> > +	# Use v2 space cache to prevent v1 space cache affecting
> > +	# the result.
> > +	_scratch_mount -o space_cache=v2
> > +
> > +	# Create one 64K extent which would cover one data stripe.
> > +	$XFS_IO_PROG -f -d -c "pwrite -S 0xaa -b 64K 0 64K" \
> > +		"$SCRATCH_MNT/foobar" > /dev/null
> > +	sync
> > +
> > +	# Corrupt the P/Q stripe
> > +	local logical=$(_btrfs_get_first_logical $SCRATCH_MNT/foobar)
> > +
> > +	# The 2nd copy is pointed to P stripe directly.
> > +	physical_p=$(_btrfs_get_physical ${logical} 2)
> > +	devpath_p=$(_btrfs_get_device_path ${logical} 2)
> > +
> > +	_scratch_unmount
> > +
> > +	echo "Corrupt stripe P at devpath $devpath_p physical $physical_p" \
> > +		>> $seqres.full
> > +	$XFS_IO_PROG -d -c "pwrite -S 0xff -b 64K $physical_p 64K" $devpath_p \
> > +		> /dev/null
> > +
> > +	# Do a scrub to try repair the P stripe.
> > +	_scratch_mount -o space_cache=v2
> > +	$BTRFS_UTIL_PROG scrub start -BdR $SCRATCH_MNT >> $seqres.full 2>&1
> > +	_scratch_unmount
> > +
> > +	# Verify the repaired content directly
> > +	local output=$($XFS_IO_PROG -c "pread -qv $physical_p 16" $devpath_p | _filter_xfs_io_offset)
> > +	local expect="XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa  ................"
> > +
> > +	echo "The first 16 bytes of parity stripe after scrub:" >> $seqres.full
> > +	echo $output >> $seqres.full
> > +	if [ "$output" != "$expect" ]; then
> > +		echo "Unexpected parity content"
> > +		echo "has:"
> > +		echo "$output"
> > +		echo "expect"
> > +		echo "$expect"
> > +	fi
> > +
> > +	# Last safenet, let btrfs check --check-data-csum to do an offline scrub.
> > +	$BTRFS_UTIL_PROG check --check-data-csum $SCRATCH_DEV >> $seqres.full 2>&1
> > +	if [ $? -ne 0 ]; then
> > +		echo "Error detected after the scrub"
> > +	fi
> > +	_scratch_dev_pool_put
> > +}
> > +
> > +workload raid5 2
> > +workload raid6 3
> > +
> > +echo "Silence is golden"
> > +status=0
> > +exit
> > diff --git a/tests/btrfs/297.out b/tests/btrfs/297.out
> > new file mode 100644
> > index 00000000..41c373c4
> > --- /dev/null
> > +++ b/tests/btrfs/297.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 297
> > +Silence is golden
>
Anand Jain July 25, 2023, 11:13 a.m. UTC | #3
On 25/07/2023 18:36, Zorro Lang wrote:
> On Mon, Jul 24, 2023 at 03:48:52PM +0800, Anand Jain wrote:
>> On 24/7/23 10:36, Qu Wenruo wrote:
>>> There is a kernel regression caused by commit 75b470332965 ("btrfs:
>>> raid56: migrate recovery and scrub recovery path to use error_bitmap"),
>>> which leads to scrub not repairing corrupted parity stripes.
>>>
>>> So here we add a test case to verify the P/Q stripe scrub behavior by:
>>>
>>> - Create a RAID5 or RAID6 btrfs with minimal amount of devices
>>>     This means 2 devices for RAID5, and 3 devices for RAID6.
>>>     This would result the parity stripe to be a mirror of the only data
>>>     stripe.
>>>
>>>     And since we have control of the content of data stripes, the content
>>>     of the P stripe is also fixed.
>>>
>>> - Create an 64K file
>>>     The file would cover one data stripe.
>>>
>>> - Corrupt the P stripe
>>>
>>> - Scrub the fs
>>>     If scrub is working, the P stripe would be repaired.
>>>
>>>     Unfortunately scrub can not report any P/Q corruption, limited by its
>>>     reporting structure.
>>>     So we can not use the return value of scrub to determine if we
>>>     repaired anything.
>>>
>>> - Verify the content of the P stripe
>>>
>>> - Use "btrfs check --check-data-csum" to double check
>>>
>>> By above steps, we can verify if the P stripe is properly fixed.
>>>
>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>> ---
>>> Changelog:
>>> v2:
>>> - Rebase to the latest misc-next
>>> - Use space_cache=v2 mount option instead of nospace_cache
>>>     New features like block group tree and extent tree v2 requires v2
>>>     cache
>>> - Fix a white space error
>>> ---
>>>    tests/btrfs/297     | 85 +++++++++++++++++++++++++++++++++++++++++++++
>>>    tests/btrfs/297.out |  2 ++
>>>    2 files changed, 87 insertions(+)
>>>    create mode 100755 tests/btrfs/297
>>>    create mode 100644 tests/btrfs/297.out
>>>
>>> diff --git a/tests/btrfs/297 b/tests/btrfs/297
>>> new file mode 100755
>>> index 00000000..852c3ace
>>> --- /dev/null
>>> +++ b/tests/btrfs/297
>>> @@ -0,0 +1,85 @@
>>> +#! /bin/bash
>>> +# SPDX-License-Identifier: GPL-2.0
>>
>>> +# Copyright (c) 2023 YOUR NAME HERE.  All Rights Reserved.
>>
>> NIT: Actual name is required here.
>>
>> Rest of the code looks good.
> 
> If there's not more review points, I can help to
>     s/YOUR NAME HERE/SUSE Linux Products GmbH
> when I merge it.
> 
> BTW, do you mean there's a RVB from you?

   Qu has sent v3 for this patch and there is my RVB as well.
Thanks.

> 
> Thanks,
> Zorro
> 
>>
>> Thanks.
>>
>>> +#
>>> +# FS QA Test 297
>>> +#
>>> +# Make sure btrfs scrub can fix parity stripe corruption
>>> +#
>>> +. ./common/preamble
>>> +_begin_fstest auto quick raid scrub
>>> +
>>> +. ./common/filter
>>> +
>>> +_supported_fs btrfs
>>> +_require_odirect
>>> +_require_non_zoned_device "${SCRATCH_DEV}"
>>> +_require_scratch_dev_pool 3
>>> +_fixed_by_kernel_commit 486c737f7fdc \
>>> +	"btrfs: raid56: always verify the P/Q contents for scrub"
>>> +
>>> +workload()
>>> +{
>>> +	local profile=$1
>>> +	local nr_devs=$2
>>> +
>>> +	echo "=== Testing $nr_devs devices $profile ===" >> $seqres.full
>>> +	_scratch_dev_pool_get $nr_devs
>>> +
>>> +	_scratch_pool_mkfs -d $profile -m single >> $seqres.full 2>&1
>>> +	# Use v2 space cache to prevent v1 space cache affecting
>>> +	# the result.
>>> +	_scratch_mount -o space_cache=v2
>>> +
>>> +	# Create one 64K extent which would cover one data stripe.
>>> +	$XFS_IO_PROG -f -d -c "pwrite -S 0xaa -b 64K 0 64K" \
>>> +		"$SCRATCH_MNT/foobar" > /dev/null
>>> +	sync
>>> +
>>> +	# Corrupt the P/Q stripe
>>> +	local logical=$(_btrfs_get_first_logical $SCRATCH_MNT/foobar)
>>> +
>>> +	# The 2nd copy is pointed to P stripe directly.
>>> +	physical_p=$(_btrfs_get_physical ${logical} 2)
>>> +	devpath_p=$(_btrfs_get_device_path ${logical} 2)
>>> +
>>> +	_scratch_unmount
>>> +
>>> +	echo "Corrupt stripe P at devpath $devpath_p physical $physical_p" \
>>> +		>> $seqres.full
>>> +	$XFS_IO_PROG -d -c "pwrite -S 0xff -b 64K $physical_p 64K" $devpath_p \
>>> +		> /dev/null
>>> +
>>> +	# Do a scrub to try repair the P stripe.
>>> +	_scratch_mount -o space_cache=v2
>>> +	$BTRFS_UTIL_PROG scrub start -BdR $SCRATCH_MNT >> $seqres.full 2>&1
>>> +	_scratch_unmount
>>> +
>>> +	# Verify the repaired content directly
>>> +	local output=$($XFS_IO_PROG -c "pread -qv $physical_p 16" $devpath_p | _filter_xfs_io_offset)
>>> +	local expect="XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa  ................"
>>> +
>>> +	echo "The first 16 bytes of parity stripe after scrub:" >> $seqres.full
>>> +	echo $output >> $seqres.full
>>> +	if [ "$output" != "$expect" ]; then
>>> +		echo "Unexpected parity content"
>>> +		echo "has:"
>>> +		echo "$output"
>>> +		echo "expect"
>>> +		echo "$expect"
>>> +	fi
>>> +
>>> +	# Last safenet, let btrfs check --check-data-csum to do an offline scrub.
>>> +	$BTRFS_UTIL_PROG check --check-data-csum $SCRATCH_DEV >> $seqres.full 2>&1
>>> +	if [ $? -ne 0 ]; then
>>> +		echo "Error detected after the scrub"
>>> +	fi
>>> +	_scratch_dev_pool_put
>>> +}
>>> +
>>> +workload raid5 2
>>> +workload raid6 3
>>> +
>>> +echo "Silence is golden"
>>> +status=0
>>> +exit
>>> diff --git a/tests/btrfs/297.out b/tests/btrfs/297.out
>>> new file mode 100644
>>> index 00000000..41c373c4
>>> --- /dev/null
>>> +++ b/tests/btrfs/297.out
>>> @@ -0,0 +1,2 @@
>>> +QA output created by 297
>>> +Silence is golden
>>
>
Zorro Lang July 25, 2023, 11:33 a.m. UTC | #4
On Tue, Jul 25, 2023 at 07:13:02PM +0800, Anand Jain wrote:
> 
> 
> On 25/07/2023 18:36, Zorro Lang wrote:
> > On Mon, Jul 24, 2023 at 03:48:52PM +0800, Anand Jain wrote:
> > > On 24/7/23 10:36, Qu Wenruo wrote:
> > > > There is a kernel regression caused by commit 75b470332965 ("btrfs:
> > > > raid56: migrate recovery and scrub recovery path to use error_bitmap"),
> > > > which leads to scrub not repairing corrupted parity stripes.
> > > > 
> > > > So here we add a test case to verify the P/Q stripe scrub behavior by:
> > > > 
> > > > - Create a RAID5 or RAID6 btrfs with minimal amount of devices
> > > >     This means 2 devices for RAID5, and 3 devices for RAID6.
> > > >     This would result the parity stripe to be a mirror of the only data
> > > >     stripe.
> > > > 
> > > >     And since we have control of the content of data stripes, the content
> > > >     of the P stripe is also fixed.
> > > > 
> > > > - Create an 64K file
> > > >     The file would cover one data stripe.
> > > > 
> > > > - Corrupt the P stripe
> > > > 
> > > > - Scrub the fs
> > > >     If scrub is working, the P stripe would be repaired.
> > > > 
> > > >     Unfortunately scrub can not report any P/Q corruption, limited by its
> > > >     reporting structure.
> > > >     So we can not use the return value of scrub to determine if we
> > > >     repaired anything.
> > > > 
> > > > - Verify the content of the P stripe
> > > > 
> > > > - Use "btrfs check --check-data-csum" to double check
> > > > 
> > > > By above steps, we can verify if the P stripe is properly fixed.
> > > > 
> > > > Signed-off-by: Qu Wenruo <wqu@suse.com>
> > > > ---
> > > > Changelog:
> > > > v2:
> > > > - Rebase to the latest misc-next
> > > > - Use space_cache=v2 mount option instead of nospace_cache
> > > >     New features like block group tree and extent tree v2 requires v2
> > > >     cache
> > > > - Fix a white space error
> > > > ---
> > > >    tests/btrfs/297     | 85 +++++++++++++++++++++++++++++++++++++++++++++
> > > >    tests/btrfs/297.out |  2 ++
> > > >    2 files changed, 87 insertions(+)
> > > >    create mode 100755 tests/btrfs/297
> > > >    create mode 100644 tests/btrfs/297.out
> > > > 
> > > > diff --git a/tests/btrfs/297 b/tests/btrfs/297
> > > > new file mode 100755
> > > > index 00000000..852c3ace
> > > > --- /dev/null
> > > > +++ b/tests/btrfs/297
> > > > @@ -0,0 +1,85 @@
> > > > +#! /bin/bash
> > > > +# SPDX-License-Identifier: GPL-2.0
> > > 
> > > > +# Copyright (c) 2023 YOUR NAME HERE.  All Rights Reserved.
> > > 
> > > NIT: Actual name is required here.
> > > 
> > > Rest of the code looks good.
> > 
> > If there's not more review points, I can help to
> >     s/YOUR NAME HERE/SUSE Linux Products GmbH
> > when I merge it.
> > 
> > BTW, do you mean there's a RVB from you?
> 
>   Qu has sent v3 for this patch and there is my RVB as well.
> Thanks.

OK, got it, thanks:)

> 
> > 
> > Thanks,
> > Zorro
> > 
> > > 
> > > Thanks.
> > > 
> > > > +#
> > > > +# FS QA Test 297
> > > > +#
> > > > +# Make sure btrfs scrub can fix parity stripe corruption
> > > > +#
> > > > +. ./common/preamble
> > > > +_begin_fstest auto quick raid scrub
> > > > +
> > > > +. ./common/filter
> > > > +
> > > > +_supported_fs btrfs
> > > > +_require_odirect
> > > > +_require_non_zoned_device "${SCRATCH_DEV}"
> > > > +_require_scratch_dev_pool 3
> > > > +_fixed_by_kernel_commit 486c737f7fdc \
> > > > +	"btrfs: raid56: always verify the P/Q contents for scrub"
> > > > +
> > > > +workload()
> > > > +{
> > > > +	local profile=$1
> > > > +	local nr_devs=$2
> > > > +
> > > > +	echo "=== Testing $nr_devs devices $profile ===" >> $seqres.full
> > > > +	_scratch_dev_pool_get $nr_devs
> > > > +
> > > > +	_scratch_pool_mkfs -d $profile -m single >> $seqres.full 2>&1
> > > > +	# Use v2 space cache to prevent v1 space cache affecting
> > > > +	# the result.
> > > > +	_scratch_mount -o space_cache=v2
> > > > +
> > > > +	# Create one 64K extent which would cover one data stripe.
> > > > +	$XFS_IO_PROG -f -d -c "pwrite -S 0xaa -b 64K 0 64K" \
> > > > +		"$SCRATCH_MNT/foobar" > /dev/null
> > > > +	sync
> > > > +
> > > > +	# Corrupt the P/Q stripe
> > > > +	local logical=$(_btrfs_get_first_logical $SCRATCH_MNT/foobar)
> > > > +
> > > > +	# The 2nd copy is pointed to P stripe directly.
> > > > +	physical_p=$(_btrfs_get_physical ${logical} 2)
> > > > +	devpath_p=$(_btrfs_get_device_path ${logical} 2)
> > > > +
> > > > +	_scratch_unmount
> > > > +
> > > > +	echo "Corrupt stripe P at devpath $devpath_p physical $physical_p" \
> > > > +		>> $seqres.full
> > > > +	$XFS_IO_PROG -d -c "pwrite -S 0xff -b 64K $physical_p 64K" $devpath_p \
> > > > +		> /dev/null
> > > > +
> > > > +	# Do a scrub to try repair the P stripe.
> > > > +	_scratch_mount -o space_cache=v2
> > > > +	$BTRFS_UTIL_PROG scrub start -BdR $SCRATCH_MNT >> $seqres.full 2>&1
> > > > +	_scratch_unmount
> > > > +
> > > > +	# Verify the repaired content directly
> > > > +	local output=$($XFS_IO_PROG -c "pread -qv $physical_p 16" $devpath_p | _filter_xfs_io_offset)
> > > > +	local expect="XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa  ................"
> > > > +
> > > > +	echo "The first 16 bytes of parity stripe after scrub:" >> $seqres.full
> > > > +	echo $output >> $seqres.full
> > > > +	if [ "$output" != "$expect" ]; then
> > > > +		echo "Unexpected parity content"
> > > > +		echo "has:"
> > > > +		echo "$output"
> > > > +		echo "expect"
> > > > +		echo "$expect"
> > > > +	fi
> > > > +
> > > > +	# Last safenet, let btrfs check --check-data-csum to do an offline scrub.
> > > > +	$BTRFS_UTIL_PROG check --check-data-csum $SCRATCH_DEV >> $seqres.full 2>&1
> > > > +	if [ $? -ne 0 ]; then
> > > > +		echo "Error detected after the scrub"
> > > > +	fi
> > > > +	_scratch_dev_pool_put
> > > > +}
> > > > +
> > > > +workload raid5 2
> > > > +workload raid6 3
> > > > +
> > > > +echo "Silence is golden"
> > > > +status=0
> > > > +exit
> > > > diff --git a/tests/btrfs/297.out b/tests/btrfs/297.out
> > > > new file mode 100644
> > > > index 00000000..41c373c4
> > > > --- /dev/null
> > > > +++ b/tests/btrfs/297.out
> > > > @@ -0,0 +1,2 @@
> > > > +QA output created by 297
> > > > +Silence is golden
> > > 
> > 
>
diff mbox series

Patch

diff --git a/tests/btrfs/297 b/tests/btrfs/297
new file mode 100755
index 00000000..852c3ace
--- /dev/null
+++ b/tests/btrfs/297
@@ -0,0 +1,85 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2023 YOUR NAME HERE.  All Rights Reserved.
+#
+# FS QA Test 297
+#
+# Make sure btrfs scrub can fix parity stripe corruption
+#
+. ./common/preamble
+_begin_fstest auto quick raid scrub
+
+. ./common/filter
+
+_supported_fs btrfs
+_require_odirect
+_require_non_zoned_device "${SCRATCH_DEV}"
+_require_scratch_dev_pool 3
+_fixed_by_kernel_commit 486c737f7fdc \
+	"btrfs: raid56: always verify the P/Q contents for scrub"
+
+workload()
+{
+	local profile=$1
+	local nr_devs=$2
+
+	echo "=== Testing $nr_devs devices $profile ===" >> $seqres.full
+	_scratch_dev_pool_get $nr_devs
+
+	_scratch_pool_mkfs -d $profile -m single >> $seqres.full 2>&1
+	# Use v2 space cache to prevent v1 space cache affecting
+	# the result.
+	_scratch_mount -o space_cache=v2
+
+	# Create one 64K extent which would cover one data stripe.
+	$XFS_IO_PROG -f -d -c "pwrite -S 0xaa -b 64K 0 64K" \
+		"$SCRATCH_MNT/foobar" > /dev/null
+	sync
+
+	# Corrupt the P/Q stripe
+	local logical=$(_btrfs_get_first_logical $SCRATCH_MNT/foobar)
+
+	# The 2nd copy is pointed to P stripe directly.
+	physical_p=$(_btrfs_get_physical ${logical} 2)
+	devpath_p=$(_btrfs_get_device_path ${logical} 2)
+
+	_scratch_unmount
+
+	echo "Corrupt stripe P at devpath $devpath_p physical $physical_p" \
+		>> $seqres.full
+	$XFS_IO_PROG -d -c "pwrite -S 0xff -b 64K $physical_p 64K" $devpath_p \
+		> /dev/null
+
+	# Do a scrub to try repair the P stripe.
+	_scratch_mount -o space_cache=v2
+	$BTRFS_UTIL_PROG scrub start -BdR $SCRATCH_MNT >> $seqres.full 2>&1
+	_scratch_unmount
+
+	# Verify the repaired content directly
+	local output=$($XFS_IO_PROG -c "pread -qv $physical_p 16" $devpath_p | _filter_xfs_io_offset)
+	local expect="XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa  ................"
+
+	echo "The first 16 bytes of parity stripe after scrub:" >> $seqres.full
+	echo $output >> $seqres.full
+	if [ "$output" != "$expect" ]; then
+		echo "Unexpected parity content"
+		echo "has:"
+		echo "$output"
+		echo "expect"
+		echo "$expect"
+	fi
+
+	# Last safenet, let btrfs check --check-data-csum to do an offline scrub.
+	$BTRFS_UTIL_PROG check --check-data-csum $SCRATCH_DEV >> $seqres.full 2>&1
+	if [ $? -ne 0 ]; then
+		echo "Error detected after the scrub"
+	fi
+	_scratch_dev_pool_put
+}
+
+workload raid5 2
+workload raid6 3
+
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/btrfs/297.out b/tests/btrfs/297.out
new file mode 100644
index 00000000..41c373c4
--- /dev/null
+++ b/tests/btrfs/297.out
@@ -0,0 +1,2 @@ 
+QA output created by 297
+Silence is golden